The Multi-Hash 3D Vision System
While many other 3D vision systems are strongly motivated by a specific class of industrial applications, our MULTI-HASH system represents our efforts in fundamental research on the recognition of 3D objects. With the MULTI-HASH system, we have extended some of our previous polynomial-time recognition systems to make use of surface color, in addition to object geometry.
The MULTI-HASH system is designed to handle objects composed of arbitrary second-order surfaces and to distinguish between identically-shaped but differently-colored objects. Here we show a typical MULTI-HASH scene:
As in all vision systems, the first step in the recognition process is of course the gathering of the data. In MULTI-HASH, this is accomplished through the use of a custom-designed color structured light scanner. The operation of this special scanner is as follows: first a standard red laser illuminates the scene and an image is acquired; the laser is then turned off and a broader strip of white light replaces it; a second image is acquired. The overhead scanner then moves to the next incremental scan position and the process is repeated.
The laser-illuminated images are used to compute a depth map of the scene (as in the Tube Recognition System.) In addition, at each laser-striped pixel location, the corresponding pixel in the white-light-illuminated image is sampled to determine the RGB color triplet at that point in the scene. The result is a color composite light-stripe image:
A typical computed depth map is shown here:
The depth map is then segmented into distinct surfaces using a windowing segmentation algorithm. This algorithm generates very clean segmentation maps (especially considering the relatively sparse range data is receives as input):
After the scene has been segmented into distinct surfaces, a set of attributes is computed for each of these surfaces. These attributes include surface-type (e.g., planar, cylindrical, etc.), surface area, radius (for cylinders), average RGB values, etc. In our current implementation, the only surface appearance information we take into account in a surface description is its average RGB value; however, the underlying MULTI-HASH architecture allows a much richer set of surface attributes to be included, such as texture information or spatial color distribution characteristics.
After the descriptions of each extracted surface have been generated, these surface features are used as keys into a multiple-attribute hash table. This hash table has been generated off-line by exposing actual model object instances to the sensor. In particular, the hash table generation algorithm attempts to select only the most discriminatory surface features for use in surface recognition; in addition, the bins are partitioned in such a way as to minimize the probabilistic entropy of the table - this results in a very efficient method for generating a limited set of possible scene-to-model match hypotheses.
The resulting scene-to-model hypotheses are verified or rejected, and the position and orientation of the recognized objects are computed. Are demonstration system then uses a PUMA robot to grasp and pick up the recognized objects as proof of successful recognition and pose determination.
L. Grewe and A. C. Kak, "Interactive Learning of a Multi-Attribute Hash Table Classifier for Fast Object Recognition," Computer Vision and Image Understanding, pp. 387-416, Vol. 61, No. 3, 1995.
L. Grewe and A. C. Kak, "Interactive Learning of Multiple Attribute Hash Table for Fast 3D Object Recognition," Proceedings of the Second IEEE CAD-Based Vision Workshop, Champion, PA, pp. 17-27, February 1994.
L. Grewe and A. C. Kak, "Integration of Geometric and Non-Geometric Attributes for Fast Object Recognition," 1993 SPIE Conference on the Applications of Artificial Intelligence, Orlando, April 1993.
    •    Multi-Hash 3D Vision System