ICP-1.2.html

Python: module ICP ICP (version 1.2, 2011-September-12) ICP.py Version: 1.2 Author: Avinash Kak (kak@purdue.edu) Date: 2011-September-12 Download Version 1.2: gztar bztar View version 1.2 code in browser Switch to Version 1.3 CHANGE LOG: Version 1.2 allows for a movie-like display of the superimposed model and data images. This makes it much easier to see the convergence of the registration between the two images. Another change in Version 1.2 is a size reduction of large binary images to speed up the computations. The size to which all images are now reduced is set by the 'calculation_image_size' parameter of the constructor. Previously, only the color and the grayscale images were subject to such size reduction. Version 1.2 also removes a bug in the mosaicked display of the results. Version 1.1 includes a new option for the constructor that lets the system decide as to which image to use as the model and which image as the data. In general, for color and grayscale images, you get superior registration between the two images if the image that results in fewer pixels for ICP processing is used as the data image. Version 1.1 also includes a better (although still extremely primitive) data generator for creating simple synthetic binary images that can be used to experiment with the ICP algorithm. INTRODUCTION: ICP stands for Iterative Closest Point algorithm. ICP algorithms are used to register two data sets (meaning making one data set spatially congruent with the other data set) by applying iteratively a rotation and translation to one data set until it is congruent with the other data set. In the domain of image processing, ICP can be used to register a data image recorded through a sensor against a model image that may be produced by a geographic information system (GIS). A typical application would be a UAV recording images as it flies over a terrain; a successful registration between such sensor images and the the model images produced by an on-board or satellite-connected GIS system would enable precise computation of the position and the orientation of the UAV vis-a-vis the terrain. The main goal of the pure-Python implementation of ICP presented here is to make it easier to experiment with the different aspects of such algorithms. USAGE: For the case of small binary images (small means that the width and the height are less than 100 pixels), a simple usage example would look like: import ICP icp = ICP.ICP( model_image = "triangle1.jpg", data_image = "triangle2.jpg", binary_or_color = "binary", iterations = 10, ) icp.icp() icp.display_results() icp.display_results_as_movie() For registering a shape in one image with a shape in another image, you'd use the binary case shown above. IMPORTANT: BY BINARY IMAGE, WE MEAN AN IMAGE IN WHICH THE ONLY NON-ZERO PIXELS ARE ON THE SHAPE BOUNDARIES. For large binary images, you are likely to use this module in this fashion: import ICP icp = ICP.ICP( model_image = "triangle1.jpg", data_image = "triangle2.jpg", binary_or_color = "binary", calculation_image_size = 100, iterations = 10, ) icp.icp() icp.display_results() icp.display_results_as_movie() For the case of color images (INCLUDING THE CASE OF GRAYSCALE IMAGES), the simplest calls to the module would look like import ICP icp = ICP( model_image = "modelterrain.jpg", data_image = "image_from_camera.jpg", binary_or_color = "color", iterations = 15, connectivity_threshold = 5, calculation_image_size = 100, display_step = 1, ) icp.icp() icp.display_results() icp.display_results_as_movie() where the parameter calculation_image_size controls the size of the image that will actually be used for ICP calculations. Color (and grayscale) images as output by sensors can be very large and it would be impractical to process all of the pixel data in the images. This module first smoothly reduces the size of the images so that the maximum dimension does not exceed calculation_image_size and then carries out ICP processing on the reduced-size images. The parameter connectivity_threshold controls how many of the edge pixels will be chosen for processing. In the current implementation, an edge pixel is chosen if it has at least connectivity_threshold number of other edge pixels in its 5x5 neighborhood. Should you choose to use the "color" option for processing images that are essentially binary shape images, make sure you use a small value such as 2 for the connectivity_threshold. If you do not care as to which image is used as a model and which as the data, you may get superior results if you choose the image with fewer pixels as the data image. For color and grayscale images, that would be the image which yields a smaller set of edge pixels that are supposed to be used for ICP processing. To let the system swap the model and the data images, you would need to turn on the auto_select_model_and_data option, as in import ICP icp = ICP( model_image = "modelterrain.jpg", data_image = "image_from_camera.jpg", binary_or_color = "color", auto_select_model_and_data = 1, iterations = 40, connectivity_threshold = 5, calculation_image_size = 100, display_step = 1, ) icp.icp() icp.display_results() icp.display_results_as_movie() The module also includes a static method gendata() to illustrate how one can create simple synthetic images to experiment with this ICP module. A call to this method looks like import ICP ICP.ICP.gendata( "triangle", (80,80), (10,10), 30, "newtriangle2.jpg" ) As currently programmed, the gendata() method constructs images with triangles and straight lines segments. CONSTRUCTOR PARAMETERS: auto_select_model_and_data: When set to 1, the system decides as to which of the two images you supplied to the constructor will actually be used as model and which as data. (DEFAULTS to 0) binary_or_color: Set to 'binary' for binary images and to 'color' for grayscale and color images (REQUIRED) calculation_image_size: The size to which a large image will be reduced for ICP processing (DEFAULTS to 100 for large images) connectivity_threshold: This parameter decides as to which edge pixels in a grayscale or a color image will participate in the ICP calculations. For an edge pixel to be accepted, it must have at least connectivity_threshold number of other edge pixels in a 5x5 neighborhood. (DEFAULTS to 5) data_image: The name of the data file (REQUIRED) debug: When set to 1, the module prints out additional information useful for debugging purposes. (DEFAULTS to 0) display_step: Controls how often the images at the end of successive iterations will be shown in the final display of the results. For example, when set to 5, every fifth image will be included for the final display. (DEFAULTS to 5 when the number of iterations exceed 40 and to 1 otherwise) error_threshold: When the difference in the registration errors between two successive iterations is less than the error_threshold, we stop iterating. (DEFAULTS to 0.0) iterations: The maximum number of iterations to try (DEFAULTS to 10) model_image: The name of the model file (REQUIRED) save_output_images: When set to 1, the module will save the images produced at the end of each iteration in the current directory. Such images are named "__resultX.jpg" for integer values of X that start from 0. (DEFAULTS to 0) METHODS: (1) You must call the method icp() for the basic ICP calculations: import ICP icp = ICP.ICP( ...constructor options ...) icp.icp() (2) The results in the mosaic mode are displayed by icp.display_results() (3) The movie version of the results that better shows the convergence of the registration between the model and the data images is seen by calling icp.display_results_as_movie() HOW THE RESULTS ARE DISPLAYED: The results are displayed using Tkinter graphics (meaning, actually, Tk graphics via the Tkinter interface). The registration results can be displayed in two different modes: a mosaic mode and a "movie" mode. In the former, the various iterations of the data image as it is subject to pose transforms to bring it into registration with the model image are shown in separate frames. In the movie mode, the model image and the various iterations of the data image are shown superimposed. It is easier to see registration convergence in the movie mode. In the mosaic mode, for binary images, the first row shows the model image followed by the data image. Subsequent rows display the data image being translated and rotated as it comes into congruence with the model image. Again in the mosaic mode, for color (and grayscale) images, the first row displays six images, three for the model and three for the data. In each set of three images, the first image is the actual photograph, the second the edges in the photo, and the third the pixels actually used for ICP calculations. Subsequent rows display the data image being brought into registration with the model image through successive iterations. Note that whereas the ICP calculations are carried out on only the sparse set of pixels shown as the third image in the model and the data sequences in the first row, the computed transformations are applied to the edge images shown as the second image for each. The display of results requires access to the "times.ttf" font file for the labels that are needed in the display. The module assumes that this file can be located through the paths stored in 'sys.path', or via the paths available through your environment variables, or at the following path: /usr/share/fonts/truetype/msttcorefonts/times.ttf which is typical for a Ubuntu install. If that is not the case, you would need to edit the 11th line of the code shown for display_results() method and enter the pathname to where "times.ttf" can be found. THE EXAMPLES DIRECTORY: The Examples subdirectory of the installation shows the following two scripts: binary_image_registration.py color_image_registration.py It is highly recommended that you play with these examples before modifying or extending the code in this module. There are four pairs (model and data) of images in this directory that you call the above scripts on. CAVEATS: Within the limitations caused by the fact that ICP calculations can get trapped in a local minimum, you can expect to get good results on binary images. As for color images, what results you get would depend greatly on how complex the images are, the quality of edge detection, and how the final selection of pixels for ICP computation is carried out. The implementation shown in this module uses very rudimentary edge detection and even more rudimentary final pixel selection. (That is because the focus of this module is on ICP and not on digital image processing.) To improve the results on color (and grayscale) images, you would need to replace the pixel extraction functions with your own implementations. Another option would be to carry out pixel extraction separately and to then feed the pixels thus selected as binary images to this module. When processing color (and grayscale) images, if you are free to decide as to which image to call model and which to call data, choose the one that results in fewer pixels for ICP processing as the data image. You are likely to get better results if you do so. If you turn the auto_select_model_and_data option on in the call to the constructor, the system will make that decision for you automatically. INSTALLATION: The ICP class was packaged using Distutils. For installation, execute the following command-line in the source directory (this is the directory that contains the setup.py file after you have downloaded and uncompressed the package): python setup.py install You have to have root privileges for this to work. On Linux distributions, this will install the module file at a location that looks like /usr/lib/python2.6/site-packages/ If you do not have root access, you have the option of working directly off the directory in which you downloaded the software by simply placing the following statements at the top of your scripts that use the ICP class: import sys sys.path.append( "pathname_to_ICP_directory" ) To uninstall the module, simply delete the source directory, locate where ICP was installed with "locate ICP" and delete those files. As mentioned above, the full pathname to the installed version is likely to look like /usr/lib/python2.6/site-packages/ICP* If you want to carry out a non-standard install of the ICP module, look up the on-line information on Disutils by pointing your browser to http://docs.python.org/dist/dist.html THEORETICAL BASIS: The first ICP algorithm was proposed by Paul Besl and Neil McKay in a now celebrated paper that appeared in 1992 in IEEE Transactions on PAMI. Since then various versions of the algorithm have been published by other folks for either speeding up the performance of the algorithm or for improving its accuracy. The algorithm implemented here is as simple as it can be. We model the relationship between model and data as R x_d + T = x_m where x_d denote the data points and x_m the model points, each a two-element vector. R is a 2x2 rotation matrix and T a 2-element translation vector. Since two planar figures may not be related by the above transformation (even when one figure "appears" to be a rotated version of the other figure) for an arbitrary location of the origin, move the origin to the mean point of the model by x_m = x_m - mean( x_m ) x_d = x_d - mean( x_m ) With regard to the calculation of R and T, let's express the data points and the CORRESPONDING model points as [x_d1, x_d2, ....., x_dn] and their CORRESPONDING model points [x_m1, x_m2, ....., x_mn] We can now write the following for estimating the rotation matrix: R . A = B where A = [x_d1, x_d2, ....., x_dn] B = [x_m1 - T, x_m2 - T, ....., x_mn - T] Both A and B are 2xn matrices. We can now write R . A . A^t = B . A^t where A^t is the transpose of A. So, we have the following as a least mean squares estimate for R: R = B . A^t . ( A . A^t )^-1 Since such an R may not obey the strict properties that must apply to a rotation matrix (it must be orthonormal), we condition it by first subjecting it to a singular value decomposition: U.S.V^t = svd( R ) and then writing the following for a better estimate for R: R = U . V^t We can now estimate T by T = mean( x_m ) - mean( R x_d ) The above assumes that we are carrying out an one-shot calculation of R and T. But ICP is iterative. The following applies to an iterative implementation of the above: For an iterative assessment of R and T, let's assume that we can decompose R as R = \deltaR . R_0 where we have previously calculated R_0 and now we wish to refine the estimate with the calculation of \deltaR. Plugging the above in the previous formulation: \deltaR . R_0 . A . A^t = B . A^t implying \deltaR . R_0 = B . A^t . ( A . A^t )^-1 which is to say: \deltaR = B . A^t . ( A . A^t )^-1 . R_0^t After you have calculated \deltaR, you can update the rotation matrix by R = \deltaR . R_0 At the end of each such update of the rotation matrix, the calculation of the translation vector remains the same as before T = mean( x_m ) - mean( R . x_d ) We can therefore write down the following steps for ICP computation: STEP 1: x_m = x_m - mean(x_m) x_d = x_d - mean(x_m) STEP 2: Initialize R and T: R = 1 0 0 1 T = 0 0 old_error = inf STEP 3: error = (1/N) \sum dist( x_m - (R * x_d + T) ) STEP 4: diff_error = abs(old_error - error) if (diff_error > threshold): old_error = error else: break STEP 5: for each x_d find its closest x_m by finding that x_m which minimizes the squared difference between the two sides of R . x_d + T = x_m STEP 6: A = [x_d1, x_d2, ....., x_dn] B = [x_m1 - T, x_m2 - T, ....., x_mn - T] Note that A will remain the same for ICP iterations, but B can change with each iteration depending on the what corresponding pixels are found for each data pixel. STEP 7: AATI = A^t inverse(A * A^t) STEP 8: R_update = B * AATI * R.T STEP 9: U,S,VT = svd(R_update) deter = determinant(U * VT) U[0,1] = U[0,1] * determinant U[1,1] = U[1,1] * determinant R_update = U * VT STEP 10: R = R_update * R STEP 11: T = mean(x_m) - mean( R * x_d ) STEP 12: Back to Step 3 ACKNOWLEDGMENTS: The author would like to thank Chad Aeschliman and Kyuseo Han for their considerable help with the debugging of the code presented here. ABOUT THE AUTHOR: Avi Kak is in the last stages of finishing up his multi-year three-volume Objects Trilogy project. See his web page at Purdue for what this project is all about. Imported Modules Image ImageChops ImageDraw ImageFilter ImageFont ImageTk Tkinter os scipy sys Classes __builtin__.object ICP class ICP(__builtin__.object) Methods defined here: __del__(self) __init__(self, *args, **kwargs) callbak(self, arg) construct_AATI_matrix(self) So we want to estimate R from R.A = B. If we had to construct a one-shot estimate for R (note ICP is iterative and not one-shot), we would write R.A=B as R.A.A^t = B.A^t and then R = B . A^t . (A . A^t)^-1 We will group together what comes after B on the right hand side and write AATI = A^t . (A . A^t)^-1 In the ICP implementation, this matrix will remain the same for all the iterations. construct_A_matrix(self) In the plane whose origin is at the model center, the relationship between the model points x_m and the data points x_d is given by R . x_d + T = x_m Let the list of the n chosen data points be given by A = [x_d1, x_d2, ....., x_dn] We can now express the relationship between the data and the model points by R . A = B where B is the list of the CORRESPONDING model points after we subtract the translation form each: B = [x_m1 - T, x_m2 - T, ......, x_mn - T] Eventually our goal will be to estimate R from the R.A = B relationship. display_results(self) display_results_as_movie(self) extract_data_pixels(self) Since extract_model_pixels() and extract_data_pixels() have the same logic, why can't we have just one such method called with two different arguments? These exist as separate methods to allow for the future possibility that one may want to process the model and the data images very differently. Presumably, the model images will be noiseless and will be output by some GIS system. On the other hand, the data images can be expected to be noisy and may suffer from optical and other distortions. extract_model_pixels(self) extract_pixels_from_color_image(self, image) This very simple routine would need to be either replaced in this class or overridden in a subclass of ICP for a more practical approach to the selection of pixels for ICP calculation. At this time, all we do is to choose a pixel if it has at least connectivity_threshold number of pixels in its 5x5 neighborhood. This is done with the hope that the pixels selected in this manner for ICP processing will fall mostly on the edges in the image. icp(self) initialize(self) move_to_model_origin(self) Since two patterns that are situated at different places in a plane may not be related by a Euclidean transform for an arbitrary placement of the origin even when one pattern appears to be a rotated version of the other, we will assume that the origin for ICP calculations will be at the "center" of the model image. Now our goal becomes to find an R and a T that will make the data pattern congruent with the model pattern with respect to this origin. setSizeForDisplay(self) Static methods defined here: gendata(feature, imagesize, position, orientation, output_image_name) Permissible values for feature: 'line', 'triangle' Permissible values for imagesize: (m,n) tuple for the size of the output image Permissible values for position: (x,y) pixel coordinates Permissible values for orientation: integer value for degrees The code here is just the simplest example of synthetic data generation for experimenting with ICP. You can obviously construct more complex model and data images by calling on the other shape drawing primitives of the ImageDraw class. When specifying coordinates, note the following .----------> positive x | | | V positive y A line is drawn from the first pair (x,y) coordinates to the second pair. Data descriptors defined here: __dict__ dictionary for instance variables (if defined) __weakref__ list of weak references to the object (if defined) Data __author__ = 'Avinash Kak (kak@purdue.edu)' __copyright__ = '(C) 2011 Avinash Kak. Python Software Foundation.' __date__ = '2011-September-12' __url__ = 'https://engineering.purdue.edu/kak/distICP/ICP-1.2.html' __version__ = '1.2'

Author Avinash Kak (kak@purdue.edu)

Data
		__author__ = 'Avinash Kak (kak@purdue.edu)' __copyright__ = '(C) 2011 Avinash Kak. Python Software Foundation.' __date__ = '2011-September-12' __url__ = 'https://engineering.purdue.edu/kak/distICP/ICP-1.2.html' __version__ = '1.2'