ICP (version 1.2, 2011-September-12) |
ICP.py
Version: 1.2
Author: Avinash Kak (kak@purdue.edu)
Date: 2011-September-12
Download Version 1.2:
gztar
bztar
View version 1.2 code in browser
Switch to Version 1.3
CHANGE LOG:
Version 1.2 allows for a movie-like display of the superimposed
model and data images. This makes it much easier to see the
convergence of the registration between the two images. Another
change in Version 1.2 is a size reduction of large binary images to
speed up the computations. The size to which all images are now
reduced is set by the 'calculation_image_size' parameter of the
constructor. Previously, only the color and the grayscale images
were subject to such size reduction. Version 1.2 also removes a
bug in the mosaicked display of the results.
Version 1.1 includes a new option for the constructor that lets the
system decide as to which image to use as the model and which image
as the data. In general, for color and grayscale images, you get
superior registration between the two images if the image that
results in fewer pixels for ICP processing is used as the data
image. Version 1.1 also includes a better (although still
extremely primitive) data generator for creating simple synthetic
binary images that can be used to experiment with the ICP
algorithm.
INTRODUCTION:
ICP stands for Iterative Closest Point algorithm. ICP algorithms
are used to register two data sets (meaning making one data set
spatially congruent with the other data set) by applying
iteratively a rotation and translation to one data set until it is
congruent with the other data set.
In the domain of image processing, ICP can be used to register a
data image recorded through a sensor against a model image that may
be produced by a geographic information system (GIS). A typical
application would be a UAV recording images as it flies over a
terrain; a successful registration between such sensor images and
the the model images produced by an on-board or satellite-connected
GIS system would enable precise computation of the position and the
orientation of the UAV vis-a-vis the terrain.
The main goal of the pure-Python implementation of ICP presented
here is to make it easier to experiment with the different aspects
of such algorithms.
USAGE:
For the case of small binary images (small means that the width and
the height are less than 100 pixels), a simple usage example would
look like:
import ICP
icp = ICP.ICP(
model_image = "triangle1.jpg",
data_image = "triangle2.jpg",
binary_or_color = "binary",
iterations = 10,
)
icp.icp()
icp.display_results()
icp.display_results_as_movie()
For registering a shape in one image with a shape in another image,
you'd use the binary case shown above. IMPORTANT: BY BINARY IMAGE,
WE MEAN AN IMAGE IN WHICH THE ONLY NON-ZERO PIXELS ARE ON THE SHAPE
BOUNDARIES. For large binary images, you are likely to use this
module in this fashion:
import ICP
icp = ICP.ICP(
model_image = "triangle1.jpg",
data_image = "triangle2.jpg",
binary_or_color = "binary",
calculation_image_size = 100,
iterations = 10,
)
icp.icp()
icp.display_results()
icp.display_results_as_movie()
For the case of color images (INCLUDING THE CASE OF GRAYSCALE IMAGES),
the simplest calls to the module would look like
import ICP
icp = ICP(
model_image = "modelterrain.jpg",
data_image = "image_from_camera.jpg",
binary_or_color = "color",
iterations = 15,
connectivity_threshold = 5,
calculation_image_size = 100,
display_step = 1,
)
icp.icp()
icp.display_results()
icp.display_results_as_movie()
where the parameter calculation_image_size controls the size of the image
that will actually be used for ICP calculations. Color (and grayscale)
images as output by sensors can be very large and it would be impractical
to process all of the pixel data in the images. This module first
smoothly reduces the size of the images so that the maximum dimension
does not exceed calculation_image_size and then carries out ICP processing
on the reduced-size images. The parameter connectivity_threshold
controls how many of the edge pixels will be chosen for processing.
In the current implementation, an edge pixel is chosen if it has at
least connectivity_threshold number of other edge pixels in its 5x5
neighborhood.
Should you choose to use the "color" option for processing images
that are essentially binary shape images, make sure you use a small
value such as 2 for the connectivity_threshold.
If you do not care as to which image is used as a model and which as
the data, you may get superior results if you choose the image with
fewer pixels as the data image. For color and grayscale images, that
would be the image which yields a smaller set of edge pixels that
are supposed to be used for ICP processing. To let the system swap
the model and the data images, you would need to turn on the
auto_select_model_and_data option, as in
import ICP
icp = ICP(
model_image = "modelterrain.jpg",
data_image = "image_from_camera.jpg",
binary_or_color = "color",
auto_select_model_and_data = 1,
iterations = 40,
connectivity_threshold = 5,
calculation_image_size = 100,
display_step = 1,
)
icp.icp()
icp.display_results()
icp.display_results_as_movie()
The module also includes a static method gendata() to illustrate how
one can create simple synthetic images to experiment with this ICP module.
A call to this method looks like
import ICP
ICP.ICP.gendata( "triangle", (80,80), (10,10), 30, "newtriangle2.jpg" )
As currently programmed, the gendata() method constructs images with
triangles and straight lines segments.
CONSTRUCTOR PARAMETERS:
auto_select_model_and_data: When set to 1, the system decides as to which
of the two images you supplied to the constructor will
actually be used as model and which as data. (DEFAULTS
to 0)
binary_or_color: Set to 'binary' for binary images and to 'color'
for grayscale and color images (REQUIRED)
calculation_image_size: The size to which a large image will be reduced
for ICP processing (DEFAULTS to 100 for large
images)
connectivity_threshold: This parameter decides as to which edge pixels in a
grayscale or a color image will participate in the ICP
calculations. For an edge pixel to be accepted, it
must have at least connectivity_threshold number of
other edge pixels in a 5x5 neighborhood. (DEFAULTS to 5)
data_image: The name of the data file (REQUIRED)
debug: When set to 1, the module prints out additional
information useful for debugging purposes. (DEFAULTS
to 0)
display_step: Controls how often the images at the end of successive
iterations will be shown in the final display of the
results. For example, when set to 5, every fifth image
will be included for the final display. (DEFAULTS to
5 when the number of iterations exceed 40 and to 1
otherwise)
error_threshold: When the difference in the registration errors
between two successive iterations is less than
the error_threshold, we stop iterating. (DEFAULTS
to 0.0)
iterations: The maximum number of iterations to try (DEFAULTS
to 10)
model_image: The name of the model file (REQUIRED)
save_output_images: When set to 1, the module will save the images produced
at the end of each iteration in the current directory.
Such images are named "__resultX.jpg" for integer values
of X that start from 0. (DEFAULTS to 0)
METHODS:
(1) You must call the method icp() for the basic ICP calculations:
import ICP
icp = ICP.ICP( ...constructor options ...)
icp.icp()
(2) The results in the mosaic mode are displayed by
icp.display_results()
(3) The movie version of the results that better shows the convergence
of the registration between the model and the data images is seen
by calling
icp.display_results_as_movie()
HOW THE RESULTS ARE DISPLAYED:
The results are displayed using Tkinter graphics (meaning,
actually, Tk graphics via the Tkinter interface). The registration
results can be displayed in two different modes: a mosaic mode and
a "movie" mode. In the former, the various iterations of the data
image as it is subject to pose transforms to bring it into
registration with the model image are shown in separate frames. In
the movie mode, the model image and the various iterations of the
data image are shown superimposed. It is easier to see
registration convergence in the movie mode.
In the mosaic mode, for binary images, the first row shows the
model image followed by the data image. Subsequent rows display
the data image being translated and rotated as it comes into
congruence with the model image. Again in the mosaic mode, for
color (and grayscale) images, the first row displays six images,
three for the model and three for the data. In each set of three
images, the first image is the actual photograph, the second the
edges in the photo, and the third the pixels actually used for ICP
calculations. Subsequent rows display the data image being brought
into registration with the model image through successive
iterations. Note that whereas the ICP calculations are carried out
on only the sparse set of pixels shown as the third image in the
model and the data sequences in the first row, the computed
transformations are applied to the edge images shown as the second
image for each.
The display of results requires access to the "times.ttf" font file for
the labels that are needed in the display. The module assumes that
this file can be located through the paths stored in 'sys.path', or
via the paths available through your environment variables, or at the
following path:
/usr/share/fonts/truetype/msttcorefonts/times.ttf
which is typical for a Ubuntu install. If that is not the case, you would
need to edit the 11th line of the code shown for display_results() method
and enter the pathname to where "times.ttf" can be found.
THE EXAMPLES DIRECTORY:
The Examples subdirectory of the installation shows the following
two scripts:
binary_image_registration.py
color_image_registration.py
It is highly recommended that you play with these examples before
modifying or extending the code in this module. There are four
pairs (model and data) of images in this directory that you call
the above scripts on.
CAVEATS:
Within the limitations caused by the fact that ICP calculations can get
trapped in a local minimum, you can expect to get good results on binary
images. As for color images, what results you get would depend greatly
on how complex the images are, the quality of edge detection, and how
the final selection of pixels for ICP computation is carried out. The
implementation shown in this module uses very rudimentary edge
detection and even more rudimentary final pixel selection. (That is
because the focus of this module is on ICP and not on digital image
processing.) To improve the results on color (and grayscale) images,
you would need to replace the pixel extraction functions with your own
implementations. Another option would be to carry out pixel extraction
separately and to then feed the pixels thus selected as binary images
to this module.
When processing color (and grayscale) images, if you are free to
decide as to which image to call model and which to call data,
choose the one that results in fewer pixels for ICP processing as
the data image. You are likely to get better results if you do so.
If you turn the auto_select_model_and_data option on in the call to
the constructor, the system will make that decision for you
automatically.
INSTALLATION:
The ICP class was packaged using Distutils. For installation,
execute the following command-line in the source directory (this is
the directory that contains the setup.py file after you have
downloaded and uncompressed the package):
python setup.py install
You have to have root privileges for this to work. On Linux
distributions, this will install the module file at a location that
looks like
/usr/lib/python2.6/site-packages/
If you do not have root access, you have the option of working
directly off the directory in which you downloaded the software by
simply placing the following statements at the top of your scripts
that use the ICP class:
import sys
sys.path.append( "pathname_to_ICP_directory" )
To uninstall the module, simply delete the source directory, locate
where ICP was installed with "locate ICP" and delete those files.
As mentioned above, the full pathname to the installed version is
likely to look like /usr/lib/python2.6/site-packages/ICP*
If you want to carry out a non-standard install of the ICP module,
look up the on-line information on Disutils by pointing your
browser to
http://docs.python.org/dist/dist.html
THEORETICAL BASIS:
The first ICP algorithm was proposed by Paul Besl and Neil McKay in
a now celebrated paper that appeared in 1992 in IEEE Transactions
on PAMI. Since then various versions of the algorithm have been
published by other folks for either speeding up the performance of
the algorithm or for improving its accuracy.
The algorithm implemented here is as simple as it can be. We model
the relationship between model and data as
R x_d + T = x_m
where x_d denote the data points and x_m the model points, each a
two-element vector. R is a 2x2 rotation matrix and T a 2-element
translation vector. Since two planar figures may not be related by
the above transformation (even when one figure "appears" to be a
rotated version of the other figure) for an arbitrary location of
the origin, move the origin to the mean point of the model by
x_m = x_m - mean( x_m )
x_d = x_d - mean( x_m )
With regard to the calculation of R and T, let's express the data
points and the CORRESPONDING model points as
[x_d1, x_d2, ....., x_dn]
and their CORRESPONDING model points
[x_m1, x_m2, ....., x_mn]
We can now write the following for estimating the rotation matrix:
R . A = B
where
A = [x_d1, x_d2, ....., x_dn]
B = [x_m1 - T, x_m2 - T, ....., x_mn - T]
Both A and B are 2xn matrices. We can now write
R . A . A^t = B . A^t
where A^t is the transpose of A. So, we have the following as a
least mean squares estimate for R:
R = B . A^t . ( A . A^t )^-1
Since such an R may not obey the strict properties that must apply
to a rotation matrix (it must be orthonormal), we condition it by
first subjecting it to a singular value decomposition:
U.S.V^t = svd( R )
and then writing the following for a better estimate for R:
R = U . V^t
We can now estimate T by
T = mean( x_m ) - mean( R x_d )
The above assumes that we are carrying out an one-shot calculation
of R and T. But ICP is iterative. The following applies to an
iterative implementation of the above:
For an iterative assessment of R and T, let's assume that we can
decompose R as
R = \deltaR . R_0
where we have previously calculated R_0 and now we wish to refine
the estimate with the calculation of \deltaR. Plugging the above
in the previous formulation:
\deltaR . R_0 . A . A^t = B . A^t
implying
\deltaR . R_0 = B . A^t . ( A . A^t )^-1
which is to say:
\deltaR = B . A^t . ( A . A^t )^-1 . R_0^t
After you have calculated \deltaR, you can update the rotation
matrix by
R = \deltaR . R_0
At the end of each such update of the rotation matrix, the
calculation of the translation vector remains the same as before
T = mean( x_m ) - mean( R . x_d )
We can therefore write down the following steps for ICP
computation:
STEP 1:
x_m = x_m - mean(x_m)
x_d = x_d - mean(x_m)
STEP 2:
Initialize R and T:
R = 1 0
0 1
T = 0
0
old_error = inf
STEP 3: error = (1/N) \sum dist( x_m - (R * x_d + T) )
STEP 4: diff_error = abs(old_error - error)
if (diff_error > threshold):
old_error = error
else:
break
STEP 5: for each x_d find its closest x_m by finding that x_m which
minimizes the squared difference between the two sides
of
R . x_d + T = x_m
STEP 6: A = [x_d1, x_d2, ....., x_dn]
B = [x_m1 - T, x_m2 - T, ....., x_mn - T]
Note that A will remain the same for ICP iterations,
but B can change with each iteration depending on the
what corresponding pixels are found for each data
pixel.
STEP 7: AATI = A^t inverse(A * A^t)
STEP 8: R_update = B * AATI * R.T
STEP 9: U,S,VT = svd(R_update)
deter = determinant(U * VT)
U[0,1] = U[0,1] * determinant
U[1,1] = U[1,1] * determinant
R_update = U * VT
STEP 10: R = R_update * R
STEP 11: T = mean(x_m) - mean( R * x_d )
STEP 12: Back to Step 3
ACKNOWLEDGMENTS:
The author would like to thank Chad Aeschliman and Kyuseo Han for
their considerable help with the debugging of the code presented
here.
ABOUT THE AUTHOR:
Avi Kak is in the last stages of finishing up his multi-year
three-volume Objects Trilogy project. See his web page at Purdue
for what this project is all about.
Imported Modules | ||||||
|
Classes | ||||||||
|
Data | ||
__author__ = 'Avinash Kak (kak@purdue.edu)' __copyright__ = '(C) 2011 Avinash Kak. Python Software Foundation.' __date__ = '2011-September-12' __url__ = 'https://engineering.purdue.edu/kak/distICP/ICP-1.2.html' __version__ = '1.2' |
Author | ||
Avinash Kak (kak@purdue.edu) |