Prof. Mireille Boutin

The Background Story

My work on unlabeled distance geometry originates from the period where I was a postdoc at Brown University. At the time, I was working on using invariants to represent objects in order to be able to recognize them using fewer computations than by comparing them directly. I was inspired by discussions with my colleague Senem Velipasalar, who was using statistics of features to represent objects. Thus I became interested in using statistics of invariants to represent objects and began to wonder in what circumstances statistics of invariants could separate the orbits of a group. In other words, under what circumstances could statistics of invariants be a lossless representation of the object modulo the corresponding group action.

I had already done work on the problem of recognizing curves and polygons modulo a group action. To take the next step, I was interested in going from ordered sets of points on a curve to unordered sets of points. I was especially interested in one particular question: is the distribution of the pairwise distances between a set of points in R^n (e.g., in the plane) a lossless representation of the set of points modulo rotation, translation, reflection, and scaling. I discussed this question with various people, including my postdoctoral advisor David Mumford, who told me that the answer was no. Still, I could not shake that question out of my mind, and I continued to try to “make it be true,” somehow. It was in London, Ontario at a conference that I was introduced to Gregor Kemper, who showed interest in my question and brought up the algebra knowledge needed to properly answer it. Thus began a very fruitful collaboration.

The Deterministic Case: Reconstructing the Shape of a Point Configurations from unlabeled pairwise distances

So the conjecture was that one could somehow, at least in theory, reconstruct a point configuration (up to a rigid motion) from the values of all its pairwise distances even if the labeling of the distances was unknown. Of course, I expected the conjecture to be false, but counterexamples were hard to find. With the help of the symbolic computation software MAGMA, Gregor was finally able to find a very nice one, which is given in our first paper. However, we were also able to prove that the conjecture is true for generic cases, as stated in the following theorem.

Theorem (B.-Kemper 2004): There exists a polynomial f: R^k x R^k x ... xR^k (n times) -> R such that if f(p1,p2,...,pn) is not equal to zero, then the point configuration formed by p1,p2,...,pn can be uniquely reconstructed, up to a rigid motion, from the multiset (i.e., bag) of its pairwise distances.

Corollary: Let p1,...pn and q1,...,qn be two sets of n points randomly drawn (following a non-degenerated probability law) in R^k. Then, with probability one, the following statement is true: there exists a rotation, translation, reflection and point relabeling mapping p1,p2,...,pn onto q1,q2,...,qn. if and only if the two sets of points have the same (unlabeled) pairwise distances.

Note that directly comparing the shape of two point configurations would be very computationally expensive. By representing the point configuration as a set of unlabeled distances, the comparison can be achieved in (low) polynomial time. Thus, we have a fast method for comparing the shape of two point configurations. One common approach for increasing the speed of an algorithm is to approximate the solution. In contrast with approximate methods, which are always inaccurate, our solution approach is accurate with probability one.

Our results were extended to other groups. We also derived a polynomial time method to check whether a configuration of point is on the zero set of the polynomial containing all the exceptional cases.

References:

M.Boutin and G. Kemper, On reconstructing n-point configurations from the distribution of distances or areas, Advances in Applied Mathematics, Vol. 32, Issue 4, Mar 2004, Pages 709-735
M.Boutin and G. Kemper, On reconstructing n-point configurations in P2 from a joint distribution of invariants, Applicable Algebra in Engineering, Communication and Computing 15 (6), 361-391

M. Boutin and G. Kenper, Which point configurations are determined by the distribution of their pairwise distances? Int. J. Comput. Geom. Appl. 17, 31 (2007).

The Noisy Case: Reconstructing the Shape of a noisy point configuration from distances samples

The existence of counterexamples, although not problematic from a theoretical perspective, brings up a serious issue in applications: ill-conditioning. More specifically, if two (generic) sets of distances are close to each other, they may correspond to completely different point configurations (e.g., small perturbations of two counterexamples). Thus, more work is needed in order to be able to compare two noisy point configurations in practice.

Our proposed approach is to view the point configuration as a noisy observation of an unknown point configuration, and to model the noise using a pre-determined noise model. For simplicity, we assume that the noise is Gaussian, and so the points observed are viewed as samples from a Gaussian mixture: each Gaussian in the mixture is centered around a true point in the unknown configuration corresponding to its mean. In that case, the question is whether the shape of the Gaussian mixture can be reconstructed from the probability distribution function of the distance between two points drawn independently following that Gaussian mixture model. Our results show that this is possible whenever the mean of the Gaussians in the mixture form a generic point configuration.

References:

H. Santos-Villalobos and M. Boutin, “Computationally efficient method to compare the shape of planar Gaussian mixtures from point samples," Journal of Electronic Imaging 21(2), 023023 (22 June 2012.

M. Boutin, K. Lee and M. Comer, Lossless Shape Representation using Invariant Statistics: the Case of Point-sets, Asilomar Conference, 2006.

M. Boutin and M. Comer, Faithful Shape Representation for 2D Gaussian Mixtures, ICIP 2007.

Extension: from point configurations to weighted graphs

Point configurations modulo rigid motion can be viewed as weighted graphs by associating the points in the configuration to the nodes,and the distance between any two points in the configuration to a weight between the two corresponding nodes in the graph. Thus recognizing a point configuration up to a rigid motion is a special case of weighted graph isomorphism. As we were investigating ways to generalize our results, we asked ourselves: can statistics of invariants be used to recognize weighted graphs op to isomorphism?

Two different types of invariant representations are investigated in this preprint. The first representation, the distribution of triangles, is used for connectivity graphs. The second representation consists of two distributions (weights; sums of any two adjacent weights) and is used to represent graphs whose weights are real-valued. As before, we show that a large number of graphs are reconstructible, up to isomorphism, from these invariant representations. In particular, for the second representation, the set of exceptional graphs has measure zero in the space of all real-valued weighted graphs.

References:

M.Boutin and G. Kemper, Lossless Representation of Graphs using Distribution, arXiv preprint.