Using least median of squares for structural superposition of flexible proteins

Yu-Shen Liu1    Yi Fang1    Karthik Ramani1,2    

1School of Mechanical Engineering
2School of Electrical Computer Engineering (by courtesy)
Purdue University, West Lafayette, IN, 47907, USA

The conventional superposition methods use an ordinary least squares (LS) fit for structural comparison of two different conformations of the same protein. The main problem of the LS fit is very unstable and is sensitive to local displacements. To overcome this problem, we present a new algorithm to overlap two protein conformations by their atomic coordinates using a robust statistics technique: least median of squares (LMS). In order to effectively approximate the LMS optimization, the forward search technique is utilized. Our algorithm can automatically detect and superimpose the rigid core regions of two conformations with small or large displacements. In contrast, most existing superposition techniques, which strongly depend on the initial LS estimating for the entire atom sets of proteins, usually fail on structural superposition of two conformations with large displacements. The proposed algorithm is robust and does not require any prior knowledge of the flexible regions. Our fit tool has produced successful superpositions when applied to proteins for which two conformations are known.
  1. README.txt
  2. Tutorial.pdf
  5. Database(~64MB)



Multiple level superposition for Topo II: 1bgw (red) and 1bjt (green). (a) Level 1 (Core% = 56.4%). (b) Level 2 (Core% = 22.1%). (c) Level 3 (Core% = 11.7%). (d) Level 4 (Core% = 5.1%). Note that our method can capture different rigid domains in multiple level superposition, where the superimposed rigid domains are highlighted in the selected regions with the solid line boundary.



We would like to thank Dr. Talapady Bhat for some helpful comments during our work. This material is partly based upon work supported by the National Science Foundation under Grant IIS No. 0535156.
Any opinions, findings, and conclusions or recommendations expressed in this material are those of the author(s) and do not necessarily reflect the views of the National Science Foundation. We also acknowledge partial support from the National Institute of Health (GM-075004).