A genetic algorithm for the identification of conformationally invariant regions in protein molecules (original) (raw)

link to html

Understanding macromolecular function often relies on the comparison of different structural models of a molecule. In such a comparative analysis, the identification of the part of the molecule that is conformationally invariant with respect to a set of conformers is a critical step, as the corresponding subset of atoms constitutes the reference for subsequent analysis for example by least-squares superposition. A method is presented that categorizes atoms in a molecule as either conformationally invariant or flexible by automatic analysis of an ensemble of conformers (e.g. crystal structures from different crystal forms or molecules related by non-crystallographic symmetry). Different levels of coordinate precision, both for different models and for individual atoms, are taken explicitly into account via a modified form of Cruickshank's DPI [Cruickshank (1999), Acta Cryst. D55, 583-601] and are propagated into error-scaled difference distance matrices [Schneider (2000), Acta Cryst. D56, 715-721]. All pairwise error-scaled difference distance matrices are then analysed simultaneously using a genetic algorithm. The algorithm has been tested on several well known examples and has been found to converge rapidly to reasonable results using a standard set of parameters. In addition to the description of the algorithm, a criterion is suggested for testing the identity of two three-dimensional models within experimental error without any explicit superposition.