Multiple subunit fitting into a low-resolution density map of a macromolecular complex using a gaussian mixture model - PubMed (original) (raw)

Multiple subunit fitting into a low-resolution density map of a macromolecular complex using a gaussian mixture model

Takeshi Kawabata. Biophys J. 2008.

Abstract

Recently, electron microscopy measurement of single particles has enabled us to reconstruct a low-resolution 3D density map of large biomolecular complexes. If structures of the complex subunits can be solved by x-ray crystallography at atomic resolution, fitting these models into the 3D density map can generate an atomic resolution model of the entire large complex. The fitting of multiple subunits, however, generally requires large computational costs; therefore, development of an efficient algorithm is required. We developed a fast fitting program, "gmfit", which employs a Gaussian mixture model (GMM) to represent approximated shapes of the 3D density map and the atomic models. A GMM is a distribution function composed by adding together several 3D Gaussian density functions. Because our model analytically provides an integral of a product of two distribution functions, it enables us to quickly calculate the fitness of the density map and the atomic models. Using the integral, two types of potential energy function are introduced: the attraction potential energy between a 3D density map and each subunit, and the repulsion potential energy between subunits. The restraint energy for symmetry is also employed to build symmetrical origomeric complexes. To find the optimal configuration of subunits, we randomly generated initial configurations of subunit models, and performed a steepest-descent method using forces and torques of the three potential energies. Comparison between an original density map and its GMM showed that the required number of Gaussian distribution functions for a given accuracy depended on both resolution and molecular size. We then performed test fitting calculations for simulated low-resolution density maps of atomic models of homodimer, trimer, and hexamer, using different search parameters. The results indicated that our method was able to rebuild atomic models of a complex even for maps of 30 A resolution if sufficient numbers (eight or more) of Gaussian distribution functions were employed for each subunit, and the symmetric restraints were assigned for complexes with more than three subunits. As a more realistic test, we tried to build an atomic model of the GroEL/ES complex by fitting 21-subunit atomic models into the 3D density map obtained by cryoelectron microscopy using the C7 symmetric restraints. A model with low root mean-square deviations (14.7 A) was obtained as the lowest-energy model, showing that our fitting method was reasonably accurate. Inclusion of other restraints from biological and biochemical experiments could further enhance the accuracy.

PubMed Disclaimer

Figures

FIGURE 1

FIGURE 1

Outline of fitting of subunit atomic models into a 3D density map of their complex, using a GMM.

FIGURE 2

FIGURE 2

Expectation maximization algorithm (EM algorithm) estimates a GMM from observed 3D data points.

FIGURE 3

FIGURE 3

Configurations (A_–_C) and corresponding pair tables (D_–_F) of subunits for typical point symmetric groups C3 (A and D), C4 (B and E), and D2 (C and F). A pair with the same letter code (a, b, or c) in the tables is a corresponding pair. Geometry of one subunit viewed from another subunit is equivalent to that of its corresponding pair.

FIGURE 4

FIGURE 4

Optimization of position and orientation of subunits (GMMs _S_1 and _S_2) to fit them into the fixed 3D density map of their complex (GMM C).

FIGURE 5

FIGURE 5

Correlation coefficient between the simulated low-resolution density map for the homotrimeric complex structure (PDB code: 1nic) and its GMM. The thick solid line, long-dashed line, thin solid line, and short-dashed line correspond to density maps of 10 Å, 15 Å, 20 Å, and 30 Å resolution, respectively.

FIGURE 6

FIGURE 6

Simulated low-resolution density maps and GMMs for the homotrimeric complex structure. (A) Atomic model of the complex (PDB code: 1nic). (B_–_D) Simulated density maps with 30 Å, 20 Å, and 15 Å resolutions, respectively. (E) GMM using three GDFs generated from the 30-Å map (B). (F) GMM using six GDFs generated from the 20-Å map (C). (G) GMM using 11 GDFs generated from the 15-Å map (D). Correlation coefficients for the corresponding density pairs (B and E, C and F, and D and G) are >0.98.

FIGURE 7

FIGURE 7

Correlation coefficient between the simulated low-resolution density map for the 21-subunit heterocomplex structure (PDB code: 1aon) and its GMM. The thick solid line, long-dashed line, thin solid line, and short-dashed line correspond to density maps of 10 Å, 15 Å, 20 Å, and 30 Å resolution, respectively. A thin dotted line corresponds to the correlation coefficients for the cryo-EM density map of the complex (EMDB code: emd_1046, resolution:23.5 Å).

FIGURE 8

FIGURE 8

Simulated low-resolution density maps and GMMs for the 21-subunits heterocomplex structure. (A) Atomic model of the complex (PDB code:1 aon). (BD) Simulated density maps with 30 Å, 20 Å, and 15 Å resolutions, respectively. (E) GMM using 21 GDFs generated from the 30-Å map (B). (F) GMM using 45 GDFs generated from the 20-Å map (C). (G) GMM using 95 GDFs generated from the 15-Å map (D). Correlation coefficients for the corresponding density pairs (B and E, C and F, and D and G) are >0.98.

FIGURE 9

FIGURE 9

Fitting models and 3D density maps for the homotrimer (PDB code: 1nic) showing the effect of the number of GDFs representing each subunit. (A) GMM using three GDFs generated from the 20-Å simulated low-resolution density map of the complex. (B) Energy-minimum GMMs using four GDFs for each subunit. (C) Energy-minimum GMMs using eight GDFs for each subunit. (D) Crystal structure for the homotrimer (PDB code: 1nic). (E) Atomic model of the complex structure corresponding to the model using four GDFs for each subunit (B). Its RMSD from the crystal structure (D) was 11.6 Å. (F) Atomic model of the complex structure corresponding to the model using eight GDFs for each subunit (C). Its RMSD from the crystal structure (D) was 3.5 Å. Both energy minimum structures were generated without the symmetric restraint.

FIGURE 10

FIGURE 10

Fitting models and 3D density maps for the homohexamer (PDB code: (2rec) showing the effect of the symmetric restraint. (A) GMM using six GDFs generated from the 20-Å simulated low-resolution density map of the complex. (B) Energy-minimum GMMs without using the symmetric restraint. (C) Energy-minimum GMMs using the symmetric restraint. (D) Crystal structure for the homohexamer (PDB code: 2rec). (E) Atomic model of the complex structure corresponding to the model without the symmetric restraint (B). Its RMSD from the crystal structure (D) was 19.7 Å. (F) Atomic model of the complex structure corresponding to the model using the symmetric restraint (C). Its RMSD from the crystal structure (D) was 4.2 Å. Both energy-minimum structures were generated using eight GDFs for each subunit.

FIGURE 11

FIGURE 11

(A) 3D density map of the complex (ID code: emd_1046). (B) Atomic model of the complex (PDB code: 1aon) fitted into the 3D density map. (C) Energy-minimum model obtained by the Gaussian mixture fitting method. Its RMSD from the atomic complex model (B) was 14.7 Å.

FIGURE 12

FIGURE 12

Local force FA(r) and a total force FA for a distribution _f_A, by the attractive overlap energy, E, of two GMMs, _f_A and _f_B.

References

    1. Kleanthous, C. (Editor.). 2000. Protein-Protein Recognition. Oxford University Press, Oxford, UK.
    1. Pandey, A., and M. Mann. 2000. Proteomics to study genes and genomes. Nature. 405:837–846. - PubMed
    1. von Mering, C., R. Krause, B. Snel, M. Cornell, S. G. Oliver, S. Fields, and P. Bork. 2002. Comparative assessment of large-scale data sets of protein-protein interactions. Nature. 417:399–403. - PubMed
    1. Frank, J. 2002. Single-particle imaging of macromolecules by cryo-electron microscopy. Annu. Rev. Biophys. Biomol. Struct. 31:303–319. - PubMed
    1. Sali, A., R. Glaeser, T. Earnest, and W. Baumeister. 2003. From words to literature in structural proteomics. Nature. 422:216–225. - PubMed

MeSH terms

Substances

LinkOut - more resources