A Bayesian view on cryo-EM structure determination - PubMed (original) (raw)

A Bayesian view on cryo-EM structure determination

Sjors H W Scheres. J Mol Biol. 2012.

Abstract

Three-dimensional (3D) structure determination by single-particle analysis of cryo-electron microscopy (cryo-EM) images requires many parameters to be determined from extremely noisy data. This makes the method prone to overfitting, that is, when structures describe noise rather than signal, in particular near their resolution limit where noise levels are highest. Cryo-EM structures are typically filtered using ad hoc procedures to prevent overfitting, but the tuning of arbitrary parameters may lead to subjectivity in the results. I describe a Bayesian interpretation of cryo-EM structure determination, where smoothness in the reconstructed density is imposed through a Gaussian prior in the Fourier domain. The statistical framework dictates how data and prior knowledge should be combined, so that the optimal 3D linear filter is obtained without the need for arbitrariness and objective resolution estimates may be obtained. Application to experimental data indicates that the statistical approach yields more reliable structures than existing methods and is capable of detecting smaller classes in data sets that contain multiple different structures.

Copyright © 2011 Elsevier Ltd. All rights reserved.

PubMed Disclaimer

Figures

None

Graphical abstract

Fig. 1

Fig. 1

A schematic interpretation of the approach. A structure is iteratively refined through a two-step procedure. The first step, which is called Expectation in mathematical terms, has been labeled “Alignment.” In this step, computer-generated projections of the structure are compared with the experimental images, resulting in information about the relative orientations of the images. Orientations are not assigned in a discrete manner, but probability distributions over all possible assignments [Γ_i_ϕ(n)] are calculated, and the sharpness of these distributions is determined by the power of the noise in the data. The second step is called Maximization and has been labeled “Smooth reconstruction.” In this step, the experimental images are combined with the prior information into a smooth, 3D reconstruction through Eq. (9), and updated estimates for the power of the noise and the signal in the data are obtained through Eqs. (10) and (11). The relative contributions of the data and the prior to the reconstruction are dictated by Bayes' law and depend on the power of the noise and the power of the signal in the data [see Eq. (9)]. The new structure and the updated estimates for the power of the noise and the signal are then used for a subsequent iteration. Iterations are typically stopped after a user-defined number or when the structures do not change anymore.

Fig. 2

Fig. 2

Thermosome test case. (a) Resolution estimates for the MAP (red) and XMIPP (green) refinements. Broken lines indicate resolution estimates as reported by the refinement program. The broken red line indicates the SSNRMAP values for the MAP refinement; the broken green line indicates the FSC values as estimated inside XMIPP by splitting the entire data set into two random halves at the final refinement iteration. Continuous lines indicate FSC values between two independently refined reconstructions. Each of these reconstructions was refined against two completely separate random halves of the data. (b) Reconstructed maps from the XMIPP (left) and MAP (right) refinements.

Fig. 3

Fig. 3

GroEL test case. (a) Spherical average of the power of the signal (τ_l_2, in black) and the annular average of the power of the noise in one of the micrographs (σ_ij_2, in red) as estimated by MAP refinement. (b) Resolution estimates for the MAP (red) and XMIPP (green) refinements. Broken lines indicate resolution estimates as reported by the refinement program. The broken red line indicates the SSNRMAP values for the MAP refinement; the broken green line indicates the FSC values as estimated inside XMIPP by splitting the entire data set into two random halves at the final refinement iteration. Continuous lines indicate FSC values between the reconstructions and the crystal structure (see also Experimental Procedures). (c) Guinier plots for the atomic model (black) and the sharpened reconstructions from the MAP (red) and XMIPP (green) refinements. Vertical dotted lines indicate the resolution range that was used to estimate the _B_-factor for sharpening the experimental reconstruction, using the atomic model as a reference. (d) Density maps for the atomic model at 8 Å resolution (top) and the sharpened reconstruction from the MAP (middle) and XMIPP (bottom) refinements.

Fig. 4

Fig. 4

Ribosome test case. (a) Reconstructed maps for a MAP refinement with K = 4 classes. Density for EF-G is shown in red, 50S subunits are shown in blue, 30S subunits are shown in yellow, and tRNAs are shown in green. The first two maps were interpreted as 70S ribosomes in complex with EF-G, the third map was interpreted as a 70S ribosome without EF-G, and the fourth map was interpreted as a 50S ribosomal subunit. For each class, the numbers of assigned particles that according to supervised classification correspond to ribosomes with EF-G and without EF-G are indicated in red and green, respectively. (The true class assignments are not known.) In addition, the resolution at which SSNRMAP drops below 1 is indicated for each class. (b) A representative reference-free class average (left) and two individual, unaligned experimental images (right) for particles assigned to the class corresponding to 70S particles without EF-G. (c) As (b), but for the class corresponding to 50S subunits.

References

    1. Cong Y., Ludtke S.J. Single particle analysis at high resolution. Methods Enzymol. 2010;482:211–235. - PubMed
    1. Grigorieff N., Harrison S.C. Near-atomic resolution reconstructions of icosahedral viruses from electron cryo-microscopy. Curr. Opin. Struct. Biol. 2011;21:265–273. - PMC - PubMed
    1. Brunger A.T. Free R value: cross-validation in crystallography. Methods Enzymol. 1997;277:366–396. - PubMed
    1. Frank J. Oxford University Press; New York, NY: 2006. Three-Dimensional Electron Microscopy of Macromolecular Assemblies.
    1. Scheres S.H.W. Classification of structural heterogeneity by maximum-likelihood methods. Methods Enzymol. 2010;482:295–320. - PMC - PubMed

Publication types

MeSH terms

Substances

LinkOut - more resources