Linking crystallographic model and data quality - PubMed (original) (raw)

Linking crystallographic model and data quality

P Andrew Karplus et al. Science. 2012.

Abstract

In macromolecular x-ray crystallography, refinement R values measure the agreement between observed and calculated data. Analogously, R(merge) values reporting on the agreement between multiple measurements of a given reflection are used to assess data quality. Here, we show that despite their widespread use, R(merge) values are poorly suited for determining the high-resolution limit and that current standard protocols discard much useful data. We introduce a statistic that estimates the correlation of an observed data set with the underlying (not measurable) true signal; this quantity, CC*, provides a single statistically valid guide for deciding which data are useful. CC* also can be used to assess model and data quality on the same scale, and this reveals when data quality is limiting model improvement.

PubMed Disclaimer

Figures

Figure 1

Figure 1

Higher resolution data, even if weak, improves refinement behaviour. For each incremental step of resolution from X->Y (top legend), the pair of bars gives the changes in overall Rwork (blue) and Rfree (red) for the model refined at resolution Y with respect to those for the model refined at resolution X, with both R values calculated at resolution X. The first pair of bars shows that Rwork and Rfree dropped 0.38 and 0.34% upon isotropic refinement, respectively, when the refinement resolution limit was extended from 2.0 to 1.9 Å; the other pairs of bars show the improvement upon anisotropic refinement.

Figure 2

Figure 2

Data quality R values behave differently than those from crystallographic refinement, and useful data extend well beyond what standard cutoff criteria would suggest. Rmeas (squares) and Rpim (circles) are compared with Rwork (blue) and Rfree (red) from 1.42 Å resolution refinements against the EXP dataset. <I‒∕σ(I‒)> (grey X) is also plotted. Inset is a close-up of the plot beyond 2 Å resolution.

Figure 3

Figure 3

Signal as a function of resolution as measured by correlation coefficients. Plotted as a function of resolution for the EXP data is CC1/2 (diamonds) and the CC for a comparison with the 3ELN reference dataset (triangles). <I‒∕σ(I‒)> (grey) is also shown. All determined CC1/2 values shown have expected standard errors of <0.025 (21, 22).

Figure 4

Figure 4

The CC1/2 / CC* relationship and the utility of comparing CC* with CCwork and CCfree from a refined model. (A) Plotted is the analytical relationship (eqn. 3) between CC1/2 and CC* (black curve). Also roughly following the CC* curve are the CC values for the EXP data compared with 3ELN (triangles) as a function of CC1/2. (B) Plotted as a function of resolution are CC* (black solid) for the EXP dataset as well as CCwork (blue dashed) and CCfree (red dashed) calculated on intensities from the 1.42 Å refined model. Also shown are CCwork (blue dotted) and CCfree (red dotted) between the 1.42 Å refined model and the 3ELN dataset.

Comment in

Similar articles

Cited by

References

    1. Wilson AJC. Largest likely values for the reliability index. Acta Cryst. 1950;3:397–398.
    1. Brünger ATB. Free R value: a novel statistical quantity for assessing the accuracy of crystal structures. Nature. 1992;355:472. - PubMed
    1. Arndt UW, Crowther RA, Mallett JFW. A computer-linked cathode-ray tube microdensitometer for X-ray crystallography. J. Phys. E: Sci. Instrum. 1968;1:510. - PubMed
    1. Diederichs K, Karplus PA. Improved R-factors for diffraction data analysis in macromolecular crystallography. Nature Structural Biology. 1997;4:269. - PubMed
    1. Weiss MS. Global indicators of X-ray data quality. J. Appl. Cryst. 2001;34:130.

Publication types

MeSH terms

Substances

LinkOut - more resources