Assessing and maximizing data quality in macromolecular crystallography - PubMed (original) (raw)

Review

Assessing and maximizing data quality in macromolecular crystallography

P Andrew Karplus et al. Curr Opin Struct Biol. 2015 Oct.

Abstract

The quality of macromolecular crystal structures depends, in part, on the quality and quantity of the data used to produce them. Here, we review recent shifts in our understanding of how to use data quality indicators to select a high resolution cutoff that leads to the best model, and of the potential to greatly increase data quality through the merging of multiple measurements from multiple passes of single crystals or from multiple crystals. Key factors supporting this shift are the introduction of more robust correlation coefficient based indicators of the precision of merged data sets as well as the recognition of the substantial useful information present in extensive amounts of data once considered too weak to be of value.

Copyright © 2015 Elsevier Ltd. All rights reserved.

PubMed Disclaimer

Figures

Figure 1

Figure 1

Averaging multiple measurements can substantially enhance data quality. (a) CCanom is plotted as a function of resolution for a data set of 1080 18 images in a sulfur-SAD phasing case study [23]. Statistics for data merged from 30 (blue), 120 (cyan), 360 (green), 720 (orange), and 1080 (red) images are shown. Based on 30 images (3.5 fold multiplicity), there is no apparent anomalous signal beyond 4 Å, but with 720 images (75-fold multiplicity) the apparent signal extends beyond 3 Å resolution. Inset shows the quality of the anomalous difference map (maximal _r_rms) increases substantially and then, as radiation damage systematically alters the structure, decreases even while CCanom stays high. (b–d) Behavior of CC1/2, _R_merge, and 〈_I/σ_〉mrgd as a function of resolution for individual crystals (breadth of values indicated by cyan swaths) and for a set of data merged from 18 crystals (red traces) and successfully used for sulfur-SAD phasing and refinement at 2.9 Å resolution [11••]. Insets show close-ups of the low or high resolution regions. According to the authors, the best individual crystal would only have been useful to ca. 3.2 Å resolution, and by the panel C inset, the averaged data would have been truncated at near 3.8 Å based on an _R_merge ~ 60% cutoff criterion.

Figure 2

Figure 2

Examples of tangible electron density map improvement enabled by extending resolution cutoffs. (a) Comparison of the 2Fo–Fc electron density (contoured at 1 _ρ_rms) for a region of the prokaryotic sodium channel pore using an 〈_I/σ_〉mrgd ~2 cutoff (_R_pim = 47%, 〈_I/σ_〉mrgd = 1.9, CC1/2 = 0.78) 4.0 Å resolution (upper panel) versus a more generous CC1/2 ~ 0.1 based cutoff (_R_pim = 213%, 〈_I/σ_〉mrgd = 0.3, CC1/2 = 0.14) 3.46 Å resolution (lower panel). The 4 Å resolution cutoff was already somewhat generous as the _R_pim of 47% with a multiplicity of 12 would be expected to correspond to an _R_meas value of above 150% (47%*√12). Used with permission from Figure S1 of [48•]. (b). Comparison of the 2Fo–Fc electron density (contoured at 1 _ρ_rms) for a region of the E. coli YfbU protein using for the phase extension a fairly conventional cutoff (_R_meas = 77%, 〈_I/σ_〉mrgd = 3.5, CC1/2 = 0.85) of 3.1Å resolution (upper panel) versus a more generous 〈_I/σ_〉mrgd ~ 0.5 or CC1/2 ~ 0.1 cutoff _R_meas = 302%, 〈_I/σ_〉mrgd = 0.5, CC1/2 = 0.14) of 2.5 Å resolution (lower panel). The additional weak data did not just extend the resolution of the map, but improved the quality of the phases obtained at 3.1 Å resolution. Images used with permission from the International Union of Crystallography from Figure 3 of [40••] (

http://dx.doi.org/10.1107/S1399004714005318

).

References

    1. Krojer T, Pike AC, von Delft F. Squeezing the most from every crystal: the fine details of data collection. Acta Crystallogr D: Biol Crystallogr. 2013;69(Pt 7):1303–1313. - PMC - PubMed
    1. Zeldin OB, Brockhauser S, Bremridge J, Holton JM, Garman EF. Predicting the X-ray lifetime of protein crystals. Proc Natl Acad Sci U S A. 2013;110:20551–20556. - PMC - PubMed
    1. Evans PR. An introduction to data reduction: space-group determination, scaling and intensity statistics. Acta Crystallogr D: Biol Crystallogr. 2011;67(Pt 4):282–292. - PMC - PubMed
    1. Diederichs K. Crystallographic data and model quality. In: Ennifar, editor. Nucleic Acids Crystallography: Methods and Protocols. Vol. 320. Springer Science + Business Media; New York: 2015. (in press) [Provides a rather in-depth discussion of random and systematic errors that impact crystallographic data quality, the distinction between precision and accuracy, and the use of data quality indicators. Then walks readers through some lesser known tools that can be used to troubleshoot and optimize data reduction using XDS.]
    1. Karplus PA, Diederichs K. Linking crystallographic model and data quality. Science. 2012;336:1030–1033. [Introduces paired refinement concept, CC1/2 and CC* indicators. Uses these and difference map analyses to prove that conventional high-resolution cutoff criteria discard useful data. Further shows how data quality R-factors are not comparable to refinement R-factors and that their values should not be used in defining a high-resolution cutoff. That the new indicators are being used and practices are changing is indicated by the over 400 citations already garnered by the work.] - PMC - PubMed

Publication types

MeSH terms

Substances

LinkOut - more resources