Autofix for backward-fit sidechains: using MolProbity and real-space refinement to put misfits in their place - PubMed (original) (raw)

Autofix for backward-fit sidechains: using MolProbity and real-space refinement to put misfits in their place

Jeffrey J Headd et al. J Struct Funct Genomics. 2009 Mar.

Abstract

Misfit sidechains in protein crystal structures are a stumbling block in using those structures to direct further scientific inference. Problems due to surface disorder and poor electron density are very difficult to address, but a large class of systematic errors are quite common even in well-ordered regions, resulting in sidechains fit backwards into local density in predictable ways. The MolProbity web site is effective at diagnosing such errors, and can perform reliable automated correction of a few special cases such as 180 degrees flips of Asn or Gln sidechain amides, using all-atom contacts and H-bond networks. However, most at-risk residues involve tetrahedral geometry, and their valid correction requires rigorous evaluation of sidechain movement and sometimes backbone shift. The current work extends the benefits of robust automated correction to more sidechain types. The Autofix method identifies candidate systematic, flipped-over errors in Leu, Thr, Val, and Arg using MolProbity quality statistics, proposes a corrected position using real-space refinement with rotamer selection in Coot, and accepts or rejects the correction based on improvement in MolProbity criteria and on chi angle change. Criteria are chosen conservatively, after examining many individual results, to ensure valid correction. To test this method, Autofix was run and analyzed for 945 representative PDB files and on the 50S ribosomal subunit of file 1YHQ. Over 40% of Leu, Val, and Thr outliers and 15% of Arg outliers were successfully corrected, resulting in a total of 3,679 corrected sidechains, or 4 per structure on average. Summary Sentences: A common class of misfit sidechains in protein crystal structures is due to systematic errors that place the sidechain backwards into the local electron density. A fully automated method called "Autofix" identifies such errors for Leu, Val, Thr, and Arg and corrects over one third of them, using MolProbity validation criteria and Coot real-space refinement of rotamers.

PubMed Disclaimer

Figures

Fig. 1

Fig. 1

Example Autofix correction of a Leu decoy rotamer from the 945-file dataset: Leu D 427 from 1A0E (Thermotoga neapolitana xylose isomerase) at 2.7 Å resolution. a (original) Leu D 427 in its deposited conformation, which is a rotamer outlier with an eclipsed χ angle and a clash with Leu D 430. b (both) Overlay, in stereo, of proposed corrected Leu rotamer (green) over the deposited conformation (pink). c (fixed) Corrected Leu D 427, in a favored mt rotamer. The clash with Leu D 430 has been alleviated and the bond angle idealized, with a somewhat better fit to the density. Images in Figs. 1, 2 and 4 were generated using KING [3]

Fig. 2

Fig. 2

Example Autofix correction from the 50S ribosome: a Thr rotamer outlier, from protein L18e in the 1YHQ archaeal large ribosomal subunit (2.4 Å) [15], before and after correction. a (original) Thr O 3 in its deposited orientation, with fairly good fit to the density, but a serious clash with RNA backbone (Thr methyl to G 0 656 H5′), no H-bond, and a rotamer outlier. b (both) Overlay, in stereo, of proposed corrected Thr rotamer (green) over the original position (pink). c (fixed) Corrected Thr O 3, with equivalent fit to the density, no steric clashes, an excellent p rotamer, and now a strong H-bond from Thr OG1 to the 2′OH of G 0 655. C atoms are gray or black balls; O atoms are larger red balls. Steric clashes are shown as clusters of hot pink spikes, H-bonds as lenses of pale green dots

Fig. 3

Fig. 3

Summary of Autofix results on 1YHQ 50S ribosomal subunit. Bar chart summary of correction results on Leu, Thr, Val, and Arg residues in 1YHQ. Gray bars represent the total number of each residue type in the file. Red represents the number of candidate outliers (<1% rotamer score). Blue represents the number of successfully corrected residues of each type: 7 Leu, 8 Val, 8 Thr, and 7 Arg, which are 63, 57, 67, and 25% of the outliers, respectively

Fig. 4

Fig. 4

Before and after χ1–χ2 plots of the 2,037 accepted Leu corrections, for those identified as rotamer outliers (<1%) in our 945-file dataset and successfully corrected by Autofix. Contours are taken from the Top500 Leu set [1], with decoys removed; black lines are the 1% contours and gray lines are the 10% contours of rotamer score. a Before: χ1–χ2 plot for the original conformation of each corrected Leu outlier (thus outside the 1% contours). b After: χ1–χ2 plot of the final χ values for each successfully corrected Leu outlier (now inside the contours). Data points are color-coded by which rotamer they ended up in after correction: mt green, tp blue, tt red, mp brown, pp purple, tm yellow, mm hot pink, and pt orange. Note that for most rotamers, the corrected examples came from a well-defined decoy cluster approximately 180° distant

Fig. 5

Fig. 5

Summary of real-space correlation coefficients (RSCC) for corrected outlier residues before (gray) and after (black) Autofix, showing improvement for all 4 amino acid types. Median RSCC values are indicated by a vertical line. The box around the median spans the 25th to the 75th percentile. Whiskers end at the 1st or 99th percentile

Similar articles

Cited by

References

    1. {'text': '', 'ref_index': 1, 'ids': [{'type': 'DOI', 'value': '10.1002/1097-0134(20000815)40:3<389::AID-PROT50>3.0.CO;2-2', 'is_inner': False, 'url': 'https://doi.org/10.1002/1097-0134(20000815)40:3<389::aid-prot50>3.0.co;2-2'}, {'type': 'PubMed', 'value': '10861930', 'is_inner': True, 'url': 'https://pubmed.ncbi.nlm.nih.gov/10861930/'}\]}
    2. Lovell SC, Word JM, Richardson JS, Richardson DC (2000) Proteins 40:389–408. doi:10.1002/1097-0134(20000815)40:3<389::AID-PROT50>3.0.CO;2-2 - PubMed
    1. {'text': '', 'ref_index': 1, 'ids': [{'type': 'DOI', 'value': '10.1006/jmbi.1998.2401', 'is_inner': False, 'url': 'https://doi.org/10.1006/jmbi.1998.2401'}, {'type': 'PubMed', 'value': '9917408', 'is_inner': True, 'url': 'https://pubmed.ncbi.nlm.nih.gov/9917408/'}\]}
    2. Word JM, Lovell SC, Richardson JS, Richardson DC (1999) J Mol Biol 285:1735–1747. doi:10.1006/jmbi.1998.2401 - PubMed
    1. {'text': '', 'ref_index': 1, 'ids': [{'type': 'DOI', 'value': '10.1093/nar/gkm216', 'is_inner': False, 'url': 'https://doi.org/10.1093/nar/gkm216'}, {'type': 'PMC', 'value': 'PMC1933162', 'is_inner': False, 'url': 'https://pmc.ncbi.nlm.nih.gov/articles/PMC1933162/'}, {'type': 'PubMed', 'value': '17452350', 'is_inner': True, 'url': 'https://pubmed.ncbi.nlm.nih.gov/17452350/'}\]}
    2. Davis IW, Leaver-Fay A, Chen VB, Block JN, Kapral GJ, Wang W, Murray LW, Arendall WBIII, Snoeyink J, Richardson JS, Richardson DC (2007) Nucleic Acids Res 35:W375–W383. doi:10.1093/nar/gkm216 - PMC - PubMed
    1. {'text': '', 'ref_index': 1, 'ids': [{'type': 'DOI', 'value': '10.1007/s10969-005-3138-4', 'is_inner': False, 'url': 'https://doi.org/10.1007/s10969-005-3138-4'}, {'type': 'PubMed', 'value': '15965733', 'is_inner': True, 'url': 'https://pubmed.ncbi.nlm.nih.gov/15965733/'}\]}
    2. Arendall BWIII, Tempel W, Richardson JS, Zhou W, Wang S, Davis IW, Liu Z-J, Rose JP, Carson WM, Luo M, Richardson DC, Wang B-C (2005) J Struct Funct Genomics 6:1–11. doi:10.1007/s10969-005-3138-4 - PubMed
    1. {'text': '', 'ref_index': 1, 'ids': [{'type': 'DOI', 'value': '10.1093/nar/28.1.235', 'is_inner': False, 'url': 'https://doi.org/10.1093/nar/28.1.235'}, {'type': 'PMC', 'value': 'PMC102472', 'is_inner': False, 'url': 'https://pmc.ncbi.nlm.nih.gov/articles/PMC102472/'}, {'type': 'PubMed', 'value': '10592235', 'is_inner': True, 'url': 'https://pubmed.ncbi.nlm.nih.gov/10592235/'}\]}
    2. Berman HM, Westbrook J, Feng Z, Gilliland G, Bhat TN, Weissig H, Shindyalov IN, Bourne PE (2000) Nucleic Acids Res 28:235–242. doi:10.1093/nar/28.1.235 - PMC - PubMed

Publication types

MeSH terms

Substances

Grants and funding

LinkOut - more resources