Improved prediction of protein side-chain conformations with SCWRL4 - PubMed (original) (raw)
Improved prediction of protein side-chain conformations with SCWRL4
Georgii G Krivov et al. Proteins. 2009 Dec.
Abstract
Determination of side-chain conformations is an important step in protein structure prediction and protein design. Many such methods have been presented, although only a small number are in widespread use. SCWRL is one such method, and the SCWRL3 program (2003) has remained popular because of its speed, accuracy, and ease-of-use for the purpose of homology modeling. However, higher accuracy at comparable speed is desirable. This has been achieved in a new program SCWRL4 through: (1) a new backbone-dependent rotamer library based on kernel density estimates; (2) averaging over samples of conformations about the positions in the rotamer library; (3) a fast anisotropic hydrogen bonding function; (4) a short-range, soft van der Waals atom-atom interaction potential; (5) fast collision detection using k-discrete oriented polytopes; (6) a tree decomposition algorithm to solve the combinatorial problem; and (7) optimization of all parameters by determining the interaction graph within the crystal environment using symmetry operators of the crystallographic space group. Accuracies as a function of electron density of the side chains demonstrate that side chains with higher electron density are easier to predict than those with low-electron density and presumed conformational disorder. For a testing set of 379 proteins, 86% of chi(1) angles and 75% of chi(1+2) angles are predicted correctly within 40 degrees of the X-ray positions. Among side chains with higher electron density (25-100th percentile), these numbers rise to 89 and 80%. The new program maintains its simple command-line interface, designed for homology modeling, and is now available as a dynamic-linked library for incorporation into other software programs.
2009 Wiley-Liss, Inc.
Figures
Figure 1
Steps in SCWRL4 side-chain conformation prediction
Figure 2. _k_-Dimensional Oriented Polytopes (kDOPs)
Left: examples of kDOPs in the plane (_k_=2,3,4) and in three dimensions (_k_=3,4). Right: Overlap test for kDOP A (black) and kDOP B (gray). The objects enclosed within the kDOPs may clash if one of the conditions shown is satisfied.
Figure 3. SCWRL4 van der Waals potential
The van der Waals potential used in SCWRL4 is shown (solid line) with a standard Lennard-Jones 6–12 potential (dotted line) with Eij = 1.
Figure 4. Hydrogen bond potential
Interaction of hydrogen bond acceptor O and hydrogen bond donor, D. Unit vector n⃗ is the vector from atom O to atom H. Unit vectors _e⃗_1 and _e⃗_2 are placed from atom O along each lone pair of electrons. Unit vector _e⃗_0 connects the hydrogen bond donor D to the hydrogen atom. α=cos−1(−nr·e1r) is the angle between the D-H bond and the H…O vector and β=cos−1(nr·e0r) is the angle between the O-lone pair and the O…H vector.
Figure 5. Tree decomposition as generalization of biconnected component decomposition
At left, the graph used in the SCWRL3 paper is shown along with its biconnected component decomposition. At right, a tree decomposition of the same graph is shown. Residues in blue and green illustrate conditions 2 and 3 of a tree decomposition being satisfied. The relevant conditions are shown below the tree decomposition. At each node of the tree, those residues that are members of set L are shown to the left of the vertical bar, and those of set R are shown to the right. Set L consists of those residues shared with the parent of each node, and R the remaining residues of the node.
Figure 6. Effect of SCWRL4 features on differences in SCWRL3 and SCWRL4 accuracy
The accuracy shown is average absolute accuracy of the training set, which covers all side-chain dihedral angles (see text). “Atomic radii” = use of optimized radii; “Interpolation” = interpolation of rotamer library probabilities and dihedral angles; “Local BB” = adding interaction between side chain and atoms N,HN of residue _i_−1 and C,O of residue i+1, previously neglected in SCWRL3; “P=98%” = reading in top 98% of probability from rotamers sorted in descending order of frequency (90% in SCWRL3); “H-bonds” = new hydrogen bond potential; “New RL” = new rotamer library; “FRM” = Flexible rotamer model; “Tuning of parameters” = tuning of FRM parameters and rotamer library weights.
Figure 7. Improvement in accuracy due to inclusion of crystal neighbors
The accuracy figures shown reflect the differences in average absolute accuracy for the testing set as described in the text.
Figure 8. Accuracy vs. relative surface area
Accuracy of SCWRL4 predictions is shown as a function of side-chain relative accessible surface area, calculated with kernel density estimates (see Methods): χ1 (black), χ1+2 (red), χ1+2+3 (orange), χ1+2+3+4 (blue) within 40°. The data points for 0% RSA were calculated separately from the kernel density estimates. The magenta curves are the probability density estimates of all side chains of each type in the crystal.
Figure 9. Accuracy vs. percentile of electron density
Accuracy of SCWRL4 predictions is shown as a function of electron density percentile calculated for each residue type, calculated with kernel density estimates (see Methods). Curves for χ1 (black), χ1+2 (red), χ1+2+3 (orange), χ1+2+3+4 (blue) within 40° are shown.
Similar articles
- A graph-theory algorithm for rapid protein side-chain prediction.
Canutescu AA, Shelenkov AA, Dunbrack RL Jr. Canutescu AA, et al. Protein Sci. 2003 Sep;12(9):2001-14. doi: 10.1110/ps.03154503. Protein Sci. 2003. PMID: 12930999 Free PMC article. - SIDEpro: a novel machine learning approach for the fast and accurate prediction of side-chain conformations.
Nagata K, Randall A, Baldi P. Nagata K, et al. Proteins. 2012 Jan;80(1):142-53. doi: 10.1002/prot.23170. Epub 2011 Nov 9. Proteins. 2012. PMID: 22072531 Free PMC article. - Improved side-chain prediction accuracy using an ab initio potential energy function and a very large rotamer library.
Peterson RW, Dutton PL, Wand AJ. Peterson RW, et al. Protein Sci. 2004 Mar;13(3):735-51. doi: 10.1110/ps.03250104. Protein Sci. 2004. PMID: 14978310 Free PMC article. - Extending the accuracy limits of prediction for side-chain conformations.
Xiang Z, Honig B. Xiang Z, et al. J Mol Biol. 2001 Aug 10;311(2):421-30. doi: 10.1006/jmbi.2001.4865. J Mol Biol. 2001. PMID: 11478870 - SCWRL and MolIDE: computer programs for side-chain conformation prediction and homology modeling.
Wang Q, Canutescu AA, Dunbrack RL Jr. Wang Q, et al. Nat Protoc. 2008;3(12):1832-47. doi: 10.1038/nprot.2008.184. Nat Protoc. 2008. PMID: 18989261 Free PMC article.
Cited by
- Asymmetric fluctuation of overlapping dinucleosome studied by cryoelectron microscopy and small-angle X-ray scattering.
Shimizu M, Tanaka H, Nishimura M, Sato N, Nozawa K, Ehara H, Sekine SI, Morishima K, Inoue R, Takizawa Y, Kurumizaka H, Sugiyama M. Shimizu M, et al. PNAS Nexus. 2024 Oct 27;3(11):pgae484. doi: 10.1093/pnasnexus/pgae484. eCollection 2024 Nov. PNAS Nexus. 2024. PMID: 39539301 Free PMC article. - Approaching Pharmacological Space: Events and Components.
Vistoli G, Talarico C, Vittorio S, Lunghini F, Mazzolari A, Beccari A, Pedretti A. Vistoli G, et al. Methods Mol Biol. 2025;2834:151-169. doi: 10.1007/978-1-0716-4003-6_7. Methods Mol Biol. 2025. PMID: 39312164 - Deep generative models of protein structure uncover distant relationships across a continuous fold space.
Draizen EJ, Veretnik S, Mura C, Bourne PE. Draizen EJ, et al. Nat Commun. 2024 Sep 16;15(1):8094. doi: 10.1038/s41467-024-52020-2. Nat Commun. 2024. PMID: 39294145 Free PMC article. - Sequence-based engineering of pH-sensitive antibodies for tumor targeting or endosomal recycling applications.
Wei W, Sulea T. Wei W, et al. MAbs. 2024 Jan-Dec;16(1):2404064. doi: 10.1080/19420862.2024.2404064. Epub 2024 Sep 17. MAbs. 2024. PMID: 39289783 Free PMC article.
References
- Veenstra DL, Kollman PA. Modeling protein stability: a theoretical analysis of the stability of T4 lysozyme mutants. Protein Eng. 1997;10(7):789–807. - PubMed
- Gray JJ, Moughon S, Wang C, Schueler-Furman O, Kuhlman B, Rohl CA, Baker D. Protein-protein docking with simultaneous optimization of rigid-body displacement and side-chain conformations. J Mol Biol. 2003;331(1):281–299. - PubMed
- Meiler J, Baker D. ROSETTALIGAND: protein-small molecule docking with full side-chain flexibility. Proteins. 2006;65(3):538–548. - PubMed
- Leach AR. Ligand docking to proteins with discrete side-chain flexibility. J Mol Biol. 1994;235(1):345–356. - PubMed
- Rohl CA, Strauss CE, Chivian D, Baker D. Modeling structurally variable regions in homologous proteins with rosetta. Proteins. 2004;55(3):656–677. - PubMed
Publication types
MeSH terms
Substances
Grants and funding
- R01 HG002302-05/HG/NHGRI NIH HHS/United States
- R01 GM84453/GM/NIGMS NIH HHS/United States
- R01 GM084453-07/GM/NIGMS NIH HHS/United States
- R01 HG02302/HG/NHGRI NIH HHS/United States
- P20 GM076222-03S1/GM/NIGMS NIH HHS/United States
- P20 GM076222/GM/NIGMS NIH HHS/United States
- R01 GM084453/GM/NIGMS NIH HHS/United States
- R01 HG002302/HG/NHGRI NIH HHS/United States
- P20 GM76222/GM/NIGMS NIH HHS/United States
LinkOut - more resources
Full Text Sources
Other Literature Sources