Improved prediction of protein side-chain conformations with SCWRL4 - PubMed (original) (raw)

Improved prediction of protein side-chain conformations with SCWRL4

Georgii G Krivov et al. Proteins. 2009 Dec.

Abstract

Determination of side-chain conformations is an important step in protein structure prediction and protein design. Many such methods have been presented, although only a small number are in widespread use. SCWRL is one such method, and the SCWRL3 program (2003) has remained popular because of its speed, accuracy, and ease-of-use for the purpose of homology modeling. However, higher accuracy at comparable speed is desirable. This has been achieved in a new program SCWRL4 through: (1) a new backbone-dependent rotamer library based on kernel density estimates; (2) averaging over samples of conformations about the positions in the rotamer library; (3) a fast anisotropic hydrogen bonding function; (4) a short-range, soft van der Waals atom-atom interaction potential; (5) fast collision detection using k-discrete oriented polytopes; (6) a tree decomposition algorithm to solve the combinatorial problem; and (7) optimization of all parameters by determining the interaction graph within the crystal environment using symmetry operators of the crystallographic space group. Accuracies as a function of electron density of the side chains demonstrate that side chains with higher electron density are easier to predict than those with low-electron density and presumed conformational disorder. For a testing set of 379 proteins, 86% of chi(1) angles and 75% of chi(1+2) angles are predicted correctly within 40 degrees of the X-ray positions. Among side chains with higher electron density (25-100th percentile), these numbers rise to 89 and 80%. The new program maintains its simple command-line interface, designed for homology modeling, and is now available as a dynamic-linked library for incorporation into other software programs.

2009 Wiley-Liss, Inc.

PubMed Disclaimer

Figures

Figure 1

Figure 1

Steps in SCWRL4 side-chain conformation prediction

Figure 2

Figure 2. _k_-Dimensional Oriented Polytopes (kDOPs)

Left: examples of kDOPs in the plane (_k_=2,3,4) and in three dimensions (_k_=3,4). Right: Overlap test for kDOP A (black) and kDOP B (gray). The objects enclosed within the kDOPs may clash if one of the conditions shown is satisfied.

Figure 3

Figure 3. SCWRL4 van der Waals potential

The van der Waals potential used in SCWRL4 is shown (solid line) with a standard Lennard-Jones 6–12 potential (dotted line) with Eij = 1.

Figure 4

Figure 4. Hydrogen bond potential

Interaction of hydrogen bond acceptor O and hydrogen bond donor, D. Unit vector n⃗ is the vector from atom O to atom H. Unit vectors _e⃗_1 and _e⃗_2 are placed from atom O along each lone pair of electrons. Unit vector _e⃗_0 connects the hydrogen bond donor D to the hydrogen atom. α=cos−1(−nr·e1r) is the angle between the D-H bond and the H…O vector and β=cos−1(nr·e0r) is the angle between the O-lone pair and the O…H vector.

Figure 5

Figure 5. Tree decomposition as generalization of biconnected component decomposition

At left, the graph used in the SCWRL3 paper is shown along with its biconnected component decomposition. At right, a tree decomposition of the same graph is shown. Residues in blue and green illustrate conditions 2 and 3 of a tree decomposition being satisfied. The relevant conditions are shown below the tree decomposition. At each node of the tree, those residues that are members of set L are shown to the left of the vertical bar, and those of set R are shown to the right. Set L consists of those residues shared with the parent of each node, and R the remaining residues of the node.

Figure 6

Figure 6. Effect of SCWRL4 features on differences in SCWRL3 and SCWRL4 accuracy

The accuracy shown is average absolute accuracy of the training set, which covers all side-chain dihedral angles (see text). “Atomic radii” = use of optimized radii; “Interpolation” = interpolation of rotamer library probabilities and dihedral angles; “Local BB” = adding interaction between side chain and atoms N,HN of residue _i_−1 and C,O of residue i+1, previously neglected in SCWRL3; “P=98%” = reading in top 98% of probability from rotamers sorted in descending order of frequency (90% in SCWRL3); “H-bonds” = new hydrogen bond potential; “New RL” = new rotamer library; “FRM” = Flexible rotamer model; “Tuning of parameters” = tuning of FRM parameters and rotamer library weights.

Figure 7

Figure 7. Improvement in accuracy due to inclusion of crystal neighbors

The accuracy figures shown reflect the differences in average absolute accuracy for the testing set as described in the text.

Figure 8

Figure 8. Accuracy vs. relative surface area

Accuracy of SCWRL4 predictions is shown as a function of side-chain relative accessible surface area, calculated with kernel density estimates (see Methods): χ1 (black), χ1+2 (red), χ1+2+3 (orange), χ1+2+3+4 (blue) within 40°. The data points for 0% RSA were calculated separately from the kernel density estimates. The magenta curves are the probability density estimates of all side chains of each type in the crystal.

Figure 9

Figure 9. Accuracy vs. percentile of electron density

Accuracy of SCWRL4 predictions is shown as a function of electron density percentile calculated for each residue type, calculated with kernel density estimates (see Methods). Curves for χ1 (black), χ1+2 (red), χ1+2+3 (orange), χ1+2+3+4 (blue) within 40° are shown.

Similar articles

Cited by

References

    1. Veenstra DL, Kollman PA. Modeling protein stability: a theoretical analysis of the stability of T4 lysozyme mutants. Protein Eng. 1997;10(7):789–807. - PubMed
    1. Gray JJ, Moughon S, Wang C, Schueler-Furman O, Kuhlman B, Rohl CA, Baker D. Protein-protein docking with simultaneous optimization of rigid-body displacement and side-chain conformations. J Mol Biol. 2003;331(1):281–299. - PubMed
    1. Meiler J, Baker D. ROSETTALIGAND: protein-small molecule docking with full side-chain flexibility. Proteins. 2006;65(3):538–548. - PubMed
    1. Leach AR. Ligand docking to proteins with discrete side-chain flexibility. J Mol Biol. 1994;235(1):345–356. - PubMed
    1. Rohl CA, Strauss CE, Chivian D, Baker D. Modeling structurally variable regions in homologous proteins with rosetta. Proteins. 2004;55(3):656–677. - PubMed

Publication types

MeSH terms

Substances

Grants and funding

LinkOut - more resources