lDDT: a local superposition-free score for comparing protein structures and models using distance difference tests - PubMed (original) (raw)

lDDT: a local superposition-free score for comparing protein structures and models using distance difference tests

Valerio Mariani et al. Bioinformatics. 2013.

Abstract

Motivation: The assessment of protein structure prediction techniques requires objective criteria to measure the similarity between a computational model and the experimentally determined reference structure. Conventional similarity measures based on a global superposition of carbon α atoms are strongly influenced by domain motions and do not assess the accuracy of local atomic details in the model.

Results: The Local Distance Difference Test (lDDT) is a superposition-free score that evaluates local distance differences of all atoms in a model, including validation of stereochemical plausibility. The reference can be a single structure, or an ensemble of equivalent structures. We demonstrate that lDDT is well suited to assess local model quality, even in the presence of domain movements, while maintaining good correlation with global measures. These properties make lDDT a robust tool for the automated assessment of structure prediction servers without manual intervention.

Availability and implementation: Source code, binaries for Linux and MacOSX, and an interactive web server are available at http://swissmodel.expasy.org/lddt.

Contact: torsten.schwede@unibas.ch.

Supplementary information: Supplementary data are available at Bioinformatics online.

PubMed Disclaimer

Figures

Fig. 1.

Fig. 1.

Comparison of predicted protein structure model with its reference structure for CASP target T0542. The target structure (shown in gray) consists of two domains. In (A), a predicted model (TS236, in color) is shown in full length, with the first domain superposed to the target. For graphical illustration, (B) shows the two domains in the prediction separated according to CASP AUs and superposed individually to the target structure. In both panels, the model is colored according to full-length lDDT scores following a traffic-light-like red-yellow-green gradient, with red corresponding to low values of the lDDT, green to high values and yellow to average values. As superposition-free method, lDDT is insensitive to relative domain orientation and correctly identifies segments in the full-length model deviating from the reference structure

Fig. 2.

Fig. 2.

Determination of the optimal inclusion radius parameter Ro. Pearson correlation (R2) between whole target lDDT scores (solid line) and domain-based weight-averaged lDDT score (dashed line) versus domain-based weight-averaged GDC-all scores for different values of the inclusion radius parameter Ro were computed over all CASP9 predictions for multidomain targets

Fig. 3.

Fig. 3.

Correlation between whole structure GDC-all and lDDT scores and domain-based weight-averaged GDC-all scores. For CASP9 predictions of multidomain targets, GDC-all scores (red dots) and lDDT scores (blue dots) were computed against the whole unsplit target structures. For the lDDT scores, the default value of 15 Å for the inclusion radius was used

Fig. 4.

Fig. 4.

Baseline lDDT scores for models with simulated threading errors. lDDT scores of pseudo-models with threading errors for two examples of different CATH Architectures are shown: Alpha Horseshoe (left) and Beta Barrel (right). The lDDT score is plotted as a function of the introduced threading error (top). The histograms (bottom) show the distribution of these ‘baseline’ scores for threading error offset >15 residues for the two architectures. The structure inlays show an example structure of the respective CATH Architecture. Peaks at large off-sets indicate repetitive structural elements with locally correct arrangement

Fig. 5.

Fig. 5.

Assessing stereochemical plausibility. This example illustrates the stereochemical quality checks on lDDT score for a model (TS276, left side as ribbon representation) for target T0570-D1 with unrealistic stereochemistry (close-up, right). Residues with too short (1) or too long (2) chemical bonds, as well as those with close atomic interactions (3) or impossible bond angles (4), result in lower scores during the lDDT computation

Fig. 6.

Fig. 6.

Comparing a model against an ensemble of reference structures. The experimental reference structure for CASP target T0559 (human protein BC008182, PDBID:2L01) is an ensemble of NMR structures. The graph shows the effect of selecting a single structure as reference (GDC-all values as striped bars) in contrast to the multireference lDDT implementation (dotted bars). For this example, each structure within the ensemble was selected in turn as reference and compared with the other members

References

    1. Allen FH. The Cambridge Structural Database: a quarter of a million crystal structures and rising. Acta Crystallogr. B. 2002;58:380–388. - PubMed
    1. Battey JN, et al. Automated server predictions in CASP7. Proteins. 2007;69(Suppl. 8):68–82. - PubMed
    1. Biasini M, et al. OpenStructure: a flexible software framework for computational structural biology. Bioinformatics. 2010;26:2626–2628. - PMC - PubMed
    1. Bordogna A, et al. Predicting the accuracy of protein-ligand docking on homology models. J. Comput. Chem. 2011;32:81–98. - PMC - PubMed
    1. Chen VB, et al. MolProbity: all-atom structure validation for macromolecular crystallography. Acta Crystallogr. D Biol. Crystallogr. 2010;66:12–21. - PMC - PubMed

Publication types

MeSH terms

Substances

LinkOut - more resources