SnapDRAGON: a method to delineate protein structural domains from sequence data - PubMed (original) (raw)

. 2002 Feb 22;316(3):839-51.

doi: 10.1006/jmbi.2001.5387.

Affiliations

PMID: 11866536
DOI: 10.1006/jmbi.2001.5387

SnapDRAGON: a method to delineate protein structural domains from sequence data

Richard A George et al. J Mol Biol. 2002.

Abstract

We describe a method to identify protein domain boundaries from sequence information alone based on the assumption that hydrophobic residues cluster together in space. SnapDRAGON is a suite of programs developed to predict domain boundaries based on the consistency observed in a set of alternative ab initio three-dimensional (3D) models generated for a given protein multiple sequence alignment. This is achieved by running a distance geometry-based folding technique in conjunction with a 3D-domain assignment algorithm. The overall accuracy of our method in predicting the number of domains for a non-redundant data set of 414 multiple alignments, representing 185 single and 231 multiple-domain proteins, is 72.4 %. Using domain linker regions observed in the tertiary structures associated with each query alignment as the standard of truth, inter-domain boundary positions are delineated with an accuracy of 63.9 % for proteins comprising continuous domains only, and 35.4 % for proteins with discontinuous domains. Overall, domain boundaries are delineated with an accuracy of 51.8 %. The prediction accuracy values are independent of the pair-wise sequence similarities within each of the alignments. These results demonstrate the capability of our method to delineate domains in protein sequences associated with a wide variety of structural domain organisation.

PubMed Disclaimer

Cited by

A modular kernel approach for integrative analysis of protein domain boundaries.
Yoo PD, Zhou BB, Zomaya AY. Yoo PD, et al. BMC Genomics. 2009 Dec 3;10 Suppl 3(Suppl 3):S21. doi: 10.1186/1471-2164-10-S3-S21. BMC Genomics. 2009. PMID: 19958485 Free PMC article.
DoBo: Protein domain boundary prediction by integrating evolutionary signals and machine learning.
Eickholt J, Deng X, Cheng J. Eickholt J, et al. BMC Bioinformatics. 2011 Feb 1;12:43. doi: 10.1186/1471-2105-12-43. BMC Bioinformatics. 2011. PMID: 21284866 Free PMC article.
Practical application of bioinformatics by the multidisciplinary VIZIER consortium.
Gorbalenya AE, Lieutaud P, Harris MR, Coutard B, Canard B, Kleywegt GJ, Kravchenko AA, Samborskiy DV, Sidorov IA, Leontovich AM, Jones TA. Gorbalenya AE, et al. Antiviral Res. 2010 Aug;87(2):95-110. doi: 10.1016/j.antiviral.2010.02.005. Epub 2010 Feb 11. Antiviral Res. 2010. PMID: 20153379 Free PMC article. Review.
Bacterial expression strategies for human angiogenesis proteins.
Dieckman LJ, Zhang W, Rodi DJ, Donnelly MI, Collart FR. Dieckman LJ, et al. J Struct Funct Genomics. 2006 Mar;7(1):23-30. doi: 10.1007/s10969-006-9006-z. J Struct Funct Genomics. 2006. PMID: 16688392
Prediction of protein domain with mRMR feature selection and analysis.
Li BQ, Hu LL, Chen L, Feng KY, Cai YD, Chou KC. Li BQ, et al. PLoS One. 2012;7(6):e39308. doi: 10.1371/journal.pone.0039308. Epub 2012 Jun 15. PLoS One. 2012. PMID: 22720092 Free PMC article.

SnapDRAGON: a method to delineate protein structural domains from sequence data - PubMed (original) (raw)

SnapDRAGON: a method to delineate protein structural domains from sequence data

Abstract

Similar articles

Cited by

Publication types

MeSH terms

Substances