CDD: NCBI's conserved domain database - PubMed (original) (raw)

. 2015 Jan;43(Database issue):D222-6.

doi: 10.1093/nar/gku1221. Epub 2014 Nov 20.

Myra K Derbyshire 2, Noreen R Gonzales 2, Shennan Lu 2, Farideh Chitsaz 2, Lewis Y Geer 2, Renata C Geer 2, Jane He 2, Marc Gwadz 2, David I Hurwitz 2, Christopher J Lanczycki 2, Fu Lu 2, Gabriele H Marchler 2, James S Song 2, Narmada Thanki 2, Zhouxi Wang 2, Roxanne A Yamashita 2, Dachuan Zhang 2, Chanjuan Zheng 2, Stephen H Bryant 2

Affiliations

CDD: NCBI's conserved domain database

Aron Marchler-Bauer et al. Nucleic Acids Res. 2015 Jan.

Abstract

NCBI's CDD, the Conserved Domain Database, enters its 15(th) year as a public resource for the annotation of proteins with the location of conserved domain footprints. Going forward, we strive to improve the coverage and consistency of domain annotation provided by CDD. We maintain a live search system as well as an archive of pre-computed domain annotation for sequences tracked in NCBI's Entrez protein database, which can be retrieved for single sequences or in bulk. We also maintain import procedures so that CDD contains domain models and domain definitions provided by several collections available in the public domain, as well as those produced by an in-house curation effort. The curation effort aims at increasing coverage and providing finer-grained classifications of common protein domains, for which a wealth of functional and structural data has become available. CDD curation generates alignment models of representative sequence fragments, which are in agreement with domain boundaries as observed in protein 3D structure, and which model the structurally conserved cores of domain families as well as annotate conserved features. CDD can be accessed at http://www.ncbi.nlm.nih.gov/Structure/cdd/cdd.shtml.

Published by Oxford University Press on behalf of Nucleic Acids Research 2014. This work is written by US Government employees and is in the public domain in the US.

PubMed Disclaimer

Figures

Figure 1.

Figure 1.

CD-Search reporting a ‘rescued’ domain annotation, which scores an _E_-value above the default reporting threshold of 0.01. The live search for the query sequence, derived from the PDB structure 2WOZ.

Figure 2.

Figure 2.

CD-Search results for SwissProt Q6XUD6, zoomed in to ‘residue level’ display so that the precise locations of domain boundaries and functional sites become apparent. Query sequence residues highlighted in bold print have been identified as part of a functional site (such as the ‘catalytic site’ mapping to R118 and D151, plus other residues not shown in this example). Structural motifs are shown as double-headed arrows.

Similar articles

Cited by

References

    1. Finn R.D., Bateman A., Clements J., Coggill P.C., Eberhardt R.Y., Eddy S.R., Heger A., Hetherington K., Holm L., Mistry J., et al. Pfam: the protein families database. Nucleic Acids Res. 2014;42:D222–D230. - PMC - PubMed
    1. Letunic I., Doerks T., Bork P. SMART: recent updates, new developments, and status in 2015. Nucleic Acids Res. 2014 doi:10.1093/nar/gku949. - PMC - PubMed
    1. Tatusov R.L., Natale D.A., Garkavtsev I.V., Tatusova T.A., Shankavaram U.T., Rao B.S., Kiryutin B., Galperin M.Y., Fedorova N.D., Koonin E.V. The COG database: new developments in phylogenetic classification of proteins from complete genomes. Nucleic Acids Res. 2001;29:22–28. - PMC - PubMed
    1. Haft D.H., Selengut J.D., Richter A.R., Harkins D., Basu M.K., Beck E. TIGRFAMs and Genome Properties in 2013. Nucleic Acids Res. 2013;41:D387–D395. - PMC - PubMed
    1. Klimke W., Agarwala R., Badretdin A., Chetvernin S., Ciufo S., Fedorov B., Kiryutin B., O'Neill K., Resch W., Resenchuk S., et al. The National Center for Biotechnology Information's Protein Clusters Database. Nucleic Acids Res. 2009;37:D216–D223. - PMC - PubMed

Publication types

MeSH terms

LinkOut - more resources