CDD: conserved domains and protein three-dimensional structure - PubMed (original) (raw)

. 2013 Jan;41(Database issue):D348-52.

doi: 10.1093/nar/gks1243. Epub 2012 Nov 28.

Chanjuan Zheng, Farideh Chitsaz, Myra K Derbyshire, Lewis Y Geer, Renata C Geer, Noreen R Gonzales, Marc Gwadz, David I Hurwitz, Christopher J Lanczycki, Fu Lu, Shennan Lu, Gabriele H Marchler, James S Song, Narmada Thanki, Roxanne A Yamashita, Dachuan Zhang, Stephen H Bryant

Affiliations

CDD: conserved domains and protein three-dimensional structure

Aron Marchler-Bauer et al. Nucleic Acids Res. 2013 Jan.

Abstract

CDD, the Conserved Domain Database, is part of NCBI's Entrez query and retrieval system and is also accessible via http://www.ncbi.nlm.nih.gov/Structure/cdd/cdd.shtml. CDD provides annotation of protein sequences with the location of conserved domain footprints and functional sites inferred from these footprints. Pre-computed annotation is available via Entrez, and interactive search services accept single protein or nucleotide queries, as well as batch submissions of protein query sequences, utilizing RPS-BLAST to rapidly identify putative matches. CDD incorporates several protein domain and full-length protein model collections, and maintains an active curation effort that aims at providing fine grained classifications for major and well-characterized protein domain families, as supported by available protein three-dimensional (3D) structure and the published literature. To this date, the majority of protein 3D structures are represented by models tracked by CDD, and CDD curators are characterizing novel families that emerge from protein structure determination efforts.

PubMed Disclaimer

Figures

Figure 1.

Figure 1.

This histogram illustrates the distribution of protein 3D structures between conserved domain superfamilies. Although the majority of superfamilies cannot be linked to a 3D structure representative, about one quarter of those that can be linked have only a single representative 3D structure. Data prepared with NCBI FLink (

http://www.ncbi.nlm.nih.gov/Structure/flink/flink.cgi

).

Figure 2.

Figure 2.

CD-Search results for a nucleotide query sequence, the complete genome sequence of a Hepatitis B virus. Results have been obtained for three different reading frames used for translation of the nucleotide query. Consequently, the display is split into three panels, which are labeled with ‘RF +1’, ‘RF +2’ and ‘RF +3’.

References

    1. Punta M, Coggill PC, Eberhardt RY, Mistry J, Tate J, Boursnell C, Pang N, Forslund K, Ceric G, Clements J, et al. The Pfam protein families database. Nucleic Acids Res. 2012;40:D290–D301. - PMC - PubMed
    1. Letunic I, Doerks T, Bork P. SMART 7: recent updates to the protein domain annotation resource. Nucleic Acids Res. 2012;40:D302–D305. - PMC - PubMed
    1. Neuwald AF, Lanczycki CJ, Marchler-Bauer A. Automated hierarchical classification of protein domain subfamilies based on functionally-divergent residue signatures. BMC Bioinform. 2012;13:144. - PMC - PubMed
    1. Montelione GT. The protein structure initiative: achievements and visions for the future. F1000 Biol. Rep. 2012;4:7. - PMC - PubMed
    1. Berman HM, Henrick K, Nakamura H. Announcing the worldwide Protein Data Bank. Nat. Struct. Biol. 2003;10:980. - PubMed

Publication types

MeSH terms

Substances

LinkOut - more resources