CATH--a hierarchic classification of protein domain structures - PubMed (original) (raw)
CATH--a hierarchic classification of protein domain structures
C A Orengo et al. Structure. 1997.
Free article
Abstract
Background: Protein evolution gives rise to families of structurally related proteins, within which sequence identities can be extremely low. As a result, structure-based classifications can be effective at identifying unanticipated relationships in known structures and in optimal cases function can also be assigned. The ever increasing number of known protein structures is too large to classify all proteins manually, therefore, automatic methods are needed for fast evaluation of protein structures.
Results: We present a semi-automatic procedure for deriving a novel hierarchical classification of protein domain structures (CATH). The four main levels of our classification are protein class (C), architecture (A), topology (T) and homologous superfamily (H). Class is the simplest level, and it essentially describes the secondary structure composition of each domain. In contrast, architecture summarises the shape revealed by the orientations of the secondary structure units, such as barrels and sandwiches. At the topology level, sequential connectivity is considered, such that members of the same architecture might have quite different topologies. When structures belonging to the same T-level have suitably high similarities combined with similar functions, the proteins are assumed to be evolutionarily related and put into the same homologous superfamily.
Conclusions: Analysis of the structural families generated by CATH reveals the prominent features of protein structure space. We find that nearly a third of the homologous superfamilies (H-levels) belong to ten major T-levels, which we call superfolds, and furthermore that nearly two-thirds of these H-levels cluster into nine simple architectures. A database of well-characterised protein structure families, such as CATH, will facilitate the assignment of structure-function/evolution relationships to both known and newly determined protein structures.
Similar articles
- The CATH classification revisited--architectures reviewed and new ways to characterize structural divergence in superfamilies.
Cuff AL, Sillitoe I, Lewis T, Redfern OC, Garratt R, Thornton J, Orengo CA. Cuff AL, et al. Nucleic Acids Res. 2009 Jan;37(Database issue):D310-4. doi: 10.1093/nar/gkn877. Epub 2008 Nov 7. Nucleic Acids Res. 2009. PMID: 18996897 Free PMC article. - Structural diversity of domain superfamilies in the CATH database.
Reeves GA, Dallman TJ, Redfern OC, Akpor A, Orengo CA. Reeves GA, et al. J Mol Biol. 2006 Jul 14;360(3):725-41. doi: 10.1016/j.jmb.2006.05.035. Epub 2006 Jun 2. J Mol Biol. 2006. PMID: 16780872 - The CATH Database provides insights into protein structure/function relationships.
Orengo CA, Pearl FM, Bray JE, Todd AE, Martin AC, Lo Conte L, Thornton JM. Orengo CA, et al. Nucleic Acids Res. 1999 Jan 1;27(1):275-9. doi: 10.1093/nar/27.1.275. Nucleic Acids Res. 1999. PMID: 9847200 Free PMC article. - Protein folds, functions and evolution.
Thornton JM, Orengo CA, Todd AE, Pearl FM. Thornton JM, et al. J Mol Biol. 1999 Oct 22;293(2):333-42. doi: 10.1006/jmbi.1999.3054. J Mol Biol. 1999. PMID: 10529349 Review. - The history of the CATH structural classification of protein domains.
Sillitoe I, Dawson N, Thornton J, Orengo C. Sillitoe I, et al. Biochimie. 2015 Dec;119:209-17. doi: 10.1016/j.biochi.2015.08.004. Epub 2015 Aug 4. Biochimie. 2015. PMID: 26253692 Free PMC article. Review.
Cited by
- Ig or Not Ig? That Is the Question: The Nucleating Supersecondary Structure of the Ig-Fold and the Extended Ig Universe.
Wang J, Abrol R, Youkharibache P. Wang J, et al. Methods Mol Biol. 2025;2870:371-396. doi: 10.1007/978-1-0716-4213-9_19. Methods Mol Biol. 2025. PMID: 39543045 - Hierarchical Analysis of Protein Structures: From Secondary Structures to Protein Units and Domains.
Perin C, Cretin G, Gelly JC. Perin C, et al. Methods Mol Biol. 2025;2870:357-370. doi: 10.1007/978-1-0716-4213-9_18. Methods Mol Biol. 2025. PMID: 39543044 - Beta Sandwich-Like Folds: Sequences, Contacts, Classification of Invariant Substructures and Beta Sandwich Protein Grammar.
Kister AE. Kister AE. Methods Mol Biol. 2025;2870:51-62. doi: 10.1007/978-1-0716-4213-9_4. Methods Mol Biol. 2025. PMID: 39543030 - Computational prediction of multiple antigen epitopes.
Viswanathan R, Carroll M, Roffe A, Fajardo JE, Fiser A. Viswanathan R, et al. Bioinformatics. 2024 Oct 1;40(10):btae556. doi: 10.1093/bioinformatics/btae556. Bioinformatics. 2024. PMID: 39271143 Free PMC article.
Publication types
MeSH terms
Substances
LinkOut - more resources
Full Text Sources
Other Literature Sources
Research Materials
Miscellaneous