The TIGRFAMs database of protein families - PubMed (original) (raw)

The TIGRFAMs database of protein families

Daniel H Haft et al. Nucleic Acids Res. 2003.

Abstract

TIGRFAMs is a collection of manually curated protein families consisting of hidden Markov models (HMMs), multiple sequence alignments, commentary, Gene Ontology (GO) assignments, literature references and pointers to related TIGRFAMs, Pfam and InterPro models. These models are designed to support both automated and manually curated annotation of genomes. TIGRFAMs contains models of full-length proteins and shorter regions at the levels of superfamilies, subfamilies and equivalogs, where equivalogs are sets of homologous proteins conserved with respect to function since their last common ancestor. The scope of each model is set by raising or lowering cutoff scores and choosing members of the seed alignment to group proteins sharing specific function (equivalog) or more general properties. The overall goal is to provide information with maximum utility for the annotation process. TIGRFAMs is thus complementary to Pfam, whose models typically achieve broad coverage across distant homologs but end at the boundaries of conserved structural domains. The database currently contains over 1600 protein families. TIGRFAMs is available for searching or downloading at www.tigr.org/TIGRFAMs.

PubMed Disclaimer

Figures

Figure 1

Figure 1

Neighbor-joining phylogenetic tree of aromatic amino acid hydroxylases. The nodes of a neighbor-joining tree based on aligned sequences are labeled to show assigned function. The tree is shown rooted at the left such that bacterial phenylalanine-4-hydroxylases (Phe-4) represented by TIGR01267, a tetrameric form, comprise the outgroup. Three other HMMs represent monomeric eukaryotic forms of aromatic amino acid hydroxylases (Tyr-3: tyrosine-3-monoxygenase, Trp-5: tryptophan-5-monoxygenase). The four equivalog models are children of the Pfam model PF00351. Note that the three closely related sets of eukaryotic proteins could have been represented by an additional subfamily HMM.

Figure 2

Figure 2

HMM hit regions for pyruvate carboxylase. The thin line represents the polypeptide sequence. Bars represent hit regions for various HMMs. Numbers in square brackets show the current size of each family. The number for each domain is larger than the number for the equivalog model because each domain is distributed more broadly than solely among pyruvate carboxylases.

Similar articles

Cited by

References

    1. Haft D.H., Loftus,B.J., Richardson,D.L., Yang,F., Eisen,J.A., Paulsen,I.T. and White,O. (2001) TIGRFAMs: a protein family resource for the functional identification of proteins. Nucleic Acids Res., 29, 41–43. - PMC - PubMed
    1. Fitch W.M. (1970) Distinguishing homologous from analogous proteins. Syst. Zool., 19, 99–113. - PubMed
    1. Nelson K.E., Clayton,R.A., Gill,S.R., Gwinn,M.L., Dodson,R.J., Haft,D.H., Hickey,E.K., Peterson,J.D., Nelson,W.C., Ketchum,K.A. et al. (1999) Evidence for lateral gene transfer between archaea and bacteria from genome sequence of Thermotoga maritima. Nature, 399, 323–329. - PubMed
    1. Hayashi T., Makino,K., Ohnishi,M., Kurokawa,K., Ishii,K., Yokoyama,K., Han,C.G., Ohtsubo,E., Nakayama,K., Murata,T., Tanaka,M., Tobe,T., Iida,T., Takami,H., Honda,T., Sasakawa,C., Ogasawara,N., Yasunaga,T., Kuhara,S., Shiba,T., Hattori,M. and Shinagawa,H. (2001) Complete genome sequence of enterohemorrhagic Escherichia coli O157:H7 and genomic comparison with a laboratory strain K-12. DNA Res., 28, 11–22. - PubMed
    1. Ashburner M., Ball,C.A., Blake,J.A., Botstein,D., Butler,H., Cherry,J.M., Davis,A.P., Dolinski,K., Dwight,S.S. and Eppig,J.T. (2000) Gene ontology: Tool for the unification of biology. The Gene Ontology Consortium. Nature Genet., 25, 25–29. - PMC - PubMed

Publication types

MeSH terms

Substances

LinkOut - more resources