Phylogenetic inference in protein superfamilies: analysis of SH2 domains - PubMed (original) (raw)
Affiliations
- PMID: 9783222
Comparative Study
Phylogenetic inference in protein superfamilies: analysis of SH2 domains
K Sjölander. Proc Int Conf Intell Syst Mol Biol. 1998.
Abstract
This work focuses on the inference of evolutionary relationships in protein superfamilies, and the uses of these relationships to identify key positions in the structure, to infer attributes on the basis of evolutionary distance, and to identify potential errors in sequence annotations. Relative entropy, a distance metric from information theory, is used in combination with Dirichlet mixture priors to estimate a phylogenetic tree for a set of proteins. This method infers key structural or functional positions in the molecule, and guides the tree topology to preserve these important positions within subtrees. Minimum-description-length principles are used to determine a cut of the tree into subtrees, to identify the subfamilies in the data. This method is demonstrated on SH2-domain containing proteins, resulting in a new subfamily assignment for Src2-drome and a suggested evolutionary relationship between Nck_human and Drk_drome, Sem5_caeel, Grb2_human and Grb2_chick.
Similar articles
- Rate4Site: an algorithmic tool for the identification of functional regions in proteins by surface mapping of evolutionary determinants within their homologues.
Pupko T, Bell RE, Mayrose I, Glaser F, Ben-Tal N. Pupko T, et al. Bioinformatics. 2002;18 Suppl 1:S71-7. doi: 10.1093/bioinformatics/18.suppl_1.s71. Bioinformatics. 2002. PMID: 12169533 - Bayesian coestimation of phylogeny and sequence alignment.
Lunter G, Miklós I, Drummond A, Jensen JL, Hein J. Lunter G, et al. BMC Bioinformatics. 2005 Apr 1;6:83. doi: 10.1186/1471-2105-6-83. BMC Bioinformatics. 2005. PMID: 15804354 Free PMC article. - On the quality of tree-based protein classification.
Lazareva-Ulitsky B, Diemer K, Thomas PD. Lazareva-Ulitsky B, et al. Bioinformatics. 2005 May 1;21(9):1876-90. doi: 10.1093/bioinformatics/bti244. Epub 2005 Jan 12. Bioinformatics. 2005. PMID: 15647305 - Phylogenetic, structural and functional relationships between WD- and Kelch-repeat proteins.
Hudson AM, Cooley L. Hudson AM, et al. Subcell Biochem. 2008;48:6-19. doi: 10.1007/978-0-387-09595-0_2. Subcell Biochem. 2008. PMID: 18925367 Review. - Phylogenetic approaches to the identification and characterization of protein families and superfamilies.
Saier MH Jr. Saier MH Jr. Microb Comp Genomics. 1996;1(3):129-50. doi: 10.1089/mcg.1996.1.129. Microb Comp Genomics. 1996. PMID: 9689209 Review.
Cited by
- Predicting transcription factor synergism.
Hannenhalli S, Levy S. Hannenhalli S, et al. Nucleic Acids Res. 2002 Oct 1;30(19):4278-84. doi: 10.1093/nar/gkf535. Nucleic Acids Res. 2002. PMID: 12364607 Free PMC article. - The construction and use of log-odds substitution scores for multiple sequence alignment.
Altschul SF, Wootton JC, Zaslavsky E, Yu YK. Altschul SF, et al. PLoS Comput Biol. 2010 Jul 15;6(7):e1000852. doi: 10.1371/journal.pcbi.1000852. PLoS Comput Biol. 2010. PMID: 20657661 Free PMC article. - SplitTester: software to identify domains responsible for functional divergence in protein family.
Gao X, Vander Velden KA, Voytas DF, Gu X. Gao X, et al. BMC Bioinformatics. 2005 Jun 1;6:137. doi: 10.1186/1471-2105-6-137. BMC Bioinformatics. 2005. PMID: 15929795 Free PMC article. - Berkeley Phylogenomics Group web servers: resources for structural phylogenomic analysis.
Glanville JG, Kirshner D, Krishnamurthy N, Sjölander K. Glanville JG, et al. Nucleic Acids Res. 2007 Jul;35(Web Server issue):W27-32. doi: 10.1093/nar/gkm325. Epub 2007 May 8. Nucleic Acids Res. 2007. PMID: 17488835 Free PMC article. - Future directions in protein function prediction.
Shehadi IA, Yang H, Ondrechen MJ. Shehadi IA, et al. Mol Biol Rep. 2002 Dec;29(4):329-35. doi: 10.1023/a:1021220208562. Mol Biol Rep. 2002. PMID: 12549818
Publication types
MeSH terms
Substances
LinkOut - more resources
Research Materials
Miscellaneous