Hidden Markov models in computational biology. Applications to protein modeling - PubMed (original) (raw)
Comparative Study
. 1994 Feb 4;235(5):1501-31.
doi: 10.1006/jmbi.1994.1104.
Affiliations
- PMID: 8107089
- DOI: 10.1006/jmbi.1994.1104
Comparative Study
Hidden Markov models in computational biology. Applications to protein modeling
A Krogh et al. J Mol Biol. 1994.
Abstract
Hidden Markov Models (HMMs) are applied to the problems of statistical modeling, database searching and multiple sequence alignment of protein families and protein domains. These methods are demonstrated on the globin family, the protein kinase catalytic domain, and the EF-hand calcium binding motif. In each case the parameters of an HMM are estimated from a training set of unaligned sequences. After the HMM is built, it is used to obtain a multiple alignment of all the training sequences. It is also used to search the SWISS-PROT 22 database for other sequences that are members of the given protein family, or contain the given domain. The HMM produces multiple alignments of good quality that agree closely with the alignments produced by programs that incorporate three-dimensional structural information. When employed in discrimination tests (by examining how closely the sequences in a database fit the globin, kinase and EF-hand HMMs), the HMM is able to distinguish members of these families from non-members with a high degree of accuracy. Both the HMM and PROFILESEARCH (a technique used to search for relationships between a protein sequence and multiply aligned sequences) perform better in these tests than PROSITE (a dictionary of sites and patterns in proteins). The HMM appears to have a slight advantage over PROFILESEARCH in terms of lower rates of false negatives and false positives, even though the HMM is trained using only unaligned sequences, whereas PROFILESEARCH requires aligned training sequences. Our results suggest the presence of an EF-hand calcium binding motif in a highly conserved and evolutionary preserved putative intracellular region of 155 residues in the alpha-1 subunit of L-type calcium channels which play an important role in excitation-contraction coupling. This region has been suggested to contain the functional domains that are typical or essential for all L-type calcium channels regardless of whether they couple to ryanodine receptors, conduct ions or both.
Similar articles
- HMM-ModE--improved classification using profile hidden Markov models by optimising the discrimination threshold and modifying emission probabilities with negative training sequences.
Srivastava PK, Desai DK, Nandi S, Lynn AM. Srivastava PK, et al. BMC Bioinformatics. 2007 Mar 27;8:104. doi: 10.1186/1471-2105-8-104. BMC Bioinformatics. 2007. PMID: 17389042 Free PMC article. - Designing patterns for profile HMM search.
Sun Y, Buhler J. Sun Y, et al. Bioinformatics. 2007 Jan 15;23(2):e36-43. doi: 10.1093/bioinformatics/btl323. Bioinformatics. 2007. PMID: 17237102 - Fast model-based protein homology detection without alignment.
Hochreiter S, Heusel M, Obermayer K. Hochreiter S, et al. Bioinformatics. 2007 Jul 15;23(14):1728-36. doi: 10.1093/bioinformatics/btm247. Epub 2007 May 8. Bioinformatics. 2007. PMID: 17488755 - Hidden Markov model and its applications in motif findings.
Wu J, Xie J. Wu J, et al. Methods Mol Biol. 2010;620:405-16. doi: 10.1007/978-1-60761-580-4_13. Methods Mol Biol. 2010. PMID: 20652513 Review. - Profile hidden Markov models.
Eddy SR. Eddy SR. Bioinformatics. 1998;14(9):755-63. doi: 10.1093/bioinformatics/14.9.755. Bioinformatics. 1998. PMID: 9918945 Review.
Cited by
- The genome sequence of the most widely cultivated cacao type and its use to identify candidate genes regulating pod color.
Motamayor JC, Mockaitis K, Schmutz J, Haiminen N, Livingstone D 3rd, Cornejo O, Findley SD, Zheng P, Utro F, Royaert S, Saski C, Jenkins J, Podicheti R, Zhao M, Scheffler BE, Stack JC, Feltus FA, Mustiga GM, Amores F, Phillips W, Marelli JP, May GD, Shapiro H, Ma J, Bustamante CD, Schnell RJ, Main D, Gilbert D, Parida L, Kuhn DN. Motamayor JC, et al. Genome Biol. 2013 Jun 3;14(6):r53. doi: 10.1186/gb-2013-14-6-r53. Genome Biol. 2013. PMID: 23731509 Free PMC article. - Phylogeny Estimation Given Sequence Length Heterogeneity.
Smirnov V, Warnow T. Smirnov V, et al. Syst Biol. 2021 Feb 10;70(2):268-282. doi: 10.1093/sysbio/syaa058. Syst Biol. 2021. PMID: 32692823 Free PMC article. - Non-Markovian effects on protein sequence evolution due to site dependent substitution rates.
Rizzato F, Rodriguez A, Laio A. Rizzato F, et al. BMC Bioinformatics. 2016 Jun 24;17:258. doi: 10.1186/s12859-016-1135-1. BMC Bioinformatics. 2016. PMID: 27342318 Free PMC article. - MSACompro: protein multiple sequence alignment using predicted secondary structure, solvent accessibility, and residue-residue contacts.
Deng X, Cheng J. Deng X, et al. BMC Bioinformatics. 2011 Dec 14;12:472. doi: 10.1186/1471-2105-12-472. BMC Bioinformatics. 2011. PMID: 22168237 Free PMC article. - Liquid-theory analogy of direct-coupling analysis of multiple-sequence alignment and its implications for protein structure prediction.
Kinjo AR. Kinjo AR. Biophys Physicobiol. 2015 Dec 11;12:117-9. doi: 10.2142/biophysico.12.0_117. eCollection 2015. Biophys Physicobiol. 2015. PMID: 27493860 Free PMC article.
Publication types
MeSH terms
Substances
LinkOut - more resources
Full Text Sources
Other Literature Sources