HMM sampling and applications to gene finding and alternative splicing - PubMed (original) (raw)
HMM sampling and applications to gene finding and alternative splicing
Simon L Cawley et al. Bioinformatics. 2003 Oct.
Abstract
The standard method of applying hidden Markov models to biological problems is to find a Viterbi (maximal weight) path through the HMM graph. The Viterbi algorithm reduces the problem of finding the most likely hidden state sequence that explains given observations, to a dynamic programming problem for corresponding directed acyclic graphs. For example, in the gene finding application, the HMM is used to find the most likely underlying gene structure given a DNA sequence. In this note we discuss the applications of sampling methods for HMMs. The standard sampling algorithm for HMMs is a variant of the common forward-backward and backtrack algorithms, and has already been applied in the context of Gibbs sampling methods. Nevetheless, the practice of sampling state paths from HMMs does not seem to have been widely adopted, and important applications have been overlooked. We show how sampling can be used for finding alternative splicings for genes, including alternative splicings that are conserved between genes from related organisms. We also show how sampling from the posterior distribution is a natural way to compute probabilities for predicted exons and gene structures being correct under the assumed model. Finally, we describe a new memory efficient sampling algorithm for certain classes of HMMs which provides a practical sampling alternative to the Hirschberg algorithm for optimal alignment. The ideas presented have applications not only to gene finding and HMMs but more generally to stochastic context free grammars and RNA structure prediction.
Similar articles
- Training HMM structure with genetic algorithm for biological sequence analysis.
Won KJ, Prügel-Bennett A, Krogh A. Won KJ, et al. Bioinformatics. 2004 Dec 12;20(18):3613-9. doi: 10.1093/bioinformatics/bth454. Epub 2004 Aug 5. Bioinformatics. 2004. PMID: 15297297 - Implementing EM and Viterbi algorithms for Hidden Markov Model in linear memory.
Churbanov A, Winters-Hilt S. Churbanov A, et al. BMC Bioinformatics. 2008 Apr 30;9:224. doi: 10.1186/1471-2105-9-224. BMC Bioinformatics. 2008. PMID: 18447951 Free PMC article. - Bayesian restoration of a hidden Markov chain with applications to DNA sequencing.
Churchill GA, Lazareva B. Churchill GA, et al. J Comput Biol. 1999 Summer;6(2):261-77. doi: 10.1089/cmb.1999.6.261. J Comput Biol. 1999. PMID: 10421527 - Hidden Markov model and its applications in motif findings.
Wu J, Xie J. Wu J, et al. Methods Mol Biol. 2010;620:405-16. doi: 10.1007/978-1-60761-580-4_13. Methods Mol Biol. 2010. PMID: 20652513 Review. - How does DNA sequence motif discovery work?
D'haeseleer P. D'haeseleer P. Nat Biotechnol. 2006 Aug;24(8):959-61. doi: 10.1038/nbt0806-959. Nat Biotechnol. 2006. PMID: 16900144 Review. No abstract available.
Cited by
- Artificial Intelligence and Cardiovascular Genetics.
Krittanawong C, Johnson KW, Choi E, Kaplin S, Venner E, Murugan M, Wang Z, Glicksberg BS, Amos CI, Schatz MC, Tang WHW. Krittanawong C, et al. Life (Basel). 2022 Feb 14;12(2):279. doi: 10.3390/life12020279. Life (Basel). 2022. PMID: 35207566 Free PMC article. Review. - An overview and metanalysis of machine and deep learning-based CRISPR gRNA design tools.
Wang J, Zhang X, Cheng L, Luo Y. Wang J, et al. RNA Biol. 2020 Jan;17(1):13-22. doi: 10.1080/15476286.2019.1669406. Epub 2019 Sep 27. RNA Biol. 2020. PMID: 31533522 Free PMC article. Review. - UniCon3D: de novo protein structure prediction using united-residue conformational search via stepwise, probabilistic sampling.
Bhattacharya D, Cao R, Cheng J. Bhattacharya D, et al. Bioinformatics. 2016 Sep 15;32(18):2791-9. doi: 10.1093/bioinformatics/btw316. Epub 2016 Jun 3. Bioinformatics. 2016. PMID: 27259540 Free PMC article. - De novo protein conformational sampling using a probabilistic graphical model.
Bhattacharya D, Cheng J. Bhattacharya D, et al. Sci Rep. 2015 Nov 6;5:16332. doi: 10.1038/srep16332. Sci Rep. 2015. PMID: 26541939 Free PMC article. - Genome-wide inference of ancestral recombination graphs.
Rasmussen MD, Hubisz MJ, Gronau I, Siepel A. Rasmussen MD, et al. PLoS Genet. 2014 May 15;10(5):e1004342. doi: 10.1371/journal.pgen.1004342. eCollection 2014. PLoS Genet. 2014. PMID: 24831947 Free PMC article.
Publication types
MeSH terms
Substances
LinkOut - more resources
Full Text Sources