Maximum entropy modeling of short sequence motifs with applications to RNA splicing signals - PubMed (original) (raw)
Maximum entropy modeling of short sequence motifs with applications to RNA splicing signals
Gene Yeo et al. J Comput Biol. 2004.
Abstract
We propose a framework for modeling sequence motifs based on the maximum entropy principle (MEP). We recommend approximating short sequence motif distributions with the maximum entropy distribution (MED) consistent with low-order marginal constraints estimated from available data, which may include dependencies between nonadjacent as well as adjacent positions. Many maximum entropy models (MEMs) are specified by simply changing the set of constraints. Such models can be utilized to discriminate between signals and decoys. Classification performance using different MEMs gives insight into the relative importance of dependencies between different positions. We apply our framework to large datasets of RNA splicing signals. Our best models out-perform previous probabilistic models in the discrimination of human 5' (donor) and 3' (acceptor) splice sites from decoys. Finally, we discuss mechanistically motivated ways of comparing models.
Similar articles
- A statistical approach for 5' splice site prediction using short sequence motifs and without encoding sequence data.
Meher PK, Sahu TK, Rao AR, Wahi SD. Meher PK, et al. BMC Bioinformatics. 2014 Nov 25;15:362. doi: 10.1186/s12859-014-0362-6. BMC Bioinformatics. 2014. PMID: 25420551 Free PMC article. - Ordered partitioning reveals extended splice-site consensus information.
Weir M, Rice M. Weir M, et al. Genome Res. 2004 Jan;14(1):67-78. doi: 10.1101/gr.1715204. Genome Res. 2004. PMID: 14707171 Free PMC article. - SpliceIT: a hybrid method for splice signal identification based on probabilistic and biological inference.
Malousi A, Chouvarda I, Koutkias V, Kouidou S, Maglaveras N. Malousi A, et al. J Biomed Inform. 2010 Apr;43(2):208-17. doi: 10.1016/j.jbi.2009.09.004. Epub 2009 Sep 30. J Biomed Inform. 2010. PMID: 19800027 - Coevolution of genomic intron number and splice sites.
Irimia M, Penny D, Roy SW. Irimia M, et al. Trends Genet. 2007 Jul;23(7):321-5. doi: 10.1016/j.tig.2007.04.001. Epub 2007 Apr 18. Trends Genet. 2007. PMID: 17442445 Review. - Beyond Moments: Extending the Maximum Entropy Principle to Feature Distribution Constraints.
Baggenstoss PM. Baggenstoss PM. Entropy (Basel). 2018 Aug 30;20(9):650. doi: 10.3390/e20090650. Entropy (Basel). 2018. PMID: 33265739 Free PMC article. Review.
Cited by
- Human PRPF39 is an alternative splicing factor recruiting U1 snRNP to weak 5' splice sites.
Espinosa S, De Bortoli F, Li X, Rossi J, Wagley ME, Lo HG, Taliaferro JM, Zhao R. Espinosa S, et al. RNA. 2022 Oct 31;29(1):97-110. doi: 10.1261/rna.079320.122. Online ahead of print. RNA. 2022. PMID: 36316087 Free PMC article. - Genetic screening of the FLCN gene identify six novel variants and a Danish founder mutation.
Rossing M, Albrechtsen A, Skytte AB, Jensen UB, Ousager LB, Gerdes AM, Nielsen FC, Hansen TV. Rossing M, et al. J Hum Genet. 2017 Feb;62(2):151-157. doi: 10.1038/jhg.2016.118. Epub 2016 Oct 13. J Hum Genet. 2017. PMID: 27734835 - Exome sequencing identifies mutations in the gene TTC7A in French-Canadian cases with hereditary multiple intestinal atresia.
Samuels ME, Majewski J, Alirezaie N, Fernandez I, Casals F, Patey N, Decaluwe H, Gosselin I, Haddad E, Hodgkinson A, Idaghdour Y, Marchand V, Michaud JL, Rodrigue MA, Desjardins S, Dubois S, Le Deist F, Awadalla P, Raymond V, Maranda B. Samuels ME, et al. J Med Genet. 2013 May;50(5):324-9. doi: 10.1136/jmedgenet-2012-101483. Epub 2013 Feb 19. J Med Genet. 2013. PMID: 23423984 Free PMC article. - Varying levels of complexity in transcription factor binding motifs.
Keilwagen J, Grau J. Keilwagen J, et al. Nucleic Acids Res. 2015 Oct 15;43(18):e119. doi: 10.1093/nar/gkv577. Epub 2015 Jun 26. Nucleic Acids Res. 2015. PMID: 26116565 Free PMC article. - A deep learning approach to identify gene targets of a therapeutic for human splicing disorders.
Gao D, Morini E, Salani M, Krauson AJ, Chekuri A, Sharma N, Ragavendran A, Erdin S, Logan EM, Li W, Dakka A, Narasimhan J, Zhao X, Naryshkin N, Trotta CR, Effenberger KA, Woll MG, Gabbeta V, Karp G, Yu Y, Johnson G, Paquette WD, Cutting GR, Talkowski ME, Slaugenhaupt SA. Gao D, et al. Nat Commun. 2021 Jun 7;12(1):3332. doi: 10.1038/s41467-021-23663-2. Nat Commun. 2021. PMID: 34099697 Free PMC article.
Publication types
MeSH terms
Substances
LinkOut - more resources
Full Text Sources
Other Literature Sources