Gene structure prediction by linguistic methods - PubMed (original) (raw)
Gene structure prediction by linguistic methods
S Dong et al. Genomics. 1994 Oct.
Free article
Abstract
The higher-order structure of genes and other features of biological sequences can be described by means of formal grammars. These grammars can then be used by general-purpose parsers to detect and to assemble such structures by means of syntactic pattern recognition. We describe a grammar and parser for eukaryotic protein-encoding genes, which by some measures is as effective as current connectionist and combinatorial algorithms in predicting gene structures for sequence database entries. Parameters of the grammar rules are optimized for several different species, and mixing experiments are performed to determine the degree of species specificity and the relative importance of compositional, signal-based, and syntactic components in gene prediction.
Similar articles
- Protein linguistics - a grammar for modular protein assembly?
Gimona M. Gimona M. Nat Rev Mol Cell Biol. 2006 Jan;7(1):68-73. doi: 10.1038/nrm1785. Nat Rev Mol Cell Biol. 2006. PMID: 16493414 Review. - A graph grammar approach to artificial life.
Kniemeyer O, Buck-Sorlin GH, Kurth W. Kniemeyer O, et al. Artif Life. 2004 Fall;10(4):413-31. doi: 10.1162/1064546041766451. Artif Life. 2004. PMID: 15479546 - MitoRes: a resource of nuclear-encoded mitochondrial genes and their products in Metazoa.
Catalano D, Licciulli F, Turi A, Grillo G, Saccone C, D'Elia D. Catalano D, et al. BMC Bioinformatics. 2006 Jan 24;7:36. doi: 10.1186/1471-2105-7-36. BMC Bioinformatics. 2006. PMID: 16433928 Free PMC article. - Evolution of universal grammar.
Nowak MA, Komarova NL, Niyogi P. Nowak MA, et al. Science. 2001 Jan 5;291(5501):114-8. doi: 10.1126/science.291.5501.114. Science. 2001. PMID: 11141560 - DNA sequence analysis linguistic tools: contrast vocabularies, compositional spectra and linguistic complexity.
Bolshoy A. Bolshoy A. Appl Bioinformatics. 2003;2(2):103-12. Appl Bioinformatics. 2003. PMID: 15130826 Review.
Cited by
- Evaluation of gene-finding programs on mammalian sequences.
Rogic S, Mackworth AK, Ouellette FB. Rogic S, et al. Genome Res. 2001 May;11(5):817-32. doi: 10.1101/gr.147901. Genome Res. 2001. PMID: 11337477 Free PMC article. - Gene prediction by spectral rotation measure: a new method for identifying protein-coding regions.
Kotlar D, Lavner Y. Kotlar D, et al. Genome Res. 2003 Aug;13(8):1930-7. doi: 10.1101/gr.1261703. Epub 2003 Jul 17. Genome Res. 2003. PMID: 12869578 Free PMC article. - A brief review of computational gene prediction methods.
Wang Z, Chen Y, Li Y. Wang Z, et al. Genomics Proteomics Bioinformatics. 2004 Nov;2(4):216-21. doi: 10.1016/s1672-0229(04)02028-5. Genomics Proteomics Bioinformatics. 2004. PMID: 15901250 Free PMC article. Review. - A view from the dark side.
Searls DB. Searls DB. PLoS Comput Biol. 2007 Jun;3(6):e105. doi: 10.1371/journal.pcbi.0030105. PLoS Comput Biol. 2007. PMID: 17604444 Free PMC article. No abstract available. - Finding and Characterizing Repeats in Plant Genomes.
Nicolas J, Tempel S, Fiston-Lavier AS, Cherif E. Nicolas J, et al. Methods Mol Biol. 2022;2443:327-385. doi: 10.1007/978-1-0716-2067-0_18. Methods Mol Biol. 2022. PMID: 35037215
Publication types
MeSH terms
LinkOut - more resources
Full Text Sources
Medical
Molecular Biology Databases