Using multiple alignments to improve gene prediction - PubMed (original) (raw)
Comparative Study
Using multiple alignments to improve gene prediction
Samuel S Gross et al. J Comput Biol. 2006 Mar.
Abstract
The multiple species de novo gene prediction problem can be stated as follows: given an alignment of genomic sequences from two or more organisms, predict the location and structure of all protein-coding genes in one or more of the sequences. Here, we present a new system, N-SCAN (a.k.a. TWINSCAN 3.0), for addressing this problem. N-SCAN can model the phylogenetic relationships between the aligned genome sequences, context dependent substitution rates, and insertions and deletions. An implementation of N-SCAN was created and used to generate predictions for the entire human genome and the genome of the fruit fly Drosophila melanogaster. Analyses of the predictions reveal that N-SCAN's accuracy in both human and fly exceeds that of all previously published whole-genome de novo gene predictors.
Similar articles
- The Gene-Finder computer tools for analysis of human and model organisms genome sequences.
Solovyev V, Salamov A. Solovyev V, et al. Proc Int Conf Intell Syst Mol Biol. 1997;5:294-302. Proc Int Conf Intell Syst Mol Biol. 1997. PMID: 9322052 - Using ESTs to improve the accuracy of de novo gene prediction.
Wei C, Brent MR. Wei C, et al. BMC Bioinformatics. 2006 Jul 3;7:327. doi: 10.1186/1471-2105-7-327. BMC Bioinformatics. 2006. PMID: 16817966 Free PMC article. - Vertebrate gene finding from multiple-species alignments using a two-level strategy.
Carter D, Durbin R. Carter D, et al. Genome Biol. 2006;7 Suppl 1(Suppl 1):S6.1-12. doi: 10.1186/gb-2006-7-s1-s6. Epub 2006 Aug 7. Genome Biol. 2006. PMID: 16925840 Free PMC article. - Strategies for transcriptome analysis in nonmodel plants.
Ward JA, Ponnala L, Weber CA. Ward JA, et al. Am J Bot. 2012 Feb;99(2):267-76. doi: 10.3732/ajb.1100334. Epub 2012 Feb 1. Am J Bot. 2012. PMID: 22301897 Review. - A brief review of computational gene prediction methods.
Wang Z, Chen Y, Li Y. Wang Z, et al. Genomics Proteomics Bioinformatics. 2004 Nov;2(4):216-21. doi: 10.1016/s1672-0229(04)02028-5. Genomics Proteomics Bioinformatics. 2004. PMID: 15901250 Free PMC article. Review.
Cited by
- Begin at the beginning: predicting genes with 5' UTRs.
Brown RH, Gross SS, Brent MR. Brown RH, et al. Genome Res. 2005 May;15(5):742-7. doi: 10.1101/gr.3696205. Genome Res. 2005. PMID: 15867435 Free PMC article. - Rule-based knowledge acquisition method for promoter prediction in human and Drosophila species.
Huang WL, Tung CW, Liaw C, Huang HL, Ho SY. Huang WL, et al. ScientificWorldJournal. 2014;2014:327306. doi: 10.1155/2014/327306. Epub 2014 Jan 29. ScientificWorldJournal. 2014. PMID: 24955394 Free PMC article. - Evaluating high-throughput ab initio gene finders to discover proteins encoded in eukaryotic pathogen genomes missed by laboratory techniques.
Goodswen SJ, Kennedy PJ, Ellis JT. Goodswen SJ, et al. PLoS One. 2012;7(11):e50609. doi: 10.1371/journal.pone.0050609. Epub 2012 Nov 30. PLoS One. 2012. PMID: 23226328 Free PMC article. - The UCSC genome browser and associated tools.
Kuhn RM, Haussler D, Kent WJ. Kuhn RM, et al. Brief Bioinform. 2013 Mar;14(2):144-61. doi: 10.1093/bib/bbs038. Epub 2012 Aug 20. Brief Bioinform. 2013. PMID: 22908213 Free PMC article. - Global discriminative learning for higher-accuracy computational gene prediction.
Bernal A, Crammer K, Hatzigeorgiou A, Pereira F. Bernal A, et al. PLoS Comput Biol. 2007 Mar 16;3(3):e54. doi: 10.1371/journal.pcbi.0030054. Epub 2007 Feb 2. PLoS Comput Biol. 2007. PMID: 17367206 Free PMC article.
Publication types
MeSH terms
Substances
LinkOut - more resources
Full Text Sources
Other Literature Sources
Molecular Biology Databases