Improving gene recognition accuracy by combining predictions from two gene-finding programs - PubMed (original) (raw)
Comparative Study
Improving gene recognition accuracy by combining predictions from two gene-finding programs
Sanja Rogic et al. Bioinformatics. 2002 Aug.
Abstract
Motivation: Despite constant improvements in prediction accuracy, gene-finding programs are still unable to provide automatic gene discovery with desired correctness. The current programs can identify up to 75% of exons correctly and less than 50% of predicted gene structures correspond to actual genes. New approaches to computational gene-finding are clearly needed.
Results: In this paper we have explored the benefits of combining predictions from already existing gene prediction programs. We have introduced three novel methods for combining predictions from programs Genscan and HMMgene. The methods primarily aim to improve exon level accuracy of gene-finding by identifying more probable exon boundaries and by eliminating false positive exon predictions. This approach results in improved accuracy at both the nucleotide and exon level, especially the latter, where the average improvement on the newly assembled dataset is 7.9% compared to the best result obtained by Genscan and HMMgene. When tested on a long genomic multi-gene sequence, our method that maintains reading frame consistency improved nucleotide level specificity by 21.0% and exon level specificity by 32.5% compared to the best result obtained by either of the two programs individually.
Availability: The scripts implementing our methods are available from http://www.cs.ubc.ca/labs/beta/genefinding/
Similar articles
- GeneComber: combining outputs of gene prediction programs for improved results.
Shah SP, McVicker GP, Mackworth AK, Rogic S, Ouellette BF. Shah SP, et al. Bioinformatics. 2003 Jul 1;19(10):1296-7. doi: 10.1093/bioinformatics/btg139. Bioinformatics. 2003. PMID: 12835277 - DIGIT: a novel gene finding program by combining gene-finders.
Yada T, Takagi T, Totoki Y, Sakaki Y, Takaeda Y. Yada T, et al. Pac Symp Biocomput. 2003:375-87. doi: 10.1142/9789812776303_0035. Pac Symp Biocomput. 2003. PMID: 12603043 - A new approach for gene annotation using unambiguous sequence joining.
Tchourbanov A, Quest D, Ali H, Pauley M, Norgren R. Tchourbanov A, et al. Proc IEEE Comput Soc Bioinform Conf. 2003;2:353-62. Proc IEEE Comput Soc Bioinform Conf. 2003. PMID: 16452811 - Advances in the Exon-Intron Database (EID).
Shepelev V, Fedorov A. Shepelev V, et al. Brief Bioinform. 2006 Jun;7(2):178-85. doi: 10.1093/bib/bbl003. Epub 2006 Mar 9. Brief Bioinform. 2006. PMID: 16772261 Review. - An Experimental Approach to Genome Annotation: This report is based on a colloquium sponsored by the American Academy of Microbiology held July 19-20, 2004, in Washington, DC.
[No authors listed] [No authors listed] Washington (DC): American Society for Microbiology; 2004. Washington (DC): American Society for Microbiology; 2004. PMID: 33001599 Free Books & Documents. Review.
Cited by
- Strategies and tools for whole-genome alignments.
Couronne O, Poliakov A, Bray N, Ishkhanov T, Ryaboy D, Rubin E, Pachter L, Dubchak I. Couronne O, et al. Genome Res. 2003 Jan;13(1):73-80. doi: 10.1101/gr.762503. Genome Res. 2003. PMID: 12529308 Free PMC article. - Genepi: a blackboard framework for genome annotation.
Descorps-Declère S, Ziébelin D, Rechenmann F, Viari A. Descorps-Declère S, et al. BMC Bioinformatics. 2006 Oct 12;7:450. doi: 10.1186/1471-2105-7-450. BMC Bioinformatics. 2006. PMID: 17038181 Free PMC article. - Ensemble-based prediction of RNA secondary structures.
Aghaeepour N, Hoos HH. Aghaeepour N, et al. BMC Bioinformatics. 2013 Apr 24;14:139. doi: 10.1186/1471-2105-14-139. BMC Bioinformatics. 2013. PMID: 23617269 Free PMC article. - EGPred: prediction of eukaryotic genes using ab initio methods after combining with sequence similarity approaches.
Issac B, Raghava GP. Issac B, et al. Genome Res. 2004 Sep;14(9):1756-66. doi: 10.1101/gr.2524704. Genome Res. 2004. PMID: 15342559 Free PMC article. - Accurate identification of novel human genes through simultaneous gene prediction in human, mouse, and rat.
Dewey C, Wu JQ, Cawley S, Alexandersson M, Gibbs R, Pachter L. Dewey C, et al. Genome Res. 2004 Apr;14(4):661-4. doi: 10.1101/gr.1939804. Genome Res. 2004. PMID: 15060007 Free PMC article.
Publication types
MeSH terms
Substances
LinkOut - more resources
Full Text Sources
Molecular Biology Databases