Using comparative genome analysis to identify problems in annotated microbial genomes - PubMed (original) (raw)
Review
. 2010 Jul;156(Pt 7):1909-1917.
doi: 10.1099/mic.0.033811-0. Epub 2010 Apr 29.
Affiliations
- PMID: 20430813
- DOI: 10.1099/mic.0.033811-0
Free article
Review
Using comparative genome analysis to identify problems in annotated microbial genomes
Maria S Poptsova et al. Microbiology (Reading). 2010 Jul.
Free article
Abstract
Genome annotation is a tedious task that is mostly done by automated methods; however, the accuracy of these approaches has been questioned since the beginning of the sequencing era. Genome annotation is a multilevel process, and errors can emerge at different stages: during sequencing, as a result of gene-calling procedures, and in the process of assigning gene functions. Missed or wrongly annotated genes differentially impact different types of analyses. Here we discuss and demonstrate how the methods of comparative genome analysis can refine annotations by locating missing orthologues. We also discuss possible reasons for errors and show that the second-generation annotation systems, which combine multiple gene-calling programs with similarity-based methods, perform much better than the first annotation tools. Since old errors may propagate to the newly sequenced genomes, we emphasize that the problem of continuously updating popular public databases is an urgent and unresolved one. Due to the progress in genome-sequencing technologies, automated annotation techniques will remain the main approach in the future. Researchers need to be aware of the existing errors in the annotation of even well-studied genomes, such as Escherichia coli, and consider additional quality control for their results.
Similar articles
- Re-annotation of genome microbial coding-sequences: finding new genes and inaccurately annotated genes.
Bocs S, Danchin A, Médigue C. Bocs S, et al. BMC Bioinformatics. 2002;3:5. doi: 10.1186/1471-2105-3-5. Epub 2002 Feb 5. BMC Bioinformatics. 2002. PMID: 11879526 Free PMC article. - Combining multiple functional annotation tools increases coverage of metabolic annotation.
Griesemer M, Kimbrel JA, Zhou CE, Navid A, D'haeseleer P. Griesemer M, et al. BMC Genomics. 2018 Dec 19;19(1):948. doi: 10.1186/s12864-018-5221-9. BMC Genomics. 2018. PMID: 30567498 Free PMC article. - MICheck: a web tool for fast checking of syntactic annotations of bacterial genomes.
Cruveiller S, Le Saux J, Vallenet D, Lajus A, Bocs S, Médigue C. Cruveiller S, et al. Nucleic Acids Res. 2005 Jul 1;33(Web Server issue):W471-9. doi: 10.1093/nar/gki498. Nucleic Acids Res. 2005. PMID: 15980515 Free PMC article. - Comparative Genome Annotation.
König S, Romoth L, Stanke M. König S, et al. Methods Mol Biol. 2018;1704:189-212. doi: 10.1007/978-1-4939-7463-4_6. Methods Mol Biol. 2018. PMID: 29277866 Review. - Annotation, comparison and databases for hundreds of bacterial genomes.
Médigue C, Moszer I. Médigue C, et al. Res Microbiol. 2007 Dec;158(10):724-36. doi: 10.1016/j.resmic.2007.09.009. Epub 2007 Oct 6. Res Microbiol. 2007. PMID: 18031997 Review.
Cited by
- Theoretical prediction and experimental verification of protein-coding genes in plant pathogen genome Agrobacterium tumefaciens strain C58.
Wang Q, Lei Y, Xu X, Wang G, Chen LL. Wang Q, et al. PLoS One. 2012;7(9):e43176. doi: 10.1371/journal.pone.0043176. Epub 2012 Sep 11. PLoS One. 2012. PMID: 22984411 Free PMC article. - EnzymeDetector: an integrated enzyme function prediction tool and database.
Quester S, Schomburg D. Quester S, et al. BMC Bioinformatics. 2011 Sep 23;12:376. doi: 10.1186/1471-2105-12-376. BMC Bioinformatics. 2011. PMID: 21943292 Free PMC article. - Divergence of protein-coding capacity and regulation in the Bacillus cereus sensu lato group.
Toby IT, Widmer J, Dyer DW. Toby IT, et al. BMC Bioinformatics. 2014;15 Suppl 11(Suppl 11):S8. doi: 10.1186/1471-2105-15-S11-S8. Epub 2014 Oct 21. BMC Bioinformatics. 2014. PMID: 25350501 Free PMC article. - Proteogenomic Analysis Provides Novel Insight into Genome Annotation and Nitrogen Metabolism in Nostoc sp. PCC 7120.
Yu S, Yang M, Xiong J, Zhang Q, Gao X, Miao W, Ge F. Yu S, et al. Microbiol Spectr. 2021 Oct 31;9(2):e0049021. doi: 10.1128/Spectrum.00490-21. Epub 2021 Sep 15. Microbiol Spectr. 2021. PMID: 34523988 Free PMC article. - Recognition of Protein-coding Genes Based on Z-curve Algorithms.
-Biao Guo F, Lin Y, -Ling Chen L. -Biao Guo F, et al. Curr Genomics. 2014 Apr;15(2):95-103. doi: 10.2174/1389202915999140328162724. Curr Genomics. 2014. PMID: 24822027 Free PMC article.
Publication types
MeSH terms
LinkOut - more resources
Full Text Sources
Other Literature Sources
Medical