NBC: the Naive Bayes Classification tool webserver for taxonomic classification of metagenomic reads - PubMed (original) (raw)

NBC: the Naive Bayes Classification tool webserver for taxonomic classification of metagenomic reads

Gail L Rosen et al. Bioinformatics. 2011.

Abstract

Motivation: Datasets from high-throughput sequencing technologies have yielded a vast amount of data about organisms in environmental samples. Yet, it is still a challenge to assess the exact organism content in these samples because the task of taxonomic classification is too computationally complex to annotate all reads in a dataset. An easy-to-use webserver is needed to process these reads. While many methods exist, only a few are publicly available on webservers, and out of those, most do not annotate all reads.

Results: We introduce a webserver that implements the naïve Bayes classifier (NBC) to classify all metagenomic reads to their best taxonomic match. Results indicate that NBC can assign next-generation sequencing reads to their taxonomic classification and can find significant populations of genera that other classifiers may miss.

Availability: Publicly available at: http://nbc.ece.drexel.edu.

PubMed Disclaimer

Figures

Fig. 1.

Fig. 1.

Percentage of reads that are assigned to a particular genera out of all 454 reads from the Biogas reactor community. CAMERA and NBC tend to agree for over 70% of the genera shown while MG-RAST agrees with CAMERA and NBC near 50%. WebCARMA bins fewers reads, and Galaxy has high variability. For the first 5602 reads (1.5 Mb web site limit), Phylopythia only classifies eight reads to the phylum level and is not included in the graph due to its inability to make assignments at the genus level.

References

    1. Altschul SF, et al. Basic local alignment search tool. J. Mol. Biol. 1990;215:403–410. - PubMed
    1. Gerlach W, et al. Webcarma: a web application for the functional and taxonomic classification of unassembled metagenomic reads. BMC Bioinformatics. 2009;10 - PMC - PubMed
    1. Hery M, et al. Monitoring of bacterial communities during low temperature thermal treatment of activated sludge combining dna phylochip and respirometry techniques. Water Res. 2010 [Epub ahead of print, doi: 10.1016/j.watres.2010.07.003.] - PubMed
    1. Huson DE, et al. Megan analysis of metagenomic data. Genome Res. 2007;17:377–386. - PMC - PubMed
    1. McHardy AC, et al. Accurate phylogenetic classification of variable-length dna fragments. Nat. Methods. 2007;4:63–72. - PubMed

Publication types

MeSH terms

LinkOut - more resources