(original) (raw)
TY - JOUR AU - Chan, Chon-Kit Kenneth AU - Hsu, Arthur L. AU - Halgamuge, Saman K. AU - Tang, Sen-Lin PY - 2008 DA - 2008/04/28 TI - Binning sequences using very sparse labels within a metagenome JO - BMC Bioinformatics SP - 215 VL - 9 IS - 1 AB - In metagenomic studies, a process called binning is necessary to assign contigs that belong to multiple species to their respective phylogenetic groups. Most of the current methods of binning, such as BLAST, k-mer and PhyloPythia, involve assigning sequence fragments by comparing sequence similarity or sequence composition with already-sequenced genomes that are still far from comprehensive. We propose a semi-supervised seeding method for binning that does not depend on knowledge of completed genomes. Instead, it extracts the flanking sequences of highly conserved 16S rRNA from the metagenome and uses them as seeds (labels) to assign other reads based on their compositional similarity. SN - 1471-2105 UR - https://doi.org/10.1186/1471-2105-9-215 DO - 10.1186/1471-2105-9-215 ID - Chan2008 ER -