FastGroup: a program to dereplicate libraries of 16S rDNA sequences - PubMed (original) (raw)
Comparative Study
FastGroup: a program to dereplicate libraries of 16S rDNA sequences
V Seguritan et al. BMC Bioinformatics. 2001.
Abstract
Background: Ribosomal 16S DNA sequences are an essential tool for identifying and classifying microbes. High-throughput DNA sequencing now makes it economically possible to produce very large datasets of 16S rDNA sequences in short time periods, necessitating new computer tools for analyses. Here we describe FastGroup, a Java program designed to dereplicate libraries of 16S rDNA sequences. By dereplication we mean to: 1) compare all the sequences in a data set to each other, 2) group similar sequences together, and 3) output a representative sequence from each group. In this way, duplicate sequences are removed from a library.
Results: FastGroup was tested using a library of single-pass, bacterial 16S rDNA sequences cloned from coral-associated bacteria. We found that the optimal strategy for dereplicating these sequences was to: 1) trim ambiguous bases from the 5' end of the sequences and all sequence 3' of the conserved Bact517 site, 2) match the sequences from the 3' end, and 3) group sequences > or =97% identical to each other.
Conclusions: The FastGroup program simplifies the dereplication of 16S rDNA sequence libraries and prepares the raw sequences for subsequent analyses.
Figures
Figure 1
Graphical User Interface (GUI) for FastGroup.
Figure 2
Schematic of bacterial 16S rDNA showing conserved and hypervariable regions. Detailed information about the primers and their superposition on the bacterial 16S rDNA can be found at
. Bact27F (5' AGA GTT TGA TCM TGG CTC AG 3') corresponds to positions 9–27 of the E. coli 16S rDNA and is similar to BSF8/20. Bact517 (5' ATT ACC GCG GCT GCT GG 3') corresponds to positions 517–534 of the E. coli 16S rDNA and is similar to BSF517/17. Bact1492R (5' TAC GGY TAC CTT GTT ACG ACT T 3') corresponds to positions 1492–1514 of the E. coli 16S rDNA. The approximate sites for hypervariable regions (V1-V3) are shown as shaded boxes.
Figure 3
Comparison of ClustalX and FastGroup analyses. An alignment of the 16S rDNA library was performed using ClustalX and a NJ tree was constructed. The "ClustalX Clades" were made by grouping end nodes separated by approximately 3% divergence (i.e., the combined branch lengths). Sequences grouped together by FastGroup, using default trimming criteria and 97% PSI, were identified on this tree and color-coded.
Similar articles
- FastGroupII: a web-based bioinformatics platform for analyses of large 16S rDNA libraries.
Yu Y, Breitbart M, McNairnie P, Rohwer F. Yu Y, et al. BMC Bioinformatics. 2006 Feb 7;7:57. doi: 10.1186/1471-2105-7-57. BMC Bioinformatics. 2006. PMID: 16464253 Free PMC article. - A comprehensive evaluation of the sl1p pipeline for 16S rRNA gene sequencing analysis.
Whelan FJ, Surette MG. Whelan FJ, et al. Microbiome. 2017 Aug 14;5(1):100. doi: 10.1186/s40168-017-0314-2. Microbiome. 2017. PMID: 28807046 Free PMC article. - Then and now: use of 16S rDNA gene sequencing for bacterial identification and discovery of novel bacteria in clinical microbiology laboratories.
Woo PC, Lau SK, Teng JL, Tse H, Yuen KY. Woo PC, et al. Clin Microbiol Infect. 2008 Oct;14(10):908-34. doi: 10.1111/j.1469-0691.2008.02070.x. Clin Microbiol Infect. 2008. PMID: 18828852 Review. - Thinking beside the box: Should we care about the non-coding strand of the 16S rRNA gene?
Garcia-Mazcorro JF, Barcenas-Walls JR. Garcia-Mazcorro JF, et al. FEMS Microbiol Lett. 2016 Aug;363(16):fnw171. doi: 10.1093/femsle/fnw171. Epub 2016 Jul 12. FEMS Microbiol Lett. 2016. PMID: 27412167 Review.
Cited by
- Colonization of the cecal mucosa by Helicobacter hepaticus impacts the diversity of the indigenous microbiota.
Kuehl CJ, Wood HD, Marsh TL, Schmidt TM, Young VB. Kuehl CJ, et al. Infect Immun. 2005 Oct;73(10):6952-61. doi: 10.1128/IAI.73.10.6852-6961.2005. Infect Immun. 2005. PMID: 16177375 Free PMC article. - Use of 16S ribosomal DNA for delineation of marine bacterioplankton species.
Hagström A, Pommier T, Rohwer F, Simu K, Stolte W, Svensson D, Zweifel UL. Hagström A, et al. Appl Environ Microbiol. 2002 Jul;68(7):3628-33. doi: 10.1128/AEM.68.7.3628-3633.2002. Appl Environ Microbiol. 2002. PMID: 12089052 Free PMC article. - FastGroupII: a web-based bioinformatics platform for analyses of large 16S rDNA libraries.
Yu Y, Breitbart M, McNairnie P, Rohwer F. Yu Y, et al. BMC Bioinformatics. 2006 Feb 7;7:57. doi: 10.1186/1471-2105-7-57. BMC Bioinformatics. 2006. PMID: 16464253 Free PMC article. - A clinician's guide to microbiome analysis.
Claesson MJ, Clooney AG, O'Toole PW. Claesson MJ, et al. Nat Rev Gastroenterol Hepatol. 2017 Oct;14(10):585-595. doi: 10.1038/nrgastro.2017.97. Epub 2017 Aug 9. Nat Rev Gastroenterol Hepatol. 2017. PMID: 28790452 Review. - Microbial Population Changes in Decaying Ascophyllum nodosum Result in Macroalgal-Polysaccharide-Degrading Bacteria with Potential Applicability in Enzyme-Assisted Extraction Technologies.
Ihua MW, Guihéneuf F, Mohammed H, Margassery LM, Jackson SA, Stengel DB, Clarke DJ, Dobson ADW. Ihua MW, et al. Mar Drugs. 2019 Mar 29;17(4):200. doi: 10.3390/md17040200. Mar Drugs. 2019. PMID: 30934874 Free PMC article.
References
- Gusfield D. Algorithms for Strings, Trees, and Sequences: Computer Science and Computational Biology New York: Cambridge University Press; 1997.
Publication types
MeSH terms
Substances
LinkOut - more resources
Full Text Sources