Effects of experimental choices and analysis noise on surveys of the "rare biosphere" - PubMed (original) (raw)
Comparative Study
. 2009 May;75(10):3263-70.
doi: 10.1128/AEM.01931-08. Epub 2009 Mar 6.
Affiliations
- PMID: 19270149
- PMCID: PMC2681628
- DOI: 10.1128/AEM.01931-08
Comparative Study
Effects of experimental choices and analysis noise on surveys of the "rare biosphere"
Timothy J Hamp et al. Appl Environ Microbiol. 2009 May.
Abstract
When planning a survey of 16S rRNA genes from a complex environment, investigators face many choices including which primers to use and how to taxonomically classify sequences. In this study, we explored how these choices affected a survey of microbial diversity in a sample taken from the aerobic basin of the activated sludge of a North Carolina wastewater treatment plant. We performed pyrosequencing reactions on PCR products generated from primers targeting the V1-V2, V6, and V6-V7 variable regions of the 16S rRNA gene. We compared these sequences to 16S rRNA gene sequences found in a whole-genome shotgun pyrosequencing run performed on the same sample. We found that sequences generated from primers targeting the V1-V2 variable region had the best match to the whole-genome shotgun reaction across a range of taxonomic classifications from phylum to family. Pronounced differences between primer sets, however, occurred in the "rare biosphere" involving taxa that we observed in fewer than 11 sequences. We also examined the results of analysis strategies comparing a classification scheme using a nearest-neighbor approach to directly classifying sequences with a naïve Bayesian algorithm. Again, we observed pronounced differences between these analysis schemes in infrequently observed taxa. We conclude that if a study is meant to probe the rare biosphere, both the experimental conditions and analysis choices will have a profound impact on the observed results.
Figures
FIG. 1.
Sequence conservation as a function of alignment position for the 489,840 sequences in version 9.59 of the RDP. The x axis shows the position in the alignment as numbered by the E. coli 16S rRNA gene. The y axis shows the Shannon sequence entropy (see Materials and Methods), a widely used measure of conservation in multiple sequence alignments (23). Highly conserved positions within the alignment have a sequence entropy close to zero and hence are shown toward the top of the y axis. Positions of the hypervariable regions V1-V3 and V6-V7 are derived from Chakravorty et al. (3).
FIG. 2.
Number of sequences assigned at the family classification level by the RDP classification algorithm to different sequences for the V1-V2, V6, and V6-V7 primers plotted against the number of 16S sequences assigned to the whole-genome shotgun sequence set. One has been added to each count to allow the data to be shown on a log-log plot.
FIG. 3.
Across taxonomic levels, the results of a linear regression on log-transformed data between sequences generated by PCR targeting the 16S rRNA gene and 16S sequences culled from our whole-genome shotgun sequence set. Assignments are by the RDP classification algorithm as in Fig. 2. For each sequence set, two separate regressions were constructed, one for the rare biosphere (circles) with taxa seen 10 or fewer times in that sequence set's PCRs and one for a common biosphere (squares) with taxa seen 11 or more times in the PCR reactions (Fig. 1, gray lines). The top panels show the −log10 of the P value of the null hypothesis that the slope of the regression equals zero. The middle panels show Pearson's R values while the bottom panel shows the number of taxa for which classifications are made. Note that a significant P value (top panel) can be produced by either a negative or positive correlation.
FIG. 4.
Comparison of classifications at the family level made by JGast and the RDP classification algorithm.
FIG. 5.
Regressions across classification levels on log-transformed data showing the comparison between the RDP classification algorithm and JGast. Two regressions were constructed for each comparison: one for a common biosphere in which a taxon was observed 11 or more times under either JGast or RDP (squares) and one for a rare taxon in which fewer than 11 taxa were observed under both classification schemes.
Similar articles
- Group-specific PCR primers for the phylum Acidobacteria designed based on the comparative analysis of 16S rRNA gene sequences.
Lee SH, Cho JC. Lee SH, et al. J Microbiol Methods. 2011 Aug;86(2):195-203. doi: 10.1016/j.mimet.2011.05.003. Epub 2011 May 12. J Microbiol Methods. 2011. PMID: 21600936 - A comparison of primer sets for detecting 16S rRNA and hydrazine oxidoreductase genes of anaerobic ammonium-oxidizing bacteria in marine sediments.
Li M, Hong Y, Klotz MG, Gu JD. Li M, et al. Appl Microbiol Biotechnol. 2010 Mar;86(2):781-90. doi: 10.1007/s00253-009-2361-5. Epub 2010 Jan 27. Appl Microbiol Biotechnol. 2010. PMID: 20107988 - Scratching the surface of the rare biosphere with ribosomal sequence tag primers.
Neufeld JD, Li J, Mohn WW. Neufeld JD, et al. FEMS Microbiol Lett. 2008 Jun;283(2):146-53. doi: 10.1111/j.1574-6968.2008.01124.x. Epub 2008 Apr 21. FEMS Microbiol Lett. 2008. PMID: 18429998 - Comparison of species richness estimates obtained using nearly complete fragments and simulated pyrosequencing-generated fragments in 16S rRNA gene-based environmental surveys.
Youssef N, Sheik CS, Krumholz LR, Najar FZ, Roe BA, Elshahed MS. Youssef N, et al. Appl Environ Microbiol. 2009 Aug;75(16):5227-36. doi: 10.1128/AEM.00592-09. Epub 2009 Jun 26. Appl Environ Microbiol. 2009. PMID: 19561178 Free PMC article. - Review and re-analysis of domain-specific 16S primers.
Baker GC, Smith JJ, Cowan DA. Baker GC, et al. J Microbiol Methods. 2003 Dec;55(3):541-55. doi: 10.1016/j.mimet.2003.08.009. J Microbiol Methods. 2003. PMID: 14607398 Review.
Cited by
- Single-cell genomics-based analysis reveals a vital ecological role of Thiocapsa sp. LSW in the meromictic Lake Shunet, Siberia.
Wu YT, Chiang PW, Tandon K, Rogozin DY, Degermendzhy AG, Tang SL. Wu YT, et al. Microb Genom. 2021 Dec;7(12):000712. doi: 10.1099/mgen.0.000712. Microb Genom. 2021. PMID: 34860152 Free PMC article. - Analysis of 16S rRNA amplicon sequencing options on the Roche/454 next-generation titanium sequencing platform.
Tamaki H, Wright CL, Li X, Lin Q, Hwang C, Wang S, Thimmapuram J, Kamagata Y, Liu WT. Tamaki H, et al. PLoS One. 2011;6(9):e25263. doi: 10.1371/journal.pone.0025263. Epub 2011 Sep 23. PLoS One. 2011. PMID: 21966473 Free PMC article. - VITCOMIC2: visualization tool for the phylogenetic composition of microbial communities based on 16S rRNA gene amplicons and metagenomic shotgun sequencing.
Mori H, Maruyama T, Yano M, Yamada T, Kurokawa K. Mori H, et al. BMC Syst Biol. 2018 Mar 19;12(Suppl 2):30. doi: 10.1186/s12918-018-0545-2. BMC Syst Biol. 2018. PMID: 29560821 Free PMC article. - The "most wanted" taxa from the human microbiome for whole genome sequencing.
Fodor AA, DeSantis TZ, Wylie KM, Badger JH, Ye Y, Hepburn T, Hu P, Sodergren E, Liolios K, Huot-Creasy H, Birren BW, Earl AM. Fodor AA, et al. PLoS One. 2012;7(7):e41294. doi: 10.1371/journal.pone.0041294. Epub 2012 Jul 26. PLoS One. 2012. PMID: 22848458 Free PMC article. - Considerations For Optimizing Microbiome Analysis Using a Marker Gene.
de la Cuesta-Zuluaga J, Escobar JS. de la Cuesta-Zuluaga J, et al. Front Nutr. 2016 Aug 8;3:26. doi: 10.3389/fnut.2016.00026. eCollection 2016. Front Nutr. 2016. PMID: 27551678 Free PMC article. Review.
References
Publication types
MeSH terms
Substances
LinkOut - more resources
Full Text Sources