Systematic design of 18S rRNA gene primers for determining eukaryotic diversity in microbial consortia - PubMed (original) (raw)

Systematic design of 18S rRNA gene primers for determining eukaryotic diversity in microbial consortia

Luisa W Hugerth et al. PLoS One. 2014.

Erratum in

Abstract

High-throughput sequencing of ribosomal RNA gene (rDNA) amplicons has opened up the door to large-scale comparative studies of microbial community structures. The short reads currently produced by massively parallel sequencing technologies make the choice of sequencing region crucial for accurate phylogenetic assignments. While for 16S rDNA, relevant regions have been well described, no truly systematic design of 18S rDNA primers aimed at resolving eukaryotic diversity has yet been reported. Here we used 31,862 18S rDNA sequences to design a set of broad-taxonomic range degenerate PCR primers. We simulated the phylogenetic information that each candidate primer pair would retrieve using paired- or single-end reads of various lengths, representing different sequencing technologies. Primer pairs targeting the V4 region performed best, allowing discrimination with paired-end reads as short as 150 bp (with 75% accuracy at genus level). The conditions for PCR amplification were optimised for one of these primer pairs and this was used to amplify 18S rDNA sequences from isolates as well as from a range of environmental samples which were then Illumina sequenced and analysed, revealing good concordance between expected and observed results. In summary, the reported primer sets will allow minimally biased assessment of eukaryotic diversity in different microbial ecosystems.

PubMed Disclaimer

Conflict of interest statement

Competing Interests: The authors have declared that no competing interests exist.

Figures

Figure 1

Figure 1. Position and coverage of candidate primers.

Eighteen bp oligomers with 12 degrees of degeneracy were designed to match as many of the sequences as possible at each position of an alignment of 31,862 full-length unique eukaryotic 18S rDNA sequences using the DegePrime program. The proportion of the sequences matched by the best oligomer found for each position is depicted in black, with a line connecting adjacent points. The entropy of each position is depicted by a dotted grey line. The position numbering refers to the Saccharomyces cerevisiae strain FM-sc-08 18S ribosomal RNA gene, NCBI accession number Z75578. Dark red horizontal bars represent the oligomers chosen as candidate primers in this study. Primers which were later altered are marked in lighter red. Primers found in the literature are depicted in dark blue. Pink rectangles are used to highlight the hypervariable regions of the gene.

Figure 2

Figure 2. Taxonomic distribution of sequences matching candidate primers.

The central circle represents the taxonomic distribution of the SILVA eukaryotic database. Each outer ring corresponds to the taxonomic distribution of sequences matching each primer candidate. Primers are marked in the figure and each colour corresponds to a kingdom or phylum as shown in the legend.

Figure 3

Figure 3. Ratio between number of unique amplicon sequences and full-length sequences.

The ratio between the number of unique amplicon sequences and unique near full-length sequences (starting at primer 391 and ending at 1786), for different primer pairs and read lengths/types. (A) Paired-end 150 bp reads (B) Paired-end 250 bp reads (C) Single-end 400 bp reads. Paired-end reads are connected by a black dashed line.

Figure 4

Figure 4. Specificity of taxonomic annotations at different taxonomic levels.

Specificity of taxonomic annotations at different taxonomic levels, for the different primer pairs and read lengths/types, when requiring 99% identity to the selected match. Only instances where the selected hit sequence was annotated down to family level are shown, which were on average 78% of the cases. Matches to the correct species are depicted in green, and to the right genus in yellow. Matches to the level annotated immediately above genus are marked in orange. All other matches are considered missasignments and depicted in red.

Figure 5

Figure 5. Hamming distance between blast queries and hits.

A violin plot of the Hamming distance between the full-length sequence of BLAST queries and hits at a 99% identity cut-off. Inside each violin the boxplot is also depicted.

Figure 6

Figure 6. Alpha- and beta-diversity of environmental samples.

(A) Rarefaction curves for OTUs at 97% similarity for environmental samples. (B) NMDA plot of the Bray-Curtis distance between 97%-similarity OTU profiles of the same samples.

Figure 7

Figure 7. Taxonomic classification of selected environmental samples Taxonomic classification of selected environmental samples.

(A) Marine water 1 (B) Soil (C) Wastewater sludge 2 (D) Moose rumen 1. An interactive HTML version of these plots and of the other environmental samples at deeper taxonomic resolution can be found as File S1.

References

    1. Pace NR, Stahl DA, Lane DJ, Olsen GJ (1985). Analyzing Natural Microbial Populations by rRNA Sequences. ASM News, 51, 4–12.
    1. Massana R, Murray AE, Preston CM, DeLong EF (1997). Vertical Distribution and Phylogenetic Characterization of Marine Planktonic Archaea in the Santa Barbara Channel. Appl Environ Microbiol, 63 (1), 50–6. - PMC - PubMed
    1. López-Garcia P, Rodríguez-Valera F, Pedrós-Altó C, Moreira D (2001). Unexpected Diversity of Small Eukaryotes in Deep-sea Antarctic Plankton. Nature, 409, 603–7. - PubMed
    1. Díez B, Pedrós-Alió C (2001). Study of Genetic Diversity of Eukaryotic Picoplankton in Different Oceanic Regions by Small-subunit rRNA Gene Cloning and Sequencing. Appl Environ Microbiol, 67 (7), 2932–41. - PMC - PubMed
    1. Chakravorty S, Helb D, Burday M, Connel N, Allan D (2007). A Detailed Analysis of 16S Ribosomal RNA Gene Segments for the Diagnosis of Pathogenic Bacteria. J Microbiol Methods, 69 (2), 330–9. - PMC - PubMed

Publication types

MeSH terms

Substances

Grants and funding

This work was funded by the Swedish Research Councils VR [grant 2011-5689], FORMAS [grant 2009-1174] and EC BONUS project BLUEPRINT partially funded by FORMAS through grants to A.F.A, the Luxemburg National Research fund [grants ATTRACT/A09/03 and CORE/11/BM/1186762 to P.W., PHD-MARP-04 to H.R. and PRD-2011-1/SR to E.M.] and the Chinese Scholarship Council. The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.

LinkOut - more resources