Classification and quantification of bacteriophage taxa in human gut metagenomes - PubMed (original) (raw)

Classification and quantification of bacteriophage taxa in human gut metagenomes

Alison S Waller et al. ISME J. 2014 Jul.

Erratum in

Abstract

Bacteriophages have key roles in microbial communities, to a large extent shaping the taxonomic and functional composition of the microbiome, but data on the connections between phage diversity and the composition of communities are scarce. Using taxon-specific marker genes, we identified and monitored 20 viral taxa in 252 human gut metagenomic samples, mostly at the level of genera. On average, five phage taxa were identified in each sample, with up to three of these being highly abundant. The abundances of most phage taxa vary by up to four orders of magnitude between the samples, and several taxa that are highly abundant in some samples are absent in others. Significant correlations exist between the abundances of some phage taxa and human host metadata: for example, 'Group 936 lactococcal phages' are more prevalent and abundant in Danish samples than in samples from Spain or the United States of America. Quantification of phages that exist as integrated prophages revealed that the abundance profiles of prophages are highly individual-specific and remain unique to an individual over a 1-year time period, and prediction of prophage lysis across the samples identified hundreds of prophages that are apparently active in the gut and vary across the samples, in terms of presence and lytic state. Finally, a prophage-host network of the human gut was established and includes numerous novel host-phage associations.

PubMed Disclaimer

Figures

Figure 1

Figure 1

Taxonomic classification and quantification of phages in the human gut and the Oceans. (a) Relative rank of phage taxa in the human gut and the Ocean. The abundance of each taxon was used to determine its rank, and the relative rank is then the rank divided by the maximum rank in that sample set. Thereby, a relative rank of 1 is the most abundant taxon in that sample. Each column represents one of the four different sample sets: GOS (82 samples from the Global Ocean Sampling Expedition), 252.Gut.PFPR (only the subset of prophage-encoded genes in 252 gut metagenome samples), 61.Gut.Virome (all of the genes with 61 samples from three published studies of virus-enriched gut metagenomes), 252.Gut.total (all of the genes contained within 252 gut metagenome samples). The columns are clustered by similarity, as illustrated by the dendrogram above the columns. (b) Abundance of each phage taxon within the 252 human gut samples (total metagenomic genes). These abundances were derived by dividing the length-normalized base coverages by the total gene abundance in each sample, and multiplied by 109 to yield an abundance per Gbp. For those taxa for which marker genes are not quantitative (see Materials and methods), the recall is indicated in brackets.

Figure 2

Figure 2

Correlation between the abundance of the group 936 lactococcal phages and the country of origin of the human subject in the study. Boxplots show the abundance of group 936 lactococcal phages in samples from individuals from different countries. Fdr-corrected _P_-values between the groups are indicated on the boxplot. Fdr-corrected _P_-values between the groups are indicated on the boxplot.

Figure 3

Figure 3

Similarity between prophage abundance profiles of human gut metagenomic samples. For each sample, a prophage abundance distribution was determined (that is, the relative abundance of each prophage region detected in the sample). The Euclidean distance between each sample's prophage abundance profile was then determined. The red dots indicate the similarity between the prophage abundance profiles from the same individual sampled at different time points, the time between the sampling points is indicated on the x axis. The blue diamonds represent the similarity between each of the time-point samples and the next most similar sample. Over the whole year sampling period, two samples from the same individual are always most similar to each other than to any other sample.

Figure 4

Figure 4

Analysis of prophage lysis. (a) A comparison of the PtoH ratio for the 25 most abundant refG-predicted prophages across the 252 samples. The PtoH ratio for a prophage is the abundance of the prophage over the mean abundance of the host chromosome. Each row represents a predicted prophage, the bacterial host of the prophage is indicated on the right of each row, and a triangle or square on the left of the row indicates if the prophage contains an antibiotic resistance (Res.) gene or virulence factor (Fact.) gene: Strep (steptomycin resitance), Tetra (tetracyclin resistance), Bact (bactracin resistance), CLPP (casinolytic protease), GMD (GDP-mannose 4,6-dehydratase). Each column represents one of the 252 gut metagenomic samples, and the colour of the cell indicates the log10PtoH ratio, with yellow indicating that the prophage is lysogenized, blue indicating that the prophage is absent and red indicating that the prophage is in lysis. (b) Trends of the PtoH ratio over time for the 25 most abundant prophages. The PtoH ratio was plotted for the two individuals with samples from three different time points. Similar temporal variability was observed in the individuals that were only sampled twice. (c) Correlation of prophage lysis with bacterial abundances across 252 samples. The x axis shows the log10 PtoH ratio for the Eubacterium rectale prophage in the 252 samples. The y axis shows the log10 relative abundance of the host bacterium in the samples. The spearman correlation coeffictent and fdr-corrected _P_-value are shown in the upper right hand corner. The negative correlation indicates that in samples where the prophage is in the lytic phase, the relative abundance of the bacterial host is lower, presumably due to bacterial lysis by the phage, indicating that the PtoH ratio reflects the lytic state of the phage.

Figure 5

Figure 5

Network of temperate phages and their bacterial hosts in the gut. The cladogram on the right indicates the taxonomy of the phage taxa that were found as integrated prophages. A line connects the phage taxa to its bacterial host on the left side. The taxon labels besides the cladograms indicate the leaves of the branches, internal nodes involved in interactions are labelled within the cladograms. The colour of the line indicates if the interaction was determined from analysis of assembled metagenomic scaftigs (scaft-prophages), red and pink, or from analysis of reference bacterial genomes that occur in the gut (refG-prophages), blue and turquoise. In addition, red and dark blue indicate that these interactions have not been reported in the literature, whereas pink and turquoise represent known interactions.

References

    1. Agriculture and Horticulture Development Board 2011. Dairy statistics. An insider's guide 2011 http://www.dairyco.org.uk .
    1. Allen HK, Bunge J, Foster JA, Bayles DO, Stanton TB. Estimation of viral richness from shotgun metagenomes using a frequency count approach. Microbiome. 2013;1:5. - PMC - PubMed
    1. Allers E, Moraru C, Duhaime MB, Beneze E, Solonenko N, Barrero-Canosa J, et al. Single-cell and population level viral infection dynamics revealed by phageFISH, a method to visualize intracellular and free viruses. Environ. Microbiol. 2013;15:2306–2318. - PMC - PubMed
    1. Angly FE, Rodriguez-Brito B, Bangor D, McNairnie P, Breitbart M, Salamon P, et al. PHACCS, an online tool for estimating the structure and diversity of uncultured viral communities using metagenomic information. BMC Bioinformatics. 2005;6:41. - PMC - PubMed
    1. Arumugam M, Harrington ED, Foerstner KU, Raes J, Bork P. SmashCommunity: a metagenomic annotation and analysis tool. Bioinformatics. 2010;26:2977–2978. - PubMed

Publication types

MeSH terms

Substances

LinkOut - more resources