Meta-analyses of human gut microbes associated with obesity and IBD - PubMed (original) (raw)

Meta-Analysis

Meta-analyses of human gut microbes associated with obesity and IBD

William A Walters et al. FEBS Lett. 2014.

Abstract

Recent studies have linked human gut microbes to obesity and inflammatory bowel disease, but consistent signals have been difficult to identify. Here we test for indicator taxa and general features of the microbiota that are generally consistent across studies of obesity and of IBD, focusing on studies involving high-throughput sequencing of the 16S rRNA gene (which we could process using a common computational pipeline). We find that IBD has a consistent signature across studies and allows high classification accuracy of IBD from non-IBD subjects, but that although subjects can be classified as lean or obese within each individual study with statistically significant accuracy, consistent with the ability of the microbiota to experimentally transfer this phenotype, signatures of obesity are not consistent between studies even when the data are analyzed with consistent methods. The results suggest that correlations between microbes and clinical conditions with different effect sizes (e.g. the large effect size of IBD versus the small effect size of obesity) may require different cohort selection and analysis strategies.

Keywords: Gut microbiota; Metaanalysis.

Copyright © 2014 The Authors. Published by Elsevier B.V. All rights reserved.

PubMed Disclaimer

Figures

Figure 1

Figure 1. Relative abundance of phylum-level gut microbial taxa

Studies listed below are Zupancic [25], Wu [28], Human microbiome project [44], Turnbaugh [8], and Yatsunenko[45].

Figure 2

Figure 2

Ratios of Firmicutes:Bacteroidetes in normal versus obese BMI subjects. Means of ratios for each study/BMI category are shown. Error bars are standard error of the mean. The Turnbaugh study includes a number of samples with extremely low Bacteroidetes, leading to large standard error values.

Figure 3

Figure 3

Alpha diversity (observed species) across studies. Metric is observed species (counts of unique OTUs). Sequence depth is 1000 sequences per sample, and subsampling was performed 10 times. P-values were calculated by using a Monte Carlo simulation with 999 permutations

Figure 3

Figure 3

Alpha diversity (observed species) across studies. Metric is observed species (counts of unique OTUs). Sequence depth is 1000 sequences per sample, and subsampling was performed 10 times. P-values were calculated by using a Monte Carlo simulation with 999 permutations

Figure 3

Figure 3

Alpha diversity (observed species) across studies. Metric is observed species (counts of unique OTUs). Sequence depth is 1000 sequences per sample, and subsampling was performed 10 times. P-values were calculated by using a Monte Carlo simulation with 999 permutations

Figure 4

Figure 4

Alpha diversity (Shannon) across studies. Metric is Shannon (abundance and evenness). Sequence depth is 1000 sequences per sample, and subsampling was performed 10 times. P-values were calculated by using a Monte Carlo simulation with 999 permutations

Figure 4

Figure 4

Alpha diversity (Shannon) across studies. Metric is Shannon (abundance and evenness). Sequence depth is 1000 sequences per sample, and subsampling was performed 10 times. P-values were calculated by using a Monte Carlo simulation with 999 permutations

Figure 4

Figure 4

Alpha diversity (Shannon) across studies. Metric is Shannon (abundance and evenness). Sequence depth is 1000 sequences per sample, and subsampling was performed 10 times. P-values were calculated by using a Monte Carlo simulation with 999 permutations

Figure 4

Figure 4

Alpha diversity (Shannon) across studies. Metric is Shannon (abundance and evenness). Sequence depth is 1000 sequences per sample, and subsampling was performed 10 times. P-values were calculated by using a Monte Carlo simulation with 999 permutations

Figure 5

Figure 5

Alpha diversity (PD) for Turnbaugh et al data [8] across clustering methods. Metric is phylogenetic diversity (a measure of branch length of the phylogenetic tree occupied by the sequences present in the samples). Sequence depth is 1000 sequences per sample, and subsampling was performed 10 times. P-values were calculated by using a Monte Carlo simulation with 999 permutations

Figure 5

Figure 5

Alpha diversity (PD) for Turnbaugh et al data [8] across clustering methods. Metric is phylogenetic diversity (a measure of branch length of the phylogenetic tree occupied by the sequences present in the samples). Sequence depth is 1000 sequences per sample, and subsampling was performed 10 times. P-values were calculated by using a Monte Carlo simulation with 999 permutations

Figure 6

Figure 6. Clustering of BMI samples with unweighted UniFrac

Shape/Color: Study/BMI category Purple square: Zupancic normal Brown triangle: Zupancic obese Orange square: Turnbaugh normal Pink triangle: Turnbaugh obese Red circle: Wu normal Yellow diamond: Wu obese Dark blue circle: HMP normal Light blue triangle: HMP obese Green triangle: Yatsunenko normal Grey circle: Yatsunenko obese

Figure 7

Figure 7. Comparison of supervised learning error ratios to clustering identity of data

Samples matching those used in Knights et al [9] replicated the improved classifications relative to random guessing (value of 1) for lean and obese subjects in the Turnbaugh [8] study, and are shown as the red line. The average error ratio for a random subsample of 30 obese and lean samples (10× sample at each percent identity, average ratio is shown) is depicted in blue. The purple line shows the classification error ratio when all samples (61 lean versus 196 obese samples), which is essentially no better than random guess for any clustering identity. The sequences were clustered using a de novo approach for each percent identity listed.

Figure 8

Figure 8. Receiver operator characteristic curve values for all Turnbaugh lean and obese subjects

The average ROC area under the curve value (using random forest method) for each clustering identity was calculated by the averaging the 5× repeated (with 10-fold cross-validation) optimized ROC values. A 0.5 value indicates no better than random guess, while 1.0 indicates perfect sensitivity and specificity. The sequences were clustered using a de novo approach for each percent identity listed.

Figure 9

Figure 9. Phylum-level taxa plots for IBD subjects versus healthy controls

Seven most abundant phyla shown. HC=healthy controls, UC=ulcerative colitis, CCD=colonic Crohn's disease, ICD=ileal Crohn's disease. Error bars indicate standard error of the mean.

Figure 10

Figure 10. PCoA plots of healthy controls versus subjects with IBD

Distances were calculated with unweighted UniFrac. A-HC vs UC samples B. HC vs CCD samples C. HC vs ICD samples. Distances between healthy controls and all IBD categories are significantly different (p-value < 0.050 with PERMANOVA tests (999 permutations). Data were evenly sampled at 1004 sequences per sample.

Figure 10

Figure 10. PCoA plots of healthy controls versus subjects with IBD

Distances were calculated with unweighted UniFrac. A-HC vs UC samples B. HC vs CCD samples C. HC vs ICD samples. Distances between healthy controls and all IBD categories are significantly different (p-value < 0.050 with PERMANOVA tests (999 permutations). Data were evenly sampled at 1004 sequences per sample.

Figure 10

Figure 10. PCoA plots of healthy controls versus subjects with IBD

Distances were calculated with unweighted UniFrac. A-HC vs UC samples B. HC vs CCD samples C. HC vs ICD samples. Distances between healthy controls and all IBD categories are significantly different (p-value < 0.050 with PERMANOVA tests (999 permutations). Data were evenly sampled at 1004 sequences per sample.

Figure 11

Figure 11. Alpha diversity for IBD subjects and healthy controls

Y-axis indicates observed species value. Healthy control samples are significantly different from all inflammatory bowel disease categories with a p-value of < 0.05 (Monte Carlo permutation test, permutations = 999). The samples were repeatedly sampled (10×) at 1000 sequences/sample.

Similar articles

Cited by

References

    1. Lozupone CA, Stombaugh J, Gonzalez A, Ackermann G, Wendel D, Vazquez-Baeza Y, Jansson JK, Gordon JI, Knight R. Meta-analyses of studies of the human microbiota. Genome research. 2013;23:1704–14. - PMC - PubMed
    1. Liu Z, DeSantis TZ, Andersen GL, Knight R. Accurate taxonomy assignments from 16S rRNA sequences produced by highly parallel pyrosequencers. Nucleic acids research. 2008;36:e120. - PMC - PubMed
    1. DeSantis TZ, Hugenholtz P, Larsen N, Rojas M, Brodie EL, Keller K, Huber T, Dalevi D, Hu P, Andersen GL. Greengenes, a chimera-checked 16S rRNA gene database and workbench compatible with ARB. Applied and environmental microbiology. 2006;72:5069–72. - PMC - PubMed
    1. Edgar RC. Search and clustering orders of magnitude faster than BLAST. Bioinformatics. 2010;26:2460–1. - PubMed
    1. Lozupone C, Knight R. UniFrac: a new phylogenetic method for comparing microbial communities. Applied and environmental microbiology. 2005;71:8228–35. - PMC - PubMed

Publication types

MeSH terms

Grants and funding

LinkOut - more resources