Microbiome connections with host metabolism and habitual diet from 1,098 deeply phenotyped individuals - PubMed (original) (raw)
Clinical Trial
. 2021 Feb;27(2):321-332.
doi: 10.1038/s41591-020-01183-8. Epub 2021 Jan 11.
Sarah E Berry # 2, Ana M Valdes 3 4, Long H Nguyen 5, Gianmarco Piccinno 1, David A Drew 5, Emily Leeming 6, Rachel Gibson 7, Caroline Le Roy 6, Haya Al Khatib 8, Lucy Francis 8, Mohsen Mazidi 6, Olatz Mompeo 6, Mireia Valles-Colomer 1, Adrian Tett 1, Francesco Beghini 1, Léonard Dubois 1, Davide Bazzani 1, Andrew Maltez Thomas 1, Chloe Mirzayi 9, Asya Khleborodova 9, Sehyun Oh 9, Rachel Hine 8, Christopher Bonnett 8, Joan Capdevila 8, Serge Danzanvilliers 8, Francesca Giordano 8, Ludwig Geistlinger 9, Levi Waldron 9, Richard Davies 8, George Hadjigeorgiou 8, Jonathan Wolf 8, José M Ordovás 10 11, Christopher Gardner 12, Paul W Franks 13 14, Andrew T Chan 5 14 15, Curtis Huttenhower 14 15, Tim D Spector 6, Nicola Segata 16 17
Affiliations
- PMID: 33432175
- PMCID: PMC8353542
- DOI: 10.1038/s41591-020-01183-8
Clinical Trial
Microbiome connections with host metabolism and habitual diet from 1,098 deeply phenotyped individuals
Francesco Asnicar et al. Nat Med. 2021 Feb.
Abstract
The gut microbiome is shaped by diet and influences host metabolism; however, these links are complex and can be unique to each individual. We performed deep metagenomic sequencing of 1,203 gut microbiomes from 1,098 individuals enrolled in the Personalised Responses to Dietary Composition Trial (PREDICT 1) study, whose detailed long-term diet information, as well as hundreds of fasting and same-meal postprandial cardiometabolic blood marker measurements were available. We found many significant associations between microbes and specific nutrients, foods, food groups and general dietary indices, which were driven especially by the presence and diversity of healthy and plant-based foods. Microbial biomarkers of obesity were reproducible across external publicly available cohorts and in agreement with circulating blood metabolites that are indicators of cardiovascular disease risk. While some microbes, such as Prevotella copri and Blastocystis spp., were indicators of favorable postprandial glucose metabolism, overall microbiome composition was predictive for a large panel of cardiometabolic blood markers including fasting and postprandial glycemic, lipemic and inflammatory indices. The panel of intestinal species associated with healthy dietary habits overlapped with those associated with favorable cardiometabolic and postprandial markers, indicating that our large-scale resource can potentially stratify the gut microbiome into generalizable health levels in individuals without clinically manifest disease.
Conflict of interest statement
Conflict of interest statement
TD Spector, SE Berry, AM Valdes, F Asnicar, PW Franks, C Huttenhower, and N Segata, are consultants to Zoe Global Ltd (“Zoe”). J Wolf, G Hadjigeorgiou, R Davies, J Capdevila, C Bonnett, R Hine, L Francis, F Giordano, and S Danzanvilliers are or have been employees of Zoe. Other authors have no conflict of interest to declare.
Figures
Extended Data Fig. 1. Alpha diversity linked with personal factors, habitual diet, fasting, and postprandial markers.
a, Microbiome alpha diversity computed using the Shannon index correlated markers from the four categories: personal, habitual diet, fasting, and post-prandial. Reported are the five strongest positive and negative Spearman correlations for each category with p < 0.05. All correlations and p-values available in the Supplementary Table 1. b, Inter-sample microbiome distances (beta-diversity) were substantially lower, that is closer, among samples from the same individuals (two weeks apart) compared to those amongst different individuals. Gut microbial communities in monozygotic twins were slightly more similar than in dizygotic twins (Mann–Whitney U test two-sided p = 0.06), which, in turn, were more similar than unrelated individuals (p < 1e-12), even after adjusting for age (p < 1e-12). c, After excluding twin status (that is non-twin, vs. mono vs. dizygotic twins) from the model, personal factors still accounted for the greatest proportion of variance explained in overall microbial diversity, followed by dietary habits, fasting and postprandial cardiometabolic blood markers (by cumulative stepwise dbRDA). d, Cumulative (left bars) contributions and individual (right bars) contributions for each metadata variable based on Bray-Curtis dissimilarity. Box plots show first and third quartiles (boxes) and the median (middle line), whiskers extends up-to 1.5× the interquartile range.
Extended Data Fig. 2. Species-level correlation with single foods.
The figure shows the species-level correlations (Spearman) with single food quantities as estimated from the food frequency questionnaires. Only foods with at least 5 significant associations (q-value≤0.2) are displayed. Species are sorted by the number of significant associations, and the top 30 are reported in the figure.
Extended Data Fig. 3. Top foods, food groups, nutrients, and dietary patterns validated in the PREDICT 1 US cohort.
The application of the RF regression model trained on the PREDICT 1 UK cohort on the PREDICT 1 US participants, validating the associations with food-related variables found in the PREDICT 1 UK.
Extended Data Fig. 4. Performance for random Forest regression and classification on microbiome functional potential in predicting fasting measurements, total cholesterol and triglycerides in different lipoproteins.
The figure shows the performance of both RF regression and classification tasks trained on microbiome gene families profiles in predicting (a) the fasting measurements presented in Fig. 4a, sorted as in Fig. 4a. b, Predicting performances of the total cholesterol and (c) of triglycerides in different sizes of lipoproteins. For each lipoprotein, we considered its concentration values at both fasting and postprandial (6 h), and also the difference (rise) between the post-prandial concentration and the fasting one. Box plots show the distribution of the Spearman correlations (left axis) between real and predicted values using RF regression. Box plots show first and third quartiles (boxes) and the median (middle line), whiskers extends up-to 1.5× the interquartile range. Circles show the median AUC (right axis) of RF classification in predicting the bottom quartile of the distribution vs. the top quartile.
Extended Data Fig. 5. Distributions of BMI in each curatedMetagenomicData dataset.
The figure shows the distributions of BMI values for the datasets available in curatedMetagenomicData. This was used to further select those datasets with a comparable range of values (interquartile range between 3.5 and 7.5) as the one in the PREDICT 1 UK dataset (IQR of 5.5), to be used as validation datasets for the associations found. Box plots show first and third quartiles (boxes) and the median (middle line), whiskers extends up-to 1.5× the interquartile range.
Extended Data Fig. 6. Pairwise partial Spearman correlations between bacterial species and total lipids and cholesterol in lipoproteins.
a, The heatmap shows the species-level correlations with total lipids in lipoprotein variables at fasting, post-prandial (6 h), and the difference (rise) between the postprandial and fasting concentrations. The 30 species with the highest number of significant associations (FDR ≤ 0.2) are shown. The asterisk indicates a significant correlation between species and metadata variable using a t-test two-sided, corrected with FDR with q < 0.2. b, The heatmap shows the species-level correlations with total cholesterol in lipoprotein variables at fasting, post-prandial (6 h), and the difference (rise) between the postprandial and fasting concentrations. The 30 species with the highest number of significant associations (FDR ≤ 0.2) are shown. The asterisk indicates a significant correlation between species and metadata variable using a t-test two-sided, corrected with FDR with q < 0.2. All correlations, p-values, and q-values are available in the Supplementary Table 6.
Extended Data Fig. 7. Species-level correlations with triglycerides in lipoproteins.
The heatmap shows the species-level correlations with triglycerides in lipoprotein variables at fasting, post-prandial (6 h), and the difference (rise) between the postprandial and fasting concentrations. The 30 species with the highest number of significant associations (FDR ≤ 0.2) are shown. The asterisk indicates a significant correlation between species and metadata variable using a t-test two-sided, corrected with FDR with q < 0.2. All correlations, p-values, and q-values are available in the Supplementary Table 6.
Extended Data Fig. 8. Pairwise partial Spearman correlations between bacterial gene families and pathway abundances with clinical and metabolic risk scores, glycaemic and inflammatory measures, and lipoproteins.
a, The heatmap shows gene families correlations with the set of metadata presented in Fig. 5a–c reporting the top 2,000 genes selected among those with at least 20% prevalence on their number of significant correlations (q < 0.2). Gene families’ correlations are showing the same clusters as the species-level correlations in Fig. 5a–c. b, The heatmap shows pathway abundances correlations with the set of metadata presented in Fig. 5a–c reporting all the pathways at 20% prevalence (349 in total). Pathway abundances correlations are showing the same cluster structure as the species-level correlations in Fig. 5a–c.
Extended Data Fig. 9. Concordance of Random Forest scores with species-level partial correlations.
Volcano plots of the scores assigned to each species by Random Forest and their partial correlation, showing an overall concordance between the two independent approaches. We considered the top 5 metadata variables for the six metadata categories: a, Foods, bacon (g) (corr. 0.49), garlic (g) (corr. 0.424), unsalted nuts (g) (0.422), dairy dessert (g) (corr. 0.421), salted nuts (g) (corr. 0.395). b, Food groups, nuts (corr. 0.468), tea and coffee (corr. 0.436), meat (corr. 0.42), legumes (corr. 0.374), vegetables (corr. 0.371). c, Nutrients, lactose (corr. 0.442), niacin (corr. 0.381), maltose (corr. 0.361), sucrose (corr. 0.344), total carbohydrates (corr. 0.324). d, Nutrients normalized by daily energy intake, magnesium (corr. 0.472), starch (corr. 0.436), total carbohydrates (corr. 0.422), non-starch polysaccharides (NSP) (corr. 0.421), lactose (corr. 0.414). e, Dietary patterns, healthy plant percentage (corr. 0.492), healthy PDI (corr. 0.472), hei score (corr. 0.47), HFD (corr. 0.408), total plants percentage (0.388). f, Lipoproteins, M-HDL-L 6 h rise (corr. 0.406), IDL-C 6 h (corr. 0.4), HDL-L 6 h rise (corr. 0.397), XL-HDL-C 0 h (corr. 0.395), Total Cholesterol 4 h rise (corr. 0.391).
Extended Data Fig. 10. Prevotella copri and/or Blastocystis presence are indicators of a more favourable postprandial glucose response to meals.
a–c, Differential analysis of visceral fat, HFD and glucose iAUC 2 h after standardised breakfast according to presence-absence of one and both of P. copri and Blastocystis. The analysis reveals that both these species are indicators of reduced visceral fat, good cholesterol and meal-driven increase of glucose. d,e, Differential analysis of C-peptide and triglycerides at different time points according to presence-absence of one and both of P. copri and Blastocystis. The distributions of the concentrations for C-peptide and triglycerides were typically lower when one or both are absent. An asterisk between two box plots represents a significant p-value (p < 0.05) according to the Mann-Whitney U test (two-sided, Supplementary Table 8). Box plots show first and third quartiles (boxes) and the median (middle line), whiskers extends up-to 1.5× the interquartile range. P-values are available in Supplementary Table 8.
Fig. 1:. The PREDICT 1 study associates gut microbiome structure with habitual diet and blood cardiometabolic markers.
(A) The PREDICT 1 study assessed the gut microbiome of 1,098 volunteers from the UK and US via metagenomic sequencing of stool samples. Phenotypic data obtained through in-person assessment, blood/biospecimen collection, and the return of validated study questionnaires queried a range of relevant host/environmental factors including (1) personal characteristics, such as age, BMI, and estimated visceral fat; (2) habitual dietary intake using semi-quantitative food frequency questionnaires (FFQs); (3) fasting; and (4) postprandial cardiometabolic blood and inflammatory markers, total lipid and lipoprotein concentrations, lipoprotein particle sizes, apolipoproteins, derived metabolic risk scores, glycaemic-mediated metabolites, and metabolites related to fatty acid metabolism. (B) Overall microbiome alpha diversity, estimated as the total number of confidently identified microbial species in a given sample (richness), was correlated with HDL-D (positive) and estimated hepatic steatosis (negative). The five strongest positive and negative Spearman’s correlations with q<0.05 are reported for each of the four categories. Top species based on Shannon diversity are reported in Extended Data Fig. 1A and all correlations are reported in Supplementary Table 1.
Fig. 2:. Food quality, regardless of source, is linked to overall and feature-level composition of the gut microbiome.
(A) Specific components of habitual diet comprising foods, nutrients, and dietary indices are linked to the composition of the gut microbiome with variable strengths as estimated by machine learning regression and classification models. Boxplots report the correlation between the real value of each component and the value predicted by regression models across 100 training/testing folds (Methods). Circles denote median area-under-the-curve (AUC) values across 100 folds for a corresponding binary classifier between the highest and lowest quartiles (Methods). (B) Single Spearman’s correlations adjusted for BMI and age between microbial species and components of habitual diet with asterisks denoting significant associations (FDR q<0.2). The 30 microbial species with the highest number of significant associations across habitual diet categories are reported. All indices of dietary patterns are reported, whereas only food groups and nutrients (energy-adjusted) with at least 7 associations among the top 30 microbial species are reported (NSP: non‐starch polysaccharides). Rows and columns are hierarchically clustered (complete linkage, Euclidean distance). Full heatmaps of foods and unadjusted nutrients are reported in Extended Data Fig. 2, and the full set of correlations is available in Supplementary Table 5. **(C)** Number of significant positive and negative associations (Spearman’s correlation p<0.2) between foods and taxa categorized by more and less healthy plant-based foods and more and less healthy animal-based foods according to the PDI. Taxa shown are the 20 species with the highest total number of significant associations regardless of category. **(D)** The association between the gut microbiome and coffee consumption in UK participants is dose-dependent, i.e. stronger when assessing heavy (e.g. >4 cups/d) vs. never drinkers, and was validated in the US cohort when applying the UK model. The reported ROC curves represent the performance of the classifier at varying classification thresholds with respect to the True Positive Rate (i.e. recall) and the False Positive Rate (i.e. precision). (E-F) Among general dietary patterns and indices, the Healthy Food Diversity index (HFD) and the Alternate Mediterranean Diet score (aMED) were validated in the US cohort, thus showing consistency between the two populations on these two important dietary indices. Other validations of the UK model applied to the US cohort are reported in Extended Data Fig. 3.
Fig. 3:. Random forest machine learning models trained on microbial or functional profiles are capable of predicting obesity phenotypic markers, even on independent cohorts.
(A) Whole-microbiome machine learning models can assess personal factors with RF regression (boxplots and left-side y-axis) using only taxonomic or functional (i.e. pathway) microbiome features. Classification models (circles and right-side y-axis) exceed AUC 0.65 except for waist-to-hip ratio (WHR) and smoking. (B) We observed the highest correlations between the relative abundance of microbial species and age, BMI, and visceral fat. The link between microbial features and visceral fat was of greater effect and more often significant than with traditional BMI. (C) Using several independent datasets we confirmed correlations between single microbial species and BMI with blue points denoting significant associations at p<0.05. (D) The machine learning model for BMI trained on PREDICT 1 data is reproducible in several external datasets (Extended Data Fig. 5), achieving correlations with true values exceeding those obtained in cross-validation of a single given dataset in five of seven cases. When the PREDICT 1 microbiome model is expanded to include other datasets (excluding those ones used for testing, i.e. leave-one-dataset-out/LODO approach) the performance remains comparable, confirming the generalizability of the PREDICT 1 model on obesity-related indicators.
Fig. 4:. Fasting and postprandial cardiometabolic responses to standardized test meals associated with the microbiome.
(A) The strongest observed links according to correlation of the predicted versus collected measures between the gut microbiome and fasting metabolic blood markers. For measures of lipid concentration in lipoproteins, we report the five strongest correlations only. Indices are grouped in nine distinct categories, and boxplots report the correlation between the prediction of RF regression models trained on microbial taxa or pathway abundances across 100 training/testing folds and stars report regressor performance when trained on the UK cohort and evaluated on the independent US validation cohort (left-side y-axis). Circles denote AUC values for RF classification (right-side y-axis) (B-F) Performance of our microbiome-based ML-model in estimating postprandial absolute levels and postprandial increases in cardiometabolic markers. Stars denote regression model results in our US validation cohort for postprandial measurements (not rises; Extended Data Fig. 4B-C). (B) RF regression and classification performance in predicting postprandial metabolic responses for clinic Meal 1 (breakfast) measured as iAUC at 6h for triglycerides and iAUC at 2h for glucose, C-peptide, and insulin. (C) Glycaemic-mediated postprandial iAUCs at 2h for the other meals (Supplementary Table 7), and (D) glycaemic-mediated markers absolute levels vs. rise. (E) Postprandial inflammatory measures (concentration and rise). (F) Postprandial lipoproteins measures (6h concentration and rise). (G) Overall agreement between RF regression and classification tasks for UK models applied to the independent US cohort. (H) RF microbiome-based model performance with postprandial changes (concentrations and rise) in lipoprotein concentration, composition, and size. Fasting and postprandial performance indices (correlation of the regressors’ outputs) were more tightly linked to gut community structure than were their corresponding postprandial rises.
Fig. 5:. Species-level segregation into healthy and unhealthy microbial signatures of fasting and postprandial cardiometabolic markers.
(A) Associations (Spearman’s correlation, q<0.2 marked with stars) between single microbial species and fasting clinical risk measures and (B) glycaemic, inflammatory, and lipaemic indices. (C) Correlation between microbial species and the iAUC for glucose and C-peptide estimations based on clinical measurements before and after standardized meals. The 30 species with the highest number of significant correlations with distinct fasting and postprandial indices are shown. Rows are hierarchically clustered (complete linkage, Euclidean distance). (D) Microbe-metabolite correlations are very consistent when evaluated for fasting versus postprandial (6h) conditions (left panel). Associations with postprandial variations (rise) conversely often show opposing relationships, with several species positively correlated with fasting measures being negatively correlated with postprandial variation of the same metabolite (or vice versa, central panel). This was mitigated somewhat when comparing absolute postprandial responses with rise (right panel).
Fig. 6:. The panel of 30 species showing the strongest overall correlations with a selection of markers of nutritional and cardiometabolic health.
The 30 species with the highest and lowest average ranks with diverse positive and negative cardiometabolic health and healthy diet indicators, respectively, are shown here. The rank of each microbe’s correlation with individual indicators is written within cells when significant (p<0.05). For each of the main categories of indices, we selected up to five representative markers (for “Personal” we considered only four as the remaining were highly correlated with visceral fat or not relevant in this context). Indices can be considered “positive” and “negative” depending on whether higher or lower values are a proxy for more or less healthy conditions.
Comment in
- Do diet and microbes really 'PREDICT' cardiometabolic risks?
Cani PD, Van Hul M. Cani PD, et al. Nat Rev Endocrinol. 2021 May;17(5):259-260. doi: 10.1038/s41574-021-00480-7. Nat Rev Endocrinol. 2021. PMID: 33627837 No abstract available. - Do Prevotella copri and Blastocystis promote euglycaemia?
Janket SJ, Conte HA, Diamandis EP. Janket SJ, et al. Lancet Microbe. 2021 Nov;2(11):e565-e566. doi: 10.1016/S2666-5247(21)00215-9. Epub 2021 Sep 17. Lancet Microbe. 2021. PMID: 35544079 No abstract available.
Similar articles
- The gut microbiome modulates the protective association between a Mediterranean diet and cardiometabolic disease risk.
Wang DD, Nguyen LH, Li Y, Yan Y, Ma W, Rinott E, Ivey KL, Shai I, Willett WC, Hu FB, Rimm EB, Stampfer MJ, Chan AT, Huttenhower C. Wang DD, et al. Nat Med. 2021 Feb;27(2):333-343. doi: 10.1038/s41591-020-01223-3. Epub 2021 Feb 11. Nat Med. 2021. PMID: 33574608 Free PMC article. - Acute and chronic improvement in postprandial glucose metabolism by a diet resembling the traditional Mediterranean dietary pattern: Can SCFAs play a role?
Vitale M, Giacco R, Laiola M, Della Pepa G, Luongo D, Mangione A, Salamone D, Vitaglione P, Ercolini D, Rivellese AA. Vitale M, et al. Clin Nutr. 2021 Feb;40(2):428-437. doi: 10.1016/j.clnu.2020.05.025. Epub 2020 Jun 3. Clin Nutr. 2021. PMID: 32698959 Clinical Trial. - Distinct Genetic and Functional Traits of Human Intestinal Prevotella copri Strains Are Associated with Different Habitual Diets.
De Filippis F, Pasolli E, Tett A, Tarallo S, Naccarati A, De Angelis M, Neviani E, Cocolin L, Gobbetti M, Segata N, Ercolini D. De Filippis F, et al. Cell Host Microbe. 2019 Mar 13;25(3):444-453.e3. doi: 10.1016/j.chom.2019.01.004. Epub 2019 Feb 21. Cell Host Microbe. 2019. PMID: 30799264 - Micronutrients impact the gut microbiota and blood glucose.
Barra NG, Anhê FF, Cavallari JF, Singh AM, Chan DY, Schertzer JD. Barra NG, et al. J Endocrinol. 2021 Jul 28;250(2):R1-R21. doi: 10.1530/JOE-21-0081. J Endocrinol. 2021. PMID: 34165440 Review. - Dietary impact on fasting and stimulated GLP-1 secretion in different metabolic conditions - a narrative review.
Huber H, Schieren A, Holst JJ, Simon MC. Huber H, et al. Am J Clin Nutr. 2024 Mar;119(3):599-627. doi: 10.1016/j.ajcnut.2024.01.007. Epub 2024 Jan 11. Am J Clin Nutr. 2024. PMID: 38218319 Free PMC article. Review.
Cited by
- Observation on clinical effect of Huoxue-Jiangtang decoction formula granules in treating prediabetes: a randomized prospective placebo-controlled double-blind trial protocol.
Zhang PX, Zeng L, Meng L, Li HL, Zhao HX, Liu DL. Zhang PX, et al. BMC Complement Med Ther. 2022 Oct 19;22(1):274. doi: 10.1186/s12906-022-03755-2. BMC Complement Med Ther. 2022. PMID: 36261813 Free PMC article. - Dietary habits and the gut microbiota in military Veterans: results from the United States-Veteran Microbiome Project (US-VMP).
Brostow DP, Stamper CE, Stanislawski MA, Stearns-Yoder KA, Schneider A, Postolache TT, Forster JE, Hoisington AJ, Lowry CA, Brenner LA. Brostow DP, et al. Gut Microbiome (Camb). 2021 Apr 28;2:e1. doi: 10.1017/gmb.2021.1. eCollection 2021. Gut Microbiome (Camb). 2021. PMID: 39296320 Free PMC article. - Vitamin A carotenoids, but not retinoids, mediate the impact of a healthy diet on gut microbial diversity.
Valdes AM, Louca P, Visconti A, Asnicar F, Bermingham K, Nogal A, Wong K, Michelotti GA, Wolf J, Segata N, Spector TD, Berry SE, Falchi M, Menni C. Valdes AM, et al. BMC Med. 2024 Aug 7;22(1):321. doi: 10.1186/s12916-024-03543-4. BMC Med. 2024. PMID: 39113058 Free PMC article. - The Potential Role of SCFAs in Modulating Cardiometabolic Risk by Interacting with Adiposity Parameters and Diet.
Ostrowska J, Samborowska E, Jaworski M, Toczyłowska K, Szostak-Węgierek D. Ostrowska J, et al. Nutrients. 2024 Jan 16;16(2):266. doi: 10.3390/nu16020266. Nutrients. 2024. PMID: 38257159 Free PMC article. - Fecal Metabolites as Biomarkers for Predicting Food Intake by Healthy Adults.
Shinn LM, Mansharamani A, Baer DJ, Novotny JA, Charron CS, Khan NA, Zhu R, Holscher HD. Shinn LM, et al. J Nutr. 2023 Jan 14;152(12):2956-2965. doi: 10.1093/jn/nxac195. J Nutr. 2023. PMID: 36040343 Free PMC article.
References
- Le Chatelier E et al.Richness of human gut microbiome correlates with metabolic markers. Nature 500, 541–546 (2013). - PubMed
Publication types
MeSH terms
Substances
Supplementary concepts
Grants and funding
- MR/N01183X/1/MRC_/Medical Research Council/United Kingdom
- P30 DK043351/DK/NIDDK NIH HHS/United States
- MR/N030125/1/MRC_/Medical Research Council/United Kingdom
- MR/M016560/1/MRC_/Medical Research Council/United Kingdom
- K23 DK125838/DK/NIDDK NIH HHS/United States
- 212904/Z/18/Z/WT_/Wellcome Trust/United Kingdom
- R01 CA230551/CA/NCI NIH HHS/United States
- WT_/Wellcome Trust/United Kingdom
- U01 CA230551/CA/NCI NIH HHS/United States
LinkOut - more resources
Full Text Sources
Other Literature Sources
Medical
Miscellaneous