Discovery of robust and highly specific microbiome signatures of non-alcoholic fatty liver disease - PubMed (original) (raw)

doi: 10.1186/s40168-024-01990-y.

Andrea Marfil-Sánchez # 1, Xiuqiang Chen 1, Mohammad Mirhakkak 1, Huating Li 2, Weiping Jia 2, Aimin Xu 3 4 5, Henrik Bjørn Nielsen 6, Max Nieuwdorp 7, Rohit Loomba 8, Yueqiong Ni 9 10 11, Gianni Panagiotou 12 13 14 15

Affiliations

Discovery of robust and highly specific microbiome signatures of non-alcoholic fatty liver disease

Emmanouil Nychas et al. Microbiome. 2025.

Abstract

Background: The pathogenesis of non-alcoholic fatty liver disease (NAFLD) with a global prevalence of 30% is multifactorial and the involvement of gut bacteria has been recently proposed. However, finding robust bacterial signatures of NAFLD has been a great challenge, mainly due to its co-occurrence with other metabolic diseases.

Results: Here, we collected public metagenomic data and integrated the taxonomy profiles with in silico generated community metabolic outputs, and detailed clinical data, of 1206 Chinese subjects w/wo metabolic diseases, including NAFLD (obese and lean), obesity, T2D, hypertension, and atherosclerosis. We identified highly specific microbiome signatures through building accurate machine learning models (accuracy = 0.845-0.917) for NAFLD with high portability (generalizable) and low prediction rate (specific) when applied to other metabolic diseases, as well as through a community approach involving differential co-abundance ecological networks. Moreover, using these signatures coupled with further mediation analysis and metabolic dependency modeling, we propose synergistic defined microbial consortia associated with NAFLD phenotype in overweight and lean individuals, respectively.

Conclusion: Our study reveals robust and highly specific NAFLD signatures and offers a more realistic microbiome-therapeutics approach over individual species for this complex disease. Video Abstract.

Keywords: Gut microbiota; Machine learning; Metabolic diseases; Metabolomics; Microbial consortia; NAFLD; Network analysis.

© 2025. The Author(s).

PubMed Disclaimer

Conflict of interest statement

Declarations. Ethics approval and consent to participate: For the cohorts that are not associated with a published study, ethics approvals were obtained by the Shanghai Jiao Tong University Affiliated Sixth People’s Hospital (approval no: 2015-65-(1)) and the University of Hong Kong / Hospital Authority Hong Kong West Cluster (approval no: UW 20-700) following the principles of the Declaration of Helsinki. Consent for publication: Written informed consent was obtained from all participants from all the cohorts that are not associated with a published study. Competing interests: M.N. is founder and scientific advisor of Caelus Health that is commercializing A. soehngenii for metabolic disease treatment. R.L. serves as a consultant to Aardvark Therapeutics, Altimmune, Arrowhead Pharmaceuticals, AstraZeneca, Cascade Pharmaceuticals, Eli Lilly, Gilead, Glympse bio, Inipharma, Intercept, Inventiva, Ionis, Janssen Inc., Lipidio, Madrigal, Neurobo, Novo Nordisk, Merck, Pfizer, Sagimet, 89 bio, Takeda, Terns Pharmaceuticals and Viking Therapeutics. In addition, his institution received research grants from Arrowhead Pharmaceuticals, Astrazeneca, Boehringer-Ingelheim, Bristol-Myers Squibb, Eli Lilly, Galectin Therapeutics, Gilead, Intercept, Hanmi, Intercept, Inventiva, Ionis, Janssen, Madrigal Pharmaceuticals, Merck, Novo Nordisk, Pfizer, Sonic Incytes and Terns Pharmaceuticals. Co-founder of LipoNexus Inc. All other authors declare that they have no competing interests.

Figures

Fig. 1

Fig. 1

Study design overview and structure differences in microbiome among NAFLD and closely associated diseases. A A graphical representation summarizing the cohort information, collected data, and analysis performed in 1206 samples. Detailed criteria on the group formation can be found in the methods section. Created with Biorender.com. B High interconnection between NAFLD-related clinical data and other clinical measures representative of different cardiometabolic diseases. Spearman’s rank-based correlations were used. C Comparison of bacterial species profiles, KEGG pathway profiles, and metabolite profiles among all disease groups adjusted by their respective controls, using non-metric multidimensional scaling (NMDS) of weighted UniFrac, Bray Curtis, and Canberra distances, respectively. The error bars indicate the mean and standard errors of the mean. Significant differences were determined using PERMANOVA and were considered significant if P < 0.05

Fig. 2

Fig. 2

Random forest model based on gut microbiota species and metabolites accurately and specifically predicts NAFLD-O. A Receiver operating characteristic (ROC) curves and confusion matrices evaluating the ability of random forest models to predict NAFLD. Each color represents the model performance using as features species and pathways (purple), or species and metabolites (red and blue). Blue indicates the model performance when being validated in an external US cohort. B Evaluation of model cross-study portability and prediction rate on NAFLD-L and other metabolic diseases. Models using species plus metabolites or species plus pathways as features were used_._ C Feature importance for the random forest model built with species and metabolites. The color indicates feature prediction as evaluated by Shapley values: blue for Control-NAFLD-O and red for NAFLD-O. D Abundance comparison between each disease and its control for the species selected in the model built with species and metabolites. ANCOM-II was used for statistical comparisons. Circle size corresponds to the mean difference, with a higher size/value indicating a stronger difference. Full circle: higher in control; empty circle: higher in disease. CLR: centered log ratio

Fig. 3

Fig. 3

Signature modules of species networks associated with NAFLD phenotype for NAFLD-O and NAFLD-L. A Venn diagram of significantly different in abundance species using ANCOM-II (cutoff = 0.6) adjusted by age gender BMI HOMA-IR and SBP for the comparisons of NAFLD-O vs CTRL-NAFLD-O and NAFLD-L vs CTRL-NAFLD-L. The amount of common significantly different species of each case, with other metabolic disease comparisons is displayed as a percentage under the Venns circles with arrows. Differentially abundant species analysis for metabolic diseases was done using ANCOM-II adjusting for a set of clinical data depending on the disease-control comparison (Methods). B Multiscale embedded correlation network analysis illustrates the differential correlation of species in NAFLD-O vs CTRL-NAFLD-O and NAFLD-L vs CTRL-NAFLD-L. Only species pairs with significant differential correlations (empirical P < 0.01) were included. The color of the link indicates the type of correlation change in control/NAFLD. Sparse canonical correlation analysis (sCCA) was used to evaluate the contribution of each species to the correlation between the module and NAFLD-related clinical parameters (P < 0.05), which is displayed by the color of the nodes (blue: negative, red: positive). ANCOM-II was used to find a significant difference in abundance species (cutoff = 0.6) adjusting for a set of clinical data depending on the disease-control comparison (Methods). Species found significantly different in abundance uniquely in NAFLD comparisons are marked with a triangle whereas a diamond is used if the species also appears significant in the comparison of any other metabolic diseases. The size of the node indicates the magnitude of the W statistic generated by ANCOM-II. C Table summarizing the numbers for each type of correlation change for the two modules in B. The types are depicted as from control to NAFLD (control/NAFLD) and colored differently as in B

Fig. 4

Fig. 4

Specific microbial metabolites mediate the associations between gut microbiota species modules and NAFLD. A Analysis of the significant (P < 0.05) effect of species modules on NAFLD-related clinical data mediated by the DA metabolites for NAFLD-O and NAFLD-L. B Parallel coordinates charts showing the 133 mediation effects of in silico estimated metabolites that were significant at P < 0.05. Upper chart: NAFLD-O; lower chart: NAFLD-L. Within each chart, it shows individual species from network modules (left), DA metabolites (middle), and NAFLD clinical data (right). The curved lines connecting the panels indicate the mediation effects, with line colors corresponding to different metabolites. The colors of feature names represent positive (red) or negative (blue) associations with NAFLD, as determined by sCCA for species and DA for metabolites and clinical data. C A graphical representation of bacterial synergistic communities generated from SMETANA for NAFLD-O and NAFLD-L. Metabolites exchanged with smetana score ≥ 1 are displayed. Created with Biorender.com

References

    1. Loomba R, Sanyal AJ. The global NAFLD epidemic. Nat. Rev. Gastroenterol. Hepatol. 2013. p. 686–90. - PubMed
    1. Kolodziejczyk AA, Zheng D, Shibolet O, Elinav E. The role of the microbiome in NAFLD and NASH. EMBO Mol Med. 2019;11:1–13. - PMC - PubMed
    1. Loomba R, Seguritan V, Li W, Long T, Klitgord N, Bhatt A, et al. Gut Microbiome-Based Metagenomic Signature for Non-invasive Detection of Advanced Fibrosis in Human Nonalcoholic Fatty Liver Disease. Cell Metab. 2017;25:1054-1062.e5. - PMC - PubMed
    1. Oh TG, Kim SM, Caussy C, Fu T, Guo J, Bassirian S, et al. A Universal Gut-Microbiome-Derived Signature Predicts Cirrhosis. Cell Metab. 2020;32:878-888.e6. Available from: 10.1016/j.cmet.2020.06.005. - DOI - PMC - PubMed
    1. Liu Y, Méric G, Havulinna AS, Teo SM, Åberg F, Ruuskanen M, et al. Early prediction of incident liver disease using conventional risk factors and gut-microbiome-augmented gradient boosting. Cell Metab. 2022;34:719-730.e4. - PMC - PubMed

MeSH terms

LinkOut - more resources