DIABLO: an integrative approach for identifying key molecular drivers from multi-omics assays - PubMed (original) (raw)

DIABLO: an integrative approach for identifying key molecular drivers from multi-omics assays

Amrit Singh et al. Bioinformatics. 2019.

Abstract

Motivation: In the continuously expanding omics era, novel computational and statistical strategies are needed for data integration and identification of biomarkers and molecular signatures. We present Data Integration Analysis for Biomarker discovery using Latent cOmponents (DIABLO), a multi-omics integrative method that seeks for common information across different data types through the selection of a subset of molecular features, while discriminating between multiple phenotypic groups.

Results: Using simulations and benchmark multi-omics studies, we show that DIABLO identifies features with superior biological relevance compared with existing unsupervised integrative methods, while achieving predictive performance comparable to state-of-the-art supervised approaches. DIABLO is versatile, allowing for modular-based analyses and cross-over study designs. In two case studies, DIABLO identified both known and novel multi-omics biomarkers consisting of mRNAs, miRNAs, CpGs, proteins and metabolites.

Availability and implementation: DIABLO is implemented in the mixOmics R Bioconductor package with functions for parameters' choice and visualization to assist in the interpretation of the integrative analyses, along with tutorials on http://mixomics.org and in our Bioconductor vignette.

Supplementary information: Supplementary data are available at Bioinformatics online.

PubMed Disclaimer

Figures

Fig. 1.

Simulation study. (A) Classification error rates (10-fold CV averaged over 20 simulations) for different FCs between groups and varying level of noise (SD). Dashed line indicates a random performance (error rate = 50%). (B) Types of variables selected by the different classification methods amongst the 180 variables selected for each classification method

Fig. 2.

Benchmark for colon cancer. (A) Number of selected features overlapping between supervised and unsupervised methods. (B) Number of correlated variables in the biomarker panels for various Pearson correlation cutoffs. (C) Top: network modularity of each multi-omics biomarker panel. Gray circles depict modules based on the edge betweenness index from the igraph R-library. Bottom: consensus component plots depicting the separation of subjects in the high and low survival groups. Similar patterns were observed for kidney, gbm and lung cancer datasets, see Supplementary Figures S5–S10

Fig. 3.

A multi-omics biomarker panel predictive of breast cancer subtypes. (A) DIABLO consensus component plot based on the identified multi-omics biomarker panel: test samples are overlaid with 95% confidence ellipses calculated from the training data. (B) Network visualization of the biomarker panel highlighting correlated variables (absolute Pearson’s correlation >0.4) and four communities based on the edge betweenness index

Fig. 4.

Asthma study: cross-over design and module-based analysis. (A) DIABLO design includes module-based decomposition to discriminate pre- and post-allergen challenge samples. (B) Receiver operating characteristic curves comparing standard DIABLO and multilevel DIABLO for repeated measures (mDIABLO) using leave-one-out CV. (C) Component plots of the pre- and post-challenge samples (DIABLO and mDIABLO)

Cited by

Integrative microbiome and metabolome profiles reveal the impacts of periodontitis via oral-gut axis in first-trimester pregnant women.
Cheng T, Wen P, Yu R, Zhang F, Li H, Xu X, Zhao D, Liu F, Su W, Zheng Z, Yang H, Yao J, Jin L. Cheng T, et al. J Transl Med. 2024 Sep 3;22(1):819. doi: 10.1186/s12967-024-05579-9. J Transl Med. 2024. PMID: 39227984 Free PMC article.
Molecular Signatures of Idiopathic Pulmonary Fibrosis.
Konigsberg IR, Borie R, Walts AD, Cardwell J, Rojas M, Metzger F, Hauck SM, Fingerlin TE, Yang IV, Schwartz DA. Konigsberg IR, et al. Am J Respir Cell Mol Biol. 2021 Oct;65(4):430-441. doi: 10.1165/rcmb.2020-0546OC. Am J Respir Cell Mol Biol. 2021. PMID: 34038697 Free PMC article.
Comparative analysis of integrative classification methods for multi-omics data.
Novoloaca A, Broc C, Beloeil L, Yu WH, Becker J. Novoloaca A, et al. Brief Bioinform. 2024 May 23;25(4):bbae331. doi: 10.1093/bib/bbae331. Brief Bioinform. 2024. PMID: 38985929 Free PMC article.
Network-based integration of multi-omics data for clinical outcome prediction in neuroblastoma.
Wang C, Lue W, Kaalia R, Kumar P, Rajapakse JC. Wang C, et al. Sci Rep. 2022 Sep 14;12(1):15425. doi: 10.1038/s41598-022-19019-5. Sci Rep. 2022. PMID: 36104347 Free PMC article.
Microbe-Immune Crosstalk: Evidence That T Cells Influence the Development of the Brain Metabolome.
Caspani G, Green M, Swann JR, Foster JA. Caspani G, et al. Int J Mol Sci. 2022 Mar 17;23(6):3259. doi: 10.3390/ijms23063259. Int J Mol Sci. 2022. PMID: 35328680 Free PMC article.

References

1. Aben N., et al. (2016) Tandem: a two-stage approach to maximize interpretability of drug response models based on multiple molecular data types. Bioinformatics, 32, i413–i420. - PubMed
1. Allahyar A., De Ridder J. (2015) Feral: network-based classifier with application to breast cancer outcome prediction. Bioinformatics, 31, i311–i319. - PMC - PubMed
1. Argelaguet R., et al. (2018) Multi-omics factor analysis—a framework for unsupervised integration of multi-omics data sets. Mol. Syst. Biol., 14, e8124. - PMC - PubMed
1. Bersanelli M., et al. (2016) Methods for the integration of multi-omics data: mathematical aspects. BMC Bioinformatics, 17, S15. - PMC - PubMed
1. Chung I.-F., et al. (2016) Driverdbv2: a database for human cancer driver gene research. Nucleic Acids Res., 44, D975–D979. - PMC - PubMed

DIABLO: an integrative approach for identifying key molecular drivers from multi-omics assays - PubMed (original) (raw)