Simultaneous identification of multiple driver pathways in cancer - PubMed (original) (raw)

Simultaneous identification of multiple driver pathways in cancer

Mark D M Leiserson et al. PLoS Comput Biol. 2013.

Abstract

Distinguishing the somatic mutations responsible for cancer (driver mutations) from random, passenger mutations is a key challenge in cancer genomics. Driver mutations generally target cellular signaling and regulatory pathways consisting of multiple genes. This heterogeneity complicates the identification of driver mutations by their recurrence across samples, as different combinations of mutations in driver pathways are observed in different samples. We introduce the Multi-Dendrix algorithm for the simultaneous identification of multiple driver pathways de novo in somatic mutation data from a cohort of cancer samples. The algorithm relies on two combinatorial properties of mutations in a driver pathway: high coverage and mutual exclusivity. We derive an integer linear program that finds set of mutations exhibiting these properties. We apply Multi-Dendrix to somatic mutations from glioblastoma, breast cancer, and lung cancer samples. Multi-Dendrix identifies sets of mutations in genes that overlap with known pathways - including Rb, p53, PI(3)K, and cell cycle pathways - and also novel sets of mutually exclusive mutations, including mutations in several transcription factors or other genes involved in transcriptional regulation. These sets are discovered directly from mutation data with no prior knowledge of pathways or gene interactions. We show that Multi-Dendrix outperforms other algorithms for identifying combinations of mutations and is also orders of magnitude faster on genome-scale data. Software available at: http://compbio.cs.brown.edu/software.

PubMed Disclaimer

Conflict of interest statement

The authors have declared that no competing interests exist.

Figures

Figure 1. The Multi-Dendrix pipeline.

Multi-Dendrix analyzes integrated mutation data from a variety of sources including single-nucleotide mutations and copy number aberrations. Multiple gene set are identified using a combinatorial optimization approaches. The output is analyzed for subtype-specific mutations and summarized across multiple values of the parameters: formula image , number of gene sets, and , maximum size per gene set.

Figure 2. Multi-Dendrix results on the GBM dataset.

(Left) Nodes represent genes in four modules found by Multi-Dendrix using formula image gene sets of minimize size and maximum size . Genes with “(A)” appended are amplification events, genes with “(D)” appended are deletion events, and genes with no annotation are SNVs. Edges connect genes that appear in the same gene set for more than one value of the parameters, with labels indicating the fraction of parameter values for which the pair of genes appear in the same gene set. Color of nodes indicates membership in three signaling pathways noted in as important for GBM: RB, p53, and RTK/RAS/PI(3)K signaling. Shape of nodes indicates genes whose mutations are associated with specific GBM subtypes, and dashed edges connect genes associated with different subtypes. The direct interactions statistic formula image of this collection of gene sets is significant (). (Middle) Known interactions between proteins in each set and -value for observed number of interactions. (Right) Mutation matrix for each of four modules with mutual exclusive (blue) and co-occurring mutations (orange).

Figure 3. Multi-Dendrix results on the BRCA dataset.

Graphical elements are as described in Figure 2 caption, except for the following. Color of nodes indicates membership in four signaling pathways noted in as important for BRCA: p53 signaling, PI(3)K/AKT signaling, cell cycle checkpoints, and p38-JNK1. The top row of each mutation matrix annotates the subtype of each patient. The regulatory interaction between GATA3 and CDH1 is shown as a dashed line. The direct interactions statistic formula image of this collection of gene sets is significant ().

Cited by

Identifying restrictions in the order of accumulation of mutations during tumor progression: effects of passengers, evolutionary models, and sampling.
Diaz-Uriarte R. Diaz-Uriarte R. BMC Bioinformatics. 2015 Feb 12;16:41. doi: 10.1186/s12859-015-0466-7. BMC Bioinformatics. 2015. PMID: 25879190 Free PMC article.
Systematic identification of cancer driving signaling pathways based on mutual exclusivity of genomic alterations.
Babur Ö, Gönen M, Aksoy BA, Schultz N, Ciriello G, Sander C, Demir E. Babur Ö, et al. Genome Biol. 2015 Feb 26;16(1):45. doi: 10.1186/s13059-015-0612-6. Genome Biol. 2015. PMID: 25887147 Free PMC article.
Analysis, identification and visualization of subgroups in genomics.
Völkel G, Laban S, Fürstberger A, Kühlwein SD, Ikonomi N, Hoffman TK, Brunner C, Neuberg DS, Gaidzik V, Döhner H, Kraus JM, Kestler HA. Völkel G, et al. Brief Bioinform. 2021 May 20;22(3):bbaa217. doi: 10.1093/bib/bbaa217. Brief Bioinform. 2021. PMID: 32954413 Free PMC article. Review.
Adaptively Weighted and Robust Mathematical Programming for the Discovery of Driver Gene Sets in Cancers.
Xu X, Qin P, Gu H, Wang J, Wang Y. Xu X, et al. Sci Rep. 2019 Apr 11;9(1):5959. doi: 10.1038/s41598-019-42500-7. Sci Rep. 2019. PMID: 30976053 Free PMC article.
Review: Precision medicine and driver mutations: Computational methods, functional assays and conformational principles for interpreting cancer drivers.
Nussinov R, Jang H, Tsai CJ, Cheng F. Nussinov R, et al. PLoS Comput Biol. 2019 Mar 28;15(3):e1006658. doi: 10.1371/journal.pcbi.1006658. eCollection 2019 Mar. PLoS Comput Biol. 2019. PMID: 30921324 Free PMC article. Review.

References

1. Gonzalez-Perez A, Lopez-Bigas N (2012) Functional impact bias reveals cancer drivers. Nucleic acids research 40: 1–10. - PMC - PubMed
1. Adzhubei IA, Scmidt S, Peshkin L, Ramensky VE, Gerasimoa A, et al. (2010) A method and server for predicting damaging missense mutations. Nature methods 7: 248–249. - PMC - PubMed
1. Reva B, Antipin Y, Sander C (2011) Predicting the functional impact of protein mutations: application to cancer genomics. Nucleic acids research 39: e118. - PMC - PubMed
1. Kumar P, Henikoff S, Ng PC (2009) Predicting the effects of coding non-synonymous variants on protein function using the SIFT algorithm. Nature protocols 4: 1073–81. - PubMed
1. Sjöblom T, Jones S, Wood LD, Parsons DW, Lin J, et al. (2006) The consensus coding sequences of human breast and colorectal cancers. Science 314: 268–274. - PubMed

Publication types

MeSH terms

Grants and funding

This work is supported by NSF grant IIS-1016648. BJR is supported by a Career Award at the Scientific Interface from the Burroughs Wellcome Fund, an Alfred P. Sloan Research Fellowship, and an NSF CAREER Award (CCF-1053753). RS was supported by a research grant from the Israel Science Foundation (grant no. 241/11). MDML was supported by NSF GRFP DGE 0228243. The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.

LinkOut - more resources

Full Text Sources
Other Literature Sources
- The Lens - Patent Citations Database
- scite Smart Citations
Research Materials
- NCI CPTC Antibody Characterization Program
Miscellaneous
- NCI CPTAC Assay Portal

Simultaneous identification of multiple driver pathways in cancer - PubMed (original) (raw)