Whole genome copy number analyses reveal a highly aberrant genome in TP53 mutant lung adenocarcinoma tumors (original) (raw)

Integrated Analyses of Copy Number Variations and Gene Expression in Lung Adenocarcinoma

PLoS ONE, 2011

Numerous efforts have been made to elucidate the etiology and improve the treatment of lung cancer, but the overall fiveyear survival rate is still only 15%. Identification of prognostic biomarkers for lung cancer using gene expression microarrays poses a major challenge in that very few overlapping genes have been reported among different studies. To address this issue, we have performed concurrent genome-wide analyses of copy number variation and gene expression to identify genes reproducibly associated with tumorigenesis and survival in non-smoking female lung adenocarcinoma. The genomic landscape of frequent copy number variable regions (CNVRs) in at least 30% of samples was revealed, and their aberration patterns were highly similar to several studies reported previously. Further statistical analysis for genes located in the CNVRs identified 475 genes differentially expressed between tumor and normal tissues (p,10 25 ). We demonstrated the reproducibility of these genes in another lung cancer study (p = 0.0034, Fisher's exact test), and showed the concordance between copy number variations and gene expression changes by elevated Pearson correlation coefficients. Pathway analysis revealed two major dysregulated functions in lung tumorigenesis: survival regulation via AKT signaling and cytoskeleton reorganization. Further validation of these enriched pathways using three independent cohorts demonstrated effective prediction of survival. In conclusion, by integrating gene expression profiles and copy number variations, we identified genes/pathways that may serve as prognostic biomarkers for lung tumorigenesis.

Common and Contrasting Genomic Profiles among the Major Human Lung Cancer Subtypes

Cold Spring Harbor Symposia on Quantitative Biology, 2005

Lung cancer is the leading cause of cancer mortality worldwide. With the recent success of molecularly targeted therapies in this disease, a detailed knowledge of the spectrum of genetic lesions in lung cancer represents a critical step in the development of additional effective agents. An integrated high-resolution survey of regional amplifications and deletions and gene expression profiling of non-small-cell lung cancers (NSCLC) identified 93 focal high-confidence copy number alterations (CNAs), with 21 spanning less than 0.5 Mb with a median of five genes. Most CNAs were novel and included high-amplitude amplification and homozygous deletion events. Pathogenic relevance of these genomic alterations was further reinforced by their recurrence and overlap with focal alterations of other tumor types. Additionally, the comparison of the genomic profiles of the two major subtypes of NSCLC, adenocarcinoma (AC) and squamous cell carcinoma (SCC), showed an almost complete overlap with the exception of one amplified region on chromosome 3, specific for SCC. Among the few genes overexpressed within this amplicon was p63, a known regulator of squamous cell differentiation. These findings suggest that the AC and SCC subtypes may arise from a common cell of origin and they are driven to their distinct phenotypic end points by altered expression of a limited number of key genes such as p63.

Characterizing the cancer genome in lung adenocarcinoma

Nature, 2007

Somatic alterations in cellular DNA underlie almost all human cancers 1 . The prospect of targeted therapies 2 and the development of high-resolution, genome-wide approaches 3-8 are now spurring systematic efforts to characterize cancer genomes. Here we report a large-scale project to characterize copy-number alterations in primary lung adenocarcinomas. By analysis of a large collection of tumors (n = 371) using dense single nucleotide polymorphism arrays, we identify a total of 57 significantly recurrent events. We find that 26 of 39 autosomal chromosome arms show consistent large-scale copy-number gain or loss, of which only a handful have been linked to a specific gene. We also identify 31 recurrent focal events, including 24 amplifications and 7 homozygous deletions. Only six of these focal events are currently associated with known mutations in lung carcinomas. The most common event, amplification of chromosome 14q13.3, is found in ~12% of samples. On the basis of genomic and functional analyses, we identify NKX2-1 (NK2 homeobox 1, also called TITF1), which lies in the minimal 14q13.3 amplification interval and encodes a lineage-specific transcription factor, as a novel candidate proto-oncogene involved in a significant fraction of lung adenocarcinomas. More generally, our results indicate that many of the genes that are involved in lung adenocarcinoma remain to be discovered.

High-resolution genomic profiles of human lung cancer

Proceedings of the National Academy of Sciences, 2005

Lung cancer is the leading cause of cancer mortality worldwide, yet there exists a limited view of the genetic lesions driving this disease. In this study, an integrated high-resolution survey of regional amplifications and deletions, coupled with gene-expression profiling of non-small-cell lung cancer subtypes, adenocarcinoma and squamous-cell carcinoma (SCC), identified 93 focal copynumber alterations, of which 21 span <0.5 megabases and contain a median of five genes. Whereas all known lung cancer genes͞loci are contained in the dataset, most of these recurrent copy-number alterations are previously uncharacterized and include high-amplitude amplifications and homozygous deletions. Notably, despite their distinct histopathological phenotypes, adenocarcinoma and SCC genomic profiles showed a nearly complete overlap, with only one clear SCC-specific amplicon. Among the few genes residing within this amplicon and showing consistent overexpression in SCC is p63, a known regulator of squamous-cell differentiation. Furthermore, intersection with the published pancreatic cancer comparative genomic hybridization dataset yielded, among others, two focal amplicons on 8p12 and 20q11 common to both cancer types. Integrated DNA-RNA analyses identified WHSC1L1 and TPX2 as two candidates likely targeted for amplification in both pancreatic ductal adenocarcinoma and non-small-cell lung cancer. array comparative genomic hybridization ͉ expression profiling ͉ lung adenocarcinoma ͉ squamous-cell lung carcinoma ͉ TP73L

Impact on Disease Development, Genomic Location and Biological Function of Copy Number Alterations in Non-Small Cell Lung Cancer

PLoS ONE, 2011

Lung cancer, of which more than 80% is non-small cell, is the leading cause of cancer-related death in the United States. Copy number alterations (CNAs) in lung cancer have been shown to be positionally clustered in certain genomic regions. However, it remains unclear whether genes with copy number changes are functionally clustered. Using a dense single nucleotide polymorphism array, we performed genome-wide copy number analyses of a large collection of non-small cell lung tumors (n = 301). We proposed a formal statistical test for CNAs between different groups (e.g., noninvolved lung vs. tumors, early vs. late stage tumors). We also customized the gene set enrichment analysis (GSEA) algorithm to investigate the overrepresentation of genes with CNAs in predefined biological pathways and gene sets (i.e., functional clustering). We found that CNAs events increase substantially from germline, early stage to late stage tumor. In addition to genomic position, CNAs tend to occur away from the gene locations, especially in germline, noninvolved tissue and early stage tumors. Such tendency decreases from germline to early stage and then to late stage tumors, suggesting a relaxation of selection during tumor progression. Furthermore, genes with CNAs in non-small cell lung tumors were enriched in certain gene sets and biological pathways that play crucial roles in oncogenesis and cancer progression, demonstrating the functional aspect of CNAs in the context of biological pathways that were overlooked previously. We conclude that CNAs increase with disease progression and CNAs are both positionally and functionally clustered. The potential functional capabilities acquired via CNAs may be sufficient for normal cells to transform into malignant cells.

Integrated molecular portrait of non-small cell lung cancers

BMC Medical Genomics, 2013

Background: Non-small cell lung cancer (NSCLC), a leading cause of cancer deaths, represents a heterogeneous group of neoplasms, mostly comprising squamous cell carcinoma (SCC), adenocarcinoma (AC) and large-cell carcinoma (LCC). The objectives of this study were to utilize integrated genomic data including copy-number alteration, mRNA, microRNA expression and candidate-gene full sequencing data to characterize the molecular distinctions between AC and SCC. Methods: Comparative genomic hybridization followed by mutational analysis, gene expression and miRNA microarray profiling were performed on 123 paired tumor and non-tumor tissue samples from patients with NSCLC. Results: At DNA, mRNA and miRNA levels we could identify molecular markers that discriminated significantly between the various histopathological entities of NSCLC. We identified 34 genomic clusters using aCGH data; several genes exhibited a different profile of aberrations between AC and SCC, including PIK3CA, SOX2, THPO, TP63, PDGFB genes. Gene expression profiling analysis identified SPP1, CTHRC1and GREM1 as potential biomarkers for early diagnosis of the cancer, and SPINK1 and BMP7 to distinguish between AC and SCC in small biopsies or in blood samples. Using integrated genomics approach we found in recurrently altered regions a list of three potential driver genes, MRPS22, NDRG1 and RNF7, which were consistently over-expressed in amplified regions, had widespread correlation with an average of~800 genes throughout the genome and highly associated with histological types. Using a network enrichment analysis, the targets of these potential drivers were seen to be involved in DNA replication, cell cycle, mismatch repair, p53 signalling pathway and other lung cancer related signalling pathways, and many immunological pathways. Furthermore, we also identified one potential driver miRNA hsa-miR-944. Conclusions: Integrated molecular characterization of AC and SCC helped identify clinically relevant markers and potential drivers, which are recurrent and stable changes at DNA level that have functional implications at RNA level and have strong association with histological subtypes.

The transcriptional landscape and mutational profile of lung adenocarcinoma

Genome Research, 2012

All cancers harbor molecular alterations in their genomes. The transcriptional consequences of these somatic mutations have not yet been comprehensively explored in lung cancer. Here we present the first large scale RNA sequencing study of lung adenocarcinoma, demonstrating its power to identify somatic point mutations as well as transcriptional variants such as gene fusions, alternative splicing events, and expression outliers. Our results reveal the genetic basis of 200 lung adenocarcinomas in Koreans including deep characterization of 87 surgical specimens by transcriptome sequencing. We identified driver somatic mutations in cancer genes including EGFR, KRAS, NRAS, BRAF, PIK3CA, MET, and CTNNB1. Candidates for novel driver mutations were also identified in genes newly implicated in lung adenocarcinoma such as LMTK2, ARID1A, NOTCH2, and SMARCA4. We found 45 fusion genes, eight of which were chimeric tyrosine kinases involving ALK, RET, ROS1, FGFR2, AXL, and PDGFRA. Among 17 recur...