Comprehensive analysis of transcriptome profiles in hepatocellular carcinoma - PubMed (original) (raw)
Comprehensive analysis of transcriptome profiles in hepatocellular carcinoma
Yu Jin et al. J Transl Med. 2019.
Abstract
Background: Hepatocellular carcinoma is the second most deadly cancer with late presentation and limited treatment options, highlighting an urgent need to better understand HCC to facilitate the identification of early-stage biomarkers and uncover therapeutic targets for the development of novel therapies for HCC.
Methods: Deep transcriptome sequencing of tumor and paired non-tumor liver tissues was performed to comprehensively evaluate the profiles of both the host and HBV transcripts in HCC patients. Differential gene expression patterns and the dys-regulated genes associated with clinical outcomes were analyzed. Somatic mutations were identified from the sequencing data and the deleterious mutations were predicted. Lastly, human-HBV chimeric transcripts were identified, and their distribution, potential function and expression association were analyzed.
Results: Expression profiling identified the significantly upregulated TP73 as a nodal molecule modulating expression of apoptotic genes. Approximately 2.5% of dysregulated genes significantly correlated with HCC clinical characteristics. Of the 110 identified genes, those involved in post-translational modification, cell division and/or transcriptional regulation were upregulated, while those involved in redox reactions were downregulated in tumors of patients with poor prognosis. Mutation signature analysis identified that somatic mutations in HCC tumors were mainly non-synonymous, frequently affecting genes in the micro-environment and cancer pathways. Recurrent mutations occur mainly in ribosomal genes. The most frequently mutated genes were generally associated with a poorer clinical prognosis. Lastly, transcriptome sequencing suggest that HBV replication in the tumors of HCC patients is rare. HBV-human fusion transcripts are a common observation, with favored HBV and host insertion sites being the HBx C-terminus and gene introns (in tumors) and introns/intergenic-regions (in non-tumors), respectively. HBV-fused genes in tumors were mainly involved in RNA binding while those in non-tumors tissues varied widely. These observations suggest that while HBV may integrate randomly during chronic infection, selective expression of functional chimeric transcripts may occur during tumorigenesis.
Conclusions: Transcriptome sequencing of HCC patients reveals key cancer molecules and clinically relevant pathways deregulated/mutated in HCC patients and suggests that while HBV may integrate randomly during chronic infection, selective expression of functional chimeric transcripts likely occur during the process of tumorigenesis.
Keywords: Chimeric transcripts; Differentially expressed genes; HBV integration; Liver cancer; Somatic mutations.
Conflict of interest statement
The authors declare that they have no competing interests.
Figures
Fig. 1
Differentially expressed genes between tumors and adjacent non tumor tissues. a Demographic and clinicopathological data of the 25 HCC patients recruited in this study. HCC staging was classified according to tumor-node-metastasis system by the American Joint Committee on Cancer. b Differentially expressed genes between tumor and adjacent non-tumor tissues. Top: The number of differentially expressed transcripts/genes at various stages of the workflow. Bottom: Heat-map of the 4462 differentially expressed genes in tumors and adjacent non-tumor patient samples. c Significant pathways associated with the differentially expressed genes between tumors and adjacent non-tumors. FDR < 0.01. Z-score predicts the activation status of the pathway. FDR denotes False Discovery Rate. d Characteristics of upstream regulators. e Network of genes associated with the most activated regulatory molecule, TP73. f Differentially expressed genes associated with clinical parameters. I–III The genes associated with clinical parameters. The X-axis represents the fold-change of gene expression between tumors and adjacent non-tumorous tissues (T/N). The Y-axis represents the fold-change of gene expression between patients with less favorable clinical outcome (i.e. higher grade (poor), greater vascular invasion (VI+) and poorer survival) versus those with more favorable clinical outcome (i.e. lower grade (good), less vascular invasion (VI−) and better survival). The size of the bubble represents the − log-2 (FDR) of gene expression between T and NT. HR denotes hazard ratio. (IV) Biological pathways associated with differentially expressed genes associated with the various clinical characteristics. The X-axis shows the pathways that are significantly associated with clinically associated genes for the different clinical phenotype. A red bar shows genes which are upregulated in tumors relative to the non-tumors associated with a worse clinical outcome, while a green bar represents genes downregulated in tumors relative to the non-tumors associated with a worse clinical phenotype
Fig. 2
Tumor-specific somatic mutations identified from RNA-seq. a Distribution of tumor-specific somatic mutations in different genomic regions. The number of total somatic mutations in various genomic regions and the numbers of recurrent mutations (numbers in yellow stars). Most of the somatic mutations reside in genic regions, and missense mutations in coding sequences accounted for ~ 50% of genic mutations. 5′ Upstream: mutation occurring within 5 kb upstream of genes. 3′ Downsteram: mutation occurring within 5 kb downstream genes. Splice donor: mutation that changes one of 2 bases at the 5′ end of an intron. Splice acceptor: mutation that changes one of 2 bases at the 3′ end of an intron. Splice region: mutation within 1–3 bases of the exon or 3–8 bases of the intron flanking the intron–exon boundary. Deleterious: mutation predicted to be damaging to protein function by both Polyphen-2 and SIFT algorithms. NMD: a mutation predicted to cause nonsense-mediated decay. b Association of mutations in HCC patients with clinical characteristics. The percentage of patients with mutations (Y-axis) for the various genes (X-axis) associated with the various clinical phenotype (X-axis below the genes). Red balls denote bad prognosis (e.g. associated with high Edmondson grade tumor, late stage, necrosis or liver cirrhosis) while green balls represents good prognosis (protective genes associated with no tumor invasion/liver cirrhosis). Size represents significance of association i.e. larger size, smaller p-value. c Association of mutations in HCC patients with tumor size. Box plot show tumor size of patients with and without mutation in WASH1 gene. d Mutations in Genes associated with overall survival. Patients with mutations in KIF21A (left panel) and LSS (right panel) correspond to significantly shorter survival time (p-value < 0.01, Kaplan–Meier test). Green lines represent patients with no mutations in the gene while red lines represent patients with mutations in the gene
Fig. 3
Coverage of HBV genome and proportion of patients with potentially intact pre-genomic RNA. a Coverage of HBV genome by sequencing reads. The circos plot shows the average coverage of each nucleotide of HBV genome by the sequencing reads. Red and blue histograms show the average coverage in the 24 N and T, respectively. Coverage is significantly higher in the Pre-S and X genes. b Proportion of patients with potentially intact pre-genomic RNA. As the pre-genomic RNA is 3.5 kb and covers the entire HBV genome, the pre-genomic RNA is considered incomplete if any region of the HBV is not detected. Intact pre-genomic RNA is likely to be found in 20.83% and 8.33% of N and T samples, respectively, suggesting that HBV replication is not a common event in HCC patient liver. c Less variety of chimeric transcripts in the tumor of HCC patients. Boxplot showing average and median number of different chimeric transcripts. The table shows the total number of different chimeric transcripts, number of samples which contain chimeric transcripts, the average and median number of chimeric transcripts detected in T and N samples. The number of different chimeric transcripts detected in T is significantly lower in compared to the N samples (p-value < 0.05, paired t-test) suggesting that a subset of functional chimeric transcripts is selected in the process of tumorigenesis. d Circos plot showing the distribution of fusion sites on HBV genome. The fusion sites between the HBV and host sequences in the chimeric transcripts are significantly located in the region between 1600 and 1900 of the HBV genome (near the end of the HBx gene) in both non-tumor and tumor samples (p-value < 0.001, random sampling test)
Fig. 4
Distribution of fusion points on chromosomes. a Circos plot showing the distribution of fusion points on human chromosomes (hg19). Each red or blue bar represents a fusion site on the corresponding chromosome in N and T samples, respectively. b Proportion of fusion sites in genic and intergenic regions. Top panel: The left pie chart shows the proportion of genic and intergenic region in human genome. The two pie charts on the right show the proportion of HBV-host chimeric transcripts from the genic and intergenic regions in T and N samples, respectively. Genic region includes promoters, 5′- and 3′-UTRs, coding or non-coding exons and introns while intergenic region excludes the genic region. Bottom panel: Distribution of fusion points on functional regions of genes. c Functional annotation of genes with viral integration sites in N and T. Red bars represent the functional annotation of genes with viral integration sites identified in the tumor tissues while the green bars correspond to the functional annotation of genes with integration sites in non-tumor tissues. d Table showing genes fused with HBV in at least two chimeric transcripts. e Tumor chimeric transcripts predicted to alter regulatory elements and their association with expression. Top panel: Table showing putative regulatory sites of genes that are predicted to be affected by viral integration in tumor tissues. Genes with fusion sites that are associated with expression changes are in bold purple. Bottom panel: association between tumor chimeric transcripts and host gene or adjacent exon expression. Red bars represent gene/exon expression in non-tumor while blue bars represent expression in tumor tissues. f Distribution of fusion points in different classes of repeat regions. g Distribution of fusion points in long non-coding RNAs
References
- Laursen L. A preventable cancer. Nature. 2014;516(7529):S2–S3. - PubMed
- Hoshida Y, Moeini A, Alsinet C, Kojima K, Villanueva A. Gene signatures in the management of hepatocellular carcinoma. Semin Oncol. 2012;39(4):473–485. - PubMed
Publication types
MeSH terms
Substances
LinkOut - more resources
Full Text Sources
Medical
Molecular Biology Databases