Transcription-mediated gene fusion in the human genome - PubMed (original) (raw)
Transcription-mediated gene fusion in the human genome
Pinchas Akiva et al. Genome Res. 2006 Jan.
Abstract
Transcription of a gene usually ends at a regulated termination point, preventing the RNA-polymerase from reading through the next gene. However, sporadic reports suggest that chimeric transcripts, formed by transcription of two consecutive genes into one RNA, can occur in human. The splicing and translation of such RNAs can lead to a new, fused protein, having domains from both original proteins. Here, we systematically identified over 200 cases of intergenic splicing in the human genome (involving 421 genes), and experimentally demonstrated that at least half of these fusions exist in human tissues. We showed that unique splicing patterns dominate the functional and regulatory nature of the resulting transcripts, and found intergenic distance bias in fused compared with nonfused genes. We demonstrate that the hundreds of fused genes we identified are only a subset of the actual number of fused genes in human. We describe a novel evolutionary mechanism where transcription-induced chimerism followed by retroposition results in a new, active fused gene. Finally, we provide evidence that transcription-induced chimerism can be a mechanism contributing to the evolution of protein complexes.
Figures
Figure 1.
A model for transcription-induced chimerism. The transcribed region spans both consecutive genes. When the pre-mRNA is spliced, it involves a 5′ splice site at the upstream gene and a 3′ splice site at the downstream gene, thus removing the intergenic region from the mature fused mRNA. The product is a hybrid mRNA containing exons from both genes.
Figure 2.
Intergenic splicing patterns. Exons (dark boxes) are numbered from 1 to n. Percentage of events is calculated out of the 212 events detected in our computational search. Thin triangles mark the intergenic splicing pattern. “GT” and “AG” stand for the 5′ and 3′ splice sites, respectively. Splice sites used for the intergenic splicing are in bold. Patterns are not mutually exclusive, and hence, the percentages sum up to more than 100%.
Figure 3.
Intergenic distances distribution. Compared are a data set of “all genes,” representing distances between 12,395 consecutive RefSeq pairs residing on the same strand in the human genome, and “fused genes,” representing distances between the 212 genes in the current analysis. _x_-axis, intergenic distance in kilobase, divided to bins of 10 kb (i.e., the “20” bin corresponds to distances between 10,001 and 20,000 bp). _y_-axis, percent of gene pairs out of each data set. Notably, fused genes tend to reside much closer on the genome than the entire population of gene pairs.
Figure 4.
Selected examples of experimentally verified transcription induced chimeras. Shown are snapshots taken from the UCSC genome browser (http://genome.ucsc.edu/), presenting the alignment of expressed sequences to the genome, as well as the location of CpG islands (Gardiner-Garden and Frommer 1987). A total of 70% of the downstream genes involved in TICs possess CpG islands in their 5′ regions, indicating that they are also regulated as single genes. Boxes represent exons, with thinner boxes representing the untranslated regions (UTRs). Arrowed thin lines represent introns. “PCR_verified” represents the chimeric sequence validated by RT–PCR. Beneath are RT–PCR results showing the fusion events (see Methods). (A) Fusion transcript between NME1 and NME2 creating a predicted fused protein. This transcript is supported by 30 ESTs (one is shown in the figure), and was found to be ubiquitously expressed in human tissues. The same fusion event was also detected in mouse ESTs. In the gel image—lanes a indicate the NME1 wild-type (WT) transcript and lanes b indicate the fused (TIC) NME1-NME2 transcript. (B) Fusion transcript between sialophorin (SPN) and quinolinate phosphoribosyltransferase (QPRT) demonstrates the donation of SPN regulatory sequence (5′UTR) to the QPRT transcript. The fusion was experimentally detected in RNAs from the Farage cell line (B-lymphoma). In the gel image—lanes a indicate the wild-type SPN and lanes b indicate the fused (TIC) SPN-QPRT transcript (validated product marked by black arrow). (C) Fusion transcript between phosphatidylinositol-4-phosphate 5-kinase (PIP5K1A) and proteasome 26S subunit non-ATPase 4 (PSMD4) on chromosome 1q21. The fusion event was discovered by identification of retroposed, chimeric processed expressed pseudogene residing on chromosome 10q23. No EST supports this fusion, but it was verified by RT–PCR. The RNA BC068549, expressed from the processed pseudogene on chromosome 10, is shown aligning to both genes. In the gel image—lanes a indicate the fused (TIC) PIP5K1A-PSMD4 transcript on chromosome 1 (validated product marked by black arrow); lanes b indicate the PIP5K1A (WT), and lanes c indicate the active transcription of the retroposed gene from chromosome 10, which was found to be ubiquitously expressed. Primers were designed from regions that are diverged between the fusion transcript and the retrogene, so that each product will be uniquely amplified (Supplemental Table S2). Products were verified by direct sequencing.
Similar articles
- Tandem chimerism as a means to increase protein complexity in the human genome.
Parra G, Reymond A, Dabbouseh N, Dermitzakis ET, Castelo R, Thomson TM, Antonarakis SE, Guigó R. Parra G, et al. Genome Res. 2006 Jan;16(1):37-44. doi: 10.1101/gr.4145906. Epub 2005 Dec 12. Genome Res. 2006. PMID: 16344564 Free PMC article. - Chimeric RNAs generated by intergenic splicing in normal and cancer cells.
Jividen K, Li H. Jividen K, et al. Genes Chromosomes Cancer. 2014 Dec;53(12):963-71. doi: 10.1002/gcc.22207. Epub 2014 Aug 11. Genes Chromosomes Cancer. 2014. PMID: 25131334 Review. - Discovery of CTCF-sensitive Cis-spliced fusion RNAs between adjacent genes in human prostate cells.
Qin F, Song Z, Babiceanu M, Song Y, Facemire L, Singh R, Adli M, Li H. Qin F, et al. PLoS Genet. 2015 Feb 6;11(2):e1005001. doi: 10.1371/journal.pgen.1005001. eCollection 2015 Feb. PLoS Genet. 2015. PMID: 25658338 Free PMC article. - A cell-based splicing reporter system to identify regulators of cis-splicing between adjacent genes.
Chwalenia K, Qin F, Singh S, Li H. Chwalenia K, et al. Nucleic Acids Res. 2019 Feb 28;47(4):e24. doi: 10.1093/nar/gky1288. Nucleic Acids Res. 2019. PMID: 30590765 Free PMC article. - Splicing, transcription, and chromatin: a ménage à trois.
Allemand E, Batsché E, Muchardt C. Allemand E, et al. Curr Opin Genet Dev. 2008 Apr;18(2):145-51. doi: 10.1016/j.gde.2008.01.006. Epub 2008 Mar 26. Curr Opin Genet Dev. 2008. PMID: 18372167 Review.
Cited by
- Recurrent reciprocal RNA chimera involving YPEL5 and PPP1CB in chronic lymphocytic leukemia.
Velusamy T, Palanisamy N, Kalyana-Sundaram S, Sahasrabuddhe AA, Maher CA, Robinson DR, Bahler DW, Cornell TT, Wilson TE, Lim MS, Chinnaiyan AM, Elenitoba-Johnson KS. Velusamy T, et al. Proc Natl Acad Sci U S A. 2013 Feb 19;110(8):3035-40. doi: 10.1073/pnas.1214326110. Epub 2013 Feb 4. Proc Natl Acad Sci U S A. 2013. PMID: 23382248 Free PMC article. - The molecular evolution of spiggin nesting glue in sticklebacks.
Seear PJ, Rosato E, Goodall-Copestake WP, Barber I. Seear PJ, et al. Mol Ecol. 2015 Sep;24(17):4474-88. doi: 10.1111/mec.13317. Epub 2015 Aug 3. Mol Ecol. 2015. PMID: 26173374 Free PMC article. - Chromosome 7 gain and DNA hypermethylation at the HOXA10 locus are associated with expression of a stem cell related HOX-signature in glioblastoma.
Kurscheid S, Bady P, Sciuscio D, Samarzija I, Shay T, Vassallo I, Criekinge WV, Daniel RT, van den Bent MJ, Marosi C, Weller M, Mason WP, Domany E, Stupp R, Delorenzi M, Hegi ME. Kurscheid S, et al. Genome Biol. 2015 Jan 27;16(1):16. doi: 10.1186/s13059-015-0583-7. Genome Biol. 2015. PMID: 25622821 Free PMC article. - A Comprehensive Analysis of Cell Type-Specific Nuclear RNA From Neurons and Glia of the Brain.
Reddy AS, O'Brien D, Pisat N, Weichselbaum CT, Sakers K, Lisci M, Dalal JS, Dougherty JD. Reddy AS, et al. Biol Psychiatry. 2017 Feb 1;81(3):252-264. doi: 10.1016/j.biopsych.2016.02.021. Epub 2016 Feb 24. Biol Psychiatry. 2017. PMID: 27113499 Free PMC article. - Knowing when to stop: Transcription termination on protein-coding genes by eukaryotic RNAPII.
Rodríguez-Molina JB, West S, Passmore LA. Rodríguez-Molina JB, et al. Mol Cell. 2023 Feb 2;83(3):404-415. doi: 10.1016/j.molcel.2022.12.021. Epub 2023 Jan 11. Mol Cell. 2023. PMID: 36634677 Free PMC article. Review.
References
- Communi, D., Suarez-Huerta, N., Dussossoy, D., Savi, P., and Boeynaems, J.M. 2001. Cotranscription and intergenic splicing of human P2Y11 and SSF1 genes. J. Biol. Chem. 276 16561-16566. - PubMed
- Gardiner-Garden, M. and Frommer, M. 1987. CpG islands in vertebrate genomes. J. Mol. Biol. 196 261-282. - PubMed
Web site references
- http://www.ncbi.nlm.nih.gov/Genbank/; NCBI GenBank version 136 (June 2003).
- http://www.ncbi.nlm.nih.gov/genome/guide/human/; human genome build 33 (April 2003).
- http://genome.ucsc.edu/; This site contains the reference sequence and working draft assemblies for a large collection of genomes. It also provides a portal to the ENCODE project.
MeSH terms
LinkOut - more resources
Full Text Sources
Other Literature Sources