A non-EST-based method for exon-skipping prediction - PubMed (original) (raw)
Comparative Study
. 2004 Aug;14(8):1617-23.
doi: 10.1101/gr.2572604.
Affiliations
- PMID: 15289480
- PMCID: PMC509271
- DOI: 10.1101/gr.2572604
Comparative Study
A non-EST-based method for exon-skipping prediction
Rotem Sorek et al. Genome Res. 2004 Aug.
Abstract
It is estimated that between 35% and 74% of all human genes can undergo alternative splicing. Currently, the most efficient methods for large-scale detection of alternative splicing use expressed sequence tags (ESTs) or microarray analysis. As these methods merely sample the transcriptome, splice variants that do not appear in deeply sampled tissues have a low probability of being detected. We present a new method by which we can predict that an internal exon is skipped (namely whether it is a cassette-exon) merely based on its naked genomic sequence and on the sequence of its mouse ortholog. No other data, such as ESTs, are required for the prediction. Using our method, which was experimentally validated, we detected hundreds of novel splice variants that were not detectable using ESTs. We show that a substantial fraction of the splice variants in the human genome could not be identified through current human EST or cDNA data.
Copyright 2004 Cold Spring Harbor Laboratory Press ISSN
Figures
Figure 1
Graphic representation of the differences between alternative and constitutive exons. For each of the following curves, constitutive exons are in squares, and alternatives are in diamond shapes. (A) Length of conserved region in the nearest 100 nt of the flanking upstream intron. _x_-axis, length of conserved region (best Sim4 local alignment); _y_-axis, percent exons with upstream conserved region greater than or equal to the value in x. Conservation was detected using local alignment with the mouse 100 counterpart intronic nt. A minimum hit was 12 consecutive perfectly matching nt. (B) Length of conserved region in the nearest 100 nt of the flanking downstream intron. Axes as in A.(C) Exon size distribution. _x_-axis, exon size; _y_-axis, percent exons having size lesser or equal to the size in x. (D) Human–mouse exon identity. _x_-axis, percent identity in the global alignment of the human and the mouse exons; _y_-axis, percent exons with identity greater or equal to the value in x. (E) Human–mouse exon identity, for exons whose size is a multiple of 3. Axes as in D. Note that by combining two features we get better separation of the two exon-types.
Figure 1
Graphic representation of the differences between alternative and constitutive exons. For each of the following curves, constitutive exons are in squares, and alternatives are in diamond shapes. (A) Length of conserved region in the nearest 100 nt of the flanking upstream intron. _x_-axis, length of conserved region (best Sim4 local alignment); _y_-axis, percent exons with upstream conserved region greater than or equal to the value in x. Conservation was detected using local alignment with the mouse 100 counterpart intronic nt. A minimum hit was 12 consecutive perfectly matching nt. (B) Length of conserved region in the nearest 100 nt of the flanking downstream intron. Axes as in A.(C) Exon size distribution. _x_-axis, exon size; _y_-axis, percent exons having size lesser or equal to the size in x. (D) Human–mouse exon identity. _x_-axis, percent identity in the global alignment of the human and the mouse exons; _y_-axis, percent exons with identity greater or equal to the value in x. (E) Human–mouse exon identity, for exons whose size is a multiple of 3. Axes as in D. Note that by combining two features we get better separation of the two exon-types.
Figure 2
Experimental validation for the existence of alternative splicing in selected predicted exons. RT–PCR for 15 exons (detailed in Table 2), for which no EST/cDNA indicating alternative splicing was found, was conducted over 14 different tissue types and cell lines (see Methods). Detected splice variants were confirmed by sequencing. For nine of these exons a splice isoform was detected in at least one of the tissues tested. Only a single tissue is shown here for each of these nine exons. Lane 1, DNA size marker. Lane 2, exon 2 skipping in FGF11 in ovary tissue (the 344-nt and 233-nt products are exon inclusion and skipping, respectively). Lane 3, exon 4 skipping in EFNA5 gene in ovary tissue (exon inclusion 287 nt; skipping 199nt). Lane 4, exon 8 skipping in NCOA1 gene in placenta tissue (exon inclusion 377 nt; skipping 275 nt). Lane 5, exon 22 skipping in PAM gene in cervix tissue (exon inclusion 323 nt; skipping 215 nt). Additional upper band contains a novel exon in PAM. Lane 6, exon 9 skipping in GOLGA4 gene in uterus tissue (exon inclusion 288 nt; skipping 213 nt). Lane 7, exon 9 skipping of NPR2 gene in placenta tissue (282nt inclusion; 207nt skipping). Lane 8, intron 8 retention in VLDLR gene in ovary tissue (wild type 324 nt; intron retention 427 nt). Lane 9, alternative acceptor site in exon 12 of BAZ1A in ovary tissue (wild type 351 nt; alternative acceptor variant 265 nt). The uppermost band represents a new exon in BAZ1A, inserted between exons 12 and 13. Lane 10, alternative acceptor site in exon 7 of SMARCD1 in uterus tissue (wild type 353 nt; exon 7 extension 397 nt).
Figure 3
Sensitivity vs. false-positive rate in classification rules. Each square on the curve represents the performance of a single classification rule. _x_-axis, 1-specificity, i.e., percent constitutive exons (false positives) retrieved by the rule. _y_-axis, sensitivity, i.e., percent alternative exons (true positives) identified by the rule. Values were computed relative to the training set. Rules that were used for this plot are provided as Supplemental material.
Similar articles
- How prevalent is functional alternative splicing in the human genome?
Sorek R, Shamir R, Ast G. Sorek R, et al. Trends Genet. 2004 Feb;20(2):68-71. doi: 10.1016/j.tig.2003.12.004. Trends Genet. 2004. PMID: 14746986 Review. - Non-EST based prediction of exon skipping and intron retention events using Pfam information.
Hiller M, Huse K, Platzer M, Backofen R. Hiller M, et al. Nucleic Acids Res. 2005 Oct 4;33(17):5611-21. doi: 10.1093/nar/gki870. Print 2005. Nucleic Acids Res. 2005. PMID: 16204458 Free PMC article. - Transcriptome and genome conservation of alternative splicing events in humans and mice.
Sugnet CW, Kent WJ, Ares M Jr, Haussler D. Sugnet CW, et al. Pac Symp Biocomput. 2004:66-77. doi: 10.1142/9789812704856_0007. Pac Symp Biocomput. 2004. PMID: 14992493 - Gene structure prediction and alternative splicing analysis using genomically aligned ESTs.
Kan Z, Rouchka EC, Gish WR, States DJ. Kan Z, et al. Genome Res. 2001 May;11(5):889-900. doi: 10.1101/gr.155001. Genome Res. 2001. PMID: 11337482 Free PMC article. - Bioinformatics detection of alternative splicing.
Kim N, Lee C. Kim N, et al. Methods Mol Biol. 2008;452:179-97. doi: 10.1007/978-1-60327-159-2_9. Methods Mol Biol. 2008. PMID: 18566765 Review.
Cited by
- Alternative RNA splicing in stem cells and cancer stem cells: Importance of transcript-based expression analysis.
Ebrahimie E, Rahimirad S, Tahsili M, Mohammadi-Dehcheshmeh M. Ebrahimie E, et al. World J Stem Cells. 2021 Oct 26;13(10):1394-1416. doi: 10.4252/wjsc.v13.i10.1394. World J Stem Cells. 2021. PMID: 34786151 Free PMC article. Review. - A CpG island promoter drives the CXXC5 gene expression.
Yaşar P, Kars G, Yavuz K, Ayaz G, Oğuztüzün Ç, Bilgen E, Suvacı Z, Çetinkol ÖP, Can T, Muyan M. Yaşar P, et al. Sci Rep. 2021 Aug 2;11(1):15655. doi: 10.1038/s41598-021-95165-6. Sci Rep. 2021. PMID: 34341443 Free PMC article. - Rotavirus Infection Alters Splicing of the Stress-Related Transcription Factor XBP1.
Duarte M, Vende P, Charpilienne A, Gratia M, Laroche C, Poncet D. Duarte M, et al. J Virol. 2019 Feb 19;93(5):e01739-18. doi: 10.1128/JVI.01739-18. Print 2019 Mar 1. J Virol. 2019. PMID: 30541862 Free PMC article. - The Expanding Landscape of Alternative Splicing Variation in Human Populations.
Park E, Pan Z, Zhang Z, Lin L, Xing Y. Park E, et al. Am J Hum Genet. 2018 Jan 4;102(1):11-26. doi: 10.1016/j.ajhg.2017.11.002. Am J Hum Genet. 2018. PMID: 29304370 Free PMC article. Review. - Analysis and Prediction of Exon Skipping Events from RNA-Seq with Sequence Information Using Rotation Forest.
Du X, Hu C, Yao Y, Sun S, Zhang Y. Du X, et al. Int J Mol Sci. 2017 Dec 12;18(12):2691. doi: 10.3390/ijms18122691. Int J Mol Sci. 2017. PMID: 29231888 Free PMC article.
References
- Berget, S.M. 1995. Exon recognition in vertebrate splicing. J. Biol. Chem. 270: 2411–2414. - PubMed
- Brett, D., Hanke, J., Lehmann, G., Haase, S., Delbruck, S., Krueger, S., Reich, J., and Bork, P. 2000. EST comparison indicates 38% of human mRNAs contain possible alternative splice forms. FEBS Lett. 474: 83–86. - PubMed
- Burge, C. and Karlin, S. 1997. Prediction of complete gene structures in human genomic DNA. J. Mol. Biol. 268: 78–94. - PubMed
- Cartegni, L., Chew, S.L., and Krainer, A.R. 2002. Listening to silence and understanding nonsense: Exonic mutations that affect splicing. Nat. Rev. Genet. 3: 285–298. - PubMed
WEB SITE REFERENCES
- http://genes.mit.edu/GENSCANinfo.html; GENSCAN.
- www.ncbi.nlm.nih.gov/dbEST; GenBank version 136 (June 2003).
- www.ncbi.nlm.nih.gov/genome/guide/human; Human genome (April 2003 assembly).
Publication types
MeSH terms
Substances
LinkOut - more resources
Full Text Sources
Other Literature Sources
Research Materials