SpliceGrapher: detecting patterns of alternative splicing from RNA-Seq data in the context of gene models and EST data - PubMed (original) (raw)

SpliceGrapher: detecting patterns of alternative splicing from RNA-Seq data in the context of gene models and EST data

Mark F Rogers et al. Genome Biol. 2012.

Abstract

We propose a method for predicting splice graphs that enhances curated gene models using evidence from RNA-Seq and EST alignments. Results obtained using RNA-Seq experiments in Arabidopsis thaliana show that predictions made by our SpliceGrapher method are more consistent with current gene models than predictions made by TAU and Cufflinks. Furthermore, analysis of plant and human data indicates that the machine learning approach used by SpliceGrapher is useful for discriminating between real and spurious splice sites, and can improve the reliability of detection of alternative splicing. SpliceGrapher is available for download at http://SpliceGrapher.sf.net.

PubMed Disclaimer

Figures

Figure 1

Figure 1

Example of a predicted splice graph in A. thaliana. RNA-Seq alignment data were loaded along with gene model annotations to create a composite model that incorporates all available evidence. SpliceGrapher's visualization modules produce color-coded graphs based on the color scheme used by Sircah [29] that makes it easy to see exons and introns involved in AS events. RNA-Seq read coverage across one of the introns was sufficient to allow SpliceGrapher to identify an intron retention event (exon outlined in blue). In addition, a novel splice junction (highlighted in green) provided SpliceGrapher with evidence for an alternative 3' splicing event (highlighted in orange). The numbers associated with splice junctions indicate the number of reads that align across it. Vertical bands in the background depict exon boundaries in the original gene model.

Figure 2

Figure 2

Splice graph prediction pipeline. SpliceGrapher predicts splice graphs using information from gene models, EST alignments and RNA-Seq data. RNA-Seq exonic alignments may be performed using any popular short-read alignment tool. RNA-Seq spliced alignments may be performed using a conventional short-read mapping tool with a database of splice junctions predicted by SpliceGrapher, or they may be performed using short-read spliced-alignment programs such as TopHat, followed by filtering using SpliceGrapher's database of predicted splice sites. SpliceGrapher incorporates all of this information to produce a comprehensive splice graph prediction.

Figure 3

Figure 3

Ambiguities in RNA-Seq data. This figure demonstrates ambiguities that arise in RNA-Seq data that make isoform prediction challenging. Because there is read coverage across several introns, SpliceGrapher is not able to determine whether this is a result of a single intron retention event, or several independent events.

Figure 4

Figure 4

Example of a Cufflinks prediction. We provide the predictions made by Cufflinks for the same gene whose SpliceGrapher predictions are shown in Figure 1. Some of the splice junctions used by Cufflinks are predicted to be false positives by SpliceGrapher's accurate splice junction classifiers (red edges in the plot). These lead to detection of questionable AS events.

Figure 5

Figure 5

SpliceGrapher prediction from RNA-Seq and EST data. This example shows how SpliceGrapher can use both RNA-Seq data and EST data to produce predictions that incorporate the strengths of each data type. RNA-Seq data provide evidence for two novel splice junctions (fourth panel down, highlighted in green) that SpliceGrapher uses to infer an alternative 3' splicing event. EST alignments provide compelling evidence for an intron retention event. SpliceGrapher combines these predictions into the final predicted graph.

References

    1. Mortazavi A, Williams B, McCue K, Schaeffer L, Wold B. Mapping and quantifying mammalian transcriptomes by RNA-Seq. Nat Methods. 2008;5:621–628. doi: 10.1038/nmeth.1226. - DOI - PubMed
    1. Wang Z, Gerstein M, Snyder M. RNA-Seq: a revolutionary tool for transcriptomics. Nat Rev Genet. 2009;10:57–63. doi: 10.1038/nrg2484. - DOI - PMC - PubMed
    1. Filichkin S, Priest H, Givan S, Shen R, Bryant D, Fox S, Wong W, Mockler T. Genome-wide mapping of alternative splicing in Arabidopsis thaliana. Genome Res. 2010;20:45. doi: 10.1101/gr.093302.109. - DOI - PMC - PubMed
    1. Harr B, Turner L. Genome-wide analysis of alternative splicing evolution among Mus subspecies. Mol Ecol. 2010;19:228–239. - PubMed
    1. Ramani A, Calarco J, Pan Q, Mavandadi S, Wang Y, Nelson A, Lee L, Morris Q, Blencowe B, Zhen M, Fraser A. Genome-wide analysis of alternative splicing in Caenorhabditis elegans. Genome Res. 2011;21:342. doi: 10.1101/gr.114645.110. - DOI - PMC - PubMed

Publication types

MeSH terms

Substances

LinkOut - more resources