A change-point model for identifying 3'UTR switching by next-generation RNA sequencing - PubMed (original) (raw)
A change-point model for identifying 3'UTR switching by next-generation RNA sequencing
Wei Wang et al. Bioinformatics. 2014.
Abstract
Motivation: Next-generation RNA sequencing offers an opportunity to investigate transcriptome in an unprecedented scale. Recent studies have revealed widespread alternative polyadenylation (polyA) in eukaryotes, leading to various mRNA isoforms differing in their 3' untranslated regions (3'UTR), through which, the stability, localization and translation of mRNA can be regulated. However, very few, if any, methods and tools are available for directly analyzing this special alternative RNA processing event. Conventional methods rely on annotation of polyA sites; yet, such knowledge remains incomplete, and identification of polyA sites is still challenging. The goal of this article is to develop methods for detecting 3'UTR switching without any prior knowledge of polyA annotations.
Results: We propose a change-point model based on a likelihood ratio test for detecting 3'UTR switching. We develop a directional testing procedure for identifying dramatic shortening or lengthening events in 3'UTR, while controlling mixed directional false discovery rate at a nominal level. To our knowledge, this is the first approach to analyze 3'UTR switching directly without relying on any polyA annotations. Simulation studies and applications to two real datasets reveal that our proposed method is powerful, accurate and feasible for the analysis of next-generation RNA sequencing data.
Conclusions: The proposed method will fill a void among alternative RNA processing analysis tools for transcriptome studies. It can help to obtain additional insights from RNA sequencing data by understanding gene regulation mechanisms through the analysis of 3'UTR switching.
Availability and implementation: The software is implemented in Java and can be freely downloaded from http://utr.sourceforge.net/.
Contact: zhiwei@njit.edu or hongzhe@mail.med.upenn.edu
Supplementary information: Supplementary data are available at Bioinformatics online.
© The Author 2014. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com.
Figures
Fig. 1.
Illustration and notations of the change-point model for 3′UTR switching problem. (A) Treatment process; (B) Control process; (C) Combined process. Isoform 2 has a higher percentage expressed in the treatment condition, leading to a higher ratio of short reads density in common versus extended regions, as defined by the proximal and distal polyA sites, respectively
Fig. 2.
Power and FDR evaluation of the change-point model at the nominal level FDR = 0.05. The FDR for the change-point model was controlled at the nominal level. Fold change, expression level and change-point position all have an influence on 3′UTR switching detection
Fig. 3.
Power and mdFDR evaluation of the directional testing procedure at the nominal level mdFDR = 0.05. For all the settings, our proposed testing framework is capable of controlling mdFDR at the nominal level. It is easier to capture the 3′UTR switching events when the OR is higher, the expression level is higher or the change point is closer to the middle
Fig. 4.
Examples of two MYC-dependent 3′UTR shortening events. The vertical lines indicate the estimated change points predicted by our proposed model. We observed dramatic changes before and after the predicted change points. Clearly, the two genes LDHA and OGDH tend to use the proximal polyA site instead of the distal site in siMYC-DHT. These change points are also consistent and supported by the polyA sites annotated in the PolyA_DB (colored bars in the PolyA_DB and Poly(A) tracks). Together, these results suggest that our proposed method works well to detect 3′UTR switching without relying on any polyA annotations
Fig. 5.
Examples of two shortening events that were identified by our method but missed by the linear trend test. The vertical lines indicate the change points predicted by our proposed model. We observed a clear change before and after the predicted change points, suggesting that our proposed method work well to detect 3′UTR switching without relying on any polyA annotation information. The two genes OAZ1 and SDC1 tend to use the shorter isoform in the MCF_7 cancer cell line in comparison with the control sample MCF_10A, and demonstrated clear shortening patterns
References
- Benjamini Y, Hochberg Y. Controlling the false discovery rate: a practical and powerful approach to multiple testing. J. R. Stat. Soc. B. 1995;57:289–300.
- Benjamini Y, et al. False discovery rate: adjusted multiple confidence intervals for selected parameters. J. Am. Stat. Assoc. 2005;100:71–93.
- Carninci P, et al. Genome-wide analysis of mammalian promoter architecture and evolution. Nat. Genet. 2006;38:626–635. - PubMed
Publication types
MeSH terms
Substances
LinkOut - more resources
Full Text Sources
Other Literature Sources