A compendium of RNA-binding motifs for decoding gene regulation (original) (raw)
Accession codes
Accessions
Gene Expression Omnibus
Data deposits
Raw and processed microarray data are available at GEO (http://www.ncbi.nlm.nih.gov/geo/) under accession number GSE41235. The derived motifs and results of analyses are available at http://hugheslab.ccbr.utoronto.ca/supplementary-data/RNAcompete_eukarya/.
References
- Glisovic, T., Bachorik, J. L., Yong, J. & Dreyfuss, G. RNA-binding proteins and post-transcriptional gene regulation. FEBS Lett. 582, 1977–1986 (2008)
Article CAS Google Scholar - Keene, J. D. RNA regulons: coordination of post-transcriptional events. Nature Rev. Genet. 8, 533–543 (2007)
Article ADS CAS Google Scholar - Cook, K. B., Kazan, H., Zuberi, K., Morris, Q. & Hughes, T. R. RBPDB: a database of RNA-binding specificities. Nucleic Acids Res. 39, D301–D308 (2011)
Article CAS Google Scholar - Gabut, M., Chaudhry, S. & Blencowe, B. J. SnapShot: The splicing regulatory machinery. Cell 133, 192.e1 (2008)
Article Google Scholar - Auweter, S. D., Oberstrass, F. C. & Allain, F. H. Sequence-specific binding of single-stranded RNA: is there a code for recognition? Nucleic Acids Res. 34, 4943–4959 (2006)
Article CAS Google Scholar - De Gaudenzi, J. G., Noe, G., Campo, V. A., Frasch, A. C. & Cassola, A. Gene expression regulation in trypanosomatids. Essays Biochem. 51, 31–46 (2011)
Article CAS Google Scholar - Noyes, M. B. et al. Analysis of homeodomain specificities allows the family-wide prediction of preferred recognition sites. Cell 133, 1277–1289 (2008)
Article CAS Google Scholar - Berger, M. F. et al. Variation in homeodomain DNA binding revealed by high-resolution analysis of sequence preferences. Cell 133, 1266–1276 (2008)
Article CAS Google Scholar - Christensen, R. G. et al. Recognition models to predict DNA-binding specificities of homeodomain proteins. Bioinformatics 28, i84–i89 (2012)
Article CAS Google Scholar - Liu, J. & Stormo, G. D. Context-dependent DNA recognition code for C2H2 zinc-finger transcription factors. Bioinformatics 24, 1850–1857 (2008)
Article CAS Google Scholar - Ray, D. et al. Rapid and systematic analysis of the RNA recognition specificities of RNA-binding proteins. Nature Biotechnol. 27, 667–670 (2009)
Article CAS Google Scholar - Berger, M. F. & Bulyk, M. L. Universal protein-binding microarrays for the comprehensive characterization of the DNA-binding specificities of transcription factors. Nature Protocols 4, 393–411 (2009)
Article CAS Google Scholar - Li, X., Quon, G., Lipshitz, H. D. & Morris, Q. Predicting in vivo binding sites of RNA-binding proteins using mRNA secondary structure. RNA 16, 1096–1107 (2010)
Article CAS Google Scholar - Hoell, J. I. et al. RNA targets of wild-type and mutant FET family proteins. Nature Struct. Mol. Biol. 18, 1428–1431 (2011)
Article CAS Google Scholar - Miyamoto, S., Hidaka, K., Jin, D. & Morisaki, T. RNA-binding proteins Rbm38 and Rbm24 regulate myogenic differentiation via p21-dependent and -independent regulatory pathways. Genes Cells 14, 1241–1252 (2009)
Article CAS Google Scholar - Anyanful, A. et al. The RNA-binding protein SUP-12 controls muscle-specific splicing of the ADF/cofilin pre-mRNA in C. elegans . J. Cell Biol. 167, 639–647 (2004)
Article CAS Google Scholar - Stefl, R., Skrisovska, L. & Allain, F. H. RNA sequence- and shape-dependent recognition by proteins in the ribonucleoprotein particle. EMBO Rep. 6, 33–38 (2005)
Article CAS Google Scholar - Brooks, A. N. et al. Conservation of an RNA regulatory map between Drosophila and mammals. Genome Res. 21, 193–202 (2011)
Article CAS Google Scholar - Huelga, S. C. et al. Integrative genome-wide analysis reveals cooperative regulation of alternative splicing by hnRNP proteins. Cell Rep. 1, 167–178 (2012)
Article CAS Google Scholar - Burd, C. G. & Dreyfuss, G. RNA binding specificity of hnRNP A1: significance of hnRNP A1 high-affinity binding sites in pre-mRNA splicing. EMBO J. 13, 1197–1204 (1994)
Article CAS Google Scholar - Blanchette, M. et al. Genome-wide analysis of alternative pre-mRNA splicing and RNA-binding specificities of the Drosophila hnRNP A/B family members. Mol. Cell 33, 438–449 (2009)
Article CAS Google Scholar - Goodarzi, H. et al. Systematic discovery of structural elements governing stability of mammalian messenger RNAs. Nature 485, 264–268 (2012)
Article ADS CAS Google Scholar - Moses, A. M., Chiang, D. Y., Pollard, D. A., Iyer, V. N. & Eisen, M. B. MONKEY: identifying conserved transcription-factor binding sites in multiple alignments using a binding site-specific evolutionary model. Genome Biol. 5, R98 (2004)
Article Google Scholar - Yeo, G. W. et al. An RNA code for the FOX2 splicing regulator revealed by mapping RNA-protein interactions in stem cells. Nature Struct. Mol. Biol. 16, 130–137 (2009)
Article CAS Google Scholar - Morris, A. R., Mukherjee, N. & Keene, J. D. Ribonomic analysis of human Pum1 reveals cis-trans conservation across species despite evolution of diverse mRNA target sets. Mol. Cell. Biol. 28, 4093–4103 (2008)
Article CAS Google Scholar - Licatalosi, D. D. et al. HITS-CLIP yields genome-wide insights into brain alternative RNA processing. Nature 456, 464–469 (2008)
Article ADS CAS Google Scholar - Wang, E. T. et al. Transcriptome-wide regulation of pre-mRNA splicing and mRNA localization by muscleblind proteins. Cell 150, 710–724 (2012)
Article CAS Google Scholar - Sawicka, K., Bushell, M., Spriggs, K. A. & Willis, A. E. Polypyrimidine-tract-binding protein: a multifunctional RNA-binding protein. Biochem. Soc. Trans. 36, 641–647 (2008)
Article CAS Google Scholar - Biedermann, B., Hotz, H. R. & Ciosk, R. The Quaking family of RNA-binding proteins: coordinators of the cell cycle and differentiation. Cell Cycle 9, 1929–1933 (2010)
Article CAS Google Scholar - Izquierdo, J. M. Hu antigen R (HuR) functions as an alternative pre-mRNA splicing regulator of Fas apoptosis-promoting receptor on exon definition. J. Biol. Chem. 283, 19077–19084 (2008)
Article CAS Google Scholar - Markus, M. A. & Morris, B. J. RBM4: a multifunctional RNA-binding protein. Int. J. Biochem. Cell Biol. 41, 740–743 (2009)
Article CAS Google Scholar - Myer, V. E., Fan, X. C. & Steitz, J. A. Identification of HuR as a protein implicated in AUUUA-mediated mRNA decay. EMBO J. 16, 2130–2139 (1997)
Article CAS Google Scholar - Van Etten, J. et al. Human Pumilio proteins recruit multiple deadenylases to efficiently repress messenger RNAs. J. Biol. Chem. 287, 36370–36383 (2012)
Article CAS Google Scholar - Xue, Y. et al. Genome-wide analysis of PTB-RNA interactions reveals a strategy used by the general splicing repressor to modulate exon inclusion or skipping. Mol. Cell 36, 996–1006 (2009)
Article CAS Google Scholar - Zhang, C. et al. Defining the regulatory network of the tissue-specific splicing factors Fox-1 and Fox-2. Genes Dev. 22, 2550–2563 (2008)
Article CAS Google Scholar - Fogel, B. L. et al. RBFOX1 regulates both splicing and transcriptional networks in human neuronal development. Hum. Mol. Genet. 21, 4171–4186 (2012)
Article CAS Google Scholar - Voineagu, I. et al. Transcriptomic analysis of autistic brain reveals convergent molecular pathology. Nature 474, 380–384 (2011)
Article CAS Google Scholar - Barash, Y. et al. Deciphering the splicing code. Nature 465, 53–59 (2010)
Article ADS CAS Google Scholar - Hogan, D. J., Riordan, D. P., Gerber, A. P., Herschlag, D. & Brown, P. O. Diverse RNA-binding proteins interact with functionally related sets of RNAs, suggesting an extensive regulatory system. PLoS Biol. 6, e255 (2008)
Article Google Scholar - Qin, X., Ahn, S., Speed, T. P. & Rubin, G. M. Global analyses of mRNA translational control during early Drosophila embryogenesis. Genome Biol. 8, R63 (2007)
Article Google Scholar - Tadros, W. et al. SMAUG is a major regulator of maternal mRNA destabilization in Drosophila and its translation is activated by the PAN GU kinase. Dev. Cell 12, 143–155 (2007)
Article CAS Google Scholar - Lécuyer, E. et al. Global analysis of mRNA localization reveals a prominent role in organizing cellular architecture and function. Cell 131, 174–187 (2007)
Article Google Scholar - Wunderlich, Z. & Mirny, L. A. Different gene regulation strategies revealed by analysis of binding motifs. Trends Genet. 25, 434–440 (2009)
Article CAS Google Scholar - Castello, A. et al. Insights into RNA biology from an atlas of mammalian mRNA-binding proteins. Cell 149, 1393–1406 (2012)
Article CAS Google Scholar - Sievers, F. et al. Fast, scalable generation of high-quality protein multiple sequence alignments using Clustal Omega. Mol. Syst. Biol. 7, 539 (2011)
Article Google Scholar - Subramanian, A. et al. Gene set enrichment analysis: a knowledge-based approach for interpreting genome-wide expression profiles. Proc. Natl Acad. Sci. USA 102, 15545–15550 (2005)
Article ADS CAS Google Scholar - Mahony, S. & Benos, P. V. STAMP: a web tool for exploring DNA-binding motif similarities. Nucleic Acids Res. 35, W253–W258 (2007)
Article Google Scholar
Acknowledgements
We thank H. van Bakel for computational support, A. Ramani and J. Calarco for discussions, Y. Wu, G. Rasanathan, M. Krishnamoorthy, O. Boright, A. Janska, J. Li, S. Talukder, A. Cote and S. Votruba for technical assistance, L. Sutherland for purchasing RBM5 protein and for feedback on the manuscript, S. Jain for software modified to create Fig. 2, and N. Barbosa-Morais for generating cRPKM values from autism RNA-seq data. We thank M. Kiledjian (PCBP1 and PCBP2), J. Stevenin (SRSF2 and SFRS7), S. Richard (QKI), M. Gorospe (TIA1), B. Chabot (SRSF9), A. Berglund (MBNL1), F. Pagani (DAZAP1), A. Bindereif (HNRNPL), M. Freeman (HNRNPK), E. Miska (LIN28A), K. Kohno (YBX1), M. Garcia-Blanco (PTBP1), R. Wharton (PUM-HD), C. Smibert (Vts1p) and M. Blanchette (Hrb27C, Hrb87F and Hrb98DE) for sending published constructs. This work was supported by funding from NIH (1R01HG00570 to T.R.H. and Q.D.M., R01GM084034 to K.W.L.), CIHR (MOP-49451 to T.R.H., MOP-93671 to Q.D.M., MOP-125894 to Q.D.M. and T.R.H., MOP-67011 to B.J.B., and MOP-14409 to H.D.L.), and the Intramural Program of the NIDDK (DK015602-05 to E.P.L.). K.B.C. and S.G. hold NSERC Alexander Graham Bell Canada Graduate Scholarships. M.T.W. was funded by fellowships from CIHR and CIFAR. H.S.N. holds a Charles H. Best Fellowship and was funded partially by awards from CIFAR to T.R.H. and B.J.F. M.I. is the recipient of an HFSP LT Fellowship.
Author information
Author notes
- Matthew T. Weirauch
Present address: Present address: Center for Autoimmune Genomics and Etiology (CAGE) and Divisions of Rheumatology and Biomedical Informatics, Cincinnati Children’s Hospital Medical Center, Cincinnati, Ohio 45229, USA., - Debashish Ray, Hilal Kazan, Kate B. Cook, Matthew T. Weirauch and Hamed S. Najafabadi: These authors contributed equally to this work.
Authors and Affiliations
- Donnelly Centre, University of Toronto, Toronto M5S 3E1, Canada ,
Debashish Ray, Matthew T. Weirauch, Hamed S. Najafabadi, Mihai Albu, Hong Zheng, Ally Yang, Hong Na, Manuel Irimia, Andrew G. Fraser, Benjamin J. Blencowe, Quaid D. Morris & Timothy R. Hughes - Department of Computer Science, University of Toronto, Toronto M5S 2E4, Canada,
Hilal Kazan & Quaid D. Morris - Department of Molecular Genetics, University of Toronto, Toronto M5S 1A8, Canada,
Kate B. Cook, Xiao Li, Serge Gueroussov, Howard D. Lipshitz, Andrew G. Fraser, Benjamin J. Blencowe, Quaid D. Morris & Timothy R. Hughes - Department of Electrical and Computer Engineering, University of Toronto, Toronto M5S 3G4, Canada,
Hamed S. Najafabadi, Brendan J. Frey & Quaid D. Morris - Laboratory of Cellular and Developmental Biology, National Institute of Diabetes and Digestive and Kidney Diseases, National Institutes of Health, Bethesda, 20892, Maryland, USA
Leah H. Matzat, Ryan K. Dale & Elissa P. Lei - Department of Medicine, Perelman School of Medicine at the University of Pennsylvania, Philadelphia, 19104, Pennsylvania, USA
Sarah A. Smith, Christopher A. Yarosh, Behnam Nabet, Russ P. Carstens & Kristen W. Lynch - Department of Biochemistry, Emory University School of Medicine, Atlanta, 30322, Georgia, USA
Seth M. Kelly & Anita H. Corbett - Department of Biology and Center for Genomics and Systems Biology, New York University, New York, 10003, New York, USA
Desirea Mecenas & Fabio Piano - Molecular and Cellular Pharmacology Program, School of Medicine and Public Health, University of Wisconsin-Madison, Madison, 53706, Wisconsin, USA
Weimin Li, Rakesh S. Laishram & Richard A. Anderson - Children’s Cancer Research Institute, UTHSCSA, San Antonio, 78229, Texas, USA
Mei Qiao & Luiz O. F. Penalva
Authors
- Debashish Ray
You can also search for this author inPubMed Google Scholar - Hilal Kazan
You can also search for this author inPubMed Google Scholar - Kate B. Cook
You can also search for this author inPubMed Google Scholar - Matthew T. Weirauch
You can also search for this author inPubMed Google Scholar - Hamed S. Najafabadi
You can also search for this author inPubMed Google Scholar - Xiao Li
You can also search for this author inPubMed Google Scholar - Serge Gueroussov
You can also search for this author inPubMed Google Scholar - Mihai Albu
You can also search for this author inPubMed Google Scholar - Hong Zheng
You can also search for this author inPubMed Google Scholar - Ally Yang
You can also search for this author inPubMed Google Scholar - Hong Na
You can also search for this author inPubMed Google Scholar - Manuel Irimia
You can also search for this author inPubMed Google Scholar - Leah H. Matzat
You can also search for this author inPubMed Google Scholar - Ryan K. Dale
You can also search for this author inPubMed Google Scholar - Sarah A. Smith
You can also search for this author inPubMed Google Scholar - Christopher A. Yarosh
You can also search for this author inPubMed Google Scholar - Seth M. Kelly
You can also search for this author inPubMed Google Scholar - Behnam Nabet
You can also search for this author inPubMed Google Scholar - Desirea Mecenas
You can also search for this author inPubMed Google Scholar - Weimin Li
You can also search for this author inPubMed Google Scholar - Rakesh S. Laishram
You can also search for this author inPubMed Google Scholar - Mei Qiao
You can also search for this author inPubMed Google Scholar - Howard D. Lipshitz
You can also search for this author inPubMed Google Scholar - Fabio Piano
You can also search for this author inPubMed Google Scholar - Anita H. Corbett
You can also search for this author inPubMed Google Scholar - Russ P. Carstens
You can also search for this author inPubMed Google Scholar - Brendan J. Frey
You can also search for this author inPubMed Google Scholar - Richard A. Anderson
You can also search for this author inPubMed Google Scholar - Kristen W. Lynch
You can also search for this author inPubMed Google Scholar - Luiz O. F. Penalva
You can also search for this author inPubMed Google Scholar - Elissa P. Lei
You can also search for this author inPubMed Google Scholar - Andrew G. Fraser
You can also search for this author inPubMed Google Scholar - Benjamin J. Blencowe
You can also search for this author inPubMed Google Scholar - Quaid D. Morris
You can also search for this author inPubMed Google Scholar - Timothy R. Hughes
You can also search for this author inPubMed Google Scholar
Contributions
D.R., H.K., K.B.C., M.T.W. and H.S.N. made unique, essential and extensive contributions to the manuscript, and are ordered by amount of time and effort contributed. D.R. and H.K. developed most of the laboratory and computational components of RNAcompete, respectively. D.R., H.Z., A.Y., H.N., L.H.M., S.A.S., C.A.Y., S.M.K., B.N., D.M., W.L., R.S.L. and M.Q. cloned, expressed and purified the proteins. D.R. ran the RNAcompete assays, including data extraction. H.K. and K.B.C. processed the data, H.K. and K.B.C. generated motifs, and H.K., K.B.C., M.T.W. and H.S.N. performed the motif analyses. H.K. assembled the in vivo protein-RNA data sets. L.H.M. and R.K.D. performed and analysed RIP-seq data. K.B.C. developed the supplementary website and Figs 1 and 2 with assistance from H.K. and M.T.W. M.T.W. and M.A. created the cisBP-RNA database. M.T.W., H.S.N. and T.R.H. created Fig. 3. H.S.N. performed the analyses of human splicing, RNA stability data and human sequence conservation, and created Figs 4 and 5. M.I. and S.G. generated and analysed RNA-seq data and S.G. performed reporter-based RNA stability assays. X.L. performed Drosophila data analysis. H.D.L., F.P., A.H.C., R.P.C., B.J.F., R.A.A., K.W.L., L.O.F.P., E.P.L., B.J.B. and A.G.F. helped organize and support the project, and provided feedback on the manuscript. B.J.F., B.J.B. and A.G.F. provided critical advice and commentary on data analysis. Q.D.M. and T.R.H. conceived of the study, supervised the project and wrote the manuscript with contributions from D.R., H.K., K.B.C., B.J.B., A.F. and H.S.N.
Corresponding authors
Correspondence toQuaid D. Morris or Timothy R. Hughes.
Ethics declarations
Competing interests
The authors declare no competing financial interests.
Supplementary information
Supplementary Information
This file contains Supplementary Methods, Supplementary Figures 1-6, Supplementary Tables 1-4 and additional references. (PDF 2569 kb)
Supplementary Data 1
This file shows RNA-binding proteins with known consensus motifs. It contains panels for human and Drosophila listing RBPs with known consensus motifs as well as the Pubmed ID of the publication that defined the motif. (XLSX 27 kb)
Supplementary Data 2
The RNAcompete master file. This file contains data on all RNAcompete experiments indexed by motif ID including: name, systematic ID and species of protein queried, the resulting motif, amino acid sequence of plasmid insert, and information on binding conditions used. (XLSX 2614 kb)
Supplementary Data 3
Secondary structure analysis. This file contains data panels in which each row corresponds to a significantly enriched secondary structure context for a given RNAcompete experiment along with P-values and effect sizes. Classification panel summarizes analysis results by motif. (XLSX 30 kb)
Supplementary Data 4
Clustered E-scores. This file contains the data matrix used in Figures 1b and S7. (TXT 16827 kb)
Supplementary Data 5
Comparison of RNAcompete and literature motifs. This file shows the results of comparison with previously defined motifs for RNAcompete RBPs. (XLSX 515 kb)
Supplementary Data 6
AUROC scores for in vivo and in vitro defined motifs on in vivo binding data. This file contains AUROCs for RNAcompete motifs on in vivo binding data described in Table S2, along with motifs learned by Malarkey on these data and AUROC scores for previously defined motifs for these RBPs. (XLSX 19 kb)
Supplementary Data 7
Post-transcriptional regulation (PTR) analysis in human. This file contains additional details and results of PTR analysis in human including predicted RBP-transcript regulatory networks for splicing and stability analysis. (XLSX 1445 kb)
Supplementary Data 8
Post-transcriptional regulation (PTR) analysis in Drosophila. This file contains details and results of PTR analysis for Drosophila including lists of PTR categories enriched for RNAcompete-derived IUPAC motifs, weights of trained logistic regression classifiers, Drosophila RBP(s) associated with each IUPAC motif, and IUPAC motifs queried. (XLSX 44 kb)
Supplementary Data 9
Sources of gene and Pfam models. This file details sources for gene and protein models for all organisms used in cisBP-RNA and in this paper. Also indicates Pfam models used to scan for RBDs.Sources of gene and Pfam models. This file details sources for gene and protein models for all organisms used in cisBP-RNA and in this paper. Also indicates Pfam models used to scan for RBDs. (XLSX 34 kb)
PowerPoint slides
Rights and permissions
About this article
Cite this article
Ray, D., Kazan, H., Cook, K. et al. A compendium of RNA-binding motifs for decoding gene regulation.Nature 499, 172–177 (2013). https://doi.org/10.1038/nature12311
- Received: 08 January 2013
- Accepted: 17 May 2013
- Published: 10 July 2013
- Issue Date: 11 July 2013
- DOI: https://doi.org/10.1038/nature12311