A compendium of RNA-binding motifs for decoding gene regulation (original) (raw)

Accession codes

Accessions

Gene Expression Omnibus

Data deposits

Raw and processed microarray data are available at GEO (http://www.ncbi.nlm.nih.gov/geo/) under accession number GSE41235. The derived motifs and results of analyses are available at http://hugheslab.ccbr.utoronto.ca/supplementary-data/RNAcompete_eukarya/.

References

  1. Glisovic, T., Bachorik, J. L., Yong, J. & Dreyfuss, G. RNA-binding proteins and post-transcriptional gene regulation. FEBS Lett. 582, 1977–1986 (2008)
    Article CAS Google Scholar
  2. Keene, J. D. RNA regulons: coordination of post-transcriptional events. Nature Rev. Genet. 8, 533–543 (2007)
    Article ADS CAS Google Scholar
  3. Cook, K. B., Kazan, H., Zuberi, K., Morris, Q. & Hughes, T. R. RBPDB: a database of RNA-binding specificities. Nucleic Acids Res. 39, D301–D308 (2011)
    Article CAS Google Scholar
  4. Gabut, M., Chaudhry, S. & Blencowe, B. J. SnapShot: The splicing regulatory machinery. Cell 133, 192.e1 (2008)
    Article Google Scholar
  5. Auweter, S. D., Oberstrass, F. C. & Allain, F. H. Sequence-specific binding of single-stranded RNA: is there a code for recognition? Nucleic Acids Res. 34, 4943–4959 (2006)
    Article CAS Google Scholar
  6. De Gaudenzi, J. G., Noe, G., Campo, V. A., Frasch, A. C. & Cassola, A. Gene expression regulation in trypanosomatids. Essays Biochem. 51, 31–46 (2011)
    Article CAS Google Scholar
  7. Noyes, M. B. et al. Analysis of homeodomain specificities allows the family-wide prediction of preferred recognition sites. Cell 133, 1277–1289 (2008)
    Article CAS Google Scholar
  8. Berger, M. F. et al. Variation in homeodomain DNA binding revealed by high-resolution analysis of sequence preferences. Cell 133, 1266–1276 (2008)
    Article CAS Google Scholar
  9. Christensen, R. G. et al. Recognition models to predict DNA-binding specificities of homeodomain proteins. Bioinformatics 28, i84–i89 (2012)
    Article CAS Google Scholar
  10. Liu, J. & Stormo, G. D. Context-dependent DNA recognition code for C2H2 zinc-finger transcription factors. Bioinformatics 24, 1850–1857 (2008)
    Article CAS Google Scholar
  11. Ray, D. et al. Rapid and systematic analysis of the RNA recognition specificities of RNA-binding proteins. Nature Biotechnol. 27, 667–670 (2009)
    Article CAS Google Scholar
  12. Berger, M. F. & Bulyk, M. L. Universal protein-binding microarrays for the comprehensive characterization of the DNA-binding specificities of transcription factors. Nature Protocols 4, 393–411 (2009)
    Article CAS Google Scholar
  13. Li, X., Quon, G., Lipshitz, H. D. & Morris, Q. Predicting in vivo binding sites of RNA-binding proteins using mRNA secondary structure. RNA 16, 1096–1107 (2010)
    Article CAS Google Scholar
  14. Hoell, J. I. et al. RNA targets of wild-type and mutant FET family proteins. Nature Struct. Mol. Biol. 18, 1428–1431 (2011)
    Article CAS Google Scholar
  15. Miyamoto, S., Hidaka, K., Jin, D. & Morisaki, T. RNA-binding proteins Rbm38 and Rbm24 regulate myogenic differentiation via p21-dependent and -independent regulatory pathways. Genes Cells 14, 1241–1252 (2009)
    Article CAS Google Scholar
  16. Anyanful, A. et al. The RNA-binding protein SUP-12 controls muscle-specific splicing of the ADF/cofilin pre-mRNA in C. elegans . J. Cell Biol. 167, 639–647 (2004)
    Article CAS Google Scholar
  17. Stefl, R., Skrisovska, L. & Allain, F. H. RNA sequence- and shape-dependent recognition by proteins in the ribonucleoprotein particle. EMBO Rep. 6, 33–38 (2005)
    Article CAS Google Scholar
  18. Brooks, A. N. et al. Conservation of an RNA regulatory map between Drosophila and mammals. Genome Res. 21, 193–202 (2011)
    Article CAS Google Scholar
  19. Huelga, S. C. et al. Integrative genome-wide analysis reveals cooperative regulation of alternative splicing by hnRNP proteins. Cell Rep. 1, 167–178 (2012)
    Article CAS Google Scholar
  20. Burd, C. G. & Dreyfuss, G. RNA binding specificity of hnRNP A1: significance of hnRNP A1 high-affinity binding sites in pre-mRNA splicing. EMBO J. 13, 1197–1204 (1994)
    Article CAS Google Scholar
  21. Blanchette, M. et al. Genome-wide analysis of alternative pre-mRNA splicing and RNA-binding specificities of the Drosophila hnRNP A/B family members. Mol. Cell 33, 438–449 (2009)
    Article CAS Google Scholar
  22. Goodarzi, H. et al. Systematic discovery of structural elements governing stability of mammalian messenger RNAs. Nature 485, 264–268 (2012)
    Article ADS CAS Google Scholar
  23. Moses, A. M., Chiang, D. Y., Pollard, D. A., Iyer, V. N. & Eisen, M. B. MONKEY: identifying conserved transcription-factor binding sites in multiple alignments using a binding site-specific evolutionary model. Genome Biol. 5, R98 (2004)
    Article Google Scholar
  24. Yeo, G. W. et al. An RNA code for the FOX2 splicing regulator revealed by mapping RNA-protein interactions in stem cells. Nature Struct. Mol. Biol. 16, 130–137 (2009)
    Article CAS Google Scholar
  25. Morris, A. R., Mukherjee, N. & Keene, J. D. Ribonomic analysis of human Pum1 reveals cis-trans conservation across species despite evolution of diverse mRNA target sets. Mol. Cell. Biol. 28, 4093–4103 (2008)
    Article CAS Google Scholar
  26. Licatalosi, D. D. et al. HITS-CLIP yields genome-wide insights into brain alternative RNA processing. Nature 456, 464–469 (2008)
    Article ADS CAS Google Scholar
  27. Wang, E. T. et al. Transcriptome-wide regulation of pre-mRNA splicing and mRNA localization by muscleblind proteins. Cell 150, 710–724 (2012)
    Article CAS Google Scholar
  28. Sawicka, K., Bushell, M., Spriggs, K. A. & Willis, A. E. Polypyrimidine-tract-binding protein: a multifunctional RNA-binding protein. Biochem. Soc. Trans. 36, 641–647 (2008)
    Article CAS Google Scholar
  29. Biedermann, B., Hotz, H. R. & Ciosk, R. The Quaking family of RNA-binding proteins: coordinators of the cell cycle and differentiation. Cell Cycle 9, 1929–1933 (2010)
    Article CAS Google Scholar
  30. Izquierdo, J. M. Hu antigen R (HuR) functions as an alternative pre-mRNA splicing regulator of Fas apoptosis-promoting receptor on exon definition. J. Biol. Chem. 283, 19077–19084 (2008)
    Article CAS Google Scholar
  31. Markus, M. A. & Morris, B. J. RBM4: a multifunctional RNA-binding protein. Int. J. Biochem. Cell Biol. 41, 740–743 (2009)
    Article CAS Google Scholar
  32. Myer, V. E., Fan, X. C. & Steitz, J. A. Identification of HuR as a protein implicated in AUUUA-mediated mRNA decay. EMBO J. 16, 2130–2139 (1997)
    Article CAS Google Scholar
  33. Van Etten, J. et al. Human Pumilio proteins recruit multiple deadenylases to efficiently repress messenger RNAs. J. Biol. Chem. 287, 36370–36383 (2012)
    Article CAS Google Scholar
  34. Xue, Y. et al. Genome-wide analysis of PTB-RNA interactions reveals a strategy used by the general splicing repressor to modulate exon inclusion or skipping. Mol. Cell 36, 996–1006 (2009)
    Article CAS Google Scholar
  35. Zhang, C. et al. Defining the regulatory network of the tissue-specific splicing factors Fox-1 and Fox-2. Genes Dev. 22, 2550–2563 (2008)
    Article CAS Google Scholar
  36. Fogel, B. L. et al. RBFOX1 regulates both splicing and transcriptional networks in human neuronal development. Hum. Mol. Genet. 21, 4171–4186 (2012)
    Article CAS Google Scholar
  37. Voineagu, I. et al. Transcriptomic analysis of autistic brain reveals convergent molecular pathology. Nature 474, 380–384 (2011)
    Article CAS Google Scholar
  38. Barash, Y. et al. Deciphering the splicing code. Nature 465, 53–59 (2010)
    Article ADS CAS Google Scholar
  39. Hogan, D. J., Riordan, D. P., Gerber, A. P., Herschlag, D. & Brown, P. O. Diverse RNA-binding proteins interact with functionally related sets of RNAs, suggesting an extensive regulatory system. PLoS Biol. 6, e255 (2008)
    Article Google Scholar
  40. Qin, X., Ahn, S., Speed, T. P. & Rubin, G. M. Global analyses of mRNA translational control during early Drosophila embryogenesis. Genome Biol. 8, R63 (2007)
    Article Google Scholar
  41. Tadros, W. et al. SMAUG is a major regulator of maternal mRNA destabilization in Drosophila and its translation is activated by the PAN GU kinase. Dev. Cell 12, 143–155 (2007)
    Article CAS Google Scholar
  42. Lécuyer, E. et al. Global analysis of mRNA localization reveals a prominent role in organizing cellular architecture and function. Cell 131, 174–187 (2007)
    Article Google Scholar
  43. Wunderlich, Z. & Mirny, L. A. Different gene regulation strategies revealed by analysis of binding motifs. Trends Genet. 25, 434–440 (2009)
    Article CAS Google Scholar
  44. Castello, A. et al. Insights into RNA biology from an atlas of mammalian mRNA-binding proteins. Cell 149, 1393–1406 (2012)
    Article CAS Google Scholar
  45. Sievers, F. et al. Fast, scalable generation of high-quality protein multiple sequence alignments using Clustal Omega. Mol. Syst. Biol. 7, 539 (2011)
    Article Google Scholar
  46. Subramanian, A. et al. Gene set enrichment analysis: a knowledge-based approach for interpreting genome-wide expression profiles. Proc. Natl Acad. Sci. USA 102, 15545–15550 (2005)
    Article ADS CAS Google Scholar
  47. Mahony, S. & Benos, P. V. STAMP: a web tool for exploring DNA-binding motif similarities. Nucleic Acids Res. 35, W253–W258 (2007)
    Article Google Scholar

Download references

Acknowledgements

We thank H. van Bakel for computational support, A. Ramani and J. Calarco for discussions, Y. Wu, G. Rasanathan, M. Krishnamoorthy, O. Boright, A. Janska, J. Li, S. Talukder, A. Cote and S. Votruba for technical assistance, L. Sutherland for purchasing RBM5 protein and for feedback on the manuscript, S. Jain for software modified to create Fig. 2, and N. Barbosa-Morais for generating cRPKM values from autism RNA-seq data. We thank M. Kiledjian (PCBP1 and PCBP2), J. Stevenin (SRSF2 and SFRS7), S. Richard (QKI), M. Gorospe (TIA1), B. Chabot (SRSF9), A. Berglund (MBNL1), F. Pagani (DAZAP1), A. Bindereif (HNRNPL), M. Freeman (HNRNPK), E. Miska (LIN28A), K. Kohno (YBX1), M. Garcia-Blanco (PTBP1), R. Wharton (PUM-HD), C. Smibert (Vts1p) and M. Blanchette (Hrb27C, Hrb87F and Hrb98DE) for sending published constructs. This work was supported by funding from NIH (1R01HG00570 to T.R.H. and Q.D.M., R01GM084034 to K.W.L.), CIHR (MOP-49451 to T.R.H., MOP-93671 to Q.D.M., MOP-125894 to Q.D.M. and T.R.H., MOP-67011 to B.J.B., and MOP-14409 to H.D.L.), and the Intramural Program of the NIDDK (DK015602-05 to E.P.L.). K.B.C. and S.G. hold NSERC Alexander Graham Bell Canada Graduate Scholarships. M.T.W. was funded by fellowships from CIHR and CIFAR. H.S.N. holds a Charles H. Best Fellowship and was funded partially by awards from CIFAR to T.R.H. and B.J.F. M.I. is the recipient of an HFSP LT Fellowship.

Author information

Author notes

  1. Matthew T. Weirauch
    Present address: Present address: Center for Autoimmune Genomics and Etiology (CAGE) and Divisions of Rheumatology and Biomedical Informatics, Cincinnati Children’s Hospital Medical Center, Cincinnati, Ohio 45229, USA.,
  2. Debashish Ray, Hilal Kazan, Kate B. Cook, Matthew T. Weirauch and Hamed S. Najafabadi: These authors contributed equally to this work.

Authors and Affiliations

  1. Donnelly Centre, University of Toronto, Toronto M5S 3E1, Canada ,
    Debashish Ray, Matthew T. Weirauch, Hamed S. Najafabadi, Mihai Albu, Hong Zheng, Ally Yang, Hong Na, Manuel Irimia, Andrew G. Fraser, Benjamin J. Blencowe, Quaid D. Morris & Timothy R. Hughes
  2. Department of Computer Science, University of Toronto, Toronto M5S 2E4, Canada,
    Hilal Kazan & Quaid D. Morris
  3. Department of Molecular Genetics, University of Toronto, Toronto M5S 1A8, Canada,
    Kate B. Cook, Xiao Li, Serge Gueroussov, Howard D. Lipshitz, Andrew G. Fraser, Benjamin J. Blencowe, Quaid D. Morris & Timothy R. Hughes
  4. Department of Electrical and Computer Engineering, University of Toronto, Toronto M5S 3G4, Canada,
    Hamed S. Najafabadi, Brendan J. Frey & Quaid D. Morris
  5. Laboratory of Cellular and Developmental Biology, National Institute of Diabetes and Digestive and Kidney Diseases, National Institutes of Health, Bethesda, 20892, Maryland, USA
    Leah H. Matzat, Ryan K. Dale & Elissa P. Lei
  6. Department of Medicine, Perelman School of Medicine at the University of Pennsylvania, Philadelphia, 19104, Pennsylvania, USA
    Sarah A. Smith, Christopher A. Yarosh, Behnam Nabet, Russ P. Carstens & Kristen W. Lynch
  7. Department of Biochemistry, Emory University School of Medicine, Atlanta, 30322, Georgia, USA
    Seth M. Kelly & Anita H. Corbett
  8. Department of Biology and Center for Genomics and Systems Biology, New York University, New York, 10003, New York, USA
    Desirea Mecenas & Fabio Piano
  9. Molecular and Cellular Pharmacology Program, School of Medicine and Public Health, University of Wisconsin-Madison, Madison, 53706, Wisconsin, USA
    Weimin Li, Rakesh S. Laishram & Richard A. Anderson
  10. Children’s Cancer Research Institute, UTHSCSA, San Antonio, 78229, Texas, USA
    Mei Qiao & Luiz O. F. Penalva

Authors

  1. Debashish Ray
    You can also search for this author inPubMed Google Scholar
  2. Hilal Kazan
    You can also search for this author inPubMed Google Scholar
  3. Kate B. Cook
    You can also search for this author inPubMed Google Scholar
  4. Matthew T. Weirauch
    You can also search for this author inPubMed Google Scholar
  5. Hamed S. Najafabadi
    You can also search for this author inPubMed Google Scholar
  6. Xiao Li
    You can also search for this author inPubMed Google Scholar
  7. Serge Gueroussov
    You can also search for this author inPubMed Google Scholar
  8. Mihai Albu
    You can also search for this author inPubMed Google Scholar
  9. Hong Zheng
    You can also search for this author inPubMed Google Scholar
  10. Ally Yang
    You can also search for this author inPubMed Google Scholar
  11. Hong Na
    You can also search for this author inPubMed Google Scholar
  12. Manuel Irimia
    You can also search for this author inPubMed Google Scholar
  13. Leah H. Matzat
    You can also search for this author inPubMed Google Scholar
  14. Ryan K. Dale
    You can also search for this author inPubMed Google Scholar
  15. Sarah A. Smith
    You can also search for this author inPubMed Google Scholar
  16. Christopher A. Yarosh
    You can also search for this author inPubMed Google Scholar
  17. Seth M. Kelly
    You can also search for this author inPubMed Google Scholar
  18. Behnam Nabet
    You can also search for this author inPubMed Google Scholar
  19. Desirea Mecenas
    You can also search for this author inPubMed Google Scholar
  20. Weimin Li
    You can also search for this author inPubMed Google Scholar
  21. Rakesh S. Laishram
    You can also search for this author inPubMed Google Scholar
  22. Mei Qiao
    You can also search for this author inPubMed Google Scholar
  23. Howard D. Lipshitz
    You can also search for this author inPubMed Google Scholar
  24. Fabio Piano
    You can also search for this author inPubMed Google Scholar
  25. Anita H. Corbett
    You can also search for this author inPubMed Google Scholar
  26. Russ P. Carstens
    You can also search for this author inPubMed Google Scholar
  27. Brendan J. Frey
    You can also search for this author inPubMed Google Scholar
  28. Richard A. Anderson
    You can also search for this author inPubMed Google Scholar
  29. Kristen W. Lynch
    You can also search for this author inPubMed Google Scholar
  30. Luiz O. F. Penalva
    You can also search for this author inPubMed Google Scholar
  31. Elissa P. Lei
    You can also search for this author inPubMed Google Scholar
  32. Andrew G. Fraser
    You can also search for this author inPubMed Google Scholar
  33. Benjamin J. Blencowe
    You can also search for this author inPubMed Google Scholar
  34. Quaid D. Morris
    You can also search for this author inPubMed Google Scholar
  35. Timothy R. Hughes
    You can also search for this author inPubMed Google Scholar

Contributions

D.R., H.K., K.B.C., M.T.W. and H.S.N. made unique, essential and extensive contributions to the manuscript, and are ordered by amount of time and effort contributed. D.R. and H.K. developed most of the laboratory and computational components of RNAcompete, respectively. D.R., H.Z., A.Y., H.N., L.H.M., S.A.S., C.A.Y., S.M.K., B.N., D.M., W.L., R.S.L. and M.Q. cloned, expressed and purified the proteins. D.R. ran the RNAcompete assays, including data extraction. H.K. and K.B.C. processed the data, H.K. and K.B.C. generated motifs, and H.K., K.B.C., M.T.W. and H.S.N. performed the motif analyses. H.K. assembled the in vivo protein-RNA data sets. L.H.M. and R.K.D. performed and analysed RIP-seq data. K.B.C. developed the supplementary website and Figs 1 and 2 with assistance from H.K. and M.T.W. M.T.W. and M.A. created the cisBP-RNA database. M.T.W., H.S.N. and T.R.H. created Fig. 3. H.S.N. performed the analyses of human splicing, RNA stability data and human sequence conservation, and created Figs 4 and 5. M.I. and S.G. generated and analysed RNA-seq data and S.G. performed reporter-based RNA stability assays. X.L. performed Drosophila data analysis. H.D.L., F.P., A.H.C., R.P.C., B.J.F., R.A.A., K.W.L., L.O.F.P., E.P.L., B.J.B. and A.G.F. helped organize and support the project, and provided feedback on the manuscript. B.J.F., B.J.B. and A.G.F. provided critical advice and commentary on data analysis. Q.D.M. and T.R.H. conceived of the study, supervised the project and wrote the manuscript with contributions from D.R., H.K., K.B.C., B.J.B., A.F. and H.S.N.

Corresponding authors

Correspondence toQuaid D. Morris or Timothy R. Hughes.

Ethics declarations

Competing interests

The authors declare no competing financial interests.

Supplementary information

Supplementary Information

This file contains Supplementary Methods, Supplementary Figures 1-6, Supplementary Tables 1-4 and additional references. (PDF 2569 kb)

Supplementary Data 1

This file shows RNA-binding proteins with known consensus motifs. It contains panels for human and Drosophila listing RBPs with known consensus motifs as well as the Pubmed ID of the publication that defined the motif. (XLSX 27 kb)

Supplementary Data 2

The RNAcompete master file. This file contains data on all RNAcompete experiments indexed by motif ID including: name, systematic ID and species of protein queried, the resulting motif, amino acid sequence of plasmid insert, and information on binding conditions used. (XLSX 2614 kb)

Supplementary Data 3

Secondary structure analysis. This file contains data panels in which each row corresponds to a significantly enriched secondary structure context for a given RNAcompete experiment along with P-values and effect sizes. Classification panel summarizes analysis results by motif. (XLSX 30 kb)

Supplementary Data 4

Clustered E-scores. This file contains the data matrix used in Figures 1b and S7. (TXT 16827 kb)

Supplementary Data 5

Comparison of RNAcompete and literature motifs. This file shows the results of comparison with previously defined motifs for RNAcompete RBPs. (XLSX 515 kb)

Supplementary Data 6

AUROC scores for in vivo and in vitro defined motifs on in vivo binding data. This file contains AUROCs for RNAcompete motifs on in vivo binding data described in Table S2, along with motifs learned by Malarkey on these data and AUROC scores for previously defined motifs for these RBPs. (XLSX 19 kb)

Supplementary Data 7

Post-transcriptional regulation (PTR) analysis in human. This file contains additional details and results of PTR analysis in human including predicted RBP-transcript regulatory networks for splicing and stability analysis. (XLSX 1445 kb)

Supplementary Data 8

Post-transcriptional regulation (PTR) analysis in Drosophila. This file contains details and results of PTR analysis for Drosophila including lists of PTR categories enriched for RNAcompete-derived IUPAC motifs, weights of trained logistic regression classifiers, Drosophila RBP(s) associated with each IUPAC motif, and IUPAC motifs queried. (XLSX 44 kb)

Supplementary Data 9

Sources of gene and Pfam models. This file details sources for gene and protein models for all organisms used in cisBP-RNA and in this paper. Also indicates Pfam models used to scan for RBDs.Sources of gene and Pfam models. This file details sources for gene and protein models for all organisms used in cisBP-RNA and in this paper. Also indicates Pfam models used to scan for RBDs. (XLSX 34 kb)

PowerPoint slides

Rights and permissions

About this article

Cite this article

Ray, D., Kazan, H., Cook, K. et al. A compendium of RNA-binding motifs for decoding gene regulation.Nature 499, 172–177 (2013). https://doi.org/10.1038/nature12311

Download citation