Systematic identification of mammalian regulatory motifs' target genes and functions (original) (raw)

Nature Methods volume 5, pages 347–353 (2008)Cite this article

Abstract

We developed an algorithm, Lever, that systematically maps metazoan DNA regulatory motifs or motif combinations to sets of genes. Lever assesses whether the motifs are enriched in _cis_-regulatory modules (CRMs), predicted by our PhylCRM algorithm, in the noncoding sequences surrounding the genes. Lever analysis allows unbiased inference of functional annotations to regulatory motifs and candidate CRMs. We used human myogenic differentiation as a model system to statistically assess greater than 25,000 pairings of gene sets and motifs or motif combinations. We assigned functional annotations to candidate regulatory motifs predicted previously and identified gene sets that are likely to be co-regulated via shared regulatory motifs. Lever allows moving beyond the identification of putative regulatory motifs in mammalian genomes, toward understanding their biological roles. This approach is general and can be applied readily to any cell type, gene expression pattern or organism of interest.

This is a preview of subscription content, access via your institution

Access options

Subscribe to this journal

Receive 12 print issues and online access

$259.00 per year

only $21.58 per issue

Buy this article

Prices may be subject to local taxes which are calculated during checkout

Additional access options:

Similar content being viewed by others

Accession codes

Accessions

Gene Expression Omnibus

References

  1. Bulyk, M.L. Computational prediction of transcription-factor binding site locations. Genome Biol. 5, 201 (2003).
    Article Google Scholar
  2. Blanchette, M. et al. Genome-wide computational prediction of transcriptional regulatory modules reveals new insights into human gene expression. Genome Res. 16, 656–668 (2006).
    Article CAS Google Scholar
  3. Hallikas, O. et al. Genome-wide prediction of mammalian enhancers based on analysis of transcription-factor binding affinity. Cell 124, 47–59 (2006).
    Article CAS Google Scholar
  4. Pennacchio, L.A. et al. In vivo enhancer analysis of human conserved non-coding sequences. Nature 444, 499–502 (2006).
    Article CAS Google Scholar
  5. Thompson, W., Palumbo, M.J., Wasserman, W.W., Liu, J.S. & Lawrence, C.E. Decoding human regulatory circuits. Genome Res. 14, 1967–1974 (2004).
    Article CAS Google Scholar
  6. Zhou, Q. & Wong, W.H. CisModule: de novo discovery of cis-regulatory modules by hierarchical mixture modeling. Proc. Natl. Acad. Sci. USA 101, 12114–12119 (2004).
    Article CAS Google Scholar
  7. Wasserman, W.W. & Fickett, J. Identification of regulatory regions which confer muscle-specific gene expression. J. Mol. Biol. 278, 167–181 (1998).
    Article CAS Google Scholar
  8. Philippakis, A.A., He, F.S. & Bulyk, M.L. Modulefinder: a tool for computational discovery of cis regulatory modules. Pac. Symp. Biocomput. 10, 519–530 (2005).
    Google Scholar
  9. Xie, X. et al. Systematic discovery of regulatory motifs in human promoters and 3′ UTRs by comparison of several mammals. Nature 434, 338–345 (2005).
    Article CAS Google Scholar
  10. Elemento, O. & Tavazoie, S. Fast and systematic genome-wide discovery of conserved regulatory elements using a non-alignment based approach. Genome Biol. 6, R18 (2005).
    Article Google Scholar
  11. Huber, B.R. & Bulyk, M.L. Meta-analysis discovery of tissue-specific DNA sequence motifs from mammalian gene expression data. BMC Bioinformatics 7, 229 (2006).
    Article Google Scholar
  12. Ettwiller, L. et al. The discovery, positioning and verification of a set of transcription-associated motifs in vertebrates. Genome Biol. 6, R104 (2005).
    Article Google Scholar
  13. Bulyk, M.L. DNA microarray technologies for measuring protein-DNA interactions. Curr. Opin. Biotechnol. 17, 422–430 (2006).
    Article CAS Google Scholar
  14. Bulyk, M.L., Huang, X., Choo, Y. & Church, G.M. Exploring the DNA-binding specificities of zinc fingers with DNA microarrays. Proc. Natl. Acad. Sci. USA 98, 7158–7163 (2001).
    Article CAS Google Scholar
  15. Mukherjee, S. et al. Rapid analysis of the DNA-binding specificities of transcription factors with DNA microarrays. Nat. Genet. 36, 1331–1339 (2004).
    Article CAS Google Scholar
  16. Berger, M.F. et al. Compact, universal DNA microarrays to comprehensively determine transcription-factor binding site specificities. Nat. Biotechnol. 24, 1429–1435 (2006).
    Article CAS Google Scholar
  17. Philippakis, A.A. et al. Expression-guided in silico evaluation of candidate cis regulatory codes for Drosophila muscle founder cells. PLOS Comput. Biol. 2, e53 (2006).
    Article Google Scholar
  18. Moses, A.M., Chiang, D.Y., Pollard, D.A., Iyer, V.N. & Eisen, M.B. MONKEY: identifying conserved transcription-factor binding sites in multiple alignments using a binding site-specific evolutionary model. Genome Biol. 5, R98 (2004).
    Article Google Scholar
  19. Margulies, E.H. et al. Analyses of deep mammalian sequence alignments and constraint predictions for 1% of the human genome. Genome Res. 17, 760–774 (2007).
    Article CAS Google Scholar
  20. Messenguy, F. & Dubois, E. Role of MADS box proteins and their cofactors in combinatorial control of gene expression and cell development. Gene 316, 1–21 (2003).
    Article CAS Google Scholar
  21. Blais, A. et al. An initial blueprint for myogenic differentiation. Genes Dev. 19, 553–569 (2005).
    Article CAS Google Scholar
  22. Daury, L. et al. Opposing functions of ATF2 and Fos-like transcription factors in c-Jun-mediated myogenin expression and terminal differentiation of avian myoblasts. Oncogene 20, 7998–8008 (2001).
    Article CAS Google Scholar
  23. Wang, Z. et al. Myocardin and ternary complex factors compete for SRF to control smooth muscle gene expression. Nature 428, 185–189 (2004).
    Article CAS Google Scholar
  24. Martinez-Fernandez, S. et al. Pitx2c overexpression promotes cell proliferation and arrests differentiation in myoblasts. Dev. Dyn. 235, 2930–2939 (2006).
    Article CAS Google Scholar
  25. Gurtner, A. et al. Requirement for down-regulation of the CCAAT-binding activity of the NF-Y transcription factor during skeletal muscle differentiation. Mol. Biol. Cell 14, 2706–2715 (2003).
    Article CAS Google Scholar
  26. Ludwig, M.Z., Bergman, C., Patel, N.H. & Kreitman, M. Evidence for stabilizing selection in a eukaryotic enhancer element. Nature 403, 564–567 (2000).
    Article CAS Google Scholar
  27. Wasserman, W.W., Palumbo, M., Thompson, W., Fickett, J. & Lawrence, C. Human-mouse genome comparisons to locate regulatory sites. Nat. Genet. 26, 225–228 (2000).
    Article CAS Google Scholar
  28. Kasabov, N.K. Foundations of Neural Networks, Fuzzy Systems, and Knowledge Engineering (MIT Press, Cambridge, Massachusetts, 1998).
    Google Scholar
  29. Mootha, V.K. et al. PGC-1alpha-responsive genes involved in oxidative phosphorylation are coordinately downregulated in human diabetes. Nat. Genet. 34, 267–273 (2003).
    Article CAS Google Scholar
  30. Berriz, G.F., King, O.D., Bryant, B., Sander, C. & Roth, F.P. Characterizing gene sets with FuncAssociate. Bioinformatics 19, 2502–2504 (2003).
    Article CAS Google Scholar

Download references

Acknowledgements

We thank E. Margulies and the ENCODE Multiple Sequence Alignment working group for generously allowing use of their phylogenetic tree before its publication; S. Asthana, S. Sunyaev, G. Kryukov, M. Berger, T. Siggers and A. Aboukhalil for helpful discussions; J. Chee, E. Mathewson and T. Sierra for technical assistance; S. Elledge, A. Friedman, T. Siggers, M. Berger and F. De Masi for critical reading of the manuscript; A. Donner (Brigham & Women's Hospital) for the generous gift of human lens epithelial cells; and K. Cichowski (Brigham & Women's Hospital) for kindly providing lentiviral reagents. This work was funded in part by a PhRMA Foundation Informatics Research Starter Grant (M.L.B.), a William F. Milton Fund Award (M.L.B.), a Harvard-MIT Division of Health Sciences & Technology (HST) Taplin Award (M.L.B.) and US National Institutes of Health (NIH) National Human Genome Research Institute (R01 HG002966 to M.L.B.). J.B.W. was supported in part by an NIH Training Grant T32 HL07627 and NIH Individual National Research Service Award F32 AR051287. A.A.P. was supported in part by a National Defense Science and Engineering Graduate Fellowship from the Department of Defense and an Athinoula Martinos Fellowship from HST. S.A.J. was supported in part by a US National Science Foundation Postdoctoral Research Fellowship in Biological Informatics.

Author information

Author notes

  1. Fangxue Sherry He
    Present address: Present address: Science Applications International Corporation–Frederick Inc., 1700 W. 7th St., Frederick, Maryland 21702, USA.,
  2. Jason B Warner, Anthony A Philippakis and Savina A Jaeger: These authors contributed equally to this work.

Authors and Affiliations

  1. Division of Genetics, Department of Medicine, Harvard Medical School, Harvard Medical School New Research Building, Room 466D, 77 Ave. Louis Pasteur, Boston, 02115, Massachusetts, USA
    Jason B Warner, Anthony A Philippakis, Savina A Jaeger, Fangxue Sherry He, Jolinta Lin & Martha L Bulyk
  2. Department of Pathology, Brigham and Women's Hospital and Harvard Medical School, Harvard Medical School New Research Building, Room 466D, 77 Ave. Louis Pasteur, Boston, 02115, Massachusetts, USA
    Martha L Bulyk
  3. Harvard–Massachusetts Institute of Technology (MIT) Division of Health Sciences and Technology (HST), Harvard Medical School, Harvard Medical School New Research Building, Room 466D, 77 Ave. Louis Pasteur, Boston, 02115, Massachusetts, USA
    Anthony A Philippakis & Martha L Bulyk
  4. Committee on Higher Degrees in Biophysics, Harvard University, Cambridge, 02138, Massachusetts, USA
    Anthony A Philippakis & Martha L Bulyk
  5. Department of Biology, MIT, 77 Massachusetts Ave., Cambridge, 02139, Massachusetts
    Jolinta Lin

Authors

  1. Jason B Warner
    You can also search for this author inPubMed Google Scholar
  2. Anthony A Philippakis
    You can also search for this author inPubMed Google Scholar
  3. Savina A Jaeger
    You can also search for this author inPubMed Google Scholar
  4. Fangxue Sherry He
    You can also search for this author inPubMed Google Scholar
  5. Jolinta Lin
    You can also search for this author inPubMed Google Scholar
  6. Martha L Bulyk
    You can also search for this author inPubMed Google Scholar

Contributions

J.B.W. participated in the experimental design, performed the experiments and participated in analysis of the results and drafting of the manuscript. A.A.P. conceived of the PhylCRM scoring algorithm, participated in programming PhylCRM and running PhylCRM analyses, the development of Lever, programming Lever, running Lever analyses and analyzing the results and drafting of the manuscript. S.A.J. optimized the performance and participated in programming PhylCRM, running PhylCRM analyses, development of Lever, programming Lever and running Lever analyses and in analysis of the results and drafting of the manuscript. F.S.H. assisted with programming PhylCRM and running PhylCRM analyses. J.L. assisted with the experiments. M.L.B. conceived of the study and participated in the study design, analysis of the results and drafting of the manuscript.

Corresponding author

Correspondence toMartha L Bulyk.

Supplementary information

Rights and permissions

About this article

Cite this article

Warner, J., Philippakis, A., Jaeger, S. et al. Systematic identification of mammalian regulatory motifs' target genes and functions.Nat Methods 5, 347–353 (2008). https://doi.org/10.1038/nmeth.1188

Download citation

This article is cited by

Associated content