Systematic identification of mammalian regulatory motifs' target genes and functions (original) (raw)
- Article
- Published: 02 March 2008
- Anthony A Philippakis1,3,4 na1,
- Savina A Jaeger1 na1,
- Fangxue Sherry He1 nAff6,
- Jolinta Lin1,5 &
- …
- Martha L Bulyk1,2,3,4
Nature Methods volume 5, pages 347–353 (2008)Cite this article
- 960 Accesses
- 74 Citations
- Metrics details
Abstract
We developed an algorithm, Lever, that systematically maps metazoan DNA regulatory motifs or motif combinations to sets of genes. Lever assesses whether the motifs are enriched in _cis_-regulatory modules (CRMs), predicted by our PhylCRM algorithm, in the noncoding sequences surrounding the genes. Lever analysis allows unbiased inference of functional annotations to regulatory motifs and candidate CRMs. We used human myogenic differentiation as a model system to statistically assess greater than 25,000 pairings of gene sets and motifs or motif combinations. We assigned functional annotations to candidate regulatory motifs predicted previously and identified gene sets that are likely to be co-regulated via shared regulatory motifs. Lever allows moving beyond the identification of putative regulatory motifs in mammalian genomes, toward understanding their biological roles. This approach is general and can be applied readily to any cell type, gene expression pattern or organism of interest.
This is a preview of subscription content, access via your institution
Access options
Subscribe to this journal
Receive 12 print issues and online access
$259.00 per year
only $21.58 per issue
Buy this article
- Purchase on SpringerLink
- Instant access to full article PDF
Prices may be subject to local taxes which are calculated during checkout
Additional access options:
Similar content being viewed by others
Accession codes
Accessions
Gene Expression Omnibus
References
- Bulyk, M.L. Computational prediction of transcription-factor binding site locations. Genome Biol. 5, 201 (2003).
Article Google Scholar - Blanchette, M. et al. Genome-wide computational prediction of transcriptional regulatory modules reveals new insights into human gene expression. Genome Res. 16, 656–668 (2006).
Article CAS Google Scholar - Hallikas, O. et al. Genome-wide prediction of mammalian enhancers based on analysis of transcription-factor binding affinity. Cell 124, 47–59 (2006).
Article CAS Google Scholar - Pennacchio, L.A. et al. In vivo enhancer analysis of human conserved non-coding sequences. Nature 444, 499–502 (2006).
Article CAS Google Scholar - Thompson, W., Palumbo, M.J., Wasserman, W.W., Liu, J.S. & Lawrence, C.E. Decoding human regulatory circuits. Genome Res. 14, 1967–1974 (2004).
Article CAS Google Scholar - Zhou, Q. & Wong, W.H. CisModule: de novo discovery of cis-regulatory modules by hierarchical mixture modeling. Proc. Natl. Acad. Sci. USA 101, 12114–12119 (2004).
Article CAS Google Scholar - Wasserman, W.W. & Fickett, J. Identification of regulatory regions which confer muscle-specific gene expression. J. Mol. Biol. 278, 167–181 (1998).
Article CAS Google Scholar - Philippakis, A.A., He, F.S. & Bulyk, M.L. Modulefinder: a tool for computational discovery of cis regulatory modules. Pac. Symp. Biocomput. 10, 519–530 (2005).
Google Scholar - Xie, X. et al. Systematic discovery of regulatory motifs in human promoters and 3′ UTRs by comparison of several mammals. Nature 434, 338–345 (2005).
Article CAS Google Scholar - Elemento, O. & Tavazoie, S. Fast and systematic genome-wide discovery of conserved regulatory elements using a non-alignment based approach. Genome Biol. 6, R18 (2005).
Article Google Scholar - Huber, B.R. & Bulyk, M.L. Meta-analysis discovery of tissue-specific DNA sequence motifs from mammalian gene expression data. BMC Bioinformatics 7, 229 (2006).
Article Google Scholar - Ettwiller, L. et al. The discovery, positioning and verification of a set of transcription-associated motifs in vertebrates. Genome Biol. 6, R104 (2005).
Article Google Scholar - Bulyk, M.L. DNA microarray technologies for measuring protein-DNA interactions. Curr. Opin. Biotechnol. 17, 422–430 (2006).
Article CAS Google Scholar - Bulyk, M.L., Huang, X., Choo, Y. & Church, G.M. Exploring the DNA-binding specificities of zinc fingers with DNA microarrays. Proc. Natl. Acad. Sci. USA 98, 7158–7163 (2001).
Article CAS Google Scholar - Mukherjee, S. et al. Rapid analysis of the DNA-binding specificities of transcription factors with DNA microarrays. Nat. Genet. 36, 1331–1339 (2004).
Article CAS Google Scholar - Berger, M.F. et al. Compact, universal DNA microarrays to comprehensively determine transcription-factor binding site specificities. Nat. Biotechnol. 24, 1429–1435 (2006).
Article CAS Google Scholar - Philippakis, A.A. et al. Expression-guided in silico evaluation of candidate cis regulatory codes for Drosophila muscle founder cells. PLOS Comput. Biol. 2, e53 (2006).
Article Google Scholar - Moses, A.M., Chiang, D.Y., Pollard, D.A., Iyer, V.N. & Eisen, M.B. MONKEY: identifying conserved transcription-factor binding sites in multiple alignments using a binding site-specific evolutionary model. Genome Biol. 5, R98 (2004).
Article Google Scholar - Margulies, E.H. et al. Analyses of deep mammalian sequence alignments and constraint predictions for 1% of the human genome. Genome Res. 17, 760–774 (2007).
Article CAS Google Scholar - Messenguy, F. & Dubois, E. Role of MADS box proteins and their cofactors in combinatorial control of gene expression and cell development. Gene 316, 1–21 (2003).
Article CAS Google Scholar - Blais, A. et al. An initial blueprint for myogenic differentiation. Genes Dev. 19, 553–569 (2005).
Article CAS Google Scholar - Daury, L. et al. Opposing functions of ATF2 and Fos-like transcription factors in c-Jun-mediated myogenin expression and terminal differentiation of avian myoblasts. Oncogene 20, 7998–8008 (2001).
Article CAS Google Scholar - Wang, Z. et al. Myocardin and ternary complex factors compete for SRF to control smooth muscle gene expression. Nature 428, 185–189 (2004).
Article CAS Google Scholar - Martinez-Fernandez, S. et al. Pitx2c overexpression promotes cell proliferation and arrests differentiation in myoblasts. Dev. Dyn. 235, 2930–2939 (2006).
Article CAS Google Scholar - Gurtner, A. et al. Requirement for down-regulation of the CCAAT-binding activity of the NF-Y transcription factor during skeletal muscle differentiation. Mol. Biol. Cell 14, 2706–2715 (2003).
Article CAS Google Scholar - Ludwig, M.Z., Bergman, C., Patel, N.H. & Kreitman, M. Evidence for stabilizing selection in a eukaryotic enhancer element. Nature 403, 564–567 (2000).
Article CAS Google Scholar - Wasserman, W.W., Palumbo, M., Thompson, W., Fickett, J. & Lawrence, C. Human-mouse genome comparisons to locate regulatory sites. Nat. Genet. 26, 225–228 (2000).
Article CAS Google Scholar - Kasabov, N.K. Foundations of Neural Networks, Fuzzy Systems, and Knowledge Engineering (MIT Press, Cambridge, Massachusetts, 1998).
Google Scholar - Mootha, V.K. et al. PGC-1alpha-responsive genes involved in oxidative phosphorylation are coordinately downregulated in human diabetes. Nat. Genet. 34, 267–273 (2003).
Article CAS Google Scholar - Berriz, G.F., King, O.D., Bryant, B., Sander, C. & Roth, F.P. Characterizing gene sets with FuncAssociate. Bioinformatics 19, 2502–2504 (2003).
Article CAS Google Scholar
Acknowledgements
We thank E. Margulies and the ENCODE Multiple Sequence Alignment working group for generously allowing use of their phylogenetic tree before its publication; S. Asthana, S. Sunyaev, G. Kryukov, M. Berger, T. Siggers and A. Aboukhalil for helpful discussions; J. Chee, E. Mathewson and T. Sierra for technical assistance; S. Elledge, A. Friedman, T. Siggers, M. Berger and F. De Masi for critical reading of the manuscript; A. Donner (Brigham & Women's Hospital) for the generous gift of human lens epithelial cells; and K. Cichowski (Brigham & Women's Hospital) for kindly providing lentiviral reagents. This work was funded in part by a PhRMA Foundation Informatics Research Starter Grant (M.L.B.), a William F. Milton Fund Award (M.L.B.), a Harvard-MIT Division of Health Sciences & Technology (HST) Taplin Award (M.L.B.) and US National Institutes of Health (NIH) National Human Genome Research Institute (R01 HG002966 to M.L.B.). J.B.W. was supported in part by an NIH Training Grant T32 HL07627 and NIH Individual National Research Service Award F32 AR051287. A.A.P. was supported in part by a National Defense Science and Engineering Graduate Fellowship from the Department of Defense and an Athinoula Martinos Fellowship from HST. S.A.J. was supported in part by a US National Science Foundation Postdoctoral Research Fellowship in Biological Informatics.
Author information
Author notes
- Fangxue Sherry He
Present address: Present address: Science Applications International Corporation–Frederick Inc., 1700 W. 7th St., Frederick, Maryland 21702, USA., - Jason B Warner, Anthony A Philippakis and Savina A Jaeger: These authors contributed equally to this work.
Authors and Affiliations
- Division of Genetics, Department of Medicine, Harvard Medical School, Harvard Medical School New Research Building, Room 466D, 77 Ave. Louis Pasteur, Boston, 02115, Massachusetts, USA
Jason B Warner, Anthony A Philippakis, Savina A Jaeger, Fangxue Sherry He, Jolinta Lin & Martha L Bulyk - Department of Pathology, Brigham and Women's Hospital and Harvard Medical School, Harvard Medical School New Research Building, Room 466D, 77 Ave. Louis Pasteur, Boston, 02115, Massachusetts, USA
Martha L Bulyk - Harvard–Massachusetts Institute of Technology (MIT) Division of Health Sciences and Technology (HST), Harvard Medical School, Harvard Medical School New Research Building, Room 466D, 77 Ave. Louis Pasteur, Boston, 02115, Massachusetts, USA
Anthony A Philippakis & Martha L Bulyk - Committee on Higher Degrees in Biophysics, Harvard University, Cambridge, 02138, Massachusetts, USA
Anthony A Philippakis & Martha L Bulyk - Department of Biology, MIT, 77 Massachusetts Ave., Cambridge, 02139, Massachusetts
Jolinta Lin
Authors
- Jason B Warner
You can also search for this author inPubMed Google Scholar - Anthony A Philippakis
You can also search for this author inPubMed Google Scholar - Savina A Jaeger
You can also search for this author inPubMed Google Scholar - Fangxue Sherry He
You can also search for this author inPubMed Google Scholar - Jolinta Lin
You can also search for this author inPubMed Google Scholar - Martha L Bulyk
You can also search for this author inPubMed Google Scholar
Contributions
J.B.W. participated in the experimental design, performed the experiments and participated in analysis of the results and drafting of the manuscript. A.A.P. conceived of the PhylCRM scoring algorithm, participated in programming PhylCRM and running PhylCRM analyses, the development of Lever, programming Lever, running Lever analyses and analyzing the results and drafting of the manuscript. S.A.J. optimized the performance and participated in programming PhylCRM, running PhylCRM analyses, development of Lever, programming Lever and running Lever analyses and in analysis of the results and drafting of the manuscript. F.S.H. assisted with programming PhylCRM and running PhylCRM analyses. J.L. assisted with the experiments. M.L.B. conceived of the study and participated in the study design, analysis of the results and drafting of the manuscript.
Corresponding author
Correspondence toMartha L Bulyk.
Supplementary information
Rights and permissions
About this article
Cite this article
Warner, J., Philippakis, A., Jaeger, S. et al. Systematic identification of mammalian regulatory motifs' target genes and functions.Nat Methods 5, 347–353 (2008). https://doi.org/10.1038/nmeth.1188
- Received: 21 December 2007
- Accepted: 04 February 2008
- Published: 02 March 2008
- Issue Date: April 2008
- DOI: https://doi.org/10.1038/nmeth.1188