High-resolution DNA-binding specificity analysis of yeast transcription factors (original) (raw)
- Cong Zhu1,9,
- Kelsey J.R.P. Byers1,9,
- Rachel Patton McCord1,2,9,
- Zhenwei Shi3,
- Michael F. Berger1,2,
- Daniel E. Newburger1,
- Katrina Saulrieta1,4,
- Zachary Smith1,4,
- Mita V. Shah1,5,
- Mathangi Radhakrishnan1,6,
- Anthony A. Philippakis1,2,7,
- Yanhui Hu3,
- Federico De Masi1,
- Marcin Pacek3,
- Andreas Rolfs3,
- Tal Murthy3,
- Joshua LaBaer3 and
- Martha L. Bulyk1,2,7,8,10
- 1 Division of Genetics, Department of Medicine, Brigham and Women's Hospital and Harvard Medical School, Boston, Massachusetts 02115, USA;
- 2 Committee on Higher Degrees in Biophysics, Harvard University, Cambridge, Massachusetts 02138, USA;
- 3 Harvard Institute of Proteomics, Harvard Medical School, Cambridge, Massachusetts 02141, USA;
- 4 Department of Biology, Massachusetts Institute of Technology, Cambridge, Massachusetts 02139, USA;
- 5 Department of Biology, Wellesley College, Wellesley, Massachusetts 02481, USA;
- 6 Department of Biological Engineering, Massachusetts Institute of Technology, Cambridge, Massachusetts 02139, USA;
- 7 Harvard/MIT Division of Health Sciences and Technology (HST), Harvard Medical School, Boston, Massachusetts 02115, USA;
- 8 Department of Pathology, Brigham and Women's Hospital and Harvard Medical School, Boston, Massachusetts 02115, USA
- ↵9 These authors contributed equally to this work.
Abstract
Transcription factors (TFs) regulate the expression of genes through sequence-specific interactions with DNA-binding sites. However, despite recent progress in identifying in vivo TF binding sites by microarray readout of chromatin immunoprecipitation (ChIP-chip), nearly half of all known yeast TFs are of unknown DNA-binding specificities, and many additional predicted TFs remain uncharacterized. To address these gaps in our knowledge of yeast TFs and their cis regulatory sequences, we have determined high-resolution binding profiles for 89 known and predicted yeast TFs, over more than 2.3 million gapped and ungapped 8-bp sequences (“_k_-mers”). We report 50 new or significantly different direct DNA-binding site motifs for yeast DNA-binding proteins and motifs for eight proteins for which only a consensus sequence was previously known; in total, this corresponds to over a 50% increase in the number of yeast DNA-binding proteins with experimentally determined DNA-binding specificities. Among other novel regulators, we discovered proteins that bind the PAC (Polymerase A and C) motif (GATGAG) and regulate ribosomal RNA (rRNA) transcription and processing, core cellular processes that are constituent to ribosome biogenesis. In contrast to earlier data types, these comprehensive _k_-mer binding data permit us to consider the regulatory potential of genomic sequence at the individual word level. These _k_-mer data allowed us to reannotate in vivo TF binding targets as direct or indirect and to examine TFs' potential effects on gene expression in ∼1700 environmental and cellular conditions. These approaches could be adapted to identify TFs and cis regulatory elements in higher eukaryotes.
Footnotes
↵10 Corresponding author.
↵E-mail mlbulyk{at}receptor.med.harvard.edu; fax (617) 525-4705.[Supplemental material is available online at www.genome.org and at http://thebrain.bwh.harvard.edu/. Gene expression microarray data have been submitted to the Gene Expression Omnibus (GEO) (http://www.ncbi.nlm.nih.gov/geo/) database under accession no. GSE13684. Protein-binding microarray data are available at http://thebrain.bwh.harvard.edu/ and in the UniPROBE database, http://thebrain.bwh.harvard.edu/uniprobe/.]
Article published online before print. Article and publication date are at http://www.genome.org/cgi/doi/10.1101/gr.090233.108.
- Received December 11, 2008.
- Accepted January 14, 2009.
Freely available online through the Genome Research Open Access option.
Copyright © 2009 by Cold Spring Harbor Laboratory Press