High-resolution DNA-binding specificity analysis of yeast transcription factors - PubMed (original) (raw)
doi: 10.1101/gr.090233.108. Epub 2009 Jan 21.
Kelsey J R P Byers, Rachel Patton McCord, Zhenwei Shi, Michael F Berger, Daniel E Newburger, Katrina Saulrieta, Zachary Smith, Mita V Shah, Mathangi Radhakrishnan, Anthony A Philippakis, Yanhui Hu, Federico De Masi, Marcin Pacek, Andreas Rolfs, Tal Murthy, Joshua Labaer, Martha L Bulyk
Affiliations
- PMID: 19158363
- PMCID: PMC2665775
- DOI: 10.1101/gr.090233.108
High-resolution DNA-binding specificity analysis of yeast transcription factors
Cong Zhu et al. Genome Res. 2009 Apr.
Abstract
Transcription factors (TFs) regulate the expression of genes through sequence-specific interactions with DNA-binding sites. However, despite recent progress in identifying in vivo TF binding sites by microarray readout of chromatin immunoprecipitation (ChIP-chip), nearly half of all known yeast TFs are of unknown DNA-binding specificities, and many additional predicted TFs remain uncharacterized. To address these gaps in our knowledge of yeast TFs and their cis regulatory sequences, we have determined high-resolution binding profiles for 89 known and predicted yeast TFs, over more than 2.3 million gapped and ungapped 8-bp sequences ("k-mers"). We report 50 new or significantly different direct DNA-binding site motifs for yeast DNA-binding proteins and motifs for eight proteins for which only a consensus sequence was previously known; in total, this corresponds to over a 50% increase in the number of yeast DNA-binding proteins with experimentally determined DNA-binding specificities. Among other novel regulators, we discovered proteins that bind the PAC (Polymerase A and C) motif (GATGAG) and regulate ribosomal RNA (rRNA) transcription and processing, core cellular processes that are constituent to ribosome biogenesis. In contrast to earlier data types, these comprehensive k-mer binding data permit us to consider the regulatory potential of genomic sequence at the individual word level. These k-mer data allowed us to reannotate in vivo TF binding targets as direct or indirect and to examine TFs' potential effects on gene expression in approximately 1,700 environmental and cellular conditions. These approaches could be adapted to identify TFs and cis regulatory elements in higher eukaryotes.
Figures
Figure 1.
PBM characterization of S. cerevisiae TF DNA-binding specificities. (A) Hierarchical clustering of PBM data over ungapped 8-mer _E_-scores determined for 89 yeast TFs. (B) Sequence logos for selected examples of newly discovered yeast TF DNA-binding site motifs.
Figure 2.
PBM _k_-mer binding profiles in most cases correspond well with ChIP-chip binding data. (A) For 33 of the 40 TFs for which we had both PBM- and ChIP-chip-derived motifs (Harbison et al. 2004), the PBM _k_-mer-derived potential targets were significantly enriched (AUC > 0.5, P < 0.05) among the ChIP-chip “bound” regions, showing good agreement between the ChIP-chip in vivo data and our scoring of genes based on the in vitro PBM _k_-mer data. (_B_) For 11 out of 40 TFs, intergenic regions scored by the PBM 8-mer data are more highly enriched (>5% improvement in AUC; all PBM AUC _P_-values are <0.05) among the ChIP-chip “bound” regions as compared with those scored by the ChIP-chip-derived motif.
Figure 3.
Reclassification of TF occupancy at ChIP-chip “bound” (P < 0.001) intergenic regions as likely being due to direct DNA-binding sites versus indirect association of the TF with the DNA. Blue bars _above_ the horizontal axis for each TF indicate the number of ChIP-chip bound intergenic regions that were previously called “indirect” (i.e., the regions do not contain a good match to the ChIP-chip motif as determined by MacIsaac et al. (2006) that are reclassified as potential “direct” TF targets by PBM data (i.e., the regions contain a PBM _k_-mer with an _E_-score > 0.45). Red bars below the axis indicate the number of intergenic regions previously annotated as “direct” targets by MacIsaac et al. (2006) that are reclassified as potential sites of indirect TF association according to the PBM data (i.e., the regions do not contain any _k_-mers with _E_-score > 0.45).
Figure 4.
Pbf1 and Pbf2 regulate rRNA processing genes. (A) Predicted target genes of Pbf1 and Pbf2 are significantly repressed (CRACR P < 10−12) after 20 min heat shock (shift from 25°C to 37°C) in wild-type, Δpbf1, and Δpbf2 strains, but not in the Δpbf1Δpbf2 double deletion strain, in Affymetrix gene expression profiling of triplicate biological replicate cultures. (B) Box plots indicating expression changes of rRNA processing genes containing at least one _k_-mer at E ≥ 0.45 after 20 min heat shock in wild-type, Δpbf1, Δpbf2, and Δpbf1Δpbf2 strains, in the expression data from A. (C) Pbf1 and Pbf2 associate in vivo with the promoter regions of the rRNA processing genes SAS10, NOP2, MTR4, KRR1, and ERB1. ChIP-qPCR was performed on cells treated with 5-min heat shock, at predicted target sites in their upstream regions, and at a negative control region upstream of ENO2. Binding fold-enrichment was defined as the ratio of PCR product in “IP” versus “INPUT,” using an open reading frame free region on chromosome V as an internal normalization control. Error bars indicate 1 SD from triplicate biological replicate cultures (*P < 0.05; **P < 0.01; two-sided Student's _t_-test). (D) Expression ratio of rRNA processing genes after heat shock. RT-qPCR data were generated for either untreated yeast or yeast treated with 20-min heat shock. Gene expression was normalized relative to ACT1 as an internal normalization control. Error bars indicate 1 SD from triplicate biological replicate cultures (*P < 0.05; **P < 0.01; two-sided Student's _t_-test compared with wild type).
Figure 5.
Analysis of TFs' regulatory associations and coregulatory factors. (A) Two-dimensional hierarchical clustering of 89 TFs (rows) according to their CRACR statistics across 1693 expression conditions (columns). (B) Examples of predicted coregulatory TFs from A with distinct motifs, and their 8-mer binding profile correlations. Clusters annotations are derived from the literature and functional predictions from this study. A high-resolution heatmap with full labeling is available in Supplemental Fig. S11, S12.
Similar articles
- Curated collection of yeast transcription factor DNA binding specificity data reveals novel structural and gene regulatory insights.
Gordân R, Murphy KF, McCord RP, Zhu C, Vedenko A, Bulyk ML. Gordân R, et al. Genome Biol. 2011 Dec 21;12(12):R125. doi: 10.1186/gb-2011-12-12-r125. Genome Biol. 2011. PMID: 22189060 Free PMC article. - Identifying combinatorial regulation of transcription factors and binding motifs.
Kato M, Hata N, Banerjee N, Futcher B, Zhang MQ. Kato M, et al. Genome Biol. 2004;5(8):R56. doi: 10.1186/gb-2004-5-8-r56. Epub 2004 Jul 28. Genome Biol. 2004. PMID: 15287978 Free PMC article. - Distinguishing direct versus indirect transcription factor-DNA interactions.
Gordân R, Hartemink AJ, Bulyk ML. Gordân R, et al. Genome Res. 2009 Nov;19(11):2090-100. doi: 10.1101/gr.094144.109. Epub 2009 Aug 3. Genome Res. 2009. PMID: 19652015 Free PMC article. - Transcription factor-DNA binding: beyond binding site motifs.
Inukai S, Kock KH, Bulyk ML. Inukai S, et al. Curr Opin Genet Dev. 2017 Apr;43:110-119. doi: 10.1016/j.gde.2017.02.007. Epub 2017 Mar 27. Curr Opin Genet Dev. 2017. PMID: 28359978 Free PMC article. Review. - Transcriptional networks: reverse-engineering gene regulation on a global scale.
Chua G, Robinson MD, Morris Q, Hughes TR. Chua G, et al. Curr Opin Microbiol. 2004 Dec;7(6):638-46. doi: 10.1016/j.mib.2004.10.009. Curr Opin Microbiol. 2004. PMID: 15556037 Review.
Cited by
- Cooperative assembly confers regulatory specificity and long-term genetic circuit stability.
Bragdon MDJ, Patel N, Chuang J, Levien E, Bashor CJ, Khalil AS. Bragdon MDJ, et al. Cell. 2023 Aug 31;186(18):3810-3825.e18. doi: 10.1016/j.cell.2023.07.012. Epub 2023 Aug 7. Cell. 2023. PMID: 37552983 Free PMC article. - Cooperativity boosts affinity and specificity of proteins with multiple RNA-binding domains.
Stitzinger SH, Sohrabi-Jahromi S, Söding J. Stitzinger SH, et al. NAR Genom Bioinform. 2023 Jun 9;5(2):lqad057. doi: 10.1093/nargab/lqad057. eCollection 2023 Jun. NAR Genom Bioinform. 2023. PMID: 37305168 Free PMC article. - DNA binding specificity of all four Saccharomyces cerevisiae forkhead transcription factors.
Cooper BH, Dantas Machado AC, Gan Y, Aparicio OM, Rohs R. Cooper BH, et al. Nucleic Acids Res. 2023 Jun 23;51(11):5621-5633. doi: 10.1093/nar/gkad372. Nucleic Acids Res. 2023. PMID: 37177995 Free PMC article. - Double DAP-seq uncovered synergistic DNA binding of interacting bZIP transcription factors.
Li M, Yao T, Lin W, Hinckley WE, Galli M, Muchero W, Gallavotti A, Chen JG, Huang SC. Li M, et al. Nat Commun. 2023 May 5;14(1):2600. doi: 10.1038/s41467-023-38096-2. Nat Commun. 2023. PMID: 37147307 Free PMC article. - Zinc cluster transcription factors frequently activate target genes using a non-canonical half-site binding mode.
Recio PS, Mitra NJ, Shively CA, Song D, Jaramillo G, Lewis KS, Chen X, Mitra RD. Recio PS, et al. Nucleic Acids Res. 2023 Jun 9;51(10):5006-5021. doi: 10.1093/nar/gkad320. Nucleic Acids Res. 2023. PMID: 37125648 Free PMC article.
References
- Angus-Hill M.L., Schlichter A., Roberts D., Erdjument-Bromage H., Tempst P., Cairns B.R. A Rsc3/Rsc30 zinc cluster dimer reveals novel roles for the chromatin remodeler RSC in gene expression and cell cycle control. Mol. Cell. 2001;7:741–751. - PubMed
- Beer M.A., Tavazoie S. Predicting gene expression from sequence. Cell. 2004;117:185–198. - PubMed
Publication types
MeSH terms
Substances
Grants and funding
- R01 HG003985-03/HG/NHGRI NIH HHS/United States
- R01 HG003985/HG/NHGRI NIH HHS/United States
- R01 HG003420-02/HG/NHGRI NIH HHS/United States
- R01 HG003420-03S1/HG/NHGRI NIH HHS/United States
- R01 HG003420-03/HG/NHGRI NIH HHS/United States
- R01 HG003420/HG/NHGRI NIH HHS/United States
LinkOut - more resources
Full Text Sources
Other Literature Sources
Molecular Biology Databases
Miscellaneous