Whole-Genome Discovery of Transcription Factor Binding Sites by Network-Level Conservation (original) (raw)

  1. Moshe Pritsker1,
  2. Yir-Chung Liu1,2,
  3. Michael A. Beer1,2, and
  4. Saeed Tavazoie1,2,3
  5. 1 Department of Molecular Biology, Princeton University, Princeton, New Jersey 08544, USA
  6. 2 The Lewis-Sigler Institute for Integrative Genomics, Princeton University, Princeton, New Jersey 08544, USA

Abstract

Comprehensive identification of DNA _cis_-regulatory elements is crucial for a predictive understanding of transcriptional network dynamics. Strong evidence suggests that these DNA sequence motifs are highly conserved between related species, reflecting strong selection on the network of regulatory interactions that underlie common cellular behavior. Here, we exploit a systems-level aspect of this conservation—the network-level topology of these interactions—to map transcription factor (TF) binding sites on a genomic scale. Using network-level conservation as a constraint, our algorithm finds 71% of known TF binding sites in the yeast Saccharomyces cerevisiae, using only 12% of the sequence of a phylogenetic neighbor. Most of the novel predicted motifs show strong features of known TF binding sites, such as functional category and/or expression profile coherence of their corresponding genes. Network-level conservation should provide a powerful constraint for the systematic mapping of TF binding sites in the larger genomes of higher eukaryotes.

Footnotes