Discovering regulatory elements in non-coding sequences by analysis of spaced dyads - PubMed (original) (raw)

Discovering regulatory elements in non-coding sequences by analysis of spaced dyads

J van Helden et al. Nucleic Acids Res. 2000.

Abstract

The application of microarray and related technologies is currently generating a systematic catalog of the transcriptional response of any single gene to a multiplicity of experimental conditions. Clustering genes according to the similarity of their transcriptional response provides a direct hint to the regulons of the different transcription factors, many of which have still not been characterized. We have developed a new method for deciphering the mechanism underlying the common transcriptional response of a set of genes, i.e. discovering cis -acting regulatory elements from a set of unaligned upstream sequences. This method, called dyad analysis, is based on the observation that many regulatory sites consist of a pair of highly conserved trinucleotides, spaced by a non-conserved region of fixed width. The approach is to count the number of occurrences of each possible spaced pair of trinucleotides, and to assess its statistical significance. The method is highly efficient in the detection of sites bound by C(6)Zn(2)binuclear cluster proteins, as well as other transcription factors. In addition, we show that the dyad and single-word analyses are efficient for the detection of regulatory patterns in gene clusters from DNA chip experiments. In combination, these programs should provide a fast and efficient way to discover new regulatory sites for as yet unknown transcription factors.

PubMed Disclaimer

Figures

Figure 1

Figure 1

Comparison of predicted and known sites in the Lys family. (A) Patterns detected by the dyad analysis, with monad calibration. The box width reflects the statistical significance, calculated as described in Materials and Methods. (B) Patterns extracted with dyad analysis, using non-coding dyad frequencies as calibration. Note that the poly-TA boxes are filtered out. (C) Location of sites whose activity has been measured experimentally. The box thickness (or height) reflects the experimental activity as measured by Becker et al. (23). The label above each site shows the experimental value of activity. Only sites showing an activity >0.1 are displayed.

References

    1. Hieter P. and Boguski,M. (1997) Science, 278, 601–602. - PubMed
    1. DeRisi J.L., Iyer,V.R. and Brown,P.O. (1997) Science, 278, 680–686. - PubMed
    1. Chu S., DeRisi,J., Elsen,M., Mulholland,J., Botstein,D., Brown,P.O. and Herskowitz,I. (1998) Science, 282, 699–705. - PubMed
    1. Waterman M.S., Arratia,R. and Galas,D.J. (1984) Bull. Math. Biol., 46, 515–527. - PubMed
    1. Mengeritsky G. and Smith,T.F. (1987) Comput. Appl. Biosci., 3, 223–227. - PubMed

Publication types

MeSH terms

Substances

LinkOut - more resources