Preferentially quantized linker DNA lengths in Saccharomyces cerevisiae - PubMed (original) (raw)

Preferentially quantized linker DNA lengths in Saccharomyces cerevisiae

Ji-Ping Wang et al. PLoS Comput Biol. 2008.

Abstract

The exact lengths of linker DNAs connecting adjacent nucleosomes specify the intrinsic three-dimensional structures of eukaryotic chromatin fibers. Some studies suggest that linker DNA lengths preferentially occur at certain quantized values, differing one from another by integral multiples of the DNA helical repeat, approximately 10 bp; however, studies in the literature are inconsistent. Here, we investigate linker DNA length distributions in the yeast Saccharomyces cerevisiae genome, using two novel methods: a Fourier analysis of genomic dinucleotide periodicities adjacent to experimentally mapped nucleosomes and a duration hidden Markov model applied to experimentally defined dinucleosomes. Both methods reveal that linker DNA lengths in yeast are preferentially periodic at the DNA helical repeat ( approximately 10 bp), obeying the forms 10n+5 bp (integer n). This 10 bp periodicity implies an ordered superhelical intrinsic structure for the average chromatin fiber in yeast.

PubMed Disclaimer

Conflict of interest statement

The authors have declared that no competing interests exist.

Figures

Figure 1

Figure 1. A diagram for extended dinucleotide frequency analysis.

(A) How the extended sequences are obtained on the genome. DNA sequence from experimentally mapped nucleosomes is extended in the 3′ direction on both strands. AA/TT/TA signals, when combined, are center symmetric, hence information from the 5′-extended sequence is implicitly included. (B) Preferred locations of AA/TT/TA signals over two consecutive nucleosomes, for linker DNA lengths of d = 5 bp (top) and 10 bp (bottom), respectively.

Figure 2

Figure 2. AA/TT/TA dinucleotide frequency in extended nucleosome sequence alignment.

With a combined encoding of AA/TT/TA dinucleotides, the upstream sequences are center symmetric with the downstream; only the downstream regions are plotted. Positions 1–147 corresponds to the original center alignment of the mapped nucleosome cores, while 148 and above are the extended regions.

Figure 3

Figure 3. Fourier amplitude spectra of TT/AA/TA signals in nucleosome core/extended regions, compared to randomly shifted samples and random genomic samples.

Figure 4

Figure 4. Linker length distribution predicted under kernel smoothing (A) and mixture model (B).

The red curve is the raw frequency and the black is the smoothed curve using a 0.75 bp bandwidth Gaussian kernel and shown for convenience as a continuous curve.

Figure 5

Figure 5. Linker DNA length distribution predicted for simulated dinucleosome sequences with periodic linker length (A–E) and uniform linker length (F).

Linker DNA length distribution predicted for simulated dinucleosome sequences with 15 bp periodic linker length (A–E) and uniform linker length (F). (A) True linker length distribution used for simulation of dinucleosome sequences. (B, C) Recovered linker length distribution under the kernel and mixture methods, respectively, for 2,000 simulated dinucleosome sequences. (D, E) Corresponding results for a subset of the simulated dinucleosomes comprising only 300 sequences. (F) Recovered linker lengths using the kernel method for 2000 simulated dinucleosome sequences under the same model as in (B )but with a uniform linker length distribution on [1,…120].

Figure 6

Figure 6. Simulated gel electrophoresis patterns under linker length distributions from the DHMM-mixture model.

Frequency is plotted versus number of nucleosomes in each oligonucleosome fragment band, shown on a log scale (since mobility in a gel electrophoretic separation is proportional to the logarithm of DNA fragment lengths) with simulated electrophoresis from left to right.

Figure 7

Figure 7. A diagram for the DHMM and linker DNA length estimation procedure.

The DHMM contains two oscillating states: nucleosome (N) and linker (L). The nucleosome state model PN is defined as a heterogeneous Markov chain trained based on the nucleosome sequence alignment, and is fixed throughout the DHMM. The linker state model (PL) is a homogeneous Markov chain defined by base composition at the first position q1, the transition matrix v and the linker length distribution (duration) FL(d). The linker model is updated iteratively until convergence using the predicted linker DNAs between two nucleosomes. In particular, the linker length distribution FL(d) is estimated using a kernel smoothing method or a mixture model method.

Similar articles

Cited by

References

    1. van Holde KE. Chromatin. New York: Springer-Verlag; 1989.
    1. Luger K, Richmond T. DNA binding within the nucleosome core. Curr Opin Struct Biol. 1998;8:33–40. - PubMed
    1. Schalch T, Duda S, Sargent DF, Richmond T. X-ray structure of a tetranucleosome and its implications for the chromatin fibre. Nature. 2005;436:138–141. - PubMed
    1. Robinson PJ, Fairall L, Huynh VA, Rhodes D. EM measurements define the dimensions of the “30-nm” chromatin fiber: evidence for a compact, interdigitated structure. Proc Natl Acad Sci U S A. 2006;103:6506–6511. - PMC - PubMed
    1. Lohr D, van Holde KE. Organization of spacer DNA in chromatin. Proc Natl Acad Sci U S A. 1979;76:6326–6330. - PMC - PubMed

Publication types

MeSH terms

Substances

LinkOut - more resources