Shotgun bisulphite sequencing of the Arabidopsis genome reveals DNA methylation patterning - PubMed (original) (raw)

. 2008 Mar 13;452(7184):215-9.

doi: 10.1038/nature06745. Epub 2008 Feb 17.

Affiliations

Shotgun bisulphite sequencing of the Arabidopsis genome reveals DNA methylation patterning

Shawn J Cokus et al. Nature. 2008.

Abstract

Cytosine DNA methylation is important in regulating gene expression and in silencing transposons and other repetitive sequences. Recent genomic studies in Arabidopsis thaliana have revealed that many endogenous genes are methylated either within their promoters or within their transcribed regions, and that gene methylation is highly correlated with transcription levels. However, plants have different types of methylation controlled by different genetic pathways, and detailed information on the methylation status of each cytosine in any given genome is lacking. To this end, we generated a map at single-base-pair resolution of methylated cytosines for Arabidopsis, by combining bisulphite treatment of genomic DNA with ultra-high-throughput sequencing using the Illumina 1G Genome Analyser and Solexa sequencing technology. This approach, termed BS-Seq, unlike previous microarray-based methods, allows one to sensitively measure cytosine methylation on a genome-wide scale within specific sequence contexts. Here we describe methylation on previously inaccessible components of the genome and analyse the DNA methylation sequence composition and distribution. We also describe the effect of various DNA methylation mutants on genome-wide methylation patterns, and demonstrate that our newly developed library construction and computational methods can be applied to large genomes such as that of mouse.

PubMed Disclaimer

Figures

Figure 1

Figure 1. Methylation of different fractions of the Arabidopsis genome

a, Chromosome-wide distribution of methylation and correlation with repeats in sliding 100 kB windows. b, Methylation levels and siRNA abundance are plotted across different types of repeats and genes. c, High levels of methylation are detected at loci corresponding to siRNAs. d, Relationship between methylation levels and the length of different types of repeats and genes. e, From left to right, methylation levels of the three consecutive cytosines in the (CCCTAAA)n telomeric repeat unit are calculated in wild type and the drm1 drm2 cmt3 mutant, respectively.

Figure 2

Figure 2. Sequence preferences for methylation in CG, CHG, and CHH contexts

Logos of sequence contexts that are preferentially methylated at the highest or lowest levels for 7-mer sequences in which the methylated cytosine is in the fifth position. In a, all genomic 7-mers in chromosome 1 were analyzed, while in b sequences were restricted to previously-defined methylated sequences . The logo graphically displays the sequence enrichment at a particular position in the alignment of 7-mers in each class, measured in bits. The maximum sequence conservation per site is 2 bits (i.e., 1 base) when a site is perfectly conserved, and 0 if there is no preference for a nucleotide.

Figure 3

Figure 3. Methylation shows periodic patterns

a, c, Correlation of the methylation status of cytosines in a CHH (a) and CHG (c) context. The _x_-axis indicates the distance between the two cytosines. The _y_-axis indicates the level of autocorrelation in methylation. The red line is a running average of windows that are ±2 bases around a single base. b, Fourier transform analysis of CHH methylation correlation. The _x_-axis indicates the number of cycles per 100 bases. The _y_-axis is the amplitude of the corresponding frequency. The peak at position 10 represents a periodicity of ten nucleotides, with a _p_-value smaller than 10−108 for observing this periodicity value by chance in random permutations of the genome. In a-c, Monte Carlo sampling of three datasets each consisting of half the data was used to compute the mean and standard deviations of the autocorrelations and Fourier transforms. Mean values are shown and error bars (a and b) represent standard deviations. In a and b, methylation from the whole genome was analyzed, while in c the analysis was restricted to previously-defined methylated sequences (see Supplementary Fig. 15 for details).

Figure 4

Figure 4. BS-Seq profiling of methylation mutants in Arabidopsis and mouse

a, BS-Seq data mapping to protein-coding genes was plotted in 500 nucleotide sliding windows. Two vertical blue lines mark the boundaries between upstream regions and gene bodies (left) and between gene bodies and downstream regions (right). b, Distribution of methylation along chromosome 4 in 25 nucleotide sliding windows. In a and b, a horizontal blue line indicates zero percent methylation. c, Comparison of the amount of CG methylation in wild type and _mUhrf1_−/− embryonic stem cells, represented as the average number of CGs appearing per million sequenced nucleotides.

Similar articles

Cited by

References

    1. Henderson IR, Jacobsen SE. Epigenetic Inheritance in Plants. Nature. 2007;447:418–424. - PubMed
    1. Goll MG, Bestor TH. Eukaryotic cytosine methyltransferases. Annu Rev Biochem. 2005;74:481–514. - PubMed
    1. Zhang X, et al. Genome-wide High-Resolution Mapping and Functional Analysis of DNA Methylation in Arabidopsis. Cell. 2006;126:1189–201. - PubMed
    1. Zilberman D, Gehring M, Tran RK, Ballinger T, Henikoff S. Genome-wide analysis of Arabidopsis thaliana DNA methylation uncovers an interdependence between methylation and transcription. Nat Genet. 2007;39:61–9. - PubMed
    1. Vaughn MW, et al. Epigenetic Natural Variation in Arabidopsis thaliana. PLoS Biol. 2007;5:e174. - PMC - PubMed

Publication types

MeSH terms

Substances

LinkOut - more resources