Identification of alternative topological domains in chromatin - PubMed (original) (raw)

Identification of alternative topological domains in chromatin

Darya Filippova et al. Algorithms Mol Biol. 2014.

Abstract

Chromosome conformation capture experiments have led to the discovery of dense, contiguous, megabase-sized topological domains that are similar across cell types and conserved across species. These domains are strongly correlated with a number of chromatin markers and have since been included in a number of analyses. However, functionally-relevant domains may exist at multiple length scales. We introduce a new and efficient algorithm that is able to capture persistent domains across various resolutions by adjusting a single scale parameter. The ensemble of domains we identify allows us to quantify the degree to which the domain structure is hierarchical as opposed to overlapping, and our analysis reveals a pronounced hierarchical structure in which larger stable domains tend to completely contain smaller domains. The identified novel domains are substantially different from domains reported previously and are highly enriched for insulating factor CTCF binding and histone marks at the boundaries.

Keywords: Alternative topological domains; Chromatin conformation capture; Dynamic programming.

PubMed Disclaimer

Figures

Figure 1

Figure 1

Interaction matrix for a portion of human chromosome 1 from a recent Hi-C experiment by Dixon et al. [5]. Each axis represents a location on the chromosome with a step of 40kbp. Densely interacting domains identified by the method of Dixon et al. are shown in red. Alternative domains are shown as dotted black lines on the upper triangular portion of the matrix. Visual inspection of the lower triangular portion suggests domains could be completely nested within another and highly overlapping when compared to Dixon et al.’s domains. This motivates the problem of identifying alternative domains across length scales.

Figure 2

Figure 2

Our approach identifies densely interacting domains across scales. (a) Our algorithm discovers domains with mean frequency value for inter- and intra-domain interactions (solid lines) at or better than that of Dixon et al. domains (dotted lines). Each solid line represents domains at different resolution γ in human fibroblast cells. (b) Multiscale domains identified in human fibroblast cells by our dynamic program tend to have higher mean frequency than those of Dixon et al. (distributions are plotted after outliers >μ+4_σ_ were removed).

Figure 3

Figure 3

Domain sizes and count across resolutions. The domain sizes increase and the domain count decreases as the resolution parameter drops. Above: plotted are maximum (red), average (blue), and minimum (green) domain size averaged over all chromosomes for the domains on human fibroblast cells (IMR90). The magenta line shows the average domain size for domains reported by Dixon et al. Below: the number of domains increases for higher values of resolution parameter. The magenta line displays domain count for Dixon et al.

Figure 4

Figure 4

Domain persistence across scales. (a) Domains identified by our algorithm (black) are smaller at higher resolutions and merge to form larger domains at γ close to 0. Visual inspection shows qualitative differences between consensus domains (red) and domains reported by Dixon et al. (green). Data shown for the first 4Mb of chromosome 1. (b) Variation of information for domains identified by our algorithm across different resolutions for chromosome 1 in human fibroblast cells.

Figure 5

Figure 5

Comparison of Dixon et al.’s domain set with the multiscale consensus set for chromosomes 1–22 ( x -axis). We used the variation of information (VI) (_y_-axis) to compute distances between domain sets for the multiscale consensus set vs. Dixon et al. (blue dots) and the multiscale consensus vs. randomly shuffled domains (red diamonds).

Figure 6

Figure 6

Enrichment for chromatin marks and histone modifications in domain boundaries. Enrichment of CTCF binding (a) in IMR90 and (b) in mESC and histone modifications (c), (d) in mESC around domain boundaries for our consensus set of persistent domains (left, blue), and for those identified by Dixon et al. (right, blue). Green lines represent the presence of CTCF at the midpoint of the topological domains.

Figure 7

Figure 7

Domain sets at various resolutions. 10 best optimal and near-optimal solutions for resolutions _γ_=0.5,0.35,0.15,0.10 for a portion of human fibroblast chromosome 20 (IMR90). Variations in the domain assignments within a single γ and across resolutions correspond with visually identifiable, hierarchical regions of dense Hi-C interactions. All histone mark tracks were obtained from IMR90 cells. Plotted with WashU EpiGenome Browser [23].

Similar articles

Cited by

References

    1. de Wit E, de Laat W. A decade of 3C technologies: insights into nuclear organization. Genes Dev. 2012;9:11–24. doi: 10.1101/gad.179804.111. - DOI - PMC - PubMed
    1. Gibcus JH, Dekker J. The hierarchy of the 3D genome. Mol Cell. 2013;9(5):773–782. doi: 10.1016/j.molcel.2013.02.011. - DOI - PMC - PubMed
    1. Cavalli G, Misteli T. Functional implications of genome topology. Nat Struct Mol Biol. 2013;9(3):290–299. doi: 10.1038/nsmb.2474. - DOI - PMC - PubMed
    1. Fudenberg G, Getz G, Meyerson M, Mirny LA. High order chromatin architecture shapes the landscape of chromosomal alterations in cancer. Nat Biotechnol. 2011;9(12):1109–13. doi: 10.1038/nbt.2049. - DOI - PMC - PubMed
    1. Dixon JR, Selvaraj S, Yue F, Kim A, Li Y, Shen Y, Hu M, Liu JS, Ren B. Topological domains in mammalian genomes identified by analysis of chromatin interactions. Nature. 2012;9(7398):376–80. doi: 10.1038/nature11082. - DOI - PMC - PubMed

LinkOut - more resources