CTCF-Mediated Human 3D Genome Architecture Reveals Chromatin Topology for Transcription - PubMed (original) (raw)

. 2015 Dec 17;163(7):1611-27.

doi: 10.1016/j.cell.2015.11.024. Epub 2015 Dec 10.

Oscar Junhong Luo 1, Xingwang Li 2, Meizhen Zheng 1, Jacqueline Jufen Zhu 3, Przemyslaw Szalaj 4, Pawel Trzaskoma 5, Adriana Magalska 5, Jakub Wlodarczyk 5, Blazej Ruszczycki 5, Paul Michalski 1, Emaly Piecuch 3, Ping Wang 1, Danjuan Wang 1, Simon Zhongyuan Tian 1, May Penrad-Mobayed 6, Laurent M Sachs 7, Xiaoan Ruan 1, Chia-Lin Wei 8, Edison T Liu 1, Grzegorz M Wilczynski 5, Dariusz Plewczynski 9, Guoliang Li 10, Yijun Ruan 11

Affiliations

CTCF-Mediated Human 3D Genome Architecture Reveals Chromatin Topology for Transcription

Zhonghui Tang et al. Cell. 2015.

Abstract

Spatial genome organization and its effect on transcription remains a fundamental question. We applied an advanced chromatin interaction analysis by paired-end tag sequencing (ChIA-PET) strategy to comprehensively map higher-order chromosome folding and specific chromatin interactions mediated by CCCTC-binding factor (CTCF) and RNA polymerase II (RNAPII) with haplotype specificity and nucleotide resolution in different human cell lineages. We find that CTCF/cohesin-mediated interaction anchors serve as structural foci for spatial organization of constitutive genes concordant with CTCF-motif orientation, whereas RNAPII interacts within these structures by selectively drawing cell-type-specific genes toward CTCF foci for coordinated transcription. Furthermore, we show that haplotype variants and allelic interactions have differential effects on chromosome configuration, influencing gene expression, and may provide mechanistic insights into functions associated with disease susceptibility. 3D genome simulation suggests a model of chromatin folding around chromosomal axes, where CTCF is involved in defining the interface between condensed and open compartments for structural regulation. Our 3D genome strategy thus provides unique insights in the topological mechanism of human variations and diseases.

Copyright © 2015 Elsevier Inc. All rights reserved.

PubMed Disclaimer

Figures

Figure 1

Figure 1. Characteristics of ChIA-PET data for 3D genome mapping

A. Graphic of ChIA-PET mapping properties including binding peaks, enriched chromatin interactions, and non-enriched singleton PETs inferring topological neighborhood proximity. B. Comparison between CTCF ChIA-PET and in situ Hi-C data (GM12878). Left: loop/peak map views of CTCF ChIA-PET data at different zoom-in scopes. For each data track, loop view is at top, peak view at bottom; Y-axis indicates the contact frequency of loops (log10 scale) and intensity of binding peaks (linear scale). The maximum frequency and intensity are given in each data track. PET counts on the left side of each track show the numbers of interaction PETs detected in the given region. Middle & Right: CTCF ChIA-PET contact heatmap and matched zoom-in regions to the in situ Hi-C contact heatmap (Rao et al., 2014). Total numbers of sequence reads generated for the in situ Hi-C data, and the CTCF ChIA-PET data are given at the bottom. See also Figure S1.

Figure 2

Figure 2. CTCF-defined chromatin looping topology

A. Characterization of CTCF-mediated loops in relation to cohesin binding and CTCF-motif orientation. See also Figure S2E-G. B. CTCF and RAD21 binding patterns centered on CTCF-motif. ChIP-Exo data of CTCF (Rhee and Pugh, 2011) and ChIP-nexus data of RAD21 were plotted as density curve around CTCF-motif sites. Borders of DNA footprints were identified by occupancy peaks. The green peaks depict the two borders of CTCF occupancy. The dashed blue line shows the peak position depicting 5’ border of RAD21 footprint. The solid blue line shows bimodal peaks from the 3’ border of RAD21 occupancy. See also Data S1, II. C. A mapping browser screenshot shows the CTCF-defined chromatin interactions and contact domains. Hi-C determined TADs (Dixon et al., 2012) and in situ Hi-C identified loops (Rao et al., 2014) are also shown. CTCF-motif position and orientation at the corresponding interaction anchors are shown as red arrows. Insert: zoom-in regions highlight CCD boundaries having multiple CTCF-binding peaks and motifs with inward-facing orientation. D. Cumulative density plot shows the genomic span distribution of individual CTCF loops, CCDs, in situ Hi-C loops and Hi-C TADs. See also Figure S3C-D. E. Statistics of CTCF-motif orientation at CCD boundaries. See also Figure S3E. F. Cumulative density of the consistency of motif orientation for tandem loops resided within the same CCD unit (red). The rewired data (grey) refers to the tandem looping directions randomly assigned (either left or right). It showed that the observed tandem loops within a given CCD have significantly higher directional consistency than random chance. _P_-value is calculated by Wilcoxon test. See also Figure S3J. G. An example CCD of 2 simple-convergent (not cross other loop anchor, red), 5 tandem (blue) and 3 complex-convergent (cross other loop anchor, purple) loops from 8 anchors (a-h). The motifs in the 5 tandem loops are all in the rightward direction. Numbers in brackets depict the contact frequency. H. Proposed models, hairpin for convergent and coiled for tandem loops. I. Simulated 3D model of chromatin looping structure using the contact frequency and genomic span in G based on the folding principles proposed in H. This model is an average representation of data derived from millions of cells. The simulation is detailed in Extended Experimental Procedures. See also Figure S2 and S3.

Figure 3

Figure 3. Relationship between CTCF/cohesin-mediated chromatin structure and RNAPII-associated transcriptional function

A. Browser view of a 5Mb genomic segment with 4 CCDs showing overlapped CTCF, RNAPII ChIA-PET data along with chromatin state (ChromHMM) and RNA-Seq data. In the ChromHMM track, red for active promoter, yellow for enhancer and green for transcribed region. See Extended Experimental Procedures for the detailed color code. B. Aggregation density plots showing histone modification (top), RNAPII binding and TSS (bottom) distribution profiles around the CTCF anchors and the loop regions. X-axis: CTCF-anchors were taken from the anchor center with ±10% extension proportional to the enclosed loop regions. Y-axis: Intensity (FPKM). C. Similar to B, but only around the anchor centers (±2kb). Anchors of convergent and tandem loops are analyzed separately. D. Similar to C, but ChIP-Seq data of selected TFs are plotted. Upper: ELF1 and ZEB1. Lower: CTCF, RAD21 and ZNF143. E. Boxplots for the ratio of H3K3me1/H3K4me3 ChIP-Seq at the anchors of convergent and tandem loops. High ratio suggests enhancer potential, low value indicates promoter function. F. Expression breadth (number of tissues a gene is expressed in) of CTCF anchor-genes and loop-genes in GM12878 (left). Anchor-genes (yellow) are significantly less represented as tissue-specific than loop-genes (green) (P < 2.2e-16, nonparametric Kolmogorov-Smirnov test). Anchor-genes are further divided as active (red) and inactive (yellow) for analysis (right). The expression breadth of all genes (grey) is included as reference. G. Proposed chromatin model. Top: A schematic CCD with anchor-gene/enhancer and loop-gene/enhancer associated with RNAPII interactions. g: gene; e: enhancer. Bottom left: CTCF-mediated loop model shows relative anchor and loop positions. Dotted arrow lines indicate the connectivity brought by RNAPII. Bottom right: RNAPII-participated model shows that RNAPII draws loop-genes/enhancers towards the CTCF anchors, docking the RNAPII foci onto the CTCF-foci. H. Browser view of a CCD with complex sub-domain structures. It involves a numbers of anchor-genes/enhancers and loop-gene/enhancers, which are also connected by RNAPII-mediated loops. Orange and purple vertical bars highlight the promoters of anchor-gene and loop-gene, respectively. Right: A simulated 3D model for this topological domain mediated by CTCF/cohesin and the embedded transcriptional complex. This model is an average representation of data obtained from millions of cells. See also Figure S4 and Extended Experimental Procedures.

Figure 4

Figure 4. Haplotype mapping of chromatin interaction

A. Statistics of phased PETs in GM12878. Intra-chromosomal PETs were distinguished as _cis_-PETs and _trans_-PETs. A _cis_-PET has the two tags mapped to phased SNPs with the same haplotype (M-M or P-P); a _trans_-PET has the two tags mapped to phased SNPs in the opposite haplotypes (M-P or P-M). B. Identification of haplotype chromatin interactions. Top: Schematic of the haplotype phasing of ChIA-PET mapping. Phased SNPs with CTCF or RNAPII binding were first identified. Interaction anchors overlapping with phased SNPs are referred as “Phased anchors” (vertical bar indicates the phased SNP). Interaction loops originating from paired phased anchors are “Phased interactions” (red). Interactions with only one side originating from phased anchors are “Extended phased interactions” (yellow). All other interactions that cannot be reliably determined are “Unphased interactions” (grey). Bottom: An example CCD, where three phased SNPs are identified with significant haplotype bias in CTCF-binding. The SNP nucleotides are color coded for their haplotypes (red: maternal, blue: paternal). Allele-specific binding frequencies in PET counts are given in parentheses with corresponding color code. *: P << 0.05, Binomial test. C. Statistics of haplotype-biased anchors and interactions. Allele-specific genes associated with haplotype-biased RNAPII loops are also shown. D. Haplotype-specific super-long interactions mediated by CTCF connecting 3 loci: DXZ4, FIRRE and G6PD in ChrX. Upper: Contact heatmaps of the maternal (M) and paternal (P) homologs showing the contacts of the three loci only identified in paternal. Lower: Loop/peak view of interactions mediated by CTCF in paternal ChrX. Phased allele frequencies of the SNPs at the FIRRE locus are shown in aggregate. The simulated 3D models for ChrX and the DXZ4-FIRRE-G6PD segment are presented in Figure S7F-G. E. DNA-FISH validation of the DXZ4-FIRRE-G6PD interactions. Left: Expected conformations and probe design. Right: Microscopic image in a nucleus with two clusters of the three testing probes. The numbers of total examined nuclei and nuclei with the expected probe pattern are shown. _P_-value calculated by Binomial test. See also Figure S5.

Figure 5

Figure 5. SNPs altering allelic CTCF chromatin interaction and the functional implication

A. Schematic of using SNP as single nucleotide “perturbation” for validation of CTCF-mediated chromatin interactions. In individual 1, phased SNP and allele-specific CTCF binding are used to determine the functional and dysfunctional alleles for CTCF interaction. However, it is not immediate ready to extrapolate “no binding = no looping”. In individual 2 and 3, homozygous alleles at the corresponding SNP location, possessing either the functional or the dysfunctional CTCF interaction allele, were analyzed for the presence or absence of CTCF binding and looping, respectively, thus, validating the function of CTCF in mediating chromatin interaction. B. An example using data from GM12878, HeLa and MCF7 shows CCD structures perturbed by SNP. A phased SNP (maternal “T”, paternal “A”) is identified at the right boundary of a CCD in GM12878. Differential strength of CTCF binding (169:37) was detected and the CTCF loops were extrapolated based on the biased binding. At this SNP locus, both HeLa and MCF7 were of homozygous “A/A” (dysfunctional CTCF allele), had no CTCF binding, and no chromatin contact originated from. *: P << 0.05; Binomial test. See also Figure S6A. C. An example in GM12878 illustrating CTCF tandem loop with allele-specificity and consequent impact on allele-biased transcription. A phased SNP (rs599134) is located in the CTCF-motif (dashed box highlighted) of a “tail” anchor of a tandem loop with the “head” anchor and CTCF-motif (highlighted in orange) proximal to the promoter of _CHI3L2_. The CTCF binding and looping in this region are paternal-specific, and the RNAPII binding and interactions are significantly paternal-biased as indicated by multiple heterozygous SNPs in this region. The expression of _CHI3L2_ also exhibited significant paternal-bias. In contrast, the genes (_CEPT1_ and _DENND2D_) immediately upstream of the tandem loop showed balanced expression. Nucleotide sequences of the highlighted CTCF-binding site are shown at the bottom with the motif underlined. *: _P_ << 0.05, Binomial test. D. Logos from 70 CTCF-motifs with allelic SNP disruption on CTCF interaction. Haplotype motifs with strong CTCF bindings had canonical consensus (top), motifs with weak CTCF binging displayed deviated consensus (down), especially at position 14. Examples of SNPs in CTCF-motif disrupting CTCF-binding and looping patterns are shown in Data S1, III. E. CTCF-motif disrupted by SNP is linked to disease susceptibility. Top: An example of allele-specific disruption on CTCF-interaction by having SNP within a CTCF motif. SNP (rs12936231) resides at motif position 14 of a CTCF-interaction site. Middle: Linkage disequilibrium between this CTCF-SNP (rs12936231, in red box) and the other 6 asthma associated SNPs (in orange boxes) in the CEU population. These seven SNPs are identified in a significant LD block (D’ value > 0.5 and LOD ≥ 3) as highlighted in black triangle. Bottom: Haplotypes of these seven SNPs associated with asthma in the CEU population. The dysfunctional “C” allele of the CTCF-SNP (rs12936231) is frequently (0.422) associated with the risk alleles of the other 6 SNPs in CEU. See also Figure S6.

Figure 6

Figure 6. Allele-biased chromatin interactions mediated by RNAPII

A. Schematic of using SNPs to investigate allelic-effects of transcription regulation via haplotype-biased occupancy and interaction mediated by RNAPII and TFs. B. Profile of allele-biased binding by 40 TFs at the allele-biased anchors (maternal 81, paternal 56) of RNAPII interactions. Each row represents an allele-biased RNAPII anchor with allele-biased binding by at least one TF. Each column represents one of the 40 tested TF. Each colored tile indicates TF binding bias: maternal, red; paternal, blue. C. Genes involved in allele-specific RNAPII-mediated interactions with phased transcripts. Eighty-nine (89) genes showed significant allele-biased gene expression D. Left: Boxplot of the expression levels of genes with (red box) and without (grey box) allele-bias. Genes with allele-bias (n=89) are of significantly higher expression than the none-biased (n=393) (P = 1.9e-06). Right: MA-plot of the allele-biased gene expression levels. X-axis measures expression abundance, Y-axis indicates differential expression between the two haplotypes. M: maternal-biased, red, n=61; P: paternal-biased, blue, n=18; C: contradictively biased with the corresponding haplotype-biased RNAPII interaction, black, n=10; N: No bias, grey, n=393. See also Table S5. E. An example shows allele-biased RNAPII binding/looping, and the regulatory effect on the associated genes. Left: Haplotype contact heatmaps (M, maternal; P, paternal) of the genomic segment indicated paternal haplotype-specific long-range chromatin interactions (blue arrows). Right: Loop/peak browser view. On the left side, an enhancer was identified. This enhancer overlaps with a phased SNP (maternal “T”, paternal “C”) and connects downstream to an RNAPII-mediated interaction complex involving 3 genes (LOC374443, CLEC2D, CLECL1). There are 11 phased SNPs in the multi-gene complex. Both of the enhancer and the gene complex exhibited paternal-biased RNAPII binding and interactions. The expression of the 3 genes is also paternal-biased. In addition, the enhancer also showed paternal-biased binding by 3 B-cell specific TFs BCL3, EBF1 and TCF12. All allele-specific sequence reads coverage by RNAPII and TF binding and transcripts are shown in aggregate numbers in the parentheses. *: P<<0.05, Binomial test.

Figure 7

Figure 7. Chromatin model of CTCF foci and RNAPII transcription factories

A. 3D models of Chr1 at 3 resolutions (16Mb, 2Mb, 100bp) with views from different angles. The color bar indicates the proportional genomic coordinates of 3D models. B. An ensemble model of Chr1 folding dynamics in GM12878. C. 3D images of DNA-FISH for the two copies of Chr1 (red) in a nucleus from different angles. The positional patterns of the two probes (green and blue) indicate two chromosomal conformations, “open” and “close”. D. An overall model of chromosomal folding involving CTCF and RNAPII. Left: Chromosome in interphase is loosely organized with chromatin loops extended from the condensed chromosome axis core that maintains the overall conformation of chromosome territory. Right: Zoom-in transverse and longitudinal cross section views. CTCF locate on the surface of chromosome axis core, defining the interphase of the inner condensed (inactive) and the outer open (active) compartment for transcription. E. Super-resolution SIM microscopic images of CTCF and RNAPII immunostains. Left: CTCF (green) and RNAPII (red) foci in GM12878 nucleus. Middle: merged images from CTCF and RNAPII without (middle top) and with (middle bottom) DNA stain (Hoechst 33342, blue). Top right: zoom-in merged image. Bottom right: 3D reconstruction of co-localization with the depth of the scanned volume. Bar chart: Statistics of Spearman's correlation values between CTCF and RNAPII signals from 21 cell nuclei. Control (Ctrl) is from random sampling of 100 nm-sized CTCF and RNAPII immunostatined images. Data are shown as mean with s.e.m. *: P < 0.001. F. FLIM of GM12878 nuclei subjected to FRET. Nuclei were stained immunofluorescently. Left: CTCF + RNAPII co-immunostained. Right: CTCF + non-immune IgG as negative control. Alexa488 labeled CTCF served as a donor for FRET, while Cy3 labeled RNAPII as an acceptor. Color-coded pixels correspond to values of mean fluorescent lifetimes as indicated by color bar below. Bottom: Distribution curve of fluorescence lifetime in the experiments, CTCF + RNAPII (blue) and CTCF + non-immune IgG (red). The occurrence of FRET between the donor and acceptor (co-localization of CTCF/RNAPII with the inter-molecular distances between the fluorophores ≤ 10 nm) is revealed by the shortening of the lifetime (nanoseconds) of the donor fluorescence. G. CTCF immunostain of lampbrush chromosome. Left: Light microscopy of oocyte, the germinal vesicle (GV, nucleus), and lampbrush chromosomes isolated from Pleurodeles waltl, with zoom-in confocal microscopy of lampbrush chromosomes stained by CTCF and IgG antibodies. The confocal microscopic images show that the CTCF signals are mostly concentrated along chromosome axis, but the control IgG signal are scattered evenly. Right: Bar chart of immunostaining measurements on chromosome axis and laterally extended chromatin loops. The CTCF signals are significantly higher on chromosome axis than on chromatin loops (*: P < 0.001). Data are presented as mean with s.e.m. See also Figure S7, Data S2 and Extended Experimental Procedures.

Similar articles

Cited by

References

    1. Bickmore WA. The spatial organization of the human genome. Annu Rev Genomics Hum Genet. 2013;14:67–84. - PubMed
    1. Boyle S, Rodesch MJ, Halvensleben HA, Jeddeloh JA, Bickmore WA. Fluorescence in situ hybridization with high-complexity repeat-free oligonucleotide probes generated by massively parallel synthesis. Chromosome Res. 2011;19:901–909. - PMC - PubMed
    1. Cremer M, Grasser F, Lanctot C, Muller S, Neusser M, Zinner R, Solovei I, Cremer T. Multicolor 3D fluorescence in situ hybridization for imaging interphase chromosomes. Methods Mol Biol. 2008;463:205–239. - PubMed
    1. Cullen KE, Kladde MP, Seyfred MA. Interaction between transcription regulatory regions of prolactin chromatin. Science. 1993;261:203–206. - PubMed
    1. Dixon JR, Selvaraj S, Yue F, Kim A, Li Y, Shen Y, Hu M, Liu JS, Ren B. Topological domains in mammalian genomes identified by analysis of chromatin interactions. Nature. 2012;485:376–380. - PMC - PubMed

Publication types

MeSH terms

Substances

Grants and funding

LinkOut - more resources