The long-range interaction landscape of gene promoters - PubMed (original) (raw)
The long-range interaction landscape of gene promoters
Amartya Sanyal et al. Nature. 2012.
Abstract
The vast non-coding portion of the human genome is full of functional elements and disease-causing regulatory variants. The principles defining the relationships between these elements and distal target genes remain unknown. Promoters and distal elements can engage in looping interactions that have been implicated in gene regulation. Here we have applied chromosome conformation capture carbon copy (5C) to interrogate comprehensively interactions between transcription start sites (TSSs) and distal elements in 1% of the human genome representing the ENCODE pilot project regions. 5C maps were generated for GM12878, K562 and HeLa-S3 cells and results were integrated with data from the ENCODE consortium. In each cell line we discovered >1,000 long-range interactions between promoters and distal sites that include elements resembling enhancers, promoters and CTCF-bound sites. We observed significant correlations between gene expression, promoter-enhancer interactions and the presence of enhancer RNAs. Long-range interactions show marked asymmetry with a bias for interactions with elements located ∼120 kilobases upstream of the TSS. Long-range interactions are often not blocked by sites bound by CTCF and cohesin, indicating that many of these sites do not demarcate physically insulated gene domains. Furthermore, only ∼7% of looping interactions are with the nearest gene, indicating that genomic proximity is not a simple predictor for long-range interactions. Finally, promoters and distal elements are engaged in multiple long-range interactions to form complex networks. Our results start to place genes and regulatory elements in three-dimensional context, revealing their functional relationships.
Conflict of interest statement
The authors declare no competing financial interests.
Figures
Figure 1. 5C approach to identify looping interactions
a, 5C design. Reverse 5C primers were designed for _Hin_dIII fragments that contains a TSS (red; according to the Gencode-v7) and Forward 5C primers for all other ‘distal’ _Hin_dIII fragments (blue). b, Heatmap of all interrogated TSS-distal fragment interactions in 14 ENCODE regions (ENm001-014) in K562 cells. Fragments are displayed in their genomic order. Each dark rectangular area in the heatmap denotes interactions within a single ENCODE region while remaining areas denote interactions between regions. ENCODE regions that are on the same chromosome show a higher interaction frequency (arrow) than regions that were on different chromosomes. c and d, examples of 5C interaction profiles for two TSSs indicated by vertical orange bars (left: ASCL6 gene located in ENm002; right: γ-δ globin located in ENm009). The solid red lines show the expected interaction level (LOWESS line, Supplementary Methods); dashed red lines above and below indicate LOWESS ± 1 standard deviation. 5C signals that are significantly higher than expected in both biological replicates (green circles, False Discovery Rate = 1%) are considered looping interactions. Interactions that are significant in only one replicate (blue circles) are not considered as a high-confidence 5C looping interaction. 5C peak calling detects a long-range interaction between the TSS of ASCL6 and a distal CTCF-bound element in GM12878 cells. The approach identifies the known long-range interactions of γ-δ globin to HS-3,4,5 and -111 and several additional DHS and CTCF sites in K562 cells2 (labeled).
Figure 2. Distribution of looping interactions across cell types and their relationship with chromatin features and gene expression
a, Venn diagram showing the number of unique and overlapping looping interactions across 3 cell types. b, Heatmap showing the enrichment/depletion of chromatin features in looping fragments compared to all interrogated fragments based on genome-wide datasets from ENCODE consortium (Supplemental Table 7). Features include Open Chromatin: UW DHS, Duke DHS and UNC-FAIRE; Active Marks: Broad Institute Histone H3K4me1/2/3, H4K20me1, H3K27ac, H3K9ac; CTCF: Broad Institute CTCF ChIP peaks; Inactive Marks: Broad Institute Histone H3K27me3 and; 7-way segmentation: based on HMM prediction for indicated cells. We further grouped segmentation categories E and WE into “E-class”, TSS and PF into “P-class”, and R and T into “Broad Marks”. The color scale represents the fold enrichment (red) or depletion (blue). The numbers listed inside each box represent p-values of the significant enrichment/depletion for that mark (NS= not significant, grey; two-tailed hypergeometric test and corrected for multiple testing using Bonferroni). c, Venn diagram showing the number of unique and overlapping looping distal fragments (top) and looping interactions (bottom) among 4 functional groups in GM12878 cells. Distal fragments are classified into 4 non-exclusive groups based on the 7-way segmentation. Similarly, TSS - distal fragment interactions are classified based on the functional grouping of the distal fragments. The four functional groups are E-class (yellow), P-class (magenta), CTCF (cyan) and Unclassified (grey). d, Pie charts showing percentages and numbers of expressed/non-expressed TSSs looping or not looping to a particular group (E-, P-, CTCF or Unclassified; colored as in c) of distal fragments in GM12878 cells. TSSs with a CAGE value > 0 are deemed expressed. Significant enrichment for expressed TSSs in the looping or non-looping categories are indicated on top (hypergeometric test; phyper<0.05). Significant differences in expression levels between TSS in the looping vs the non-looping category is indicated on the left (Wilcoxon signed-rank test; pWilcoxon<0.05).
Figure 3. Looping landscape of TSSs to distal fragments
a, Composite profile of average number of group-specific looping interactions upstream and downstream of TSSs based on combined 5C interaction data from the 3 cell lines. The top panel shows the average looping profiles of all TSSs (left), of expressed TSSs (middle) and of non-expressed TSSs (right). The bottom set of plots shows the corresponding profiles of all interrogated TSS-distal element interactions (left), of expressed TSSs (middle) and of non-expressed TSSs (right). All the interaction data for a particular group for all 3 cell lines are binned with a sliding window of 150 Kb (step size of 5 Kb) and normalized for the number of TSSs. b, Histogram showing the number of distal fragments that are involved in looping with their target promoters skipping 0,1,2,…, 25 (and above) TSSs. c, Histogram showing the number of looping interactions that skip over 0, 1, 2,…, 25 (and above) restriction fragments bound by either CTCF (left) or by both CTCF and Rad21 (cohesin; right). In b and c combined results for all 3 cell lines are plotted and values above 24 on the x-axis are added and grouped as 25+. All the values above 24 in the x-axis are added and grouped as 25+. Percentage of looping interactions that skip ≥1 CTCF (left) or CTCF + cohesin (right) are indicated on top.
Figure 4. Networks of looping interactions
a, Histogram showing the number of TSSs (left, red) or distal fragments (middle, blue) in percentages that are involved in 0, 1, 2,…., 10 (and above) looping interactions (degree, x-axis) in GM12878 cells. All the values for degrees that are >9 are grouped under degree 10+. The dark red bars represent the percentages of looping TSSs that are expressed while light red bars represent the percentages of looping TSSs that are not expressed. Inset: the difference in percentages between looping TSSs that are expressed and not expressed for each degree is shown. The right panel: degree distribution for each functional group of distal fragments. The average degrees (mean, μ) for TSSs and distal fragments are indicated. The first value is the mean degree considering all the TSS/distal fragments (looping + non-looping) while the second value is the mean degree of looping TSS/distal fragments (excluding degree = 0). b, Webplot showing the long-range looping interactions in ENr132 region in K562 cells. The interrogated distal fragments (blue circle) and the TSS (red circle) are positioned according to genomic coordinates and the Gencode v7 gene annotation is indicated. The size of the red circles denotes if that TSS is expressed (big circle) or not expressed (small circle). The thin grey lines show all the interactions that were interrogated. The colored lines show significant looping interactions between TSSs and distal fragments of a particular group.
Comment in
- Genomics: users' guide to the human genome.
Skipper M. Skipper M. Nat Rev Genet. 2012 Oct;13(10):678. doi: 10.1038/nrg3329. Epub 2012 Sep 7. Nat Rev Genet. 2012. PMID: 22955793 No abstract available.
Similar articles
- Genome-wide map of regulatory interactions in the human genome.
Heidari N, Phanstiel DH, He C, Grubert F, Jahanbani F, Kasowski M, Zhang MQ, Snyder MP. Heidari N, et al. Genome Res. 2014 Dec;24(12):1905-17. doi: 10.1101/gr.176586.114. Epub 2014 Sep 16. Genome Res. 2014. PMID: 25228660 Free PMC article. - Three-dimensional genome architectural CCCTC-binding factor makes choice in duplicated enhancers at Pcdhα locus.
Wu Y, Jia Z, Ge X, Wu Q. Wu Y, et al. Sci China Life Sci. 2020 Jun;63(6):835-844. doi: 10.1007/s11427-019-1598-4. Epub 2020 Apr 2. Sci China Life Sci. 2020. PMID: 32249388 - An integrated encyclopedia of DNA elements in the human genome.
ENCODE Project Consortium. ENCODE Project Consortium. Nature. 2012 Sep 6;489(7414):57-74. doi: 10.1038/nature11247. Nature. 2012. PMID: 22955616 Free PMC article. - CTCF and cohesin: linking gene regulatory elements with their targets.
Merkenschlager M, Odom DT. Merkenschlager M, et al. Cell. 2013 Mar 14;152(6):1285-97. doi: 10.1016/j.cell.2013.02.029. Cell. 2013. PMID: 23498937 Review. - Many facades of CTCF unified by its coding for three-dimensional genome architecture.
Wu Q, Liu P, Wang L. Wu Q, et al. J Genet Genomics. 2020 Aug;47(8):407-424. doi: 10.1016/j.jgg.2020.06.008. Epub 2020 Sep 1. J Genet Genomics. 2020. PMID: 33187878 Review.
Cited by
- Complex genetics of pulmonary diseases: lessons from genome-wide association studies and next-generation sequencing.
Pouladi N, Bime C, Garcia JGN, Lussier YA. Pouladi N, et al. Transl Res. 2016 Feb;168:22-39. doi: 10.1016/j.trsl.2015.04.016. Epub 2015 May 7. Transl Res. 2016. PMID: 26006746 Free PMC article. Review. - Architectural proteins, transcription, and the three-dimensional organization of the genome.
Cubeñas-Potts C, Corces VG. Cubeñas-Potts C, et al. FEBS Lett. 2015 Oct 7;589(20 Pt A):2923-30. doi: 10.1016/j.febslet.2015.05.025. Epub 2015 May 22. FEBS Lett. 2015. PMID: 26008126 Free PMC article. Review. - DNA cyclization and looping in the wormlike limit: Normal modes and the validity of the harmonic approximation.
Giovan SM, Hanke A, Levene SD. Giovan SM, et al. Biopolymers. 2015 Sep;103(9):528-38. doi: 10.1002/bip.22683. Biopolymers. 2015. PMID: 26014845 Free PMC article. - Pluripotency in 3D: genome organization in pluripotent cells.
Denholtz M, Plath K. Denholtz M, et al. Curr Opin Cell Biol. 2012 Dec;24(6):793-801. doi: 10.1016/j.ceb.2012.11.001. Epub 2012 Nov 27. Curr Opin Cell Biol. 2012. PMID: 23199754 Free PMC article. Review. - Identification of multi-loci hubs from 4C-seq demonstrates the functional importance of simultaneous interactions.
Jiang T, Raviram R, Snetkova V, Rocha PP, Proudhon C, Badri S, Bonneau R, Skok JA, Kluger Y. Jiang T, et al. Nucleic Acids Res. 2016 Oct 14;44(18):8714-8725. doi: 10.1093/nar/gkw568. Epub 2016 Jul 20. Nucleic Acids Res. 2016. PMID: 27439714 Free PMC article.
References
- Dekker J, Rippe K, Dekker M, Kleckner N. Capturing Chromosome Conformation. Science. 2002;295:1306–1311. - PubMed
Publication types
MeSH terms
Substances
LinkOut - more resources
Full Text Sources
Other Literature Sources