SnapShot: Chromosome Conformation Capture (original) (raw)

. Author manuscript; available in PMC: 2019 Feb 13.

Published in final edited form as: Cell. 2012 Mar 2;148(5):1068.e1–1068.e2. doi: 10.1016/j.cell.2012.02.019


The organization of the genome in the nuclear space is nonrandom and affects genome functions, including transcription, replication, and repair. Specific genomic regions, from the same or different chromosomes, frequently physically associate with each other and with nuclear structures, giving rise to an intricately compartmentalized nucleus. Examples of genome interactions are the association of an enhancer with a promoter or the clustering of genes such as rDNA genes in the nucleolus. Genome interactions have traditionally been studied using fluorescence in situ hybridization (FISH), which allows visualization of the spatial relationship between distinct genes or genome regions. Limitations of this method are that only known interactions can be interrogated, only very few loci can be probed in an experiment, and resolution is limited to the optics of the microscope.

The family of chromosome conformation capture techniques is a set of biochemical approaches to determine the physical interaction of genome regions. C-technology approaches invariably involve five steps: (1) formaldehyde fixation to crosslink chromatin at sites of physical interaction, (2) cleavage of chromatin by restriction enzyme or sonication, (3) ligation under dilute conditions favoring ligation between DNA ends captured on the same complex over ligations from random collisions, (4) detection of ligation junctions using variable molecular biology steps depending on the variant of the methods, and (5) computational analysis to determine interaction frequencies captured in the ligation of the crosslinked chromatin.

C-technologies (3C, 4C, 5C, Hi-C) differ in their manner of detection and scope of what interactions they can probe. The 3C method tests the interaction between two known sites in the genome, 4C allows probing of unknown interactors of a known bait sequence, 5C identifies all regions of interaction within a given genome domain, and Hi-C probes all occurring interactions in an unbiased fashion genome-wide. Additional variants (ChIA-PET, ChIP-Loop) incorporate a protein precipitation step, allowing identification of genome interactions that involve a specific protein of interest. The choice of method strongly depends on the specific nature and scope of the biological question, but also on the availability of resources, including the amount of starting material and sequencing capacity. Many derivatives of the standard C-techniques have been developed, often inspired by the specific biological question addressed or with the goal of improving specificity or reducing background.

C-technologies are population-based methods. They produce relative contact probabilities rather than absolute contact frequencies. The population-based nature is due to the fact that each genomic locus gives one pair-wise ligation junction in one cell. To allow high coverage and quantitative appraisal of contact profiles, thousands to millions of genome equivalents (cells) containing multiple ligation junctions must be included and combined in each experiment. Correlations between C contacts and DNA FISH have indicated that an interchromosomal association that occurs in 3%–5% of cells in a population will typically be detected as positive in most C methods. More frequent associations generally result in stronger signals; however, the strength of signal may also reflect the affinity of the physical interactions and not its frequency.

A critical step in data analysis is to determine whether an interaction, detected as a ligation junction, is specific. The contact frequency decreases exponentially and is inversely related to the linear genomic distance up to a few Mb away from the reference point. Therefore, the frequency of a specific contact in the vicinity of a locus is expected to be higher than the background of random collisions. A good indicator of specificity beyond the Mb range is the detection of a given interaction as clusters of signals from adjacent restriction fragments.

The resolution of C methods is determined by the nature of the restriction enzyme(s) used and, in the case of methods that use sequencing for detection, also by the number of sequencing reads. The frequency of recognition sequences of a four base-pair (bp) endonuclease is, in principle, sixteen times higher than the frequency of recognition sequence of a six bp cutter. The use of a four bp cutter is expected to increase the resolution of contacts in the Mb range, where multiple ligation events are captured for specific contacts and the background collisions. Beyond this range, however, where clusters of restriction fragments define contact regions in the range of tens to hundreds of kb, the advantage of using a four bp cutter is expected to be diminished. Although many genome-wide assays have used dedicated microarrays, hi-throughput sequencing is becoming the method of choice for global detection of ligation junctions. Sequencing depth is a technical barrier for resolution in some approaches such as Hi-C and ChIA-PET. PCR-based technologies overcome this limitation by amplifying a subset of contacts, with the tradeoff of reduced coverage. The pairwise nature of ligation products imposes a power of two relationship between the increase in resolution and the increase in required sequencing depth. Genomic coverage per sequencing depth depends also on the size of the inspected genome. For example, similar sequencing power provides tens of kb contact resolution in yeast, but only Mb resolution in the human genome.

graphic file with name nihms-1003794-f0001.jpg

REFERENCES