Genome-wide assessment of sequence-intrinsic enhancer responsiveness at single-base-pair resolution - PubMed (original) (raw)

. 2017 Feb;35(2):136-144.

doi: 10.1038/nbt.3739. Epub 2016 Dec 26.

Affiliations

Genome-wide assessment of sequence-intrinsic enhancer responsiveness at single-base-pair resolution

Cosmas D Arnold et al. Nat Biotechnol. 2017 Feb.

Abstract

Gene expression is controlled by enhancers that activate transcription from the core promoters of their target genes. Although a key function of core promoters is to convert enhancer activities into gene transcription, whether and how strongly they activate transcription in response to enhancers has not been systematically assessed on a genome-wide level. Here we describe self-transcribing active core promoter sequencing (STAP-seq), a method to determine the responsiveness of genomic sequences to enhancers, and apply it to the Drosophila melanogaster genome. We cloned candidate fragments at the position of the core promoter (also called minimal promoter) in reporter plasmids with or without a strong enhancer, transfected the resulting library into cells, and quantified the transcripts that initiated from each candidate for each setup by deep sequencing. In the presence of a single strong enhancer, the enhancer responsiveness of different sequences differs by several orders of magnitude, and different levels of responsiveness are associated with genes of different functions. We also identify sequence features that predict enhancer responsiveness and discuss how different core promoters are employed for the regulation of gene expression.

PubMed Disclaimer

Conflict of interest statement

Competing Financial Interests

The authors declare no competing financial interests.

Figures

Figure 1

Figure 1. STAP-seq identifies position and orientation of transcription initiation within arbitrary candidate fragments.

(a) Experimental setup of STAP-seq. Short candidate DNA fragments are cloned into a reporter construct that provides an enhancer and a reporter gene (short open reading frame (ORF)). Active candidates initiate reporter transcripts that start with sequence tags depicting the exact TSS. These tags are then sequenced and mapped to the reference genome. (b) UCSC Genome browser screenshot depicting STAP-seq using the zfh1 enhancer. Tag coverage is shown in a strand-specific manner. (c) Cumulative STAP-seqzfh1 tag counts around FlyBase-annotated TSSs (aTSS). (d) Metagene profile of STAP-seqzfh1 tag counts and short-nuclear-capped-RNA-seq (scRNA-seq) signals at experimentally determined STAP-seq TSSs (eTSSs). (e) Agreement of STAP-seqzfh1 and scRNA-seq for eTSSs that are shifted with respect to aTSSs by 1–5 nt.

Figure 2

Figure 2. Induced activities are consistent across developmental enhancers.

(a) UCSC genome browser screenshot showing STAP-seq signals of the focused screens using the indicated developmental (dev; zfh1, sgl, and ham) and housekeeping (hk; ncm and ssp3) enhancers. The depicted locus covers developmental and housekeeping genes. Pointed (pnt) codes for a transcription factor, whereas ATP synthase, coupling factor 6 (ATPsyn-Cf6), and Secretory 13 (sec13) code for components of ATP synthase and nuclear pore complex, respectively. (b) Scatterplots depicting the similarity of STAP-seq screens with two developmental (left) and two housekeeping (right) enhancers, respectively, and the dissimilarity between developmental- and housekeeping-enhancer screens (middle). (c) Bi-clustered heatmap depicting pairwise similarities. Pearson correlation coefficients (PCCs) of STAP-seq tag counts for three developmental and two housekeeping enhancers.

Figure 3

Figure 3. Induced activities are consistent across cell types.

(a) Scatterplot depicting STAP-seq tag counts for STAP-seqzfh1 in S2 cells (_x_-axis) versus STAP-seqtj in OSCs (_y_-axis) and their similarity (PCC). TSSs that endogenously—as measured by GRO-seq,—are exclusively active in S2 cells or OSCs are labeled blue or red, respectively (see also Supplementary Fig. 4). (b,c) Scatterplots depicting comparisons between STAP-seq in b, and GRO-seq, in c, in S2 cells and OSCs at aTSSs. (d) Venn diagram depicting the overlap of aTSSs detected by STAP-seqzfh1 and scRNA-seq in S2 cells for genomic regions covered in the focused STAP-seq screens. (e) Breakdown of aTSSs detected by scRNA-seq: 45.6% are also detected by STAP-seqzfh1 and essentially all other aTSSs are detected by the focused STAP-seq screens with housekeeping enhancers. Only 13 aTSSs (3.6%) are not found by developmental and housekeeping STAP-seq screens combined. (f) Core promoter motif-enrichment analyses of aTSSs uniquely detected by either STAP-seqzfh1 or scRNA-seq. NS, not significant.

Figure 4

Figure 4. Wide range of enhancer responsiveness and associated biological functions.

(ac) Scatterplots showing the range of enhancer responsiveness at corrected aTSSs that contain exclusively TATA box, Inr, MTE, or DPE in a, random positions in b, and eTSSs in c, depicting replicate 1 versus 2 in a, and b, and sense versus antisense signals of replicate 1 in c. (d) Enhancer responsiveness according to STAP-seq versus luciferase induction by the zfh1 enhancer. Error bars, s.d.; n = 3. (e) Boxplot showing enhancer responsiveness for aTSSs of genes that are surrounded by 1 or 2 versus 5 or more enhancers (n = 1,325 and 139, respectively; Wilcoxon P value). Center line: median; limits: interquartile range; whiskers: 10th and 90th percentiles. (f) Heatmaps depicting enrichments for the most differentially enriched Gene Ontology (GO) categories and for defined sets of transcription factors among the 400 genes associated with the strongest or weakest eTSSs that contain exclusively TATA box, Inr, MTE, or DPE. (g) Berkeley Drosophila Genome Project (BDGP)in situ embryo images for genes representing the GO categories most strongly enriched near weak eTSSs.

Figure 5

Figure 5. Candidate sequences are predictive of responsiveness to developmental enhancers.

(a) Sequence logos summarizing position-specific nucleotide frequencies for eTSSs (bins of 2,000 sequences) ranked by decreasing enhancer responsiveness. (b) Position weight matrix (PWM) match scores for TATA box, Inr, and DPE motifs at eTSSs ranked by enhancer responsiveness (b), and aggregate quality scores of all three motifs from b (c). Center line: median; limits: interquartile range; whiskers: 5th and 95th percentiles. (d) Scatterplots of experimentally determined and predicted enhancer responsiveness for eTSSs, subthreshold positions, and random positions. Also included is predicted enhancer responsiveness after randomizing the assignment between the sequences and responsiveness (gray). MSE: mean square error. (e) Five most predictive 5mers, their positions (bin out of 7 bins along the sequence), and weights, as well as the most similar known core promoter motifs.

Figure 6

Figure 6. Positions of endogenous transcription initiation in developmental enhancers and upstream of aTSSs have weak sequence-intrinsic enhancer responsiveness.

(a) Boxplot depicting enhancer responsiveness of positions that initiate transcription in S2 cells (≥5 scRNA-seq tags; left-most four boxes) or are randomly selected from the D. melanogaster genome (rightmost box, ‘Random genomic positions’). ‘Corrected aTSSs, containing TATA box, Inr, MTE or DPE’, are position-corrected according to scRNA-seq as in Fig. 4a and Supplementary Fig. 6b. For ‘Distal enhancers’, we used STARR-seq enhancers that are more than 500 bp away from the nearest aTSS and for each enhancer considered the position with the highest scRNA-seq signal within ± 250 bp around the STARR-seq peak summit on either strand (disregarding enhancers for which this signal was below 5 tags). For ‘Upstream antisense TSSs’, we considered the position with the highest scRNA-seq signal upstream and antisense of aTSSs until the 3′end or—for divergent gene pairs—until 500 bp upstream of the 5′end (aTSS) of the next gene. ‘Random scRNA-seq positions’ are aTSS- and enhancer-distal and not closely spaced with respect to each other. Also shown are P values via one-sided Wilcoxon’s rank-sum test between the categories. Center line: median; limits: interquartile range; whiskers: 5th and 95th percentiles. (b) UCSC Genome browser screenshots exemplifying representative loci of endogenous transcription initiation within enhancers as measured by scRNA-seq that have only weak STAP-seq signals.

Comment in

Similar articles

Cited by

References

    1. Banerji J, Rusconi S, Schaffner W. Expression of a β-globin gene is enhanced by remote SV40 DNA sequences. Cell. 1981;27:299–308. - PubMed
    1. Shlyueva D, Stampfel G, Stark A. Transcriptional enhancers: from properties to genome-wide predictions. Nat Rev Genet. 2014;15:272–286. - PubMed
    1. Roeder RG. The role of general initiation factors in transcription by RNA polymerase II. Trends Biochem Sci. 1996;21:327–335. - PubMed
    1. Kadonaga JT. Perspectives on the RNA polymerase II core promoter. Wiley Interdiscip Rev Dev Biol. 2012;1:40–51. - PMC - PubMed
    1. Core LJ, et al. Analysis of nascent RNA identifies a unified architecture of initiation regions at mammalian promoters and enhancers. Nat Genet. 2014;46:1311–1320. - PMC - PubMed

Publication types

MeSH terms

LinkOut - more resources