Identification of two types of GGAA-microsatellites and their roles in EWS/FLI binding and gene regulation in Ewing sarcoma - PubMed (original) (raw)

Identification of two types of GGAA-microsatellites and their roles in EWS/FLI binding and gene regulation in Ewing sarcoma

Kirsten M Johnson et al. PLoS One. 2017.

Abstract

Ewing sarcoma is a bone malignancy of children and young adults, frequently harboring the EWS/FLI chromosomal translocation. The resulting fusion protein is an aberrant transcription factor that uses highly repetitive GGAA-containing elements (microsatellites) to activate and repress thousands of target genes mediating oncogenesis. However, the mechanisms of EWS/FLI interaction with microsatellites and regulation of target gene expression is not clearly understood. Here, we profile genome-wide protein binding and gene expression. Using a combination of unbiased genome-wide computational and experimental analysis, we define GGAA-microsatellites in a Ewing sarcoma context. We identify two distinct classes of GGAA-microsatellites and demonstrate that EWS/FLI responsiveness is dependent on microsatellite length. At close range "promoter-like" microsatellites, EWS/FLI binding and subsequent target gene activation is highly dependent on number of GGAA-motifs. "Enhancer-like" microsatellites demonstrate length-dependent EWS/FLI binding, but minimal correlation for activated and none for repressed targets. Our data suggest EWS/FLI binds to "promoter-like" and "enhancer-like" microsatellites to mediate activation and repression of target genes through different regulatory mechanisms. Such characterization contributes valuable insight to EWS/FLI transcription factor biology and clarifies the role of GGAA-microsatellites on a global genomic scale. This may provide unique perspective on the role of non-coding DNA in cancer susceptibility and therapeutic development.

PubMed Disclaimer

Conflict of interest statement

Competing Interests: SLL declares a conflict of interest as a member of the advisory board for Salarius Pharmaceuticals. SLL is also a listed inventor on United States Patent No. US 7,939,253 B2, “Methods and compositions for the diagnosis and treatment of Ewing’s Sarcoma,” and United States Patent No. US 8,557,532, “Diagnosis and treatment of drug-resistant Ewing’s sarcoma.” This does not alter our adherence to PLOS ONE policies on sharing data and materials.

Figures

Fig 1

Fig 1. Schema and characteristics of repeat regions across genome.

(A) Schema of repeat regions. Regions with only one type of motif are called pure repeat region while those with both GGAA and TTCC are called mixed repeat regions. Each repeat region (purple box) is separated by at least 20-bp consecutive non-motifs. (B) Histogram of maximum number of consecutive motifs. (C) Histogram of total number of motifs. (D) Histogram of motif density of repeat regions. Density=(totalnumberofmotifs×4lengthofregions)×100%. Bin width is 5%. (E) Histogram of length of repeat regions. Each bin is 100bp width (e.g., first bin is 0-100bp length). Bins with zero repeat regions are not shown. (F) The characteristics of repeat regions for pure and mixed repeat regions across the genome. Red line indicates the mean for each characteristic.

Fig 2

Fig 2. Nearest gene schema and genomic location of repeat regions.

(A) Schema showing the nearest gene (orange) which is the gene with the shortest distance calculated from its TSS to the middle of the repeat region. (B) Distribution of distances to nearest genes for each repeat region grouped by number of consecutive motifs. The sum of percentages for each consecutive motif is 100%. (C) Comparisons of distance-to-nearest-gene for longer consecutive motifs to repeat regions with one to two consecutive motifs (i.e. ‘1–2’). * indicates the repeat regions are significantly closer to a gene than repeat regions with 1–2 consecutive motifs (p < 0.05). Red line represents the median distance-to-nearest gene for repeat regions with 1–2 consecutive motifs. (D) Feature distribution for each consecutive motif category. (E) Proportions of repeat regions in each chromosome grouped by the number of consecutive motifs.

Fig 3

Fig 3. Characteristics of EWS/FLI-bound microsatellites.

(A) Permutation test shows that the number of EWS/FLI binding sites that overlap with repeat regions (n = 8,256) with minimum of 3 consecutive motifs is significantly higher than random chance (p < 0.001). Red line denotes the significance limit (α = 0.05). Gray bars represent the number of overlaps in the random regions with EWS/FLI binding sites in 1,000 permutations. The black line represents the mean of overlaps in random regions (EVperm) and the green bar is the actual number of overlaps observed in repeat regions (Obs). (B) Boxplot of EWS/FLI fold-enrichment (relative to genomic background) and number of consecutive motifs in EWS/FLI-bound microsatellites showing statistically significant increasing trend (p < 2.2 × 10−16). The blue line is the estimated LOESS regression line of the mean with the estimated 95% confidence bands (shaded region). (C) Boxplot of EWS/FLI fold-enrichment and total number of motifs in EWS/FLI-bound microsatellites showing a positive correlation (p = 1.9 × 10−10) and a non-linear trend (p < 0.05). The blue line is the estimated LOESS regression line of the mean with the estimated 95% confidence bands (shaded region). (D) Boxplot of EWS/FLI fold-enrichment and Density (=totalmotif×4lengthofmicrosatellite×100%) showing statistically significant positive correlation (p < 2.2 × 10−16). The blue line is the estimated LOESS regression line of the mean with the estimated 95% confidence bands (shaded region).

Fig 4

Fig 4. Correlation between EWS/FLI-bound microsatellites, GGAA-motif and gene expression.

(A) Scatter plot of expression of activated genes and EWS/FLI fold-enrichment at promoter-like microsatellites showing a positive correlation (r = 0.46, p = 3.35 × 10−7). (B) Boxplot of EWS/FLI fold-enrichment and number of consecutive motifs of EWS/FLI-bound at promoter-like microsatellites for activated genes showing a non-linear trend. Blue line is the estimated LOESS regression line of the mean with the estimated 95% confidence interval (shaded region). Overall, there is statistically significant positive correlation (r = 0.43, p = 1.5 × 10−6). (C) Boxplot of EWS/FLI-activated gene expression and number of consecutive motifs at promoter-like EWS/FLI-bound microsatellites for gene activation showing a non-linear trend as seen in EWS/FLI binding intensities and a statistically significant positive correlation (r = 0.23, p = 0.01). The blue line is the estimated LOESS regression line of the mean with the estimated 95% confidence bands (shaded region). (D) Scatter plot of expression of activated genes and EWS/FLI fold-enrichment at enhancer-like microsatellites showing a positive correlation (r = 0.15, p = 3.5 × 10−4). (E) Boxplot of EWS/FLI fold-enrichment and number of consecutive motifs at EWS/FLI-bound enhancer-like microsatellites showing a positive correlation (r = 0.53, p = 2.2 × 10−16). Blue line is the estimated LOESS regression line of the mean and the standard error of the prediction shown as shaded region. (F) Boxplot of EWS/FLI fold-enrichment and number of consecutive motifs at EWS/FLI-bound enhancer-like microsatellites associated with gene repression showing positive correlation (r = 0.40, p < 2.2 × 10−16). The blue line is the estimated LOESS regression line of the mean with the estimated 95% confidence bands (shaded region).

Fig 5

Fig 5. Schema of correlative associations between GGAA motifs in EWS/FLI-bound microsatellites for gene activation and repression.

Schematic illustrating EWS/FLI responsiveness at given loci across the genome. (A) Promoter-like (close-range) GGAA-microsatellites positively correlate with EWS/FLI binding and activation of genes in a length dependent manner. (B) Enhancer-like (long-range) GGAA-microsatellites positively correlate with EWS/FLI binding but correlation with transcriptional regulation is only minimal for activated genes. (C) Promoter-like GGAA-microsatellites display no correlation with EWS/FLI binding and transcriptional repression. (D) Enhancer-like GGAA-microsatellites positively correlate with EWS/FLI binding; however, they do not confer gene expression.

References

    1. Braun BS, Frieden R, Lessnick SL, May WA, Denny CT. Identification of target genes for the Ewing’s sarcoma EWS/FLI fusion protein by representational difference analysis. Mol Cell Biol. 1995;15: 4623–30. - PMC - PubMed
    1. Bailly RA, Bosselut R, Zucman J, Cormier F, Delattre O, Roussel M, et al. DNA-binding and transcriptional activation properties of the EWS-FLI-1 fusion protein resulting from the t(11;22) translocation in Ewing sarcoma. Mol Cell Biol. 1994;14: 3230–41. Available: http://www.pubmedcentral.nih.gov/articlerender.fcgi?artid=358690&tool=pm... - PMC - PubMed
    1. Lamber EP, Vanhille L, Textor LC, Kachalova GS, Sieweke MH, Wilmanns M. Regulation of the transcription factor Ets-1 by DNA-mediated homo-dimerization. EMBO J. 2008;27: 2006–17. doi: 10.1038/emboj.2008.117 - DOI - PMC - PubMed
    1. Ohno T, Rao VN, Shyam E, Reddy P. EWS/Fli-1 Chimeric Protein Is a Transcriptional Activator. Cancer Res. 1993;53: 5859–5863. - PubMed
    1. Sorensen PH, Triche TJ. Gene fusions encoding chimaeric transcription factors in solid tumours. Semin Cancer Biol. 1996;7: 3–14. doi: 10.1006/scbi.1996.0002 - DOI - PubMed

MeSH terms

Substances

LinkOut - more resources