ChEA3: transcription factor enrichment analysis by orthogonal omics integration - PubMed (original) (raw)
ChEA3: transcription factor enrichment analysis by orthogonal omics integration
Alexandra B Keenan et al. Nucleic Acids Res. 2019.
Abstract
Identifying the transcription factors (TFs) responsible for observed changes in gene expression is an important step in understanding gene regulatory networks. ChIP-X Enrichment Analysis 3 (ChEA3) is a transcription factor enrichment analysis tool that ranks TFs associated with user-submitted gene sets. The ChEA3 background database contains a collection of gene set libraries generated from multiple sources including TF-gene co-expression from RNA-seq studies, TF-target associations from ChIP-seq experiments, and TF-gene co-occurrence computed from crowd-submitted gene lists. Enrichment results from these distinct sources are integrated to generate a composite rank that improves the prediction of the correct upstream TF compared to ranks produced by individual libraries. We compare ChEA3 with existing TF prediction tools and show that ChEA3 performs better. By integrating the ChEA3 libraries, we illuminate general transcription factor properties such as whether the TF behaves as an activator or a repressor. The ChEA3 web-server is available from https://amp.pharm.mssm.edu/ChEA3.
© The Author(s) 2019. Published by Oxford University Press on behalf of Nucleic Acids Research.
Figures
Figure 1.
Performance of the ChEA3 libraries and integration techniques in recovering the perturbed TFs from 946 TF LOF and GOF experiments from the TFpertGEOupdn benchmark dataset. (A) Mean ROC AUC and mean PR AUC over 5000 bootstrapped ROC and PR curves; (B) composite ROC curves generated from 5000 boostrapped curves; (C) composite PR curves generated from 5000 bootstrapped curves; (D) the deviation of the cumulative distribution from uniform of the scaled rankings of each perturbed TF in the benchmarking dataset. Anderson-Darling test of uniformity: MeanRank P = 6.34 × 10−7; TopRank P = 6.34 × 10−7; ARCHS4 P = 6.34 × 10−7; ENCODE P = 2.06 × 10−6; Enrichr Queries P = 6.83 × 10−7; GTEx P = 6.45 × 10−7; Literature ChIP-seq P = 1.28 × 10−6; ReMap P = 1.02 × 10−6.
Figure 2.
Fraction of the TFpertGEOupdn benchmarking dataset subset recovered in the top one percentile of rankings compared to the library TF coverage. (A) A heatmap visualizing transcription factor coverage for the ChEA3 libraries. (B) The fraction of the TFpertGEOupdn subset TFs recovered in the top percentile of ranks for each ChEA3 library. Only the TFpertGEOupdn gene sets where the perturbed TF was covered by the library were considered when computing the ‘Percent Subset Recovered’.
Figure 3.
Effect of input type on ChEA3 performance. The deviation of the cumulative distribution from uniform of the scaled rankings of perturbed TFs in the benchmarking dataset for: (A) TF overexpression or chemical activation experiments from TFpertGEOup; (B) TF overexpression or chemical activation experiments from TFpertGEOdn; (C) TF knockdown, knockout or chemical inactivation experiments from TFpertGEOup; and (D) TF knockdown, knockout or chemical inactivation experiments from TFpertGEOdn.
Figure 4.
Comparison of available TF prediction tools with ChEA3 with the hsTFpertGEO benchmarking dataset. (A) Composite ROC curves generated from 5000 bootstrapped curves; (B) composite PR curves generated from 5000 bootstrapped curves; (C) the deviation of the cumulative distribution from uniform of the scaled rankings of each perturbed TF in the benchmarking dataset; Anderson–Darling test of uniformity: VIPER GTEx Regulon P = 1.39 × 10−6, MAGICACT P = 6.58 × 10−5, TFEA.ChIP P = 2.47 × 10−6, BART P = 2.34 × 10−6, DoRothEA Regulon A P = 2.39 × 10−6; DoRothEA Regulon B P = 2.22 × 10−6, DoRothEA Regulon C P = 1.92 × 10−6, DoRothEA Regulon D P = 1.71 × 10−6, DoRothEA Regulon E P = 1.46 × 10−6, DoRothEA Regulon TOP10score P = 1.46 × 10−6; (D) mean ROC AUC and mean PR AUC over 5000 bootstrapped ROC and PR curves for available TF prediction tools as compared with ChEA3 benchmarked with hsTFpertGEO.
Figure 5.
Comparison of available TF prediction tools with ChEA3. (A) The percent of the perturbed TFs recovered by the tool in the top one percentile of ranks as compared to TF coverage of the tool. For the ‘Percent Subset Recovered’ metric, we consider only the subset of the hsTFpertGEO TF perturbation experiments where the TF is covered by the tool. (B) The percent of the perturbed TFs recovered by the tool in the top one percentile of ranks as compared to TF coverage of the tool. For the ‘Percent Total Recovered’ metric, we consider all 443 TF perturbation experiments in the hsTFpertGEO benchmarking datasets. (C) Mean AUROC over 5000 bootstrapped curves compared to tool TF coverage. (D) Mean AUPR over 5000 bootstrapped curves compared to tool TF coverage.
Figure 6.
Scatterplots showing activating/repressing activity across TFs. Significant ORs (P< 0.05) are plotted. For uniformity, when examining loss-of-function TF perturbations, we consider –log(OR), as this will be positive if the TF acts as an activator of its targets and negative if it acts as a repressor. Conversely, we consider log(OR) for gain-of-function perturbations, which will be positive if the TF is an activator and negative if the TF acts as a repressor. Red arrows indicate TFs discussed in the results. (A) ORs from gain-of-function TF perturbations; (B) ORs from loss-of-function TF perturbations; (C) TF–target interactions from the TRRUST v2 database. For each TF, the percent of activating TF–target interactions (red) or repressive TF–target interactions (blue) from the subset of TF–target interactions in TTRUST v2 for which directionality is available.
Similar articles
- Cistrome-GO: a web server for functional enrichment analysis of transcription factor ChIP-seq peaks.
Li S, Wan C, Zheng R, Fan J, Dong X, Meyer CA, Liu XS. Li S, et al. Nucleic Acids Res. 2019 Jul 2;47(W1):W206-W211. doi: 10.1093/nar/gkz332. Nucleic Acids Res. 2019. PMID: 31053864 Free PMC article. - PlantPAN3.0: a new and updated resource for reconstructing transcriptional regulatory networks from ChIP-seq experiments in plants.
Chow CN, Lee TY, Hung YC, Li GZ, Tseng KC, Liu YH, Kuo PL, Zheng HQ, Chang WC. Chow CN, et al. Nucleic Acids Res. 2019 Jan 8;47(D1):D1155-D1163. doi: 10.1093/nar/gky1081. Nucleic Acids Res. 2019. PMID: 30395277 Free PMC article. - ChIP-GSM: Inferring active transcription factor modules to predict functional regulatory elements.
Chen X, Neuwald AF, Hilakivi-Clarke L, Clarke R, Xuan J. Chen X, et al. PLoS Comput Biol. 2021 Jul 22;17(7):e1009203. doi: 10.1371/journal.pcbi.1009203. eCollection 2021 Jul. PLoS Comput Biol. 2021. PMID: 34292930 Free PMC article. - ChIP-Seq Data Analysis to Define Transcriptional Regulatory Networks.
Pavesi G. Pavesi G. Adv Biochem Eng Biotechnol. 2017;160:1-14. doi: 10.1007/10_2016_43. Adv Biochem Eng Biotechnol. 2017. PMID: 28070596 Review. - Transcriptional networks in the human epididymis.
Browne JA, Leir SH, Yin S, Harris A. Browne JA, et al. Andrology. 2019 Sep;7(5):741-747. doi: 10.1111/andr.12629. Epub 2019 May 2. Andrology. 2019. PMID: 31050198 Free PMC article. Review.
Cited by
- Phenotype remodelling of HNSCC cells in the muscle invasion environment.
Zeng G, Shen Y, Sun W, Lu H, Liang Y, Wu J, Liao G. Zeng G, et al. J Transl Med. 2024 Oct 7;22(1):909. doi: 10.1186/s12967-024-05607-8. J Transl Med. 2024. PMID: 39375763 Free PMC article. - Exome sequencing of 20,979 individuals with epilepsy reveals shared and distinct ultra-rare genetic risk across disorder subtypes.
Epi25 Collaborative. Epi25 Collaborative. Nat Neurosci. 2024 Oct;27(10):1864-1879. doi: 10.1038/s41593-024-01747-8. Epub 2024 Oct 3. Nat Neurosci. 2024. PMID: 39363051 - Sustained alterations in proximal tubule gene expression in primary culture associate with HNF4A loss.
Telang AC, Ference-Salo JT, McElliott MC, Chowdhury M, Beamish JA. Telang AC, et al. Sci Rep. 2024 Oct 2;14(1):22927. doi: 10.1038/s41598-024-73861-3. Sci Rep. 2024. PMID: 39358473 Free PMC article. - Advances in omics data for eosinophilic esophagitis: moving towards multi-omics analyses.
Matsuyama K, Yamada S, Sato H, Zhan J, Shoda T. Matsuyama K, et al. J Gastroenterol. 2024 Sep 19. doi: 10.1007/s00535-024-02151-6. Online ahead of print. J Gastroenterol. 2024. PMID: 39297956 - Unveiling and Validating the Role of Fatty Acid Metabolism in Ulcerative Colitis.
Deng B, Zhen J, Xiang Z, Li X, Tan C, Chen Y, He P, Ma J, Dong W. Deng B, et al. J Inflamm Res. 2024 Sep 13;17:6345-6362. doi: 10.2147/JIR.S479011. eCollection 2024. J Inflamm Res. 2024. PMID: 39291081 Free PMC article.
References
- Lambert S.A., Jolma A., Campitelli L.F., Das P.K., Yin Y., Albu M., Chen X., Taipale J., Hughes T.R., Weirauch M.T.. The human transcription factors. Cell. 2018; 172:650–665. - PubMed
- Jolma A., Yan J., Whitington T., Toivonen J., Nitta K.R., Rastas P., Morgunova E., Enge M., Taipale M., Wei G. et al. .. DNA-binding specificities of human transcription factors. Cell. 2013; 152:327–339. - PubMed
Publication types
MeSH terms
Substances
LinkOut - more resources
Full Text Sources
Other Literature Sources
Miscellaneous