BART: a transcription factor prediction tool with query gene sets or epigenomic profiles - PubMed (original) (raw)

BART: a transcription factor prediction tool with query gene sets or epigenomic profiles

Zhenjia Wang et al. Bioinformatics. 2018.

Abstract

Summary: Identification of functional transcription factors that regulate a given gene set is an important problem in gene regulation studies. Conventional approaches for identifying transcription factors, such as DNA sequence motif analysis, are unable to predict functional binding of specific factors and not sensitive enough to detect factors binding at distal enhancers. Here, we present binding analysis for regulation of transcription (BART), a novel computational method and software package for predicting functional transcription factors that regulate a query gene set or associate with a query genomic profile, based on more than 6000 existing ChIP-seq datasets for over 400 factors in human or mouse. This method demonstrates the advantage of utilizing publicly available data for functional genomics research.

Availability and implementation: BART is implemented in Python and available at http://faculty.virginia.edu/zanglab/bart.

Supplementary information: Supplementary data are available at Bioinformatics online.

PubMed Disclaimer

Figures

Fig. 1.

Fig. 1.

BART workflow. (A) Cis-regulatory profile is generated from query gene set by MARGE or from a ChIP-seq dataset by genomic mapping. Yellow bars indicate UDHS. (B) Each transcription factor binding profile from a ChIP-seq dataset is converted to a binary string showing presence or absence at each UDHS. (C) Top: Each ROC curve represents the prediction performance of a transcription factor profile from B by the query cis-regulatory profile from A; Bottom: Area under the ROC curve (AUC) is calculated for all datasets. (D) AUC are grouped by factor, and Wilcoxon test is performed for each factor compared with all datasets as background. In this example, cumulative distributions show significantly higher AUC for TF_a (red). (E) Wilcoxon test statistic is calculated for each transcription factor from each dataset in the background for Z-score calculation. (F) BART outputs a ranked list of all transcription factors

Similar articles

Cited by

References

    1. Auerbach R.K. et al. (2013) Relating genes to function: identifying enriched transcription factors using the ENCODE ChIP-Seq significance tool. Bioinformatics, 29, 1922–1924. - PMC - PubMed
    1. Boeva V. (2016) Analysis of genomic sequence motifs for deciphering transcription factor binding and transcriptional regulation in eukaryotic cells. Front. Genet., 7, 24.. - PMC - PubMed
    1. Bradner J.E. et al. (2017) Transcriptional addiction in cancer. Cell, 168, 629–643. - PMC - PubMed
    1. Creyghton M.P. et al. (2010) Histone H3K27ac separates active from poised enhancers and predicts developmental state. Proc. Natl. Acad. Sci. U.S.A, 107, 21931–21936. - PMC - PubMed
    1. Dozmorov M.G. (2017) Epigenomic annotation-based interpretation of genomic data: from enrichment analysis to machine learning. Bioinformatics, 33, 3323–3330. - PubMed

Publication types

MeSH terms

Substances

LinkOut - more resources