A comparative analysis of genome-wide chromatin immunoprecipitation data for mammalian transcription factors - PubMed (original) (raw)
Comparative Study
A comparative analysis of genome-wide chromatin immunoprecipitation data for mammalian transcription factors
Hongkai Ji et al. Nucleic Acids Res. 2006.
Abstract
Genome-wide location analysis (ChIP-chip, ChIP-PET) is a powerful technique to study mammalian transcriptional regulation. In order to obtain a basic understanding of the location data generated for mammalian transcription factors and potential issues in their analysis, we conducted a comparative study of eight independent ChIP experiments involving six different transcription factors in human and mouse. Our cross-study comparisons, to the best of our knowledge the first to analyze multiple datasets, revealed the importance of carefully chosen genomic controls in the de novo identification of key transcription factor binding motifs, raised issues about the interpretation of ubiquitously occurring sequence motifs, and demonstrated the clustering tendency of protein-binding regions for certain transcription factors.
Figures
Figure 1
Comparisons of de novo motif discovery from multiple ChIP studies. Eight experiments were examined here, including (A) Gli-chip, (B) ER-chip, (C) p53-PET, (D) Oct4-chip, (E) Sox2-chip, (F) Nanog-chip, (G) Oct4-PET and (H) Nanog-PET. For each factor, representative motifs recovered by de novo discovery are shown. The motif score reported by Gibbs motif sampler for each motif is shown in parantheses. The complete lists of motifs reported by Gibbs motif sampler are present in Supplementary Figures S1–S8. For each reported motif, the relative enrichment level _r_1, _r_2 and _r_3 were computed by comparing TFBS occurrence rates in ChIP regions to their counterparts in matched genomic controls (labeled by ‘Matched’) or random genomic controls (labeled by ‘Random’). The enrichment levels of all discovered motifs are compared here, and the data used to generate the figures are listed in Supplementary Tables S1–S8. The motifs responsible for the sequence-specific protein binding are underlined or indicated by arrows. For Nanog-chip and Nanog-PET, the motif responsible for the binding was unknown. In these two cases, Oct-Sox composite motif was highlighted, and the relative enrichment levels of the previously proposed Nanog motif (4) (labeled by ‘Nanog’) are also shown. For Gli-chip, ER-chip, Oct4-chip, Sox2-chip and Nanog-chip, enrichment levels in relative to design-based controls were also computed and are shown in Supplementary Figure S9.
Figure 2
Motifs that contain one or more CCCAG components. CCCAG components are highlighted by dashed rectangles. Arrows indicate possible repetitive or palindrome structures. Motifs are indexed by the experiment from which they were recovered (e.g. p53_M8 means that the motif was recovered from p53 data and was the eighth strongest one in that experiment in terms of motif score).
Figure 3
GC-content and cross-species conservation of ChIP-binding regions. (A) GC-content of ChIP-binding regions. Blue bar, genome-wide GC-content; cyan bar, GC-content for ChIP-binding regions; yellow bar, GC-content for genomic regions surveyed by the ChIP experiments; red bar, GC-content for matched genomic controls. The error bar shows three times standard error for the GC-content estimate. For p53-PET, Oct4-PET and Nanog-PET, the regions surveyed by the ChIP experiments are the whole genome; for ER-chip, they are human chr21 and chr22; for Oct4-chip, Sox2-chip and Nanog-chip, they are promoter regions that span from −8 to +2 kb of TSS; for Gli-chip, they are regions tiled in the custom array. Matched genomic controls used here are the same control regions used for relative enrichment computation. Detailed base occurrence frequencies for A, C, G and T are listed in Supplementary Table S9. (B and C) Cumulative probability function of conservation scores for ChIP regions and genomes. Conservation scores are defined for each base pair and were linearly scaled to interval [0, 255]. A large score corresponds to a more conserved status. Human Genome, genome-wide conservation for human; Human Promoter, conservation for human promoter regions spanning from −8 to +2 kb of TSS; Human chr21 and 22, conservation for human chr21 and chr22; and 22, conservation for human chr21 and chr22; Mouse Genome, genome-wide conservation for mouse; Mouse tiled in Gli, conservation for regions surveyed in the Gli study, i.e. regions tiled in the custom array.
Figure 4
Clustering tendency of binding regions. (A–C) Three examples of binding region clusters. The transcription factor that binds to DNA is shown on the top of each figure. The gene that is bound by the transcription factor is shown on the bottom of the figure. For the two ChIP-chip examples (A and B), binding regions are indicated by high fold enrichment of IP samples versus control samples (i.e. peaks in the figure). For the ChIP-PET example (C), binding regions are indicated by the blocks in the ‘Oct4-PET’ and ‘Nanog-PET’ track in the UCSC genome browser. (D–G) Cumulative probability functions (CDF) of the observed and simulated peak-to-peak distance. The simulated distance was considered to be the random distribution. Observed and simulated distributions were fitted by Gamma and Exponential density respectively, the fitted CDF are also shown.
Similar articles
- CEAS: cis-regulatory element annotation system.
Ji X, Li W, Song J, Wei L, Liu XS. Ji X, et al. Nucleic Acids Res. 2006 Jul 1;34(Web Server issue):W551-4. doi: 10.1093/nar/gkl322. Nucleic Acids Res. 2006. PMID: 16845068 Free PMC article. - De novo prediction of cis-regulatory elements and modules through integrative analysis of a large number of ChIP datasets.
Niu M, Tabari ES, Su Z. Niu M, et al. BMC Genomics. 2014 Dec 2;15:1047. doi: 10.1186/1471-2164-15-1047. BMC Genomics. 2014. PMID: 25442502 Free PMC article. - Ab initio identification of putative human transcription factor binding sites by comparative genomics.
Corà D, Herrmann C, Dieterich C, Di Cunto F, Provero P, Caselle M. Corà D, et al. BMC Bioinformatics. 2005 May 2;6:110. doi: 10.1186/1471-2105-6-110. BMC Bioinformatics. 2005. PMID: 15865625 Free PMC article. - Genomic studies of transcription factor-DNA interactions.
Sikder D, Kodadek T. Sikder D, et al. Curr Opin Chem Biol. 2005 Feb;9(1):38-45. doi: 10.1016/j.cbpa.2004.12.008. Curr Opin Chem Biol. 2005. PMID: 15701451 Review. - Serial analysis of binding elements for transcription factors.
Chen J. Chen J. Methods Mol Biol. 2009;567:113-32. doi: 10.1007/978-1-60327-414-2_8. Methods Mol Biol. 2009. PMID: 19588089 Review.
Cited by
- E2F in vivo binding specificity: comparison of consensus versus nonconsensus binding sites.
Rabinovich A, Jin VX, Rabinovich R, Xu X, Farnham PJ. Rabinovich A, et al. Genome Res. 2008 Nov;18(11):1763-77. doi: 10.1101/gr.080622.108. Epub 2008 Oct 3. Genome Res. 2008. PMID: 18836037 Free PMC article. - Parameter estimation for robust HMM analysis of ChIP-chip data.
Humburg P, Bulger D, Stone G. Humburg P, et al. BMC Bioinformatics. 2008 Aug 18;9:343. doi: 10.1186/1471-2105-9-343. BMC Bioinformatics. 2008. PMID: 18706106 Free PMC article. - An integrated software system for analyzing ChIP-chip and ChIP-seq data.
Ji H, Jiang H, Ma W, Johnson DS, Myers RM, Wong WH. Ji H, et al. Nat Biotechnol. 2008 Nov;26(11):1293-300. doi: 10.1038/nbt.1505. Epub 2008 Nov 2. Nat Biotechnol. 2008. PMID: 18978777 Free PMC article. - A genome-scale analysis of the cis-regulatory circuitry underlying sonic hedgehog-mediated patterning of the mammalian limb.
Vokes SA, Ji H, Wong WH, McMahon AP. Vokes SA, et al. Genes Dev. 2008 Oct 1;22(19):2651-63. doi: 10.1101/gad.1693008. Genes Dev. 2008. PMID: 18832070 Free PMC article. - Defense against territorial intrusion is associated with DNA methylation changes in the honey bee brain.
Herb BR, Shook MS, Fields CJ, Robinson GE. Herb BR, et al. BMC Genomics. 2018 Mar 26;19(1):216. doi: 10.1186/s12864-018-4594-0. BMC Genomics. 2018. PMID: 29580210 Free PMC article.
References
- Carroll J.S., Liu X.S., Brodsky A.S., Li W., Meyer C.A., Szary A.J., Eeckhoute J., Shao W., Hestermann E.V., Geistlinger T.R., et al. Chromosome-wide mapping of estrogen receptor binding reveals long-range regulation requiring the forkhead protein FoxA1. Cell. 2005;122:33–43. - PubMed
- Wei C.L., Wu Q., Vega V.B., Chiu K.P., Ng P., Zhang T., Shahab A., Yong H.C., Fu Y., Weng Z., et al. A global map of p53 transcription-factor binding sites in the human genome. Cell. 2006;124:207–219. - PubMed
- Loh Y.H., Wu Q., Chew J.L., Vega V.B., Zhang W., Chen X., Bourque G., George J., Leong B., Liu J., et al. The Oct4 and Nanog transcription network regulates pluripotency in mouse embryonic stem cells. Nature Genet. 2006;38:431–440. - PubMed
- Liu X.S., Brutlag D.L., Liu J.S. An algorithm for finding protein-DNA binding sites with applications to chromatin-immunoprecipitation microarray experiments. Nat. Biotechnol. 2002;20:835–839. - PubMed