Identifying transcription factor binding sites through Markov chain optimization - PubMed (original) (raw)
Comparative Study
Identifying transcription factor binding sites through Markov chain optimization
Kyle Ellrott et al. Bioinformatics. 2002.
Abstract
Even though every cell in an organism contains the same genetic material, each cell does not express the same cohort of genes. Therefore, one of the major problems facing genomic research today is to determine not only which genes are differentially expressed and under what conditions, but also how the expression of those genes is regulated. The first step in determining differential gene expression is the binding of sequence-specific DNA binding proteins (i.e. transcription factors) to regulatory regions of the genes (i.e. promoters and enhancers). An important aspect to understanding how a given transcription factor functions is to know the entire gamut of binding sites and subsequently potential target genes that the factor may bind/regulate. In this study, we have developed a computer algorithm to scan genomic databases for transcription factor binding sites, based on a novel Markov chain optimization method, and used it to scan the human genome for sites that bind to hepatocyte nuclear factor 4 alpha (HNF4alpha). A list of 71 known HNF4alpha binding sites from the literature were used to train our Markov chain model. By looking at the window of 600 nucleotides around the transcription start site of each confirmed gene on the human genome, we identified 849 sites with varying binding potential and experimentally tested 109 of those sites for binding to HNF4alpha. Our results show that the program was very successful in identifying 77 new HNF4alpha binding sites with varying binding affinities (i.e. a 71% success rate). Therefore, this computational method for searching genomic databases for potential transcription factor binding sites is a powerful tool for investigating mechanisms of differential gene regulation.
Similar articles
- Analysis of protein dimerization and ligand binding of orphan receptor HNF4alpha.
Bogan AA, Dallas-Yang Q, Ruse MD Jr, Maeda Y, Jiang G, Nepomuceno L, Scanlan TS, Cohen FE, Sladek FM. Bogan AA, et al. J Mol Biol. 2000 Sep 29;302(4):831-51. doi: 10.1006/jmbi.2000.4099. J Mol Biol. 2000. PMID: 10993727 - Transcription binding site prediction using Markov models.
Abnizova I, Rust AG, Robinson M, Te Boekhorst R, Gilks WR. Abnizova I, et al. J Bioinform Comput Biol. 2006 Apr;4(2):425-41. doi: 10.1142/s0219720006001813. J Bioinform Comput Biol. 2006. PMID: 16819793 - Integration of genome and chromatin structure with gene expression profiles to predict c-MYC recognition site binding and function.
Chen Y, Blackwell TW, Chen J, Gao J, Lee AW, States DJ. Chen Y, et al. PLoS Comput Biol. 2007 Apr 6;3(4):e63. doi: 10.1371/journal.pcbi.0030063. PLoS Comput Biol. 2007. PMID: 17411336 Free PMC article. - Eukaryotic transcription factor binding sites--modeling and integrative search methods.
Hannenhalli S. Hannenhalli S. Bioinformatics. 2008 Jun 1;24(11):1325-31. doi: 10.1093/bioinformatics/btn198. Epub 2008 Apr 21. Bioinformatics. 2008. PMID: 18426806 Review. - A compilation and classification of DNA binding sites for protein transcription factors from vertebrates.
Boulikas T. Boulikas T. Crit Rev Eukaryot Gene Expr. 1994;4(2-3):117-321. doi: 10.1615/critreveukargeneexpr.v4.i2-3.10. Crit Rev Eukaryot Gene Expr. 1994. PMID: 7881164 Review.
Cited by
- HNF4α isoforms: the fraternal twin master regulators of liver function.
Radi SH, Vemuri K, Martinez-Lomeli J, Sladek FM. Radi SH, et al. Front Endocrinol (Lausanne). 2023 Aug 3;14:1226173. doi: 10.3389/fendo.2023.1226173. eCollection 2023. Front Endocrinol (Lausanne). 2023. PMID: 37600688 Free PMC article. Review. - MODER2: first-order Markov modeling and discovery of monomeric and dimeric binding motifs.
Toivonen J, Das PK, Taipale J, Ukkonen E. Toivonen J, et al. Bioinformatics. 2020 May 1;36(9):2690-2696. doi: 10.1093/bioinformatics/btaa045. Bioinformatics. 2020. PMID: 31999322 Free PMC article. - A novel method for improved accuracy of transcription factor binding site prediction.
Khamis AM, Motwalli O, Oliva R, Jankovic BR, Medvedeva YA, Ashoor H, Essack M, Gao X, Bajic VB. Khamis AM, et al. Nucleic Acids Res. 2018 Jul 6;46(12):e72. doi: 10.1093/nar/gky237. Nucleic Acids Res. 2018. PMID: 29617876 Free PMC article. - Evaluating tools for transcription factor binding site prediction.
Jayaram N, Usvyat D, R Martin AC. Jayaram N, et al. BMC Bioinformatics. 2016 Nov 2;17(1):547. doi: 10.1186/s12859-016-1298-9. BMC Bioinformatics. 2016. PMID: 27806697 Free PMC article. - Diabetes-linked transcription factor HNF4α regulates metabolism of endogenous methylarginines and β-aminoisobutyric acid by controlling expression of alanine-glyoxylate aminotransferase 2.
Burdin DV, Kolobov AA, Brocker C, Soshnev AA, Samusik N, Demyanov AV, Brilloff S, Jarzebska N, Martens-Lobenhoffer J, Mieth M, Maas R, Bornstein SR, Bode-Böger SM, Gonzalez F, Weiss N, Rodionov RN. Burdin DV, et al. Sci Rep. 2016 Oct 18;6:35503. doi: 10.1038/srep35503. Sci Rep. 2016. PMID: 27752141 Free PMC article.
Publication types
MeSH terms
Substances
LinkOut - more resources
Full Text Sources
Other Literature Sources