A census of human transcription factors: function, expression and evolution (original) (raw)
Simon, I. et al. Serial regulation of transcriptional regulators in the yeast cell cycle. Cell106, 697–708 (2001). CASPubMed Google Scholar
Accili, D. & Arden, K. C. FoxOs at the crossroads of cellular metabolism, differentiation, and transformation. Cell117, 421–426 (2004). CASPubMed Google Scholar
Bain, G. et al. E2A proteins are required for proper B cell development and initiation of immunoglobulin gene rearrangements. Cell79, 885–892 (1994). CASPubMed Google Scholar
Dynlacht, B. D. Regulation of transcription by proteins that control the cell cycle. Nature389, 149–152 (1997). CASPubMed Google Scholar
Furney, S. J. et al. Structural and functional properties of genes involved in human cancer. BMC Genomics7, 3 (2006). PubMedPubMed Central Google Scholar
Boyadjiev, S. A. & Jabs, E. W. Online Mendelian Inheritance in Man (OMIM) as a knowledgebase for human developmental disorders. Clin. Genet.57, 253–266 (2000). CASPubMed Google Scholar
Bustamante, C. D. et al. Natural selection on protein-coding genes in the human genome. Nature437, 1153–1157 (2005). Genome-wide study demonstrating that human TFs are under strong positive selection. CASPubMed Google Scholar
De, S., Lopez-Bigas, N. & Teichmann, S. A. Patterns of evolutionary constraints on genes in humans. BMC Evol. Biol.8, 275 (2008). PubMedPubMed Central Google Scholar
Lopez-Bigas, N., De, S. & Teichmann, S. A. Functional protein divergence in the evolution of Homo sapiens. Genome Biol.9, R33 (2008). PubMedPubMed Central Google Scholar
van Nimwegen, E. Scaling laws in the functional content of genomes. Trends Genet.19, 479–484 (2003). CASPubMed Google Scholar
Vogel, C. & Chothia, C. Protein family expansions and biological complexity. PLoS Comput. Biol.2, e48 (2006). PubMedPubMed Central Google Scholar
Levine, M. & Tjian, R. Transcription regulation and animal diversity. Nature424, 147–151 (2003). A discussion of how progressively more elaborate transcriptional regulation has contributed to organismal complexity. CASPubMed Google Scholar
Lemon, B. & Tjian, R. Orchestrated response: a symphony of transcription factors for gene control. Genes Dev.14, 2551–2569 (2000). CASPubMed Google Scholar
Wilson, D. et al. DBD — taxonomically broad transcription factor predictions: new content and functionality. Nucleic Acids Res.36, D88–D92 (2008). CASPubMed Google Scholar
Perez-Rueda, E. & Collado-Vides, J. The repertoire of DNA-binding transcriptional regulators in Escherichia coli K-12. Nucleic Acids Res.28, 1838–1847 (2000). CASPubMedPubMed Central Google Scholar
Moreno-Campuzano, S., Janga, S. C. & Perez-Rueda, E. Identification and analysis of DNA-binding transcription factors in Bacillus subtilis and other Firmicutes — a genomic approach. BMC Genomics7, 147 (2006). PubMedPubMed Central Google Scholar
Park, J. et al. FTFD: an informatics pipeline supporting phylogenomic analysis of fungal transcription factors. Bioinformatics24, 1024–1025 (2008). CASPubMed Google Scholar
Reece-Hoyes, J. S. et al. A compendium of Caenorhabditis elegans regulatory transcription factors: a resource for mapping transcription regulatory networks. Genome Biol.6, R110 (2005). PubMedPubMed Central Google Scholar
Adryan, B. & Teichmann, S. A. FlyTF: a systematic review of site-specific transcription factors in the fruit fly Drosophila melanogaster. Bioinformatics22, 1532–1533 (2006). CASPubMed Google Scholar
Gray, P. A. et al. Mouse brain organization revealed through direct genome-scale TF expression analysis. Science306, 2255–2257 (2004). CASPubMed Google Scholar
Riano-Pachon, D. M. et al. PlnTFDB: an integrative plant transcription factor database. BMC Bioinformatics8, 42 (2007). PubMedPubMed Central Google Scholar
Riechmann, J. L. et al. Arabidopsis transcription factors: genome-wide comparative analysis among eukaryotes. Science290, 2105–2110 (2000). The first genome-wide survey of TF repertoires for eukaryotic organisms. CASPubMed Google Scholar
Lander, E. S. et al. Initial sequencing and analysis of the human genome. Nature409, 860–921 (2001). CASPubMed Google Scholar
Venter, J. C. et al. The sequence of the human genome. Science291, 1304–1351 (2001). CASPubMed Google Scholar
Ashburner, M. et al. Gene Ontology: tool for the unification of biology. Nature Genet.25, 25–29 (2000). CASPubMed Google Scholar
Hunter, S. et al. InterPro: the integrative protein signature database. Nucleic Acids Res.35, D224–D228 (2008). Google Scholar
Messina, D. N. et al. An ORFeome-based analysis of human transcription factor genes and the construction of a microarray to interrogate their expression. Genome Res.14, 2041–2047 (2004). CASPubMedPubMed Central Google Scholar
Roach, J. C. et al. Transcription factor expression in lipopolysaccharide-activated peripheral-blood-derived mononuclear cells. Proc. Natl Acad. Sci. USA104, 16245–16250 (2007). CASPubMedPubMed Central Google Scholar
Kersey, P. J. et al. The International Protein Index: an integrated database for proteomics experiments. Proteomics4, 1985–1988 (2004). CASPubMed Google Scholar
Wingender, E. et al. TRANSFAC: an integrated system for gene expression regulation. Nucleic Acids Res.28, 316–319 (2000). CASPubMedPubMed Central Google Scholar
Brunkow, M. E. et al. Disruption of a new forkhead/winged-helix protein, scurfin, results in the fatal lymphoproliferative disorder of the scurfy mouse. Nature Genet.27, 68–73 (2001). CASPubMed Google Scholar
Wildin, R. S. et al. X-linked neonatal diabetes mellitus, enteropathy and endocrinopathy syndrome is the human equivalent of mouse scurfy. Nature Genet.27, 18–20 (2001). CASPubMed Google Scholar
Hori, S., Nomura, T. & Sakaguchi, S. Control of regulatory T cell development by the transcription factor Foxp3. Science299, 1057–1061 (2003). CASPubMed Google Scholar
Fontenot, J. D., Gavin, M. A. & Rudensky, A. Y. Foxp3 programs the development and function of CD4+ CD25+ regulatory T cells. Nature Immunol.4, 330–336 (2003). CAS Google Scholar
Marson, A. et al. Foxp3 occupancy and regulation of key target genes during T-cell stimulation. Nature445, 931–935 (2007). CASPubMedPubMed Central Google Scholar
Zheng, Y. et al. Genome-wide analysis of Foxp3 target genes in developing and mature regulatory T cells. Nature445, 936–940 (2007). CASPubMed Google Scholar
Satoda, N. et al. Value of FOXP3 expression in peripheral blood as rejection marker after miniature swine lung transplantation. J. Heart Lung Transplant.27, 1293–1301 (2008). PubMed Google Scholar
Luscombe, N. M. et al. An overview of the structures of protein–DNA complexes. Genome Biol.1, REVIEWS001 (2000). A review of DNA-binding domain structures and their interactions with DNA. CASPubMedPubMed Central Google Scholar
Su, A. I. et al. A gene atlas of the mouse and human protein-encoding transcriptomes. Proc. Natl Acad. Sci. USA101, 6062–6067 (2004). This paper presents the microarray experiments underlying the SymAtlas gene expression data for human and mouse organs, tissues and cell lines. CASPubMedPubMed Central Google Scholar
Ghaemmaghami, S. et al. Global analysis of protein expression in yeast. Nature425, 737–741 (2003). CASPubMed Google Scholar
Liu, X. & Clarke, N. D. Rationalization of gene regulation by a eukaryotic transcription factor: calculation of regulatory region occupancy from predicted binding affinities. J. Mol. Biol.323, 1–8 (2002). CASPubMed Google Scholar
Nagalakshmi, U. et al. The transcriptional landscape of the yeast genome defined by RNA sequencing. Science320, 1344–1349 (2008). CASPubMedPubMed Central Google Scholar
Wang, Z., Gerstein, M. & Snyder, M. RNA-Seq: a revolutionary tool for transcriptomics. Nature Rev. Genet.10, 57–63 (2009). A recent overview of the use of ultra-high-throughput sequencing technologies for measuring transcript levels. CASPubMed Google Scholar
Merscher, S. et al. TBX1 is responsible for cardiovascular defects in velo-cardio-facial/DiGeorge syndrome. Cell104, 619–629 (2001). CASPubMed Google Scholar
Tang, C. J. et al. The zinc finger domain of Tzfp binds to the tbs motif located at the upstream flanking region of the Aie1 (aurora-C) kinase gene. J. Biol. Chem.276, 19631–19639 (2001). CASPubMed Google Scholar
Pashmforoush, M. et al. Nkx2–5 pathways and congenital heart disease; loss of ventricular myocyte lineage specification leads to progressive cardiomyopathy and complete heart block. Cell117, 373–386 (2004). CASPubMed Google Scholar
Koizume, S. et al. Heterogeneity in binding and gene-expression regulation by HIF-2α. Biochem. Biophys. Res. Commun.371, 251–255 (2008). CASPubMed Google Scholar
Kimura, S. et al. The T/ebp null mouse: thyroid-specific enhancer-binding protein is essential for the organogenesis of the thyroid, lung, ventral forebrain, and pituitary. Genes Dev.10, 60–69 (1996). CASPubMed Google Scholar
Morte, B. et al. Deletion of the thyroid hormone receptor α1 prevents the structural alterations of the cerebellum induced by hypothyroidism. Proc. Natl Acad. Sci. USA99, 3985–3989 (2002). CASPubMedPubMed Central Google Scholar
Hirose, K. et al. cDNA cloning and tissue-specific expression of a novel basic helix-loop-helix/PAS factor (Arnt2) with close sequence similarity to the aryl hydrocarbon receptor nuclear translocator (Arnt). Mol. Cell Biol.16, 1706–1713 (1996). CASPubMedPubMed Central Google Scholar
Alberti, S. et al. Neuronal migration in the murine rostral migratory stream requires serum response factor. Proc. Natl Acad. Sci. USA102, 6148–6153 (2005). CASPubMedPubMed Central Google Scholar
Miano, J. M. et al. Restricted inactivation of serum response factor to the cardiovascular system. Proc. Natl Acad. Sci. USA101, 17132–17137 (2004). CASPubMedPubMed Central Google Scholar
Gauthier-Rouviere, C. et al. p67SRF is a constitutive nuclear protein implicated in the modulation of genes required throughout the G1 period. Cell Regul.2, 575–588 (1991). CASPubMedPubMed Central Google Scholar
Arsenian, S. et al. Serum response factor is essential for mesoderm formation during mouse embryogenesis. EMBO J.17, 6289–6299 (1998). CASPubMedPubMed Central Google Scholar
Hill, C. S., Wynne, J. & Treisman, R. The Rho family GTPases RhoA, Rac1, and CDC42Hs regulate transcriptional activation by SRF. Cell81, 1159–1170 (1995). CASPubMed Google Scholar
Treisman, R. Ternary complex factors: growth factor regulated transcriptional activators. Curr. Opin. Genet. Dev.4, 96–101 (1994). CASPubMed Google Scholar
Mo, Y. et al. Crystal structure of a ternary SAP-1/SRF/c-fos SRE DNA complex. J. Mol. Biol.314, 495–506 (2001). CASPubMed Google Scholar
Cooper, S. J. et al. Serum response factor binding sites differ in three human cell types. Genome Res.17, 136–144 (2007). CASPubMedPubMed Central Google Scholar
Gineitis, D. & Treisman, R. Differential usage of signal transduction pathways defines two types of serum response factor target gene. J. Biol. Chem.276, 24531–24539 (2001). CASPubMed Google Scholar
Gonzalez Bosc, L. V. et al. Nuclear factor of activated T cells and serum response factor cooperatively regulate the activity of an α-actin intronic enhancer. J. Biol. Chem.280, 26113–26120 (2005). PubMed Google Scholar
Doi, H. et al. HERP1 inhibits myocardin-induced vascular smooth muscle cell differentiation by interfering with SRF binding to CArG box. Arterioscler. Thromb. Vasc. Biol.25, 2328–2334 (2005). CASPubMed Google Scholar
Zhang, Y., Fillmore, R. A. & Zimmer, W. E. Structural and functional analysis of domains mediating interaction between the bagpipe homologue, Nkx3.1 and serum response factor. Exp. Biol. Med. (Maywood)233, 297–309 (2008). CAS Google Scholar
Amoutzias, G. D. et al. One billion years of bZIP transcription factor evolution: conservation and change in dimerization and DNA-binding site specificity. Mol. Biol. Evol.24, 827–835 (2007). CASPubMed Google Scholar
Uht, R. M. et al. A conserved lysine in the estrogen receptor DNA binding domain regulates ligand activation profiles at AP-1 sites, possibly by controlling interactions with a modulating repressor. Nucl. Recept.2, 2 (2004). PubMedPubMed Central Google Scholar
Garcia-Fernandez, J. The genesis and evolution of homeobox gene clusters. Nature Rev. Genet.6, 881–892 (2005). CASPubMed Google Scholar
De la Houssaye, G. et al. ETS-1 and ETS-2 are upregulated in a transgenic mouse model of pigmented ocular neoplasm. Mol. Vis.14, 1912–1928 (2008). CASPubMedPubMed Central Google Scholar
Albagli, O. et al. A model for gene evolution of the ets-1/ets-2 transcription factors based on structural and functional homologies. Oncogene9, 3259–3271 (1994). CASPubMed Google Scholar
Itzkovitz, S., Tlusty, T. & Alon, U. Coding limits on the number of transcription factors. BMC Genomics7, 239 (2006). PubMedPubMed Central Google Scholar
Luscombe, N. M. & Thornton, J. M. Protein–DNA interactions: amino acid conservation and the effects of mutations on binding specificity. J. Mol. Biol.320, 991–1009 (2002). CASPubMed Google Scholar
Pavletich, N. P. & Pabo, C. O. Zinc finger–DNA recognition: crystal structure of a Zif268–DNA complex at 2.1 A. Science252, 809–817 (1991). CASPubMed Google Scholar
Nardelli, J. et al. Base sequence discrimination by zinc-finger DNA-binding domains. Nature349, 175–178 (1991). CASPubMed Google Scholar
Honda, K. & Taniguchi, T. IRFs: master regulators of signalling by Toll-like receptors and cytosolic pattern-recognition receptors. Nature Rev. Immunol.6, 644–658 (2006). CAS Google Scholar
Bulger, M. et al. Conservation of sequence and structure flanking the mouse and human β-globin loci: the β-globin genes are embedded within an array of odorant receptor genes. Proc. Natl Acad. Sci. USA96, 5129–5134 (1999). CASPubMedPubMed Central Google Scholar
Ben-Arie, N. et al. Olfactory receptor gene cluster on human chromosome 17: possible duplication of an ancestral receptor repertoire. Hum. Mol. Genet.3, 229–235 (1994). CASPubMed Google Scholar
Scott, M. P. Vertebrate homeobox gene nomenclature. Cell71, 551–553 (1992). CASPubMed Google Scholar
Dehal, P. et al. Human chromosome 19 and related regions in mouse: conservative and lineage-specific evolution. Science293, 104–111 (2001). CASPubMed Google Scholar
Grimwood, J. et al. The DNA sequence and biology of human chromosome 19. Nature428, 529–535 (2004). CASPubMed Google Scholar
Abbasi, A. A. & Grzeschik, K. H. An insight into the phylogenetic history of HOX linked gene families in vertebrates. BMC Evol. Biol.7, 239 (2007). PubMedPubMed Central Google Scholar
Looman, C. et al. KRAB zinc finger proteins: an analysis of the molecular mechanisms governing their increase in numbers and complexity during evolution. Mol. Biol. Evol.19, 2118–2130 (2002). CASPubMed Google Scholar
Huntley, S. et al. A comprehensive catalog of human KRAB-associated zinc finger genes: insights into the evolutionary history of a large family of transcriptional repressors. Genome Res.16, 669–677 (2006). CASPubMedPubMed Central Google Scholar
Vogel, M. J. et al. Human heterochromatin proteins form large domains containing KRAB-ZNF genes. Genome Res.16, 1493–1504 (2006). CASPubMedPubMed Central Google Scholar
Kikuta, H. et al. Genomic regulatory blocks encompass multiple neighboring genes and maintain conserved synteny in vertebrates. Genome Res.17, 545–555 (2007). CASPubMedPubMed Central Google Scholar
Lee, A. P. et al. Highly conserved syntenic blocks at the vertebrate Hox loci and conserved regulatory elements within and outside Hox gene clusters. Proc. Natl Acad. Sci. USA103, 6994–6999 (2006). CASPubMedPubMed Central Google Scholar
Kim, S. K. et al. A gene expression map for Caenorhabditis elegans. Science293, 2087–2092 (2001). CASPubMed Google Scholar
Arendt, D. The evolution of cell types in animals: emerging principles from molecular studies. Nature Rev. Genet.9, 868–882 (2008). CASPubMed Google Scholar
Wilson, C. A., Kreychman, J. & Gerstein, M. Assessing annotation transfer for genomics: quantifying the relations between protein sequence, structure and function through traditional and probabilistic scores. J. Mol. Biol.297, 233–249 (2000). CASPubMed Google Scholar
Todd, A. E., Orengo, C. A. & Thornton, J. M. Evolution of function in protein superfamilies, from a structural perspective. J. Mol. Biol.307, 1113–1143 (2001). CASPubMed Google Scholar
Freilich, S. et al. Relationship between the tissue-specificity of mouse gene expression and the evolutionary origin and function of the proteins. Genome Biol.6, R56 (2005). PubMedPubMed Central Google Scholar
Hirayama, T. & Shinozaki, K. A cdc5+ homolog of a higher plant, Arabidopsis thaliana. Proc. Natl Acad. Sci. USA93, 13371–13376 (1996). CASPubMedPubMed Central Google Scholar
Monod, J. & Jacob, F. Teleonomic mechanisms in cellular metabolism, growth, and differentiation. Cold Spring Harb. Symp. Quant. Biol.26, 389–401 (1961). CASPubMed Google Scholar
King, M. C. & Wilson, A. C. Evolution at two levels in humans and chimpanzees. Science188, 107–116 (1975). CASPubMed Google Scholar
Nielsen, R. et al. A scan for positively selected genes in the genomes of humans and chimpanzees. PLoS Biol.3, e170 (2005). PubMedPubMed Central Google Scholar
Lai, C. S. et al. A forkhead-domain gene is mutated in a severe speech and language disorder. Nature413, 519–523 (2001). CASPubMed Google Scholar
Enard, W. et al. Molecular evolution of FOXP2, a gene involved in speech and language. Nature418, 869–872 (2002). CASPubMed Google Scholar
Haygood, R. et al. Promoter regions of many neural- and nutrition-related genes have experienced positive selection during human evolution. Nature Genet.39, 1140–1144 (2007). CASPubMed Google Scholar
Odom, D. T. et al. Tissue-specific transcriptional regulation has diverged significantly between human and mouse. Nature Genet.39, 730–732 (2007). A ChIP–chip study showing that functionally equivalent TFs bind to different sites in the human and mouse genomes. CASPubMed Google Scholar
Khaitovich, P. et al. Parallel patterns of evolution in the genomes and transcriptomes of humans and chimpanzees. Science309, 1850–1854 (2005). CASPubMed Google Scholar
Gilad, Y. et al. Expression profiling in primates reveals a rapid evolution of human transcription factors. Nature440, 242–245 (2006). A microarray study suggesting that changes in gene expression levels might be important in distinguishing between different primate species. CASPubMed Google Scholar
Stroud, J. C. et al. Structure of the forkhead domain of FOXP2 bound to DNA. Structure14, 159–166 (2006). CASPubMed Google Scholar
Lopez-Bigas, N., Blencowe, B. J. & Ouzounis, C. A. Highly consistent patterns for inherited human diseases at the molecular level. Bioinformatics22, 269–277 (2006). CASPubMed Google Scholar
Darnell, J. E. Transcription factors as targets for cancer therapy. Nature Rev. Cancer2, 740–749 (2002). CAS Google Scholar
Engelkamp, D. & van Heyningen, V. Transcription factors in disease. Curr. Opin. Genet. Dev.6, 334–342 (1996). CASPubMed Google Scholar
Jimenez-Sanchez, G., Childs, B. & Valle, D. Human disease genes. Nature409, 853–855 (2001). CASPubMed Google Scholar
Wheeler, D. L. et al. Database resources of the National Center for Biotechnology Information. Nucleic Acids Res.35, D5–D12 (2007). CASPubMed Google Scholar
Scherzer, C. R. et al. GATA transcription factors directly regulate the Parkinson's disease-linked gene α-synuclein. Proc. Natl Acad. Sci. USA105, 10907–10912 (2008). CASPubMedPubMed Central Google Scholar
Sinha, S. et al. Systematic functional characterization of _cis_-regulatory motifs in human core promoters. Genome Res.18, 477–488 (2008). CASPubMedPubMed Central Google Scholar
Martinez, N. J. et al. A C. elegans genome-scale microRNA network contains composite feedback motifs with high flux capacity. Genes Dev.22, 2535–2549 (2008). CASPubMedPubMed Central Google Scholar
Berger, M. F. et al. Variation in homeodomain DNA binding revealed by high-resolution analysis of sequence preferences. Cell133, 1266–1276 (2008). A high-throughput assay of DNA-binding specificities for 168 mouse TFs using protein-binding microarrays. CASPubMedPubMed Central Google Scholar
Hallikas, O. et al. Genome-wide prediction of mammalian enhancers based on analysis of transcription-factor binding affinity. Cell124, 47–59 (2006). A study combining SELEX and motif-finding methods to identify the DNA-binding specificities and target regions for five mammalian TFs. CASPubMed Google Scholar
Horak, C. E. et al. GATA-1 binding sites mapped in the β-globin locus by using mammalian chIp–chip analysis. Proc. Natl Acad. Sci. USA99, 2924–2929 (2002). CASPubMedPubMed Central Google Scholar
Johnson, D. S. et al. Genome-wide mapping of in vivo protein–DNA interactions. Science316, 1497–1502 (2007). CASPubMed Google Scholar
Robertson, G. et al. Genome-wide profiles of STAT1 DNA association using chromatin immunoprecipitation and massively parallel sequencing. Nature Methods4, 651–657 (2007). Refs 115 and 116 are the first uses of chromatin immunoprecipitation and ultra-high-throughput sequencing to determine genome-wide binding sites of mammalian TFs CASPubMed Google Scholar
Gaspard, N. et al. An intrinsic mechanism of corticogenesis from embryonic stem cells. Nature455, 351–357 (2008). CASPubMed Google Scholar