Control of Embryonic Stem Cell State (original) (raw)

Cell. Author manuscript; available in PMC 2012 Mar 18.

Published in final edited form as:

PMCID: PMC3099475

NIHMSID: NIHMS283517

Whitehead Institute for Biomedical Research, Cambridge, MA 02142

Department of Biology, Massachusetts Institute of Technology, Cambridge, MA 02142

Abstract

Embryonic stem cells and induced pluripotent stem cells, which can be propagated in culture in an undifferentiated state but induced to differentiate into specialized cell types, hold great promise for regenerative medicine. Moreover, these cells provide a powerful model system for studies of cellular identity and early mammalian development. Insights into the transcriptional control of embryonic stem cell state, including the regulatory circuitry underlying pluripotency and cellular reprogramming, have emerged from the study of these cells. These studies have also revealed fundamental mechanisms that control vertebrate gene expression, connect gene expression to chromosome structure, and contribute to human disease.

Introduction

Embryonic stem cells (ESCs) are pluripotent, self-renewing cells that are derived from the inner cell mass (ICM) of the developing blastocyst. Pluripotency is the capacity of a single cell to generate all cell lineages of the developing and adult organism. Self-renewal is the ability of a cell to proliferate in the same state. The molecular mechanisms that control ESC pluripotency and self-renewal are important to discover because they are key to understanding development. Because defects in development cause many different diseases, improved understanding of control mechanisms in pluripotent cells may lead to new therapies for these diseases.

ESCs have a gene expression program that allows them to self-renew yet remain poised to differentiate into essentially all cell types in response to developmental cues. Recent reviews have discussed developmental potency (Rossant, 2008), the nature of the pluripotent ground state of ESCs (Silva and Smith, 2008), ESC transcriptional regulatory circuitry (Chen et al., 2008a; Jaenisch and Young, 2008; Orkin et al., 2008) and cellular reprogramming into ESC-like states (Yamanaka and Blau, 2010). This review provides a synthesis of key concepts that explain how pluripotency and self-renewal are controlled transcriptionally. These concepts have emerged from genetic, biochemical and molecular studies of the transcription factors, cofactors, chromatin regulators and noncoding RNAs that control the ESC gene expression program.

The regulators of gene expression programs can participate in gene activation, establish a poised state for gene activation in response to developmental cues, or contribute to gene silencing (Figure 1). The molecular mechanisms by which these regulators generally participate in control of gene expression are the subject of other reviews (Bartel, 2009; Fuda et al., 2009; Ho and Crabtree, 2010; Li et al., 2007; Roeder, 2005; Surface et al., 2010; Taatjes, 2010). I describe here the regulators that have been implicated in control of ESC state and discuss how they contribute to pluripotency and self-renewal.

An external file that holds a picture, illustration, etc. Object name is nihms-283517-f0001.jpg

Models for transcriptionally active, poised and silent genes

Transcription factors, cofactors, chromatin regulators and ncRNA regulators can be found at active, poised and silent genes. At active genes, enhancers are typically bound by multiple transcription factors (TFs), which recruit cofactors that can interact with RNA polymerase II at the core promoter. RNA polymerase II generates a short transcript and pauses until pause-release factors and elongation factors allow further transcription. Chromatin regulators, which include nucleosome remodeling complexes such as Swi/Snf complexes and histone modifying enzymes such as TrxG, Dot1 and Set2, are recruited by transcription factors or the transcription apparatus and mobilize or modify local nucleosomes. Poised genes are rapidly activated when ESCs are stimulated to differentiate. At poised genes, transcription initiation and recruitment of TrxG can occur, but pause-release, elongation and recruitment of Dot1 and Set2 do not occur. The PcG and SetDB1 chromatin regulators can contribute to this repression, and these can be recruited by some transcription factors and by ncRNAs. The RNA polymerase II “ghost” in this model of poised genes reflects the low levels of the enzyme that are detected under steady-state conditions. Silent genes show little or no evidence of transcription initiation or elongation and are often occupied by chromatin regulators that methylate histone H3K9 and other residues. Some of these silent genes are probably silenced by mechanisms that depend on transcription of at least a portion of the gene (Buhler and Moazed, 2007; Grewal and Elgin, 2007; Zaratiegui et al., 2007).

Transcription Factors and Regulatory Circuitry

Transcription factors recognize specific DNA sequences and either activate or prevent transcription. Early studies into the transcriptional control of the E. coli lac operon created the framework for understanding gene control (Jacob and Monod, 1961). In the absence of lactose, the lac operon is repressed by the Lac repressor, which binds the lac operator and inhibits transcription by RNA polymerase. In the presence of lactose, the Lac repressor is lost and gene expression is activated by a transcription-activating factor that binds a nearby site and recruits RNA polymerase. The fundamental concept that emerged from these studies - that gene control relies on specific repressors and activators and the DNA sequence elements they recognize - continues to provide the foundation for understanding control of gene expression in all organisms.

In mammals, transcription factors make up the largest single class of proteins encoded in the genome, representing approximately 10% of all protein-coding genes (Levine and Tjian, 2003). Transcription factors bind to both promoter-proximal DNA elements and to more distal regions that can be nearby or 100s of kb away. These distal elements that are involved in positive gene regulation are called enhancers, and these elements are generally bound by multiple transcription factors. Transcription factors can activate gene expression by recruiting the transcription apparatus and/or by stimulating release of RNA polymerase II from pause sites (Fuda et al., 2009). They can also recruit various chromatin regulators to promoter regions to modify and mobilize nucleosomes in order to increase access to local DNA sequences (Li et al., 2007).

In ESCs, the pluripotent state is largely governed by the core transcription factors Oct4, Sox2 and Nanog (Table 1) (Chambers and Smith, 2004; Niwa, 2007; Silva and Smith, 2008). Oct4 and Nanog were identified as key regulators based on their relatively unique expression pattern in ESCs and genetic experiments showing that they are essential for establishing or maintaining a robust pluripotent state (Chambers et al., 2003; Chambers and Smith, 2004; Mitsui et al., 2003; Nichols et al., 1998; Niwa et al., 2000). Oct4 functions as a heterodimer with Sox2 in ESCs, thus placing Sox2 among the key regulators (Ambrosetti et al., 2000; Avilion et al., 2003; Masui et al., 2007). Reprogramming of somatic cells into induced pluripotent stem (iPS) cells generally requires forced expression of Oct4 and Sox2, unless endogenous Sox2 is expressed in the somatic cell, consistent with the view that Oct4/Sox2 are key to establishing the ESC state. Although ESCs can be propagated in the absence of Nanog (Chambers et al., 2007), Nanog promotes a stable undifferentiated ESC state (Chambers et al., 2007), is necessary for pluripotency to develop in ICM cells (Silva et al., 2009), and co-occupies most sites with Oct4 and Sox2 throughout the ESC genome (Marson et al., 2008b), so it is included here as a component of the core regulatory circuitry.

Table 1

Transcriptional regulators implicated in control of ESC state*.

Type of Regulator Function References
Transcription factors
Oct4 Core circuitry 1
Sox2 Core circuitry 2
Nanog Core circuitry 3
Tcf3 Wnt signaling to core circuitry 4
Stat3 Lif signaling to core circuitry 5
Smad1 BMP signaling to core circuitry 6
Smad 2/3 TGF-β/Activin/Nodal signaling 7
c-Myc Proliferation 8
Esrrb Steroid hormone receptor 9
Sall4 Embryonic regulator 10
Tbx3 Mediates LIF signaling 11
Zfx Self-renewal 12
Ronin Metabolism 13
Klf4 LIF signaling 14
Prdm14 ESC identity 15
Cofactors
Mediator Core circuitry 16
Cohesin Core circuitry 17
Paf1 complex Couples transcription with histone mod 18
Dax1 Oct4 inhibitor 19
Cnot3 Myc/Zfx cofactor 20
Trim28 Myc/Zfx cofactor 21
Chromatin regulators
Polycomb Group Silencing of lineage-specific regulators 22
SetDB1 (ESET) Silencing of lineage-specific regulators 23
esBAF Nucleosome mobilization 24
Chd1 Nucleosome mobilization 25
Chd7 Nucleosome mobilization 26
Tip60-p400 Histone acetylation 27
ncRNA regulators
miRNAs Fine tuning of pluripotency transcripts 28
GC-rich ncRNAs PcG complex recruitment 29

Core regulatory circuitry

Two key concepts dominate our understanding of the function of the core transcription factors Oct4, Sox2 and Nanog in control of ESC state (Figure 2): 1) The core transcription factors function together to positively regulate their own promoters, forming an interconnected autoregulatory loop. 2) The core factors co-occupy and activate expression of genes necessary to maintain ESC state, while contributing to repression of genes encoding lineage-specific transcription factors whose absence helps prevent exit from the pluripotent state.

An external file that holds a picture, illustration, etc. Object name is nihms-283517-f0002.jpg

Core regulatory circuitry

Oct4, Sox2 and Nanog collaborate to regulate their own promoters, forming an interconnected autoregulatory loop. The Pou5f (Oct4), Sox2 and Nanog genes are represented as red boxes and proteins as blue balloons. These core transcription factors (O/S/N) function to activate expression of protein-coding and miRNA genes necessary to maintain ESC state, but they also occupy poised genes encoding lineage-specific protein and miRNA regulators whose repression is essential to maintaining that state. Additional transcription factors, such as the c-Myc/Max heterodimer (M/M), cause pause release at actively transcribed genes. A subset of the cofactors and chromatin regulators implicated in control of ES cell state (Table 1) are shown.

The interconnected autoregulatory loop formed by Oct4, Sox2 and Nanog generates a bi-stable state for ESCs: residence in a positive-feedback-controlled gene expression program when the factors are expressed at appropriate levels, versus entrance into a differentiation program when any one of the master transcription factors is no longer functionally available (Boyer et al., 2005; Loh et al., 2006). This regulatory circuit likely explains the ability to jump-start the ESC gene expression program during reprogramming by forced expression of reprogramming factors (Jaenisch and Young, 2008). Thus, the ectopically expressed factors activate transcription of the endogenous Pou5f (Oct4), Sox2 and Nanog genes and thereby initiate the positive-feedback loop that sustains ongoing production of these factors from the endogenous genes in the absence of further input from the ectopically expressed factors. Some factors present in reprogramming cocktails, such as c-Myc, appear to facilitate activation of this interconnected autoregulatory circuitry by stimulating gene expression and proliferation more generally (Rahl et al., 2010).

Oct4, Sox2 and Nanog collaborate to activate a substantial fraction of the actively transcribed protein-coding and miRNA genes in ESCs (Chen et al., 2008b; Marson et al., 2008b). Sites co-occupied by the three core regulators generally have enhancer activity, and transcription of genes adjacent to such sites often depends on at least one of the trio (Chen et al., 2008b; Chew et al., 2005; Matoba et al., 2006). Oct4 and Nanog can bind and recruit multiple coactivators, as described below, accounting for their ability to activate genes.

Oct4, Sox2 and Nanog also occupy repressed genes encoding cell lineage-specific regulators, and the repression of these genes is essential for ESCs to maintain a stable pluripotent state and to undergo normal differentiation (Bilodeau et al., 2009; Boyer et al., 2005; Boyer et al., 2006; Lee et al., 2006; Loh et al., 2006; Marson et al., 2008b; Pasini et al., 2008; Pasini et al., 2004). The loss of these core regulators leads to rapid induction of a wide spectrum of genes encoding lineage-specific regulators, indicating that these genes are poised for activation. How might Oct4/Sox2 and Nanog act to repress these genes? The SetDB1 and Polycomb Group (PcG) chromatin regulators have been implicated in repression of these lineage-specific genes. Oct4 can bind sumoylated SetDB1, which catalyzes the repressive histone modification H3K9me3 at these genes (Yeap et al., 2009; Yuan et al., 2009). Polycomb Group (PcG) complexes can associate with nucleosomes with histone H3K9me3 (Margueron et al., 2009) and further contribute to repression through mechanisms described below. It is also possible that Oct4, Sox2 and Nanog activate transcription initiation in the extensive GC-rich promoter regions of these genes, and GC-rich RNA species produced from these regions might then recruit or stabilize Polycomb Group (PcG) complexes (Guenther and Young, 2010).

The ability of Oct4, Sox2 and Nanog to positively regulate genes necessary to maintain ESC state while repressing genes that would enable egress from this state explains, in part, the ability of ESCs to self-renew in an undifferentiated state, yet remain poised to differentiate into all cell types of the body in response to developmental cues. However, the factors and mechanisms that specify positive versus negative regulation by Oct4/Sox2/Nanog are not fully understood. Additional transcription factors are known to collaborate with Oct4, Sox2 and Nanog to regulate the ESC gene expression program, and these make various contributions to selective activation and repression of target genes, as described below.

Control of RNA polymerase II

Transcription factors control at least two major steps in gene expression (Fuda et al., 2009; Peterlin and Price, 2006; Rahl et al., 2010). Some transcription factors recruit RNA polymerase II to promoters, where the enzyme typically transcribes a short distance (approximately 35 bp), and then pauses or terminates. Other transcription factors recruit a cyclin-dependent kinase (Cdk9/cyclinT) called p-TEFb, which phosphorylates the polymerase and its associated pause control factors, allowing the enzyme to be released from pause sites and fully transcribe the gene. Oct4, Sox2 and Nanog interact with coactivators that bind to RNA polymerase II (Kagey et al., 2010), so the core regulators are involved in RNA polymerase II recruitment. In contrast, c-Myc, which plays important roles in ESC proliferation and self-renewal (Cartwright et al., 2005), does not appear to play an important role in RNA polymerase II recruitment, but rather binds to E-box sequences at core promoter sites and recruits p-TEFb, thus stimulating pause release (Rahl et al., 2010). A large proportion of the actively transcribed genes in ESCs are bound and regulated by both the core transcription factors and c-Myc (Figure 3A). Thus, Oct4/Sox2/Nanog apparently play dominant roles in selecting the set of ESC genes that will be actively transcribed and recruiting RNA polymerase II to these genes, while c-Myc regulates the efficiency with which these selected genes are fully transcribed. This likely explains why forced expression of c-Myc can enhance reprogramming efficiency and why this transcription factor plays such a potent role in proliferation of many cancer cells (Jaenisch and Young, 2008; Rahl et al., 2010).

An external file that holds a picture, illustration, etc. Object name is nihms-283517-f0003.jpg

Relationships between core and other transcription factors in regulatory circuitry and gene control

A) Overlap between actively transcribed genes occupied by core transcription factors (union of Oct4, Sox2 and Nanog bound genes) and those of occupied by c-Myc. Active genes were defined as the set of genes occupied by both RNA polymerase II and nucleosomes with histone H3K79me3.

B) Frequency distribution showing how c-Myc, Tcf3, Smad1, STAT3, Esrrb, Tbx3, Zfx, Ronin and Klf4 are associated with Oct4/Sox2/Nanog-occupied loci. Oct4, Sox2 and Nanog are the three transcription factors in the first bin. Binding was called at a high confidence (p<10−9) threshold within a 50bp window, so the actual number of factors bound to Oct4/Sox2/Nanog-occupied loci is somewhat higher.

C) Frequency distribution showing how often c-Myc, Tcf3, Smad1, STAT3, Esrrb, Tbx3, Zfx, Ronin and Klf4 are associated with Oct4/Sox2/Nanog-occupied genes (p<10−9).

D) Gene tracks showing example of an actively transcribed gene (Max) occupied by an Oct4/Sox2/Nanog enhancer and other transcription factors implicated in ESC control. ChIP-Seq data was obtained from GSE11431, GSE11724, GSE12680 and GSE22557.

Multiple enhancers and enhanceosomes

Enhancers are generally bound by multiple transcription factors, forming large nucleoprotein complexes called enhanceosomes, which permit cooperative binding between transcription factors and allow for synergistic and combinatorial effects on gene regulation (Maniatis et al., 1998). The cooperative interactions among transcription factors binding to adjacent DNA sites and to cofactor complexes explains why multiple transcription factors are found together in the genome and why transcription factors bind stably to only a small subset of the millions of DNA sequence motifs present in the vertebrate genome. Many genes have multiple enhancers and thus multiple enhanceosomes (Levine and Tjian, 2003). In Drosophila, these multiple, seemingly redundant enhancers have been shown to contribute to phenotypic robustness during embryonic development (Frankel et al., 2010; Hong et al., 2008). That is, normal levels of gene expression are obtained despite environmental and genetic variability so long as genes are equipped with multiple enhancers.

In addition to Oct4, Sox2, Nanog and c-Myc, the transcription factors Tcf3, Smad1, Stat3, Esrrb, Sall4, Tbx3, Zfx, Ronin, Klf2, Klf4, Klf5 and PRDM14 have been shown to play important roles in control of ESC state (Table 1). The ChIP-Seq data that has been obtained for these transcription factors indicate that they can bind to loci occupied by Oct4/Sox2/Nanog as well as other loci (Figure 3B), forming sites that have been called multiple transcription factor binding loci (MTL) (Chen et al., 2008b; Kim et al., 2008). Several lines of evidence indicate that most MTLs are enhancers. Most MTLs are occupied by the p300 cofactor, and the subset of MTLs that are occupied by Oct4/Sox2/Nanog are also occupied by the mediator cofactor (Chen et al., 2008b; Kagey et al., 2010). All Oct4/Sox2/Nanog-containing MTLs tested to date have been shown to exhibit enhancer activity (Chen et al., 2008b). It is therefore likely that functional enhanceosomes are formed at most MTLs.

The evidence obtained thus far suggests that most Oct4/Sox2/Nanog-regulated genes are co-occupied by one or more of the other transcription factors implicated in control of ESCs (Figure 3A–C). Examination of gene tracks for the Max gene reveals a typical pattern, where the promoter region contains a site bound by Oct4, Sox2, Nanog, Tcf3 and Essrb, and various other sites occupied by c-Myc, Zfx, Ronin and Klf4 (Figure 3D). Thus the functions of the core regulators (Oct4, Sox2 and Nanog) are augmented by the functions many of the other transcription factors implicated in control of ESC state at actively transcribed target genes.

Signaling to the core regulatory circuitry

A simple concept has emerged from the study of transcription factors associated with signaling pathways that have roles in control of ESC state: these particular factors tend to co-occupy enhancers bound by Oct4, Sox2 and Nanog, thereby allowing direct control of genes within the core circuitry by these signaling pathways (Figure 4) (Chen et al., 2008b; Cole et al., 2008; Tam et al., 2008). ESCs were initially cultured on a layer of irradiated fibroblasts in order to obtain the necessary factors for self-renewal and pluripotency (Smith, 2001; Smith and Hooper, 1983). LIF, Wnt and ligands of the TGF-β/BMP signaling pathway were among factors supplied by the fibroblasts and found to influence murine ESC state (Okita and Yamanaka, 2006; Sato et al., 2004; Smith et al., 1998; Williams et al., 1988; Ying et al., 2003). Transcription factors associated with LIF, Wnt and BMP4 signaling pathways (Stat3, Tcf3 and Smad1) bind to loci bound by Oct4, Sox2 and Nanog (Chen et al., 2008a; Cole et al., 2008; Tam et al., 2008; Wu et al., 2006; Zhang et al., 2006). Thus, signals mediated by these pathways are delivered directly to the enhancers of genes within the core regulatory circuitry and can thereby have profound effects on pluripotency and self-renewal. This likely explains why manipulation of certain signaling pathways can enhance reprogramming (Lluis et al., 2008; Marson et al., 2008a).

An external file that holds a picture, illustration, etc. Object name is nihms-283517-f0004.jpg

Signaling to core regulatory circuitry

A) Model of an enhancer where transcription factors associated with Wnt, LIF and BMP4 signaling (STAT3, Tcf3 and Smad1) occupy sites near the core regulators.

B) Oct4 distal enhancer provides an example of a DNA element that is bound by the core regulators and signaling transcription factors and contains sequence motifs for each of these factors.

C) Frequency distribution showing how often signaling transcription factors (STAT3, Tcf3 and Smad1) are associated with Oct4/Sox2/Nanog-bound loci throughout genome. Binding was called at a high confidence (p<10−9) threshold within a 50bp window, so the actual number of signaling transcription factors bound to Oct4/Sox2/Nanog-occupied loci is somewhat higher.

ChIP-Seq data was obtained from GSE11431, GSE11724, GSE12680 and GSE22557.

Transcriptional Cofactors

Cofactors include protein complexes that contribute to activation (coactivators) and repression (co-repressors) but do not have DNA-binding properties of their own. Some cofactors mobilize or modify nucleosomes and in these cases they are also considered chromatin regulators. Cofactors are generally expressed in most cell types, but it appears that ESCs are more sensitive than somatic cells to reduced levels of certain cofactors and chromatin regulators (Fazzio and Panning, 2010; Kagey et al., 2010).

Transcription factors that occupy active enhancers bind coactivators such as p300 and mediator, which in turn bind and control the activity of the transcription initiation apparatus (Conaway et al., 2005; Malik and Roeder, 2005; Roeder, 1998; Taatjes, 2010). The p300 cofactor occupies most active promoters in ESCs (Chen et al., 2008b). Reduced levels of p300 do not appear to adversely affect ESCs, but rather have a profound effect on ESC differentiation (Chen et al., 2008b; Zhong and Jin, 2009).

Recent studies have shown that mediator physically links Oct4/Sox2/Nanog-bound enhancers to the promoters of active genes in the core regulatory circuitry of ESCs (Figure 5)(Kagey et al., 2010). The form of mediator recruited to the Oct4/Sox2/Nanog-regulated promoters associates with the cohesin loading factor Nipbl, which provides a mechanism for cohesin loading at these sites. The mediator/cohesin complex forms a looped chromosome architecture between enhancers and core promoters that is necessary for normal gene activity. Mediator and cohesin co-occupy different promoters in different cell types, thus generating cell-type-specific DNA loops associated with the gene expression program of each cell.

An external file that holds a picture, illustration, etc. Object name is nihms-283517-f0005.jpg

Mediator and cohesin contribute to gene control in core circuitry

A) ChIP-Seq data at the Pou5f gene for transcription factors, mediator and cohesin, and the transcription apparatus (Pol2 and TBP). Note evidence for crosslinking of most components to both enhancer elements and core promoter. The numbers on the Y-axis are reads/million. ChIP-Seq data was obtained from GSE11431, GSE11724, GSE12680 and GSE22557.

B) Model for DNA looping by mediator and cohesin. Oct4, Sox2 and Nanog bind mediator, which binds RNA polymerase II at the core promoter, thus forming a loop between the enhancer and the core promoter. The transcription activator-bound form of mediator binds the cohesin loading factor Nipbl, which provides a means to load cohesin. Both mediator and cohesin are necessary for normal gene activity. This model contains a single DNA loop, but multiple enhancer may be bound simultaneously, generating multiple loops.

The CDK8 kinase subunit of the mediator complex can influence the activity of signaling transcription factors (Alarcon et al., 2009; Fryer et al., 2004; Gao et al., 2009; Taatjes, 2010). For example, CDK8 meditated phosphorylation of the linker region within SMAD1/5 or SMAD2/3 complexes can activate these transcription factor complexes, but it also targets them for proteasomal degradation. A dynamic cycle of transcription factor activation and destruction ensures that continuous pathway activation is necessary for continuous gene activation and may facilitate rapid changes in cell state when signaling is altered.

ESCs are also sensitive to changes in the levels of the PAF1 complex, which is associated with RNA polymerase II at active genes (Ding et al., 2009). Based on studies in yeast, the Paf1 complex couples transcription initiation and elongation with histone H3K4 and H3K36 methylation (Krogan et al., 2003). In ESCs, the Paf1 complex may also play this role, as knockdowns lead to reduced levels of histone H3K4me3 at actively transcribed genes (Ding et al., 2009).

Corepressors that have been implicated in control of ES cell state include Dax1, Cnot3 and Trim28 (Hu et al., 2009; Sun et al., 2009). Overexpression of Dax1 causes ESC differentiation, likely due to an inhibitory interaction with Oct4 (Sun et al., 2009). Cnot3 and Trim28 co-occupy many promoters with c-Myc and Zfx, and likely contribute to control of proliferation and self-renewal. They differ somewhat in the additional promoters they occupy, which might explain why loss of Cnot causes ESCs to differentiate into trophectoderm while loss of Trim28 causes cells to differentiate into primitive ectoderm lineage (Hu et al., 2009).

Chromatin Regulators in ESC Gene Activity and Silencing

Eukaryotic genomes are packaged into nucleosomes (Kornberg and Thomas, 1974; Olins and Olins, 1974), which provide a means to compact the genome and to influence gene expression. Early studies showed that nucleosomes can affect transcription in vitro (Knezetic and Luse, 1986; Lorch et al., 1987) and in vivo (Han and Grunstein, 1988; Kayne et al., 1988). Subsequent studies revealed that gene expression can be influenced by proteins that modify histones (Brownell et al., 1996) or mobilize nucleosomes (Cote et al., 1994; Imbalzano et al., 1994; Kwon et al., 1994). Chromatin regulators are generally recruited to genes by DNA binding transcription factors, the transcription apparatus, or specific RNA species (Guenther and Young, 2010; Li et al., 2007; Roeder, 2005; Surface et al. 2010). The chromatin regulators that are known to contribute to the control of ESC state fall into two classes: histone-modifying enzymes and ATP-dependent chromatin remodeling complexes (Table 1).

Histone-Modifying Enzymes

The chromatin regulators known to have the most profound impact on ESC state are histone-modifying enzymes that repress genes encoding lineage-specific developmental regulators. These include the Polycomb group (PcG) protein complexes, SetDB1 and TIP60-p400. Polycomb group (PcG) and Trithorax group (TrxG) genes were discovered in Drosophila melanogaster as repressors and activators of Hox genes (Schuettengruber et al., 2007). TrxG proteins catalyze trimethylation of histone H3 lysine 4 (H3K4me3) at the promoters of active genes and facilitate maintenance of active gene states during development, in part by antagonizing the functions of PcG proteins. PcG protein complexes catalyze ubiquitylation of histone H2A lysine 119 (H2AK119u) and trimethylation of histone H3 lysine 27 (H3K27me3) and function in ESCs to help silence genes encoding key regulators of development, yet allow them to remain in a state that is “poised” for activation during differentiation (Azuara et al., 2006; Bernstein et al., 2006; Boyer et al., 2006; Bracken et al., 2006; Endoh et al., 2008; Lee et al., 2006; Li et al., 2010; Pan et al., 2007; Pasini et al., 2010; Peng et al., 2009; Shen et al., 2009; van der Stoop et al., 2008). PcG proteins are thought to inhibit transcription, at least in part, by restraining poised RNA polymerase molecules (Stock et al., 2007; Zhou et al., 2008b). ESCs lacking PcG protein complexes are unstable and tend to differentiate, but fail to execute differentiation programs appropriately (Leeb et al., 2010).

Multiple histone H3 lysine 9 methyltransferases have been implicated in control of ESC state (Bilodeau et al., 2009; Yeap et al., 2009; Yuan et al., 2009). A subset of the silent genes that encode lineage-specific developmental regulators, including those involved in generating the extraembryonic trophoblast lineage, are occupied and repressed by SetDB1, which catalyzes methylation of histone H3 lysine 9. Thus, multiple repressive mechanisms, involving methylation of H3K27 and H3K9 and ubiquitylation of histone H2A, are used to silence genes encoding lineage-specific developmental regulators.

The Tip60-p400 complex has multiple activities, among which is histone acetylation, and loss of this complex affects ESC morphology and state (Fazzio et al., 2008). It is found associated with active promoters in ESCs and appears to be recruited in two ways, directly by the H3K4me3 mark and indirectly by Nanog. Interestingly, the complex is also associated with nucleosomes with H3K4me3 at PcG-occupied genes encoding lineage-specific regulators, where it apparently facilitates repression of these poised genes. Because Tip60-p400 is generally found associated with active genes, its repressive function may derive from its potential role in facilitating transcription of ncRNAs that recruit or stabilize PcG complexes, as described below.

ATP-dependent nucleosome remodeling

ATP-dependent nucleosome remodeling complexes can be recruited by transcription factors and modified histones to the promoters of genes, where they enhance or reduce the access of transcriptional components to DNA sequences with resulting positive or negative effects on gene activity (Clapier and Cairns, 2009; Ho and Crabtree, 2010). Components of multiple ATP-dependent nucleosome-remodeling complexes have been implicated in control of ESC state (Table 1)(Bilodeau et al., 2009; Gaspar-Maia et al., 2009; Ho and Crabtree, 2010; Klochendler-Yeivin et al., 2000; Schnetz et al.). A complex purified from ESCs called esBAF has been shown to be associated with the promoters of genes under the control of the core regulators Oct4, Sox2 and Nanog, and core subunits of this complex are essential for ESC maintenance (Ho and Crabtree, 2010). Chd1, a member of the chromodomain helicase DNA-binding (CHD) family of ATP-dependent chromatin remodelers, is associated with the promoters of active genes and Chd1-deficient ESCs are incapable of giving rise to primitive endoderm (Gaspar-Maia et al., 2009). Another member of the CHD family, Chd7, is associated with active Oct4/Sox2/Nanog-bound enhancers in ES cells, where it is thought to fine-tune the expression levels of ESC–specific genes (Schnetz et al., 2010). Unlike mutations in esBAF and Chd1, which affect ESC state, the effects of changing Chd7 dosage are subtle and do not appear to affect pluripotency or self-renewal. Thus, multiple ATP-dependent nucleosome remodeling complexes are present at many key ESC genes.

Non-coding RNAs in ESC Regulatory Circuitry

The idea that noncoding RNA (ncRNA) might regulate genes was proposed at the dawn of studies on regulation of gene expression (Britten and Davidson, 1969; Jacob and Monod, 1961). It is now clear that noncoding RNA is involved in regulation of many important biological processes, including X inactivation, dosage compensation, imprinting, polycomb repression and silencing of repeated elements, as described in several recent reviews (Surface et al., 2010; Wilusz et al., 2009; Zaratiegui et al., 2007). Indeed, a variety of noncoding RNA species have been implicated in control of ESC state (Table 1). These include miRNAs, which can regulate the stability and translatability of mRNAs and, acting in this fashion, are essential for normal ESC self-renewal and cellular differentiation. They also include longer noncoding RNAs of varyous types, which have been implicated in recruitment of chromatin regulators such as the PcG complexes (Bracken and Helin, 2009; Guenther and Young, 2010; Surface et al., 2010; Wilusz et al., 2009).

miRNAs and control of ESC identify

Multiple lines of evidence indicate that miRNAs contribute to the control of early development. ESCs deficient in miRNA-processing enzymes show defects in differentiation and proliferation (Kanellopoulou et al., 2005; Murchison et al., 2005; Wang et al., 2007). Two key themes have emerged from studying the regulation of miRNA genes in ESCs (Marson et al., 2008b). First, the core regulators Oct4/Sox2/Nanog activate genes for miRNAs that are preferentially expressed in ESCs, and these miRNAs contribute to cell state maintenance and cell state transitions by fine-tuning the expression of key ESC genes and by promoting the rapid clearance of ESC transcripts during differentiation. Second, the core regulators co-occupy repressed lineage-specific miRNA genes with SteDB1 and PcG complexes, thus poising them for expression during differentiation.

The core circuitry controls the expression of miRNAs that fine-tune the expression of key transcripts and promote the rapid clearance of ESC-specific transcripts during differentiation (Figure 6). Several miRNA polycistrons that specify the most abundant miRNAs in ESCs and which are silenced during early differentiation are positively regulated by Oct4/Sox2/Nanog (Marson et al., 2008b). These include the mir-290–295 cluster, and miRNAs with seed sequences in this family have been implicated in cell proliferation (He et al., 2005; O'Donnell et al., 2005; Wang et al., 2008), and have been shown to rescue the proliferation defects observed in miRNA-deficient ES cells (Kanellopoulou et al., 2005; Murchison et al., 2005; Wang et al., 2008; Wang et al., 2007). Furthermore, the zebrafish homolog of this miRNA family, miR-430, contributes to the rapid degradation of maternal transcripts in early zygotic development (Giraldez et al., 2006), and this miRNA family also promotes the clearance of transcripts in early mammalian development (Farh et al., 2005).

An external file that holds a picture, illustration, etc. Object name is nihms-283517-f0006.jpg

ESC core regulatory circuitry and differentiation

A) This model of core regulatory circuitry incorporates selected protein-coding and miRNA target genes. Oct4, Sox2 and Nanog directly activate transcription of genes whose products include the spectrum of transcription factors, cofactors, chromatin regulators and miRNAs that are known to contribute to ESC state. Oct4, Sox2 and Nanog are also associated with SetDB1 and PcG-repressed protein-coding and miRNA genes that are poised for differentiation.

B) The loss of ESC state during differentiation involves the silencing of the Oct4 gene, the proteolytic destruction of Nanog by caspase-3, and miRNA-mediated reduction in Oct4, Nanog and Sox2 mRNA levels.

The core transcription factors and PcG complexes co-occupy genes for miRNAs that are repressed in ESCs, but become selectively expressed in cells of the immune system (mir-155), pancreatic islets (mir-375), neural cells (mir-124 and mir-9), and differentiating ESCs (mir-296) (Figure 6A)(Marson et al., 2008b). This set of miRNA genes is thus poised to contribute to cell-fate decisions during development in the same fashion as genes encoding lineage-specific transcription factors that are co-occupied by the core regulators and PcG complexes. For example, mir-296, which is rapidly induced upon ESC differentiation, targets Nanog mRNA (Tay et al., 2008).

ncRNAs and Polycomb-mediated Silencing

Recent studies indicate that various ncRNA molecules recruit or stabilize PcG complexes at specific sites in the genome. Specific ncRNA molecules have recently been shown to recruit the PcG complexes to the X-inactivation center X(ic), the kcnq1 domain, the INK4b/ARF/INK4a locus, the HOXD locus and potentially many other genomic loci in various cell types (Gupta et al., 2010; Pandey et al., 2008; Rinn et al., 2007; Tsai et al., 2010; Yap et al., 2010; Zhao et al., 2008). Short ncRNAs are transcribed bi-directionally by RNA polymerase II from most CpG islands (Core et al., 2008; Guenther et al., 2007; He et al., 2008; Seila et al., 2008), and some of these are able to recruit PcG complexes (Guenther and Young, 2010; Kanhere et al., 2010; Surface et al., 2010). Most PcG-occupied genes in ESCs show some evidence of RNA polymerase II binding or transcripts, but it is challenging to detect low or transient levels of RNA species that serve to initiate Polycomb-mediated silencing at these loci. Thus, much remains to be understood about the role of ncRNAs in PcG complex recruitment.

Transitioning from ES to Specialized States

The ESC regulatory circuitry is reconfigured when cells are stimulated to differentiate (Figure 6). The mechanisms that contribute to the loss of ESC state include the silencing of the Oct4 gene (Ben-Shushan et al., 1995; Feldman et al., 2006), the proteolytic destruction of Nanog by caspase-3 (Fujita et al., 2008), and miRNA-mediated reduction in Oct4, Nanog and Sox2 mRNA levels (Tay et al., 2008). The mechanisms that influence exit from the ESC state and entrance into new states include the activation of selected cell lineage-specific regulatory genes, downregulation of the miRNA regulator Lin28 with consequent maturation of the Let-7 miRNA (Viswanathan et al., 2008), activation of pioneer transcription factors that have occupied cell type-specific enhancers silently and thus without active promotion of transcription (Zaret et al., 2008), and modifications to the subunit composition of mediator, BAF and TFIID complexes (Deato et al., 2008; Deato and Tjian, 2007; Ho and Crabtree, 2010; Taatjes, 2010).

Insights into Disease Mechanisms

The study of ESC control has provided new insights into mechanisms that are involved in several human diseases. For example, improved understanding of the functions of transcription factors such as c-Myc, cofactor complexes such as mediator and cohesin, and chromatin regulators such as TrxG and PcG has provided new insights into the molecular pathways affected by mutations in these regulators.

Key aspects of the ESC gene expression program are recapitulated in cancer cells (Ben-Porath et al., 2008), and it has been argued that this is largely a consequence of c-Myc (Kim et al., 2010). c-Myc amplification is the most frequent somatic copy-number amplification in tumor cells (Beroukhim et al., 2010). Tumor cells that overexpress c-Myc have enhanced expression of proliferation genes, and this is likely due to the role of c-Myc in recruiting P-TEFb to effect RNA polymerase II pause release at these genes (Rahl et al., 2010). This insight suggests that therapeutic agents that target control of transcription elongation may be valuable for treating tumors that overexpress c-Myc.

Mutations in the genes encoding mediator, cohesin and the cohesin loading factor Nipbl can cause an array of human developmental syndromes and diseases. Mediator mutations have been associated with Opitz-Kaveggia (FG) syndrome, Lujan syndrome, schizophrenia, Transposition of the Great Arteries (TGA) syndrome and colon cancer progression (Ding et al., 2008; Firestein et al., 2008; Muncke et al., 2003; Philibert and Madan, 2007; Risheg et al., 2007; Schwartz et al., 2007). Mutations in Nipbl and cohesin are responsible for most cases of Cornelia de Lange syndrome, which is characterized by developmental defects and mental retardation and appears to be the result of mis-regulation of gene expression rather than chromosome cohesion or mitotic abnormalities (Krantz et al., 2004; Strachan, 2005; Tonkin et al., 2004). Knowledge that mediator, Nipbl and cohesin are linked at active promoters suggests therapies that might compensate for partial loss of transcriptional activity. The CDK8 kinase resides within a subcomplex of mediator that has repressive activities (Knuesel et al., 2009; Taatjes, 2010), so it is conceivable that small molecule antagonists of CDK8 would lead to an increase in transcriptionally active mediator/cohesin assemblies.

Mutations that affect the functions or levels of TrxG and PcG chromatin regulators have been implicated in a variety of cancers (Bracken and Helin, 2009; Krivtsov and Armstrong, 2007). The study of these regulators in ESCs and in cancer cells has revealed how repression of lineage-specific transcription factors and cell cycle regulators may contribute to cancer phenotypes. Chromatin regulators with enzymatic activities are a new class of targets for small molecule drug discovery and we can expect new developments in this field in the near future.

Summary and Outlook

How do regulators of the ESC gene expression program produce a self-renewing cell capable of differentiating into all the cells of the adult? Part of the answer is that the core transcription factors establish autoregulatory loops that help maintain their own expression, activate transcription of a large fraction of the active genes, and contribute to the poised state of lineage-specific genes. The core transcription factors frequently share enhancers with signaling transcription factors, so signal transduction pathways can deliver signals directly to the genes regulated by the core factors. At actively transcribed genes, other transcription factors implicated in proliferation and other aspects of self-renewal bind to sites that can be separate from the core enhancers, and modulate RNA expression levels though mechanisms that include release of paused polymerases. The core factors help create a poised state by recruiting SetDB1 and PcG chromatin regulators to genes encoding lineage-specific factors through mechanisms that are not yet fully understood.

Several of the regulatory features of ESCs probably operate to control cell identity in other cell types. The idea that cells may rely on a small number of master transcription factors for control of cell state is supported by reprogramming and transdifferentiation experiments, and this concepts warrants further study in additional cell types (Graf and Enver, 2009; Vierbuchen et al., 2010; Zhou et al., 2008a). Identification of the master transcription factors for all cell types would improve our understanding of cell identity and could facilitate cell-based therapies. Signaling pathways can transmit information to enhancers bound by master regulators in ESCs and it will be useful to know if this extends to other cell types. It will be interesting to determine whether master transcription factors generally recruit mediator and cohesin to cell-type specific enhancers and if they generally function to activate large populations of genes necessary for cell identity. Finally, it will be important to learn if the differentiation potential of cells is simply a function of the set of genes selected for transcriptional activity together with the set of genes encoding lineage-specific regulators that are selected for silencing.

Acknowledgements

Many ideas discussed in this review emerged from conversations with Steve Bilodeau, Laurie Boyer, Megan Cole, Joan and Ron Conaway, Jerry Crabtree, Job Dekker, David Gifford, Amanda Fisher, Garrett Frampton, Matthew Guenther, Kristian Helin, Rudolf Jaenisch, Richard Jenner, Michael Kagey, Tony Lee, Stuart Levine, Charles Lin, John Lis, Alexander Marson, Alan Mullen, Jamie Newman, Huck Ng, Stuart Orkin, David Orlando, Renato Paro, Peter Rahl, Peter Reddien, Robert Roeder, Phillip Sharp, Dylan Taatjes, Ken Zaret, Len Zon, Robert Weinberg and Thomas Zwaka. I am also grateful to David Orlando and Steve Bilodeau for help with data analysis and figures.

Footnotes

Publisher's Disclaimer: This is a PDF file of an unedited manuscript that has been accepted for publication. As a service to our customers we are providing this early version of the manuscript. The manuscript will undergo copyediting, typesetting, and review of the resulting proof before it is published in its final citable form. Please note that during the production process errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal pertain.

References