POLYCOMB GROUP COMPLEXES – MANY COMBINATIONS, MANY FUNCTIONS (original) (raw)

. Author manuscript; available in PMC: 2011 Feb 13.

Published in final edited form as: Trends Cell Biol. 2009 Nov 4;19(12):692–704. doi: 10.1016/j.tcb.2009.10.001

Abstract

Polycomb Group (PcG) proteins are transcription regulatory proteins that control the expression of a variety of genes from early embryogenesis through birth to adulthood. PcG proteins form several complexes that are thought to collaborate to repress gene transcription. Individual PcG proteins have unique characteristics and mutations in genes encoding different PcG proteins cause distinct phenotypes. Histone modifications have important roles in some PcG protein functions, but they are not universally required. The mechanisms of gene-specific recruitment, transcription repression, and selective derepression of genes by vertebrate PcG proteins are incompletely understood. Future studies of this enigmatic group of developmental regulators are certain to produce unanticipated discoveries.

Introduction

Appropriate utilization of the information stored in the genome is controlled by a variety of regulatory proteins that associate with specific genes or genomic regions. Many of these proteins do not bind DNA directly, but are recruited to chromatin through interactions with histones or other DNA binding proteins. In this perspective, I review some of the findings from in vitro and cell based studies of chromatin binding by Polycomb Group (PcG) proteins with the goal to arrive at a synthesis of the principles governing chromatin association by this intriguing group of transcription regulatory proteins. I reflect on the complementary information that can be obtained using single cell imaging compared with approaches that provide information about the average properties of a cell population. I address some of the apparent contradictions that have arisen from studies using different experimental approaches and attempt to point to out ways to resolve these conflicts.

PcG genes were discovered in a screen of mutations affecting Drosophila development 1. In a series of studies in Drosophila, PcG proteins were found to control the activities of homeotic genes, and to do so in a manner that challenged conventional models of gene regulation. PcG proteins do not affect the initial pattern of homeotic gene expression, but they are required along with counteracting trithorax group (trxG) proteins to maintain the appropriate pattern of gene expression after the proteins that originally established the pattern are no longer present. PcG proteins are traditionally classified as epigenetic regulatory proteins. This class of regulatory mechanisms is characterized by the inheritance of a state of gene expression through multiple rounds of cell division, apparently independently of information provided by the DNA sequence.

Histone modifications are potential mediators of epigenetic regulatory mechanisms since they are transferred to daughter chromatids, albeit diluted by newly assembled nucleosomes. Approximately 60 different residues that can be modified have been identified in core histones. When the different modification states of each residue are considered and all theoretical binary combinations of known modifications are counted, the total number of combinations of modifications of a nucleosome is comparable to the number of genes in mammalian genomes. However, only a small number of histone modifications correlate with transcriptional activity and the causal relationship between transcription activation and many of these modifications remains unclear. Putative combinatorial mechanisms of gene regulation have been identified for a handful of combinations of histone modifications. Thus, the known repertoire of regulatory functions of histone modifications is a minute fraction of their theoretical potential. It is likely that histone modifications operate in concert with other regulatory mechanisms to achieve selective control of individual target genes.

Many genes homologous to Drosophila PcG genes have been identified in vertebrates as well as in other multicellular eukaryotes. Mutations in some of these genes cause axial transformations and alter the expression patterns of homeotic genes 217. The similarities in these characteristics indicate that some of the functions of PcG proteins have been conserved during evolution. The number of genes encoding PcG proteins has expanded in mammals, suggesting a greater complexity of functions and the likely addition of new molecular mechanisms to the basic repertoire present in Drosophila.

In this review, I focus on vertebrate PcG proteins and refer to PcG proteins in Drosophila and other invertebrates only in cases where such comparisons are instructive. I focus on the core PcG proteins and discuss their interaction partners only in cases where their roles in PcG complexes are understood. PcG protein functions are counteracted by other regulatory protein complexes (Trithorax Group and histone demethylases among others), but I discuss these complexes only in so far as their activities are directly related to PcG functions. Finally, I discuss the functions of PcG proteins mainly in the context of normal development and differentiation with an emphasis on long-term control of gene expression. I regretfully bypass the important roles of these proteins in tumorigenesis as well as their regulation by transient signals and the cell cycle. For readers interested in the wide range of topics not covered, I recommend several reviews 1830

Biochemical studies of Drosophila PcG proteins as well as their mammalian homologues have revealed that they form at least two classes of complexes designated polycomb repressive complexes 1 and 2 (PRC1 and PRC2). Each of these classes includes several known complexes with distinct compositions and functions. There is potential for great combinatorial diversity in each class of complexes, and the PRC1 and PRC2 designations should not be taken to represent categories with uniform characteristics.

PRC1

PRC1 class complexes contain four core subunits homologous to Drosophila Polycomb (Pc), Sex combs extra (dRing1/Sce), Polyhomeotic (Ph), and Posterior sex combs (Psc) proteins. Each of these proteins has multiple homologues in vertebrates, classified respectively as the Cbx, Ring1, Phc, and Bmi1/Mel18 families (Fig. 1a). For convenience, I will refer to this class of complexes as PRC1, but it is important to keep in mind that complexes formed by different combinations of related proteins can have distinct functions. The diversity of PRC1 functions is further expanded by numerous interaction partners that can associate with one or more core PRC1 proteins, including RYBP, MBLR, NSPc1, SCML2 and L(3)MBT.

Figure 1.

Figure 1

Complexes formed by vertebrate PcG proteins. The subunits of core PRC1 (a) and PRC2 (b) complexes are indicated. It is likely that many combinations of subunits can associate with each other, but the permutations that form complexes have not been established. Each subunit is expressed in a distinct set of cells and tissues, so the compositions of the complexes are likely to vary depending on the cell type. Many interactions between individual subunits have been identified, and the contacts shown in the diagrams are not intended to represent those required for complex assembly. The unlabeled white ovals represent the fact that many of PcG protein interaction partners are not shown and many more remain to be identified. It is also likely that some complexes contain only a subset of the PcG proteins indicated.

The individual PRC1 proteins and the domains conserved in these proteins have few known functions that are independent of the complex. A notable exception is the ability of the chromodomains of Cbx family proteins and Drosophila Pc to bind H3 peptides containing di- or trimethylated lysine residues 31. Mammalian cells contain five Cbx family proteins whose chromodomains can bind H3 peptides trimethylated on K9 or K27 32. The binding affinities are relatively low (10 to >500 micromolar), and they differ only 2–3 fold between peptides trimethylated on K9 and on K27 33. These chromodomains do not bind H4 peptides trimethylated on K20 or monomethylated peptides. In addition, Ring1b contains a RING domain and has E3 ubiquitin ligase activity for H2A 34. The level of H2A ubiquitination in cells is dramatically reduced by conditional deletion of Ring1b and Ring1a 35.

PRC2

PRC2 class complexes contain the Enhancer of zeste [E(Z)], Extra sex combs (Esc) and Suppressor of zeste 12 (Suz12) proteins in Drosophila and various combinations of their homologues in vertebrates (Ezh2/Ezh1, Eed and Suz12) 3639 (Fig. 1b). Canonical PRC2 have histone H3 lysine 27 methyltransferase activity toward nucleosomal substrates and appear to be the primary, and possibly the only enzymes that produce di- and tri-methylated H3 K27. In the interest of brevity, I will refer to both di- and trimethylation of H3 K27 as H3 K27 trimethylation. This is because in most studies, only one of these modifications has been examined, but in many cases where both have been examined, they exhibit qualitatively similar properties. PRC2 can interact with additional proteins including RBBP4 (RbAp48/46), PHF1 (Polycomblike in Drosophila), AEBP2, and YY14043.

Mammals have several complexes that contain different Eed isoforms in association with Ezh2, Suz12 and other proteins (PRC2, 3 and 4) 44, 45. Histone H1 inhibits H3 K27 trimethylation by these complexes in vitro and is a preferred substrate of some of the complexes (PRC2 and PRC4). Mammals also have PRC2 that contains Ezh1 in association with Eed and Suz12. These complexes maintain H3 K27 trimethylation in Ezh2−/− ES cells at about a third of the genes trimethylated on H3 K27 in wild type cells 46. These complexes have little H3 K27 methyltransferase activity in vitro and can regulate gene expression independently of H3 K27 trimethylation 47. In the remainder of this review, I refer to all of these complexes generically as PRC2 since most studies have not distinguished among the various complexes and have used this name. It is important to keep in mind that this group of complexes is functionally divers and that some of them have little H3 K27 methyltransferase activity in vitro despite their structural and evolutionary relationship with Drosophila PRC2.

The H3 K27 methyltransferase activity of PRC2 and the ability of the chromodomains of Cbx family proteins to bind trimethylated H3 K27 in vitro, has given rise to the model that H3 K27 trimethylation by PRC2 is required for PRC1 recruitment to target genes (Fig. 2, Gene D). This model has been challenged by several observations of chromatin binding by various PRC1 proteins independent of H3 K27 trimethylation 4851. Nevertheless, this model has retained its popular appeal, in part because of the complementary lysine methyltransferase and methyl-lysine binding activities of PRC2 and PRC1 and their functional relationships.

Figure 2.

Figure 2

Speculative models for PcG protein complex recruitment to target genes. PcG proteins could be recruited by sequence-specific DNA binding proteins (Genes A and C), non-coding RNA molecules (Genes B and D), or a combination of the two. These complexes could recruit either PRC1 (Genes A and B) or PRC2 alone (Gene C) or PRC2 followed by PRC1 (Gene D). Binding by the recognition factors and the PRC complexes could be sequential as shown or they could bind together. The recruitment of PRC1 to Gene D requires H3 K27 trimethylation, whereas the initial recruitment of PcG proteins to the other genes occurs independently of H3 K27 trimethylation. The pink ovals represent sequence-specific DNA binding proteins and the pink squiggles represent non-coding RNAs. The grey cylinders represent nucleosomes and the grey bar is H1. The light green circles are trimethylated H3 K27, and the dark green circle is trimethylated H1 K26. The other shapes and colors are the same as in Figure 1. Different shades of each color represent different members of protein families. Other putative mechanisms of PcG protein recruitment to target genes, including direct recognition of DNA sequence and/or structural elements are not shown.

It is likely that PRC1 and PRC2 class complexes have both functions that are independent of each other and functions that are interdependent. Mutations in genes encoding subunits of either class of complexes can affect the recruitment of subunits of the other complex to selected genes under some conditions, but do not affect recruitment to all genes or under all conditions 52, 53.

Functions of PcG Proteins in Vertebrate Development

The functions of PRC1 and PRC2 proteins in mice have been investigated by targeted mutational analysis. Mutations in genes encoding most PRC1 proteins cause transformation of individual segments of the axial skeleton as well as various hematopoietic and neurological abnormalities 217. In contrast, deletion of Ring1b causes defective gastrulation 54. Mutations in the genes encoding PRC2 proteins, each of which eliminates most detectable H3 K27 trimethylation, cause early embryonic lethality soon after gastrulation 5557. Embryonic stem (ES) cells derived from these embryos can be propagated in vitro, but the mutations in PRC2 subunits have distinct effects on the differentiation characteristics of these ES cells 46, 52, 5860. Conditional deletion of Ring1b in combination with a deletion in Ring1a is the only PcG mutant genotype that is incompatible with ES cell proliferation 53.

Mutations in PRC1 and PRC2 subunits have dissimilar effects on the phenotypes of many different cell types. The roles of PRC2 proteins later in development have been investigated in only a few cases because of the early embryonic lethality caused by mutations in the genes encoding these proteins. Conditional deletion of Ezh2 in hematopoietic cells causes selective arrest of B cell differentiation and defective IgH gene rearrangement 61. In contrast, deletion of Bmi-1 causes selective depletion of hematopoietic stem cells 13. Mutations in Eed and Bmi1 have opposing effects on bone marrow progenitor cell proliferation 62. Thus, mutations in genes encoding mouse PRC1 and PRC2 proteins cause dissimilar phenotypes at multiple developmental stages affecting many different cell types. Some of these differences may be due to compensatory effects among related proteins, but others are likely to reflect independent functions of PRC1 and PRC2 proteins.

Nuclear Organization and Dynamics

Subnuclear localization

Many endogenous PcG proteins as well as exogenously expressed fusions to fluorescent proteins are enriched in subnuclear foci known as polycomb bodies in cultured cells and Drosophila embryos 6371. The number and appearance of polycomb bodies varies between different cell types from hundreds of diffraction-limited dots to a few amorphous regions encompassing up to 10% of the nuclear volume. Many PcG proteins co-localize with each other. This has been interpreted to indicate that they bind the same sites on chromatin. However, it is not clear if polycomb body formation requires chromatin association.

The subnuclear distributions of different Cbx family proteins fused to fluorescent proteins are dissimilar in ES cells, suggesting that they do not co-localize with each other 72. Many Cbx family proteins form subnuclear foci in undifferentiated ES cells, but disperse during ES cell differentiation. Focus formation by Cbx proteins requires multiple conserved regions that can interact with different partners. The changes in distribution during ES cell differentiation are likely to reflect changes in these interactions.

To visualize the subnuclear locations of Cbx protein association with histones, bimolecular fluorescence complementation (BiFC) analysis of Cbx protein interactions with H3 has been used (Box 1 Fig. 4). BiFC complexes formed by different Cbx proteins with H3 in ES cells have unique distributions both in interphase nuclei as well as on metaphase chromosomes 51. The distributions of some BiFC complexes differ between pluripotent ES cells and mouse embryo fibroblasts, suggesting that the genes targeted by these Cbx proteins change during differentiation. These results indicate that BiFC complexes formed by different Cbx proteins bind different chromosomal regions and have distinct target gene specificities.

Box 1

Bimolecular fluorescence complementation (BiFC) analysis of chromatin binding protein association with histones. The BiFC assay enables visualization of protein interactions in living cells. This assay is based on the association of two fragments of a fluorescent protein when they are brought in proximity to each other by an interaction between proteins fused to the fragments. The BiFC assay has been adapted for visualization of PcG protein binding to H3 (Box 1 Fig. 4) 51. Association of the fluorescent protein fragments is facilitated when Cbx family proteins bind H3 embedded in chromatin. This enables determination of the subnuclear locations of Cbx family protein binding to chromatin as well as the chromosomal regions occupied by each Cbx protein. The direct visualization of chromatin binding in living cells also enables tracking of chromatin-associated complexes over time. BiFC analysis provides a method for investigation of protein interactions with chromatin that is complementary to other approaches. Chromatin immunoprecipitation detects the occupancy of individual genes in a large number of cells. Fluorescence photobleaching detects properties of the total population of molecules in a single cell. BiFC analysis of chromatin association detects the subpopulation of proteins that bind to histones in a single cell.

Box 1.

Visualization of chromatin binding in cells using bimolecular fluorescence complementation analysis. The blue crescent represents a component of a nucleosome-binding protein complex. The cluster of spheres represent histones assembled in a nucleosome. The green half-cylinders represent fragments of fluorescent proteins. Nucleosome binding facilitates association of the fluorescent protein fragments and produces a fluorescent signal at the sites of chromatin association.

Box 1

Despite the frequent localization of PcG proteins in subnuclear foci, the significance of this pattern remains unknown. The number of polycomb bodies observed in most vertebrate cells is orders of magnitude smaller than the number of genes targeted by PcG proteins in several cell types 52, 73, 74. If polycomb bodies are the only sites of chromatin association of PcG proteins, a very large number of genes must be associated with each body. Since these genes are distributed throughout all chromosomes, the association of these genes would need to be re-established after each round of cell division. It is also possible that polycomb bodies form at a small subset of PcG target genes. In either case, the functional roles of polycomb bodies remain to be identified.

Dynamics of PRC1 proteins in cells

Studies of the mobilities of PRC1 proteins in both Drosophila and mammalian cells using photobleach recovery have challenged the view that the stable repression of gene expression by PcG proteins involves the assembly of static chromatin-associated complexes. The mobilities of Pc-GFP and Ph-GFP are high in Drosophila pre-blastoderm embryos (photobleach recovery t1/2 < 1 s). Their mobilities are retarded during differentiation (t1/2 = 10–20 s) 75. The exchange rates of Pc-GFP and Ph-GFP on polytene chromosomes (60 s < t1/2 < 500 s) are consistent with transient PRC1 association with specific genetic loci.

Many PRC1 proteins associate transiently with chromatin in mammalian cells. Bmi1-GFP in U2OS cells has photobleach recovery kinetics consistent with two populations, one that has high mobility (t1/2 = 30–80 s) and another that is immobile 70. A large proportion (> 85%) of the total population of each Cbx family protein fused to Venus fluorescent protein has high mobility in ES cells (t1/2 < 30 s) 72. These rates are orders of magnitude faster than the mean exchange rates of core histones, but are similar to the exchange rate of HP1 in ES cells 72, 76. The mobilities of Cbx proteins decrease during ES cell differentiation, suggesting that the avidities or frequencies of Cbx protein interactions with chromatin or other nuclear scaffolds increase. A small subpopulation of Cbx proteins is apparently immobile in ES cells and this proportion increases during differentiation. The significance of the differences in dynamics among different subpopulations of Cbx proteins remains to be elucidated.

Gene Regulation by PcG proteins

Genes occupied by PcG proteins

The genes that are occupied by PcG proteins in vertebrate cells have been investigated primarily by the use of chromatin immunoprecipitation (ChIP) assays. Analysis of PcG protein occupancy at most promoters in mouse ES cells revealed a high degree of co-occupancy among mouse Eed, Suz12, Ring1b and Phc1 at promoters enriched in H3 K27 trimethylation 73. Most of these promoters are also enriched in H3 K4 trimethylation, and are known as “bivalent domains 77. Genome-wide analysis of Suz12 occupancy in human ES cells demonstrated association predominantly (95%) within 1 kb of sites of transcription initiation 74.

Many of the genes occupied by PcG proteins in mouse and human ES cells are induced upon differentiation and are derepressed in cells lacking PRC2 subunits 73, 74. The analysis of promoters occupied by Suz12 and Cbx8 in immortalized human embryonic lung fibroblasts revealed less overlap in occupancy and a smaller proportion of occupied genes that are trimethylated on K27 of H3 than in ES cells 49. Some of the genes are derepressed by knock-down of PRC2 proteins, but a significant proportion of the genes is repressed by PRC2 knock-down. These results are consistent with both positive and negative effects of PRC2 on the expression of genes occupied by Cbx8 and Suz12. The high degree of overlap between the genes occupied by these proteins indicates that promoter occupancy by PRC1 and PRC2 is interrelated, but this does not establish either the order of recruitment or how binding by one complex affects binding by the other.

Suz12 has a high degree of co-occupancy with the H2A.Z histone variant at promoter regions in mouse ES cells 78. The genes occupied by H2A.Z change during ES cell differentiation into neuronal progenitors, and H2A.Z and H3 K27 trimethylation are not correlated in these cells. Knock-down of Suz12 in ES cells reduces H2A.Z occupancy and knock-down of H2A.Z reduces both Suz12 and Ring1b occupancy, suggesting that their binding is interrelated. The molecular mechanisms that mediate the co-occupancy of H2A.Z and PcG proteins in mouse ES cells remain to be elucidated.

Regulation of ES cell proliferation and differentiation by PcG proteins

The high degree of overlap between the genes occupied by Eed, Suz12, Ring1b and Phc1 on the one hand and the genes occupied by regulators of ES cell pluripotency (Oct3/4, Sox2 and Nanog) on the other 73, 74, 79 suggested that PcG proteins could participate in the maintenance of ES cell pluripotency. The early embryonic lethality of mice with mutations in genes encoding PRC2 proteins and the inhibition of ES cell proliferation by combined Ring1b and Ring1a deletions indicate that these PcG proteins are important for the control of early mouse development. The tendency of Eed mutant ES cells to differentiate spontaneously and the activation of many differentiation-specific genes in ES cells containing mutations in Eed, or in Ring1b and Ring1a or in Ezh2 combined with Ezh1 knockdown 46, 53 suggest that the PcG proteins encoded by these genes contribute to pluripotency by preventing premature differentiation. Conversely, induction of differentiation in Suz12 mutant ES cells does not extinguish expression of several of the genes characteristic of pluripotent cells and does not induce expression of all differentiation-specific genes, suggesting that Suz12 is necessary to suppress pluripotency upon induction of differentiation. Different PcG proteins may therefore have opposite roles by promoting the pluripotency versus differentiation of ES cells. These PcG proteins may also have different functions in pluripotent and in differentiating cells.

Despite the roles PcG proteins play in repression of many differentiation-specific genes in ES cells, PRC2 proteins and H3 K27 trimethylation are not essential for ES cell proliferation in vitro 46, 52, 5860. They are also not necessary for ES cells to differentiate into cells that express marker genes from any one of the three germ layers. In contrast, Ring1b or Ring1a is required for ES cell proliferation 53. The essential functions of Ring1b/1a in ES cell proliferation are apparently independent of PRC2 and H3 K27 trimethylation, although a difference in the genetic background or culture conditions of these cells could account for the more severe phenotype. Thus PRC2 proteins as well as Ring1b or Ring1a regulate the transcription of many genes in ES cells, but their functions appear to be independent of each other, and only the functions of Ring1b or Ring1a are necessary for ES cell proliferation.

Recruitment of PcG Proteins to Chromosomes and Genes

Recruitment of PcG proteins to the inactive X chromosome

X inactivation is a classical epigenetic silencing process in mammals. In females one of the X chromosomes is randomly inactivated in each cell of the epiblast and the daughters of that X chromosome remain inactive in all somatic cells. Both PRC1 and PRC2 proteins are recruited to the inactive X in a manner that depends on the noncoding Xist RNA that is transcribed from the locus on X that initiates X inactivation 8083. The initiation of both imprinted and random X-inactivation are independent of Eed and H3 K27 trimethylation, but the maintenance of imprinted X inactivation in extra-embryonic tissues requires Eed 48, 8486. The recruitment of Ring1b to autosomal loci expressing Xist in differentiating ES cells is independent of Eed and H3 K27 trimethylation, whereas Phc1 and Phc2 recruitment to these loci requires Eed 48. Stable repression of autosomal loci expressing Xist does not require Eed, suggesting that both transcription repression and the epigenetic inheritance of the transcriptional state of these loci are independent of H3 K27 trimethylation.

The mechanism of PcG protein recruitment to the inactive X is of great interest since it may provide clues to the more general question of how PcG proteins are recruited to specific target genes. In differentiating ES cells, a segment of Xist designated Repeat A is associated with Ezh2 and Suz12 before induction of differentiation in both male and female cells 87. It is not clear if this interaction occurs at the X chromosome, but if it does, it provides a potential mechanism for PRC2 recruitment. Intriguingly, knock-down of Ezh2 or Eed prevents Xist enrichment and H3 K27 trimethylation, but it does not prevent silencing of genes on the X chromosome. Since the recruitment of Ring1b to autosomal loci by Xist is independent of Eed and H3 K27 trimethylation 48, it is likely that additional mechanisms mediate Xist-dependent recruitment of some PRC1 components. Studies of PcG protein occupancy at Xist loci containing mutations in the Repeat A sequence that selectively alter binding by PcG proteins are necessary to establish the roles of these interactions in PcG recruitment and X inactivation.

Investigation of mechanisms that target PcG proteins to specific genes

The mechanisms that recruit PcG proteins to specific vertebrate genes as well as the mechanisms that derepress selected genes during differentiation are unknown. The effect of H3 K27 trimethylation on histone binding by the chromodomains of Cbx proteins does not explain how PcG proteins are targeted to individual genes in the first place. Nor does it explain how different PRC1 complexes are selectively targeted to different genes. The gene-specific recruitment of PcG proteins logically demands that specific DNA sequences originally determine the genomic loci that are occupied by these proteins.

In Drosophila, many of the genes that are repressed by PcG proteins contain Polycomb Response Elements (PREs) that are required for PcG protein recruitment and gene repression 88. PRE elements are recognized by Pleiohomeotic (Pho) and Pho-like (Phol) in association with other DNA binding proteins, and are required for PRC2 targeting in Drosophila 89, 90. Pho and Phol can interact with dSfmbt to form the Pho-repressive complex (PhoRC) 91.

The DNA sequences that recruit PcG proteins to their target genes in mammalian cells are largely unknown. Many of the regions occupied by PcG proteins in ES cells contain DNA sequences that are highly conserved in mammalian genomes 74, 77, 92. However, the relevance of these sequences for PcG protein recruitment or functions has not been tested. A region 50 kilobases upstream of the mouse MafB gene has characteristics resembling Drosophila PREs. This region is occupied by PcG proteins and represses the expression of reporter genes when they are integrated at ectopic sites in Drosophila or in mouse cells 93. The region contains several sequence elements that are also found at Drosophila PREs, but the roles of these sequence elements and the proteins that could bind to them in transcription repression have not been tested.

Nucleic acid binding by PcG protein interaction partners

Since none of the core PcG proteins have been shown to bind DNA with high sequence specificity, it is presumed that they are initially recruited to specific genes through interactions with partners that can impart sequence-specific DNA binding. The intense search for such partners has yielded a few candidates. One potential class of partners is the large number of sequence-specific DNA binding proteins that regulate gene transcription. Such sequence-specific DNA binding proteins can interact with both PRC1 and PRC2 subunits (Fig. 2, Genes A and C).

The vertebrate YY1 protein shares limited sequence similarity with Drosophila Pho. YY1 can complement some of the functions of Pho in Drosophila 9496 and YY1 knockdown alters H3 K27 trimethylation of muscle-specific genes in skeletal myoblasts 97. However, the direct role of YY1 in PcG protein targeting in vertebrates has not been established. Several other DNA-binding proteins have been proposed to recruit PcG proteins to individual genes 98100, but the direct roles of these proteins in PcG protein occupancy have not been established. Some PcG proteins co-purify with various DNA-binding protein complexes. Immuno-affinity purification of E2F6 interacting proteins identified Ring1a and Ring1b together with YAF2, H-L(3)MBT-like, DP1, Mga and Max 101. Mutations in mouse E2F6 and Bmi1 have synergistic effects on Hox gene regulation and on axial skeleton development 102. Similarly, Ring1a and Ring1b co-purify with the Bcl6 co-repressor (BCOR) and PCGF1/NSPc1 together with the FBXL10/JHDM1B H3 K36 demethylase and other proteins 103. However, it is not known if these complexes regulate the expression of PcG target genes.

Ring1b can interact with Oct3/4 in ES cell extracts 53 and conditional deletion of Oct3/4 in ES cells reduces Ring1b and Eed occupancy at all genes tested that are occupied by Oct3/4 in wild type ES cells. It is appealing to think that Oct3/4 binding can recruit Ring1b and possibly other PRC1 proteins to many genes occupied by PRC1 proteins in ES cells. However, conditional Oct3/4 deletion also reduces the expression of many PRC1 and PRC2 subunits and induces aborted differentiation of ES cells that ultimately results in cell death. It is therefore difficult to exclude the possibility that secondary effects of Oct3/4 depletion influence Ring1b or Eed occupancy.

The discovery that non-coding RNAs (ncRNAs) can repress the transcription of many genes has raised the intriguing possibility that gene-specific recruitment of PcG proteins could be directed by non-coding RNA (Fig. 2, Genes B and D). Both PRC1 and PRC2 proteins can bind RNA in vitro 32, 87. The HOTAIR ncRNA encoded in the HOXC homeotic gene cluster interacts with PRC2 proteins in cell extracts and is required for Suz12 recruitment, H3 K27 trimethylation and repression of a region of the HOXD cluster 104. The Knq1ot1 ncRNA encoded in the Kcnq1 cluster of voltage-gated potassium channel genes also interacts with PRC2 subunits in cell extracts 105, 106. Ezh2 and Ring1b are independently recruited to the mono-allelic site of Knq1ot1 expression and are both required for compaction of the chromatin domain and mono-allelic repression in early trophectodermal cells 106. Cbx7 association with the inactive X in permeabilized cells is reduced by RNase treatment 32. These results are consistent with the model that ncRNA can facilitate PcG protein association with several different loci. Further studies of the molecular mechanisms whereby ncRNA may facilitate PcG protein binding to specific genes are needed 107.

Roles of H3 K27 trimethylation in PRC1 recruitment

The H3 K27 trimethyl modification produced by PRC2 can be bound by the chromodomains of Pc and Cbx family proteins 31, 32, 39. Knock-down of Drosophila Esc reduces H3 K27 trimethylation as well as Pc occupancy at the Ubx PRE in S2 cells 38. When Drosophila larvae with temperature-sensitive E(Z) are grown at the restrictive temperature, Pc occupancy at the Ubx PRE is reduced in wing imaginal disc cells. Knock-down of Pho and Phol also reduces both E(Z) as well as Pc occupancy at the Ubx PRE 90.

The overall levels of H3 K27 trimethylation can be altered by changes in H3 K27 demethylase activity mediated by UTX and JMJD3. Knock-down of UTX in HEK293 cells increases H3 K27 trimethylation as well as the occupancy of Bmi1 and Ring1A at the Hoxa13 and Hoxc4 loci 108. Conversely, overexpression of JMJD3 reduces H3 K27 trimethylation and Cbx8 retention in permeabilized HeLa cells 109. The combination of gain of function and loss of function modifications in both methylase and demethylase activities corroborate the role of H3 K27 trimethylation in the recruitment of several PRC1 proteins to many genes (Fig. 2, Gene D).

Many, but not all genes occupied by Ring1b and Phc1 in ES cells as well as those occupied by Cbx8 in a human embryonic lung cell line are modified by H3 K27 trimethylation 49, 73. In Drosophila, the sites of highest Pc occupancy often correspond to regions within core PREs where H3 K27 trimethylation is depleted, presumably because of nucleosome displacement 110113. Pc occupancy in Drosophila is focused within narrow regulatory regions, whereas K27 trimethylation is distributed over wide chromatin domains sometimes encompassing multiple genes. Similarly, in mouse ES cells the mean level of H3 K27 trimethylation is reduced immediately upstream of the transcription start sites, which corresponds to the region of near-maximal PcG protein occupancy averaged across all promoters 73. Thus, there is a high correlation between H3 K27 trimethylation and PcG occupancy when different genes are compared, but their distributions within individual genes differ from one another in both Drosophila and mammalian cells.

The Hox gene clusters are the most extensively studied targets of PcG protein regulation. In anterior and posterior tissues of the developing embryo, the relative levels of Ring1b occupancy at the Hoxb8 gene correlate with the relative levels of H3 K27 trimethylation and correlate inversely with the levels of Hoxb8 transcription 114. In contrast, Cbx2, Phc1, and Mel18 occupy several regions of the gene regardless of their H3 K27 trimethylation status or the level of Hoxb8 transcription. It is therefore possible that different PRC1 proteins differ in their dependence on H3 K27 trimethylation and in their effects on gene transcription.

Recruitment of PRC1 complexes by mechanisms independent of H3 K27 trimethylation

The presence or absence of H3 K27 trimethylation alone cannot account for the selectivity of occupancy by different PRC1 proteins at different sites. Moreover, PRC1 recruitment to many sites occurs in the absence of PRC2 (Fig. 2, Genes A and B). Both random and imprinted X-inactivation initiate normally in Eed-deficient embryos that have no detectable H3 K27 trimethylation 85. Xist expression in an autosomal region recruits Ring1b and establishes long-term silencing in ES cells that lack functional Eed and have no detectable H3 K27 trimethylation 48. PRC1 proteins are recruited to heterochromatin of the paternal pro-nucleus following fertilization in zygotes with no detectable H3 K27 trimethylation produced by Ezh2-deficient gametes 50. Ring1b is recruited to the mono-allelic site of Knq1ot1 repression in embryos devoid of Ezh2 106. During the differentiation of Suz12-deficient ES cells lacking detectable H3 K27 trimethylation, Cbx8 and Bmi1 are recruited to most of the genes they occupy in wild type ES cells 52. Cbx family proteins form BiFC complexes with H3 in cells containing a null mutation in Eed that eliminates H3 K27 trimethylation 51. Deletion of the chromodomains of Cbx family proteins has no effect on their mobilities in the nucleus (with the exception for Cbx8, which becomes cytoplasmic) and has different effects on the efficiency of BiFC complex formation by different Cbx proteins with H3. Although H3 K27 trimethylation is dispensable for PRC1 recruitment to chromatin in many situations, this modification could stabilize chromatin association by PRC1 proteins under conditions where the factors that initially recruit PRC1 to chromatin are no longer present (see below).

If PRC1 protein recruitment to chromatin is independent of PRC2 activity, what causes the correlation between PRC1 and PRC2 occupancy and H3 K27 trimethylation in ES cells? One possibility is that the same mechanisms recruit or maintain the occupancy of PRC1 and PRC2 independently of each other. Alternatively, PRC1 may recruit PRC2 to some genes by unknown mechanisms. The conditional knockout of Ring1b and Ring1a in ES cells causes a reduction in Eed occupancy and H3 K27 trimethylation at some genes 53. Additional studies of the mechanisms that mediate the relationships in PRC1 and PRC2 occupancy and functions are required.

Proposed Mechanisms For Gene Repression

The mechanism(s) of transcription repression by PcG proteins are a challenging problem to solve because of the difficulty of reconstituting the native regulatory network in vitro. An additional challenge is that it is unclear which PcG proteins must bind the gene to repress its activity. The step(s) in the transcription cycle that are inhibited have also been unclear. Gene repression by PcG proteins is generally thought to involve PRC1 either alone (Fig. 3, Genes A and B) or in association with PRC2 (Fig. 3, Gene D). However, PRC2 are likely to also repress gene expression independently of PRC1 (Fig. 3, Gene C). Gene repression by PcG proteins has been proposed to involve changes in chromatin architecture (Fig. 3, Genes A and C). Alternatively, or in addition, gene repression by PcG proteins can involve changes in histone modifications (Fig. 3, Genes B and D).

Figure 3.

Figure 3

Proposed mechanisms for gene repression by PcG proteins. Chromatin compaction by either PRC1 (Gene A) or PRC2 (Gene C) could produce a chromatin configuration that blocks (red octagon) transcription by RNA polymerase II (green wedge). Alternatively, or in addition, histone H2A K119 ubiquitination (red star) by PRC1 alone (Gene B) or PRC1 in association with PRC2 (Gene D) could block RNA polymerase II. Other potential mechanisms of gene repression by PcG proteins, including modifications of the transcription machinery are not shown.

Effects of PRC1 proteins on chromatin and DNA structures

PRC1 reconstituted using Pc, Ph, Psc and dRing purified from overexpressing insect cells can inhibit nucleosome remodeling by hSWI/SNF as well as compact nucleosomal arrays and promote interactions between chromatin fragments in vitro 115, 116. Mutations in Psc that eliminate the inhibition of hSWI/SNF remodeling activity and chromatin compaction in vitro correlate closely with genetically characterized Psc mutations that impair PcG repression of AbdB in wing imaginal disks 117. These results indicate that the effects of PRC1 on nucleosome remodeling in vitro correlate with gene repression by PcG proteins in vivo, consistent with the model that PRC1 represses gene expression through chromatin compaction.

The high mobilities of many PRC1 proteins in cells and their relatively fast exchange rates at individual genes in Drosophila 72, 75 can be reconciled with chromatin compaction and the inhibition of hSWI/SNF activity if PRC1 occupancy at regulatory elements is maintained at high levels. It is more difficult to account for the elevated rates of H3.3 incorporation that closely coincide with all sites of E(Z) and Psc occupancy within the BX-C and ANTP-C loci in Drosophila cells 113. Sites of PcG protein association in Drosophila cells are therefore associated with elevated rates of H3 exchange or turnover. The mechanisms that produce these effects and their roles in transcription regulation by PcG proteins remain unknown.

Reconstituted Drosophila PRC1 can also bind naked DNA 118. This binding activity is enhanced in the presence of Pho on DNA containing a Pho recognition sequence and flanking sequences thought to be recognized by PRC1 119. Scanning force microscopy imaging of the Pho-PRC1 complex on DNA revealed shortening of the DNA, consistent with wrapping of DNA around Pho-PRC1 120. It will be interesting to discover the roles of specific DNA binding and the topological change induced by PRC1 in transcription regulation by PcG proteins during development.

Histone ubiquitination

An anti-Ring1a immunopurified protein complex containing Ring1a, Ring1b, Bmi1 and Phc2 has H2A E3 ubiquitin ligase activity 34. Conditional deletion of Ring1b and Ring1a dramatically reduces H2A ubiquitination, suggesting that they are required for most H2A ubiquitin ligase activities 35. The E3 ubiquitin ligase activity of recombinant Ring1b is markedly stimulated by Bmi1 as well as Mel18 121126. H2A ubiquitination is reduced in many regions of the mouse genome in MEFs lacking Bmi1 compared with wild type MEFs 127. The ubiquitin on H2A can be removed by many different deubiquitinases 128, 129, 130, 131. The roles of deubiquitinase enzymes in the regulation of gene expression by PcG proteins have not been described. In Drosophila, a different complex containing dRing in association with dKDM2 has been identified that can ubiquitinate H2A 132. A single amino acid substitution in Drosophila dRing (R65C) that eliminates H2A E3 ubiquitin ligase activity by Ring1b in vitro 34 causes a weak PcG phenotype, consistent with partial loss of function 133. It will be important to define the roles of H2A ubiquitination in PcG repression of specific genes and the mechanisms whereby H2A ubiquitination may inhibit gene transcription.

PcG regulation of transcription initiation and elongation

The effects of PcG on the recruitment of RNA polymerase II have been examined in Drosophila as well as in mammalian cells. In Drosophila, RNA polymerase II and TFIID occupy a PcG reporter constructed by fusing the bxd PRE to the hsp26 gene promoter 134. No KMnO4 hypersensitivity is detected at the repressed promoter, suggesting that strand separation is blocked. It will be interesting to determine if a similar mechanism represses endogenous PcG target genes since RNA polymerase II is enriched within the promoter regions of almost half of the genes tested in Drosophila S2 cells 135. Many non-PcG target genes with promoter-proximal polymerase enrichment also display KMnO4 hypersensitivity, consistent with transcriptionally engaged RNA polymerase II.

A majority of genes in mammalian cells have RNA polymerase II transcription complexes as well as H3 K4 trimethylation associated with their promoter regions 136, 137 Direct measurement of nascent transcription using a genome-wide nuclear run-on assay indicates that an equally large proportion of genes have transcriptionally engaged RNA polymerase II 138. There is an inverse correlation between the ratio of promoter-proximal to promoter-distal RNA polymerase II engagement and the level of gene activity, suggesting that mechanisms regulating gene activity could modulate promoter-proximal stalling. However, genes with exclusively promoter-proximal RNA polymerase II engagement are rare, suggesting that silenced genes generally lack transcriptionally engaged RNA polymerase II. Genes that produce short promoter-proximal transcripts in ES cells are also under-represented among genes occupied by Suz12 139. Thus, although promoter-proximal RNA polymerase II is prevalent both in vertebrate and invertebrate genomes, it is unclear how the transition to productive elongation is regulated and whether PcG proteins contribute to this regulation.

RNA polymerase II phosphorylated on Serine 5 (Ser5P) of the C-terminal repeats occupies many genes that are modified by H3 K27 trimethylation 140. It is not clear if this occupancy is restricted to promoters or extends several kilobases into transcribed sequences. Conditional deletion of Ring1b in Ring1a mutant ES cells causes little change in Ser5P RNA polymerase II occupancy at these genes, but results in a dramatic increase in the level of productive transcription of some of these genes. These results suggest that Ring1 proteins influence transcription at a step after Ser5P RNA polymerase II engagement, but it is unclear what step is affected and how it regulates the level of productive transcription.

Epigenetic inheritance of transcription regulation by PcG proteins

One hallmark of classical PcG silencing in Drosophila is inheritance of the repressed state through multiple rounds of DNA replication and cell division in the absence of the transcription regulatory proteins that originally establish the repressed state 21, 22. In vertebrates, PcG repression of many genes reversed by stimuli that alter the cell state 73, 74. Since vertebrate factors that recruit PcG proteins to specific genes have not been identified, the roles of these factors and epigenetic mechanisms in the maintenance of PcG protein occupancy are unknown.

Mechanisms for the maintenance of gene-specific repression

Propagation of the regulatory information through each cell cycle requires mechanisms for the maintenance of this information during DNA replication and chromosome segregation. H3 K27 trimethylation provides a potential mechanism for the inheritance of regulatory information since the nucleosomes from parental chromatin are transferred to the daughter chromosomes during replication. PRC2 can bind trimethylated H3 K27 peptide in vitro and transient recruitment of exogenous PRC2 to a heterologous gene can establish persistent H3 K27 trimethylation and gene repression 141. Recruitment of PRC2 to trimethylated H3 K27 during DNA replication could potentially methylate newly assembled histones on the daughter chromosomes. The WD40 domain of Eed can selectively bind to trimethylated H3 K27 by virtue of a cage of aromatic residues in the binding pocket 142. Replacement of these aromatic residues in vertebrate Eed reduces H3 K27 trimethyl binding. The corresponding substitutions in Drosophila ESC eliminate its ability to complement esc and escl mutations and to maintain H3 K27 trimethylation in the vicinity of the bxd PRE. The WD40 domain of Eed can bind with similar affinities to trimethylated H1 K26, H3 K9 and H4 K20, all of which are associated with transcription repression. Trimethylated H3 K27 peptide stimulates PRC2 methyltransferase activity by an allosteric mechanism. The allosteric activation of methyltransferase activity is not strictly related to peptide binding affinity, suggesting that additional recognition mechanisms contribute to the allostery. These and other factors could contribute to the selective propagation of H3 K27 trimethylation separate from the other modified histones recognized by Eed.

PcG repression could be epigenetically inherited through continuous occupancy of repressed genes by one or more PcG protein complexes. Low, but detectable levels of PRC1 proteins are associated with the condensed chromosomes at all stages of mitosis 6567, 143145. PcG proteins are also associated with the inactive X during mitosis 80, 82, 83, 146. Thus, chromosome condensation does not displace all PcG proteins, indicating that PcG protein association with chromatin can maintain regulatory information through mitosis. PRC1 also remains bound to chromatin during DNA replication in vitro 147. It is not known if PRC1 segregation is stochastic or if some mechanism ensures PRC1 association with both daughter chromatids. PRC1 is thought to self-associate, but it is not known if this self-association can prevent dilution of the complexes over multiple rounds of replication and cell division. It will be important to distinguish the roles of chromatin-associated PcG protein complexes and of histone modification patterns in the epigenetic inheritance of transcription regulatory states.

Novel and Non-transcriptional Functions of PcG proteins

Since many aspects of PcG protein function have not been resolved despite years or even decades of concerted effort, it might be hoped that serendipitous discoveries by perceptive investigators might provide the necessary insight. Indeed, several observations have revealed unexpected characteristics of PcG proteins that might be related to their wide-ranging biological effects.

Antibodies against Drosophila topoisomerase II and Barren (a homologue of the Xenopus XCAP-H interaction partner of structural maintenance of chromosome (SMC) proteins) precipitate chromosomal regions that are occupied by Pc within the BX-C locus 148. Mutations in barren and ph show both defects in silencing of the mini-white reporter linked to the Fab7 PRE as well as defects in chromosome segregation. Mutations in genes encoding other PcG proteins also exhibit defects in mitotic chromosome segregation 149. The PcG proteins on mitotic chromosomes could participate directly in mitotic segregation, or PcG proteins may be required for the assembly of other complexes necessary for segregation.

PcG proteins and their interaction partners have enzymatic activities unrelated to histone modifications. The Cbx4 protein has SUMO E3 ligase activity for the homeodomain interacting protein kinase (HIPK2) 150. HIPK2 can in turn phosphorylate Cbx4, which stimulates its SUMO E3 ligase activity in response to DNA damage. Mel18 can interact with UBC9 and inhibit its SUMO E2 ligase activity 151.

The super sex combs (Sxc) gene encodes a glycosyltransferase that can modify Ph with N-acetylglucosamine (GlcNac) 152. Regions of the genome associated with GlcNac-modified proteins coincide with regions occupied by Ph and Pho in genome-wide chromatin immunoprecipitation analysis. Sxc mutants eliminate GlcNac modified protein binding to chromatin and PcG repression, but do not prevent Pho, Ez or Ph binding to chromatin. It will be interesting to determine the effects these modifications on transcription regulation by PcG proteins.

Perspectives and Opportunities for the Future

The PcG is a fascinating regulatory system that has much to teach us about the principles that enable cells to maintain a stable identity in the face of stochastic molecular processes and fluctuations in their environments. The stability of this system is essential for maintenance of cell identity, yet it must be balanced against the need for developmental plasticity and both short- and long-term responses to extracellular stimuli.

The roles of nuclear architecture and PcG protein localization in transcription regulation have not been critically tested. Strategies for the selective and reversible control of subnuclear localization must be developed to test the effects of experimentally induced changes in nuclear localization on PcG protein functions. The contrast between the transient chromatin association of many PRC1 proteins and the stable maintenance of gene repression demands creative strategies for investigation of the significance of PcG protein dynamics in their functions.

Although several PcG proteins occupy overlapping sets of genes in ES cells, the relationship between PRC1 and PRC2 occupancy as well as the binding specificities of PRC1 and PRC2 complexes composed of different subunits remain incompletely understood. Future studies need to compare the binding specificities of different members of each family of PcG proteins. These studies must also examine the binding specificities of the intact complexes in addition to those of the individual subunits.

Studies in cell populations cannot detect variations in PcG protein occupancy among individual cells in the population. Such variation can be caused by cell cycle regulation or other sources of cell to cell variability. Ideally, studies of PcG protein association with chromatin should be conducted in the normal environment of the developing animal.

Efforts to identify the mechanisms whereby PcG proteins are first recruited to their target genes have focused on PRC2 interaction partners. The discovery that PRC1 proteins can be recruited in the absence of PRC2 or H3 K27 trimethylation indicates that this search should be expanded to include PRC1 interaction partners. Since different PcG proteins occupy different genes and since PcG repression of different genes must be independently regulated, it is likely that vertebrate PcG proteins are recruited and their occupancy is modulated by interactions with many different partners with unique target gene specificities.

By analogy with transcriptional co-repressors and co-activators, it seems likely that PcG protein complexes are recruited through combinatorial interactions with many alternative interaction partners at each gene. Identification of such partners using genetic screens or bioinformatic strategies can be difficult since recruitment can be mediated by interactions with multiple alternative partners. The identification of interaction partners using proteomic methods in combination with validation of their functional relevance in cells and animals provides one strategy for elucidation of the mechanisms of PcG recruitment.

The roles of PcG proteins in the control of pluripotency have been investigated in ES cells derived from the inner cell mass of blastocysts. Reprogramming of differentiated cells to pluripotency has expanded the range of cell types that can be used as a source of pluripotent cells. The efficiency of reprogramming is generally low, but can be enhanced by the expression of modifier genes. It could be interesting to determine if either deletion of genes encoding PcG proteins (conditionally in cases where they cause embryonic lethality) or overexpression of these proteins affects the efficiency of reprogramming.

The PcG proteins represent a unique group of transcription regulatory proteins. Nevertheless, the most significant lessons learned from the study of these proteins are likely to be in areas fundamental to all mechanisms of transcription regulation. PcG proteins must be targeted by mechanisms that are robust to stochastic fluctuations and that can locate specific regulatory regions embedded in the expanse of genomic DNA. They must maintain stable levels of gene expression, yet remain responsive to signals that trigger changes in cell fate. Finally, they must achieve a wide dynamic range of gene expression levels to mediate the transitions in developmental potential that control all stages of development. Discovery of the mechanisms that mediate these functions will provide insight into the fundamental processes that control the retrieval of genomic information.

Acknowledgements

I thank members of the Kerppola laboratory for stimulating discussions. The research is supported by the National Institute of General Medical Sciences (R01 GM086213).

References