ReviewMOLECULAR BIOLOGY

Regulation of gene transcription by Polycomb proteins

See allHide authors and affiliations

Science Advances  04 Dec 2015:
Vol. 1, no. 11, e1500737
DOI: 10.1126/sciadv.1500737

Abstract

The Polycomb group (PcG) of proteins defines a subset of factors that physically associate and function to maintain the positional identity of cells from the embryo to adult stages. PcG has long been considered a paradigmatic model for epigenetic maintenance of gene transcription programs. Despite intensive research efforts to unveil the molecular mechanisms of action of PcG proteins, several fundamental questions remain unresolved: How many different PcG complexes exist in mammalian cells? How are PcG complexes targeted to specific loci? How does PcG regulate transcription? In this review, we discuss the diversity of PcG complexes in mammalian cells, examine newly identified modes of recruitment to chromatin, and highlight the latest insights into the molecular mechanisms underlying the function of PcGs in transcription regulation and three-dimensional chromatin conformation.

Keywords
  • Gene Regulation
  • Chromatin
  • Epigenetics

INTRODUCTION

The genome of eukaryotic cells is packaged in the nucleus of cells in a macromolecular complex termed chromatin, which is formed by DNA together with RNA, histone, and non-histone proteins (Fig. 1). The complex was initially identified by the German cytologist Walter Flemming, who defined “chromatin” as the cellular structure visually detectable under the microscope after staining with a basic dye (“stainable material,” originally from the Latinized version of the Greek term Khroma) (1). The minimal structural unit of the chromatin is the nucleosome, which is composed of 147 base pairs of DNA wrapped around an octamer formed by two of each of the histones H3, H2A, H2B, and H4 (2). Over the last few years, technological advancements, including the development of high-resolution microscopy technologies and chromosome conformation assays, have revealed that the genome packaging within the nucleus is nonrandom, but hierarchically structured into dynamically regulated structures. At the largest scale, interphase chromosomes occupy discrete regions in the nucleus known as “chromosome territories” (3). At increased resolution, distal regions within interphase chromosomes establish long-range interactions (with an average size in mammals of 1 Mb), forming structures known as “topologically associated domains” (TADs) (47). Finally, chromatin loops within TADs organize transcriptionally co-regulated genes and are important for defining cellular identity and other physiological processes (Fig. 1).

Fig. 1 Hierarchical layers of chromatin organization in mammalian cells.

Individual chromosomes cover a distinct region within the nucleus known as chromosome territory. At increasing resolution, chromosomes are composed of topologically associating domains (TADs), which are structural units defined by the high frequency of chromatin interactions between their loci that are partitioned by sharp boundaries. Within TADs, enhancer elements and active proximal promoters (both depicted in red) form chromatin loops, which are mediated and/or stabilized by protein effectors, noncoding RNAs (ncRNAs), and histone posttranslational modifications (PTMs). Enhancers and promoters are characterized by the presence of specific histone variants and PTMs on the histone tails. Upon transcription activation, elongating RNA polymerase II (RNAP, in green) is phosphorylated at Ser5 and Ser2 on its C-terminal domain (CTD) and begins to produce mRNA. Genomic regions that are transcriptionally silenced form repressed chromatin domains that are also stabilized by ncRNA and other repressive protein complexes. Finally, tracks of repetitive sequence are found in specific functional regions of the genome, including CpG islands (CGIs), in which cytosines can be modified (5-methylC and 5-hydroxymethylC).

The cell type–specific transcription activation of the genome is a major determinant of the cellular diversity found in multicellular organisms. The specific transcriptional program is initially induced by a triggering signal and can be sustained by epigenetic information across cell divisions until an additional input induces and establishes an alternative transcriptional program; this in turn can be sustained by another epigenetic mechanism (8). The epigenetic information is thought to reside in the following: (i) self-propagating transcriptional networks, such as the pluripotent network in embryonic stem cells (ESCs); (ii) ncRNAs, such as the X-inactive specific transcript (Xist) RNA; and (iii) chemical modifications of chromatin, including DNA modifications (that is, cytosine methylation and hydroxymethylation) and histone PTMs (8). To be considered as such, any epigenetic information must fulfill two criteria: (i) it must regulate gene transcription, and (ii) once present in a cell, it must self-propagate across cell divisions independently of the input signal and until the appearance of a replacing signal.

The Polycomb group (PcG) of proteins has long been considered to be a paradigmatic model for epigenetic regulation of gene silencing. PcG proteins are a collection of transcriptional regulatory factors that can control gene expression, whose transcriptional imposed silencing can be transmitted from embryos to adulthood (9, 10). In this review, we will discuss recent advances in PcG-mediated gene regulation. Specifically, we will focus on (i) the diversity of the Polycomb complexes that have been defined in mammals, (ii) their modes of recruitment to specific chromatin domains, and (iii) their roles in transcription regulation and chromatin architecture.

HOW MANY POLYCOMB COMPLEXES ARE PRESENT IN MAMMALIAN CELLS?

Polycomb proteins were initially identified in Drosophila melanogaster. Analysis of a fly mutant containing additional sex combs not only on the first but also on the second and third pairs of legs allowed researchers to identify the first Polycomb member (Pc) (11). Later, the Pc protein was suggested to be a negative regulator of homeotic genes, which are necessary for proper body segmentation of Drosophila during development. The characterization of mutants showing similar Polycomb phenotypes enabled the identification of 18 PcG genes in Drosophila that are required to regulate the proper activation of the Hox gene cluster (12). The number of PcG ortholog genes expanded remarkably during the metazoan evolution, from 18 up to 37 members in mammals, most likely by multiple duplication events (12, 13). In addition, the sequences of mammalian paralogs diverged significantly, correlating with the evolution of complex traits in vertebrates (13). In the last few years, a growing number of additional proteins have been identified as members of the PcG complexes, adding an extra layer of complexity in mammals (14).

The different PcG proteins associate to form functionally distinct complexes that belong to two major families: the Polycomb repressive complexes 1 and 2 (PRC1 and PRC2, respectively) (Fig. 2). The different complexes of each family have catalytic activity: PRC1 complexes have E3 ligase activity, and their main characterized substrate is the monoubiquitinated form of histone H2A at lysine 119 (H2Aub1), whereas PRC2 complexes contain methyltransferase activity and are mainly involved in generating the di-/trimethylated form of lysine 27 on histone H3 (H3K27me2/3) (14). Both the PRC1 and PRC2 complexes comprise core components that are always present and that contain basal catalytic activity in vitro (Fig. 2). In PRC1, the core components include the E3 ubiquitin ligase Ring1B and one Polycomb group of ring finger (Pcgf) protein (Fig. 2). The PRC2 core components are Suz12 (suppressor of zeste 12), which contains a Zinc finger domain; Eed (embryonic ectoderm development), which contains a WD40 repeat domain that recognizes trimethylated peptides; and Ezh1/2 (enhancer of zeste 1 or 2) protein, which contains the SET domain responsible for the methyltransferase activity of the complex. PRC1 and PRC2 core components then interact with additional partners that regulate the enzymatic activity and/or define the mode of recruitment to chromatin of the complex (see below for details). Currently, the mammalian PRC1 family is subdivided into two subfamilies of canonical PRC1 (cPRC1; the functional homolog to Drosophila PRC1) and noncanonical PRC1 (ncPRC1), which include a heterogeneous group of several complexes (Fig. 2) (15, 16). Likewise, the core components of PRC2 can associate with up to six different proteins (that have been identified to date), which then determine its functional specificity (17).

Fig. 2 PcG complexes in mammals.

(A and B) PcG complexes are classified into two major families: (A) PRC1 and (B) PRC2. Both families contain core subunits present in all the subcomplexes of the family. The interaction of the core complex with other accessory proteins defines the complete composition of each subcomplex. These accessory proteins have been found to regulate recruitment to specific chromatin domains and/or to modulate the catalytic activity of the core complex. PRC1 complexes are divided into cPRC1 and ncPRC1 (A). The core complex can associate with distinct Pcgf proteins, which allows for an alternative nomenclature. Therefore, Pcgf2 and Pcgf4 are present in the cPRC1 complexes (PRC1.2 and PRC1.4, respectively), Pcgf2 and Pcgf4 are also associated with ncPRC1-containing Rybp or YAF proteins, Pcgf3 and Pcgf5 are present in the ncPRC1 complexes (PRC1.3 and PRC1.5), Pcgf1 is present in the ncPRC1 complex PRC1.1 (also known as BCOR), and Pcgf6 is present in the ncPRC1 complex PRC1.6 (also known as E2F6.com). (B) The trimeric PRC2 core complex can associate with different proteins present in the PRC2 complex at the same time.

This complexity can result in terminology issues because the same Polycomb term can be equally used to refer to two related but functionally different PcG complexes. Although attempts to uniquely identify different PRC1 complexes have been undertaken (15, 16), a unified nomenclature is still missing. The growing number of partners and functions attributed to the PcG complexes demand a consensus in the nomenclature used by the scientific community (Fig. 2).

PRC1 complex classification

The PRC1 complexes have a very diverse composition. An initial classification differentiated between cPRC1 and ncPRC1 complexes (15, 16). This classification is mainly based on the presence of one Chromobox (Cbx) protein in cPRC1 complexes and of the Ring1B and Yy1-binding protein (Rybp), or its homolog YAF2, in ncPRC1 complexes (15, 16). However, this initial classification simplifies the actual diversity of the PRC1 complexes.

All PRC1 complexes contain Ring1B (also known as Ring2/RNF2), which has the E3 ubiquitin ligase activity of the complex (18, 19), as well as one of the Pcgf proteins (Pcgf1-6) (15). In an attempt to classify the different PRC1-containing Pcgf complexes, Gao and collaborators defined six different groups (PRC1.1–6) according to the Pcgf member associated to the complex (Fig. 2) (15). Hence, the PRC1.2 and PRC1.4 complexes include Ring1B in association with Pcgf4/BMI-1 or Pcgf2/Mel-18, which stimulate the E3 ligase activity (19, 20), together with a Cbx protein (Cbx2, Cbx4, Cbx6, Cbx7, or Cbx8) and a polyhomeotic homolog protein (HPH1–3). On the other hand, PRC1.1, PRC1.3, PRC1.5, and PRC1.6 include Pcgf1/NsPC1, Pcgf3, Pcgf5, and Pcgf6/MBLR6, respectively, and the Ring1 and Yy1-binding protein Rybp or its homologous YAF1 (Fig. 2). However, this second classification excluded the complexes containing Rybp-Ring1B found in association with Pcgf2 and Pcgf4. An alternative nomenclature should also consider these last complexes.

cPRC1 complexes

The Cbxs proteins are considered to be determinants for the recruitment of cPRC1 to chromatin. In mammals, the Cbx protein family is composed of eight members (Cbx1–8), all of which contain a conserved C-terminal Cbx domain that binds methylated lysine residues. In addition, Cbx2, Cbx4, Cbx6, Cbx7, and Cbx8 contain a conserved N-terminal PcG box required for its interaction with Ring1A/B (2123). In Drosophila, the orthologous Pc protein shows a discriminatory binding for H3K27me3 as compared to H3K9me3, although both lysines are contained within an identical peptide sequence of ARKS (24, 25). However, the mammalian Cbx proteins associated to Ring1B have a wide range of affinities toward both marks without a distinct selectivity for one (26, 27). For instance, Cbx2 chromodomain shows a more clear binding for H3K27me3; Cbx4 and Cbx7 chromodomains bind with similar affinity to H3K9me3 and H3K27me3, but with a slight preference for H3K9me3; and the Cbx6 and Cbx8 chromodomains show a very weak affinity for both marks (26, 27). Although it has a poor discrimination for methylated peptides in vitro, PRC1 containing the Cbx7 protein is depleted from chromatin in Eed knockout ESCs that lack H3K27me3 (28), suggesting that alternative mechanisms might exist to ensure the selective chromatin binding of PcG complexes. In this regard, Cbx2 also contains a putative DNA binding domain (23), and Cbx7 has a binding affinity for RNA via its chromodomain (29).

ncPRC1 complexes

The PRC1.1 complex, also named BCOR (30, 31), is the homolog complex of the Drosophila dRAF complex. In this complex, Pcgf1 enhances the catalytic activity of Ring1B in vitro and in vivo (3032). A key factor in the complex is the histone demethylase Kdm2b, which harbors a CxxC domain involved in DNA binding and drives the targeting of PRC1 complex to CGIs (3234). Ectopic tethering of Kdm2b results in the de novo recruitment of PRC1, and hemizygous loss of the CxxC domain of Kdm2b results in a homeotic transformation and loss of ncPRC1 genomic occupancy, supporting its role in targeting the complex to specific locus (35).

The complexes PRC1.3 and PRC1.5 contain Ring1B associated with casein kinase 2 (CK2). In neurons, the PRC1.5 complex also contains AUTS2 (autism susceptibility candidate 2), and phosphorylation of Ring1B by CK2 inhibits its catalytic activity (36). Genome-wide studies suggested that PRC1.5 is recruited to active genes, uncovering a new function of the PRC1.5 complex in gene transcription (36).

The PRC1.6, also named E2F6.com (3739), contains the transcriptional repressor E2F6 in association with Ring1B-Pcgf6. Deletion of E2F6 in mice causes homeotic transformation of the axial skeleton (40). In addition, the complex contains the oncoprotein L3MBTL2, the deletion of which, in mice, leads to early embryonic lethality, and the transcription factors Max and Mga, suggested to target the complex to E-boxes (39, 41). The complex is completed by its association with the Cbx3 protein, which interacts with H3K9me3, histone deacetylases HDAC1 and HDAC2, the transcription factor Dp-1, the H3K9 methyltransferase G9a, the H3K4 demethylase Jarid1C, and the WD40 repeat protein Wdr5 (15, 41). However, the functions of these proteins in the context of PRC1.6 are not yet known.

cPRC1 versus ncPRC1

cPRC1 and ncPRC1 complexes have been compared to each other at the molecular and functional levels. At the molecular level, the cPRC1-Cbx and ncPRC1-Rybp complexes co-occupy common as well as distinct subsets of target genes (15, 42). Their distribution correlates with the levels of H3K27me3, with the co-occupancy of H3K27me3 by PRC1-Cbx higher than that by PRC1-Rybp (15, 42). Furthermore, the presence of PRC1-Cbx7 (the most expressed Polycomb Cbx in ESCs) correlates with robust gene silencing, whereas genes uniquely occupied by PRC1-Rybp are moderately expressed (42). At the functional level, both complexes can compact nucleosomes in vitro. However, PRC1-Rybp exhibits increased E3 ligase activity as compared to PRC1-Cbx2 and PRC1-Cbx8, but not to PRC1-Cbx7 (15, 16).

Although this detailed comparative analysis has not been performed at the cellular level, recent data suggest that the PRC1 complexes that contain different Pcgf proteins exhibit cell type–specific functions. For instance, PRC1-Pcgf6 is required for ESC self-renewal (43), and PRC1-Pcgf4 is required for proliferation and self-renewal of neural stem cells (44, 45) and hematopoietic stem cells (HSCs) (46, 47). Overall, these studies suggest that Pcgfs determine specific, nonoverlapping functions of Ring1B complexes.

Similarly, the presence of different Cbx proteins dictates nonoverlapping functions of PRC1 complexes (28, 48, 49). Both in ESCs and HSCs, the PRC1 complex containing Cbx7 represses the lineage specification program. Conversely, PRC1-Cbx2, PRC1-Cbx4, and PRC1-Cbx8 repress the expression of genes involved in stem cell self-renewal (28, 48, 49). These results indicate that, first, PRC1 with distinct Cbx proteins exhibit nonredundant functions in stem cells; second, PRC1 complexes with specific Cbx subunits are recruited to specific loci in ESCs and differentiating cells; and last, these complexes exhibit a conserved molecular mechanism to regulate self-renewal in both embryonic and adult stem cells.

PRC2 core complex

As mentioned above, the three core protein subunits of PRC2 complex are Suz12 (with a Zinc finger domain), Eed (with a WD repeat domain that recognizes trimethylated peptides), and Ezh1/2 (with a catalytic subunit within its SET domain). These three components are present in a 1:1:1 stoichiometry (50) and are sufficient for the PRC2 core complex to have basal levels of methyltransferase activity in vitro (51). Considering the number of paralogs and splicing isoforms of Ezh and Eed proteins, there are several potential trimeric complexes that can be assembled (Fig. 2).

The Ezh1 and Ezh2 proteins are mutually exclusive in the complex, and their expression seems to be complementary; for instance, Ezh2 is highly expressed in embryonic tissues and proliferating cells, whereas Ezh1 is mostly present in adult tissues and nondividing cells (5256). Although PRC2 with Ezh2 efficiently methylates H3K27, PRC2 with Ezh1 has only a minor methyltransferase activity against this lysine, both in vitro (53) and in vivo (54). The function of PRC2-Ezh2 in mediating gene repression has been well characterized (9, 57, 58), but the function of PRC2-Ezh1 remains controversial. Both Ezh1 and Ezh2 target the same genes and repress transcription in mouse carcinoma cells (53). Nevertheless, in muscle cells, Ezh1 and Ezh2 occupy distinct sets of genes, with Ezh2 associated to the H3K27me3 mark and transcriptionally repressed genes, and Ezh1 present in active chromatin marked with H3K4me3 (55). Additionally, recent studies indicate that PRC2-Ezh2 is replaced by PRC2-Ezh1 during differentiation of myoblasts, and in HSCs and hippocampal cells, resulting in an activation of common target genes (56, 59, 60). These studies suggest that alternative assembly of core PRC2 components can result in different functions, rather than complementary active complexes.

In addition to the two related Ezh1 and Ezh2 genes, four isoforms of the Eed protein can be produced by an alternative translation start site from the same mRNA (58). However, the functional differences between PRC2 complexes containing distinct Eed isoforms still remain unclear (58, 61).

PRC2 partners and their function

Detailed proteomic analyses of the PRC2 complex indicate that the core trimeric PRC2 complex can associate with additional polypeptides at a substoichiometric level, resulting in different PRC2 complexes assembled in the same cell type (17, 50). The PRC2 cofactors identified to date are the retinoblastoma binding proteins 4 and 7 (Rbbp4/7; also known as RbAp48/p46), the adipocyte enhancer-binding protein 2 (Aebp2), the Jumonji AT-rich interactive domain 2 (Jarid2) protein, the Polycomb-like 1 [PCL1 or PHD finger protein 1 (PHF1)] protein, the related PCL2 [also known as metal response element binding transcription factor 2 (MTF2)] protein, PCL3 [also known as the PHD finger protein 19 (Phf19)], and two mammalian-specific proteins (C17orf96 and C10orf12) (17, 50). Some of these subunits can be present in the same PRC2 subcomplex, and their main functions are to regulate PRC2 enzymatic activity and/or its recruitment to specific genomic loci (Fig. 2) (17, 51, 6265). However, considering that histone modifications and RNA can also regulate PRC2 activity (6570), it is difficult to distinguish whether the effects on PRC2 methyltransferase activity in vivo are mediated directly by an allosteric effect of the partner present in the complex or indirectly by the partner-mediated recruitment of the core complex to a specific chromatin context.

The histone binding proteins Rbbp7/4 are the mammalian homologs of the Drosophila Nurf55 protein, which enhances dPRC2 activity in vitro (51, 62). Accordingly, the formation of the tetrameric complex (trimeric core plus Rbbp7/4) is thought to constitute a fully competent methyltransferase complex in vivo. In Drosophila, Nurf55 together with dSuz12 anchor the dEzh2 enzyme at chromatin, whereas dEed boosts its catalytic activity (62). The addition of Aebp2 enhances the enzymatic activity of PRC2 (51), and its DNA binding motif may be necessary to target the complex to specific genomic loci (71). The structure of human PRC2 complex was recently characterized in association with Aebp2, further supporting a possible function of this protein in stabilizing the complex (72). Despite the potential role of Aebp2 as a recruiter of PRC2 to DNA, its genomic occupancy has not been documented to date.

The fact that the PRC2 methyltransferase activity is enhanced by H3K27me3 (66, 67) and inhibited by H3K4me3 and H3K36me3 (73) offers an attractive model to explain how the PRC2-mediated spreading of H3K27me3 might be prevented by transcriptionally active chromatin domains. In addition, this model predicts that removing the H3K4me3 or H3K36me3 mark relieves the PRC2 enzymatic inhibition and allows transcriptionally active regions to be converted to silent chromatin domains. In line with this, Phf19/PCL3 contains a TUDOR domain that selectively binds to H3K36me3 (74, 75). Phf19 binds to active genes in ESCs and recruits the PRC2 complex together with a histone demethylase, such as NO66 and Kdm2b (74, 75). Experimental evidence indicates that, upon differentiation, the demethylases remove the H3K36me3 mark at target genes, thereby releasing the H3K36me3-mediated PRC2 inhibition and allowing for H3K27me3-induced transcription silencing (74, 75). Similarly, the TUDOR domains of Phf1/Pcl1 and Mtf2/Pcl2 bind H3K36me3 (76). Pcl1 targets the PRC2 complex to DNA damage sites decorated by H3K36me3, thus acting as a cofactor during early DNA damage response (76). The PRC2-Pcl2 complex was found to bind to pluripotent genes to negatively regulate its expression in ESCs; the authors proposed that PRC2 containing PCL2 proteins are placed at the promoter of pluripotent genes to achieve a rapid repression upon differentiation stimulus (77).

Jarid2 is the best characterized PRC2 accessory protein. Jarid2 is an inactive member of the Jumonji family of transcriptional repressors and, in addition to the Jumonji domains, has two DNA binding domains. Several studies in ESCs highlight the strong relationship between Jarid2 and PRC2 (7882). First, Jarid2 and PRC2 physically interact; second, their genome-wide occupancies largely overlap; third, co-occupancy is broadly interdependent, although this is lost during X chromosome inactivation (XCI) when Jarid2 is recruited independently of PRC2 (83); and fourth, both Jarid2 and PRC2 are required for proper differentiation of ESCs (7882). At the molecular level, Jarid2 has been reported to play a role in PRC2 catalytic activity as both an activator (63, 79, 84, 85) and an inhibitor (78, 81). These contradictory results might be resolved by recent studies that show that Jarid2 activity on PRC2 can be modulated by Ezh2-mediated methylation (85). The methylated Jarid2 enhances the catalytic activity of PRC2 on recombinant nucleosomes in vitro. However, the methylated form also competes with H3K27me3 binding with Eed and H3K27me3-mediated stimulatory effects. The results suggest that the role of Jarid2 on catalytic activity depends on the chromatin context. Finally, Jarid2 stabilizes PRC2’s occupancy at chromatin through its ability to interact with nucleosomes (84), ncRNAs (86), and H2Aub1 (87).

The last two partners identified, the mammalian specific proteins C17orf96 (also known as esPRC2p48) and C10orf12, are poorly characterized. Quantitative proteomic assays show a strong interaction between C17orf96 and the trimeric core complex (50). Although C17orf96 lacks chromatin binding domains, it has been shown to interact with unmethylated DNA and nucleosomes (88). Recent data indicates that C17orf96 is located in CGIs in ESCs irrespective of the presence of PRC2 or H3K27me3 (89). Unexpectedly, knockdown of C17orf96 increases the association of Suz12 to chromatin and increases the levels of H3K27me3, in contrast to previous studies that show that aC17orf96 knockdown reduces the levels of H3K27me3 also in ESCs and enhances PRC2 activity in vitro (63). Further studies on these novel PRC2 interactors will clarify these seemingly contradictory results.

HOW ARE PRC1 AND PRC2 RECRUITED TO SPECIFIC LOCI?

Initial studies in Drosophila provided experimental evidences for a sequential binding of PcG complexes at PcG response elements (PREs). These elements are stretches of DNA that contain motifs for several sequence-specific DNA binding proteins and that can be located thousands of base pairs from the gene they regulate (90). The following sequential mode of recruitment was proposed: (i) dPRC2 targets PREs by interacting with sequence-specific DNA binding proteins, such as the PHO and PHO-like transcription factors; (ii) recruitment stimulates H3K27 methylation at the PRE loci; and (iii) this methylation provides a docking site for PRC1 complex recruitment through the chromodomain of Pc protein. Transcription silencing is then imposed by blocking the access of chromatin remodeling complexes and/or directly inhibiting the transcriptional machinery at any step, from its recruitment to elongation (90). However, several indications support an alternative mode of recruitment of dPRC1: the low affinity of Pc for H3K27me3, the narrow distribution of PRC1 over PRE elements while H3K27me3 is spread over several kilobases of PRE-centered regions, and the physical interaction of PRC1 with PHO (90).

In mammals, several lines of evidence also argue against the initially proposed model as a general mechanism of action. First, among the thousands of binding regions in PcG, only two functional PRE elements have been identified (91, 92). Second, PRC1 genome-wide distribution does not completely overlap with PRC2 and H3K27me3 (15, 16, 42, 48). Last, the distribution of PRC1 at some occupied loci and global levels of H2Aub1 remains unaffected after H3K27me3 depletion in PRC2-depleted mouse ESCs (16). Rather than a unique mode of recruitment, several mechanisms have been proposed during the past decade to explain these anomalies; these include interactions with sequence-specific DNA binding proteins, histone modifications, ncRNAs, and unmodified DNA (Fig. 3).

Fig. 3 Mechanisms of PcG recruitment to chromatin.

(A to C) Three major mechanisms of recruitment of PcG complexes have been proposed: (A) a DNA-based mechanism in which PcG complexes are targeted to defined DNA sequences. DNA binding domains (DBD) present in different PcG complexes, such as Kdm2b or Aebp2, can mediate the recruitment to CGIs of CG-rich regions. Transient interaction with transcription factors (TF), such as Snail, can also mediate the recruitment of PcG to DNA-specific sequences; (B) histone modifications can also mediate the recruitment of PcG by its interaction with chromatin “readers” present in the PcG complexes, such as Cbx and PCL proteins; (C) ncRNAs also interact with PcG complexes and are required for their recruitment to chromatin. Two examples of this are PRC2 recruitment mediated by Xist during XCI and by short nascent transcripts from active promoters. In this latter case, interaction with 5′-nascent RNAs negatively regulates PRC2 methyltransferase activity.

Recruitment mediated by DNA sequences and modifications

On the basis of sequence homology, the Zinc finger protein Yy1 was found to be the mammalian ortholog of the DNA binding protein PHO. Studies in HeLa cells suggest that Yy1 could mediate PcG binding to specific loci (93). However, genome-wide analysis showing a poor overlap between PcG and Yy1 and an absence of Yy1 DNA binding elements on PcG targets sites indicate that Yy1 is not a major regulator of PcG recruitment in mammals (94, 95). Additionally, other DNA binding proteins have been suggested to direct the binding of PcG to specific loci, such as Rest (96, 97) and Runx1 (98) for PRC1, and Snail for PRC2 (99). This suggests that PcG complexes can be recruited to discrete loci in particular circumstances, for which silencing needs to be efficiently imposed. The presence of Max and Mga on the PRC1-E2F6 complex, and the moderate enrichment in E-boxes suggest that this PRC1 variant could be recruited by these sequence-specific transcription factors (41).

Results from several genomic and functional experiments indicate a strong correlation between PRC2 binding and CGIs. Genome-wide analysis of Suz12, Ezh2, and H3K27me3 occupancy shows a remarkable association of PRC2 to a particular subset of CGIs that lack predicted binding for transcriptional activators in pluripotent cells (94). Tethering assays using engineered ESCs that harbor artificial bacterial chromosomes containing CGI motifs indicate that these motifs are sufficient to induce de novo recruitment of a catalytically competent PRC2 (100). These results argue that the local configuration of chromatin is a major regulator of PRC2 recruitment and that the mere presence of permissive chromatin would be sufficient to induce PRC2 binding. Along these lines, Riising and colleagues recently showed that gene silencing is a signal that triggers PRC2 recruitment to CGIs (101). They showed that drug-mediated inhibition of RNAPII is followed by de novo recruitment of PRC2 to transcriptionally silenced chromatin in ESCs. After transcription inhibition, the ectopic recruitment of PRC2 is nonrandom and rather is directed to specific loci that contain nucleosomal-free CGI, which appear to be genuine PRC2 targets in other tissues and cell types (101). The fact that loss of PRC2 does not cause global changes in the transcriptome and that transcription silencing precedes PRC2 binding indicates that PRC2 acts to maintain silencing rather than to initiate the signaling cascade. The link between PcG and CGI is further supported by a redistribution of H3K27me3 and H2Aub1 when global DNA methylation is abrogated in Dnmt1/3 ESCs mutants (102, 103). In addition, genomic analysis of the protein Ten-eleven translocation (Tet1), an enzyme that converts methyl-cytosine to hydroxymethyl-cytosine, indicates that 95% of the Ezh2 target genes and 84% of Suz12 target genes are co-occupied by Tet1 in ESCs (104, 105). Tet1 deletion increases CGI methylation and results in a reduction of Ezh2 binding (104). Consistent with this functional relationship in ESCs, Tet1 interacts with the PRC2 complex, and Suz12 knockdown affects Tet1 binding and the hydroxymethyl-cytosine levels in bivalent promoters (105).

Nevertheless, a major question remains unanswered: How is PRC2 recruited to permissive CGIs? Several scenarios have been proposed that would increase DNA accessibility and potentially explain the recruitment of PRC2 to CGIs: the loss of cytosine methylation, the depletion of nucleosomes, and/or the eviction of transcription factors. Supporting the first option, the components of the PRC2 complex Aebp2 and Ezh2 have affinity for unmodified CpG-containing chromatin arrays in vitro, and cytosine methylation reduces their affinity (106). Alternatively, PRC1 has been proposed to bind unmethylated CGI and thus trigger PRC2 recruitment. The recruitment of PRC1 is mediated by the histone demethylase Kdm2b, which contains a Zinc finger CxxC domain that targets the ncPRC1 complex to unmethylated CGI (32, 33). As further discussed below, ubiquitination of H2A mediated by PRC1 would then trigger PRC2 recruitment (35, 87).

Recruitment mediated by histone modifications

Affinity purification of Drosophila embryo nuclear extracts using H2Aub1 oligonucleosomes recovered several dPRC2 subunits, including Jarid2 and Aebp2 (87). The presence of H2Aub1 enhances the methyltransferase activity of PRC2 in vitro, an effect mediated by Aebp2 (87). An independent study showed that forced tethering of specific Pcgf subunits of the ncPRC1 to an artificial locus induces the de novo recruitment of an enzymatically competent PRC1 complex (35), in a PRC2-independent fashion. Moreover, ectopic recruitment of PRC1 correlates with an increase in H2Aub1 deposition and de novo recruitment of PRC2 (35). Remarkably, acute deletion of Ring1A and Ring1B in ESCs reduces the binding of PRC2 in a genome-wide manner (35). Whether there is a direct link between H2Aub1 and PRC2 recruitment, or whether both PRC1 and PRC2 complexes mutually sustain and/or stabilize their binding to chromatin, remains unknown.

Recently, a link between H3K9me3 and PRC2 has been explored. In tethering assays, the H3K9 methyltransferase G9a can recruit an enzymatically active PRC2 to target genes (107). Both methyltransferases interact physically and co-occupy a subset of genes in ESCs. The recruitment of PRC2 to common target genes relies on a catalytically active G9a (107). The fact that Eed can also efficiently bind methylated H3K9 mononucleosomes (66), and that G9a can monomethylate H3K27, suggests that both marks could provide an anchoring site for the PRC2 complex.

In addition to being methylated, H3K27 can also be acetylated by the acetyltransferases CBP and P300 (108). Acetylation and methylation at the same residue are mutually exclusive (108110) and correspond to opposing transcription outcomes (111). Several pieces of evidence suggest that regions containing H3K27ac are nonpermissive for PRC2 binding (109, 110). A prior deacetylation by histone deacetylases is a prerequisite for PRC2 to bind to, and be active at, specific loci in ESCs and leukemia cells (109, 110). These results suggest a model in which the transition between the active (H3K27ac) and repressed (H3K27me3) states is mediated by the action of erasers and writers of methyl- or acetyl-modified H3K27 (109, 110).

Histone variants also play a role in PRC1 and PRC2 occupancy at their target genes. In ESCs, the histone variant H2AZ co-occupies a large number of bivalently marked developmental genes (at which both repressive H3K27me3 and active H3K4me3 are present), together with Suz12 and H3K27me3 (112, 113). However, a number of data shed some doubt on the mutual interdependence of genome occupancy. On the one side, in addition to bivalent genes (112), H2AZ occupancy seems to expand to active genes that lack PRC2 (113). Moreover, whereas one study shows that knockdown of either H2AZ or Suz12 affects the genome occupancy for the other (112), a further study found that H2AZ remains cobound to active and silent genes in Ring1B or Eed knockout ESCs (113). In addition, colocalization of PRC2 and H2AZ is lost in differentiated ESCs, in which H3K27me3 associates to repressed regions and H2AZ occupancy switches toward active chromatin domains (112). Therefore, co-occupancy of PRC2 with H2AZ at bivalent genes may result merely from a configuration of bivalent regions. In ESCs, the histone variant H3.3 is also enriched at bivalent developmental genes together with H3K27me3 (114). Although nucleosome occupancy remains unaffected after H3.3 knockdown, H3K27me3 is severely reduced, as are Jarid2 and Suz12 occupancies. The physical interaction between PRC2 and the H3.3 chaperone HIRA, and the loss of PRC2 recruitment after HIRA knockdown further support a functional link between PRC2 and this H3 variant. However, H3.3 is not enriched in strongly repressed genes lacking H3K4me3, suggesting a differential mechanism of PRC2 recruitment between bivalent and repressed genes.

Recruitment mediated by ncRNAs and coding RNAs

Several laboratories have reported a physical association of PRC2 components with thousands of both coding RNAs and ncRNAs in different cell types, which could indicate a possible promiscuity of PRC2 in RNA binding (86, 115119). In contrast, other studies argue in favor of a more specific RNA binding (29, 69, 120123). Attempts to decipher this apparent contradiction have been recently made, resulting in a model in which PRC2 binds RNAs with different affinities that range from mid to low nanomolar affinities in vitro (124). Nevertheless, whether this broad range of binding affinity is relevant in vivo remains unknown.

The models proposed for RNA-PcG functionality suggest that these RNAs can regulate gene transcription in cis (115, 118, 120, 121) as well as in trans (122, 123). For instance, a model for RNA-PcG interaction is given during mammalian XCI. In mammalian females, the extra gene dosage attributed to the X chromosomes is compensated by a mitotically stable transcription silencing of one of the X chromosomes. XCI is mediated by the action of a specific long ncRNA, Xist RNA, encoded in an X-linked gene. Once expressed, Xist coats the X chromosome in cis and triggers the recruitment of chromatin remodeling machineries, including PcG proteins, which impose repressive DNA and histone methylation (120, 121). Despite numerous efforts to define the molecular mechanisms underlining XCI, there are still some experimental discrepancies. For instance, although the initial PRC2 binding region on Xist defined in vitro is required for XCI (121, 125), it seems to be dispensable for PRC2 recruitment in vivo (83). In undifferentiated ESCs, H3K27me3 is enriched over gene bodies and is excluded from CGI upon ectopic induction of Xist (126). However, in differentiated ESCs, PRC2 and H3K27me3 are broadly distributed along the X chromosome (83, 127). During the early onset of Xist induction in pluripotent ESCs, the silenced X-linked genes do not associate with H3K27me3 (126). This correlates with a poor local colocalization of Xist with Ezh2, Eed, Suz12, and Ring1B, as analyzed with superresolution microscopy (126). These results temporally separate the initial silencing of X-linked genes with PRC2 recruitment and activity, challenging the prevalent model of PRC2 function on XCI.

Another connection between a cis regulatory lncRNA and PcG was also reported at the INK4b-ARF-INK4a locus (29, 128). The antisense lncRNA ANRIL is expressed from the INK4 gene and physically interacts with both Suz12 and the chromodomain of Cbx7, which recruit PRC2 and PRC1, respectively, to this locus. This leads to the deposition of H3K27me3 and H2Aub1 and causes transcription silencing of the INK4b-ARF-INK4a locus. Ultimately, repression of this tumor-suppressor locus triggers cell proliferation and inhibits senescence, therefore inappropriately driving oncogenesis.

Understanding the link between HOTAIR and PRC2 has also been the subject of intense research. HOTAIR is a conserved lncRNA of ~2 kb that acts in trans to impose the transcription silencing throughout the genome (122, 123, 129, 130). It is transcribed from the HOXC locus in humans and mice and is thought to provide information about cell positioning by repressing other HOX loci (129, 131). Its deletion in mice causes homeotic transformation by derepression of the HOXD cluster, which correlates with a reduction in H3K27me3 and an increase in H3K4me3 (129, 131). HOTAIR binds to both PcG proteins and the histone demethylase LSD1, driving their occupancy at hundreds of different GA-enriched loci and resulting in H3K27me3 deposition and loss of H3K4me3. Detailed functional studies in human cells have shown that interaction of HOTAIR and PRC2 is regulated by phosphorylation of Ezh2 by the cell cycle–regulated kinase Cdk1 (132). Additionally, a highly structured minimal region of 89 nucleotides in HOTAIR is sufficient to interact with the Ezh2-Eed dimer (133), which may provide insight to understanding the structural basis of PcG’s RNA binding.

In addition to the RNA interactions with Xist, ANRIL, and HOTAIR, PRC2 components such as Ezh2 also bind the 5′-terminal part of nascent transcripts (118, 119, 134). Binding to the nascent RNAs inhibits the catalytic activity of PRC2 by an unknown mechanism (118, 134). The model proposed suggests that PRC2 initially “samples” the chromatin, and, on active genes, its activity is somehow inhibited by direct contact with nascent RNAs. When a signal triggers the recruitment of gene silencers, transcription of nascent RNAs disappears, and PRC2 actively imposes the mitotically stable silencing. Following the widely accepted idea, this model suggests that PRC2 is dispensable for initiating the transcription silencing, but rather is required for the mitotic maintenance of repressive states.

HOW DOES PcG MEDIATE TRANSCRIPTION REGULATION?

PRC1 and PRC2 regulate RNA polymerase

Seemingly, PcG complexes can alter the chromatin environment both through their catalytic activity, by imposing PTMs on histones, and independently of their catalytic activity, by inducing chromatin condensation (Fig. 4) (135137). Although chromatin condensation restricts the actions of the ATP (adenosine 5′-triphosphate)–dependent chromatin remodeling complex SWI/SNF in vitro (135, 136, 138, 139), the physiological relevance of this mechanism of action remains unknown.

Fig. 4 Mechanisms of PcG-mediated transcription regulation.

PcG proteins mediate repression and activation of transcription. (A) Three major mechanisms of repression have been proposed. First, in bivalent promoters marked with both repressive (H3K27me3) and active (H3K4me3) histone PTMs, PcG complexes hold the poised RNAPII at transcriptional start site (TSS), thereby inhibiting its release. Second, PcG complex can compact chromatin. For PRC1, its ability to compact chromatin appears to be independent of its catalytic activity. Chromatin compaction is proposed to block the accessibility of chromatin remodeling complexes, such as the SWI/SNF complex, which is required during transcription activation. Third, deubiquitination and demethylation of histone H3 at gene bodies are required for efficient transcription elongation by RNAPII. Thus, the histone modifications imposed by PcG at gene bodies might prevent RNAPII processivity during transcription elongation. (B) PRC1 complexes can also regulate gene activation. Two different mechanisms have been proposed, both of which require the action of a protein kinase. The first mechanism involves the phosphorylation by Aurora B kinase of the deubiquitinase Usp16 and the E2-conjugating enzyme Ube2d3, which act in a coordinated manner to block PRC1 activity. Usp16 phosphorylation activates its deubiquitinase activity and hence leads to removal of the ubiquitin from H2AK119. In contrast, Ube2d3 phosphorylation inhibits its activity, thus impairing the E3 ligase activity of PRC1. Therefore, a catalytically impaired PRC1 complex would favor the recruitment of RNAPII to activate gene transcription. In the second mechanism, the CK2 within the PRC1.5 complex phosphorylates Ring1B on serine 168 and inhibits its catalytic activity. Additionally, the subunit Auts2 of the complex triggers the recruitment of the acetyltransferase P300, which acetylates histone tails and enhances transcription.

Transcription is a stepwise process by which the transcriptional machinery, led by the RNA polymerase (RNAPII), is first recruited to promoters, a process facilitated by transcription factors bound to proximal-promoter and distal regulatory enhancer regions. After an initial short transcription, RNAPII is phosphorylated in a serine located in a fifth position (S5-Pi) of a heptapeptide that is repeated 52 times in the CTD. At this stage, the RNAPII is held close to the TSS poised for productive transcription elongation. Next, the RNAPII is fired by phosphorylation in a second serine of the same heptapeptide (S2-Pi) of the CTD and is released from the proximity of the TSS toward the gene body. The transcriptional machinery then travels along the gene body together with chromatin remodeling factors that modify nucleosome conformation to favor the elongation of the complex. Finally, the transcription terminates at the 3′ end of the gene (140).

The monoubiquitination of lysine 119 on H2A and the di-/trimethylation of lysine 27 on H3, by PRC1 and PRC2, respectively, are thought to block gene transcription directly (Fig. 4), although an indirect action of PcG over additional non-histone substrates that regulate transcription cannot be completely ruled out.

Globally, the H3K27me3 mark encompasses distal enhancers, proximal promoters/TSS, and gene bodies (141, 142), suggesting that its functional role could be dependent on the chromatin context. For instance, studies in Drosophila analyzing the genome-wide distribution of RNPAII, H3K27me3, and H3K4me3 in both wild-type and mutants of extra sex combs (the ortholog of mammalian Eed) indicate that H3K27me3 limits RNAPII recruitment to gene promoters (143). In addition, studies in human myeloid cells show that demethylation of H3K27me3 at genes also marked with H3K4me3 induces a release of paused polymerase but does not significantly affect RNAPII recruitment (144). Accordingly, during differentiation of muscle cells, demethylation of H3K27me3 on gene bodies is required for proper RNAPII elongation of developmentally regulated genes (Fig. 4) (145).

PRC2 is also required for deposition of mono- and dimethylation of H3K27 (146, 147). Recent data show that H3K27me1 is located in actively transcribed regions, promoting their transcription. Conversely, H3K27me2 is broadly located along the repressed chromatin domains, which represents 70% of total H3 (147). These results suggest that fine-tuning the methyltransferase activity of PRC2 might result in different transcriptional outputs.

H2Aub1 has also been implicated in acting to restrain RNAPII elongation. Zhou and colleagues proposed that H2Aub1 prevents the recruitment of the histone chaperone FACT at promoter regions, thus inhibiting H2A-H2B dimer eviction and, consequently, RNAPII release (148). As an alternative model, Stock and collaborators suggested that H2Aub1 favors the switch of RNPAII to a yet-uncharacterized conformation that has less processive activity (149). Supporting this notion, deletion of Ring1b in ESCs causes the activation of poised developmental genes without affecting the transcription elongation–associated RNAPII, which is phosphorylated at serine 2/5 on its CTD (149). However, allele-specific genomic characterizations of PcG co-occupancy with RNAPII at different stages of the transcription process indicate that PcG repression is associated only with RNAPII phosphorylated at serine 5 on its CTD (RNAPII-S5P), across CpG-rich genes (150). At silent developmental genes, PcG and unproductive RNAPII-S5P co-occupy promoters and coding regions, which produce immature RNAs (Fig. 4). In a subset of moderately expressed genes that are mostly related to development and metabolism, the distribution of PcG and elongating RNAPII is allele-specific and mutually exclusive, with PcG and RNAPII-S5P co-occupying the repressed allele and elongating RNAPII-S2P/S5P/S7P co-occupying the other active allele (150). Alternatively, H2Aub1 has been proposed to inhibit transcription initiation but not elongation, by preventing H3K4 ditrimethylation during hepatocyte regeneration (151).

Transcription activation by PRC1

In recent years, a radically opposite scenario has emerged that places PcG complexes as activators of gene expression (36, 152, 153). During the differentiation of ESCs toward an ectodermal fate, PRC1 is required for the initial activation of developmental genes (152). This switch in the PRC1 role from repressor to activator is due to a switch in subunits from Cbx7 to Cbx8 (152). In this case, PRC1 also occupies H3K36me3-decorated active genes. How this switch favors gene activation remains unknown. In a study using quiescent lymphocytes, PRC1 was shown to be recruited to actively transcribed genes independently of PRC2 (153). In this context, PRC1 co-occupies active target genes together with the Aurora B kinase. This kinase phosphorylates and activates the Usp16 deubiquitinase, which then deubiquitinates H2Aub1 (Fig. 4). Additionally, Aurora B phosphorylates the E2-conjugating enzyme Ube2d3 to inhibit its catalytic activity, which is required for Ring1B E3 ligase function (Fig. 3) (153). The authors suggest that the activating role of PRC1 relies in the capacity of the complex to recruit active RNAPII (149, 150). Thus, in contrast to ESCs, in which the active form of PRC1 retains RNAPII in a poised state, the inactive PRC1 in quiescent lymphocytes would enable the processivity of RNAPII.

Recently, it has become clear that the PRC1.5 complex is involved in gene activation after its phosphorylation: the CK2 protein present in the complex phosphorylates Ring1B at serine 168 (36). This phosphorylation severely reduces the catalytic activity of Ring1B. Additionally, another subunit of the complex, the Auts2 protein, triggers the recruitment of p300, which acetylates histone tails and thereby favors transcription activation (Fig. 4) (36).

Finally, PcG seems to be a coactivator of gene expression by regulating local topological interactions. A recent study demonstrated that Ring1B plays a role in enhancer-promoter interactions at the developmental mouse gene Meis2 (154). Meis2 is repressed during early development and becomes active during midbrain development. The transition from the repressive to the active state correlates with the topological interaction of the Meis2 promoter and a midbrain-specific enhancer (MBE) within the gene. This expression transition is facilitated by Ring1B, which functions as a molecular bridge, thus facilitating the interaction of MBE, the Meis2 promoter, and a Ring1B-binding site located downstream of the poly(A) site of the gene (154). Considering that 25% of Ring1B-binding sites are outside of promoter regions in mouse neural ESCs (154), this suggests that the role of PRC1 on topological organization could be extended to other genes.

Regulation of chromatin structure by PcG

TADs are three-dimensional (3D) structures of the genome that are highly conserved across cell types and species. TADs segregate from each other by sharp boundaries that contain binding sites for architectural proteins such as CTCF and cohesion (Fig. 5). Different TADs segregate according to transcriptional levels, histone modifications, chromatin accessibility, and replication timing (4, 155, 156). Correspondingly, regions of the genome featuring transcriptionally permissive (or euchromatic) histone marks tend to form active TADs and are mainly found within the active nuclear environment or “HiC compartment A.” Active TADs spatially segregate from transcriptionally repressed TADs, which are associated with facultative heterochromatin (“HiC compartment B”) or the peripheral nuclear lamina (lamina-associated domains) (156).

Fig. 5 PcG shapes intra-TAD interactions.

In ESC nuclei, the genome is compartmentalized on the basis of the preferential interactions between genomic elements, forming multilooped structures called TADs (see text) of active chromatin (depicted in green) and transcriptionally repressed chromatin (in orange). Chromatin loops are flanked by insulator proteins such as the CTCF transcription factor (in gray). Top panel: Hypothetical chromosome conformation capture (3C) data showing pairwise interaction frequencies (in red) occurring between two active TADs (green) segregated from an inactive TAD (orange). The active and inactive TADs are densely marked by H3K4me3 and H3K27me3, respectively. Bottom panel: Upon ESCs differentiation, the overall TAD structure and location of TAD boundaries are not altered, but small rearrangements occur, which correlate with a redistribution of histone PTMs.

Since the discovery of TADs and chromatin loops, particular attention has been paid to how these 3D genomic structures change during development and cellular differentiation, when cells need to precisely and dynamically tune the lineage-specific gene expression programs that are essential for maintaining cell identity (157). In ESCs, two special types of chromatin loops spatially organize the genes involved in cell identity. On the one hand, genes involved in ESC self-renewal are contained within the so-called super-enhancer domains. Transcription of these genes is governed by super-enhancers, intergenic regions characterized by an exceptionally high occupancy of the RNPAII subunit Mediator, core pluripotent factors (Oct4, Sox2, and Nanog), and the histone mark H3K27ac. On the other hand, similar to the super-enhancer domains, the genes specifying repressed lineage are organized within chromatin structures known as PcG domains. PcG domains average 112 kilo–base pair (kbp) and include most (70%) of the PcG-associated genes, contained within a loop of densely marked H3K27me3 chromatin flanked by CTCF/cohesin sites (142). Accordingly, PcG domains contain an exceptionally high density of PRC2 subunits along with their associated histone marks. Some of the best characterized PcG domains are the Hox gene clusters (HoxA-D). High-resolution 3C studies have revealed that inactive Hox genes form large H3K27me3-marked TADs located within HiC compartment A in the nucleus of ESCs and away from the lamina-associated nuclear periphery (158). Recent studies indicate that H3K27me3 regions may aggregate due to the intrinsic affinity between H3K27me3-decorated loci and the overall chromosome folding patterns of each cell type (158). During embryonic development, Hox genes are sequentially activated to allow for the correct patterning of the vertebrate body axis. When transcription is activated, specific Hox genes progressively segregate into an active TAD compartment (Fig. 4) (159, 160). This TAD reorganization is accompanied by a switch in histone modifications, whereby H3K27me3-marked regions are labeled with the opposing H3K4me3 mark during transcription activation (159, 158, 161).

TADs therefore change with the reorganization of chromatin marks, but is PcG/H3K27me3 required for TAD formation or maintenance? Chromosome conformation assays in ESCs show that loss of PRC2 (and therefore its associated H3K27me3 mark) had a minimal effect on global genome conformation, only causing the disruption of specific interactions between regions densely enriched by Polycomb within the TAD (162). Similar conclusions were obtained when analyzing the TAD organization of the inactive X chromosome, which showed no differences in TAD size or position in Eed knockout cells, whereas deletion of a 58-kbp region inside the TAD boundary led to changes in long-range interactions and transcription misregulation (7). Together, these results suggest that PRC2 is not a major driver of global genome architecture, although higher-resolution 3C experiments should be performed to rule out the possibility that PRC2 is required to maintain specific enhancer-promoter interactions during development and cellular differentiation.

Nonetheless, from these and other studies, the notion emerges that the integrity of the PcG domain boundaries is a strong determinant of TAD organization and transcriptional output. Accordingly, the presence of intact boundaries flanking PcG domains is required for full transcriptional repression of the genes within the domain, as specific mutations that disrupt CTCF boundaries lead to a relocalization of PcG proteins outside the boundaries, with a resulting increase in expression of the genes within the domain (142, 161, 163). Analogous results were found in Drosophila, in which depletion of dCTCF caused a decrease in H3K27me3 within the Polycomb domain (164). Overall, these data indicate that high occupancy of PcG proteins within TADs might help to stabilize and consolidate—but not to establish—topological domains of transcriptionally inactive regions of the genome.

CONCLUDING REMARKS

PcG complexes are evolutionarily conserved proteins that regulate gene expression. We have highlighted the current understanding of the molecular mechanisms by which PcG complexes regulate transcription. To the best of our knowledge, PcG proteins can modulate transcription by (i) associating with each other in a rational manner to form distinct functional complexes; (ii) targeting silenced as well as actively transcribed regions in a controlled fashion; (iii) imposing PTMs on histone tails and compacting chromatin, thus altering the chromatin environment; and (iv) modulating directly the transcriptional machinery, by regulating its accessibility to DNA and/or the processivity of RNAPII. Despite having obtained a broad overview, elucidating the details of this still faces some major challenges. First, although there have been efforts to comprehensively characterize the complex subunits at the proteomic level, we lack an in-depth structural model of the PcG complexes, because their size and heterogeneity hamper a comprehensive and precise characterization of the tridimensional structures. Additionally, elucidation of the complex 3D structures will require different structural approaches to be integrated, from nuclear magnetic resonance spectroscopy to proximity-ligation strategies combined with mass spectrometry. Constructing such structural models is important, because it will provide insight not only into the complex assembly but also for its genome targeting and mechanisms of regulation. Second, chromatin immunoprecipitation combined with massive parallel sequencing have been widely used to define the genomic distribution of different PcG proteins, which supports both redundant and nonoverlapping functions. A major goal in the future will be to understand the interplay between different complexes in a temporal manner, such as during cellular differentiation, as well as during oncogenic transformation. Third, despite the strong correlation between H3K27me3 and H2Aub1 with gene repression, it is not fully understood how these marks affect transcription. The identification of H3K27M gain-of-function mutations in a subset of highly aggressive pediatric gliomas might contribute to decipher the impact of H3K27me3 methylation in gene regulation. A single mutation of histone H3.3 Lys27-to-Met27 results in a blockage of the PRC2 methyltransferase activity and H3K27me depletion (165). Recent proteomic studies from Drosophila animal models for this mutation show that nucleosomes containing the mutated histone H3 are devoid of PRC2 components and are enriched on bromodomain proteins (166). The extensive characterization of similar models at the molecular level will clarify the role of this PTM in gene transcription. Fourth, the emerging 3C technologies are providing new avenues on how transcription of subsets of genesis regulated in a coordinated manner during biological processes. The role of PcG in defining global chromosome architecture is still unclear. However, detailed changes on promoter-enhancer interactions are missing due to the resolution of the 3C studies currently published. Hence, strategies to improve the genome interaction maps will enlighten the precise role of PcG on regulating 3D chromatin architecture.

Box 1

Modes of epigenetic inheritance of H3K27me3.

During DNA replication, parental histones marked with PTMs are deposited back onto nascent DNA (167). Deposition of newly synthetised histones supplies the extra demand of nucleosomes at nascent DNA. The coexistence of both parental and new histones supports a model of precise maintenance of histone marks on mature DNA after replication fork progression, in which the marks on parental histones serve as a blueprint for the modification of new histones placed in the vicinity (167). For H3K27me3, the affinity of different PRC2 subunits, such as Eed, to the methyl mark would ensure its self-propagation by recruiting their cognate enzyme (66, 168). Therefore, considering this mechanism of propagation, the H3K27me3 would fulfill the criteria to be considered an epigenetic feature stably maintained at a specific loci across cell divisions. However, recent proteomic analysis of radioactive labeled histones on HeLa cells challenge the view that H3K27me3 is inherited at specific nucleosomal positions. These studies indicate that, after the replication, the total levels of H3K27me3 are not fully reestablished until the G1 phase of the next cell cycle (169171). Indeed, detailed analyses show that faithful propagation of total levels of H3K27me3 requires the continuous modification of new histones as well as previously unmodified parental histones during several cell generations (171). This model of transmission indicates that, rather than requiring H3K27me3 to be deposited at a specific nucleosomal position, the mark is distributed throughout the locus to reach a threshold level required to maintain its epigenetic state (170, 171).

As an alternative model of transmission, data from the Drosophila embryo indicate that the PcG proteins themselves convey epigenetic information to impose stable silencing and self-perpetuate across cell divisions (172). In Drosophila, proximity ligation assays have shown that the PcG proteins Pc and E(z), and the Trithorax protein Trx are associated with nascent DNA within 200 bp of the replication fork (172). The authors suggest that PcG and TrxG remain associated to the chromatin during replication fork progression and are ready to resume the marks at the nucleosomes on nascent DNA (172). In contrast to the situation in HeLa cells (171), trimethylated H3K27 and H3K4 were not detected in the initial 200 bp of nascent DNA in Drosophila embryos, although this cannot exclude that parental histones are loaded after this initial DNA track. Accordingly, specific deletion of a PRE, using the FLP recombinase at different time points during Drosophila development, leads to loss of silencing within one or a few cell generations (10). These data indicate that the PRE is required not only to initiate but also to maintain the silencing state, whereas the histone marks, once established, do not self-perpetuate throughout development, therefore questioning their epigenetic nature.

This is an open-access article distributed under the terms of the Creative Commons Attribution-NonCommercial license, which permits use, distribution, and reproduction in any medium, so long as the resultant use is not for commercial advantage and provided the original work is properly cited.

REFERENCES AND NOTES

Acknowledgments: We are indebted to L. Morey, P. Vizan, and M. Beringer for critical reading of the manuscript. We also thank all the members of Di Croce laboratory for helpful discussions. We thank V. A. Raker for author’s editing. We apologize to any colleagues whose work has not been cited due to space limitations. Funding: This work received the support of the Secretary for Universities and Research of the Ministry of Economy and Knowledge of the Government of Catalonia (to S.A. and G.M.), the Lady Tata Memorial Trust (to S.A.), and “Ia Convocatoria de Ayudas Fundación BBVA a Investigadores, Innovadores y Creadores Culturales” (to G.M.M.). This work in Di Croce’s laboratory is supported by grants from the Spanish “Ministerio de Educación y Ciencia” (SAF2013-48926-P), from AGAUR (Agència de Gestió d’Ajuts Universitaris i de Recerca), from La Marató TV3, from the Fundación Vencer El Cáncer (VEC), Spanish Ministry of Economy and Competitiveness, “Centro de Excelencia Severo Ochoa 2013-2017” (SEV-2012-0208), and from the European Commission’s 7th Framework Program 4DCellFate (grant number 277899). Author contributions: S.A., G.M., and L.D.C. wrote the manuscript. Competing interests: The authors declare that they have no competing interests. Data and materials availability: All data needed to evaluate the conclusions in the paper are present in the paper and/or the Supplementary Materials.
View Abstract

Navigate This Article