Research ArticleBIOCHEMISTRY

Chromatin capture links the metabolic enzyme AHCY to stem cell proliferation

See allHide authors and affiliations

Science Advances  06 Mar 2019:
Vol. 5, no. 3, eaav2448
DOI: 10.1126/sciadv.aav2448


Profiling the chromatin-bound proteome (chromatome) in a simple, direct, and reliable manner might be key to uncovering the role of yet uncharacterized chromatin factors in physiology and disease. Here, we have designed an experimental strategy to survey the chromatome of proliferating cells by using the DNA-mediated chromatin pull-down (Dm-ChP) technology. Our approach provides a global view of cellular chromatome under normal physiological conditions and enables the identification of chromatin-bound proteins de novo. Integrating Dm-ChP with genomic and functional data, we have discovered an unexpected chromatin function for adenosylhomocysteinase, a major one-carbon pathway metabolic enzyme, in gene activation. Our study reveals a new regulatory axis between the metabolic state of pluripotent cells, ribosomal protein production, and cell division during the early phase of embryo development, in which the metabolic flux of methylation reactions is favored in a local milieu.


Most fundamental cellular functions rely on chromatin-based processes, such as transcription, replication, and inheritability of genetic and epigenetic information. In mammals, chromatin structures are spatially and functionally organized into compartments by the combination of strong and weak molecular interactions (1). To date, our knowledge on the distributions of chromatin-bound factors is limited to a reduced number of proteins and histone modifications (2, 3). Genomic, transcriptomic, and global proteomic profiling can only predict the proteomic composition of the cellular chromatin. Moreover, proteins lacking a chromatin/DNA recognition motif are difficult to foresee as chromatin-associated factors, thereby limiting their characterization. Hence, a simple, straightforward, and reliable “chromatomic” approach is in high demand to define the proteomic composition of the cellular chromatin and to discover unpredicted chromatin-associated factors de novo.


In the last decade, several methods have been designed to study the proteomic composition of the chromatin, the so-called chromatome (4) (table S1). However, most of these methodologies have notable limitations, resulting in the lack of a unique and solid surveyor assay that serves as a gold standard method for the systematic chromatin-bound proteome profiling. These limitations include (i) the relative methodological complexity, which makes them unsuitable as a routine laboratory technique; (ii) the presence of contaminants (such as antibodies) that hamper the detection of proteins by mass spectrometry (MS); and (iii) the incompatibility of the purified chromatin preparations as chromatin immunoprecipitation (ChIP)–able material for downstream functional analysis. In 2011, two independent laboratories developed a technology to capture nascent DNA labeled with the thymidine analog 5-ethynyl-2′-deoxyuridine (EdU) during DNA replication (5, 6). These two groups used the simple and highly efficient “Click chemistry” to covalently link biotin moieties to nascent EdU-labeled DNA that was further captured using streptavidin columns. Combined with high-resolution MS, the Click-assisted chromatin purification method allowed the definition of the first unbiased picture of the proteins associated to DNA during replication in cancer and pluripotent stem cells (7, 8). On the basis of the recent DNA-mediated chromatin pull-down (Dm-ChP) technology (6) and with the aim of surveying the chromatome of proliferating cells in a simple and robust manner, we designed an experimental strategy to capture the whole EdU-labeled genome under physiological conditions (Fig. 1A). Our strategy relies on (i) the efficient incorporation of the thymidine analog EdU during DNA replication in a noncytotoxic manner, (ii) the addition of a biotin moiety to the incorporated EdU under mild conditions, and (iii) streptavidin-biotin affinity binding to capture sheared EdU-labeled chromatin. As shown in Fig. 1B, the incubation of mouse embryonic stem cells (ESCs) with EdU for a short period of 10 min enables its incorporation into nascent DNA in living cells, displaying a punctuated distribution throughout the nucleus of ESCs in S phase, together with the replication-associated protein RPA32. In contrast, after incubation for 20 hours, EdU was uniformly distributed in the nuclei of virtually all ESCs in culture, indicating a global EdU incorporation into their genomes (Fig. 1B). As depicted in Fig. 1B, our strategy includes a formaldehyde cross-link step (1% FA, 10 min) to preserve protein-protein and protein-DNA interactions during the sample preparation, similarly to the ChIP procedure (step 1). After a mild cell permeabilization (0.1% Triton X-100), a biotin moiety is attached in situ to the incorporated EdU under mild conditions by Click chemistry (step 2). Chromatin is sheared by sonication to yield fragments of about 500 base pairs (bp) (step 3; Fig. 1A and fig. S1A), and streptavidin-biotin affinity columns are used to capture EdU-labeled chromatin (step 4; Fig. 1A). Captured chromatin is extensively washed, reverse cross-linked, and eluted by heating under reducing conditions [2% SDS and 0.1 M dithiothreitol (DTT); step 5]. Last, chromatin-bound proteins are further analyzed by conventional blotting methods or by MS techniques (step 6; Fig. 1A). Specificity of chromatin capture can be monitored by dot blot (Fig. 1C) or Western blot (fig. S1B). While histone H3 was efficiently captured from ESCs incubated with EdU (+EdU; Fig. 1C and fig. S1B), it remained undetectable on eluates from nonincubated ESCs (−EdU). In contrast, the cytoskeleton vinculin or β-actin proteins were absent in eluates from either +EdU- or −EdU-incubated cells (Fig. 1C and fig. S1B). We show that a similar experimental pipeline can be used for other cell types (fig. S1C) under distinct growth conditions (e.g., grown in suspension or adherent cultures) (fig. S1C).

Fig. 1 Dm-ChP for chromatin-bound proteome profiling.

(A) Schematic representation of the Dm-ChP technique for chromatin-bound proteome profiling. Cells were incubated with a very low amount of EdU during a long pulse (0.1 μM EdU for 20 hours) to label the whole genome. The DNA and associated proteins were cross-linked with 1% formaldehyde (FA) for 10 min, and the cells were lysed (step 1). The EdU-labeled DNA was conjugated to a biotin group by a Click reaction (step 2) and then fragmented by sonication (step 3). Labeled DNA fragments were isolated using streptavidin magnetic beads (step 4) and eluted using Laemmli buffer (step 5). Eluted samples were analyzed by Western blot or high-resolution MS (step 6). μM, micromolar. LC-MS/MS, liquid chromatography–tandem mass spectrometry (B) ESCs were pulsed with EdU at the times indicated and then stained with an anti-RP32 antibody, and EdU was detected by a Click reaction. Insets show magnifications of the image. DAPI, 4′,6-diamidino-2-phenylindole. (C) Input material and eluates prepared by Dm-ChP from nonincubated (−EdU) and EdU-incubated mouse ESCs (+EdU) were analyzed in triplicate by dot blot using the indicated antibodies. Dashed red circles indicate the position of the dotted samples. (D) Experimental scheme for the proteomic survey of pluripotent ESCs using Dm-ChP with high-resolution MS (Dm-ChP–MS). (E) Volcano plot of the 1812 proteins identified in the proteomic analysis in ESCs. Proteins enriched in EdU-ESCs are shown on the right, and those enriched in (non-EdU) ESCs are shown on the left. FDR, false discovery rate. (F) Pie chart representing the functional groups, by manual curation, of the 488 chromatin-bound proteins found in ESCs by Dm-ChP–MS and Significance Analysis of INTeractome (SAINT). The number of proteins is indicated in parentheses. (G) Venn diagram indicating the overlap between the manually curated list of previously reported chromatin modifiers [from (6365)] and our Dm-ChP–MS dataset. The list includes the top expressed chromatin modifiers in mouse ESCs (RPKM > 50). RPKM, Reads Per Kilobase Million. (H) Selected functional protein networks associated with chromatin organization and remodeling, DNA modification, and pluripotency. Related subnetworks are depicted separately according to their functional roles. The legend indicates the color code for the log2 of the fold changes in +EdU/−EdU conditions and the presence and significance of each protein in our dataset. Selected protein complexes are indicated in red. ACF (ID_925), adenosine 5′-triphosphate–dependent chromatin assembly factor complex; Mll2 (ID_6460), mixed-lineage leukemia 2 complex; NoRC (ID_5694), nucleolar remodeling complex; NuRD (ID_61), nucleosome remodeling deacetylase complex; PRC2 (ID_996), polycomb repressive complex 2; Sin3A (ID_283), paired amphipathic helix protein Sin3a complex; WSTF (ID_236), Williams syndrome transcription factor–containing complex (abbreviations for protein complexes are given with the identifier number of the archetypic complex at the CORUM database). ADP, adenosine 5′-diphosphate.

Incubation times of 48 hours or longer with 10 μM EdU have been shown to be cytotoxic in several human cancer cell lines (911). Although this concentration is a hundred times more concentrated than that we use in our experimental pipeline (0.1 μM EdU for 20 hours), we analyzed whether incorporation of EdU affects normal cell behavior. We did not observe any changes in the transcriptional program of EdU-incubated ESCs by RNA sequencing (RNA-seq) (fig. S2A). Further, phosphorylation levels of histone H2AX at serine 139 (γH2AX), an early marker of DNA damage, remained unchanged after EdU incorporation (fig. S2B). We additionally analyzed the subcellular distribution of γH2AX by immunofluorescence. In both nonincubated and EdU-incubated ESCs, γH2AX displays a typical cell-to-cell heterogeneous staining with the presence of discrete nuclear foci, which are characteristic of normally growing asynchronous ESCs (fig. S2C) (12). Last, cell cycle profile analysis showed equivalent proportions of cells in different phases of the cell cycle in EdU-ESCs and nonincubated cells, which is a signal of normal cell cycle progression (fig. S2D). In sum, all these results indicate that the incorporation of EdU in ESCs using our experimental scheme is compatible with normal physiological conditions, and that, in combination with Dm-ChP technology, it enables one to survey the intact chromatome of proliferating cells.

Pluripotent ESCs have a unique open chromatin with transcriptionally active and poised genomic regions established by the interplay between pluripotent transcription factors and chromatin remodeling complexes (13). To date, a partial proteomic view of the chromatin of ESCs has been gained by focusing on genomic regions occupied by the pluripotent factors (14), or decorated by specific modified histones (15), or by characterizing the strongly chromatin-bound insoluble nuclear material (16), or focusing on nascent chromatin (8). With the aim of uncovering both the strong and weakly chromatin-bound proteome of pluripotent ESCs in a global and unbiased manner, we combined Dm-ChP with high-resolution MS (Dm-ChP–MS; Fig. 1D). In total, we measured the relative abundance of 1812 proteins using label-free quantitative proteomics (Fig. 1D). The volcano plot distribution showed a large number of proteins enriched on chromatin captured from EdU-incubated ESCs, while the eluates from nonincubated control ESCs displayed very few enriched proteins (Fig. 1E). To provide a comprehensive dataset for the pluripotent chromatin-bound proteome, we used the Significance Analysis of INTeractome (SAINT) algorithm to identify the bona fide protein-chromatin interactions in the Dm-ChP–MS experiments (17). Using this approach, we identified a total of 488 confident chromatin-bound proteins of ESCs, of which nearly 60% primarily have a typical chromatin-based function, such as chromatin organization and remodeling, DNA repair, replication and modification, transcription, and RNA processes (i.e., splicing and capping) (Fig. 1F). Gene Ontology (GO) term enrichment analysis showed nucleic acid and chromatin binding as top category terms related with the molecular function of these 488 proteins (fig. S3A). Collectively, we found that about 75% of the top expressed ESC-specific chromatin modifiers described in the literature were present in our Dm-ChP samples (Fig. 1G). In particular, we purified the complete polycomb repressive complex 2 (PRC2) and the nearly complete mixed-lineage leukemia 2/COMplex of Proteins ASSociated with Set1 (MLL2/COMPASS)–related complex 2; these are two key protein complexes associated with transmission of the epigenetic cellular memory (Fig. 1H) (18, 19). In addition, we found factors that have been functionally linked with self-renewal and pluripotency capacity of ESCs, including the Brg-associated factors (esBAFs) (20) and a protein network centered around the pluripotency factors POU5F1 and SOX2.

Together, our results constitute the first broad characterization of the chromatin-bound proteome in pluripotent cells, and it encompasses major DNA factors and chromatin-associated complexes essential for sustaining ESC identity and the pluripotency transcriptional program.

In addition to relevant, well-known chromatin-associated proteins, our dataset included proteins with no obvious connections with chromatin (Fig. 1F). In addition to their primary function, increasing experimental evidence supports a chromatin function for some of these proteins, including proteasome subunits (21), ribosomal subunits (22, 23), and ribonuclear proteins (24). We aimed to validate the strength of our Dm-ChP–MS strategy to identify the potential role of chromatin-associated proteins de novo. Among the most biologically and pathologically relevant proteins from this group, we identified adenosylhomocysteinase (AHCY, also known as S-adenosyl-l-homocysteine hydrolase), a major one-carbon pathway metabolic enzyme and one of the most conserved enzymes across living organisms, including bacteria, nematodes, yeast, plants, insects, and vertebrates (25) (fig. S3, B and C). The homotetrameric structure of AHCY is a result of a dimer of dimers that catalyzes the reversible hydrolysis of S-adenosylhomocysteine (SAH) to adenosine and l-homocysteine (26) (Fig. 2A). The AHCY substrate SAH is the product of the S-adenosyl-l-methionine (SAM)–dependent methyltransferases (MTases), which transfer the methyl group from SAM to a variety of cellular substrates, including DNA, RNA, and proteins. The mammalian AHCY is the only enzyme capable of hydrolyzing SAH. Accordingly, a chromosomal deletion that includes Ahcy is embryonically lethal in mice at the peri-implantation stage (between 4.5 and 5.5 days post coitum) (27). Moreover, AHCY dysfunction has been associated with a wide variety of human disorders, including vascular disease, cancer, diabetic nephropathy, severe myopathy, and neurodevelopmental delay followed by childhood death (2832). Despite the functional, pathological, and biological relevance of AHCY, and the presence of the methylation reactions in different subcellular compartments, no direct evidence exists for genomic recruitment or distribution of AHCY, and its role in stem cell biology remains unknown. We validated our MS data by Dm-ChP–Western blot using a specific antibody against AHCY (Fig. 2B). We confirmed the presence of a partial fraction of cellular pool of AHCY in the chromatin of ESCs growing in either serum-containing media plus leukemia inhibitory factor (LIF) or the so-called 2i-LIF media, which promotes the homogeneous expansion of preimplantation ground state–like ESCs. We confirmed the presence of AHCY in the nucleus of ESCs in culture, and the specificity of the nuclear signal was evaluated by reducing the expression of AHCY using short hairpin RNAs (shRNAs) against Ahcy (Fig. 2C). Using available single-cell RNA-seq data (33), we observed Ahcy expression during early mouse development (Fig. 2D). Ahcy expression levels increase with the acquisition of the pluripotency of blastomers, correlating with expression of the pluripotency marker Nanog (fig. S4) and the formation of the preimplantation blastocysts (Fig. 2D). Immunostaining and confocal microscopy analysis on mice embryos at the blastula stage revealed that AHCY is located in the cell nucleus (Fig. 2E), and it is most expressed in pluripotent NANOG-positive cells. These results indicate that AHCY has a nuclear function during early mammalian development.

Fig. 2 The AHCY enzyme is recruited to chromatin in pluripotent ESCs.

(A) Scheme of the methionine (Met) metabolic pathway. AHCY is depicted in red. MAT, methionine adenosyltransferase; MS, methionine synthase; MTase; methyltransferase; HCy, Homocysteine. (B) Input and chromatin isolated by Dm-ChP from ESCs, under serum-containing or 2iLIF culture conditions, with nonincubated (minus) or EdU-incubated cells (plus), were analyzed by Western blot using the indicated antibodies. H3 was used as a control of efficient chromatin purification by Dm-ChP. Note that specific AHCY bands correspond to the dimer in size that is resistant to reverse cross-linking by heating. (C) shControl and shAhcy-KD ESCs were stained with a specific anti-AHCY antibody and show the loss of a specific nuclear signal in all three shAhcy-KD cells. Nuclei were counterstained with DAPI. (D) Single-cell expression values of Ahcy during mouse development, from zygote to the late blastocyst stage. Each dot represents a cell from the original data of (33). scRNA-seq, single cell RNA-seq. (E) Representative confocal section covering the inner cell mass and the trophectoderm layers of mouse late preimplantation blastocysts (E4.5) immunostained for AHCY or NANOG.

We next studied the genomic occupancy of AHCY by ChIP followed by massive parallel sequencing (ChIP-seq). By combining the data from two independent ChIP-seq replicates, we identified 665 confident binding sites for AHCY in the mouse genome (Fig. 3A, fig. S5A, and table S3). We validated several AHCY peaks by ChIP and quantitative polymerase chain reaction (ChIP-qPCR; Fig. 3B) and confirmed that Ahcy knockdown led to reduction of the AHCY ChIP signal at specific promoters (Fig. 3B). Genome-wide analysis of ChIP-seq data revealed that AHCY is preferentially recruited to regions in close proximity to transcription start sites (TSS) (Fig. 3C), covering a narrow window around TSS of target genes (Fig. 3, D and E). We found that AHCY target genes are among the most highly expressed genes in ESCs (Fig. 3F) and are decorated with transcriptionally permissive histone posttranslational modifications (H3K36me3, H3K4me3, and H3K27ac) and only low levels of transcriptionally repressive marks (e.g., H3K27me3 and 5mC) (fig. S5). In addition, we found a high correlation between AHCY occupancy and key factors associated with the cancer-related ESC-specific Myc network, including Myc, TRIM28, and DMAP1 (Fig. 3G). The Myc network defined in ESCs is primary linked to proliferation and ribosomal activity and shares transcriptional activity in cancer cells (34), suggesting a potential role of AHCY in ESC proliferation. In agreement, we found that AHCY was enriched at TSS of a large number of ribosomal and mRNA processing genes (Fig. 3H), which are strongly expressed in ESCs, in line with their highly proliferative and transcriptional activity. In sum, these data indicate that AHCY associates at TSS of transcriptionally active genes in ESCs, with a potential direct impact in their transcriptional regulation at the chromatin level.

Fig. 3 AHCY occupies transcription start sites (TSS) of highly expressed genes in ESCs.

(A) Genomic visualization of AHCY ChIP-seq at several target sites. (B) AHCY ChIP-seq validation by ChIP-qPCR at TSS of AHCY target genes (Map2k5, Rbl1, Hdac2, Rps20, Rpl18, Rpl4, and Rpl17) and AHCY nontarget genomic regions (TSS of Spink2 and intergenic region at Chr15) as a negative control. ChIP-qPCRs were performed in shControl cells (n = 3) and shAhcy-KD ESCs with three independent shRNAs, showing the specific reduction in AHCY binding in shAhcy-KD ESCs (mean ± SD of three technical replicates). IgG, immunoglobulin G. (C) Genomic distribution of ChIP-seq peaks of AHCY. The spie chart represents the distribution of AHCY peaks corrected by the genome-wide distribution of each gene genomic feature (indicated in the background circle distribution). The spie charts indicate that AHCY preferentially occupies TSS neighborhood regions, including 5′ untranslated region (5′UTR) and proximal promoter regions. (D) Heat map showing the AHCY ChIP-seq occupancy profile around TSS (±5 kb) in two independent ChIP-seq replicates, with IgG as a negative control. (E) Meta-gene plot showing the AHCY ChIP-seq occupancy profile (red line) around TSS (±2 kb). IgG distribution is indicated by the gray line. (F) Box plots indicating the gene expression for the whole transcriptome in naïve ESCs, fractionated into five groups, from non-expressed genes (black box plot) to the highly expressed genes (light blue box plot). The red box plot indicates the expression of AHCY target genes. (G) Percentage of AHCY co-occupancy with chromatin factors from Myc, core, and PcG networks. (H) Pathway enrichment analysis of AHCY target genes. EGFR1, epidermal growth factor receptor 1; MAPK, mitogen-activated protein kinase; TNF-α, tumor necrosis factor–α; and NF-κB, nuclear factor κB.

The efficient knockdown of Ahcy had no effect on the levels of the DNA damage marker γH2AX, the formation of dome-shaped ESC colonies in culture (Fig. 4, A and B), or the expression levels of key pluripotency genes, such as Pou5f1, Nanog, and Sox2 (fig. S6), which are indications of normal stem cell identity. However, our unsuccessful attempts to generate full CRISPR knockout ESCs for Ahcy suggest that a low basal AHCY activity is essential for ESC fitness. Lowering the levels of AHCY resulted in a 40 to 50% reduction in the number of ESCs in culture (Fig. 4C). This reduction is in agreement with an increased number of cells in the G1 phase and a concomitant decrease of mitotically active S phase cells in shAhcy-KD ESC cultures (Fig. 4, D and E), thus indicating that AHCY functions to promote the proliferation of pluripotent cells.

Fig. 4 AHCY promotes the expression of ribosomal protein genes, protein synthesis, and proliferation in ESCs.

(A) Total cell extracts from shControl- or shAhcy (three independent shRNAs)–infected cells were analyzed by Western blot with the indicated antibodies. The protein vinculin was used as a loading control. (B) Images of ESC colonies growing under naïve culture conditions, displaying the typical dome shape in both shControl and shAhcy-KD ESCs. Scale bar, 100 μm. (C) Histogram indicating the relative number of shControl and shAhcy-KD ESCs after 48 hours of growing under naïve conditions (mean ± SE, n = 4 for shControl, shAhcy#1, and shAhcy#3 and n = 3 for shAhcy#2; *P ≤ 0.05, **P ≤ 0.01, two-tailed Student’s t test). (D) Cell cycle profile of shControl and shAhcy-KD ESCs analyzed by fluorescence-activated cell sorting (FACS). PI, propidium iodide. (E) Histogram indicating the percentage of shControl and shAhcy-KD ESCs in different phases of the cell cycle (mean ± SE; *P ≤ 0.05, two-tailed Student’s t test; n = 2 shControl and n = 5 shAhcy). (F) Gene set enrichment analysis (GSEA) plots for the gene expression data generated by RNA-seq from cells infected with shControl (n = 2) or shAhcy (n = 5). The complete mouse genome was ranked according to their log2 fold change expression between shControl- and shAhcy-infected cells. Genes down-regulated in shAhcy are distributed on the left of the plot, and those up-regulated are distributed on the right. The top significant KEGG (Kyoto Encyclopedia of Genes and Genomes) pathway is shown together with the computed nominal P value, the q value, and the normalized enrichment score (NES). Distribution of ribosomal protein genes is indicated in blue lines, as well as the curve for the evolution of the gene density. (G) Violin plots indicating the expression of ribosomal protein genes calculated from RNA-seq. Each dot represents average expression of a ribosomal protein gene. Red dots indicate the ribosomal protein genes targeted by AHCY. ***P < 0.001, Wilcoxon matched pair signed-rank test. (H) Histogram indicates the relative incorporation of the methionine analog L-HPG in shControl and shAhcy-KD ESCs as analyzed by FACS. (I) Quantification of high and low percentage of L-HPG–incorporated cells (mean ± SE; *P ≤ 0.05, two-tailed Student’s t test; n = 3 shControl and n = 9 shAhcy). (J) Mouse preimplantation embryos (E3.5) were cultured ex vivo in the presence of 3-deazaadenosine (3-DZA) or DMSO (as vehicle) for 18 hours. Embryos were then pulsed once with L-HPG and labeled using the Click reaction. Developmental progression of embryos was evaluated by (K) counting the number of cells per embryo or (L) quantifying the total L-HPG incorporation per embryo and normalizing this by the number of cells. *P < 0.05, **P < 0.01, ***P < 0.001, Mann-Whitney test two-tailed; n = 3 for E3.5, and n = 10 for 18-hour vehicle and n = 9 for 18-hour AHCYi. (M) Model for the AHCY-dependent proliferation control of pluripotent cells. Nutrients are the primary source of cellular methionine, which, in turn, can be incorporated into the newly synthetized proteins (1) or it can be used to produce SAM as a methyl donor for MTases (2). Elevated levels of SAH produced by the transfer of the methyl group can inhibit the MTases. AHCY is recruited to highly transcribed genes, such as ribosomal protein genes, to favor the efficient methylation reactions at these loci and the production of mature transcripts in an efficient manner. Elevated ribosomal protein production sustains the rate of protein synthesis and proliferation in ESCs.

To gain insight into the mechanism of action of how AHCY promotes proliferation, we performed gene set enrichment analysis (GSEA) with transcriptome data from shControl and shAhcy-KD ESCs. We identified the ribosomal protein genes as the most prominent category of genes transcriptionally affected by AHCY depletion (Fig. 4, F and G). Proteomic analysis of shControl and shAhcy-KD ESCs indicated a consistent reduction in the ribosomal proteins encoded by AHCY target genes, further supporting their role in ribosome biogenesis (fig. S7, A and B). To test the impact of reducing ribosome production on protein synthesis upon AHCY depletion, we measured the efficiency of incorporating the methionine analog L-homopropargylglycine (L-HPG) in newly synthesized proteins at the single-cell level by fluorescence-activated cell sorting (FACS) (fig. S8, A and B). We discovered that the protein synthesis rate was remarkably reduced in shAhcy-KD ESCs as compared to shControl cells (Fig. 4, H and I). Similarly, the pharmacological blockage of AHCY activity with 3-deazaadenosine (3-DZA), a potent and high specific inhibitor, reduced the protein synthesis rate in ESCs in a dose-dependent manner (fig. S8, C and D). Mouse preimplantation embryos cultured in the presence of 3-DZA displayed severe developmental growth delay and a remarkable reduction in the protein synthesis rate (Fig. 4, J to L). Together, these findings support a functional role of AHCY in controlling the proliferation of pluripotent cells during early mammalian development by mechanistically regulating ribosomal protein gene expression and protein synthesis rate (Fig. 4M).


In summary, the Dm-ChP–MS pipeline presented in this study is a powerful method for chromatome profiling and for the identification of unpredicted chromatin-associated proteins. The experimental approach highlights its simplicity, its robustness, and its usability in a wide range of proliferating cells. In the last decade, several methodologies have been designed to study the proteomic composition of cells using different strategies. These methodologies fall into three major groups according to the material captured and the method used for capturing (table S1). Several of these strategies are relatively complex methodologies that include the genomic engineering of epitope-tagged proteins (e.g., HTB or TAP), the genomic insertion of DNA recognition sites at a specific locus (e.g., LexA), or the capture of specific genomic repetitive regions by oligo hybridization and pulldown (e.g., telomeres). Other chromatin-purifying methods result in a material with a remarkable abundance of contaminants, such as antibodies, that can mask the detection of low-abundant chromatin proteins. Last, traditional fractionation methodologies base their strategy on the inherent insolubility of the chromatin fraction, which is not an exclusive characteristic of the chromatin compartment. These strategies do not completely rule out the existence of contaminants, and even the more sophisticated one (as ChEP) required computational filtering based on complex machine-learning approaches to provide a probabilistic score for each protein identified as a putative chromatin factor. We believe that Dm-ChP methodology overcomes all these limitations by (i) being a simple experimental procedure easily implemented by researchers with limited experience in biochemical purification procedures, allowing a throughput approach; (ii) being a robust methodology with the absence of contaminants (e.g., antibodies), which could mask the detection of low-abundance chromatin proteins; and (iii) allowing the maintenance of the weak interactions of proteins due to in situ cross-link reaction. In addition, considering that the Dm-ChP relies on the incorporation of EdU during replication, we envision that Dm-ChP could be applied to purify chromatin from highly proliferative tumoral cells or from proliferating adult stem cells within a non-proliferative tissue, thereby highlighting important regulators in cancer and stem cell biology.

With the aim of testing the strength of Dm-ChP to identify relevant chromatin-associated proteins de novo, we have discovered the unexpected functional link of the metabolic enzyme AHCY with ribosomal protein gene transcription and stem cell proliferation. The biological and pathological relevance of AHCY is in line with its unique capacity to catabolize SAH metabolite. Elevated levels of SAH/SAM result in a competitive inhibition of SAM-dependent MTases (28). Several reports have highlighted a functional connection between AHCY and cancer cell fitness (3538). However, the proposed mechanism of action are diverse, including the activation of the DNA damage response in human cancer cell lines (3537) or attenuating the induction of mRNA cap methylation and protein synthesis upon ectopic expression of Myc in immortalized rat and human cells (38). By using Dm-ChP–MS, together with chromatin profiling by ChIP-seq, we have discovered that AHCY is a chromatin-associated protein that occupies TSS of active genes. We did not observe any sign of DNA damage response activation in shAhcy-KD ESCs, including comparable levels of γH2AX to control cells, the normal dome-shaped morphology of ESCs colonies, and the stable maintenance of pluripotent gene expression. This, together with the high correlation in genomic occupancy that we observed between AHCY and Myc-related factors, suggests that AHCY could be recruited to chromatin to facilitate an efficient mRNA cap methylation on specific genes in ESCs. Overall, our data point to a dual role of methionine for the control of protein synthesis: first, as a building block for protein production and, second, as a metabolic intermediate for the control of ribosome biogenesis (Fig. 4M).

By using Dm-ChP, we have provided additional data challenging the classical compartmentalization of cytosolic metabolic pathways and nuclear chromatin regulation (39). Recent studies propose that the chromatin recruitment of metabolic enzymes is considered an adaptive cell solution for the subcellular local demand of metabolites (4042). Our findings suggest that AHCY recruitment to spatially confined genomic regions might favor a balanced SAM/SAH composition at SAH-sensitive chromatin subcompartments, thereby facilitating a transcription-permissive environment. Beyond the functional significance for the control of pluripotent cell proliferation, our discovery provides a mechanistic insight into the long-lasting factual connection between perinatal nutrition, which fuels the methionine cycle, and embryo development.


Cell culture

E14TG2a from mouse embryo (male blastocyst, strain 129/Ola) (Sigma-Aldrich) were cultured on tissue culture plates coated with 0.1% gelatin (Millipore) under serum-free or serum conditions, as indicated. For serum-free (so-called 2iLIF) cultures, ESCs were grown in serum-free N2B27 medium [(Dulbecco’s modified Eagle’s medium (DMEM)/F12/Neurobasal 1:1, 0.5× N-2 supplement, 1× B-27 serum-free supplement, 50 μM 2-mercaptoethanol, bovine albumin fraction V 0.033%, 1× glutaMAX, 1× minimum essential medium (MEM) nonessential amino acids, and 1× penicillin-streptomycin; all from Gibco] supplemented with 1 μM PD0325901, 3 μM CHIR99021, and LIF. For serum-containing cultures, ESCs were cultured in knockout DMEM, 20% knockout serum replacement, 1× nonessential amino acids, 2 mM glutaMAX, 5 mM Hepes, and 0.5 mM 2-mercaptoethanol (all from Gibco) supplemented with LIF.

For stable depletion of gene expression, ESCs were incubated with media containing plasmid-based shRNA lentiviral particles for 24 hours. After 48 hours, infected cells were selected for 3 days with puromycin (2 μg/ml; Sigma).

For protein synthesis rate assays, cells were incubated with 50 μM L-HPG (Invitrogen) in serum-free and methionine-free DMEM (Gibco) supplemented with 0.5× N-2 supplement, 1× B-27 serum-free supplement, 50 μM 2-mercaptoethanol, bovine albumin fraction V 0.033%; 1× glutaMAX, 1× MEM nonessential amino acids, 1× penicillin-streptomycin, 1 μM PD0325901, 3 μM CHIR99021, and LIF. Additional leukemic and epithelial carcinoma human cell lines were cultured in suspension or under adherent conditions, as indicated in Table 1.

Table 1 List of human cell lines used in this study.

CIMA, Centro de Investigación Médica Aplicada, Spain; CRG, Center for Genomic Regulation, Spain; EC, epithelial carcinoma; FBS, fetal bovine serum (Hyclone); IMIM, Institut Hospital del Mar d’Investigacions Mèdiques, Spain; IRB, Research Institute Barcelona, Spain; LC, leukemic cells; NaPyr, sodium pyruvate (Gibco); P/S, penicillin-streptomycin (Gibco); Qx, glutaMAX (Gibco); RPMI 1640 (Gibco); GM-CSF, granulocyte-macrophage colony-stimulating factor; h.i., heat inactivated.

View this table:

Lentivirus production and infection

Lentiviruses were produced as described previously (43). Human embryonic kidney (HEK) 293 T packaging cells were transfected using the calcium phosphate transfection method with 5 μg of pCMV-VSV-G, 6 μg of pCMVDR-8.91, and 7 μg of the pLKO-shRNA (Sigma-Aldrich) plasmid, together with either TRC2 MISSION pLKO.5 harboring nonmammalian shRNA control or shRNAs against three different regions of the Ahcy transcript (#1, GAGCAAATGTCACCAACTTTG; #2, CGGTGGAGAAAGTGAACATCA; #3, GTGGACCCACCCAGATAAATA). Cells were incubated with the transfection mix for 12 to 16 hours, after which the medium was replaced by ESC (LIF-free) fresh medium. After 48 hours, lentiviral particles were collected, filtered with a 0.45-μm filter, and stored at −80°C for further use. Lentiviral particles were prepared according to the Spanish National Biosafety Commission guidelines and the relevant European directives.

Western blotting

Samples were analyzed by Western blot, as described previously (44). The antibodies used are listed in Table 2.

Table 2 List of antibodies used in this study.

DB, dot blot; IF, immunofluorescence; WB, Western blot; HRP, horseradish peroxidase.

View this table:

Fluorescence-activated cell sorting

Cell cycle analysis. Cells were dissociated to single cells with trypsin (Gibco) and fixed with ethanol. DNA was stained with propidium iodide (Sigma-Aldrich). Cells were analyzed by FACS using FACScalibur (Becton Dickinson), and data were analyzed using FlowJo v10. The number of single cells collected for analysis was 10,000.

Protein synthesis. Cells were dissociated to single cells with trypsin, fixed with 4% paraformaldehyde (PFA), and analyzed by Click reaction [100 mM tris-HCl (pH 8), 2 mM CuSO4, 2 μM Alexa Fluor 647 azide (Invitrogen), and 100 mM ascorbic acid] for 30 min at room temperature (RT). Cells were analyzed by FACS using LSRII (Becton Dickinson), and data were analyzed using FlowJo v10. The number of single cells collected for analysis was between 2000 and 9500.

Chromatin immunoprecipitation–quantitative polymerase chain reaction

ESCs were cross-linked in 1% PFA for 10 min at RT in a shaker. The fixation reaction was stopped by adding a final concentration of 0.125 M glycine (pH 7.4) to the culture and incubating for 5 min. Cells were washed twice with 1× phosphate-buffered saline (PBS) at RT and harvested in ice-cold 1× PBS plus protease inhibitor cocktail (PIC; Roche). Pellets were resuspended in 1.3 ml of ice-cold IP buffer, 1× volume SDS buffer [100 mM NaCl, 50 mM tris-HCl (pH 8), 5 mM EDTA (pH 8), and 0.5% SDS], and 0.5 volume Triton dilution buffer [100 mM NaCl, 100 mM tris-HCl (pH 8.6), 5 mM EDTA (pH 8), and 5% Triton X-100] with PIC (Roche). Samples were sonicated for 3 × 10 min (30 s on/30 s off) in a Bioruptor (Diagenode) at maximum output. After sonication, samples were centrifuged at 4°C at maximum speed for 15 min. To check the yield of chromatin shearing, 1 to 5% of input material was reverse cross-linked for 3 hours at 65°C in a shaker (1000 rpm) in high-salt buffer (1× PBS, 500 mM NaCl) plus 0.2 μg of proteinase K (Invitrogen), followed by PCR purification using a kit (Qiagen). A total of 50 μg of sheared chromatin of 200 to 500 bp was immunoprecipitated with 5 μg of antibody (anti-AHCY or rabbit IgG, as control) to a final volume of 500 μl. ChIP reactions were incubated overnight at 4°C on rotation. The next day, 30 μl of blocked protein A agarose beads [0.05% bovine serum albumin (BSA) on IP buffer] was added to the ChIP reactions and incubated for 2 hours at 4°C. After incubation, beads were washed three times with 1 ml of salt buffer [140 mM NaCl, 50 mM Hepes (pH 7.5), and 1% Triton X-100] and once with 1 ml of high-salt buffer [500 mM NaCl, 50 mM Hepes, (pH 7.5), and 1% Triton X-100]. ChIP samples were eluted using 200 μl of freshly prepared elution buffer (1% SDS and 100 mM NaHCO3) and reverse cross-linked at 65°C overnight with shaking (1000 rpm). DNA was purified using a PCR purification kit (Qiagen). DNA was eluted with 100 μl of water.

Real-time PCR reactions were performed using SYBR Green I PCR Master Mix (Roche) and the Roche LightCycler 480. The oligonucleotides used in this study are given in Table 3.

Table 3 List of primers used in this study.

View this table:

Chromatin immunoprecipitation sequencing

Chromatin material for ChIP-seq was prepared using the ChIP-IT High Sensitivity Kit from Active Motif, following the manufacturer’s instruction. ChIP-seq libraries were prepared with 2 to 10 ng of ChIP DNA material using the NEBNext Ultra DNA library Prep Kit for Illumina (New England Biolabs) following the manufacturer’s instruction. ChIP-seq libraries were amplified for 10 to 15 PCR cycles. Libraries were sequenced using a HiSeq 2500 sequencer (Illumina). Table 4 provides sequencing depth data.

Table 4 List of sequencing runs.

View this table:

RNA extraction and library preparation

RNA extraction was performed using the RNeasy Mini Kit (Qiagen) following the manufacturer’s instruction. RNA samples were quantified, and quality was evaluated using a Bioanalyzer (RIN > 9.9). A total of 0.5 μg of RNA was used as starting material for library preparation with rRNA (ribosomal RNA) depletion using the TruSeq Stranded Total RNA Library Prep Kit (Ilumina) following the manufacturer’s instruction. Libraries were sequenced using a HiSeq2500 sequencer (50 bp, single end) with HiSeq v4 chemistry. Sequencing depth details are provided in Table 4.

Dot blot analysis

Samples were spotted in 1-μl dots onto a nitrocellulose membrane (Protan, 0.2 μM, Amersham) in triplicate, air-dried, and subjected to standard blotting procedures with the indicated antibodies.

Embryo collection and culture

All animal procedures were performed in accordance with Spanish and Catalan laws and overseen by the Institutional Animal Care and Use Ethics Committee of the Barcelona Biomedical Research Park. Embryos were collected from 5- to 8-week-old (C57BL/6J) superovulated females crossed with males (B6D2F1/J). Embryos were collected at 98 hours after human chorionic gonadotrophin injection (phCG) at the early blastocyst stage. Flushed embryos were cultured in M16 (SIGMA) drops under oil at 37°C and 5% CO2 in the presence of dimethyl sulfoxide (DMSO) or 3-DZA (200 nM; Cayman Chemical) for 18 hours. Embryos were then incubated for an additional hour with 50 μM L-HPG, at which point embryos were fixed in 4% paraformaldehyde for 10 min.


Cells or flushed mouse embryos (CD1 strain, E4.5) were fixed in 4% PFA for 10 min, incubated for 30 min in blocking buffer [1% BSA, 10% fetal bovine serum (FBS), and 0.1% Triton X-100 in PBS], and then stained with primary antibodies for 2 hours at RT in blocking buffer with 5% FBS. Samples were washed three times for 5 min at RT with blocking buffer and incubated with conjugated secondary antibodies for 1 hour at RT. Where indicated, a Click reaction [100 mM tris-HCl (pH 8.0), 2 mM CuSO4, 2 μM Alexa Fluor 647 azide (Invitrogen), and 100 mM ascorbic acid] was performed for 30 min at RT. Samples were analyzed using a Leica TCS SPE inverted confocal microscope, and optical sections were captured. For quantification of signal intensity, the sum of the complete Z stack projection image was generated and quantified using ImageJ processing software (

Dm-ChP–mass spectrometry

Sample processing. Eluted proteins were reduced, alkylated, and digested to peptide mixes according to the filter-aided sample preparation (45) method using LysC [1:10 (w/w), enzyme/substrate] at 37°C overnight followed by trypsin [1:10 (w/w), enzyme/substrate] at 37°C for 8 hours. Tryptic peptide mixtures were desalted using a C18 UltraMicroSpin column (46).

LC-MS analysis. Samples were analyzed in an LTQ-Orbitrap Velos Pro mass spectrometer (Thermo Fisher Scientific) coupled to nano-LC (Proxeon) equipped with a reversed-phase chromatography 25-cm column with an inner diameter of 75 μm, packed with 1.9-μm C18 particles (Nikkyo Technos Co.). Chromatographic gradients changed from 97% buffer A, 3% buffer B to 65% buffer A, 35% buffer B over 120 min at a flow rate of 250 nl/min, in which buffer A was 0.1% formic acid in water and buffer B was 0.1% formic acid in acetonitrile. The instrument was operated in DDA mode, and full MS scans with one microscan at a resolution of 60,000 were used over a mass range of mass/charge ratio (m/z) 350 to 2000 with detection in the Orbitrap. Following each survey scan, the top 20 most intense ions with multiple charged ions above a threshold ion count of 5000 were selected for fragmentation at a normalized collision energy of 35%. Fragment ion spectra produced via collision-induced dissociation were acquired in the linear ion trap. All data were acquired with Xcalibur software v2.2.

Data analysis. Acquired data were analyzed using the Proteome Discoverer software suite (v1.4, Thermo Fisher Scientific) and Mascot search engine (v2.5). Data were searched against SwissProt mouse for the mouse cell lines and against SwissProt human for cancer tissue samples; in both cases, the most common contaminants were added (47). A precursor ion mass tolerance of 7 parts per million (ppm) at the MS1 level was used, and up to three miscleavages for trypsin were allowed. The fragment ion mass tolerance was set to 0.5 Da. Oxidation of methionine and protein acetylation at the N terminus were defined as variable modification. Carbamidomethylation on cysteines was set as a fixed modification. The identified peptides were filtered using false discovery rate (FDR) < 5%.

Interactome analysis. Analysis of specific chromatin interactors was carried out with SAINT, as previously described (17) and using a threshold corresponding to FDR < 1%. Protein interaction data were retrieved from the STRING database (48). Only experimentally validated protein-protein interactions, curated databases, and text-mining information were considered. Functionally related protein networks were visualized using Cytoscape (v3.6.1) (49).

Shotgun mass spectrometry

Sample processing. Cells were lysed for 30 min in urea lysis buffer (6 M urea and 0.2 M ammonium bicarbonate) at RT. Then, samples were sonicated (Bioruptor, Diagenode) for 10 min at high intensity (30 s on/30 s off pulses) and clarified by centrifugation, and the protein concentration was quantified with a BCA Protein Assay Kit (Thermo Fisher Scientific). Equal amounts of protein sample were analyzed by MS, as follows. Samples were reduced with DTT (10 mM, 1 hour, 37°C), alkylated in the dark with iodoacetamide (20 mM, 30 min, 25 C), and digested. The resulting protein extract was first diluted 1:3 with 200 mM NH4HCO3 and digested using LysC [1:10 (w/w), enzyme/substrate] at 37°C overnight followed by trypsin [1:10 (w/w), enzyme/substrate] at 37°C for 8 hours. Tryptic peptide mixtures were desalted using a C18 UltraMicroSpin column.

LC-MS analysis. Peptide mixes were analyzed using an Orbitrap Fusion Lumos mass spectrometer (Thermo Scientific, San Jose, CA, USA) coupled to an EasyLC [Thermo Scientific (Proxeon), Odense, Denmark]. Peptides were loaded directly onto the analytical column and were separated by reversed-phase chromatography using a 50-cm column with an inner diameter of 75 μm, packed with 2-μm C18 particles spectrometer (Thermo Scientific, San Jose, CA, USA). Chromatographic gradients started at 95% buffer A and 5% buffer B with a flow rate of 300 nl/min and gradually increased to 22% buffer B in 79 min and then to 35% buffer B in 11 min. After each analysis, the column was washed for 10 min with 5% buffer A and 95% buffer B. Buffer A was 0.1% formic acid in water and buffer B was 0.1% formic acid in acetonitrile.

The mass spectrometer was operated in DDA mode, and full MS scans with one microscan at a resolution of 120,000 were used over a mass range of m/z 350 to 1500 with detection in the Orbitrap. Auto gain control was set to 2E5 and dynamic exclusion was set to 60 s. In each cycle of DDA analysis following each survey scan, top-speed ions charged 2 to 7 above a threshold ion count of 1e4 were selected for fragmentation at a normalized collision energy of 28%. Fragment ion spectra produced via high-energy collision dissociation were acquired in the ion trap. All data were acquired with Xcalibur software v3.0.63.

Data analysis. Proteome Discoverer software suite (v2.2, Thermo Fisher Scientific) and Mascot search engine [v2.5, Matrix Science (2)] were used for peptide identification and quantification. The data were searched against an in-house generated database containing all proteins corresponding to mouse in the SwissProt database plus a list of common contaminants and all the corresponding decoy entries (released, April 2018). A precursor ion mass tolerance of 7 ppm at the MS1 level was used, and up to three missed cleavages for trypsin were allowed. The fragment ion mass tolerance was set to 0.5 Da. Oxidation of methionine and protein acetylation at the protein N-terminus were defined as variable modification, whereas carbamidomethylation on cysteines was set as a fixed modification. Identified peptides were filtered using a 5% FDR. Protein abundance was estimated using the area under the chromatographic peak of the three most intense peptides per protein. Data were log-transformed, and fold changes, P values, and q values were calculated to assess protein relative quantification.

DNA-mediated chromatin pulldown

Cells were pulsed for 20 hours (for ESCs) or 48 hours (for human cancer cell lines) with 0.1 μM of the thymidine analog, EdU (Invitrogen). Subsequently, the cells were fixed in 1% PFA for 10 min at RT and quenched with 0.125 mM glycine (pH 7) for 5 min at RT. Cells were harvested, pelleted by centrifugation (720g, 10 min at 4°C), and lysed for 30 min at 4°C in lysis buffer A [10 mM Hepes (pH 7.9), 10 mM KCl, 1.5 mM MgCl2, 0.34 M sucrose, 10% (v/v) glycerol, 1 mM DTT, 10 mM β-glycerol phosphate, 1 mM sodium orthovanadate, PIC (Roche), and 0.1% (v/v) Triton X-100]. Nuclei were pelleted by centrifugation (1300g, 4 min at 4°C), washed with PBS + PIC, and subjected to Click reaction for 30 min at RT with 0.2 mM biotin-azide (Invitrogen). In Click, an organic azide reacts with a terminal acetylene, and the nucleotide-exposed ethynyl residue of EdU is derivatized by a copper-catalyzed cycloaddition reaction, to form a covalent bond between EdU and biotin. Nuclei were repelleted by centrifugation (1300g, 4 min at 4°C), washed with PBS + PIC, and suspended in shearing buffer [20 mM tris-HCl (pH 7.4), 150 mM NaCl, 1 mM EDTA, 0.1% (w/v) SDS, 0.5% (w/v) sodium deoxycolate, 1% (v/v) Triton X-100, 10 mM β-glycerol phosphate, 1 mM sodium orthovanadate, and PIC]. Nuclei suspension was extensively sonicated (Bioruptor, Diagenode) for four to six cycles of 10 min at high intensity (30 s on/30 s off pulses). Lysates were centrifuged (20,800g, 20 min at 4°C), and supernatant was collected as input for further analysis. To analyze shearing efficiency, 5% of input material was reverse cross-linked by incubation overnight at 65°C with 250 mM NaCl and then digested with proteinase K (0.1 mg/ml) for 1 hour at 55°C. DNA was purified using the PCR purification kit from Qiagen, following the manufacturer’s instruction. For chromatin capture, input material was diluted 1:4 with blocking buffer [1% Triton X-100, 2 mM EDTA (pH 8), 150 mM NaCl, 20 mM tris-HCl (pH 8), 20 mM β-glycerol phosphate, 2 mM sodium orthovanadate, PIC, and salmon sperm DNA (10 mg/ml)] and then incubated with preblocked Dynabead M-280 streptavidin (Invitrogen) for 30 min at 4°C. Beads were washed twice with blocking buffer (without salmon sperm DNA), twice with high-salt blocking buffer (containing 500 mM NaCl), and once with tris-EDTA buffer (pH 8). Beads were suspended in modified Laemmli buffer [2% (v/v) SDS, 0.06 M tris-HCl (pH 6.5), and 0.1 M DTT] for 20 min at 95°C, and the supernatant was collected for analysis by dot blot, Western blot, or MS. For MS analysis of Dm-ChP samples, 30 × 106 to 40 × 106 ESCs were lysed in 5 ml of lysis buffer A, incubated in 1 ml of Click reaction buffer, sheared in 4 ml of shearing buffer, incubated with 0.5 ml of Dynabeads M-280 streptavidin, and resuspended in 150 ml of modified Laemmli buffer.

Bioinformatic analysis

Analysis of ChIP-seq data. ChIP-seq samples were mapped against the mm9 mouse genome assembly using BowTie with the option –m 1, to discard reads that could not be uniquely mapped to just one region (50). MACS (Model-based analysis of ChIP-Seq) was run with the default parameters, but the shift size was adjusted to 100 bp to perform the peak calling against the corresponding control sample (51). Peaks from two replicates were merged, and those with a fold-enrichment score above 4.50 were selected as confident AHCY binding sites. Genome distribution of each set of peaks was calculated by counting the number of peaks fitting on each class of region according to RefSeq annotations. Distal region is the region 2.5 kilo–base pairs (kbp) to 0.5 kbp upstream of the TSS; proximal region, from 0.5 kbp until the TSS; UTR, untranslated sequence; CDS, protein coding sequence; introns, intronic regions; and intergenic, the nonintronic regions of the genome. Peaks that overlapped with more than one genomic feature were proportionally counted the same number of times. To generate spie charts, the genome distribution of all features in the full genome was first calculated and then the R-caroline package was used to combine the spie chart of each set of peaks with the full genome distribution (52). Each set of target genes was retrieved by matching the ChIP-seq peaks in the region 2.5 kbp upstream of the TSS until the end of the transcripts as annotated in RefSeq. Heat maps displaying the density of ChIP-seq reads 5 kb around the TSS of each target gene set were generated by counting the number of reads in this region for each individual gene and normalizing this value with the total number of mapped reads of the sample. Genes on each ChIP heat map were ranked by the logarithm of the averaged number of reads on the same genomic region. The University of California Santa Cruz (UCSC) genome browser was used to generate the screenshots shown for each group of experiments (53).

GO term enrichment was performed using the DAVID (Database for Annotation, Visualization and Integrated Discovery) database with the total set of proteins against the entire mouse genome as the background (54). The Enrichr tool was used to analyze pathway enrichment (55). The Mouse Genome Informatics tool was used to retrieve the mammalian loss-of-function phenotype ( (56). The human disorders associated with selected genes were retrieved using the BioMart tool from Ensembl (57). Protein interaction data were retrieved from the STRING database (48) using the protein mode. Only interactions from protein-protein interaction databases, curated databases, and text-mining information were considered. The network was visualized using Cytoscape (v3.6.0) (49). Genomic co-occupancy was analyzed using the BindDB online tool ( Published datasets used for co-occupancy analysis and histone mark level analysis are shown in Table 5.

Table 5 List of available datasets used in this study.

View this table:

Analysis of RNA-seq data. RNA-seq samples were mapped against the mm9 mouse genome assembly using TopHat (58) with the option --g 1 to discard reads that could not be uniquely mapped to just one region. Cufflinks and Cuffdiff were run to quantify the expression in RPKMs of each annotated transcript in RefSeq and to identify the list of differentially expressed genes on each case (59). Qlucore Omics Explorer 3.4 software (Qlucore, Lund, Sweden) was used to perform GSEA with predefined gene sets, and GSEA software ( was used to perform GSEA with custom-defined gene sets (60, 61).

Accession numbers

Raw data and processed information of the ChIP-seq and RNA-seq experiments generated in this article were deposited in the National Center for Biotechnology Information Gene Expression Omnibus repository under the accession number GSE116799. The MS proteomics data have been deposited to the ProteomeXchange Consortium via the PRIDE (62) partner repository with the dataset identifier PXD011670.


Supplementary material for this article is available at

Fig. S1. Chromatin capture from pluripotent ESCs by Dm-ChP.

Fig. S2. Culture conditions for Dm-ChP are compatible with normal physiology.

Fig. S3. Biological and pathologically relevant chromatin-associated proteins in ESCs.

Fig. S4. Expression of key developmental genes during early mammalian development.

Fig. S5. AHCY is recruited to the TSS of transcriptionally active genes.

Fig. S6. Expression of pluripotency genes upon Ahcy knockdown.

Fig. S7. Proteomic analysis of ribosomal protein encoded by AHCY targets.

Fig. S8. Inhibition of AHCY reduces rate of protein synthesis.

Fig. S9. Gating strategy for cell cycle and HPG incorporation analysis by FACS.

Table S1. Principal methods developed to identify protein components of the chromatin.

Table S2. Proteomic data.

Table S3. ChIP-seq data.

Table S4. RNA-seq data.

This is an open-access article distributed under the terms of the Creative Commons Attribution-NonCommercial license, which permits use, distribution, and reproduction in any medium, so long as the resultant use is not for commercial advantage and provided the original work is properly cited.


Acknowledgments: We thank P. Vizán and members of the Di Croce laboratory for critical reading of the manuscript and insightful discussions, V.A. Raker for scientific editing, S. Nakagawa for advice on mouse embryo manipulation, and the CRG Genomics Unit, CRG/UPF Flow Cytometry Unit, and the CRG Advanced Light Microscopy Unit for assistance with sequencing, FACS, and microscopy services, respectively. Funding: We acknowledge support from the Spanish Ministry of Economy, Industry and Competitiveness to the EMBL partnership, Centro de Excelencia Severo Ochoa, the CERCA Programme/Generalitat de Catalunya, the Secretary for Universities and Research of the Ministry of Economy and Knowledge of the Government of Catalonia (to S.A. and A.A.-C.), and the Lady Tata Memorial Trust (to S.A.). The CRG/UPF Proteomics Unit is a member of the ProteoRed PRB3 consortium that was supported by grant PT17/0019 of the PE I+D+i 2013–2016 from the Instituto de Salud Carlos III (ISCIII), ERDF, and “Secretaria d’Universitats i Recerca del Departament d’Economia i Coneixement de la Generalitat de Catalunya” (2017SGR595). The Di Croce Laboratory was supported by grants from the Spanish Ministerio de Educación y Ciencia (BFU2016-75008-P), AGAUR, and La Marato TV3. Author contributions: S.A. conceived and planned this project; collected, analyzed, and interpreted data; and wrote the manuscript with input from coauthors. A.A.-C. collected, analyzed, and interpreted data. E.Bl. performed the bioinformatics analysis of deep sequencing data. E.Bo. and E.S. performed and analyzed proteomic data. C.C. conducted and analyzed data. L.D.C. conceived and planned this project. Competing interests: The authors declare that they have no competing interests. Data and materials availability: The accession number for the ChIP-seq and RNA-seq data reported here is GSE116799. MS proteomics data were deposited in the ProteomeXchange Consortium via the PRIDE (62) partner repository with the dataset identifier PXD011670. All other data needed to evaluate the conclusions in the paper are present in the paper and/or the Supplementary Materials. Additional data related to this paper may be requested from the authors.

Stay Connected to Science Advances

Navigate This Article