Research ArticleCELL BIOLOGY

A novel landscape of nuclear human CDK2 substrates revealed by in situ phosphorylation

See allHide authors and affiliations

Science Advances  17 Apr 2020:
Vol. 6, no. 16, eaaz9899
DOI: 10.1126/sciadv.aaz9899


Cyclin-dependent kinase 2 (CDK2) controls cell division and is central to oncogenic signaling. We used an “in situ” approach to identify CDK2 substrates within nuclei isolated from cells expressing CDK2 engineered to use adenosine 5′-triphosphate analogs. We identified 117 candidate substrates, ~40% of which are known CDK substrates. Previously unknown candidates were validated to be CDK2 substrates, including LSD1, DOT1L, and Rad54. The identification of many chromatin-associated proteins may have been facilitated by labeling conditions that preserved nuclear architecture and physiologic CDK2 regulation by endogenous cyclins. Candidate substrates include proteins that regulate histone modifications, chromatin, transcription, and RNA/DNA metabolism. Many of these proteins also coexist in multi-protein complexes, including epigenetic regulators, that may provide new links between cell division and other cellular processes mediated by CDK2. In situ phosphorylation thus revealed candidate substrates with a high validation rate and should be readily applicable to other nuclear kinases.


Advanced mass spectrometry (MS) and quantitative phosphoproteomics enable the identification of large sets of protein phosphorylation sites to comprehensively identify protein kinase substrates (14). However, despite many advances, determining which phosphorylation sites are direct and physiologic targets of a specific kinase remains challenging. Analog-sensitive kinases (AS-kinases) are an important tool for discovering kinase substrates. AS-kinases contain a mutation in a conserved “gatekeeper” residue that normally functions to restrict active site access to adenosine 5′-triphosphate (ATP; but not other nucleotides) (3, 57). Replacing the gatekeeper residue with a smaller amino acid renders AS-kinases capable of using bulky ATP analogs that cannot be used by normal kinases. By using bulky ATP analogs in combination with AS-kinases in various ways, the activity of an AS-kinase can be isolated from that of other cellular kinases.

In vitro applications of AS-kinases use ATP analogs to directly tag substrates with a label that distinguishes them from other cellular phosphoproteins [reviewed in (3, 6)]. However, because ATP analogs cannot enter intact cells, these approaches require in vitro conditions that can be prone to artifacts. For example, the use of highly active recombinant AS-kinases and cell lysates (in which cellular contexts are lost) can produce false-positive identifications that may greatly increase the experimental work needed to validate candidate substrates. In contrast, in vivo applications of AS-kinases typically identify proteins that are differentially phosphorylated in cells treated with highly specific AS-kinase inhibitors, sometimes in the context of AS mutations engineered into endogenous kinases (3, 6). While these methods are not prone to in vitro artifacts, they cannot readily distinguish direct substrates from phosphorylations that may be indirectly regulated by AS-kinase inhibition because substrates are not directly labeled by the AS-kinase.

Cyclin-dependent kinases (CDKs) are holoenzymes consisting of catalytic (CDK) and regulatory (cyclin) subunits that regulate cellular process by phosphorylating complex substrate networks (8, 9). Many mammalian CDKs—including CDK1, CDK2, CDK5, CDK7, and CDK9—have proven amenable to AS mutations and have been studied using both direct and indirect approaches (1018). Cdk2 is activated by cyclin E and cyclin A. Cyclin E-Cdk2 regulates cell cycle re-entry, G1 progression, and S phase entry, whereas cyclin A–Cdk2 acts later in the cell cycle, where it coordinates S phase progression and functions in G2 and M phase cells. In previous work, we developed an in vitro approach using recombinant cyclin A/AS-CDK2 and thiophosphate labeling to identify >100 candidate cyclin A–CDK2 substrates in human cell lysates (19). The use of a thiophospho-ATP analog allowed a biochemical enrichment–based strategy and prevented ATP analog hydrolysis in cell lysates. Shokat and colleagues (10) also developed a thiophosphate-based method to study AS-CDK1 in vitro, which used alternate thiophosphate chemistry. These studies revealed candidate substrates common to both CDK2 and CDK1 and unique candidates for each kinase. However, the extent to which any of these proteins is predominantly a CDK1 versus CDK2 substrate in vivo is not known.

Because of its crucial roles in normal and neoplastic signaling, we sought to identify high confidence CDK2 substrates using conditions that maintain near-physiologic CDK2 activity and preserve nuclear context and architecture. In this approach, termed “in situ phosphorylation,” we first stably expressed ectopic AS-CDK2 in cells, which is activated by endogenous cyclins and thus subject to near-normal regulation. We then isolated nuclei from cells expressing either wild-type CDK2 (WT-CDK2) or analog-sensitive CDK2 (AS-CDK2) and performed in situ substrate thiophosphorylation by incubating the nuclei with an ATP-γ-S analog. These conditions allow substrates to be phosphorylated in conditions that better preserve CDK2’s normal subcellular interactions with its substrates. We subsequently used biochemical enrichment and MS to identify candidate substrates.

We identified ~150 AS-CDK2–specific thiophosphopeptides and 117 candidate CDK2 substrates. Remarkably, ~43% of these proteins are known CDK substrates, indicating that a high proportion of the candidates are physiologic CDK2 substrates. Moreover, we found that each of the candidates we tested was directly phosphorylated by CDK2 in vitro or in vivo, further supporting the idea that candidates identified through in situ phosphorylation contain a high proportion of bona fide CDK2 substrates. Many previously unidentified candidates are chromatin-associated proteins with roles in histone modification and chromatin remodeling; DNA metabolism, damage, and repair; and transcription and RNA metabolism, and whose identification was likely allowed by maintaining the nuclear contexts within which CDK2 normally functions. In summary, in situ conditions led to the efficient and more confident identification of CDK2 substrates. These methods should be readily applicable to other nuclear kinases with complex substrate networks.


Developing AS-CDK2 cells and substrate thiophosphorylation in isolated cell nuclei

We used retroviral transduction to generate pools of human embryonic kidney–293 (HEK293) cells that stably express either WT-HACDK2 or AS-HACDK2 [hemagglutinin (HA) tagged] at levels slightly less than endogenous CDK2 (Fig. 1A). A previous study revealed that AS-CDK2 has impaired cyclin A binding that prevents AS-CDK2 from effectively competing with endogenous CDK1 and CDK2 for cyclin A binding (12). Accordingly, AS-CDK2 had less kinase activity than WT-CDK2, as assayed by immunoprecipitation kinase assays using antibodies against the HA-tag (specific to ectopic CDK2), total CDK2, or cyclin E (Fig. 1, B and C). To overcome this defect, we inhibited endogenous CDK2 by using lentiviruses encoding a short hairpin RNA (shRNA) targeting the CDK2 3′ untranslated region (3′UTR; which is not present in the retroviral CDK2 complementary DNAs), which silenced endogenous CDK2 to almost undetectable levels (Fig. 1A). As predicted, endogenous CDK2 knockdown increased AS-CDK2 activity, presumably by enhancing cyclin association (Fig. 1, B and C). However, AS-HACDK2 activity remained lower than WT-HACDK2 activity even after endogenous CDK2 knockdown (Fig. 1B). Silencing endogenous CDK2 thus yielded detectable AS-CDK2 activity that was somewhat less than endogenous CDK2 activity, rather than supraphysiologic activity.

Fig. 1 Characterization of AS-CDK2 activity.

(A) HEK293 cells [control (con)] and HEK293 cells stably expressing WT-HACDK2 and AS-HACDK2 were transduced with lentiviral vectors expressing control or one of two different CDK2 shRNAs (#1 and #2) for 2 days. Cell lysates were immunoblotted with anti-CDK2 and anti–γ-tubulin (loading control) antibodies. The positions of CDK2 and ectopic HA-CDK2 are indicated. (B and C) Lysates from the above cells were immunoprecipitated (IP) with anti-HA (B), anti-CDK2, and anti–cyclin E antibodies (C), and histone H1 kinase assays were carried out in the presence of γ-32P-ATP.

Since the cell membrane is impermeable to nucleotides, we isolated nuclei using hypotonic methods to help preserve nuclear integrity and determined that ATP-γ-S enters isolated nuclei and that proteins could be readily labeled by incubating nuclei with ATP-γ-S (table S1). We then used nuclei isolated from WT- and AS-HACDK2 cells in conjunction with N6-(2-phenylethyl)–ATP-γ-S (PE–ATP-γ-S) to directly label nuclear CDK2 substrates (Fig. 2A). Cells have high concentrations of ATP (>1 mM) that competes with PE–ATP-γ-S for kinase occupancy. We thus included Mn2+ in the in situ kinase reaction, which improves thiophosphate labeling by allowing more effective competition with cellular ATP (20). Last, CDK2 substrates that are already highly phosphorylated by CDK2 in vivo may not be additionally labeled by the thiophospho-ATP analog in situ. We thus briefly pretreated cells with a CDK inhibitor (roscovitine) before harvesting and isolating nuclei to inhibit endogenous CDK2 and allow substrate dephosphorylation before labeling in situ (Fig. 2A).

Fig. 2 Schema for identification of CDK2 substrates using an in situ nuclear labeling assay.

(A) Scheme for in situ labeling. Asynchronously growing HEK293 cells stably expressing WT-CDK2 (AS-CDK2) were pretreated with roscovitine and harvested by trypsinization. Nuclei were isolated as depicted, incubated with PE–ATP-γ-S to allow thiophosphate labeling of proteins, and subsequently lysed (see Materials and Methods). (B) Scheme for thiophosphopeptide isolation and identification. Labeled nuclear extracts were digested with trypsin. The resulting peptides were incubated with disulfide beads, which capture both thiophosphopeptides and cysteine-containing peptides. Beads were eluted with dithiothreitol (DTT), and the eluted peptides were subjected to liquid chromatography coupled to tandem MS to identify the thiophosphopeptides. Specific thiophosphopeptide identifications that carried proline-directed phosphorylation motifs and were present in the AS-CDK2 cells but not control WT-CDK2 cells were considered as candidate CDK2 substrates. (C) Schematic of thiophosphopeptide and cysteine-containing peptide elution from disulfide beads by DTT.

Thiophosphopeptide recovery and identification of high confidence nuclear CDK2 substrates

Nuclei were lysed after thiophosphorylation and digested with trypsin, and the resultant tryptic peptides containing either thiophosphate or cysteine were bound to a disulfide resin (Thiopropyl Sepharose 6B) as previously described (Fig. 2, B and C) (19)). We developed a new method to elute thiophosphopeptides from the disulfide beads with dithiothreitol (DTT), which retains the thiophosphate tag within the eluted peptides (thereby providing a unique mass signature for AS-CDK2 substrates) and allows the elution of cysteine-containing thiophosphopeptides. Although DTT elutes a large excess of peptides that simply contain cysteine, current MS instruments easily handle the additional sample complexity, and these peptides were readily excluded from subsequent analyses. Moreover, the mildly acidic conditions used favor the binding and elution of thiophosphopeptides over cysteine-containing peptides (21).

We performed a large-scale experiment, in which four replicates of cell nuclei from WT- or AS-HACDK2 cells were labeled with PE–ATP-γ-S (see Materials and Methods). We identified 4592 unique peptides, of which 464 were unique thiophosphopeptides (about 10% of the total eluted peptides), and almost all (~99%) of the remaining peptides contained cysteine (table S2). Among the 326 unique “stripped” thiophosphopeptides (distinguished only on the basis of primary sequence), 166 thiophosphopeptides were found only in AS-CDK2 samples, 32 only in control WT-CDK2 samples, and 128 in both (table S3). Candidate CDK2 substrates were defined as containing thiophosphopeptides that (i) were present only in AS-CDK2 samples and (ii) have primary sequences that contain proline-directed CDK phosphorylation motifs (serine or threonine followed by a proline, termed “S/T-P”). Applying these criteria, 156 of 166 (93%) of the CDK2-AS–specific thiophosphopeptides had at least one S/T-P site, which corresponds to 117 unique proteins. The Comet search engine localized the presumptive thiophosphorylation site to be the proline-directed S/T-P residues in 148 of 156 (95%) of the AS-specific S/T-P thiophosphopeptides (tables S3 and S4). Most (94%) of these 117 substrates are nuclear proteins, and only 33 of 117 had been found in our previous in vitro studies. About 95% of the phosphorylation sites carried the full CDK consensus motif S/T-P-X-R/K (where X is any amino acid), and ~95% of these phosphorylations have been reported in large-scale in vivo phosphoproteome analyses (2225). We recovered six cysteine-containing thiophosphopeptides representing four unique proteins (tables S3 and S4). Fifty of 117 (43%) candidate substrates are known CDK substrates, indicating a remarkably high proportion of physiologic CDK substrates within the candidate list generated in situ (table S4).

In addition to known CDK substrates, we found a novel landscape of candidate CDK2 substrates, most of which function within key nuclear processes, including cell cycle control, histone modification and chromatin organization, DNA replication and repair, transcription, and RNA metabolism (Fig. 3A). The high proportion of candidates associated with chromatin modification, DNA metabolism, and transcription is likely the result of in situ conditions that preserve cyclin-CDK2 chromatin associations. For example, we identified 11 proteins that regulate histone modifications as candidate substrates, eight of which are previously unknown (DOT1L, JARID2, KAT6A, LSD1, MSL1, MSL3, SETDB1, and NSD2) and three of which are known CDK substrates (KMT2B, PHF8, and SUV39H1). We also found candidates with broader roles in chromatin remodeling (e.g., BCOR, BCL11A, DMAP1, INO80E, and SMARCA5) and known CDK substrates with chromatin remodeling functions (e.g., BRD4). Similarly, we identified many proteins that regulate transcription and RNA metabolism, and these include many previously unknown candidate CDK2 substrates (see Discussion).

Fig. 3 Functional categorization of CDK2 candidate substrates.

(A) Candidate substrates are grouped based on select gene ontology processes and with manual curation. Boldfaced proteins denote known CDK substrates. (B to G) Examples of CDK2 substrates and candidate substrates (green subunits) co-occurring in known Corum complexes. The Corum complex identification number is shown in the middle of each depicted complex. The following complexes are shown: (B) BASC complex, 6 of 12 components depicted; (C) Emerin complex 24, 6 of 6 components depicted; (D) NuRD complex, 6 of 7 components depicted; (E) LSD1 complex, 6 of 14 components depicted; (F) XFIM complex, 5 of 5 components depicted; and (G) SNF2h-cohesin-NuRD complex, 6 of 16 components depicted.

We further interrogated the relationships between the substrates in this network by searching protein-protein interaction databases. First, we manually curated the known physical interactions between all of the substrates with one another by using BioGRID 3.5 ( (table S5) (26). We found extensive physical interactions between many of the CDK2 substrates, reinforcing the master regulatory roles that CDK2 plays in processes such as cell division and DNA repair and replication. Next, we examined whether and how candidate substrates coexist within known multiprotein complexes by using the Corum database ( (27). Many known and candidate substrates coexist within these defined Corum complexes in both expected and unexpected ways, suggesting that CDK2 may globally regulate these complexes through phosphorylating multiple subunits thereof (Fig. 3, B and C, and see Discussion). One particularly notable finding was the co-occurrence of candidate CDK2 substrates in complexes with crucial epigenetic functions, particularly those that contain both LSD1 and the histone deacetylase 1/2 (HDAC1/2) (Fig. 3, D to G, and see Discussion). References for each of these specific complexes can be found on the Corum website.

In vitro validation of select candidate CDK2 substrates

The high proportion of known CDK substrates found in our screen suggested that many candidates may also be bona fide CDK2 substrates. We thus determined whether a group of nine candidates (representing proteins that were readily immunoprecipitated from transfected cells) were phosphorylated by recombinant cyclin A–CDK2 in vitro. These include (i) NSD2, a SET-domain histone lysine methyltransferase; (ii) DMAP1, a component of the NuA4 histone acetyltransferase complex; (iii) LSD1/KDM1A, a histone H3 Lys-4 demethylase; (iv) DOT1L, a methyltransferase that acts on Lys79 of histone H3; (v) BCL11A, a transcription factor (TF) associated with chromatin remodeling; (vi) PRPF3, a component of the precatalytic spliceosome; (vii) MSL3, a chromatin remodeling protein and regulator of histone H3 acetylation; and (viii) GTF3C2, a general TF required for RNA polymerase III–mediated transcription. Last, we recently validated Rad54 phosphorylation on both T31 and S59 by CDK2 in vitro and found that S49 phosphorylation inhibits Rad54’s branch migration activity (28).

We immunoprecipitated each of these nine proteins from transfected HEK293 cells and tested their ability to be phosphorylated by recombinant cyclin A–CDK2 in vitro using 32P-ATP (Fig. 4). All nine of these in situ candidates were validated to be direct CDK2 substrates in vitro. In the case of DMAP1, it was necessary to elute the immunoprecipitated protein for it to be phosphorylated (Fig. 4, C and F), whereas the other substrates were readily phosphorylated while bound to beads (Fig. 4, A and B). Immunoblots of the immunoprecipitated substrates are shown in Fig. 4 (D to F).

Fig. 4 Validation of select CDK2 candidate substrates in vitro.

(A) FLAG-tagged PRPF3, Myc-tagged BCL11A, and HA-tagged MSL3 and GTF3C2 were expressed in HEK293 cells and immunoprecipitated from cell lysate using antibodies against the tag. Washed immunoprecipitates were phosphorylated using purified cyclin A–CDK2 (or no kinase control) in the presence of 32P-ATP and detected by autoradiography. Histone H1 kinase assay was used as control. oe, overexposure. Lane 9 was empty. Asterisk (*) marks the expected band of the substrates. (B) FLAG-tagged NSD2, LSD1, and DOT1L were analyzed as in (A). (C) FLAG-tagged DMAP1 was analyzed as in (A), except that the washed immunoprecipitates were first eluted with a FLAG peptide and fractions of the eluate were used in the kinase assay. Lane 2 was empty. (D and E) Immunoblots (IB) of a fraction of the corresponding immunoprecipitates shown in (A) and (B) using the antibodies against the tag. In (D), lanes 2 and 4 were empty. (F) Immunoblot of the gel in (C) after autoradiography using FLAG antibody.

In vivo validation of select candidate CDK2 substrates

We next determined whether any of three of these previously unknown substrates (RAD54, DOT1L, and LSD1) are phosphorylated by CDK2 in vivo by either generating phosphosite-specific antibodies or using commercial phospho-specific antibodies. Because we identified a RAD54 phosphopeptide containing T31 in our substrate screen, we generated an anti-Rad54 antibody that detects phosphorylated T31 in a phosphatase-sensitive manner (fig. S1A). The anti-pT31 antibody also exhibited a small amount of reactivity against unphosphorylated Rad54, which was eliminated when the epitope was deleted (fig. S1B). Endogenous RAD54 was hypophosphorylated on T31 in asynchronous cells, but T31 phosphorylation rapidly increased when cells were treated with the phosphatase inhibitor okadaic acid, suggesting that endogenous Rad54 T31 phosphorylation is highly labile and opposed by a phosphatase (fig. S1C). Phospho-T31 was also readily detectable on ectopic Rad54 in vivo (without okadaic acid treatment); this was reduced by brief CDK inhibition with roscovitine (Fig. 5A) and, conversely, increased by cyclin A–CDK2 overexpression (Fig. 5B). Residual Rad54 T31 phosphorylation in roscovitine-treated cells likely reflects either incomplete T31 dephosphorylation during the brief period of CDK2 inhibition or a roscovitine-insensitive kinase that also phosphorylates T31. Because CDK2 directly phosphorylates T31, these data are consistent with the idea that CDK2 directly phosphorylates RAD54 on T31 in vivo, although it is formally possible that another T31 kinase is up-regulated by cyclin A–CDK2 overexpression. Last, because CDK activity is highest in mitosis, we studied endogenous Rad54 in cells arrested in prometaphase with nocodazole (see flow profiles in fig. S1D) and found high levels of T31 phosphorylation that was almost completely inhibited by two CDK inhibitors, roscovitine and RO-3306 (a CDK1 inhibitor) (Fig. 5C). Thus, endogenous RAD54 is phosphorylated in mitosis on T31, likely by CDK2, CDK1, or both. Because we used a pT31-specific antibody, we were unable to assess Rad54 S49 phosphorylation in vivo.

Fig. 5 CDK2-dependent phosphorylation of RAD54, DOT1L, and LSD1 in vivo and cell cycle regulation of LSD1 T59 phosphorylation.

(A) Myc-tagged RAD54 WT and T31A mutant were immunoprecipitated from transfected HEK293 cells and immunoblotted for T31 phosphorylation and total RAD54 protein. Roscovitine (Rosc) was added 2 hours before harvest. Untransfected cells, control. Arrow denotes endogenous RAD54. (B) HEK293 cells were cotransfected with myc-tagged RAD54 and myc-tagged cyclin A–HACDK2 or vector control. RAD54 was immunoprecipitated and immunoblotted for T31 phosphorylation and total RAD54 protein (top), and the cell lysates were immunoblotted for cyclin A and CDK2 expression (bottom). γ-Tubulin–loading control. (C) HEK293 cells were treated with nocodazole (Noc) for 17 hours, and endogenous RAD54 was immunoprecipitated and immunoblotted for T31 phosphorylation and for total RAD54 protein. Roscovitine or RO-3306 was added 1 hour before harvest. “Noc-” indicates asynchronous cells. (D) FLAG-tagged DOT1L was analyzed as in (B). DOT1L phosphorylation was detected using an anti–phospho-S297 DOT1L antibody. Untransfected cells, control. (E) HEK293 cells were transfected with myc–cyclin A and HA-CDK2 or vector control. Endogenous LSD1 was immunoprecipitated and immunoblotted for T59 phosphorylation using anti–phospho-TP antibody and also for total LSD1 (top). Cell lysates were immunoblotted for cyclin A and CDK2 expression (bottom). γ-Tubulin, loading control. (F) Endogenous LSD1 was analyzed as in (C). (G) FLAG-tagged LSD1 WT and T59A mutant were immunoprecipitated from transfected HEK293 cells that were treated with nocodazole for 17 hours and immunoblotted for T59 phosphorylation and total LSD1. (H) HEK293 cells were synchronized at the G1-S boundary by double thymidine block and released. Cells were harvested at indicated times. Endogenous LSD1 was immunoprecipitated and immunoblotted for T59 phosphorylation using anti–phospho-TP antibody and also for total LSD1 (top). Cell lysates were immunoblotted for LSD1 and γ-tubulin (bottom). (I) Cell cycle profiles at the indicated times using flow cytometry.

To study DOT1L, we made a phospho-specific antibody against its pS297 epitope. We chose S297 instead of the S775 site identified in our screen because S297 is a CDK consensus site that falls within a conserved catalytic domain of DOT1L. We found that the antibody was specific for S297 (fig. S1F) and that coexpression of DOT1L with cyclin A–CDK2 strongly stimulated DOT1L S297 phosphorylation (Fig. 5D). We were unable to detect endogenous DOT1L protein and thus could not study its phosphorylation.

Last, we performed similar validation experiments for LSD1. Although we failed to make a phospho-specific antibody against T59 (the site we identified by MS), a commercial phospho-T/P antibody specifically detected T59 phosphorylation of LSD1 in vitro, as shown by the strong LSD1 phosphorylation after cyclin A–CDK2 treatment and the lack of detectable phosphorylation of an LSD1 T59A mutant (fig. S1E). Using this phospho-T/P antibody, we found that endogenous LSD1 is phosphorylated in vivo and strongly stimulated by cyclin A–CDK2, suggesting that cyclin A–CDK2 phosphorylates LSD1 on T59 in vivo (Fig. 5E). Endogenous LSD1 was also strongly phosphorylated in nocodazole-treated cells in a roscovitine-dependent and RO-3306–dependent manner (Fig. 5F). This phosphorylation was completely eliminated when an ectopic T59A LSD1 mutant was compared with ectopic WT LSD1, indicating that pT59 is also recognized by the pT/P antibody in vivo (Fig. 5G). We also studied synchronized cells to determine whether LSD1 T59 phosphorylation exhibited cell cycle regulation. By following cells arrested in early S phase by a double thymidine block and release, we found that T59 phosphorylation was lowest in G1 phase, begins to rise in S phase, and peaks in G2-M phase (Fig. 5, H and I). This pattern is highly reflective of CDK2 activity during the cell cycle, although we were unable to directly test this because CDK2 inhibition prevents cell cycle progression after cell synchronization and release. In summary, these data support that Rad54, DOT1L, and LSD1 are in vivo substrates of CDK2.


CDK2 has critical roles in cell division, DNA replication, DNA damage and repair, and cell cycle checkpoints. Moreover, CDK2 mediates many oncogenic pathways and responses to anti-neoplastic agents. While many CDK2 substrates in these pathways have been identified, crucial aspects of normal and neoplastic CDK2 function remain unknown. We thus developed in situ phosphorylation to enable future mechanistic studies by identifying previously unknown physiologic substrates. In addition to many known CDK substrates, we identified ~70 previously undiscovered candidate substrates. Validation studies of a set of nine candidates confirmed that each was a CDK2 substrate, suggesting that many (or most) candidates will be found to be bona fide substrates. In situ phosphorylation identified a very different set of CDK2 substrates from those found in vitro, including many chromatin-associated proteins that we speculate were discovered because in situ kinase reactions are performed under conditions in which the normal subcellular relationships between CDK2 and its nuclear substrates are preserved.

One particularly interesting set of candidates includes enzymes that modify histones and chromatin. CDK2 affects histone abundance and activity in several ways: (i) by direct histone phosphorylation (29), (ii) by regulating histone transcription through NPAT phosphorylation (30), and (iii) through phosphorylation of histone-modifying enzymes, such as EZH2, which controls epigenetic gene silencing (31). Here, we identify several candidate CDK2 substrates with important roles in histone modification (DOT1L, JARID2, KAT6A, LSD1, MSL1, MSL3, SETDB1, and NSD2) or broader roles in chromatin remodeling (e.g., BCOR, DMAP1, INO80E, and SMARCA5).

Another interesting group of candidate substrates is composed of transcriptional regulators and TFs, which include proteins with roles in cancer (e.g., BCL11A and AF9) and more general transcriptional functions (e.g., GTF2I). We also found several proteins with roles in splicing and RNA metabolism (e.g., PRP3, PRP16/DHX38, and SRRM2). Again, we speculate that CDK2-mediated phosphorylation of these transcriptional and mRNA regulators may thus provide important mechanisms linking control of cell division with gene expression. Last, while CDK2’s roles in DNA replication, damage, and repair are well known, we identified previously undiscovered putative substrates with important roles within these well-studied pathways, including RIF1, RAD54, and XRCC1. Future studies are needed not only to validate these candidates but also to understand how they are regulated by CDK2-mediated phosphorylation during the cell cycle.

As noted above, many of the CDK2 substrates we found coexist within known complexes, suggesting that CDK2 may regulate these complexes by targeting multiple components for phosphorylation, perhaps within the context of the complexes themselves. For example, while many cell cycle and DNA repair proteins are found in complexes with canonical functions (e.g., BASC complex; Fig. 3B), these proteins are also found in complexes with unknown functions, such as the Emerin 24 complex (Fig. 3C). Studies of these atypical complexes may reveal new functions for CDK2.

One particularly interesting finding involves the presence of multiple CDK2 substrates within complexes with epigenetic functions, many of which contain both the LSD1 histone demethylase and the HDAC1/2 histone deacetylases. These complexes regulate gene expression through coordinated chromatin modification, including histone demethylation and acetylation. In addition to LSD1 and HDACs, these complexes also contain unique subunits, which include previously undiscovered CDK2 substrates, including RREB1, MTA1, SMARCA5, ZMYM3, and GTF2I (Fig. 3, D to G). The presence of multiple CDK2 substrates within these epigenetic regulators suggests that CDK2 may directly affect their function via direct phosphorylation of these components, perhaps in cell cycle–specific ways. There are numerous additional examples of multiple CDK substrates contained within complexes that regulate transcription, cell division, ribosomal biogenesis, and other cellular processes. We speculate that these substrates link the control of cell division with the regulation of gene expression at many levels via their phosphorylation by CDK2.

Validation work is a bottleneck for all proteomic screens, and it is highly desirable to improve the ability of these screens to identify physiologic substrates. While we initially attempted to label CDK2 substrates in living cells using different delivery systems, we were unable to deliver ATP-γ-S analogs into live cells in an efficient and noninvasive way. We thus sought to preserve nuclear contexts while also allowing direct substrate labeling by performing in situ kinase assays in isolated nuclei. This in situ approach has notable advantages, including (i) maintenance of nuclear structures and the cellular contexts within which CDKs normally function, (ii) near-physiologic CDK2-AS abundance and activity, and (iii) maintaining thiophosphate signatures after elution, which allows more confident identification of peptides phosphorylated by the AS-kinase. As a result, the list of candidate CDK2 substrates was highly enriched for known CDK substrates, and all of the candidates that we studied were validated to be phosphorylated by CDK2. We thus believe that it is likely that many (or most) of the candidates that remain untested will prove to be bona fide CDK2 substrates.

Although we confirmed a high rate of validation, several factors may still contribute to false-positive substrate identifications. For example, nearly half of the thiophosphopeptides we identified were found in both CDK-AS and CDK2-WT kinase assays, indicating that some AS-CDK2–independent background labeling occurs, with a signal-to-noise ratio of about 1 (table S3). This background likely results from cellular kinases known to use bulky ATP analogs (32). While it is also possible that PE–ATP-γ-S was hydrolyzed during the kinase reaction (thereby generating ATP-γ-S that can be used by many cellular kinases), we previously found that ATP-γ-S and PE–ATP-γ-S are stable in whole-cell lysates, and this is likely also true in isolated nuclei (19). Despite this background, by requiring that thiophosphopeptides contain S/T-P motifs to be considered as substrates, the vast majority of background thiophosphopeptides were eliminated from subsequent analyses. The few thiophosphopeptides present in the “WT-CDK2–only” samples indicates that our MS analyses did not reach saturation with respect to peptide identifications. Thus, some “AS-specific” candidate substrates may drop out if they are found in WT-CDK samples with additional replicates. However, thus far, our validations have not revealed many false-positive identifications. Last, because our methods require isolating intact nuclei, we could only identify CDK2 substrates in interphase cells. This limitation may explain why we did not detect elongation factor 2, which is a cytoplasmic protein phosphorylated by CDK2 only in mitosis (33).

In conclusion, we describe an in situ implementation of AS-CDK2 for substrate discovery that allows CDK2 to phosphorylate its substrates under near-physiologic conditions. We also developed methodological advances to identify candidate AS-CDK2 substrates more definitively and comprehensively. As a result, we found a high percentage of known CDK substrates and a high rate of validation for previously undiscovered candidates. Moreover, the enrichment for chromatin-associated proteins with important roles in processes such as gene expression and DNA and RNA metabolism suggests that in situ conditions may allow new types of substrates to be identified for CDKs and other nuclear kinases that function within specific nuclear contexts. These methods should be readily adaptable to study other nuclear kinases that are amenable to AS mutations.


Experimental design

The objectives of this study were to develop a near physiologic environment for substrate phosphorylation by AS-CDK2. The primary study design provided two mechanisms to accomplish this goal. The first was to stably express AS-CDK2 at normal abundance and to allow the exogenous AS-CDK2 to be activated by endogenous cyclins, which are rate limiting for CDK2 activation. The second was to isolate nuclei and allow substrates to be phosphorylated by AS-CDK2 in a nearly normal nuclear environment. We used substrate thiophosphorylation for several reasons: (i) We previously have characterized the ability of AS-CDK2 to use and hydrolyse ATP-γ-S, (ii) we previously developed efficient biochemical methods to enrich and identify thiophosphopeptides by MS, and (iii) thiophosphate allows AS-CDK2 substrates to be easily distinguished from other cellular phosphopeptides. As shown in Fig. 2, the overall design consisted of the following steps: (i) stable expression of AS-CDK2 in HEK293A cells, (ii) inhibition of endogenous CDK2 by shRNA, (iii) treatment of cells with roscovitine to inhibit endogenous CDKs, (iv) isolation of nuclei and incubation with PE–ATP-γ-S, (v) thiophosphopeptide enrichment and identification by MS, and (vi) substrate validation in vitro and in vivo.

Cell culture, transfection, retroviral transduction, and drug treatments

The HEK293 was obtained from the American Type Culture Collection, tested for mycoplasma and genetically authenticated during the course of this study. HEK293 cells were cultured in Dulbecco’s modified Eagle’s medium (11965-092, Gibco), which was supplemented with 10% (v/v) fetal bovine serum (10437-028, Gibco), penicillin (100 U/ml), and streptomycin (100 μg/ml; 15140-122, Gibco). Transient transfections were carried out using calcium phosphate. HEK293 cells stably expressing C-terminally HA-tagged WT-CDK2 or CDK2-F80A (AS-CDK2) were generated via retroviral transduction using a pBabe-based vector (puromycin) as previously described (34). For CDK2 knockdown via small hairpin RNA, cells were transduced with lentiviruses encoding nontargeting control shRNA or two different shRNAs directed to noncoding regions of CDK2 mRNA (see below). Cells were harvested 2 days after infection without selection, and shRNA expression was confirmed by green fluorescent protein coexpression. Nocodazole (M1404, Sigma-Aldrich) was used at 40 ng/ml, for 17 hours, roscovitine (R7772, Sigma-Aldrich) was used at 25 μM for 1 to 2 hours where indicated, and RO-3306 (sc-358700; Santa Cruz Biotechnology) was used at 1 μM for 1 hour. Okadaic acid (O-2220, LC Laboratories) was used at 200 nM for 90 min.

Antibodies, plasmids, enzymes, and other reagents

The following antibodies were used: Myc tag (1:5; 9E10 hybridoma supernatant, prepared in house), HA-tag (12CA5 hybridoma supernatant; prepared in-house), FLAG M2 (1:4000; F1804, Sigma-Aldrich), CDK2 (1:1000; D-12, sc-6248, Santa Cruz Biotechnology), cyclin A (1:1000; gift from J. Roberts, Fred Hutchinson Cancer Research Center), cyclin E (1:1000; HE111, sc-248, Santa Cruz Biotechnology), γ-tubulin (1:1000; sc-17787; Santa Cruz Biotechnology), RAD54 (1:1000; F-11, sc-374598, Santa Cruz Biotechnology), LSD1 (1:5000; A300-215A, Bethyl Laboratories), DOT1L (1:2000; A300-953A, Bethyl Laboratories), and phospho-T/P (1:1000; 9391S; Cell Signaling Technology). Phospho-T31 RAD54 and phospho-S297 DOT1L antibodies were custom made by Yenzym Antibodies (San Francisco), raised against the synthetic phosphopeptides CEDWQPGLV-pT-PRKRK (RAD54) and CMRVVEL-pS-PLKGSVS (DOT1L).

Substrate expression plasmids were obtained as follows: FLAG-DMAP1 (C. Kim, Hanyang University), FLAG-NSD2 (J. Licht, University of Florida), HA-MSL3 (J. Cotes, Laval University), HA-GTF3C2 (R. White, University of York), FLAG-PRPF3 (S. M. Iguchi-Ariga, Hokkaido University), and N-terminal FLAG-DOT1L and C-terminal HA-DOT1L (Y. Zhang, Harvard). RAD54 was cloned into the pCS2+ vector with a C-terminal 5× myc tag. The deleted sequence for myc-RAD54-T31AΔ8 was EDWQPGLV. N-terminal 3× FLAG-tagged LSD1 was cloned into a pCS2+ vector. BCL11A was cloned into the pCS2+ 5myc vector. Cyclin A and CDK2 expression plasmids were pCS2+ 6 myc–cyclinA (gift from J. Roberts, Fred Hutchinson Cancer Research Center) and pCMV-CDK2HA (gift from S. van den Heuvel, Utrecht University, The Netherlands), respectively. All mutants were generated by site-directed mutagenesis using the QuikChange method (Stratagene) and sequenced. Lentiviral plasmids using the pGIPZ vector were purchased onsite, and the CDK2 3′UTR targeting sequences of the two shRNAs used were AGGATGAACAATTATATTT (#1) and AGGTTATATCCAATAGTAG (#2). Lambda protein phosphatase (P0753S) and recombinant cyclin A–CDK2 were purchased from New England Biolabs. Recombinant glutathione S-transferase (GST)–cyclin E–CDK2 (33) and GST-Rb (19) were prepared in house. ATP-γ-S was purchased from Millipore Sigma (catalog no. 119120). PE–ATP-γ-S was purchased from BIOLOG Life Sciences Institute (catalog no. P026, Bremen, Germany).

Immunoblotting, immunoprecipitation, kinase assays, cell synchronization, and flow cytometry

Immunoblotting, Immunoprecipitation, kinase assays, and flow cytometry were performed as described (34). Images of full gels for each figure are shown in fig. S2. For DMAP1 kinase assay, DMAP1 was first eluted from immunoprecipitates with FLAG peptide (150 μg/ml; F4799, Millipore Sigma) and concentrated using a MicroCon-30 (MRCF0R030, Millipore Sigma-Aldrich). For cell cycle analysis of LSD1 phosphorylation, HEK293 cells were first synchronized by double thymidine block. Briefly, cells were treated with 2 mM thymidine (T9250, Millipore Sigma-Aldrich) for 16 hours, washed once with fresh medium, and released into fresh medium supplemented with 24 μM 2′-deoxycytidine (951-77-9, Acros Organics) for 8.5 hours. Cells were then washed once with fresh medium and treated again with 2 mM thymidine for 18 hours before releasing into fresh medium containing 2′-deoxycytidine. Cells were harvested by trypsinization at various times up to 25 hours, and a fraction of the cells were fixed for flow cytometry analysis.

Cell nuclei isolation and in situ kinase assay

HEK293 cells stably expressing WT-CDK2 or AS-CDK2 were grown on eight 15-cm plates (2 days after infection by CDK2 shRNA lentivirus) to ~80 to 90% confluence (~2 × 107 cells per plate), treated with roscovitine for 1 hour, and harvested in four replicates (two plates per cell pellet) by trypsinization, and washed once with cold phosphate-buffered saline buffer. Each cell pellet (total of eight) was washed twice by resuspending the cells in 1 ml of cold hypotonic lysis buffer 1 [10 mM Hepes (pH 7.4), 10 mM KCl, and 2 mM MgCl2] followed by centrifugation in a microcentrifuge (2000 rpm, 1 min). Cells were resuspended again in 1 ml of hypotonic lysis buffer 1 and incubated on ice for 15 min. Swollen cells were lysed by pipetting up and down five to seven times using a 1-ml syringe with 26-gauge needle (305111, Becton, Dickinson and Company). Cell lysis and intact cell nuclei were verified by staining a small aliquot with trypan blue (T10282, Life Technologies). The slurry was slowly added to the surface of 6 ml of cold hypotonic lysis buffer 1 containing 30% sucrose (w/v) in a 15-ml Falcon tube. After centrifugation in a tabletop centrifuge at 1000g for 10 min, the buffer above the nuclei pellet was removed, and the pellet was washed three times by resuspending it in 1 ml of cold hypotonic lysis buffer 1 followed by centrifugation in a microcentrifuge (3000 rpm, 1 min). The nuclei preparation was checked again by staining with trypan blue and microscope examination. The final nuclei pellet was resuspended in 1.5× volume of hypotonic lysis buffer 1 containing PE–ATP-γ-S (final concentration of 0.5 mM) and MnCl2 (final concentration of 0.5 mM final) and incubated at 30°C for 30 min. The nuclei slurry was occasionally mixed by tapping during the course of the reaction. After the reaction, the nuclei mix was briefly centrifuged (3000 rpm, 15 s) to remove most of the supernatant, and the pellet was flash-frozen in liquid nitrogen and stored or processed as described below. ATP-γ-S labeling was done similarly starting with two 15-cm plates of WT-CDK2 cells and labeled at a final concentration of 0.5 mM.

Purification of thiophosphorylated peptides

The frozen nuclei pellet was resuspended in 0.4 ml of hypotonic lysis buffer 2 [30 mM Hepes (pH 7.4), 10 mM EDTA, and benzonase (25 U/ml; 70746, Millipore Sigma)]. After incubation on ice for 30 min, Tween-20 was added to a final concentration of 0.1%, and the sample was sonicated using 20× 1-s pulses. Nuclei debris was pelleted by centrifugation at 20,000g for 10 min. The supernatant was digested with sequencing grade modified trypsin (Promega) at 1:20 ratio (w/w), and thiophosphopeptides from the peptide mixture were purified by binding to 40 μl of disulfide beads Thiopropyl Sepharose 6B (17042001, GE Healthcare) at pH 4.0 as previously described (19). Washed beads were eluted with 30 μl of 25 mM DTT (pH ~4 without buffering) in 5% acetonitrile/95% H2O at room temperature for 30 min. The eluate was acidified with tris(2-carboxyethyl)phosphine and formic acid to a final concentration of 5 mM and 0.1%, respectively, and analyzed directly by MS.

MS analysis and database search

Phosphopeptides samples were analyzed by Nanoflow liquid chromatography (NanoLC) and electrospray ionization tandem MS (MS/MS) using an LTQ-Orbitrap mass spectrometer (Thermo Fisher Scientific) interfaced with an Agilent 1100 Nano Pump with electronically controlled split flow. For ATP-γ-S labeling, one sample was analyzed in duplicate MS runs, and for PE-ATP-γ-S labeling, eight samples (four WT-CDK2 and four AS-CDK2) were analyzed in duplicate MS runs (16 MS runs in total). Peptides were loaded in sequence onto a 75 μm (inner diameter) by 15 cm C18 microcapillary column, packed in-house with Magic C18 AQ 5-μm resin (Michrom Bioresources), and resolved by a nonlinear gradient of 5 to 28% acetonitrile containing 0.1% formic acid at a flow rate of 300 nl/min over the course of 80 min. Each survey scan in the Orbitrap was followed by MS/MS scans of the top nine most intense precursor ions in the linear ion trap. Tandem spectra acquired were searched against a human Uniprot database (downloaded January 2015) with target decoy using the Comet algorithm (version 2014.02) (35). Peptide search parameters included precursor mass tolerance of 20 parts per million, one tryptic end for peptide, and differential mass modification to methionine (+15.999) due to oxidation and serine and threonine (+96.0329) due to thiophosphorylation. Search results were filtered using Trans Proteomic Pipeline (36) with a minimal iProphet (37) score of 0.75 and corresponding peptide false discovery rate (FDR) between 0.5 to 1%.

Functional enrichment analysis of CDK2 substrates

A network made up of the candidate substrates were created by manually inputting the list into the STRING protein query within Cytoscape (38, 39) and analyzed using the STRING functional enrichment tool with an enrichment FDR value cutoff of 0.05. Select enriched functional categories were generated on the basis of the Gene Ontology process category.


Supplementary material for this article is available at

This is an open-access article distributed under the terms of the Creative Commons Attribution-NonCommercial license, which permits use, distribution, and reproduction in any medium, so long as the resultant use is not for commercial advantage and provided the original work is properly cited.


Acknowledgments: We thank members of the Clurman laboratory for helpful discussions during the course of this project. We thank S. Li, D. Shteynberg, Z. Sun, and members of the Moritz laboratory for technical and computational help throughout the method development. Funding: This publication was supported by grants 1R01CA193808 (B.E.C.), CA188347 (A.V.M.), and P30CA056036 and Cancer Center Support grant 2P30CA015704. This work was funded in part by the National Institutes of Health, National Institute of General Medical Sciences grants: R01GM087221 (R.L.M). Author contributions: Y.C. designed the method, performed nuclei kinase assays and phosphopeptide purifications, carried out MS and data analyses, and substrate validation experiments. J.H.C. carried out substrate validation experiments. J.S. performed AS-CDK2 cell line characterization and kinase assays. A.V.M. provided scientific input. R.L.M. provided MS and software resources. Y.C., R.L.M., and B.E.C. designed the research project. Y.C. and B.E.C. wrote the manuscript. Competing interests: The authors declare that they have no competing interests. B.E.C. is the founder and equity holder of Coho Therapeutics, a startup company that is focused on protein degradation and is completely unrelated to the work presented in this paper. Coho Therapeutics did not provide any funding or support for this work. Data and materials availability: All data needed to evaluate the conclusions in the paper are present in the paper and/or the Supplementary Materials. Additional data (RAW files, search parameters, database file, and search results) has been deposited in the Institute for Systems Biology’s Peptide Atlas repository and can be accessed via the following Weblink: Additional data related to this paper may be requested from the authors.

Stay Connected to Science Advances

Navigate This Article