Research ArticleCORONAVIRUS

Activity profiling and crystal structures of inhibitor-bound SARS-CoV-2 papain-like protease: A framework for anti–COVID-19 drug design

See allHide authors and affiliations

Vol. 6, no. 42, eabd4596

Abstract

Viral papain-like cysteine protease (PLpro, NSP3) is essential for SARS-CoV-2 replication and represents a promising target for the development of antiviral drugs. Here, we used a combinatorial substrate library and performed comprehensive activity profiling of SARS-CoV-2 PLpro. On the scaffold of the best hits from positional scanning, we designed optimal fluorogenic substrates and irreversible inhibitors with a high degree of selectivity for SARS PLpro. We determined crystal structures of two of these inhibitors in complex with SARS-CoV-2 PLpro that reveals their inhibitory mechanisms and provides a molecular basis for the observed substrate specificity profiles. Last, we demonstrate that SARS-CoV-2 PLpro harbors deISGylating activity similar to SARSCoV-1 PLpro but its ability to hydrolyze K48-linked Ub chains is diminished, which our sequence and structure analysis provides a basis for. Together, this work has revealed the molecular rules governing PLpro substrate specificity and provides a framework for development of inhibitors with potential therapeutic value or drug repurposing.

INTRODUCTION

The global epidemic of three coronaviruses has emerged in this century so far. In November 2002 in Foshan, China, the first known case of human infected with severe acute respiratory syndrome coronavirus (SARS-CoV) has been reported (1). By July 2003, more than 8000 SARS cases were detected in 27 countries. The main symptoms of SARS-CoV infection were influenza-like and included fever, headache, malaise, shivering, and diarrhea. Only a few cases of infection occurred between December 2003 and January 2004 (2). The implementation of infection control measures has ended the global SARS outbreak. Ten years after the SARS pandemic, a new coronavirus, Middle East respiratory syndrome coronavirus (MERS-CoV) was diagnosed in Saudi Arabia man (3). Due to international travels of infected people, MERS-CoV has spread worldwide. A total of 2502 laboratory-confirmed cases of MERS-CoV infection were reported from September 2012 to the end of December 2019, including 858 associated deaths. In December 2019, a novel coronavirus, SARS-CoV-2, formerly known as the 2019 novel coronavirus (2019-nCoV) was identified in Wuhan, China (4, 5). Current studies indicate that this coronavirus is similar to SARS-CoV. Although these three coronaviruses—SARS-CoV, MERS-CoV, and SARS-CoV-2—are identified as a highly pathogenic into the human population, there is no effective antiviral treatment. Therefore, current studies are focused on rapid development of vaccines and antiviral drugs to prevent and treat coronavirus infection.

One of the attractive antiviral drug targets is the SARS-CoV–encoded cysteine protease—papain-like protease (PLpro) (6). This enzyme recognizes the tetrapeptide LXGG motif found in-between viral proteins nsp1 and nsp2, nsp2 and nsp3, and nsp3 and nsp4 (nsp1/2, nsp2/3, and nsp3/4) (7, 8). The hydrolysis of the peptide bond on the carboxyl side of glycine at the P1 position leads to the release of nsp1, nsp2, and nsp3 proteins, which are essential for viral replication. The in vitro studies have shown that SARS-CoV PLpro harbors two other proteolytic activities, removal of ubiquitin (Ub) and Ub-like (Ubl) protein ISG15 (interferon-induced gene 15) from cellular proteins (911). Ubiquitinated and ISGylated substrates are more efficiently hydrolyzed by SARS-CoV PLpro than small substrates containing C-terminal LRGG motif (11, 12). These results indicated a more complex mechanism of substrate recognition than only the interaction of S4-S1 pockets of enzyme with tetrapeptide fragment. Further studies revealed that SARS-CoV PLpro have two distinct Ub binding subsites (SUb1 and SUb2) and recognize Lys48-linked polyUb chains for polyUb chain editing and/or deubiquitination of polyubiquitinated proteins (1315).

Due to the deubiquitinating and deISGylating activities of SARS-CoV PLpro, this enzyme performs an important role in the innate immune response during viral infection (16, 17). SARS-CoV PLpro is involved in inhibiting the production of cytokines and chemokines that are responsible for the activation of the host innate immune response against viral infection (1820). For these reasons, this enzyme is an important molecular target in the design of SARS-CoV antiviral drugs. Despite substantial research efforts in the development of SARS-CoV inhibitors, efficacy data of these compounds from clinical trials are missing (2123). Nevertheless, we hypothesize that information gained over the past years for the SARS-CoV PLpro could be immediately translated into the timely study of SARS-CoV-2 PLpro to accelerate new antivirals development and drug retargeting approaches.

A molecular understanding of CoV-2 PLpro substrate specificity, structure, and mechanism would greatly facilitate development of effective PLpro inhibitors by enabling rational design and research on drug retargeting. In this study, we first performed comprehensive activity profiling of SARS-CoV-2 PLpro using our novel chemical approach, HyCoSuL (Hybrid Combinatorial Substrate Library) (24). The results reveal the molecular rules governing PLpro substrate specificity. Leveraging this information, we next designed and biochemically characterized potent inhibitors (VIR250 and VIR251) harboring high selectivity for SARS-CoV-2 PLpro and the related SARSCoV-1 PLpro versus other proteases. We determined crystal structures of VIR250 and VIR251 in complex with SARS-CoV-2 PLpro, which reveals their inhibitory mechanisms and provides a structural basis for the observed substrate specificity profiles. The unexpected finding that the P4 amino acids of VIR250 and VIR251 occupy opposite sides of the broad S4 pocket of SARS-CoV-2 PLpro and that there are additional regions of this pocket that are unengaged by either inhibitor raise the possibility that our structures will inform future drug discovery efforts. Last, we examined processing of Ub and Ubl protein variants by SARS-CoV-1 and -2 PLpro. These studies revealed that SARS-CoV-2 PLpro harbors deISGylating activities similar to SARSCoV-1 PLpro, but its ability to hydrolyze K48-linked Ub chains is substantially diminished, which our sequence and structural analyses provide a basis for. This finding is important, given the role of Ub and ISG15 conjugation in evasion of the host innate immune response. Together, our data also give a hope for design of a drug that can act as a pan-selective inhibitor against both SARS-CoV PLpro and SARS-CoV-2 PLpro and may have some universal value against emerging coronaviruses in the near future.

RESULTS

Substrate specificity profile

SARS-CoV-2 PLpro recognizes the tetrapeptide LXGG motif found in between viral proteins nsp1 and nsp2, nsp2 and nsp3, and nsp3 and nsp4 (Fig. 1A) (7, 8). Hydrolysis of the peptide bond on the carboxyl side of glycine at the P1 position leads to the release of nsp1, nsp2, and nsp3 proteins, which are essential for viral replication. SARS-CoV-2 PLpro also harbors deubiquitinating and deISGylating activities and recognizes the conserved LRGG motif at the C terminus of these proteins. (Fig. 1A). Our previous studies of SARS-CoV-1 PLpro substrate preferences using a combinatorial substrate library containing only natural amino acids revealed that this protease recognizes LXGG motif at P4-P1 positions with broad substrate specificity at P3 position (25). These results suggest that more detailed mapping of binding pocket architecture should facilitate the design of new, active substrates and optimal peptide sequences for inhibitor development efforts. To achieve this goal, we developed a defined and combinatorial substrate library (HyCoSuL) containing wide variety of nonproteinogenic amino acids (24).

Since tetrapeptide fluorogenic substrates are not very efficiently hydrolyzed by enzymes exhibiting deubiquitinating activity, we designed and synthesized the P2 defined library with a general structure of Ac-LRXG-ACC (X: 19 natural and 109 nonproteinogenic amino acids) and a HyCoSuL, where three positions were fixed and one position contains an equimolar mixture of 19 amino acids (Mix) (P3 sublibrary: Ac-Mix-P3-Gly-Gly-ACC, P4 sublibrary: Ac-P4-Mix-Gly-Gly-ACC; P3 and P4—a natural or nonproteinogenic amino acid) (26). By design of libraries with tailored peptide scaffold toward deubiquitinases (DUBs), we could reach the highest possible concentration of individual fluorogenic substrates in each sublibrary during the assay.

P2 library screening revealed that SARS-CoV and SARS-CoV-2 PLpro have very high substrate specificity at this position—only glycine can be accepted (fig. S1). Both proteases exhibit a broad substrate preference at P3 position (Fig. 1B). The S3 pocket of SARS-CoV and SARS-CoV-2 PLpro can tolerate not only positively charged residues like Phe(guan), Dap, Dab, Arg, Lys, Orn, and hArg but also hydrophobic amino acids, such as hTyr, Phe(F5), Cha, Met, Met(O), Met(O)2, d-hPhe (amino acid structures presented in table S1). These enzymes do not recognize acidic residues and most d-amino acids (the exception are d-Arg, d-hPhe, d-Lys, and d-Phg). The S4 pocket of SARS-CoV and SARS-CoV-2 PLpro can accommodate hydrophobic residues only, among natural amino acids, practically only leucine can be tolerated (being the best hit for SARS-CoV-2 PLpro) (Fig. 1B). SARS-CoV PLpro recognized two nonproteinogenic residues better than leucine at P4 position [hTyr and hTyr(Me)]. Other bulky amino acids are also accepted [≥30%, hPhe, Abu(Bth), Phe(3-I), Cys(Bzl), Cys(MeBzl), Cys(4-MeOBzl), hSer(Bzl), and Dht] (Fig. 1B).

Design and kinetic analysis of tetrapeptide fluorogenic substrates

To validate the library-screening data, we designed optimal tetrapeptide fluorogenic substrates to find optimal sequences recognized by SARS-CoV and SARS-CoV-2 PLpro. We analyzed both SARS PLpro substrate specificity profile at the P4-P2 positions and selected the most preferred amino acids [P2: Gly; P3: Dap and Phe(guan); P4: hTyr, hPhe, and Abu(Bth)] (Fig. 1E). Kinetic analysis revealed that some designed substrates were better recognized by SARS-CoV-1 PLpro with Ac-hTyr-Dap-Gly-Gly-ACC being almost 2.5 times more efficiently cleaved than endogenous Ac-LRGG-ACC. In the case of SARS-CoV-2 PLpro, we did not find substantial difference between Ac-LRGG-ACC and all tested substrates (Fig. 1C). It is important to notice that substitution of Arg in P3 position by relatively small Dap did not affect binding to S3 pocket and yields very good substrates (Fig. 1C). Thus, data obtained from combinatorial screening translate very well into individual substrates and demonstrate very high level of similarity between two investigated enzymes.

Next, we wanted to see whether incorporation of nonproteinogenic amino acids in P4 and P3 positions of peptide sequence can result in selective tetrapeptide substrates. We tested the substrates with four enzymes that exhibit deubiquitinating activity—SARS-CoV PLpro, SARS-CoV-2 PLpro, MERS-CoV PLpro, and human DUB UCH-L3. We have found that none of the substrates with nonproteinogenic amino acids in the sequence were substantially recognized at 10 μM either by MERS-CoV PLpro (2.5 μM) or human DUB UCH-L3 (8 μM) (Fig. 1D). In line with previous data, Ac-LRGG-ACC was recognized by all four enzymes (Fig. 1D).

Development of PLpro inhibitors

To further analyze selectivity of peptide sequences with nonproteinogenic amino acids, we converted two substrates [Ac-hTyr-Dap-Gly-Gly-ACC and Ac-Abu(Bth)-Dap-Gly-Gly-ACC] into inhibitors by exchanging the fluorescent tag to a reactive group—vinylmethyl ester (VME). A VME group was selected due to its broad reactivity toward DUBs (inhibitor selectivity is determined by tetrapeptide sequence). The results from kinetic analysis of SARS-CoV PLpro and SARS-CoV-2 PLpro inhibitors reflected those of substrate hydrolysis (Fig. 2, A and B). Ac-hTyr-Dap-Gly-Gly-VME (hereafter referred to as VIR251) was more potent but less selective inhibitor toward these enzymes than Ac-Abu(Bth)-Dap-Gly-Gly-VME (hereafter referred to as VIR250). Both compounds exhibit high selectivity for SARS-PLpro variants and robustly inhibit both SARS-CoV PLpro and SARS-CoV-2 PLpro activities. In contrast, practically no inhibition of human DUB UCH-L3 and only a slight inhibition of MERS-PLpro was observed (Fig. 2, A and B). Furthermore, incubation of HeLa lysates with Ub-VME yields a cross-linking profile that is unaltered by titrations of VIR250 or VIR251 (Fig. 2C). Since a major cross-linking target of Ub-VME is known to be human DUB enzymes, these data suggest that VIR250 and VIR251 do not cross-react with human DUBs. This is an important finding in search for a selective antiviral molecule with minimal cross-reactivity with human DUBs.

Structures of CoV-2 PLpro in complex with VIR250 and VIR251

We next set out to determine crystal structures of SARS-CoV-2 PLpro in complex with VIR250 and VIR251 to gain insights into the molecular mechanism by which these molecules inhibit SARS-CoV-2 PLpro activity as well as the basis for the observed substrate selectivity profile. Catalytic cysteine-111 of CoV-2 PLpro engages in Michael Addition to the β carbon of the vinyl group of the VME warheads of VIR250 and VIR251, resulting in formation of a covalent thioether linkage (Fig. 2D). Large-scale cross-linking reactions yielded CoV-2 PLpro-VIR250 and CoV-2 PLpro-VIR251 complexes of yield and purity sufficient for growth of diffraction quality crystals. The structure of CoV-2 PLpro in complex with VIR250 (Fig. 2D) [Protein Data Bank (PDB): 6WUU] was determined by molecular replacement using the recently determined structure of apo CoV-2 PLpro (PDB: 6W9C) and was resolved to 2.79 Å resolution with R/Rfree values of 0.195/0.230 (table S2). This structure was used as the molecular replacement search model for determination of the structure of CoV-2 PLpro in complex with VIR251 (Fig. 2E) (PDB: 6WX4). The CoV-2 PLpro/VIR251 structure was resolved to 1.65 Å resolution and refined to R/Rfree values of 0.170/0.196 (table S2).

Comparison of apo CoV-2 PLpro to CoV-2 PLpro/VIR250 and CoV-2 PLpro/VIR251 complexes reveal similar overall structures with the exception of the β14-β15 loop that is situated proximal to the active site and undergoes a conformational change in that is likely due to inhibitor binding (fig. S2) (see below). This analysis shows that there are also slight rigid body rotations of the finger and Ubl domains of CoV-2 PLpro that are likely due to crystal packing effects. Analysis of the structures reveal extensive electron density projecting from the catalytic Cys111 side chain of CoV-2 PLpro into which all the atoms of VIR250 and VIR251 could unambiguously be placed (Fig. 2, D and E). Furthermore, the covalent bond between Cys111 and both VIR250 and VIR251 are clear (Fig. 2, D and E). As anticipated, both VIR250 and VIR251 inhibitors occupy the S4-S1 pockets of CoV-2 PLpro in proximity to the active site and adopt similar structures with the exception of the orientation of the P4 substituents, which will be discussed in greater detail below. The P4 position is the only region of chemical divergence between VIR250 and VIR251, with an Abu(Bth) in VIR250 and an h Tyr in VIR251 (Fig. 2, A and B).

Molecular recognition of VIR250 and VIR251

Analysis of the CoV-2 PLpro/VIR250 (PDB: 6WUU) and CoV-2 PLpro/VIR251 (PDB: 6WX4) complexes reveals a similar network of interacting residues with ~560 Å2 from a total of ~775 Å2 solvent accessible area of VIR251 and ~600 Å2 from a total of ~800 Å2 solvent accessible area of VIR250 buried upon complex formation. With the exception of the P4 positions of VIR250 and VIR251, which engage largely in hydrophobic interactions with CoV-2 PLpro, most of the interactions at the P1-P3 positions of both inhibitors are mediated through polar interactions and hydrogen bonds (Fig. 3, A and B). At the P1 position of VIR250, GlyVME is covalently linked via thioether bond to catalytic Cys111 of CoV-2 PLpro and engages in a backbone-backbone hydrogen bond to Gly271 (Fig. 3A). At the P2 VIR250 position, Gly engages in two backbone-backbone hydrogen bonds to Gly163, and van der Waals contacts to Leu163 and Tyr164 of CoV-2 PLpro and P3 Dap of VIR250 participates in a backbone-backbone hydrogen bond with Gly271 (Fig. 3A). The network of backbone-backbone hydrogen bonds participated in at the P3-P1 positions of VIR250 are fully conserved in VIR251 (Fig. 3B). In contrast, while the methylester group from the GlyVME warhead of VIR250 engages in a hydrogen bond with His272 from the catalytic triad of CoV-2 PLpro, the corresponding methylester of VIR251 participates in hydrogen bonds with Trp106 and Asn109 side chains, which are proposed to contribute to oxyanion hole stabilization (Fig. 3, A and B). Trp106 adopts a different conformation and is poorly ordered in the VIR250 complex (Fig. 3A).

There are important differences in how the side chains of the P3 and P4 positions of VIR250 and VIR251 engage CoV-2 PLpro. The side-chain amine of Dap at the P3 position of VIR250 engages in a hydrogen bond with the backbone carbonyl oxygen of Tyr268 and the P4 Abu(Bth) projects toward Met208, Pro247, Pro248, and Thr301 where it engages in a network of van der Waals interactions (Fig. 3A). In contrast, it is the backbone amine of P3 Dap that engages in the hydrogen bond to the carbonyl oxygen of Tyr268, and unexpectedly, hTyr at the P4 position projects toward the opposite side of the S4 pocket compared to Abu(Bth) from VIR250 by extending toward Pro248, Tyr264, and Tyr268 of CoV-2 PLpro and participating in a distinct network of van der Waals interactions (Fig. 3B). This new network of interactions is facilitated by a 1.5-Å shift of the β14-β15 loop (Asn267, Tyr268, and Gln269) toward the hTyr of VIR251 (Fig. 3C), thereby facilitating many novel contacts that would be unable to occur in the absence of this shift. Notably, all of the CoV-2 PLpro residues involved in contacts to both VIR250 and VIR251 are fully conserved in SARS CoV-1 PLpro, and the overall structures of the two SARS PLpro variants are very similar in the catalytic site of the enzyme that likely accounts for the ability of these inhibitors to target both enzymes (Figs. 3C and 5).

In terms of how our structures correlate with the observed substrate selectivity profiles described above, P2 dependence on Gly is the result of residues from the β14-β15 and α5-α6 loops of CoV-2 PLpro (notably, Leu162, Tyr264, Cys270, Gly271, and Tyr273) clamping down on top of the P2 position, leaving no room for side-chain atoms at the R position (Fig. 3, A to C). The preference for positive and hydrophobic residues and selection against acidic residues at the P3 position is likely the result of its broader pocket and proximity to the acidic carbonyl oxygens of Tyr268, Gln269, and Leu162, the side chain of Asp164, as well as the hydrophobic side chains of Leu162 and Tyr268 (Fig. 3, A to C). At the P4 position, the strong preference for bulky hydrophobic residues can be explained by the hydrophobic nature of the P4 binding pocket that is largely formed by residues Met208, Pro247, Pro248, Tyr264, and Tyr268 (Fig. 3, A and B). Notably, the very deep and broad nature of the S4 pocket of SARS-CoV-2 PLpro has been exploited by the Abu(Bth) and h Tyr sidechains at P4 of VIR250 and VIR251, which, as noted above, project toward different ends of the S4 pocket and engage in distinct networks of contacts (Fig. 3C). With that said, there remain regions at the deepest parts of this pocket, particularly an acidic patch formed by Asp164, Tyr273, and Thr301 that could potentially be exploited for development of more potent inhibitors.

Processing of Ub and Ubl variants by CoV-1 and CoV-2 PLpro

To more thoroughly examine these differences, we performed a comparison of the kinetics of SARS-CoV-1 and CoV-2 PLpro processing of LRGG-ACC, Ub-ACC, and ISG15-AMC fluorogenic substrates. The results of this experiment show that SARS-CoV-2 PLpro processes Ub-ACC fourfold less efficiently compared to SARS-CoV-1 PLpro and that SARS-CoV-2 PLpro processes ISG15-AMC 60-fold more efficiently than Ub-ACC (Fig. 4C). Furthermore, SARS-CoV-2 PLpro, like SARS-CoV-1 PLpro, more robustly processes K48 tetraUb compared to K63 tetraUb (Fig. 4D), and cross-links to the ABP Ub-VS similarly to SARS-CoV-1 PLpro, MERS PLpro, and the human DUB USP2CD (Fig. 4E). Yet, in side-by-side comparison, SARS-CoV-2 PLpro demonstrates a substantially diminished ability to process K48 tetraUb compared to SARS-CoV-1 PLpro (Fig. 4F). This was an unexpected finding, as we and others have shown before that SARS-CoV-1 PLpro displays a preference for recognition of K48 diUb linkages over ISG15 (13, 15). As expected and shown before, both MERS PLpro and USP2CD efficiently processes both types of Ub chains.

The substantially diminished ability of SARS-CoV-2 PLpro to process K48 polyUb chains compared to SARS-CoV-1 PLpro was unexpected considering the very high overall similarity between the enzymes (83% identity, 9% similarity) (Fig. 3D). To try to reconcile this apparent contradiction, we compared our SARS-CoV-2 PLpro structures with the previously reported structure of SARS-CoV-1 PLpro in complex with K48 diUb (Fig. 5) (PDB: 5E6J) (15). This structure revealed three key interfaces: (i) the catalytic site that accommodates the C terminus of Ub with L73, R74, G75, and G76 constituting the S4-S1 residues; (ii) a binding site for the “S1 Ub,” which is the Ub N-terminal to the cleavage site in the K48 polyUb chain; and (iii) a binding site for the “S2 Ub,” which is N-terminal to the S1 Ub (Fig. 5). Comparative analysis of the catalytic sites of SARS-CoV-1 and 2 PLpro shows a 100% conservation of residues involved in contacts to the S4-S1 positions of S1 Ub, VIR250, and VIR251 (Fig. 3D) and expectedly a very similar structure in this region (Fig. 5). While the Ub S1 site harbors more variability than the catalytic site, the overall amino acid conservation is still very high (83% identity, 17% similarity) (Fig. 3D) and the structures align well in this region (Fig. 5).

In contrast to the catalytic and S1 Ub sites, the S2 Ub site of SARS-CoV-2 PLpro harbors much less conservation at the amino acid level (67% identity, 13% similarity) compared to SARS-CoV-1 PLpro (Fig. 3D), and there are several structural differences at these regions important for molecular recognition of the S2 Ub (Fig. 5). A key interaction surface at this interface is formed by the Ile44 hydrophobic patch (formed by Leu8, Ile44, and Val70), and in the SARS-CoV-1 PLpro/K48 diUb structure, the Ile44 patch of the S2 Ub engages in a network of hydrophobic contacts with Phe70, Leu75, and others (Fig. 5). This residue is changed to a threonine in SARS-CoV-2, which would be unable to engage in a similar network of contacts with S2 Ub as leucine. Furthermore, much of the SARS-CoV-2 structure in proximity to the S2 Ub site adopts a slightly different structure including a ~3-Å translation of Thr75 of SARS-CoV-2 PLpro relative to Leu75 of SARS-CoV-1 PLpro and other notable amino acid changes of S66V and E77P (Fig. 5). Last, Glu179 of SARS-CoV-1 PLpro engages in hydrogen bonds to Thr9 and Lys11 of S2 Ub (Fig. 5). This residue has changed to an aspartate in SARS-CoV-2 PLpro, which is a conservative change, but the shorter aspartate side chain is unable to engage in a similar set of contacts (Fig. 5). Based on our analysis, we posit that the diminished ability of SARS-CoV-2 PLpro to process K48 polyUb is largely due to the aforementioned differences at the S2 Ub binding site. Consistent with this hypothesis, mutation of Leu75 of SARS-CoV-1 PLpro to serine resulted in a fivefold reduction in binding of K48 diUb with no apparent effect on monoUb (13). In combination with other changes in SARS-CoV-2 PLpro such as E179D, there appears to be a cumulative effect of several relatively minor changes between SARS-CoV-1 and 2 PLpro at the S2 binding site that together have a substantial effect on their ability to process K48 polyUb. Whether these changes also account for the apparent preference of SARS-CoV-2 PLpro for ISG15 over Ub and whether these intriguing differences in the function of SARS-CoV-1 and 2 PLpro have any effect on the biology of the viruses remain to be seen. Notably, our conclusion regarding the deISGylating activity and diminished processing of K48-Ub linkages by CoV-2 PLpro relative to CoV-1 PLpro has been independently corroborated by two preprints (27, 28) and one recent manuscript (29).

DISCUSSION AND CONCLUSIONS

The outbreak of the current coronavirus pandemic leading to COVID-19 disease has markedly accelerated research into effective drugs and a vaccine to treat this disease. SARS-CoV-2 PLpro is an excellent candidate for antiviral drug development, as they are not only blocking virus replication but also inhibiting the dysregulation of signaling cascades in infected cells (9). A detailed molecular understanding of CoV-2 PLpro substrate specificity, structure, and mechanism would greatly facilitate development of effective PLpro inhibitors by enabling rational design and research on drug retargeting, and this was the major focus of our study. In this study, we examined SARS-CoV-2 PLpro substrate preferences at positions P4-P2 and compare them directly with the well-known SARS virus 2002/03 protein, SARS-CoV PLpro. For this purpose, we used positional scanning technology using natural and nonproteinogenic amino acids (HyCoSuL) (24). Library screening revealed that both enzymes recognize only Gly in P2 and have broad in P3 and rather narrow substrate specificity at the P4 position. Moreover, direct analysis of the preferences of both enzymes demonstrates that the architecture of S4-S2 pockets is almost identical, because they recognize natural and nonproteinogenic amino acids practically in a very similar way. The differences in activity for a given amino acid between the two enzymes observed in some positions are very small, and there are no amino acids that are recognized by one enzyme only. This is also confirmed by the analysis of amino acids building S4-S2 pockets in both enzymes, which is identical (Fig. 1B and fig. S1). This is critically important information in the aspect of using information from research on inhibitors or retargeting of drugs conducted in the past for SARS-CoV PLpro for immediate application to SARS-CoV-2 PLpro. Analysis of kinetic parameters for tetrapeptide substrates for both enzymes shows a high degree of similarity in terms of kcat/Michaelis constant (KM) values, proving that the catalytic yields of both enzymes are also similar. The sequences containing nonproteinogenic amino acids at P4-P3 positions were recognized only by both SARS-PLpro, not MERS-PLpro and the human DUB UCH-L3. This open the doors to the potential application of specific SARS-PLpro substrates developed in our work for use in cell culture studies such as localization of the targeted proteases and virus.

We next leveraged the information we gained regarding the molecular rules governing substrate selectivity by SARS-CoV-2 PLpro to develop covalent inhibitors VIR250 and VIR251. These inhibitors proved to be active and selectively inhibited the SARS-CoV-1 and -2 PLpro, but exhibited much weaker activity toward MERS-PLpro and practically no activity toward human DUB UCH-L3. This is valuable information in terms of conducting research toward the search for peptide antiviral compounds targeted to this enzyme. Our crystal structures of VIR250 and VIR251 in complex with SARS-CoV-2 PLpro reveal their inhibitory mechanisms and provide a structural basis for the observed substrate specificity profiles. Furthermore, the unexpected findings that the P4 amino acids of VIR250 and VIR251 occupy opposite sides of the broad S4 pocket of SARS-CoV-2 PLpro and that there are additional regions of this pocket that are unengaged by either inhibitor raise the possibility that our structures will inform future drug discovery efforts aimed at generating more potent inhibitors. Comparative analysis of the substrate specificity of SARS-CoV-2-Mpro and SARS-CoV-2 PLpro indicates that they have markedly different substrate specificity (30). This indicates that for peptidic inhibitors, it will be impossible to design an inhibitor that will act on both enzymes simultaneously. However, if peptidic inhibitors were found for both proteases separately, then it would probably be possible to use them as a cocktail. Another possible approach is searching for a small-molecule inhibitor that would promiscuously inhibit both Mpro and PLpro. Such an inhibitor would certainly be very beneficial in the therapeutic treatment of COVID-19, but it should be remembered that it could cross-react with human cysteine proteases, which could lead to undesirable side effects (3133).

Furthermore, our substrate specificity studies conducted for SARS-CoV-2-Mpro (30) and here for SARS-CoV-2 PLpro indicate that both enzymes have virtually identical substrate specificity as their homologs from the previous SARS. Thus, the shapes of the binding pockets are virtually unchanged. This is valuable information from the standpoint of designing inhibitors as drugs for these enzymes. For the next SARS-type coronavirus that emerges in the future, there will definitely be a need to create a new vaccine, which is a time-consuming process. On the other hand, antiviral drugs developed on the basis of knowledge obtained from studies on SARS-1 and SARS-2 proteases will have a chance for immediate use in treatment through drug repurposing. This further indicates the high potential of both proteases as medical targets. Another possible application of our inhibitors is their use as selective ABPs to visualize SARS-CoV-2 PLpro activity in cells or even in COVID-19 diagnostics. Similar studies have already been conducted toward the use of ABPs for smallpox K7L protease, ZIKA, WNW, or dengue viruses proteases (3436).

Last, we examined processing of Ub and Ubl variants by SARS-CoV-1 and -2 PLpro and found that SARS-CoV-2 PLpro harbors deISGylating activities similar to SARSCoV-1 PLpro but its ability to hydrolyze K48-linked Ub chains is substantially diminished. This was an unexpected result considering the very high sequence identity between SARS-CoV-1 and -2 PLpro; however, our structure analysis revealed subtle structural and sequence variations in the S2 Ub binding site of SARS-CoV-2 PLpro that we posit collectively diminish the ability of the S2 Ub of K48 polyUb to bind and subsequently be processed. Furthermore, analysis of the enzyme kinetics of the Ub-ACC substrate indicates that it is efficiently processed by the enzyme, but the difference between the tetrapeptide substrate and Ub is only about 10 times, when in the case of SARS-CoV-1 PLpro, this difference is around 60 times (Fig. 4C). This indicates some differences between both enzymes in the aspect of interaction in the exosite binding region related to amino acids identity and similarity. Given the role of Ub and ISG15 conjugation in evasion of the host innate immune responses, whether these intriguing differences in the function of SARS-CoV-1 and -2 PLpro have any effect on the biology of the viruses remains to be seen and will be the topic of future studies. Notably, two preprints (27, 28) and one recent manuscript (29) have all independently come the same conclusion as us regarding the deISGylating activity and diminished processing of K48-Ub linkages of CoV-2 PLpro relative to CoV-1 PLpro and have validated CoV-2 PLpro as a viable target for antiviral development.

Collectively, our work has revealed the molecular rules governing PLpro substrate specificity and reveals a very high level of sequence and structural similarity between SARS-CoV-1 and -2 PLpro in the substrate binding pocket. These findings signal that previously discovered information on SARS-CoV-1 PLpro can immediately be applied to the search for effective antiviral molecules and retargeting of known drugs for the inhibition of SARS-CoV-2 PLpro. Furthermore, structures of the novel inhibitors VIR250 and VIR251 in complex with SARS-CoV-2 PLpro provides a framework for rational development of inhibitors with improved potency and ABPs. It is worth noting that a flurry of preprint publications have conducted SARS-CoV-1 PLpro drug-repurposing studies against SARS-CoV-2 PLpro, showing that existing compounds can inhibit it (2729, 37). We believe our profiling and crystallographic studies open up additional avenues in developing inhibitors with improved properties. Together, our data also provide hope for design of a drug that can act as a pan-selective inhibitor against both SARS-CoV-1 PLpro and SARS-CoV-2 PLpro and may have some universal value against emerging coronaviruses in the near future.

MATERIALS AND METHODS

Plasmids

For biochemical assays, the cDNA for PLpro corresponding to amino acids 745 to 1061 of SARS-CoV-2 NSP3 was codon-optimized for Escherichia coli expression, synthesized, and cloned into pGEX6P-1 (GE Healthcare, UK) using the Bam HI and Not I sites by Gene Universal (USA) for expression as a PreScission protease cleavable N-terminally glutathione S-transferase (GST)–tagged protein (table S3). For crystallization studies, the codon optimized SARS-CoV-2 cDNA was cloned into the Nde I and Xho I sites for expression as a C-terminally uncleavable 6× His tag (table S3). The plasmids were transformed into BL21 (DE3) codon and E. coli strain for protein expression.

Protein expression and purification

SARS-CoV PLpro, UCH-L3, and MERS-PLpro were obtained as described earlier (14, 25). SARS-CoV-2 PLpro transformed cells were grown in LB broth at 37°C with shaking until the optical density at 600 nm reached 1.5. Isopropyl-β-d-thiogalactopyranoside (0.1 mM) and ZnSO4 (0.1 mM) were added to induce protein expression overnight at 18°C. Cell pellet was resuspended in lysis buffer [20 mM tris-Cl (pH 8.0), 350 mM NaCl, and 2 mM β-mercaptoethanol) and lysed using sonication. The lysate was cleared by centrifugation at 35,000g for 30 min at 4°C. The lysate was passed onto Glutathione Sepharose 4B (GE) followed by washing with lysis buffer. The GST-tagged PLpro was eluted in lysis buffer supplemented with 20 mM reduced glutathione (pH 8.0). The fusion protein was cleaved using GST-PreScission protease at 4°C overnight followed with desalting and passing through fresh glutathione beads to remove cleaved GST and PreScission protease. The sample was further purified using Superdex 200-pg size-exclusion columns (GE) equilibrated with 20 mM tris-Cl (pH 8.0), 40 mM NaCl, and 2 mM dithiothreitol (DTT). The purified protein was then concentrated to ~10 mg/ml and snap-frozen in liquid nitrogen for later use.

Reagents

The reagents used for the solid-phase peptide synthesis (SPPS) were as follows: Rink amide (RA) resin (particle size 100 to 200 mesh; loading 0.74 mmol/g), 2-chlorotrityl chloride resin (particle size 100 to 200 mesh, loading 0.97 mmol/g), all 9-fluorenyl methoxycarbonyl–amino acids, O-benzotriazole-N,N,N,N-tetramethyl-uronium-hexafluoro-phosphate, 2-(1-H-7-azabenzotriazol-1-yl)-1,1,3,3-tetramethyluranium hexafluorophosphate (HATU), piperidine, diisopropylcarbodiimide, and trifluoroacetic acid (TFA), purchased from Iris Biotech GmbH (Marktredwitz, Germany); anhydrous N-hydroxybenzotriazole (HOBt) from Creosauls Louisville, KY, USA; 2,4,6-collidine (2,4,6-trimethylpyridine), high-performance liquid chromatography–grade acetonitrile, triisopropylsilane (TIPS), tBu-N-allyl carbamate, toluene, methyl acrylate, dichlorophenylborane, and second generation Grubbs catalyst from Sigma-Aldrich (Poznan, Poland); and N,N-diisopropylethylamie from VWR International (Gdansk, Poland). N,N-dimethylformamide (DMF), dichloromethane (DCM), methanol (MeOH), diethyl ether (Et2O), acetic acid (AcOH), and phosphorus pentoxide (P2O5) were obtained from Avantor (Gliwice, Poland). Individual substrates, Ub-ACC and B-Ub-VME, were purified by HPLC on a Waters M600 solvent delivery module with a Waters M2489 detector system using a semipreparative Wide Pore C8 Discovery column and Jupiter 10-μm C4 300-Å column (250 mm × 10 mm). The solvent composition was as follows: phase A (water/0.1% TFA) and phase B (acetonitrile/0.1% TFA). The purity of each compound was confirmed with an analytical HPLC system using a Jupiter 10-μm C4 300 Å column (250 × 4.6 mm). The solvent composition was as follows: phase A (water/0.1% TFA) and phase B (acetonitrile/0.1% TFA); gradient, from 5 to 95% B over a period of 15 or 20 min. The molecular weight of each substrate and B-Ub-VME was confirmed by high-resolution mass spectrometry (HRMS) on a High-Resolution Mass Spectrometer Waters LCT premier XE with electrospray ionization and a time-of-flight module.

Combinatorial and defined substrate library synthesis.

Detailed protocol of combinatorial and defined tetrapeptide fluorogenic substrate library synthesis was described elsewhere (26).

Determination of SARS-CoV and SARS-CoV-2 PLpro substrate specificity

Library screening was performed using a spectrofluorometer (Molecular Devices SpectraMax Gemini XPS) in 96-well plates containing substrates and enzymes. Assay conditions were 1 μl of substrate in dimethyl sulfoxide (DMSO) and 99 μl of enzyme, which had been incubated for 15 min at 37°C in assay buffer [150 mM NaCl, 20 mM tris, and 5 mM DTT (pH 8.0) for SARS-CoV PLpro; 5 mM NaCl, 20 mM tris, and 5 mM DTT (pH 8.0) for SARS-CoV-2 PLpro]. The final substrate concentration in each well was 200 μM combinatorial library and 100 μM defined P2 library. The final enzyme concentration was 1 μM SARS-CoV PLpro and 0.5 μM SARS-CoV-2 PLpro for P3 and P4 sublibraries and 0.1 μM SARS-CoV PLpro and 75 nM SARS-CoV-2 PLpro for Ac-Leu-Arg-P2-Gly-ACC. The release of ACC was measured continuously for 45 min (λex = 355 nm, λem = 460 nm). SARS-CoV and SARS-CoV-2 PLpro substrate specificity profiles were established by setting the highest relative fluorescence unit per second for the best substrate as to 100% and adjusting other results accordingly.

Synthesis of tetrapeptide fluorogenic substrates and Ub-ACC

Individual fluorogenic substrates were synthesized on a solid support using the SPPS method as previously described (24, 38). Each substrate was purified by HPLC and analyzed using analytical HPLC and HRMS. The purity of each compound was ≥95%. The individual substrates were dissolved at 20 mM in DMSO and stored at −80°C until use.

Kinetic studies of individual tetrapeptide substrates and Ub-ACC

Individual substrate hydrolysis was measured in the same assay conditions as for library screening. The final substrate concentration was 10 μM, SARS-CoV and SARS-CoV-2 PLpro concentration was 0.1 μM, MERS-CoV PLpro was 2.5 μM, and UCH-L3 was 8.8 μM. MERS-CoV PLpro and UCH-L3 were incubated for 30 min at 37°C in assay buffer [MERS-CoV PLpro: 150 mM NaCl, 20 mM tris, 5 mM DTT (pH 8.0); UCH-L3: 50 mM Hepes, 0.5 mM EDTA, 5 mM DTT (pH 7.5)] before add into the wells on plate. The measurements were repeated at least three times, and the results were presented as mean values with SDs. Kinetic parameters were determined for selected tetrapeptide substrates and Ub-ACC toward SARS-CoV and SARS-CoV-2 PLpro. Wells contained 20 μl of substrate in assay buffer at eight different concentrations (0.88 to 20 μM) and 80 μl of enzyme (0.5 μM SARS-CoV and SARS-CoV-2 PLpro for tetrapeptide substrates and 80 nM SARS-CoV-2 PLpro and 10 nM SARS-CoV PLpro for Ub-ACC). Substrate hydrolysis was measured for 30 min at the appropriate wavelength (λex = 355 nm, λem = 460 nm). Each experiment was carried out at least three times and the results reported as averages with SD. Due to the precipitation of tetrapeptide substrates at high concentrations, only the specificity constant (kcat/KM) was determined. When [S0] < < KM, the plot of vi (the initial velocities) versus [S0] yields a straight line with slope representing Vmax/KM, kcat/KM = slope/E (E, total enzyme concentration). Kinetic parameter for ISG15-AMC toward SARS-CoV-2 PLpro was determined in the same manner as described above. The final enzyme concentration was 1 nM, and the final substrate concentration was ranging from 0.3 to 5 μM.

SARS-CoV and SARS-CoV-2 PLpro inhibitor and B-Ub-VME synthesis

Inhibitor synthesis was performed in the three sequential stages. In the first step, vinyl methyl ester as a reactive group was synthesized according to published protocol (39). tBu-N-allyl carbamate (500 mg, 3.2 μmol) was dissolved in 10 ml of anhydrous toluene. Methyl acrylate (580 μl, 6.4 μmol), dichlorophenylborane (42 μl, 0.32 μmol), and second generation Grubbs catalyst (50 mg) were added. The reaction was carried out under reflux at 40°C with stirring overnight. After 12 hours, the solvent was evaporated under reduced pressure, and the mixture was purified by column chromatography on silica gel (Hex/EtOAc, 5:1). The crude product was obtained as a yellowish oil. tBu group deprotection was performed by adding TFA/DCM/TIPS [4.2 ml, 3/1/0.2 (v/v/v)] cleavage mixture for 45 min with stirring. TFA*H2N-Gly-VME was then crystallized in cold Et2O and stored at −20°C. In the second step, Ac-P4-P3-Gly-OH fragments were synthesized using 2-chlorotrityl chloride resin as previously described (40). In the last step, Ac-P4-P3-Gly-OH fragment (1.2 equiv.) was coupled to the reactive group (1 equiv.) using HATU (1.2 equiv.) and 2,4,6-collidine (3 equiv.) as a coupling reagents in DMF. The reaction was carried out at room temperature (RT) with stirring for 2 hours. The reaction mixture was diluted in ethyl acetate; washed once with 5% citric acid, once with 5% NaHCO3, and once brine; dried over MgSO4; and concentrated under reduce pressure. To remove side-chain amino acid protecting groups, Ac-P4-P3-Gly-Gly-VME was added to a mixture of TFA/DCM/TIPS [% (v/v/v), 70:27:3]. After 30 min, solvents were removed and inhibitor was purified on HPLC. B-Ub-VME was synthesized according to synthetic protocol described elsewhere (41, 42).

Determination of DUB inhibition

To assess activity and selectivity of designed SARS-CoV and SARS-CoV-2 PLpro inhibitors DUBs were incubated with inhibitors at eight different concentrations (2.3 to 300 μM) for 30 min at 37°C in assay buffers. DUB residual activity was estimated using Ac-Leu-Arg-Gly-Gly-ACC (50 μM). Assay conditions were 20 μl of inhibitor, 60 μl of DUB (0.3 μM SARS-CoV PLpro, 0.1 μM SARS-CoV-2 PLpro, 2.5 μM MERS-CoV PLpro, and 8 μM UCH-L3), and 20 μl of substrate (50 μM). Inhibition assays were measured for 40 min and repeated at least three times. The results were established as mean values with SDs.

Crystallization

SARS-CoV-2 PLpro (3 μM) was reacted with 30 μM peptide inhibitor in 5 mM NaCl, 20 mM tris-HCl (pH 8.0) at 37°C for 20 min. Protein was concentrated using a 30-kD cutoff Amicon Ultra Filter and desalted into 5 mM NaCl, 20 mM tris-HCl (pH 8.0), and 10 mM DTT. Final protein concentration was 5 to 10 mg/ml. VIR250 complex crystals were grown by mixing 0.4-μl protein and 0.4-μl well solution containing 0.2 M lithium citrate tribasic, and 20% polyethylene glycol, molecular weight 3350 on a 96-well sitting plate at 18°C. Crystals were cryo-protected by well solution plus 25% (v/v) ethylene glycol and snap-frozen in liquid nitrogen. VIR251 complex crystals were grown by mixing 0.2-μl protein sample with 0.2-μl well solution containing 0.8 M potassium sodium tartrate tetrahydrate, 0.1 M tris-HCl (pH 8.5), and 0.5% w/v polyethylene glycol monomethyl ether 5000 on a 96-well sitting plate at 18 degrees. Crystals were cryo-protected by 25% (v/v) ethylene glycol and flash-frozen with liquid nitrogen.

Structure determination and refinement

A complete data set was collected from the SARS-CoV-2 PLpro/VIR250 crystals to 2.79 Å resolution at the Advanced Photon Source, NE-CAT beamline 24-IDC at a wavelength of 0.979 Å. Dataset was indexed, integrated, and scaled using HKL2000. Crystal belongs to space group P21 with unit cell dimensions a = 58.4, b = 189.7, c = 63.1, and β = 98.7°. There are four SARS-CoV-2 PLpro/VIR250 complexes per asymmetric unit. The structure was solved by molecular replacement using the program PHASER. The search model was apo SARS-CoV-2 PLpro structure (PDB: 6W9C). Apparent ligand density for both Fo-Fc and 2Fo-Fc maps was observed projecting off Cys111 after first round of refinement. Model and restraints for VIR250 was prepared using Phenix.Elbow. Model of SARS-CoV-2 PLpro/VIR250 was subjected to iterative rounds of refinement and rebuilding using PHENIX (43) and COOT (44).

For SARS-CoV-2 PLpro/VIR251 crystals, data were collected and processed as described above for VIR250 to a resolution of 1.65 Å. The crystal belongs to space group I222 with unit cell dimensions a = 44.9, b = 113.5, and c = 151.1. There is one SARS-CoV-2 PLpro/VIR251 complex per asymmetric unit. The structure was determined by molecular replacement with Phaser and the search model was SARS-CoV-2 PLpro/VIR250 structure above (PDB: 6WUU). Structure with ligand was refined as described above for the VIR250 structure.

The final two models for PLpro-VIR250 and PLpro-VIR251 complexes have R/Rfree values of 0.195/0.230 and 0.170/0.196, respectively. The two structures also have excellent geometry as assessed using Molprobity: favored (95.3%), allowed (4.6%), and outliers (0.1%) for the PLpro/VIR250 structure and favored (97.0%), allowed (3.0%), and outliers (0%) for the PLpro/VIR251 structure.

PLpro-Ub/Ubl ABP panel assay

The probes used in this experiment (fig. S3, 1 to 4) were generous gifts of UbiQ. Development of the probes have been previously described: Probe 1 (45, 46), Probe 2 (46), Probe 3 (47), Probe 4 (48, 49). Plpro (3 μM) was incubated with 30 μM inhibitor or DMSO at 37 for 20 min and put on ice. Reaction buffer contains 5 mM NaCl, 20 mM tris-HCl (pH 8.0). Then, the indicated Ub/Ubl ABPs were mixed with PLpro at 4.5 and 2.7 μM, respectively, at RT for 2 min. Reactions were terminated by adding SDS sample buffer, subjected to SDS-PAGE sypro staining.

SARS-CoV and SARS-CoV-2 PLpro labeling by B-Ub-VME

Enzymes (200 nM) were incubated with different B-Ub-VME concentrations (100, 200, 400, 800, and 1000 nM) in assay buffer [150 mM NaCl, 20 mM tris, and 5 mM DTT (pH 8.0) for SARS-CoV PLpro; 5 mM NaCl, 20 mM tris, and 5 mM DTT (pH 8.0) for SARS-CoV-2 PLpro] for 45 min at 37°C. Then, 3× SDS/DTT was added, and the samples were boiled for 5 min at 95°C and resolved on 4 to 12% bis-tris Plus 12-well gels. Electrophoresis was performed at 200 V for 29 min. Next, the proteins were transferred to a nitrocellulose membrane (0.2 μm, Bio-Rad) for 60 min at 10 V. The membrane was blocked with 2% bovine serum albumin (BSA) in tris-buffered saline with 0.1% (v/v) Tween 20 (TBS-T) for 60 min at RT. B-Ub-VME was detected with a fluorescent streptavidin Alexa Fluor 647 conjugate (1:10,000) in TBS-T with 1% BSA using an Azure Biosystems Sapphire Biomolecular Imager and Azure Spot Analysis Software.

Gel-based Ub chain cleavage assays and Ub-VS labeling

Tetra-Ub chains (K48- and K63-linked; Boston Biochem) were cleaved in a reaction volume of 10 μl [in 20 mM tris (pH 7.5), 150 mM NaCl, and 5 mM DTT] with 25 to 500 nM PLpro or USP2 catalytic domain (Boston Biochem), as indicated in figures. Ub-Vinyl Sulfone-labeling was performed in 10 μl [in 20 mM tris (pH 7.5), 150 mM NaCl, and 5 mM DTT] with 1.5 μM Ub-VS (Boston Biochem) and 0.25 μM PLpro or USP2 catalytic domain. Reactions were incubated at 37°C for 30 min, terminated with sample loading buffer (4X LDS, Invitrogen), and analyzed by SDS-PAGE (4 to 12% bis-tris, NuPAGE) and SYPRO Ruby staining. Gels were imaged using an Azure Biosystems c500 imager.

HeLa lysate assay

HeLa cells were cultured in DMEM supplemented with 10% fetal bovine serum, 2 mM L-glutamine, and antibiotics (100 U/mL penicillin, 100 µg/mL streptomycin) in a humidified 5% CO2 atmosphere at 37°C. Approximately 1 200 000 cells were harvested and washed three times with PBS. The cell pellet was lysed in buffer containing 20 mM Tris, 150 mM NaCl, and 5 mM DTT, pH 8.0, using a sonicator. The cell lysate was centrifuged for 10 min, and the supernatant was collected. Cell lysates were incubated with or without of inhibitors (Ac-Abu(Bth)-Dap-Gly-Gly-VME and Ac-hTyr-Dap-Gly-Gly-VME) at four different concentrations (25, 50, 100, and 200 µM) for 30 min at 37°C. Next 300 nM of B-Ub-VME was added and the samples were incubated for 30 min at 37°C. Then the samples were combined with 3xSDS/DTT, boiled, and run on a gel. Electrophoresis, protein transfer to a nitrocellulose membrane, and probe visualization were conducted in the same manner as described above.