Research ArticleBIOENGINEERING

Protein engineering by highly parallel screening of computationally designed variants

See allHide authors and affiliations

Science Advances  20 Jul 2016:
Vol. 2, no. 7, e1600692
DOI: 10.1126/sciadv.1600692
  • Fig. 1 Structure of the ubiquitin-USP21 complex.

    Ubiquitin and USP21 are shown in gray and blue, respectively, with the engineered ubiquitin region shown in pink.

  • Fig. 2 Schematic of the parallel protein engineering strategy.

    Computational protein design is performed on the human ubiquitin interface (positions 54 to 71) binding USP21 [Protein Data Bank (PDB) ID: 3I3T]. Fixed backbone design is performed on protein ensembles, whereas flexible backbone design is performed directly on the initial crystal structure. Unique sequences (2000), 18 amino acids (AA) in length, were extracted for each of the three design strategies and reverse-translated (NT) for synthesis on a DNA microarray. Libraries were constructed by amplifying the DNA microarray product and were subsequently screened by either phage display or Y2H. Deep sequencing of the final screening products recovered the designed ubiquitin variants that tightly bound USP21.

  • Fig. 3 Sequence logos of ubiquitin variants that tightly bind USP21.

    (A to C) Sequence logos of computationally designed ubiquitin variants derived from MD, CONCOORD, and Backrub ensembles for (A) 2000 of the best ranked designed variants, (B) variants selected by phage display, and (C) variants selected by Y2H. (D and E) Sequence logos of ubiquitin variants selected for USP21 binding (D) from a biased naïve library screened by phage display and (E) predicted by random forest models from simulated naïve libraries following the NNK nucleotide randomization scheme. The simulated libraries were biased toward the ubiquitin wild-type nucleic acid composition 70% of the time. The x and y axes correspond to the designed ubiquitin positions and the information content in bits, respectively, as determined by WebLogo (39). Amino acids colored black, green, and blue correspond to hydrophobic, neutral, and hydrophilic residues, respectively.

  • Fig. 4 PCA of the designed ubiquitin variants.

    (A and B) The first two principal components (PCs), capturing 35% of the variance, are formed from a set of 37 unbound ubiquitin and 137 bound ubiquitin crystal structures, which include the wild-type ubiquitin (WT Ub) and a tight-binding ubiquitin variant (Ubv21.4) bound to USP21. (A) Ubiquitin backbone models from the top 2000 scoring variants from each of the MD, CONCOORD, and Backrub ensembles were projected into the first two PCs. (B) MD, CONCOORD, and Backrub structures associated with the 215 ubiquitin variants found to tightly bind USP21 by phage display are projected into the first two PCs. (C and D) Sequences of 215 designed variants identified by phage display and 26 variants from a biased naïve library found to tightly bind USP21 in addition to the wild-type ubiquitin sequence were used for PCA over (C) 7 sequence positions (62, 63, 64, 66, 68, 70, and 71) engineered by the biased naïve library (25% of the variance) or (D) all 18 designed positions (21% of the variance).

  • Fig. 5 Ubiquitin variant validation.

    (A) Dose-response curve of USP21 activity inhibition by wild-type ubiquitin and the Ubv10 ubiquitin design. IC50 concentrations were evaluated as the ubiquitin concentration inhibiting USP21 activity by 50%. The Ubv10 dose-response curve has an IC50 value of 4.2 nM. (B) A strong correlation exists between the IC50 concentrations and the log-transformed deep sequencing counts.

  • Fig. 6 Microenvironment of the ubiquitin-USP21 interface.

    Molecular models for four ubiquitin variants were superimposed on wild-type ubiquitin (PDB ID: 3I3T). (A to D) Designed residues (bold) for (A) Ubv1 (cyan), (B) Ubv2 (magenta), (C) Ubv4 (yellow), and (D) Ubv10 (green) are shown relative to wild-type ubiquitin (pink) as colored sticks. USP21 contact residues are shown as gray sticks. Wild-type ubiquitin Glu64 was substituted for Phe, Asn, His, and Asn in Ubv1, Ubv2, Ubv4, and Ubv10, respectively, which removes the unfavorable charge interaction with USP21 Asp438. Ala65 substitution for all variants removes the unfavorable charge interaction with Glu395. (E) Thr66 was maintained in all variants making a hydrogen bond with Glu395, whereas Phe68 facilitates hydrophobic interactions with USP21 residues Leu378, Phe423, and Leu450.

Supplementary Materials

  • Supplementary material for this article is available at http://advances.sciencemag.org/cgi/content/full/2/7/e1600692/DC1

    fig. S1. Structural comparisons of the ubiquitin backbone models.

    fig. S2. IC50 and affinity validation of a subset of the designed ubiquitin variants against USP21.

    fig. S3. Venn diagrams of the designed ubiquitin variants recovered by phage display and Y2H.

    fig. S4. PCA of sequences identified by Y2H.

    fig. S5. Random forest regression model for sequence count prediction.

    fig. S6. Sequence logos of ubiquitin variants predicted to tightly bind USP21 by an ensemble of random forests model for variants derived from MD, CONCOORD, and Backrub.

    fig. S7. Y2H screening of ubiquitin library against USP21.

    table S1. Jenson-Shannon divergence of designed ubiquitin variants derived from MD, CONCOORD, and Backrub ensembles compared to the wild-type sequence and ubiquitin variants recovered from a biased naïve library.

    table S2. IC50 and associated deep sequencing read counts for four selected low-nanomolar binders to USP21.

    table S3. Deep sequencing read counts of ubiquitin variants surviving phage display and Y2H selections.

    table S4. Isothermal titration calorimetry of Ubv10 binding USP21.

  • Supplementary Materials

    This PDF file includes:

    • fig. S1. Structural comparisons of the ubiquitin backbone models.
    • fig. S2. IC50 and affinity validation of a subset of the designed ubiquitin variants against USP21.
    • fig. S3. Venn diagrams of the designed ubiquitin variants recovered by phage display and Y2H.
    • fig. S4. PCA of sequences identified by Y2H.
    • fig. S5. Random forest regression model for sequence count prediction.
    • fig. S6. Sequence logos of ubiquitin variants predicted to tightly bind USP21 by an ensemble of random forests model for variants derived from MD, CONCOORD, and Backrub.
    • fig. S7. Y2H screening of ubiquitin library against USP21.
    • table S1. Jenson-Shannon divergence of designed ubiquitin variants derived from MD, CONCOORD, and Backrub ensembles compared to the wild-type sequence and ubiquitin variants recovered from a biased naïve library.
    • table S2. IC50 and associated deep sequencing read counts for four selected low-nanomolar binders to USP21.
    • Legend for table S3
    • table S4. Isothermal titration calorimetry of Ubv10 binding USP21.

    Download PDF

    Other Supplementary Material for this manuscript includes the following:

    • table S3 (Microsoft Excel format). Deep sequencing read counts of ubiquitin variants surviving phage display and Y2H selections.

    Files in this Data Supplement:

Navigate This Article