Research ArticleSTRUCTURAL BIOLOGY

Solving a new R2lox protein structure by microcrystal electron diffraction

See allHide authors and affiliations

Science Advances  07 Aug 2019:
Vol. 5, no. 8, eaax4621
DOI: 10.1126/sciadv.aax4621

Abstract

Microcrystal electron diffraction (MicroED) has recently shown potential for structural biology. It enables the study of biomolecules from micrometer-sized 3D crystals that are too small to be studied by conventional x-ray crystallography. However, to date, MicroED has only been applied to redetermine protein structures that had already been solved previously by x-ray diffraction. Here, we present the first new protein structure—an R2lox enzyme—solved using MicroED. The structure was phased by molecular replacement using a search model of 35% sequence identity. The resulting electrostatic scattering potential map at 3.0-Å resolution was of sufficient quality to allow accurate model building and refinement. The dinuclear metal cofactor could be located in the map and was modeled as a heterodinuclear Mn/Fe center based on previous studies. Our results demonstrate that MicroED has the potential to become a widely applicable tool for revealing novel insights into protein structure and function.

INTRODUCTION

Electrons, similar to x-rays and neutrons, are a powerful source for diffraction experiments (13). Because of the strong interactions between electrons and matter, crystals that are considered as powder in x-ray crystallography can be treated as single crystals by microcrystal electron diffraction (MicroED) (4). This enables structure determination of molecules from micrometer- to nanometer-sized three-dimensional (3D) crystals that are too small for conventional x-ray diffraction (48). Furthermore, MicroED can be applied to study biomolecules of low molecular weight that are beyond what can be resolved by single-particle cryo–electron microscopy (cryo-EM) imaging (1, 9).

Over the past decades, 3D ED methods have been developed for structure determination of small inorganic compounds (1012) and organic molecules (1315). At the early stages of 3D ED method development, tilting of the crystal was done manually, while diffraction patterns were collected on negative film. It could take years before sufficient data were obtained and processed to determine the crystal structure (16). The computerization of transmission electron microscopes (TEMs) and the development of charge-coupled device detectors allowed software to be developed that can semi-automatically collect 3D ED data in less than an hour (11, 13). Owing to the recent advancement in complementary metal-oxide semiconductor and hybrid detector technology, it is now feasible to collect diffraction data in movie mode while continuously rotating the crystal (5, 12, 17). Benefiting from these technological advances, data collection and structure refinement can now be performed within hours (18, 19). Furthermore, peptide structures have been solved ab initio using high-resolution MicroED data (20, 21). Since 2013, several research groups have shown that it is feasible to redetermine already known protein structures using MicroED (table S1), and it is only very recently that a new polymorph of hen egg white lysozyme was unveiled by MicroED, but again phased using a previously determined structure of the identical protein (22).

The sample handling of MicroED is similar to that of cryo-EM, while the data collection and processing are similar to that used in x-ray crystallography, making the technique highly adaptable to existing cryo-EM and general TEM laboratories (48, 21). Therefore, it is of great benefit to further develop MicroED methods to meet the needs of scientists in a wide community.

Here, we report the next step in the development of MicroED by solving a novel protein structure, Sulfolobus acidocaldarius R2-like ligand-binding oxidase, SaR2lox. The R2lox metalloenzyme family was discovered a decade ago from its similarities with the ribonucleotide reductase R2 protein (23). Its physiological function is unknown to date, but the two crystal structures solved (23, 24) of proteins belonging to this family reveal a four-helix bundle core accommodating a dinuclear metal cluster, characterized by electron paramagnetic resonance as Mn(III)/Fe(III) (25), interacting with a long-chain fatty acid.

We demonstrate that MicroED data can be collected from SaR2lox 3D microcrystals by the continuous rotation method (17, 5). Conventional x-ray crystallography software can be used directly for processing ED data [XDS (26)], determining the phases using a homologous protein model of 35% sequence identity [Phaser (27)], and refining the model structure [phenix.refine (28)]. These results illustrate that MicroED is a powerful tool for determining novel protein structures with sample requirements complementing those of x-ray crystallography, single-particle cryo-EM, and x-ray free-electron lasers.

RESULTS

Micrometer-sized 3D crystals of SaR2lox were grown using conventional hanging drop vapor diffusion and are barely large enough to be distinguished under an optical microscope from precipitate and phase separation occurring in the drop (Fig. 1A). The mother liquor used for growing microcrystals contains 44% (v/v) polyethylene glycol (PEG) 400. The viscosity of the sample prevented preparation of thin vitrified cryo-EM samples using the traditional deposit-blot-plunge routine. Thus, we used manual backside blotting to remove excess liquid and vitrified the sample in liquid ethane (29). The plate-like crystals are triangular in shape and a few micrometers in size (Fig. 1B). The thickness of the crystals is estimated to be less than 0.5 μm. We note that it is crucial to reduce the thickness of the protective layer of vitrified ice as much as possible to collect ED data of high signal-to-noise ratio from the crystals. Even in the presence of 44% (v/v) PEG 400, the backside blotting approach provided good distribution of crystals in a thin layer of vitreous ice (Fig. 1, A and B). This ability makes MicroED a more generally applicable tool in structural biology as highly viscous solutions are common in crystallization of macromolecules.

Fig. 1 Overview of the MicroED experiment.

(A) SaR2lox microcrystals (pointed out by arrows) viewed under an optical microscope. (B) Overfocused TEM image of a typical diffracting SaR2lox crystal preserved in vitrified ice. The volume of this particular crystal is estimated to be approximately 2 μm3, which contains approximately 6 × 106 unit cells. (C) Typical ED pattern of the SaR2lox crystal. The frame was taken after an accumulated electron dose of 4.3 e2. (D) Reconstructed reciprocal lattice showing limited data completeness with predominately missing reflections in the direction of c* (fig. S1); missing observations are shown in white, observed reflections are in a rainbow color scheme, and systematic absences are shown in pink.

MicroED data were collected by continuously rotating a single crystal in the electron beam at a rate of 0.45°/s. The exposure time of each frame was 2 s, integrating over 0.90° of the reciprocal lattice. Data were typically collected over a total rotation range of 54° within 2 min. The crystals diffracted beyond a resolution of 3.0 Å (Fig. 1C). The electron dose rate applied was approximately 0.08 e2 per second, resulting in a total dose of less than 10 e2 per dataset. Because of their plate-like shape, a morphology commonly observed in protein crystallography, the SaR2lox crystals are dispersed on the TEM grid with a preferred orientation. A total of 35 ED datasets were collected from the plate-like SaR2lox crystals. Data were processed using crystallography software XDS (26). We could determine that the crystals are orthorhombic with a primitive unit cell using rotation ED processing software (REDp) (11). Two screw axes were identified from the reflection conditions (Fig. 1C). On the basis of unit cell consistency and cross-correlation, 21 of the 35 datasets were merged. Because of the preferred orientation of the crystals, the data completeness increased from 50.9% for a single dataset to only 62.8% (Fig. 1D and fig. S1). However, the multiplicity (~32) and overall I/σ(I) (6.12 up to a resolution of 3.0 Å) improved drastically (table S2). We demonstrated previously that the resulting structural model can be improved by merging data from a large number of crystals (8).

The processed MicroED data were of sufficient quality to solve and refine the structure of SaR2lox (Fig. 2A). The closest homolog to SaR2lox in the Protein Data Bank (PDB) is an R2lox protein from Mycobacterium tuberculosis (MtR2lox) sharing 33.8% identity of the full-length protein and 44% identity over the most conserved two-thirds of the protein sequence [PDB ID 3EE4 (23)] (fig. S2). Using a modified search model of 35% sequence identity, we were able to phase the merged MicroED data by molecular replacement using Phaser (27), obtaining a clear single solution in space group P21212. The structure was iteratively built and refined in COOT (30) and phenix.refine (28) (table S3). The refined electrostatic potential map provided sufficient detail to model side chains and allowed rebuilding of the main chain (Fig. 2, B and D), although certain parts of the maps were less well defined along the c* direction (owing to the incomplete data). The dinuclear metal cofactor was located in the map and modeled as a heterodinuclear Mn/Fe center based on previous studies on this protein family (23, 24). To assess the data quality and eliminate the influence of model bias, we generated a composite omit map covering the entire unit cell, confirming the correct interpretation for modeling the structure (Fig. 2, C and E).

Fig. 2 High-quality electrostatic potential maps allowing accurate model interpretation.

(A) Overall structure of SaR2lox solved by MicroED with three colored selections as examples to show electrostatic potential maps in (B) to (E). Electrostatic scattering potential maps 2FoFc (contoured at 1σ; colored blue) and FoFc (contoured at ±3σ; colored green and red for positive and negative peaks, respectively) and simulated annealing composite omit maps (contoured at 1σ; colored magenta) are shown for residues 164 to 177 (orange) and 235 to 249 (yellow) in (B) and (C), respectively, and for residues 202 to 217 (cyan) in (D) and (E), respectively. Simulated annealing composite omit electrostatic potential maps are calculated with sequential 5% fractions of the structure omitted. Only observed reflections were used for map calculations, i.e., no missing F(obs) were restored using a weighted F(calc). Despite the low completeness, the data produce high-quality well-resolved maps. Oxygen and nitrogen atoms are colored red and blue, respectively, and carbons are colored according to the selection previously mentioned.

The final model of SaR2lox shows a protein backbone Cα root mean square deviation of 0.94 Å compared to the structure of MtR2lox used for molecular replacement. This value is within the expected range from proteins of this size and sequence identity level. The structure presented here confirms the dimeric biological assembly and the ferritin-like helix bundle overall fold, previously seen in the R2lox protein family (fig. S3) (23, 24). Furthermore, the structure reveals new biochemically important features of the SaR2lox enzyme, differing from the two structures known in this family: The substrate-binding pocket is reshaped and shows an altered electrostatic potential distribution (Fig. 3). In addition, the sequence identity with existing structures is lower than 40%, which indicates a possibly divergent enzyme function (31). Thus, our results suggest that SaR2lox has a substrate specificity different from that of the enzymes previously structurally characterized. There are additional ongoing biochemical studies in our laboratory to confirm this hypothesis.

Fig. 3 Differences between substrate-binding pockets support substrate specificity divergences in the R2lox metalloenzyme family.

The molecular surface and identity of residues defining the substrate-binding pocket in the SaR2lox structure (A) are compared with the two already known structures of this protein family, (B) MtR2lox (PDB ID 3EE4) (23) and (C) GkR2lox (PDB ID 4HR0) (24). Although the Mn- and Fe-coordinating residues are fully conserved, the distal residues lining the cavity are remarkably divergent, drastically affecting the shape and electrostatic contact potentials of the pocket accommodating the putative substrate. This suggests that the substrate specificity could be different between these three enzymes, and thus, their function could be divergent. Electrostatic contact potentials plotted on molecular surfaces are colored in a gradient from red (negative) to blue (positive). Ligands modeled as myristic acid and palmitic acid for MtR2lox and GkR2lox, respectively, are colored yellow. A ligand was not modeled in the SaR2lox structure due to a very weak signal in the corresponding scattering potential map, but its presence cannot be excluded. Carbon, nitrogen, oxygen, manganese, and iron atoms are colored gray, blue, red, purple, and orange, respectively. We note that the missing stretch of unmodeled residues in SaR2lox (250 to 260) is not shared with MtR2lox or GkR2lox and is likely to affect distal closure of the cavity, further emphasizing differences between proteins.

DISCUSSION

MicroED data can be collected from crystals previously considered to be insufficient in size, which is its main advantage over x-ray crystallography. There are several aspects where ED can be further improved. For instance, by developing sample preparation methods such as cryogenic focused ion beam (29, 3234), the orientation of the crystals may be precisely controlled to achieve near 100% data completeness. Furthermore, the MicroED data are not yet as accurate or as precise as generally achieved by x-ray crystallography, as indicated by poorer crystallographic quality indicators, e.g., R factors. Rapid development of data collection strategies, as well as improvements of TEM hardware, may further accelerate the development of MicroED. Our results show that MicroED can be used to determine novel protein structures, even under the circumstances frequently encountered in macromolecular crystallography, where a viscous sample environment, preferred orientation, and poor signal-to-noise ratio complicate data collection and structure determination.

MATERIALS AND METHODS

Cloning, expression, and protein purification

A construct encoding full-length S. acidocaldarius R2lox (accession number WP_011278976) was polymerase chain reaction–amplified from genomic DNA [DSM number 639, obtained from DSMZ (Deutsche Sammlung von Mikroorganismen und Zellkulturen), Braunschweig, Germany] and inserted into pET-46 Ek/LIC (Novagen) using the following primers: GACGACGACAAGATGAAGGAAAAATTACTTGAATTCAGAAGT and GAGGAGAAGCCCGGTTATTTGTCCAGCTTAATCTCCTCTATGAC.

Expression was carried out in Escherichia coli BL21(DE3) (Novagen). Cells were cultured at 37°C in terrific broth medium (Formedium) supplemented with ampicillin (50 μg/ml) in a benchtop bioreactor system (Harbinger). When an optical density at 600 nm of 0.7 was reached, expression was induced with 0.5 mM isopropyl β-d-1-thiogalactopyranoside and allowed to continue for 3 hours. Then, cells were harvested by centrifugation and stored at −80°C. The N-terminal His6-tagged SaR2lox protein was purified via immobilized metal-ion affinity chromatography and size exclusion chromatography. Cells were resuspended in buffer A [25 mM Hepes-Na (pH 7.0), 300 mM NaCl, and 20 mM imidazole] and disrupted by high-pressure homogenization. The lysate was cleared by centrifugation and applied to a nickel–nitrilotriacetic acid agarose (Protino) gravity flow column. The beads were washed extensively with buffer B (buffer A containing 40 mM imidazole). Protein was then eluted using buffer C (buffer A containing 250 mM imidazole), concentrated using Vivaspin 20 centrifugal concentrators with a 30,000 molecular weight cutoff polyethersulfone membrane (Sartorius), and applied to a HiLoad 16/60 Superdex 200 prep grade size exclusion column (GE Healthcare) equilibrated in a final buffer of 25 mM Hepes-Na (pH 7.0) and 50 mM NaCl. Fractions corresponding to the pure SaR2lox protein were pooled, concentrated to 16 mg/ml, aliquoted, flash-frozen in liquid nitrogen, and stored at −80°C. The protein concentration was obtained using a calculated molecular weight for this construct of 38,006 and an experimentally determined extinction coefficient at 280 nm for metal-bound protein of 52.13 mM−1 cm−1.

Crystallization

SaR2lox protein was crystallized using the hanging drop vapor diffusion method. A volume of 2 μl of a protein solution (8 mg/ml) was mixed with 2 μl of reservoir solution consisting of 44% (v/v) PEG 400, 0.2 M lithium sulfate, and 0.1 M sodium acetate (pH 3.4). Plate-like crystals grew within 48 hours at 21°C.

Sample preparation

A cryo-EM sample of SaR2lox was prepared by freezing the crystals in a thin layer of vitrified ice. A thin and uniform vitrified ice layer is crucial for obtaining MicroED of high signal-to-noise ratio. Meanwhile, the ice layer has to protect the crystals from being dehydrated under vacuum inside a TEM. The 4-μl hanging drop was deposited onto a QUANTIFOIL R 3.5/1 (300 mesh) Cu holy carbon TEM grid. The excessive liquid was removed by manual back-side blotting. The grid was then rapidly plunge-frozen in liquid ethane. We note that the automated blotting and vitrification routine using a FEI Vitrobot Mark IV was not efficient in removing the viscous mother liquid while leaving a sufficient number of crystals on the TEM grid.

Data collection

MicroED data were collected on a JEOL JEM-2100 (LaB6 filament) TEM operated at 200 kV, with a Gatan 914 cryo-transfer holder. Before searching for suitable crystals, the electron beam was aligned, while the center of the TEM grid was brought to the mechanical eucentric height. During crystal searching, by inserting a 50-μm condenser lens aperture, the size of the electron beam was set to be slightly larger than the field of view (6 μm) on the side entry Orius detector. As soon as a suitable crystal was found, the beam was blanked while we set up for MicroED data collection. These measures were taken to avoid unnecessary electron dose on surrounding crystals. An area with a diameter of 2 μm as defined by a selected area aperture was used to select the region of interest on the crystal. ED data were collected by continuously rotating the SaR2lox crystal under the electron beam while simultaneously collecting the diffraction patterns on a fast Timepix hybrid pixel detector (Amsterdam Scientific Instruments). Data were collected at a sample-to-detector distance of 1830 mm, equivalent to 0.001198 Å−1 per pixel.

Data processing

The native format of collected data is tagged image file (TIF). The data were converted to the Super Marty View (SMV) format using a python script developed in-house. The important metadata of the experiment was written automatically to the headers of each SMV frame. Data were processed using XDS (26). The data were scaled and merged based on unit cell consistency, correlation coefficients between the datasets (analyzed with XDS nonisomorphism) (35), I/σ(I), and resolution using XSCALE (26). Data were converted to MTZ format using POINTLESS (36) and merged with AIMLESS (37), and structure factor amplitudes were calculated using TRUNCATE (38). Figure 1D and fig. S1 were prepared with the Reflection Data Viewer in Phenix (39).

Structure solution and refinement

The structure was solved by molecular replacement using Phaser (27) with atomic scattering factors for electrons. A truncated search model was created using the atomic coordinates of the R2-like ligand-binding oxidase from M. tuberculosis (PDB ID 3EE4) (23) using Sculptor (39). A well-contrasted solution was obtained with one molecule per asymmetric unit in the space group P21212 [log-likelihood gain (LLG) = 161, translation function Z-score (TFZ) = 14.0]. A solution in the alternative space group P212121 was substantially worse (LLG = 48, TFZ = 5.6), confirming the absence of the screw axis along c*. The structure model was refined using rigid body refinement directly after molecular replacement in phenix.refine (28). The structure was iteratively built using COOT (30) and refined using phenix.refine (28) with atomic scattering factors for electrons, automatic weighting of the geometry term, and group B factors per residue. Structure solution and refinement were performed using the merged intensities.

Certain regions of the map were less well resolved, particularly residues 250 to 260. As the map was difficult to interpret, we did not attempt to place those residues. Furthermore, a segment of an α helix, residues 235 to 249, appeared to be shifted judging from the difference potential map after attempting to fit the corresponding side chains. This part was corrected and refined by real-space refinement in COOT (30) using geometrical restraints. Simulated annealing composite omit maps were generated using phenix.composite_omit_map (28), calculated with sequential 5% fractions of the structure omitted.

The model was validated using MolProbity (40). Table S2 lists the crystallographic statistics in which the test set represents 5% of the reflections. The core root mean square deviation values between structures were calculated by the secondary-structure matching tool (41). Figures 2 and 3 and fig. S3 were prepared using the PyMOL Molecular Graphics System, version 2.2.3, Schrödinger LLC. Electrostatic protein contact potentials were generated with the vacuum electrostatics tool in PyMOL, using the default values for cavity detection radius and cutoff and using a level range of ±50.

SUPPLEMENTARY MATERIALS

Supplementary material for this article is available at http://advances.sciencemag.org/cgi/content/full/5/8/eaax4621/DC1

Fig. S1. Views of reciprocal space showing the merged intensities of SaR2lox.

Fig. S2. Sequence alignment between SaR2lox (UniProt identifier Q4J6V7) and R2lox from M. tuberculosis MtR2lox (UniProt identifier P9WH69) used as molecular replacement search model for the structure presented in this study (PDB ID 3EE4) (23).

Fig. S3. Overall superimposition of crystal structures of R2lox dimers.

Table S1. Overview of protein structures in the PDB determined by MicroED.

Table S2. ED merging statistics for a single-crystal dataset and merging of 11, 14 and 21 crystal datasets.

Table S3. Data collection and refinement statistics of SaR2lox.

References (4261)

This is an open-access article distributed under the terms of the Creative Commons Attribution-NonCommercial license, which permits use, distribution, and reproduction in any medium, so long as the resultant use is not for commercial advantage and provided the original work is properly cited.

REFERENCES AND NOTES

Acknowledgments: H.L. thanks P. Stenmark for discussions about this manuscript. Funding: We acknowledge financial support from the Knut and Alice Wallenberg Foundation through the project grants 3DEM-NATUR (no. 2012.0112 to X.Z.) and Wallenberg Academy Fellows (no. 2017.0275 to M.H.), the Science for Life Laboratory through the pilot project grant Electron Nanocrystallography, the European Research Council (HIGH-GEAR 724394 to M.H.), and the Swedish Research Council (2017-04018 to M.H. and 2017-05333 to H.X.). Author contributions: H.X. contributed to project design, conception, ED data collection, ED data analysis, manuscript writing, and figure creation. H.L contributed to project design, crystal growth, structure determination, manuscript writing, and figure creation. M.T.B.C. contributed to ED data analysis, structure determination, manuscript writing, and figure creation. J.Z. contributed to ED data collection. J.J.G. contributed to cloning. X.Z. and M.H. contributed to project design, data analysis, conception, and manuscript writing. Competing interests: The authors declare that they have no competing interests. Data and materials availability: The atomic coordinates of the SaR2lox structure are deposited in the PDB under accession code 6QRZ. Raw MicroED data are available upon request. The python script used for converting diffraction frames from TIF to MRC and IMG (SMV format) is available upon request. REDp can be downloaded at mmk.su.se/zou. All data needed to evaluate the conclusions in the paper are present in the paper and/or the Supplementary Materials. Additional data related to this paper may be requested from the authors.
View Abstract

Navigate This Article