Accurate high-throughput screening based on digital protein synthesis in a massively parallel femtoliter droplet array

See allHide authors and affiliations

Science Advances  21 Aug 2019:
Vol. 5, no. 8, eaav8185
DOI: 10.1126/sciadv.aav8185


We report a general strategy based on digital counting principle that enables an efficient acquisition of enzyme mutants with desired activities from just a few clones within a day. We prepared a high-density femtoliter droplet array, consisting of 1 million uniform droplets per 1 cm2 to carry out high-throughput protein synthesis and screening. Single DNA molecules were randomly distributed into each droplet following a Poisson process to initiate the protein synthesis with coupled cell-free transcription and translation reactions and then recovered by a microcapillary. The protein yield in each droplet was proportional to the number of DNA molecules, meaning that droplets with apparent intensities higher than the Poisson distribution–predicted maximum can be readily identified as the exact hits exhibiting the desired increased activity. We improved the activity of an alkaline phosphatase up to near 20-fold by using less than 10 nl of reagents.


The increasing demand for high-performance enzymes associated with industrial biocatalysis, theranostic techniques, and genetic engineering has driven dramatic advances as well as flourishing creativities in protein engineering under the concept of directed evolution, a laboratorial mimic of Darwinian evolution but with an accelerated process. The directed evolution of proteins has seen great success; some notable examples include fluorescent proteins with improved brightness or altered spectra, antibodies with improved antigen-binding affinities, and enzymes with improved activity, thermostability, and solvent tolerance or altered substrate selectivity. These achievements led to new research tools, and some of them deepened our understanding of the molecular evolution processes in nature (14). Over the past decades, diverse strategies for library preparation or screening have been extensively proposed or demonstrated, but the primary framework of laboratory evolution of proteins, composed of iterative rounds of mutagenesis, mutant characterization, and candidate enrichment, remains unchanged (5).

With the rise of microfluidic technologies in the past decades represented by fluorescence-activated cell sorting and droplet emulsion, many integrated and high-throughput screenings have been developed for either cells or cell-free reactors, in which the target protein is expressed and assayed. Although the cell-based approach was demonstrated to be broadly useful for protein screening, it necessitates special care for the host fitness and always requires sophisticated genetic engineering, for reasons such as cytotoxic effects and heterologous expression. Cell-free transcription and translation systems disconnect the protein of interest from the organismal constraints and, as such, provide a much straightforward way to explore the protein sequence space, which is guaranteed by the central dogma. However, the transcriptional and translational machineries, although simpler than a living cell, are still highly complex. Severe challenges remain in achieving a fine control of the water-oil interface at micrometer scales during in vitro compartmentalization, a process linking a genotype to its corresponding phenotype. On the other hand, in the case of enzymes, the apparent signal intensity associated with the activity of each mutant is always masked by a fluctuating protein expression level, causing false positives or false negatives. As a consequence, in general, a less stringent threshold (allowing for a larger coverage of potential hits) and iterated rounds of screening (allowing for the gradual elimination of false positives and the accumulation of the real hits) have been the default framework of directed evolution, until now.

Here, we report a strategy allowing for easy separation of the enzyme yield information from the apparent intensity by introducing an emerging concept of digital counting based on Poisson statistics (6), enabling accurate and unambiguous determination of hits. We prepared massively parallel, highly uniform, and individually addressable femtoliter droplets where single DNA molecules can be encapsulated following a stochastic process, transcribed and translated with a coupled versatile cell-free system, and recovered by an external microcapillary. The great number of droplet reactors provides statistically significant evidence regarding the general capability of the cell-free protein synthesis (CFPS) initiating from a single DNA molecule. The stable droplet array allows time-resolved kinetic measurements for every single clone. The uniform dimension of every droplet allows a Poisson distribution of DNA molecules over the array. As the protein yield in each droplet is proportional to the number of DNA molecules (as we prove in the following sections), a few exceptional droplets with intensities higher than the Poisson distribution–predicted maximum can be attributed reliably to an improved activity, rather than an elevated expression level. Our digital counting scheme is insusceptible to the commonly ambiguous threshold in other schemes and provides a high accuracy for the determination of hits in high-throughput screening. We used fluorescent proteins to establish the overall working principle, evolved an enzyme, and screened out a new mutant with one of the highest activities to date within a single day, which has never been discovered in the past despite the fact that the enzyme was well characterized.


Preparation of femtoliter droplet array (FemDA)

We prepared a planar droplet array consisting of over 106 uniform droplets per 1 cm2 on a micropatterned substrate. Each droplet was trapped in a microchamber composed of a hydrophilic bottom (glass) and a hydrophobic barrier (CYTOP, a perfluoropolymer) (Fig. 1). The basic principle of the microfabrication process was proposed in a previous report (fig. S1A) (7), but the dimension of individual microchambers often fluctuated over the patterned area of the substrate. The resulting uneven volumes of the droplets formed made it difficult to carry out a parallel and quantitative measurement within a single device or among different batches of devices. After extended microscopic observations, we found that perfluorocarbon oils such as FC-40 (3M) can slowly dissolve the amorphous fluoropolymer CYTOP, resulting in a gradual leakage of the encapsulated contents and a cross-talk among the droplets.

Fig. 1 Schematic illustration of the preparation of FemDA used for CFPS driven by single DNA molecules.

The integrated device is composed of a hybrid hydrophilic-inside–hydrophobic-outside microchamber array substrate and a microfluidic channel. Cell-free protein synthesis (CFPS) solution, flush oil, and sealing oil are sequentially injected into the channel to form individual droplets inside the microchambers. FemDA, femtoliter droplet array.

We addressed the problems above by completely changing the protocol for the droplet formation. We found that keeping the relative humidity of the clean room at 40 to 50% is critical for achieving a fine etching of CYTOP, fully exposing the hydrophilic bottom (Supplementary Text and fig. S1, B to G). Then, we established a two-step oil-sealing strategy to enable an extremely stable and robust formation of the droplet array (Fig. 1). An ideal oil for this system should at least fulfill the following criteria: (i) it should have a lower surface tension than CYTOP (<19 mN/m), (ii) it does not dissolve CYTOP, (iii) it should be as immiscible as possible with water, and (iv) it should be biocompatible. We searched and found a hydrofluoroether oil, ASAHIKLIN AE-3000 (AGC), and a nonionic fluorosurfactant, SURFLON S-386 (AGC), that can dissolve in AE-3000 and stabilize the water/oil interface at very low concentrations. The extremely low surface tension (fig. S2) and a high density (1.47 g/cm3) allowed AE-3000 [with 0.1 weight % (wt %) S-386] to spread out all over the surface of CYTOP and completely remove any trace amount of aqueous phase from the surface of CYTOP. The fully exposed hydrophilic bottom of microchambers retained the aqueous solution inside the microchamber well. The nonionic characteristic of the surfactant also seems preferable for complex biochemical applications (8, 9). Since AE-3000 exhibits a relatively high evaporation rate, we performed a follow-up oil exchange procedure by injecting a perfluoropolyether oil, Fomblin Y25 (Solvay), heavier (1.90 g/cm3) than AE-3000, to replace AE-3000. Fomblin Y25 completely suppressed the evaporation of aqueous droplets. We call the first oil (AE-3000 with 0.1 wt % S-386) the “flush oil” and the Fomblin oil the “sealing oil,” respectively. This two-step oil-sealing strategy was able to produce unprecedentedly uniform and stable femtoliter droplets over a large area (fig. S1, H to I). Neither visible droplet shrinkage nor cross-contamination among droplets was observed over at least 24 hours (fig. S1, J and K, and movie S1). The unprecedented simplicity, stability, and uniformity were all realized at the femtoliter volume within minutes, without complex instrumentation.

CFPS using FemDA

We used a coupled transcription and translation system reconstituted from the purified components in our study (10). A variant of yellow fluorescent protein (cp173-mVenus) was used as the model protein. For CFPS in droplets (movie S2), a straight microfluidic channel was assembled reversibly with the microchamber array substrate (Fig. 1). A mixture solution containing the template DNA and the CFPS components was injected into the channel, followed by sequential injection of the flush oil and the sealing oil. We observed a stochastic occurrence of fluorescent droplets over the array (Fig. 2A and movie S3), as a result of the DNA concentration being less than one molecule per droplet. The histogram of fluorescence intensities of 1.6 × 105 droplets from the array showed a discrete distribution of the fluorescence intensity (a measure of the quantity of fluorescent proteins) (Fig. 2B). Furthermore, the histogram was nicely fitted by a sum of Gaussian distributions of equal peak-to-peak intervals. As a bulk measurement of CFPS revealed a DNA concentration–dependent dynamics of protein expression in low DNA concentrations (fig. S3) where the supply of DNA template would be the rate-limiting factor of CFPS, the histogram suggested an occupancy of different numbers of DNA molecules in each droplet. Further statistical analysis showed that the probability of occurrence of droplets containing different numbers of DNA molecules was a perfect fit to a Poisson distribution (Fig. 2C), as expected from a random distribution of single DNA molecules into each droplet. The input concentration (calculated on the basis of the dilution of the DNA sample), the observed concentration (calculated from the histogram), and the Poisson-fitted concentration of template DNA were consistent with each other. The Poisson statistics based on the quantitative measurement for a large population of droplets provides the only way to judge the presence of single DNA molecules without direct visualization of DNA itself.

Fig. 2 CFPS from single DNA molecules with FemDA.

(A) mVenus fluorescent protein synthesis. An array containing hundreds of thousands of droplets can be divided into hundreds of frames (movie S3) to be captured by a fluorescence microscope with a motorized scanning stage. (B) Histogram of fluorescence intensities of droplets from an array. The histogram showed a discrete distribution of fluorescence intensities corresponding to the quantity of proteins in every droplet. The histogram was well fitted by a sum of Gaussian distributions of the equal peak-to-peak intervals, which suggests an occupancy of different numbers of DNA molecules per droplet. (C) Histogram of the DNA occupancy in a droplet. The probability of occurrence of droplets containing different numbers of DNA molecules up to 5 [the largest number observed in the experiment of (A)] was perfectly fitted by a Poisson distribution (P = e−λ ⋅ λ k/k!, where λ is the expected average number of DNA molecules per droplet and k is the actual number of DNA molecules in a droplet) with an average of 0.5 DNA molecules per droplet, as expected for a random distribution of DNA molecules. (D to H) Several examples of CFPS for other fluorescent proteins (mseCFP, mNeonGreen, tdTomato, mRuby2, and smURFP). (I) Enzyme synthesis coupled with a fluorogenic reaction. To visualize and quantify the synthesis of ALP, 6,8-difluoro-4-methylumbelliferyl phosphate as the fluorogenic substrate was premixed with CFPS components, producing highly fluorescent 6,8-difluoro-7-hydroxy-4-methylcoumarin (DiFMU) upon enzymatic cleavage of the phosphate group. Scale bars, 10 μm. A.U., arbitrary units.

We synthesized different kinds of fluorescent proteins to prove the general capability of FemDA for protein synthesis with single DNA molecules. Various fluorescent proteins (mseCFP, mNeonGreen, tdTomato, mRuby2, and smURFP) spanning the visible spectrum were all successfully synthesized with FemDA at room temperature or 37°C (Fig. 2, D to H, and fig. S4). Each protein has unique properties not only in the wavelength but also in the folding kinetics, photostability, and pH stability, among other features (11, 12). In addition to the fully reconstituted expression system, cell lysates were also compatible with FemDA (fig. S5). These results proved the robustness and the superior optical transparency of the integrated system.

We also tested the feasibility of CFPS of enzymes in FemDA. We performed CFPS of Escherichia coli alkaline phosphatase (ALP), a homodimeric secretory protein containing two intramolecular disulfide bonds per monomer essential for its activity, coupled with a fluorogenic reaction in FemDA (Fig. 2I and movie S4). We added oxidizing agents and enzymes necessary for the disulfide bond formation into the cell-free system. A fluorogenic substrate [6,8-difluoro-4-methylumbelliferyl phosphate (DiFMUP)] was stable and nonfluorescent until an enzymatic cleavage of the phosphate group, producing highly fluorescent product 6,8-difluoro-7-hydroxy-4-methylcoumarin (DiFMU). We carried out this enzymatic hydrolysis reaction under Vmax conditions using excess DiFMUP, to retain a constant turnover rate against substrate consumption for a reasonable period of time. Because of the small coefficient variation of the expression level (15.5% on average; fig. S4F), the apparent fluorescence intensity of droplets provides a good measure of the number of template DNA.

Recovery of single DNA molecules

In this study, we integrated a glass microcapillary (inner diameter, 0.5 μm; outer diameter, 1 μm) with FemDA to recover single DNA molecules from the droplet of interest (Fig. 3A). The microchannel can be peeled off from the array substrate after protein synthesis. A constant inner pressure against the capillary force was applied inside the microcapillary when moving the microcapillary closer to the droplet of interest. A transient drop in pressure (down to 0 hPa for 1 s) was triggered when the tip was inserted into the droplet, resulting in an instant suction of the femtoliter solution (movie S5). Transfer of the recovered contents was triggered by touching the tip of the microcapillary, with the inner wall of the polymerase chain reaction (PCR) tube preloaded with PCR reagents. This action broke the tip of the microcapillary and thus enlarged the diameter of the outlet; the capillary force is decreased in inverse proportion to the capillary diameter. The constant inner pressure, therefore, pushed the liquid out of the microcapillary as a result of a disruption of the force balance. The transferred DNA sample is then subjected to PCR.

Fig. 3 Single-molecule DNA recovery and amplification.

(A) Schematic workflow for the recovery of single DNA molecules. The silicone rubber [polydimethylsiloxane (PDMS)] can be peeled off from the array substrate before recovery. The recovery is carried out with a glass microcapillary equipped on an XYZ-axis manipulator. Every droplet of interest is determined via digital image processing, which allows for fast addressing and moving of the microcapillary toward target positions. Each recovered sample is subjected to amplification [polymerase chain reaction (PCR) or reverse transcription PCR (RT-PCR)], making conventional detection and purification of nucleic acids available. The purified DNA can directly be used for various kinds of downstream applications. (B) Proof of demonstration of nucleic acid recovery using fluorescent proteins. After mVenus synthesis, individually addressable droplets allowed a microcapillary prefilled with water to suck up whole contents out of the microchamber without disturbing surrounding droplets. Each recovered mRNA (lanes 1 to 4) or DNA (lanes 5 to 8) can all be amplified efficiently with RT-PCR followed by another round of PCR, or two rounds of PCR, respectively. The process featuring the changes of bright field and fluorescence before and after the recovery was recorded in movie S5. NC, negative control (nonfluorescent droplet) for DNA amplification; PC, positive control (purified target DNA, 1010 bp); M1, 100-bp DNA ladder; M2, 500-bp DNA ladder.

We carried out a model recovery experiment using the fluorescent protein (Fig. 3B). We proved that two rounds of 30-cycle PCR were sufficient for amplifying a single DNA molecule to a level largely exceeding the detection limit of gel electrophoresis (fig. S6A). The liquid trapped in the microcapillary moved straight to the PCR tube upon the release of the capillary action. Nevertheless, it may be the case that DNA remained inside the microcapillary. Therefore, in addition to the contact-triggered injection, we snapped the entire tip off (about 2 mm, corresponding to the length of the prefilled water) and subjected it to another PCR as a control (fig. S6B). Only the contact-triggered injection for the positive droplets gave a positive band (fig. S6C), proving that DNA did not remain in the microcapillary after the transfer. The robustness of the recovery method was proved with multiple parallel samples not only for DNA but also for mRNA (Fig. 3B and fig. S6, B to E). Consistent results across the repeats revealed a stable performance of the series across the series of processes including sampling, transfer, amplification, and detection. Compared to a well-known trade-off between sorting rate and error rate in droplet flow sorting systems, our efficient DNA recovery technique and the unambiguous on/off signals over the planar array achieved an unprecedented 104-fold enrichment for a very small minority of advantageous genes (see Supplementary Text). The amplicons can be purified for Sanger sequencing or used directly as the template DNA of CFPS. Our well-designed control experiments established the “world-to-chip” interface for the recovery of single DNA molecules from femtoliter space.

Directed evolution of enzymes

We applied the integrated platform for a rapid screening of a highly active E. coli ALP. ALP is an enzyme that has been extensively investigated for over a hundred years (13) and has been widely applied in enzyme-linked immunosorbent assays (ELISAs) and molecular cloning. We chose the wild-type (WT) E. coli ALP as the starting reference for this screening. A single-site saturation mutagenesis library covering the 97th to 144th amino acids (Ser102 was identified as the catalytic residue) of the E. coli ALP, substituting each site against all 20 possible natural amino acids, was prepared using NNK degenerate primers (14). We used an array (approximately 2 mm by 9 mm) consisting of about 1.8 × 105 droplets, with an input concentration λ = 0.1 of the library DNA, which ensured over 10× scanning of the whole library and kept the theoretical number of droplets containing four or more DNA molecules less than one. Because of the potentially diversified enzymatic activities resulting from the mutagenesis, as anticipated, there was no ideal sum of Gaussian distributions (Fig. 4A). Only a few droplets showed exceptionally high fluorescence, confirmed by the histogram. These droplets can be expected to encapsulate mutant DNA encoding highly active enzymes, rather than containing multiple DNA molecules. We recovered these candidate droplets and amplified each sample with the established two rounds of PCR. The purified PCR product was directly used as the template DNA for in-tube CFPS. The CFPS solution was highly diluted 106 times and then injected into FemDA to rapidly confirm the enzymatic activity by digital assays without the need of complicated subcloning, protein purification, and quantification (15). This efficient system enables protein library expression, screening, and activity validation all within a single day.

Fig. 4 High-throughput screening of an ALP library.

A single-site saturation mutagenesis library covering the 97th to 144th (a total of 48) amino acids of the E. coli ALP, substituting each site against all 20 possible natural amino acids, was prepared using NNK degenerate primers. We used an array (2 mm by 9 mm) consisting of 1.8 × 105 microchambers with a loading concentration λ = 0.1 of the DNA library, which ensured about 10× scanning of the whole library and suppressed the theoretical number of droplets containing four or more DNA molecules to be less than one. (A) Histograms of fluorescence intensities of individual droplets from the CFPS of WT ALP (top) or its mutant library (bottom). The histogram counted only the fluorescent droplets. Because of the diversified enzymatic activity resulting from the mutagenesis, there was no ideal shape of the sum of Gaussian distributions (although the peaks corresponding to different numbers of DNA molecules may be identified). Only a few droplets showed exceptionally high fluorescence (labeled with red bars). It can be expected that these droplets encapsulated mutant DNA encoding highly active enzymes rather than multiple DNA molecules. (B) Turnover numbers (kcat) of the WT and mutant ALP with substrate DiFMUP. The W109I, W109L, and D101S mutants exhibited 5.5 to 8.3 times the kcat of WT at pH 9.25. The error bars represent the SD, and each was calculated from three independent measurements using the same lot of a purified enzyme sample. (C) kcat of the WT and mutant ALP with substrate 4-MUP. The W109L, W101S, and W109I mutants exhibited 9.0 to 17.2 times the kcat of the WT at pH 9.25. The error bars represent the SD, and each was calculated based on the ensemble of droplets from the array.

This single round screening experiment rapidly confirmed three mutants (D101S, W109I, and W109L), with activities much higher than the WT. We used two different substrates structurally similar to each other, 4-methylumbelliferyl phosphate (4-MUP) and DiFMUP, to evaluate the mutant activity with either digital assays or conventional bulk assays (Fig. 4, B and C, and fig. S7, A and B). These mutants exhibited kcat values 5 to 8 times those of the WT with DiFMUP, and kcat values 9 to 17 times those of the WT with 4-MUP. The different extent of the improvements may be attributed to subtle substrate specificities. Given the high general interest in this important enzyme, numerous mutagenesis works based on sophisticated x-ray crystallography or nuclear magnetic resonance spectroscopy have been performed in the past decades (16). D101S next to the catalytic residue of Ser102 was the most effective mutation site so far, attributable to an accelerated rate-determining step of Pi release (17). Unexpectedly, we not only reproduced the D101S mutation but also discovered a new “hotspot” Trp109 that has never been taken into account. The activity of both W109L and W109I is comparable to D101S with the substrate DiFMUP, and the activity of W109I is even 35% higher than that of D101S for the substrate 4-MUP. The increased structural flexibility around the catalytic site may contribute to the increased activity (fig. S8).

Mathematical model of digital screening

The high hit rate of our screening approach can be well interpreted with a nonlinear mathematical model based on the Poisson theory, given that the protein yield from every template DNA is basically identical to each other (fig. S9). In our single DNA–triggered cell-free expression system, the protein expression level is solely dependent on the discrete number of the encapsulated DNA, which follows a Poisson process, as we showed. In general, we may aim to screen a mutant with an activity “k” or more times that of the WT. We are therefore able to define the relationship between the number of droplets (m) and the input concentration of template DNA (λ, which refers to the average number of DNA molecules per droplet) with an aid of the parameter k. The boundary condition here can be set to be as follows: The expected total number of droplets containing k and more DNA molecules is 1, which generated inequality (1). Inequality (1) guides us to choose a proper combination of m and λ (Fig. 5). In practice, the number m may always be fixed because of the fixed design in microfabrication. Therefore, the most important information that inequality (1) can tell us is the fact that the concentration λ should preferably be less than a threshold (the semitransparent color-shaded area in Fig. 5), to avoid possible pseudo-hits. The threshold is largely determined by the value k. The higher k we want to achieve, the higher λ that may be applied, under the restriction of inequality (1). The red circle coordinate in Fig. 5 indicated our practice demonstrated in Fig. 4, in which we successfully obtained several mutants with desired activities (i.e., >4-fold improvement) by recovering a few candidates just from a single round of screening. From another viewpoint, inequality (1) can also help us to maximize the capacity of a given array by approaching the boundary condition.m11eλi=0i=k1λii!(1)fL(2)

Fig. 5 Rational uses of the digital screening system.

On the basis of the visualization for inequality (1), this chart focuses on the relationship between the number of droplets (m) and the loading concentration of template DNA (λ, which refers to the average number of DNA molecules per droplet). The concentration λ is preferably less than a threshold (the semitransparent color-shaded area) to avoid the possible pseudo-hits. The higher the activity (k) desired, the higher the λ that can be applied. The red and blue circle coordinates indicate our practices demonstrated in this study, with which we successfully obtained several mutants with desired activities (red, >4-fold activity improvement; blue, >3-fold activity improvement) by recovering only a few candidates from a single-round screening. From another point of view, inequality (1) can also help us to maximize the capacity of a given array, through approaching the boundary condition (solid colored line), as demonstrated by the instance of the square coordinate.

We can, of course, perform a screening experiment outside the preferred range defined by inequality (1) at the risk of encountering pseudo-hits, to cover a relatively large library within a relatively small array. In other words, the maximum size of the mutant library that can be tolerated by a given array is always another key concern in determining the library size (L) to be prepared. We introduced herein a new parameter “f” as a measure of the frequency of every mutant in a library to be interrogated on average. The relationship of m, λ, L, and f can be expressed in a straightforward manner with inequality (2). Therefore, the maximum size of the library is Lmax = mλ/f (table S1), where m and f are predetermined and λ can be obtained from inequality (1) or Fig. 5. State-of-the-art technologies can prepare a gene library with a predefined size. The combination of inequality (1) and inequality (2) provides a complete solution for users to customize and optimize their strategy to suit every specific screening requirement.

To further prove the effectiveness of this theory, we carried out 10 additional individual screenings to cover almost the full length of the E. coli ALP. Each of the single-site saturation mutagenesis library targeting about 40 amino acids was subjected to a single round of screening. The library DNA in a final concentration of λ = 0.01 was introduced into FemDA consisting of 576,000 droplets, which ensured a 4× scanning of the library and was suitable for acquiring mutants with >3-fold activity improvement (the blue circle coordinate in Fig. 5). We obtained 12 mutants with the desired activities (fig. S7C). Except for a few mutants that have been reported previously, most of the mutants were newly identified by this screening. In contrast to conventional screening works that relied on labor-intensive repetitive efforts, the proposed paradigm in our study makes the high-throughput screening highly rational and predictable for the accurate determination of the candidate hits.

We attempted another screening for improved activity of an ALP cloned from a psychrophilic marine bacterium, Cobetia marina. The C. marina ALP has been recognized as the most active ALP among known microbial or mammalian ALPs (18). The disulfide bond enhancer is no longer required for the CFPS reaction due to the absence of disulfide bonds. No crystal structure has been reported for this relatively new member of the ALP superfamily. We prepared a site saturation mutagenesis library arbitrarily covering its 4th to 253rd amino acids using specific codons (following the E. coli codon preference) for each site and subjected the library to a screening experiment in a DNA concentration of λ = 0.1 and using 2.6 × 105 droplets, which ensured a 5× scanning of the library and maximized the capacity of the given array by approaching the boundary condition (the square coordinate in Fig. 5). A few top-ranked droplets with fluorescence intensities beyond the Poisson-predicted maximum revealed a single converged mutation site Gly168, which was substituted by either Lys or Arg. We carried out the same digital enzymatic assay at pH 9.25 for every mutant. Both mutant enzymes, G168K and G168R, showed an identical improved kcat of 4200 ± 280 s−1 solely for DiFMUP, twice the kcat of WT, while the mutant activity toward 4-MUP was comparable to the WT. Therefore, the activity improvement could be attributed to a reinforced electrostatic interaction between the fluorinated substrate and the positively charged Lys/Arg, which is putatively located in the narrow entrance to the catalytic pocket (18). The C. marina ALP mutant G168K/R set a new world record in the race of discovery and the creation of highly active ALP, in just a few days.


The feasibility of quantitative measurement based on Poisson statistics in our study relies on the dramatic improvement in the uniformity and stability of the droplets. On the basis of our comprehensive and rational screening for the ideal oil and surfactant, we established this unconventional strategy for the large-scale preparation of femtoliter droplets with a dramatically improved quality. The surfactant fitting to FemDA stabilized the water/oil interface, prolonging the life span of droplets at the femtoliter scale. The biocompatible and leakage-free in vitro compartmentalization has long been raised as a bottleneck issue (19, 20). The complete isolation of microchambers in our system prevents the droplets from direct contact and hence thoroughly eliminated the possibility of unwanted coalescence or aggregation that has been frequently raised as an issue in a droplet emulsion (21, 22). Herein, we not only disclosed a new formulation of the oil and the surfactant that showed unprecedented performance but also detailed our selection criteria and the characterization methods, which is likely more important and helpful to other researchers who intend to explore other oils and surfactants according to their differing needs. During the troubleshooting process for the microfabrication, we were made aware of the difficulties when applying unconventional materials to traditional micro-electro-mechanical systems (MEMS) processes. The low surface energy of CYTOP, much lower than that of conventional materials such as glass or silicon wafer, prevents most of the photoresists from being adhered to the surface, but this is crucial for the perfect sealing of tiny aqueous droplets into the femtoliter space. Further advances toward facile and reliable fabrication using such unconventional materials can be pursued in the future following the growing demand and interest in the high-performance microdevices.

The uniform dimension of every droplet is an important prerequisite of the Poisson distribution of DNA molecules (6), which leads to a convincing conclusion that protein synthesis in droplets results from single DNA molecules. Otherwise, an internal fluorescent dye must be used as a volumetric marker for calibration, but the reaction volume may nonlinearly alter the reaction dynamics at microscales, diminishing the reliability of the calibration work. Our FemDA featuring a homogeneous structural property thus saves substantial efforts in data interpretation. The simple procedure for the droplet formation protects biomacromolecules from potential damage caused by vigorous vortex or centrifugation usually involved in preparations of water-in-oil emulsions (23). In addition, the highly ordered and uniform droplets in our system are able to provide positioning information, which opens up an opportunity for system extension with image processing–guided robotic recovery (24, 25).

The Poisson-based model proposed herein is useful not only for the planar droplet array but also for droplet flow systems, as long as the variation of protein expression levels in the droplet emulsions can be decreased to an extent that allows resolving single DNA molecules. It should be noted that this model guarantees the high accuracy of hit identification only if the desired mutant exists in the library. A few rounds of tentative protein expression in a population of droplets may be necessary for selecting an optimal screening criterion because no one knows exactly how many times the mutant activity can be improved in a given library before a practical trial. It should be noted that no library is able to cover a fully randomized protein composed of far less than 100 residues (26). There is always a compromise in the number of residues subjected to mutagenesis as well as in the library size. It is still highly probable that a given library contains zero improved mutants. In such cases, our digital screening scheme can quickly conclude with high confidence that the given library does not contain any improved mutant. This is also beneficial for accelerating the searching process of sequence space through quickly changing the strategy of library preparation, rather than wasting time on repeatedly testing a useless library. In circumstances where small activity improvements are expected in a library, the digital screening prefers a small library (e.g., a focused library rather than a completely random library) or requires a large number of droplets, and the Poisson-based model supports every round of additional mutagenesis and screening for accumulating beneficial mutations. In this sense, the full integration of an optimal starting point, a well-designed library, and our digital screening can dramatically increase the probability of obtaining improved mutants.

FemDA in the context of cell-free directed evolution provides a powerful means to explore brand-new mutants where the relevant mutation sites may be highly conserved across species and may even not exist in living organisms owing to the lack of evolutionary drive and/or the complexity of the in vivo metabolic network (27). Through expanding the search scope beyond the limited residues closely adjacent to the catalytic center, without having to worry about the labor intensity, we were able to identify the new mutation sites (Trp109 of the E. coli ALP and Gly168 of the C. marina ALP) located relatively far away (>10 Å) from the catalytic site. They are not ligand binding sites for coenzymes or metallic ions, based on the current understanding from the exact or putative structure information of ALPs (13, 18). All previous studies in the past decades, including very recent reports, did not discuss or identify this residue Trp109 at all (28, 29), despite the fact that extensive efforts by mutagenesis have been made to improve the activity or to interrogate the structure-function relationship of this essential enzyme (16). The combination of FemDA and CFPS could be reliable and advantageous over cell-based screening, particularly for secretory proteins, which always suffer from difficulties in protein expression susceptible to the cellular environment (30). The improved ALPs are expected to be useful for accelerating clinical diagnosis based on ELISA because the time required for accumulation of sufficient amount of reaction products is shortened with an increase of the activity of the labeling enzyme. The cold-adapted feature of the highly active C. marina ALP mutant is particularly attractive for daily molecular cloning experiments, as the mutant enzyme is able to efficiently catalyze the dephosphorylation reaction even at room temperature and can be rapidly inactivated. Further elucidation of the mechanism about the activity improvement may fill some gaps that remain in the theory of catalysis and refine the strategies toward focused mutagenesis.

As a result of the intrinsic property of single-molecule sensitivity of the digital assay, opportunities based on FemDA are no longer restricted by the quantity of enzymes that can be synthesized per clone. Therefore, it should be possible to integrate various types of expression systems (31) or advanced gene libraries into FemDA regardless of the protein synthesis efficiency. A fine on/off control of orthogonal translation reactions should become more important than the efforts in improving the protein yield. Further, an exciting extension of our technology may also link to the fast-growing field of de novo enzyme design with the means of directed evolution (32, 33), in which plenty of room for substantial improvement of activity exists. This has particularly attracted the growing attention in generating chemical transformations not known to exist in natural biocatalysis (30, 34, 35). Such integration may also offer a higher order of directed evolution with posttranslational modifications or unnatural amino acids for the future development of protein engineering, beyond the canonical genetic alphabet (36, 37).


Experimental design

The objective of this study was to establish rapid enzyme screening via an accurate hit selection. The femtoliter droplets were used to encapsulate single DNA molecules, carry out high-throughput protein synthesis with a cell-free transcription and translation system, and link the genotype to the corresponding phenotype. The droplet array was imaged by a microscope. The image data were analyzed by software developed based on Fiji. The microcapillary as a world-to-chip interface was used to recover a single femtoliter droplet that contains the template DNA encoding highly active mutants. The Poisson statistics provides an accurate analytical tool to determine the droplet of interest without being influenced by the protein expression level. A model enzyme was used to demonstrate this approach.

Template DNA preparation

A T7 expression vector pRSET-B (Invitrogen) was used as the cloning vector for the preparation of recombinant plasmids. Each gene encoding for cp173-mVenus, mseCFP, mNeonGreen, tdTomato, mRuby2, smURFP, E. coli ALP, and C. marina ALP was inserted in between the multiple cloning sites of the pRSET-B vector via In-Fusion cloning (In-Fusion HD Cloning Kit, Clontech), respectively. After transformation (HIT-JM109 competent cells, RBC Bioscience) and cell culture, each plasmid was extracted, purified (NucleoSpin plasmid QuickPure, Takara-Bio), and sequenced (FASMAC). A T7 promoter and a T7 terminator were located at the upstream and downstream of the insert, respectively, and a primer set (forward primer, GCGAAATTAATACGACTCACTATAGGG; reverse primer, GTTATGCTAGTTATTGCTCAGCGG) targeting these two regions was used for PCR amplification (PrimeSTAR Max DNA Polymerase, Takara-Bio). The amplicon was purified (NucleoSpin gel and PCR clean-up, Takara-Bio) and quantified (NanoDrop, Thermo Fisher Scientific) for use as a linear template DNA in cell-free expression.

Preparation of mutagenesis libraries

We constructed a single-site saturation mutagenesis library in which every codon of amino acid residue in the E. coli ALP (isozyme 3, EC gene was individually randomized. In brief, we used inverse PCR to amplify the expression plasmid with nonoverlapping adjacent primers designed for each clone. A degenerate codon NNK (N: A, T, G, or C; K: G or T) was placed at the 5′ end of the forward primer (14). The PCR product was subjected to 5′-phosphorylation and self-ligation (T4 DNA ligase). The ligation product (plasmid) was used as the PCR template for the preparation of the linear template DNA of cell-free expression. To prepare the site saturation mutagenesis library of the C. marina ALP, we used specific codons preferred by E. coli for each site instead of the degenerate NNK codons.

Preparation of microchamber array device

The ambient humidity of the microfabrication room was maintained around 40 to 50%. A cover glass (No. 1, Matsunami Glass) was sonicated for 15 min in 8 M sodium hydroxide solution (FUJIFILM Wako Pure Chemical), rinsed with ultrapure water, and dried under an air stream. The hydroxylated cover glass was soaked in 0.05 vol % (3-aminopropyl)triethoxysilane (in ethanol) (Sigma-Aldrich) for 1 hour, rinsed with ultrapure water, and dried under an air stream. The silanized cover glass was baked at 80°C for 5 min on a hotplate (TH-900, As One) to stabilize the siloxane bonds. A perfluoropolymer (CYTOP CTL-816AP, AGC) was spin-coated on the cover glass at 3400 rpm for 30 s and baked at 80°C for 30 min and then at 200°C for 1 hour on the hotplate, resulting in a thickness of 3 μm. A positive photoresist (AZ P4903, AZ Electronic Materials) was spin-coated on the CYTOP layer at 7500 rpm for 60 s and cured at 110°C for 5 min on the hotplate. After a spontaneous rehydration process for 30 min, the photoresist was exposed by a mask aligner (BA100it, Nanometric Technology) with a chrome photomask fabricated via electron beam lithography (F5112, Advantest). The exposed photoresist was developed (AZ 300 MIF, AZ Electronic Materials) and used as a mask for the following dry-etching process. The photoresist-uncovered CYTOP was selectively removed with O2 plasma (O2, 50 standard cubic centimeters per minute; pressure, 10 Pa; power, 50 W; time, 27 min) generated by a reactive ion etching system (RIE-10NR, Samco), exposing the hydrophilic bottom surface of the glass substrate. After removal of the photoresist mask by a sequential rinse with acetone, 2-propanol, and pure water, microchambers with a hydrophobic sidewall and hydrophilic bottom formed the microchamber array. This fabrication process ensured the complete removal of hydrophobic polymers and consequent full exposure of the hydrophilic glass substrate, which is important for the stable retention of aqueous solution in the femtoliter space.

The resulting microchamber array was covered with a piece of PDMS (Sylgard 184, Dow Corning) channel. The PDMS channel with a height of 135 μm was fabricated via replica molding. A mixture of base and curing agent in a 10 (base):1 (curing agent) ratio was deaerated (planetary centrifugal mixer AR-100, Thinky), poured on a patterned master, deaerated again in a vacuum chamber, and cured at 60°C overnight. The inlet and outlet holes of the PDMS channel were punched by a biopsy puncher (Kai Industries). After assembly with the microchamber array, liquids could be guided through the holes into the channel and trapped straightly inside the microchambers via a sharp chilling process using an aluminum block prechilled on ice. The simple chilling process was able to remove air from the microchambers due to the solubility of air in water being inversely proportional to temperature. A subsequent oil sealing removed the aqueous solution outside the microchambers to produce FemDA. The diameter and depth of each microchamber were 4 and 3 μm, respectively, resulting in a volume of about 38 fl.

CFPS using the microchamber array device

We prepared a flush oil composed of a hydrofluoroether (AE-3000, AGC) with 0.1 wt % fluorosurfactant (SURFLON S-386, AGC). The flush oil was equilibrated with buffer components of the cell-free system by vortex mixing for 30 s, incubation for 10 min, and centrifugal separation at 2 × 104g for 5 min. The equilibrated oil was chilled on ice before use. We synthesized proteins from single DNA molecules encapsulated in the droplets using a cell-free transcription and translation system (PURExpress In Vitro Protein Synthesis Kit, New England Biolabs). The protein synthesis working solution was composed of 4.0 μl of solution A, 3.0 μl of solution B, 0.2 μl of recombinant RNase inhibitor (40 U/μl; Takara-Bio), template DNA with desired concentrations, biliverdin (10 μM; Sigma-Aldrich) as required, disulfide bond enhancer 1 and disulfide bond enhancer 2 (0.4 μl for each, PURExpress Disulfide Bond Enhancer, New England Biolabs) as required, fluorogenic substrate (1.0 mM DiFMUP, Invitrogen; or 1 mM fluorescein diphosphate, AAT Bioquest) as required, and nuclease-free water (Invitrogen) to a final volume of 10 μl (the total volume may be arbitrarily scaled down or up). The reaction solution was injected into the microchannel by pipette, and a sharp chilling process was applied for a few seconds to remove air from the microchambers. We then injected the equilibrated flush oil to isolate each microchamber, followed immediately by injection of the perfluoropolyether oil (Fomblin Y25, Solvay) to seal individual microchambers. This system featuring zero dead-volume allows the replaced reaction solution to be collected, frozen, or reused. Last, we sealed the inlet and outlet of the microchannel with a piece of foil tape, respectively. With the exception of smURFP, which was synthesized at 37°C (Thermo Plate, TP-CHSQ-C, Tokai Hit), the FemDA-based CFPSs in this study were carried out at room temperature. In particular, for the E. coli ALP screening experiment, we applied the library DNA in λ = 0.1 concentration to an array (approximately 2 mm by 9 mm) consisting of about 1.8 × 105 droplets, keeping the theoretical number of droplets containing four or more DNA molecules less than one (1.8 × 105 × [1 − e−0.1 × (1 + 0.11/1! + 0.12/2! + 0.13/3! )] ≈ 0.7).


All images were captured using an inverted fluorescence microscope (Eclipse Ti-E, Nikon) equipped with either an electron-multiplying charge-coupled device camera (ImagEM C9100-13, Hamamatsu; or iXon Ultra 897, Andor) or a scientific complementary metal-oxide-semiconductor camera (ORCA-flash 4.0 C11440-22C, Hamamatsu). We used a light-emitting diode light source (SPECTRA X Light Engine, Lumencor) to provide illumination with filter sets (from Semrock or Nikon): (i) excitation (Ex), 390/40 nm; dichroic, 405 nm; emission (Em), 452/45 nm (for DiFMU); (ii) Ex, 427/10 nm; dichroic, 458 nm; Em, 483/32 nm (for mseCFP); (iii) Ex, 480/40 nm; dichroic, 505 nm; Em, 535/50 nm (for mNeonGreen, and fluorescein); (iv) Ex, 504/12 nm; dichroic, 515 nm; Em, 542/28 nm (for mVenus); (v) Ex, 554/23 nm; dichroic, 573 nm; Em, 609/54 nm (for tdTomato, and mRuby2); and (vi) Ex, 630/38 nm; dichroic, 655 nm; Em, 694/44 nm. A 60× objective lens (Plan Apo VC; numerical aperture, 1.4; Nikon) was used for imaging experiments.

Recovery of single DNA molecules

A single-use glass microcapillary (FemtoTips, Eppendorf) was connected to a pump (FemtoJet 5247 or FemtoJet 4i, Eppendorf), and the movement of the microcapillary was controlled by a micromanipulator (TransferMan 4r, Eppendorf). The micromanipulator was mounted on the body of the microscope. A pipette tip (Microloader, Eppendorf) was used to add 1 μl of pure water into the tip of the microcapillary. A constant inner pressure of 60 hPa against capillary force was applied inside the microcapillary by the pump. The focal plane was kept constant at the position of the upper surface of the microchamber array during the recovery experiment. As we lowered the microcapillary toward the droplet of interest, a latent image of the microcapillary was first observed via eyepieces under bright field, which revealed the approximate location of the microcapillary. A pressure drop (down to 0 hPa, kept for 1 s) was triggered when the tip was inserted into the target droplet, resulting in an instant suction for the femtoliter solution. The large concentration difference inside and outside the microcapillary facilitates a quick diffusion of the contents into the preloaded pure water. A fast release of the recovered contents into a PCR tube preloaded with PCR reagents (with the same primer set as the one used in the preparation of the linear template DNA) was triggered by touching the tip of the microcapillary with the inner wall of the PCR tube. The capillary force was decreased in inverse proportion to the diameter of the tip, and the constant inner pressure pushed the liquid out of the microcapillary as a result of disruption of a force balance. Because of the large dilution factor (>108-fold) of the contents of the CFPS solution, the following PCR was not affected by the impurities. Two rounds of normal PCR in a volume of 25 μl, each consisting of 30 cycles, were able to amplify a single DNA molecule to a level exceeding the detection limit of agarose gel electrophoresis. The first-round PCR solution (0.5 μl) was added into the second-round PCR without DNA purification. The amplicons were purified for Sanger sequencing or they may be used directly as the template DNA of CFPS.

Enzyme activity measurement based on digital assays

The E. coli ALP was synthesized using PURExpress spiked with the disulfide bond enhancer (New England Biolabs) in a PCR tube for 3 hours at 37°C. The C. marina ALP was synthesized using PURExpress spiked with 100 μM zinc acetate but without the disulfide bond enhancer. Template DNA (150 ng) was added to 15 μl of reaction solution. The resulting solution containing ALP was diluted 106 times with a buffer (Solution I from PUREfrex kit, GeneFrontier). The diluted enzyme solution was mixed with 3 mM 4-MUP (Invitrogen) or 1.5 mM DiFMUP (Invitrogen) in an assay buffer [1 M diethanolamine, 1 mM MgCl2, and 0.05 vol % Tween-20 (pH 9.25)]. The mixture was immediately introduced and sealed into the microchamber array device, so that single enzyme molecules could be loaded stochastically into the microchambers following Poisson distribution. The turnover number, kcat (s−1), was determined with a time-course measurement at room temperature, combining with a calibration curve of 4-MU (Invitrogen) or DiFMU (Invitrogen). The time-course measurement showed a discrete distribution of the hydrolysis rate corresponding to one, two, or three enzyme molecules encapsulated in the droplet. The difference between adjacent slopes in the time-course data was used to extract the increment of the hydrolysis rate resulting from exactly one enzyme molecule.

Enzyme activity measurement based on microtiter plate

The ALP gene was cloned into the pRSET-B expression vector along with a signal peptide (MKQSTIALALLPLLFTPVTKA) and then transformed and expressed in E. coli C43 (DE3) cells. The expressed enzyme was purified from the cultured cells using a nickel–nitrilotriacetic acid column (Qiagen) and further purified with gel filtration (Superdex 200, GE Healthcare). The concentration of the purified enzyme samples was determined by the bicinchoninic acid assay (BCA protein assay kit, Pierce). Five picomolar ALP was mixed with 1.5 mM DiFMUP or 3 mM 4-MUP in the assay buffer [1 M diethanolamine, 1 mM MgCl2, and 0.05 vol% Tween-20 (pH 9.25)]. The kinetic measurement was carried out on a plate reader (FlexStation 3, Molecular Devices) at room temperature with 364-nm excitation and 448-nm emission wavelengths.

CFPS based on microtiter plate

We synthesized the fluorescent protein cp173-mVenus using the PURExpress system in a microtiter plate (non-binding, μClear, 384-well plate, Greiner Bio-One) at room temperature. The composition of the reaction solution was the same as that used for the FemDA, except for the DNA concentration. A serial dilution of the template DNA was added to a 25-μl final volume along with a blank control, resulting in a final DNA concentration of 7.41, 3.70, 1.85, 0.93, 0.46, 0.23, 0.11, 0.058, and 0 nM. A time-course measurement of the fluorescence intensity was carried out by the plate reader at room temperature with 513-nm excitation and 528-nm emission wavelengths.

Statistical analysis

The fluorescence intensities of droplets were extracted using Fiji ( with a homemade Java plugin (available upon inquiry to the corresponding author). Gaussian fitting was done by KaleidaGraph (Synergy). The equation P = e−λ ⋅ λk/k! (where λ is the expected average number of DNA molecules per droplet and k is the physical number of DNA molecules in a droplet, which should be 0, 1, 2, 3, …) was used for Poisson fitting. Other data from the plate reader were analyzed by Excel (Microsoft Office). The reported values correspond to the means of a minimum of three independent experiments. Data are presented as mean ± SD.


Supplementary material for this article is available at

Supplementary Text

Fig. S1. Microfabrication, optimization, and characterization for FemDA device.

Fig. S2. Measurement of the dynamic surface tension of the flush oil.

Fig. S3. Bulk-phase CFPS.

Fig. S4. Histogram of fluorescence intensities of droplets from various protein samples on FemDA.

Fig. S5. CFPS by E. coli cell lysates.

Fig. S6. Verification and establishment of amplification protocol for one DNA molecule.

Fig. S7. Analyses of the E. coli ALP mutants.

Fig. S8. Changes of interaction energy between the E. coli ALP 109th amino acid residue and its surrounding amino acid residues.

Fig. S9. Fluorescent fusion proteins to assess target protein yield.

Table S1. Theoretical maximum size of gene library for specific settings of screening.

Movie S1. Fluorescence recovery after photobleaching (FRAP) experiment for the confirmation of the robust droplet formation.

Movie S2. Time-course measurement for CFPS of fluorescent protein mVenus.

Movie S3. CFPS of mVenus using FemDA.

Movie S4. Time-course measurement for CFPS of ALP.

Movie S5. Nucleic acid recovery from femtoliter droplets with a microcapillary.

References (3840)

This is an open-access article distributed under the terms of the Creative Commons Attribution-NonCommercial license, which permits use, distribution, and reproduction in any medium, so long as the resultant use is not for commercial advantage and provided the original work is properly cited.


Acknowledgments: We thank AGC Inc. for offering free samples of oils and surfactants. We thank Hitachi High-Technologies Corporation, S. Okada (JAMSTEC), and Y. Kanasaki (JAMSTEC) for technical support in materials characterization. We thank Abbott Japan Co. Ltd. for offering E. coli ALP D101S mutant and assay reagents. We thank J. Morijiri (AIST) and K. Kurosawa (JAMSTEC) for technical assistance in gene library preparation. We thank R. Yaginuma (UTokyo) for technical support in enrichment experiment and full-length screening of E. coli ALP, and M. Hara (UTokyo), K. Saito (UTokyo), and I. Horie (UTokyo) for technical support in genetic and biochemical analysis of mutant ALP. We thank E. Chiyoda (UTokyo) and K. Kurosawa (JAMSTEC) for technical support in microfabrication works. We thank M. Hirai (JAMSTEC), M. Miyazaki (JAMSTEC), and T. Toyofuku (JAMSTEC) for assistance in imaging facility setup. We thank T. Sumida (JAMSTEC), T. Nunoura (JAMSTEC), S. Deguchi (JAMSTEC), and K. Takai (JAMSTEC) for useful discussion. We thank C. Chen (JAMSTEC) for improving the English language. A part of microfabrication work was conducted in the Center for Nano Lithography & Analysis, The University of Tokyo, supported by the Ministry of Education, Culture, Sports, Science and Technology (MEXT), Japan. Funding: This work was supported by a JSPS Postdoctoral Fellowship for Overseas Researchers (to Y.Z.); a Grant-in-Aid for JSPS Research Fellow (to Y.Z.); ImPACT Program of Council for Science, Technology and Innovation, Japan Science and Technology Agency (to H.N.); the budget of Japan Agency for Marine-Earth Science and Technology (to Y.Z.); and JSPS KAKENHI grant number JP18K14260 (to Y.Z.). Author contributions: Y.Z. conceived this study, designed and performed experiments, proposed the mathematical model, and wrote the manuscript. H.K. and R.I. performed experiments in the early stage of this project. Y.M. developed software for data analysis and performed experiments in the late stage of this project. K.M. prepared mutagenesis library. Y.S. performed theoretical calculations for mutants. H.U. contributed to the late stage of this project. K.V.T. contributed to the early stage and the late stage of this project. H.N. conceived the idea and supervised the project. Competing interests: The authors declare that they have no competing interests. Data and materials availability: All data needed to evaluate the conclusions in the paper are present in the paper and/or the Supplementary Materials. Additional data related to this paper may be requested from the authors.

Stay Connected to Science Advances

Navigate This Article