Environmental changes bridge evolutionary valleys

See allHide authors and affiliations

Science Advances  22 Jan 2016:
Vol. 2, no. 1, e1500921
DOI: 10.1126/sciadv.1500921


In the basic fitness landscape metaphor for molecular evolution, evolutionary pathways are presumed to follow uphill steps of increasing fitness. How evolution can cross fitness valleys is an open question. One possibility is that environmental changes alter the fitness landscape such that low-fitness sequences reside on a hill in alternate environments. We experimentally test this hypothesis on the antibiotic resistance gene TEM-15 β-lactamase by comparing four evolutionary strategies shaped by environmental changes. The strategy that included initial steps of selecting for low antibiotic resistance (negative selection) produced superior alleles compared with the other three strategies. We comprehensively examined possible evolutionary pathways leading to one such high-fitness allele and found that an initially deleterious mutation is key to the allele’s evolutionary history. This mutation is an initial gateway to an otherwise relatively inaccessible area of sequence space and participates in higher-order, positive epistasis with a number of neutral to slightly beneficial mutations. The ability of negative selection and environmental changes to provide access to novel fitness peaks has important implications for natural evolutionary mechanisms and applied directed evolution.

  • molecular evolution
  • fitness landscape
  • epistasis
  • beta-lactamase
  • directed evolution
  • environmental changes


The fitness landscape metaphor is often depicted as a three-dimensional hilly surface with distinct fitness peaks and valleys (Fig. 1). This representation flattens the high dimensionality of actual landscapes to aid our understanding of what a fitness landscape represents. Fitness landscapes depict the change in fitness along possible evolutionary pathways (1), and thus, the shape of the fitness landscape affects the dynamics and outcomes of evolution. Accordingly, much effort has gone into understanding the shape of the landscape and the degree to which ruggedness manifests in actual evolutionary scenarios (25). Ruggedness results from complex interactions between mutations, especially those exhibiting reciprocal sign epistasis (6). In rugged landscapes, local fitness maxima exist as “traps,” and therefore, the evolutionary endpoint of adaptation from a given starting point is not predetermined as it would be in a smooth landscape. The endpoint of each evolutionary pathway depends on selective history, landscape topology, and chance. However, one way in which evolution might escape traps and cross fitness valleys is through changes in the environment that remodel the landscape (1). Natural environments are intrinsically variable, and evolutionary pathways are contingent on past environmental interactions and selective history (7). Thus, the dynamics and complexity of selective pressures are likely to contribute to the routes of natural evolutionary pathways. There is a growing appreciation of the complex and important relationship between the environment and the fitness and epistatic effects of mutations (8, 9). Understanding the roles that environmental changes and landscape ruggedness play in adaptive pathways would allow insights into the predictability and potentiality of evolutionary pathways.

Fig. 1 Simplified model representation of how evolution in alternating environments that modulate the fitness landscape can lead to the crossing of fitness valleys.

The xy plane represents sequence space, whereas the z axis represents fitness. The solid red arrow indicates an evolutionary pathway following increasing positive selection pressure. (A) A sequence (circle) sits at an optimum separated from global optima by a fitness valley. (B) A change in the environment alters the landscape such that evolution brings the gene to a new sequence that previously resided in the valley. (C) Upon return to the original environment, the sequence resides in the valley but can now evolve to the global optimum, which lies uphill.

How changes in environment or selection affect evolutionary outcomes is underexplored experimentally (8). One way to address this topic is to ask whether increasingly stringent positive selection is the best strategy for optimization. This strategy is ubiquitously used in directed evolution experiments; however, on a rugged landscape, positive selection alone may quickly reach a local maximum, limiting evolutionary capacity. Studies of evolution under strong selection have indicated that evolution can follow few mutational pathways using positive selection, supporting the predictability of evolutionary endpoints (10). How evolution might cross fitness valleys remains an unresolved problem (1, 2). Models and simulations have suggested that random strategies, such as stochastic tunneling or recombination, may occasionally result in the crossing of fitness valleys (11, 12). Other studies have examined the effects of changing the starting point of evolution to arrive at different fitness maxima (13), but experiments rarely fundamentally change the nature of selection over the course of experimental evolution. Exceptions such as selection on alternating antibiotics (14) and a regimen of neutral drift followed by positive selection (1521) have met with mixed success in evolving improved function.


Experimental design

We wondered whether complex selection dynamics could facilitate the crossing of fitness valleys. Specifically, we examined whether environmental changes causing dynamic shifts in a protein fitness landscape would result in evolutionary pathways ending in distinct, potentially superior evolutionary endpoints. We designed an experiment in which environmental changes enabled selection toward sequences previously residing in a low-fitness valley and tested whether such an evolutionary path would arrive at distinct and superior outcomes (Figs. 1 and 2). We used Escherichia coli having a bandpass genetic circuit designed for tunable selection of β-lactam antibiotic resistance as a model system for systematically examining such a scenario (22). For cells with this gene circuit, β-lactam antibiotics such as ampicillin are required to induce the expression of tetracycline resistance and to do so even at sublethal levels of the β-lactam. When cells are plated in the presence of tetracycline and ampicillin, growth requires enough β-lactamase activity to degrade the ampicillin below lethal levels, but not too much so as to remove the signal that induces tetracycline resistance. This genetic circuit allowed us to perform selections for β-lactam resistance within a narrow window of resistance (that is, one can select for resistance above a first threshold but below a second, higher threshold) (Fig. 2A). Thus, selections for low but nonzero resistance can be performed. We term such selections “negative” selections in the sense that high fitness (that is, the ability of the cells to grow) in such an environment is achieved only by alleles conferring low levels of β-lactam resistance. Similarly, neutral selection for maintenance of about the original level of resistance can be performed. These bandpass selections are tuned by the level of the β-lactam antibiotic in the environment and require the addition of tetracycline. Thus, these environments reshape the fitness landscape analogously to the depiction in Fig. 1, and our engineered bacteria are a suitable model system for examining evolution under landscape-altering environmental changes.

Fig. 2 Schematic and course of experimental evolution on the β-lactamase gene.

(A) General schematic. After a comprehensive saturation mutagenesis, selection proceeds in one of three manners depending on the environment. Positive selection (blue) enriches for sequences that provide resistance above a certain level. Neutral selection (yellow) enriches for sequences conferring about the same resistance as the starting gene. Negative selection (red) enriches for sequences conferring very low but above-background resistance. The latter two selection schemes are enabled by an engineered bandpass gene circuit present in the cell (22) and require the addition of the antibiotic tetracycline and different levels of cefotaxime, which constitute the difference in environment from the positive selection. (B to E) Course of the evolution experiments. Experiments differ in the first three rounds in which the bacteria underwent (B) positive selection, (C) neutral selection, (D) negative selection, or (E) oscillating selection. Each round represents both mutagenesis (line segments) and selection at the indicated level of cefotaxime (endpoints). Red circles indicate bandpass selections, and blue circles indicate positive selections. The dashed horizontal line indicates the resistance level conferred by TEM-15. Graphed resistance values indicate a lower limit on the resistance values of the entire population during selection (that is, the concentration at which the selection was performed).

The β-lactamase allele TEM-1 provides high antibiotic resistance to ampicillin but very low resistance to cefotaxime. Clinically isolated alleles of TEM-1 conferring increased cefotaxime resistance include TEM-15 (E104K/G238S) and TEM-52 (E104K/M182T/G238S), with the latter increasing resistance 4000-fold (23). Six independent directed evolution experiments on TEM-1 that imposed positive selection for cefotaxime resistance found these same three mutations in the highest-resistance alleles (13, 2428). The GKTS allele, which adds the A42G mutation, has the highest resistance observed in these experiments (10, 27, 29), though other combinations of amino acids at these positions can confer slightly higher resistance (30). These directed evolution studies have used a typical methodology: mutation of the gene, positive selection for higher resistance, and iteration of these processes over several rounds.

Evolution experiments

We subjected TEM-15 to eight rounds of experimental evolution using four different selection strategies: positive selection, neutral selection, negative selection, and oscillating selection (Fig. 2, B to D). As a control for evolution from low-resistance alleles using only positive selection, we also subjected TEM-1 to positive selection. The differences in the selection strategies occurred in the first three rounds of evolution and were followed by five rounds of positive selection to climb toward local fitness maxima. The positive selection experiments represent evolution in a fixed environment with a static landscape and are the typical directed evolution strategy. Because of the bandpass selection, our neutral selection avoids selection for alleles conferring resistance far above the starting allele, unlike previous directed evolution studies using neutral drift (1521). During the negative selection steps in the oscillating and negative selection schemes, we set the level of resistance needed for growth above the background level of resistance exhibited by E. coli without a β-lactamase gene. We reasoned that mutations that completely deactivate a protein might be exceedingly common (such as nonsense mutations), whereas maintenance of above-background level of activity would better reflect the desired scenario that the gene is still required for cell viability and that it encodes a protein with some enzyme activity.

We constructed five plasmids with distinct DNA barcodes identifying the five experiments to monitor and ensure against cross-contamination between the five different selection experiments. One round of evolution consisted of comprehensive codon mutagenesis on the plasmid-borne β-lactamase genes (isolated from the previous round), transformation of the newly mutated plasmid library into fresh E. coli cells, and selection on agar plates for the desired level of cefotaxime resistance (Fig. 2A). At least 106 library members were generated and subjected to selection in each round. Selections were tuned such that typically >100 and sometimes thousands of independent clones appeared at each step to limit the effects of evolutionary bottlenecking on the experimental outcome (data S1). Populations under negative and oscillating selection recovered from negative selection to resistance above that of TEM-15 after one round of positive selection (Fig. 2, D and E). This result supports the idea of the commonality of compensating mutations or suppressor mutations in restoring protein activity (31, 32) but might also reflect a reversal of deleterious mutations in combination with beneficial mutations.

Evolution outcomes

At the end of the experiment, all strategies resulted in bacteria with a high cefotaxime resistance phenotype. The coding sequences of the genes from 47 of the most resistant colonies from each selection strategy were sequenced (data S2 to S6). All selection strategies showed sequence clustering, indicating that each strategy, for the most part, reached distinct areas of sequence space (Fig. 3A). This result emphasizes how much the environment and the nature of selective pressure can shape evolutionary outcomes. Sequences resulting from negative selection showed both the highest average genetic distance from TEM-15 (Fig. 3A and fig. S1) and the highest diversity (fig. S2). We discerned no apparent pattern in the location of mutations on the protein structure (fig. S3). We chose seven distinct sequences from each selection experiment (based on diversity within the tree) to subclone and analyze the effects of mutations on the coding sequence as opposed to those elsewhere on the plasmid. The coding sequences were ligated into a fresh plasmid backbone and transformed into fresh E. coli cells for resistance testing. All libraries produced alleles that conferred resistance higher than that conferred by GKTS, though an approximately eightfold drop in resistance upon subcloning of the genes indicated that mutations elsewhere on the plasmids contributed to the high resistance observed after round 8. We supposed that mutations elsewhere likely consisted of those that increased plasmid copy number or promoter strength. Strikingly, the negative selection strategy evolved alleles that conferred the highest resistance, with five of the seven alleles providing an eightfold resistance increase over that of GKTS (Fig. 3B). This result is interesting in light of theoretical predictions that a low starting fitness may dramatically enhance the chances for a population to reach the global maximum (33).

Fig. 3 Diversity and conferred resistance of alleles after the eighth round of evolution.

(A) Unrooted tree of Hamming distances between randomly selected alleles. Branches are color-coded by the evolution experiment as indicated. Scale bar indicates 1 unit of Hamming distance (that is, one missense mutation). (B) Increase in resistance conferred by seven alleles from each selection strategy as compared to TEM-1. Resistance was measured using the MIC assay. The seven alleles were selected from among those depicted in the tree to represent diverse sequences. The dashed line represents resistance conferred by the GKTS allele, whereas the TEM-15 allele conferred a 64-fold higher resistance relative to TEM-1.

Deep sequencing of populations during the course of the negative selection experiment

We next sought insights into how negative selection produced this outcome. Deep sequencing of the population at each round of the negative selection regimen revealed a high expansion of genetic diversity during negative selection, consistent with our expectation that deleterious mutations are common (fig. S4). In contrast, during the subsequent positive selection steps, increasing purifying selection is apparent (fig. S4). In particular, the A184I mutation was highly enriched in the population (figs. S5 and S6). However, a large amount of diversity was maintained throughout the experiment. A total of 5836 unique sequences were found in deep sequencing data after the first positive selection, demonstrating that, despite rapidly changing modes of selection, we maintained a high diversity of alleles following the negative selection regimen. In addition, 85% (40 of 47) of the sequences from the final selection were unique, with an average difference of 6.1 AA mutations between unique mutants.

Evolutionary pathway reconstruction using comprehensive fitness measurements

To determine whether the environmental change resulting in negative selection was a key factor in the evolutionary history of an allele conferring a high resistance, we attempted to reconstruct all possible evolutionary intermediates between TEM-15 and one of the fittest alleles resulting from the negative selection regimen (BS-NEG-4) (fig. S7). This allele contained nine missense mutations relative to TEM-15 [A11S, A42G, L49M, Q90R, A184I, I208W, A224V, F230S, and R241F in Ambler notation (34)] and conferred an eightfold higher cefotaxime resistance than GKTS. Among the mutations in BS-NEG-4, A11S, A42G, L49M, and A224V have been reported in other TEM alleles, though none were present together in the same allele (35). We created a library designed to contain all 512 combinations of these nine mutations to determine the protein fitness (w) of all possible evolutionary intermediates in the evolution of BS-NEG-4 (assuming no mutation reversals or multiple mutations at the same amino acid position). Protein fitness quantifies the relative ability of the protein to confer a cefotaxime resistance phenotype: the higher the resistance, the higher the fitness. We determined the fitness of 98.8% (506 of 512) of all possible combinations of these nine mutations in a high-throughput manner involving bandpass selection and deep sequencing (data S7) (36). This method uses the bandpass circuit. Cells with a given level of cefotaxime resistance will grow best within a certain range of cefotaxime concentrations. A cell with higher resistance requires high concentrations, and a cell with lower resistance requires lower concentrations. Fitness is thus determined by plating the library on plates containing tetracycline and different cefotaxime concentrations, deep sequencing the β-lactamase gene on each plate, counting how many times each allele appears on each plate, and calculating the center of the concentration range at which the allele allows growth. This center concentration can be thought of as a proxy for the minimum inhibitory concentration (MIC) for cefotaxime. Fitness is expressed as this center concentration normalized to the center concentration of a reference (in this case, the resistance conferred by TEM-15). One mutation, F230S, was particularly deleterious (wF230S = 0.09 ± 0.01, with the fitness of TEM-15 set to a value of 1.0) and remained so even in combination with one to two other mutations, making it a highly likely starting point on the evolutionary pathway resulting in BS-NEG-4.

Analysis of fitness landscapes and evolution pathways using selection-weighted attraction graphing

We sought a method to visualize the high-dimensional space of the fitness landscape composed of the 512 potential evolutionary intermediates between TEM-15 and BS-NEG-4. For this purpose, we introduce selection-weighted attraction graphing (SWAG), a force-directed graphing method in which nodes represent sequences and edges connect nodes differing by only one missense mutation (Fig. 4). Forces between connected nodes are weighted proportionally to the selection strength, as determined by the nodes’ respective fitness values. These forces are applied on the xy plane, and the graph can be depicted in three dimensions by adding fitness as the z axis. In the graph, edges represent possible single steps in the evolutionary pathway, and the distance between connected nodes on the xy plane is inversely proportional to the likelihood of taking that step under positive selection.

Fig. 4 SWAG of the possible evolutionary intermediates between TEM-15 and BS-NEG-4.

Nodes are sequences, and edges connect nodes that differ by one missense mutation. Forces between connected nodes are weighted proportional to the selection strength, as determined by the node’s fitness values. (A) Two-dimensional SWAG. Edges are curved to aid visualization. (B to D) Three-dimensional SWAG has protein fitness as the z axis and is color-coded by (B) cluster, (C) whether the sequence contains F230S, and (D) protein fitness value.

SWAG analysis of the intermediates between TEM-15 and BS-NEG-4 resulted in eight major clusters of sequences representing local fitness peaks (Fig. 4, A and B). Such clustering was robust to the weighting of selection strength in the forces. No clustering was observed when fitness values were randomly reassigned to the nodes. Clusters comprise groups of sequences with many interconnections (fig. S8) that tend to lead in succession to ever-increasing fitness values. The graph can be seen to have two “ranges” of four clusters each separated by a large gap devoid of nodes (Fig. 4). The two ranges differ by whether their sequences contain F230S (Fig. 4C). The range containing F230S lies far from the start point TEM-15 and contains the endpoint BS-NEG-4. The proximal range contains TEM-15 and nearby nodes. Interrange edges primarily connect low-fitness alleles, typically with the lower fitness node lying in the F230S range.

Although we cannot exclude the possibility that BS-NEG-4 could result from positive selection, we can say that the evolution of BS-NEG-4 is extremely unlikely under positive selection. Of the 9! = 362,880 possible pathways between TEM-15 and BS-NEG-4, we could analyze 347,616 (the missing pathways are due to six alleles not being frequent enough in the library to analyze fitness data on those alleles). Of the 347,616 pathways analyzed, only 16 involve increasing fitness as each mutation is added (Fig. 5). Furthermore, evolution under positive selection is more likely to terminate at intermediates that are local maxima than to reach BS-NEG-4 (Fig. 6). The SWAG analysis pictorially depicts this result. First, eight of the possible nine initial mutations (the exception being F230S) are located in the proximal range (fig. S9). Second, there are many more steps within a cluster than between clusters (fig. S8A). Finally, steps within a cluster tend to be fitter sequences than steps between clusters (fig. S8B), making intercluster steps even less likely. Thus, starting from TEM-15, the most likely mutational pathways in a constant, positive selection environment progress first into one of the clusters of the proximal range and then preferentially climb within that cluster to local fitness maxima. However, when the environment changes such that low resistance is beneficial (that is, under negative selection), mutational steps to low-fitness values in the depths of the F230S range become advantageous (Fig. 4D). Then, upon a reversion back to the original environment (that is, back to positive selection), sequences tend to evolve up from the depths in the clusters of the F230S range, enabling access to BS-NEG-4. The environmental change can cause interrange bridges to form across gaps in the SWAG landscape that end in “gateway” sequences having F230S, allowing access to the sequence space of that range of peaks. Without this bridge, evolution of BS-NEG-4 is extremely unlikely. We conclude that evolution of BS-NEG-4 is extremely unlikely to occur under positive selection but can be accessed through a series of environmental changes that modulate the selective pressure on the gene.

Fig. 5 Radial graph of all analyzed pathways from TEM-15 to BS-NEG-4.

Each unit branch extending from the center indicates the gain of one of the nine amino acids from TEM-15 until the endpoint containing the full set of mutations is reached. (A) The line color indicates the log of the fitness change from the addition of the mutation. This graph pictorially shows that most pathways have steps with decreased fitness. (B) Same data as in (A), but with the color indicating the fitness relative to TEM-15 after the addition of the mutation. This graph pictorially shows that early steps in most pathways pass through mutants with fitness values less than that of TEM-15. Data is the same in A and B; only color coding is different.

Fig. 6 Three-dimensional SWAG of all evolutionary pathways in which fitness increases monotonically with each additional mutation.

Fitness is represented by height on the z axis. Color indicates clusters. Each node is sized by the likelihood of arrival, as measured by PageRank algorithm weighted by selection strength between nodes.

A deleterious gateway mutation allowing passage through a fitness valley via epistatic effects

We next turned our attention to the role of F230S in the evolution of BS-NEG-4. F230S is very deleterious to TEM-15 for cefotaxime resistance but is only mildly deleterious to ampicillin resistance in TEM-1 (36). The different effects of F230S on TEM-1 and TEM-15 might originate from a strong negative pairwise epistatic interaction with G238S, which lies at the opposite end of the same β-strand. Because we suspected F230S to be key in the evolution of this allele, we compared the epistatic and fitness effects of mutations of evolutionary intermediates containing F230S to intermediates lacking F230S (Fig. 7). Although alleles with F230S tend to have lower fitness, especially among intermediates with few mutations (Fig. 7A), several intermediates demonstrate higher fitness with the addition of F230S, including the global optimum among these sequences, which contains all but the Q90R and A184I mutations. We individually verified the MIC conferred by several of the fittest variants that contain F230S and observed up to a 32-fold improvement in resistance over that provided by GKTS (data S8). Seven of nine of the mutations of BS-NEG-4 are slightly beneficial as the sole mutation in TEM-15. However, the higher-order epistatic effect of combinations of these mutations in the absence of F230S is increasingly negative (Fig. 7B), consistent with previous observations that beneficial mutations usually interact with negative epistasis, creating diminishing returns (3739). In contrast, higher-order epistatic interactions in intermediates containing F230S are almost always positive [252 of 256 (98%)] and become increasingly positive as mutations accumulate up to about four to five mutations, where the higher-order epistasis plateaus at ε = +0.7. This epistasis value means that the average fitness of proteins having the F230S mutation is five times higher than expected based on the effects of the individual mutations in TEM-15. We emphasize that this strong positive epistasis with F230S only occurs in the higher-order epistasis terms. The benefits of F230S are not so apparent when one examines the effects of F230S on pairwise epistasis among the other eight mutations, though F230S tends to have a slightly positive effect on the pairwise epistasis between certain mutations (fig. S10). We suggest that F230S is more than just a key gateway mutation for traversing the fitness valley. F230S also serves to alleviate the negative interactions between beneficial mutations, creating the pathway to the final high-activity allele.

Fig. 7 Protein fitness and epistasis along possible evolutionary pathways between TEM-15 and BS-NEG-4.

(A) Protein fitness values of all possible evolutionary intermediates. (B) N-order epistasis between all mutations as a function of the number of mutations. Proteins containing F230S are noted in red, whereas all other proteins are shown in blue. Points are jittered for clarity. Larger points indicate the mean for each data set, with bars extending to the SD.


Our results are important for applied directed evolution experiments, which almost exclusively use positive selection as a strategy. In our study, a selection regimen including negative selection generated the highest activity variants analyzed and outperformed strict positive selection. This result may appear counterintuitive to protein engineers but could prove useful in escaping plateaus in protein functionality. A few directed evolution studies have demonstrated the advantage of neutral drift in the evolution of activity toward alternate substrates (1521), but our results are the first to show that a regimen involving negative selection can rival or even, perhaps, outperform other strategies, including strict positive selection and neutral drift. Although the neutral selection regimen performed here also accesses alternate fitness maxima, negative selection reached more distant sequences with superior properties on average. Thus, negative selection may be a more effective means of generating highly diverse and effective alternative starting points for directed evolution experiments.

Our study lends further credence to models of rough fitness landscapes shaped by epistasis. SWAG enabled us to visualize and gain insights into potential evolutionary pathways using network analysis. Our technique creates statistical inference from evolutionary data that can be understood intuitively. We demonstrate SWAG visualization of a landscape with an evolutionary valley and multiple fitness peaks. SWAG landscape visualization should be broadly useful for depicting fitness landscapes and evolutionary pathways. Our experiments address the predictions of Wright (1) regarding the importance of environmental change in evolutionary processes. We present evidence that environmental changes that alter the fitness landscape can extend access to areas of sequence space that are difficult to explore in a constant environment. In our particular example, this access resulted in superior evolutionary outcomes. Whether our result is the exception or the rule will require testing of other genes and systems.



All experiments used the expression plasmid pSkunk3 with a 9–base pair (bp) barcode indicating selection type downstream of the β-lactamase gene. The barcodes did not alter the level of antibiotic resistance conferred by the plasmid. The plasmid pSkunk3 contains the gene under the control of the isopropyl-β-D-thiogalactopyranoside (IPTG)–inducible tac promoter and confers spectinomycin resistance. Induction of β-lactamase expression was accomplished using 300 μM IPTG and the maintenance antibiotic spectinomycin (50 μg/ml).


PFunkel mutagenesis was performed as described (36, 40). Briefly, plasmids were transformed into E. coli CJ236 to prepare single-stranded template DNA containing uracil, using filamentous phage R408. Mutagenic oligonucleotides with degenerate codon NNN were targeted to every codon in the β-lactamase gene. Following mutagenesis, all reaction libraries were transformed into electrocompetent NEB 5-alpha cells (New England Biolabs) to prepare a naïve library for each round. All naïve libraries exceeded 106 transformants following mutagenesis.


For bandpass experiments, plasmid libraries were isolated from the NEB 5-alpha cells and transformed into E. coli SNO301 in the presence of plasmid pTS40 (22). Bandpass selections were performed, as described previously (22, 36), on solid LB agar plates with spectinomycin (25 μg/ml), chloramphenicol (25 μg/ml), and 300 μM IPTG at 37°C for 24 hours [except for tetracycline (10 μg/ml) and various concentrations of cefotaxime]. Selections under non-bandpass conditions were performed in NEB 5-alpha cells on LB broth agar containing 300 μM IPTG, spectinomycin (25 μg/m), and various concentrations of cefotaxime at 37°C for 24 hours. Roughly 106 colony-forming units were plated for each selection experiment. Cell stocks were not previously exposed to β-lactam antibiotics before selection. At the end of the experiment, the 47 alleles were picked for sequencing based on colony size on plates, with antibiotics at the final selection level (that is, the largest 47 colonies were picked, first from the highest-concentration plate and then from the next highest-concentration plate, as necessary to reach 47).

Screening and prevention of cross-contamination

Between rounds, the 9-bp barcodes on each plasmid were used to monitor for contamination via a polymerase chain reaction (PCR)–based screen. Primers were designed to specifically amplify DNA containing each barcode to diagnose the presence of cross-contamination between the selection experiments. In the event of contamination, the desired fraction of the contaminated library was amplified with the appropriate barcodes and ligated into fresh vector to eliminate the contaminants.

Deep sequencing of populations during evolution experiments

Bacteria were recovered from plates with LB broth containing 10% glycerol and pelleted by centrifugation. Plasmid DNA was isolated from the pelleted cells. Amplicons were prepared for deep sequencing via PCR from each plasmid prep with the addition of a 3-bp barcode on either side of the amplicons indicating the round from which each amplicon was isolated. PacBio deep sequencing was performed on these amplicons using a three-pass circular consensus. Custom MATLAB scripts described previously (36) were used to align, analyze, and quantify reads and mutation composition. Reads were filtered for quality score, length, and quantity of insertions and deletions. Each read was then aligned to the reference gene to identify barcodes and mutations.

Deep sequencing to determine protein fitness

Fitness values were determined essentially as previously described (36) and summarized here in a way that includes the minor differences. Bandpass selection selects for cells with activity at or near a particular level, as determined by the cefotaxime concentration. Thus, plating at a variety of different cefotaxime concentrations partitions the library into bins of relative fitness levels. Thus, by deep sequencing the plasmid DNA to determine the frequency of each allele on each plate, the fitness for each mutant can be determined.

To determine fitness values, we performed selections over 11 plates containing doubling levels of cefotaxime (from 0.01 to 10.24 μg/ml) under bandpass conditions [300 μM IPTG, spectinomycin and chloramphenicol (25 μg/ml), and tetracycline (10 μg/ml) on LB broth agar at 37°C for 24 hours]. Selected libraries were swept with LB broth with 10% glycerol, cells were pelleted, and plasmid DNA was isolated. Amplicons were generated directly from each plasmid DNA prep. During PCR, 3-bp barcodes indicating the cefotaxime concentration on which each sequence was found were added at both ends. After amplifying the library from each plate, PacBio deep sequencing data were generated for each selection condition. We used custom Matlab scripts to align, analyze, and quantify reads and amino acid mutation composition. Reads were filtered for quality score (reported average quality score > 30, or average reported probability of error = 10–3), length (length of read less than 1100 bp but more than 930 bp), and quality of alignment to the reference gene (entirety of reference gene aligned to read) and the barcode (perfect match accepted only). We tabulated allele counts for each plate. In our previous work (36), the counts were used directly in the fitness calculation. Here, we adjusted the counts to reflect the desired proportion of sequencing reads from each plate as determined by the frequency of colonies appearing on that plate. Next, for each protein sequence, the plate with the highest counts was found, and the counts in that plate and in the surrounding two plates (five in total) were used to calculate the final fitness measurement as before (36). Only fitnesses for protein sequences with five or more counts were considered. Fitness values (w) were defined relative to the TEM-15 fitness.

Phylogenetic analysis

We used the Hamming distance between any two given sequences at the amino acid level to construct a distance matrix using custom scripts in MATLAB (fig. S1). Next, we applied a neighbor-joining algorithm in MATLAB to generate the tree from all combined sequences and cluster branches. The distance tree was visualized using FigTree (41).

Selection-weighted attraction graphing

SWAG network analysis was accomplished using Gephi (42) and calculated fitness values. Forces between nodes were weighted by the calculated relative selection strength between nodes (sr, the ratio of fitnesses wN and wN+1 between node N and node N + 1, which differ by one amino acid mutation)Embedded Image(1)Nodes that differed by only one amino acid mutation were connected. The ForceAtlas2 algorithm (43) was used to generate graphs. Clustering was accomplished using the modularity algorithm weighted by selection strength. PageRank (44) was implemented and also weighted by selection strength.

Epistasis calculations

Pairwise epistasis between mutation A with fitness wA and mutation B with fitness wB was calculated asEmbedded Image(2)in which wο is the wild-type fitness (45). Higher-order epistasis of order N with mutations i,j,k,…,n was calculated using Eq. 3Embedded Image(3)Among three mutations A, B, and C, the amount by which a third mutation changes the epistasis is the same regardless of which mutation is considered the third mutationEmbedded Image(4)in which εi•j|k refers to the pairwise epistasis between i and j in the context of an allele containing mutation k. This calculation was used in fig. S10.

MIC testing

MIC testing was performed in NEB 5-alpha cells by growing 1 ml of each culture to be tested overnight in LB broth in the presence of spectinomycin (50 μg/ml). Saturated growths were diluted 1:100 and grown until above OD600 (optical density at 600 nm) = 0.3. All samples were then diluted back to OD600 = 0.003, and 1 μl was spotted onto LB broth agar plates with 300 μM IPTG and various concentrations of cefotaxime in twofold increments. Growth was evaluated after 24 hours of incubation at 37°C. The MIC was the lowest concentration of cefotaxime at which no growth was visible.


Supplementary material for this article is available at

Fig. S1. Heat map of the matrix of Hamming similarity between unique proteins found from 47 of the most resistant alleles from each selection regimen following round 8.

Fig. S2. Comparison of the differences in sequences obtained using the different selection regimen.

Fig. S3. Structural mapping of mutations found in each selection scheme following round 8.

Fig. S4. Heat map of the frequency of mutations observed during the negative evolution regimen, as analyzed by deep sequencing.

Fig. S5. Stacked frequency distribution of mutations found by deep sequencing at each step in the negative evolution regimen as a function of codon position over eight rounds.

Fig. S6. Stacked distribution of mutations found in the top 47 alleles from all selection strategies following round 8 as a function of codon position in β-lactamase.

Fig. S7. Structural mapping of mutations found in the BS-NEG-4 allele, which confers a high resistance to cefotaxime.

Fig. S8. Pathways between clusters found in SWAG analysis.

Fig. S9. Two-dimensional SWAG landscape of pathways between TEM-15 and BS-NEG-4 with the nine single mutants indicated.

Fig. S10. Change in pairwise epistasis (Δε3) between two mutations on the pathway from TEM-15 to BS-NEG-4 in response to the addition of F230S.

Data S1. Evolutionary history of selection schemes.

Data S2. Sequences resulting from the positive selection regimen (TEM-1).

Data S3. Sequences resulting from the positive selection regimen (TEM-15).

Data S4. Sequences resulting from the neutral selection regimen.

Data S5. Sequences resulting from the negative selection regimen.

Data S6. Sequences resulting from the oscillating selection regimen.

Data S7. Fitness and epistasis values for BS-NEG-4 pathways.

Data S8. Additional MIC data.

This is an open-access article distributed under the terms of the Creative Commons Attribution-NonCommercial license, which permits use, distribution, and reproduction in any medium, so long as the resultant use is not for commercial advantage and provided the original work is properly cited.


Acknowledgments: We thank the Johns Hopkins Deep Sequencing and Microarray Core Facility for performing the PacBio deep sequencing. Funding: This work was supported by the National Science Foundation (grants DEB-0950939 and CBET-1402101 to M.O.). Author contributions: B.S. and M.O. conceived the approach, analyzed the data, and wrote the paper. B.S. performed all experiments. Competing interests: M.O. is co-inventor on patent applications for the mutagenesis technique. This intellectual property has been licensed to Revolve Biotechnologies, for which M.O. serves as scientific advisor and has stock options. B.S. co-founded Revolve Biotechnologies and holds equity in the company. Data and materials availability: All data needed to evaluate the conclusions in the paper are present in the paper and/or the Supplementary Materials. Additional data related to this paper may be requested from the authors.
View Abstract

Stay Connected to Science Advances

Navigate This Article