## Abstract

Adaptation, where a population evolves increasing fitness in a fixed environment, is typically thought of as a hill-climbing process on a fitness landscape. With a finite genome, such a process eventually leads the population to a fitness peak, at which point fitness can no longer increase through individual beneficial mutations. Instead, the ruggedness of typical landscapes due to epistasis between genes or DNA sites suggests that the accumulation of multiple mutations (via a process known as stochastic tunneling) can allow a population to continue increasing in fitness. However, it is not clear how such a phenomenon would affect long-term fitness evolution. By using a spin-glass type model for the fitness function that takes into account microscopic epistasis, we find that hopping between metastable states can mechanistically and robustly give rise to a slow, logarithmic average fitness trajectory.

## INTRODUCTION

Adaptation dynamics has often been discussed in the context of fitness-parametrized landscapes, where the fitness of mutants or the fitness effects of mutations are assumed to be drawn from some distribution that could depend on the current fitness of the genotype (*1*–*3*). The resulting dynamics (e.g., fitness trajectories) depends heavily on the form of that distribution and its variation with fitness. When adopting such an approach, one often assumes an infinite genome such that a population never runs out of possible beneficial mutations.

Organisms have a finite genome, and in a large population where only beneficial mutations have a chance of fixing, one would expect the number of available beneficial mutations to deplete as cells continuously gain mutations in a constant environment (*4*–*6*). This occurs until the population eventually reaches a fitness peak where there are no possible beneficial mutations left. The fitness trajectory would therefore reach a plateau if this is the only single fitness peak in the landscape or if this process of gaining single beneficial mutations is the only way by which fitness can increase.

Interactions between loci in the genome, known as epistasis, imply that the fitness effect of a mutation depends on the states of other loci. In particular, even if individual mutations are deleterious, they could be jointly beneficial. These effects have been observed experimentally and result in the fitness landscape being rugged with many fitness peaks (*7*–*10*). In a large population, neutral and deleterious mutations cannot fix on their own, and these mutants would die out in a finite time. Escape from any local fitness peak would therefore require cells with a deleterious mutation to gain additional mutations before going extinct. This cell with multiple mutations then has a chance of fixing in the population if the mutations have a net beneficial effect. This mechanism of crossing fitness valleys has been termed stochastic tunneling and has been found to be important when the population size is sufficiently large (*11*, *12*). At a local fitness peak, this allows the population to continue increasing in fitness. However, how this affects the form of the long-term fitness trajectory has not been studied.

In this work, we investigate how stochastic tunneling affects the average fitness trajectory of an evolving population using a spin-glass model for the fitness function and a model for the dynamics of mutation accumulation that allows a deleterious mutation to fix through fitter double mutants. We find that while the trajectory that arises within the hill-climbing regime depends on the distribution of interaction strengths between mutational sites and eventually reaches a plateau, hopping between metastable states (MSs) can robustly give rise to a slow, logarithmic fitness trajectory that is reminiscent of glassy dynamics.

## RESULTS

### The model

We consider a genome as a finite sequence of *L* sites, where each site represents a DNA base or a gene and each site can take two possible discrete states, α* _{i}* = ±1 and

*i*= 1,2, …

*L*(Fig. 1A). The fitness, here equivalent to the exponential growth rate of a cell, depends on the states of all sites and is given by

*13*)], and

*F*

_{offset}is a constant whose value is chosen such that the fitness of the initial strain is 1. This is a convenient choice because we are interested in the trajectory of the relative fitness (i.e., fitness relative to the wild-type strain).

The strengths of the pairwise interactions are captured by the symmetric interaction matrix *J* (Fig. 1A) and are randomly drawn from the following distribution* _{h}* = (1 − β)Δ and

*L*), while Δ affects the fitness effects of mutations. In our analysis, Δ is chosen such that the fitness effects of fixed mutations are, on average, of the order of a few percent (

*14*). Because biological networks are typically sparse (

*15*,

*16*), we assume that ρ is small. One could also include an additional term in Eq. 1 representing weak fully connected pairwise interactions, but this does not seem to affect the relevant structural properties of the fitness landscape as long as stronger sparse interactions are present (section SC.). For the rest of the paper, we set ρ = 0.05, but our findings hold for other values of ρ as long as it is sufficiently small (sections SC and SE).

Our model can be extended to include higher-order interaction terms if desired and, in fact, approaches that of the commonly used Kauffman’s NK model in the limit of including orders up to *K* (*17*–*19*), which is the number of genes each gene interacts with, and in our model is determined by ρ. However, to keep things computationally tractable even for large ρ, we will not include these higher-order interactions in our analysis. It is important to note that although we only consider pairwise interactions, each site can interact with a large number of other sites. This is in contrast to the NK model with *K* = 2. Similar spin-glass–type models have also been used to study other systems such as the conformational states of proteins (*20*, *21*) and networks of neurons (*22*, *23*). Although our model assumes that the fitness effects of independent mutations add up, mapping to the other commonly used multiplicative null model (*24*) gives similar trajectories (section SA).

### Dynamics of mutation accumulation

We consider a population of fixed size *N* in a continuous culture such that growth occurs in a constant environment, and hence, the fitness landscape does not change over time. Our results also hold for batch cultures subject to the standard dilution protocol (section SF).

We model the dynamics using the Moran process, where, at each time step, a random cell is chosen to leave the population and a random cell is chosen to divide. We assume that mutations occur at a constant rate μ per cell per division and that all sites mutate with equal probability. Whenever a mutation occurs, its fitness effect *N*μ ≪ 1, where the time between successful mutations emerging is much longer than the time taken for a successful mutation to fix. This allows us to assume that if a mutation fixes, it immediately takes over the whole population. The population can then be thought of as performing an adaptive walk through a multidimensional genotypic space (Fig. 1C). Within this model, the fixation probability of a single mutant is known to be given by (*25*)

Taking ∣*s*∣ ≪ 1 and the large population limit, all deleterious and neutral mutations will eventually go extinct, while *p*_{f} ≈ *s* for *s* ≫ 1/*N*.

During the time taken for deleterious/neutral mutants to go extinct, it might be possible for them to gain a second mutation such that the net fitness effect *s*_{eff} of the double mutant is positive. This double mutant then has a chance of fixing in the population. When the system is at a local fitness maximum, this mechanism of gaining multiple mutations is the only way the fitness can continue to increase (Fig. 1, C and D). Hence, we investigated the stability of the local peaks to multiple mutations and found that, for a small ρ, many of the fitness maxima in our model are unstable to double mutations (section SC).

We therefore take into account the probability *p*_{d} that a single mutant with fitness effect *s*_{1} ≤ 0 gains a second mutation and eventually fixes, which in the large *N* limit and for ∣*s*_{1}∣ ≪ 1 is given by (*26*)*s*_{eff}〉 = 〈 max (*s*_{eff},0)〉 ≪ 1, with the averaging carried out over all *L* possible second mutation sites. Intuitively, *n*_{d} is at most ∼1/ ∣*s*_{1}∣ and *p*_{d} is the probability that a successful double mutant emerges during this period (*12*).

This implies that if a population is large enough such that *N* ≪ 1/μ) if μ is sufficiently small μ ≪ 〈*s*_{eff}〉. For typical *s*_{eff} of a few percent, this can be easily satisfied. For the SSWM assumption to hold, a large population already implies that the mutation rate must be very small: μ ≪ 1/*N*. Nevertheless, we still expect stochastic tunneling to dominate the fixation of deleterious mutations at large population sizes because for any given *N*μ, *p*_{f} (Eq. 3) decreases with *N* faster than *p*_{d} (Eq. 4) for *s* < 0.

To take into account the number of possible deleterious mutations, which is large for a long genome and can be larger than the number of possible beneficial double mutants, we constructed a typical quenched fitness landscape and, for any given state, calculated the relative probabilities of the next successful mutation event falling into one of three possible categories: (A) beneficial mutation fixing, (B) deleterious mutation fixing on its own, and (C) stochastic tunneling via a double mutant. Within our regime of interest, as long as the number of available beneficial mutations, which we refer to as the rank of the state, is positive, the next successful event must be of type A (Fig. 1E). At an MS (rank = 0), for any given *L*, the probability of C substantially outweighs that of B for sufficiently large *N* (Fig. 1F).

Because the fraction of first mutants in the population is always small, the double mutants can be assumed to be competing against the original population once it emerges. Therefore, given that a first deleterious mutant has produced a successful double mutant, the relative probabilities of the second mutation site are proportional to their fixation probabilities *p*_{f}(*s*_{eff}).

### Effective simulation model

Putting the above elements together, our effective simulation model is as follows: From any given state *s*, and accept this mutation with probability *p*_{f}(*s*). If *p*_{d} (Eq. 4), which is obtained by considering all possible *L* second mutation sites and calculating the effective selection coefficients of their corresponding double mutants *s*_{eff}. If a successful double mutant emerges, we then draw the second mutation site randomly with weights proportional to their fixation probabilities *p*_{f}(*s*_{eff}). For convenience, within our simulations, time is measured in terms of the number of mutational attempts (because at each time step we mutate a site and ask if the mutation successfully fixes). Although *N* does not explicitly enter in the simulations, we discuss in section SB the range of *N* for which our assumptions and results hold.

### Hill-climbing regime: Relaxation toward a fitness peak

We first consider the regime where the initial state is far from a fitness peak and adaptation brings the population closer to a fitness peak.

If there is no epistasis (*J* = 0), there is a single well-defined peak in the fitness landscape. As mutations accumulate, the number of available beneficial mutations left, which we refer to as the rank of the system, decreases monotonically until the system eventually reaches a fitness plateau (Fig. 2A). In this case, the average fitness trajectory is uniquely defined by the distribution *p*(*h*) of *h* and for *p*(*h*) that follows a normal distribution, *27*).

In the presence of epistasis (*J* > 0), a mutation at a specific site can be beneficial or deleterious depending on the states of other sites. This is known as sign epistasis because the sign of the fitness effect of a mutation changes with the accumulation of other mutations and is responsible for a rugged fitness landscape. The rank of the system is therefore no longer guaranteed to decrease monotonically with time. By varying the relative contribution of the field and interaction terms (Eq. 1), we found that while the trajectory still follows a power law (Fig. 2, A and B), the relaxation to the first encountered local fitness maximum slows down as the relative strength of the interaction term increases (section D). Here, as in the case when *J* = 0, the exact form of the fitness trajectory is also sensitive to the chosen form of *p*(*h*) and *p*(*J*) but eventually reaches a fitness plateau.

### Hopping between MSs

With multiple local fitness maxima, the only way for fitness to continue increasing from any of these maxima is to first gain a deleterious mutation (that does not fix by itself) followed by one or more beneficial ones (Fig. 1D). In our model, these fitness maxima are MSs that are stable to single mutations but are unstable to double mutations. Once the system escapes from an MS via a double mutant, it may enter a state with rank > 0, which is now able to gain one or more single beneficial mutations before again entering another MS, and the process repeats itself. This dynamics results in a scenario where the system spends a long time in MSs but transits between MSs relatively quickly. The phenomenon that fitness goes through periods of stasis followed by rapid jumps is known as “epochal dynamics” and has also been observed and studied in other models, such as nearly neutral holey fitness landscapes, where the epochal nature comes about from degeneracies in the genotype-to-fitness mapping (*28*–*30*).

To obtain the average fitness trajectory, instead of repeating the simulations multiple times (which would be computationally expensive because the escape events from MSs are rare and the space of possible trajectories is large), we constructed a Markov chain by mapping out all the possible states that a state can go to and by calculating the corresponding transition rates (Fig. 3A). We start the system in a randomly chosen MS (by randomly drawing a genotypic state and successively flipping sites chosen from all possible beneficial mutation sites with equal probability), assuming that the cells are already relatively close to a local fitness peak. Because double-mutation transitions are typically much slower than single-mutation transitions (Fig. 3B), we only include double-mutation transitions out of MSs, i.e., when there are no possible single-mutation transitions out of a state. The transition probability of going to a state via a single mutation at site *i* with effect *s*^{(i)} > 0 is given by *p _{ij}* of going to a double mutant state via mutations at sites

*i*and

*j*with net effect

*s*

^{(ij)}is given by

*i*followed by

*j*, and vice versa for the second term. The fixation probability of a mutant

*p*

_{f}(

*s*) and the probability of a deleterious single mutant becoming a successful double mutant

*p*

_{d}(

*s*

_{1}, 〈

*s*

_{eff}〉) (Eq. 4) are as defined previously, and 〈

*s*

^{(ik)}〉 = ∑

*max(*

_{k}*s*

^{(ik)},0)/

*L*. The probability of staying in the same state after a mutation event is then 1 − ∑

*− ∑*

_{i}p_{i}_{i, j}

*p*

_{i, j}. By propagating the dynamics using this matrix (Methods), we find that the average fitness increases approximately logarithmically with time (Fig. 3C).

Because the time taken to escape from an MS is much longer than the time spent between MSs, the dynamics is governed by the hopping between these MSs. This is reminiscent of Bouchaud’s trap model, where these MSs can be considered as “traps” with long trapping times τ (*31*), which is the average time spent in the state before transiting to another state. We also find that the average trapping time 〈τ〉 increases with time (Fig. 3D), supporting the concept that the system becomes trapped in deeper and deeper states that are harder to get out of. This is also why our model exhibits aging—the dependence on the system properties on the time from the start of the experiment—one of the hallmarks of glassy systems (section SE).

In the trap model, one assumes that the time taken to overcome energy barriers follows the Arrhenius law (*31*). Here, although we did not a priori assume any relationship between the trapping time τ and the fitness of a state, we find that, on average, τ increases exponentially with *F* (Fig. 3E). Because *t* ∼ τ and *F* ∼ log(τ), the logarithmic fitness trajectory *F* ∼ log(*t*) naturally emerges.

There are two factors that determine τ: (i) the number of escape paths *n*_{p} (i.e., number of possible double mutations that increases *F*) and (ii) the average escape rates through one of those escape paths 〈λ〉. More specifically

While λ depends on the fitness difference between two states and is a representative of the energy barrier between states, *n*_{p} governs the strength of the entropic barrier, which refers to the low probability of a favorable set of mutations occurring. We find that the exponential increase in τ with *F* is predominantly due to the exponential decrease in *n*_{p} with *F* (Fig. 3F) (section SE). This could be related to the exponential decrease in the number of local minima with energy in typical spin-glass energy landscapes (section SE) (*32*–*34*). Repeating the analysis with transition probabilities drawn from a uniform distribution gives a similar fitness trajectory (section SE), suggesting that it can be attributed to the following property of the fitness landscape: The number of beneficial double mutants out of an MS is negatively correlated with the fitness of the MS (section SE). This mechanism for generating a slow fitness trajectory is also robust to the choice of parameter distributions (section SE).

## DISCUSSION

Biological systems in a constant environment exhibit quenched disorder, e.g., the interactions between genes are fixed and encoded in the genome, and what we typically observe experimentally are individual trajectories on a specific fitness landscape defined by a fixed set of parameters rather than the averaged trajectory over all possible realizations of the landscape. While there have been attempts to map out the fitness landscape for a few genes (*35*–*37*), repeating the process for the full landscape over all possible genotypes currently seems out of reach. In this work, we illustrate the utility of a microscopic model that explicitly takes into account specific quenched interactions between sites on a genome. We find that such a model captures realistic features of the dynamics and can provide insight into long-standing questions such as what the possible mechanisms driving long-term evolution are.

The dynamics of adaptation depends on both the fraction of available beneficial mutations in the genome and the distribution of their fitness effects. While it is common in other models to make assumptions about the beneficial mutation rate and the specific form for the fitness effect distribution, including its variation with current fitness (“macroscopic epistasis”), these aspects of the dynamics, such as the decrease in the beneficial mutation rate, emerge naturally from the microscopic model presented here.

If the number of beneficial mutations is depleted over time, as it occurs in our model, the system eventually reaches a local fitness maximum. In this case, although adaptation is commonly thought of as a hill-climbing process that only involves the accumulation of beneficial mutations, the escape from MSs via multiple mutations is crucial for the continued increase in fitness over many generations and can robustly give rise to a logarithmic fitness trajectory.

The slow fitness trajectory that emerges from hopping between MSs arises due to the ruggedness of the fitness landscape. This is reminiscent of glassy dynamics, where the existence of multiple local energy minima has also been the cause of slow relaxations in many other models of glassy systems such as spin glasses, structural glasses, electron glasses, polymers, and granular materials (*38*–*43*). However, the exact dynamics for the fitness increase in our model is different from that in these other scenarios and thus provides an alternative mechanism for how slow relaxations could arise in a real system.

A logarithmic fitness trajectory has also been known to emerge on an uncorrelated, House-of-Cards fitness landscape, where mutant fitness values are assumed to be drawn from the same distribution regardless of the fitness of the current state (*1*, *44*). If this distribution has an exponentially decaying tail, the number of available beneficial mutations will also decrease exponentially with fitness. This has a similar flavor to the dynamics observed in our model in that the slowness of the fitness trajectory is governed by entropic factors. However, it has been shown that the number of steps to a local maximum *N*_{steps} on an uncorrelated landscape is very short [*N*_{steps} ∼ log(*L*)] (*4*). In contrast, the fitness landscape studied here is highly correlated (fig. S1B), and hence, both *N*_{steps} and the number of steps between successive MSs *d* increase linearly with *L* (section SC). The presence of long-range correlations shows that the fitness values of successive MSs are still highly correlated. If each genotype was allowed to jump beyond the correlation length of the landscape by gaining many [∼*O*(*L*)] mutations in one step, the dynamics would effectively correspond to that on an uncorrelated landscape. Studies of this “long-jump” adaptation have been carried out for the evolution of NK Boolean networks, in which a logarithmic fitness trajectory has been observed (*44*). However, here, we found that even when considering realistic dynamics for the accumulation of single individual mutations, as long as one takes into account the phenomenon of stochastic tunneling, a slow, logarithmic average fitness trajectory can still arise on such a highly correlated landscape.

Our model can potentially be extended to account for different forms of the interaction matrix. We have so far assumed that each site interacts with a random subset of other sites, but there might be some other structure in the interaction network between genes that may affect the dynamics. For example, genes might be connected in such a way where beneficial mutations always change potential deleterious mutations into beneficial ones. This could maintain or increase the rank of the system even if it is far from an MS.

Together, our theoretical study demonstrates the utility of a microscopic model in providing a mechanistic understanding of the evolutionary dynamics and in allowing us to probe details of the system that might not be accessible in a macroscopic model. Using this approach, we explored the consequences of epistasis on fitness trajectories, both in the hill-climbing regime and after the population reaches a fitness peak, and found that hopping between MSs via stochastic tunneling can robustly give rise to a logarithmic trajectory.

## METHODS

### Obtaining average fitness trajectories

We obtained the probability distribution *k* using *M _{ij}* is the probability of going from state

*j*to state

*i*in one time step (i.e., after each mutation event).

The elements of this transition matrix are determined as follows:

1) From a non-MS, the transition probability *p _{i}* of going to a state via a single mutation at site

*i*with effect

*s*

^{(i)}> 0 is given by

2) From an MS, the probability *p _{ij}* of going to a double mutant state via mutations at sites

*i*and

*j*with net effect

*s*

^{(ij)}is given by

*i*followed by

*j*, and vice versa for the second term, the fixation probability of a mutant

*p*

_{f}(

*s*) = max(

*s*,0), 〈

*s*

^{(ik)}〉 = ∑

*max(*

_{k}*s*

^{(ik)},0)/

*L*, and the probability of a deleterious single mutant becoming a successful double mutant

3) The probability of staying in the same state after a mutation event is 1 − ∑* _{i}p_{i}* − ∑

_{i, j}

*p*

_{i, j}.

With this transition matrix, the average fitness *F _{k}* at time

*k*can then be found from

## SUPPLEMENTARY MATERIALS

Supplementary material for this article is available at http://advances.sciencemag.org/cgi/content/full/5/7/eaav3842/DC1

Section SA. Choice of null model for the combined effect of multiple independent mutations

Section SB. Range of validity for population size *N*

Section SC. Structural features of fitness landscapes

Section SD. Relaxation to local fitness maximum

Section SE. Hopping between MSs

Section SF. Batch culture

Fig. S1. Properties of fitness landscape as a function of *L*.

Fig. S2. Properties of fitness peaks as a function of *L* when additional weak interactions

between all sites are included.

Fig. S3. Figure showing how properties of MSs vary with fitness for ρ = 0.01 (black diamonds, *F*_{offset} = −3.4), ρ = 0.025 (red circles, *F*_{offset} = −3.8), ρ = 0.05 (green triangles, *F*_{offset} = −4.2), ρ = 0.075 (blue squares, *F*_{offset} = −4.4), and ρ = 0.1 (purple crosses, *F*_{offset} = −4.5).

Fig. S4. The number of connecting MSs *n*_{s}, which is the number of MSs that the system can transit to next from the current MS, correlates with the number of double mutant escape paths out of a state *n*_{p}.

Fig. S5. Relaxation toward a single local fitness maximum slows down with increasing degree of epistasis.

Fig. S6. Changing the distribution of fixation probabilities does not significantly change the functional form of the fitness trajectory.

Fig. S7. Other distributions for the nonzero elements of the interaction matrix give similar form for the fitness trajectory.

Fig. S8. Logarithmic fitness trajectories are also observed for different values of ρ.

Fig. S9. The decay of the two-time correlation function depends on both the time difference Δ*t* and the initial time of the measurement *t*_{w}, implying that the system ages.

Reference (*45*)

This is an open-access article distributed under the terms of the Creative Commons Attribution-NonCommercial license, which permits use, distribution, and reproduction in any medium, so long as the resultant use is **not** for commercial advantage and provided the original work is properly cited.

## REFERENCES AND NOTES

**Acknowledgments:**We thank B. Good, M. Desai, J. Lin, and P.-Y. Ho for useful discussions and feedback.

**Funding:**This research was supported by the National Science Foundation through MRSEC DMR 14-20570 and PHY-1125915, the Alfred P. Sloan Foundation Research Fellowship, the Harvard Deans Competitive Fund for Promising Scholarship, the Milton Fund, the Kavli Foundation, and the Volkswagen Foundation.

**Author contributions:**Y.G., M.V., and A.A. designed the research, performed the research, and wrote the paper.

**Competing interests:**The authors declare that they have no competing interests.

**Data and materials availability:**All data needed to evaluate the conclusions in the paper are present in the paper and/or the Supplementary Materials. Additional data related to this paper may be requested from the authors.

- Copyright © 2019 The Authors, some rights reserved; exclusive licensee American Association for the Advancement of Science. No claim to original U.S. Government Works. Distributed under a Creative Commons Attribution NonCommercial License 4.0 (CC BY-NC).