## Abstract

Extreme events are ubiquitous in a wide range of dynamical systems, including turbulent fluid flows, nonlinear waves, large-scale networks, and biological systems. We propose a variational framework for probing conditions that trigger intermittent extreme events in high-dimensional nonlinear dynamical systems. We seek the triggers as the probabilistically feasible solutions of an appropriately constrained optimization problem, where the function to be maximized is a system observable exhibiting intermittent extreme bursts. The constraints are imposed to ensure the physical admissibility of the optimal solutions, that is, significant probability for their occurrence under the natural flow of the dynamical system. We apply the method to a body-forced incompressible Navier-Stokes equation, known as the Kolmogorov flow. We find that the intermittent bursts of the energy dissipation are independent of the external forcing and are instead caused by the spontaneous transfer of energy from large scales to the mean flow via nonlinear triad interactions. The global maximizer of the corresponding variational problem identifies the responsible triad, hence providing a precursor for the occurrence of extreme dissipation events. Specifically, monitoring the energy transfers within this triad allows us to develop a data-driven short-term predictor for the intermittent bursts of energy dissipation. We assess the performance of this predictor through direct numerical simulations.

## INTRODUCTION

A plethora of dynamical systems exhibit intermittent behavior manifested through sporadic bursts in the time series of their observables. These extreme events produce values of the observable that are several SDs away from its mean, resulting in heavy tails of the corresponding probability distribution. Important examples include climate phenomena (*1*, *2*), rogue waves in oceanic and optical systems (*3*–*5*), and large deviations in turbulent flows (*6*–*8*). Because these extreme phenomena typically have marked consequences, their quantification and prediction are of great interest.

Significant progress has been made in the computation of extreme statistics, both through direct numerical simulations and through indirect methods such as the theory of large deviations [see the work of Touchette (*9*) for a review]. Although these methods estimate the probability distribution of the extreme events, they do not inform us about the underlying mechanisms that lead to the extremes, nor are they capable of predicting individual extreme events.

For systems operating near an equilibrium or systems that are nearly integrable, perturbative methods have been successful in identifying the resonant interactions that cause the extreme events (*10*, *11*). For systems that are not perturbations from these trivial limits, a general framework for probing the transition mechanism to extreme states is missing. These systems, such as turbulent fluid flows and water waves, are also typically high-dimensional and nonlinear where the nonlinearities create a complex network of interdependent interactions among many degrees of freedom (*12*–*17*).

Here, we propose a variational framework to probe the underlying conditions that lead to extreme events in these high-dimensional complex systems. More specifically, we derive precursors of extreme events as the solutions of a finite-time constrained optimization problem. The functional to be maximized is the observable whose time series exhibit the extreme events. The constraints are designed to ensure that the triggers belong to the system attractor and therefore reflect physically relevant phenomena. If the lifetime of the extreme events are short compared to the typical dynamical time scales of the system, then the finite-time optimization problem can be replaced with its instantaneous counterpart. For the instantaneous problem, we derive the Euler-Lagrange equations that can be solved numerically using Newton-type iterations.

We apply the variational framework to the Kolmogorov flow, a two-dimensional Navier-Stokes equation driven by a monochromatic body forcing. At sufficiently high Reynolds numbers, this flow is known to exhibit intermittent bursts of energy dissipation rate (*18*, *19*). We first show that these extreme bursts are due to the internal transfers of energy through nonlinearities, as opposed to phase locking with the external forcing. Because of the high number of involved degrees of freedom and their complex interactions, deciphering the modes responsible for the extreme energy dissipation rate is not straightforward. However, the optimal solution to our variational method isolates the triad interaction responsible for this transfer of energy. Monitoring this triad along trajectories of the Kolmogorov flow, we find that, on the onset of the extreme bursts, the energy is transferred spontaneously from a large-scale Fourier mode to the mean flow, leading to growth of the energy input rate and consequently the energy dissipation rate.

We then use the derived large-scale mode as a predictor for the extreme events. Specifically, by tracking the energy of this mode, we develop a data-driven short-term predictor of intermittency in the Kolmogorov flow. We assess the effectiveness of the prediction scheme on extensive direct numerical simulations by explicitly quantifying its success rate, as well as the false-positive and false-negative rates.

## RESULTS

### Variational formulation of extreme events

Consider the general evolution equations(1A)(1B)(1C)where *u* : Ω × ℝ^{+} → ℝ^{d} belongs to an appropriate function space *X* and completely determines the state of the system. The initial condition *X* ∋ *u*_{0} : Ω → ℝ^{d} is specified at the time *t*_{0}, and Ω ⊂ℝ^{d} is an open bounded domain. The differential operators and can be potentially nonlinear. A wide range of physical models can be written as a set of partial differential equations (PDEs), as in Eqs. 1A to 1C. For instance, for incompressible fluid flows, Eq. 1A is the momentum equation and Eq. 1B is the incompressibility condition where (*u*) = ∇ · *u*. For simplicity, we will denote a trajectory of Eqs. 1A to 1C by *u*(*t*).

Let *I* : *X* → ℝ denote an observable whose time series *I*(*u*(*t*)) along a typical trajectory *u*(*t*) exhibits intermittent bursts (see Fig. 1A). Drawing upon the near-integrable case, we view the system as consisting of a background chaotic attractor, which has small regions of instability (see Fig. 1B). Once a trajectory reaches an instability region, it is momentarily repelled away from the background attractor, resulting in a burst in the time series of the observable. Our goal here is to probe the instability region(s) by using a combination of observed data from the system and the governing equations of the system. We also require the instability regions to have a nonzero probability of occurrence under the natural flow of the dynamical system. This constraint is of particular importance because it excludes “exotic” states with extreme growth of *I* but with a negligible probability of being observed in practice [see the constraint *C*(*u*_{0}) = *c*_{0} in Eq. 2B].

We formulate this task as a constrained optimization problem. Assume that there is a typical time scale τ ∈ ℝ^{+} over which the bursts in the observable *I* develop (see Fig. 1A). We seek initial conditions *u*_{0} whose associated observable *I*(*u*(*t*)) attains a maximal growth within time τ. More precisely, we seek the solutions to the constrained optimization problem(2A)(2B)where the optimization variable is the initial condition *u*(*t*_{0}) = *u*_{0} of systems 1A to 1C. The set of critical states is required to satisfy the constraints in Eq. 2B to enforce two important properties. The first property ensures that *u*(*t*) obeys the governing Eqs. 1A to 1C as opposed to being an arbitrary one-parameter family of functions. The second property *C*(*u*_{0}) = *c*_{0}, where *C* : *X* → ℝ^{k}, is a codimension *k* constraint. This constraint is enforced to ensure the nonzero probability of occurrence, that is, states that are sufficiently close to the chaotic background attractor. The set of probabilistically feasible states can be generally described by exploiting basic physical properties of the chaotic attractor such as average energy along different components of the state space or the second-order statistics. The precise form of the constraint *C*(*u*_{0}) = *c*_{0} is problem-dependent and will shortly be discussed in more detail. We point out that more general inequality constraints of the form *c*_{min} ≤ *C*(*u*_{0}) ≤ *c*_{max} may also be used. However, the treatment of these inequality constraints is not discussed in this paper.

We expect the set of solutions to problems 2A and 2B to unravel the mechanisms underpinning the intermittent bursts of the observable. Although it is unlikely that a generic trajectory of the system passes exactly through one of the maximizers, by continuity, any trajectory passing through a sufficiently small open neighborhood of the maximizer (that is, the instability regions of Fig. 1B) will result in a similar observable burst.

We emphasize that an optimization problem similar to Eqs. 2A and 2B has been pursued before in special contexts. The largest finite-time Lyapunov exponent can be formulated as Eqs. 2A and 2B, where the observable is the amplitude of infinitesimal perturbations after finite time. The maximizer is the corresponding finite-time Lyapunov vector (*20*, *21*). In a similar context, Pringle and Kerswell (*22*) seek optimal finite-amplitude perturbations that trigger transition to turbulence in the pipe flow. They formulate the unknown optimal perturbation as the solution of a constrained optimization problem similar to Eqs. 2A and 2B, with the observable being the *L*^{2} norm of the fluid velocity field. Ayala and Protas (*23*–*25*) consider the finite-time singularity formation for Navier-Stokes equations. They also use a variational method to seek the initial conditions that could lead to finite-time singularities. In these studies, the emphasis is given to the analysis of the most “unstable” states, but the physical properties of the attractor are not taken into account.

The standard approach for solving the PDE-constrained optimization problem (Eqs. 2A and 2B) is an adjoint-based gradient iterative method (*26*–*28*). This method is computationally very expensive because, at each iteration, the gradient direction needs to be evaluated as the solution of an adjoint PDE. If the growth time scale τ is small compared to the typical time scales of the observable, then it is reasonable to replace the finite-time growth problem (Eqs. 2A and 2B) with its instantaneous counterpart(3A)(3B)

Problems 3A and 3B seek initial states *u*_{0} for which the instantaneous growth of the observable *I* along the corresponding solution *u*(*t*) is maximal.

We point out that the large instantaneous derivatives of *I* do not necessarily imply a subsequent burst in the observable because the growth may not always be sustained at later times along the trajectory *u*(*t*). As a result, the set of solutions to this instantaneous problem may differ significantly from the finite-time problem (Eqs. 2A and 2B). Nonetheless, the solutions to the instantaneous problem can still be insightful. In addition, as we show below, these solutions can be obtained at a relatively low computational cost.

### Optimal solutions

First, we derive an equivalent form of Problems 3A and 3B. Taking the time derivative of the time series *I*(*u*(*t*)) yields (d/d*t*)*I*(*u*(*t*)) = d*I*(*u*; ∂_{t}*u*), where d*I*(*u*; *v*) := lim_{ε→0} [*I*(*u* + ε*v*) − *I*(*u*)]/ε denotes the Gâteaux differential of *I* at *u* evaluated along *v*. Using Eq. 1A to substitute for ∂_{t}*u*, we obtain the following optimization problem, which is equivalent to problems 3A and 3B(4A)(4B)where(5)

Note that the first constraint in Eq. 3B is simplified because we have already used Eq. 1A, and it only remains to enforce Eq. 1B. For notational simplicity, we omit the subscript from *u*_{0}.

If *J* : *X* → ℝ is a continuous map and the subset *S* = {*u* ∈ *X* : (*u*) = 0, *C*(*u*) = *c*_{0}} is compact in *X*, problems 4A and 4B have at least one solution. This follows from the fact that the image of a compact set under a continuous transformation is compact. Therefore, *J*(*S*) ⊂ℝ is compact, which implies that *J*(*S*) is bounded and closed (*29*). Therefore, *J* is bounded and attains its maximum (and minimum) on *S*. The uniqueness of the maximizer is not generally guaranteed. However, the set of maximizers (and minimizers) of *J* is a compact subset of *S* (*30*).

As we show in section S1, if *X* is a Hilbert space with the inner product 〈⋅,⋅〉_{X} and the operator is linear, then every solution of the optimization problem (Eqs. 4A and 4B) satisfies the set of Euler-Lagrange equations(6A)(6B)(6C)

Here, ^{†} is the adjoint of , and *J*′(*u*) and *C*_{i}′(*u*) are the unique identifiers of the Gâteaux differentials d*J*(*u*;⋅) and d*Ci*(*u*;⋅), such that d*J*(*u*;*v*) = 〈*J*′(*u*), *v*〉_{X} and d*C*_{i}(*u*;*v*) = 〈*C*_{i}′(*u*),*v*〉_{X} for all *v*. The existence and uniqueness of *J′*(*u*) and *C*_{i}′(*u*) are guaranteed by the Riesz representation theorem (*31*). Here, *C*_{i} are the components of the map *C* = (*C*_{1}, *C*_{2}, ⋯, *C*_{k}). The function α : Ω → ℝ and the vector β = (β_{1}, ⋯, β_{k}) ∈ ℝ^{k} are unknown Lagrange multipliers to be determined together with the optimal state *u* : Ω → ℝ^{d}.

### Application to Navier-Stokes equations

We consider the Navier-Stokes equations(7)where *u* : Ω × ℝ^{+} → ℝ^{d} is the fluid velocity field, *p* : Ω × ℝ^{+} → ℝ is the pressure field, and ν = *Re*^{−1} is the nondimensional viscosity, which coincides with the reciprocal of the Reynolds number *Re*. Here, we consider two-dimensional flows (*d* = 2) over the domain Ω = [0, 2π] × [0, 2π] with periodic boundary conditions. The flow is driven by the monochromatic Kolmogorov forcing *f*(*x*) = sin(*k*_{f}*y*)*e*_{1}, where *k*_{f} = (0, *k*_{f}) is the forcing wave number and the vectors *e*_{i} denote the standard basis in ℝ^{d}. In the following, we assume that the velocity fields are square integrable for all times, that is, *X* = *L*^{2} (Ω).

The kinetic energy *E* (per unit volume), the energy dissipation rate *D*, and the energy input rate *I* are defined, respectively, by(8)where |Ω| denotes the area of the domain, that is, |Ω| = (2π)^{2}. Along any trajectory *u*(*t*), these three quantities satisfy . We use the energy dissipation rate *D* to define the eddy turnover time, , where denotes the expected value.

The Kolmogorov flow admits the laminar solution . For the forcing wave number *k*_{f} = 1, the laminar solution is the global attractor of the system at any Reynolds number (*32*). If the forcing is applied at a higher wave number and the Reynolds number is sufficiently large, then the laminar solution becomes unstable. In particular, numerical evidence suggests that, for *k*_{f} = 4 and sufficiently large Reynolds numbers, the Kolmogorov flow is chaotic and exhibits intermittent bursts of energy dissipation (*18*, *33*, *34*). This is manifested in Fig. 2A, showing the time series of the energy dissipation *D* at Reynolds number *Re* = 40 with *k*_{f} = 4.

A closer inspection reveals that each burst of the energy dissipation *D* is shortly preceded by a burst in the energy input *I* (see Fig. 2B). Therefore, we expect the mechanism behind the bursts in the energy input to be also responsible for the bursts in the energy dissipation. As we show in section S2.1, the energy input is given by *I*(*u*) = −|*a*(*k*_{f})| sin(φ(*k*_{f})), where *a*(*k*) are the Fourier modes of the velocity field *u* and φ(*k*) are their corresponding phases such that *a*(*k*) = |*a*(*k*)| exp(*i*φ(*k*)). We refer to the Fourier mode *a*(*k*_{f}) as the mean flow. The energy input *I* can grow through two mechanisms: (i) alignment between the phase of the mean flow and the external forcing, that is, φ(*k*_{f}) → −π/2, and (ii) growth of the mean flow energy |*a*(*k*_{f})|.

Examining the alignment between the forcing and the velocity field rules out mechanism (i) (cf. fig. S1). The remaining mechanism (ii) is possible through the nonlinear term in the Navier-Stokes equation. This nonlinearity redistributes the system energy among various Fourier modes *a*(*k*) through triad interactions of the modes whose wave numbers (*k*, *p*, *q*) satisfy *k* = *p* + *q* (see section S2.2 for further details). Because of the high number of active modes involved in the intricate network of triad interactions, it is unclear which triad (or triads) is (are) responsible for the nonlinear transfer of energy to the mean flow during the extreme events. As we show below, our variational approach identifies the modes involved in this transfer. However, before obtaining the optimal solution, we need to specify the explicit form of the constraint *C*(*u*) = *c*_{0}.

### Constraints

The constraint *C*(*u*) = *c*_{0} is imposed to ensure that the optimal solutions are physically admissible, that is, that they are sufficiently close to the attractor and thus have a nonzero probability of occurrence. For instance, the solutions to a wide range of dissipative PDEs are known to converge asymptotically to a finite-dimensional subset of the state space (*35*). The maximizers of the functionals (Eq. 2A or Eq. 3A) that are far from this asymptotic attractor are physically irrelevant because they correspond to a transient phase that cannot be sustained.

In most applications, including the Kolmogorov flow, the system attractor is not explicitly known. Therefore, the physical relevance of the optimal solutions needs to be ensured otherwise. Here, we consider constraints of the form(9)where *A* is a linear operator. Several physically important quantities can be written as function 9. For instance, if *A* is the identity operator, then *C* coincides with the kinetic energy, *C*(*u*) = *E*(*u*). If *A* is the gradient operator, *A* = ∇, then we have *C*(*u*) = *D*(*u*) × (*Re*/2), which can be used to constrain the energy dissipation rate. A more general class of these operators can be constructed as follows. Let *u* = ∑_{i}α_{i}*v*_{i}, where {*v*_{i}} is a principal component basis, that is, it diagonalizes the covariance operator of *u*. Define *A* as the diagonal linear operator such that *A*(*v*_{i}) = *v*_{i}/σ_{i}, where is the SD of α_{i}. Then, the constraint takes the form . This corresponds to an ellipsoid, describing points that have equal probability of occurrence when we approximate the statistics of the background attractor by a Gaussian measure (*36*). Note that the constraints of Eq. 9 are codimension one, *C* : *X* → ℝ, and hence a special case of the codimension *k* constraint in Eq. 2B.

Excluding the intermittent bursts, the energy dissipation of the Kolmogorov flow exhibits small oscillations around its mean value [*D*] (see Fig. 2A). On the basis of this observation, we seek optimal solutions of Eqs. 4A and 4B, which are constrained to have the energy dissipation *D* = [*D*]. This results into the constraint (Eq. 9) with *A* = ∇ and *C*(*u*) = *c*_{0} = [*D*] × (*Re*/2). We approximate the mean value [*D*] from direct numerical simulations. At *Re* = 40, for instance, we have [*D*] ≃ 0.117.

### Probing the extreme energy transfers

The functional *J* (see Eq. 5) associated with the energy input *I* reads(10)

The associated Euler-Lagrange equations (Eqs. 6A to 6C) read(11A)(11B)(11C)where *J*′(*u*) = (∇*f* + ∇*f*^{T}) *u* + νΔ*f*, ^{†}(α) = −∇α, and *C*′(*u*) = *A*^{†} *Au* (see section S2.3 for the derivations). We set *A* = ∇ to enforce a constant energy dissipation constraint. This implies that *A*^{†}*Au* = −Δ*u*.

Using the symmetries of Eqs. 11A to 11C, we find that it admits the pair of exact solutions , , and . More complex solutions, with unknown closed forms, may exist. We approximate these solutions using the Newton iterations described in section S3.

At each *Re*, we initiated several Newton iterations from random initial conditions. In addition to the pair of exact solutions (*u*_{±}, α_{±}, and β_{±}), the iterations yielded one nontrivial solution. Figure 3 shows the resulting three branches of solutions including the exact solution *u*_{+} (solid black), the exact solution *u*_{−} (dashed black), and the nontrivial solution (red circles). For small Reynolds numbers, our Newton searches only returned the exact solutions. At *Re* ≃ 3.1, a bifurcation takes place where a new nontrivial solution is born. This solution appears to be a global maximizer because no other solutions were found. Because the intermittent bursts are only observed for *Re* > 35, we focus the following analysis on this range of Reynolds numbers.

The nontrivial optimal solution converges to an asymptotic limit as the *Re* increases. This is discernible from the plateau of the red curve in Fig. 3 and the select solutions shown in its outset. The three most dominant Fourier modes present in this asymptotic solution are the forcing wave number (0, *k*_{f}) and the wave numbers (1, 0) and (1, *k*_{f}) together with their complex conjugate pairs. Incidentally, these wave numbers form a triad, (0, *k*_{f}) + (1, 0) = (1, *k*_{f}). The dominant mode of the optimal solution corresponds to the wave number (1, 0) whose modulus |*a*(1, 0)| is one order of magnitude larger than the other nonzero modes.

Next, we turn to the direct numerical simulations of the Kolmogorov flow and monitor the three Fourier modes *a*(0, *k*_{f}), *a*(1, 0), and *a*(1, *k*_{f}). We find that the energy transfers within this triad underpin the intermittent bursts of the mean flow energy |*a*(0, *k*_{f})| and, hence, the energy input rate *I*. Figure 4A shows the time series of *I* = −Im[*a*(0, *k*_{f})] and the Fourier mode |*a*(1, 0)| along a typical trajectory of the Kolmogorov flow at *Re* = 40. The bursts of the energy input rate *I* are nearly concurrent with extreme dips in the modulus |*a*(1, 0)|. A similar concurrent behavior was observed for other trajectories and at higher Reynolds numbers (see section S6).

This observation reveals that, before a burst, the mode *a*(1, 0) transfers a significant portion of its energy budget to the mean flow *a*(0, *k*_{f}) through the triad interaction of the modes *a*(0, *k*_{f}), *a*(1, 0), and *a*(1, *k*_{f}). This leads to the increase in the energy of the mean flow and, therefore, the energy input rate *I*, which in turn leads to the growth of the energy dissipation rate *D*.

One can go one step further and inquire about the reason for the release of energy from mode *a*(1, 0) to the mean flow *a*(0, *k*_{f}). The answer to this question, involving the relative phases of the modes and their interactions with other triads, is beyond the scope of the present work and will be addressed elsewhere. It is tempting to study these interactions by truncating the Kolmogorov flow to the modes *a*(0, *k*_{f}), *a*(1, 0), and *a*(1, *k*_{f}) and their complex conjugates. Unfortunately, these severe truncations fail to illuminate because the dynamics of the truncated system severely departs from the original Navier-Stokes equations (*37*, *38*).

### Prediction of extreme events

Given the above observation that the optimal solution primarily consists of mode (1, 0), we choose this mode to formulate our data-driven prediction scheme. The decrease in the energy of the mode (1, 0) precedes the increase in the energy of the mean flow. This enables the data-driven short-term prediction of extreme bursts of the energy dissipation by observing the modulus |*a*(1, 0, *t*)|. More specifically, relatively small values of λ(*t*) := |*a*(1, 0, *t*)|, along a solution *u*(*t*), signal the high probability of an upcoming burst in the energy dissipation.

To quantify this, we consider the conditional probability(12)where is the maximum of the energy dissipation over the future time interval [*t* + *t*_{i}, *t* + *t*_{f}]. This conditional probability measures the likelihood of the future maximum value of the energy dissipation belonging to the interval [*D*_{1}, *D*_{2}], given that the present value of the indicator λ(*t*) belongs to [λ_{1}, λ_{2}]. The constant parameters *t*_{f} > *t*_{i} > 0 determine the future time interval [*t* + *t*_{i}, *t* + *t*_{f}]. In the following, we set *t*_{i} = 1 ≃ 2.2*t*_{e} and *t*_{f} = 2 ≃ *t*_{i} + 2.2*t*_{e}. The length of the time window *t*_{f} − *t*_{i} is long enough to ensure that the extreme event (if it exists) is contained in the time interval [*t* + *t*_{i}, *t* + *t*_{f}]. The choice of the prediction time *t*_{i} will be discussed shortly. The reported results are robust to small perturbations to all parameters.

Figure 5A shows the conditional probability density corresponding to Eq. 12. We observe that relatively small values of λ correlate strongly with the high future values of the energy dissipation *D*. For instance, when λ < 0.4, the value of *D*_{m} is most likely larger than 0.2. Conversely, when λ is larger than 0.4, the future values of the future energy dissipation *D*_{m} are smaller than 0.2.

We seek an appropriate value λ_{0} such that λ(*t*) < λ_{0} predicts an extreme burst of energy dissipation over the future time interval [*t* + *t*_{i}, *t* + *t*_{f}]. Denote the extreme event threshold by *D*_{e}, such that *D* > *D*_{e} constitutes an extreme burst of energy dissipation. We define the probability of an upcoming extreme event *P*_{ee} as(13)which measures the likelihood that *D*_{m}(*t*) > *D*_{e} assuming that λ(*t*) = λ_{0}. Here, we set the threshold of the extreme event *D*_{e} as the mean value of the energy dissipation plus 2 SDs, ≃ 0.194. The extreme event probability *P*_{ee} can be computed from the probability density shown in Fig. 5A (see section S5 for details).

Figure 5B shows the probability of extreme events *P*_{ee} as a function of the parameter λ_{0}. If, at time *t*, the values of λ(*t*) are larger than 0.5, the probability of a future extreme event, that is, *D*_{m}(*t*) > *D*_{e}, is nearly zero. The probability of a future extreme event increases as λ(*t*) decreases. At λ(*t*) ≃ 0.4, the probability is 50%. If λ(*t*) < 0.3, then the likelihood of an upcoming extreme event is nearly 100%. The horizontal dashed line in Fig. 5A marks the transition line from the low likelihood of an upcoming extreme event *P*_{ee} < 0.5 to the higher likelihood *P*_{ee} > 0.5. This line, together with the vertical line *D*_{m} = *D*_{e}, divides the conditional probability density into four regions: (I) correct rejections [*P*_{ee} < 0.5 and *D*_{m}(*t*) < *D*_{e}]: correct prediction of no upcoming extremes; (II) false positives [*P*_{ee} > 0.5 but *D*_{e}(*t*) < *D*_{e}]: the indicator predicts an upcoming extreme event but no extreme event actually takes place; (III) hits [*P*_{ee} > 0.5 and *D*_{m}(*t*) > *D*_{e}]: correct prediction of an upcoming extreme event; and (IV) false negatives [*P*_{ee} < 0.5 but *D*_{m}(*t*) > *D*_{e}]: An extreme event takes place, but the indicator fails to predict it.

A reliable indicator of upcoming extreme events must maximize the number of correct rejections (quadrant I) and hits (quadrant III) while, at the same time, having minimal false positives (quadrant II) and false negatives (quadrant IV). From nearly 100,000 predictions made, only 0.26% false negatives and 0.85% false positives were recorded. The number of hits was 5.6%, and the number of correct rejections amounts to 93.3% of all predictions made. As we show in the Supplementary Materials, this amounts to a 95.6% success rate for the prediction of the extreme events (see eq. S24 and table S1). Note that the high percentage of correct rejections compared to the hits is a mere consequence of the fact that the extreme events are rare.

An additional desirable property of an indicator is its ability to predict the upcoming extremes well in advance of the events taking place. The chosen prediction time *t*_{i} = 1 is approximately twice the eddy turnover time *t*_{e}. In comparison, it takes approximately one eddy turnover time (on average) for the energy dissipation rate to grow from to its extreme value .

The prediction time *t*_{i} can always be increased at the cost of increasing false positives and/or false negatives. For instance, with the choice *t*_{i} = 2 ≃ 4.3*t*_{e} and *t*_{f} = 3 ≃ *t*_{i} + 2.2*t*_{e}, prediction of the extreme events *D*_{m} > *D*_{e} returns 1.2% false negatives and 0.6% false positives. The number of hits decreases slightly to 5.3%, as does the number of correct rejections (92.9%), which amounts to a success rate of 82% in the extreme event prediction (see eq. S24). Therefore, the prediction time *t*_{i} = 2 still yields satisfactory predictions. Upon increasing *t*_{i} further, eventually, the number of hits becomes comparable to the number of false negatives, at which point the predictions are unreliable.

## DISCUSSION

A method for the computation of precursors of extreme events in complex turbulent systems is introduced here. The new approach combines basic physical properties of the chaotic attractor (such as energy distribution along different directions of phase space) obtained from data with stability properties induced by the governing equations. The method is formulated as a constrained optimization problem, which can be solved explicitly if the time scale of the extreme events is short compared to the typical time scales of the system. To demonstrate the approach, we consider a stringent test case, the Kolmogorov flow, which has a turbulent attractor with positive Lyapunov exponents and intermittent extreme bursts of energy dissipation. We can correctly identify the triad of modes associated with the extreme events. Moreover, the derived precursors allow for the formulation of an accurate short-term prediction scheme for the intermittent bursts. The results demonstrate the robustness and applicability of the approach on systems with high-dimensional chaotic attractors.

## MATERIALS AND METHODS

The Navier-Stokes equations and the corresponding Euler-Lagrange equations were solved numerically with a standard pseudospectral code with *N* × *N* Fourier modes and 2/3 dealiasing. For *Re* = 60, 80, and 100, we used *N* = 256 to fully resolve the velocity fields. However, at *Re* = 40, this resolution was unnecessarily high, and hence, we used *N* = 128. The temporal integration of the Navier-Stokes equations was carried out with a fourth-order Runge-Kutta scheme.

## SUPPLEMENTARY MATERIALS

Supplementary material for this article is available at http://advances.sciencemag.org/cgi/content/full/3/9/e1701533/DC1

section S1. Derivation of the Euler-Lagrange equation

section S2. The Navier-Stokes equation

section S3. Newton iterations

section S4. Sensitivity to parameters

section S5. Computing the probability of extreme events

section S6. Supporting computational results

fig. S1. Evolution of the energy input versus mean flow.

fig. S2. Triad interactions.

fig. S3. Sensitivity of the optimal solutions.

fig. S4. Joint PDFs for higher Reynolds numbers.

fig. S5. Prediction of intermittent bursts at higher Reynolds numbers.

table S1. Simulation parameters.

movie S1. The prediction of an extreme event in the Kolmogorov flow.

This is an open-access article distributed under the terms of the Creative Commons Attribution-NonCommercial license, which permits use, distribution, and reproduction in any medium, so long as the resultant use is **not** for commercial advantage and provided the original work is properly cited.

## REFERENCES AND NOTES

**Acknowledgments:**

**Funding:**This work was supported by Office of Naval Research grant N00014-15-1-2381, Air Force Office of Scientific Research grant FA9550-16- 1-0231, and Army Research Office grant 66710-EG-YIP.

**Author contributions:**M.F. and T.P.S. designed and performed the research and wrote the paper.

**Competing interests:**The authors declare that they have no competing interests.

**Data and materials availability:**All data needed to evaluate the conclusions in the paper are present in the paper and/or the Supplementary Materials. Additional data related to this paper may be requested from the authors. M.F. and T.P.S. claim responsibility for all figures. Certain codes and data used in this manuscript are publicly available on GitHub (https://github.com/mfarazmand/VariExtEvent).

- Copyright © 2017 The Authors, some rights reserved; exclusive licensee American Association for the Advancement of Science. No claim to original U.S. Government Works. Distributed under a Creative Commons Attribution NonCommercial License 4.0 (CC BY-NC).