Research ArticleAPPLIED PHYSICS

Combinatorial optimization by simulating adiabatic bifurcations in nonlinear Hamiltonian systems

See allHide authors and affiliations

Science Advances  19 Apr 2019:
Vol. 5, no. 4, eaav2372
DOI: 10.1126/sciadv.aav2372

Abstract

Combinatorial optimization problems are ubiquitous but difficult to solve. Hardware devices for these problems have recently been developed by various approaches, including quantum computers. Inspired by recently proposed quantum adiabatic optimization using a nonlinear oscillator network, we propose a new optimization algorithm simulating adiabatic evolutions of classical nonlinear Hamiltonian systems exhibiting bifurcation phenomena, which we call simulated bifurcation (SB). SB is based on adiabatic and chaotic (ergodic) evolutions of nonlinear Hamiltonian systems. SB is also suitable for parallel computing because of its simultaneous updating. Implementing SB with a field-programmable gate array, we demonstrate that the SB machine can obtain good approximate solutions of an all-to-all connected 2000-node MAX-CUT problem in 0.5 ms, which is about 10 times faster than a state-of-the-art laser-based machine called a coherent Ising machine. SB will accelerate large-scale combinatorial optimization harnessing digital computer technologies and also offer a new application of computational and mathematical physics.

INTRODUCTION

In social life and industry, one often encounters problems for finding the best combination among an enormous number of candidates. These combinatorial optimization problems are mathematically formulated as minimization or maximization of certain functions of discrete variables, which are called objective functions or cost functions. These problems are notoriously difficult because of combinatorial explosion (1, 2). Thus, special-purpose hardware devices for these problems are expected to be useful. In particular, “Ising machines” designed for finding a ground state of the Ising spin model (3) have recently attracted much attention because many combinatorial optimization problems can be mapped to the Ising problem (4), including very-large-scale integrated circuit design (5), drug design (6), and financial portfolio management (7). These machines have been developed by various approaches: quantum computers based on quantum annealing (8, 9) or quantum adiabatic optimization (1012) implemented with superconducting circuits (13, 14), coherent Ising machines (CIMs) implemented with laser pulses (1520), and Ising machines based on simulated annealing (SA) (1, 21, 22) implemented with digital circuits such as field-programmable gate arrays (FPGAs) (2329).

Inspired by recently proposed quantum adiabatic optimization using a nonlinear oscillator network (3034), here, we propose a new heuristic algorithm for the Ising problem. In this algorithm, which we call simulated bifurcation (SB), we numerically simulate adiabatic evolutions of classical nonlinear Hamiltonian systems exhibiting bifurcation phenomena (35), where two branches of the bifurcation in each nonlinear oscillator correspond to two states of each Ising spin (30, 34). [The CIMs also use two branches of a bifurcation for two states of an Ising spin (1520), but they are not Hamiltonian systems.] Unlike simulations of quantum systems, we can efficiently simulate the Hamiltonian systems (36) using current digital computers. This enables treatment of large-scale problems with dense connectivity, which are challenging for near-term quantum computers. Moreover, SB allows one to update variables simultaneously at each time step. This is in contrast to SA, because it generally requires one-by-one updating to guarantee convergence. [Simultaneous updating is allowed only for isolated spins (22, 24).] This difference between SB and SA implies that the acceleration of SB by massively parallel processing is easier than that of SA. Exploiting these advantages, we implement an ultrafast all-to-all connected 2000-spin Ising machine based on SB with a single FPGA and demonstrate that our machine is about 10 times faster than a state-of-the-art CIM (17, 19). In addition, we also solve an all-to-all connected 100,000-spin Ising problem with continuous parameters by SB using a graphics processing unit (GPU) cluster, which is about 10 times and 1000 times faster than our best SA and the software of SA provided by (22), respectively. This suggests that SB can accelerate large-scale combinatorial optimization without specially designed hardware devices or custom circuits.

While quantum adiabatic optimization is based on the quantum adiabatic theorem (37, 38), SB is based on adiabatic and chaotic (ergodic) evolutions of classical nonlinear Hamiltonian systems. We will discuss this mechanism of SB using the simulation results of a simple example instead of giving a mathematically rigorous proof of the operational principle of SB, which is left as an interesting problem for future work.

RESULTS

SB algorithm

The N-spin Ising problem without external magnetic fields is formulated as follows. A dimensionless Ising energy is given byEIsing(s)=12i=1Nj=1NJi,jsisj(1)where si denotes the ith Ising spin, which takes a value of 1 or −1, s = (s1 s2sN) is the vector representation of a spin configuration, and Ji,j(= Jj,i) is the coupling coefficient between the ith and jth spins (Ji,i = 0). The problem is to find a spin configuration minimizing the Ising energy. This problem is mathematically equivalent to a famous combinatorial optimization problem named MAX-CUT (5, 17, 18, 39): divide the nodes of a weighted graph into two groups maximizing the total weight (called “cut value”) of the edges cut by the division. By setting the coupling coefficients as Ji,j = −wi,j (wi,j = wj,i is the weight between the ith and jth nodes), we can solve MAX-CUT using Ising machines (see also the Supplementary Materials) (5, 17, 18). Since the total number of spin configurations is 2N, it is extremely difficult to find exact solutions for large N. Unless the topology of the graph has some special structure, such as two-dimensional lattices, this problem is known to be nondeterministic polynomial time (NP)–hard (2, 3). Hence, it is strongly believed that there exists no efficient exact algorithm for this problem. In practice, one often uses heuristic algorithms, such as SA, to find approximate but practically useful solutions as fast as possible. Here, we propose SB as a new option for these heuristic algorithms. [For obtaining exact solutions of small-size problems, the SA machine called “Digital Annealer” in (29) may be the fastest so far.]

For the Ising problem, a new kind of quantum adiabatic optimization using a network of Kerr-nonlinear parametric oscillators (KPOs) has recently been proposed (3034). The quantum mechanical Hamiltonian in this approach is given by (30, 34)Hq(t)=i=1N[K2ai2ai2p(t)2(ai2+ai2)+Δiaiai]ξ0i=1Nj=1NJi,jaiaj(2)where ℏ is the reduced Planck constant, ai and ai are the creation and annihilation operators, respectively, for the ith oscillator, K is the positive Kerr coefficient, p(t) is the time-dependent parametric two-photon pumping amplitude, Δi is the positive detuning frequency between the resonance frequency of the ith oscillator and half the pumping frequency, and ξ0 is a positive constant with the dimension of frequency. The initial state of each KPO is set to the vacuum state, and the pumping amplitude p(t) is gradually increased from zero to a sufficiently large value. The constant ξ0 is set to a sufficiently small value such that the vacuum state is the ground state of the initial Hamiltonian (30, 34). Then, each KPO finally becomes a coherent state with a positive or negative amplitude via a quantum adiabatic bifurcation (30, 34), and the sign of the final amplitude for the ith KPO provides the ith Ising spin of the ground state of the Ising model, which is guaranteed by the quantum adiabatic theorem (30, 34).

The corresponding classical Hamiltonian system is derived from a classical approximation where the expectation value of ai is approximated as a complex amplitude xi + iyi (where i denotes the imaginary unit) (30, 34). Here, the real and imaginary parts, xi and yi, are a pair of canonical conjugate variables, such as a position and a momentum, for the ith KPO. The classical mechanical Hamiltonian and the equations of motion for this classical system are given by (30, 34)Hc(x,y,t)=i=1N[K4(xi2+yi2)2p(t)2(xi2yi2)+Δi2(xi2+yi2)]ξ02i=1Nj=1NJi,j(xixj+yiyj)(3)x.i=Hcyi=[K(xi2+yi2)+p(t)+Δi]yiξ0j=1NJi,jyj(4)y.i=Hcxi=[K(xi2+yi2)p(t)+Δi]xi+ξ0j=1NJi,jxj(5)where the dots denote differentiation with respect to time t. It was empirically found that this classical system can also find good approximate solutions with considerably high probability, where the sign of the final xi provides the ith spin of the solutions (30, 34). This result suggests a new approach to the Ising problem. Unlike quantum computers, we do not need to build a real machine described by the above Hamiltonian, because, instead, we can simulate such a machine efficiently using classical computers.

However, Eqs. 4 and 5 are not suitable for fast numerical simulation. By disregarding some terms proportional to the momenta y, which varies around zero (30, 34), we simplify Eqs. 4 and 5 as followsHSB(x,y,t)=i=1NΔ2yi2+V(x,t)=i=1NΔ2yi2+i=1N[K4xi4+Δp(t)2xi2]ξ02i=1Nj=1NJi,jxixj(6)x.i=HSByi=Δyi(7)y.i=HSBxi=[Kxi2p(t)+Δ]xi+ξ0j=1NJi,jxj(8)where V(x, t) denotes the potential energy, and all the detunings have been assumed to be the same value Δ. The resultant classical system is a network of Duffing oscillators (35, 40) with mass Δ−1 and coupling coefficients {ξ0Ji,j}. [Similar simplification for the CIM simulation has recently been reported in (41).]

Hereafter, we treat all the parameters with the dimension of frequency (correspondingly, also time) as dimensionless quantities because we use Eqs. 7 and 8 as an algorithm for solving mathematical problems, and therefore, the physical meanings of the parameters are unimportant. x and y are also dimensionless by definition.

The SB algorithm is based on Eqs. 7 and 8. Note that the new Hamiltonian in Eq. 6 is separable with respect to positions and momenta, unlike the previous one in Eq. 3. Therefore, Eqs. 7 and 8 can be solved by the explicit symplectic Euler method (see also Methods) (36) instead of implicit methods or, e.g., the explicit fourth-order Runge-Kutta method. The simplicity of the explicit symplectic Euler method is crucial for hardwiring of the SB algorithm with custom circuits, such as FPGAs. Its stability is also important, because this allows one to set the time step to a large value, which results in high-speed simulation. [The symplectic Euler method is first-order accurate. Higher-order explicit symplectic methods, such as the Störmer-Verlet method (36), can also be used, but this is not effective not only for the speed and stability of the present simulation but also for the solution accuracy for the Ising problem. Hence, we adopt the simplest first-order method. See Methods for details.]

The SB algorithm is as follows. All the variables, x and y, are initially set around zero. Gradually increasing p(t) from zero, we solve Eqs. 7 and 8 numerically by the explicit symplectic Euler method. (For further speedup, we modify the method from its standard form. See Methods for details.) The sign of the final xi provides the ith spin of an approximate solution of the Ising problem.

SB has further algorithmic advantages: Eqs. 7 and 8 have only one product-sum operation with respect to the coupling matrix J, which is the most computation-intensive part in this algorithm, unlike Eqs. 4 and 5 including two such product-sum operations; we can update the variables simultaneously, and therefore, we can fully exploit massively parallel processing for accelerating SB; and SB includes only addition and multiplication, and no probabilistic processes, and therefore can be easily hardwired, e.g., with FPGAs.

Mechanism of SB

The mechanism of SB is qualitatively explained as follows. When p(t) is gradually increased from zero to a sufficiently large value pf, each oscillator exhibits a bifurcation with two stable branches whose positions are approximately given by si(pfΔ)/K (si = ±1) (34). Correspondingly, the potential energy finally has 2N minima corresponding to 2N spin configurations, which are approximately given byV(s(pfΔ)/K,t)=N(pfΔ)24Kξ02pfΔKi=1Nj=1NJi,jsisj(9)

Note that the first term is independent of s, and the second term is proportional to the Ising energy regarding s as a spin configuration. In the adiabatic evolution with slowly varying p(t), the state will follow one of the potential energy minima near to the present state. Assuming that minima with relatively low final energies appear earlier and are lower during the adiabatic evolution than those with relatively high final energies, we expect that the state will arrive at a minimum with a relatively low final energy. Thus, we will find a spin configuration with a low Ising energy as an approximate solution.

Instead of showing the general validity of the above qualitative description, here, we show the situation of a simple example: the two-spin Ising model with ferromagnetic coupling (J1,2 = J2,1 = 1), the ground states of which are (up, up) and (down, down). This will give us an intuitive understanding of SB, which is enough to use SB as a heuristic algorithm.

Figure 1 (A and B) shows the simulation results for this problem with a linearly increased pumping amplitude p(t) = εpt. As shown in this figure, after two bifurcations, the potential energy finally has four minima corresponding to the four configurations of the two-spin Ising model.

Fig. 1 Dynamics in SB with two oscillators for the two-spin Ising problem with ferromagnetic coupling.

(A) Time-dependent pumping amplitude, p(t) = εptp = 0.01), in the simulation (top) and time evolutions of x and y (oscillating thin lines in middle and bottom panels). Bold lines represent x of stable fixed points (potential energy minima): x1 = x2 = 0 (p ≤ Δ − ξ0 = 0.4), x1=x2=±(pΔ+ξ0)/K (p > Δ − ξ0 = 0.4), and x1=x2=±(pΔξ0)/K (p > Δ + ξ0 = 0.6). (B) Trajectories (circulating lines in white) in three time intervals, (0, 40) (top), (40, 60) (middle), and (60, 200) (bottom) and potential energy V(x, t) measured from the total energy E(t) = H(x(t), y(t), t) at the final times of the intervals. The boundaries of the time intervals are indicated in (A) by the dotted (t = 40) and dashed (t = 60) vertical lines. The loops (magenta) show the boundaries of the energetically allowable regions defined by V(x, t) − E(t) ≤ 0 at the final times. Other parameters are set as K = 1, Δ = 0.5, ξ0 = 0.1, x1(0) = x2(0) = y1(0) = 0, and y2(0) = 0.1. (C and D) Similar results for p(t) given in the top panel in (C) are shown. The boundaries of the time intervals are indicated in (C) by the dotted (t = 100) and dashed (t = 131.2) vertical lines.

Before the first bifurcation, the potential energy has a single minimum at the origin, around which the trajectory circulates, as shown in the top panel of Fig. 1B. After the first bifurcation, two stable fixed points (potential energy minima) corresponding to the ground states, (up, up) and (down, down), appear earlier than the other two minima corresponding to (up, down) and (down, up). The state adiabatically follows one of the two minima that appear earlier. Thus, SB successfully finds a ground state (down, down).

This adiabatic evolution is explained as follows. First, the total energy E(t) and the potential energies at the minima, V±,±(t) (+ and − correspond to up and down, respectively), are formulated from their time derivatives asE(t)=E0εp20t|x(t)|2dt(10)V±,±(t)=εp20t|X±,±(t)|2dt(11)where E0 is the initial total energy, which is assumed to be small, and X±,±(t) denote the positions of the minima. Before the corresponding bifurcation, X±,±(t) are defined as zero. Note that the total energy monotonically decreases because of the second term in Eq. 10 describing the negative work done by the time-dependent pumping term. After the first bifurcation, X+,+(t) and X−,−(t) leave the origin, and x(t) starts to circulate around them, instead of the origin. This leads to the increase of |x(t)| and therefore the further decrease of the total energy. As a result, the energetically allowable region defined by V(x, t) − E(t) ≤ 0 (inside the loops in Fig. 1B) separates into two parts, as shown in the middle panel of Fig. 1B. In the present simulation, x(t) is confined in the allowable region around X−,−(t). After that, x(t) circulates around X−,−(t), which leads to ∫|x(t)|2dt ≈ ∫|X−,−(t)|2 dt. From this and Eqs. 10 and 11, the difference between E(t) and V−,−(t) is kept constant at a small value. Thus, x(t) is kept in the allowable region around X−,−(t), as shown in the bottom panel of Fig. 1B. In summary, SB finds a ground state by adiabatically following one of the stable fixed points (potential energy minima) corresponding to the ground states that appear at the first bifurcation point.

The above mechanism of SB suggests that solutions obtained by SB are determined at the first bifurcation point. Since x(t) at the first bifurcation point is given by the maximum-eigenvalue eigenvector of the coupling matrix J (see the Supplementary Materials), SB may provide the signs of the elements of the eigenvector as an approximate solution, which is actually an approximate solution obtained by a continuous reduction method (30, 34). As shown in the next subsection, however, SB can provide much better solutions than this. This fact implies that there may exist another mechanism of SB; that is, SB may continue to search better solutions after the first bifurcation.

To investigate this new mechanism of SB, we also did another simulation where p(t) is given in the top panel of Fig. 1C. The results are shown in Fig. 1 (C and D). In this case, the initial value of p(t) is higher than the two bifurcation points, and therefore, there are four potential energy minima even at the initial time. While p(t) is constant, x(t) moves in the energetically allowable region including the four minima, as shown in the top panel of Fig. 1D. After that, the allowable region separates into four parts, as shown in the middle panel of Fig. 1D. Then, x(t) is confined in one of the four regions. Eventually, SB find a ground state (up, up) successfully.

Since x(t) also moves around the other potential energy minima, as shown in the top and middle panels of Fig. 1D, the failure probability may be finite. To evaluate the success probability, we repeat this simulation 104 times with different initial conditions, where x1(0) = x2(0) = 0 and y1(0) and y2(0) are set randomly from the interval (−0.1, 0.1). As a result, we obtain ground states with a probability of about 0.83. This result indicates that lower energy states (better solutions) are obtained with higher probabilities even in the initial condition where the energetically allowable region includes multiple potential energy minima.

This interesting result can be explained by assuming the ergodicity of this nonlinear Hamiltonian system, implying that a trajectory will eventually visit all states in the phase-space allowable region (36). This is suggested by the top panel of Fig. 1D, where x(t) moves around two low minima more frequently than two high minima. From the ergodicity, the success probability may be proportional to the phase-space volume of the allowable region satisfying x1x2 > 0. The success probability numerically estimated by the phase-space volume for the middle panel of Fig. 1D is about 0.841, which is remarkably close to the actual success probability of 0.83. This numerical fact supports the mechanism based on the ergodicity. In general, the ergodicity suggests that x(t) moves around low potential energy minima in longer time than high minima, which results in lower Ising energies. In the next subsection, we provide additional numerical evidence for this mechanism; that is, SB can find much better solutions for a 2000-spin Ising problem than that obtained at the first bifurcation. Thus, we conclude that SB finds lower minima among many minima harnessing the ergodicity. [Very recently, a new approach to the Ising problem using a classical Hamiltonian system with “classical spins” mimicking quantum annealing has been proposed (42), where a fixed point is tracked by the technique called shortcut to adiabaticity. Although this approach uses classical Hamiltonian dynamics as well, it is quite different from SB, because it will not exploit the ergodic search but just track a fixed point. In addition, it has been unclear so far whether it can achieve a high-speed computation with its complicated equations.]

Solving an all-to-all connected 2000-node MAX-CUT problem by a single-FPGA SB machine

From the advantages and properties of SB, we expect that SB will be useful for approximately solving large-scale Ising or MAX-CUT problems with dense connectivity as fast as possible. To check this expectation, we compare our SB machine with a state-of-the-art CIM (17, 19), because this machine has achieved outstanding performance for these problems. For this comparison, we solved an all-to-all connected 2000-node MAX-CUT problem named K2000, which was solved by the CIM (17, 19).

We implemented an SB-based 2000-spin Ising machine with a single FPGA. The product-sum operation, j=1NJi,jxj, in Eq. 8, which is the most computation-intensive part in SB, is performed efficiently by massively parallel processing. Ji,jxj terms of up to 8192 are processed at a single clock, which is about four times more than the total spins. See the Supplementary Materials for details.

One of the main results in (17) is that the CIM reached a target energy given by the Goemans-Williamson semidefinite programming (GW-SDP) algorithm (17, 38) in about 70 μs in the best case among 100 trials, which is about 50 times shorter than that by the SA highly tuned in (17). [The SA in (17, 19) is very fast by putting all the elements of the coupling matrix J on a cache of a single core. Thus, parallel computing with multiple cores cannot accelerate the SA because of communication overheads.] In comparison with this, our single-FPGA SB machine reaches the GW-SDP value in only 58 μs even in the worst case among 100 trials, as shown in Fig. 2A.

Fig. 2 Performance of SB machine implemented with an FPGA for K2000.

(A) Time evolutions of Ising energies. (The computation times do not include the loading of data and the output of the results; see the Supplementary Materials.) Constants in SB are set as K = Δ = 1 and ξ0=0.7ΔσN (σ = 1 is the SD of the elements of J.) p(t) is linearly increased from 0 to 1, while the number of time steps is increased from 1 to Nstep. Bold lines: 100-trial average energies of SB with Nstep = 40 (left), CIM from (17, 19) (middle), and SA from (19) (right). [The CIM data are the 26-trial average eliminating 74 trials because of the instability of optical parametric oscillators (17, 19).] Dashed thin and dotted thin lines sandwiching the bold lines: lower (best case) and upper (worst case) envelopes of the 100 traces. Dotted thin horizontal line: GW-SDP (17). Dashed thin horizontal line: 100-trial average energy of HNN after convergence at a local minimum. Table: computation times to reach the HNN and GW-SDP energies. N/A, not applicable. (B) Histograms of 100 cut values obtained by SB with Nstep = 186 in 0.5 ms (top), CIM in 5 ms from (17) (middle), and SA in 50 ms from (17) (bottom). The numbers in the panels are the average cut values of the 100 trials.

Another main result in (17) is that the CIM took only 5 ms to obtain as good solutions (large cut values) as the highly tuned SA obtained in 50 ms, where the cut value of the MAX-CUT problem is given by EIsing214i=1Nj=1NJi,j with Ising parameters (see the Supplementary Materials). Figure 2B shows that our SB machine takes only 0.5 ms to obtain better solutions on average than both the CIM in 5 ms and the SA in 50 ms.

In Fig. 2A, we also show the energy obtained by the Hopfield neural network (HNN) of two-state neurons (19, 43). The HNN is the most naïve local search algorithm, and therefore, more sophisticated algorithms should outperform the HNN. (The HNN is equivalent to SA at zero temperature; see the Supplementary Materials.) Note that the HNN energy is lower than the GW-SDP energy. Our SB machine reaches the HNN energy about 13 times faster than the CIM on average. From these results, we conclude that our single-FPGA SB machine is about 10 times faster than the CIM developed in (17).

As mentioned in the last subsection, at the first bifurcation point, SB may find an approximate solution determined by the maximum-eigenvalue eigenvector of the coupling matrix J (30, 34). In the case of K2000, this approximate solution is EIsing = −55724 or the cut value of 27,342. This is much worse than those actually obtained by SB, as shown in Fig. 2. This fact supports the mechanism of SB based on the ergodicity discussed in the last subsection.

Solving an all-to-all connected 100,000-node MAX-CUT problem by a GPU-cluster SB machine

We have further solved an all-to-all connected 100,000-node MAX-CUT problem with continuous weights set randomly from the interval [−1, 1], which is named M100000. [See the Supplementary Materials for its definition with a pseudo-random number generator in (44).] Note that in M100000, there are about 5 billion edges, the data size of which is too large for FPGAs. We instead use two large computational systems and compare the performance from each. System 1 is a GPU cluster using eight GPUs communicating with one another via a fast link. System 2 is a PC cluster composed of multiple nodes with two 14-core processors each communicating with one another via a fast link. We also solved M100000 by the HNN and the SA developed by us based on (22). Our SA is remarkably fast, e.g., about 100 times faster than the software provided by (22). See the Supplementary Materials for details.

The comparison of computation times for SB and SA is shown in Fig. 3A. The fastest implementation is the GPU cluster for SB and the 50-core PC cluster for SA. This difference comes from the fact that the massively parallel processing of the GPU cluster is effective for SB, but not for SA, due to the lack of parallelizability of the spin update and resultant large communication overheads. Figure 3B shows the computation times of SB and SA in their fastest implementations. The dotted lines roughly give the envelopes of the time-evolution curves. This means that the dotted lines indicate the best cases with respect to Nstep and Nsweep. From this comparison at the same Ising energies below the HNN value, it turns out that the GPU-cluster SB machine is about 10 times faster than our fastest SA or 1000 times faster than the software provided by (22). This demonstrates that SB allows one to accelerate large-scale combinatorial optimization without specially designed hardware devices or custom circuits.

Fig. 3 Performance of SB machines implemented with a GPU cluster and a PC cluster for M100000.

(A) Computation times of SB (Nstep = 100) and SA (Nsweep = 100) for M100000. (The computation times do not include the loading of data and the output of the results; see the Supplementary Materials.) Constants in SB are set as K = Δ = 1 and ξ0=0.7ΔσN (σ=1/3 is the SD of the elements of J.) p(t) is linearly increased from 0 to 1, while the number of time steps is increased from 1 to Nstep. In the SA, the annealing speed is controlled by the total number of “sweeps” Nsweep, and the inverse temperature is increased linearly from 0 to 0.05. Full and empty circles: SB and SA, respectively, using a PC cluster, the number of cores of which is given by the horizontal axis (each node of the PC cluster uses 1 core or 25 cores). Dotted and dashed lines: SB and SA, respectively, using a GPU cluster with eight GPUs. (B) Time evolutions of Ising energies (thin lines). Full and empty circles connected by dotted lines: final values of SB and SA, respectively, with different Nstep and Nsweep (both are 10, 20, 50, 100, 200, 500, and 1000 from left). Dashed horizontal line: 10-trial average of HNN after convergence to a local minimum.

DISCUSSION

We have proposed a new heuristic algorithm for the Ising problem, which we call SB, inspired by quantum adiabatic optimization using a nonlinear oscillator network. Exploiting the advantages of SB, such as simultaneous updating and simple calculations, we have realized an ultrafast all-to-all connected 2000-spin Ising machine based on SB using a single FPGA, which is about 10 times faster than a state-of-the-art CIM. We have also solved an all-to-all connected 100,000-node MAX-CUT problem with continuous weights using a GPU-cluster SB machine, which is about 10 times faster than our fastest SA. We found that the computation speed of the GPU-cluster SB machine is limited by the memory bandwidth of each GPU (see the Supplementary Materials). This indicates that hardware devices with broader memory bandwidths specially designed for SB, such as application-specific integrated circuits or multi-FPGA systems, will substantially improve the performance. The mechanism of SB is based on adiabatic and chaotic (ergodic) evolutions of the nonlinear Hamiltonian system. We have discussed the mechanism using simulation results of a simple model. The mechanism of SB seems to be related to adiabatic invariants of classical integrable systems (36, 40, 45, 46) and also ergodic adiabatic invariants of classical nonintegrable systems (4749). The mathematically rigorous treatment of the mechanism using these concepts is left for future work. Another interesting direction of research is to extend SB to thermal states at finite temperature, e.g., using the Nosé-Hoover method (36), which will relate SB to nonequilibrium statistical mechanics (50).

METHODS

Modified explicit symplectic Euler method

The symplectic method (36) is a stable numerical method for solving the Hamiltonian equations of motion. Hereafter, the flow map by a Hamiltonian H with time step Δt is denoted by Φ(Δt, H). Since the flow map by the kinetic energy alone Φ(Δt, HV) or the potential energy alone Φ(Δt, V) is obviously symplectic, their composition Φ(Δt, V) ∘ Φ(Δt, HV) is also symplectic (36). Since the present Hamiltonian is separable with respect to x and y, we obtained the explicit symplectic Euler method for the present Hamiltonianxi(tn+1)=xi(tn)+Δyi(tn)Δt(12)yi(tn+1)=yi(tn)[Kxi3(tn+1)+(Δp)xi(tn+1)ξ0j=1NJi,jxj(tn+1)]Δt(13)where tn = Δtn are the nth discretized time and p = p(tn + 1). The point of this method is that the momenta are updated using the updated positions. This method is not only simple but also stable for considerably large Δt. Thus, we can solve the Hamiltonian equations fast using a large Δt.

For realizing further speedup, we modified this method as follows. In the SB algorithm, the product-sum operation, j=1NJi,jxj, is the most computation-intensive part. To mitigate this computation, we split the Hamiltonian as HSB=MHSBHJM+HJ, where M is an integer larger than 2 and HJ=ξ02i=1Nj=1NJi,jxixj. Since the flow maps Φ(Δt, (HSBHJ)/M) = Φ(Δt/M, HSBHJ) and Φ(Δt, HJ) are obviously symplectic, their composition is also symplectic. Thus, we obtained the modified explicit symplectic Euler method for SBxi(m+1)=xi(m)+Δyi(m)δt(14)yi(m+1)=yi(m)[Kxi(m+1)3+(Δp)xi(m+1)]δt(15)xi(tn+1)=xi(M)(16)yi(tn+1)=yi(M)+ξ0j=1NJi,jxj(M)Δt(17)where δt = Δt/M, m = 0, …, M − 1, xi(0)=xi(tn), yi(0)=yi(tn), and p = p(tn+1), disregarding its detailed time dependence because p(t) varies slowly. Note that the flow map Φ(Δt, HSBHJ), which includes nonlinear terms and therefore induces instability, is replaced by the M-time repetition of Φ(δt, HSBHJ) with smaller time step δt = Δt/M. Thus, this method is stable compared with the original one. We found that the original and modified methods are stable when Δt ≤ 0.5 and Δt ≤ 1 (for large M), respectively, where K = Δ = 1 and ξ0=0.7ΔσN, and p(t) is increased linearly from 0 to 1, as in the present simulations. Thus, using the modified method, we can set Δt to a value larger by a factor of 2 than the original method, and consequently, a speedup by a factor of 2 can be achieved. In the present simulations, we set Δt = 0.9 and M = 2 in Fig. 2 and Δt = 0.9 and M = 5 in Fig. 3.

The second-order symplectic method called the Störmer-Verlet method is given by the flow map Φ(Δt/2, HSBV) ∘ Φ(Δt, V) ∘ Φ(Δt/2, HSBV) (36). Because of Φ(Δt, V), the stability of this method is equivalent to the explicit symplectic Euler method. We found that this method is stable for M100000 when Δt ≤ 0.5. Thus, this method does not contribute to a speedup. We also found that the second-order accuracy for the Hamiltonian equations does not contribute to the solution accuracy for the Ising problem. This may be because we need only the signs of the positions, which will be robust against numerical errors. Therefore, we adopted the above modified explicit symplectic Euler method.

SUPPLEMENTARY MATERIALS

Supplementary material for this article is available at http://advances.sciencemag.org/cgi/content/full/5/4/eaav2372/DC1

Solving MAX-CUT with Ising machines

First bifurcation in SB

Setting of ξ0

HNN of two-state neurons

Simulated annealing

FPGA implementation of SB for K2000

Definition of M100000

Eight-GPU implementation of SB and SA for M100000

PC-cluster implementation of SB and SA for M100000

Fig. S1. Computation times of our best SA and the SA software provided by (22) for M100000.

This is an open-access article distributed under the terms of the Creative Commons Attribution-NonCommercial license, which permits use, distribution, and reproduction in any medium, so long as the resultant use is not for commercial advantage and provided the original work is properly cited.

REFERENCES AND NOTES

Acknowledgments: We acknowledge useful comments from A. Shields and T. Kanao. We also thank Y. Kaneko, M. Kujiraoka, and Y. Tanizawa for their support. Funding: The authors acknowledge no funding in support of this research. Author contributions: H.G. discovered and developed the SB algorithm, found a modified symplectic Euler method, confirmed the high performance of SB by solving K2000 and M100000 with a PC cluster, explained the mechanism of SB, and wrote the manuscript. K.T. suggested to use the symplectic Euler method, implemented a fast SB machine with an FPGA for K2000, and wrote the part about the FPGA implementation in the Supplementary Materials. A.R.D. implemented a fast SB machine with a GPU cluster for M100000 and wrote the part about the GPU-cluster implementation in the Supplementary Materials. Competing interests: H.G. and K.T. are inventors on two U.S. patent applications related to this work filed by the Toshiba Corporation (no. 16/123146, filed 6 September 2018; no. 16/118646, filed 31 August 2018). The authors declare that they have no other competing interests. Data and materials availability: All data needed to evaluate the conclusions in the paper are present in the paper and/or the Supplementary Materials. Additional data related to this paper may be requested from the authors.
View Abstract

Navigate This Article