Research ArticlePHYSICS

# A quantum machine learning algorithm based on generative models

See allHide authors and affiliations

Science Advances  07 Dec 2018:
Vol. 4, no. 12, eaat9004
DOI: 10.1126/sciadv.aat9004

### Figures

• Fig. 1 Classical and quantum generative models.(A) Illustration of a factor graph, which includes widely used classical generative models as its special cases. A factor graph is a bipartite graph where one group of the vertices represents variables (denoted by circles) and the other group of vertices represents positive functions (denoted by squares) acting on the connected variables. The corresponding probability distribution is given by the product of all these functions. For instance, the probability distribution in (A) is p(x1, x2, x3, x4, x5) = f1(x1, x2, x3, x4)f2(x1, x4, x5)f3(x3, x4)/Z, where Z is a normalization factor. Each variable connects to at most a constant number of functions that introduce correlations in the probability distribution. (B) Illustration of a tensor network state. Each unshared (shared) edge represents a physical (hidden) variable, and each vertex represents a complex function of the variables on its connected edges. The wave function of the physical variables is defined as a product of the functions on all the vertices, after summation (contraction) of the hidden variables. Note that a tensor network state can be regarded as a quantum version of the factor graph after partial contraction (similar to the marginal probability in the classical case), with positive real functions replaced by complex functions. (C) Definition of a QGM introduced in this paper. The state |Q〉 represented here is a special kind of tensor network state, with the vertex functions fixed to be three types as shown on the right side. Without the single-qubit invertible matrix Mi, which contains the model parameters, the wave function connected by Hadamard and identity matrices just represent a graph state. To get a probability distribution from this model, we measure a subset of n qubits (among total m qubits corresponding to the physical variables) in the computational basis under this state. The unmeasured m − n qubits are traced over to get the marginal probability distribution P({xi}) of the measured n qubits. We prove in this paper that the P({xi}) is general enough to include probability distributions of all the classical factor graphs and special enough to allow a convenient quantum algorithm for the parameter training and inference.
• Fig. 2 Efficient representation of factor graphs by the QGM.(A) General form of correlation functions of two binary variables in a factor graph, with parameters a, b, c being real. This correlation acts as the building block for general correlations in any factor graphs by use of the universal approximation theorem (25). (B) Notations of some common tensors and their identities: D is a diagonal matrix with diagonal elements with x = 0, 1; Z is the diagonal Pauli matrix diag(1, −1); and . (C) Representation of the building block correlator f(x1, x2) in a factor graph [see (A)] by the QGM with one hidden variable (unmeasured) between two visible variables x1, x2 (measured in the computational basis). As illustrated in this figure, f(x1, x2) can be calculated as contraction of two copies of a diagram from Fig. 1C as 〈Q|x1, x2〉〈x1, x2|Q〉, where x1, x2 are measurement results of the two visible variables. We choose the single-bit matrix D1, D2 to be diagonal with and . For simplification of this graph, we have used the identities shown in (B). (D) Further simplification of the graph in (C), where we choose the form of the single-bit matrix M†M acting on the hidden variable to be M†M = λ1| + 〉〈 + | + λ2| − 〉〈 − | with positive eigenvalues λ1, λ2. We have used the identities in (B) and the relation HZH = X, where X (H) denotes the Pauli (Hadamard) matrix, respectively. By solving the values of λ1, λ2, d1(x1), d2(x2) in terms of a, b, c (see the proof of theorem 1), this correlator of the QGM exactly reproduces the building block correlator f(x1, x2) of the factor graph.
• Fig. 3 Illustration of our training algorithm for the QGM.(A) Training and inference of the QGM are reduced to measuring certain operators under the state |Q(z)〉. The key step of the quantum algorithm is therefore to prepare the state |Q(z)〉, which is achieved by recursive quantum phase estimation of the constructed parent Hamiltonian. The variables in the set z whose values are specified are called conditioned variables, whereas the other variables that carry the binary physical index are called unconditioned variables. We group the variables in a way such that each group contains only one unconditioned variable and different groups are connected by a small constant number of edges (representing virtual indices or hidden variables). Each group then defines a tensor with one physical index (denoted by p) and a small constant number of virtual indices (denoted by i, j, k in the figure). (B) Tensor network representation of |Q(z)〉, where a local tensor is defined for each group specified in (A). (C) Tensor network representation of |Qt〉, where |Qt〉 are the series of states reduced from |Qz〉. In each step of the reduction, one local tensor is moved out. The moved-out local tensors are represented by the unfilled circles, each carrying a physical index set to 0. For the edges between the remaining tensor network and the moved-out tensors, we set the corresponding virtual indices to 0 (represented by the unfilled circles). (D) Construction of the parent Hamiltonian. The figure shows how to construct one term in the parent Hamiltonian, which corresponds to a group of neighboring local tensors such as those in the dashed box in (C). After contraction of all virtual indices among the group, we get a tensor Lpqr,ij, which defines a linear map L from the virtual indices i, j to the physical indices p, q, r. As the indices i, j take all the possible values, the range of this mapping L spans a subspace range(L) in the Hilbert space Hp,q,r of the physical indices p, q, r. This subspace has a complementary orthogonal subspace inside Hp,q,r, denoted by comp(L). The projector to the subspace comp(L) then defines one term in the parent Hamiltonian, and by this definition, |Qt〉 lies in the kernel space of this projector. We construct each local term with a group of neighboring tensors. Each local tensor can be involved in several Hamiltonian terms [as illustrated in (C) by the dashed box and the dotted box]; thus, some neighboring groups have non-empty overlap, and they generate terms that, in general, do not commute. By this method, one can construct the parent Hamiltonian whose ground state generally uniquely defines the state |Qt〉 (30) [technically, this step requires the tensor network to be injective (29), a condition that is generically satisfied for random choice of the tensor networks (30)]. (E) States involved in the evolution from |Qt −1〉 to |Qt〉 by quantum phase estimation applied on their parent Hamiltonians. represent the states orthogonal to |Qt −1〉, |Qt〉, respectively, inside the two-dimensional (2D) subspace spanned by |Qt −1〉 and |Qt〉. The angle between |Qt〉 and |Qt −1〉 is determined by the overlap ηt = |〈Qt|Qt −1〉|2. (F) State evolution under recursive application of the quantum phase estimation algorithm. Starting from the state |Qt −1〉, we always stop at the state |Qt〉, following any branch of this evolution, where ηt and 1 − ηt denote the probabilities of the corresponding outcomes.

### Supplementary Materials

• Supplementary material for this article is available at http://advances.sciencemag.org/cgi/content/full/4/12/eaat9004/DC1

Section S1. A brief introduction to generative models

Section S2. Representational power and generalization ability

Section S3. Reduction of typical generative models to factor graphs

Section S4. Universal approximation theorem

Section S5. Proof of theorem 2

Section S6. NP hardness for preparation of an arbitrary QGM state |Q

Section S7. Parent Hamiltonian of the state |Q

Section S8. Training and inference in the QGM

Section S9. Proof of theorem 3

Section S10. Applying the QGM to practical examples

Fig. S1. Parameter space of factor graph and QGM.

Fig. S2. Probabilistic graphical models.

Fig. S3. Energy-based neural networks.

Fig. S4. Simulating graphs of unbounded degrees with graphs of constantly bounded degrees.

Fig. S5. Illustration of the universal approximation theorem by a restricted Boltzmann machine.

Fig. S6. #P-hardness for the QGM.

Fig. S7. Contraction between two local tensors using the structure of QGM state and conditioned variables.

Fig. S8. Construction of the history state.

Fig. S9. Flow chart for a machine learning process using the QGM.

References (3840)

• ## Supplementary Materials

This PDF file includes:

• Section S1. A brief introduction to generative models
• Section S2. Representational power and generalization ability
• Section S3. Reduction of typical generative models to factor graphs
• Section S4. Universal approximation theorem
• Section S5. Proof of theorem 2
• Section S6. NP hardness for preparation of an arbitrary QGM state |Q
• Section S7. Parent Hamiltonian of the state |Q
• Section S8. Training and inference in the QGM
• Section S9. Proof of theorem 3
• Section S10. Applying the QGM to practical examples
• Fig. S1. Parameter space of factor graph and QGM.
• Fig. S2. Probabilistic graphical models.
• Fig. S3. Energy-based neural networks.
• Fig. S4. Simulating graphs of unbounded degrees with graphs of constantly bounded degrees.
• Fig. S5. Illustration of the universal approximation theorem by a restricted Boltzmann machine.
• Fig. S6. #P-hardness for the QGM.
• Fig. S7. Contraction between two local tensors using the structure of QGM state and conditioned variables.
• Fig. S8. Construction of the history state.
• Fig. S9. Flow chart for a machine learning process using the QGM.
• References (3840)

Download PDF

Files in this Data Supplement: