Quantum Gibbs Sampling Using Szegedy Operators Robert R. Tucci P.O. Box 226 0 1 Bedford, MA 01730 0 2 [email protected] n a January 14, 2010 J 4 1 ] h p - t n a u Abstract q [ We present an algorithm for doing Gibbs sampling on a quantum computer. The 2 v algorithm combines phase estimation for a Szegedy operator, and Grover’s algorithm. 7 Foranyǫ > 0, thealgorithmwillsampleaprobabilitydistributionin ( 1 )stepswith 4 O √δ 6 precision (ǫ). Here δ is the distance between the two largest eigenvalue magnitudes 1 O of the transition matrix of the Gibbs Markov chain used in the algorithm. It takes . 0 (1) steps to achieve the same precision if one does Gibbs sampling on a classical 1 O δ 9 computer. 0 : v i X r a 1 1 Introduction In Ref.[1], Szegedy proposed a quantum walk operator for each classical Markov chain. In Ref.[2], Somma et al. proposed a method for doing simulated annealing on a quantum computer. In Ref.[3], Wocjan et al. showed how to improve the Somma et al. algorithm. The algorithms of Somma et al. and Wocjan et al. both use Szegedy operators. In Ref.[4], I presented computer programs called QuSAnn and Multiplexor Expander that implement ideas of Refs.[2] and [3], and also some of my own ideas about quantum multiplexors. InRef.[5], IdescribedoneparticularalgorithmfordoingGibbsandMetropolis- Hastings sampling of a classical Bayesian network (i.e., a probability distribution) on a quantum computer. In this paper, I describe a different algorithm for doing Gibbs sampling on a quantum computer. Unlike my first algorithm, this one uses Szegedy operators. Forany ǫ > 0, thisnew algorithmwill sample aBayesian network in ( 1 ) O √δ steps with precision (ǫ). Here δ is the distance between the two largest eigenvalue O magnitudes of the transition matrix of the Gibbs Markov chain used in the algorithm. It takes (1) steps to achieve the same precision if one does Gibbs sampling on a O δ classical computer. This paper assumes that its reader has read the section entitled “Notation and Preliminaries” in Ref.[5]. The reader should refer to Refs.[5, 4] for clarification when any notation of this paper eludes him. 2 Dual Gibbs Markov Chains In this section, we will discuss dual “Gibbs” Markov chains with transition matrices M and M , respectively. These two transition matrices are both defined in terms of 1 2 a single classical Bayesian network x. 2.1 Definitions of M and M 1 2 Consider a classical Bayesian net with N nodes, labeled x ,x ,...,x where nds 1 2 Nnds x S foreachj. (Asusualinmypapers, Iindicaterandomvariablesbyunderlining j ∈ xj them.) Let x = (x ,x ,...,x ). Let x assume values in a set S which has 1 2 Nnds x N = 2NB elements. S Let π(x) = P(x = x) (1) for all x S . x ∈ For N = 3 and x,y S , let nds x ∈ M (y x) = P (y x ,x )P (y x ,y )P (y y ,y ) , (2) 1 x x x 1 2 3 x x x 2 3 1 x x x 3 1 2 | 1| 2 3 | 2| 3 1 | 3| 1 2 | 2 and M (y x) = P (y y ,y )P (y y ,x )P (y x ,x ) . (3) 2 x x x 1 2 3 x x x 2 3 1 x x x 3 1 2 | 1| 2 3 | 2| 3 1 | 3| 1 2 | (M (y x) can be obtained by swapping x and y in the conditioned arguments of 2 i i | M (y x).) Note that M (y x) = 1 and M (y x) = 1. Define M and M for 1 | y 1 | y 2 | 1 2 arbitrary Nnds using thPe same pattern. M1 aPnd M2 are transition matrices of the type typical for Gibbs sampling. (See Ref.[5] for an introduction to Gibbs sampling and the more general Metropolis-Hastings sampling). Youcancheck thatπ()isnotadetailedbalanceofeitherM norM separately. 1 2 However, the following property is true. We will refer to this property by saying that π() is a detailed balance of the pair (M ,M ). 1 2 Claim 1 M (y x)π(x) = M (x y)π(y) (4) 1 2 | | for all x,y S . x ∈ proof: Let P(x ,x ,...) stand for P(x = x ,x = x ,...). Assume N = 3 to begin j k j j k k nds with. One has M (y x) P(y x ,x )P(y x ,y )P(y y ,y ) 1 1 2 3 2 3 1 3 1 2 | = | | | (5) M (x y) P(x x ,x )P(x x ,y )P(x y ,y ) 2 1 2 3 2 3 1 3 1 2 | | | | P(y ,x ,x )P(y ,x ,y )P(y ,y ,y ) 1 2 3 2 3 1 3 1 2 = (6) P(x ,x ,x )P(x ,x ,y )P(x ,y ,y ) 1 2 3 2 3 1 3 1 2 P(y ,x ,x )P(y ,y ,x )P(y) 1 2 3 1 2 3 = (7) P(x)P(y ,x ,x )P(y ,y ,x ) 1 2 3 1 2 3 P(y) = . (8) P(x) A proof for an arbitrary number N of nodes follows the same pattern. nds QED 2.2 Eigenvalues of M , M and M 1 2 hyb Let Λ (y x) = M (y x) , (9) j j | q | for j = 1,2 and x,y S . It’s convenient to define a hybrid function of M and M , x 1 2 ∈ as follows: 3 M (y x) = Λ (x y)Λ (y x) (10) hyb 2 1 | | | for x,y S . (Note that unlike M (y x) and M (y x), M (y x) is not a probability x 1 2 hyb ∈ | | | function in y, its first argument.) Define the quantum states (π)η = [π(x)]η x (11) | i | i Xx for η = 1,1. (Note that only the η = 1 state is normalized in the sense of quantum 2 2 mechanics.) Claim 2 M π = π for j = 1,2 , (12) j | i | i and M √π = √π . (13) hyb | i | i Also, M , M and M have the same eigenvalues. 1 2 hyb proof: Taking the square root of both sides of the pair detailed balance statement Eq.(4), we get Λ (y x) π(x) = Λ (x y) π(y) . (14) 1 2 | | p p Therefore, 1 1 M (y x) = Λ (x y) Λ (x y) π(y) = M (x y) π(y) . (15) hyb 2 2 2 | | π(x) | π(x) | p p p p Hence, M (y x)π(x) = M (x y)π(y) = π(y) , (16) 1 2 | | Xx Xx M (x y)π(y) = M (y x)π(x) = π(x) , (17) 2 1 | | Xy Xy and 1 M (y x) π(x) = M (x y) π(y) π(x) = π(y) . (18) hyb 2 | π(x) | Xx p Xx p p p p OrdertheelementsofthefinitesetS insomepreferredway. Usethispreferred x order to represent M , M and M as matrices. Define a diagonal matrix D whose 1 2 hyb 4 diagonal entries are the numbers π(x) for each x S , with the x ordered in the x ∈ preferred order: D = diag[(π(x)) ] . (19) x ∀ Since M2T = D−1M1D ,MhTyb = D−21M2D21 , (20) it follows that det(M λ) = det(M λ) = det(M λ) (21) 1 2 hyb − − − for any λ C. ∈ QED Lettheeigenvalues1 ofM (andalsoofM andM )bem ,m ,...m C hyb 1 2 0 1 NS−1 ∈ with m = 1 > m m ... m . Define m to be the corresponding 0 | 1| ≥ | 2| ≥ | NS−1| | ji eigenvectors of M (but not necessarily of M and M ). Thus hyb 1 2 M m = m m , (22) hyb j j j | i | i for j = 0,1,...,N 1. In particular, m = √π . S 0 − | i | i For each j, define ϕ [0, π] and η [0,2π) so that m = eiηj cosϕ . (Thus, j ∈ 2 j ∈ j j cosϕ 0). Note that m = 1 so ϕ = 0. The M eigenvalue gap δ is defined as j 0 0 1 ≥ ϕ2 δ = 1 m . δ 1 when ϕ is small. −| 1| ≈ 2 1 3 Q-Embeddings U and U 1 2 In this section, we will define a “q-embedding” U of M , for j = 1,2. (For more j j information about q-embeddings, see Ref.[5].) For simplicity, we begin this section by considering a Bayesian net with only 3 nodes x ,x ,x , and such that each of these nodes is binary (i.e., S = Bool for 1 2 3 xj j = 1,2,3). At the end of this section, we will show how to remove these restrictions and make our treatment valid for general Bayesian networks. Using the same language as Ref.[5], consider a unitary matrix U of the form 1 shown in Fig.1, with its multiplexor gates defined as follows. Let x k Bool jh i ∈ and x k Bool for any j,k. U has 3 analogous gates (a.k.a. nodes) labeled ′jh i ∈ 1 (x 2 ,x 2 ,x 2 ), (x 3 ,x 3 ,x 3 ), and (x 4 ,x 4 ,x 4 ). Consider the first ′1h i 3h i 2h i ′2h i ′1h i 3h i ′3h i ′2h i ′1h i ofthesefordefiniteness. LettheprobabilityamplitudeA(x 2 ,x 2 ,x 2 x 1 ,x 1 ,x 1 ) ′1h i 3h i 2h i| ′1h i 3h i 2h i of node (x 2 ,x 2 ,x 2 ) satisfy the constraint ′1h i 3h i 2h i 1There must be a single eigenvalue 1 and all others must have a magnitude strictly smaller than one because of the Frobenius-PerronTheorem. The eigenvalues may be complex. 5 x 1 1 2 x 1 2 3 2 x 1 U = 4 3 2 x3 1 Ry 1 1 4 Ry 3 x2 1 Ry 4 x3 1 Figure 1: Unitary matrix U expressed as a product of quantum multiplexors. 1 A(x 2 ,x 2 ,x 2 x 1 = 0,x 1 ,x 1 ) (23) ′1h i 3h i 2h i| ′1h i 3h i 2h i = qPx1|x3,x2(x′1h2i|x3h2i,x2h2i)δxx22hh21iiδxx33hh21ii . (24) If we indicate non-zero entries by a plus sign, 000 001 010 011 ··· (x′1,x3,x2)=000 + ··· + 001 ··· + 010 ··· A = + (25) 011 ··· + 100 ··· + 101 ··· + 110 ··· + 111 ··· G#(cid:31)(cid:24)(cid:30)(cid:25)(cid:29)(cid:26)(cid:28)(cid:27) → eiθ~bσY ⊗P~b = G#(cid:31)(cid:24)(cid:30)(cid:25)(cid:29)(cid:26)(cid:28)(cid:27) , (26) X ~b Bool2 ∈ for some θ R. Here the right pointing arrow means that the expression at the ~b ∈ origin of the arrow can be extended to the expression at the target of the arrow. From the above definition of U , it follows that, for x,x,y,y Bool3, 1 ′ ′ ∈ 6 y x 1 1 h | | i y G#(cid:31)(cid:24)(cid:30)(cid:25)(cid:29)(cid:26)(cid:28)(cid:27) x 2 2 h | | i hy| U |xi = hy3| G#(cid:31)(cid:24)(cid:30)(cid:25)(cid:29)(cid:26)(cid:28)(cid:27) G#(cid:31)(cid:24)(cid:30)(cid:25)(cid:29)(cid:26)(cid:28)(cid:27) |x3i (27) y′ 1 0 ⊗3 y G#(cid:31)(cid:24)(cid:30)(cid:25)(cid:29)(cid:26)(cid:28)(cid:27) G#(cid:31)(cid:24)(cid:30)(cid:25)(cid:29)(cid:26)(cid:28)(cid:27) 0 h | | i h 1′| | i hy2′| G#(cid:31)(cid:24)(cid:30)(cid:25)(cid:29)(cid:26)(cid:28)(cid:27) |0i y 0 h 3′| | i = Λ (y x)δ(y,x) . (28) 1 ′ | Hence, x x U1 | i = | i or U1|0i⊗3|xi = (Λ1|xi)|xi . (29) 0 ⊗3 Λ1 |xi | i R 4 x 1 y 1 4 3 x 1 R y 2 2 U = 4 3 Ry x3 1 2 3 2 x 1 1 2 x 1 2 x 1 3 Figure 2: Unitary matrix U expressed as a product of quantum multiplexors. 2 Besides U , it is convenient to consider another unitary matrix called U . We 1 2 define U to be of the form of Fig.2, where the multiplexors are defined in such a way 2 that U satisfies, for all x,x,y,y Bool3, 2 ′ ′ ∈ y 0 1 h | | i hy2| G#(cid:31)(cid:24)(cid:30)(cid:25)(cid:29)(cid:26)(cid:28)(cid:27) |0i y 0 3 y G#(cid:31)(cid:24)(cid:30)(cid:25)(cid:29)(cid:26)(cid:28)(cid:27) G#(cid:31)(cid:24)(cid:30)(cid:25)(cid:29)(cid:26)(cid:28)(cid:27) 0 h | U | i⊗ = h 3| | i (30) y 2 x h ′| | ′i y G#(cid:31)(cid:24)(cid:30)(cid:25)(cid:29)(cid:26)(cid:28)(cid:27) G#(cid:31)(cid:24)(cid:30)(cid:25)(cid:29)(cid:26)(cid:28)(cid:27) x h 1′| | ′1i hy2′| G#(cid:31)(cid:24)(cid:30)(cid:25)(cid:29)(cid:26)(cid:28)(cid:27) |x′2i y x = Λ2(hy3′x|′)δ(y′,x′) . | ′3i (31) | 7 Hence U2 |0i⊗3= Λ2 |x′i or U2|x′i|0i⊗3 = |x′i(Λ2|x′i) . (32) x x ′ ′ | i | i U is called the q-embedding of M for j = 1,2. j j N Ba A = R y R N y Bb R y Figure 3: The unitary matrix A is a quantum embedding of a probability matrix P(b a), where a has N bits and b has N bits. Ba Bb | Sofarwehaveconsideredtheq-embeddingsU andU forthecaseofaclassical 1 2 Bayesian network x with 3 binary nodes. What if x has N nodes and some of those nds nodes have more than 2 states? In that case, we must use several qubits (horizontal lines) for each node x (and an equal number of qubits for the dual node x ) in i ′i Figs.1 and 2. More specifically, suppose P(x x ,x ,...,x ) equals P(b a) where 1| 2 3 Nnds | a BoolNBa and b BoolNBb. For the number of bits N , define the number of Ba ∈ ∈ states N = 2NBa. Likewise, let N = 2NBb. The constraint Eq.(24) generalizes to Sa Sb A(b,a˜ ˜b = 0,a) = P(b a)δa˜ , (33) | | a p where a,a˜ BoolNBa and b,˜b BoolNBb. Eq.(33) can be expressed in matrix form ∈ ∈ as follows: ˜ (b = 0,a) → (b,a˜) D0,0 [A(b,a˜ ˜b = 0,a)] = D1,0 , (34) | ↓ ··· DNSb−1,0 where, for all b BoolNBb, Db,0 RNSaXNSa are diagonal matrices with entries ∈ ∈ (Db,0) = P(b a)δa˜ . (35) a,a˜ | a p 8 By adding more columns to the matrix of Eq.(34), one can extended it (see section entitled “Q-Embeddings” in Ref.[5]) to a square matrix which can be expressed in terms of multiplexors as in Fig.3. The Markov Blanket MB(i) for a node x of the classical Bayesian network x i satisfies (see section entitled “Notation and Preliminaries” in Ref.[5]) P(x x ) = P(x x ) . (36) i i c i MB(i) | {} | If the set MB(i) is strictly smaller than the set i c, this property can be used to { } reduce the number of controls for the multiplexor in U and U corresponding to 1 2 P(x x ). i i c | {} Giventhetwo q-embeddings U andU foraBayesian network x, wecandefine 1 2 a unitary matrix U as follows U = U2†U1 . (37) Matrix U has the following highly desirable property: Claim 3 For any j,k 0,1,...,N 1 , S ∈ { − } 0 m h | U | ki = m δk . (38) m 0 j j j h | | i proof: 0 m y ΛT x x m mh | U2†U1 |0 ki = m y (cid:20) hy| 2 (cid:21)(cid:20) Λ |xi (cid:21) h | ki (39) h j| | i Xy,x h j| i h | 1| i = m y ΛT(y x)Λ (y x) x m (40) h j| i 2 | 1 | h | ki Xy,x = m M m = m δk . (41) h j| hyb| ki j j QED 4 Szegedy Quantum Walk Operator W In this section, we will define a special type of Szegedy quantum walk operator W corresponding to a Bayesian net x. We will then find the eigenvalues of W. 4.1 Definition of W As in Ref.[4], define the projection operator πˆ and its dual projection operator πˇ by 9 0 0 — πˆ = | ih | , πˇ = πˆ = . (42) — l l 0 0 | ih | Then the Szegedy quantum walk operator W for the Bayesian net x is defined by W = U( 1)πˇU ( 1)πˆ . (43) † − − 4.2 Eigenvalues of W To find the eigenvalues of W, we will use the following identities. Claim 4 πˆ m 0 = m 0 , (44a) j j | i | i πˆ(U ) m 0 = m m 0 , (44b) j j j l | i | i πˆ( U ) m 0 = m m 0 , (44c) l † | j i ∗j| j i for all j 0,1,...,N 1 . S ∈ { − } proof: From the definition of πˆ, we see that 0 0 πˆ | i = | i . (45) m m j j | i | i Also, 0 0 0 m 0 πˆ(U ) | i = | ih | U | ji = m | i , (46) l m m m 0 j m | ji Xk | kih k| | i | ji and 0 0 m 0 0 πˆ(l U†) |mi = |mih 0k| U† |mi = m∗j |mi . (47) | ji Xk | kih | | ji | ji QED An immediate consequence of Claim 4 is that m 0 U m 0 = m 0 πˆU m 0 = m δk , (48) h j | l | k i h j | l | k i j j for j,k 0,1,...,N 1 . S ∈ { − } Note that since m = 1, Eq.(48) implies that 0 m 0 = U m 0 . (49) 0 0 | i l | i 10