ebook img

Monte Carlo Quasi-Heatbath by approximate inversion PDF

11 Pages·0.13 MB·English
Save to my drive
Quick download
Download
Most books are stored in the elastic cloud where traffic is expensive. For this reason, we have a limit on daily download.

Preview Monte Carlo Quasi-Heatbath by approximate inversion

SCSC-TR-98-13 9 9 9 1 n Monte Carlo Quasi-Heatbath a by approximate inversion J 7 2 v Philippe de Forcrand 5 Swiss Center for Scientific Computing (SCSC) 2 ETH-Zu¨rich, CH-8092 Zu¨rich 0 Switzerland 1 [email protected] 1 8 9 / t a m - d n o Abstract c : When sampling the distribution P(φ~) exp( Aφ~ 2), a global v ∝ −| | heatbath normally proceeds by solving the linear system Aφ~ = ~η, i X where ~η is a normal Gaussian vector, exactly. This paper shows how r to preserve the distribution P(φ~) while solving the linear system with a arbitrarily low accuracy. Generalizations are presented. PACS numbers: 02.70.L, 02.50.N, 52.65.P,11.15.H, 12.38.G In Monte Carlo simulations, it is frequently the case that one wants to sample a vector φ~ from a distribution of the Gaussian type exp( Aφ~ 2). Typically,φ~ hasmanycomponents,andAisalarge,sparsem∝atrix. I−n|latt|ice field theory, φ~ is the value of the continuum field φ~ at regular grid points, and A is the discretized version of some differential operator . Illustrative A examples used in this paper are = m+ip~ (free field) and = m+ip/ A A (Dirac operator). The goal of the Monte Carlo simulation is to provide independent configurations of φ~ at the least cost. The brute-force approach consists of drawing successive random vec- tors ~η(k) from the normal Gaussian distribution exp( ~η 2), and of solving Aφ~(k) = ~η(k). The solution of this linear system c−a|n|be efficiently ob- tained with an iterative linear solver (Conjugate Gradient if A is Hermi- tian, BiCGStab otherwise). This approach can be called a global heatbath, because φ~(k+1) has no memory of φ~(k): the heatbath has touched all the components of φ~. To avoid a bias, the solver must be iterated to full con- vergence, which is often prohibitively expensive. One may try to limit the accuracy while maintaining the bias below statistical errors, but this re- quires a delicate compromise difficult to tune a priori. A notable example of this global heatbath approach is thestochastic evaluation of inverse Dirac matrix elements, where several hundred “noise vectors” ~η(k) are inverted to yield (A†A)−1 φ φ† . An abundant literature has been devoted to the ij ≈ h i jik optimization of this procedure [1, 2]. For the free field or the Dirac operator mentioned above, the number of iterations of the solver required to reach a given accuracy grows like the correlation length ξ 1/m. Thus the work per new, independent φ~ is c ξz ≡ where z, the dynamical critical exponent, is 1. However, the prefactor c is large. For this reason, local updates, where only one component of φ~ is changed at a time, are often preferred. They usually provide an indepen- dent φ~ after an amount of work c′ ξ2, but with a much smaller prefactor c′ [3]. This paper presents an adaptation of the global heatbath which allows for arbitrarily low accuracy in the solution of Aφ~ = ~η, thus reducing the prefactor c, while maintaining the correct distribution. This is obtained by the introduction of an accept/reject test of the Metropolis type, making the procedure a “quasi-heatbath.” The method is described in the next section. Generalizations, including a local version, are presented afterwards. 1 1 Quasi-Heatbath Efficient Monte Carlo often relies on the subtle introduction of auxiliary degrees of freedom. Consider here a vector χ~ distributed according to 1 exp( χ~ Aφ~ 2). Note that Z is a constant (πN for an N-component Zχ −| − | χ complex vector) independent of φ~. Therefore, the original distribution of φ~, 1 exp( Aφ~ 2), is unchanged by the introduction of χ~: Zφ −| | 1 φ~ e−|Aφ~|2 = 1 φ~ χ~ e−|Aφ~|2−|χ~−Aφ~|2. (1) Z Z D Z Z Z D D φ φ χ We can now alternate Monte Carlo steps on φ~ and χ~, with the following prescription: 1. Perform a global heatbath on χ~: χ~ Aφ~+~η, (2) ←− where ~η is a normal Gaussian vector; 2. ReflectAφ~ withrespecttotheminimumof thequadraticform(Aφ~ 2+ χ~ Aφ~ 2): | | | − | Aφ~ χ~ Aφ~, ←− − i.e., φ~ A−1χ~ φ~. (3) ←− − Step 2 conserves the probability of φ~ but is not ergodic. Step 1 provides the ergodicity. Note that step 2 exchanges the two terms Aφ~ 2 and χ~ Aφ~ 2 in the quadratic form. Since χ~ Aφ~ in step 1 is set to a|new| rando|m−vecto|r ~η, Aφ~ at the end of step 2 is eq−ual to ~η. Therefore, a completely decorrelated φ~ has been generated. The vector χ~ is not needed any longer and can be discarded. This two-step algorithm can now be modified slightly. The vector A−1χ~ in Eq.(3) need not be computed exactly. Consider an approximate solution ζ~ with Aζ~ = χ~ ~r, where ~r = ~0 is the residual. Step 2 should now be considered as a −way to propos6 e a candidate φ~′ = ζ~ φ~ in a Metropolis procedure. Since ζ~ is completely independent of φ~ or φ−~′, the probability of proposing φ~′ given φ~ is the same as that of proposing φ~ given φ~′. Detailed balance will be satisfied with the additional step: 2 3. Accept the candidate φ~′ = ζ~ φ~ with probability − P (φ~ φ~′) = min(1,e−∆S), (4) acc → where ∆S = Aφ~′ 2+ χ~ Aφ~′ 2 Aφ~ 2 χ~ Aφ~ 2. | | | − | −| | −| − | Simple algebra shows that ∆S = 2Re(r† (Aφ~ Aφ~′)), · − which is antisymmetric under the exchange φ~ φ~′, as it should be. If the linearsystemAζ~ = χ~ issolvedexactly(~r =~0),t↔hen∆S = 0andonerecovers the original global heatbath with acceptance 1. Otherwise, the candidate φ~′ may be rejected, in which case φ~(k) must be included once more in the Monte Carlo sequence: φ~(k+1) = φ~(k). As the residual is allowed to grow, the average acceptance falls. But no bias is introduced: the distribution of φ~ remains 1 exp( Aφ~ 2). Zφ −| | The optimal magnitude of ~r is thus the result of a compromise between accuracy and acceptance. The average acceptance of the prescription (4) is erfc( (∆S)2 /8) [5]. Here (∆S)2 can be evaluated as a function of the h i h i conveprgence criterion ǫ of the linear solver. If the solver yields a residual ~r such that ||~r|| ǫ, then (∆S)2 8Nǫ2, where Aφ~, Aφ~′, and ~r have been ||χ~|| ≤ h i ≤ considered independent random Gaussian vectors with variance N, N, and 2ǫ2N, respectively, and N is the number of their components. Therefore, the acceptance is simply acceptance erfc(ǫ√N). (5) h i≈ Inotherwords,theacceptanceisentirelydeterminedbyǫandN,thenumber of degrees of freedom (the volume) of the system, and is independent of the matrix A. To maintain a constant acceptance as the volume grows, the convergence criterion for the solution of Aζ~ = χ~ should vary like 1/√N. An accuracy ǫ 10−3 10−4 provides an acceptance of 80-90% up to systems ∼ − of 106 degrees of freedom. There is no need for higher accuracy. The convergence of an iterative solver is typically exponential: ǫ n ∼ e−n/ξ after n iterations. Therefore, the above prescription reduces the work byafactorof2to3comparedtotheusualapproachwhichiteratesthesolver until “full” convergence (which typically means ǫ < 10−8 10−12). Illustra- ∼ − tive results are shown in Fig. 1 for the case of the Wilson–Dirac operator. This figure shows the number of iterations, the acceptance, and the work per independent φ~ as a function of ǫ. The acceptance obeys erfc(c ǫ√N), 3 wherec < 1(0.75 here)reflects thefactthattheresidualis always smaller(c times smaller on average) than required by the stopping criterion. For this system of N = 49 152 variables (84 lattice), the optimal stopping criterion is near 10−3. Figure 1: As the stopping criterion in the iterative solver is varied, the number of solver iterations (top) and the acceptance of the quasi-heatbath (middle)change. TheacceptanceiswelldescribedbyEq.(5)(solidline). The work per new φ~ (bottom) shows a clear minimum. The optimal stopping criterion depends on the system size only (49 152 here). 4 2 Generalizations A. Over- and under-relaxation Consider a modification of Eq.(1) with a parameter λ: 1 φ~ e−|Aφ~|2 = 1 φ~ χ~ e−|Aφ~|2−|χ~−λAφ~|2. (6) Z Z D Z Z Z D D φ φ χ The same 3-step algorithm of Section 1 now reads: 1. Heatbath on χ~: χ~ λ Aφ~+~η; (7) ←− 2. Reflection of φ~ abouttheapproximate minimumof thequadratic form: 2λ φ~′ = ζ~ φ~, (8) 1+λ2 − where Aζ~ = χ~ ~r; − 3. Accept φ~′ with probability P (φ~ φ~′) = min(1,e−∆S) where ∆S = acc Aφ~′ 2+ χ~ λAφ~′ 2 Aφ~ 2 χ~ →λAφ~ 2; and, by simple algebra, | | | − | −| | −| − | ∆S = 2λ Re(r† (Aφ~ Aφ~′)). (9) · − Thus, as λ decreases from 1, (∆S)2 also decreases, which boosts the acceptance. On the other hand, Ehq.(8) inidicates that φ~′ approaches φ~ as λ 0, so that φ~′ and φ~ become very (anti)correlated. The parame−ter λ → allows interpolation between simple reflection (λ = 0) and no motion at all (λ = + ). In fact, substituting Eq.(7) into Eq.(8) gives ∞ 1 λ2 2λ φ~′ = − φ~+ (A−1~η ~r). (10) −1+λ2 1+λ2 − Taking~r =~0, onecan identify this prescriptionwith thatof Adler’s stochas- tic over-relaxation (AOR) [4]: φ~′ = (1 ω)φ~+ ω(2 ω)A−1~η, (11) − q − provided ω = 2 . The quasi-heatbath can be viewed as a flexible, global 1+λ2 generalization of Adler’s AOR. 5 It is clear from Eq.(9) that λ < 1 allows for a looser convergence crite- rion ǫ 1/λ. However, the work to reach convergence typically grows like ∼ log ǫ, whereas Eq.(10) indicates that the number of Monte Carlo steps to d−ecorrelate φ~ will grow like 1 . Therefore, it seems inadvisable to depart λ2 from λ =1. Nonetheless, there are many situations where a completely independent φ~ at each Monte Carlo step is a wasteful luxury. When the matrix A fluc- tuates and depends on other variables U, it will take some time for the U to equilibrate in the new background φ~(k+1). Equilibration will be achieved quickly over short distances, more slowly over large ones. In that case it is useful to refresh the short-wavelength modes of φ~ at every MC step, but not the long-wavelength ones. The situation is similar for the stochastic evaluation of inverse Dirac matrix elements: one is interested in estimating (A†A)−1, where the distance i j is short. Refreshing the long-wavelength ij | − | modes every time is wasteful. B. Selective mode refresh The quasi-heatbath may be tailored for this purposeby modifying the basic Eq.(1) to 1 φ~ e−|Aφ~|2 = 1 φ~ χ~ e−|Aφ~|2−|χ~−Cφ~|2. (12) Z Z D Z Z Z D D φ φ χ ThematrixC playstheroleoftheearlierλA,exceptthatnowλdepends ontheeigenmodeconsidered. Thethreebasicstepsofthealgorithmbecome: 1. Heatbath on χ~: χ~ Cφ~+~η; (13) ←− 2. Reflection of φ~ abouttheapproximate minimumof thequadratic form: φ~′ = ζ~ φ~, (14) − where 1 (A†A+C†C)ζ~ = C†χ~ ~r; (15) 2 − 3. Accept φ~′ with probability P (φ~ φ~′)= min(1,e−∆S), where acc → ∆S = 2Re(r† (φ~ φ~′)). · − Forsimplicity,considerthecasewhereC andAcommute. Thecandidate φ~′ can be expressed as 6 φ~′ = (A†A+C†C)−1(A†A C†C)φ~+2(A†A+C†C)−1(C†~η ~r). − − − One wishes to obtain a heatbath (λ 1 in Section A) on short-wavelength ∼ modes. This implies a cancellation of eigenvalues in (A†A C†C) for short − wavelengths. For long wavelengths a heatbath is not necessary, and one could have λ 0 or + . One possible way to implement this would be ∼ ∞ C = F−1ΛFA, where F is the Fourier transform and Λ is a diagonal matrix with entries λ(~k) growing from 0 to 1 with momentum ~k . However, for operators | | A of the free-field or Dirac type, a simpler and equivalent way consists of modifying the mass parameter m to m > m. This is equivalent to λ(~k )= C | | (m2 +k2)/(m2 +k2). C q The mass which enters into the linear system to solve (15) is m = eff (m2+m2)/2. As m is increased, so is m . The work to approximately C C eff q solve Eq.(15) decreases as1/m . Therefore, oneachieves thedesiredeffect eff of refreshing short-wavelength modes at cheaper cost. By drawing m ran- C domly from a suitable distribution at each MC step, the tailored refreshing of all Fourier modes with the desired frequency can be achieved. C. Local version The quasi-heatbath described so far is a global update procedure: all com- ponents of φ~ are updated together. A local version readily suggests it- self: restricting the auxiliary vector χ~ to have only 1 non-zero component, χ = χ δ (or any subset of components). i i,i0 Eq.(1) then becomes 1 φ~ e−|Aφ~|2 = 1 φ~ χ~ e−|Aφ~|2−|χ−(Aφ~)i0|2. (16) Z Z D Z Z Z D D φ φ χ The algorithm is unchanged: 1. Heatbath on χ: χ (Aφ~) +η; ←− i0 2. Approximate reflection of φ~: φ~′ = ζ~ φ~, where Aζ~ = χ~ ~r; − − 3. Accept φ~′ with probability P (φ~ φ~′)= min(1,e−∆S), acc → where ∆S = Re(r† (Aφ~ Aφ~′))+Re(r⋆ ((Aφ~) (Aφ~′) )). · − i0 · i0 − i0 7 Inthiscase, ζ~istheapproximateGreen’sfunctionofAforasourceati . 0 Itwillhave asupportof size (ξ), sothatthelocal change inχ willinduce O i0 a change in φ over a whole domain. By varying i from 1 to N, one sweeps 0 the whole system and generates a new vector φ~(k+1). If the acceptance is maintained close to 1, φ~(k+1) will essentially be uncorrelated with φ~(k). However, the work per local update is proportional to ξd in d dimensions, so that this approach becomes very inefficient when the correlation length ξ is large. Nevertheless, it may be advantageous for moderate ξ. The reason is that the approximate solution ζ~ A−1χδ(i ) need not be obtained by a 0 ≈ Krylov method, which applies successive powers of A to the initial residual. Instead, onemay searchforthebestsolution ζ~amongallvectors oflocalized support, for instance, i and its nearest neighbours. 0 D. Adler’s stochastic over-relaxation Finally, the local variable χ δ(i,i ) may interact with φ~ in the simplest way, 0 with a contact interaction. This modifies Eq.(1) to 1 φ~ e−|Aφ~|2 = 1 φ~dχ e−|Aφ~|2−|χ−λφ(i0)|2. (17) Z Z D Z Z Z D φ φ χ If one chooses to update only φ(i ) and leave the other components of φ~ 0 unchanged, then there is no need to invert the matrix A. The algorithm simplifies to: 1. Heatbath on χ: χ λφ(i )+η; 0 ←− 2. Reflection of φ(i ) with respect to the minimum of the quadratic form 0 (m2+λ2)φ(i )2 +(φ(i )† (ψ λχ)+H.c.)+constant, 0 0 | | · − where m2 (A†A) and ψ (A†A) φ(j). ≡ i0i0 ≡ i0j This reflection is exact, and so the acceptance test disappears. The new reflected value is λχ ψ φ′(i ) = 2 − φ(i ) 0 1+λ2 − 0 1 λ2 2 2λ = − φ(i ) ψ+ η. −1+λ2 0 − 1+λ2 1+λ2 This prescription is identical to Adler’s stochastic over-relaxation [4] with the change of notation ω 2/(1+λ2). ↔ 8 3 Conclusion The quasi-heatbath Eqs.(2)–(4) is a simple and efficient method to globally change a vector φ~ distributed according to 1 e−|Aφ~|2. Like the global heat- Zφ bath consisting of solving Aφ~ = ~η, where ~η is a Gaussian vector, exactly at each Monte Carlo step, the quasi-heatbath also has dynamical critical exponent 1. The prefactor is reduced by a factor of 2 to 3 because the linear system Aφ~ = ~η can now be solved approximately. Whatever the level of accuracy, an acceptance test maintains the exact distribution e−|Aφ~|2. The most efficient choice for the accuracy level is (1/√N), where N is the O volume of the system. Several generalizations of the quasi-heatbath have been proposed. A simple modification makes it possible to refresh each of the Fourier compo- nents of Aφ~ at aprescribedrate. Alocal version may beadvantageous when the correlation length is moderate. In a limiting case, this version becomes identical to Adler’s stochastic over-relaxation. I thank Massimo D’Elia for interesting discussions and valuable com- ments. 9

See more

The list of books you might like

Most books are stored in the elastic cloud where traffic is expensive. For this reason, we have a limit on daily download.