An Accelerated Uzawa Method for Application to Frictionless Contact Problem Yoshihiro Kanno † The Uzawa method is a method for solving constrained optimization prob- lems, and is often used in computational contact mechanics. The simplicity of this method is an advantage, but its convergence is slow. This paper presents an accelerated variant of the Uzawa method. The proposed method can be viewed as application of an accelerated projected gradient method to the Lagrangian dual problem. Preliminary numerical experiments suggest thattheconvergenceoftheproposedmethodismuchfasterthantheoriginal 7 Uzawa method. 1 0 Keywords 2 n Fast first-order method, accelerated gradient scheme, contact mechanics, a nonsmooth mechanics, Uzawa method, convex optimization. J 3 1 1 Introduction ] C It has been recognized well that contact mechanics has close relation with optimization O and variational inequalities [6, 27]. The static frictionless contact problem of a linear h. elastic body, also called Signorini’s problem, is one of the most fundamental problems t in contact mechanics. This is a boundary value problem to find the equilibrium configu- a m ration of an elastic body, where some portion of the boundary of the body can possibly [ touchthesurfaceofarigidobstacle(orthesurfaceofanotherelasticbody). Positivedis- tance between the elastic body and the obstacle surface (i.e., positive gap) implies zero 1 v contact pressure (i.e., zero reaction), while nonzero reaction implies zero gap. This dis- 8 junction nature can be described by using complementarity conditions. Moreover, the 5 frictionless contact problem can be formulated as a continuous optimization problem 6 3 under inequality constraints [27]. 0 TheUzawamethodisknownasaclassicalmethodforsolvingconstrainedoptimization . 1 problems [2, 5, 25]. Due to ease in implementation, the Uzawa method is often applied 0 to contact problems [10, 12, 18–22, 26]. Major drawback of the Uzawa method is that 7 1 its convergence is slow; it exhibits only linear convergence in general. : Recently, accelerated, or “optimal” [14, 15], first-order methods have received sub- v i stantial attention, particularly for solving large-scale optimization problems; see, e.g., X [3, 4, 13, 16]. Advantages of most of these methods include ease of implementation, r a cheap computation per each iteration, and fast local convergence. Application of an † Laboratory for Future Interdisciplinary Research of Science and Technology, Institute of Innova- tive Research, Tokyo Institute of Technology, Nagatsuta 4259, Yokohama 226-8503, Japan. E-mail: [email protected]. Phone: +81-45-924-5364. Fax: +81-45-924-5978. 1 accelerated first-order method to computational mechanics can be found in [11]. In this paper, we apply the acceleration scheme in [3] to the Uzawa method. The paper is organized as follows. Section 2 provides an overview of the necessary backgroundofthefrictionlesscontactproblemandtheUzawamethod. Section3presents an accelerated Uzawa method. Section 4 reports the results of preliminary numerical experiments. Some conclusions are drawn in section 5. In our notation, (cid:62) denotes the transpose of a vector or a matrix. For two vectors x = (x ) ∈ Rn and y = (y ) ∈ Rn, we define i i min{x,y} = (min{x ,y },...,min{x ,y })(cid:62). 1 1 n n We use Rn to denote the nonpositive orthant, i.e., Rn = {x = (x ) ∈ Rn | x ≤ 0 (i = − − i i 1,...,n)}. For y ∈ Rn, we use Π (y) to denote the projection of y on Rn, i.e., − − Π (y) = argmin{(cid:107)x−y(cid:107) | x ∈ Rn} = min{y,0}. − − We use diag(x) to denote a diagonal matrix, the vector of diagonal components of which is x. 2 Fundamentals of frictionless contact and Uzawa method We briefly introduce the frictionless contact problem of an elastic body; see, e.g., [27] and [6] for fundamentals of contact mechanics. Consider an elastic body subjected to a static load and a rigid obstacle fixed in space. Since the body cannot penetrate the surface of the obstacle, the deformation of the body is constrained from one side by the obstacle surface. We also assume the absence of friction and adhesion between the body surface and the obstacle surface. The set of these conditions is called the frictionless unilateral contact. Suppose that the conventional finite element procedure is adopted for discretization of the elastic body. Let u ∈ Rd denote the nodal displacement vector, where d is the numberofdegreesoffreedomofdisplacements. Weuseπ(u)todenotethetotalpotential energy caused by u. At the (unknown) equilibrium state, the nodes on a portion of the body surface can possibly make contact with the obstacle surface. Such nodes are called the contact candidate nodes, and the number of them is denoted by m. We use g (u) i (i = 1,...,m)todenotethegapbetweentheithcontactcandidatenodeandtheobstacle surface. The non-penetration conditions are then formulated as g (u) ≥ 0, i = 1,...,m. (1) i The displacement vector at the equilibrium state minimizes the total potential energy under the constraints in (1). Namely, the frictionless contact problem is formulated as follows: Minimize π(u) (2a) subject to g (u) ≥ 0, i = 1,...,m. (2b) i 2 The Uzawa method solving problem (2) is listed in Algorithm 1 [1, 2, 5, 25]. It is worth noting that step 2 of Algorithm 1 can be performed as the equilibrium analysis with a conventional finite element code. Also, step 3 is simple to implement. Due to sucheaseinimplementation, theUzawamethodiswidelyusedincomputationalcontact mechanics. Algorithm 1 Uzawa method Require: α > 0, r(0) ∈ Rm − 1: for k = 0,1,... do m 2: u(k) solves ∇π(u)+(cid:88)ri(k)∇gi(u) = 0 i=1 3: ri(k+1) := min{0,ri(k)+αgi(u(k))} (i = 1,...,m) 4: end for As shown in Ciarlet [5, section 9.4], the Uzawa method can be viewed as a projected gradient method solving the Lagrange dual problem. Essentials of this observation are repeated here. The Lagrangian L : Rd×Rm → R∪{−∞} associated with problem (2) is given by m (cid:88) π(u)+ rigi(u) if r ≤ 0, L(u,r) = (3) i=1 −∞ otherwise. Here, the Lagrange multipliers, r ,...,r , correspond to the reactions. The inequality 1 m constraints imposed on the reactions in (3) correspond to the non-adhesion conditions. Define ψ : Rm → R∪{−∞} by ψ(r) = inf{L(u,r) | u ∈ Rd}, (4) which is the Lagrange dual function. The Lagrange dual problem of (2) is then formu- lated as follows: Maximize ψ(r) (5a) subject to r ≤ 0. (5b) Since ψ is the pointwise infimum of a family of affine functions of r, it is concave (even if problem (2) is not convex). Therefore, the dual problem (5) is convex. Let α > 0 be an arbitrary constant. A point r ∈ Rm is optimal for problem (5) if and only if it satisfies r = Π (r+α∇ψ(r)). (6) − This fixed point relation yields the iteration of the projected gradient method as follows (see, e.g., [17]): r(k+1) := Π (r(k)+α∇ψ(r(k))). (7) − 3 Here, α plays a role of the step size. For given r¯ ∈ Rm, define u by − r¯ u = argmin{L(u,r¯) | u ∈ Rd}. (8) r¯ It can be shown that the relations ∂ ψ(r¯) = g (u ), i = 1,...,m (9) i r¯ ∂r i holds, when π is convex and g ,...,g are concave [5, Theorem 9.3-3]. Substitution of 1 m (9) into (7) results in step 3 of Algorithm 1. Thus, the Uzawa method for problem (2) is viewed as the projected gradient method applied to the Lagrange dual problem (5). It is worth noting that a reasonable stopping criterion, (cid:107)r(k)−r(k+1)(cid:107) ≤ (cid:15) with threshold (cid:15), is derived from the optimality condition in (6). If we assume the small deformation, the total potential energy is given by 1 π(u) = u(cid:62)Ku−p(cid:62)u. 2 Here, K ∈ Rd×d is the stiffness matrix, which is a constant positive definite symmetric matrix, and p ∈ Rd is the external nodal force vector. Moreover, g is linearized as i g (u) = h −n(cid:62)u, i = 1,...,m. (10) i i i For notational simplicity, we rewrite (10) as g(u) = h−Nu, where h ∈ Rm is a constant vector, and N ∈ Rm×d is a constant matrix. The upshot is that, in the small deformation theory, the frictionless contact problem is reduced to the following convex quadratic programming (QP) problem: 1 Minimize u(cid:62)Ku−p(cid:62)u (11a) 2 subject to h−Nu ≥ 0. (11b) The step size, α, of the Uzawa method for solving problem (11) is chosen as follows. Let λ (K) and σ (N) denote the minimum eigenvalue of K and the maximum singular 1 d value of σ (N). If α is chosen so that d α ∈]0,2λ (K)/σ (N)[, 1 d then it is shown that the Uzawa method converges [5, section 9.4]. 4 3 Accelerated Uzawa method In this section, we present an accelerated version of the Uzawa method. The Uzawa method in Algorithm 1 is essentially viewed as the projected gradient method (7). This method is considered a special case of the proximal gradient method, because the proximal operator of the indicator function of a nonempty closed convex set is reduced to the projection onto the set [17]. Therefore, it is natural to apply the acceleration scheme for the proximal gradient method [3] to the projected gradient update (7). This results in the update r(k+1) := Π (ρ(k)+α∇ψ(ρ(k))), (12) − ρ(k+1) := r(k)+ω (r(k+1)−r(k)). (13) k+1 Here, ω ∈ [0,1) is an extrapolation parameter which is to be determined so that the k convergence acceleration is achieved. In [3], ω is chosen as k 1(cid:16) (cid:113) (cid:17) τ := 1+ 1+τ2 , k+1 2 k τ −1 k ω := k+1 τ k+1 with τ := 1. The sequence of the dual objective values, {ψ(r(k))}, converges to the 0 optimal value with rate O(1/k2) [3]. Theacceleratedmethodin(12)and(13)isnotguaranteedtobemonotoneinthedual objective value. Therefore, we follow O’Donoghue and Cand`es [16] in incorporating an adaptive restart technique of the acceleration scheme. Namely, we perform the restart procedurewheneverthemomentumterm, r(k+1)−r(k), andthegradientoftheobjective function, ∇ψ(r(k)), make an obtuse angle, i.e., ∇ψ(r(k))(cid:62)(r(k+1)−r(k)) < 0. As the upshot, the accelerated Uzawa method with adaptive restart is listed in Algo- rithm 2. One reasonable stopping criterion is (cid:107)ρ(k)−r(k+1)(cid:107) ≤ (cid:15) (14) at step 4. Inthecaseofthesmalldeformationtheory, steps2and3ofAlgorithm2aresimplified as follows. • Step 2: Let u(k) be the solution to the system of linear equations Ku = p+N(cid:62)ρ(k). (15) Here, thecoefficientmatrixK iscommontoalltheiterations. Hence, wecarryout the Cholesky factorization only at the first iteration; at the following iterations, we can solve (15) only with the back-substitutions. • Step 3: γ(k) := h−Nu(k). 5 Algorithm 2 accelerated Uzawa method with restart Require: α > 0, r(0) ∈ Rm, ρ(0) := r(0), τ := 1 − 0 1: for k = 0,1,... do m 2: u(k) solves ∇π(u)+(cid:88)ρ(ik)∇gi(u) = 0 i=1 3: γ(k) := g(u(k)) 4: r(k+1) := min{0,ρ(k)+αγ(k)} 1(cid:16) (cid:113) (cid:17) 5: τk+1 := 2 1+ 1+4τk2 6: if (γ(k))(cid:62)(r(k+1)−r(k)) ≥ 0 then τ −1 7: ρ(k+1) := r(k+1)+ k (r(k+1)−r(k)) τ k+1 8: else 9: ρ(k+1) := r(k+1) 10: τk+1 := 1 11: end if 12: end for Figure 1: An elastic body on the obstacle. 4 Preliminary numerical experiments Consider an elastic body shown in Figure 1. The body is in the plane-stress state, with thickness5mm,width60mm,andheight20mm. Itconsistsofanisotropichomogeneous material with Young’s modulus 200GPa and Poisson’s ratio 0.3. The bottom edge of the body is on the rigid obstacle, and the left edge is fixed by supports. The uniform downwardtractionof50kPaisappliedtothetopedge, andtheuniformupwardtraction of 500kPa is applied to the right edge. The body is discretized into N ×N four-node X Y quadrilateral (Q4) finite elements, where N (= 3N ) is varied to generate problem X Y instances with different sizes. The number of degrees of freedom of displacements is d = 2N (N + 1) and the number of contact candidate nodes is m = N . At the X Y X equilibrium state, about 73.3% of contact candidate nodes are in contact with nonzero reactions. We assume the small deformation, and solve QP problem (11). Computation was carried out on a 2.2GHz Intel Core i5 processor with 8GB RAM. The proposed algorithm was implemented with MATLAB ver. 9.0.0. The threshold for the stopping criteria in (14) is set to (cid:15) = 10−6. Comparison was performed with QUADPROG [23], SDPT3 ver. 4.0 [24] with CVX ver. 2.1 [9], and PATH ver. 4.7.03 [8] via 6 100 ∗| ψ /| ) k (x 10-5 ψ − ∗ ψ 10-10 0 500 1000 1500 2000 Iteration, k (a) 102 100 al u sid 10-2 e R 10-4 10-6 0 500 1000 1500 2000 Iteration (b) Figure 2: Convergence history for (N ,N ) = (30,10). (a) The dual objective value; X Y and (b) the total residual. “Solid line” the accelerated Uzawa method with restart; “dotted line” the accelerated Uzawa method without restart; and “dashed line” the Uzawa method. the MATLAB Interface [7]. QUADPROG is a MATLAB built-in function for QP. We use an implementation of an interior-point method with setting the termination threshold, the parameter TolFun, to 10−8. SDPT3 implements a primal-dual interior-point method for solving conic programming problems. A QP problem is converted to the standard form of conic programming by CVX, where the parameter cvx precision for controlling the solver precision is set to high. PATH is a nonsmooth Newton method to solve mixed complementarity problems, and hence can solve the KKT condition for QP. Figure 2 reports the convergence history of the proposed algorithm (Algorithm 2). It also shows the result of the accelerated Uzawa method without restart scheme, and that of Algorithm 1 (i.e., the Uzawa method without acceleration). Figure 2(a) shows the convergence history of the dual objective function. It is observed that the acceler- ation and restart schemes drastically speed up the convergence. Particularly, with the proposed algorithm, the dual objective value seems to converge quadratically and mono- 7 100 102 Time (s) 101 Residual 10-5 100 10-10 10-1 103 104 103 104 No.ofdegreesoffreedom,d No.ofdegreesoffreedom,d (a) (b) Figure 3: Comparisonof(a)thecomputationaltime; and(b)thetotalresidual. “◦”The accelerated Uzawa method with restart; “(cid:52)” the Uzawa method; “×” SDPT3; “+” QUADPROG; and “(cid:3)” PATH. tonically. The residual in Figure 2(b) is defined by using the KKT condition. Namely, we define vectors e (j = 1,2,3,4) by j e = Ku(k)−f −N(cid:62)r(k), 1 e = min{g(u(k)),0}, 2 e = max{r(k),0}, 3 e = −diag(g(uk))r(k), 4 and the value of (cid:107)(e ,e ,e ,e )(cid:107) is reported in Figure 2(b). 1 2 3 4 Figure 3 compares the computational results of the five methods. It is observed in Figure 3(a) that QUADPROG is fastest for almost all the problem instances. The computational time required by PATH is very small for small instances, but drastically increases as the instance size increases. On the other hand, it is observed in Figure 3(b) that PATH converges to the highest accuracy. For large instances, the proposed method isthesecond-fastestmethodamongthefivemethods. Thecomputationaltimeof SDPT3 is larger than that of the proposed method. Also, SDPT3 converges to the lowest accu- racy. The residuals of the solutions obtained by the proposed method, the unaccelerated Uzawa method, and QUADPROG are similar. The computational time required by the unaccelerated Uzawa method is very large. The upshot is that, within the small deformation theory, the proposed method is quite efficient, but the interior-point method for QP (QUADPROG) is more efficient. If we consider large deformation, problem (2) is not convex, and hence QUADPROG and SDPT3 cannot be adopted. Efficiency of the proposed method, that solves the convex dual problem, remains to be studied. 8 5 Concluding remarks This paper has presented an accelerated Uzawa method. The method can be recognized as an accelerated projected gradient method solving the Lagrangian dual problem of a constrained optimization problem. It has been shown in the numerical experiments that the acceleration and restart schemes drastically speed up the convergence of the Uzawa method. Besides, the proposed method is easy to implement. Many possibilities of extensions could be considered. To update the primal variables, the Uzawa method solves a system of linear equations. For large-scale problems, this processmightbereplacedbyafastfirst-orderminimizationmethodofaconvexquadratic function, e.g., [13]. Also, an extension to large deformation problems remains to be explored. Acknowledgments This work is partially supported by JSPS KAKENHI 26420545. References [1] P. Alart, A. Curnier: A mixed formulation for frictional contact problems prone to Newton like solution methods. Computer Methods in Applied Mechanics and Engineering, 92, 353–375 (1991). [2] G. Allaire: Numerical Analysis and Optimization. Oxford University Press, New York (2007). [3] A. Beck, M. Teboulle: A fast iterative shrinkage-thresholding algorithm for linear inverse problems. SIAM Journal on Imaging Sciences, 2, 183–202 (2009). [4] S. Becker, J. Bobin, E. J. Cand`es: NESTA: A fast and accurate first-order method for sparse recovery. SIAM Journal on Imaging Sciences, 4, 1–39 (2011). [5] P. G. Ciarlet: Introduction to Numerical Linear Algebra and Optimisation. Cam- bridge University Press, Cambridge (1989). [6] G. Duvaut, J. L. Lions: Inequalities in Mechanics and Physics. Springer-Verlag, Berlin (1976). [7] M. C. Ferris, T. S. Munson: Interfaces to PATH 3.0: Design, implementation and usage. Computational Optimization and Applications, 12, 207–227 (1999). [8] M. C. Ferris, T. S. Munson: Complementarity problems in GAMS and the PATH solver. Journal of Economic Dynamics and Control, 24, 165–188 (2000). [9] M. Grant, S. Boyd: Graph implementations for nonsmooth convex programs. In: V. Blondel, S. Boyd, H. Kimura (eds.), Recent Advances in Learning and Control (A Tribute to M. Vidyasagar), Springer (2008). pp. 95–110. 9 [10] J. Haslinger, R. Kuˇcera, Z. Dost´al: An algorithm for the numerical realization of 3D contact problems with Coulomb friction. Journal of Computational and Applied Mathematics, 164–165, 387–408 (2004). [11] Y.Kanno: Afastfirst-orderoptimizationapproachtoelastoplasticanalysisofskele- tal structures. Optimization and Engineering, 17, 861–896 (2016). [12] J. Koko: Uzawa block relaxation domain decomposition method for a two-body frictionless contact problem. Applied Mathematics Letters, 22, 1534–1538 (2009). [13] Y. T. Lee, A. Sidford: Efficient accelerated coordinate descent methods and faster algorithmsforsolvinglinearsystems.IEEE54thAnnualSymposiumonFoundations of Computer Science, Berkeley (2013) pp. 147–156. [14] Y.Nesterov: Amethodofsolvingaconvexprogrammingproblemwithconvergence rate O(1/k2). Soviet Mathematics Doklady, 27, 372–376 (1983). [15] Y. Nesterov: Introductory Lectures on Convex Optimization: A Basic Course. Kluwer Academic Publishers, Dordrecht (2004). [16] B. O’Donoghue, E. Cand`es: Adaptive restart for accelerated gradient schemes. Foundations of Computational Mathematics, 15, 715–732 (2015). [17] N.Parikh, S.Boyd: Proximalalgorithms.Foundations and Trends in Optimization, 1, 127–239 (2014). [18] M. A. Puso, T. A. Laursen, J. Solberg: A segment-to-segment mortar contact method for quadratic elements and large deformations. Computer Methods in Ap- plied Mechanics and Engineering, 197, 555–566 (2008). [19] M. Raous: Quasistatic Signorini problem with Coulomb friction and coupling to adhesion.In: P.Wriggers, P.Panagiotopoulos(eds.), New Developments in Contact Problems, Springer-Verlag, Wien (1999) pp. 101–178. [20] E. M. Rudoy: Domain decomposition method for a model crack problem with a possible contact of crack edges. Computational Mathematics and Mathematical Physics, 55, 305–316 (2015). [21] I˙. Temizer: A mixed formulation of mortar-based frictionless contact. Computer Methods in Applied Mechanics and Engineering, 223–224, 173–185 (2012). [22] I˙.Temizer,P.Wriggers,T.J.R.Hughes: Three-dimensionalmortar-basedfrictional contact treatment in isogeometric analysis with NURBS. Computer Methods in Applied Mechanics and Engineering, 209–212, 115–128 (2012). [23] The MathWorks, Inc.: MATLAB Documentation. http://www.mathworks.com/ (Accessed November 2016). 10