NOTES ON OPTIMAL CONTROL THEORY with economic models and exercises Andrea Calogero Dipartimento di Matematica e Applicazioni – Universit`a di Milano-Bicocca ([email protected]) October 30, 2018 ii Contents 1 Introduction to Optimal Control 1 1.1 Some examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1 1.2 Statement of problems of Optimal Control . . . . . . . . . . . . . 5 1.2.1 Admissible control and associated trajectory . . . . . . . 5 1.2.2 Optimal Control problems . . . . . . . . . . . . . . . . . . 9 2 The simplest problem of OC 11 2.1 The necessary condition of Pontryagin . . . . . . . . . . . . . . . 11 2.1.1 The proof in a particular situation . . . . . . . . . . . . . 16 2.2 Sufficient conditions . . . . . . . . . . . . . . . . . . . . . . . . . 19 2.3 First generalizations . . . . . . . . . . . . . . . . . . . . . . . . . 22 2.3.1 Initial/final conditions on the trajectory . . . . . . . . . . 22 2.3.2 On minimum problems. . . . . . . . . . . . . . . . . . . . 22 2.4 The case of Calculus of Variation . . . . . . . . . . . . . . . . . . 22 2.5 Examples and applications. . . . . . . . . . . . . . . . . . . . . . 25 2.5.1 The curve of minimal length . . . . . . . . . . . . . . . . 29 2.5.2 A problem of business strategy I . . . . . . . . . . . . . . 29 2.5.3 A two-sector model. . . . . . . . . . . . . . . . . . . . . . 31 2.5.4 A problem of inventory and production I. . . . . . . . . . 34 2.6 Singular and bang-bang controls . . . . . . . . . . . . . . . . . . 36 2.6.1 The building of a mountain road: a singular control . . . 37 2.7 The multiplier as shadow price I: an exercise . . . . . . . . . . . 40 3 General problems of OC 43 3.1 Problems of Bolza, of Mayer and of Lagrange . . . . . . . . . . . 43 3.2 Problems with fixed or free final time. . . . . . . . . . . . . . . . 44 3.2.1 Fixed final time. . . . . . . . . . . . . . . . . . . . . . . . 44 3.2.2 Free final time . . . . . . . . . . . . . . . . . . . . . . . . 46 3.2.3 The proof of the necessary condition . . . . . . . . . . . . 48 3.2.4 The moonlanding problem . . . . . . . . . . . . . . . . . . 50 3.3 The Bolza problem in Calculus of Variations. . . . . . . . . . . . 55 3.3.1 Labor adjustment model of Hamermesh. . . . . . . . . . . 57 3.4 Existence and controllability results. . . . . . . . . . . . . . . . . 58 3.4.1 Linear time optimal problems . . . . . . . . . . . . . . . . 62 3.5 Time optimal problem . . . . . . . . . . . . . . . . . . . . . . . . 64 3.5.1 The classical example of Pontryagin and its boat . . . . . 66 3.5.2 The Dubin car . . . . . . . . . . . . . . . . . . . . . . . . 69 3.6 Infinite horizon problems . . . . . . . . . . . . . . . . . . . . . . 73 iii 3.6.1 The model of Ramsey . . . . . . . . . . . . . . . . . . . . 76 3.7 Current Hamiltonian . . . . . . . . . . . . . . . . . . . . . . . . . 79 3.7.1 A model of optimal consumption with log–utility I . . . . 82 4 Constrained problems of OC 85 4.1 The general case . . . . . . . . . . . . . . . . . . . . . . . . . . . 85 4.2 Pure state constraints . . . . . . . . . . . . . . . . . . . . . . . . 89 4.2.1 Commodity trading . . . . . . . . . . . . . . . . . . . . . 92 4.3 Isoperimetric problems in CoV . . . . . . . . . . . . . . . . . . . 96 4.3.1 Necessary conditions with regular constraints . . . . . . . 97 4.3.2 The multiplier ν as shadow price . . . . . . . . . . . . . . 99 4.3.3 The foundation of Cartagena . . . . . . . . . . . . . . . . 100 4.3.4 The Hotelling model of socially optimal extraction . . . . 101 5 OC with dynamic programming 105 5.1 The value function: necessary conditions . . . . . . . . . . . . . . 105 5.1.1 The final condition on V: an first necessary condition . . 106 5.1.2 Bellman’s Principle of optimality . . . . . . . . . . . . . . 107 5.2 The Bellman-Hamilton-Jacobi equation . . . . . . . . . . . . . . 108 5.2.1 Necessary condition of optimality . . . . . . . . . . . . . . 108 5.2.2 Sufficient condition of optimality . . . . . . . . . . . . . . 111 5.2.3 Affine Quadratic problems . . . . . . . . . . . . . . . . . . 113 5.3 Regularity of V and viscosity solution . . . . . . . . . . . . . . . 115 5.3.1 Viscosity solution . . . . . . . . . . . . . . . . . . . . . . . 119 5.4 More general problems of OC . . . . . . . . . . . . . . . . . . . . 124 5.4.1 On minimum problems. . . . . . . . . . . . . . . . . . . . 125 5.5 Examples and applications. . . . . . . . . . . . . . . . . . . . . . 125 5.5.1 A problem of business strategy II . . . . . . . . . . . . . . 130 5.5.2 A problem of inventory and production II. . . . . . . . . . 133 5.6 The multiplier as shadow price II: the proof . . . . . . . . . . . . 138 5.7 Infinite horizon problems . . . . . . . . . . . . . . . . . . . . . . 140 5.7.1 Particular infinite horizion problems . . . . . . . . . . . . 143 5.7.2 A model of consumption with HARA–utility . . . . . . . 144 5.7.3 Stochastic consumption: the idea of Merton’s model . . . 146 5.7.4 A model of consumption with log–utility II . . . . . . . . 147 5.8 Problems with discounting and salvage value . . . . . . . . . . . 148 5.8.1 A problem of selecting investment . . . . . . . . . . . . . 150 iv Chapter 1 Introduction to Optimal Control 1.1 Some examples Example 1.1.1. Thecurveofminimallengthandtheisoperimetricprob- lem Supposeweareinterestedtofindthecurveofminimallengthjoiningtwodistinct points in the plane. Suppose that the two points are (0,0) and (a,b). Clearly we can suppose that a = 1. Hence we are looking for a function x : [0,1] → R such that x(0)=0 and x(1)=b. The length of such curve is defined by x (cid:82)1 ds, i.e. as the “sum” of arcs of in- 0 finitesimal length ds; using the picture x+dx ds and the Theorem of Pitagora we obtain x b (ds)2 =(dt)2+(dx)2 (cid:112) ⇒ ds= 1+x˙2dt, t where x˙ = dxd(tt). t t+dt 1 Hence the problem is (cid:90) 1 min (cid:112)1+x˙2(t)dt x 0 (1.1) x(0)=0 x(1)=b It is well known that the solution is a line. We will solve this problem in subsection 2.5.1. Amorecomplicateproblemistofindtheclosedcurveintheplaneofassigned length such that the area inside such curve is maximum: we call this problem the foundation of Cartagena.1 This is the isoperimetric problem. Without loss 1WhenCartagenawasfounded,itwasgrantedforitsconstructionasmuchlandasaman could circumscribe in one day with his plow: what form should have the groove because it obtainsthemaximumpossibleland,beinggiventothelengthofthegroovethatcandigaman inaday? Or,mathematicallyspeaking,whatistheshapewiththemaximumareaamongall thefigureswiththesameperimeter? 1 2 CHAPTER 1. INTRODUCTION TO OPTIMAL CONTROL of generality, we consider a curve x : [0,1] → R such that x(0) = x(1) = 0. (cid:82)1 Clearly the area delimited by the curve and the t axis is given by x(t)dt. 0 Hence the problem is (cid:90) 1 mx(xa0x)=00x(t)dt (1.2) x(1)=0 (cid:90) 1(cid:112)1+x˙2(t)dt=A>1 0 Notethatthelengthoftheinterval[0,1]isexactly1and,clearly,itisreasonable to require A>1. We will present the solution in subsection 4.3.3. Example 1.1.2. A problem of business strategy Afactoryproducesauniquegoodwitharatex(t),attimet. Ateverymoment, such production can either be reinvested to expand the productive capacity or sold. The initial productive capacity is α > 0; such capacity grows as the reinvestment rate. Taking into account that the selling price is constant, what fractionu(t)oftheoutputattimetshouldbereinvestedtomaximizetotalsales over the fixed period [0,T]? Let us introduce the function u:[0,T]→[0,1]; clearly, if u(t) is the fraction of the output x(t) that we reinvest, (1−u(t))x(t) is the part of x(t) that we sell at time t at the fixed price P >0. Hence the problem is (cid:90) T mu∈aCx 0 (1−u(t))x(t)P dt x˙ =ux (1.3) Cx(=0){=uα:[,0,T]→[0,1]⊂R, u∈KC} whereαandT arepositiveandfixed. Wewillpresentthesolutioninsubsection 2.5.2 and in subsection 5.5.1. Example 1.1.3. The building of a mountain road The altitude of a mountain is given by a differentiable function y, with y : [t ,t ] → R. We have to construct a road: let us determinate the shape of 0 1 the road, i.e. the altitude x = x(t) of the road in [t ,t ], such that the slope 0 1 of the road never exceeds α, with α > 0, and such that the total cost of the construction (cid:90) t1 (x(t)−y(t))2dt t0 is minimal. Clearly the problem is (cid:90) t1 min (x(t)−y(t))2dt u∈C t0 (1.4) x˙ =u C ={u:[t ,t ]→[−α,α]⊂R, u∈KC} 0 1 where y is an assigned and continuous function. We will present the solution in subsection 2.6.1 of this problem introduced in chapter IV in [22]. 1.1. SOME EXAMPLES 3 Example 1.1.4. “In boat with Pontryagin”. Suppose we are on a boat that at time t =0 has distance d >0 from the pier 0 0 oftheportandhasvelocityv inthedirectionoftheport. Theboatisequipped 0 with a motor that provides an acceleration or a deceleration. We are looking for a strategy to arrive to the pier in the shortest time with a “soft docking”, i.e. with vanishing speed in the final time T. We denote by x=x(t) the distance from the pier at time t, by x˙ the velocity of theboatandbyx¨=utheacceleration(x¨>0)ordeceleration(x¨<0).Inorder to obtain a “soft docking”, we require x(T)=x˙(T)=0, where the final time T is clearly unknown. We note that our strategy depends only on our choice, at every time, on u(t). Hence the problem is the following minT xx¨u∈(=0C)u=d 0 (1.5) x˙(0)=v Cx(=T){=u:x˙0[(0T,∞)=)→0 [−1,1]⊂R} where d and v are fixed and T is free. 0 0 This is one of the possible ways to introduce a classic example due to Pon- tryagin; it shows the various and complex situations in the optimal control problems (see page 23 in [22]). We will solve this problem in subsection 3.5.1. (cid:52) Example 1.1.5. A model of optimal consumption. Consideraninvestorwho,attimet=0,isendowedwithaninitialcapitalx(0)= x > 0. At any time he and his heirs decide about their rate of consumption 0 c(t)≥0. Thus the capital stock evolves according to x˙ =rx−c where r > 0 is a given and fixed rate to return. The investor’s time utility for consumingatratec(t)isU(c(t)).Theinvestor’sproblemistofindaconsumption plain so as to maximize his discounted utility (cid:90) ∞ e−δtU(c(t))dt 0 where δ, with δ ≥r, is a given discount rate, subject to the solvency constraint that the capital stock x(t) must be positive for all t≥0 and such that vanishes at ∞. Then the problem is (cid:90) ∞ mx˙c∈a=Cxrx0−ec−δtU(c)dt x(0)=x0 >0 (1.6) Cxtl→im≥=∞x0{c(t:)[=0,∞0 )→[0,∞)} with δ ≥r ≥0 fixed constants. We will solve this problem in subsections 3.7.1 and 5.7.4 for a logarithmic utility function, and in subsection 5.7.2 for a HARA utility function. (cid:52) 4 CHAPTER 1. INTRODUCTION TO OPTIMAL CONTROL One of the real problems that inspired and motivated the study of optimal control problems is the next and so called “moonlanding problem”. Example 1.1.6. The moonlanding problem. Consider the problem of a spacecraft attempting to make a soft landing on the moon using a minimum amount of fuel. To define a simplified version of this problem, let m=m(t)≥0 denote the mass, h=h(t)≥0 and v =v(t) denote the height and vertical velocity of the spacecraft above the moon, and u=u(t) denote the thrust of the spacecraft’s engine. Hence in the initial time t = 0, 0 we have initial height and vertical velocity of the spacecraft as h(0) = h > 0 0 and v(0) = v < 0; in the final time T, equal to the first time the spacecraft 0 reaches the moon, we require h(T) = 0 and v(T) = 0. Such final time T is not fixed. Clearly h˙ =v. LetM denotethemassofthespacecraftwithoutfuel,F theinitialamountof fuel and g the gravitational acceleration of the moon. The equations of motion of the spacecraft is mv˙ =u−mg wherem=M+candc(t)istheamountoffuelattimet.Letαbethemaximum thrust attainable by the spacecraft’s engine (α > 0 and fixed): the thrust u, 0≤u(t)≤α, of the spacecraft’s engine is the control for the problem and is in relation with the amount of fuel with m˙ =c˙=−ku, with k a positive constant. h spacecraft h0 u v 0 spacecraft mg Moon Moon On the left, the spacecraft at time t=0 and, on the right, the forces that act on it. The problem is to land using a minimum amount of fuel: (cid:0) (cid:1) min m(0)−m(T) =M +F −maxm(T) 1.2. STATEMENT OF PROBLEMS OF OPTIMAL CONTROL 5 Let us summarize the problem maxm(T) hmm˙u˙∈=v˙=C=v−uk−u mg (1.7) h(0)=h , h(T)=0 0 vmm(((0t0)))=≥=v0M0,, +Fhv((tT))≥=00 C ={u:[0,T]→[0,α]} where h , M, F, g, −v , k and α are positive and fixed constants; the final 0 0 time T is free. The solution for this problem is very hard; we will present it in subsection 3.2.4. (cid:52) 1.2 Statement of problems of Optimal Control 1.2.1 Admissible control and associated trajectory Let us consider a problem where the development of the system is given by a function x:[t ,t ]→Rn, with x=(x ,x ,...,x ), 0 1 1 2 n withn≥1.Ateverytimet,thevaluex(t)describesoursystem. Wecallxstate variable (or trajectory): thestate variableis at least a continuous function. We suppose that the system has an initial condition, i.e. x(t )=α, (1.8) 0 where α=(α ,α ,...,α )∈Rn. In many situation we require that the trajec- 1 2 n tory satisfies x satisfies a final condition; in order to do that, let us introduce a set T ⊂[t ,∞)×Rn called target set. In this case the final condition is 0 (t ,x(t ))∈T. (1.9) 1 1 Forexample, thefinalconditionx(t )=β withβ fixedinRn hasT ={(t ,β)} 1 1 astargetset; ifwehaveafixedfinaltimet andnoconditionsonthetrajectory 1 at such final time, then the target set is T ={t }×Rn. 1 Letussupposethatoursystemdependsonsomeparticularchoice(orstrat- egy), at every time. Essentially we suppose that the strategy of our system is given by a measurable2 function u:[t ,t ]→U, with u=(u ,u ,...,u ), 0 1 1 2 k where U is a fixed closed set in Rk called control set. We call such function u control variable. However, it is reasonable is some situations and models to 2In many situations that follows we will restrict our attention to the class of piecewise continuous functions (and replace “measurable” with “KC”); more precisely, we denote by KC([t0,t1]) the space of piecewise continuous function u on [t0,t1], i.e. u is continuous in [t0,t1] up to a finite number of points τ such that lim u(t) and lim u(t) exist and are t→τ+ t→τ− finite. 6 CHAPTER 1. INTRODUCTION TO OPTIMAL CONTROL require that the admissible controls are in KC and not only measurable (this is the point of view in the book [13]). The fact that u determines the system is represented by the dynamics, i.e. the relation x˙(t)=g(t,x(t),u(t)), (1.10) where g : [t ,t ]×Rn×Rk → Rn. From a mathematical point of view we are 0 1 interesting in solving the Ordinary Differential Equation (ODE) of the form x˙ =g(t,x,u) in [t ,t ] 0 1 x(t )=α (1.11) 0 (t ,x(t ))∈T 1 1 where u is an assigned function. In general, without assumption on g and u, it isnotpossibletoguaranteethatthereexistsauniquesolutionfor(1.10)defined in all the interval [t ,t ]; moreover, since the function t (cid:55)→ g(t,x,u(t)) can be 0 1 not regular, we have to be precise on the notion of “solution” of such ODE. In the next pages we will give a more precise definition of solution x for (1.11). Controllability Let us give some examples that show the difficulty to associated a trajectory to a control: Example 1.2.1. Letusconsider (cid:26) √ x˙ =2u x in[0,1] x(0)=0 Provethatthefunctionu(t)=a,withapositiveconstant,isnotanadmissiblecontrolsince thetwofunctionsx1(t)=0andx2(t)=a2t2 solvethepreviousODE. Example 1.2.2. Letusconsider (cid:26)x˙ =ux2 in[0,1] x(0)=1 Provethatthefunctionu(t)=a,withaconstant,isanadmissiblecontrolifandonlyifa<1. Provethatthetrajectoryassociatedtosuchcontrolisx(t)= 1 . 1−at Example 1.2.3. Letusconsider x˙ =ux in[0,2] x(0)=1 x(2)=36 |u|≤1 Prove3 thatthesetofadmissiblecontrolisempty. The problem to investigate the possibility to find admissible control for an optimal controls problem is called controllability (see section 3.4). In order to guarantee the solution of (1.11), the following well–known theorem is funda- mental Theorem 1.1. Let us consider G = G(t,x) : [t ,t ]×Rn → Rn and let G be 0 1 continuous and Lipschitz continuous with respect to x in an open set D ⊆Rn+1 with (t ,α)∈D ⊂R×Rn. Then, there exists a neighborhood I ⊂R of t such 0 0 that the ODE (cid:26) x˙ =G(t,x) x(t )=α 0 3Notethat0≤x˙ =ux≤3xandx(0)=1imply0≤x(t)≤e3t.
Description: