Chapter 7: The optimal control system
We study systems whose behavior can be modeled by a set of two
ordinary differential equations
x1 f1( x1, x2 , u), x2 f 2 ( x1, x2 , u), or xi f i ( x1, x2 ,u )
We now wish to control the system from X0 at t 0 to X1 at t t1 in such a
way that the cost functional
t1
J f 0 ( x1, x2 , u )dt
t0
is minimized where X0 , X1 and t0 are fixed t1 is unspecified.
* x * (t )
Let u (t ) be an optimal control and be the corresponding optimal
t
path. Consider a small variation of u * , u u* u (t ) with corresponding
path ( x1* x1 , x2* x2 ). This will not arrive at X1 at t t1 but at a slightly
different time t1 t .The end conditions give
xi* (t1 t ) xi (t1 t ) xi1, i 1,2.
xi* (t1 t ) xi* (t1) xi (t1 t ) xi1 xi*(t1)=0, i 1,2.
Based on a first-order approximation, we have
xi* (t1) t xi (t1) 0, i 1,2. or
xi (t1) fi (t1) t, i 1,2. (1)
Where f i (t1) denotes fi ( x1* (t1), x2* (t1), u* (t1)) .
The consequent change J is
J t f 0 ( x1, x2 , t )dt t1 t0
f 0 (t t0 ) t
t1 t1
f 0 (t t1 ) t f 0dt
0 t t t0
(2)
t1 f f f
f 0 (t1 ) t 0 x1 0 x2 0 u dt
t0 x
1 x2 u
In view of the constraints xi f i ( x1, x2 , u ) , we multiply the Lagrangian
multipliers i and integrate with respect to t giving
i t i (t )( xi fi ( x1, x2 , u))dt
t1
i t1 i (t1 )[ xi (t1 ) fi ( x1, x2 , u)] t i (t ) d ( xi ) fi x1 fi x2 fi u dt
t1
t 0
dt x1 x2 u
(3)
The first term in (3) vanishes in view of constraints and
(t ) dt ( x )dt [ (t ) x ] (t ) x dt
t1 d t1
t1
t0 i i i i t0 t0 i i
i (t1 ) xi (t1 ) i (t0 ) xi (t0 ) i (t ) xi dt
t1
(4)
t0
fi (t1 ) i (t1 ) t i (t ) xi dt
t1
t0
Substituting (4) into (3) gives
i t i (t ) fi x1 fi x2 fi u dt fi (t1 ) i (t1 ) t t i (t ) xi dt
t1 t1
0
x1 x2 u 0
(5)
Now, combing (2) and (5), we have
( J 1 2 ) 0 or
f 0 f f f f f
1 1 2 2 1 dt x2 0 1 1 2 2 2 dt
t1 t1
t0
x1
x1 x1 x1 t0
x2 x2 x2
f f f
u 0 1 1 2 2 dt f 0 (t1 ) f1 (t1 ) 1 (t1 ) f 2 (t1 ) 2 (t1 ) t 0
t1
t0
u u u
(6)
We introduce the function
H f 0 ( x1, x2 , u ) 1 f1 ( x1, x2 , u ) 2 f 2 ( x1, x2 , u)
Eq. (6) becomes
H H H
1 dt x2 2 dt u
t1 t1 t1
t0
x1
1x t0
2x t0 u
dt H (t1 ) t 0 (7)
H
If we choose i , Eq. (7) becomes
xi
t1 H
t0 u
udt H (t1 ) t 0
Now we have the Euler equation
H
0 (8)
u
with the natural boundary condition
H (t1 ) 0 (9)
where t t1 is not specified.
In order to minimize the cost functional
t1
J f 0 ( x1, x2 , u )dt ,
t0
the scalar function
H f 0 ( x1, x2 , u ) 1 f1 ( x1, x2 , u ) 2 f 2 ( x1, x2 , u)
is required to attain its maximum with respect to u at u u* (t ) .
Example-1:
Consider the behavior of the system is x1 x1 with a solution
x1 cet . If no control is imposed in the system, then the system has a
tendency to move closer to the origin. If we ask it to go from x1 2 to
x1 1 , and do not specify how long it should take, then it can do so
without incurring any cost at all. However, it will not go from x1 1 to
x1 2 unless it is forced by a control large enough to overcome its
tendency to move towards the origin.
Example-2:
Consider the simple one-dimensional problem of controlling
system x1 x1 u from x1 1 at t 0 to x1 2 at some time t1 in
1 t1 2
2 0
such a way that J u dt is minimized.
We have f 0 ( x1, x2 , u ) u 2 / 2 , f1 ( x1, x2 , u ) x1 u , f 2 ( x1, x2 , u ) 0 , and
H
H u 2 / 2 1 ( x1 u) , 1 1 , 1 Aet
x1
To attain its maximum, we have
H 2H
u 1 u Aet 0 and 1
u u 2
1
Thus, we have u Aet , and x1 x1 Aet or x1 Aet Bet .
2
Now apply the end conditions x1(0) 1 and x1(t1 ) 2 , we have
A 2(2 et1 ) / (et1 et1 ), B (et1 2) / (et1 et1 )
and the natural boundary condition gives
A2e2t 1
A t1
H (t1 ) Aet ( Bet
1 1
e Aet1 ) 0
2 2
It reduces to AB 0, but A 0, leading to B 0, or t1 ln 2 .
Example-3:
x x
1 2
Now we consider the system x2 u , and the problem of
u k
controlling it from a given initial point (a, b) at t 0 to the origin in as
short as a time possible. The cost for this problem is simply t1 , and this
we need to write in the form
t1
J f 0 ( x1, x2 , u )dt
0
Choosing f 0 ( x1, x2 , u ) 1 gives J t1 . So in this problem f 0 ( x1, x2 , u ) 1 ,
f1 ( x1, x2 , u ) x2 , f 2 ( x1, x2 , u ) u and we have
H 1 1x2 2u
with
1 H 0, 2 H 1
x1 x2
so 1 A, 2 B - At , where A and B are arbitrary constants.
Thus
H 1 Ax2 (B - At )u
Here we need to tackle optimal control problems for which the values
taken by the control are restricted to a finite region u k .
Consider a truck be at M, a distance x from L. The equation of motion is
mx F
w F K here, K is a constant.
At each time t we can choose the magnitude and direction of F so
as to control the behavior of the truck. We think of F/m as our time-
dependent control function u(t) and write the equation of motion as
x u(t ), u K/m
Now let
x1 x2 , x2 u(t ), u K /m
We can represent the state of the system by a point in the ( x1, x2 ) plane,
and the origin represents the state in which the system is at rest at L. We
then have a pair of first-order differential equations for x1 and x2 ,
x1 x2 , x2 u* , u* K / m
And H 1 Ax2 (B - At )u*
The maximum value of H is attained at u* K / m if B - At 0 and at
u* K / m if B - At 0 . When B - At 0 , the controls are allowed switch
once from K / m to K / m or vice versa. If we eliminate t to obtain
dx2 u*
and x22 2u* x1 constant
dx1 x2
Each of the two values of u * corresponds to a family of parabolas.
There are only two routes that lead to the origin: the lower half of
the family L (u* K / m) through O and the upper half of the family L
(u* K / m) through O. If an initial point (a, b) lies on either of these,
then it can go to O without switching. Suppose that (a, b) lies above POQ
at R, there is no direct route to O, but we are allowed one switch and that
the final section of the path must be along OP or OQ. There is only one
possibility that the L path through R intersects OQ, the L path through
R does not intersects OP. Thus, we must take the L (u* K / m)
through R until it intersects OQ and then switch to u* K / m in order to
travel to O an arc of OQ. Similar arguments apply for an initial point S
that lies below POQ.
Let us see if our optimal solution fits common-sense ideas about
the fastest way of returning the truck to rest at L . Suppose that at t 0
the state of the system is represented by the point R that is the truck is to
the right of L ( x1 x 0 ) and moving away from L ( x2 x1 0 ).Our first
priority is to stop it and reverse the direction of its velocity. We use the
motor to produce a force towards L so that the equation of motion is
mx F (t ) . This will allow the truck down and eventually force it back
towards L . If the force exerted remains in this direction the truck will
reach but pass through L , so at some stage the direction of the force
must be reversed. We need to control F(t) at any t, provided F (t ) K ,
and it must be negative at the start but positive at the end. It is
understood that a large force will cause the transfer to L faster than a
smaller one, so it is suggested to choose F (t ) K at the start and
F (t ) K at the end. The difficulty is in choosing the time at which to
switch from one to the other. Switch too early and the truck will come to
rest before it reaches L , switch too late and it will overshoot. Therefore,
our choice must be exactly right to reach the fastest way of returning the
truck to rest at L .
Example-4: The glucose problem
The level of glucose is governed by the state equation
x1 x1 u
where the control u u(t ) satisfies the constraint 0 u m . The level is
to be controlled from x1 a at t 0 to x1 c at some time T in such a
way ,
T
J udt
0
Is minimized. Find the optimal control and the corresponding value of J .
Solution:
H u 1 ( x1 u) x11 u(1 1)
The maximum of H with respect to u is for
0 when 1 1
uu
m when 1 1
The corresponding state equation
x1 x1 u, where u 0 or m
Integrates to give
x1 Bet u /
Applying the end conditions, we obtain
1 u a
B a u / , T ln
u c
There are two cases:
(i) a > c . Here we are decreasing the glucose level. The control u m
gives T 0 and clearly cannot transfer the level to its required final
1
value x1 c . However, the control u 0 gives T ln(a / c) 0 .
This control takes the system to x1 c in a finite time and the
corresponding value of J is zero. That is, the system can get to
x1 c with no cost.
(ii) a < c . Here we are increasing the glucose level. The control u 0
gives T a negative value and is useless. The control u m gives a
finite positive value to T . Thus the only control that satisfies the
maximum principle and transfers the system from x1 a to x1 c
when a < c is u m with corresponding cost
m m a
J ln( )
m c
Problem-1: The system x1 x1 u is to be controlled from
x1 0 at t 0 to x1 2 at t 1 in such a way that
1 t1 2 2
2 t0
J (3x1 +u ) dt is minimized.
Find the optimal control u * .
Time-optimal control of linear systems
We consider here systems with two variables x1 (t ), x2 (t)
describing the state of the system and a single control variable u (t ) that
is forced to take its values in u 1 . We let the system be governed by a
pair of linear differential equations
x1 ax1 bx2 lu
x2 cx1 dx2 mu
with u 1 and a, b, c, d , l , m given constants.
In matrix notation we have
x Ax Iu
where
a b l
A and I
c d m
Given that the system can be controlled from a given initial point
x(t0 ) x0 to a given target point x(t1 ) x1 by an admissible control u 1 ,
we will find the optimal control u* (t ) for which
t1
J 1 dt t1 t0 is minimized.
t0
We need to maximize H as a function of u as
H 1 1 (ax1 bx2 lu) 2 (cx1 dx2 mu )
1 1 (ax1 bx2 ) 2 (cx1 dx2 ) (l 1 m 2 )u
The co-state equations are
1 H a 1 c 2 , 2 H b 1 d 2
x1 x2
or, in matrix notation
ψ = -A T ψ where ψ = 1 (1)
2
Since H is linear in u, so to maximize H we need u=1 or u=-1, depending
on the sign of the coefficient l 1 m 2 . Thus the only controls that can
lead to a minimum time of transfer are those of the form
u* sgn(l 1 m 2 )
They are piecewise constant controls that are discontinuous at the zeros
of
S l 1 (t ) m 2 (t )
which is called the switching function switching from 1 to -1 or from -1 to
1 whenever S=0. In the time interval between two zeros of S the control
is constant, so the state equations become
x Ax Iu , where u* 1 or -1
and the form of the trajectories in the ( x1 , x2 ) plane is easily found in
each case. Provided ad bc 0 the trajectories for u* 1 will have an
isolated singularity at the intersection of
ax1 bx2 l 0 and cx1 dx2 m 0 (2)
while the trajectories for u* 1 will have an isolated singularity at the
intersection of
ax1 bx2 l 0 and cx1 dx2 m 0 (3)
The behavior of both families of trajectories is determined by the
eigenvalues of the system matrix A. The trajectory pattern is the same as
the pattern of the trajectories of the uncontrolled system x Ax . The
only difference is that the whole phase plane pattern is translated so that
the singularity is at the solution of (2) for u* 1 and at the solution of (3)
for u* 1 .Recall that a singular point in the phase-plane represents a
solution that is constant for all t.
Now if A has real eigenvalues so does AT , so the solution of the co-
state equations (1) must be of the form
ψ heq1t keq2t
where h, k are eigenvectors corresponding to the real eigenvalues
q1, q2 of AT .
Example-1:
The system x1 3x1 2 x2 5u, x2 2 x1 3x2 is to be controlled
from a general initial state to the origin in minimum time. Find the optimal
control when the control satisfies the constraint u 1 .
Solution:
We have
H 1 1 (3x1 2 x2 5u ) 2 (2 x1 3 x2 ) 1 1 (3 x1 2 x2 ) 2 (2 x1 3 x2 ) 5 1u
and 1 H 3 1 2 2 , 2 H 2 1 3 2 .
x1 x2
The function H is maximized as a function of u, u 1 , by
u* sgn(5 1 ) 1. The corresponding solutions of the state equations are
the trajectories of the linear system
3 2 5 *
x x 0u , u * 1
2 3
3 2
The eigenvalues of
2 3
are -1 and -5. Since the eigenvalues are both negative the trajectories
are those of a stable node. For u* 1 the singularity is at the solution of
3x1 2 x2 5 0, 2 x1 3x2 0. That is at x1 3, x2 2. For u* 1 the
singularity is at the solution of 3x1 2 x2 5 0, 2 x1 3x2 0. That is at
x1 3, x2 2. In both cases there are straight-line trajectories of slope
+1 and -1. Let L be a typical path corresponding to u* 1 and L be a
typical path corresponding to u* 1 .
In order to reach the origin in minimum time the phase point ( x1 , x2 ) must
either travel along a L path or a L path and can switch one to the
other at most time. The system must arrive at O on a L path or a L
path. The slope of these paths at O is dx2 / dx1 0 / 5u* 0 .
If the initial state of the system happens to lie on either P+O or Q-O
then the optimal control sequence must be 1 or 1 , respectively. The
system goes directly to O without a switch.
Now consider any initial state W lying above P+OQ-,
It cannot be taken to O on a L path because of the singularity at
P . The state of the system would simply be attracted towards P.
However, the singularity for the L trajectories lies on the other side of
P+OQ- and any such path starting at W would intersect P+O which is the
u* 1 path to the [Link] there is a control sequence 1,1 with one
switch which takes an initial state W to O. This must be the time optimal
control for initial states lying above P+OQ-. A similar argument shows that
the time-optimal control for initial states below P+OQ- must be 1, 1 and
we can write down the optimal synthesis
1, below P OQ and on P O
u*
1, above P OQ and on OQ
Above figure shows some typical optimal path. Note that every
initial state can be controlled to the origin in minimum time. Complete
control of this system is possible because the system itself is inherently
stable; the uncontrolled system has a stable node at O so the system
has a natural tendency to move towards O. As we shall see in the next
example, unstable systems are not as tractable.
Example-2:
The system x1 3x1 2 x2 5u, x2 2 x1 3x2 is to be controlled
from a general initial state to the origin in minimum time. Find the optimal
control when the control satisfies the constraint u 1 .
Solution:
We have
H 1 1 (3x1 2 x2 5u ) 2 (2 x1 3x2 )
1 1 (3x1 2 x2 ) 2 (2 x1 3x2 ) 5 1u
and 1 H 3 1 2 2 , 2 H 2 1 3 2 .
x1 x2
The function H is maximized as a function of u, u 1 , by
u* sgn(5 1 ) 1. The corresponding solutions of the state equations are
the trajectories of the linear system
3 2 5 *
x x 0u , u * 1
2 3
3 2
The eigenvalues of
2 3
are 1 and 5. Since the eigenvalues are both positive the trajectories are
those of an unstable node. For u* 1 the singularity is at the solution of
3x1 2 x2 5 0, 2 x1 3x2 0. That is at x1 3, x2 2. For u* 1 the
singularity is at the solution of 3x1 2 x2 5 0, 2 x1 3x2 0. That is at
x1 3, x2 2.
In both cases there are straight-line trajectories of slope +1 and -1.
Let L be a typical path corresponding to u* 1 and L be a typical path
corresponding to u* 1 . It becomes apparent that the behavior of this
system is strikingly different from that of Example-1.
Consider the L trajectory through O. Only states that lie on the finite
section PO to the left of O can reach the origin.
Now consider an initial state such as W that Lies above POQ.
If we choose u* 1 then W will be driven to infinity; it cannot be
controlled to O with one switch of control because the L path through W
cannot intersect QO. If we choose u* 1 then control to O with one
switch is possible only if this trajectory intersects PO. For most initial
states such as W the L path through it fails to hit PO. Thus control to O
with one switch is impossible for most initial states. There is however a
region near O for which time-optimal control is possible.
Consider the L trajectory that passes through P.
This curve labelled has a geometrical intersection with PO but does
not yield a route to O. is useful only as part of the boundary of the
region in which control to O is possible. Thus a L trajectory lying below
and above QO intersects PO at an ordinary point, a switch of the
control to u* 1 would then send the system to O along part of PO. If
however W lies above then the L path through it does not intersect
PO and such initial states cannot be controlled to O with one switch.
Thus the only initial states that can be driven to the origin by the
sequencs 1,1 are those lying between PO and . Similarly, the only
states that can be controlled to O by the sequencs 1, 1 are those lying
between QO and . Outside the region bounded by and there
can be no time-optimal path to the origin. Inside the region bounded by
and the optimal synthesis must be
1, above POQ and on QO
u*
1, below POQ and on PO
Example-3:
The system x1 x1 3x2 7u, x2 3x1 x2 5u is to be controlled
from a general initial state to the origin in minimum time. Find the optimal
control when the control satisfies the constraint u 1 .
Solution:
We have
H 1 1 ( x1 3x2 7u) 2 (3x1 x2 5u)
1 1 ( x1 3x2 ) 2 (3x1 x2 ) (7 1 5 2 )u
and 1 H 1 3 2 , 2 H 3 1 2 .
x1 x2
The function H is maximized as a function of u, u 1 , by
u* sgn(7 1 5 2 ) 1 .The corresponding solutions of the state
equations are the trajectories of the linear system
1 3 7 *
x x 5 u , u * 1
3 1
1 3
The eigenvalues of
3 1
are -2 and 4 so the singularity is a saddle point. The singularity for L
paths is at (1,2) and is at (-1,-2) for the L paths. There are straight-line
trajectories of slope +1 and -1 and at the origin both families of
trajectories have slope 5/7.
If the initial state lies on P+O or Q-O, then minimum time to the
origin is attained by the control sequence 1 or 1 respectively.
Consider the control sequence 1,1 . The phase point reaches O by
travelling along part of P+O and the previous section must have been
along a L path that intersects P+O.
Above figure NO L path lying below CQD can intersect P+O so no
initial point that is below CQD can be controlled to the origin by the
control sequence 1,1 . The only initial states that can reach O using
the control sequence 1,1 are those lying in the region between CQD
and Q-OP+. Similarly, the only initial states that can reach O with the
control sequence 1, 1 are those lying in the region between APB and
Q-OP+. Therefore, Inside the infinite strip we have the usual time-optimal
synthesis
1, below QOP and on QO
u*
1, above QOP and on P O