Harvard University
Division of Applied Sciences
Engineering Science 202 Class Notes #12.1
Estimation and Control of Dynamic Systems
Professor Y. C. Ho Fall 1997
ELEMENTS OF PROBABILITY AND RANDOM PROCESSES
1. Definitions of random variable:
A r.v. x is characterized by its density function p(x), and is approximately
characterized by its mean and variance
•
x ≡ E[x] ≡ xp(x)dx
-•
•
σ2x ≡ Var(x) ≡ (x-x)2 p(x)dx
-•
2. Definitions of vector random variables: Similarly x≡ [x1, . . ., xn] is characterized
by p(x1, . . ., xn)≡p(x) and approximately characterized by
x1
•
•
x ≡ E[x] ≡ xp(x)dx ≡
•
-•
xn
• •
Σ x ≡ Cov(x) ≡ (x-x)(x-x)Tp(x)dx ≡ (xi-xi)(xj-xj)Tp(x)dx
-• -•
3. Definitions of Random Sequences: A r.s. is nothing but a collection of r.v.'s
indexed by a parameter, usually time t, i.e. . . . . , x-1, x0, x1, x2, . . . . It can
certainly be thought of as a giant vector of r.v.'s. The only additional definition we
need to add is the concept of correlational function which is analogous to
covariances between the components of a vector r.v. (defined in §2 above);
correlation is covariances defined between r.v.s indexed by different t, i.e.
•
R(t,τ) ≡ (xt -xt )(xτ-xτ)Tp(x)dx
-•
4. The simplest r.s. is the purely random sequence, a.k.a. White Noise. Its joint
density function can be written as
p( . . . . , x-1 , x0 , x1 ,x2 , . . . .) = • p(xi )
i
Example: Gaussian white noise
wt ~ GWN(0, Qt): E[wt]=0 , E[wtwτT]=Qtδtτ
5. The next order of complexity is Markov r.s. which has the property that
p(xt+1 /xt ,xt-1 , . . . . ) = p( xt+1 /xt)
“knowledge of the present separate the past from the future”
A direct consequence of the Markov property is
p(x0 , x1 ,x2 , . . . .) = • p(xt+1 /xt )p(x0 )
t
which is just slightly more complicated than the purely random sequence.
Furthermore, so long as we are willing to increased the dimension of the random
vector xt any dependecne on the finite past can be re-formulated as a Markov
sequence1. Thus, Markov r.s. is fairly general.
If xt above takes on only discrete values then we have a Markov Chain. Let the
probability in each state be πi. Then π = [ . . .πi . . .] obeys
π=πP
where Pij is the transition probability from state i at time t to state j at time t+1.
The continuous state analog is
p(xt+1 ) = p(xt+1 /xt )p(xt )dxt
6. A stationary r.s. is when
p(. . . xt , xt+1 ,xt+2 , . . . .) = p(. . . xt+τ, xt+τ+1 ,xt+τ+2 , . . . .)
A wide sense stationary r.s. is when stationarity is applied only to the correlation
function R(t,τ) = R(T-τ).
A Gaussian r.s. is when the joint density functions p(x) is gaussian
7. Conditional probability of two r.v.'s, x and y
p(x,y) p(y/x)
p(x/y) ≡ ≡ p(x)
p(y) p(y)
which is also known as “Bayes Rule” characterizing the relationship between the
prior and the posterior knowledge about x.
1For example suppose xt+1 depends on xt and xt-1, we redefine yt+1=[xt+1, xt] then
p(yt+1/yt) = p(yt+1/yt,yt-1, . . ).
8. Let y=f(x) where x is a random variable. Then y is also a random variable with
p(y).
9.. §7 and §8 are easy to carry out if the random variables involved are Gaussian.
If
x ~ N(x, Σ x ) and y = Ax
then y ~ N (Ax, AΣ xAT)
Proof. For the case of z=x+y x~N(x, Σ x ) and y~N(y,Σ y ) we can prove via the
convolution formula directly.
Application:
xt ~ N(xt , P t ) and wt ~ N(wt , Qt ) ; xt and w t are independent
xt+1 = Φxt+wt
Consider x ≡ [ xt , wt] then
x ~ N ( xt , P t
0 A = Φ I , y=Ax=xt+1
wt 0 Qt ,
and xt+1 ~ N ( Φxt +w t , ΦPtΦT+Qt) (*)
the r.s. xt is a Gauss-Markov random sequence with p(x0) ~N( x0 , P0) and
p(xt+1/xt)~N(Φxt, Qt). All Gauss-Markov Random Sequence GMRS can be represented
by linear system driven by gaussian white noise.
continuous time analog is
w(t) ~GWN(0, Q(t))
E[w(t)]=0, E[w(t)w(τ)T]=Q(t)δ(t-τ)
continuous time GMRP
dx/dt=Ax+w ==>
dx = Ax +w ; E(x )=x
0 0
dt
dP = AP+PAT+Q ; Cov(x )=P
0 0
dt
The general Markov process analog is
p(xt+1 ) = p(xt+1 /xt )p(xt )dxt
If
Σ xx Σ xy
x y ~ N( x y , )
Σ yx Σ yy
then
-1 -1
p(x/y) ~ N( x +Σ xy Σ yy (y-y), Σ xx - Σ xy Σ yy Σ yx )
Proof: again for two dimension, a graphical illustration
Application: x=xt+1 ~ N(xt +1 , Mt+1), y=zt+1 ≡ Hxt+1+vt+1 where vt+1 ~ N(0, Rt+1)
and independent of xt+1 ==>
y=Hxt+1 , Σ yy = HMt+1 H T+Rt+1 , Σ xy = Cov[xt+1 zTt+1 ] = Mt+1 HT and
xt+1 =xt+1 +Mt+1 HT(HMt+1 HT+Rt+1 )-1 (zt+1 - Hxt+1 )
Pt+1 = Mt+1 - Mt+1 H T(HMt+1 HT+R+1)-1 HMt+1 with p(x/y) ~ N( xt +1, Pt+1) (**)
The continuous time analog is
x = 0 ; z(t) = Hx(t)+v(t) ; v(t) ~GWN (0, Rδ(t-τ))
and
x = PH TR-1 (z-Hx) , x(0)=x(0)
P = PHTR-1 HP , P(0)= P0