Discrete distributions Proof & Note
Let X be a stochastic variable with possible values {x1 , . . . , xk } and Proof.
P
P(X = xi ) = pi . Of course ki=1 pi = 1.
An algorithm for simulating a value for x is then: P(X = xi ) = P(u ∈ (Fi−1 , Fi ])
Fk = 1 = P(u ≤ Fi ) − P(u ≤ Fi−1 )
u ∼ U[0, 1] Fk−1 = Fi − Fi−1 = (p1 + . . . + pi ) − (p1 + . . . + pi−1 ) = pi
for i = 1, 2, . . . , k do
if u ∈ (Fi−1 , Fi ] then
x ← xi Note: We may have k = ∞
end if p1 + p2 + p3 = F3
u • The algorithm is not necessarily very efficient. If k is large,
end for p1 + p2 = F2
many comparisons are needed.
p2
Each interval Ii = (Fi−1 , Fi ] p1 = F1 • This generic algorithm works for any discrete distribution. For
p1
corresponds to single value of x. F0 = 0 specific distributions there exist alternative algorithms.
Bernoulli distribution Binomial distribution
Let X ∼ Bin(n, p).
Let S = {0, 1}, P(X = 0) = 1 − p, P(X = 1) = p.
The generic algorithm from before can be used, but involves tedious
Thus X ∼ Bin(1, p). calculations which may involve overflow difficulties if n is large.
1
An alternative is:
1−p x =0
The algorithm becomes now: for i = 1, 2, . . . , n do
u ∼ U[0, 1] p generate u ∼ U[0, 1]
x = I (u ≤ p) if u ≤ p then
p x ←x +1
end if
0 end for
return x
Geometric and negative binomial distribution Poisson distribution
λx −λ
The negative binomial distribution gives the probability of needing Let X ∼ Po(λ), i.e. f (x) = x! e , x = 0, 1, 2, . . ..
x trials to get r successes, where the probability for a success is An alternative to the generic algorithm is:
given by p. We write X ∼ NB(r , p). x =0 . (# of events)
The generic algorithm can still be used, but an alternative is: t=0 . (time)
while t < 1 do
s=0 . (# of successes)
∆t ∼ Exp(λ)
x =0 . (# of tries)
t ← t + ∆t
while s < r do
x ←x +1
u ∼ U[0, 1]
end while
x ←x +1
x ←x −1
if u ≤ p then
return x
s ←s +1
end if
0 t=1
end while
return x It remains to decide how to generate ∆t ∼ Exp(λ).
Change of variables formula Example
Let X be a continuous random variable with density fX (x). Consider X ∼ U[0, 1] and Y = − log(X ), i.e. y = g (x) = − log(x).
Consider now the random variable Y = g (X ), where for example The inverse function and its first derivative are:
Y = exp(X ), Y = 1/X , . . . .
dg −1 (y )
g −1 (y ) = exp(−y ) = − exp(−y )
Question: What is the density fY (y ) of Y ? dy
For a strictly monotone and differentiable function g we can apply Application of the change of variables formula leads to:
the change of variables formula:
fY (y ) = 1 · |− exp(−y )|
−1
dg (y )
fY (y ) = fX (g −1 (y )) ·
dy It follows: Y ∼ Exp(1)! Thus, this is a simple way to generate
| {z }
g −10 (y ) exponentially distributed random variables!
More generally, leads Y = − λ1 log(x) to random variables from an
Proof over cumulative distribution function (CDF) FY (y ) of Y
exponential distribution with parameter λ: Y ∼ Exp(λ).
(blackboard).
Inverse cumulative distribution function Inverse cumulative distribution function (II)
Rx
More generally, inversion method or the probability integral Let X have density f (x), x ∈ R and CDF F (x) = −∞ f (z)dz:
transform approach can be used to generate samples from an
0.4
arbitrary continuous distribution with density f (x) and CDF F (x): 0.3
1. Generate random variable U from the standard uniform
f(x)
0.2
distribution in the interval [0, 1]. 0.1
0.0
Simulation algorithm:
2. Then is −1 0 1 2 3 4 5
u ∼ U[0, 1]
X = F −1 (U) x
1.0 x = F −1 (u)
a random variable from the target distribution. 0.8
return x
0.6
F(x)
Proof. 0.4
fX (x) = fU (F (X )) · F 0 (x) = f (x)
u
| {z } | {z } 0.2
1 f (x) 0.0
x
−1 0 1 2 3 4 5
Standard Cauchy distribution Note
Density and CDF of the standard Cauchy distribution are:
1 1 1 arctan(x)
f (x) = · and F (X ) = +
π 1 + x2 2 π
The inversion method cannot always be used! We must have a
The inverse CDF is thus:
formula for F (x) and be able to find F −1 (u). This is for example
−1 1 not possible for the normal, χ2 , gamma and t-distributions.
F (y ) = tan π y −
2
Random numbers from the standard Cauchy distribution can easily
be generated, by sampling U1 , . . . , Un from U[0, 1], and then
computing tan[π(Ui − 21 )].