Inverse Transform Method
Overview
Inverse transform sampling is a method for generating random numbers from any probability distribution
by using its inverse cumulative distribution F −1 ( x ) . Recall that the cumulative distribution for a random
variable X is FX ( x ) = P ( X x ) .In what follows, we assume that our computer can, on demand,
generate independent realizations of a random variable U uniformly distributed on [0,1].
Algorithm
Continuous Distributions
Assume we want to generate a random variable X with cumulative distribution function (CDF) FX . The
inverse transform sampling algorithm is simple:
1. Generate U Unif ( 0,1)
2. Let X = FX−1 (U )
Then, X will follow the distribution governed by the CDF FX , which was our desired result.
Note that this algorithm works in general but is not always practical. For example, inverting FX is easy
if X is an exponential random variable, but its harder if X is Normal random variable.
Discrete Distributions
Now we will consider the discrete version of the inverse transform method. Assume that X is a discrete
random variable such that P ( X = xi ) = pi . The algorithm proceeds as follows:
1. Generate U Unif ( 0, 1)
k −1 k
2. Determine the index kk such that pj U pj
j =1 j =1
, and return X = xk .
Notice that the second step requires a search.
Assume our random variable X can take on any one of k values with probabilities p1 ,..., pk . We
implement the algorithm below, assuming these probabilities are stored in a vector called [Link].
# note: this inefficient implementation is for pedagogical purposes
# in general, consider using the rmultinom() function
[Link] <- function( [Link] ) {
U <- runif(1)
if(U <= [Link][1]){
return(1)
}
for(state in 2:length([Link])) {
if(sum([Link][1:(state-1)]) < U && U <= sum([Link][1:state]) ) {
return(state)
}
}
}
Continuous Example: Exponential Distribution
Assume YY is an exponential random variable with rate parameter = 2 . Recall that the probability
density function is p ( y ) = 2e −2 y for y 0 . First, we compute the CDF:
FX ( x ) = P (Y x ) = 2e−2 y dy = 1 − e−2 x
x
Solving for the inverse CDF, we get that
ln (1 − y )
FY−1 ( y ) =
2
ln (1 − u )
Using our algorithm above, we first generate U Unif ( 0,1) then set X = FY−1 (U ) = . We do
2
this in the R code below and compare the histogram of our samples with the true density of Y .
# inverse transform sampling
[Link] <- 1000
U <- runif([Link])
X <- -log(1-U)/2
# plot
hist(X, freq=F, xlab='X', main='Generating Exponential R.V.')
curve(dexp(x, rate=2) , 0, 3, lwd=2, xlab = "", ylab = "", add = T)
Past versions of [Link]
Indeed, the plot indicates that our random variables are following the intended distribution.
Discrete Example
Let’s assume we want to simulate a discrete random variable X that follows the following distribution:
xi P ( X = xi )
1 0.1
2 0.4
3 0.2
4 0.3
Below we simulate from this distribution using the [Link]() function above, and
plot both the true probability vector, and the empirical proportions from our simulation.
[Link] <- 1000
[Link] <- c(0.1, 0.4, 0.2, 0.3)
names([Link]) <- 1:4
samples <- numeric([Link])
for(i in seq_len([Link]) ) {
samples[i] <- [Link]([Link])
}
barplot([Link], main='True Probability Mass Function')
Past versions of [Link]
barplot(table(samples), main='Empirical Probability Mass Function')