0% found this document useful (0 votes)

28 views3 pages

Acceptance-Rejection Method in MCMC

The document discusses the Acceptance-Rejection Method (AR) and the Metropolis-Hastings algorithm as techniques for generating random samples from a posterior distribution in Bayesian econometrics. It explains the process of simulating draws from a target distribution and the importance of choosing appropriate candidate distributions and constants to improve efficiency. Additionally, it covers the concept of Markov chains, transition kernels, and the conditions for convergence to an invariant distribution.

Uploaded by

rysul12

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

28 views3 pages

Acceptance-Rejection Method in MCMC

Uploaded by

rysul12

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

Acceptance-Rejection Method (AR) 1

14.384 Time Series Analysis, Fall 2007

Professor Anna Mikusheva
Paul Schrimpf, scribe
Novemeber 29, 2007

Lecture 25

MCMC: Metropolis Hastings Algorithm

A good reference is Chib and Greenberg (The American Statistician 1995).

Recall that the key object in Bayesian econometrics is the posterior distribution:

f (YT |θ)p(θ)
p(θ|YT ) = R
˜
f (YT |θ)dθ̃

It is often difficult to compute this distribution. In particular, the integral in the denominator is difficult.
So far, we have gotten around this by using conjugate priors – classes of distributions for which we know the
form of the posterior. Generally, it’s easy to compute the numerator, f (YT |θ)p(θ), but it is hard to compute
˜ ˜ One approach is to try to compute
R
the normalizing constant, the integral in the denominator, f (YT |θ)dθ.
this integral in some clever way. Another, more common approach is Markov Chain Monte-Carlo (MCMC).
The goal here is to generate a random sample θ1 , .., θN from p(θ|YT ). We can then use moments from this
sample to approximate moments of the posterior distribution. For example,
1 X
E(θ|YT ) ≈ θn
N
There are a number of methods for generating random samples from an arbitrary distribution.

Acceptance-Rejection Method (AR)

The goal is to simulate ξ ∼ π(x). We can calculate for each the value of a function, f (x), such that
π (x) = f (x)
k . The constant k is unknown. We have some candidate pdf h(x) that we can simulate draws
from, and there is a known constant c such that

f (x) ≤ ch(x)

We simulate draws from π(x) as follows:

1. Draw z ∼ h(x), u ∼ U [0, 1]
f (z )
2. If u ≤ ch(z ) , then ξ = z . Otherwise repeat (1)

The intuition of the procedure is the following: Let v = uch(z) and imagine the joint distribution of (v, z).
It will have support under the graph of ch(z) and above the z = 0 axis with a uniform density (it is uniform
on {(v, z) : z ∈ Spt(h), 0 ≤ v ≤ ch(z)}). Then, it is fairly easy to see that if we accept ξ = z, the joint
distribution of (v, ξ) will be uniform with support {(v, ξ) : ξ ∈ Spt(π), f (ξ) ≥ v ≥ 0} and be uniform. Then
(for the same reason that h(z) is the marginal density of (v, z)), the marginal density of ξ will be f (kξ) . More
formally,

Cite as: Anna Mikusheva, course materials for 14.384 Time Series Analysis, Fall 2007.
MIT OpenCourseWare ([Link] Massachusetts Institute of Technology. Downloaded on [DD Month YYYY].
Markov Chains 2

Proof. Let ρ be the probability of rejecting a single draw. Then,

z1
P (ξ ≤ x) =P (z1 ≤ x, u1 ≤ )(1 + ρ + ρ2 + ...)
ch(z1 )
1 z1
= P (z1 ≤ x, u1 ≤ )
1−ρ ch(z1 )
· ¸
1 z
= Ez P (u ≤ |z)1{z≤x} iterated expectations
1−ρ ch(z)
Z x
1 f (z)
= h(z)dz
1 − ρ −∞ ch(z)
Z x
f (z) f (z)
= dz is a distribution, so c(1 − ρ) = k and
−∞ c(1 − ρ) c(1 − ρ)
Z x
= π(z)dz
−∞

A major drawback of this method is that is may lead us to reject many draws before we finally accept
f (z )
one. This can make the procedure inefficient. If we choose c and h(z) poorly, then ch(z ) could be very small
for many z. It will be especially difficult to choose a good c and h() when we do not know much about π(z).

Markov Chains
A Markov Chain is a stochastic process where the distribution of xt+1 only depends on xt , P (xt+1 ∈
A|xt , xt−1 , ...) = P (xt+1 ∈ A|xt ) ∀A.
Definition 1. A transition kernel is a function, P (x, A), which is a probability measure in the second
argument. It gives the probability of moving from x into the set A.
We want to study the behavior of a sequence of draws x1 → x2 → ... where we move around according
to a transition kernel. Suppose the distribution of xk is π ∗ , then the distribution of y = xk+1 is
Z
π̃(y)dy = π ∗ (x)P (x, dy)dx
<

Definition 2. It is an invariant measure (with respect to transition kernel P (x, A)) if π̃ = π ∗

Under some regularity conditions, the distribution of a Markov chain converges to its unique invariant
distribution.
In MCMC the goal is to simulate a draw from π. We need to find a transition kernel P (x, dy) such that
π is its invariant measure. Let’s suppose that π is continuous. We will consider the class of kernels
P (x, dy) = p(x, y)dy + r(x)∆x (dy)
i.e. we can stay at x with probability r(x), otherwise y is distributed accordingRto some pdf (times
R probability
of moving.
R p(x, y) isn’t exactly a density because it doesn’t integrate to 1. ( P (x, dy) = 1 = p(x, y)dy +
r(x); p(x, y)dy ≤ 1)).
Definition 3. A transition kernel is reversible if π(x)p(x, y) = π(y)p(y, x)
Theorem 4. If a transition kernel is reversible, then π is invariant.
There are more general conditions under which a Markov Chain converges. Generally, if the transition
kernel is irreducible (it can reach any point from any other point) and aperiodic (not periodic, i.e. the
greatest common denominator of {n : y can be reached from x in n steps} is 1), then it converges to an
invariant distribution.

Cite as: Anna Mikusheva, course materials for 14.384 Time Series Analysis, Fall 2007.
MIT OpenCourseWare ([Link] Massachusetts Institute of Technology. Downloaded on [DD Month YYYY].
Metropolis-Hastings 3

Metropolis-Hastings
Suppose we have a Markov chain in state x. We want to simulate a draw from the transition kernel p(x, y)
with invariant measure π, but we do not know the form of p(x, y). We do know how to compute
R a function
proportional to π, f (x) = kπ(x). Assume that we can draw y ∼ q(x, y), a pdf wrt y (so q(x, y)dy = 1).
Consider using this q as a transition kernel. Notice that if

π(x)q(x, y) > π(y)q(y, x)

then we would move from x to y too often. This suggests that rather than always moving to the new y we
draw, we should only move with some probability, α(x, y). If we construct α(x, y) such that

π(x)q(x, y)α(x, y) = π(y)q(y, x)α(y, x)

then we will have a reversible transition kernel with invariant measure π. We can take:
π(y)q(y, x)
α(x, y) = min{1, }
π(x)q(x, y)

We can calculate α(x, y) because although we do not know π(x), we do know f (x) = kπ(x), so we can
compute the ratio. In summary, the Metropolis-Hastings algorithm is: given xj we move to xj+1 by
1. Generate a draw, y, from q(xj , ·)
2. Calculate α(xj , y)
3. Draw u ∼ U [0, 1]
4. If u < α(xj , y), then xj+1 = y. Otherwise xj+1 = xj
Then the marginal distribution of xj will converge to π. In practice, we begin the chain at an arbitrary
x0 , run the algorithm many, say M times, then use the last N < M draws as a sample from π. Note
that although the marginal distribution of the xj is π, the xj are autocorrelated. This is not a problem
for computing moments from the draws (although the higher the autocorrelation, the more draws we need
to get the same accuracy), but if we want to put standard errors on these moments, we need to take the
autocorrelation into account.

Choice of q()
• Random walk chain: q(x, y) = q1 (y − x), i.e. y = x + ², ² ∼ q1 . This can be nice choice because if q1
x,y )
is symmetric, q1 (z) = q1 (−z) and qq((y,x ) drops out of α(x, y ). Popular such q1 are normal and U [−a, a].
Note that there is a tradeoff between step-size in the chain and rejection probability when choosing
σ 2 = E²2 . Choosing σ 2 too large will lead to many draws of yfrom low probability areas (low π), and
as a result we will reject lots of draws. Choosing σ 2 too small will lead us to accept most draws, but
not move very much, and we will have difficulty covering the whole support of π. In either case, the
autocorrelation in our draws will be very high and we’ll need more draws to get a good sample from π.
• Independence chain: q(x, y) = q1 (y)
• π(y) ∝ ψ(y)h(y) and can sample from q(x, y) = h(y). This also simplifies α()
• Autocorrelated y = a + B(x − a) + ² with B < 0, this leads to negative autocorrelation in y. The hope
is that this reverses some of the positive autocorrelation inherent in the procedure.

Cite as: Anna Mikusheva, course materials for 14.384 Time Series Analysis, Fall 2007.
MIT OpenCourseWare ([Link] Massachusetts Institute of Technology. Downloaded on [DD Month YYYY].

Acceptance-Rejection Method in MCMC
No ratings yet
Acceptance-Rejection Method in MCMC
5 pages
Introduction to Gibbs Sampling in MCMC
100% (1)
Introduction to Gibbs Sampling in MCMC
7 pages
Markov Transition Kernels Explained
No ratings yet
Markov Transition Kernels Explained
9 pages
MCMC Theory and Algorithms Overview
No ratings yet
MCMC Theory and Algorithms Overview
77 pages
Markov Chains and Metropolis-Hastings
No ratings yet
Markov Chains and Metropolis-Hastings
32 pages
Markov Chain Monte Carlo Overview
No ratings yet
Markov Chain Monte Carlo Overview
29 pages
Monte Carlo Sampling Techniques Explained
No ratings yet
Monte Carlo Sampling Techniques Explained
101 pages
Markov Chain Monte Carlo Techniques
No ratings yet
Markov Chain Monte Carlo Techniques
27 pages
MCMC and Bayesian Statistics Overview
No ratings yet
MCMC and Bayesian Statistics Overview
76 pages
Markov Chains and Gibbs Sampling Explained
No ratings yet
Markov Chains and Gibbs Sampling Explained
12 pages
MCMC Sampling Methods Overview
No ratings yet
MCMC Sampling Methods Overview
70 pages
General State Space MCMC Overview
No ratings yet
General State Space MCMC Overview
64 pages
MCMC Methods in Machine Learning Q&A
No ratings yet
MCMC Methods in Machine Learning Q&A
14 pages
Markov Chain Monte Carlo Techniques
100% (1)
Markov Chain Monte Carlo Techniques
69 pages
Markov Chains: Modified by Longin Jan Latecki Temple University, Philadelphia Latecki@temple - Edu
No ratings yet
Markov Chains: Modified by Longin Jan Latecki Temple University, Philadelphia Latecki@temple - Edu
36 pages
MCMC and Gibbs Sampling Overview
No ratings yet
MCMC and Gibbs Sampling Overview
24 pages
STAT 333 Markov Chains Course Notes
No ratings yet
STAT 333 Markov Chains Course Notes
71 pages
On The Markov Chain Monte Carlo (MCMC) Method: Rajeeva L Karandikar
No ratings yet
On The Markov Chain Monte Carlo (MCMC) Method: Rajeeva L Karandikar
24 pages
Understanding Basic Markov Chains
No ratings yet
Understanding Basic Markov Chains
75 pages
Guilhoto - Applying Markov Chains to Monte Carlo Integration (1)
No ratings yet
Guilhoto - Applying Markov Chains to Monte Carlo Integration (1)
16 pages
Geyer - Markov Chain Monte Carlo Lecture Notes
100% (1)
Geyer - Markov Chain Monte Carlo Lecture Notes
166 pages
Metropolis-Hastings Algorithm: Theory and Implementation: 1 Markov Chain Construction
No ratings yet
Metropolis-Hastings Algorithm: Theory and Implementation: 1 Markov Chain Construction
5 pages
Understanding Markov Models and Chains
No ratings yet
Understanding Markov Models and Chains
14 pages
Bayesian Time Series Econometrics Guide
No ratings yet
Bayesian Time Series Econometrics Guide
72 pages
MTH707: Markov Chains Problem Set
No ratings yet
MTH707: Markov Chains Problem Set
13 pages
Mixing Times in Markov Chains Explained
No ratings yet
Mixing Times in Markov Chains Explained
70 pages
Bayesian Inference and Sampling Methods
No ratings yet
Bayesian Inference and Sampling Methods
41 pages
Understanding Stochastic Processes
No ratings yet
Understanding Stochastic Processes
27 pages
Monte Carlo Variance Reduction Techniques
No ratings yet
Monte Carlo Variance Reduction Techniques
6 pages
sheet_2_sol
No ratings yet
sheet_2_sol
8 pages
Markov Chain Monte Carlo Overview
No ratings yet
Markov Chain Monte Carlo Overview
8 pages
Monte Carlo Methods for Statistical Inference
No ratings yet
Monte Carlo Methods for Statistical Inference
138 pages
Gibbs Sampling in Bayesian Analysis
No ratings yet
Gibbs Sampling in Bayesian Analysis
35 pages
MCMC Regression Analysis Lecture Notes
No ratings yet
MCMC Regression Analysis Lecture Notes
13 pages
Doob's h-Transform of 2D Random Walk
No ratings yet
Doob's h-Transform of 2D Random Walk
8 pages
Mixing Notes
No ratings yet
Mixing Notes
43 pages
First Step Analysis of Markov Chains
No ratings yet
First Step Analysis of Markov Chains
4 pages
Markov Chain Monte Carlo Methods Explained
No ratings yet
Markov Chain Monte Carlo Methods Explained
66 pages
Understanding Markov Chain Monte Carlo
No ratings yet
Understanding Markov Chain Monte Carlo
17 pages
Markov Chains and Bayesian Statistics Guide
No ratings yet
Markov Chains and Bayesian Statistics Guide
29 pages
MCMC Methods in Bayesian Estimation
No ratings yet
MCMC Methods in Bayesian Estimation
25 pages
Mixing Times in Markov Chains
No ratings yet
Mixing Times in Markov Chains
126 pages
CS194 Fall 2011 Lecture 22
No ratings yet
CS194 Fall 2011 Lecture 22
38 pages
Solutions to ECE 534 Problem Set 5
No ratings yet
Solutions to ECE 534 Problem Set 5
3 pages
MCMC: Gibbs Sampling & Metropolis-Hastings
No ratings yet
MCMC: Gibbs Sampling & Metropolis-Hastings
242 pages
Stochastic Simulation Lecture Notes
No ratings yet
Stochastic Simulation Lecture Notes
157 pages
Sampling Methods for Probabilistic Inference
No ratings yet
Sampling Methods for Probabilistic Inference
17 pages
Econometrics I Lecture Notes Overview
No ratings yet
Econometrics I Lecture Notes Overview
119 pages
Empirical Process Theory in Time Series Analysis
No ratings yet
Empirical Process Theory in Time Series Analysis
6 pages
Finite Markov Chains Overview
No ratings yet
Finite Markov Chains Overview
7 pages
Markov Chains: Definitions and Examples
No ratings yet
Markov Chains: Definitions and Examples
55 pages
Understanding Stochastic Processes
100% (3)
Understanding Stochastic Processes
233 pages
Markov Chains: Theory and Applications
50% (2)
Markov Chains: Theory and Applications
233 pages
Markov Chains: Transition Probabilities Explained
No ratings yet
Markov Chains: Transition Probabilities Explained
45 pages
LeDuck TFT Set 6.5 Meta Overview
No ratings yet
LeDuck TFT Set 6.5 Meta Overview
19 pages
Minimum Enclosing Balls & Ellipsoids Study
No ratings yet
Minimum Enclosing Balls & Ellipsoids Study
106 pages
Kernel Trick and PCA Momentum Methods
No ratings yet
Kernel Trick and PCA Momentum Methods
49 pages
Fulgone Optimization Report
No ratings yet
Fulgone Optimization Report
9 pages
Mapping Reducibility in Turing Machines
No ratings yet
Mapping Reducibility in Turing Machines
3 pages
NP-Completeness: 3-SAT and CLIQUE
No ratings yet
NP-Completeness: 3-SAT and CLIQUE
5 pages
حسابات أحمال التكييف
No ratings yet
حسابات أحمال التكييف
28 pages
Small Plastic Metal Servo Specifications
No ratings yet
Small Plastic Metal Servo Specifications
1 page
Phasor Diagrams for AC Circuits
No ratings yet
Phasor Diagrams for AC Circuits
5 pages
DEN0001C Principles of Arm Memory Maps
No ratings yet
DEN0001C Principles of Arm Memory Maps
25 pages
22 Immutable Laws of Branding Summary
No ratings yet
22 Immutable Laws of Branding Summary
12 pages
Defects
No ratings yet
Defects
32 pages
HRM Final Presentation
No ratings yet
HRM Final Presentation
12 pages
Lux Soap Marketing Strategy Overview
No ratings yet
Lux Soap Marketing Strategy Overview
3 pages
Intuitionistic Fuzzy Sets Explained
No ratings yet
Intuitionistic Fuzzy Sets Explained
7 pages
FIRST and FOLLOW Sets in Parsing
No ratings yet
FIRST and FOLLOW Sets in Parsing
14 pages
Manual: Communication Protocols
No ratings yet
Manual: Communication Protocols
41 pages
Geometry Basics for VI Class Students
No ratings yet
Geometry Basics for VI Class Students
24 pages
Discrete Structures Exam - Kenya Methodist University
No ratings yet
Discrete Structures Exam - Kenya Methodist University
4 pages
Overview of VDN in Avaya CM
No ratings yet
Overview of VDN in Avaya CM
28 pages
UVM Transaction and Scoreboard Development
No ratings yet
UVM Transaction and Scoreboard Development
20 pages
English Prep 2 Term 1 Guide 2024-2025
No ratings yet
English Prep 2 Term 1 Guide 2024-2025
33 pages
Isometric Projection Techniques Explained
No ratings yet
Isometric Projection Techniques Explained
23 pages
Analyzing Shakespeare's Sonnet 130
No ratings yet
Analyzing Shakespeare's Sonnet 130
38 pages
Symmetric Probability Distributions
No ratings yet
Symmetric Probability Distributions
17 pages
Lab No. 04 Title: Developing Data Flow Diagram (DFD) Model of A Project
No ratings yet
Lab No. 04 Title: Developing Data Flow Diagram (DFD) Model of A Project
7 pages
Single and Three Phase Surge Protection Diagrams
No ratings yet
Single and Three Phase Surge Protection Diagrams
2 pages
Ancient Egyptian Donation Stela Analysis
No ratings yet
Ancient Egyptian Donation Stela Analysis
7 pages
COS30008 Assignment 3: List ADT
No ratings yet
COS30008 Assignment 3: List ADT
6 pages
Year 2 English Lesson Plan: Action Songs
No ratings yet
Year 2 English Lesson Plan: Action Songs
3 pages
Enhancing Reading Comprehension Skills
100% (1)
Enhancing Reading Comprehension Skills
6 pages
Past Continuous Tense Explained
No ratings yet
Past Continuous Tense Explained
4 pages
Instructions for Traction Battery Use
No ratings yet
Instructions for Traction Battery Use
124 pages
Modernism in English Literature Explained
No ratings yet
Modernism in English Literature Explained
3 pages
Starbucks Shift Supervisor Resume
No ratings yet
Starbucks Shift Supervisor Resume
2 pages
Unit 3: Calendar and Addresses
0% (1)
Unit 3: Calendar and Addresses
14 pages
Dynamics 365 Plugin & Workflow Q&A Guide
No ratings yet
Dynamics 365 Plugin & Workflow Q&A Guide
8 pages
CapraTek Employee Engagement Strategies
No ratings yet
CapraTek Employee Engagement Strategies
3 pages
Speculative Fiction: Genres and Elements
No ratings yet
Speculative Fiction: Genres and Elements
12 pages
Linear Algebra: Inner Products & Norms
100% (1)
Linear Algebra: Inner Products & Norms
4 pages
Hyper Hub Premium GUI Script
No ratings yet
Hyper Hub Premium GUI Script
9 pages
Present Perfect Exercises and Practice
No ratings yet
Present Perfect Exercises and Practice
2 pages
Possessive Adjectives Lesson Plan
No ratings yet
Possessive Adjectives Lesson Plan
2 pages
School Management System Project Report
No ratings yet
School Management System Project Report
152 pages

Acceptance-Rejection Method in MCMC

Uploaded by

Acceptance-Rejection Method in MCMC

Uploaded by

Acceptance-Rejection Method (AR) 1

14.384 Time Series Analysis, Fall 2007

MCMC: Metropolis Hastings Algorithm

A good reference is Chib and Greenberg (The American Statistician 1995).

Acceptance-Rejection Method (AR)

We simulate draws from π(x) as follows:

Proof. Let ρ be the probability of rejecting a single draw. Then,

Definition 2. It is an invariant measure (with respect to transition kernel P (x, A)) if π̃ = π ∗

π(x)q(x, y) > π(y)q(y, x)

π(x)q(x, y)α(x, y) = π(y)q(y, x)α(y, x)

You might also like