Metropolis-Hastings sampling

Next: Gibbs Sampling Up: MCMC Previous: Markov Chain Monte Carlo

Metropolis-Hastings sampling

The goal of the M-H algorithm is to design a Markov chain so that its stationary distribution is the same as the desired distribution , i.e., after the ``burn-in'' peroid of some iterations, the consecutive states of the chain are statistically equivalent to samples drawn from .

M-H sampling assumes an arbitrary proposal distribution $q(.\vert X)$ that depends on the current state of the Markov chain. For example, a multivariate normal distribution with a mean vector same as and some covariance matrix. A candidate state generated by the proposal distribution is accepted with the probability:

$\begin{displaymath}\alpha(X_t,Y)=min(1, \frac{p(Y)q(X_t\vert Y)}{p(X_t)q(Y\vert X_t)}) \end{displaymath}$

If the candidate

is accepted, then next state is $X_{t+1}=Y$ , otherwise if

is rejected, the next state remains the same as the previous state $X_{t+1}=X_t$ . The ratio can be written as

$\begin{displaymath}\frac{p(Y)q(X_t\vert Y)}{p(X_t)q(Y\vert X_t)}=\frac{p(Y)/q(Y\... ...)} =\frac{\mbox{importance of $Y$}}{\mbox{importance of $X$}} \end{displaymath}$

i.e., if

is more important than

, it is accepted as the next step, but if

is less important, it is rejected.

  Initialize X_0, set t=0.
  Repeat {
    get a sample point Y from q(.|X_t)
    get a sampel value u from a Uniform(0,1)
    if u < \alpha(X_t, Y) then X_{t+1}=Y (Y accepted)
    else X_{t+1}=X_t (Y rejected)
    t=t+1
  }

Proof of M-H algorithm

Now we show that the states of the Markov chain generated by the M-H algorithm do satisfy the requirement. First we denote the probability of the transition from a given state to the next by

$\begin{displaymath}Prof(X \rightarrow Y)\stackrel{\triangle}{=}P(Y\vert X)=q(Y\vert X)\alpha(X,Y) \end{displaymath}$

where $q(Y\vert X)$ is the probability that

is generated with current state

, and $\alpha(X,Y)$ is the probability that

is accepted as the next state, following current state

Step 1: First we show that the detailed balance equation defined below always holds:

$\begin{displaymath}P(Y\vert X)p(X)=P(X\vert Y)p(Y), \mbox{ i.e., } q(Y\vert X) \alpha(X,Y)p(X)=q(X\vert Y) \alpha(Y,X)p(Y) \end{displaymath}$

Consider the following three exhaustive cases:

Case 1: $p(Y)q(X\vert Y)=p(X)q(Y\vert X)$ , then $\alpha(X,Y)=\alpha(Y,X)=1$ , and

$\begin{displaymath}P(Y\vert X)p(X)=q(Y\vert X)p(X) \alpha(X,Y)=q(Y\vert X)p(X)=P... ...vert Y)p(Y)=q(X\vert Y)p(Y) \alpha(Y,X)=q(X\vert Y)p(Y)=P(X,Y) \end{displaymath}$

i.e.,

$\begin{displaymath}P(Y\vert X)p(X)=P(X\vert Y)p(Y) \end{displaymath}$
Case 2: $p(Y)q(X\vert Y)>p(X)q(Y\vert X)$ , then

$\begin{displaymath}\alpha(X,Y)=\frac{p(Y)q(X\vert Y)}{p(X)q(Y\vert X)},\;\;\;\alpha(Y,X)=1 \end{displaymath}$

Hence

$\displaystyle P(Y\vert X)p(X)$ $\textstyle =$ $\displaystyle q(Y\vert X)\alpha(X,Y)p(X)=q(Y\vert X)\frac{p(Y)q(X\vert Y)}{p(X)q(Y\vert X)}p(X)$

$\textstyle =$ $\displaystyle q(X\vert Y)p(Y)=q(X\vert Y)\alpha(Y,X)p(Y)=P(X\vert Y)p(y)$
Case 3: $p(Y)q(Y\vert\vert X)>p(Y)q(X\vert Y)$ , then

$\begin{displaymath}\alpha(Y,X)=\frac{p(X)q(Y\vert X)}{p(Y)q(X\vert Y)},\;\;\;\alpha(X,Y)=1 \end{displaymath}$

By similar argument in case 2, we get

$\begin{displaymath}P(X\vert Y)p(Y)=P(Y\vert X)p(X) \end{displaymath}$

Step 2: Next we show that if is from , then $X_{t+1}$ generated by M-H method will also be from the same . Rewrite the balance equation as

$\begin{displaymath}P(X_{t+1}\vert X_t)p(X_t)=P(X_t\vert X_{t+1})p(X_{t+1}) \end{displaymath}$

and integrate both sides respect to

to get

$\begin{displaymath}\int P(X_{t+1}\vert X_t)p(X_t dX_{t})=\int P(X_t\vert X_{t+1})p(X_{t+1}) dX_t=p(X_{t+1}) \end{displaymath}$

The left-hand side gives the marginal distribution of $X_{t+1}$ under the assumption that

is from

, and the right-hand side shows $X_{t+1}$ will also be from the same distribution

Next: Gibbs Sampling Up: MCMC Previous: Markov Chain Monte Carlo

Ruye Wang 2018-03-26

$\displaystyle P(Y\vert X)p(X)$	$\textstyle =$	$\displaystyle q(Y\vert X)\alpha(X,Y)p(X)=q(Y\vert X)\frac{p(Y)q(X\vert Y)}{p(X)q(Y\vert X)}p(X)$
	$\textstyle =$	$\displaystyle q(X\vert Y)p(Y)=q(X\vert Y)\alpha(Y,X)p(Y)=P(X\vert Y)p(y)$