next up previous
Next: Expectation Maximization (EM) Up: MCMC and EM Algorithms Previous: Markov Chain Monte Carlo

Metropolis-Hastings algorithm

The goal of the M-H algorithm is to design a Markov chain so that its stationary distribution is the same as the desired distribution $p(X)$, i.e., after the ``burn-in'' peroid of some $m$ iterations, the consecutive states $X_t$ $(t>m)$ of the chain are statistically equivalent to samples drawn from $p(X)$.

An arbitrary proposal distribution $q(.\vert X)$, for example, a multivariate normal distribution with some mean vector and covariance matrix is first assumed. A candidate state $Y$ generated by the proposal distribution is accepted with the probability:

\begin{displaymath}\alpha(X_t,Y)=min(1, \frac{p(Y)q(X_t\vert Y)}{p(X_t)q(Y\vert X_t)}) \end{displaymath}

If the candidate $Y$ is accepted, then next state is $X_{t+1}=Y$, otherwise if $Y$ is rejected, the next state remains the same as current state $X_{t+1}=X_t$.

  Initialize X_0, set t=0.
  Repeat {
    get a sample point Y from q(.|X_t)
    get a sampel value u from a Uniform(0,1)
    if u < \alpha(X_t, Y) then X_{t+1}=Y (Y accepted)
    else X_{t+1}=X_t (Y rejected)
    t=t+1
  }

Proof

Now we show that the states of the Markov chain generated by the M-H algorithm do satisfy the requirement. First we denote the probability of the transition from a given state $X$ to the next $Y$ by

\begin{displaymath}Prof(X \rightarrow Y)\stackrel{\triangle}{=}P(Y\vert X)=q(Y\vert X)\alpha(X,Y) \end{displaymath}

where $q(Y\vert X)$ is the probability that $Y$ is generated with current state $X$, and $\alpha(X,Y)$ is the probability that $Y$ is accepted as the next state, following current state $X$.

Step 1: First we show that the detailed balance equation defined below always holds:

\begin{displaymath}P(Y\vert X)p(X)=P(X\vert Y)p(Y), \mbox{ i.e., }
q(Y\vert X) \alpha(X,Y)p(X)=q(X\vert Y) \alpha(Y,X)p(Y) \end{displaymath}

Consider the following three exhaustive cases:

Step 2: Next we show that if $X_t$ is from $p(X)$, then $X_{t+1}$ generated by M-H method will also be from the same $p(X)$. Rewrite the balance equation as

\begin{displaymath}P(X_{t+1}\vert X_t)p(X_t)=P(X_t\vert X_{t+1})p(X_{t+1}) \end{displaymath}

and integrate both sides respect to $X_t$ to get

\begin{displaymath}\int P(X_{t+1}\vert X_t)p(X_t dX_{t})=\int P(X_t\vert X_{t+1})p(X_{t+1})
dX_t=p(X_{t+1}) \end{displaymath}

The left-hand side gives the marginal distribution of $X_{t+1}$ under the assumption that $X_t$ is from $p(X)$, and the right-hand side shows $X_{t+1}$ will also be from the same distribution $p$.


next up previous
Next: Expectation Maximization (EM) Up: MCMC and EM Algorithms Previous: Markov Chain Monte Carlo
Ruye Wang 2006-10-11