next up previous
Next: Markov Chain Monte Carlo Up: MCMC and EM Algorithms Previous: MCMC and EM Algorithms

Bayesian Inference

The goal of Bayesian inference is to find the model parameters denoted by $\theta$, based on the observed data denoted by $D$. Assume the a priori distribution of the parameters is $P(\theta)$ and the distribution of the data is $P(D)$, then the joint probability of both the data and parameters is

\begin{displaymath}P(D,\theta)=P(D\vert\theta) P(\theta) \end{displaymath}

The posterior distribution of the model parameters can be obtained according to Bayesian theorem:

\begin{displaymath}P(\theta\vert D)=\frac{P(\theta)P(D\vert\theta)}{P(D)}
=\fra...
...theta)P(D\vert\theta)}{\int P(\theta) P(D\vert\theta) d\theta}
\end{displaymath}

where $P(D\vert\theta)=L(\theta\vert D)$ is the likelihood function of the parameters $\theta$, given the observed data $D$. This equation can be interpreted as

\begin{displaymath}\mbox{posterior}=\frac{\mbox{likelihood} \times \mbox{prior}}...
...zation factor}} \propto
\mbox{prior} \times \mbox{likelihood} \end{displaymath}

In a maximum-likelihood problem, the goal is to find $\theta$ that maximizes the likelihood $L$:

\begin{displaymath}\theta^*=argmax_{\theta}\;L(\theta\vert D) \end{displaymath}

which can be obtained by solving the likelihood equation:

\begin{displaymath}\frac{\partial}{\partial \theta} L(\theta\vert D)
=\frac{\partial}{\partial \theta} p(D\vert\theta)=0 \end{displaymath}

Bayesian inference can be used to find any feature of the posterior distribution $f(\theta)$, whose posterior expectation is

\begin{displaymath}E[f(\theta)\vert D]=\int f(\theta) P(\theta\vert D) d\theta
=...
...(D\vert\theta)d\theta}{\int P(\theta) P(D\vert\theta) d\theta} \end{displaymath}

The integration in this expression is likely to be of high dimensions, and in most applications, analytical evaluation of $E[f(\theta)\vert D]$ is impossible. In such cases, Monte Carlo integration can be used, including Markov Chain Monte Carlo (MCMC).


next up previous
Next: Markov Chain Monte Carlo Up: MCMC and EM Algorithms Previous: MCMC and EM Algorithms
Ruye Wang 2006-10-11