Appendix A: Kullback-Leibler (KL) divergence

Next: Appendix B: Matrix Operations Up: MCMC and EM Algorithms Previous: EM Method for Parameter

Appendix A: Kullback-Leibler (KL) divergence

The KL-divergence between two distributions and is defined as

$\begin{displaymath}KL(P\vert\vert Q)\stackrel{\triangle}{=}\sum_i P_i log \frac{P_i}{Q_i} =-\sum_i P_i log Q_i-(-\sum_i P_i log P_i)=H(P,Q)-H(P) \end{displaymath}$

where

is the entropy of distribution

is the cross-entropy of distributions

and

, and their difference, also called relative entropy, represents the divergence or difference between the two distributions. According Gibbs' inequality, $H(P,Q)\ge H(P)$ , with the equality holds

if and only if

. Therefore $KL(P\vert\vert Q) \ge 0$ .

Ruye Wang 2006-10-11