Next: About this document ... Up: pca Previous: Comparison with Other Orthogonal

Geometric Interpretation of KLT

Assume the random variables in $X=[x_0,\cdots, x_{N-1}]^T$ have a normal joint probability density function:

$\begin{displaymath}p(x_0,\cdots, x_{N-1})=N(X, M_X, \Sigma_X)= \frac{1}{(2\pi)^... ...t\vert^{1/2}} exp[ -\frac{1}{2}(X-M_X)^T\Sigma_X^{-1}(X-M_X)] \end{displaymath}$

where

and $\Sigma_X$ are the mean vector and covariance matrix of

, respectively. When

, $\Sigma_X$ and

become $\sigma_x$ and $\mu_x$ , respectively, and the density function becomes single variable normal distribution.

The shape of this normal distribution in the N-dimensional space can be found by considering the iso-value hyper-surface in the space determined by equation

$\begin{displaymath}N(X,M_X,\Sigma_X)=c_0 \end{displaymath}$

where

is a constant. Or, equivalently, this equation can be written as

$\begin{displaymath}(X-M_X)^T \Sigma_X^{-1} (X-M_X) = c_1 \end{displaymath}$

where

is another constant related to

and $\Sigma_X$ . In particular, with

variables

and

, we have

$\displaystyle (X-M_X)^T \Sigma_X^{-1} (X-M_X)$	$\textstyle =$	$\displaystyle [x_0-\mu_{x_0}, x_1-\mu_{x_1}] \left[ \begin{array}{cc} a & b/2 \... ...ht] \left[ \begin{array}{c} x_0-\mu_{x_0} x_1-\mu_{x_1} \end{array} \right]$
	$\textstyle =$	$\displaystyle a(x_0-\mu_{x_0})^2+b(x_0-\mu_{x_0})(x_1-\mu_{x_1})+c(x_1-\mu_{x_1})^2 = c_1$

Here we have assumed

$\begin{displaymath} \Sigma_X^{-1}=\left[ \begin{array}{cc} a & b/2 b/2 & c \end{array} \right] \end{displaymath}$

The above quadratic equation represents an ellipse (instead of any other quadratic curve) centered at $M_X=[\mu_{x_0}, \mu_{x_1}]^T$ , because $\Sigma_X^{-1}$ , as well as $\Sigma_X$ , is positive definite:

$\begin{displaymath}\left\vert \Sigma_X^{-1} \right\vert = ac-b^2/4 > 0 \end{displaymath}$

When , the equation $N(X, M_X, \Sigma_X)=c_0$ represents a hyper ellipsoid in the N-dimensional space. The center and spatial distribution of this ellipsoid are determined by and $\Sigma_X$ , respectively.

When $X=[x_0,\cdots, x_{N-1}]^T$ is completely decorrelated by KLT:

$\begin{displaymath}Y=\Phi^T X \end{displaymath}$

the covariance matrix becomes diagonalized:

$\begin{displaymath} \Sigma_Y =\Lambda= \left[ \begin{array}{cccc} \lambda_0 & ... ...s \\ 0 & 0 & \cdots & \sigma^2_{y_{N-1}} \end{array} \right] \end{displaymath}$

and equation $N(X, M_X, \Sigma_X)=c_0$ becomes $N(Y,M_Y, \Sigma_Y)=c_0$ , or

$\begin{displaymath} (Y-M_Y)^T\Sigma_Y^{-1}(Y-M_Y) =\sum_{i=0}^{N-1} \frac{(y_i-... ...um_{i=0}^{N-1} \frac{(y_i-\mu_{y_i})^2}{\sigma^2_{y_i}} =c_1 \end{displaymath}$

This equation represents a standard ellipsoid in the N-dimensional space. In other words, KLT $Y=\Phi^T X$ rotates the coordinate system so that the ellipsoid associated with the normal distribution of

becomes a standardized ellipsoid associated with the normal distribution of

, whose axes are parallel to $\phi_i$ ( $i=0, \cdots, N-1$ ), the axes of the new coordinate system, with the corresponding semi axes equal to $\sqrt{\lambda_i}=\sigma_{y_i}$ .

The standardization of the ellipsoid is the essential reason why KLT has the two desirable properties: (a) decorrelation and (b) compaction of energy, as illustrated in the figure:

Next: About this document ... Up: pca Previous: Comparison with Other Orthogonal

Ruye Wang 2004-09-29