next up previous
Next: Applications Up: klt Previous: Comparison with Other Orthogonal

KLT of images

Each of a set of $N$ images of $r$ rows and $c$ columns and $K=r\times c$ pixels can be represented as a K-D vector by concatenating its $c$ columns (or its $r$ rows), and the $N$ images can be represented by a $K\times N$ array ${\bf X}$ with each column for one of the $N$ images:

\begin{displaymath}
{\bf X}_{K\times N}
=[{\bf x}_{c_1},\cdots,{\bf x}_{c_N}]
...
...
({\bf X}^T)_{N\times K}=[{\bf x}_{r_1},\cdots,{\bf x}_{r_K}]
\end{displaymath}

where ${\bf x}_c_i$ is a K-D column vector containing the $K$ pixels of the ith image, and ${\bf x}_r_j^T$ is an N-D row vector containing the pixels at the same position of all $N$ images. If ${\bf X}$ is not degenerate, its rank is

\begin{displaymath}
R=\Rank({\bf X})=\min(N,K)
\end{displaymath}

A KLT can be applied to either the column or row vectors of the data array ${\bf X}$, depending on whether the column or row vectors are treated as the realizations (samples) of a random vector.

From these two cases we see that the eigenvalue problems associated with these two different covariance matrices ${\bf\Sigma}_r={\bf X}^T{\bf X}$ and ${\bf\Sigma}_c={\bf XX}^T$ are equivalent, in the sense that they have the same non-zero eigenvalues, and their eigenvectors are related by ${\bf V}={\bf X}{\bf U}$ or ${\bf U}={\bf X}^T{\bf V}$. We can therefore solve the eigenvalue problem of either of the two covariance matrices, depending on which has lower dimension.

Owing to the nature of the KLT, most of the energy/information contained in the $N$ images, representing the variations among all $N$ images, is concentrated in the first few eigen-images corresponding to the greatest eigenvalues, while the remaining eigen-images can be omitted without losing much energy/information. This is the foundation for various KLT-based image compression and feature extraction algorithms. The subsequent operations such as image recognition and classification can be carried out in a much lower dimensional space after the KLT.

In image recognition, the goal is typically to classify or recognize objects of interest, such as hand-written alphanumeric characters, or human faces, represented in an image of $K$ pixels. As not all $K$ pixels are necessary for representing the image object, the KLT can be carried out to compact most of the information into a small number of components:

\begin{displaymath}
{\bf Z}={\bf V}^T{\bf X}
\end{displaymath}

where the transform matrix ${\bf V}$ is the eigenvector matrix of the covariance matrix ${\bf\Sigma}_c={\bf XX}^T$:

\begin{displaymath}
{\bf XX}^T{\bf V}={\bf V}{\bf\Lambda}
\end{displaymath}

When $K\ll N$, the KLT matrix ${\bf V}$ can be obtained by directly solving this eigenvalue problem of the $K\times K$ covariance matrix. However, when $K\gg N$, the complexity of this eigenvalue problem may be too high. In this case we can obtain the KLT matrix by ${\bf V}={\bf XU}$ as shown above, where ${\bf U}$ is the $N \times N$ eigenvector matrix of ${\bf\Sigma}_r={\bf X}^T{\bf X}$:

\begin{displaymath}
{\bf X}^T{\bf XU}={\bf U}{\bf\Lambda}
\end{displaymath}

This eigenvalue problem can be more easily solved if $N\ll K$. Now the desired KLT can be carried out as

\begin{displaymath}
{\bf z}={\bf V}^T{\bf X}=({\bf XU})^T{\bf X}={\bf U}^T({\bf...
... X}){\bf U}]^T=({\bf U}{\bf\Lambda})^T
={\bf\Lambda}{\bf U}^T
\end{displaymath}

Example


\begin{displaymath}
{\bf X}=\left[\begin{array}{cc}4&8 6&8 1&3 9&6\end{array}\right]
\end{displaymath}


\begin{displaymath}
{\bf\Lambda}=\left[\begin{array}{cc}15.12&0 0&291.88\end{a...
...1710\\
-0.2179 & 0.1919 & -0.7369 & 0.6105\end{array}\right]
\end{displaymath}


\begin{displaymath}
{\bf Y}=\left[\begin{array}{rr}
0.0000 & 0.0000\\
0.0000...
...0\\
-2.9368 & 2.5484\\
11.1971 & 12.9037\end{array}\right]
\end{displaymath}


next up previous
Next: Applications Up: klt Previous: Comparison with Other Orthogonal
Ruye Wang 2016-04-06