The KLT can be applied to a set of images for various purposes such as
data compression and feature extraction. There are two alternative ways to
carry out the KLT on the
images each containing
pixels, depending
on how the random vector
is defined based on the image data
which can be represented in the form of an
2-D array.
(86) |
(87) |
As shown previously, these two different covariance matrices share
the same eigenvalues. The eigenequations for
and
(with the constant coefficients
and
neglected) are:
(88) |
(89) |
We can now carry out the KLT to each of the
-dimensianl vectors
corresponding to each pixel of the
images to obtain another
-dimensional vector
for the same pixel of a
set of
eigen-images, as shown below. After the KLT, most of the
energy/information contained in the
images, representing the variations
among all
images, is concentrated in the first few eigen-images corresponding
to the greatest eigenvalues, while the remaining eigen-images can be omitted
without losing much energy/information. This is the foundation for various
KLT-based image compression and feature extraction algorithms. The subsequent
operations such as image recognition and classification can all be carried out
in a much lower dimensional space.
We now consider some of such applications.
In remote sensing, images of the surface of either the Earth or other
planets such as Mars are taken by a multispectral camera system on board
satellite, for various studies (e.g., geology, geography, etc.). The camera
system has an array of sensors, typically a few tens or even over a
hundred, each sensitive to a different wavelength band in the visible and
infrared range of the electromagnetic spectrum. Depending on the number
of sensors, the data are referred to as either multi or hyper-spectral
images.
These sensors will produce a set of images covering the same surface
area on the ground. For the same position in these images, there are
pixel values each from one wavelength band representing the spectral profile
that characterizes the material on the surface area corresponding to the
pixel. A typical application of the multi or hyper-spectral image data is
to classify the pixels into different types of materials (different types
of rocks, vegetations, polutions, etc.). When
is large, KLT can be
used to reduce the dimensionality without loss of much information.
Specifically, we consider the
values associated with each pixel form
an N-dimensional random vector, and then carry out KLT to reduce its
dimensionality. All classification can be carried out in this low
dimensional space, thereby significantly reducing the computational
complexity.
A sequence of frames of a video of a moving escalator and their
eigen-images are shown respectively in the upper and lower parts of the
figure below.
It is interesting to observe that the first eigen-image corresponding to the greatest eigenvalue (left panel of the third row of the figure) represents mostly the static scene of the image frames representing the main variations in the image (carrying most of the energy), while the subsequent eigen-images represent mostly the motion in the video, the variation between the frames. For example, the motion of the people riding on the escalator is mostly reflected by the first few eigen-images following the first one, while the motion of the escalator stairs is mostly reflected in the subsequent eigen-images.
The covariance matrix and the energy distribution among the eight
components plot before and after the KLT are shown below.
We see that due to the spatial correlation between nearby pixels, the covariance matrix before the KLT (left) can be modeled by the squared exponental function, while the covariance matrix after the KLT (middle) is completely decorrelated and the energy is highly compacted into a small number of principal components (here the first component), as also clearly shown in the comparison of the energy distribution before and after the KLT (right).
Twenty images of faces ():
The eigen-images after KLT:
Percentage of energy contained in the
components | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | 10 | 11 | 12 | 13 | 14 | 15 | 16 | 17 | 18 | 19 | 20 |
percentage energy | 48.5 | 11.6 | 6.1 | 4.6 | 3.8 | 3.7 | 2.6 | 2.5 | 1.9 | 1.9 | 1.8 | 1.6 | 1.5 | 1.4 | 1.3 | 1.2 | 1.1 | 1.1 | 0.9 | 0.8 |
accumulative energy | 48.5 | 60.1 | 66.2 | 70.8 | 74.6 | 78.3 | 81.0 | 83.5 | 85.4 | 87.3 | 89. | 90.7 | 92.2 | 93.6 | 94.9 | 96.1 | 97.2 | 98.2 | 99.2 | 100.0 |
Reconstructed faces using 95% of the total information (15 out of 20 components):
The goal here is to recognize hand-written number from 0 to 9 in an image,
as those in the figure below, containing samples for the
numbers. Each sample in the
image can be represented by a
dimensional vector by concatenating all columns (or
rows) of the image. The KLT can be carried out to significantlly reduce
the dimensionality of the vectors from
to some
, based on
either the covariance matrix
of all
sample vectors representing the over all distribution of these data points,
or the between-class scatter matrix
previously considered
representing the separability of the ten classes.
Specifically, we use the eigenvectors corresponding to the greatest
eigenvalues of the covariance matrix or the between-class scatter matrix
to form a
by
transform matrix. After the KLT transform by the data,
certain classification algorithm can be carried out in the much reduced d'
dimensional space.
The energy distribution over all signal components is plotted below
for the original signal (top), after the KLT based on
(middle),
and after the KLT based on
.
For the KLT based on with rank
,
components are
needed to keep 95.1% of the total energy, in comparison to the KTL
based on
with rank
, requiring are
only
principal components corresponding to the same number of
non-zero eigenvalues of
to keep 100% of the total energy
representing the separability information. The percentage energy conteined
in these non-zero eigenvalues are:
.
The corresponding dimensional eigenvectors can be visualized when
converted to
eigenimages, representing the basis by which any
original images can be represented as a linear combination of such eigenimages,
as shown in the figure below. The 10th eigenimage corresponding to a zero
eigenvalue contains only some random noise.
If we only keep the first two or three principal components (corresponding
to the greatest eigenvalues) after the KLT, the dataset can be visualized
as shown in the figure below. The sample points in each of the ten different
classes are color-coded. It can be seen that even when the dimensions are
much reduced from to
or even
, it is still possible to
separate the ten classes reasonably well.