Due to KLT's properties of signal decorrelation and energy compaction,
it can be used for data compression by reducing the dimensionality of
the dataset. Specifically, we carry out the following steps:
- Find the mean vector
and the covariance matrix
of the random vectors
based on the
given dataset
containing
samples of
.
- Find the eigenvalues
of
, sorted in descending order, and their corresponding
eigenvectors
.
- Determine the reduced dimensionality
so that the percentage
of energy contained after the transform is no less than a given threshold
(e.g., 95%):
 |
(77) |
- Construct an
transform matrix composed of the
eigenvectors corresponding to the
greatest eigenvalues of
:
![$\displaystyle {\bf V}'=[ {\bf v}_1,\cdots,{\bf v}_{d'} ]_{d\times d'}$](img291.svg) |
(78) |
and carry out the KLT based on
:
![$\displaystyle {\bf y}=\left[ \begin{array}{c} y_1\\ \vdots\\ y_{d'}\end{array}\...
...\begin{array}{c} x_1\\ \vdots \\ \vdots \\ x_d
\end{array} \right]_{d \times 1}$](img293.svg) |
(79) |
This is a lossy compression by which the dimensionality is reduced from
to
, with the following error representing the percentage of information
lost:
 |
(80) |
But as
are the smallest
eigenvalues, the error is minimum (e.g., 5%).
- Reconstruct
by inverse KLT:
![$\displaystyle {\bf x}
=\left[ \begin{array}{c} x_1\\ \vdots \\ \vdots \\ x_d \e...
...dots \\ y_{d '}
\end{array} \right]_{d' \times 1}
=\sum_{i=1}^{d'} y_i{\bf v}_i$](img297.svg) |
(81) |
The rank of the
estimated covariance matrix
in Eq. (40) based on the dataset
is no greater than
.
When
, then the rank of
is at most
,
i.e., there are no more than
non-zero eigenvalues while all
remaining
eigenvalues are zero. Also, if the
dimensionality
is high, the computational complexity
for solving this eigenvalue problem of
may be expensive. In this case, we can find the
non-zero
eigenvalues and their corresponding eigenvectors of
by solving the eigenvalue problem of an
matrix with much
lower complexity of
.
Specifically, for convenience and without loss of generality, we
assume that the data have zero mean
, so that the
covariance matrix of the given dataset
can be estimate as
and the eigenequation of this
matrix can be written as
 |
(82) |
Pre-multiplying
on both sides, we get the eigenequation
of an
matrix
:
i.e. |
(83) |
where
is the eigenvector of
. We see that
and
share the same non-zero eigenvalues. When
, we can solve the
eigenvalue problem in Eq. (83) with complexity
, instead that of the original problem in Eq. (82)
with
, to get the
non-zero eigenvalues
, and the corresponding eigenvectors
. To find the eigenvectors
of the original matrix
, we
pre-multiply
on both sides of the eigenequation above
and get
 |
(84) |
Comparing this with Eq. (82), we see that
is proportional
to the original eigenvector
corresponding to
,
and they become the same when normalized.