Next: Comparison with Other Orthogonal
 Up: pca
 Previous: KLT Completely Decorrelates the
Consider a general orthogonal transform pair defined as
where 
 and 
 are N by 1 vectors and 
 is an arbitrary N by N orthogonal
matrix 
.
We represent 
 by its column vectors 
 as
or 
Now the ith component of 
 can be written as
As we assume the mean vector of 
 is zero 
 (and obviously we also have
), we have 
, and the variance of the ith element in both
 and 
 are
and
where 
 and 
 represent the energy contained in the 
ith component of 
 and 
, respectively. In order words, the trace of 
 (the sum of all the diagonal elements of the matrix) represents the
expectation of the total amount of energy contained in the signal 
Since an orthogonal transform 
 does not change the length of a vector X,
i.e., 
,
where
the total energy contained in the signal vector 
 is conserved after the
orthogonal transform.
(This conclusion can also be obtained from the fact that orthogonal transforms
do not change the trace of a matrix.)
We next define
where 
. 
 is a function of the transform matrix 
 and 
represents the amount of energy contained in the first 
 components of 
.
Since the total energy is conserved, 
 also represents the percentage
of energy contained in the first 
 components. In the following we will
show that 
 is maximized if and only if the transform 
 is the
KLT:
i.e., KLT optimally compacts energy into a few components of the signal. 
Consider
Now we need to find a transform matrix 
 so that 
The constraint 
 is to guarantee that the column vectors in  
 
are normalized. This constrained optimization problem can be solved by Lagrange
multiplier method as shown below.
We let
(* the last equal sign is due to explanation in the handout of review of
linear algebra.)
We see that the column vectors of 
 must be the eigenvectors of
:
i.e., the transform matrix must be 
Thus we have proved that the optimal transform is indeed KLT, and
where the ith eigenvalue 
 of 
 is also the average (expectation)
energy contained in the ith component of the signal.
If we choose those 
 that correspond to the 
 largest eigenvalues of
: 
,
then 
 will achieve maximum.
 
 
   
 Next: Comparison with Other Orthogonal
 Up: pca
 Previous: KLT Completely Decorrelates the
Ruye Wang
2004-09-29