When the number of features is large, solving the eigenvalue problem of the
matrix
may be very time consuming. To compromise,
we can use other orthogonal transform such as DFT or WHT instead of KLT for
the transform
.
Obviously DFT and WHT are not dependent on the feature selection criterion
. The reason why they can be used to replace KLT is that
orthogonal transforms in general tend to decorrelate signals so that the
energy/information (separability information here) is concentrated in a small
number of components while others containing little. (However, this energy
compaction is suboptimal compared to KLT.) We should choose the
rows of
the
by
DFT or WHT matrix corresponding to the
largest
values to achieve best feature selection
effect.