Next: Optimal transformation for maximizing
Up: Feature Selection
Previous: Feature Selection
First we realize that the separability criterion
in the
space
can be expressed as:
The optimal transform matrix
which maximizes
can be
obtained by solving the following optimization problem:
Here we have further assumed that
is an orthogonal matrix (a justifiable
constraint as orthogonal matrices conserve energy/information in the signal
vector). This constrained optimization problem can be solved by Lagrange
multiplier method:
We see that the column vectors of
must be the orthogonal eigenvectors of
the symmetric matrix
:
i.e., the transform matrix must be
Thus we have proved that the optimal feature selection transform is the
principal component transform (KLT) which, as we have shown before, tends to
compact most of the energy/information (representing separability here) into
a small number of components. Therefore the
new features can be obtained by
and
Obviously, to maximize
, we just need to choose the
eigenvectors
's corresponding to the
largest eigenvalues of
:
In the subspace spanned by these
new features,
with be
maximized.
Next: Optimal transformation for maximizing
Up: Feature Selection
Previous: Feature Selection
Ruye Wang
2016-11-30