Here we first consider a set of simple supervised classification
algorithms that assign an unlabeled sample to one of
the
known classes based on set
of training samples
, where each sample
is labeled by
, indicating it belongs to
class
.
Given an unlabeled pattern , we first find its
nearest neighbors in the training dataset, and then assign
to one of the
classes by a majority vote of the
neighbors based on their class labelings. The voting can be weighted
so that closer neighbors are more heavily weighted than those that
are farther away. In particular, when
,
is assigned
to the class of its closest neighbor.
While the k-NN method is simple and straight forward, its computational
cost is high as classifying any unlabeled pattern requires
computing distances to all data points in the training set.
Given a set of training data points
all belonging to the kth class
(
), we can find their
mean and covariance to represent the class:
(1) |
if |
(2) |
We could simply use the Euclidean distance
between
and
. But such a classification may not
reliable as Euclidean distance does not take into consideration the
covariance
representing how the
samples are
distributed in the feature space, as illustrated by the following
example.
Example 1: As illustrated in the figure below (left plot), a
point in 1-D space is to be classified into one of the two
classes represented by their corresponding Gaussian pdfs:
(3) |
We see the distance
should be positively related
to
, but inversely related to
, i.e., we can
define
. Based on this distance, we find
, i.e.,
should be
classified to
.
In a higher dimensional feature space, we can carry out classification
based on the more generally defined Mahalanobis distance between
a point and a distribution represented by
and
:
(4) |
Example 2: As illustrated in the above figure (right plot), two
samples and
are to be classified into either of two
classes:
(5) |
(6) |