Here we first consider a set of simple supervised classification algorithms that assign an unlabeled sample to one of the known classes based on set of training samples , where each sample is labeled by , indicating it belongs to class .
Given an unlabeled pattern , we first find its nearest neighbors in the training dataset, and then assign to one of the classes by a majority vote of the neighbors based on their class labelings. The voting can be weighted so that closer neighbors are more heavily weighted than those that are farther away. In particular, when , is assigned to the class of its closest neighbor.
While the k-NN method is simple and straight forward, its computational cost is high as classifying any unlabeled pattern requires computing distances to all data points in the training set.
Given a set of training data points all belonging to the kth class ( ), we can find their mean and covariance to represent the class:
(1) |
ifthen | (2) |
We could simply use the Euclidean distance between and . But such a classification may not reliable as Euclidean distance does not take into consideration the covariance representing how the samples are distributed in the feature space, as illustrated by the following example.
Example 1: As illustrated in the figure below (left plot), a point in 1-D space is to be classified into one of the two classes represented by their corresponding Gaussian pdfs:
(3) |
We see the distance should be positively related to , but inversely related to , i.e., we can define . Based on this distance, we find , i.e., should be classified to .
In a higher dimensional feature space, we can carry out classification based on the more generally defined Mahalanobis distance between a point and a distribution represented by and :
(4) |
Example 2: As illustrated in the above figure (right plot), two samples and are to be classified into either of two classes:
(5) |
(6) |