: the a priori probability that an arbitrary pattern belongs
to class
.
-
: the a posteriori conditional probability that a
specific pattern
belongs to class
.
: the density distribution of patterns in all classes.
-
: the conditional density distribution of all patterns
belonging to
.
Note that
is the weighted sum of all
for
:
- The Bayes' Theorem
- Training
The a priori probability
can be estimated from the
training samples as
, assuming the training samples
are randomly chosen from all the patterns.
We also need to estimate
. If we don't have any good reason
to believe otherwise, we will assume the density to be a normal distribution:
where the mean vector
and the covariance matrix
can be estimated from the training samples as shown before.
- Classification
A given pattern
of unknown class is classified to
if
it is most likely to belong to
(optimal classifier), i.e.:
As shown above, the likelihood
can be written as
and the denominator
can be dropped as it is common in all
's, therefore a discriminant function
can be used in the classification:
Give all
discriminant functions, we can partition the N-D feature into
regions
by boundaries between any two regions
and
represented by
.