In principal component analysis (PCA), each pattern
is treated as a random vector of which each component
is a
random variable with mean and variance
(37) |
(38) |
(39) |
Usually the joint probability density function
of the random vector
is
unknown. In this case, the mean vector
and covariance
matrix
of
can be estimated by the method
of maximum likelihood estimation (MLE)
based on a set of observed data
samples
:
Note that the rank of the
estimated covariance matrix
is at most
, due to the
samples in the
dataset
, assumed to be are independent, and the additional
constraint:
(41) |
The variance
can be treated
as the dynamic energy contained in
, or the amount of
information carried by
, while the trace
can be considered as
the total amount of dynamic energy contained in
.
Also, the covariance
can be considered as the mutual energy, a measure of the
correlation between
and
. By normalizing the covariance
, we get the correlation coefficient between
and
:
(42) |
Examples
Six normally distributed 2-D datasets are generated with zero mean and the following covariance matrices:
(43) |
These data points are plotted below, together with the correlation coefficient on top of each dataset.