The method of factor analysis (FA)
models
a set of observed manifest variables in
as a linear combination of a set
of
unobserved hiden latent variables or
common factors in
, to explain
and reveal the variability and dependency among the
observed
variables, typically correlated, in terms of the latent variables,
assumed to be independent and therefore uncorrelated. The method
of FA is therefore considered as a method for dimensionality
reduction.
Specifically, we assume each of the observed variables in
is a linear combination of the
factors in
(121) |
(122) |
(123) |
However, as is unavailable, it needs to be estimated
together with
at the same time, based on the given
dataset
, typically
.
This can be done by the general method of
expectation-maximization (EM), an iterative algorithm that
maximizes the expectation of the likelihood (ML) or maximum a
posteriori (MAP) estimate of some model parameters, widely used
in machine learning.
Specifically, we treat both and
as random vectors,
and make the following assumptions:
(124) |
(126) |
(128) |
Based on the assumptions above, we desire to find the conditional
pdf
of the latent variable
given the
observed variable
. To do so, we first find the pdf of
, which, as a linear combination of the
two normally distributed random vectors
and
,
is also normally distributied with
, where
(129) | |||
(130) |
(132) |
(133) |
(134) |
(135) |
The computational complexity for the inversion of the
matrix
is
. However, by applying the
Woodbury matrix identity:
The model parameter
can
now be estimated based on the given dataset
by the
EM algorithm, an iterative process of the following two steps:
Find the expectation of the log likelihood function of the
model parameter
, to be maximized in the
following M-step.
Find the likelihood function of
based on the
observed dataset
(all
samples assumed to be i.i.d.):
(140) |
(141) |
We then find the expectation of the log likelihood function,
denoted by , with respect to the latent variable
based on Eq. (136):
(143) |
Find the optimal model parameter
that maximizes the expectation of the log likelihood
obtained
in the E-step.
We set to zero the derivative of with respective each of the
two parameters in
(see
derivative with respect to matrix) and solve the resulting equations.
(144) |
The second equation is due to the fact that
.
Here
can be considered as the estimation of
the
, while
the uncertainty of the
estimation.
(147) |
In summary, here are the steps of the EM method:
Find
and
in Eq. (146)
based on
and
in Eq. (137),
which in turn is based on
;
Find
in
Eqs. (145) and (149), based on
and
;