The EM method for the Gaussian mexture model can be
generalized to estimate the parameters in
of a generic
probabilistic model
, where
is a hidden variable that takes any of the values
with the corresponding probabilities
, satisfying
.
By marginalizing the hidden variable we get (similar to
how we get Eq. (214)):
|
(268) |
Based on the training set
containing i.i.d. samples, we can further get the likelihood
function of the parameter
(similar to
Eq. (219)):
and the log likelihood
The last step is due to
Jensen's inequality
for the concave logarithmic function
.
The final expression is a lower bound of the log likelihood
function, which becomes tight if the equality holds, if the
expression inside the brackets is not a random variable, but
some constant , i.e.,
for all
. This can be found by
i.e., |
(271) |
Now we have
|
(272) |
We see that if we set to be the same as the posterior
probability
of , then the
right hand side of Eq. (270) is the same as the log
likelihood
. The optimal parameters in
that maximize this log likelihood can be found
by iterating the following two steps, based on some initial
guess of
:
- The E-step:
|
(273) |
We note that this is the same as Eq. (221) for the
specific Gaussian mixture model.
- The M-step:
|
(274) |
As in the E-step depends on
, which can be
found in the M-step depending on , these two steps need to
be carried out alternatively and iteratively until the final
convergence of
.