The EM method for the Gaussian mexture model can be
generalized to estimate the parameters in
of a generic
probabilistic model
, where
is a hidden variable that takes any of the
values
with the corresponding probabilities
, satisfying
.
By marginalizing the hidden variable
we get (similar to
how we get Eq. (214)):
 |
(268) |
Based on the training set
containing
i.i.d. samples, we can further get the likelihood
function of the parameter
(similar to
Eq. (219)):
and the log likelihood
The last step is due to
Jensen's inequality
for the concave logarithmic function
.
The final expression is a lower bound of the log likelihood
function, which becomes tight if the equality holds, if the
expression inside the brackets is not a random variable, but
some constant
, i.e.,
for all
. This
can be found by
i.e., |
(271) |
Now we have
 |
(272) |
We see that if we set
to be the same as the posterior
probability
of
, then the
right hand side of Eq. (270) is the same as the log
likelihood
. The optimal parameters in
that maximize this log likelihood can be found
by iterating the following two steps, based on some initial
guess of
:
- The E-step:
 |
(273) |
We note that this is the same as Eq. (221) for the
specific Gaussian mixture model.
- The M-step:
![$\displaystyle {\bf\theta}=\argmax_{\theta} \sum_{n=1}^N E_z \log
\left[\frac{ p...
..._{k=1}^K P_k
\log\left[\frac{ p({\bf x}_n, z_k \vert {\bf\theta})}{P_k} \right]$](img961.svg) |
(274) |
As
in the E-step depends on
, which can be
found in the M-step depending on
, these two steps need to
be carried out alternatively and iteratively until the final
convergence of
.