The method of probabilistic PCA (PPCA) can be considered
as a special case of factor analysis based on the model
, when the random noise
is assumed to have an isotropic normal distributions with
covariance matrix
. More specially,
if we further let
approach zero, the
iteration of the EM algorithm for the PPCA becomes extremely
simple, as we will see below.
Same as in Eqs. (125) and (127) for FA, here
we also have the normal distribution of both and
:
(150) |
(151) |
Now Eqs. (137) through (139) are rewritten as:
(153) |
(154) |
The two steps of the EM algorithm become:
(155) |
(158) |
In summary, here are the steps in the EM algorithm for PPCA:
Specially, if we further assume
and the latent variables in
are deterministic, i.e.,
. At the limit where
,
is simply a linear combination of the latent variables
in
:
(161) |
Here are the two EM steps for the PPCA:
(162) |
(163) |
However, we note that the model
is an
over-constraned linear system of
equations but only
variables in
, and the least-squares
solution that minimizes the squared error
can be found by the
left pseudo inverse
of
:
(164) |
(165) |
(166) |
Again, as the E-step and M-steps are interdependent on each other,
they need to be carried out iteratively:
E-step: | |||
M-Step: | (167) |
(168) |
(169) |
When compared to the regular PCA method that finds all
eigenvalues and the corresponding eigenvectors all at once by
solving the eigenequation
,
the PPCA method only finds the
column vectors of matrix
as the basis vectors, as the basis vectors that span
a subspace with much reduced dimensionality but containing the
most essential information in the data. However, unlike the PCA,
the basis vectors in PPCA are not necessarily orthogonal to each
other.
The implementation of the PPCA is extremely simple, as shown
in the Matlab code below. Based on some initialized ,
the EM iteration is composed of the E-step and M-step:
while er>tol Z=inv(W'*W)*W'*X; % find Z given W in E-step Wnew=X*Z'*inv(Z*Z'); % find W given Z in M-step er=norm(Wnew-W); % the LS error W=Wnew; end
Example 1:
The figure below shows the first few iterations of the PPCA based
on the EM method presented above, for a set of data
points in
the
dimensional
space. The stright line in the plots indicates the direction of
, which is initially set along a random direction, but
quickly approaches the direction corresponding to the principal
component obtained by the PCA method, along which the data points
are most widely spread.
Example 2:
The figure below shows the first few iterations of the PPCA EM
algorithm for a set of data points in dimensional space.
The stright lines in the plots indicates the direction of the
principal components when
.
Example 3:
The figure below shows the visualization of the dataset for
the hand-written numbers from 0 to 9, containing 2240 data
points in a 256-D space, but now linearly mapped to a 3-D
space spanned by factors found by the PPCA method.