Next: Appendix Up: ica Previous: Preprocessing for ICA

FastICA Algorithm

Summarizing the objective functions discussed above, we see a common goal of maximizing a function $\sum_i E\{ G(y_i) \}$ , where $y_i={\mathbf w}_i^T {\mathbf x}$ is a component of ${\mathbf y}={\mathbf Wx}$

$\begin{displaymath}\sum_i E\{ G(y_i) \}=\sum_i E\{ G( {\mathbf w}_i^T {\mathbf x} ) \} \end{displaymath}$

where ${\mathbf w}_i^T$ is the ith row vector in matrix ${\mathbf W}$ . We first consider one particular component (with the subscript i dropped). This is an optimization problem which can be solved by Lagrange multiplier method with the objective function

$\begin{displaymath}O({\mathbf w})=E\{ G( {\mathbf w}^T {\mathbf x} ) \} -\beta( {\mathbf w}^T{\mathbf w}-1)/2 \end{displaymath}$

The second term is the constraint representing the fact that the rows and columns of the orthogonal matrix ${\mathbf W}$ are normalized, i.e., ${\mathbf w}^T{\mathbf w}=1$ . We set the derivative of $O({\mathbf w})$ with respect to ${\mathbf w}$ to zero and get

$\begin{displaymath}F({\mathbf w})\stackrel{\triangle}{=} \frac{\partial O({\mat... ...mathbf x}g( {\mathbf w}^T {\mathbf x} ) \}-\beta {\mathbf w}=0 \end{displaymath}$

where

is the derivative of function

. This algebraic equation system can be solved iteratively by Newton-Raphson method:

$\begin{displaymath}{\mathbf w} \Leftarrow {\mathbf w}-J_F^{-1}({\mathbf w}) F({\mathbf w}) \end{displaymath}$

where $J_F({\mathbf w})$ is the Jacobian of function $F({\mathbf w})$ :

$\begin{displaymath} J_F({\mathbf w})=\frac{\partial F}{\partial {\mathbf w}}= E\{{\mathbf xx}^T g'({\mathbf w}^T{\mathbf x})\}-\beta{\mathbf I} \end{displaymath}$

The first term on the right can be approximated as

$\begin{displaymath}E\{{\mathbf xx}^T g'({\mathbf w}^T{\mathbf x})\} \approx E\{... ...\mathbf x})\} =E\{g'({\mathbf w}^T{\mathbf x})\} {\mathbf I} \end{displaymath}$

and the Jacobian becomes diagonal

$\begin{displaymath}J_F({\mathbf w})=[E\{g'({\mathbf w}^T{\mathbf x})\}-\beta] {\mathbf I} \end{displaymath}$

and the Newton-Raphson iteration becomes:

$\begin{displaymath}{\mathbf w} \Leftarrow {\mathbf w}-\frac{1}{E\{g'({\mathbf w}... ...mathbf x}g( {\mathbf w}^T {\mathbf x} ) \}-\beta {\mathbf w}] \end{displaymath}$

Multiplying both sides by the scaler $\beta-E\{g'({\mathbf w}^T{\mathbf x})\}$ , we get

$\begin{displaymath}{\mathbf w} \Leftarrow E\{ {\mathbf x}g( {\mathbf w}^T {\mathbf x} ) \} -E\{g'({\mathbf w}^T{\mathbf x})\} {\mathbf w} \end{displaymath}$

Note that we still use the same representation ${\mathbf w}$ for the left-hand side, while its value is actually multiplied by a scaler. This is taken care of by renormalization, as shown in the following FastICA algorithm:

Choose an initial random guess for ${\mathbf w}$
Iterate:

$\begin{displaymath}{\mathbf w} \Leftarrow E\{ {\mathbf x}g( {\mathbf w}^T {\mathbf x} ) \} -E\{g'({\mathbf w}^T{\mathbf x})\} {\mathbf w} \end{displaymath}$
Normalize:

$\begin{displaymath}{\mathbf w} \Leftarrow {\mathbf w}/\vert\vert{\mathbf w}\vert\vert \end{displaymath}$
If not converged, go back to step 2.

This is a demo of the FastICA algorithm.

Next: Appendix Up: ica Previous: Preprocessing for ICA

Ruye Wang 2018-03-26