next up previous
Next: Pseudo-inverse Up: algebra Previous: Normal matrices and diagonalizability

Singular value decomposition

SVD Theorem: An $M\times N$ matrix ${\bf A}$ of rank $R\le \min(M,N)$ can be diagonalized by two unitary matrices ${\bf U}_{M\times M}$ and ${\bf V}_{N\times N}$:

\begin{displaymath}
{\bf U}^*{\bf AV}={\bf\Lambda}^{1/2}
=diag[\sqrt{\lambda_1...
...Sigma},
\;\;\;\;\;\;\;\;
{\bf A}={\bf U}{\bf\Sigma}{\bf V}^*
\end{displaymath}

where

Proof: Let $\lambda_n$ and ${\bf v}_n$ be the nth eigenvalue and the corresponding normalized eigenvector of the $N\times N$ matrix ${\bf A}^*{\bf A}$ satisfying the following eigenequation:

\begin{displaymath}
{\bf A}^*{\bf A}{\bf v}_n=\lambda_n {\bf v}_n,\;\;\;\;(n=1,\cdots,N)
\end{displaymath}

As $({\bf A}^*{\bf A})^*={\bf A}^*{\bf A}$ is Hermitian (symmetric if ${\bf A}$ is real), its eigenvalues $\lambda_n$ are real and its normalized eigenvectors are orthonormal: ${\bf v}^*_i{\bf v}_j=\delta_{ij}$, i.e., the eigenvector matrix defined as ${\bf V}_{N\times N}$ is unitary (orthogonal if ${\bf A}$ is real):

\begin{displaymath}
{\bf V}^*={\bf V}^{-1},\;\;\;\;\mbox{or}\;\;\;\;\;{\bf V}^*{\bf V}={\bf I}
\end{displaymath}

The eigenequation can be written in matrix form:

\begin{displaymath}
{\bf A}^*{\bf A}{\bf V}={\bf V}{\bf\Lambda}
\;\;\;\;\;\mbox{or}\;\;\;\;\;\;
{\bf V}^*{\bf A}^*{\bf A}{\bf V}={\bf\Lambda}
\end{displaymath}

where ${\bf\Lambda}=diag[\lambda_1,\cdots,\lambda_N]$.

Pre-multiplying ${\bf A}$ on both sides of ${\bf A}^*{\bf A}{\bf v}_n=\lambda_n {\bf v}_n$ we get

\begin{displaymath}
{\bf A}{\bf A}^*{\bf A}{\bf v}_n=\lambda_n {\bf A}{\bf v}_n
\end{displaymath}

We see that ${\bf A}{\bf v}_n$ is the eigenvector of the Hermitian matrix ${\bf A}{\bf A}^*$ corresponding to its eigenvalue $\lambda_n$, with norm:

\begin{displaymath}
\vert\vert{\bf A}{\bf v}_n\vert\vert=\sqrt{({\bf A}{\bf v}_n...
... v}_n}
=\sqrt{ {\bf v}^*_n\lambda_n{\bf v}_n}=\sqrt{\lambda_n}
\end{displaymath}

We define the normalized eigenvector of ${\bf A}{\bf A}^*$ as

\begin{displaymath}
{\bf u}_n={\bf A}{\bf v}_n/\sqrt{\lambda_n}
\end{displaymath}

As ${\bf A}{\bf A}^*$ is Hermitian, its normalized eigenvectors are orthonormal, which can also be shown as:

\begin{displaymath}
{\bf u}^*_i{\bf u}_j
=\frac{({\bf A}{\bf v}_i)^*}{\sqrt{\lam...
...a_{ij}\lambda_j}{\sqrt{\lambda_i}\sqrt{\lambda_j}}=\delta_{ij}
\end{displaymath}

We define

\begin{displaymath}
{\bf U}=[{\bf u}_1,\cdots,{\bf u}_M]
={\bf A}[{\bf v}_1/\s...
...\bf v}_M/\sqrt{\lambda_M}]
={\bf A}{\bf V}{\bf\Lambda}^{-1/2}
\end{displaymath}

which is also unitary

\begin{displaymath}
{\bf U}^*={\bf U}^{-1},\;\;\;\;\mbox{or}\;\;\;\;\;{\bf U}^*{\bf U}={\bf I}
\end{displaymath}

Now we have

\begin{displaymath}
{\bf U}^*{\bf A}{\bf V}=({\bf A}{\bf V}{\bf\Lambda}^{-1/2})^...
...{\bf\Lambda}^{-1/2}{\bf\Lambda}={\bf\Lambda}^{1/2}={\bf\Sigma}
\end{displaymath}

Q.E.D.

When ${\bf A}^*={\bf A}$ is Hermitian (symmetric if real), i.e., ${\bf A}{\bf A}^*={\bf A}^*{\bf A}={\bf A}^2$, then if

\begin{displaymath}
{\bf A}{\bf\phi}_i=\lambda_i{\bf\phi}_i
\end{displaymath}

we also have

\begin{displaymath}
{\bf A}^*{\bf A}{\bf\phi}_i={\bf A}^*{\bf A}{\bf\phi}_i
={\bf A}^2{\bf\phi}_i=\lambda^2{\bf\phi}_i
\end{displaymath}

i.e., the singular values $\sigma_i=\sqrt{\lambda_i^2}=\vert\lambda_i\vert$ are the absolute values of its eigenvalues $\sigma_i({\bf A})=\vert\lambda_i({\bf A})\vert$, and ${\bf U}={\bf V}={\bf\Phi}$.

The SVD equation

\begin{displaymath}
{\bf U}^*{\bf A}{\bf V}={\bf\Lambda}^{1/2}={\bf\Sigma}
\end{displaymath}

can be considered as the forward SVD transform. Pre-multiplying ${\bf U}$ and post-multiplying ${\bf V}^*$ on both sides, we get the inverse transform:

\begin{displaymath}
{\bf A}={\bf U}{\bf\Sigma}{\bf V}^*
=[{\bf u}_1,\cdots\cdo...
...nd{array}\right]
=\sum_{k=1}^R \sigma_k[{\bf u}_k{\bf v}_k^*]
\end{displaymath}

by which the original matrix ${\bf A}$ is represented as a linear combination of $R$ matrices $[{\bf u}_k{\bf v}_k^*]$ weighted by the singular values $\sqrt{\lambda_k}$ ($k=1,\ldots,R$). We can rewrite both the forward and inverse SVD transform as a pair of forward and inverse transforms:

\begin{displaymath}
\left\{ \begin{array}{l}
{\bf\Sigma}={\bf\Lambda}^{1/2}={...
.../2}{\bf V}^*={\bf U}{\bf\Sigma}{\bf V}^*.
\end{array} \right.
\end{displaymath}

Given the SVD of an $M\times N$ matrix ${\bf A}={\bf U}{\bf\Lambda}^{1/2}{\bf V}^*={\bf U}{\bf\Sigma}{\bf V}^*$, its pseudo-inverse can be found to be

\begin{displaymath}
{\bf A}^-=({\bf V}^*)^{-1}{\bf\Sigma}^{-}{\bf U}^{-1}
={\bf V}{\bf\Sigma}^{-}{\bf U}^*
\end{displaymath}

where ${\bf\Sigma}^{-}$ is pseudo-inverse of ${\bf\Sigma}$, composed of the reciprocals $1/\sigma_k$ of the $R$ singular values along the diagonal.

The $M\times N$ matrix ${\bf A}={\bf U}{\bf\Sigma}{\bf V}^*$ can be considered as alinear transformation that converts a vector ${\bf x}\in \mathbb{C}^N$ to another vector ${\bf y}\in\mathbb{C}^M$ in three steps:

\begin{displaymath}
{\bf y}={\bf A}{\bf x}={\bf U} {\bf\Sigma} {\bf V}^*\;{\bf x}
\end{displaymath}

  1. Rotate vector ${\bf x}$ by the unitary matrix ${\bf V}^*$:

    \begin{displaymath}
{\bf y}_1={\bf V}^*\; {\bf x}.
\end{displaymath}

  2. Scale each dimention $Y_k$ of ${\bf y}_1$ by a factor of $\sigma_k$ ($k=1,\ldots,R$):

    \begin{displaymath}
{\bf y}_2={\bf\Sigma}\;{\bf y}_1={\bf\Sigma} {\bf V}^*\;{\bf x}.
\end{displaymath}

  3. Rotate vector ${\bf y}_2$ by the unitary matrix ${\bf U}$:

    \begin{displaymath}
{\bf y}={\bf U} {\bf y}_2={\bf U}{\bf\Sigma} {\bf V}^*\;{\bf x}
={\bf A}{\bf x}.
\end{displaymath}

SVDtransform.png

The figure below illustrates the transformation of the three vertices of a triangle in 2-D space by a matrix ${\bf A}={\bf U}{\bf\Sigma}{\bf V}*^$, which first rotates the vertices by 45 degrees CCW, scale horitontally and vertically by a factor of 3 and 2, respectively, and then rotate CW by 30 degrees.

SVDtransformation.gif


next up previous
Next: Pseudo-inverse Up: algebra Previous: Normal matrices and diagonalizability
Ruye Wang 2015-04-27