Next: Pseudo-inverse Up: algebra Previous: Normal matrices and diagonalizability

Singular value decomposition

SVD Theorem: An $M\times N$ matrix ${\bf A}$ of rank $R\le \min(M,N)$ can be diagonalized by two unitary matrices ${\bf U}_{M\times M}$ and ${\bf V}_{N\times N}$ :

$\begin{displaymath} {\bf U}^*{\bf AV}={\bf\Lambda}^{1/2} =diag[\sqrt{\lambda_1... ...Sigma}, \;\;\;\;\;\;\;\; {\bf A}={\bf U}{\bf\Sigma}{\bf V}^* \end{displaymath}$

where

${\bf U}_{M\times M}=[{\bf u}_1,\ldots,{\bf u}_M]$ is a unitary matrix ( ${\bf U}^{-1}={\bf U}^*$ ) composed of the left singular vectors satisfying

$\begin{displaymath} {\bf A}{\bf A}^*{\bf u}_n=\lambda_n {\bf u}_n,\;\;\;\;\;(n=1,\cdots,M), \end{displaymath}$

or in matrix form

$\begin{displaymath} {\bf A}{\bf A}^*{\bf U}={\bf U}{\bf\Lambda}, \;\;\;\;\;\;\;\;\;\;\;\; {\bf U}^*{\bf A}{\bf A}^*{\bf U}={\bf\Lambda} \end{displaymath}$
${\bf V}_{N\times N}=[{\bf v}_1,\ldots,{\bf v}_N]$ is a unitary matrix ( ${\bf V}^{-1}={\bf V}^*$ ) composed of the right singular vectors satisfying

$\begin{displaymath} {\bf A}^*{\bf A}{\bf v}_n=\lambda_n {\bf v}_n,\;\;\;\;\;(n=1,\ldots,N) \end{displaymath}$

or in matrix form

$\begin{displaymath} {\bf A}^*{\bf A}{\bf V}={\bf V}{\bf\Lambda}, \;\;\;\;\;\;\;\;\;\;\;\; {\bf V}^*{\bf A}^*{\bf A}{\bf V}={\bf\Lambda} \end{displaymath}$
${\bf\Sigma}={\bf\Lambda}^{1/2}$ is an $M\times N$ diagonal matrix with non-zero singular values $\sigma_n=\sqrt{\lambda_n}$ $(n=1,\ldots,R)$ of ${\bf A}$

Proof: Let $\lambda_n$ and ${\bf v}_n$ be the nth eigenvalue and the corresponding normalized eigenvector of the $N\times N$ matrix ${\bf A}^*{\bf A}$ satisfying the following eigenequation:

$\begin{displaymath} {\bf A}^*{\bf A}{\bf v}_n=\lambda_n {\bf v}_n,\;\;\;\;(n=1,\cdots,N) \end{displaymath}$

As $({\bf A}^*{\bf A})^*={\bf A}^*{\bf A}$ is Hermitian (symmetric if ${\bf A}$ is real), its eigenvalues $\lambda_n$ are real and its normalized eigenvectors are orthonormal: ${\bf v}^*_i{\bf v}_j=\delta_{ij}$ , i.e., the eigenvector matrix defined as ${\bf V}_{N\times N}$ is unitary (orthogonal if ${\bf A}$ is real):

$\begin{displaymath} {\bf V}^*={\bf V}^{-1},\;\;\;\;\mbox{or}\;\;\;\;\;{\bf V}^*{\bf V}={\bf I} \end{displaymath}$

The eigenequation can be written in matrix form:

$\begin{displaymath} {\bf A}^*{\bf A}{\bf V}={\bf V}{\bf\Lambda} \;\;\;\;\;\mbox{or}\;\;\;\;\;\; {\bf V}^*{\bf A}^*{\bf A}{\bf V}={\bf\Lambda} \end{displaymath}$

where ${\bf\Lambda}=diag[\lambda_1,\cdots,\lambda_N]$ .

Pre-multiplying ${\bf A}$ on both sides of ${\bf A}^*{\bf A}{\bf v}_n=\lambda_n {\bf v}_n$ we get

$\begin{displaymath} {\bf A}{\bf A}^*{\bf A}{\bf v}_n=\lambda_n {\bf A}{\bf v}_n \end{displaymath}$

We see that ${\bf A}{\bf v}_n$ is the eigenvector of the Hermitian matrix ${\bf A}{\bf A}^*$ corresponding to its eigenvalue $\lambda_n$ , with norm:

$\begin{displaymath} \vert\vert{\bf A}{\bf v}_n\vert\vert=\sqrt{({\bf A}{\bf v}_n... ... v}_n} =\sqrt{ {\bf v}^*_n\lambda_n{\bf v}_n}=\sqrt{\lambda_n} \end{displaymath}$

We define the normalized eigenvector of ${\bf A}{\bf A}^*$ as

$\begin{displaymath} {\bf u}_n={\bf A}{\bf v}_n/\sqrt{\lambda_n} \end{displaymath}$

As ${\bf A}{\bf A}^*$ is Hermitian, its normalized eigenvectors are orthonormal, which can also be shown as:

$\begin{displaymath} {\bf u}^*_i{\bf u}_j =\frac{({\bf A}{\bf v}_i)^*}{\sqrt{\lam... ...a_{ij}\lambda_j}{\sqrt{\lambda_i}\sqrt{\lambda_j}}=\delta_{ij} \end{displaymath}$

We define

$\begin{displaymath} {\bf U}=[{\bf u}_1,\cdots,{\bf u}_M] ={\bf A}[{\bf v}_1/\s... ...\bf v}_M/\sqrt{\lambda_M}] ={\bf A}{\bf V}{\bf\Lambda}^{-1/2} \end{displaymath}$

which is also unitary

$\begin{displaymath} {\bf U}^*={\bf U}^{-1},\;\;\;\;\mbox{or}\;\;\;\;\;{\bf U}^*{\bf U}={\bf I} \end{displaymath}$

Now we have

$\begin{displaymath} {\bf U}^*{\bf A}{\bf V}=({\bf A}{\bf V}{\bf\Lambda}^{-1/2})^... ...{\bf\Lambda}^{-1/2}{\bf\Lambda}={\bf\Lambda}^{1/2}={\bf\Sigma} \end{displaymath}$

Q.E.D.

When ${\bf A}^*={\bf A}$ is Hermitian (symmetric if real), i.e., ${\bf A}{\bf A}^*={\bf A}^*{\bf A}={\bf A}^2$ , then if

$\begin{displaymath} {\bf A}{\bf\phi}_i=\lambda_i{\bf\phi}_i \end{displaymath}$

we also have

$\begin{displaymath} {\bf A}^*{\bf A}{\bf\phi}_i={\bf A}^*{\bf A}{\bf\phi}_i ={\bf A}^2{\bf\phi}_i=\lambda^2{\bf\phi}_i \end{displaymath}$

i.e., the singular values $\sigma_i=\sqrt{\lambda_i^2}=\vert\lambda_i\vert$ are the absolute values of its eigenvalues $\sigma_i({\bf A})=\vert\lambda_i({\bf A})\vert$ , and ${\bf U}={\bf V}={\bf\Phi}$ .

The SVD equation

$\begin{displaymath} {\bf U}^*{\bf A}{\bf V}={\bf\Lambda}^{1/2}={\bf\Sigma} \end{displaymath}$

can be considered as the forward SVD transform. Pre-multiplying ${\bf U}$ and post-multiplying ${\bf V}^*$ on both sides, we get the inverse transform:

$\begin{displaymath} {\bf A}={\bf U}{\bf\Sigma}{\bf V}^* =[{\bf u}_1,\cdots\cdo... ...nd{array}\right] =\sum_{k=1}^R \sigma_k[{\bf u}_k{\bf v}_k^*] \end{displaymath}$

by which the original matrix ${\bf A}$ is represented as a linear combination of

matrices $[{\bf u}_k{\bf v}_k^*]$ weighted by the singular values $\sqrt{\lambda_k}$ ( $k=1,\ldots,R$ ). We can rewrite both the forward and inverse SVD transform as a pair of forward and inverse transforms:

$\begin{displaymath} \left\{ \begin{array}{l} {\bf\Sigma}={\bf\Lambda}^{1/2}={... .../2}{\bf V}^*={\bf U}{\bf\Sigma}{\bf V}^*. \end{array} \right. \end{displaymath}$

Given the SVD of an $M\times N$ matrix ${\bf A}={\bf U}{\bf\Lambda}^{1/2}{\bf V}^*={\bf U}{\bf\Sigma}{\bf V}^*$ , its pseudo-inverse can be found to be

$\begin{displaymath} {\bf A}^-=({\bf V}^*)^{-1}{\bf\Sigma}^{-}{\bf U}^{-1} ={\bf V}{\bf\Sigma}^{-}{\bf U}^* \end{displaymath}$

where ${\bf\Sigma}^{-}$ is pseudo-inverse of ${\bf\Sigma}$ , composed of the reciprocals $1/\sigma_k$ of the

singular values along the diagonal.

The $M\times N$ matrix ${\bf A}={\bf U}{\bf\Sigma}{\bf V}^*$ can be considered as alinear transformation that converts a vector ${\bf x}\in \mathbb{C}^N$ to another vector ${\bf y}\in\mathbb{C}^M$ in three steps:

$\begin{displaymath} {\bf y}={\bf A}{\bf x}={\bf U} {\bf\Sigma} {\bf V}^*\;{\bf x} \end{displaymath}$

Rotate vector ${\bf x}$ by the unitary matrix ${\bf V}^*$ :

$\begin{displaymath} {\bf y}_1={\bf V}^*\; {\bf x}. \end{displaymath}$
Scale each dimention of ${\bf y}_1$ by a factor of $\sigma_k$ ( $k=1,\ldots,R$ ):

$\begin{displaymath} {\bf y}_2={\bf\Sigma}\;{\bf y}_1={\bf\Sigma} {\bf V}^*\;{\bf x}. \end{displaymath}$
Rotate vector ${\bf y}_2$ by the unitary matrix ${\bf U}$ :

$\begin{displaymath} {\bf y}={\bf U} {\bf y}_2={\bf U}{\bf\Sigma} {\bf V}^*\;{\bf x} ={\bf A}{\bf x}. \end{displaymath}$

The figure below illustrates the transformation of the three vertices of a triangle in 2-D space by a matrix ${\bf A}={\bf U}{\bf\Sigma}{\bf V}*^$ , which first rotates the vertices by 45 degrees CCW, scale horitontally and vertically by a factor of 3 and 2, respectively, and then rotate CW by 30 degrees.

Next: Pseudo-inverse Up: algebra Previous: Normal matrices and diagonalizability

Ruye Wang 2015-04-27