next up previous
Next: Positive/Negative (semi)-definite matrices Up: algebra Previous: Hermitian and unitary matrices

Eigenvalues and eigenvectors

For any matrix ${\bf A}$, if there exist a vector $\phi$ and a value $\lambda$ such that

\begin{displaymath}{\bf A}{\bf\phi}=\lambda {\bf\phi} \end{displaymath}

then $\lambda$ and ${\bf\phi}$ are called the eigenvalue and eigenvector of matrix ${\bf A}$, respectively. Note that the eigenvector ${\bf\phi}$ is not unique but up to any scaling factor, i.e, if ${\bf\phi}$ is the eigenvector of ${\bf A}$ so is $c{\bf\phi}$ with any costant $c$. Typically for the uniqueness of ${\bf\phi}$, we keep if normalized with $\vert\vert{\bf\phi}\vert\vert=1$.

To obtain $\lambda$, rewrite the above equation as

\begin{displaymath}(\lambda {\bf I}-{\bf A}) {\bf\phi} =0 \end{displaymath}

which is a homogeneous equation system. For this system to have non-zero solution for ${\bf\phi}$, the determinant of its coefficient matrix has to be zero:

\begin{displaymath}
det(\lambda {\bf I}-{\bf A})=\vert \lambda {\bf I}-{\bf A} \vert = 0
\end{displaymath}

Solving this $n$th order equation of $\lambda$, we get $n$ eigenvalues $\{ \lambda_1,\cdots,\lambda_n \}$. Substituting each $\lambda_i$ back into the equation system, we get the corresponding eigenvector ${\bf\phi_i}$. We can put all $n$ equations ${\bf A}{\bf\phi}_i=\lambda_i{\bf\phi}_i$ together and have

\begin{displaymath}
{\bf A}[{\bf\phi_1},\cdots,{\bf\phi}_n]
=[\lambda_1 {\bf\p...
... \vdots \\
0 & 0 & \cdots & \lambda_n
\end{array} \right]
\end{displaymath}

or in a more compact matrix form,

\begin{displaymath}
{\bf A}{\bf\Phi}={\bf\Phi}{\bf\Lambda},
\;\;\;\;\;\;\mbox{or}\;\;\;\;\;\;\;
{\bf\Phi}^{-1}{\bf A}{\bf\Phi}={\bf\Lambda}
\end{displaymath}

where

\begin{displaymath}
{\bf\Lambda}=diag[\lambda_1,\cdots,\lambda_n]
\;\;\;\;\;\...
...and}\;\;\;\;\;\;\;
{\bf\Phi}=[{\bf\phi}_1,\cdots,{\bf\phi}_n]
\end{displaymath}

are the diagonal eigenvalue matrix and eigenvector matrix, respectively.

The spectrum of an $N\times N$ square matrix ${\bf A}$ the set of its eigenvalues $\{\lambda_1,\cdots,\lambda_N\}$. The spectral radius of ${\bf A}$, denoted by $\rho({\bf A})$, is the maximum of the absolute values of the elements of its spectrum:

\begin{displaymath}
\rho({\bf A})=\max(\vert\lambda_1\vert,\cdots,\vert\lambda_N\vert)
\end{displaymath}

The absolute value of a complex number $z=x+jy$ is $\vert z\vert=\sqrt{x^2+y^2}$, the Euclidean distance between $z$ and the origin of the complex plane. If all eigenvalues are sorted such that $\vert\lambda_1\vert\ge\cdots \ge\vert\lambda_N\vert$ then $\rho({\bf A})=\vert\lambda_1\vert$.

The trace and determinant of ${\bf A}$ can be obtained from its eigenvalues

\begin{displaymath}
tr({\bf A}) =\sum_{k=1}^n \lambda_k,
\;\;\;\;\;\mbox{and}\;\;\;\;\;\;\;
det({\bf A}) =\prod_{k=1}^n \lambda_k
\end{displaymath}

It can also be shown that

The eigenvalues and eigenvectors of a matrix are invariant under any unitary transform ${\bf B}={\bf R}^*{\bf A}{\bf R}$. To prove this, we first assume ${\bf\Lambda}=diag(\lambda_1,\cdots,\lambda_N)$ and ${\bf\Phi}$ are the eigenvalue and eigenvector matrices of a square matrix ${\bf A}$:

\begin{displaymath}
{\bf A}{\bf\Phi}={\bf\Phi}{\bf\Lambda}
\end{displaymath}

and ${\bf\Sigma}=diag(\sigma_1,\cdots,\sigma_N)$ and ${\bf\Psi}$ are the eigenvalue and eigenvector matrices of ${\bf B}={\bf R}^*{\bf A}{\bf R}$, a unitary transform of ${\bf A}$:

\begin{displaymath}
{\bf B}{\bf\Psi}=\left({\bf R}^*{\bf A}{\bf R}\right){\bf\Psi}={\bf\Psi}{\bf\Sigma}
\end{displaymath}

Left-multiplying ${\bf R}$ on both sides we get the eigenequation of ${\bf A}$

\begin{displaymath}
{\bf R}{\bf R}^*{\bf A}{\bf R}{\bf\Psi}
={\bf A}{\bf R}{\bf\Psi}={\bf R}{\bf\Psi}{\bf\Sigma}
\end{displaymath}

with same eigenvalue and igenvector matrices

\begin{displaymath}
{\bf R}{\bf\Psi}={\bf\Phi},\;\;\;\;\;\;{\bf\Sigma}={\bf\Lambda}
\end{displaymath}

If ${\bf A}$ is Hermitian (symmetric if real), such as a covarance matrix, then all of its eigenvalues $\lambda_i$'s are real and all eigenvectors ${\bf\phi}_i$'s are orthogonal and can be normalized (orthonormal):

\begin{displaymath}
<{\bf\phi}_i,{\bf\phi}_j>=\delta_{ij}=\left\{\begin{array}{ll}
1&i=j\\ 0&i\ne j\end{array}\right.
\end{displaymath}

and the eigenvector matrix ${\bf\Phi}$ is unitary (orthogonal if ${\bf A}$ is real):

\begin{displaymath}
{\bf\Phi}^{-1}={\bf\Phi}^*
\end{displaymath}

and we have

\begin{displaymath}
{\bf\Phi}^{-1}{\bf A}{\bf\Phi}={\bf\Phi}^*{\bf A}{\bf\Phi}={\bf\Lambda},
\end{displaymath}

Left and right multiplying by ${\bf\Phi}$ and ${\bf\Phi}^*={\bf\Phi}^{_1}$ respectively on the two sides, we get

\begin{displaymath}
{\bf A}={\bf\Phi} {\bf\Lambda}{\bf\Phi}^*=[{\bf\phi}_1,\cdo...
...} \right]
=\sum_{i=1}^n \lambda_i {\bf\phi}_i {\bf\phi}_i^*
\end{displaymath}

The generalized eigenvalue problem of two symmetric matrices ${\bf A}={\bf A}^T$ and ${\bf B}={\bf B}^T$ is to find a scalar $\lambda$ and the corresponding vector ${\bf\phi}$ for the following equation to hold:

\begin{displaymath}
{\bf A}{\bf\phi}_i=\lambda_i{\bf B}{\bf\phi}_i,
\;\;\;\;\;\;\;(i=1,\cdots,n)
\end{displaymath}

or in matrix form

\begin{displaymath}
{\bf A}{\bf\Phi}={\bf B}{\bf\Phi}{\bf\Lambda}
\end{displaymath}

The eigenvalue and eigenvector matrices ${\bf\Lambda}$ and ${\bf\Phi}$ can be found in the following steps.

The Rayleigh quotient of two symmetric matrices ${\bf A}$ and ${\bf B}$ is a function of a vector ${\bf w}$ defined as:

\begin{displaymath}
R({\bf w})=\frac{{\bf w}^T{\bf A}{\bf w}}{{\bf w}^T{\bf B}{\bf w}}
\end{displaymath}

To find the optimal ${\bf w}$ corresponding to the extremum (maximum or minimum) of $R({\bf w})$, we find its derivative with respect to ${\bf w}$:

\begin{displaymath}
\frac{d}{d{\bf w}}R({\bf w})
=\frac{2{\bf A}{\bf w}({\bf w}^...
...\bf w}({\bf w}^T{\bf A}{\bf w})}
{({\bf w}^T{\bf B}{\bf w})^2}
\end{displaymath}

Setting it to zero we get

\begin{displaymath}
{\bf A}{\bf w}({\bf w}^T{\bf B}{\bf w})={\bf B}{\bf w}({\bf...
...bf B}{\bf w}
=R({\bf w}){\bf B}{\bf w}=\lambda {\bf B}{\bf w}
\end{displaymath}

The second equation can be recognized as a generalized eigenvalue problem with $\lambda=R({\bf w})$ being the eigenvalue and and ${\bf w}$ the corresponding eigenvector. Solving this we get the vector ${\bf w}={\bf\phi}$ corresponding to the maximum/minimum eigenvalue $\lambda=R({\bf w})$, which maximizes/minimizes the Rayleigh quotient.


next up previous
Next: Positive/Negative (semi)-definite matrices Up: algebra Previous: Hermitian and unitary matrices
Ruye Wang 2014-06-05