In the previous section we obtained the solution of the equation
together with the bases of the four subspaces
of
based its rref. Here we will consider an alternative
and better way to solve the same equation and find a set of
orthogonal bases that also span the four subspaces, based on the
pseudo-inverse
and
the singular value decomposition (SVD)
of
. The solution obtained this
way is optimal in some certain sense as shown below.
Consider the SVD of an
matrix
of rank
:
data:image/s3,"s3://crabby-images/26283/26283efb91d6d652e77362199c692fa6d2d18aca" alt="$\displaystyle {\bf A}={\bf U\Sigma V}^T$" |
(160) |
where
![$\displaystyle {\bf\Sigma}=\left[\begin{array}{cccccc}\sigma_1&&&&&0\\
&\ddots&...
...&&\sigma_R &&&\\ &&&0&&\\ &&&&\ddots &\\ 0&&&&&0
\end{array}\right]_{M\times N}$](img622.svg) |
(161) |
is an
matrix with
non-zero singular values
of
along
the diagonal (starting from the top-left corner), while all other
components are zero, and
and
are two orthogonal matrices
of dimensions
and
respectively. The column
vectors
and
,
are called the left and right singular vectors of
,
respectively, and they can be used as the orthonormal bases to span
respectively
and its
subspaces
and
, and
and its subspaces
and
.
To see this, we rewrite the SVD of
as:
Each column vector of
can be expressed as a linear
combination of the first
columns of
corresponding
to the non-zero singular values:
data:image/s3,"s3://crabby-images/8d095/8d095b5564aa34bd6b8e8212be2495885fca170e" alt="$\displaystyle {\bf c}_j=\sum_{k=1}^R (\sigma_k v_{jk})\;{\bf u}_k\;\;\;\;\;\;(j=1,\cdots,N),$" |
(163) |
Taking transpose on both sides of the SVD we also get:
i.e., each row vector of
can be expressed as a linear
combination of the first
columns of
corresponding
to the non-zero singular values:
data:image/s3,"s3://crabby-images/cf85c/cf85cc41c3ec65422eae0fdb17e876fb49d8f7eb" alt="$\displaystyle {\bf r}_i=\sum_{k=1}^R (\sigma_k u_{ik})\;{\bf v}_k\;\;\;\;\;\;(i=1,\cdots,M)$" |
(165) |
We therefore see that the
columns of
and
corresponding to the non-zero singular values span respectively
the column space
and row space
:
data:image/s3,"s3://crabby-images/be35e/be35e8ec4110af333e23e842cbb728348db0bca3" alt="$\displaystyle C({\bf A})=span({\bf u}_1,\cdots,{\bf u}_R),\;\;\;\;\;\;\;
R({\bf A})=span({\bf v}_1,\cdots,{\bf v}_R)$" |
(166) |
and the remaining
columns of
and
columns of
corresponding to the zero singular values span respectively
orthogonal to
, and
orthogonal
to
:
data:image/s3,"s3://crabby-images/443bf/443bf8e1ce2edf637c007b54f5a023901e38808c" alt="$\displaystyle N({\bf A}^T)=span({\bf u}_{R+1},\cdots,{\bf u}_M),\;\;\;\;\;\;
N({\bf A})=span({\bf v}_{R+1},\cdots,{\bf v}_N)$" |
(167) |
Subspace |
Definition |
Dimension |
Basis |
data:image/s3,"s3://crabby-images/d5bff/d5bff0fff3d4c49fb1d4b025e2dbb3f0bf22a796" alt="$C({\bf A})\subseteq \mathbb{R}^M$" |
column space (image) of data:image/s3,"s3://crabby-images/e1f7b/e1f7b876fa1ecad3b18c87f37b32308404a6626b" alt="${\bf A}$" |
data:image/s3,"s3://crabby-images/d9a7f/d9a7f8a5a68e6522dd25bf05ce18c1f3d9ad94d0" alt="$R$" |
columns of corresponding to non-zero singular values |
data:image/s3,"s3://crabby-images/41f14/41f14fa973b697ab964889ee78a5827f91580ffd" alt="$N({\bf A^T})\subseteq \mathbb{R}^M$" |
left null space of data:image/s3,"s3://crabby-images/f991c/f991cf1792b1a26ba55889f8abac99c178e15714" alt="${\bf A}^T$" |
data:image/s3,"s3://crabby-images/fb349/fb3497f14bfe1024717697d3d42b570596c68263" alt="$M-R$" |
columns of corresponding to zero singular values |
data:image/s3,"s3://crabby-images/dc110/dc110b6c6fbd7fdd1de1c6fdcf946cc63c34e4e4" alt="$R({\bf A})\subseteq \mathbb{R}^N$" |
row space of (column space of ) |
data:image/s3,"s3://crabby-images/d9a7f/d9a7f8a5a68e6522dd25bf05ce18c1f3d9ad94d0" alt="$R$" |
columns of corresponding to non-zero singular values |
data:image/s3,"s3://crabby-images/aed62/aed622fbdf001124b52cb61d79ef544f5b827de1" alt="$N({\bf A})\subseteq \mathbb{R}^N$" |
null space of data:image/s3,"s3://crabby-images/e1f7b/e1f7b876fa1ecad3b18c87f37b32308404a6626b" alt="${\bf A}$" |
data:image/s3,"s3://crabby-images/ddbc8/ddbc8e0da0e07fcd9603316f3e382e8f91efeafc" alt="$N-R$" |
columns of corresponding to zero singular values |
data:image/s3,"s3://crabby-images/e0f00/e0f00b8cbdc1fe8e08fb676051e5e107437d3802" alt="$\mathbb{R}^N=N({\bf A}) \oplus R({\bf A})$" |
domain of
data:image/s3,"s3://crabby-images/bff76/bff762a027fc1efb8d07039648573d193912735b" alt="${\bf Ax}={\bf b}$" |
data:image/s3,"s3://crabby-images/70430/704306c32caf6f3500f7586c3388c06d8a9db59a" alt="$N$" |
all columns of data:image/s3,"s3://crabby-images/fa1d9/fa1d90414754a2915a5dffb0817170c4b58d89a3" alt="${\bf V}$" |
data:image/s3,"s3://crabby-images/96979/96979af134b6f1f5605cf556ef458bf70079bd09" alt="$\mathbb{R}^M=N({\bf A}^T) \oplus C({\bf A})$" |
codomain of
data:image/s3,"s3://crabby-images/bff76/bff762a027fc1efb8d07039648573d193912735b" alt="${\bf Ax}={\bf b}$" |
data:image/s3,"s3://crabby-images/b3ef4/b3ef4ef93b1d5af900169d11a69d2e77506e8402" alt="$M$" |
all columns of data:image/s3,"s3://crabby-images/8f98b/8f98b4c1067c66aea40c9b9296fee17101f54daf" alt="${\bf U}$" |
The SVD method can be used to find the pseudo-inverse of an
matrix
of rank
:
data:image/s3,"s3://crabby-images/e6768/e6768548c1ca6d0324e534b8d3e1f010aed9c26c" alt="$\displaystyle {\bf A}^-={\bf V\Sigma}^-{\bf U}^T$" |
(168) |
where both
and
![$\displaystyle {\bf\Sigma}^-=\left[\begin{array}{cccccc}1/\sigma_1&&&&&0\\
&\dd...
...1/\sigma_R &&&\\ &&&0&&\\ &&&&\ddots &\\ 0&&&&&0
\end{array}\right]_{N\times M}$](img650.svg) |
(169) |
are
matrices.
Consider the following four cases:
- If
, then
,
.
- If
, then
(full rank),
is a left inverse:
data:image/s3,"s3://crabby-images/77b68/77b6853866c3cc92a07f42fd08b2a23304d81985" alt="$\displaystyle {\bf A}^-{\bf A}={\bf V}{\bf\Sigma}^-{\bf U}^T{\bf U}{\bf\Sigma}{...
...={\bf V}{\bf\Sigma}^-{\bf\Sigma}{\bf V}^T
={\bf V}{\bf V}^T={\bf I}_{N\times N}$" |
(170) |
However,
data:image/s3,"s3://crabby-images/cb939/cb939bdb5936e6e8b38e4bc77ac7bb9b836088df" alt="$\displaystyle {\bf A}{\bf A}^-={\bf U}{\bf\Sigma}{\bf V}^T{\bf V}{\bf\Sigma}^-{\bf U}^T
={\bf U}{\bf\Sigma}{\bf\Sigma}^-{\bf U}^T\ne{\bf I}_{M\times M}$" |
(171) |
as the
matrix
with only
1's
along the diagonal is not full rank and unequal to
,
- If
, the
(full rank),
is a right inverse:
data:image/s3,"s3://crabby-images/ce104/ce1040336740265dc61b02f9fbed809ad1ec3552" alt="$\displaystyle {\bf A}{\bf A}^-={\bf U}{\bf\Sigma}{\bf V}^T{\bf V}{\bf\Sigma}^-{...
...
={\bf U}{\bf\Sigma}{\bf\Sigma}^-{\bf U}^T={\bf U}{\bf U}^T={\bf I}_{M\times M}$" |
(172) |
However,
data:image/s3,"s3://crabby-images/302fb/302fb1919c7843bfb7cae28961a96bebc550fae8" alt="$\displaystyle {\bf A}^-{\bf A}={\bf V}{\bf\Sigma}^-{\bf U}^T{\bf U}{\bf\Sigma}{\bf V}^T
={\bf V}{\bf\Sigma}^-{\bf\Sigma}{\bf V}^T\ne{\bf I}_{N\times N}$" |
(173) |
as the
matrix
with only
1's
along the diagonal is not a full rank identity matrix
.
- If
, then neither
nor
is full rank, as they only have
non-zeros 1's along the diagonal, therefore
is neither
left nor right inverse:
data:image/s3,"s3://crabby-images/42b0b/42b0b6af096893fa291cca1aa602b2cd92ff77ed" alt="$\displaystyle {\bf A}^-{\bf A}\ne {\bf I}_{N\times N},\;\;\;\;\;\;\;
{\bf A}{\bf A}^-\ne {\bf I}_{M\times M}$" |
(174) |
We further note that matrices
and
are related to each
other by:
ordata:image/s3,"s3://crabby-images/44cfe/44cfe759d543f752d47835bea96f8d3f8fe79e82" alt="$\displaystyle \;\;\;\;\;\;\;
\left\{\begin{array}{l}{\bf A}{\bf V}={\bf U}{\bf\Sigma}\\
{\bf A}^-{\bf U}={\bf V}{\bf\Sigma}^-\end{array}\right.$" |
(175) |
or in vector form:
data:image/s3,"s3://crabby-images/2c63c/2c63cb5494f3de6428e5e0e336056ba940dafec4" alt="$\displaystyle \left\{\begin{array}{ll}
{\bf A}{\bf v}_i=\sigma_i{\bf u}_i\;\;\;...
...+1,\cdots,N)\\
{\bf A}^-{\bf u_i}=0\;\;\;\;(i=R+1,\cdots,M)
\end{array}\right.$" |
(176) |
indicating how the individual columns are related.
We see that the last
columns
of
form an orthogonal basis of
; and the last
columns
of
form an orthogonal
basis of
.
We now show that the optimal solution of the linear system
can be obtained based on the pseudo-inverse
of
:
data:image/s3,"s3://crabby-images/3aa08/3aa08194560b382fa2a79625bbe5c836e44f2ce1" alt="$\displaystyle {\bf x}_{svd}={\bf A}^-{\bf b}={\bf V\Sigma}^-{\bf U}^T{\bf b}$" |
(177) |
Specially, if
, then
, and
is the unique and exact solution.
In general when
or
, the solution may not
exist, or it may not be unique, but
is still an
optimal solution in two ways, from the perspective of both the
domain and codomain of the linear mapping
, as shown below.
- In domain
:
Pre-multiplying
on both sides of the
pseudo-inverse solution
given above,
we get:
![$\displaystyle {\bf V}^T{\bf x}_{svd}
=[{\bf v}_1,\cdots,{\bf v}_N]^T{\bf x}_{sv...
...f\Sigma}^-{\bf U}^T{\bf b}
={\bf\Sigma}^-[{\bf u}_1,\cdots,{\bf u}_M]^T{\bf b},$](img677.svg) |
(178) |
or in component form:
![$\displaystyle \left[\begin{array}{c}{\bf v}_1^T{\bf x}_{svd}\\ \vdots\\
{\bf v...
...ma_1\\ \vdots\\
{\bf u}_R^T{\bf b}/\sigma_R\\ 0\\ \vdots\\ 0\end{array}\right]$](img678.svg) |
(179) |
- The first
components
are the projection of
onto
spanned by
corresponding to the non-zero
singular values;
- The last
components
are the projection
of
onto
spanned by
corresponding to the zero
singular values.
We see that
is entirely in the row space,
containing no homogeneous component in
, i.e.,
has the minimum norm (closest to the origin) compared to any other possible
solution
, such as those found previously based on the rref
of
, containing a non-zero homogeneous component
:
data:image/s3,"s3://crabby-images/fdddc/fdddc58598d97907649799c71362a760d2b1a094" alt="$\displaystyle \vert\vert{\bf x}_{svd}\vert\vert\le \vert\vert{\bf x}_p\vert\vert=\vert\vert{\bf x}_{svd}+{\bf x}_h\vert\vert$" |
(180) |
As
,
is the
projection of any such solution
onto
. The
complete solution can be found as
data:image/s3,"s3://crabby-images/68522/685229e791e784a0d76b6f02f9c180a794793830" alt="$\displaystyle {\bf x}_c={\bf x}_{svd}+N({\bf A})={\bf x}_{svd}+\sum_{j=R+1}^N c_j{\bf v}_j$" |
(181) |
for any set of coefficients
.
- In codomain
:
We now consider the result
produced
by the pseudo-inverse solution
, which, as a linear
combination of the columns of
, is in its column space
:
data:image/s3,"s3://crabby-images/30f4f/30f4f8037cc9f37b96b4a151e8951a8554fc62a5" alt="$\displaystyle {\bf b}_{svd}={\bf Ax}_{svd}={\bf AA}^-{\bf b}
={\bf U}{\bf\Sigma...
...}{\bf\Sigma}^-{\bf U}^T{\bf b}
={\bf U}{\bf\Sigma}{\bf\Sigma}^-{\bf U}^T{\bf b}$" |
(182) |
Pre-multiplying
on both sides we get:
data:image/s3,"s3://crabby-images/4f2d7/4f2d703a6aa1cdcc647cc71b5943035a2d97ea46" alt="$\displaystyle {\bf U}^T{\bf b}_{svd}={\bf\Sigma}{\bf\Sigma}^-{\bf U}^T{\bf b}$" |
(183) |
or in component form:
![$\displaystyle \left[\begin{array}{c}{\bf u}_1^T{\bf b}_{svd}\\ \vdots\\
{\bf u...
...u}_1^T{\bf b}\\ \vdots\\
{\bf u}_R^T{\bf b}\\ 0\\ \vdots\\ 0\end{array}\right]$](img691.svg) |
(184) |
We see that
is entirely in the column
space. If
, then
is an
exact solution. But if
, then
and
is not an exact solution.
But as
is the projection of
onto
, the error
is minimized in
comparison to any other possible
data:image/s3,"s3://crabby-images/7d83c/7d83c59a539eeae581c43fb7f93402aefd1228e0" alt="$\displaystyle \vert\vert{\bf b}-{\bf b}_{svd}\vert\vert=\vert\vert{\bf b}-{\bf A}{\bf x}_{svd}\vert\vert \le \vert\vert{\bf b}-{\bf Ax}\vert\vert$" |
(185) |
We therefore see that
is the optimal solution.
Summarizing the two aspects above, we see that the pseudo-inverse
solution
is optimal in the sense that both its
norm
and its error
are minimized.
- If the solution
is not unique because
, then the complete solution can be
found by adding the entire null space to it:
.
- If no solution exists because
,
then
is the optimal approximate solution with
minimum error.
Example: Given the same system considered in previous examples
we will now solve it using the pseudo-inverse method. We first find SVD of
in terms of the following matrices:
The particular solution of the system
with
is
which is in
spanned by the first
columns
and
, perpendicular to
spanned
by the last
columns
and
. Note that
this solution
is actually
the first component
of the particular
solution
found in the previous section
by Gauss-Jordan elimination, which is not in
. Adding the
null space to the particular solution, we get the complete solution:
If the right-hand side is replaced by
, no solution exists. However, we can still find the
pseudo-inverse solution as the optimal approximate solution:
which is the same as the solution for
,
indicating
happens to be the projection of
onto
spanned by
and
. The result produced by
is
is the projection of
onto
, with a minimum error distance
, indicating
is the optimal approximate solution.
Homework 3:
- Use Matlab function
svd(A)
to carry out the SVD of the coefficient
matrix
of the linear system in Homework 2 problem 3,
Find
,
,
. Verify that
and
are orthogonal, i.e.,
and
.
- Obtain
, i.e, find the transpose of
and then replace each singular value by its reciprocal, verify that
and
.
Then find the pseudo-inverse
.
- Use Matlab function
pinv(A)
to find the pseudo-inverse
and
. Compare them to what you obtained
in the previous part.
- Identify the bases of the four subspaces
,
,
, and
based on
and
. Verify that these bases and those you found previously (problem
3 of Homework 2) span the same spaces, i.e., the basis vectors of one basis
can be written as a linear combination of those of the other, for each of
the four subspaces.
- Use the pseudo-inverse
found above to solve
the system
. Find the two particular solutions
and
corresponding to two different right-hand
side
and
.
- How are the two results
and
related to
each other? Give your explanation. Why can't you find a solution when
but SVD method can?
- Verify that
. Also find the error (or
residual)
for each of the two results
above. Verify that
, if
.
- In Homework 2 you used row reduction method to solve the system
and you should have found a particular solution
. Also it is obvious to see that another
solution is
. Show that the projection of
these solutions onto
spanned by the first two columns of
is the same as the particular solution
found by
the SVD method.
Answer
- Show that
is the projection of
onto
, the 2-D subspace spanned by
the first two columns of
. Can you also show that it is the
projection of
onto
spanned by the basis you obtained
in Homework 2 (not necessarily orthogonal)?
Answer
- Give an expression of the null space
, and then write the
complete solution in form of
. Verify any
complete solution so generated satisfies
.