In numerical methods, it is important to know how reliable the numerical
result produced by a certain numerical algorithm is, in terms of how
sensitive it is with respect to the various perturbations caused by
the inevitable noise in the given data (e.g., observational error) and
computational errors in the system parameters (e.g., truncation error).
The algorithm can also be considered as a generic system or function with
the given data as the input and numerical result as the output. We can
therefore talk about how well or ill conditioned the system is. In
general, if a large difference in the output may be caused by a small
change in the input/system parameters, then the result is too sensitive
to the perturbations and therefore not reliable. Whether a problem is
well or ill defined (the system is well or ill behaved) can therefore
be measured based on such sensitivities:
- If a small change in the input causes a small change in the output,
the problem is well conditioned (well behaved).
- If a small change in the input can cause a large change in the output,
the problem is ill conditioned (ill behaved).
Specifically, How well or ill conditioned a problem is can be measured
quantitatively by the absolute or relative condition numbers
defined below:
- The absolute condition number
is the upper bound
of the ratio between the change in output and change in input:
|
(1) |
- The relative condition number is the upper bound of the
ratio between the relative change in output and relative change in input:
|
(2) |
Here
is the norm of any variable
(real or complex scalar, vector, matrix, function, etc.), a real scalar
representing its “size”. In the following, we will use mostly the
2-norm
for any vector and the corresponding
spectral norm
for any matrix . Specifically, the spectral norm
satisfies the following properties:
|
(3) |
As the relative condition number compares the normalized changes in both
the input and output, its value is invariant to the specific units used
to measure them. The change in input can also be equivalently normalized
by the input plus change
input + change in input as well as
input. The same is true for the change in the output.
Obviously, the smaller the condition number, the better conditioned
the problem is, the less sensitive the solution is with respect to the
perturbation of the data (error, noise, approximation, etc.); and the
larger the condition number, the more ill conditioned the problem is.
In the following, we specifically consider the condition numbers
defined for some different systems.
System with single input and output variables
A system of a single input and single output can be described
by a function to be evaluated at a given input . The
condition number of the function at point is defined as the
greatest (worst case) ratio between the changes of the output and the
input:
|
(4) |
Here is the symbol for supremum (least upper bound) and
is the norm of the scaler variable which is either the absolute
value of if
is real or modulus of if
is complex. Note that in general the condition number is a function of .
When the perturbation is small, the Taylor expansion of the
function can be approximated as
|
(5) |
i.e.,
|
(6) |
Taking the absolute value or modulus on both sides, we get
|
(7) |
where
is the absolute
condition number. Dividing both sides by
, we get
|
(8) |
where is the relative condition number:
|
(9) |
The discussion above is for the task of evaluating the output
of a system as a function of the input . However, if the task is to
solve a given equation , for the purpose of finding treated
as the output that satisfies the equation with treated as the
input, the problem needs to be converted into the form of evaluating the
function
at . The absolute condition number of
this function
is
|
(10) |
which is the reciprocal of the absolute condition number of
the corresponding evaluating problem. We therefore see that the larger
, or the smaller , the better conditioned the problem is.
In the figure below,
, therefore for the problem
of evaluating , is more ill-conditioned than ,
but for the problem of solving , is better conditioned
than .
Example: Consider the following function and its derivative:
|
(11) |
|
(12) |
|
(13) |
- at ,
|
(14) |
|
(15) |
|
(16) |
At this point, the problem of evaluating is well-conditioned.
- at ,
|
(17) |
|
(18) |
|
(19) |
At this point, the function is ill-conditioned.
We see the function is well-conditioned at but ill-conditioned
at .
However, the problem of solving the equation is very
ill-conditioned, as in the neighborhood of the root ,
is very large, any value in a wide range in could result in
, and therefore considered as a solution, which is
certainly not reliable.
Systems with Multiple input and output variables
The results above can be generalized to a system of multiple inputs
and multiple outputs
represented by a function
. A change
in the input will cause certain change in the output:
|
(20) |
When
is small, the output can be approximated by the first
two terms of its
Taylor series expansion:
|
(21) |
where
is the Jacobian matrix of the function,
and we get
|
(22) |
where
is the absolute condition number defined as:
|
(23) |
Dividing both sides of the equation above by
we get
|
(24) |
where is the relative condition number defined as:
|
(25) |
In particular, consider a linear system of inputs
and outputs
described by
|
(26) |
Consider the following two problems both associated with the linear
system:
We see that both the evaluation and equation solving problems
(
) have the same relative condition number
.
Let the
singular values
of be
,
then the singular values of
are
, and their
spectral norms
are their respective greatest singular values:
and
. Now we have
|
(40) |
We see that the condition number of is large if its
and
are far apart in values, but it is small otherwise.
The condition number
is a measurement of how close
is to singularity. When is singular, one or more
of its singular values are zero, i.e.,
, then
, i.e., any problem
associated with a singular matrix is worst conditioned. On the other
hand, as all singular values of an identity matrix are 1,
its condition number is
, the smallest condition
number possible, i.e., any problem
is best
conditioned.
In Matlab, the function cond(A,p)
generates the condition number
of matrix A based on its p-norm. When , this is the ratio between
the greatest and smallest singular values. The following Matlab commands
give the same result:
cond(A,2)
norm(A,2)*norm(inv(A),2)
s=svd(A)
, s(1)/s(end)
We now further consider the worst case scenarios
for both evaluating
the matrix-vector product
and solving the
linear equation system
. In both cases, we
find as the output given as the input. To do so,
we first carry out the SVD decomposition of to get
with the singular values in
sorted in descending order:
. The inverse of can also be
SVD decomposed:
.
Example 1: Consider solving the linear system
with
|
(45) |
The singular values of are
and
,
the condition number is
|
(46) |
which is small, indicating this is a well-behaved system. Given two similar
inputs
and
with
, we find the corresponding solutions:
|
(47) |
We have
|
(48) |
and
|
(49) |
Example 2:
|
(50) |
The singular values of are
and
,
the condition number is
|
(51) |
indicating matrix is close to singularity, and the system
is ill-conditioned. The solutions corresponding to the same two inputs
and
are
|
(52) |
We can further find
|
(53) |
and the condition number is:
|
(54) |
We see that a small relative change
in the input caused a huge change
in
the output (2000 times greater).
Example 3: Consider
|
(55) |
with
|
(56) |
As is near singular, its condition number is
large, indicating that the problem associated with this
is ill-conditioned. We will now specifically consider both problems
of evaluating and solving the linear system.
Associated with whether a problem is well or ill conditioned is the
issue of an approximate solutions. For the problem of solving an
equation system
, there exist two types
of approximate solutions:
- The approximate solution corresponds to a small residual in the
image of the function:
|
(67) |
- The approximate solution
is close to the true
solution satisfying
(typically unknown) in the domain of the function:
|
(68) |
If the problem is well-conditioned, the two different criteria
are consistent, i.e., the two approximations are similar,
is small. However, if the problem
is ill-conditioned, they can be very different.