The Newton-Raphson method discussed above for solving a single-variable
equation can be generalized to the case of multivariate equation
systems containing equations of variables in
:
|
(95) |
To solve the equation system, we first consider the
Taylor series expansion
of each of the functions in the neighborhood of the initial
point
:
|
(96) |
where
represents the second and higher
order terms in the series beyond the linear term, which can be neglected
if
is small. These equations can be expressed
in matrix form
where
, while
and
are the function
and its Jacobian matrix
both evaluated at . We further
consider solving the equation system
in the following two cases:
- : The number of equations is the same as the number of
unknowns, the Jacobian
is a square matrix and
its inverse
exists in general. In the special case
where
is linear, the Taylor series contains
only the first two terms while all higher order terms are zero,
and the approximation in Eq. (97) becomes
exact. To find the root satisfying
, we set
in
Eq. (97) to zero and solve the resulting
equation for to get
|
(98) |
As in general
is nonlinear, it can only
be approximated by the first two terms of the Taylor series,
consequently the result above is only an approximation of
the optimal solution. But this approximation can be improved
iteratiively to approach the optimal solution :
|
(99) |
where we have defined
as the increment, which can also be denoted by
to represent the search or
Newton direction. The iteration moves in the
N-D space spanned by
from some initial
guess along such a path that all function values
) are reduced. Same as in the
univariate case, a scaling factor can be used to
control the step size of the iteration
|
(100) |
When , the step size becomes smaller and the convergence
of the iteration is slower, however, we will have a better chance
not to skip a solution, which may happen if
is
not smooth and the step size is too big.
The algorithm is listed below:
- Select
- Obtain
and
- Obtain
-
- While
do
- : There are more equations than unknowns, i.e.,
equation
is an over-constrained
system, and the Jacobian
is an
non-square matrix without an inverse, i.e., no solution
exists for the equation
in general.
But we can still seek to find an optimal solution
that minimizes the following sum-of-squares error:
|
(101) |
The gradient vector of
is:
|
(102) |
The nth component of
is
|
(103) |
where
is the component
in the mth row and nth column of the Jacobian matrix
of
. Now the gradient
can be written as
|
(104) |
If
is linear and can therefore be represented
as the sum of the first two terms of its Taylor series in
Eq. (97), then the gradient is:
|
(105) |
where is any chosen initial guess. If we assume
is the optimal solution at which
is minimized and
is zero:
|
(106) |
then the increment
can be found by solving
the equation
|
(107) |
Here
is the
pseudo-inverse
of the non-square matrix . Now the optimal solution
can be found as:
|
(108) |
However, as
is nonlinear in general, the
sum of the first two terms of its Taylor series is only an
approximation. Consequently the result
above is not the optimal
solution, but an estimate which can be improved by carrying
out this step iteratively:
|
(109) |
This iteration will converge to at which
, and the squared
error
is minimized.
Specially, for a linear equation system
, the Jacobian
is simplely
, and the optimal
solution is
|
(110) |
i.e., the optimal solution can be found from any initial
guess in a single step. This result is the same as that in
Eq. (7).
Comparing Eqs. (99) and (109),
we see that the two algorithms are essentially the same, with
the only difference that the regular inverse
is
used when , but the pseudoinverse is used when
and
does not exist.
The Newton-Raphson method assumes the availability of
the analytical expressions of all partial derivatives
in the Jacobian matrix . However, when this is not
the case, need to be approximated by forward or central
difference (secant) method:
where is a small increment.
Example 1
Example 2
With
, we get a root:
With
, we get another root:
Broyden's method
In the Newton-Raphson method, two main operations are carried out in
each iteration: (a) evaluate the Jacobian matrix
and (b) obtain its inverse
. To avoid
these expensive computation for these operations, we can consider using
Broyden's method, one of the quasi-Newton methods, which approximates
the inverse of the Jacobian
from the
in the previous iteration
step, so that it can be updated iteratively from the initial
.
We first consider in the single-variable case how to estimate the next
derivative
from the current
by the
secant method:
where
The equation above indicates that the derivative in the
step can be estimated by adding the estimated increment
to the derivative in the current
step.
Having obtained
, we can use the same iteration in the
Newton-Raphson method to find :
|
(114) |
This method for single-variable case can be generalized to multiple
variable case for solving
. Following the
way we estimate the increment of the derivative of a single-variable
function in Eq. (113), here we can estimate the
increment of the Jacobian of a multi-variable function:
|
(115) |
where
and
.
Now in each iteration, we can update the estimated Jacobian as well as
the estimated root:
|
(116) |
The algorithm can be further improved so that the inverse of the Jacobian
is avoided. Specifically, consider the inverse Jacobian:
|
(117) |
We can apply the
Sherman-Morrison formula:
|
(118) |
to the right-hand side of the equation above by defining
|
(119) |
and rewrite it as:
|
(120) |
We see that the next
can be iteratively estimated
directly from the previous
, thereby avoiding computing
the inverse of
altogether. The algorithm is listed below:
- Select
- Find
,
, and
-
-
- While
do