#math376##tex2html_wrap_inline7168#.
We note that the iteration #math377##tex2html_wrap_inline7170# is just
Eq. (#NewtonRaphson1d#219>) in Chapter #equations#220>, the
Newton-Raphson method applied to solving equation #tex2html_wrap_inline7172#, as
the necessary condition for an extremum of #tex2html_wrap_inline7174#.
Newton's method for the minimization of a single-variable
function #tex2html_wrap_inline7176# can be generalized for the minimization of
a multivariable function #math378##tex2html_wrap_inline7178#. Again,
we approximate #tex2html_wrap_inline7180# by a quadratic function #tex2html_wrap_inline7182#
containing only the first three terms of its Taylor series at some
initial point #tex2html_wrap_inline7184#:
#math379# #tex2html_wrap_indisplay18738#
(12)
where #tex2html_wrap_inline7186# and #tex2html_wrap_inline7188# are respectively the gradient vector
and Hessian matrix of the function #tex2html_wrap_inline7190# at #tex2html_wrap_inline7192#:
#math380#
#tex2html_wrap_indisplay18744# |
#tex2html_wrap_indisplay18745# |
#tex2html_wrap_indisplay18746# |
|
#tex2html_wrap_indisplay18747# |
#tex2html_wrap_indisplay18748# |
#tex2html_wrap_indisplay18749# |
(13) |
The stationary point of #math381##tex2html_wrap_inline7194# can be found from
any initial guess #tex2html_wrap_inline7196# by setting its derivative to zero:
#math382#
#tex2html_wrap_indisplay18753# |
#tex2html_wrap_indisplay18754# |
#tex2html_wrap_indisplay18755# |
|
|
#tex2html_wrap_indisplay18756# |
#tex2html_wrap_indisplay18757# |
(14) |
and solving the resulting equation to get
#math383# #tex2html_wrap_indisplay18759#
(15)
Similar to the single-variable case in which #tex2html_wrap_inline7198# is either a
minimum or maximum depends on whether the second order derivative
#tex2html_wrap_inline7200# is greater or smaller than zero, here in multi-variable
case the function value #tex2html_wrap_inline7202# at the stationary point is
a maximum, minimum, or saddle point, depending on the curvature of
the function represented by the Hessian matrix
#math384##tex2html_wrap_inline7204#
evaluated at #tex2html_wrap_inline7206#:
- If #tex2html_wrap_inline7208# is positive definite (all eigenvalues are
positive), then #tex2html_wrap_inline7210# is <#327#>convex<#327#> and #tex2html_wrap_inline7212#
is the global minimum;
- If #tex2html_wrap_inline7214# is negative definite (all eigenvalues are
negative), then #tex2html_wrap_inline7216# is <#331#>concave<#331#> and #tex2html_wrap_inline7218#
is the global maximum;
- If #tex2html_wrap_inline7220# is indefinite (with both positive and negative
eigenvalues), then #tex2html_wrap_inline7222# is neither convex nor concave, and
#tex2html_wrap_inline7224# is a saddle point (maximum in some directions, but
minimum in others).
If #tex2html_wrap_inline7226# is quadratic and convex, the optimization problem
is called <#338#>convex programming<#338#>, and its minimum can be easily
obtained. (Equivalently, if #tex2html_wrap_inline7228# is quadratic and concave,
its maximum can be easily obtained.)
But if #math385##tex2html_wrap_inline7230# is not quadratic, the result
above is not a minimum or maximum, but it can still be used as an
approximation of the solution, which can be improved iteratively
to approach the solution:
#math386# #tex2html_wrap_indisplay18778#
(16)
where #math387##tex2html_wrap_inline7232#, #math388##tex2html_wrap_inline7234#,
and the increment #math389##tex2html_wrap_inline7236#
is called <#366#>Newton search direction<#366#>. We note that this iteration
is just a generalization of #math390##tex2html_wrap_inline7238# in 1-D case
where #math391##tex2html_wrap_inline7240# and #math392##tex2html_wrap_inline7242#, and the iteration
#math393##tex2html_wrap_inline7244# above is just
Eq. (#NRiterationMeqN#376>) in Chapter #equations#377>, the
Newton-Raphson method applied to solving equation
#math394##tex2html_wrap_inline7246#, as the necessary condition
for an extremum of #math395##tex2html_wrap_inline7248#. The computational complexity
for each iteration is #tex2html_wrap_inline7250# due to the inverse operation
#math396##tex2html_wrap_inline7252#.
<#386#>Example 1:<#386#>
An over-constrained nonlinear equation system
#math397##tex2html_wrap_inline7254# of #tex2html_wrap_inline7256# equations and #tex2html_wrap_inline7258#
unknowns can be solved by minimizing the following SSE:
#math398# #tex2html_wrap_indisplay18794#
(17)
To use the Newton-Raphson method, we first find the
gradient of the error function:
#math399# #tex2html_wrap_indisplay18796#
(18)
of which the ith component is:
#math400# #tex2html_wrap_indisplay18798#
(19)
where #math401##tex2html_wrap_inline7260# is the component
of the function's Jacobian #math402##tex2html_wrap_inline7262# in the mth
row and ith column.
We further find the component of the Hessian #math403##tex2html_wrap_inline7264#
of the error function in the ith row and jth column:
#math404#
#tex2html_wrap_indisplay18803# |
#tex2html_wrap_indisplay18804# |
#tex2html_wrap_indisplay18805# |
|
|
#tex2html_wrap_indisplay18806# |
#tex2html_wrap_indisplay18807# |
|
|
#tex2html_wrap_indisplay18808# |
#tex2html_wrap_indisplay18809# |
(20) |
Here #tex2html_wrap_inline7266# is the component of the Jacobian #tex2html_wrap_inline7268# in
the ith row and jth column, and we have dropped the second
order terms #math405##tex2html_wrap_inline7270#.
Now the Hessian matrix can be written as
#math406# #tex2html_wrap_indisplay18814#
(21)
Based on Newton-Raphson's method, the optimal solution #tex2html_wrap_inline7272#
that minimizes #math407##tex2html_wrap_inline7274# can be found iteratively:
#math408# #tex2html_wrap_indisplay18818#
(22)
This is the same as pseudo-inverse method in
Eq. (#NRiterationMneqN#501>) in Chapter #equations#502>.
<#503#>Example 2:<#503#>
Consider the following quadratic function:
#math409# #tex2html_wrap_indisplay18820#
We have
#math410# #tex2html_wrap_indisplay18822#
We let #tex2html_wrap_inline7276#, #tex2html_wrap_inline7278#, and consider the following values of #tex2html_wrap_inline7280#:
- #tex2html_wrap_inline7282#, #math411##tex2html_wrap_inline7284#, #math412##tex2html_wrap_inline7286#, #math413##tex2html_wrap_inline7288#,
#tex2html_wrap_inline7290# is positive definite, #tex2html_wrap_inline7292# is the minimum.
- #tex2html_wrap_inline7294#, #tex2html_wrap_inline7296#, #tex2html_wrap_inline7298#,
#math414##tex2html_wrap_inline7300#,
#tex2html_wrap_inline7302# is positive definite, #tex2html_wrap_inline7304# is the minimum.
- #tex2html_wrap_inline7306#, #math415##tex2html_wrap_inline7308#, #math416##tex2html_wrap_inline7310#,
#math417##tex2html_wrap_inline7312#,
#tex2html_wrap_inline7314# is positive definite, #tex2html_wrap_inline7316# is the minimum.
- #tex2html_wrap_inline7318#, #math418##tex2html_wrap_inline7320#, #math419##tex2html_wrap_inline7322#,
#math420##tex2html_wrap_inline7324#,
#tex2html_wrap_inline7326# is indefinite, #tex2html_wrap_inline7328# is a saddle point
(minimum in one direction but maximum in another).
<#18850#>Figure<#18850#> 2.2:
<#18851#>Quadratic Surfaces<#18851#>
[width=110mm]<#538#>figures/QuadraticSurf.eps<#538#> |
We can speed up the convergence by a bigger step size #tex2html_wrap_inline7330#. However,
if #tex2html_wrap_inline7332# is too big, the solution may be skipped and the iteration may
not converge if it gets into an oscillation around the solution. Even worse,
the iteration may become divergent. For such reasons, a smaller step size
#tex2html_wrap_inline7334# may be preferred sometimes.
In summary, Newton's method approximates the function #tex2html_wrap_inline7336# at an
estimated solution #tex2html_wrap_inline7338# by a quadratic equation (the first three
terms of the Taylor's series) based on the gradient #tex2html_wrap_inline7340# and Hessian
#tex2html_wrap_inline7342# of the function at #tex2html_wrap_inline7344#, and treat the vertex of the
quadratic equation as the updated estimate #tex2html_wrap_inline7346#.
Newton's method requires the Hessian matrix as well as the gradient to be
available. Moreover, it is necessary calculate the inverse of the Hessian
matrix in each iteration, which may be computationally expensive.
<#549#>Example 3:<#549#>
The Newton's method is applied to solving the following non-linear
equation system of #tex2html_wrap_inline7348# variables:
#math421# #tex2html_wrap_indisplay18864#
with the exact solution #math422##tex2html_wrap_inline7350#. These equations can
be expressed in vector form as #math423##tex2html_wrap_inline7352# and solved as an
optimization problem with the objective function
#math424##tex2html_wrap_inline7354#. The iteration from an
initial guess #math425##tex2html_wrap_inline7356# is shown below.
#math426# #tex2html_wrap_indisplay18870#
(23)
We see that after 13 iterations the algorithm converges to
the following approximated solution with accuracy of
#math427##tex2html_wrap_inline7358#:
#math428# #tex2html_wrap_indisplay18873#
(24)