Newton-Raphson Method (Multivariate)
The Newton-Raphson method discussed above for solving a
single-variable equation #tex2html_wrap_inline3142# can be generalized to the case of
multivariate equation systems containing #tex2html_wrap_inline3144# equations of #tex2html_wrap_inline3146# variables
in #math342##tex2html_wrap_inline3148#:
#math343# #tex2html_wrap_indisplay6744#
(65)
Again, we first consider the Taylor series expansion of each
of the #tex2html_wrap_inline3150# functions in the neighborhood of an initial guess
#math344##tex2html_wrap_inline3152#:
#math345# #tex2html_wrap_indisplay6748#
(66)
where #math346##tex2html_wrap_inline3154# represents the second and
higher order terms in the series beyond the linear term, which
can be neglected if #math347##tex2html_wrap_inline3156# is small. These #tex2html_wrap_inline3158#
equations can be expressed in matrix form
#math348# #tex2html_wrap_indisplay6753#
(67)
or more concisely
#math349# #tex2html_wrap_indisplay6755#
(68)
where #math350##tex2html_wrap_inline3160#, and
#math351##tex2html_wrap_inline3162# and #math352##tex2html_wrap_inline3164#
are respectively the function #math353##tex2html_wrap_inline3166# and its
<#1183#>Jacobian matrix<#1183#> #math354##tex2html_wrap_inline3168# both evaluated at
#tex2html_wrap_inline3170#. We further consider solving the equation system
#math355##tex2html_wrap_inline3172# in the following two cases:
- #tex2html_wrap_inline3174#: The number of equations is the same as the number of
unknowns, the Jacobian #math356##tex2html_wrap_inline3176# is a square matrix and
its inverse #tex2html_wrap_inline3178# exists in general. In the special case
where #math357##tex2html_wrap_inline3180# is linear, the Taylor
series contains only the first two terms while all higher order
terms are zero, and the approximation in Eq. (#TaylorApproximate#1199>)
becomes exact with #math358##tex2html_wrap_inline3182#. To find the root #tex2html_wrap_inline3184#
of the equation, we set #math359##tex2html_wrap_inline3186# in the equation to
zero and solve the resulting equation to get
#math360# #tex2html_wrap_indisplay6771#
(69)
For example, if #math361##tex2html_wrap_inline3188# and #math362##tex2html_wrap_inline3190#,
then #math363##tex2html_wrap_inline3192#.
If #math364##tex2html_wrap_inline3194# is nonlinear, the sum of the first two
terms of the Taylor series is only an approximation, and the
result in Eq. (#NRnlinear#1229>) is only an approximate root,
which can be further improved iteratively to move from the
initial guess #tex2html_wrap_inline3196# towards the root #tex2html_wrap_inline3198#:
#math365# #tex2html_wrap_indisplay6779#
(70)
where #math366##tex2html_wrap_inline3200# is the
increment, representing the <#1247#>search direction<#1247#> at the
nth step. The iteration moves #tex2html_wrap_inline3202# in the N-D space
spanned by #math367##tex2html_wrap_inline3204# from some initial guess
#tex2html_wrap_inline3206# along such a path that all function values
#math368##tex2html_wrap_inline3208#) are reduced. As in the
univariate case, a scaling factor #tex2html_wrap_inline3210# can be used
to control the step size of the iteration
#math369# #tex2html_wrap_indisplay6787#
(71)
The step size can be adjusted based on the specific function
shape. When #tex2html_wrap_inline3212#, the step size becomes smaller and
the convergence of the iteration is slower, we will have a
better chance not to skip a solution, which may happen if
#math370##tex2html_wrap_inline3214# is not smooth and the step size is too big.
The algorithm is listed below (where the tolerance #tex2html_wrap_inline3216# is
preset to a small value):
- Select #tex2html_wrap_inline3218#
- Obtain #math371##tex2html_wrap_inline3220# and #tex2html_wrap_inline3222#
- Obtain #math372##tex2html_wrap_inline3224#
- #tex2html_wrap_inline3226#
- While #math373##tex2html_wrap_inline3228# do
- #math374##tex2html_wrap_inline3230#
- Find #tex2html_wrap_inline3232# and #tex2html_wrap_inline3234#
- #tex2html_wrap_inline3236#
- #tex2html_wrap_inline3238#: There are more equations than unknowns, i.e.,
equation #math375##tex2html_wrap_inline3240# is an over-constrained
system, and the Jacobian #math376##tex2html_wrap_inline3242# is an #tex2html_wrap_inline3244#
non-square matrix with no inverse, i.e., no solution exists
for the equation #math377##tex2html_wrap_inline3246# in general. But
we can still seek to find an optimal solution #tex2html_wrap_inline3248#
that minimizes the following sum-of-squares error:
#math378# #tex2html_wrap_indisplay6808#
(72)
The <#1313#>gradient vector<#1313#> of #math379##tex2html_wrap_inline3250# is:
#math380# #tex2html_wrap_indisplay6811#
(73)
The nth component of #math381##tex2html_wrap_inline3252# is
#math382# #tex2html_wrap_indisplay6814#
(74)
where #math383##tex2html_wrap_inline3254# is the
component in the mth row and nth column of the Jacobian
#math384##tex2html_wrap_inline3256#. Now the gradient can be written as
#math385# #tex2html_wrap_indisplay6818#
(75)
If specially #math386##tex2html_wrap_inline3258# is linear
with #math387##tex2html_wrap_inline3260#, then it can be represented
as the sum of the first two terms of its Taylor series in
Eq. (#TaylorApproximate#1366>), then the gradient is:
#math388# #tex2html_wrap_indisplay6822#
(76)
where #tex2html_wrap_inline3262# is any chosen initial guess. If we assume
#tex2html_wrap_inline3264# is the optimal solution at which #math389##tex2html_wrap_inline3266#
is minimized and #math390##tex2html_wrap_inline3268# is zero:
#math391# #tex2html_wrap_indisplay6828#
(77)
then by solving this equation we get the optimal increment
#math392# #tex2html_wrap_indisplay6830#
(78)
Here #math393##tex2html_wrap_inline3270# is
the <#1418#>pseudo-inverse<#1418#> (Section #pseudoInverse#1419>) of the
non-square matrix #tex2html_wrap_inline3272#. Now the optimal solution can
be found as:
#math394# #tex2html_wrap_indisplay6834#
(79)
For example, if #math395##tex2html_wrap_inline3274# and
#math396##tex2html_wrap_inline3276#, we have
#math397##tex2html_wrap_inline3278#, same as
Eq. (#linearEquationPseudoInverse#1440>).
When #math398##tex2html_wrap_inline3280# is nonlinear in general, the sum
of the first two terms of its Taylor series is its
approximation, and the result above is an approximation of
the root, which can be further improved iteratively to move
from the initial guess towards the root:
#math399# #tex2html_wrap_indisplay6840#
(80)
This iteration will converge to #tex2html_wrap_inline3282# at which
#math400##tex2html_wrap_inline3284#, and the squared
error #math401##tex2html_wrap_inline3286# is minimized.
Comparing Eqs. (#NRiterationMeqN#1459>) and (#NRiterationMneqN#1460>),
we see that they are essentially the same, with the only difference
that the regular inverse #tex2html_wrap_inline3288# is used when #tex2html_wrap_inline3290#, but the
pseudoinverse #tex2html_wrap_inline3292# is used when #tex2html_wrap_inline3294# and #tex2html_wrap_inline3296# does
not exist.
The Newton-Raphson method assumes the availability of the
analytical expressions of all partial derivatives
#math402##tex2html_wrap_inline3298#
in the Jacobian matrix #tex2html_wrap_inline3300#. However, when this is not
the case, #tex2html_wrap_inline3302# need to be approximated by forward or central
difference (secant) method:
#math403#
#tex2html_wrap_indisplay6853# |
#tex2html_wrap_indisplay6854# |
#tex2html_wrap_indisplay6855# |
|
|
#tex2html_wrap_indisplay6856# |
#tex2html_wrap_indisplay6857# |
(81) |
where #tex2html_wrap_inline3304# is a small increment.
<#1479#>Example 1<#1479#>
#math404# #tex2html_wrap_indisplay6860#
#math405# #tex2html_wrap_indisplay6862#
#math406# #tex2html_wrap_indisplay6864#
<#1500#>Example 2<#1500#>
#math407# #tex2html_wrap_indisplay6866#
#math408# #tex2html_wrap_indisplay6868#
With #math409##tex2html_wrap_inline3306#, we get a root:
#math410# #tex2html_wrap_indisplay6871#
With #math411##tex2html_wrap_inline3308#, we get another root:
#math412# #tex2html_wrap_indisplay6874#
<#1532#>Broyden's method<#1532#>
In the Newton-Raphson method, two main operations are carried out in
each iteration: (a) evaluate the Jacobian matrix #math413##tex2html_wrap_inline3310#
and (b) obtain its inverse #math414##tex2html_wrap_inline3312#. To avoid
the expensive computation for these operations, we can consider using
Broyden's method, one of the so called <#1540#>quasi-Newton methods<#1540#>,
which approximates
the inverse of the Jacobian #math415##tex2html_wrap_inline3314#
from the #math416##tex2html_wrap_inline3316# in the previous iteration
step, so that it can be updated iteratively from the initial
#math417##tex2html_wrap_inline3318#.
We first consider in the single-variable case how to estimate the next
derivative #math418##tex2html_wrap_inline3320# from the current #tex2html_wrap_inline3322# by the
secant method:
#math419#
#tex2html_wrap_indisplay6883# |
#tex2html_wrap_indisplay6884# |
#tex2html_wrap_indisplay6885# |
|
|
#tex2html_wrap_indisplay6886# |
#tex2html_wrap_indisplay6887# |
(82) |
where
- #math420##tex2html_wrap_inline3324# is the true increment of the function
over the interval #math421##tex2html_wrap_inline3326#;
- #math422##tex2html_wrap_inline3328# is the estimated increment of the function
based on the previous derivative #tex2html_wrap_inline3330#;
- #math423##tex2html_wrap_inline3332# is the estimated increment of the derivative:
#math424# #tex2html_wrap_indisplay6894#
(83)
The equation above indicates that the derivative #tex2html_wrap_inline3334# in the
#tex2html_wrap_inline3336# step can be estimated by adding the estimated increment
#math425##tex2html_wrap_inline3338# to the derivative #tex2html_wrap_inline3340# in the current
#tex2html_wrap_inline3342# step.
<#6900#>Figure<#6900#> 1.16:
<#6901#>Broyden's Method<#6901#>
[width=90mm]<#1594#>figures/Broyden1.eps<#1594#> |
Having obtained #math426##tex2html_wrap_inline3344#, we can use the same iteration in the
Newton-Raphson method to find #tex2html_wrap_inline3346#:
#math427# #tex2html_wrap_indisplay6906#
(84)
This method for single-variable case can be generalized to multiple
variable case for solving #math428##tex2html_wrap_inline3348#. Following the
way we estimate the increment of the derivative of a single-variable
function in Eq. (#deltaderivative#1609>), here we can estimate the
increment of the Jacobian of a multi-variable function:
#math429# #tex2html_wrap_indisplay6909#
(85)
where #math430##tex2html_wrap_inline3350# and
#math431##tex2html_wrap_inline3352#.
Now in each iteration, we can update the estimated Jacobian as well as
the estimated root:
#math432# #tex2html_wrap_indisplay6913#
(86)
The algorithm can be further improved so that the inverse of the Jacobian
#tex2html_wrap_inline3354# is avoided. Specifically, consider the inverse Jacobian:
#math433# #tex2html_wrap_indisplay6916#
(87)
We can apply the Sherman-Morrison formula (Section #Woodbury#1668>):
#math434# #tex2html_wrap_indisplay6918#
(88)
to the right-hand side of the equation above by defining
#math435# #tex2html_wrap_indisplay6920#
(89)
and rewrite it as:
#math436# #tex2html_wrap_indisplay6922#
(90)
We see that the next #math437##tex2html_wrap_inline3356# can be iteratively estimated
directly from the previous #math438##tex2html_wrap_inline3358#, thereby avoiding computing
the inverse of #tex2html_wrap_inline3360# altogether. The algorithm is listed below:
- Select #tex2html_wrap_inline3362#
- Find #math439##tex2html_wrap_inline3364#, #math440##tex2html_wrap_inline3366#, and #math441##tex2html_wrap_inline3368#
- #math442##tex2html_wrap_inline3370#
- #tex2html_wrap_inline3372#
- While #math443##tex2html_wrap_inline3374# do
- #math444##tex2html_wrap_inline3376#
- #math445##tex2html_wrap_inline3378#
- #math446##tex2html_wrap_inline3380#
- #math447##tex2html_wrap_inline3382#
- #math448##tex2html_wrap_inline3384#, #math449##tex2html_wrap_inline3386#
- #math450##tex2html_wrap_inline3388#
- #tex2html_wrap_inline3390#