Optimization with Equality Constraints

The optimization problems subject to equality constraints can be generally formulated as:

#math749#   #tex2html_wrap_indisplay19739# (152)
Here we assume #tex2html_wrap_inline8290# is no bigger than #tex2html_wrap_inline8292#, as in general there does not exist a solution that satisfies more than #tex2html_wrap_inline8294# equations in the N-D space.

This problem can be visualized in the special case with #tex2html_wrap_inline8296# and #tex2html_wrap_inline8298#, where both #tex2html_wrap_inline8300# and #tex2html_wrap_inline8302# are surfaces defined over the 2-D space spanned by #tex2html_wrap_inline8304# and #tex2html_wrap_inline8306#, and #tex2html_wrap_inline8308# is the intersection line of #tex2html_wrap_inline8310# and the 2-D plane. The solution #tex2html_wrap_inline8312# must satisfy the following two conditions:

For both conditions to be satisfied, the two contours #tex2html_wrap_inline8328# and #tex2html_wrap_inline8330# must coincide at #tex2html_wrap_inline8332#, i.e., they must have the same tangent line and therefore parallel gradients (perpendicular to the tangents):
#math751#   #tex2html_wrap_indisplay19763# (153)
where #math752##tex2html_wrap_inline8334# is the gradient of #tex2html_wrap_inline8336#, and #tex2html_wrap_inline8338# is a constant scaling factor which can be either positive if the two gradients are in the same direction or negative if they are in opposite directions. As here we are only interested in whether the two curves coincide or not, the sign of #tex2html_wrap_inline8340# can be either positive or negative.

<#19768#>Figure<#19768#> 2.18: <#19769#>Equality Constrained Optimimization<#19769#>
[width=110mm]<#3675#>figures/Lagrange.eps<#3675#>

To find such a point #tex2html_wrap_inline8342#, we first construct a new objective function, called the <#3680#>Lagrange function<#3680#> (or simply <#3681#>Lagrangian<#3681#>):

#math753#   #tex2html_wrap_indisplay19773# (154)
where #tex2html_wrap_inline8344# is the <#3687#>Lagrange multiplier<#3687#>, which can be either positive or negative (i.e., the sign in front of #tex2html_wrap_inline8346# can be either plus or minus), and then set the gradient of this objective function with respect to #tex2html_wrap_inline8348# as well as #tex2html_wrap_inline8350# to zero:
#math754#   #tex2html_wrap_indisplay19779# (155)
This equation contains two parts:
#math755#   #tex2html_wrap_indisplay19781# (156)
by which both Eq. (#sameGradiant#3709>) and the equality constraint are satisfied as desired.

This is an equation system containing #tex2html_wrap_inline8352# equations for the same number of unknowns #math756##tex2html_wrap_inline8354#:

#math757#   #tex2html_wrap_indisplay19785# (157)
The solution #math758##tex2html_wrap_inline8356# and #tex2html_wrap_inline8358# of this equation system happens to satisfy the two desired relationships: (a) #math759##tex2html_wrap_inline8360#, and (b) #math760##tex2html_wrap_inline8362#.

Specially, consider the following two possible cases:

  • If #tex2html_wrap_inline8364#, then
    #math761#   #tex2html_wrap_indisplay19792# (158)
    indicating #tex2html_wrap_inline8366# is an extremum independent of the constraint, i.e., the constraint #tex2html_wrap_inline8368# is <#3733#>inactive<#3733#>, and the optimization is unconstrained.
  • If #math762##tex2html_wrap_inline8370#, then #math763##tex2html_wrap_inline8372# (as in general #math764##tex2html_wrap_inline8374#), indicating #tex2html_wrap_inline8376# is not an extremum without the constraint, i.e., the constraint is <#3740#>active<#3740#>, and the optimization is indeed constrained.

The discussion above can be generalized from 2-D to an #tex2html_wrap_inline8378# dimensional space, in which the optimal solution #tex2html_wrap_inline8380# is to be found to extremize the objective #tex2html_wrap_inline8382# subject to #tex2html_wrap_inline8384# equality constraints #math765##tex2html_wrap_inline8386#, each representing a contour surface in the N-D space. The solution #tex2html_wrap_inline8388# must satisfy the following two conditions:

  • #tex2html_wrap_inline8390# is on all #tex2html_wrap_inline8392# contour surfaces so that #math766##tex2html_wrap_inline8394#, i.e., it is on the intersection of these surfaces, a curve in the space (e.g., the intersection of two surfaces in 3-D space is a curve);
  • #tex2html_wrap_inline8396# is on a contour surface #math767##tex2html_wrap_inline8398# (for some value #tex2html_wrap_inline8400#) so that #tex2html_wrap_inline8402# is an extremum, i.e., it does not increase or decrease while moving along the curve in the neighborhood of #tex2html_wrap_inline8404#.
Combining these conditions, we see that the intersection (a curve or a surface) of the #tex2html_wrap_inline8406# surfaces #math768##tex2html_wrap_inline8408# must coincide with the contour surface #tex2html_wrap_inline8410# at #tex2html_wrap_inline8412#, i.e., the intersection of the tangent surfaces of #math769##tex2html_wrap_inline8414# through #tex2html_wrap_inline8416# must coincide with the tangent surface of #tex2html_wrap_inline8418# at #tex2html_wrap_inline8420#, therefore the gradient #math770##tex2html_wrap_inline8422# must be a linear combination of the #tex2html_wrap_inline8424# gradients #math771##tex2html_wrap_inline8426#:
#math772#   #tex2html_wrap_indisplay19825# (159)
as shown in Fig. #Lagrange1#3771>.
<#19826#>Figure<#19826#> 2.19: <#19827#>Derivation of Lagrange Function<#19827#>
[width=60mm]<#3773#>figures/Lagrange1.eps<#3773#>

To find such a solution #tex2html_wrap_inline8428# of the optimization problem, we first construct the Lagrange function:

#math773#   #tex2html_wrap_indisplay19831# (160)
where #math774##tex2html_wrap_inline8430# is a vector for #tex2html_wrap_inline8432# Lagrange multipliers. We then set its gradient to zero:
#math775#   #tex2html_wrap_indisplay19835# (161)
to get the following two equation systems of #tex2html_wrap_inline8434# and #tex2html_wrap_inline8436# equations respectively:
#math776#   #tex2html_wrap_indisplay19839# (162)
and
#math777#   #tex2html_wrap_indisplay19841# (163)
The first set of #tex2html_wrap_inline8438# equations indicates that the gradient #math778##tex2html_wrap_inline8440# at #tex2html_wrap_inline8442# is a linear combination of the #tex2html_wrap_inline8444# gradients #math779##tex2html_wrap_inline8446#, while the second set of #tex2html_wrap_inline8448# equations guarantees that #tex2html_wrap_inline8450# also satisfy the #tex2html_wrap_inline8452# equality constraints, both as desired. Solving these equations we get the desired solution #tex2html_wrap_inline8454# together with the Lagrange multipliers #math780##tex2html_wrap_inline8456#. This is the <#3825#>Lagrange multiplier method<#3825#>, by which the constrained optimization problem is reformulated in such a way that the solution can be found by solving the equation system obtained by setting the gradient of the Lagrange function to zero.

Specially, if #tex2html_wrap_inline8458#, then the ith constraint is not active. More specially if #tex2html_wrap_inline8460# for all #tex2html_wrap_inline8462#, then

#math781#   #tex2html_wrap_indisplay19856# (164)
indicating #tex2html_wrap_inline8464# is an extremum independent of any of the constraints, i.e., the optimization problem is unconstrained.