#math1045##tex2html_wrap_inline9456#
given above, but here it is labeled as complementarity, so that
these conditions can be compared with the KKT conditions of the
original problem. We see that the two sets of KKT conditions are
similar to each other, but with the following differences:
- The non-negativity #math1046##tex2html_wrap_inline9458# in the primal feasibility
of the original KKT conditions is dropped;
- Consequently the inequality #math1047##tex2html_wrap_inline9460# in the dual
feasibility of the original KKT conditions is also dropped;
- There is an extra term #tex2html_wrap_inline9462# in complementarity which
will vanish when #math1048##tex2html_wrap_inline9464#.
The optimal solution that minimizes the objective function #tex2html_wrap_inline9466#
can be found by solving the simultaneous equations in the modified
KKT conditions (with no inequalities) given above. To do so, we first
express the equations in the KKT conditions above as a nonlinear
equation system #math1049##tex2html_wrap_inline9468#, and
find its Jacobian matrix #math1050##tex2html_wrap_inline9470# composed of the partial
derivatives of the function #math1051##tex2html_wrap_inline9472#
with respect to each of the three variables #tex2html_wrap_inline9474#, #math1052##tex2html_wrap_inline9476#,
and #math1053##tex2html_wrap_inline9478#:
#math1054# #tex2html_wrap_indisplay20618#
(243)
where
#math1055#
#tex2html_wrap_indisplay20620# |
#tex2html_wrap_indisplay20621# |
#tex2html_wrap_indisplay20622# |
|
|
#tex2html_wrap_indisplay20623# |
#tex2html_wrap_indisplay20624# |
(244) |
is symmetric as the Hessian matrices
#tex2html_wrap_inline9480# and #math1056##tex2html_wrap_inline9482# are symmetric.
We can now use the Newton-Raphson method to find the solution
of the nonlinear equation
#math1057##tex2html_wrap_inline9484#
iteratively:
#math1058# #tex2html_wrap_indisplay20629#
(245)
where #tex2html_wrap_inline9486# is a parameter that controls the step size and
#math1059##tex2html_wrap_inline9488#
is the search direction (<#5858#>Newton direction<#5858#>), which can be found
by solving the following equation:
#math1060#
|
|
#tex2html_wrap_indisplay20633# |
|
|
#tex2html_wrap_indisplay20634# |
#tex2html_wrap_indisplay20635# |
(246) |
To get the initial values for the iteration, we first find any point
inside the feasible region #tex2html_wrap_inline9490# and then #math1061##tex2html_wrap_inline9492#,
based on which we further find #math1062##tex2html_wrap_inline9494# by solving the first
equation in the KKT conditions in Eq. (#KKTconditionsIP#5916>).
Actually this equation system above can be separated into two
subsystems, which are easier to solve. Pre-multiplying #tex2html_wrap_inline9496#
to the third equation:
#math1063# #tex2html_wrap_indisplay20641#
(247)
we get
#math1064# #tex2html_wrap_indisplay20643#
(248)
Adding this to the first equation:
#math1065# #tex2html_wrap_indisplay20645#
(249)
we get
#math1066# #tex2html_wrap_indisplay20647#
(250)
Combining this equation with the second one, we get an equation system
of two vector variables #tex2html_wrap_inline9498# and #math1067##tex2html_wrap_inline9500# with
symmetric coefficient matrix:
#math1068# #tex2html_wrap_indisplay20651#
(251)
Solving this equation system we get #tex2html_wrap_inline9502# and #math1069##tex2html_wrap_inline9504#,
and we can further find #math1070##tex2html_wrap_inline9506# by solving the third equation
#math1071# #tex2html_wrap_indisplay20656#
(252)
to get
#math1072# #tex2html_wrap_indisplay20658#
(253)
We now consider the two special cases of linear and quadratic
programming:
- <#6036#>Linear programming<#6036#>:
#math1073# #tex2html_wrap_indisplay20660#
(254)
We have
- #math1074##tex2html_wrap_inline9508#
- #math1075##tex2html_wrap_inline9510#.
- #math1076##tex2html_wrap_inline9512#.
The Lagrangian is
#math1077# #tex2html_wrap_indisplay20665#
(255)
The KKT conditions are:
#math1078# #tex2html_wrap_indisplay20667#
The search direction of the iteration can be found by solving the
this equation system:
#math1079# #tex2html_wrap_indisplay20669#
(256)
By solving the equation #math1080##tex2html_wrap_inline9514#
we get the initial value #math1081##tex2html_wrap_inline9516#. The Matlab code for the interior
point method for the LP problem is listed below:
verbatim340#
<#6155#>Example:<#6155#>
A given linear programming problem shown below is first
converted into the standard form:
#math1082# #tex2html_wrap_indisplay20673#
or in matrix form:
#math1083# #tex2html_wrap_indisplay20675#
with
#math1084# #tex2html_wrap_indisplay20677#
where #math1085##tex2html_wrap_inline9518# are the slack variables.
<#20679#>Figure<#20679#> 2.27:
<#20680#>Interior Point Method<#20680#>
[width=70mm]<#6211#>figures/InteriorPointLPexample.eps<#6211#> |
We choose an initial value #math1086##tex2html_wrap_inline9520#, and
an initial parameter #tex2html_wrap_inline9522#, which is scaled up by a factor of #tex2html_wrap_inline9524#
in each iteration. We also used full Newton step size of #tex2html_wrap_inline9526#.
After 8 iterations, the algorithm converged to the optimal solution
#math1087##tex2html_wrap_inline9528# corresponding to the maximum function value
#math1088##tex2html_wrap_inline9530#:
#math1089# #tex2html_wrap_indisplay20689#
At the end of the iteration, we also get the values of the slack
variables: #math1090##tex2html_wrap_inline9532#.
- <#6225#>Quadratic programming:<#6225#>
#math1091# #tex2html_wrap_indisplay20692#
(257)
We have
- #math1092##tex2html_wrap_inline9534#
- #math1093##tex2html_wrap_inline9536#.
- #math1094##tex2html_wrap_inline9538#.
The Lagrangian is
#math1095# #tex2html_wrap_indisplay20697#
(258)
The KKT conditions are:
#math1096# #tex2html_wrap_indisplay20699#
The equation system for the Newton-Raphson method is:
#math1097# #tex2html_wrap_indisplay20701#
(259)
The Matlab code for the interior point method for the QP problem is
listed below:
verbatim341#
<#6352#>Example:<#6352#>
A 2-D quadratic programming problem shown below is to minimize
four different quadratic functions under the same linear
constraints as the linear programming problem in the previous
example. In general, the quadratic function is given in the
matrix form:
#math1098# #tex2html_wrap_indisplay20703#
with following four different sets of function parameters:
#math1099# #tex2html_wrap_indisplay20705#
#math1100# #tex2html_wrap_indisplay20707#
As shown in the four panels in the figure below, the iteration of
the interior point algorithm brings the solution from the same
initial position at #math1101##tex2html_wrap_inline9540# to the final position
for each of the four objective functions #tex2html_wrap_inline9542#, the optimal
solution, which is on the boundary of the feasible region in all
cases except the third one, where the optimal solution is inside
the feasible region, i.e., the optimization problem is not constrained.
Also note that the trajectories of the solution during the iteration
go straightly from the initial guess to the final optimal solution
by following the negative gradient direction of the quadratic function
in all cases except in the last one, where it makes a turn while
approaching the vertical boundary, due obviously to the significantly
greater value of the log barrier function close to the boundary.
<#20710#>Figure<#20710#> 2.28:
<#20711#>Interior Point Example<#20711#>
[width=100mm]<#6412#>figures/InteriorPointQPexample.eps<#6412#> |
We also note that the dual problem in the SVM algorithm in
Eq. (#SVMdual3#6416>) is in the same form as in Eq. (#QPproblem#6417>),
i.e., it is a quadratic programming problem that can be solved
by the interior point algorithm.
In fact, the equation system of three variable
#math1102##tex2html_wrap_inline9544#
above can be separated into two independent systems, the first one
for #math1103##tex2html_wrap_inline9546#, while the second
one for #math1104##tex2html_wrap_inline9548#. To see this, we first left multiply
#tex2html_wrap_inline9550# on both sides of the third equation:
#math1105#
|
|
#tex2html_wrap_indisplay20718# |
|
|
#tex2html_wrap_indisplay20719# |
#tex2html_wrap_indisplay20720# |
|
and then add this to the first equation:
#math1106# #tex2html_wrap_indisplay20722#
#math1107# #tex2html_wrap_indisplay20724#
where #math1108##tex2html_wrap_inline9552#, called the
<#6499#>Newton direction<#6499#>, can be found by solving the linear equation
system #math1109##tex2html_wrap_inline9554#. Here
we will first express all equations in the KKT conditions as
#math1110##tex2html_wrap_inline9556#, and find its
Jacobian #math1111##tex2html_wrap_inline9558#. Then the Newton direction
#math1112##tex2html_wrap_inline9560#
can be found by solving the following linear system:
we solve these nonlinear
equations in the KKT conditions with #math1113##tex2html_wrap_inline9562#, #math1114##tex2html_wrap_inline9564#
as well as #tex2html_wrap_inline9566# treated as variables by
///
We first consider using the <#6518#>central path<#6518#> method to solve the
following inequality constrained optimization problem:
#math1115# #tex2html_wrap_indisplay20734#
Here the inequality constraints are expressed in vector form in terms of
the vector function #math1116##tex2html_wrap_inline9568#.
The Lagrangian of the problem is
#math1117# #tex2html_wrap_indisplay20737#
(260)
where #math1118##tex2html_wrap_inline9570# is for the #tex2html_wrap_inline9572#
multipliers, and #math1119##tex2html_wrap_inline9574#
is a diagonal matrix of the #tex2html_wrap_inline9576# constraint functions. The KKT conditions
for #tex2html_wrap_inline9578# and #math1120##tex2html_wrap_inline9580# to be the solution of this
problem are:
#math1121# #tex2html_wrap_indisplay20745#
#math1122# #tex2html_wrap_indisplay20747#
where #tex2html_wrap_inline9582#, called the <#6597#>logarithmic barrier function<#6597#>, is
defined as:
#math1123# #tex2html_wrap_indisplay20750#
which is smooth and differentiable. To minimize this new objective function
we set its gradient to zero:
#math1124# #tex2html_wrap_indisplay20752#
We further define:
#math1125# #tex2html_wrap_indisplay20754#
Here #tex2html_wrap_inline9584# is non-negative because #tex2html_wrap_inline9586# and #math1126##tex2html_wrap_inline9588#.
These #tex2html_wrap_inline9590# equations can be expressed in matrix form:
#math1127# #tex2html_wrap_indisplay20760#
Then the above becomes
#math1128# #tex2html_wrap_indisplay20762#
(261)
The corresponding dual Lagrangian turns out to take the same form as that
of the original given above:
#math1129# #tex2html_wrap_indisplay20764#
(262)
and the solution #math1130##tex2html_wrap_inline9592# that minimizes
#math1131##tex2html_wrap_inline9594# has to satisfy the KKT conditions, the above two
equations of both variables #tex2html_wrap_inline9596# and #math1132##tex2html_wrap_inline9598#:
#math1133# #tex2html_wrap_indisplay20770#
(263)
Comparing these with the KKT conditions of the original problem, we see they
are very similar, except the following differences:
- The inequality #math1134##tex2html_wrap_inline9600# in the primal feasibility
of the original KKT is dropped;
- The inequality #math1135##tex2html_wrap_inline9602# in the dual feasibility
of the original KKT conditions is dropped as #tex2html_wrap_inline9604# and #math1136##tex2html_wrap_inline9606#;
- There is an extra term #tex2html_wrap_inline9608# in complementarity. However, it will
vanish if we let #math1137##tex2html_wrap_inline9610#.
The given inequality-constrained optimization problem can now be be solved
by the central path method based on the new KKT conditions without any
inequality constraints. It starts from an initial point #tex2html_wrap_inline9612# inside
the feasible region with an initial #tex2html_wrap_inline9614#, and approaches the solution on
the boundary iteratively. In each iteration (outer), the value of the parameter
#tex2html_wrap_inline9616# is increased from the previous value, and the best solution #tex2html_wrap_inline9618#
corresponding to this #tex2html_wrap_inline9620# is obtained by solving the two equations given
above by the iteration (inner) of Newton-Raphson's method. Eventually, when
#math1138##tex2html_wrap_inline9622#, #tex2html_wrap_inline9624# approaches some point on the boundary,
at which #tex2html_wrap_inline9626# is minimized subject to the inequality constraints.
Specifically, in each step of the outer iteration, we find the increments
#tex2html_wrap_inline9628# and #math1139##tex2html_wrap_inline9630# by solving the following linear
equation system:
#math1140# #tex2html_wrap_indisplay20788#
(264)
where the 2 by 2 block matrix on the left is the Jacobian of the function
combining the two functions above, with respect to both variables #tex2html_wrap_inline9632# and
#math1141##tex2html_wrap_inline9634#.
We then update the two variables to get #math1142##tex2html_wrap_inline9636#
and #math1143##tex2html_wrap_inline9638# based on their previous values #tex2html_wrap_inline9640#
and #math1144##tex2html_wrap_inline9642#. The iteration will approach the solutions #tex2html_wrap_inline9644#
and #math1145##tex2html_wrap_inline9646# (both are functions of parameter #tex2html_wrap_inline9648#) that minimize
the dual Lagrangian #math1146##tex2html_wrap_inline9650# corresponding to the specific
#tex2html_wrap_inline9652#:
#math1147# #tex2html_wrap_indisplay20801#
(265)
The difference #math1148##tex2html_wrap_inline9654#
is the duality gap which approaches zero when #math1149##tex2html_wrap_inline9656#.
The duality relationship is useful as sometime solving the dual problem
may be easier than solving the primal one. Even if the duality gap is
not zero, by solving the dual problem, we can still get the tightest
bound for the true solution, either the maximal lower bound of a minimization
problem or the minimal upper bound, as a good estimate of the true solution,
when it is unavailable.