Duality and KKT Conditions

Previously we considered the constrained minimization problem:

#math835# #tex2html_wrap_indisplay19998# (183)

#math836# #tex2html_wrap_indisplay20000# (184)

#tex2html_wrap_inline8622#

#math837# #tex2html_wrap_indisplay20003# (185)

#math838##tex2html_wrap_inline8624#.

In particular, if the objective function #tex2html_wrap_inline8626# and all inequality constraints #math839##tex2html_wrap_inline8628# are convex and the equality constraints #math840##tex2html_wrap_inline8630# are linear (affine), i.e., the feasible region is convex (Section #convexity#4231>), then the problem is a <#4232#>convex optimization<#4232#> with some nice properties as shown below.

This constrained minimization problem given above can be called the <#4233#>primal problem<#4233#>, as there also exists an equivalent <#4234#>dual problem<#4234#>, which may be easier to solve. To construct the dual problem, we first define the <#4235#>dual function<#4235#> as the lower bound of the Lagrangian in Eq. (#LagrangianPrimal#4236>) to be minimized:

#math841# #tex2html_wrap_indisplay20009# (186)

We further find the tightest lower bound, the maximum of the dual function #math842##tex2html_wrap_inline8632# over its variables #math843##tex2html_wrap_inline8634# and #math844##tex2html_wrap_inline8636#, by solving the following dual problem:

#math845# #tex2html_wrap_indisplay20014# (187)

Here #math846##tex2html_wrap_inline8638# for the equality constrains of the primal problem can be either positive or negative, i.e., it is not constrained in the dual problem. However, the sign of #math847##tex2html_wrap_inline8640# for the inequalitie constraints of the primal minimization problem needs to satisfy the following according to Eq. (#mugminmax#4277>) (the second inequality for this minimization primal problem):

#math848# #tex2html_wrap_indisplay20018# (188)

i.e., #math849##tex2html_wrap_inline8642# is subject to the inequality constraint of the dual problem. We denote the optimal solution of the dual problem by #math850##tex2html_wrap_inline8644#, and the maximum of the dual objective function by #math851##tex2html_wrap_inline8646#.

Obviously the primal objective function of any feasible #tex2html_wrap_inline8648# is always no less than its lower bound, the dual objective function of any feasible #math852##tex2html_wrap_inline8650#:

#math853# #tex2html_wrap_indisplay20025# (189)

and this inequality also holds for the optimal solutions #tex2html_wrap_inline8652# and #math854##tex2html_wrap_inline8654#:

#math855# #tex2html_wrap_indisplay20029# (190)

The difference #tex2html_wrap_inline8656# is defined as the <#4305#>duality gap<#4305#>. Obviously #tex2html_wrap_inline8658# always holds and the duality is called <#4306#>weak duality<#4306#>. In some problems where certain conditions (<#4307#>Slater's conditions<#4307#>) are satisfied, #tex2html_wrap_inline8660#, i.e., #tex2html_wrap_inline8662# is the tightest lower bound, the reality is called <#4308#>strong duality<#4308#>.

We now consider the necessary and sufficient conditions for the solutions of both the primal and dual problems. We first introduce the <#4309#>Karush-Kuhn-Tucker (KKT)<#4309#> conditions:

<#4311#>Stationarity:<#4311#> #tex2html_wrap_inline8664# maximizes or minimizes the Lagrangian, i.e.,
#math856# #tex2html_wrap_indisplay20036# (191)
<#4321#>Primal feasibility:<#4321#> #tex2html_wrap_inline8666# must be inside the feasible region:
#math857# #tex2html_wrap_indisplay20039# (192)
<#4330#>Dual feasibility:<#4330#> All multipliers in #math858##tex2html_wrap_inline8668# for the inequalities must take proper sign as specified by Table #PolarityTable#4332>:
#math859# #tex2html_wrap_indisplay20042# (193)
<#4337#>Complementarity:<#4337#> Either #tex2html_wrap_inline8670# or
#math860##tex2html_wrap_inline8672# must be zero depending on whether the jth constraint is inactive or not:
- if inactive, then #math861##tex2html_wrap_inline8674#, but #tex2html_wrap_inline8676#;
- if active, then #tex2html_wrap_inline8678#, but #math862##tex2html_wrap_inline8680#.
Combining both cases, we get (Eq. (#mugeqzVector#4343>)):
#math863# #tex2html_wrap_indisplay20050# (194)

We then prove the following theorem regarding the KKT conditions and the solutions of the primal and dual problems.

<#4351#>Theorem:<#4351#>

Necessary conditions:
If #tex2html_wrap_inline8682# and #math864##tex2html_wrap_inline8684# are solutions for the respective primal and dual problems with strong duality, then they satisfy all KKT conditions.
Sufficient conditions:
If #tex2html_wrap_inline8686# and #math865##tex2html_wrap_inline8688# satisfy all KKT conditions, then they are solutions for the respective primal and dual problems, and strong duality holds, when the primal objective function #math866##tex2html_wrap_inline8690# and the inequality constraints #tex2html_wrap_inline8692# are convex, and equality constraints #tex2html_wrap_inline8694# are linear.

<#4363#>Proof:<#4363#>

Necessary conditions:

As #tex2html_wrap_inline8696# and #math867##tex2html_wrap_inline8698# are respectively the primal and dual solutions, the primal and dual feasibilities must be satisfied, and so is stationarity as the optimal solution #tex2html_wrap_inline8700# satisfies Eq. (#primaldL0#4369>). In addition, we also have #math868##tex2html_wrap_inline8702#. To show complementarity also holds, consider

#math869#

#tex2html_wrap_indisplay20063#	#tex2html_wrap_indisplay20064#	#tex2html_wrap_indisplay20065#
	#tex2html_wrap_indisplay20066#	#tex2html_wrap_indisplay20067#
	#tex2html_wrap_indisplay20068#	#tex2html_wrap_indisplay20069#
	#tex2html_wrap_indisplay20070#	#tex2html_wrap_indisplay20071#
	#tex2html_wrap_indisplay20072#	#tex2html_wrap_indisplay20073#	(195)

The last inequality is due to Eq. (#muggt0#4401>). But as #tex2html_wrap_inline8704# due to the strong duality, the inequalities need to be equalities, i.e., #math870##tex2html_wrap_inline8706# and complementarity holds.

Sufficient conditions:
If #tex2html_wrap_inline8708# and #math871##tex2html_wrap_inline8710# satisfy the KKT conditions, then they are feasible for the respective primal and dual problems. Also, as #tex2html_wrap_inline8712# and all #tex2html_wrap_inline8714# are convex and #tex2html_wrap_inline8716# are linear, Lagrangian in Eq. (#LagrangianPrimal#4411>) as a sum of convex functions is also convex (see Section #convexity#4412>), and stationarity in Eq. (#KKTCondition1#4413>) indicates #tex2html_wrap_inline8718# is a global minimum, therefore

#math872#

#tex2html_wrap_indisplay20083# #tex2html_wrap_indisplay20084# #tex2html_wrap_indisplay20085#

#tex2html_wrap_indisplay20086# #tex2html_wrap_indisplay20087# (196)

where the second line is due to the KKT conditions in Eqs. (#KKTCondition2#4430>) and (#KKTCondition4#4431>).

In summary, we see that the KKT conditions are necessary and satisfied by all solutions of constrained optimization problems, and they can also be sufficient for the solution of such problems if the primal problem further satisfies some additional conditions of convexity, such as in quadratic program if the quadratic objective function is also positive definite. In such a case, strong duality holds and the solution of the dual problem is the tightest lower bound of the primal, the same as the solution of the primal.

Consider, as an example of the inequality constrained minimization problems, the support vector machine for binary classification (Chapter #SVM#4433>). To find the decision boundary that optimally separates two classes, we need to solve the following minimization problem constrained by a set of inequalities imposed by the data samples #math873##tex2html_wrap_inline8720# in the training set:

#math874# #tex2html_wrap_indisplay20090# (197)

where #math875##tex2html_wrap_inline8722#. We note that the quadratic objective function is convex and so are the affine inequality constraints, i.e., this is a convex problem. The solution #tex2html_wrap_inline8724# with minimum norm #tex2html_wrap_inline8726# subject to the inequality constraints can be found by solving the dual problem in terms of the Lagrange multipliers in #math876##tex2html_wrap_inline8728#, subject to certain inequality constraints (detailed derivation in Section #SVM#4454>):

#math877# #tex2html_wrap_indisplay20096# (198)

As the dual function can be written a quadratic form:

#math878# #tex2html_wrap_indisplay20098# (199)

where #tex2html_wrap_inline8730# is an #tex2html_wrap_inline8732# symmetric matrix of which the component in the mth row and nth column is #math879##tex2html_wrap_inline8734#, this is a quadratic programming problem, which can be solved by the interior point algorithm to be discussed in Section #interiorPoint#4485>.

#tex2html_wrap_indisplay20083#	#tex2html_wrap_indisplay20084#	#tex2html_wrap_indisplay20085#
	#tex2html_wrap_indisplay20086#	#tex2html_wrap_indisplay20087#	(196)