Optimization with Inequality Constraints

The optimization problems subject to inequality constraints can be generally formulated as:

$\displaystyle \left\{\begin{tabular}{ll}
max/min: & $f({\bf x})=f(x_1,\cdots,x_...
...g_j(x_1,\cdots,x_N)
\ge 0,\;\;\;\;\;\;\;\;(j=1,\cdots n)$
\end{tabular}\right.$ (185)

Again, to visualize the problem we first consider an example with $N=2$ and $n=1$, as shown in the figure below for the minimization (left) and maximization (right) of $f({\bf x})$ subject to $g({\bf x}) > 0$. The constrained solution ${\bf x}^*=[x^*_1,\,x^*_2]^T$ is on the boundary of the feasible region satisfying $g(x_1,x_2)=0$, while the unconstrained extremum is outside the feasible region.

KKT0a.png

Consider the following two possible cases.

Summarizing the two cases above, we see that $g({\bf x}^*)=0$ but $\mu^*\ne 0$ in the first case, $g({\bf x}^*)\ne 0$ but $\mu^*=0$ in the second case, i.e., the following holds in either case:

$\displaystyle \mu^*\,g({\bf x}^*)=0$ (190)

The discussion above can be generalized from 2-D to $N>2$ dimensional space, in which the optimal solution ${\bf x}^*$ is to be found to extremize the objective $f({\bf x})$ subject to $n$ inequality constraints $g_i({\bf x})=0\;(i=1,\cdots,n)$. To solve this inequality constrained optimization problem, we first construct the Lagrangian:

$\displaystyle L({\bf x},{\bf\mu})=f({\bf x})-\sum_{i=1}^n\mu_i\,g_i({\bf x})$ (191)

We note that in some literatures, a plus sign is used in front of the summation of the second term. This is equivalent to our discussion here so long as the sign of $\mu$ indicated in Table 188 is negated.

We now set the gradient of the Lagrangian to zero:

$\displaystyle \bigtriangledown_{{\bf x},{\bf\mu}} L({\bf x},{\bf\mu})
=\bigtria...
...bf x},{\bf\mu}}\left[f({\bf x})
-\sum_{i=1}^n\mu_i\,g_i({\bf x})\right]
={\bf0}$ (192)

and get two equation systems of $N$ and $n$ equations, respectively:

$\displaystyle \bigtriangledown_{\bf x}f({\bf x})
=\sum_{i=1}^n\mu_i \bigtriangledown_{\bf x} g_i({\bf x})$ (193)

and

$\displaystyle \frac{\partial L({\bf x},{\bf\mu})}{\partial\mu_i}
=g_i({\bf x})=0\;\;\;\;(i=1,\cdots,n)$ (194)

The result above for the inequality constrained problems is the same as that for the equality constrained problems considered before. However, we note that there is an additional requirement regarding the sign of the scaling coifficients. For an equality constrained problem, the direction of the gradient $\bigtriangledown g({\bf x})$ is of no concern, i.e., the sign of $\lambda$ is unrestricted; but here for an inequality constrained problem, the sign of $\mu$ needs to be consistent with those shown in Table 188, other wise the constraints may be inactive.

We now consider the general optimization of an N-D objective function $f({\bf x})$ subject to multiple constraints of both equalities and inequalities:

$\displaystyle \begin{tabular}{ll}
max/min: & $f({\bf x})=f(x_1,\cdots,x_N)$\ \\...
...f x})\ge 0,\;\;\;\;\;\;\;\;(j=1,\cdots n)\\
\end{array} \right.$
\end{tabular}$ (195)

For notational convenience, we represent the $m+n$ equality and inequality constraints in vector form as:

$\displaystyle \begin{tabular}{ll}
max/min: & f({\bf x})\\
s. t.: &
$\left\{ \b...
...0}\;\;\mbox{or}\;\;{\bf g}({\bf x})\ge{\bf0}
\end{array} \right.$
\end{tabular}$ (196)

where ${\bf h}({\bf x})=[h_1({\bf x}),\cdots,h_m({\bf x})]^T$ and ${\bf g}({\bf x})=[g_1({\bf x}),\cdots,g_n({\bf x})]^T$.

To solve this optimization problem, we first construct the Lagrangian

$\displaystyle L({\bf x},{\bf\lambda},{\bf\mu})
=f({\bf x})-\sum_{i=1}^m\lambda_...
...{\bf x})
=f({\bf x})-{\bf\lambda}^T {\bf h}({\bf x})-{\bf\mu}^T{\bf g}({\bf x})$ (197)

where the Lagrange multipliers in ${\bf\lambda}=[\lambda_1,\cdots,\lambda_m]^T$ and ${\bf\mu}=[\mu_1,\cdots,\mu_n]^T$ are for the $m$ equality and $n$ non-negative constraints, respectively, and then set its gradient with respect to both ${\bf\lambda}$ and ${\bf\mu}$ as well as ${\bf x}$ to zero. The solution ${\bf x}^*$ can then be obtained by solving the resulting equation system. While $\lambda_i$ can be either positive or negative, with sign of $\mu_j$ needs to be consistent with those specified in Table 188. Otherwise the inequality constraints is inactive.

Example:

Find the extremum of $f(x_1,x_2)=x^2_1+x^2_2$ subject to each of the three different constraints: $x_1+x_2=1$, $x_1+x_2\le 1$, and $x_1+x_2\ge 1$.

optConstrainedEx.png