Optimization with Inequality Constraints

The optimization problems subject to inequality constraints can be generally formulated as:

$\displaystyle \left\{\begin{tabular}{ll} max/min: & $f({\bf x})=f(x_1,\cdots,x_... ...g_j(x_1,\cdots,x_N) \ge 0,\;\;\;\;\;\;\;\;(j=1,\cdots n)$ \end{tabular}\right.$

(185)

Again, to visualize the problem we first consider an example with $N=2$

and

, as shown in the figure below for the minimization (left) and maximization (right) of $f({\bf x})$ subject to $g({\bf x}) > 0$ . The constrained solution ${\bf x}^*=[x^*_1,\,x^*_2]^T$ is on the boundary of the feasible region satisfying $g(x_1,x_2)=0$

, while the unconstrained extremum is outside the feasible region.

Consider the following two possible cases.

First, if the unconstrained extremum at which $\bigtriangledown_{\bf x} f({\bf x})=0$ is outside the feasible region, i.e., the inequality constraint is active, then the constrained solution ${\bf x}^*$ must be
- on the boundary of the feasible region, i.e., $g({\bf x}^*)=0$ ,
- different from the unconstrained solution, i.e., $\bigtriangledown_{\bf x} f({\bf x}^*)\ne{\bf0}$ ;
The problem becomes the same as an equality constrained problem considered before, which requires the objective function $f({\bf x})$ and the constraining function $g({\bf x})$ to have the same tangent at ${\bf x}^*$ , or parallel gradients:

$\displaystyle \bigtriangledown_{\bf x} f({\bf x}^*)=\mu^*\;\bigtriangledown_{\bf x} g({\bf x}^*) \;\;\;\;$ or $\displaystyle \;\;\;\; \bigtriangledown_{\bf x} f({\bf x}^*)-\mu^*\;\bigtriangledown_{\bf x} g({\bf x}^*)=0$ (186)

However, different from before, now we are also concerned with whether the two gradients have the same or opposite directions, corresponding to $\mu>0$ or $\mu<0$ , respectively, as illustrated in the 1-D examples in the figure below:

Depending on whether $f({\bf x})$ is to be maximized or minimized, and whether the constraint is $g({\bf x})\ge 0$ or $g({\bf x})\le 0$ , there exist four possible cases in terms of the sign of $\mu$ , as summarized in the table below:

$\displaystyle \begin{tabular}{c\vert\vert c\vert c}\hline & $g({\bf x})\ge 0$\ ... ...\bf x}_4)=\mu \,\bigtriangledown g({\bf x}_4),\;\mu<0$\ \\ \hline \end{tabular}$ (187)

$\displaystyle \begin{tabular}{c\vert\vert c\vert c\vert\vert c}\hline Case & ma... ...4) & $\min f({\bf x})$\ & $g({\bf x})\le 0$\ & $\mu<0$\ \\ \hline \end{tabular}$ (188)

We can also consider the product of $\mu$ and $g({\bf x})$ for either maximization or minimization:

$\displaystyle \mu\, g({\bf x})\;\left\{\begin{array}{ll} \le 0 & \mbox{for maximization}\\ \ge 0 & \mbox{for minimization} \end{array}\right.$ (189)
Second, if the unconstrained extremum is inside the feasible region, i.e., the inequality constraint is inactive, then the problem is actually unconstrained and the results above are no longer valid. The solution ${\bf x}^*$ of this unconstrained problem is
- the same as the unconstrained solution, i.e., $\bigtriangledown_{\bf x} f({\bf x}^*)={\bf0}$ ;
- not on the boundary of the feasible region, i.e., $g({\bf x}^*)\ne 0$ ;
Now we need to assume $\mu=0$ and find the solution solving the same equation $\bigtriangledown_{\bf x} f({\bf x}) =\mu\,\bigtriangledown_{\bf x} g({\bf x})={\bf0}$

Summarizing the two cases above, we see that $g({\bf x}^*)=0$ but $\mu^*\ne 0$ in the first case, $g({\bf x}^*)\ne 0$ but $\mu^*=0$ in the second case, i.e., the following holds in either case:

$\displaystyle \mu^*\,g({\bf x}^*)=0$

(190)

The discussion above can be generalized from 2-D to $N>2$ dimensional space, in which the optimal solution ${\bf x}^*$ is to be found to extremize the objective $f({\bf x})$ subject to $n$ inequality constraints $g_i({\bf x})=0\;(i=1,\cdots,n)$ . To solve this inequality constrained optimization problem, we first construct the Lagrangian:

$\displaystyle L({\bf x},{\bf\mu})=f({\bf x})-\sum_{i=1}^n\mu_i\,g_i({\bf x})$

(191)

We note that in some literatures, a plus sign is used in front of the summation of the second term. This is equivalent to our discussion here so long as the sign of $\mu$ indicated in Table 188 is negated.

We now set the gradient of the Lagrangian to zero:

$\displaystyle \bigtriangledown_{{\bf x},{\bf\mu}} L({\bf x},{\bf\mu}) =\bigtria... ...bf x},{\bf\mu}}\left[f({\bf x}) -\sum_{i=1}^n\mu_i\,g_i({\bf x})\right] ={\bf0}$

(192)

and get two equation systems of $N$

and

equations, respectively:

$\displaystyle \bigtriangledown_{\bf x}f({\bf x}) =\sum_{i=1}^n\mu_i \bigtriangledown_{\bf x} g_i({\bf x})$

(193)

and

$\displaystyle \frac{\partial L({\bf x},{\bf\mu})}{\partial\mu_i} =g_i({\bf x})=0\;\;\;\;(i=1,\cdots,n)$

(194)

The result above for the inequality constrained problems is the same as that for the equality constrained problems considered before. However, we note that there is an additional requirement regarding the sign of the scaling coifficients. For an equality constrained problem, the direction of the gradient $\bigtriangledown g({\bf x})$ is of no concern, i.e., the sign of $\lambda$ is unrestricted; but here for an inequality constrained problem, the sign of $\mu$ needs to be consistent with those shown in Table 188, other wise the constraints may be inactive.

We now consider the general optimization of an N-D objective function $f({\bf x})$ subject to multiple constraints of both equalities and inequalities:

$\displaystyle \begin{tabular}{ll} max/min: & $f({\bf x})=f(x_1,\cdots,x_N)$\ \\... ...f x})\ge 0,\;\;\;\;\;\;\;\;(j=1,\cdots n)\\ \end{array} \right.$ \end{tabular}$

(195)

For notational convenience, we represent the $m+n$

equality and inequality constraints in vector form as:

$\displaystyle \begin{tabular}{ll} max/min: & f({\bf x})\\ s. t.: & $\left\{ \b... ...0}\;\;\mbox{or}\;\;{\bf g}({\bf x})\ge{\bf0} \end{array} \right.$ \end{tabular}$

(196)

where ${\bf h}({\bf x})=[h_1({\bf x}),\cdots,h_m({\bf x})]^T$ and ${\bf g}({\bf x})=[g_1({\bf x}),\cdots,g_n({\bf x})]^T$ .

To solve this optimization problem, we first construct the Lagrangian

$\displaystyle L({\bf x},{\bf\lambda},{\bf\mu}) =f({\bf x})-\sum_{i=1}^m\lambda_... ...{\bf x}) =f({\bf x})-{\bf\lambda}^T {\bf h}({\bf x})-{\bf\mu}^T{\bf g}({\bf x})$

(197)

where the Lagrange multipliers in ${\bf\lambda}=[\lambda_1,\cdots,\lambda_m]^T$ and ${\bf\mu}=[\mu_1,\cdots,\mu_n]^T$ are for the $m$

equality and $n$

non-negative constraints, respectively, and then set its gradient with respect to both ${\bf\lambda}$ and ${\bf\mu}$ as well as ${\bf x}$ to zero. The solution ${\bf x}^*$ can then be obtained by solving the resulting equation system. While $\lambda_i$ can be either positive or negative, with sign of $\mu_j$ needs to be consistent with those specified in Table 188. Otherwise the inequality constraints is inactive.

Example:

Find the extremum of $f(x_1,x_2)=x^2_1+x^2_2$ subject to each of the three different constraints: $x_1+x_2=1$ , $x_1+x_2\le 1$ , and $x_1+x_2\ge 1$ .

$\displaystyle \left\{\begin{tabular}{ll} min/max: & $f(x_1,x_2)=x^2_1+x^2_2$\\ s. t.: & $h(x_1,x_2)=x_1+x_2-1=0$\\ \end{tabular}\right.$

The Lagrangian is:

$\displaystyle L(x_1,x_2,\lambda)=f(x_1,x_2)-\lambda g(x_1,x_2) =x^2_1+x^2_2-\lambda(x_1+x_2-1)$

Solving the following equations

$\displaystyle \frac{\partial L}{\partial x_1}=2x_1-\lambda=0,\;\;\; \frac{\part... ...tial x_2}=2x_2-\lambda=0,\;\;\; \frac{\partial L}{\partial \lambda}=x_1+x_2-1=0$

we get $\lambda^*=1,\;x_1^*=x_2^*=0.5$ , i.e., the function is minimized at $f(0.5,\,0.5)=0.5$ . We note that and have the same gradients:

$\displaystyle \bigtriangledown f(0.5,0.5)=\bigtriangledown h(0.5,0.5)=[1,1]^T$
Consider these two problems:

$\displaystyle \left\{\begin{tabular}{ll} min: & $f(x_1,x_2)=x^2_1+x^2_2$\\ s. ... ...,x_2)=x^2_1+x^2_2$\\ s. t.: & $g(x_1,x_2)=x_1+x_2-1\le 0$ \end{tabular}\right.$

These problems are cases 2 and 3 in Table 188. They have the same Lagrangian as in the previous case and therefore the same results and $\mu^*=1>0$ . According to Table 188, ${\bf x}^*$ is the solution for the minimization problem subject to $g({\bf x})\ge 0$ (left), or the maximization problem subject to $g({\bf x})\le 0$ .
Consider these two problems:

$\displaystyle \left\{\begin{tabular}{ll} max: & $f(x_1,x_2)=x^2_1+x^2_2$\ \\ s... ...,x_2)=x^2_1+x^2_2$\\ s. t.: & $g(x_1,x_2)=x_1+x_2-1\le 0$ \end{tabular}\right.$

These problems are cases 1 and 4 in Table 188. They have the same Lagrangian as in the previous case and therefore the same results , and $\mu^*=1>0$ . According to Table 188, ${\bf x}^*$ is not the solution of either the maximization problem subject to $g({\bf x})\ge 0$ (left), or the minimization problem subject to $g({\bf x})\le 0$ (right), i.e., the constraint is inactive. We therefore need to assume $\mu^*=0$ and solve the following equations for an unconstrained problem:

$\displaystyle \frac{\partial L}{\partial x_1}=2x_1=0,\;\;\; \frac{\partial L}{\partial x_2}=2x_2=0$

Now we get and ${\bf x}^*={\bf0}$ , with minimum $f({\bf x}^*)=0$ . This is the solution for the minimization problem (right), at which $g({\bf x}^*)\ne 0$ , i.e., the constraint is inactive. However this is not the solution for the maximization problem (left) as the function is not bounded from above.