For a decision hyper-plane
to separate the two
classes
and
, it has to
satisfy
The optimal plane should be in the middle of the two classes, so that
the distance from the plane to the closest point on either side is the
same. We define two additional planes and
that are parallel
to
and go through the point closest to the plane on either side:
Moreover, the distances from the origin to the three parallel planes ,
and
are, respectively,
,
,
and
, and the distance between planes
and
is
. Our goal is to find the optimal decision hyperplane in
terms of
and
with maximal distance between
and
,
or, equivalantly, minimal
. Now the
classification problem can be formulated as:
![]() |
![]() |
||
![]() |
![]() |
Since the objective function is quadratic, this constrained optimization
problem is called a quadratic program (QP) problem. (If the objective function
is linear instead, the problem is a linear program (LP) problem). This QP
problem can be solved by Lagrange multipliers method to minimize the following
![]() |
![]() |
||
![]() |
![]() |
Solving this dual problem (an easier problem than the primal one), we get
, from which
of the optimal plane can be found.
Those points on either of the two planes
and
(for which
the equality
holds) are called support vectors
and they correspond to positive Lagrange multipliers
. The
training depends only on the support vectors, while all other samples away
from the planes
and
are not important.
For a support vector (on the
or
plane), the constraining
condition is
![]() |
![]() |
![]() |
|
![]() |
![]() |
||
![]() |
![]() |