next up previous
Next: Kernel PCA Up: Kernel PCA (KPCA) Previous: PCA

Mapping to high dimensional space

In classification problems, it is sometimes the case that if the d-dimensional data points ${\bf x}_n,\;(n=1,\cdots,N)$ are nonlinearly mapped into a higher dimensional space ${\bf y}=f({\bf x})$, different classes/clusters in the data can be better separated than in the original space. For example, suppose two groups of data points are given in a 2D space $(x_1, x_2)$ in whcih they form two circles centered at the origin. Obviously it is impossible to linearly separate these two groups of points in the 2D space. However, if these 2D points are nonlinearly mapped into a 3D space $(x_1, x_2, x_1^2+x_2^2)$, the two groups of points can be easily separated linearly in the 3rd dimension corresponding to the radius of the circles.

The drawback of doing classification in a high dimension space is the increased computational cost. This difficulty can be overcome if the computation depends exclusively on the inner product of the vectors $({\bf y} \cdot {\bf y})={\bf y}^T{\bf y}$ in the space, so that the costly computation can be avoided by the kernell trick, based on the kernel defined as:

\begin{displaymath}k({\bf x},{\bf x'})={\bf y}^T{\bf y}'=f({\bf x})^T f({\bf x}') \end{displaymath}

The kernel method can be used on any algorithm that can be carried out solely based on the inner product, to take advantage of the nonlinear mapping into a high dimensional feature space without the associated computational cost. The support vector machine (SVM) is a typical application of the kernel method, and kernel PCA (KPCA) is another.


next up previous
Next: Kernel PCA Up: Kernel PCA (KPCA) Previous: PCA
Ruye Wang 2007-03-26