Next: About this document ... Up: bp Previous: The Structure and Purpose

The Competitive Learning Law

For mathematical convenience, we assume

All patterns are binary:

$\begin{displaymath}x_i=\mbox{$0$\ or $1$} \end{displaymath}$
All weights of an output node are normalized:

$\begin{displaymath}\sum_{j=1}^n w_{ij}=1\;\;\;\;(w_{ij} \geq 0) \end{displaymath}$

During training, all input patterns are presented to the input layer one at a time in a random order. Every time a pattern is presented to the input layer of the network, the weights are modified by the following learning law:

$\begin{displaymath}W_i^{new}=W_i^{old}+\eta(X-W_i^{old})u_i=W_i^{old}+\triangle W_i u_i \end{displaymath}$

where

$\begin{displaymath}u_i=\left\{ \begin{array}{ll} 1 & \mbox{if $y_i$\ is the winner} \\ 0 & \mbox{otherwise} \end{array} \right. \end{displaymath}$

$\begin{displaymath}\triangle W_i \stackrel{\triangle}{=}\eta(X-W_i^{old}) \end{displaymath}$

and $0<\eta<1$ is the learning rate.

This learning law can now be written as

$\begin{displaymath}\left\{ \begin{array}{ll} \mbox{for winner} & W_j^{new}=(1-\... ...sers} & W_i^{new}=W_i^{old}\;\;(i \neq j) \end{array} \right. \end{displaymath}$

We note that the new weight vector is still normalized:

$\begin{displaymath}\sum_{i=1}^n w_{ij}^{new}=(1-\eta)\sum_{i=1}^n w_{ij}^{old} +\eta \sum_{i=1}^n x_i=(1-\eta)+\eta=1 \end{displaymath}$

We see that the winner's weight vector is modified so that it moves closer towards the current input vector

while all other weights are unchanged. Since for an output node

to win, its weight vector

has to satisfy

$\begin{displaymath}W_j^TX=\vert W_j\vert\,\vert X\vert\,cos(\phi) > W_i^TX\;\;\;\mbox{for any $i\neq j$} \end{displaymath}$

where $\phi$ is the angle between the two vectors

and $W_{ij}$ , in other words, the distance between

and

$\begin{displaymath}\vert W_j-X\vert^2=\vert W_j\vert^2+\vert X\vert^2-2\vert W_j\vert\,\vert X\vert\,cos(phi) \end{displaymath}$

must be smaller than that between

and any other

, we realize that the learning law will always pulls the weight vector closest to the current input vector even closer towards the input vector, so that the corresponding winning node will be more likely to win whenever a pattern similar to the current

is presented in the future. The overall effect of such a learning process is to pull the weight vector of each output node towards the center of a cluster of similar input patterns and the corresponding node will always win the competition whenever a pattern in the cluster is presented. If there exist c clusters in the feature space, they will each be represented by an output node. The remaining

output nodes may never win and therefore do not represent any cluster.

Next: About this document ... Up: bp Previous: The Structure and Purpose

Ruye Wang 2002-12-09