In machine learning, the artificial neural networks are a category of algorithms that are inspired by the biological neural networks in the brain, and designed to carry out both supervised and unsupervised learning tasks, such as classification and clustering. To understand how such neural network algorithms work, we first consider some basic concepts in biological neural system.
The human brain consists of
neurons
interconnected through about
to
synaptic junctions
to form millions of neural networks. Hundreds specialized cortical areas
are formed based on these networks for different information processing
tasks.
Functionally, a neuron consists of the following three parts:
The function of a neuron can be modeled mathematically. Each
neuron, modeled as a node in the neural network, receives
input signal or stimulus from neurons and its activation
or the net input is the weighted sum of all such inputs:
(2) |
(3) |
(4) |
Here is an activation function, which typically take one of the
following forms:
(5) |
(6) |
(7) |
(8) |
The function of a neural network can be modeled mathematically as a hierarchical structure shown below containing multiple layers of neurons, called nodes in the context of artificial neural networks:
The learning paradigms of the neural networks are listed below, depending on the interpretations of the input and output of the neural network.
This is the most general form of neural networks that learns and stores the associative relationship between two sets of patterns represented by vectors.
(9) |
Human memory is associative in the sense that given one pattern,
some associated pattern(s) may be produced. Examples include:
(Evolution, Darwin), (Einstein, ), (food, sounding bell,
salivation).
As a special pattern associator, auto-associator associates a prestored pattern to an incomplete or noisy version of the pattern.
This is another special kind of pattern associator which takes
a vector input
and produces a real value
as a multivariable function
at its
only output node.
This is a variation of the pattern associator of which the output
patterns are a set of categorical symbols representing different
classes
, i.e., each input pattern is classified
by the network into one of the classes
(10) |
This is an unsupervised learning process. The network discovers automatically the regularity in the inputs so that similar patterns are automatically detected and grouped together in the same cluster or class.