The hierarchical structure of the network model is illustrated in
Fig. 1 The input layer simulates the V1 neurons which are
selective to local velocities. The visual field is covered by an array of
small patches (represented by the 7 by 7 squares in the first
layer in Fig. 1a) each containing 32 V1 nodes for 8 directions
and 4 speeds (Fig. 1b). The middle layer is composed of
groups each containing a set of more than 32 component
MT nodes, which are all fully connected to all
V1 nodes in a local region in the input layer composed of
patches.
Two neighboring middle layer groups have an overlap of 3 patches in the input
layer (sharing
V1 nodes). The output layer has a set of
groups (three in Fig. 1a) each containing more than 32 pattern
MT nodes fully connected to all the middle layer nodes. The number of groups
in the output layer is arbitrary as they are independent of each other (nodes
in different groups do not interact, as there is no lateral connection between
two groups). The weights between two consecutive layers will be determined by
competitive learning algorithm, as discussed in Appendix.