The tuning surface of a component MT node was first obtained, shown as the top-left plot in Fig. 3, by plotting the responses of the node to motion patterns of all different directions and speeds. As can be seen, the MT node responded most strongly to (and becomes the winner of) a motion of certain speed and direction, while it also responded less strongly (and no longer the winner) to other motion patterns of similar directions and speeds. This property closely resemble that of the real MT neurons ([36]).
The network model was then tested with transparent motion, which is simulated
by presenting two sets of dots moving in different directions to all patches
of the input layer. (A plaid pattern composed of two moving gratings could be
treated the same way.) The angle between the two motion directions changes
systemically from
to
with
increment. The
tuning surfaces of a component MT node to these five motions are shown in
Fig. 3. It can be seen that when the angle between the two
directions gets larger, the single peak of the tuning surface splits into two
lower peaks, each corresponding to one of the two motions. At the same time,
due to the V1 nodes' inhibition in the null direction, the two motion stimuli
suppress each other more strongly as the their directions are more apart, until
eventually the response of the MT node is totally suppressed when the two
motions are in opposite directions.
By integrating the responses of many component MT nodes in the middle layer, the pattern MT nodes in the output layer can detect the global motions presented to the input layer. Depending on the angle between the two directions of the transparent motion, a pattern MT node will respond in three different ways: it will detect either (a) a single global coherent motion (e.g., a moving plaid), when the angle is small and there is a single peak in the component MT nodes' tuning, (b) two separate global motions (e.g., two moving gratings) when the angle is bigger and there are two peaks in the component MT nodes' tuning, or (c) no global motion at all when the two peaks become too weak to be detected as they are too far apart (in opposite directions).
These results can account for the findings of several psychophysical and
physiological studies. First, it was found in [63] that
the subjects would perceive either a coherent plaid motion when the angle
between the directions of two moving gratings was smaller than
,
or two gratings when the angle was greater than
.
This finding can
be easily explained by the spitting of single tuning peak (coherent plaid
motion) to two (grating motions) as the angle increases. Second, As found in
[41], the MT cells did not respond to the transparent motion
of two sets of random dots moving in opposite direction, when each dot was
paired with another moving in the other direction. However, the MT cells did
respond when the dots were not paired. This finding can be readily explained by
the model whose output is almost totally suppressed under paired dots moving in
opposite directions. If the dots are not paired, some of the component MT nodes
will respond to one motion, some to the other. After the integration of the
output layer, two motions will be detected by the pattern MT nodes in the
output layer. Third, It was found in [64] that the perceived
plaid motion direction was strongly biased towards the direction of the
higher-contrast grating. This can be explained based on the response property
of V1 found in another study [65] that the response of V1
increases monotonically when the contrast of input increases. This property
implies that the direction tuning peak of the MT cells will be higher when the
contrast of the input increases. When the tuning peaks corresponding to two
grating motions are close enough, they combine into a single peak whose center
is closer to the higher peak corresponding to the grating of higher contrast.
As the result, the direction of the perceived plaid motion is biased toward the
grating with higher contrast.
Whether there is one or two peaks in the MT nodes' direction tuning is also affected by the width of the tuning. If they have narrow tuning, it is more likely for a pattern MT node to detect two separate motions. Otherwise, it is more likely for a pattern MT node to detect a single motion. This property can be related to the physiological finding [45], as discussed earlier, that the component MT cells would respond to either the motions of two gratings, or a coherent motion of a plaid, depending on the luminance of the intersection areas of the two gratings. We could speculate that some segmentation cue (whether there are two gratings or a plaid in the scene) from outside the motion pathway may act as an additional input to MT and change the sharpness of MT tuning, therefore causing the different motion perceptions.