Simulation results

The tuning surface of a component MT node was first obtained, shown as the top-left plot in Fig. 3, by plotting the responses of the node to motion patterns of all different directions and speeds. As can be seen, the MT node responded most strongly to (and becomes the winner of) a motion of certain speed and direction, while it also responded less strongly (and no longer the winner) to other motion patterns of similar directions and speeds. This property closely resemble that of the real MT neurons ([36]).

**Figure 3:** The tuning surface of the MT nodes
$\begin{figure} The horizontal axes represent direction (0 to 7) and speed (0 to ... ...rent motion of two directions 0, 45, 90, 135, and 180 degrees apart.\end{figure}$

The network model was then tested with transparent motion, which is simulated by presenting two sets of dots moving in different directions to all patches of the input layer. (A plaid pattern composed of two moving gratings could be treated the same way.) The angle between the two motion directions changes systemically from $0^{\circ}$ to $180^{\circ}$ with $45^{\circ}$ increment. The tuning surfaces of a component MT node to these five motions are shown in Fig. 3. It can be seen that when the angle between the two directions gets larger, the single peak of the tuning surface splits into two lower peaks, each corresponding to one of the two motions. At the same time, due to the V1 nodes' inhibition in the null direction, the two motion stimuli suppress each other more strongly as the their directions are more apart, until eventually the response of the MT node is totally suppressed when the two motions are in opposite directions.

By integrating the responses of many component MT nodes in the middle layer, the pattern MT nodes in the output layer can detect the global motions presented to the input layer. Depending on the angle between the two directions of the transparent motion, a pattern MT node will respond in three different ways: it will detect either (a) a single global coherent motion (e.g., a moving plaid), when the angle is small and there is a single peak in the component MT nodes' tuning, (b) two separate global motions (e.g., two moving gratings) when the angle is bigger and there are two peaks in the component MT nodes' tuning, or (c) no global motion at all when the two peaks become too weak to be detected as they are too far apart (in opposite directions).

These results can account for the findings of several psychophysical and physiological studies. First, it was found in [63] that the subjects would perceive either a coherent plaid motion when the angle between the directions of two moving gratings was smaller than $90^{\circ}$ , or two gratings when the angle was greater than $90^{\circ}$ . This finding can be easily explained by the spitting of single tuning peak (coherent plaid motion) to two (grating motions) as the angle increases. Second, As found in [41], the MT cells did not respond to the transparent motion of two sets of random dots moving in opposite direction, when each dot was paired with another moving in the other direction. However, the MT cells did respond when the dots were not paired. This finding can be readily explained by the model whose output is almost totally suppressed under paired dots moving in opposite directions. If the dots are not paired, some of the component MT nodes will respond to one motion, some to the other. After the integration of the output layer, two motions will be detected by the pattern MT nodes in the output layer. Third, It was found in [64] that the perceived plaid motion direction was strongly biased towards the direction of the higher-contrast grating. This can be explained based on the response property of V1 found in another study [65] that the response of V1 increases monotonically when the contrast of input increases. This property implies that the direction tuning peak of the MT cells will be higher when the contrast of the input increases. When the tuning peaks corresponding to two grating motions are close enough, they combine into a single peak whose center is closer to the higher peak corresponding to the grating of higher contrast. As the result, the direction of the perceived plaid motion is biased toward the grating with higher contrast.

Whether there is one or two peaks in the MT nodes' direction tuning is also affected by the width of the tuning. If they have narrow tuning, it is more likely for a pattern MT node to detect two separate motions. Otherwise, it is more likely for a pattern MT node to detect a single motion. This property can be related to the physiological finding [45], as discussed earlier, that the component MT cells would respond to either the motions of two gratings, or a coherent motion of a plaid, depending on the luminance of the intersection areas of the two gratings. We could speculate that some segmentation cue (whether there are two gratings or a plaid in the scene) from outside the motion pathway may act as an additional input to MT and change the sharpness of MT tuning, therefore causing the different motion perceptions.