It is hoped that after reading through this chapter, one will get some ideas about how a biological visual system might detect and process motion, and, at the same time, some hints about how motion may be detected and processed from an engineering point of view. For example, maybe we need to build a compute vision system specializing in motion detection, such as detecting the global motion, or the heading direction of a vehicle (which is closely related to the focus of expansion in the scene). Then the methods discussed above may be helpful. However, it is too premature to assume that these are the ways a a biological visual system detects motion. It should be emphasized here that what discussed above is based on our current understanding of the biological motion processing. Even in the most heavily studied visual areas such as V1, MT, and MST, we still only see the tip of the iceberg and many questions remain, let alone the many more questions regarding the areas beyond. In the following we list some of these questions.
First, as mentioned earlier, it is not the case that the four types of visual cues, orientation, velocity, spectral composition, and binocular disparity, are processed by four separate pathways to form the four aspects of visual perception, form, motion, color and depth. Both psychophysical and neurophysiological findings have shown that the detection and perception of visual motion is influenced by various segmentation cues. The results by Stoner and Albright [45] discussed in a previous section are good examples. Similar findings have been reported in psychophysical studies [62], [63] which show that segmentation information other than luminance, such as the relative contrasts, spatial frequencies, and directions of motion of the two gratings, also affects the perceptual coherence of the resulting motion. These findings strongly suggests that in addition to motion information, segmentation information also plays an important role, in some unknown way, in motion perception.
Second, almost all efforts for understanding visual motion detection assume
that the motion information is processed along the V1-MT-MST
pathway.
However, there are several studies which show that the majority of the MT
cells are still directionally selective after inactivation of V1
([61], [37], [38]).
This can be explained by another
path from retina to MT, which goes from retina to superior colliculus, pulvinar
nucleus of the thalamus, and to MT. This pathway is intact after the V1 lesion,
so it seems likely to be a major source of the residual input following V1
lesions. And combining a V1 lesion with a superior colliculus lesion eliminates
all visually driven activity in MT. But little is known about this additional
pathway to MT.
Third, an interesting question one may ask is: what happens beyond MT and MST to form visual perception of motion? The answer: not much is known. But we can at least cite some very exciting neurophysiological studies ([58], [59], [60]) which found that microstimulating the neurons in MT and MST areas can influence the perceptual judgements of motion direction. These findings indicate that the responses of individual neurons are directly related to the visual perception, wherever it is formed in the brain, but little is known how this happens.
Forth, a recent psychophysical study [66] showed that lesion of some part of the cerebellum can also impair the perception of visual motion. This is the first report that cerebellum is also involved with visual perception. Since all known visual areas are in the cerebrum of the brain, and cerebellum is only known for its essential function of coordination of motion, this finding obviously raises many more questions about how and where else the visual motion information is processed in the brain.
Finally, attention also plays an important role in visual processing. Increased attention enhances the performance in not only the behavioral level [68], [69], but also the neuronal level. It was reported in [67] [68] that increased amount of attention directed toward a stimulus can enhance the responsiveness and selectivity of the neurons in V4 that process it. However, there was no effect of attention in V1. Recently, evidences have been found [70] that there may be attentional modulation of MT responses. These findings show that both the temporal pathway (responsible for object recognition) and the parietal pathway (responsible for motion perception) are affected by attention as early as V4 and MT, while their common preprocessing stage V1 is not affected. Based on psychophysical experiments, a two-level motion processing is suggested [69]: the motion is detected automatically in the lower level in the absence of attention to the stimulus, and it is than further processed in the higher level mediated by attention. Attention can be speculated as a top-down process carried out by some kind of feedback projection from the higher level in the hierarchy to the lower ones. But little is known how this actually takes place at the neuronal level. For more general discussion on visual awareness, the reader is referred to [71], [72].
In summery, what we have discussed in this chapter is only a very preliminary understanding and speculation about how our visual system might detect and process motion. We have so much more to learn, not only to know better about the visual areas such as V1, MT, and MST, but also to discover the areas beyond that are more closely related to visual awareness, attention and perception. And, from an engineering point of view, the more we find out from the biological visual systems, the more we may apply to the artificial systems to build more intelligent machines. In this regard, the study of vision is almost unlimited.