Next: Spatial Frequency Filtering
Up: Computational Models for Stereopsis
Previous: Constraints
Marr and Poggio (1977) proposed an algorithm to solve correspondence problem
based on the gray levels (assumed to be black and white for simplicity) of the
two retina images. Their algorithm has the following two phases:
- 1.
- Inverse project the two images so that the rays from the two eyes
intersect to form a 3D grid. Each intersection is assigned a value of
1, if the rays from the two retina images correspond to the same color
(white or black), or a value of 0, if the rays correspond to different
colors ((one white one black). The intersections set to 1 are possible
to be on object surfaces, but the intersections set to 0 are definitely
not and are therefore excluded in the rest of the algorithm.
- 2.
- Iterate to eliminate some of the possible surface points by two assumed
constraints:
- Surface Opacity The object surface is opaque so that along
a ray only the point closest to the eye is visible;
- Surface Continuity The surface is in general continuous
and smooth, i.e., neighboring points tend to have same or similar
depths.
The iteration is carried out by treating the grid of intersections as
a dynamic network (called neural network, although it has nothing to
do with the actural neural wiring) where each intersection represents
a neuron which is connected to its neighbors by either excitatory or
inhibitory weights. The excitatory connection is in the horizontal
direction parallel to the two retina images to promote depth
continuity, and the inhibitory connection is along the rays so that
only one intersection in front of all others along the ray will remain
active while all others are inhibited. By the end of this iteration
when the network is stablized, the active intersections form the
reconstructed object surface, as shown below.
While this algorithm successfully solve the correspondence problem, it is
seriously challenged by the random-dot experiment (Prazdny 1984) as shown
below, where neighboring points in the retina images may be from 3D points
of different depths, i.e., the assumption of surface continuity is no longer
valid. However, human subjects can solve this stereogram effortlessly.
Another algorithm by Jones and Malik (1992) is based on the idea of
spatial filtering. This method assumes a set of biologically inspired spatial
filters for different spatial frequencies (scales) and different orientations
is available at every retina location. Each local region in the retina image
is filtered by this set of filters and the corresponding outputs are stored
as the elements of a vector representing the region. Instead of matching the
pixels in the left and right retina images, this method matches a vector for
a position in the first image to the vectors of a set of regions in the
neighborhood (with different lateral displacements) of the corresponding
position in the other image. The bionocular disparity is obtained from the
two vectors that match the best.
Next: Spatial Frequency Filtering
Up: Computational Models for Stereopsis
Previous: Constraints
Ruye Wang
1999-11-10