A Short Treatise on Stereo Vision

In the beginning of this chapter, Prof. Banchoff mentions a sneaky way to fool someone: take a photograph of a room and mount a life-size print of it just inside the room. The viewer cannot tell the difference between this two-dimensional image and the three- dimensional space which it represents. Actually, Prof. Banchoff was simplifying for the sake of his audience; true depth perception is more complex and dictates that the image must be placed at least ten feet from the doorway to be effective. This is because of the particularities of our system of depth perception, which must solve a number of dimensional problems, as I will briefly explain.

Each eye perceives only a flat, 2-D image of the world. Actually, it is not "flat" but a projection onto the curved 2-D interior (the retina) of the spherical eye. The images on the retinas are nearly identical, but are displaced slightly in the horizontal direction, due to their placement in the head. The issue then becomes how to convert two 2-D images into a single, 3-D image of the world; one theory involves the formation of a "2-and-a-half-D image" (a fractal?) which contains most, but not all, of the depth information. This "correspondence problem" is a major issue in vision research, because it is theoretically possible to match any point on the left image with any point on the right image; only one of these matchings will be correct, however. An example of an "incorrect" matching can be seen in "magic eye" stereograms. Interestingly, motion perception also involves a correspondence problem; here it requires matching across the dimension of time rather than space.

Perception of depth involves three primary types of cues: stereopsis, occularmotor cues, and pictorial cues. Stereopsis is what is traditionally thought of as "depth perception"; it involves an evaluation of correspondence. For objects further from the eye, the images in the retinas will differ less (the difference between the angles with which the image strikes the eye is smaller) than with closer objects. Occularmotor cues function in individual eyes and use muscular nervous information to determine depth; in particular, the tension of the ciliary muscles which change focal depth by altering the curvature of the lens, and the angle at which the eye is pointing. A closer object will require greater accommodation of the lens and will require the eye to be angled further toward the central axis of the head. Any of these mechanisms (except perhaps evaluation of focal depth, which may be negated if the camera has the right focal depth), especially when combined with the pictorial cues which I will explain shortly, will indicate that the objects in the sneaky 2-D picture are all equidistant from, and very close to, the eye. Unfortunately, none of these mechanisms is effective past a distance of about ten feet, after which the only effective cues to depth are the monocular pictorial cues.

Pictorial cues (so called because they may be manipulated in paintings) provide the majority of depth information, and the only information past ten feet, at which point the eyes are pointing straight ahead, are focused at infinity, and receive an effectively identical image. In fact, approximately ten percent of the population (including Lisa Eckstein) is "stereo-blind," and cannot use any binocular cues to depth. Pictorial cues include size (closer objects appear larger), height (closer objects appear higher), occlusion (objects behind other objects appear farther away) and perspective (horizontal lines converge at a distance). All of these are preserved by the sneaky 2-D photo, so if it is placed far enough away, the observer will indeed be fooled, until she moves her head, using the information from motion parallax (further objects move more slowly than closer objects) to detect the ruse.

How did I get so smart? Cog Sci 77. Try it.

Prof. Banchoff's Appreciative Response- now with dave's handy comments

At the risk of carrying this on too far . . . Still more Prof. B. comments.

-david stanke