Stere-oopsis (Part 1)

Not a typo. Stereopsis is the technical term for the perception of depth using binocular disparity, or in other words by each eye (apparently) seeing a slightly different version of the same scene - because each eye has a different 'line of sight', parallax occurs between images seen from those separate lens of sight. Which again just means the two images are different, which you can easily test by covering or closing one eye and then the other, alternately, and noticing how the scene in front of you changes slightly.

I said not a typo because this theory is probably wrong. So more "oooops" than ops. But it's far and away the most common theory of how we see in depth, which I've talked about and questioned a few times, see here. (Note that links will show below this post, you'll have to scroll down.)

There's no doubt you will see profound depth using stereopsis techniques, such as the ones most commonly used in 3D pictures and films. There's no question it 'works', but there's also no question kids fervently believe in Santa, and that the toys arrive, just as they expect them to. Stereoscopic techniques don't prove stereopsis theories, those theories were the eggs arriving firmly after the chicken of stereoscopic images were already on the scene.

There would be many ways to explore the problems with these theories. One key reason to be suspicious is that there's actually an enormous black box right at the heart of stereopsis ideas, without which they're meaningless. That black box is the brain. The standard theories of 3D or depth perception make the brain do all of the work of turning the two apparently separate images into one 3D image. There's rarely any explanation of how the brain apparently does this, and thus it's a black box, in engineering terms - something that just works without any explanation of what's going on, in the box as it were. The space or depth we all experience every day is 'made up' by the brain, we don't perceive or live it directly.

Does that seem reasonable to you? That as you look around, wherever you are right now, the depth you perceive in the scene is actually just an artifact of your brain? As I've mentioned in the earlier posts linked to above, I think, close one eye and I'm sure most of you will notice depth is actually still there. The almost obsession with binocular parallax in standard talk about and work in 3D often ignores the (much greater number of) monocular cues for depth completely, for starters. (But that's only a relatively minor omission, I hope to show.) So what's with this idea about depth being about combining two different images and processing them in your head, which kindly then adds depth to them, while combining them into a single image? Why would we have evolved in this world of depth only to then need this sophisticated brain to come along before we could notice any of this depth?

Not beating around the bush, the rot started with the idea that our eyes take in the world around us, flatten it into an image, an image then transferred to our retinas, and which the retina then passes on to the brain for processing, to add the depth and other bits. The camera obscura (see earlier posts here) and other optical devices, up to and including the cameras we all use today, have misled us into thinking really bizarre things about images in general. We incredibly have no problem with the idea that an image is projected onto our retina, and then somehow our brain 'looks' at this image, as if there were another little us up in our heads to do the looking. This leads to an even more absurd infinite regress, because the little extra us in our heads looking at this image on the retina itself needs an eye, which takes in an image, which some other little self in its brain then looks at, etc. etc. To really understand 3D (itself a revealing term, as we'll see), you need to start with understanding why this idea of an image is so misleading.

It would be easy to blame cameras and optics for this flattened image in the way many artists do, using a standard critique of photography that it gives a depth-less 'snapshot' of reality, compared to the many more layers of mediation and meaning (supposedly) woven into a painting, for example. The problem with this critique is that it has always mistaken the beautiful, almost magical endpoint of the photographic process for the process itself. The clean, apparently 'real' photographic image is the outcome of many, many more layers of process and mediation than any painting, but all of that process gets black-boxed, like the brain in stereopsis. Before digital photography came onto the scene, you had firstly all of the detailed, complicated task of building a camera that worked, with mirrors and lenses and shutters and apertures, incredibly intricate mechanisms combining centuries of skill and knowledge. Then all of the knowledge, skill and complexity in making a film that will capture your light at the end of its journey through your camera, and then the complicated chemistry and technique of developing the film, printing your images on photo-receptive paper (itself a sophisticated technology developed over many years), etc. When you put all of those layers back in the frame, it's quite bizarre really that anybody ever thought a photographic image was this pure, flattened snapshot that it's so often accused of being. With digital photography many more layers again of complexity get added, with sensor technology, digital image manipulation, etc. etc. How simple and unmediated a blob of paint seems when you add the back story to photography!

The semi-magical output of photography has seduced us into believing that images are some sort of ghostly, depth-less apparitions floating around in the world which our magnificent brains then add all meaning and depth to, for our benefit. Part of an old hatchet job undertaken by people like Locke and Descartes, and pursued with vigour today by most neuroscientists and cognitive scientists, psychologists, philosophers, and people who want to understand vision, to name only a few. A separating of primary qualities from secondary qualities, where primary qualities are (apparently) things that exist outside our minds, in objects themselves, such as how solid something is, its extension in space, its number etc. And secondary qualities then being a whole 'subjective' layer added to objects and their primary qualities when we perceive them, such as colour, taste, smell, and of course in the traditional theory of stereopsis, depth or "3D". Whitehead used to name this splitting of the world into primary and secondary bits "the bifurcation of nature", and it's quite surprising that more haven't firstly noticed what a hatchet job this is - splitting all of reality in half! - and secondly what sort of loopy mysticism it leads you into, as that lump of electrified meat in your skull takes on the ability to create so much of what we experience as reality. (And your brain really is just a lump of electrified meat, albeit a complex piece of meat - I defy anybody to cut a brain open and show us an 'image' stored there - again what has become the standard idea about images is rank mysticism).

We need to pay respect to images by making them things again. Rather than these ghostly, flat, meaningless-in-themselves apparitions supposedly floating around in space, separate to the objects they're (again, supposedly) images of and waiting for our brains to add depth and colour and meaning to. Our marvellous brains! There is so much unmitigated, fawning tripe written about our brains today, so much of what's beautiful and astonishing about the world and our existence is shoved into that black box, and more all the time. Not only did we have the monumental vanity to split reality in half, to bifurcate it, but then as we stare across this (fictional) abyss of our own making between the world and our experience, we multiply the vanity a thousandfold by then trying to paper over the split with our own brains. As if reality had no colour or texture or meaning until we evolved, together with our apparently marvellous brains, to give it those qualities. Astounding arrogance. So many of the gee whiz stories you see from much of modern neuroscience - despite the best efforts of the embodied cognition and extended mind schools - extolling the marvels of our brains, are deeply offensive when seen from the perspective of this artificial split in reality that we made for ourselves. So many of the supposedly amazing characteristics of our brains and how they affect our experience are really just the genuinely amazing world we live in trying to sneak back into the picture, and we in our colossal, confused arrogance attribute it all to ourselves, to our brains. (Confused because it's hard work pretending to produce reality every day, up in our heads - no wonder depression is now so widespread.)

The brain is a lump of electrified meat! The most rabid chest-beating empiricst goes weak at the knees before this lump of meat, attributing it with the power to clothe reality itself with colour, depth and meaning. Again, cut a brain open and find an image in there. If you want to be an empiricst, a table-thumping realist, then be one! Recognise the brain for what it empirically is, a lump of electrified meat, and leave all the black-boxed mystical rubbish about it being a sort of cinema in our heads at home. (Steven Poole's wonderful piece here, and also the fascinating work of embodied cognition theorists like Andy Clark, and the "spread mind" of Ricardo Manzotti, give a taste of where the real cutting-edge science is heading here.)

To flesh this out more specifically, it's difficult to give a better general overview of the issue here than this one from James Gibson's seminal and still largely unappreciated work on vision, The Ecological Approach to Visual Perception:

"The very notion of an image as a flattened-out object, a sort of pancake of a solid body, is shown to be misleading. It begins to appear that most of what has been written about pictures and images over the centuries is misleading, or hopelessly vague. We should forget it all and start fresh. The information for the perception of an object is not its image. The information in light to specify something does not have to resemble it, or copy it, or be a simulacrum or even an exact projection. Nothing is copied in the light to the eye of an observer, not the shape of a thing, not the surface of it, not its substance, not its colour, and certainly not its motion." (p. 304).

Gibson was no armchair psychologist, in World War 2 he worked with the American military, training pilots, and realised as part of that work that the traditional theories of depth perception were of no use in that sort of real life situation. Gibson realised almost straight away that vision was not this abstract process of images floating around in the air and happening upon retinas and brains, it was the outcome of the interaction of the human body with its environment (thus an 'ecological' theory, though not in any sort of green way, but in the scientific sense of taking into account the environment of an organism). Riccardo Manzotti has explored this from another direction in recent times, see for example here.

Reading Gibson you see all of the idealisms of traditional theories of stereopsis fall away. Gibson was a brilliant experimentalist, a committed, world-leading, practical scientist. Reading him describe the difficulties researchers traditionally face in trying to work with the traditional theories, mostly to do with having them almost strap subjects to tables with various apparatus to keep the head still enough to try to capture images falling on retinas, you know straight away that something is wrong with the traditional theory. The human body didn't evolve strapped to tables with heads held still by elaborate devices. One of the first things Gibson notes is that we don't just sit absolutely still and see, we move. Our vision evolved as part of our being in and moving in the world, and this movement is central to that evolution and our survival. Gibson also talks extensively about depth perception (an expression he detested), and about the 'cues' (binocular and monocular) traditionally thought to deliver us the perception of depth, and rejects the whole thing - the idea that depth is perceived from cues was to him hopelessly abstract, and just plain wrong when you tested it in real life situations. (The idea of cues is synonymous with the idea that the mind/brain is separate from the world, that bifurcation or split again, and the mind/brain then interprets the cues to produce reality as we experience it.)

This isn't an essay on the work of Gibson, but he no doubt saw where the traditional theories of 3D are limited. Incidentally I said above that the term "3D" itself has issues, and in a way it perfectly encapsulates everything wrong with the traditional theory. It's a term borrowed from the standard Cartesian geometry we all learn at school, where space is measured using x, y and z axes, the z axis being one of depth: x, y and z are the 3 "D's" (dimensions). You definitely can measure out distances in this way, but it's another thing altogether to then assume that's how we, as human bodies, evolved to experience what we think of as depth. Gibson's experiments essentially led him to conclude that all perception was about the continuous flux of varying and non-varying elements in our experience of our bodies as part of a wider world, from which what we usually think of as depth derives. For example travelling in a vehicle and looking out the front will produce an experience of the world flowing towards you, and expanding and foreshortening in various ways. Ways that are different if you look out the side or the back of the vehicle. In each case it's that changing flow which gives us a direct perception of the relationship of our body to our surroundings. No brain or mind processing data or adding anything, a direct immersion in our surroundings, with our perception being the expression of that immersion in every moment, and our percepts (colors, depth, sound etc.) extended beyond the boundaries of our bodies out into the world around us. Not as out-there as it sounds, if you see a red rose why is that red in your head, and not 'out there' in the rose too?

So why does a standard stereoscopic image work? Two images with parallax, a slightly different point of view, combined into one view with a stereoscope or via 'free viewing'? As Charles Wheatstone observed at the birth of modern stereoscopy in the 1830s, and as noted above, using each eye separately does give you that different view on a scene. Wheatstone, like so many after him, observed this difference but then jumped straight to a conclusion that had no obvious link with it:

"It being thus established that the mind perceives an object of three dimensions by means of the two dissimilar pictures projected by it on the two retinæ..." (Contributions to the Physiology of Vision.—Part the First. On some remarkable, and hitherto unobserved, Phenomena of Binocular Vision, "Philosophical Transactions" of the Royal Society of London, Vol. 128, pp. 371 - 394.)

It's one thing to notice that each eye alone has a different point of view on your surroundings, but what justifies saying that this difference is what produces depth or space in our experience? Wheatstone rightly notices these eye-based parallax phenomena, perhaps for the first time in Western science ("...they seem to have escaped the attention of every philosopher and artist who has treated of the subjects of vision and perspective.."), with the exception he notes of Leonardo Da Vinci, but he begins by assuming what he's trying to prove - that depth was produced in his brain, and so these images from each eye must be what created the depth when they arrived there. And his stereo experimental equipment confirms this view for him, because it produces images that pop out at him in that 3D way we now all know.

Is there a simpler explanation, one more consistent with a body properly immersed in its surroundings, rather than looking out at them from the other side of a fictional divide between what's in their heads and the world around them? I think there is, one that restores depth to the world itself. I'll have a shot at it in Part 2.


Popular posts from this blog

The Morality of a Speed Bump. Latour.

Reductio Ad Hitlerum, or what's wrong with Godwin's Law

Counterpoint (P.S.). Queen.