Oh! Say Can You See?: A Perceptual Problem
Leo Segedin   |   March 14, 2007 |   Print this essay

In my paper last Nov, I argued that negative responses to paintings were often based, not on taste or aesthetic judgments, but rather on 'visual illiteracy", the lack of ability or skill in seeing what was represented on them. I claimed that this ability was determined to a great extent by presumptions about what there was to be seen, e.g. Renaissance or Impressionist beliefs about vision. I illustrated this idea with reference to Albert Wolff's late 19th century negative judgment of Renoir's paintings and my own experience in learning to see Cezanne's. This description was challenged. It was claimed that what I described was not a matter of seeing. Since all normal people, regardless of cultural experience, are assumed to have the same brain physiology, they must have seen the same thing. The issue, therefore, was not vision, but, rather, a matter of how such pictures were perceived. We are likely to refer to such perception as 'understanding' and 'interpretation'. Up to a point, of course, this is true. Wolff had access to the same visual information as anyone who 'understood' Impressionist paintings, but because he never learned to understand them, he misinterpreted them. This would be similar to the 18th century Chinese person, who, when looking at a Renaissance portrait, asked, "Are Europeans dirty on one side of their faces?" We can say that the Chinese did not understand European concepts of chiaroscuro and therefore misinterpreted the picture.

But although words like 'interpreting' and 'understanding' are necessary to describe what happens when we look at paintings, they are also misleading. While such words are often used to mean the same thing as 'seeing', they also have an intellectual sense. Certainly this is implicit in the idea that 'literacy' is a necessary aspect of art as a symbol system and that painting is a kind of visual knowledge. (Jerome Bruner refers to art as a "mode of knowing".) This usage, however, should not distract from the fact that the experience I am trying to describe is essentially non-verbal. We can no more verbalize the experience of perceiving a painting than we can the experience of hearing music. We can say that in order in order to see a painting, we must first understand it. For example, I could understand and interpret Cezanne's paintings - lecture about them - before I could, in this sense, 'see' them. Intellectual understanding, however, does not always result in perception and it is possible to 'see' something without understanding it. However we use language to describe the experience of perception, there still remains a fundamental difference between vision and knowledge. I will argue in this paper that although people are born with the same visual potentiality, they do not necessarily develop the same visual skills and that the physical and cultural environments in which such skills develop are factors in determining how and what they see. What ever their understanding, the Chinese person literally saw dirty faces in the same way that Wolff saw putrescent flesh in these representations.


Many of us assume that perception of the external world consists entirely of an image of that world projected onto our retinas. From this image, we are supposed to be able to identify objects and locate them in space. We assume that such images are universal and, therefore, they are likely to believe that it is possible to see and describe the world objectively A scientist studying nature is supposed to have this neutral, realistic vision and, with hand-eye coordination, can copy the image he or she sees in a microscope or telescope. Anyone with this acquired skill should be able to represent what they see. We should also be able to recognize representations of these images on a 2D surface. The 5th century B.C. Greek artist, Zeuxis, was famous for paintings grapes so realistic that he fooled birds. We are likely to believe that that a blind man learning to see later in life as well as infants, animals and even insects have this innocent vision and can see their environment as it really is without interpretation or cultural bias.

None of this is true and research does not support it. Unfortunately for the Zeuxis legend, there is no such thing as an innocent eye. In fact, birds as well as insects, frogs and fish respond only to the most essential characteristics of what they are looking at. The outline of a cow is sufficient to trap tsetse flies in a trap. Two dark round shapes of different size and the silhouette of the head and body of a bird will cause baby birds to open their mouths. High flying eagles respond to the movement of a rabbit, not the rabbit itself, Frogs will snap their tongues at anything that moves; they can even make mistakes determining how far away objects are, sometimes confusing a predator with prey. Rats have trouble distinguishing between squares and circles. A chimpanzee cannot distinguish a triangle made up of small circles.

Even among humans, infants 2-6 months old will respond with a smile to any face-like form. There is even physiological evidence that there is a center in the brain devoted entirely to responses to smiles. It is likely that infants will also respond to other expressions. There is apparently also a built in tendency to see images where there is only the slightest resemblance. Human faces will be seen in abstract patterns in wall paper as well as a close arrangement of 3 circles. Two horizontally arranged circular forms, such as knotholes or patterns on butterfly wings and peacock tail feathers, will be seen as eyes. We tend to see animals in clouds, trees, moss patterns, scribbles, smudges, etc. The evolutionary psychologist, Steven Pinker believes that there is a module in the brain especially devoted to perspective, to depth cues such as texture gradients, overlapping and converging lines. In his view, such abilities are the same for 2 and 3D experiences.

To whatever degree perceptual skills may be hard-wired in the brain, they develop through adaptation and learning. In fact, without experience, no functional hard-wiring can occur. For example, when rats first look at simple, geometric forms; they see only the closest part. Their recognition of the whole form is learned slowly and depends originally on multiple visual fixations. A blind person who has learned to see late in life may be able to see a figure against a ground, but will have great difficulty telling whether it is a square or a triangle and cannot name them. This recognition is completely destroyed if the object is slightly changed or altered. He has trouble identifying colors. He has no sense of 3D space, sees objects far away as appearing as small objects up close. (In frustration he often reverts back to a tactile and kinesthetic world.) Since people living in dense forests, when taken out of the forest, also see distant objects as small rather than far away, we can assume that, although we are born with the innate capacity to discern distance, we must experience it at the appropriate time for the ability to develop.

Perceptual abilities thus do not require a picture of the world; all are based on what capacities the viewer brings to the perception. Perception is not a mere copy of appearances, but a structure of selected, pertinent data determined by the physiology of the eye and brain. It is an awareness of certain privileged and relevant aspects of what is there. It is a process of detecting, of picking out what is being looked for. And the more relevant to what we are looking for, the less resemblance there has to be. (Consider being hungry or horny).


Vision is a process that produces from images of the external world a description that is useful to the viewer and not cluttered with irrelevant information.

- Neuroscientist, David Marr (1982)

Perception is not determined simply by stimulus patterns; rather it is a dynamic searching for the best interpretation of the available data.

- Psychologist, Richard Gregory (1966)

How do people see? Light is projected onto the retinas of our eyes. Our eyes explore this light, trying to pick up useful information. Although there are images projected on the retina, they are incidental.

"Such images are not like pictures. They are not something looked at by an observer. It is a distribution of energy on a sensory mosaic, not a replica, or a copy, or a model or a record of an external form. It is a continuous 'input' as the computer theorists say. It starts impulses in the optic nerve. A retinal image is no more like a picture than an auditory stimulus is like a phonograph record in the ear".

- Psychologist, J. J. Gibson, (1950)

Not only is there is no picture of the external, physical world projected on the retina; we don't even see the stimulations of the eye. By the time we see - before it becomes information - the stimulations have been processed. The retina consists of 130 to 180 million receptors, but only 1.2 million optic nerves. Obviously, some selection has occurred before the stimulation reaches the brain. The patterns of light rays stimulating the retina are continuously changing. The eyes are constantly moving, shifting several times a second, refocusing, head always moving. The same light stimulates different parts of the retina. The stimulations are projected upside down. Yet the outside word appears stable because, first, the retina selects constant features from all this constantly changing stimulation by detecting sudden contrast of light intensity. It establishes perception of vertical edges, horizontal edges, corners, gradations of tones, etc, but not objects in 3D space. Axons from the retina go to the visual thalamus, then to the middle layer of neo cortex and then to the 5 layers of the cortex itself. Built into the cortex are structures specifically designed for certain perceptual skills and these develop, not only according to their inner dynamics, but also to the needs and experiences of the individual.

Not all information is visual. Tactile and kinesthetic senses provide information, in fact are essential to the perception of space. The brain computes this information together with what the eyes supply to create our sense of 3-d reality. This reality thus involves a lot more than the appearance of what is in front of us from a single point of view. For this reason, there is a difference between representations of the appearance of objects in space and recognitions of objects, as well as size, shape, location, movement, etc. These are very different kinds of information and often contradict each other when we try to represent such experiences on a 2D surface. Since perception is determined by all these variables, it obviously cannot be the same for everyone.

Neurobiologist, Semir Zeki (1999) writes:
The brain is no mere passive chronicler of the external physical reality but an active participant in generating the visual image according to its own rules and programs.

Cognitive scientist, Donald Hoffman (1998) writes:
…the rules of universal vision allow a child to acquire specific rules for constructing visual scenes. These specific rules are at work when the child, having learned to see, looks upon and understands visual scenes.


"Being immersed in a single culture sometimes blinds us to the fact that apparently basic frameworks are in fact cultural artifacts. Two dimensional representations of three dimensional life as in drawings, photographs and motion pictures are examples of what we in our culture have mastered early in life and have grown accustomed to, so that we do not have to go through the inferential process of reconstructing the real world from the photograph. It can be otherwise with people not accustomed to such two dimensional representations…

- Anthropologist, Psychiatrist, D. Price-Williams

Photographs and similar representations are seen as being realistic, but even though a 2D picture can contain shapes and marks that refer to the external world, there is little similarity between them. Pictures are flat, have rectangular edges, borders, surface textures, are usually smaller than what they represent, can be black and white, are partial, freeze motion, etc. They can be carried, turned, seen from different distances, in different light. Since it is neither possible nor necessary to duplicate all the light from the world focused on the retina on a 2D surface, all potential data in pictures are selective and structured. For example, in research among non western peoples, some photographs used consist of frontal views of faces; others of _ views or profiles. Some figures are represented as static and frontal, others in motion, some in color or black and white. Some are line drawings with or without color, tone or texture. Some are abstract, 2or3D images. Therefore the assumption that the primary responses to pictures is to whether they are 2 or 3 dimensional is entirely inadequate to determine perceptual abilities.

We are oblivious to how important the differences between these "realistic" pictures are to people unfamiliar with them. Each of these different kinds of pictures s can cause different responses in perceivers. The psychologist, Paul Ekman, working with photographs, showed that Papua, New Guinean highlanders could recognize facial expressions in frontal photographs of faces of Berkley students. Pinker believes that this indicated that they could recognize 3D representations but, in fact, the ability to distinguish facial expressions may be distinct from 3D perception, limited to frontal views and, as we have suggested before, hard wired in the human brain.

A. K Forge, (1970), found that the Abelam, in Papua, New Guinea noticed that:
…when shown photographs of themselves in action, or of any pose other than face or full figure looking directly at the camera … cease to be able to 'see' the photograph at all. …turning the photograph in all directions … could rarely identify individuals and had a tendency to regard any brightly colored photograph as a tambaran display

Mrs. Donald Fraser, in 1920's (1932) offers this description of an African woman looking at a picture:
She discovered in turn the nose, the mouth, the eye, but where was the other eye? I tried by turning my profile, to explain why she could only see one eye but she hopped around to my other side to point out that I possessed another eye which the other lacked.

So far, all this research has to do with perception of photographs of human faces and figures. But even when such 3D forms are recognized, the perception of depth on a 2D surface may not be. Some people are apparently even immune to this illusion altogether.

When Floyd and Lyn Ausburn (1982) gave adolescents in Papau, New Guinea:
a drawing of a 3D geometric figure, some clay and matchsticks, and asked them to make a model to match what they saw in the drawing. Most of them produced a two-dimensional model rather than a three-dimensional one, suggesting that they did not see the three-dimensions in the drawing.

Westerners often don't recognize this lack of skill when they try to educate non-westerners using photographs. Sherman W. Selden, Columbia University, (1971) working in "East Africa", (Kenya? Uganda?), wrote that:
I first became aware of this cultural bias when I watched a cadet teacher displaying to his class a professionally taken photograph of the Greek Parthenon. He explained that these ruins were lying on the ground where they had once been standing. After class when I pointed out to him that the ruins were standing upright, he asked me, "Then why does the photograph show them lying on the ground?" Obviously he could not correctly visualize the scene from the colored picture of it.*

When photographs from Look magazine with various kinds of depths represented were shown to 8-17 yr old Ugandan, British and American students, it was found that:

…in general Ugandans perceive spatial relationships quite different from the child trained in England and America. The African had difficulty in visualizing the relative position of objects in the photographs. He was not competent in translating the two-dimensional world of the camera's eye into the three-dimensional world of reality.

William Hudson (1960) compared literate and illiterate groups in South Africa in the perception of three dimensional cues such superimposition (one object partially obscuring another object) perspective and object size in drawings and photographs. The literate sample perceived depth in the representations far more frequently than the illiterate group who had been exposed to pictorial matter, and not one illiterate perceived a photograph as three dimensional. …formal schooling and immersion in a culture containing pictures, books and magazines are factors in determining the tendencies to use depth cues in viewing 2d representation of 3d objects. Segall, et al

It would seem that different kinds of images as well as separate features stimulate different parts of the brain. Some perceptual responses are apparently more hard wired than others, but all grow, in differing degrees, out of environmental and cultural experiences. Recognition of static, frontal views is easier than other views, probably because such pictures are based on recognition of pertinent information rather than the need to see an illusion of 3D on a flat surface. Although there may be structures in the brain devoted specifically to the perception of depth, the ability to see this illusion requires learning.


Sense data are taken, not merely given: we learn to perceive. "Why can't you draw what you see?" is the memorial cry of the teacher to the student looking down the microscope for the first time at some quite unfamiliar preparation he is called upon to draw. The teacher has forgotten, and the student himself will soon forget, that what he sees conveys no information until he knows beforehand the kind of thing he is expected to see.

- Zoologist, Sir Peter Medawar (1996)

If we do not know what we are looking at, we cannot tell what visual data is relevant. We, therefore, approach unfamiliar objects and pictures with a hypothesis as to what it is, and try to fit the data into this expectation.

Robert Laws, a Scottish missionary, active in Nyasaland (Malawi) 1901) wrote:
Take a picture in black and white, and the native cannot see it. You may tell the natives: "This is a picture of an ox and a dog; and the people will look at it and look at you and that look says that they consider you a liar. Perhaps you say again' This is a picture of an ox and a dog'. Well perhaps they will tell you what they think this time! If there are boys about, you say: 'This really is a picture of an ox and a dog. Look at the horn of the ox, and there is his tail! And the boy will say: 'Oh! Yes and there is the dog's nose and eyes and ears…

This is from a report of a study of members of a lowland Ethiopian tribe, the Me'ens, looking at line drawings of a buck and leopard:
Experimenter: Points at picture: 'What do you see?'
Subject: I'm looking closely. That is a tale. This is a foot. That is a leg joint. Those are horns.
E: What is the whole thing?
S: Wait. Slowly, I am still looking. Let me look and I will tell you. In my country this is a waterbuck.


E: Point to picture. 'What do you see?'
S: 'What is this? It has horns, leg…front and back, tail, eyes. Is it a goat? A sheep? Is it a goat?

In all these situations, subjects, with effort, ultimately recognized the parts that made up the whole picture. The process of learning to see both the external world and pictures thus involves searching for information by studying selected parts, details of what is presented and trying to fit them into what we know. Only familiar, relevant data is included in the perception. Non-westerners examined the pictures, various hypotheses were tested; various elements were recognized and linked in an attempt to arrive at a comprehensive whole. This approach is also evident in the way engineers, medical students, anyone studying blueprints; x-rays, etc. learn to see. The art historian, E. H.Gombrich, applies the same approach to the perception of paintings when he describes learning to see a painting as "a piecemeal affair that starts with random shots and these are followed by the search for a coherent whole". This 'making and matching' way of developing 'visual literacy' thus appears to illustrate an aspect of a universal perceptual process.

Sometimes erroneous perceptual assumptions are made. Wolff tried to fit the 'broken color' of Renoir's nudes into inappropriate Renaissance notions of colored forms. Even scientists have used mistaken hypotheses as to what they are looking at resulting in an incorrect representation. For example, Leonardo da Vinci, for all of his skill and the accuracy of his drawings of the human heart, still did not deviate significantly from Galen's erroneous, 1,400 year old account of it. He included descriptions of little channels required by Galen's assumptions about the heart's function, but which were not there. In 1610, Galileo, looking at a grainy image of the planet, Saturn, through a primitive telescope, could not tell that it was a sphere with a ring around it. Because he did not know that the planet had a ring; he saw it as three bodies. When years later, even with a superior telescope, the Dutch scientist, Christiaan Heygens first looked at Saturn, he still could not see the rings because he did not have an appropriate hypothesis. Later, when he did, he could see the rings.

A visually 'illiterate' person has the same problem when trying to draw even a familiar form. An example can be found in some demonstrations I did on the stage in NEIU's auditorium in front of large Art in Society classes. In one demonstration, I had two visually illiterate students come on the stage - one sat in a chair in front of the other student who was standing at an easel with a pad of drawing paper on it. The student at the easel was told to look at the sitting student and draw what he or she saw. For this exercise, all the skill the student needed was the ability to draw an oval for the head, two vertical, parallel lines for the neck and two horizontal, parallel lines for the shoulders. 100% of the time, the student drew the neck lines under the head oval and the shoulder lines under that. However, from that position, the neck was not visible and the shoulder lines would touch the oval of the head near its center.

Any teacher of life drawing will tell you that the major problem facing the student is not hand-eye coordination, but learning to see the model. In fact, all initial exercises have that purpose. For example, most classes begin with what are called 'gesture' or 'action' drawings, one minute sketches which focus on the movement of the model's body. No time is allowed for the student notice contours. Another kind of drawing, called 'contour' drawing has the opposite purpose, concentration on the contours of the body while ignoring any other aspect. Different exercises emphasize other features of which the student would be oblivious without this visual education.

Perception thus also involves intellectual learning and cognition, what we know about what we are looking at. As the philosopher, Nelson Goodman (1972) wrote:
Perception depends heavily on conceptual schemata. "There is no innocent eye." The raw material of vision cannot be extracted from the finished product. Our schemata (schema: a pattern imposed on complex reality or experience to mediate perception) may change and evolve, be revised or replaced, be suggested or informed, by factors of all kinds, but without some schema there is no perception.

Since these general rules of perception apply to how we see paintings, any aesthetic and emotional responses are either derived from such perceptions or are secondary to them. If we have no schema or an inappropriate one for the perception of a painting, we cannot 'see' it and if we can't see it, our aesthetic responses, whether positive or not, are purely personal.

*Later a colleague came to me with a photograph of an of an orchestra conductor, arms outstretched, taken from a very low angle in front of the podium. He had no idea of what the man in the picture was doing. "Is this a picture of Christ being crucified?" he asked. Yet this picture had been published by the United States Information Agency in a book designed to inform the people of other countries that Americans were concerned with the arts.

Find this content at: 

© 2007 Leopold Segedin. All rights reserved.