Culture, Perception, and Images: Do You See What I See?

Is it possible to make pictures of what we see independent of culture? Most of us take for granted that that seeing is an automatic, physiological process, that people in all cultures, regardless of their intellectual capacities, see the world the same way we do. We also believe that there are objective ways, like photography and drawings by trained artists, of representing what we see. Then why, if we ignore personal, expressive variations, do photographs of the same subject by different photographers look so different? If we all see the same things the same way, why do artists’ representations of what they see look so different? Why can certain autistic ‘savant’ children draw what they see without training? Also, if seeing is automatic, why does visual representation have a history? And how did prehistoric man first learn to draw what he saw? Finally, when we represent what we see, what are we supposed to see and what should an objective representation of it look like?


Most people consider a photograph an objective representation of what we see, produced by technology that is independent of culture. A photograph is supposed to be an impersonal, mechanical copy of what is in front of the camera. Underlying this belief is the idea that the image of what is in front of the camera is projected on the retina of the eye in the same way that that image is projected on a film in the back of the camera. According to this view, this image is, therefore, an accurate copy of reality. So strong is this belief that a photograph is frequently characterized as reproducing or even being a surrogate for what we see. For example, according to the philosopher, Roger Scruton, “…from studying a photograph, (an observer) may come to know how something looked in the same way he might know it if he had actually seen it.” The famous, French film maker, Andre Bazin, goes even further. He writes, “The photographic image is the object itself, the object freed from the condition of time and space that govern it.…”

This notion of photography as realistic representation is based on two assumptions on how a camera works. One assumption is that the camera functions like an eye; the second assumption is that the camera shows what is in front of the camera. In other words, a photograph shows what we see or what is there as if these two assumptions were the same. But they really are not the same and neither premise is true. Actually, there is little resemblance between the eye and the camera and between the photograph and what it represents. As the photographer, Joel Snyder says, it would be more accurate to say that “the camera shows us what we would have seen at a certain moment of time, from a certain vantage point, if we kept our head immobile and closed one eye and if we saw with the equivalent of a 150-mm or a 24-mm lens and if we saw things in Agfacolor or in Tri-x developed in D-76 and printed on Kodabromide #3 paper.” Rather than showing us what we would see, he says, the camera show what we would see if our eyes worked like a camera.

A photograph is not what the eye sees. A photograph is two dimensional; what is projected on the retina is not. In fact, there is nothing like an image formed which can then be inspected and interpreted by the brain or mind. “In the living, active eye, the so-called ‘image’ on the retina “is kept in constant involuntary motion; it drifts away from the fovea, ‘flicked’ back, while the drifting movement itself vibrates at up to 150 cycles per second.” Our eyes jump from one point to another, near and far, refocusing each time. We perceive space by head and body movements and bifocal vision along with anything projected on the retina. Different receptors in the retina select and respond to edges, corners and gradations of light. Vision is selective. We see what we attend to. We learn to see what to attend to. “We do not see all of our surroundings, but construct a viewer-centered three dimensional view of our environment.”

In contrast, a photograph is a piece of paper on which are patterns of light, dark and color. These patterns are the chemical reactions to projected light rays. We call both the paper and such patterns images. We think of paper images as physical artifacts, then small, rectangular, two dimensional images describing three dimensional space can have no direct physical resemblance to what they represent.

In ordinary perception, there is no one eye, single point of view, no perspective, no vanishing point and no picture plane and so any patterns on the surface of the paper which indicate such entities are synthetic constructions. Since these constructions give a form to the chaos of visual stimulation, we see them as if they represent what we see in the real world. While some psychologists claim that, under very contrived circumstances, these patterns can come close to matching the light stimulation on the retina of the eye, in the same way that words are basic to a novel, these patterns are really more characteristic of the image than our perception. This, of course, is not to suggest that a photograph is an arbitrary convention or that it does not in some way depict what it represents; it is rather to point out that a photograph looks like a photograph. It is not a duplicate of what it represents.

Thus, photography, like any other method of representation, is a cultural product and as such, has a history. Light projections were not originally seen as pictures. Projections from ‘pinhole cameras’ were utilized since ancient times to observe solar eclipses. Arab and Latin natural philosophers studied such projections to develop a theory of vision which was generally accepted during the Medieval period, but there is no evidence that such projections were considered for use in the production of pictures until well into the 16th century, 100 years after the development of perspective theory. Before then, there was no concept of a pictorial image of visual phenomena. In other words, the idea that a picture could represent the appearance of things did not exist until the Renaissance.

In fact, although there were representations of objects, there were no representations of three dimensional space before the development of pictorial images. This concept of visual reality was established before the invention of the photograph. When the photograph was invented in 1839, the problem for painters was not how to make a picture look like a photograph, but rather how to make the photograph look like their pictures. In fact, the first fixed photographs were not seen as advancement in realistic representation, but rather as convenient, representational models to assist professional and amateur artists make traditional drawings. They were cheaper than live models. Delacroix, Gauguin, Manet, Degas, even Picasso, used photographs as sources of information, not to make their paintings more realistic. Only later was photography seen by historians and philosophers as an indication of progress in the representation of nature in art history.


If the pre Renaissance artist did not see appearances, what did he see? What did he represent before he learned how to represent appearances? In fact, what do we normally see when we are not looking at representations? What is non pictorial vision? Rather obviously and necessarily, such vision focuses on what is necessary to function in everyday life. In that we live in a three-dimensional world, ordinary perception operates in three-dimensions. Although we ‘see’ objects in that world, (like chimpanzees and other, uncultured, but practical animals) we do not pay attention to or represent their functionally irrelevant, visual aspects. We do not see one point perspective, light and shade or foreshortening; nor do we see line, form or color. Such representations of visual aspects are two dimensional, cultural constructions. Rather, we see solid, three dimensional objects - buildings, furniture and people. Ordinary vision involves perceiving the ‘objectness’ of objects, their stable, physical, tactile qualities, as we move among them. Practically speaking, this requires our eyes to detect the permanent and invariant properties of objects, as recognized and formed from many angles.

If the artist represents objects as seen within this non-pictorial vision, then, instead of focusing on appearances, a ‘realistic’ image would emphasize an object’s distinguishing, physical characteristics. In a sense, it is the difference between representing what we see and what we know. This can be seen in images made by children and ancient Egyptians. People can be represented by describing what makes them most identifiable – for example, profiles of noses and feet, front views of eyes and shoulders, as in Egyptian wall painting, which is quite different than describing their appearance – say – of a foreshortened arm from a particular point of view, as in Renaissance painting. There are many other such identifying characteristics, e.g. two adjacent dots can represent eyes or nostrils, etc. depending on their location in the image. Objects the same size will be represented the same size regardless how distant they are from the observer because they are the same size, even though they appear smaller. Most likely, size difference would indicate the importance of the object, rather than its location in space.

While it is impossible and impractical to separate the appearance of objects from our concept of them, most images in the history of civilizations emphasize their physical characteristics. Culture, social and practical functions and media determine to what extent either is emphasized. As a result, the word, ‘seeing’ refers confusingly to both the appearances and the ‘objectness’ of what we perceive. Most of the time, in this essay, it refers to appearances

We are likely to think that, when artists represent what they see, they are seeing this reality directly, but they really are seeing three dimensional, visual reality as a two dimensional picture. In other words, in order to represent pictorially, an artist must see pictorially. And since there have been several different modes of pictorial representation through Western history, there must also be different kinds of pictorial vision. Therefore, perception itself cannot be an unchanging, physiological process. Human vision must also vary historically and culturally. In fact, people did not see pictorially until they turned from the otherworldliness of the Medieval period to concerns with the physical world during the Renaissance. It then became important that they be aware that such visual phenomena existed and should be turned into a picture. This requires attention to appearances – to perspective, light and shade, foreshortening, contour, etc. In the 1300’s, as artists first became aware of such phenomena and learned how to represent it, people looked with awe at the realism of Giotto’s paintings in the Arena Chapel; 100 years later, as they saw more and representational techniques became more sophisticated, Masaccio became the ideal, realist painter.

After they saw the paintings of Leonardo and Michelangelo, the realism of both Giotto and Masaccio looked conventional. We did not see one point perspective until Brunelleschi first represented it in 1420; we did not understand it until Alberti published the optical theories on which it was based fifteen years later. We did not see sunrises the same way after we saw Monet’s paintings of 1872. In fact, the Impressionists changed our whole way of seeing color in nature. In our culture, even learning to draw is not a matter of developing hand-eye coordination, but rather of learning to see. People, regardless of their culture, who have not received this visual education, literally cannot see any of this. They are, in effect, visual illiterates

Pictorial vision is two dimensional in that it reduces three dimensional perception to what can be represented on a two dimensional surface. This idea of pictorial vision came to dominate Western intellectual and scientific culture because of its apparent objectivity. When Renaissance artists introduced the idea that two dimensional pictures could and should represent the appearance of reality, philosophers began to think that the perception of the three dimensional world itself was the interpretation of two dimensional images, but that the ‘vulgar’, the uneducated, were oblivious to this ‘insight’. Pictorial vision thus became an elitist, philosophical preoccupation. Only sophisticated intellectuals, in their consideration of how we know reality, understood that perception was an interpretation, a function of the mind. In the 17th century, the English philosopher, John Locke wrote, “When we set before our eyes a round globe… it is certain that the idea thereby imprinted in our mind is of a flat circle” and in the 18th century, the Scottish philosopher, David Hume, wrote, ‘It is commonly allowed by philosophers that all bodies which discover themselves to the eye appear as if painted on a plane surface”. Some psychologists continue this idea today when they present a picture of an object to subjects in psychological experiments as if it was what it represents or when they analyze perception as a two-dimensional image projected on the retina. But pictorial perception is really the reverse of this concept. We are really converting three dimensional perceptions into two dimensions. We do not see a sphere as a circle which we then judge to be round, as Locke proposed. We see a sphere which we represent as a circle. (This may also be the source of Scruton’s and Bazin’s idea that a photograph was a surrogate for reality. If perception itself is two dimensional, then there would be little difference between perception and its mechanical copy. This is also consistent with what I wrote in my paper on ‘image as substitute’, for certainly the photograph is being used as a substitute for what it represents.)


Western children require training before they can represent what they see pictorially. They also go through stages of development, in which they progress from making conceptual, conventional or symbolic images to perceptual images. Chinese children require training, but do not go through such stages. They can draw pictorially at the age of four. Complicating this issue of representation even further is the fact that there are people who apparently can represent what they see without training or going through stages of development. Certain autistic children can represent what they see after only a quick glance. They have been labeled ‘idiot savants’. Oliver Sacks has called them ‘prodigies.’ These autistic children have almost total recall of optical experience apparently without any conceptual structure or symbolic schemas. One famous example was Nadia, a developmentally delayed, autistic, three and a half year old girl who could make visually accurate pictures of animals from memory. She had no language ability. Her drawings showed a sense of space, of foreshortening and of visual contours. Psychologists say that she has an ability to draw spontaneously because she had direct access to ‘lower’ levels of neural information” prior to it being conceptualized. Such talents apparently recede or disappear with the onset of maturity, possibly as a result of the development of language.

This visual skill in people with no language raises questions about the origin and significance of image making in the history of civilization. Are pictures by these savants exceptions to the notion that image making is a product of culture or do they suggest that everyone has potential access to these skills? Were pre-historic cave paintings necessarily the product of culture or did they arise spontaneously from the primitive, pre historic, human mind? The extraordinary drawings on the walls in the 30,000 year old Chauvet caves have been interpreted by many historians as an indication that civilization is far older than previously thought because they assume such drawings to have been accompanied by language, ritual and other signs of the modern mind. However, the British psychologist, Nicholas Humphrey, disagrees.

He argues in his essay, “Cave Art, Autism, and the Evolution of the Human Mind”, that “comparison of the cave art with the drawings made by a young autistic girl, Nadia, reveals surprising similarities in content and style”. This similarity suggests to him that these prehistoric artists may have been “little given to symbolic thought, have had no great interest in communication and have been essentially self taught and untrained.” In other words, Humphrey believes that images can arise spontaneously from perceptions based on primitive brain activity without the intervention of culture. Adam Gopnik, in an essay called “Learning To Draw” in The New Yorker last June, conjectures that learning to draw may be a ‘delving back’ into the pre-conceptual levels of our mind rather than a slow growing craft.

However, both Humphrey and the psychologists ignore information which contradicts their conclusions. In the same book that they used, “Nadia’, by Lorna Selfe, describing Nadia’s abilities, Selfe writes:

However, it is important to point out that Nadia did not normally work from direct perceptual experience. Although she did not copy her drawings from pictures, most of her drawings did have originals. These were taken from children’s books, newspapers and wall prints…

Thus, although Nadia may have had a compulsion to draw and a fantastic, visual memory, her images did not come spontaneously from ‘lower levels of neural information’, as Humphrey and the other psychologists believed; they came from pictures, which, of course, are the products of culture. Nadia based her drawings on images in which objects were represented in traditional, Western modes. The three dimensional forms of the horses she drew were already reduced to two-dimensional, paper surfaces. Problems of drawing contours of foreshortened, moving horses, so difficult for normal draftsmen before the invention of photography, were already solved for her. In other words, she did not make pictures of horses; she made pictures of pictures of horses, which is not the same thing. As far as I know, she never saw a real horse. Even her few drawings from life are based on what she learned from studying and drawing such pictures.

Nadia’s drawings looked like traditional, Western drawings because she had Western models for representation. In the same way, the drawings of Chinese and Japanese autistic savants look like Chinese and Japanese drawings. All children, regardless of their age, their culture, their level of cognition, have been exposed to images in the representational modes of their culture: children cannot avoid seeing pictures in magazines, pictures hung on the wall and pictures in the books that are read to them. It is not a matter of copying images; rather it is a matter of visualizing perception in particular, cultural modes, just like thought is articulated in particular languages. This is no different than representational artists in any culture. There is no alternative.


Two- dimensional pictures are often used as substitutes for three dimensional objects in psychological experiments and this is confusing and deceptive. Humphrey ignores this difference or confuses the two in trying to establish that there is a natural way of seeing – an ‘innocent eye’, so to speak - independent of language and a history of image making. Nadia shows that it is possible to make images representing three dimensional forms without language, but with awareness of a particular, representational imagery. Therefore, although language may not be necessary for the development of representational skill, a long tradition of image making independent of language might be. Such images may have been used for communication, serving social functions previous to or along with language. There may even be a brain structure for image making, a different ‘frame of mind’, as we assume there to be for language. There may even be a compulsion to represent experiences visually as well as verbally.

I will even conjecture that the first images of animals in prehistoric caves did not come from direct observation of animals, but rather from accidental images of animals found in natural formations on cave walls. If the contemporary human mind can find faces in clouds, and Madonnas in pieces of cheese, a pre-historic human might see an image of a horse in a natural configuration on a cave wall, revealed dramatically in flickering torchlight and emphasize its representational aspects with lines and colored pigments. Eventually, using this found image as a model, he might create his own forms and

Thus, image making cannot exist outside of culture any more than language can. There can be no unmediated, objective observation and representation of nature. Also, there can be no representational images coming out of ‘lower levels of neural information’, independent of cultural context. The photograph and drawings of autistic artists are not exceptions. Images come out of a history of previous images. They come before perception in the sense that we do not see first and then make images of what we see; we see what images make us aware of before we can make images of what we see. For all practical purposes, we are unaware of this cultural world in which we live. Experientially, like water to fish or air to humans, culture does not exist. We are oblivious to the way it forms our vision and its representation as well as our social beliefs.

