Top-down and bottom-up theories of perception

Psychologists often distinguish between top-down and bottom-up approaches to information-processing. In top-down approaches, knowledge or expectations are used to guide processing. Bottom-up approaches, however, are more like the structuralist approach, piecing together data until a bigger picture is arrived at. One of the strongest advocates of a bottom-up approach was J.J. Gibson (1904-1980), who articulated a theory of direct perception. This stated that the real world provided sufficient contextual information for our visual systems to directly perceive what was there, unmediated by the influence of higher cognitive processes. Gibson developed the notion of affordances, referring to those aspects of objects or environments that allow an individual to perform an action. Gibson's emphasis on the match between individual and environment led him to refer to his approach as ecological. Most psychologists now would argue that both bottom-up and top-down processes are involved in perception.

Bottom-up theories

Template matching

TheCat.png
What do you read above?

One way for people to recognize objects in their environment would be for them to compare their representations of those objects with templates stored in memory. For example, if I can achieve a match between the large red object I see in the street and my stored representation of a London bus, then I recognize a London bus. However, one difficulty for this theory is illustrated in the figure to the right. Here, we have no problem differentiating the middle letters in each word (H and A), even though they are identical. A second problem is that we continue to recognize most objects regardless of what perspective we see them from (e.g. from the front, side, back, bottom, top, etc.). This would suggest we have a nearly infinite store of templates, which hardly seems credible.

The first couple of minute so of the following video talks about the issue of templates:

Prototype theories

An alternative to template theory is based on prototype matching. Instead of comparing a visual array to a stored template, the array is compared to a stored prototype, the prototype being a kind of average of many other patterns. The perceived array does not need to exactly match the prototype in order for recognition to occur, so long as there is a family resemblance. For example, if I am looking down on a London bus from above its qualities of size and redness enable me to recognize it as a bus, even though the shape does not match my prototype. There is good evidence that people do form prototypes after exposure to a series of related stimuli. For instance, in one study people were shown a series of patterns that were related to a prototype, but not the prototype itself. When later shown a series of distractor patterns plus the prototype, the participants identified the prototype as a pattern they had seen previously1.

Feature-matching theories

Feature-matching theories propose that we decompose visual patterns into a set of critical features, which we then try to match against features stored in memory. For example, in memory I have stored the information that the letter "Z" comprises two horizontal lines, one oblique line, and two acute angles, whereas the letter "Y" has one vertical line, two oblique lines, and one acute angle. I have similar stored knowledge about other letters of the alphabet. When I am presented with a letter of the alphabet, the process of recognition involves identifying the types of lines and angles and comparing these to stored information about all letters of the alphabet. If presented with a "Z", as long as I can identify the features then I should recognise it as a "Z", because no other letter of the alphabet shares this combination of features. The best known model of this kind is Oliver Selfridge's Pandemonium.

One source of evidence for feature matching comes from Hubel and Wiesel's research, which found that the visual cortex of cats contains neurons that only respond to specific features (e.g. one type of neuron might fire when a vertical line is presented, another type of neuron might fire if a horizontal line moving in a particular direction is shown).

Some authors have distinguished between local features and global features. In a paper titled Forest before trees David Navon suggested that "global" features are processed before "local" ones. He showed participants large letter "H"s or "S"s that were made up of smaller letters, either small Hs or small Ss. People were faster to identify the larger letter than the smaller ones, and the response time was the same regardless of whether the smaller letters (the local features) were Hs or Ss. However, when required to identify the smaller letters people responded more quickly when the large letter was of the same type as the smaller letters.

One difficulty for feature-matching theory comes from the fact that we are normally able to read slanted handwriting that does not seem to conform to the feature description given above. For example, if I write a letter "L" in a slanted fashion, I cannot match this to a stored description that states that L must have a vertical line. Another difficulty arises from trying to generalise the theory to the natural objects that we encounter in our environment.

Unless otherwise stated, the content of this page is licensed under Creative Commons Attribution-ShareAlike 3.0 License