The Saliency Map is a topographically arranged map that represents visual saliency of a corresponding visual scene.
One of the most severe problems of perception is information overload. Peripheral sensors generate afferent signals more or less continuously and it would be computationally costly to process all this incoming information all the time. Thus, it is important for the nervous system to make decisions on which part of the available information is to be selected for further, more detailed processing, and which parts are to be discarded. Furthermore, the selected stimuli need to be prioritized, with the most relevant being processed first and the less important ones later, thus leading to a sequential treatment of different parts of the visual scene. This selection and ordering process is called selective attention. Among many other functions, attention to a stimulus has been considered necessary for it to be perceived consciously (see Attention and Consciousness and Visual Awareness; but see Koch and Tsuchiya (2007) for a different viewpoint).
What determines which stimuli are selected by the attentional process and which will be discarded? Many interacting factors contribute to this decision. It has proven useful to distinguish between bottom-up and top-down factors. The former are all those that depend only on the instantaneous sensory input, without taking into account the internal state of the organism. Top-down control, on the other hand, does take into account the internal state, such as goals the organisms has at this time, personal history and experiences, etc. A dramatic example of a stimulus that attracts attention using bottom-up mechanisms is a fire-cracker going off suddenly while an example of top-down attention is the focusing onto difficult-to-find food items by an animal that is hungry, ignoring more "salient" stimuli.
Given the difficulty of accurately measuring or even quantifying the internal states of an organism, those aspects of attentional control that are independent of these, i.e., bottom-up attention, are easier to understand than those that are influenced by internal states. Possibly the most influential attempt at understanding bottom-up attention and the underlying neural mechanisms was made by Christof Koch and Shimon Ullman (Koch and Ullman, 1985). They proposed that the different visual features that contribute to attentive selection of a stimulus (color, orientation, movement etc) are combined into one single topographically oriented map, the Saliency map which integrates the normalized information from the individual feature maps into one global measure of conspicuity. In analogy to the center-surround representations of elementary visual features, bottom-up saliency is thus determined by how different a stimulus is from its surround, in many submodalities and at many scales. To quote from Koch and Ullman, 1985 (p. 221), Saliency at a given location is determined primarily by how different this location is from its surround in color, orientation, motion, depth etc.
The saliency map was designed as input to the control mechanism for covert selective attention. Koch and Ullman (1985) posited that the most salient location (in the sense defined above) in a visual scene would be a good candidate for attentional selection. Once a topographic map of saliency is established, this location is obtained by computing the position of the maximum in this map by a Winner-Take-All mechanism. After the selection is made, suppression of activity at the selected location (which may correspond to the psychophysically observed "inhibition of return" mechanism) leads to selection of the next location at the location of the second-highest value in the saliency map and a succession of these events generates a sequential scan of the visual scene. This role of the saliency map in the control of which locations in the visual scene are attended is close to that of the "master map" postulated in the "Feature Integration Theory" proposed by Treisman and Gelade (1980).
The Koch and Ullman study was purely conceptual. The first actual implementation of a saliency map was described by Niebur and Koch (1996). They applied their saliency map model which made use of color, intensity, orientation and motion cues both to simplified visual input (as is typically used in psychophysical experiments) and to complex natural scenes and they demonstrated sequential scanning of the visual scene in order of decreasing salience (see #Applications below). Later work refined the model (Itti et al, 1998; Itti and Koch 2001). The source code to compute saliency maps is freely available at http://ilab.usc.edu/toolkit/downloads.shtml.
Bottom-up mechanisms (and thus the saliency map) do not completely determine attentional selection. In many cases, top-down influences play an important role and can override bottom-up saliency cues, as in the example discussed at the end of the #Motivation section (see also Underwood et al, 2006). Various mechanisms have been proposed to integrate top-down influences in the saliency map, starting with its very first implementation by Niebur and Koch (1996) who proposed that spatial selective attention (as in a Posner task; Posner 1980) would result from spatially defined additions to the saliency map.
Not everybody agrees that an explicit saliency map is necessary to control attention, see e.g. VanRullen (2003).
The figure shows a complex visual scene and the corresponding saliency map, as computed from the algorithm in Niebur and Koch (1996). The scene is static so the motion component of the algorithm does not yield a contribution. The surf line is well-represented in the saliency map since it combines input from several feature maps: intensity, orientation and color all have substantial local contrast at several spatial scales in this area. The same is the case for the clouds and the island in the distance.
Beyond the original application of the saliency map as the stage of a control system for covert attention, it has found use in other, related areas. Perhaps the most immediate extension is to predict eye movements ("overt attention"; e.g., Parkhurst et al, 2002, Underwood et al, 2006). There are numerous technical applications in which the saliency map is typically used to prioritize selection, e.g. to identify the most important information in visual input streams and to use this to improve performance in generating or transmitting visual data (review: Parkhurst and Niebur 2002). Even an "inverse" saliency map has been used, to de-emphasize salient image regions and to direct attention to other regions (Su et al 2004). Another original application of saliency maps is to generate synthetic vision for simulated actors in virtual environments (Courty and Marchand 2003). Saliency maps have also been integrated in a VLSI hardware model of visual selective attention (Indiveri 2000).
Anatomical localization of the saliency map
The original definition of the saliency map by Koch and Ullman (1985) is in terms of neural processes and transformations, rather than in terms of cognitive or higher order constructs. The question where the saliency map is located in the brain arises thus quite naturally. There is no logical necessity that it arises in one particular location and it could be understood as a functional map whose components could be distributed over many brain areas. It is also possible that there are more than one topographically organized saliency maps. However, given that many feature maps of early vision are, in fact, localized in specific parts of the central nervous system, it has been proposed that the same might also be the case for the saliency map. Koch and Ullman (1985) proposed that it may be located in the lateral geniculate nucleus of the thalamus, an area previously suggested as playing a major role in attentional control by Crick (1984). Another thalamic nucleus, the pulvinar, is known to be involved in attention (Robinson and Petersen 1992) and has also been suggested as a candidate for housing the saliency map. Another possibility is the superior colliculus, likewise known to be involved in the control of attention (Kustov and Robinson 1996). Several neocortical areas have been suggested as well, including V1 (Li 2002), V4 (Mazer and Gallant 2003), and posterior parietal cortex (Gottlieb 2007). Thus, there are a number of identified candidates which may correspond to different flavors of salience, perhaps more bottom-up driven in some area and more strongly modulated by behavioral goals in some other area.
Saliency maps in other sensory modalities
While the discussion so far has been on the use of saliency maps in vision, it has also been proposed that saliency maps could be employed in auditory perception (Kayser et al, 2005). At this time, the author is not aware of an analogous claim in the third major spatially organized modality, i.e. somatosensory perception.
- Courty, N. and Marchand, E. Visual perception based on salient features. Proc. of 2003 IEEE/RSJ Intl. Conference on Intelligent Robots and Systems. Las Vegas, Nevada 2003
- Gottlieb, J. From Thought to Action: The Parietal Cortex as a Bridge between Perception, Action, and Cognition. Neuron 53(1): 9-16 (2007)
- Kayser C, Petkov CI, Lippert M, and Logothetis NK. Mechanisms for allocating auditory attention: an auditory saliency map. Current Biology 15(21):1943-1947 (2005)
- Indiveri G. Modeling selective attention using a neuromorphic analog VLSI device. Neural Computation 12(12):2857-80 (2000)
- Itti, L, Koch, C. and Niebur, E. A Model of Saliency-Based Visual Attention for Rapid Scene Analysis IEEE Transactions on Pattern Analysis and Machine Intelligence 20(11):1254-1259 (1998)
- Itti, L and Koch, C. Computational Modeling of Visual Attention, Nature Reviews Neuroscience 2(3):194-203 (2001)
- Koch, C. and Ullman, S. Shifts in selective visual attention: towards the underlying neural circuitry. Human Neurobiology 4:219-227 (1985).
- Koch, C and Tsuchiya, N. Attention and consciousness: two distinct brain processes. Trends Cogn Sci. 11(1):16-22 (2007)
- Kustov, A.A and Robinson, D. L. Shared neural control of attentional shifts and eye movements. Nature 384(6604):74-77 (1996)
- Li Z. A saliency map in primary visual cortex Trends in Cognitive Sciences 6(1): 9-16 (2002)
- Mazer, J.A. and Gallant, J.L. Goal-related activity in area V4 during free viewing visual search: Evidence for a ventral stream salience map Neuron 40: 1241-1250 (2003)
- Niebur, E. and Koch, C. Control of Selective Visual Attention: Modeling the `Where' Pathway. Neural Information Processing Systems 8:802-808 (1996)
- Parkhurst, D. and Law, K. and Niebur, E. Modelling the role of salience in the allocation of visual selective attention. Vision Research 42(1):107-123 (2002)
- Parkhurst, D. and Niebur, E., Variable resolution displays: a theoretical, practical and behavioral evaluation. Human Factors 44(4):611-29 (2002)
- Posner, M. I. Orienting of attention. Quart. J. Exp. Psychol 32: 3-25 (1980)
- Robinson, D. L. and Petersen, S. E. The pulvinar and visual salience. Trends Neuroscience 15(4):127-132 (1992)
- Su, S. L., Durand, F. and Agrawala, M. An Inverted Saliency Model for Display Enhancement. In Proceedings of 2004 MIT Student Oxygen Workshop, Ashland, MA (2004)
- Treisman, A. M. and Gelade, G. A feature-integration theory of attention. Cognitive Psychology 12(1): 97-136 (1980)
- Thompson, K. G. and Schall, J. D. Antecedents and correlates of visual detection and awareness in macaque prefrontal cortex, Vision Research 40 (10-12):1523-1538 (2000.
- Underwood, G., Foulsham, T, van Loon, E., Humphreys, L. and Bloyce, J.. Eye movements during scene inspection: A test of the saliency map hypothesis. European Journal Of Cognitive Psychology 18(3):321-342 (2006)
- VanRullen R. Visual saliency and spike timing in the ventral visual pathway. J Physiol Paris 97(2-3):365-77 (2003)
- Valentino Braitenberg (2007) Brain. Scholarpedia, 2(11):2918.
- John G. Taylor (2007) CODAM model. Scholarpedia, 2(11):1598.
- Keith Rayner and Monica Castelhano (2007) Eye movements. Scholarpedia, 2(10):3649.
- S. Murray Sherman (2006) Thalamus. Scholarpedia, 1(9):1583.
- Laurent Itti (2007) Visual salience. Scholarpedia, 2(9):3327.
Author's homepage: http://cnslab.mb.jhu.edu
Open source implementation of a visual saliency map model: http://ilab.usc.edu/toolkit