|Keith Rayner and Monica Castelhano (2007), Scholarpedia, 2(10):3649.||doi:10.4249/scholarpedia.3649||revision #126973 [link to/cite this article]|
Eye movements are a behavior that can be measured and their measurement provides a sensitive means of learning about cognitive and visual processing. Although eye movements have been examined for some time, it has only been in the last few decades that their measurement has led to important discoveries about psychological processes that occur during such tasks as reading, visual search, and scene perception.
Basic characteristics of eye movements in visual cognition
Although we have the impression that we can process the entire visual field in a single fixation, in reality we would be unable to fully process the information outside of foveal vision if we were unable to move our eyes (Rayner, 1978, 1998).
Because of acuity limitations in the retina, eye movements are necessary for processing the details of the array. Our ability to discriminate fine detail drops off markedly outside of the fovea in the parafovea (extending out to about 5 degrees on either side of fixation) and in the periphery (everything beyond the parafovea). (See Figure 1).
While we are reading or searching a visual array for a target or simply looking at a new scene, our eyes move every 200-350 ms. These eye movements serve to move the fovea (the high resolution part of the retina encompassing 2 degrees at the center of the visual field) to an area of interest in order to process it in greater detail.
During the actual eye movement (or saccade), vision is suppressed and new information is acquired only during the fixation (the period of time when the eyes remain relatively still).
While it is true that we can move our attention independently of where the eyes are fixated, it does not seem to be the case in everyday viewing. The separation between attention and fixation is often attained in very simple tasks (Posner, 1980); however, in tasks like reading, visual search, and scene perception, covert attention and overt attention (the exact eye location) are tightly linked.
Because eye movements are essentially motor movements, it takes time to plan and execute a saccade. In addition, the end-point is pre-selected before the beginning of the movement.
While it has generally been assumed that the two eyes move in synchrony and that they fixate the same point in space, recent research clearly demonstrates that this is not the case and the two eyes are frequently deviated from each other (Liversedge, Rayner, White, Findlay, & McSorley, 2006; Liversedge, White, Findlay, & Rayner, 2006).
There is considerable evidence that the nature of the task influences eye movements. A summary of the average amount of time spent on each fixation and the average distance the eyes move in reading, visual search, and scene perception are shown in Table 1.
|Task||Typical mean fixation duration (ms)||Mean Saccade Size (degrees)|
|Silent Reading||225-250||2 (8-9 letter spaces)|
|Oral Reading||275-325||1.5 (6-7 letter spaces)|
From this table, it is immediately apparent that while the values presented in the table are quite representative of the different tasks, they show a range of average fixation durations and for each of the tasks there is considerable variability both in terms of fixation durations and saccade lengths.
Brief history of eye movements in visual cognition
At one time, researchers believed that the eyes and the mind were not tightly linked during information processing tasks like reading, visual search, and scene perception. This conclusion was based on the relatively long latencies of eye movements (or reaction time of the eyes) and the large variability in the fixation time measures.
They questioned the influence of cognitive factors on fixations given that eye movement latency was so long and the fixation times were so variable. It seemed unlikely that cognitive factors could influence fixation times from fixation to fixation.
Actually, an underlying assumption was that everything proceeded in a serial fashion and that cognitive processes could not influence anything except very late in a fixation, if at all. However, a great deal of research using new eye trackers that enable better eye tracking has since established a tight link between the eye and the mind, and it is now clear that saccades can be programmed in parallel (Becker & Jürgens, 1979) and, furthermore, that information processing continues in parallel with saccade programming.
Eye movements in reading
In reading, unlike other tasks, character spaces are used rather than visual angle. This is because it has been demonstrated that character spaces are the more appropriate unit than visual angle. So, if the size of the print is held constant and the viewing distance varied (so that there are either more or fewer characters per degree of visual angle), how far the eyes travel is determined by character spaces, not visual angle (Morrison & Rayner, 1981).
Another important characteristic of eye movements while reading is that about 10-15% of the time readers move their eyes (regress) back to previously read material in the text. These regressions, as they are called, tend to depend on the difficulty of the text.
As would be expected, saccade size and fixation duration are also both modulated by text difficulty: as the text becomes more difficult, saccade size decreases, fixation durations increase, and regressions increase.
From these measures alone, it is very clear that global properties of the text influence eye movements greatly. In addition, these three main global measures (saccade size, fixation duration and number of regressions) are also influenced by the type of material being read and the reader’s goals in reading (Rayner & Pollatsek, 1989). For instance, reading a text for understanding produces a very different pattern of eye movement measures when compared to skimming a text while proofreading ( Figure 3).
In addition to global effects, studies have shown clear local effects on words. Measures in these studies focus on the processing of a target word (versus looking at an average measure that is pooled from all words in a sentence, such as the average fixation duration). Local measures include: first fixation duration (the duration of the first fixation on a word), single fixation duration (those cases where only a single fixation is made on a word), and gaze duration (the sum of all fixations on a word prior to moving to another word).
A very important issue in reading is how much information is the reader able to process and use during a single fixation, which, as we’ve noted above, typically lasts for 200-250 ms. This measure is referred to as the perceptual span (also called the functional field of view or, to a lesser degree, the region of effective vision). Although we have the impression that we can see an entire line of text or even an entire page of text, this is an illusion. This fact has been clearly demonstrated in a number of studies over the years that use a “gaze-contingent moving window paradigm”, introduced by McConkie and Rayner (1975; Rayner & Bertera, 1979). For more information on the “gaze-contingent moving window paradigm” and other gaze contingent paradigms, see Eye-Contingent Experimental Paradigms.
Studies have demonstrated that English readers acquire useful information from an asymmetrical region around the fixation point (extending 3-4 character spaces to the left of fixation and about 14-15 character spaces to the right). Research has also found that readers do not utilize information from the words on the line below the currently fixated line (Pollatsek, Raney, LaGasse, & Rayner, 1993).
As briefly mentioned above, the difficulty of the text being read has an impact on eye movement patterns (fixation duration, saccade length, and frequency of regressing to previously read text). Over the past few years, it has become very clear that how long the eyes remain in place is influenced by a host of linguistic factors.
These factors include:
- the frequency of the fixated word (Inhoff & Rayner, 1986; Rayner & Duffy, 1986),
- how predictable the fixated word is (Ehrlich & Rayner, 1981; Rayner & Well, 1996),
- how many meanings the fixated word has (Duffy, Morris, & Rayner, 1988; Sereno, O’Donnell, & Rayner, 2006),
- when the meaning of the word was acquired (Juhasz & Rayner, 2003, 2006),
- semantic relations between the word and prior words (Carroll & Slowiaczek, 1986; Morris, 1994), and
- how familiar the word is (Williams & Morris, 2004). For a more in-depth review of these factors and others, see Rayner (1998.)
Recently, a number of sophisticated computational models (see Reichle, Pollatsek, Fisher, & Rayner, 1998; Reichle, Rayner, & Pollatsek, 2003; Engbert, Nuthmann, Richter, & Kliegl, 2005) have been presented which do a good job of accounting for the characteristics of eye movements during reading.
Eye movements and visual search
Fixation durations in search tend to be highly variable. Some studies report average fixation times as short as 180 ms while others report averages on the order of 275 ms. This is undoubtedly due to the fact that the difficulty level of a search (i.e., how dense or cluttered the array is) and the exact nature of the search task will strongly influence how long viewers pause on each item (see Figure 4 and Figure 5).
Typically, saccade size is a bit larger than in reading (though saccades can be quite short with very dense arrays).
When an array is very cluttered (with many objects and distractors), the search becomes more demanding than when the array is simple. The eye movements on each of these types of arrays typically reflect this property of an array (Bertera & Rayner, 2000; Greene & Rayner, 2001a. 2001b). As the array becomes more complicated, we see an increase in the fixation duration and the number of fixations, as well as a decrease in the average saccade length (Vlaskamp & Hooge, 2006).
Research has shown that in visual search, the perceptual span varies as a function of the difficulty of the distractor letter. That is, the perceptual span is smaller when distractor items were visually similar to the target than when the distractor items are distinctly different.
This suggested that there were two qualitatively different regions within the span: a decision region (where information about the presence or absence of a target is available), and a preview region where some item information is available, but where information on the absence of a target is not yet available.
Describing where a person will most likely fixate while performing a visual task is often described using a saliency map (e.g., Guided Search Model, Cave & Wolfe, 1990; Wolfe, 1994, 2001). The Guided Search model is perhaps the most well known and has sparked a great deal of interest in the guiding mechanisms of visual search. According to this model, guidance within a search array arises from two sources: bottom-up (prioritizes items according to how much they differ from their neighbors) and top-down (prioritizes items according to how much they share target features).
Eye movements and scene perception
When people look at scenes not every part of the scene is fixated. This is largely because in scene perception information can be obtained over a wider region than is found in reading and possibly, visual search arrays ( Figure 6).
However, it is also clear that the important aspects of the scene are typically fixated (and generally looked at for longer periods than less important parts of the scene).
The average fixation duration in scene perception tends to be longer than that in reading, and likewise the average saccade size tends to be longer ( Figure 6).
The gist of a scene has been defined as the general scene concept (Potter, 1999), and is most often referring to a scene’s basic-level category when investigated in the literature (Oliva, 2005). Researchers have also found that in addition to a scene’s concept, the general scene layout is quickly extracted (Sanocki & Epstein, 1997; Sanocki, 2003).
One very important general finding is that viewers are able to acquire scene gist in a single glance. That is, the gist of the scene is understood so quickly, it is processed even before the eyes begin to move (De Graef, 2005). Indeed, recent research has shown that with only 40 ms of exposure, the visual system can extract enough information to processes the scene’s gist (Castelhano & Henderson, 2007b).
It has become clear that the eyes can quickly go to parts of a scene that are relevant and important. Pioneering works of Buswell (1938) and Yarbus (1967) first documented how a viewer’s gaze is drawn to important aspects of a visual scene and that task goal very much influences eye movements. Much of the research that followed illustrated that the eyes are drawn to informative areas in a scene quickly (Antes, 1974; Mackworth & Morandi, 1967).
Other studies have also made it clear that saliency of different parts of the scene greatly influences where viewers tend to fixate (Parkhurst & Niebur, 2003; Mannan, Ruddock, & Wooding, 1995, 1996). Saliency is typically defined in terms of low-level components of the scene (such as contrast, color, intensity, brightness, spatial frequency, etc.).
There are a number of computational models (Baddeley & Tatler, 2006; Findlay and Walker, 1999;Itti & Koch, 2000, 2001; Parkhurst, Law, & Niebur, 2002) that use the concept of a saliency map to model eye fixation locations in scenes. With this approach, the bottom-up properties of a scene (i.e., saliency map) make explicit predictions about the most visually salient regions of the scene. The models are basically used to derive predictions about the distribution of fixations on a given scene based on these prominent regions.
Research has shown that higher-level factors also have a strong influence on where viewers direct their gaze in a scene (Castelhano & Henderson, 2007a; Henderson & Castelhano, 2005; Henderson & Ferreira, 2004). For instance, see Figure 7.
Recently, Torralba, Oliva, Castelhano, and Henderson (2006) presented a computational model that incorporates the influence of top-down and cognitive strategies.
How much information is extracted from a single fixation on a scene? As noted at the beginning of this section, it is known the extent of the visual field used to extract useful information is much larger in scene viewing than it is in reading. In an early study, Nelson and Loftus (1980) examined object recognition as a function of the closest fixation on that object. Results showed that objects located within about 2.6 degrees from fixation were generally recognized. The results also suggested that information acquired from the region 2-3 degrees around fixation is qualitatively different from information acquired from regions further away (see Henderson & Hollingworth, 1999; Henderson, William, Castelhano & Falk, 2003).
The question of how large the perceptual span is during scene viewing hasn’t been answered as conclusively as it has in reading or visual search. It seems that objects can be located up to 4 degrees from the point of fixation and tagged for a saccade target, but it is not clear what the perceptual span is for other types of information (see Henderson & Ferreira, 2004, for a review). And yet, it does appear that viewers typically gain useful information from a fairly wide region of the scene and that it probably varies as a function of the scene and the task of the viewer.
Interestingly, viewers are rather insensitive to large changes in a scene (McConkie, 1991; Grimes & McConkie, 1995; Grimes, 1996). This phenomenon is referred to as Change Blindness. For instance, see Figure 8.
Research on change blindness has found that the placement of eye movements plays a significant role in the phenomenon of change blindness. Hollingworth and Henderson (2002) found that when the fixation location was accounted for, the ability of people to detect change was significantly higher when the pre-change and post-change region of a scene were both fixated. This study highlighted the importance of encoding and retrieving specific details of the scene in order to be able to detect changes in these images.
When do viewers move their eyes when looking at scenes? Past studies have shown that attention precedes an eye movement to a new location within a scene (Henderson, 1992; van Diepen & D’Ydewalle, 2003). So, it would follow that the eyes will move once the visual information at the center of vision has been processed and a new fixation location has been selected and programmed (for review, see Henderson, 2007).
Research also suggests that at the fovea information is extracted very rapidly, and attention is directed to the periphery almost immediately following the extraction of information (70-120 ms) to choose a viable saccade target. The general timing of the switch between central and peripheral information processing is currently being investigated; however, the inherent variability across scenes makes it difficult to find as specific a time frame as in reading.
General comments on eye movements
Although there are obviously many differences between these tasks, there are some general principles that are likely to hold across them (see also Rayner, 1995, 1998).
- How much information is processed on any fixation (the perceptual span or functional field of view) varies as a function of the task. The perceptual span is clearly smaller in reading than in either scene perception or visual search. Hence, for example, fixations in scene perception tend to be longer and saccades are longer because more information is being processed in a single fixation.
- The difficulty of the stimulus influences eye movements: in reading, when the text becomes more difficult, eye fixations get longer and saccades get shorter; likewise in scene perception and visual search, when the array is more crowded, cluttered, or dense, fixations get longer and saccades get shorter.
- The difficulty of the specific task (reading for comprehension versus reading for gist, searching for a person in a scene versus looking at the scene for a memory test, and so on) obviously influences how the eyes move.
- In all three tasks there is some evidence (Najemnik & Geisler, 2005; Rayner, 1998) that viewers integrate information poorly across fixations and it is more critical that information is processed efficiently on each fixation.
For further information on eye movements in visual cognition, we suggest:
- Findlay and Gilchrist (2003) “Active Vision -- The Psychology of Looking and Seeing”. Oxford: Oxford University Press.
- Henderson. J. M., & Ferreira, F. (2004). The interface of language, vision, and action: Eye movements and the visual world.New York: Psychology Press.
- Antes, J.R. (1974). The time course of picture viewing. Journal of Experimental Psychology, 103, 62-70.
- Baddeley, R. J., & Tatler, B. W. (2006). High frequency edges (but not contrast) predict where we fixate: a Bayesian system identification analysis. Vision Research, 46, 2824-2833.
- Becker, W., & Jürgens, R. (1979). Analysis of the saccadic system by means of double step stimuli. Vision Research, 19, 967-983.
- Bertera, J. H., & Rayner, K. (2000). Eye movements and the span of effective stimulus in visual search. Perception & Psychophysics, 62, 576-585.
- Buswell, G. T. (1935). How people look at pictures. Chicago: University of Chicago Press.
- Carroll, P.J., & Slowiaczek, M.L. (1986). Constraints on semantic priming in reading: A fixation time analysis. Memory & Cognition, 14, 509-522.
- Castelhano, M. S., & Henderson, J. M. (2007a). Initial scene representations facilitate eye movement guidance in visual search. Journal of Experimental Psychology: Human Perception and Performance, 33(4), 753-763.
- Castelhano, M. S., & Henderson, J. M. (2007b). The influence of color on perception of scene gist. Journal of Experimental Psychology: Human Perception and Performance, in press.
- De Graef, P. (2005). Semantic effects on object selection in real-world scene perception. In G. Underwood (ed), Cognitive processes in eye guidance. Oxford: Oxford University Press.
- Duffy, S. A., Morris, R. K., & Rayner, K. (1988). Lexical ambiguity and fixation times in reading. Journal of Memory and Language, 27, 429-446.
- Engbert, R., Nuthmann, A., Richter, E., & Kliegl, R. (2005). SWIFT: A dynamical model of saccade generation during reading. Psychological Review, 112, 777-813.
- Ehrlich, S. E, & Rayner, K. (1981). Contextual effects on word perception and eye movements during reading. Journal of Verbal Learning and Verbal Behavior, 20, 641-655.
- Greene, H. (2006). The control of fixation duration in visual search. Perception, 35, 303-315.
- Greene, H., & Rayner, K. (2001a). Eye movements and familiarity effects in visual search. Vision Research, 41, 3763-3773.
- Greene, H., & Rayner, K. (2001b). Eye-movement control in direction-coded visual search. Perception, 29, 363-372.
- Grimes, J. (1996). On the failure to detect changes in scenes across saccades. In K. Akins (Ed.), Vancouver studies in cognitive science: Vol. 5. Perception (pp. 89–110). New York: Oxford University Press.
- Grimes J., & McConkie, G. (1995). On the insensitivity of the human visual system to image changes made during saccades. In K. Akins, (Ed.), Problems in Perception. Oxford, UK: Oxford University Press.
- Henderson, J. M. (1992). Identifying objects across saccades: Effects of extrafoveal preview and flanker object context. Journal of Experimental Psychology: Learning, Memory, and Cognition, 18, 521-530.
- Henderson, J.M. (2007). Regarding Scenes. “Current Directions in Psychological Science, 16 (4)”, 219–222.
- Henderson, J.M, & Castelhano, M.S. (2005). Eye Movements and Visual Memory for Scenes. In G. Underwood (Ed.), Cognitive Processes in Eye Guidance (pp. 213-235). Oxford University Press.
- Henderson, J. M., & Hollingworth, A. (1999). The role of fixation position in detecting scene changes across saccades. 'Psychological Science, 10, 438-443.
- Henderson. J. M., & Ferreira, F. (2004). Scene perception for psycholinguists. In J. M. Henderson, and F. Ferreira (Eds.), The interface of language, vision, and action: Eye movements and the visual world (pp 1-58). New York: Psychology Press.
- Henderson, J.M., Williams, C., Castelhano, M.S., & Falk, R. (2003).
- Inhoff, A. W., & Rayner, K. (1986). Parafoveal word processing during eye fixations in reading: Effects of word frequency. Perception & Psychophysics, 40, 431-439.
- Itti., L., & Koch, C. (2000). A saliency-based search mechanism for overt and covert shifts of visual attention. Vision Research, 40, 1489-1506.
- Itti, L., & Koch, C. (2001). Computational modeling of visual attention. Nature Reviews: Neuroscience, 2, 194-203.
- Juhasz, B.J., & Rayner, K. (2003). Investigating the effects of a set of intercorrelated variables on eye fixation durations in reading. Journal of Experimental Psychology: Learning, Memory & Cognition, 29, 1312-1318.
- Juhasz, B.J., & Rayner, K (2006). The role of age-of-acquisition and word frequency in reading: Evidence from eye fixation durations. Visual Cognition, 13, 846-863.
- Liversedge, S.P., Rayner, K., White, S.J., Findlay, J.M., & McSorley, E. (2006). Binocular coordination of the eyes during reading. Current Biology, 16, 1726-1729.
- Liversedge, S.P., White, S.J., Findlay, J.M., & Rayner, K. (2006). Binocular coordination of eye movements during reading. Vision Research, 46, 2363-2374.
- Mackworth, N. H., & Morandi, A. J. (1967). The gaze selects informative details within pictures. Perception & Psychophysics, 2, 547-552.
- Mannan, S. K., Ruddock, K. H., & Wooding, D. S. (1995). Automatic control of saccadic eye movements made in visual inspection of briefly presented 2-D images. Spatial Vision, 9, 363-386.
- Mannan, S. K., Ruddock, K. H., & Wooding, D. S. (1996). The relationship between the locations of spatial features and those of fixation made during visual examination of briefly presented images. Spatial Vision, 10, 165-188.
- McConkie, G.W. (1991). Perceiving a stable visual world. In Proceedings of the Sixth European Conference on Eye Movements, (pps. 5–7). Leuven, Belgium: Laboratory of Experimental Psychology.
- Morrison, R. E., & Rayner, K. (1981). Saccade size in reading depends upon character spaces and not visual angle. Perception & Psychophysics, 30, 395-396.
- Nelson, W.W., & Loftus, G.R. (1980). The functional visual field during picture viewing. Journal of Experimental Psychology: Human Learning and Memory, 6, 391-399.
- Nodine, C. E, Carmody, D. P., & Herman, E. (1979). Eye movements during visual search for artistically embedded targets. Bulletin of the Psychonomic Society, 13, 371-374.
- Oliva, A. (2005). Gist of the scene. In L. Itti, G. Rees, and J.K. Tsotsos (Eds.),The Encyclopedia of Neurobiology of Attention. (pp. 251-256) San Diego: Elsevier.
- Parkhurst, D. J., & Niebur, E. (2003). Scene content selected by active vision. Spatial Vision, 16, 125–154.
- Parkhurst, D., Law, K., & Niebur, E. (2002). Modeling the role of salience in the allocation of overt visual attention. Vision Research, 42, 107-123.
- Posner, M. I. (1980). Orienting of attention. Quarterly Journal of Experimental Psychology, 32, 3-25.
- Potter, M. (1999). Understanding Sentences and Scenes: The role of conceptual short-term memory. In V. Coltheart (Ed.), Fleeting Memories, (pp. 13-46). Boston: MIT Press.
- Pollatsek, A., Raney, G.E., LaGasse, L., & Rayner, K. (1993). The use of information below fixation in reading and visual search. Canadian Journal of Experimental Psychology, 47, 179-200.
- Rayner, K. (1978). Eye movements in reading and information processing. Psychological Bulletin, 85, 618-660.
- Rayner, K. (1995). Eye movements and cognitive processes in reading, visual search, and scene perception. In J. M. Findlay, R. Walker, & R.W. Kentridge (Eds.), Eye movement research: Mechanisms, processes and applications (pp. 3-22). Amsterdam: North Holland.
- Rayner, K. (1998). Eye movements in reading and information processing: 20 years of research. Psychological Bulletin, 85, 618-660.
- Rayner, K., & Duffy, S.A. (1986). Lexical complexity and fixation times in reading: Effects of word frequency, verb complexity, and lexical ambiguity. Memory & Cognition, 14, 191-201.
- Rayner, K., & Pollatsek, A. (1989). The psychology of reading. Englewood Cliffs, NJ: Prentice Hall.
- Rayner, K., & Well, A. D. (1996). Effects of contextual constraint on eye movements in reading: A further examination. Psychonomic Bulletin & Review, 3, 504-509.
- Reichle, E. D., Pollatsek, A., Fisher, D. L., & Rayner, K. (1998). Toward a model of eye movement control in reading. Psychological Review, 105, 125-157.
- Reichle, E. D., Rayner, K., & Pollatsek, A. (2003). The E-Z Reader model of eye movement control in reading: Comparisons to other models. Behavioral and Brain Sciences, 26, 445-476.
- Sereno, S.C., O'Donnell, P.J., & Rayner, K. (2006). Eye movements and lexical ambiguity resolution: Investigating the subordinate bias effect. Journal of Experimental Psychology: Human Perception and Performance, 32, 335-350.
- Torralba, A., Oliva, A., Castelhano, M.S., & Henderson, J.M. (2006). Contextual guidance of attention in natural scenes. Psychological Review, 113, 766-786.
- van Diepen, E M. J., & d'Ydewalle, G. (2003). Early peripheral and foveal processing in fixations during scene perception. Visual Cognition, 10, 79-100.
- Vlaskamp, B.N.S., & Hooge, I.T.C. (2006). Crowding degrades saccadic search performance. Vision Research, 46, 417-425.
- Wolfe, J.M. (2001). Guided Search 4.0: A guided search model that does not require memory for rejected distractors [Abstract]. Journal of Vision, 1(3), 349.
- Wolfe, J. M. (1994). Guided Search 2.0: A revised model of visual search. Psychonomic Bulletin and Review, 1(2), 202-238.
- Williams, R.S., & Morris, R.K. (2004). Eye movements, word familiarity, and vocabulary acquisition. European Journal of Cognitive Psychology, 16, 312-339.
- Yarbus, A. (1967). Eye movements and vision. New York: Plenum Press.
- Mark Aronoff (2007) Language. Scholarpedia, 2(5):3175.
- John Dowling (2007) Retina. Scholarpedia, 2(12):3487.
- Ernst Niebur (2007) Saliency map. Scholarpedia, 2(8):2675.
- Keith Rayner's website
- The Rayner Eyetracking Laboratory
- Monica Castelhano's website
- Monica Castelhano's Lab