Charles Spence Crossmodal Research Laboratory, Department of Experimental Psychology, Oxford University
CORRESPONDENCE TO: Professor Charles Spence, Department of Experimental Psychology, University of Oxford, South Parks Road, Oxford, OX1 3UD, United Kingdom. Tel: +44-1865-271364; Fax: +44-1865-310447
Crossmodal Attention. Attention refers to those processes that allow for the selective processing of incoming sensory stimuli. Mechanisms of attention help us to prioritize those stimuli that are most relevant to achieving our current goals and/or to performing the task at hand. The term ‘attention’ is used to describe those processes that give rise to a temporary change (often enhancement) in signal processing. This change will often be manifest in only a subset of the stimuli presented at any time. Researchers have attempted to distinguish attentional effects from other temporary changes in the efficiency of information processing, such as those induced by changes in arousal and/or alerting. These latter processes can be contrasted with attention both on the grounds of their non-selectivity (i.e., increased arousal or alertness tends to influence the processing of all incoming stimuli; that is, its effects are stimulus non-specific), and behaviourally, by the fact that while alerting, for example, can speed up a person’s response it also tends to result in increased errors (i.e., perceptual sensitivity is not enhanced). (More recently, however, it has become somewhat more difficult to distinguish between attention, alerting, and arousal. This is both because researchers have started to argue that certain kinds of attention effect may only lead to a speeding of participants’ responses (i.e., without any concomitant change in perceptual sensitivity; see Prinzmetal et al., 2005a, b, 2009) and also because there is some evidence of the selectivity of alerting effects (e.g., in terms of the modality affected; e.g., Posner, 1978).)
Attention can either be oriented endogenously or exogenously: People orient their attention endogenously whenever they voluntarily choose to attend to something, such as when listening to a particular individual at a noisy cocktail party, say, or when concentrating on the texture of the object that they happen to be holding in their hands. By contrast, exogenous orienting occurs when a person’s attention is captured reflexively (i.e., involuntary) by the sudden onset of an unexpected event, such as when someone calls our name at a noisy cocktail party, or when a mosquito suddenly lands on our arm. However, our attention can also be captured by intrinsically salient or biological significant stimuli. Attended stimuli tend to be processed more thoroughly and more rapidly than other potentially distracting (‘unattended’) stimuli (Posner, 1978; Spence & Parise, 2010). Although attention research has traditionally considered selection among the competing inputs within just a single sensory modality at a time (most often vision; see Driver, 2001, for a review), the last couple of decades have seen a burgeoning of interest in the existence and nature of any crossmodal constraints on our ability to selectively attend to a particular sensory modality, spatial location, event, or object (Spence & Driver, 2004). In fact, crossmodal interactions in attention have now been demonstrated between most combinations of visual, auditory, tactile, olfactory, gustatory, and even painful stimuli (Calvert et al., 2004).
Attending to a sensory modality
One of the most fundamental questions in crossmodal attention research concerns the extent to which people can selectively direct their attention toward a particular sensory modality such as, for example, audition, at the expense of the processing of stimuli presented in the other modalities. Spence and his colleagues have conducted a number of studies showing that voluntarily (i.e., endogenously) attending to one sensory modality can result in the facilitation of people’s (speeded) spatial discrimination responses to stimuli presented in that modality when compared to situations in which their attention has been directed to another modality instead (Spence et al., 2000a, 2001a, 2002; see also Ashkenazi & Marks, 2004). The presentation of a non-predictive cue stimulus in a particular modality has also been shown to result in the short-lasting exogenous orienting of attention toward the modality in which the cue was presented (see Spence et al., 2001 a; Turatto et al., 2002, 2004).
Interestingly, endogenously attending to a particular sensory modality does not always give rise to a particularly large facilitation of behavioural performance (e.g., Alais et al., 2006; Shiffrin & Grantham, 1974). For example, Alais et al. recently observed little decrement in perceptual sensitivity when the participants in their study monitored the stimuli presented in two sensory modalities (audition and vision) rather than just one (i.e., audition or vision). The visual task in Alais et al.’s study consisted of ‘low-level’ contrast discrimination, while the auditory task involved participants having to discriminate the pitch of target sounds (i.e., another low-level task). The results showed that auditory thresholds were slightly (but significantly) higher in the bimodal divided attention condition than in the unimodal focused attention condition. By contrast, visual thresholds were completely unaffected by whether performance was assessed in the focused or bimodal divided attention blocks.
In another study by Larsen et al. (2003), participants were briefly presented with pairs of degraded letters, one spoken and the other presented visually on a screen. In the focused attention condition, the participants only had to report the letter presented in one pre-specified modality (either audition or vision), while in the divided attention condition, they had to try and report which letter had been presented in each modality. Once again, no significant cost was associated with dividing attention (though it should be noted that only 8 participants were tested in this study and the main effect of focusing versus dividing attention was ‘borderline-significant’). What is more, the probability of correctly reporting the target in one modality was found to be independent of whether the target was reported correctly in the other modality or not. That said, performance did improve significantly if the same letter happened to be presented in both modalities.
The most parsimonious account of the large body of data published over the last century or so on the effects of attending to a modality is that: 1) Generally-speaking, people find it easier to divide their attention between tasks presented in different modalities than between tasks presented within the same sensory modality (e.g., Hancock et al., 2007; Lavie, 2005; Proctor & Proctor, 1979; Sarter, 2007; Treisman & Davis, 1973; Wickens, 1992, 2008); 2) The costs associated with attending to the wrong sensory modality appear to be larger than the benefits associated with focusing attention on a particular modality, when compared to performance in a neutral baseline divided attention condition (see Alais et al., 2006; Spence et al., 2001a; see also Bonnel & Hafter, 1998); 3) Divided attention costs are more likely to be observed in identification/discrimination tasks than in tasks requiring simple detection (Bonnel & Hafter, 1998; Eijkman & Vendrik, 1965; Hein et al., 2006); 4) Tasks requiring speeded responding by participants are likely to show larger effects of attending to a sensory modality than unspeeded responding tasks (Spence et al., 2001a; though see also Spence & Parise, 2010); 5) Crossmodal attentional effects are more likely to be observed when participants have to respond on the basis of higher-level stimulus attributes (such as, for example, relating to a target’s semantic identity; e.g., Treisman & Davies, 1973; though see also Larsen et al., 2003) or amodally-defined spatial location (Soto-Faraco et al., 2002) than when responding to lower-level (and likely more modality-specific) stimulus attributes such as colour or pitch; and 6) The costs of divided attention tend to be somewhat more apparent for auditory than for visual targets (e.g., Alais et al., 2006; Hein et al., 2006; see also Proctor & Proctor, 1979).
Over the years, the results of a number of studies utilizing a range of different behavioural tasks/paradigms have delivered results that fit broadly with the conclusions outlined above: For example, it appears that temporal attentional processing deficits associated with trying to process more than one near-simultaneously-presented stimulus, as indexed by phenomena such as the attentional blink (AB; e.g., Arnell, 2006; Arnell & Jenkins, 2004; Duncan et al., 1996; Hein et al., 2006, 2007; Jolicouer, 1999; Soto-Faraco et al., 2002; Soto-Faraco & Spence, 2002; Van der Burg et al., 2007), inattentional blindness (IB; Sinnett et al., 2006), or repetition blindness/deafness (RB/RD; Soto-Faraco & Spence, 2002), are much more severe within a particular sensory modality than between them. Similarly, a number of studies of perceptual load (see Lavie, 1995, 2005) also support the view that perceptual resources are more limited within a modality than between modalities (Otten et al., 2000; Rees et al., 2001; Tellinghuisen & Nowak, 2003). Interestingly, however, other dual-task processing deficits, such as the psychological refractory period (PRP), appear to be just as large no matter whether the sequentially-presented target stimuli are presented in the same modality versus in different sensory modalities (e.g., Pashler, 1994; Spence, 2008).
What, then, of the neural correlates of focusing attention on a particular sensory modality? A growing-number of neuroimaging studies have started to highlight the neural consequences of (or mechanisms underlying) the effects of focusing attention on a particular sensory modality (e.g., Johnson & Zatorre, 2005, 2006; Kawashima et al., 1995, 1999; Macaluso et al., 2002a; Roland, 1982). These studies have highlighted the increased neural activity that is seen in those cortical areas traditionally associated with the attended modality (though see Ghazanfar & Schroeder, 2006) and a suppression of neural activity in those cortical areas associated with the competing or ignored sensory modality (or modalities). For example, when attention is drawn away from an auditory event by the presence of a visual stimulus, and particularly by attending to vision (i.e., when performing a visual task, as compared with a non-competitive baseline condition), auditory cortex, especially secondary auditory cortical areas, shows decreased activity in response to auditory stimuli (Johnson & Zatorre, 2005; Laurienti et al., 2002; Shomstein & Yantis, 2004). The available evidence suggests that the extent of this suppression may well depend upon the difficulty of the task at hand (Hairston et al., 2008). By analyzing the functional connectivity between the auditory and visual cortical areas in participants performing auditory and visual tasks, Johnson and Zatorre (2005, 2006) have been able to highlight a reciprocal inverse relationship with decreasing visual activation correlating with increased auditory activation and vice versa (see also Just et al., 2008).
Macaluso et al. (2002a) conducted an influential positron emission tomography (PET) study of the neural correlates associated with attending to either vision or touch. The participants in their study were presented with visual and tactile stimuli simultaneously on both sides (i.e., four stimuli were presented at once, two on either side). The stimuli consisted of both single and double pulses, and the participants’ task was to respond verbally whenever a target (a double pulse) was presented in the attended modality. When the participants focused their attention on the tactile modality, activation was observed in the parietal operculi. Meanwhile, activation was seen in the posterior parietal and superior premotor cortices when attention was focused on the visual modality instead. Other researchers have shown that attending to tactile stimuli can also give rise to an increase in activity in primary and secondary somatosensory cortex (e.g., Burton et al., 1999; Chapman & Meftah, 2005; Sterr et al., 2007).
Crossmodal attention and visual dominance
Over the years, it has frequently been claimed that humans preferentially direct their attentional resources toward the visual modality (e.g., see Posner et al., 1976; Spence et al., 2001b). Evidence in support of this claim has come from a number of different sources: One intriguing example that it has been argued supports the notion of an attentional account of visual dominance comes from research on the Colavita effect (see Colavita, 1974; see Spence, 2009, for a review). In his now-classic study, Colavita reported that while people find it easy to make speeded modality discrimination/detection responses to auditory and visual stimuli when they are presented in isolation, they often fail to respond to auditory stimuli when they are presented at the same time as visual targets (see also Hecht & Reiner, 2009). In an influential review paper, Posner et al. argued that the Colavita effect was one of a range of laboratory phenomena illustrating visual dominance that could be explained in terms of participants having a tendency to direct their attention toward the visual modality, perhaps to make up for the inferior alerting properties of visual stimuli. However, recent empirical research has shown that while attentional manipulations may, on occasion, be successful in eliminating vision’s dominance over audition in the Colavita effect (see Koppen & Spence, 2007a, b; Sinnett et al., 2007b), they cannot be used to reverse it (i.e., and show auditory dominance, as would be predicted according to the attentional account of the phenomenon; see Spence et al., in press b, for a review).
There are now many examples in the literature demonstrating vision’s dominance over audition and touch/proprioception. So, for example, vision typically dominates over audition in localization judgments, as highlighted by evidence from studies of the ventriloquism effect (see Bertelson & Aschersleben, 1998; see Bertelson & de Gelder, 2004, for a review). Vision has also been shown to modulate auditory perception in the McGurk effect (see Alsius et al., 2005; McGurk & MacDonald, 1976). Similarly, vision dominates over touch and proprioception in determining where people feel their arm to be in the rubber hand illusion (Botvinick & Cohen, 1998; Ehrsson et al., 2004). Audition has, however, been shown to dominate over (or modulate) vision in several other tasks (e.g., see Sekuler et al., 1997; Shams et al., 2000; Shipley, 1964; Welch et al, 1986). As a general rule, spatial tasks typically result in visual dominance whereas temporal tasks more often result in auditory dominance.
Over the last few years, maximum likelihood estimation has been shown to provide an impressive account of many of the findings from sensory dominance research. According to Ernst and Banks (2002), which sense dominates in any given situation depends on the variance associated with each perceptual estimate. They suggest that such perceptual estimates may be ‘optimal’ in the sense that each modality’s estimate is weighted by its reliability/variability. Thus, our brains appears to integrate noisy sensory inputs such that the variance associated with the multisensory estimate is maximally reduced. Since Ernst and Banks’ original paper was published, several other research groups have successfully used MLE in order to model the sensory dominance observed in a number of other behavioural paradigms (see e.g., Alais & Burr, 2004; Beierholm et al., 2005; Gori et al., 2008; Morgan et al., 2008; van Beers et al., 2002).
In Ernst and Banks’s (2002) study and in many other early studies of sensory dominance, only a single stimulus was presented in each sensory modality and hence there was essentially no binding problem (i.e., deciding which stimulus from each modality should be bound together). However, given that we typically operate in more complex multi-stimulus environments, the binding problem becomes more apparent, and this has led the researchers in subsequent studies to move to a Bayesian decision theory account of sensory dominance and multisensory integration, i.e., an account that incorporates both priors and likelihood functions (e.g., Ernst & Bülthoff, 2004; Roach et al., 2006; Wozny et al., 2008). Researchers have, for example, argued that we likely have a prior to bind stimuli that happen to be presented from more-or-less the same spatial location (Gephstein et al., 2005; though see also Helbig & Ernst, 2007). Given the excellent job that Bayesian decision theory does in accounting for the literature on sensory dominance, the question arises as to whether there is any role for attention left. At present, the answer isn’t altogether clear, with some researchers arguing that there may be (e.g., Battaglia et al., 2003; though see Witten & Knudsen, 2005), while others have found no effect of attentional manipulations (using a dual-task manipulation to reduce the perceptual resources available in one modality) on the relative weightings of the unimodal inputs (Helbig & Ernst, 2008). The latter results support an ‘early’, rather than a ‘late’, model of integration (i.e., whereby cue combination occurs prior to the effects of attention; see also Andersen et al., 2005).
People can either orient their spatial attention overtly or covertly: Overt orienting occurs when we shift our eyes, head, hands, and/or tongue in order to more efficiently process a given environmental stimulus. By contrast, covert orienting occurs when we shift our attention without making any overt movements. Covert orienting is currently of most interest to cognitive neuroscientists studying crossmodal selective attention. It should, though, be noted that many of the same neural structures control both types of attentional orienting, and indeed, covert orienting has often been shown to occur as a pre-cursor to overt orienting (e.g., Rorden & Driver, 1999; Rorden et al., 2002). The majority of studies of crossmodal spatial attention published to date have adapted the spatial cuing paradigm first popularized by Posner back in the late 1970s (e.g., Posner, 1978; see Wright & Ward, 2008, for a recent review).
In terms of crossmodal links in spatial attention, the term crossmodal is typically used to refer to those situations in which the orienting of a person’s spatial attention in one sensory modality (such as vision) results in a concomitant shift of attention in one or more of their other sensory modalities (such as audition or touch) to the same location (or object; Turatto et al., 2005) at the same time. The central question for cognitive neuroscientists interested in crossmodal attention research concerns how the brain’s attentional resources are coordinated, or linked, between the various spatial senses (i.e., vision, audition, and touch). How is it, for example, that people can select just that subset of information that is relevant to their current goals from amongst the abundance of multisensory information impinging on the various sensory receptors at any one time?
Most researchers fall into one of four camps regarding the nature (and even the very existence) of crossmodal links in spatial attention. According to the modality-specific attentional resources account (see Hancock et al., 2007; Sarter, 2007; Wickens, 1992, 2008), there are relatively independent visual, auditory, and tactile attentional systems in the human brain (see Figure 1A). According to this account (which posits that crossmodal links in spatial attention do not exist), people should be able to direct their visual attention to one location (as indicated schematically by the arrow in the figure) while at the same time directing their auditory or tactile attention in different directions (since the attentional systems are independent). Other cognitive neuroscientists, meanwhile, have argued that there is a single supramodal attentional system in the human brain, such that people can only attend to a single location (or object) at any given time. The claim here is that people simply cannot ‘split’ their attention between different locations simultaneously (see Figure 1B). According to the supramodal account, all stimuli, no matter what their modality, that are presented from a location that is attended should receive preferential processing relative to stimuli that are presented elsewhere.
A third possibility is that there might be some intermediate form of organization of attentional resources instead. So, for example, according to Spence and Driver’s (1996) ‘separate-but-linked’ hypothesis (see Driver & Spence, 2004, for a review), there may be separate auditory, visual, and tactile attentional systems at the earliest levels of information processing. However, these attentional systems are subsequently linked, such that people’s attention is typically focused on the same region of space in the different modalities, but importantly does not always have to be (see Figure 1C). Posner (1990) has also proposed a somewhat different hybrid attentional system, one involving interconnected modality-specific, and supramodal, attentional systems (see Figure 1D).
Of course, any one of these models of crossmodal attention can be combined with the evidence concerning people’s ability to ‘selectively attend to a modality’ (discussed above). Accordingly, the operation of any one of the models highlighted in Figure 1 can presumably be modified by the biasing of a person’s attention toward, or away from, a particular sensory modality (or modalities; see Spence et al., 2001a). Given such a possibility, it therefore becomes all the more difficult to try and discriminate between these various models of crossmodal attention on the basis of behavioral data alone. This is one of the principal reasons why the various techniques of cognitive neuroscience, such as functional magnetic resonance imaging (fMRI) and transcranial magnetic stimulation (TMS), are increasingly being brought to bear on this topic (e.g., Chambers et al., 2004, 2007; Kida et al., 2007; Macaluso et al., 2000a, b, 2002a, b).
So what of the empirical evidence? Well, it has been shown that if people deliberately choose to direct their spatial attention to a particular location in one sensory modality, their endogenous attention in the other modalities will tend to follow to the same location, albeit at a somewhat reduced level (that is, the attentional benefits will be smaller; see Driver & Spence, 2004, for a review). So, for example, if participants are instructed to attend to their left hand because a tactile target is more likely to be presented from that location (than from the other hand), visual targets will also be responded to preferentially when they are presented by the left hand (than by the other hand). Eimer and his colleagues have shown that these attentional effects are location- (rather than hemispace-) specific, and, what is more, the spatial focus of attention in the secondary modality is exactly the same as in the primarily attended modality (see Eimer & van Velzen, 2005; Eimer et al., 2004). Evidence in support of the ‘separate-but-linked’ hypothesis of crossmodal links in endogenous spatial attention in humans comes from the results of studies showing that although they find it difficult, people can nevertheless still direct their spatial attention in different directions in different modalities at the same time (albeit with reduced efficiency). So, for example, under the appropriate experimental conditions, people can preferentially process visual stimuli presented by their left hand while simultaneously showing a small but significant attentional benefit when responding to tactile stimuli presented to their right hand. Such results are inconsistent with both the modality-specific and supramodal accounts of crossmodal spatial attention (see Figure 1), and hence have been taken to provide support for one of the hybrid models, such as the separate-but-linked account (see also Eimer, 2004; Chambers et al., 2004; Driver & Spence, 2004; Kida et al., 2007; Santangelo, Fagiloi, & Macaluso, 2010).
But what of the crossmodal links that constrain the deployment of exogenous spatial attention? In a typical exogenous orienting study, a spatially-nonpredictive cue is presented shortly before (typically within 300 ms) a target appears on either the same or opposite side. The target may either be presented in the same or different sensory modality as the cue. Importantly, however, the target is just as likely to be presented on the same, as on the opposite, side as the cue. Even though participants are often explicitly instructed to try and ignore the cue as much as possible (since it provides no useful information with regards to the likely location of the target), the results of innumerable studies have now shown that participants typically cannot ignore the cue (even after 1000’s of trials). Instead, they tend to respond more rapidly (and/or accurately) to targets presented at the cued, as opposed to the uncued, location, at least when the target is presented within a few hundred milliseconds of the cue. These crossmodal attention effects occur even under conditions where the modality of the cue is completely irrelevant to the participant’s task (i.e., when auditory cues are presented prior to participants performing an unimodal visual task; see Spence, 2001). Studies of exogenous crossmodal spatial attention have demonstrated that perceptual sensitivity is also enhanced at the cued location (McDonald et al., 2000; though see Prinzmetal et al., 2005a, b, 2009). What is more, people tend to become aware of stimuli presented at the cued location sooner than when the same stimuli are presented elsewhere, a phenomenon known as ‘prior entry’ (McDonald et al., 2005; see Spence, in press a, Spence & Parise, 2010, for reviews).
The presentation of auditory, tactile, or visual spatially-nonpredictive cues result in a rapid exogenous shift of spatial attention to the cued location. This attentional shift facilitates the subsequent processing of auditory, tactile, and visual targets at that location. While early studies of crossmodal exogenous orienting tended to present the stimuli from only one cue and target location on either side of a central fixation point, more recent studies have demonstrated that crossmodal exogenous spatial orienting effects can actually be quite spatially-specific (e.g., Gray et al., 2009; see Spence et al., 2004, for a review). In fact, it would appear that the spatial distribution of a person’s exogenous attention following the presentation of a peripheral cue depends on the modality of the cue, with visual cues typically giving rise to more spatially-focused attentional effects than tactile cues, which in turn seem to cue attention more narrowly than do auditory cues (see Gray et al., 2009; Spence et al., 2004). What is more, the narrow focus of spatial attention that is typically seen following visual cuing helps to explain why visual cues have not always been shown to influence auditory performance (especially when participants have had to make auditory elevation discrimination responses; see Prime et al., 2008; Spence, 2001).
Traditionally, it was thought that exogenous spatial cuing effects were automatic (e.g., see Spence, 2001). However, the latest research has highlighted the fact that both intramodal and crossmodal cuing effects may be effectively eliminated under conditions where the participant is engaged in another attentionally-demanding perceptual task at the same time. This is true regardless of whether that task (i.e., the target stimulus) is presented in the same or different modality as the cue (Spence & Santangelo, 2009). Interestingly, multisensory cues (e.g., audiovisual or audiotactile) seem to be capable of capturing a person’s spatial attention no matter what else they may be doing at the same time i.e., no matter how high the perceptual load (see Spence, in press; Spence & Santangelo, 2009, for reviews).
Researchers are currently debating the extent to which such facilitatory crossmodal effects should be considered in terms of crossmodal links in exogenous spatial attention versus in terms of the results of multisensory integration, as popularized by the work of Stein and his colleagues at the single cell level (see Bolognini et al., 2005; Macaluso et al., 2000a, 2001; McDonald et al., 2001; Spence et al., 2004). One way to tease these two explanations apart in the future may be in terms of relative stimulus timing: Crossmodal exogenous attentional effects should peak when the onset of the cue precedes that of the target by up to 100-200 ms, whereas multisensory integration effects should be maximal when the cue and target are presented at around the same time (see King & Palmer, 1985; Meredith et al., 1987; Shore et al., 2006; Spence et al., 2004).
As the interval between the onset of a spatially-nonpredictive peripheral cue and target lengthens beyond around 3-400 ms, participants may start to respond more slowly to targets at the cued (as compared to the uncued) location. This phenomenon, known as ‘inhibition of return’ (IOR; Posner & Cohen, 1984), is typically reported in speeded simple detection studies, but has, on occasion, also been observed in speeded discrimination tasks as well (see Klein, 2000; Lupiáñez, in press). Spence et al. (2000b) have demonstrated that IOR occurs between all possible combinations of visual, auditory, and tactile stimuli. In their study, a random sequence of auditory, visual, and tactile stimuli were presented to either side of fixation with each target requiring a speeded simple detection response. Participants’ responses to targets in all three modalities were slowed significantly when the target on the preceding trial had been presented on the same rather than on the opposite side. This pattern of results suggests that IOR is a supramodal phenomenon. Many researchers believe that IOR may reflect an inhibitory tag attached to a location following the exogenous orienting of attention to a location where no target (i.e., event of interest) was found (Klein, 2000).
Maintaining crossmodal correspondence following posture change
Having demonstrated the existence of crossmodal links in both endogenous and exogenous spatial attention, and in IOR, between all possible combinations of auditory, visual, and tactile stimuli, one of the most important issues currently facing crossmodal attention researchers concerns how (and even whether) the brain updates the mapping (or correspondence) between the senses when people change their posture. Note that each of our senses initially codes information according to a different frame of reference: So, for example, at the earliest stages of information processing, visual stimuli are coded retinotopically, auditory stimuli tonotopically, and tactile stimuli somatotopically. The question therefore arises as to how the various cues processed by our different senses are coordinated into a common frame of reference for the control of attention, and subsequently, action (see Pöppel, 1979; Spence & Driver, 2004).
In order to investigate whether crossmodal links in spatial attention are updated following posture change, researchers typically conduct experiments in which participants have to cross their hands over the midline (e.g., so that their left hand lies in the right side of space and their right hand is positioned on the left side), or else deviate their gaze (to either the left or right) while keeping their head fixed straight ahead (Azañón & Soto-Faraco, 2008; Ferlazzo et al., 2002; Spence & Driver, 2004; Spence et al., 2008). These postural manipulations are designed to vary the mapping between the senses. The results of several such studies have now demonstrated that crossmodal links in spatial attention (both endogenous and exogenous) are updated following such changes in posture. So, for example, people find it easier to attend to tactile stimuli presented to their left hand and to visual stimuli on their left side when their hands are placed in an uncrossed posture (so that their left hand is on the left side of their body, and their right hand on the right). By contrast, they find it easier to concentrate on their left hand and right visual stimuli when their hands are crossed over the midline (such that their left hand now lies in the right hemispace; see Driver & Spence, 2004; Spence et al., 2008, for reviews). Similarly, Spence et al. (2004) have also demonstrated that auditory cues exogenously draw people’s visual attention to more-or-less their correct external location, regardless of the position of their eyes with respect to the head. A similar finding has also been observed when a tactile stimulus is presented to one of a participant’s hands, while they are held in a crossed posture (see Kennett et al., 2001, 2002). Results such as these have led many researchers to conclude that the ‘space’ in which attention is directed is itself a multisensory construct (see Spence & Driver, 2004). At present, though, it is not so clear whether IOR also updates for any changes in posture (Driver & Spence, 1998b; Röder et al., 2002).
It is, though, worth bearing in mind that in many of these studies the participants were required to maintain their deviated gaze or crossed hand posture for several minutes at a time. Thus, such results do not necessarily tell us anything about how quickly spatial remapping occurs. Research that is more directly relevant to the question of how quickly the position of a stimulus is remapped (or correctly spatially coded) comes from a study reported by Azañón and Soto-Faraco (2008). They used a version of the orthogonal spatial cuing paradigm in which the participants had to discriminate the elevation of visual targets presented shortly after a tactile cue that was presented to either hand. Their results showed that when the participants’ hands were placed in a crossed posture the remapping of tactile stimuli into external coordinates took time. Thus, depending on the time at which attention is probed, perceptual facilitation/performance can be shown to be governed by different reference frames: Somatotopic at very short cue-target onset asynchronies (of up to approximately 100 ms), spatiotopic (meaning that tactile stimuli are referred to their correct external location) at longer intervals.
Neural underpinnings of crossmodal spatial attention
One of the most heavily investigated topics currently in crossmodal attention research concerns how (and where) such crossmodal links in spatial attention are mediated in the human brain (see Spence & Driver, 2004; Wright & Ward, 2008). One popular suggestion has been that multisensory maps, such as those found in the superior colliculus (SC; where spatially aligned maps of visual, auditory, and tactile space lie, one superimposed on top of the other), might mediate at least some of the spatial attentional cuing effects that have been observed behaviorally in the laboratory. (The SC is a small sub-cortical brain structure that controls the overt orienting of the oculomotor system (see Stein & Meredith, 1993).) However, there is currently much debate over the extent to which exogenous crossmodal spatial attention effects (typically observed by cognitive psychologists in awake human participants) and multisensory integration effects (traditionally observed at the single cell level in the superior colliculus of anaesthetized animals by neurophysiologists) actually represent the same underlying neural phenomenon (see above). That is, researchers are still trying to ‘bridge the gap’ between the different cognitive neuroscience methods and the different levels of analysis at which crossmodal spatial attention and multisensory integration are studied (see Macaluso et al., 2000a, 2001; McDonald et al., 2001).
However, over the last decade or so, numerous studies using a variety of cognitive neuroscience techniques have started to uncover some of the most important brain areas involved in the crossmodal orienting of spatial attention. So, for example, Macaluso et al. (2000b) conducted an influential PET study in which a stream of tactile or visual stimuli were presented bilaterally. They demonstrated that the effects of endogenous (or sustained) spatial attention to one or other side had both modality-specific and multisensory consequences. In particular, unimodal spatial effects were observed in modality-specific areas of the brain, such as the superior occipital gyrus (for vision) and the superior postcentral gyrus (for touch). Meanwhile, multisensory spatial effects (i.e., neural activations that were seen regardless of which modality was being attended) were demonstrated in the intraparietal sulcus (a polysensory association area) and occipitotemporal junction. Interestingly, sustained tactile spatial attention to one side versus the other has also been shown to result in a modulation of activity in contralateral visual cortical areas (see Macaluso et al., 2002b). Similar effects, although involving a higher level of activation were seen when the participants’ vision attention was directed toward one side or the other.
McDonald et al. (2000a) used fMRI to investigate crossmodal links in spatial attention between vision and touch. The participants in their study had to make simple speeded detection responses to visual targets that were presented randomly from either side of fixation. On half of the trials, a tactile stimulus was presented to the participant’s right hand at the same time as the visual target (note that this aspect of the experimental design has caused some controversy subsequently as to whether this study was really investigating crossmodal attention or rather, multisensory integration; see Macaluso et al., 2001; McDonald et al., 2001). The results showed greater activation of the visual cortex (lingual gyrus) on trials where the visual and tactile stimuli were presented from the same position. An analysis of effective connectivity suggested that the influence of tactile cues/stimuli on unimodal visual cortex was mediated by back-projections from multimodal inferior parietal areas (such as the supramarginal gyrus).
One idea that is currently-popular in this area of research is that multisensory influences on ‘unimodal’ brain areas might arise as a result of feedback or back-projection influences upon them, from multisensory convergence-zones and/or attentional control structures (Driver & Noesselt, 2008; Driver & Spence, 1998a; Macaluso & Driver, 2005; Stein & Stanford, 2008; Wright & Ward, 2008). However, evidence is also emerging of the importance of direct connects between primary sensory cortices and rapid feed-forward integration (see Cappe et al., 2009; Falchier et al., in press)
It is, however, important to note here that many cognitive neuroscientists (e.g., Eimer & van Velzen, 2002; Macaluso et al., 2002b) have perhaps been a little too hasty to jump from such findings to the conclusion that shifts of spatial attention are necessarily controlled by a supramodal attentional (or spatial representational) system in the human brain. As pointed out by Chambers et al. (2004), such neuroimaging results (i.e., demonstrating that the activation of a particular brain area occurs no matter which modality is attended) do not in-and-of themselves provide unequivocal support for supramodal control of spatial attention. Instead, such findings are equally consistent with the hypothesis that spatial attentional orienting may be mediated by modality-specific processes that are activated in synchrony but, crucially, may be anatomically independent of one another.
Chambers et al. (2004) were able to address this potential criticism of the neuroimaging data in a transcranial magnetic stimulation (TMS) study. The participants in their study endogenously directed their attention to the left or right in order to make a speeded elevation discrimination response to vibrotactile targets presented to either the thumb or index finger. The side on which the target would be presented was predicted with a validity of 75% by a central visual arrow cue. Applying TMS to the right hemisphere for 300 ms during the presentation of the visual cue impaired endogenous orienting to visual but not somatosensory events. In particular, TMS-ing the supramarginal gyrus in the right inferior parietal lobe resulted in a significant reduction in spatial cuing effects for visual targets while leaving tactile attention unaffected. Such results provide more compelling evidence in support of the view that this part of the brain has a modality-specific role in the spatial orienting of visual but not tactile attention. Such results are clearly inconsistent with the existence of a single supramodal attentional system in the human brain and are instead more consistent with some degree of modality-specificity of attentional processing within parietal cortex. It should though be noted that while Macaluso et al. (2002b) looked at the neural correlates of sustained attention to one side or the other in their PET study, Chambers et al.’s study, using a trial-by-trial cuing design, was more concerned with the neural mechanisms concerned with the shifting of spatial attention.
Chambers et al. (2007) have gone on in subsequent research to investigate the neural substrates of crossmodal links in exogenous spatial attention as well. The participants in their study were presented with a spatially-nonpredictive lateralized visual or tactile cue shortly before a visual or tactile target, once again requiring an elevation discrimination response (i.e., an orthogonal cuing design was used). TMS delivered synchronously with the tactile (but not visual) cue over right supramarginal gyrus and angular gyrus reduced the magnitude of the spatial cuing effects to the targets presented in both modalities. These latter results therefore suggest that the inferior parietal cortex mediates exogenous shifts of crossmodal (and, for that matter, intramodal) spatial attention.
Thus, taken together, the available evidence is currently consistent with the view that while exogenous spatial attention (and IOR) may be controlled by a supramodal orienting system, endogenous spatial attention is controlled by a system that approximates to the separate-but-linked account originally put forward by Spence and Driver (1996). Prinzmetal and colleagues (e.g., Prinzmetal et al., 2005a, b, 2009) have argued that exogenous and endogenous spatial attention may operate in importantly different ways, such that while endogenous orienting enhances the perceptual representation of an attended stimulus, exogenous orienting simply increases the tendency that a participant will respond first to the stimulus presented in one location rather than another. Hence, endogenous orienting is more likely to affect the accuracy of a participant’s responses, exogenous orienting effects are more likely to influence the speed of a participant’s responses instead.
Applying crossmodal attention research
It seems likely that in the years to come, our growing understanding of the nature of the crossmodal links that constrain the deployment of spatial attention will increasingly help researchers to provide guidelines with which to facilitate the effective design of multimodal (or multisensory) user interfaces (Ferris & Sarter, 2008; Sarter, 2007; Spence & Ho, 2008). For example, research in the field of applied cognitive psychology has already shown that people find it particularly difficult to hold a conversation on a mobile phone, while simultaneously driving a car (Spence & Read, 2003; see Ho & Spence, 2008, chapter 2, for a review). One of the major problems in this multisensory dual-task situation (a problem, note, which is not anticipated by the modality-specific resources account of crossmodal attention outlined earlier) may be that people find it difficult to attend visually out of the windscreen to watch the road ahead, while simultaneously trying to listen to the voice coming from the phone by their ear (due, presumably, to the existence of robust crossmodal links in endogenous spatial attention, see above; see also Just et al., 2008). Spence and Read proposed that performance in this situation could be improved if the speaker’s voice were to be presented from directly in front of the driver, to take advantage of the underlying crossmodal links that constrain the deployment of endogenous spatial attention (see Driver & Spence, 2004; Ho & Spence, 2008).
Similarly, a better understanding of the nature of the crossmodal links underlying exogenous attentional orienting may also lead to the design of more effective non-visual (and multisensory) warning signals (see also Ho & Spence, 2009): Indeed, the latest research by Ho, Reed, and Spence (2007) has shown that bimodal audiotactile cues appear to capture the spatial attention of drivers in a simulator setting far more effectively than unisensory warning signals (at least when the various unisensory cues are presented from the same spatial location, or direction; see Spence & Santangelo, 2009, for a review).
Recent laboratory results from Van der Burg et al. (2008a, 2009; see also Ngo & Spence, submitted) are also relevant here. These researchers have demonstrated that a sound or tactile stimulus does not need to be presented from the same location as a visual target in order to capture a person’s spatial attention and facilitate their visual search performance. The participants in their studies were presented with arrays of tilted line segments whose colour change unpredictably from red to green and back again. The participants’ task was to determine whether a target line segment was oriented either horizontally or vertically somewhere in the display. Van der Burg et al. have conducted a number of studies demonstrating that the presentation of either a spatially-nonpredictive auditory or tactile stimulus can dramatically enhance participants’ visual search performance providing it is presented in synchrony with the colour change of the target element. Thus, it appears that mere temporal synchrony (of the auditory or tactile event) can be sufficient to enhance the saliency of the (visual) target stimulus, hence leading to attentional capture (Van der Burg et al., 2008b; though see also Fujisaki et al., 2006). Such results therefore hold the promise of enhancing the visual search of interface operators who are faced with overly complex visual displays (see Spence et al., in press a).
In conclusion, research on crossmodal attention has come a long way over the last 30 years or so, with scientists highlighting the existence of extensive crossmodal interactions in attention between the senses (Calvert et al., 2004; Spence & Driver, 2004). While the majority of studies have tended to focus on audiovisual interactions, a growing body of research now shows similar constraints operating between many other pairs of sensory modalities as well (e.g., Spence & Gallace, 2007, for a review). While researchers have, for many years, tended to study crossmodal links in either endogenous or exogenous spatial orienting, many of the studies that have been published over the last couple of years have started to investigate how the two forms of attention interact (e.g., Chica et al., 2007; Koelewijn et al., 2009a, b; Santangelo et al., 2009), as they presumably do most of the time in everyday life. What is more, while much of the recent research on crossmodal attention has focused on spatial attention, it is important to note that there are equally important crossmodal effects in the purely temporal domain as well, as, for example, demonstrated by the crossmodal attentional capture documented in serial audiovisual search tasks (e.g., Dalton & Spence, 2007).
To date, most of the research has focused on crossmodal attention in “normal” healthy adult human participants. However, there is now growing interest in understanding any changes in crossmodal attention that may occur across the lifespan (e.g., Hugenschmidt et al., 2009; Poliakoff et al., 2006), and following brain-damage (Brozzoli et al., 2006; Rapp & Hendel, 2003; Sarri et al., 2006; Sinnett et al., 2007a). There has even been some progress in extending these crossmodal (and/or multisensory) research paradigms / theoretical approaches to the animal domain (e.g., Delano et al., 2007; Narins et al., 2003, 2005; Skals et al., 2005; Uetake & Kudo, 1994). To give but one example, chinchillas have now been shown to exhibit decreased cochlear sensitivity when performing a visual (but not when performing an auditory task), with the magnitude of this decrease correlating with the demands of the animal’s visual task (see Delano et al., 2007).
- Alais, D. and Burr, D. (2004). The ventriloquist effect results from near-optimal bimodal integration. Current Biology, 14, 257-262.
- Alais, D., Morrone, C. and Burr, D. (2006). Separate attentional resources for vision and audition. Proceedings of the Royal Society B, 273, 1339-1345.
- Alsius, A., Navarra, J., Campbell, R., & Soto-Faraco, S. (2005). Audiovisual integration of speech falters under high attention demands. Current Biology, 15, 1-5.
- Andersen, T. S., Tiippana, K., & Sams, M. (2005). Maximum likelihood integration of rapid flashes and beeps. Neuroscience Letters, 380, 155-160.
- Arnell, K. M. (2006). Visual, auditory, and cross-modality dual-task costs: Electrophysiological evidence for an amodal bottleneck on working memory consolidation. Perception & Psychophysics, 68, 447-457.
- Arnell, K. M. and Jenkins, R. (2004). Revisiting within modality and cross-modality attentional blinks: Effects of target-distractor similarity. Perception & Psychophysics, 66, 1147-1161.
- Ashkenazi, A. and Marks, L. E. (2004). Effect of endogenous attention on detection of weak gustatory and olfactory flavors. Perception & Psychophysics, 66, 596-608.
- Battaglia, P. W., Jacobs, R. A. and Aslin, R. N. (2003). Bayesian integration of visual and auditory signals for spatial localization. Journal of the Optical Society of America A, 20, 1391-1397.
- Azañón, E., & Soto-Faraco, S. (2008). Changing reference frames during the encoding of tactile events. Current Biology, 18, 1044-1049.
- Battaglia, P. W., Jacobs, R. A., & Aslin, R. N. (2003). Bayesian integration of visual and auditory signals for spatial localization. Journal of the Optical Society of America A, 20, 1391-1397.
- Beierholm, U. R., Quartz, S. R., & Shams, L. (2009). Bayesian priors are encoded independently from likelihoods in human multisensory perception. Journal of Vision, 9(5):23, 1-9.
- Bertelson, P., & Aschersleben, G. (1998). Automatic visual bias of perceived auditory location. Psychonomic Bulletin & Review, 5, 482-489.
- Bertelson, P., & de Gelder, B. (2004). The psychology of multimodal perception. In C. Spence & J. Driver (Eds.), Crossmodal space and crossmodal attention (pp. 141-177). Oxford: Oxford University Press.
- Bolognini, N., Frassinetti, F., Serino, A. and Làdavas, E. (2005). “Acoustical vision” of below threshold stimuli: Interaction among spatially converging audiovisual inputs. Experimental Brain Research, 160, 273-282.
- Bonnel, A.-M., & Hafter, E. R. (1998). Divided attention between simultaneous auditory and visual signals. Perception & Psychophysics, 60, 179-190.
- Botvinick, M., & Cohen, J. (1998). Rubber hands ‘feel’ touch that eyes see. Nature, 391, 756.
- Brozzoli, C., Demattè, M. L., Pavani, F., Frassinetti, F. and Farnè, A. (2006). Neglect and extinction: Within and between sensory modalities. Restorative Neurology and Neruoscience, 24, 217-232.
- Burton, H., Abend, N. S., MacLeod, A.-M.K., Sinclair, R. J., Snyder, A. Z., & Raichle, M. E. (1999). Tactile attention tasks enhance activation in somatosensory regions of parietal cortex: A positron emission tomography study. Cerebral Cortex, 9, 662-674.
- Calvert, G. A., Spence, C. and Stein, B. E. (Eds.). (2004). The handbook of multisensory processes. Cambridge, MA: MIT Press.
- Chambers, C. D., Payne, J. M., & Mattingley, J. B. (2007). Parietal disruption impairs reflexive spatial attention within and between sensory modalities. Neuropsychologia, 45, 1715-1724.
- Chambers, C. D., Stokes, M. G., & Mattingley, J. B. (2004). Modality-specific control of strategic spatial attention in parietal cortex. Neuron, 44, 925-930.
- Chapman, C. E., & Meftah, E. M. (2005). Independent controls of attentional influences in primary and secondary somatosensory cortex. Journal of Neurophysiology, 94, 4094-4107.
- Cappe, C., Morel, A., Barone, P., & Rouiller, E. M. (2009). The thalamocortical projection systems in primates: An anatomical support for multisensory and sensorimotor interplay. Cerebral Cortex, 19, 2025-2037.
- Chica, A., Sanabria, D., Lupiáñez, J., & Spence, C. (2007). Comparing intramodal and crossmodal cuing in the endogenous orienting of spatial attention. Experimental Brain Research, 179, 353-364, 531.
- Colavita, F. B. (1974). Human sensory dominance. Perception & Psychophysics, 16, 409-412.
- Dalton, P., & Spence, C. (2007). Attentional capture in serial audiovisual search tasks. Perception & Psychophysics, 69, 422-438.
- Delano, P. H., Elgueda, D., Hamame, C. M., & Robles, L. (2007). Selective attention to visual stimuli reduces cochlear sensitivity in chinchillas. Journal of Neuroscience, 27, 4146-4153.
- Driver, J. (2001). A selective review of selective attention research from the past century. British Journal of Psychology, 92, 53-78.
- Driver, J., & Noesselt, T. (2008). Multisensory interplay reveals crossmodal influences on ‘sensory-specific’ brain regions, neural responses, and judgments. Neuron, 57, 11-23.
- Driver, J., & Spence, C. (1998a). Crossmodal attention. Current Opinion in Neurobiology, 8, 245-253.
- Driver, J., & Spence, C. (1998b). Crossmodal links in spatial attention. Proceedings of the Royal Society Section B, 353, 1-13.
- Driver, J., & Spence, C. (2004). Crossmodal spatial attention: Evidence from human performance. In C. Spence & J. Driver (Eds.), Crossmodal space and crossmodal attention (pp. 179-220). Oxford, UK: Oxford University Press.
- Duncan, J., Martens, S., & Ward, R. (1997). Restricted attentional capacity within but not between sensory modalities. Nature, 387, 808-810.
- Ehrsson, H. H., Spence, C., & Passingham, R. E. (2004). That’s my hand! Activity in premotor cortex reflects feeling of ownership of a limb. Science, 305, 875-877.
- Eijkman, E., & Vendrik, J. H. (1965). Can a sensory system be specified by its internal noise? Journal of the Acoustical Society of America, 37, 1102-1109.
- Eimer, M. (2004). Electrophysiology of human crossmodal spatial attention. In C. Spence & J. Driver (Eds.), Crossmodal space and crossmodal attention (pp. 221-245). Oxford: Oxford University Press.
- Eimer, M., & Van Velzen, J. (2002). Crossmodal links in spatial attention are mediated by supramodal control processes: Evidence from event-related potentials. Psychophysiology, 39, 437-449.
- Eimer, M., & Van Velzen, J. (2005). Spatial tuning of tactile attention modulates visual processing within hemifields: An ERP investigation of crossmodal attention. Experimental Brain Research, 166, 402-410.
- Eimer, M., Van Velzen, J., & Driver, J. (2004). ERP evidence for cross-modal audiovisual effects of endogenous spatial attention within hemifields. Journal of Cognitive Neuroscience, 16, 272-288.
- Ernst, M. O., & Banks, M. S. (2002). Humans integrate visual and haptic information in a statistically optimal fashion. Nature, 415, 429-433.
- Ernst, M. O., & Bülthoff, H. H. (2004). Merging the senses into a robust percept. Trends in Cognitive Sciences, 8, 162-169.
- Falchier, A., Schroeder, C. E., Hackett, T. A., Lakatos, P., Nascimento-Silva, S., Ulbert, I., Karmos, G., & Smiley, J. F. (in press). Projection from visual areas V2 and prostriata to caudal auditory cortex in the monkey. Cerebral Cortex.
- Farah, M. J., Wong, A. B., Monheit, M. A., & Morrow, L. A. (1989). Parietal lobe mechanisms of spatial attention: Modality-specific or supramodal? Neuropsychologia, 27, 461-470.
- Ferlazzo, F., Couyoumdjian, A., Padovani, T., & Berlardinelli, M. O. (2002). Head-centered meridian effect on auditory spatial attention orienting. Quarterly Journal of Experimental Psychology (A), 55, 937-963.
- Ferris, T. K., & Sarter, N. B. (2008). Cross-modal links among vision, audition, and touch in complex environments. Human Factors, 50, 17-26.
- Fujisaki, W., Koene, A., Arnold, D., Johnston, A., & Nishida, S. (2006). Visual search for a target changing in synchrony with an auditory signal. Proceedings of the Royal Society (B), 273, 865-874.
- Gepshtein, S., Burge, J., Ernst, M. O., & Banks, M. S. (2005). The combination of vision and touch depends on spatial proximity. Journal of Vision, 5, 1013-1023.
- Ghazanfar, A. A., & Schroeder, C. E. (2006). Is neocortex essentially multisensory? Trends in Cognitive Sciences, 10, 278-285.
- Gori, M., Del Viva, M., Sandini, G., & Burr, D. C. (2008). Young children do not integrate visual and haptic information. Current Biology, 18, 694-698.
- Gray, R., Mohebbi, R., & Tan, H. Z. (2009). The spatial resolution of crossmodal attention: Implications for the design of multimodal interfaces. ACM Transactions on Applied Perception, 6, 1-14.
- Hairston, W. D., Hodges, D. A., Casanova, R., Hayasaka, S., Kraft, R., Maldjian, J. A., & Burdette, J. H. (2008). Closing the mind’s eye: Deactivation of visual cortex related to auditory task difficulty. Neuroreport, 19, 151-154.
- Hancock, P. A., Oron-Gilad, T., & Szalma, J. L. (2007). Elaborations of the multiple-resource theory of attention. In A. F. Kramer, D. A. Wiegmann, & A. Kirlik (Eds.), Attention: From theory to practice (pp. 45-56). Oxford: Oxford University Press.
- Hecht, D., & Reiner, M. (2009). Sensory dominance in combinations of audio, visual and haptic stimuli. Experimental Brain Research, 193, 307-314.
- Hein, G., Alink, A., Kleinschmidt, A., & Müller, N. G. (2007). Competing neural responses for auditory and visual decisions. PLoS ONE, 3, e320.
- Hein, G., Parr, A., & Duncan, J. (2006). Within-modality and cross-modality attentional blinks in a simple discrimination task. Perception & Psychophysics, 68, 54-61.
- Helbig, H. B., & Ernst, M. O. (2007). Knowledge about a common source can promote visual-haptic integration. Perception, 36, 1523-1533.
- Helbig, H. B., & Ernst, M. O. (2008). Visual-haptic cue weighting is independent of modality-specific attention. Journal of Vision, 8(10):21, 1-16.
- Ho, C., Reed, N., & Spence, C. (2007). Multisensory in-car warning signals for collision avoidance. Human Factors, 49, 1107-1114.
- Ho, C., & Spence, C. (2008). The multisensory driver: Implications for ergonomic car interface design. Aldershot: Ashgate Publishing.
- Ho, C., & Spence, C. (2009). Using peripersonal warning signals to orient a driver’s gaze. Human Factors, 51, 539-556.
- Hugenschmidt, C. E., Peiffer, A. M., McCoy, T. P., Hayasaka, S., & Laurienti, P. J. (2009). Preservation of crossmodal selective attention in healthy aging. Experimental Brain Research, 198, 273-285.
- Johnson, J. A., & Zatorre, R. J. (2005). Attention to simultaneous unrelated auditory and visual events: Behavioral and neural correlates. Cerebral Cortex, 15, 1609-1620.
- Johnson, J. A., & Zatorre, R. J. (2006). Neural substrates for dividing and focusing attention between simultaneous auditory and visual events. Neuroimage, 31, 1673-1681.
- Jolicouer, P. (1999). Restricted attentional capacity between sensory modalities. Psychonomic Bulletin & Review, 6, 87-92.
- Just, M. A., Kellar, T. A., & Cynkar, J. (2008). A decrease in brain activation associated with driving when listening to someone speak. Brain Research, 1205, 70-80.
- Kawashima, R., Imaizumi, S., Mori, K., Okada, K., Goto, R., Kiritani, S., Ogawa, A., & Fukuda, H. (1999). Selective visual and auditory attention toward utterances – A PET study. Neuroimage, 10, 209-215.
- Kawashima, R., O'Sullivan, B. T., & Roland, P. E. (1995). Positron-emission tomography studies of cross-modality inhibition in selective attentional tasks: Closing the "mind's eye". Proceedings of the National Academy of Science, USA, 92, 5969-5972.
- Kennett, S., Eimer, M., Spence, C., & Driver, J. (2001). Tactile-visual links in exogenous spatial attention under different postures: Convergent evidence from psychophysics and ERPs. Journal of Cognitive Neuroscience, 13, 462-478.
- Kennett, S., Spence, C., & Driver, J. (2002). Visuo-tactile links in covert exogenous spatial attention remap across changes in unseen hand posture. Perception & Psychophysics, 64, 1083-1094.
- Kida, T., Inui, K., Wasaka, T., Akatsuka, K., Tanaka, E., & Kakigi, R. (2007). Time-varying cortical activations related to visual-tactile cross-modal links in spatial selective attention. Journal of Neurophysiology, 97, 3585-3596.
- King, A. J., & Palmer, A. R. (1985). Integration of visual and auditory information in bimodal neurones in the guinea-pig superior colliculus. Experimental Brain Research, 60, 492-500.
- Klein, R. (2000). Inhibition of return. Trends in Cognitive Sciences, 4, 138-147.
- Koelewijn, T., Bronkhorst, A., & Theeuwes, J. (2009a). Competition between auditory and visual spatial cues during visual task performance. Experimental Brain Research, 195, 593-602.
- Koelewijn, T., Bronkhorst, A., & Theeuwes, J. (2009b). Auditory and visual capture during focused visual attention. Journal of Experimental Psychology: Human Perception & Performance, 35, 1303-1315.
- Koppen, C., & Spence (2007a). Seeing the light: Exploring the Colavita visual dominance effect. Experimental Brain Research, 180, 737-754.
- Koppen, C., & Spence, C. (2007b). Assessing the role of stimulus probability on the Colavita visual dominance effect. Neuroscience Letters, 418, 266-271.
- Larsen, A., McIlhagga, W., Baert, J., & Bundesen, C. (2003). Seeing or hearing? Perceptual independence, modality confusions, and crossmodal congruity effects with focused and divided attention. Perception & Psychophysics, 65, 568-574.
- Laurienti, P. J., Burdette, J. H., Wallace, M. T., Yen, Y.-F., Field, A. S., & Stein, B. E. (2002). Deactivation of sensory-specific cortex by cross-modal stimuli. Journal of Cognitive Neuroscience, 14, 1-10.
- Lavie, N. (1995). Perceptual load as a necessary condition for selective attention. Journal of Experimental Psychology: Human Perception & Performance, 21, 451-468.
- Lavie, N. (2005). Distracted and confused?: Selective attention under load. Trends in Cognitive Sciences, 9, 75-82.
- Lupiáñez, J. (in press). Inhibition of return. In A. C. Nobre & J. T. Coull (Eds.), Attention and time. Oxford, UK: Oxford University Press.
- Macaluso, E., & Driver, J. (2005). Multisensory spatial interactions: A window onto functional integration in the human brain. Trends in Neurosciences, 28, 264-271.
- Macaluso, E., Frith, C., & Driver, J. (2000a). Modulation of human visual cortex by crossmodal spatial attention. Science, 289, 1206-1208.
- Macaluso, E., Frith, C., & Driver, J. (2000b). Selective spatial attention in vision and touch: Unimodal and multimodal mechanisms revealed by PET. Journal of Neurophysiology, 83, 3062-3075.
- Macaluso, E., Frith, C. D., & Driver, J. (2001). A reply to J. J. McDonald, W. A. Teder-Sälejärvi, & L. M. Ward, Multisensory integration and crossmodal attention effects in the human brain. Science, 292, 1791.
- Macaluso, E., Frith, C. D., & Driver, J. (2002a). Directing attention to locations and to sensory modalities: Multiple levels of selective processing revealed with PET. Cerebral Cortex, 12, 357-368.
- Macaluso, E., Frith, C. D., & Driver, J. (2002b). Supramodal effects of covert spatial orienting triggered by visual or tactile events. Journal of Cognitive Neuroscience, 14, 389-401.
- McDonald, J. J., Teder-Sälejärvi, W. A., & Hillyard, S. A. (2000). Involuntary orienting to sound improves visual perception. Nature, 407, 906-908.
- McDonald, J. J., Teder-Sälejärvi, W. A., Di Russo, F., & Hillyard, S. A. (2005). Neural basis of auditory-induced shifts in visual time-order perception. Nature Neuroscience, 8, 1197-1202.
- McDonald, J. J., Teder-Sälejärvi, W. A., & Ward, L. M. (2001). Multisensory integration and crossmodal attention effects in the human brain. Science, 292, 1791.
- McGurk, H., & MacDonald, J. (1976). Hearing lips and seeing voices. Nature, 264, 746-748.
- Meredith, M. A., Nemitz, J. W., & Stein, B. E. (1987). Determinants of multisensory integration in superior colliculus neurons. I. Temporal factors. Journal of Neuroscience, 7, 3215-3229.
- Morgan, M. L., DeAngelis, G. C., & Angelaki, D. E. (2008). Multisensory integration in macaque visual cortex depends on cue reliability. Neuron, 59, 662-673.
- Narins, P. M., Grabul, D. S., Soma, K. K., Gaucher, P., & Hödl, W. (2005). Cross-modal integration in a dart-poison frog. Proceedings of the National Academy of Sciences, 102, 2425-2429.
- Narins, P. M., Hödl, W., & Grabul, D. S. (2003). Bimodal signal requisite for agonistic behavior in a dart-poison frog, Epipedobates femoralis. Proceedings of the National Academy of Sciences, 100, 577-580.
- Ngo, M. K., & Spence, C. (submitted). Evaluating the effectiveness of temporally synchronous and spatially informative cues in visual search. Attention, Perception, & Psychophysics.
- Otten, L. J., Alain, C., & Picton, T. W. (2000). Effects of visual attentional load on auditory processing. NeuroReport, 11, 875-880.
- Pashler, H. (1994). Dual-task interference in simple tasks: Data and theory. Psychological Bulletin, 116, 220-244.
- Poliakoff, E., Ashworth, S., Lowe, C., & Spence, C. (2006). Vision and touch in ageing: Crossmodal selective attention and visuotactile spatial interactions. Neuropsychologia, 44, 507-517.
- Pöppel, E. (1973). Comments on "Visual system's view of acoustic space". Nature, 243, 231.
- Posner, M. I. (1978). Chronometric explorations of mind. Hillsdale, NJ: Erlbaum.
- Posner, M. I. (1990). Hierarchical distributed networks in the neuropsychology of selective attention. In A. Caramazza (Ed.), Cognitive neuropsychology and neurolinguistics: Advances in models of cognitive function and impairment (pp. 187-210). Hillsdale, NJ: Erlbaum.
- Posner, M. I., & Cohen, Y. (1984). Components of visual orienting. In H. Bouma & D. G. Bouwhuis (Eds.), Attention and performance: Control of language processes (Vol. 10, pp. 531-556). Hillsdale, NJ: Erlbaum.
- Posner, M. I., Nissen, M. J., & Klein, R. M. (1976). Visual dominance: An information-processing account of its origins and significance. Psychological Review, 83, 157-171.
- Prime, D. J., McDonald, J. J., Green, J., & Ward, L. M. (2008). When crossmodal attention fails: A controversy resolved? Canadian Journal of Experimental Psychology, 62, 192-197.
- Prinzmetal, W., McCool, C., & Park, S. (2005a). Attention: Reaction time and accuracy reveal different mechanisms. Journal of Experimental Psychology: General, 134, 73-92.
- Prinzmetal, W., Park, S., & Garrett, R. (2005b). Involuntary attention and identification accuracy. Perception & Psychophysics, 67, 1344-1353.
- Rapp, B., & Hendel, S. K. (2003). Principles of cross-modal competition: Evidence from deficits of attention. Psychonomic Bulletin & Review, 10, 210-219.
- Rees, G., Frith, C., & Lavie, N. (2001). Processing of irrelevant visual motion during performance of an auditory attention task. Neuropsychologia, 39, 937-949.
- Roach, N. W., Heron, J., & McGraw, P. V. (2006). Resolving multisensory conflict: A strategy for balancing the costs and benefits of audio-visual integration. Proceedings of the Royal Society B, 273, 2159-2168.
- Röder, B., Spence, C., & Rösler, F. (2002). Assessing the effect of posture changes on tactile inhibition of return. Experimental Brain Research, 143, 453-462.
- Roland, P. E. (1982). Cortical regulation of selective attention in man. A regional cerebral blood flow study. Journal of Neurophysiology, 48, 1059-1078.
- Rorden, C., & Driver, J. (1999). Does auditory attention shift in the direction of an upcoming saccade? Neuropsychologia, 37, 357-377.
- Rorden, C., Greene, K., Sasine, G. M., & Baylis, G. C. (2002). Enhanced tactile performance at the destination of an upcoming saccade. Current Biology, 12, 1-6.
- Santangelo, V., Olivetti Belardinelli, M., Spence, C., & Macaluso, E. (2009). Multisensory interactions between voluntary and stimulus-driven spatial attention mechanisms across sensory modalities. Journal of Cognitive Neuroscience, 21, 2384-2397.
- Sarri, M., Blankenburg, F., & Driver, J. (2006). Neural correlates of crossmodal visual-tactile extinction and of tactile awareness revealed by fMRI in a right-hemisphere stroke patient. Neuropsychologia, 44, 2398-2410.
- Sarter, N. B. (2007). Multiple-resource theory as a basis for multimodal interface design: Success stories, qualifications, and research needs. In A. F. Kramer, D. A. Wiegmann, & A. Kirlik (Eds.), Attention: From theory to practice (pp. 187-195). Oxford: Oxford University Press.
- Sekuler, R., Sekuler, A. B., & Lau, R. (1997). Sound alters visual motion perception. Nature, 385, 308.
- Shams, L., Kamitani, Y., & Shimojo, S. (2000). What you see is what you hear: Sound induced visual flashing. Nature, 408, 788.
- Shiffrin, R. M., & Grantham, D. W. (1974). Can attention be allocated to sensory modalities? Perception & Psychophysics, 15, 460-474.
- Shipley, T. (1964). Auditory flutter-driving of visual flicker. Science, 145, 1328-1330.
- Shomstein, S., & Yantis, S. (2004). Control of attention shifts between vision and audition in human cortex. Journal of Neuroscience, 24, 10702-10706.
- Shore, D. I., Barnes, M. E., & Spence, C. (2006). The temporal evolution of the crossmodal congruency effect. Neuroscience Letters, 392, 96-100.
- Sinnett, S., Costa, A., & Soto-Faraco, S. (2006). Manipulating inattentional blindness within and across sensory modalities. Quarterly Journal of Experimental Psychology, 59, 1425-1442.
- Sinnett, S., Juncadella, M., Rafal, R., Azañón, E., & Soto-Faraco, S. (2007a). A dissociation between visual and auditory hemi-inattention: Evidence from temporal order judgements. Neuropsychologia, 45, 552-560.
- Sinnett, S., Spence, C., & Soto-Faraco, S. (2007b). Visual dominance and attention: The Colavita effect revisited. Perception & Psychophysics, 69, 673-686.
- Skals, N., Anderson, P., Kanneworff, M., Löftstedt, C., & Surlykke, A. (2005). Her odours make him deaf: Crossmodal modulation of olfaction and hearing in a male moth. Journal of Experimental Biology 208, 595-601.
- Soto-Faraco, S., & Spence, C. (2002). Modality-specific auditory and visual temporal processing deficits. Quarterly Journal of Experimental Psychology (A), 55, 23-40.
- Soto-Faraco, S. Spence, C., Fairbank, K., Kingstone, A., Hillstrom, A. P., & Shapiro, K. (2002). A crossmodal attentional blink between vision and touch. Psychonomic Bulletin & Review, 9, 731-738.
- Spence, C. (2001). Crossmodal attentional capture: A controversy resolved? In C. Folk & B. Gibson (Eds.), Attention, distraction and action: Multiple perspectives on attentional capture. Advances in psychology, 133 (pp. 231-262). Amsterdam: Elsevier Science BV.
- Spence, C. (2008). Cognitive neuroscience: Searching for the bottleneck in the brain. Current Biology, 18, R965-R968.
- Spence, C. (2009). Explaining the Colavita visual dominance effect. Progress in Brain Research, 176, 245-258.
- Spence, C. (in press a). Prior entry: Attention and temporal perception. To appear in A. C. Nobre & J. T. Coull (Eds.), Attention and time. Oxford: Oxford University Press.
- Spence, C. (in press b). Crossmodal spatial attention. The Year in Cognitive Neuroscience.
- Spence, C., Bentley, D. E., Phillips, N., McGlone, F. P., & Jones, A. K. P. (2002). Selective attention to pain: A psychophysical investigation. Experimental Brain Research, 145, 395-402.
- Spence, C., & Driver, J. (1996). Audiovisual links in endogenous covert spatial attention. Journal of Experimental Psychology: Human Perception and Performance, 22, 1005-1030.
- Spence, C., & Driver, J. (Eds.). (2004). Crossmodal space and crossmodal attention. Oxford, UK: Oxford University Press.
- Spence, C., & Ho, C. (2008). Multisensory warning signals for event perception and safe driving. Theoretical Issues in Ergonomics Science, 9, 523-554.
- Spence, C., Kettenmann, B., Kobal, G., & McGlone, F. P. (2000a). Selective attention to the chemosensory modality. Perception & Psychophysics, 62, 1265-1271.
- Spence, C., Lloyd, D., McGlone, F., Nicholls, M. E. R., & Driver, J. (2000b). Inhibition of return is supramodal: A demonstration between all possible pairings of vision, touch and audition. Experimental Brain Research, 134, 42-48.
- Spence, C., Ngo, M., Lee, J.-H., & Tan, H. (in press a). Solving the correspondence problem in multisensory interface design. Advances in Haptics.
- Spence, C., Nicholls, M. E. R., & Driver, J. (2001a). The cost of expecting events in the wrong sensory modality. Perception & Psychophysics, 63, 330-336.
- Spence, C., & Parise, C. (2010). Prior entry. Consciousness & Cognition. http://dx.doi.org/10.1016/j.concog.2009.12.001
- Spence, C., Parise, C., & Chen, Y.-C. (in press b). The Colavita visual dominance effect. To appear in M. M. Murray & M. Wallace (Eds.), Frontiers in the neural bases of multisensory processes.
- Spence, C., Pavani, F., Maravita, A., & Holmes, N. P. (2008). Multi-sensory interactions. In M. C. Lin & M. A. Otaduy (Eds.), Haptic rendering: Foundations, algorithms, and applications (pp. 21-52). Wellesley, MA: AK Peters.
- Spence, C., & Read, L. (2003). Speech shadowing while driving: On the difficulty of splitting attention between eye and ear. Psychological Science, 14, 251-256.
- Spence, C., & Santangelo, V. (2009). Capturing spatial attention with multisensory cues. Hearing Research, 258, 134-142.
- Spence, C., Shore, D. I., & Klein, R. M. (2001b). Multisensory prior entry. Journal of Experimental Psychology: General, 130, 799-832.
- Stein, B. E., & Meredith, M. A. (1993). The merging of the senses. Cambridge, MA: MIT Press.
- Stein, B. E., & Stanford, T. R. (2008). Multisensory integration: Current issues from the perspective of the single neuron. Nature Reviews Neuroscience, 9, 255-267.
- Sterr, A., Shen, S., Zaman, A., Roberts, N., & Szameitat, A. (2007). Activation of SI is modulated by attention: A random effects fMRI study using mechanical stimuli. Neuroreport, 18, 607-611.
- Tellinghuisen, D. J., & Nowak, E. J. (2003). The inability to ignore auditory distractors as a function of visual task perceptual load. Perception & Psychophysics, 65, 817-828.
- Treisman, A. M., & Davies, A. (1973). Divided attention to ear and eye. In S. Kornblum (Ed.), Attention and performance (Vol. 4, pp. 101-117). New York: Academic Press.
- Turatto, M., Benso, F., Galfano, G., Gamberini, L., & Umilta, C. (2002). Non-spatial attentional shifts between audition and vision. Journal of Experimental Psychology: Human Perception & Performance, 28, 628-639.
- Turatto, M., Galfano, G., Bridgeman, B., & Umiltà, C. (2004). Space-independent modality-driven attentional capture in auditory, tactile and visual systems. Experimental Brain Research, 155, 301-310.
- Turatto, M., Mazza, V., & Umiltà, C. (2005). Crossmodal object-based attention: Auditory objects affect visual processing. Cognition, 96, B55-B64.
- Uetake, K., & Kudo, Y. (1994). Visual dominance over hearing in feed acquisition procedure of cattle. Applied Animal Behaviour Science, 42, 1-9.
- Van Beers, R. J., Wolpert, D. M., & Haggard, P. (2002). When feeling is more important than seeing in sensorimotor adaptation. Current Biology, 12, 834-837.
- Van der Burg, E., Olivers, C. N. L., Bronkhorst, A. W., Koelewijn, T., & Theeuwes, J. (2007). The absence of an auditory-visual attentional blink is not due to echoic memory. Perception & Psychophysics, 69, 1230-1241.
- Van der Burg, E., Olivers, C. N. L., Bronkhorst, A. W., & Theeuwes, J. (2008a). Non-spatial auditory signals improve spatial visual search. Journal of Experimental Psychology: Human Perception and Performance, 34, 1053-1065.
- Van der Burg, E., Olivers, C. N. L., Bronkhorst, A. W., & Theeuwes, J. (2009). Poke and pop: Tactile-visual synchrony increases visual saliency. Neuroscience Letters, 450, 60-64.
- Van der Burg, E., Olivers, C. N. L., Bronkhorst, A. W., & Theeuwes, J. (2008b). Audiovisual events capture attention: Evidence from temporal order judgments. Journal of Vision, 8(5):2, 1-10.
- Welch, R. B., DuttonHurt, L. D., & Warren, D. H. (1986). Contributions of audition and vision to temporal rate perception. Perception & Psychophysics, 39, 294-300.
- Wickens, C. D. (1992). Engineering psychology and human performance (2nd. Edition). NY: HarperCollins.
- Wickens, C. D. (2008). Multiple resources and mental workload. Human Factors, 50, 449-454.
- Witten, I. B., & Knudsen, E. I. (2005). Why seeing is believing: Merging auditory and visual worlds. Neuron, 48, 489-496.
- Wozny, D. R., Beierholm, U. R., & Shams, L. (2008). Human trimodal perception follows optimal statistical inference. Journal of Vision, 8(3):24, 1-11.
- Wright, R. D., & Ward, L. M. (2008). Orienting of attention. Oxford: Oxford University Press.
- Kimron L. Shapiro, Jane Raymond, Karen Arnell (2009) Attentional blink. Scholarpedia, 4(6):3320.
- David Spiegelhalter and Kenneth Rice (2009) Bayesian statistics. Scholarpedia, 4(8):5230.
- Valentino Braitenberg (2007) Brain. Scholarpedia, 2(11):2918.
- Zhong-Lin Lu and Barbara Anne Dosher (2007) Cognitive psychology. Scholarpedia, 2(8):2769.
- William D. Penny and Karl J. Friston (2007) Functional imaging. Scholarpedia, 2(5):1478.
- Seiji Ogawa and Yul-Wan Sung (2007) Functional magnetic resonance imaging. Scholarpedia, 2(10):3105.
- Daniel J. Simons (2007) Inattentional blindness. Scholarpedia, 2(5):3244.
- Raymond Klein and Jason Ivanoff (2008) Inhibition of return. Scholarpedia, 3(10):3650.
- Rodolfo Llinas (2008) Neuron. Scholarpedia, 3(8):1490.
- Dale Purves (2009) Neuroscience. Scholarpedia, 4(8):7204.
- Arkady Pikovsky and Michael Rosenblum (2007) Synchronization. Scholarpedia, 2(12):1459.
- Anthony T. Barker and Ian Freeston (2007) Transcranial magnetic stimulation. Scholarpedia, 2(10):2936.
- Jeremy Wolfe and Todd S. Horowitz (2008) Visual search. Scholarpedia, 3(7):3325.
- Calvert, G. A., Spence, C., & Stein, B. E. (Eds.). (2004). The handbook of multisensory processes. Cambridge, MA: MIT Press.
- Spence, C. and Driver, J. (Eds.) (2004). Crossmodal space and crossmodal attention. Oxford, UK: Oxford University Press.
- Wright, R. D. and Ward, L. M. (2008). Chapter 8: Crossmodal attention shifts. In R. D. Wright and L. M. Ward, Orienting of attention (pp. 199-227). Oxford: Oxford University Press.
University of Oxford, Department of Experimental Psychology, Crossmodal Research Laboratory: http://psyweb.psy.ox.ac.uk/xmodal/default.htm