Theories of perception

From Scholarpedia
This article has not yet been published; it may contain inaccuracies, unapproved changes, or be unfinished.
Jump to: navigation, search

There is a wide agreement that science can benefit from tight interactions between theory and experiments. In the field of mammalian perception a structured theoretical space is lacking. Contemporary theories of perception, many of them implicitly deduced from experimental designs, form a patchy theoretical landscape. This article is an attempt to describe families of implicit and explicit theories of perception, mostly for the visual and tactile modalities, in one structured plane. The two axes spanning this plane are (i) the brain-world (BW) axis, along which external information is acquired by a given brain, and (ii) the brain-brain axis, along which different brains interact.



How do we perceive our environment? Despite decades of intensive research the scientific community does not seem to converge on an agreed direction. To begin with, there is no agreement about the general scheme of perception: is perception ‘direct’ or ‘indirect’? Does it depend on active sensor movements? Is it based on the construction of internal representations? Moreover, the neurobiology of perception seems to progress for the most part independently of its theory; partially, arguably, due to the lack of a structured theoretical landscape. This article proposes a structured theoretical landscape, which, despite its simplicity (or in fact thanks to its simplicity), can form an initial step towards productive theory-experiment dialogue.

The brain-world (BW) axis

Along this axis information about the environment of a specific brain is acquired. We adopt here the Umwelt viewpoint of von Uexkull, according to which the world perceived by a given brain (B i ) is unique to that brain (W i ) [1]. In line with von Uexkull, whenever we use the term “brain” here we in fact refer to the entire organism, and primarily to the entire perceptual system in that organism, including the sensory organs and their muscles.

Schematic classification

We categorize theories of perception (TOPs), whether implicit or explicit, in five schematic classes, not necessarily mutually exclusive (Figure 1). For this generic scheme we consider two fundamental states – ‘world state’ and ‘brain state’ – and assume that perceptual acquisition updates the brain state according to the world state such that the brain state forms an updated model of the world state [2, 3]. In the following we describe briefly each class and provide several representative examples of implicit or explicit theories.

Figure 1: Theories of perception taxonomy.
  1. Bottom-Up TOPs. Acquisition of information from the world is done in one direction, from W to B, and in a feedforward manner. Various aspects of such processing have been suggested along the years including: integration of neuronal representations of individual features (Feature Integration Theory, FIT; [4]), classification by feedforward transformations (DEEP network, [5, 6]; HMAX [7]), ignition of Local Activations (LA; [8]), activation of a Global Work Space (GWS; [9, 10]) and others.
  2. Bottom-Up-Top-Down TOPs. Acquisition of information from the world is done in two directions, from W towards B (bottom-up, BU) and from B towards W (Top-Down, TD). Different schemes suggest different types of interactions between the two processing streams. The Reverse Hierarchy Theory (RHT) suggests that the gist of the scene is acquired via a rapid propagation in the bottom-up direction and the details are acquired via top-down processes, whose depth and scrutiny level depends on the context [11, 12]. The BU/TD Segmentation (BUTD) scheme proposes that BU and TD processing streams run in parallel and interact at different brain levels, matching stored knowledge with segmentation constrains [13].
  3. Bottom-up Reentrant TOPs, Acquisition of information from the world is done in the BU direction, from W towards B, and includes local closed-loop dynamics in one or more processing stations. Reentrant processing is proposed to facilitate data integration and categorization and to increase processing robustness [14, 15].
  4. Closed-loop TOPs. Acquisition of information from the world is done in loops connecting B and W. The processing obeys global closed-loop dynamics which link world elements and brain elements. Loop dynamics may follow an internal control signal (Perceptual Control Theory, PCT [16]) or converge to perceptual attractors (Closed-Loop Perception, CLP [17]).
  5. Motor-sensory TOPs. Perception is hypothesized to emerge from motor-sensory interactions and to depend on sensorimotor contingencies (SMCs, [18]). Neuronal implementation is not stated explicitly, and thus this type of TOP may be integrated with previous ones. Specifically, it complements quite naturally closed-loop TOPs [19].

Dynamic classification

BW acquisition processes can in principle follow discrete or continuous dynamics. This distinction is related to, but probably cannot be reduced to, the distinction between discrete dynamical systems and continuous dynamical systems, which is formulated only in terms of their descriptive equations.

If perception follows discrete dynamics it can be localized in space and time. That is, it has starting and ending spatiotemporal coordinates and in principle it can be put “on hold” – be paused and continued later. In contrast, if perception follows continuous dynamics it does not have starting or ending spatiotemporal coordinates and it cannot be put “on hold” – if paused it cannot be continued later. Accordingly, with discrete dynamics perception can be based on transformations between static representations (where x is termed ‘static’ if there exists a time window, short as it may be, in which x does not change) (e.g., [20]). In contrast, with continuous dynamics no static events exist whatsoever (e.g., [21, 22]).

The brain-brain (BB) axis

Figure 2: Brain states.

Along this axis information about the environment is exchanged between brains based on their brain states. BB communication is also based on channels of BW acquisition (naturally, since for each brain the other brain is part of its world). For example, auditory perception is used for speech, and visual perception for written symbols. The information carried in these channels, however, is symbolic, and it is transferred between the brain states of the two brains (e.g., B1 and B2; Figure 2). We term the fundamental items transferred in BB communication “ideas”, following Descartes’ terminology [23]. These ideas often represent “substances” perceived in the worlds of these two brains (W1 and W2). An objective world (W) can be inferred by the collective behavior of the two (or more) brains.

The most typical BB channels used for conveying ideas about external substances are those related to language:

  1. Speech. The physical channel is based on the auditory system. This communication is rhythmic, active in both sides (production and perception) and usually interactive (closing a production-perception loop).
  2. Sign language. The physical channel is typically based on the visual system. This communication is rhythmic, active in both sides (production and perception) and usually interactive (closing a production-perception loop).
  3. Script. The physical channel is typically based on the visual system. Both writing and reading are rhythmic and active, but are typically not interactive.

BB communications of ideas should follow discrete dynamics given the discrete nature of ideas. The physical communication, via BW channels, may follow either discrete or continuous dynamics, as discussed above.

BW-BB interactions

Interactions between the two axes can occur at many levels. For example (Figure 2, large arrows): BU processes (including those containing reentrant loops) may add additional feedforward levels to convey the internal brain state (that would form an Internal Representation, IR, in this case) to the BB channel. BU-TD acquisition processes may interact bi-directionally with the BB channel, where sites of interactions may span a range between the top of the BU hierarchy (more likely for RHT) and earlier BU junctions (more likely for BU/TD segmentation). Closed-loop acquisition scheme may prefer closed loop interactions with the BB channel, with sites of interactions spanning those parts of the BW loops that are accessible for conscious report.

BW-BB interactions can be embodied, in each brain, via synaptic interactions between any projection from a BB-related station (e.g., speech recipient station) and a BW-related station (e.g., sensory brain areas). Two major candidates, not necessarily mutually exclusive, come to mind here. One are the feedback connections (TD) projecting from high to low level sensory stations. Another are the efference copies -- collaterals of motor-related projections that innervate sensory stations. The dense distribution of these junctions allows BB-BW interactions in virtually all processing stations. These interactions can be unidirectional, bi-directional, open-loop or closed-loop. One crucial transformation must be acknowledged here. Whereas BB communication is based on discrete signals, typically representing perceptual categories, BW communication may be continuous and non-categorical. Thus, BW-BB interactions are likely to include transformations between continuous and discrete representations.

Empirical approaches

There are two external anchors for empirical studies addressing theories of perception: the world state and the report. Psychophysical and behavioral approaches typically monitor and manipulate these two anchors rigorously. Unlike these two anchors, the internal brain state cannot be monitored, or manipulated, in a rigorous manner. In fact, we can currently sample only a negligible fraction of the relevant neuronal activity in any given condition. Thus, empirical discrimination between available theories should probably proceed in stages, starting with well-designed behavioral experiments and continuing with prediction-based neuronal experiments.


  • Uexkull, J.v., Theoretical biology. 1926, London: K. Paul, Trench, Trubner & co. ltd.
  • Tishby, N. and D. Polani, Information theory of decisions and actions, in Perception-Action Cycle. 2011, Springer. p. 601-636.
  • Friston, K., The free-energy principle: a unified brain theory? nature reviews neuroscience, 2010. 11(2): p. 127-38.
  • Treisman, A.M. and G. Gelade, A feature-integration theory of attention. Cognit Psychol, 1980. 12(1): p. 97-136.
  • Kriegeskorte, N., Deep neural networks: a new framework for modeling biological vision and brain information processing. Annual Review of Vision Science, 2015. 1: p. 417-446.
  • Cadieu, C.F., et al., Deep neural networks rival the representation of primate IT cortex for core visual object recognition. PLoS computational biology, 2014. 10(12): p. e1003963.
  • Poggio, T. and T. Serre, Models of visual cortex. Scholarpedia, 2013. 8(4): p. 3516.
  • Noy, N., et al., Ignition’s glow: Ultra-fast spread of global cortical activity accompanying local “ignitions” in visual cortex during conscious visual perception. Consciousness and cognition, 2015. 35: p. 206-224.
  • Baars, B.J., The conscious access hypothesis: origins and recent evidence. Trends Cogn Sci, 2002. 6(1): p. 47-52.
  • Dehaene, S., M. Kerszberg, and J.P. Changeux, A neuronal model of a global workspace in effortful cognitive tasks. Proc Natl Acad Sci U S A, 1998. 95(24): p. 14529-34.
  • Hochstein, S. and M. Ahissar, View from the top: hierarchies and reverse hierarchies in the visual system. Neuron, 2002. 36(5): p. 791-804.
  • Ahissar, M. and S. Hochstein, The reverse hierarchy theory of visual perceptual learning. Trends Cogn Sci, 2004. 8(10): p. 457-64.
  • Borenstein, E. and S. Ullman, Combined top-down/bottom-up segmentation. IEEE Transactions on pattern analysis and machine intelligence, 2008. 30(12): p. 2109-2125.
  • Edelman, G.M. and J.A. Gally, Reentry: a key mechanism for integration of brain function. Frontiers in integrative neuroscience, 2013. 7.
  • Enns, J.T. and V. Di Lollo, What's new in visual masking? Trends Cogn Sci, 2000. 4(9): p. 345-352.
  • Powers, W.T., Feedback: beyond behaviorism. Science, 1973. 179(71): p. 351-6.
  • Ahissar, E. and E. Assa, Perception as a closed-loop convergence process. eLife, 2016. 5: p. e12830.
  • O'Regan, J.K. and A. Noe, A sensorimotor account of vision and visual consciousness. Behavioral and Brain Sciences, 2001. 24(5): p. 939-73; discussion 973-1031.
  • Buhrmann, T., E.A. Di Paolo, and X. Barandiaran, A dynamical systems account of sensorimotor contingencies. Frontiers in psychology, 2013. 4.
  • Marr, D., Vision. 1982, San Francisco: W. H. Freeman.
  • Van Gelder, T. and R.F. Port, It’s about time: An overview of the dynamical approach to cognition. Mind as motion: Explorations in the dynamics of cognition, 1995. 1: p. 43.
  • Kelso, J.S., Dynamic patterns: The self-organization of brain and behavior. 1997: MIT press.
  • Descartes, R. and J. Cottingham, René Descartes: Meditations on First Philosophy: With Selections from the Objections and Replies. 2013: Cambridge University Press.

External links

See also

Personal tools

Focal areas