# Signal-to-noise ratio in neuroscience

Post-publication activity

Curator: Simon R Schultz

Signal-to-Noise ratio (SNR) generically means the dimensionless ratio of signal power to noise power. It has a long history of being used in neuroscience as a measure of the fidelity of signal transmission and detection by neurons and synapses. This article is intended to provide a point of reference for neurophysiologists and to help steer around occasional confusion in the literature concerning the use of the signal-to-noise ratio in neuroscience.

## What is signal-to-noise ratio (in the context of a neuroscience experiment)?

The signal-to-noise ratio allows us to quantify the size of the applied or controlled signal relative to fluctuations that are outside experimental control. It has general applicability to the analysis of sensory discrimination (by nerve cells and by whole organisms) and to the performance of networks.

A common use of SNR is to compare the quality of electrophysiological recordings containing events (for instance action potentials) recorded in the presence of noise. This measure is used (although often just approximately by reading off an oscilloscope by eye) to decide whether a recording location is adequate to begin spike sorting, or whether to move the electrode on. This can be quantified by the ratio of the variances of the event signal train and the noise. Another application, which we will discuss in more detail, is the use of signal-to-noise ratio to characterise the reliability of neural information transmission. For the sake of choosing language we will examine SNR in a sensory system, although a motor system could be studied the same way.

A sensory neuroscience experiment typically involves the collection of neural responses of some kind over repeated trials in which a particular stimulus signal is presented. Over the course of an experiment, a distribution of different stimuli (signals) will be presented, and a distribution of responses to each stimulus recorded. Note that the SNR values we will discuss here in general depend upon the choice of input signal used by the experimenter - they thus characterise the combination of the system plus external stimulus, rather than being an intrinsic property of the system itself. Two classes of experiments might be undertaken, and calculating SNR for each differs slightly.

#### Discrete stimuli

Here, the stimulus takes on discrete values because of limitations on the data collecting capacity of the experiment (for instance, where 30 degree increments of the orientation of a grating stimulus are used) or because of the fundamental nature of the experiment (in a signal detection task, for instance). Let us examine the situation where the response to discrete stimulus s (one of S stimuli in total) is $$r_s\ .$$ In this case, the SNR is $SNR = \frac{P_S}{P_N} = \frac{E[r_s^2]}{\sigma_N^2}$ where $$E[.]$$ denotes the expectation over stimuli, i.e. $$1/S \sum_s r_s^2$$ if stimuli are equiprobable, and $$\sigma_N^2$$ is the noise variance, i.e. it measures the amount of variability that still occurs when the stimulus is held to a fixed value, which can be measured in practice by taking the variance of the distribution of responses across trials where that stimulus is repeated. If this noise variance does not depend upon the stimulus (but just takes a fixed value $$\sigma_N^2$$), then the SNR is simply as given above. If the noise variance depends upon which stimulus has been presented (which for instance will be true in a Poisson spiking scenario where the noise variance is equal to the mean response for the stimulus), then the average noise variance must be used, such that the signal-to-noise ratio becomes $SNR = \frac{1/S \sum_s r_s^2}{1/S \sum_s \sigma^2_N(s)} \ .$ Note that the extension to non-equal stimulus probabilities is straightforward.

In the special case of a signal detection task (two stimuli: signal present or absent), with the signal causing a change in the response $$\Delta r\ ,$$ then the signal power becomes $$P_S = (\Delta r)^2\ .$$ In this case the noise variance is still measured as the variance across experimental trials.

#### Stimuli continuously varying in time

Figure 1: Signal transmission across a toy "synapse", which here is modeled simply as a 7th order Butterworth filter with cutoff frequency at 200 Hz . A White noise input signal (lowpass filtered at 1 kHz). B Gaussian distribution of inputs in A. C Mean signal in the response (blue line) and several traces of the noise in the response (grey lines). D Probability densities of signal (filled circles) and noise (open circles). This simulation was motivated by the experimental results of de Ruyter van Steveninck and Laughlin (1996). Source code available from the curator.
An example of a continuous-stimulus is found in the study of signal transmission from photoreceptors through chemical synapses to the non-spiking large monopolar cells of the blowfly Calliphora vicina (de Ruyter van Steveninck and Laughlin 1996). In this scenario, a pseudo-random contrast signal is applied to the photoreceptors through a light-emitting diode, and the graded-potential output of the neuron (at the other side of the synapse) recorded. How are the signal and noise measured? Note that if the diode output were to be used directly to represent the signal, then the noise would have to be measured in the same units; this would mean referring the noise to the input (Horowitz and Hill, 1989) - see the Exemplars section below for one method. However, it is instead possible to measure both signal and noise power at the neural response recording site. Figure 1, at right, uses a computer simulation of a simplified version of the experiment to explain how this can be done. Here, a white fluctuating stimulus waveform (a short segment of which is shown in Fig. 1A) is presented on many repeated trials. In the real experiment, it controls the contrast of an LED; here, as shown in Fig. 1B, it has zero mean and unit standard deviation. In the simulation, the signal is passed through a "synapse", which in our simulation is modeled simply as a 7th order Butterworth filter with cutoff frequency at 200 Hz. The mean over trials of the output of this model is the average response shown by the solid line in Fig. 1C; the distribution of response values over the length of the stimulus sequence is shown by the filled circles in Fig. 1D. By squaring the modulus of the Fourier transform of this response, and normalizing appropriately, we arrive at the power spectral density (shown in Fig. 2A) of the signal $$P_S(f)$$ (where f is frequency). Several examples of the fluctuations around that mean response on individual trials are shown by the grey lines in Fig. 1C; these have a probability density (over the length of the sequence and over trials) given by the open circles in 1D. By averaging together the power spectral densities of these fluctuation traces, we obtain the noise power spectral density $$P_N(f)$$shown in Fig. 2A. We now have everything we need to calculate the signal-to-noise ratio for this scenario. But note: to do this, we have assumed that the noise is independent from trial to trial. If this is not the case, for instance due to the presence of very slow fluctuations, our procedure will not be valid.
Figure 2: Calculation of signal-to-noise ratio for the model synapse described in Figure 1. A Power spectral densities for the presynaptic signal (white noise low-pass filtered at 1 kHz), the average postsynaptic signal, and the postsynaptic fluctuations (noise) from the simulation. B Signal-to-noise ratio at each frequency. Source code available from the curator.

The signal-to-noise ratio pertaining to each frequency is $SNR(f) = \frac{P_S(f)}{P_N(f)}$ with $$P_S(f)$$ and $$P_N(f)$$ defined as described above. For the simulation of Fig. 1, this is shown in Fig. 2B. The overall signal-to-noise ratio is the ratio of the area under the signal and noise power spectral density curves, $SNR = \frac{\int df P_S(f) }{\int df P_N(f) } \ .$

## Relationship to discriminability

In the section "Discrete Stimuli" above, we discussed a signal detection scenario, where a signal shifts between two values (which we can call 0 and $$\Delta r$$), in the presence of additive Gaussian noise of variance $$\sigma_N^2\ .$$ The probability of correctly detecting that the signal has taken the higher value is going to depend only upon how far apart the two signal levels are, and the standard deviation of the noise. The discriminability of these signal levels is $$d'=\frac{\Delta r}{\sigma_N}\ ,$$ as illustrated in Figure 3. It can easily be seen from the definition of signal-to-noise ratio above that $SNR = \frac{(\Delta r)^2}{\sigma_N^2} = (d')^2\ .$

Figure 3: d' measures the distance between two normal distributions of equal variance, in units of standard deviation. An ideal (maximum likelihood) observer detects the signal if the value x observed is above the intersection of the two curves, indicated by the vertical arrow placed d'/2 from the centre of each Gaussian.

$$d'$$ is a commonly used measure of discriminability in psychophysics (Green and Swets 1966). The probability of correct detection $$P_C$$in this scenario can be found by integrating over the noise (Green and Swets 1966, Rieke et al. 1998). $$P_C$$ is given by the sum of the probability of correct detections ('hits') and the probability of correct rejections; equivalently, it is one minus the sum of the probabilities of misses and false positives. Using the latter, we can see that for signals occurring on 50% of trials, $P_C = 1-\frac{1}{2}\frac{1}{\sqrt{2\pi}} \int_{d'/2}^\infty dx e^{-x^2/2} - \frac{1}{2}\frac{1}{\sqrt{2\pi}} \int_{-\infty}^{d'/2} dx e^{-(x-d')^2/2}$ with the prefactors of one half before the integrals coming from the equal probability of the signal being present or absent. Thus, with a small amount of algebra, $P_C = \frac{1}{\sqrt{2\pi}}\int_{-\infty}^{d'/2} dx e^{-x^2/2} = \phi(d'/2)$ where $$\phi(x)$$ is the cumulative normal distribution function. This can also be written as $P_C = \frac{1}{2}\left[ 1+\mathrm{erf}\left(\frac{d'}{2\sqrt{2}}\right)\right] =\frac{1}{2}\left[ 1+\mathrm{erf}\sqrt{\frac{SNR}{8}}\right]$

Figure 4: Percent correct detections for an ideal observer in a detection task as a function of SNR.
where $$\mathrm{erf(\cdot)}$$ is the error function.

As the SNR approaches zero, an ideal observer can still make 50% correct discriminations - simply by guessing - and of course as the SNR becomes large, performance approaches 100% correct. At SNR=1, the percentage of correct discriminations is 69% - this is a common definition of the threshold for detection in the psychophysics literature. The relationship between SNR and percent correct for the simple signal detection task described here is shown in Figure 4.

## Relationship to mutual information

Note that while it is of course possible to calculate $$SNR(f)$$ no matter how the signal and noise are distributed, we can interpret SNR as an "information" quantity (in the sense of Shannon mutual information), when the signal and noise both follow Gaussian distributions. For the case of a discrete time channel with additive Gaussian noise (see Cover and Thomas, 1991), the mutual information I can be expressed as $I = \frac{1}{2} \log_2 \left( 1+ SNR\right)$ bits per transmission. This can be derived directly as the difference between the entropy of the Signal+Noise distribution with variance $$\sigma_{Tot}^2 = \sigma_S^2 + \sigma_N^2$$ (i.e. total entropy) and the entropy of the Noise distribution with variance $$\sigma_N^2\ .$$ If an independent sample was being received once every $$\Delta t$$ seconds, this means that information would be being transmitted at the rate $$\frac{1}{2\Delta t} \log_2 \left( 1+ SNR\right)$$ bits/second.

What happens when the samples are not strictly independent and discrete? Shannon (1949) derived an expression for the limiting information capacity for the continuous channel with Gaussian noise, $I = \int_0^\infty \log_2 \left( 1+SNR(f)\right) df\ .$ This information capacity provides an upper bound on the mutual information (which is reached when the signal as well as the noise follows a Gaussian distribution). This result can be appreciated as follows: the power spectrum of a Gaussian process can be seen as an ordered list of the variances of frequency components. Each of these frequencies can be viewed as a "symbol", and the information carried follows the form of the information per symbol (sample) above. Then the total information is obtained by summing over all independent symbols, and normalizing to express it as an information rate - these steps being accomplished by integrating over frequencies. See Shannon (1949) for the full derivation. A note of caution: this upper bound assumes additive Gaussian noise, which may not be a realistic assumption in many neural systems; if this does not apply, the bound may not hold (Rozell and Johnson 2005).

## Common pitfalls

• Signal-to-noise ratio and evoked-to-spontaneous ratio have occasionally been confused as one for the other. The former is a measure of the fidelity of signaling, and for this purpose the noise is necessarily measured in terms of the variability across trials (or data segments) in which the same signal is present. The latter is a measure of the ratio of the amplitude of elicited responses to the average level of ongoing activity. The difference between these measures can be appreciated by noting that increasing the average level of spontaneous activity would not by itself affect the signal-to-noise ratio (an appropriately tuned receiver would simply subtract out such a constant offset, leaving the reliability of the signaling channel unaffected), whereas it may strongly affect the evoked-to-spontaneous ratio.
• Using the ratio of amplitudes rather than of powers of the signal and the noise. This may, under some circumstances, produce misleading conclusions.

## SNR measurements in neuroscience - some exemplars

• Signal/noise considerations are often considered qualitatively in the literature (see for example Barlow and Levick 1969) as the motivation for analyses using approaches such as signal detection theory. However, cases where it has been determined quantitatively are less common.
• Bialek et al. (1991) measured SNR in the movement-sensitive H1 neuron of the blowfly Calliphora erythrocephala, using an approximately white noise random velocity stimulus as a signal. One clever aspect of this study was the use of a reconstruction approach to calculate the noise power spectral density referred to the input. This approach meant that the SNR could be computed as the ratio between the actual stimulus power spectral density and the input-referred noise power density, at each frequency band. They then computed the rate of information transmission via the Shannon formula.
• The example used in the section above for continuous stimuli is drawn from the work of de Ruyter van Steveninck and Laughlin (1996). Recording from the non-spiking large monopolar cells of the blowfly eye, they measured signal and noise power spectral density from the graded potential responses, and thus calculated the SNR in each frequency band (and, via Shannon's formula, the information rate). Since the stimulus directly modulated the photoreceptor, calculating SNR this way allowed the rate of information transmission across a chemical synapse to be directly measured. A more recent paper by Simmons and de Ruyter van Steveninck (2005) applied a similar approach to the synapse between two classes of ocellar neurons in the locust.
• In another line of research, the effect of the neuromodulator acetylcholine on signal-to-noise characteristics of cortical neuron activity was studied (Sato et al. 1987). However, the quantity measured might be better described as evoked-to-spontaneous ratio, as the denominator in $$S/N$$ (they actually show $$S/(S+N)$$) reflected the average level of spontaneous activity rather than trial to trial variability. A similar approach was taken by Sherman and Guillery in their studies of bursting in the lateral geniculate nucleus (reviewed in Sherman and Guillery 2002). See also commentary in Disney and Schultz (2004).
• Signal to noise ratio has been used as a metric to characterise the performance of neural networks. An example is provided by Dayan and Willshaw (1991), who examined the learning rules that result from maximising the SNR for a class of associative matrix memories.
• Zohary, Shadlen and Newsome (1994) computed the SNR for a population of cells, under a pooling model that assumes that a decoder sums spikes from all cells in the pool without reference to their origin (but see Reich et al. 2001 for evidence that this assumption may lead to missing some information content). A Pearson noise correlation coefficient of 0.12 was assumed, based on the average of measurements from their recordings from cortical area MT. The study found that the noise correlation results in a "diminishing return" to the SNR due to saturation as more neurons are added to the pool. See also discussion in Rieke et al. (1998).

## References

• HB Barlow and WR Levick (1969). Three factors limiting the reliable detection of light by retinal ganglion cells of the cat. J. Physiol. 200:1-24.
• W Bialek, F Rieke, RR de Ruyter van Steveninck and D Warland (1991). Reading a neural code. Science 252:1854-57.
• TM Cover and JA Thomas (1991). Elements of Information Theory. John Wiley and Sons, New York, USA.
• P Dayan and DJ Willshaw (1991). Optimising synaptic learning rules in linear associative memories. Biol. Cybernetics 65:253-265.
• AA Disney and SR Schultz (2004). Hallucinations and acetylcholine: Signal or noise? Behavioral and Brain Sciences 27(6):790-791.
• DM Green and JA Swets (1966). Signal Detection Theory and Psychophysics. 1988 Reprint Edition, Peninsula Publishing, Los Altos CA, USA.
• P Horowitz and W Hill (1989). The Art of Electronics. Second Edition. Cambridge University Press, Cambridge, UK
• DS Reich, F Mechler and JD Victor (2001). Independent and redundant information in nearby cortical neurons. Science 294:2566-8.
• F Rieke, D Warland, RR de Ruyter van Steveninck and W Bialek (1998). Spikes: exploring the neural code. MIT Press, Cambridge, USA.
• CJ Rozell and DH Johnson (2005). Examining methods for estimating mutual information in spiking systems. Neurocomputing 65:429-34.
• RR de Ruyter van Steveninck and SB Laughlin (1996). The rate of information transfer at graded-potential synapses. Nature 379:642-645.
• H Sato, Y Hata, H Masui and T Tsumoto (1987). A functional role of cholinergic innervation to neurons in the cat visual cortex. J. Neurophysiol. 58(4):765780.
• CE Shannon (1949). Communication in the presence of noise. Proceedings of the IRE, 37(1):10-21. Reprinted in Proceedings of the IEEE, 86(2):447-458, Feb. 1998.
• SM Sherman and RW Guillery (2002). The role of the thalamus in the flow of information to the cortex. Phil. Trans. R. soc. Lond. B 357:1695-1708.
• PJ Simmons and R de Ruyter van Steveninck (2005). Reliability of signal transfer at a tonically transmitting, graded potential synapse of the locust ocellar pathway. J. Neurosci. 25(33):7529-37.
• E Zohary, MN Shadlen and WT Newsome (1994). Correlated discharge rate and its implication for psychophysical performance. Nature 370:140-143.

Internal references

• F Rieke, D Warland, RR de Ruyter van Steveninck and W Bialek (1998). Spikes: exploring the neural code. MIT Press, Cambridge, USA.