Source localization

Post-publication activity

Rey R. Ramírez, Institute for Learning and Brain Sciences (I-LABS) MEG Brain Imaging Center, University of Washington, Seattle, WA, USA

Neuroelectromagnetic source imaging (NSI) is the scientific field devoted to modeling and estimating the spatiotemporal dynamics of neuronal currents throughout the brain that generate the electric potentials and magnetic fields measured with noninvasive or invasive electromagnetic (EM) recording technologies (Ha93; Ba01; Ha05). Unlike the images produced by fMRI , which are only indirectly related to neuroelectrical activity through neurovascular coupling (e.g., the BOLD signal), the current source density or activity images that NSI techniques generate are direct estimates of electrical activity in neuronal populations. In the past few decades, researchers have developed better source localization techniques that are robust to noise and that are well informed by anatomy, neurophysiology, magnetic resonance imaging (MRI), and realistic volume conduction physics (Io90; Ge95; Fu99; Gr01; Vr01; Li02; Wo03). State-of-the-art algorithms can localize many simultaneously active sources and even determine their variable spatial extents and single-trial dynamics.

Neuronal origin of the electromagnetic signals

The measured EM signals that are generated by the brain are thought to be primarily due to ionic current flow in the apical dendrites of cortical pyramidal neurons and the associated return currents throughout the volume conductor (Ok93; Ok97). Neurons that have dendritic arbors with closed field geometry (e.g., interneurons) produce no externally measurable signals (Hu69). However, some non-pyramidal neurons, such as the Purkinje neurons, do produce externally measurable signals (Ni71). Although still controversial, it is believed that some source localization methods can accurately image the activity of deep brain structures, such as the basal ganglia, amygdala, hippocampus, brain stem, and thalamus (Ri91; Vo96; Te97; Li99). However, single neurons produce weak fields and if the current flow is spatiotemporally incoherent (e.g., a local desynchronization) the fields end up canceling. Thus, EM recordings are particularly suited for studying spatiotemporally coherent and locally synchronized collective neural dynamics. There is a limit to how much current density a patch of cortex can support and this seems to be similar across species (Ok93). Thus, large amplitude fields/potentials entail distributed synchronized oscillations.

Measurement modalities

All the EM recording technologies share the important benefit of high temporal resolution (< 1 ms). However, they measure different, yet closely related, physical quantities at different spatial scales that are generated by the current density source distribution located throughout the brain, heart, muscles, eyes, and the environment. In principle, the inverse methods described here can be applied to data acquired using magnetoencephalography (MEG), scalp electroencephalography (sEEG), intracranial EEG (iEEG) or Electrocorticography (ECoG), or combinations of these measurement modalities, as long as the appropriate realistic forward models are available to compute the gain vectors needed for localization (Wo03; Wo04; Wo06).

Magnetoencephalography (MEG)

In MEG, an array of sensors is used to measure components of the magnetic vector field surrounding the head (Ha93; Vr01). Although the first magnetoencephalogram was recorded with a single coil (Co68), after the invention of the superconducting quantum interference device (SQUID) (Zi71), MEG data has been recorded with SQUIDs (Co72). Current state-of-the-art systems have many sensors (>275) organized as a helmet-array of magnetometers and/or gradiometers (planar or radial). Distant reference sensors are used to eliminate noise (Vr01). Also, magnetic shielded rooms have been improved and active shielding technology has been developed to further reduce the effects of sources outside the shielded room and to reduce the need for heavily shielded rooms.

Electroencephalography (EEG)

In sEEG, an array of electrodes is placed on the scalp surface to measure the electric potential scalar field relative to a reference electrode (Ni05; Nu05). EEG recording technology has progressed much since the first human recordings by Hans Berger in 1924 (Be29) and the later work by Edgar Adrian (Ad34). Modern state-of-the-art EEG systems use caps with as many as 256 active electrodes. Some EEG systems can be used for simultaneous EEG/MEG or EEG/fMRI recordings. Researchers are also working on wireless acquisition and on dry electrode technologies that do not use conductive gel.

Invasive electrical recordings

In patients undergoing iEEG or ECoG, grid or strip electrode arrays are neurosurgically placed to record the electric potential more closely and undistorted by the skull (La03; Le05; Ca06). Grid electrode arrays are commonly rectangular in shape and have inter-electrode distances of about 1 cm or lower. Invasive measurements of the local field potential (LFP) can be recorded by depth electrodes, electrode grids, and laminar electrodes (Pe54; Ul01; Sa03). Invasive recordings are usually not localized because the LFP is treated as strictly local. However, there is also a local mixing process that generates the LFPs, and electrodes can pick up far field potentials of strong sources. Thus, inverse models are still desired.

Figure 1: Illustration of the forward and inverse problems. The forward problem (see volume conduction article) can be summarized as the problem of predicting the electric potential or magnetic field vector that would be externally measured at the sensors if some source was active inside the brain (e.g., a unit dipole). The inverse problem (the topic of this article) can be summarized as the problem of estimating the current density or activity values of the source that generated a measured electric potential or magnetic field vector. In this example, the measurement vector (shown on the left) is actually a scalp map of an independent component learned from the unaveraged data measured during a visually-cued reaching task. The interpolated color codes for the potential on the scalp surface. An estimated source of this component is shown on the right. Because of non-uniqueness, an infinite number of such estimates could have generated the electric potential vector on the left. Figure modified from Ra07b.

An overview of all the models

A quick overview of several aspects of modeling which directly affect source estimation even though they are not technically inverse modeling per se is presented in this section.

Source modeling

The source model refers to the mathematical model used to approximate the current density. The source model most often used is the equivalent electric dipole model, which approximates the primary current density within a patch or volume as a point source \(\mathbf{j^p(r')}=\mathbf{q}\delta(\mathbf{r'}-\mathbf{r})\ ,\) where \(\delta(\mathbf{r})\) is the Dirac delta function with moment \(\mathbf{q}=\textstyle \int \mathbf{j^p(r')}\mathbf{dr'}\) (Ha93; Ba01). Other source models, such as the blurred dipole (i.e., a dipole with surrounding monopoles), the quadrupole, or higher multipolar expansions, can also be used for forward modeling. In addition, an alternative irrotational current density source model, which excludes the solenoidal component of the current density, can be used to formulate the forward problem as a mapping from the scalar field of local field potentials throughout the brain to the scalar field of measured potentials of EEG or ECoG. This approach reduces the number of unknowns, thereby making the inverse problem less underdetermined. This irrotational source model approach for forward/inverse modeling is called ELECTRA (Gr00). Once the LFP vector is estimated, the irrotational current density vector can be computed by taking its gradient. Although we will assume a dipole source model for the rest of this article, the linear system of equations of ELECTRA can also be solved with many of the inverse methods explained here.

Anatomical modeling

An MRI scan must be obtained from each individual for doing subject-specific anatomical modeling. The T1 scan is segmented into white and gray matter volumes from which topologically correct white and gray matter cortical surfaces are tessellated. Sources can then be constrained to reside only in the gray matter volume by constructing a volumetric source space (i.e., a set of coordinates where candidates dipoles are allowed to have nonzero amplitudes). However, the cortical gray matter volume is usually modeled as a surface since NSI technology probably does not have the spatial resolution to discriminate sources at different cortical layers. With this surface approach, dipole orientations can easily be constrained to point normal to the cortical surfaces (i.e., in the direction of the apical dendrites of pyramidal neurons). Non-cortical structures can be modeled as volumetric source sub-spaces without dipole orientations constraints. MRI scans are also used to segment other head subvolumes (e.g., scalp/skin, skull, CSF, and brain envelope volume) and to tesselate their associated surfaces (e.g., scalp/skin, inner and outer skull, and brain envelope surfaces) which are used for forward modeling of volume conduction. If a subject's MRI is not available, a standardized MRI brain atlas (e.g., MNI) can be be warped to optimally fit the subject's anatomy based on the subject's digitized head shape. The warped atlas can then be used as a standardized volumetric source space and for standardized forward modeling.

Coordinate transformations

To compute the forward EM signals that would be produced by a unit dipole somewhere in the brain, both the sensor and dipole positions and orientations must be expressed in the same coordinate system. This is usually done by transforming (i.e., translating and rotating based on fiducial markers) the sensor positions and orientations to the coordinate system of the MRI where the dipoles are modeled. Errors in finding the true fiducial marker coordinates in the MRI space can result in poor co-registration. For better co-registration an automatic algorithm (e.g., iterative closest point (ICP) matching) can be used to match the digitized head-shape or EEG electrode surface to the skin surface segmented from the MRI. Even with optimal co-registration, one must keep in mind that some of the modeled meshes may be misaligned with the true brain structures not only due to transformation errors but also due to intrinsic structural errors in the MRI data sets (e.g., due to susceptibility artifacts and/or mistaken tissue segmentation).

Forward modeling

In order to localize the primary current density one has to model the EM signals produced by both the primary (i.e., impressed) and the secondary (i.e., volume, return) current density throughout the head volume conductor, which in reality has an inhomogenous and anisotropic conductivity profile. Analytic MEG forward solutions can be computed if the volume conductor is approximated by an isotropic sphere or a set of overlapping spheres (Sa87; Hu99). The same is true for EEG but using cocentric spherical shells with different isotropic conductivities. The level of realism of the forward volume conduction head model is particularly important since the measured signals have significant contributions from volume currents. The different volume conductor materials (e.g., scalp, skull layers, CSF, gray matter, white matter) have different conductivities and different levels of anisotropy. These need to be modeled in great spatial detail to compute a realistic bases of gain vectors used to represent the data. Much progress has been made toward realistic EM forward modeling using numerical techniques such as the Boundary Element Method (BEM) and the Finite Element Method (FEM) (Ha87; Ak04; Wo06). BEM assumes homogenous and isotropic conductivity through the volume of each tissue shell (e.g., brain, CSF, skull, skin) but not across the boundaries of the shells. FEM usually also assumes homogeneity and isotropy within each tissue type, but in contrast to BEM, can also be used to model the anisotropy of white matter (using DTI) and that of the skull's spongiform and compact layers (using geometric models). Although realistic modeling exploits any available subject-specific information from Magnetic Resonance Imaging (MRI) (e.g., T1, T2, PD, DTI), standardized BEM or FEM head models can be used as a first approximation for subjects without an MR scan (Fu02; Da06).

Inverse modeling

The goal of inverse modeling, the topic of this paper, is to estimate the location and strengths of the current sources that generate the measured EM data. This problem of source localization is an ill-posed inverse problem. There are an infinite number of solutions that explain the measured data equally well because silent sources (i.e., sources that generate no measurable EM signals) exist, and these can always be added to a solution without affecting the data fit (He53). Because of this nonuniqueness, a priori information is needed to constrain the space of feasible solutions (Sa87; Ha93). Nonuniqueness is handled by making assumptions about the nature of the sources (e.g., number of sources, anatomical and neurophysiological constraints, prior probability density functions, norms, smoothness, correlation, covariance models, sparsity, diversity measures, spatial extent constraints, etc.). Thus, the accuracy and validity of the estimates depend to some extent on the biological correctness of the assumptions and priors adopted in our models. This is why priors should not only be informed by neurophysiology domain knowledge, but should also be flexible and adaptive to particular data sets. The rest of this paper focuses on different inverse modeling approaches. In such a short space it would be impossible to mention all inverse algorithms that exist. Thus, we focus on three basic approaches that encompass most algorithms: 1) non-linear dipole or source fitting; 2) spatial scanning or beamforming; and 3) source-space based algorithms explained within a general Bayesian framework.

Data pre-processing

Before localization, data is usually preprocessed with uni- and multivariate techniques. The measured univariate time-series of each electric or magnetic recording channel is typically represented as a row vector, and is usually cleaned by mean subtraction, high- and low-pass filtering, detrending, harmonic analysis, and/or downsampling. Signals can also be transformed to the time-frequency/scale domains with Fourier/wavelet methods, or can be band-pass filtered and Hilbert transformed to extract the instantaneous amplitude and phase (Ta96; Se99; Gr01; Fr07). Frequency-specific noise can easily be filtered out in the frequency or time domains. Filtering or time-frequency analysis is also useful for studying wavelike activity and oscillations at specific frequency bands: the slow oscillation (<1 Hz), delta (1-4 Hz), theta (5-8Hz), alpha (8-12 Hz), mu (8-12 Hz and 18-25 Hz), spindles (~14 Hz), low beta (13-20 Hz), high beta (20-30 Hz), gamma (30-80 Hz), high gamma or omega (80-200 Hz), ripples (~200 Hz), and sigma bursts (~600 Hz).

The continuously and simultaneously acquired time-series of all MEG, EEG, and/or ECoG channels can be concatenated to form a multivariate data matrix \(\mathbf{B}\in\Re^{{d_m}\times{d_t} }\ ,\) where \(d_m\) is the number of measurement channels and \(d_t\) is the number of time points. Correlated noise, which is generated by non-brain sources such as the heart, eyes, and muscles, can be filtered out using blind source separation, subspace methods, and machine learning methods (Ma97; Uu97; Ma02; Ta02; Pa05; Zu07).. Different linear transformations that use only the second-order statistics of the data can be applied to the data for noise removal (e.g. signal space separation (SSP), Principal Component Analysis (PCA)). The denoised \(\mathbf{B}\) matrix can be cut into epochs time-locked to an event (e.g., stimulus onset) for single trial analysis or averaged across epochs to extract the event-related potential and/or field (ERP/F) (Lu05). The ERP/F can then be localized by many different inverse methods. Alternatively, blind source separation algorithms that use higher order statistics or temporal information (e.g., infomax/maximum-likelihood independent component analysis (ICA) or second-order blind identification (SOBI)), can be applied to the entire unaveraged multivariate time series to learn a data representation basis of sensor mixing vectors (associated with maximally independent time-courses) that can be localized separately, and to reject non-brain components (i.e., denoising) (Be95; Ma97; Ma02; Ta02). Time-frequency analysis can be performed on the activation functions of each component, and the source event related spectral perturbation (ERSP), or inter-trial coherence (ITC) can be computed (De04). Convolutive ICA algorithms in the time and time-frequency domains can model more complex spatiotemporal source dynamics (An03; Dr07; Le07).

Regardless of any transformation or averaging on the data, the data to be simultaneously solved can be represented as a \({d_m}\times{d_v}\) real or complex matrix \(\mathbf{B}\ ,\) which contains \(d_v\) measurement column vectors. For example, if \(\mathbf{B}\) is a time-series, \({\mathbf{b}}_\tau = (b_1 ,...,b_\tau )^T\) is the \(d_m\) dimensional measurement column vector at time \(\tau\ .\) But \(\mathbf{B}\) needs not be a time-series, it can also be any given set of vectors obtained from the data that benefit from simultaneous source localization (e.g., a subspace of the data). When \(d_v=1\ ,\) the single measurement problem is recovered. This case is also used for localizing individual sensor maps obtained from a decomposition of the data (e.g., ICA).

Parametric dipole modeling

The goal of inverse methods is to estimate the location and strengths of the current sources that generate \(\mathbf{B}\ .\) However, because of nonuniqueness a priori assumptions are needed to constrain the space of feasible solutions. The most common assumption is that the measurements were generated by a small number of brain regions that can be modeled using equivalent dipoles. These algorithms minimize a data-fit cost function defined in a multidimensional space with dimension equal to the number of parameters. Once the algorithm converges to a local minima in the multidimensional space of parameters, the optimal parameters (each corresponding to a dimension) are found. The algorithms estimates five nonlinear parameters per dipole: the x, y, and z dipole position values, and the two angles necessary to define dipole orientations in 3D space. However, in the MEG spherical volume conductor model only one angle (on the tangent space of the sphere) is necessary because the radial dipole component is silent. The amplitudes are linear parameters estimated directly from the data as explained below for the cases of uncorrelated or correlated noise.

Uncorrelated noise model

Parametric dipole fitting algorithms, minimize a data fit cost function such as the Frobenius norm of the residual, \[\tag{1} \min_{\mathbf{s}} ||\mathbf{B}-\mathbf{\hat B}||_F^2=||\mathbf{B}-\mathbf{L_s\hat J_s}||_F^2=||\mathbf{B}-\mathbf{L_sL_s^{\dagger}B}||_F^2=||(\mathbf{I}-\mathbf{L_sL_s^{\dagger}})\mathbf{B})||_F^2=||\mathbf{P_{L_s}^\perp}\mathbf{B}||_F^2\]

where \(\mathbf{s}\) refers to all the nonlinear parameters, \(\mathbf{s}=\lbrace \mathbf{r}_i , \mathbf{\theta}_i \rbrace\ ,\) which are the positions and orientations of all the dipoles in the model that are optimized to minimize the data fit cost (Sc91; Ba01). \(\mathbf{\hat B}\) is the explained forward data given by the generative linear model\[\mathbf{\hat B}=\mathbf{L_s\hat J_s}\ ,\] where \(\mathbf{\hat J_s}=\mathbf{L_s^{\dagger}B}\in\Re^{{d_s}\times{d_v} }\) is the estimated current matrix containing the moments of \(d_s\) dipoles. The ith row vector of \(\mathbf{\hat J_s}\) contains the moments of a dipole located at position \(\mathbf{r}_i\) with orientation \(\mathbf{\theta}_i\ .\) \(\mathbf{L_s}\) is the lead-field matrix containing \(d_s\) m-dimensional column vectors called gain vectors. They are computed for \(d_s\) unit dipoles located at different positions \(\mathbf{r}_i=(x_i,y_i,z_i)^T\) and with different orientations \(\mathbf{\theta}_i=(\phi_i,\omega_i)^T\ .\) The orientations of the dipoles however can be obtained linearly if we optimize only the positions and include the gain vectors of all 3 dipole components pointing in the (x,y,z) directions, so that \(\mathbf{L_s}\) is a \(d_m\) by \(3d_s\) matrix and \(\mathbf{J_s}\) is a \(3d_s\) by \(d_v\) matrix. \(\mathbf{I}\) is the \(d_m\)-dimensional identity matrix, \(\mathbf{L_s}^{\dagger}\) is the pseudoinverse of \(\mathbf{L_s}\ ,\) and \(\mathbf{P_{L_s}^\perp}\) is the orthogonal projection operator onto the null space of \(\mathbf{L_s}\ .\) Note that in this approach the gain matrix needs to be recomputed at each iteration for any given parameters. Also, note that this approach is equivalent to a maximum likelihood estimate of the parameters using a Gaussian likelihood noise model.

Correlated noise model

In the presence of correlated noise, a modified cost function can be minimized (Sa87). If \(\mathbf{C}_\varepsilon\) is the noise covariance matrix, which can be decomposed as \(\mathbf{C}_\varepsilon=\mathbf{V}\Lambda^2 \mathbf{V}^T\ ,\) where \(\mathbf{V}\) is an orthonormal matrix, and \(\Lambda=diag(\lambda _1 , \ldots ,\lambda_{d_m})\ ,\) then the new cost function can be expressed as \[\tag{2} \min_{\mathbf{s}} \left\| {{\mathbf{PB} } - {\mathbf{P\hat B} } } \right\|_F^2 = \left\| {{\mathbf{PB} } - {\mathbf{PL_s} } {\mathbf{\hat J_s} } } \right\|_F^2\ ,\] where \(\mathbf{P}= \mathbf{V}\Lambda^{-1}\mathbf{V}^T\ .\)

Global minimization

These cost functions are usually minimized using nonlinear optimization algorithms (e.g., Nelder-Meade downhill simplex, Levenberg-Marquardt). Unfortunately, when the number of dipoles is increased (e.g., \(d_s>2\)), the cost suffers from many local minima. Furthermore, it should be noted that by adding a spatial term to the data-fit cost function dipoles can be constrained to reside as close as desired to the gray matter volume. However, such spatial penalties can introduce more local minima problems. Global minimization can be achieved for a time-series containing maybe up to seven dipoles by using more computationally intensive algorithms such as simulated annealing, multistart simplex algorithms, or genetic algorithms (Hu98; Uu98). Alternatively, instead of selecting a point estimate, one can use Markov Chain Monte Carlo algorithms to make Bayesian inferences about the number of sources and their spatial extents and to compute probabilistic maps of activity anatomically constrained to gray matter (Sc99; Be01). It is important to distinguish the cost function from the optimization algorithms. Although the standard costs for dipole fitting have many local minima, other costs, like for example the negative log marginal likelihood (see sparse Bayesian learning section), have less local minima, and can also be minimized with nonlinear dipole fitting algorithms.

Spatial scanning and beamforming

An alternative approach to the ill-posed bioelectromagnetic inverse problem is to independently scan for dipoles within a grid containing candidate locations (i.e., source points). Here the goal is to estimate the activity at a source point or region while avoiding the crosstalk from other regions so that these affect as little as possible the estimate at the region of interest.

Matched filter

The simplest spatial filter, a matched filter, is obtained by normalizing the columns of the leadfield matrix and transposing this normalized dictionary. The spatial filter for location s is given by \[\tag{3} {\mathbf{W}}_s^T = \frac{{ {{\mathbf{L} }_s^T } } }{ {\left\| {{\mathbf{L} }_s } \right\|_F } }\ .\] This approach essentially projects the data to the column vectors of the dictionary. Although this guarantees that when only one source is active, the absolute maximum of the estimate corresponds to the true maximum, this filter is not recommended since this single-source assumption is usually not valid, and since the spatial resolution of the filter is so low given the high correlation between dictionary columns. This approach can be extended to fast recursive algorithms, such as matching pursuit and its variants, which sequentially project the data or residual to the non-used dictionary columns to obtain fast suboptimal sparse estimates.

Multiple signal classification (MUSIC)

The MUSIC cost function is given by \[\tag{4} M_s = \frac{ {\left\| {\left( {{\mathbf{I} } - {\mathbf{U} }_s {\mathbf{U} }_s^T } \right){\mathbf{L} }_s } \right\|_2^2 } }{ {\left\| {{\mathbf{L} }_s } \right\|_2^2 } } = \frac{ {\left\| {P_{{\mathbf{U} }_s }^ \bot {\mathbf{L} }_s } \right\|_2^2 } }{ {\left\| {{\mathbf{L} }_s } \right\|_2^2 } }\ ,\] where \({\mathbf{B}} = {\mathbf{USV}}^T\) is the singular value decomposition (SVD) of the data, \({\mathbf{U}}_s \) is a matrix with the first \(d_s\) left singular vectors that form the signal subspace, and \(\mathbf{L}_s\) is the gain vector for the dipole located at \(r_i\) and with orientation \(\theta_i\) (obtained from anatomy or using the generalized eigenvalue decomposition) (Mo98). \(P_{\mathbf{U}_s }^ \perp\) is an orthogonal projection operator onto the data noise subspace. The MUSIC map is the reciprocal of the cost function at all locations scanned. This map can be used to guide a recursive parametric dipole fitting algorithm.

Linearly constrained minimum-variance (LCMV) beamforming

In contrast, the minimum variance beamformer attempts to minimize the beamformer output power subject to a unity gain constraint: \[\tag{5} \min_{{\mathbf{W} }_s} \quad tr\left( {{\mathbf{W} }_s^T {\mathbf{CW} }_s } \right)\] subject to \({\mathbf{W} }_s^T {\mathbf{L} }_s = {\mathbf{I} }\) , where \({\mathbf{C}}\) is the data covariance matrix, \({\mathbf{L}}_s\) is the \(d_m\) by 3 gain matrix at source point s, and \({\mathbf{W}}_s\) is the \(d_m\) by 3 spatial filtering matrix ( Va97). The solution to this problem is given by \[\tag{6} {\rm{ }}{\mathbf{W}}_s^T = \left( {{\mathbf{L} }_s^T {\mathbf{C} }^{ - 1} {\mathbf{L} }_s } \right)^{ - 1} {\mathbf{L} }_s^T {\mathbf{C} }^{ - 1}\ .\]

The parametric source activity at source point s is given by \({\mathbf{A}}_{\rm{s}} = {\mathbf{W}}_s^T {\mathbf{B}}\ .\) This can be performed at each source-point of interest to obtain a distributed map of activity. This beamforming approach can be expanded to a more general Bayesian graphical model that uses event timing information to model evoked responses, while suppressing interference and noise sources (Zu07). This approach uses a variational Bayesian EM algorithm to compute the likelihood of a dipole at each grid location.

Synthetic aperture magnetometry (SAM)

In synthetic aperture magnetometry (SAM), a nonlinear beamformer, an optimization algorithm is used at each source-point to the find the dipole orientation that maximizes the ratio of the total source power over noise power, the so-called pseudo-Z deviate \[\tag{7} {\rm{ Z = }}\sqrt {\frac{{{\mathbf{W} }_s^T {\mathbf{CW} }_s } }{{{\mathbf{W} }_s^T {\mathbf{C} }_n {\mathbf{W} }_s } } } = \sqrt {\frac{P}{n} }\]

where \({\mathbf{C}}_n\) is the noise covariance, usually based on some control recording or assumed to be a multiple of the identity matrix (Vr01). This maximization generates a scalar beamformer (i.e., \({\mathbf{L}}_s\) is a vector) with optimal dipole orientations in terms of SNR. This improves the spatial resolution of SAM relative to that of LCMV beamforming. To generate statistical parametric maps between an active task period (a) and a control period (c), the so-called pseudo-T statistic can be computed as \[\tag{8} T{\rm{ = }}\frac{{P{\rm{(} }a{\rm{) - } }P{\rm{(} }c{\rm{)} } } }{{n{\rm{(} }a{\rm{) + } }n{\rm{(} }c{\rm{)} } } }\ .\]

Such maps usually have more focal activities since they contrast the differences between two states. Other scalar beamformers can be implemented. For example, an anatomically constrained beamformer (ACB) can be obtained by simply constraining the dipole orientations to be orthogonal to the cortical surfaces (Hi03).

Figure 2: Examples of spatial scans and beamforming. For comparison purposes, the measurement vector localized by these four methods is the scalp map of the independent component shown in Fig 1. The data covariance matrix (used in methods b-d) was computed using all of the data. (a) Match Filter; (b) Anatomically constrained beamforming (ACB); (c) Linearly constrained minimum variance (LCMV) beamformer; (d) Synthetic aperture magnetometry (SAM). Any activity below 10% from maximum was cropped to show the curvature map on inflated cortical surfaces. Figure modified from Ra07b.

Dynamic imaging of coherent sources (DICS)

Beamforming can be performed in the frequency domain using the dynamic imaging of coherent sources (DICS) algorithm, whose spatial filter matrix for frequency f is given by \[\tag{9} {\rm{ }}{\mathbf{W}}_s^T (f) = \left( {{\mathbf{L} }_s^T {\mathbf{C} }(f)^{ - 1} {\mathbf{L} }_s } \right)^{ - 1} {\mathbf{L} }_s^T {\mathbf{C} }(f)^{ - 1}\]

where \({\mathbf{C}}(f)\) is the cross-spectral density matrix for frequency f (Gr01). Note that the covariance matrix has simply been replaced by the cross-spectral density matrices.

Other spatial filtering methods

All the spatial filter vectors explained so far depend on the gain vectors associated only with the region of interest (i.e., they don't depend on the gain vectors associated with the rest of the source space). There are other more direct approaches to spatial filtering that incorporate the gain vectors associated with both the region of interest and the rest of the source space, and that do not necessarily use the measured covariance matrix. In the Backus-Gilbert method, a different spread matrix is computed for each candidate source location (Gr98; Gr99). The goal is to penalize the side lobes of the resolution kernels (i.e., the row vectors of the resolution matrix, defined as \(\mathbf{R}=\mathbf{O}\mathbf{L}\ ,\) where \(\mathbf{L} \) is the leadfield matrix for the entire sourcespace (see next section) and \(\mathbf{O}\) is any optimized linear operator that gives the source estimates when multiplied with the data). This usually results in a wider main lobe. In the spatially optimal fast initial analysis (SOFIA) algorithm, virtual leadfields are constructed that are well concentrated within a region of interest compared to the rest of the sourcespace (Bo99). The region of interest can be moved to every source-point. A similar approach is adopted in the Local Basis Expansion (LBEX) algorithm which solves a generalized eigenvalue problem to maximize the concentration of linear combinations of leadfields (Mi06). As a final remark, it should be emphasized that all of the spatial filtering algorithms presented scan one source-point or local region at a time, but can be expanded with multi-source scanning protocols that search through combinations of sources. Although, multisource scanning methods can recover perfectly synchronized sources (which are usually missed by single-source methods), there is no agreed protocol to scan the immense space of multisource configurations.

Sourcespace-based distributed and sparse methods

Instead of performing low-dimensional nonlinear optimization or spatial scanning, one can assume dipoles at all possible candidate locations of interest within a grid and/or mesh called the sourcespace (e.g., source-points in grey matter) and then solve the underdetermined linear system of equations \[\tag{10} {\mathbf{B}} = {\mathbf{LJ}} + \Upsilon\]

for \(\mathbf{\hat J}\in\Re^{{d_s}\times{d_v} }\) (\(d_s\) now being the number of dipole components throughout the sourcespace). The leadfield matrix \(\mathbf{L}\in\Re^{{d_m}\times{d_s} }\) relates the current space with the measurement space. \(\Upsilon \in \Re ^{d_m {\rm{ } } \times {\rm{ } }d_v } \) is the noise matrix usually assumed to be Gaussian. Since there is no unique solution to this problem, priors are needed to find solutions of interest. The rest of the algorithms can be best presented from a more abstract Bayesian general framework that makes explicit the source prior assumptions using probability density functions. Bayes theorem \[\tag{11} p({\mathbf{J}}|{\mathbf{B}}) = \frac{{p({\mathbf{B} }|{\mathbf{J} })p({\mathbf{J} })} }{{p({\mathbf{B} })} }\]

says that the posterior probability given the measurements is equal to the likelihood multiplied by the marginal prior probability and divided by a normalization factor called the evidence.

Bayesian maximum a posteriori probability (MAP) estimates

A Gaussian likelihood model is assumed, \[\tag{12} p({\mathbf{B}}|{\mathbf{\hat J}}) = \exp \left( { - \frac{1}{{2\sigma ^2 } }\left\| {{\mathbf{B} } - {\mathbf{L\hat J} } } \right\|_F^2 } \right) \ ,\]

together with a very useful family of prior models using generalized Gaussian marginal probability density functions (pdf) \[\tag{13} p({\mathbf{\hat J}}) \propto \exp \left( { - {\mathop{\rm sgn}} (p)\sum\limits_{{\rm{i = 1} } }^{\rm{n} } {\left\| {{\mathbf{\hat J} }{(i,\;\,:)} } \right\|_q^p } } \right)\]

where p specifies the shape of the pdf which determines the sparsity of the estimate, and q usually is 2 and specifies the norm of \({\mathbf{\hat J}}{(i,\,:)}\ ,\) the ith row vector of \({\mathbf{\hat J}}\ .\) Because \(p({\mathbf{B}})\) does not affect the location of the posterior mode, it can be ignored, and thus the MAP point estimate can be computed by \[\tag{14} {\mathbf{\hat J}}_{MAP} = \arg \; \max_{{\mathbf{\hat J} } } \;\ln p({\mathbf{\hat J} }|{\mathbf{B} }) = \;\ln p({\mathbf{B} }|{\mathbf{\hat J} }) + \ln p({\mathbf{\hat J} }) = \ln p_\Upsilon ({\mathbf{B} } - {\mathbf{L\hat J} }) + \ln p({\mathbf{\hat J} })\ .\]

Maximizing the log posterior (which is a monotonic function) rather than the posterior, illustrates how these MAP approaches are equivalent to algorithms that minimize p-norm-like measures \[\tag{15} \min_{\mathbf{\hat J}} {\rm{ }}\left\| {{\mathbf{B} } - {\mathbf{L\hat J} } } \right\|_F^2 + \lambda {\mathop{\rm sgn} } (p)\,\sum_{{\rm{i = 1} } }^{\rm{n} } {\left\| {{\mathbf{\hat J} }{(i,\;:)} } \right\|_q^p } ,\;\forall {\rm{ } }\left\| {{\mathbf{\hat J} }{(i,\;:)} } \right\|_q \ne 0\]

where \(\lambda\) is the noise regularization parameter, which can be fixed across iterations, learned, stabilized (e.g., l-curve, generalized cross validation), or adjusted to achieve a desired representation error using the discrepancy principle, \[\tag{16} {\rm{ }}\left\| {{\mathbf{B} } - {\mathbf{L\hat J} } } \right\|_F^2 = \varepsilon = {\rm{ } }\left\| \Upsilon \right\|_F^2\ .\]

MAP estimates using a Gaussian prior (p=2) are equivalent to noise-regularized minimum-l₂-norm solutions, \[\tag{17} {\mathbf{\hat J}} = {\mathbf{L}}^T \left( {{\mathbf{LL} }^T + \lambda {\mathbf{I} } } \right)^{ - 1}{\mathbf{B} }\ ,\]

which are widely distributed and suffer from depth bias (i.e., deep sources are mislocalized to superficial source-points) (Ah92; Wa92; Ge98). Weighted minimum-norm algorithms can be used to partially compensate for this depth bias. The inverse operator is then given by \[\tag{18} {\mathbf{\hat J}} = {\mathbf{W}}_a {\mathbf{W}}_a^T {\mathbf{L}}^T \left( {{\mathbf{LW} }_a {\mathbf{W} }_a^T {\mathbf{L} }^T + \lambda {\mathbf{I} } } \right)^{ - 1} {\mathbf{B} }\]

where \({\mathbf{W}}_a\) is a diagonal matrix (e.g., \({\mathbf{W}}_a = diag\left( {\left\| {l_i } \right\|_2^{ - 1} } \right)\ ,\) \({\mathbf{W}}_a = diag\left( {\left\| {l_i } \right\|_2^{ - 1/2} } \right) \ ,\) 3-D Gaussian function, or fMRI priors) (Io90; Fu99). Non-diagonal \({\mathbf{W}}_a {\mathbf{W}}_a^T\) matrices can be used to incorporate source covariance and smoothness constraints (Pa99). To suppress correlated noise the matrix \(\lambda {\mathbf{I}}\) can be replaced with the noise covariance matrix, \(C_\upsilon\ ,\) which can be estimated from the data. Such approaches are often called adaptive. Likewise, MAP estimates using a Laplacian prior (p=1) are equivalent to obtaining minimum-l₁-norm solutions. These are usually computed using linear programming, and have been called recently minimum current estimates (MCE) (Ma95; Uu99), but can be computed efficiently for a time-series using a generalized Focal Underdetermined System Solver (FOCUSS) algorithm \[\tag{19} {\mathbf{\hat J}}_k = {\mathbf{W}}_k {\mathbf{W}}_k^T {\mathbf{L}}^T \left( {{\mathbf{LW} }_k {\mathbf{W} }_k^T {\mathbf{L} }^T + \lambda {\mathbf{I} } } \right)^{ - 1} {\mathbf{B} }\ ,\]

where \[\tag{20} {\mathbf{W}}_k = {\mathbf{W}}_a diag\left( {\left\| {{\mathbf{\hat J} }_{k - 1} (i,:)} \right\|_q^{1 - {\textstyle{p \over 2} } } } \right)\]

and \({\mathbf{\hat J}}_{k - 1}\) is the previous estimated matrix at iteration k-1 (Go95; Go97; Ra99; Ra02; Co05; Ra05; Ra08). Importantly, FOCUSS is equivalent to an Expectation Maximization (EM) algorithm used to find MAP estimates in a hierarchical Bayesian framework in which the hyperparameters are integrated out and the prior pdf is a Gaussian scale mixture (Wi07b; Fi02). To simultaneously localize a very long time-series of any length very quicky, the times-series matrix \(\mathbf{B}\) or its covariance matrix can be decomposed with the SVD and replaced in eq. 19 with the matrix \(\mathbf{U}\left(\mathbf{S}\right)^{-1/2}\ ,\) where \(\mathbf{U}\) and \(\mathbf{S}\) are the left singular vectors and singular values matrices (up to whatever rank desired), respectively. Although FOCUSS works optimally with a Laplacian prior (p=1), it can also be used to find MAP estimates with different generalized Gaussian priors. When p=-2, the Magnetic Field Tomography (MFT) algorithm is recovered if the update rule is based on the current modulus, there is only one iteration, and the a priori weight matrix is a 3D Gaussian used for depth bias compensation (Io90; Ri91; Ta99). Importantly, MAP estimates have different modes for different \(\mathbf{W}_a\) matrices (Wi07b), and computer simulations have shown that \({\mathbf{W}}_a = diag\left( {\left\| {l_i } \right\|_2^{ - 1} } \right)\) and p=1 work optimally (Ra05). If one is not sure whether one should use a Gaussian or Laplacian prior, one can use Markov chain Monte Carlo (MCMC) methods to learn which l_p-norm is optimal for that particular data set (Au05).

Dynamic statistical parametric maps (dSPM)

Another approach to compensate for depth bias is the noise-normalized dynamic statistical parametric map (dSPM) technique (Da00; Li02). First, the linear inverse operator, which is equivalent to a correlated noise-regularized pseudoinverse or Wiener operator, is computed by \[\tag{21} {\mathbf{P}} = {\mathbf{C}}_s {\mathbf{L}}^T \left( {{\mathbf{LC} }_s {\mathbf{L} }^T + {\mathbf{C} }_n } \right)^{ - 1}\]

where \(\mathbf{C}_s\) is the source covariance matrix, usually assumed to be the identity matrix. Then the noise- normalized operator is computed which in the case of fixed dipole orientations is given by \[\tag{22} {\mathbf{P}}_{norm} = diag\left( v \right)^{ - 1/2} {\mathbf{P}}\ ,\]

where \(v = diag\left( {{\mathbf{PC} }_n {\mathbf{P} }^T } \right)\ .\) Note that dSPM performs noise-normalization after the inverse operator \(\mathbf{P}\) has been computed.

sLORETA

An alternative approach for depth-bias compensation is the sLORETA technique (Pa02). In contrast to the dSPM method, the inverse operator is weighted as a function of the resolution matrix, \(\mathbf{R=PL}\ ,\) that is associated with the inverse and forward operators \(\mathbf{P}\) and \(\mathbf{L}\ .\) For fixed dipole orientations, the pseudo-statistics of power and absolute activation are respectively given by \[\tag{23} \varphi_i=\frac{\mathbf{j}_i^2}{\mathbf{R}_{ii}} \text{ and } \phi_i=\sqrt{\varphi_i}\ .\] Thus, the standardized sLORETA inverse operator is given by \[\tag{24} \mathbf{P}_{sloreta}=diag(r)^{-1/2}\mathbf{P}\]

where \(r=diag(\mathbf{R})\ .\) Interestingly, the sLORETA algorithm is equivalent to the first step of the Sparse Bayesian (SBL) algorithm explained in the next section.

Figure 3: Examples of distributed methods. For comparison purposes, the measurement vector localized by these four methods is the scalp map of the independent component shown in Fig 1. (a) Minimum-L2-norm without dipole orientation constraints (current density modulus); (b) Weighted minimum-L2-norm with dipole orientation constraints from segmented cortical surfaces; (c) Dynamic statistical parametric maps (dSPM) with noise normalization and anatomical dipole orientation constraints; (d) Statistical low resolution electromagnetic tomography (sLORETA) with anatomical dipole orientation constraints. Any activity below 10% from maximum was cropped to show the curvature map on inflated cortical surfaces. Figure modified from Ra07b.

Sparse Bayesian learning (SBL) and automatic relevance determination (ARD)

Instead of finding MAP point estimates using fixed priors, one can use the evidence framework of sparse Bayesian learning (SBL) to learn adaptive parametrized priors from the data itself (Ma92; Ne96; Ti01; Wi04; Sa04; Ra05; Ra06a; Ra07a; Nu07a; Ra08). This approach is an extremely important alternative because the posterior mode may not be representative of the full posterior, and thus, a better point estimate may be obtained, the posterior mean, by tracking the posterior probability mass. This is achieved by maximizing a tractable Gaussian approximation of the evidence, also known as the type-II likelihood or marginal likelihood \[\tag{25} p({\mathbf{B}};\Sigma _{\mathbf{B}} ) = \int {p({\mathbf{B}}|{\mathbf{J}})p({\mathbf{J}};\Sigma _{\mathbf{B}} )} dJ = N(0,\Sigma _{\mathbf{B}} )\]

or equivalently by minimizing the negative log marginal likelihood \[\tag{26} L\left( \gamma \right) = - \log p({\mathbf{B}};\Sigma _{\mathbf{B}} ) = d_v \log \left| {\Sigma _{\mathbf{B}} } \right| + trace\left( {{\mathbf{B} }^T \Sigma _{\mathbf{B} }^{ - 1} {\mathbf{B} } } \right)\]

where \(\Sigma _{\mathbf{B}} = {\mathbf{L}}\Sigma _{\mathbf{J}} {\mathbf{L}}^T + \Sigma _\varepsilon\ .\) The diagonal matrix \(\Sigma _{\mathbf{J}} = \Gamma = diag(\gamma _i )\) is the prior source covariance matrix which contains the vector of hyperparameters on the diagonal (i.e., the variances). In the ARD framework the precisions (i.e., inverse variances) are Gamma distributed. The matrix \(\Sigma _\varepsilon\) is the noise covariance matrix, which can be assumed to be a multiple of the identity matrix (e.g., \(\sigma_\varepsilon^2\mathbf{I}\ ,\) where \(\sigma_\varepsilon^2\) is the noise variance, a hyperparameter that can also be learned from the data or empirically obtained from the measurements). The evidence maximization is achieved by using an Expectation-Maximization update rule \[\tag{27} \gamma _i^{(k + 1)} = \frac{1}{{d_v r_i } }\left\| {\gamma _i^{(k)} {\mathbf{L} }_{(:,\;i)}^T \left( {\Sigma _{\mathbf{B} }^{(k)} } \right)^{ - 1} {\mathbf{B} } } \right\|_F^2 + \frac{1}{{r_i } }trace\left( {\gamma _i^{(k)} {\mathbf{I} } - \gamma _i^{(k)} {\mathbf{L} }_{(:,\;i)}^T \left( {\Sigma _{\mathbf{B} }^{(k)} } \right)^{ - 1} {\mathbf{L} }_{(:,\;i)}^{} \gamma _i^{(k)} } \right)\]

or alternatively using a fixed-point gradient update rule \[\tag{28} \gamma _i^{(k + 1)} = \frac{1}{{d_v r_i} }\left\| {\gamma _i^{(k)} {\mathbf{L} }_{(:,\;i)}^T \left( {\Sigma _{\mathbf{B} }^{(k)} } \right)^{ - 1} {\mathbf{B} } } \right\|_F^2 \left( {trace\left( {\gamma _i^{(k)} {\mathbf{I} } - \gamma _i^{(k)} {\mathbf{L} }_{(:,\;i)}^T \left( {\Sigma _{\mathbf{B} }^{(k)} } \right)^{ - 1} {\mathbf{L} }_{(:,\;i)}^{} \gamma _i^{(k)} } \right)} \right)^{ - 1}\]

where \(r_i\) is the rank of \({\mathbf{L}}_{(:,\;i)}{\mathbf{L}}_{(:,\;i)}^T\ ,\) and \({\mathbf{L}}_{(:,\;i)}\) is a matrix with column vectors from \(\mathbf{L}\) that are controlled by the same hyperparameter (Ra06b,Wi07b). With fixed dipole orientations \({\mathbf{L}}_{(:,\;i)}\) is a vector, but with loose orientations \({\mathbf{L}}_{(:,\;i)}\) is a \(d_m \times 3\) matrix. For patch source models involving dipoles within a region \({\mathbf{L}}_{(:,\;i)}\) is a matrix containing all gain vectors associated with the local patch of cortex. The gradient udpdate rule is much faster than the EM rule, and is almost identical to the update rule used in the sLORETA/FOCUSS hybrid algorithm, which is not expressed in the ARD framework (Sc05; Ra05). Once the optimal hyperparameters have been learned, the posterior mean is given by \[\tag{29} {\mathbf{\hat J}}=E[{\mathbf{J}}|{\mathbf{B}};\Sigma _{\mathbf{J}} ] = \Sigma _{\mathbf{J}} {\mathbf{L}}^T \left( {\Sigma _{\mathbf{B}} } \right)^{ - 1} {\mathbf{B}}\ .\]

It is important to note that many SBL algorithms are distinguished by the parametrization of the source covariance matrix \(\Sigma _{\mathbf{J}} =\sum_{i = 1}^{d_s } {{\mathbf{C} }_i \gamma _i }\) (Wi07b). In fact, if only a few hyperparameters are used that are controlling many source-points, then the parametrization cannot support sparse estimates. For example, in the restricted maximum likelihood (ReML) algorithm one of the source covariance component is the identity matrix which is controlled by a single hyperparameter (Fr02; Ph05; Ma06; Wi07b).

Figure 4: Examples of sparse inverse methods. For comparison purposes, the measurement vector localized by these four methods is the scalp map of the independent component shown in Fig 1. (a) Minimum-L1-norm (MAP estimation with Laplacian pdf) with anatomic dipole orientation constraints from segmented cortical surfaces; (b) Geodesic multiscale minimum-L1-norm with dipole orientation constraints; (c) Sparse Bayesian learning (SBL) with dipole orientation constraints; (d) Geodesic multiscale sparse Bayesian Learning with dipole orientation constraints. Note that SBL estmates are sparser than MAP estimates. Also note that MAP estimation with Laplacian prior results in some spurious sources which are eliminated with the multiscale geodesic MAP estimation approach. Any activity below 10% from maximum was cropped to show the curvature map on inflated cortical surfaces. Figure modified from Ra07b.

In standard SBL, \({\mathbf{C}}_i=e_i e_i^T\ ,\) where \(e_i\) is a vector with zeros everywhere except at the i^th element, where it is one. This delta function parametrization can be extended to box car functions in which \(e_i\) takes a value of 1 for all three dipole components or for a patch of cortex. More interestingly, the \(e_i\) can be substituted by geodesic basis functions \(\psi _i\) (e.g., a 2-D Gaussian current density function) centered at the i^th sourcepoint and with some spatial standard deviation (Sa04; Ra06a; Ra06b; Ra07a; Ra08). More powerfully, the source covariance can be composed of components across many possible spatial scales, by using multiple \(\psi _i\) located at the i^th source-point but with different spatial standard deviations (Ra06a; Ra06b; Ra07a; Ra08). This approach can be used to estimate the spatial extent of distributed sources by using a mixture model of geodesic Gaussians at different spatial scales. Such multiscale approach has now been extended also to MAP estimation (Ra07b; Ra08).

Although we have assumed here a non-informative hyperprior on the precisions, \(p(\gamma _i^{-1})=\gamma _i\ ,\) since the degrees of freedom parameter of the Gamma distribution is set to 0, this does not need to be the case (Sa04; Wi07b). The problem of finding optimal hyperpriors to handle multimodal posteriors and to eliminate the use of improper priors has been dealt with by making this parameter non-zero (yet small) or by introducing MCMC stategies (Nu07a; Nu07b). In practice, the non-informative hyperprior works well, and helps avoid the problem of determining the optimal hyperprior. Finally, like explained for MAP estimation, to simultaneously localize a very long time-series of any length very quicky, instead of localizing the times-series matrix \(\mathbf{B}\) one can localize the matrix \(\mathbf{U}\left(\mathbf{S}\right)^{-1/2}\ ,\) where \(\mathbf{U}\) and \(\mathbf{S}\) are the left singular vectors and singular values matrices of \(\mathbf{B}\) (up to whatever rank desired).

Figure 5: Movie of single trial brain activity (1 sec) for a dependent subspace learned using independent vector analysis (IVA) and localized using multiscale geodesic SBL. Subspace is obtained from the SVD of the temporal covariance matrix obtained by inverse Fourier transform of a single independent vector time-frequency series. Movie modified from Ra07a.

Conclusions

The relative strengths of different localization algorithms offer an opportunity to select the most appropriate algorithm, constraints, and priors for a given experiment. If one expects only a few focal sources, then dipole fitting algorithms may be sufficient. If one expects distributed sources, then beamforming, spatial scans, or distributed MAP-based estimation algorithms are appropriate. If one expects sparse sources, then MAP estimation with a Laplacian prior may better reflect the true sources. If one is more interested in finding representative sparse estimates of the whole posterior, then SBL is the right choice. If one expects distributed sources with variable levels of spatial extent, then SBL or MAP (with Laplacian prior) estimation using a mixture model of multiscale basis functions is recommended.

References

Adrian E, Mathews B. (1934): The Berger Rhythm: Potential changes from the occipital lobes in man. Brain 57:355-385.
Ahlfors SP, Ilmoniemi RJ, Hamalainen MS. (1992): Estimates of visually evoked cortical currents. Electroencephalogr Clin Neurophysiol 82(3):225-36.
Aine C, Huang M, Stephen J, Christner R. (2000): Multistart algorithms for MEG empirical data analysis reliably characterize locations and time courses of multiple sources. Neuroimage 12(2):159-72.
Akalin-Acar Z, Gencer NG. (2004): An advanced boundary element method (BEM) implementation for the forward problem of electromagnetic source imaging. Physics in Medicine and Biology 49(21):5011-5028.
Anemuller J, Sejnowski TJ, Makeig S. (2003): Complex independent component analysis of frequency-domain electroencephalographic data. Neural Networks 16(9):1311-1323.
Auranen T, Nummenmaa A, Hamalainen MS, Jaaskelainen IP, Lampinen J, Vehtari A, Sams M. (2005): Bayesian analysis of the neuromagnetic inverse problem with l(p)-norm priors. Neuroimage 26(3):870-84.
Baillet S, Mosher JC, Leahy RM. (2001): Electromagnetic Brain Mapping. IEEE Signal Processing Magazine 18(6):14-30.
Bell AJ, Sejnowski TJ. (1995): An information-maximization approach to blind separation and blind deconvolution. Neural Comput 7(6):1129-59.
Berger H. (1929): Über das Elektroenkephalogramm des Menschen. . Archiv für Psychiatrie 87:527-570.
Bertrand C, Ohmi M, Suzuki R, Kado H. (2001): A probabilistic solution to the MEG inverse problem via MCMC methods: the reversible jump and parallel tempering algorithms. IEEE Trans Biomed Eng 48(5):533-42.
Bolton JPR, Gross J, Liu AK, Ioannides AA. (1999): SOFIA: spatially optimal fast initial analysis of biomagnetic signals. Phys Med Biol 44:87-103.
Canolty RT, Edwards E, Dalal SS, Soltani M, Nagarajan SS, Kirsch HE, Berger MS, Barbaro NM, Knight RT. (2006): High gamma power is phase-locked to theta oscillations in human neocortex. Science 313(5793):1626-1628.
Cohen D. (1968): Magnetoencephalography: evidence of magnetic fields produced by alpha-rhythm currents. Science 161:784-6.
Cohen D. (1972): Magnetoencephalography: Detection of the brain's electrical activity with a superconducting magnetometer. Science 175:664-6.
Cotter SF, Rao BD, Engan K, Kreutz-Delgado K. (2005): Sparse solutions to linear inverse problems with multiple measurement vectors. IEEE Trans Signal Processing 53(7):2477-2488.
Dale AM, Liu AK, Fischl BR, Buckner RL, Belliveau JW, Lewine JD, Halgren E. (2000): Dynamic statistical parametric mapping: combining fMRI and MEG for high-resolution imaging of cortical activity. Neuron 26(1):55-67.
Darvas F, Ermer JJ, Mosher JC, Leahy RM. (2006): Generic head models for atlas-based EEG source analysis. Hum Brain Mapp 27(2):129-43.
Delorme A, Makeig S. (2004): EEGLAB: an open source toolbox for analysis of single-trial EEG dynamics including independent component analysis. J Neurosci Methods 134(1):9-21.
Dyrholm M, Makeig S, Hansen LK. (2007): Model Selection for Convolutive ICA with an Application to Spatiotemporal Analysis of EEG. Neural Computation 19(4):934-955.
Figueiredo MAT. 2002. Adaptive sparseness using Jeffreys prior. Advances in Neural Information Processing Systems: MIT Press. p 697-704.
Freeman WJ. 2007. Hilbert Transform for Brain Waves. Scholarpedia. p. 7514.
Friston KJ, Penny W, Phillips C, Kiebel S, Hinton G, Ashburner J. (2002): Classical and Bayesian inference in neuroimaging: theory. Neuroimage 16(2):465-83.
Fuchs M, Kastner J, Wagner M, Hawes S, Ebersole JS. (2002): A standardized boundary element method volume conductor model. Clin Neurophysiol 113(5):702-12.
Fuchs M, Wagner M, Kohler T, Wischmann HA. (1999): Linear and nonlinear current density reconstructions. J Clin Neurophysiol 16(3):267-95.
Gencer NG, Williamson SJ. (1998): Differential characterization of neural sources with the bimodal truncated SVD pseudo-inverse for EEG and MEG measurements. IEEE Trans Biomed Eng 45(7):827-38.
George JS, Aine CJ, Mosher JC, Schmidt DM, Ranken DM, Schlitt HA, Wood CC, Lewine JD, Sanders JA, Belliveau JW. (1995): Mapping function in the human brain with magnetoencephalography, anatomical magnetic resonance imaging, and functional magnetic resonance imaging. J Clin Neurophysiol 12(5):406-31.
Gorodnitsky IF, George JS, Rao BD. (1995): Neuromagnetic source imaging with FOCUSS: a recursive weighted minimum norm algorithm. Electroencephalogr Clin Neurophysiol 95(4):231-51.
Gorodnitsky I, Rao BD. (1997): Sparse signal reconstruction from limited data using FOCUSS: A re-weighted minimum norm algorithm. IEEE Trans Signal Processing 45(3):600–616.
Grave de Peralta Menendez R, Hauk O, Gonzalez Andino S, Vogt H, Michel C. (1998): Linear inverse solutions with optimal resolution kernels applied to electromagnetic tomography. Human Brain Mapping 5(6):454-467.
Grave de Peralta Menendez R, Gonzalez Andino SL. (1999): Backus and Gilbert method for vector fields. Hum Brain Mapp 7(3):161-5.
Grave de Peralta Menendez R, Gonzalez Andino SL, Morand S, Michel CM, Landis T. (2000): Imaging the electrical activity of the brain: ELECTRA. Hum Brain Mapp 9(1):1-12.
Gross J, Ioannides AA. (1999): Linear transformations of data space in MEG. Phys Med Biol 44(8):2081-97.
Gross J, Kujala J, Hamalainen M, Timmermann L, Schnitzler A, Salmelin R. (2001): Dynamic imaging of coherent sources: Studying neural interactions in the human brain. Proc Natl Acad Sci U S A 98(2):694-9.
Halchenko YO, Hanson SJ, Pearlmutter BA. 2005. Multimodal Integration: fMRI, MRI, EEG, MEG. In: Landini L, Positano V, Santarelli MF, editors. Advanced Image Processing in Magnetic Resonance Imaging: Dekker. p p. 223-265.
Hamalainen M, Hari R, Ilmoniemi R, Knuutila J, Lounasmaa O. (1993): Magnetoencephalography—theory, instrumentation, and applications to noninvasive studies of the working human brain. Rev.Mod.Phys. 65(2):413-497.
Hamalainen M, Sarvas J. (1987): Feasibility of the homogenous head model in the interpretation of the magnetic fields. Phys Med Biol 32:91-97.
Helmholtz Hv. (1853): Ueber einige Gesetze der Vertheilung elektrischer Strome in korperlichen Leitern, mit Anwendung auf die thierisch-elektrischen Versuche. Ann Phys Chem 89:211-233,353-377.
Hillebrand A, Barnes GR. (2003): The use of anatomical constraints with MEG beamformers. Neuroimage 20(4):2302-13.
Huang M, Aine CJ, Supek S, Best E, Ranken D, Flynn ER. (1998): Multi-start downhill simplex method for spatio-temporal source localization in magnetoencephalography. Electroencephalogr Clin Neurophysiol 108(1):32-44.
Huang MX, Mosher JC, Leahy RM. (1999): A sensor-weighted overlapping-sphere head model and exhaustive head model comparison for MEG. Phys Med Biol 44(2):423-40.
Huang MX, Dale AM, Song T, Halgren E, Harrington DL, Podgorny I, Canive JM, Lewis S, Lee RR. (2006): Vector-based spatial-temporal minimum L1-norm solution for MEG. Neuroimage 31(3):1025-37.
Hubbard JI, Llinás RR, Quastel DMJ. 1969. Electrophysiological analysis of synaptic transmission. Baltimore,: Williams & Wilkins Co. ix, 372 p.
Ioannides AA, Bolton JP, Clarke CJS. (1990): Continuous probabilistic solutions to the biomagnetic inverse problem. Inverse Probl. 6:523-542.
Lachaux JP, Rudrauf D, Kahane P. (2003): Intracranial EEG and human brain mapping. Journal of Physiology Paris 97:613-628.
Leahy RM, Mosher JC, Spencer ME, Huang MX, Lewine JD. (1998): A study of dipole localization accuracy for MEG and EEG using a human skull phantom. Electroencephalogr Clin Neurophysiol 107(2):159-73.
Lee I, Kim T, Lee TW. (2007): Fast fixed-point independent vector analysis algorithms for convolutive blind source separation. Signal Process 87(8):1859-1871.
Lee IK, Worrell G, Makeig S. 2005. Relationships between concurrently recorded scalp and intracranial electrical signals in humans. 11th Annual Meeting of the Organization for Human Brain Mapping. Toronto, Ontario.
Liu AK, Dale AM, Belliveau JW. (2002): Monte Carlo simulation studies of EEG and MEG localization accuracy. Hum Brain Mapp 16(1):47-62.
Liu H, Gao X, Schimpf PH, Yang F, Gao S. (2004): A recursive algorithm for the three-dimensional imaging of brain electric activity: Shrinking LORETA-FOCUSS. IEEE Trans Biomed Eng 51(10):1794-802.
Liu L, Ioannides AA, Streit M. (1999): Single trial analysis of neurophysiological correlates of the recognition of complex objects and facial expressions of emotion. Brain Topogr 11(4):291-303.
Luck SJ. 2005. An Introduction to the Event-Related Potential Technique: The MIT Press.
Mackay DJC. (1992): Bayesian Interpolation. Neural Computation 4(3):415-447.
Makeig S. (1993): Auditory event-related dynamics of the EEG spectrum and effects of exposure to tones. Electroencephalogr Clin Neurophysiol 86(4):283-93.
Makeig S, Bell AJ, Jung TP, Sejnowski TJ. 1996. Independent Component Analysis of Electroencephalographic Data. Advances in Neural Information Processing Systems: NIPS, 8. Denver, CO: MIT Press. p 145-151.
Makeig S, Jung TP, Bell AJ, Ghahremani D, Sejnowski TJ. (1997): Blind separation of auditory event-related brain responses into independent components. Proceedings of the National Academy of Sciences of the United States of America 94(20):10979-10984.
Makeig S, Westerfield M, Jung TP, Enghoff S, Townsend J, Courchesne E, Sejnowski TJ. (2002): Dynamic brain sources of visual evoked responses. Science 295(5555):690-694.
Matsuura K, Okabe Y. (1995): Selective minimum-norm solution of the biomagnetic inverse problem. IEEE Trans Biomed Eng 42(6):608-15.
Mattout J, Phillips C, Penny WD, Rugg MD, Friston KJ. (2006b): MEG source localization under multiple constraints: an extended Bayesian framework. Neuroimage 30(3):753-67.
Mitra PP, Maniar H. (2006): Concentration maximization and local basis expansions (LBEX) for linear inverse problems. IEEE Trans on Biomed Eng 53(9):1775-1782.
Mosher JC, Lewis PS, Leahy RM. (1992): Multiple dipole modeling and localization from spatio-temporal MEG data. IEEE Trans Biomed Eng 39(6):541-57.
Mosher JC, Leahy RM. (1998): Recursive MUSIC: a framework for EEG and MEG source localization. IEEE Trans Biomed Eng 45(11):1342-54.
Neal R. 1996. Bayesian Learning in Neural Networks.: Springer.
Nicholson C, Llinas R. (1971): Field potentials in the alligator cerebellum and theory of their relationship to Purkinje cell dendritic spikes. J Neurophysiol 34(4):509-31.
Niedermeyer E, Lopes da Silva FH. 2005. Electroencephalography : basic principles, clinical applications, and related fields. Philadelphia: Lippincott Williams & Wilkins. xiii, 1309 p.
Nummenmaa A, Auranen T, Hamalainen MS, Jaaskelainen IP, Lampinen J, Sams M, Vehtari A. (2007a): Hierarchical Bayesian estimates of distributed MEG sources: theoretical aspects and comparison of variational and MCMC methods. Neuroimage 35(2):669-85.
Nummenmaa A, Auranen T, Hamalainen MS, Jaaskelainen IP, Sams M, Vehtari A, Lampmen J. (2007b): Automatic relevance determination based hierarchical Bayesian MEG inversion in practice. Neuroimage 37(3):876-889.
Nunez Paul L. 2005. Electric fields of the brain. Oxford University Press (NC) 2nd edition.
Okada Y. (1993): Empirical bases for constraints in current-imaging algorithms. Brain Topogr 5(4):373-7.
Okada YC, Wu J, Kyuhou S. (1997): Genesis of MEG signals in a mammalian CNS structure. Electroencephalogr Clin Neurophysiol 103(4):474-85.
Oostendorp TF, Vanoosterom A. (1989): Source Parameter-Estimation in Inhomogeneous Volume Conductors of Arbitrary Shape. IEEE Trans on Biomed Eng 36(3):382-391.
Parra LC, Spence CD, Gerson AD, Sajda P. (2005): Recipes for the Linear Analysis of EEG. Neuroimage 28(2):326-341.
Pascual-Marqui RD, Lehmann D, Koenig T, Kochi K, Merlo MC, Hell D, Koukkou M. (1999): Low resolution brain electromagnetic tomography (LORETA) functional imaging in acute, neuroleptic-naive, first-episode, productive schizophrenia. Psychiatry Res 90(3):169-79.
Pascual-Marqui RD. (2002): Standardized low-resolution brain electromagnetic tomography (sLORETA): technical details. Methods Find Exp Clin Pharmacol 24 Suppl D:5-12.
Penfield W, Jasper HH. 1954. Epilepsy and the functional anatomy of the human brain. Boston: Little. 898 p.
Phillips C, Mattout J, Rugg MD, Maquet P, Friston KJ. (2005): An empirical Bayesian solution to the source reconstruction problem in EEG. Neuroimage 24(4):997-1011.
Ramirez RR, Kronberg E, Ribary U, Llinás R. 2003. Recursive Weighted Minimum-norm Algorithms for Neuromagnetic Source Imaging Using Diversity Measure Minimization: Analysis of Spatial Resolution. Soc. Neurosci. Abstr., Vol. 29, Program No. 863.20.
Ramirez RR. 2005. Neuromagnetic Source Imaging of Spontaneous and Evoked Human Brain Dynamics. New York: New York University School of Medicine. 452 p.
Ramírez RR, Makeig S. 2006a. Neuroelectromagnetic source imaging using multiscale geodesic neural bases and sparse Bayesian learning. 12th Annual Meeting of the Organization for Human Brain Mapping. Florence, Italy.
Ramírez RR, Wipf D, Rao B, Makeig S. 2006b. Sparse Bayesian Learning for estimating the spatial orientations and extents of distributed sources. Biomag 2006 - 15th International Conference on Biomagnetism. Vancouver, BC, Canada.
Ramírez RR, Makeig S. 2007a. Neuroelectromagnetic source imaging of spatiotemporal brain dynamical patterns using frequency-domain independent vector analysis (IVA) and geodesic sparse Bayesian learning (gSBL). 13th Annual Meeting of the Organization for Human Brain Mapping. Chicago, USA.
Ramirez RR, Makeig S. 2007b. Neuroelectromagnetic source imaging (NSI) toolbox and EEGLAB module. 37th annual meeting of the Society for Neuroscience, November, San Diego, CA.
Ramirez RR, Makeig S. (2008): Neuroelectromagnetic Source Imaging Using Multiscale Geodesic Basis Functions with Sparse Bayesian Learning or MAP estimation. Neural Computation. (in preparation).
Rao B, Kreutz-Delgado K. (1999): An affine scaling methodology for best basis selection. IEEE Trans Signal Processing 1:187–202.
Rao BD, Engan K, Cotter SF, Palmer J, Kreutz-Delgado K. (2002): Subset selection in noise based on diversity measure minimization. IEEE Trans Signal Processing 51(3):760-770.
Ribary U, Ioannides AA, Singh KD, Hasson R, Bolton JP, Lado F, Mogilner A, Llinas R. (1991): Magnetic field tomography of coherent thalamocortical 40-Hz oscillations in humans. Proc Natl Acad Sci U S A 88(24):11037-41.
Sarnthein J, Morel A, von Stein A, Jeanmonod D. (2003): Thalamic theta field potentials and EEG: high thalamocortical coherence in patients with neurogenic pain, epilepsy and movement disorders. Thalamus Related Syst 2(3):231-238.
Sarvas J. (1987): Basic mathematical and electromagnetic concepts of the biomagnetic inverse problem. Phys Med Biol 32(1):11-22.
Sato M, Yoshioka T, Kajihara S, Toyama K, Naokazu G, Doya K, Kawatoa M. (2004): Hierarchical Bayesian estimation for MEG inverse problem. Neuroimage 23:806-826.
Scherg M, Berg P. (1991): Use of prior knowledge in brain electromagnetic source analysis. Brain Topogr 4(2):143-50.
Schmidt DM, George JS, Wood CC. (1999): Bayesian inference applied to the electromagnetic inverse problem. Hum Brain Mapp 7(3):195-212.
Schimpf PH, Liu H, Ramon C, Haueisen J. (2005): Efficient electromagnetic source imaging with adaptive standardized LORETA/FOCUSS. IEEE Trans Biomed Eng 52(5):901-8.
Sekihara K, Nagarajan S, Poeppel D, Miyashita Y. (1999): Time-frequency MEG-MUSIC algorithm. IEEE Trans Med Imaging 18(1):92-7.
Sekihara K, Nagarajan SS, Poeppel D, Marantz A, Miyashita Y. (2002): Application of an MEG eigenspace beamformer to reconstructing spatio-temporal activities of neural sources. Hum Brain Mapp 15(4):199-215.
Tallon-Baudry C, Bertrand O, Delpuech C, Pernier J. (1996): Stimulus specificity of phase-locked and non-phase-locked 40 Hz visual responses in human. J Neurosci 16(13):4240-9.
Tang AC, Pearlmutter BA, Malaszenko NA, Phung DB, Reeb BC. (2002): Independent components of magnetoencephalography: localization. Neural Comput 14(8):1827-58.
Taylor JG, Ioannides AA, Muller-Gartner HW. (1999): Mathematical analysis of lead field expansions. IEEE Trans Med Imaging 18(2):151-63.
Tesche CD. (1997): Non-invasive detection of ongoing neuronal population activity in normal human hippocampus. Brain Res 749(1):53-60.
Tipping ME. (2001): Sparse Bayesian learning and the relevance vector machine. Journal of Machine Learning Research 1:211-244.
Ulbert I, Halgren E, Heit G, Karmos G. (2001): Multiple microelectrode-recording system for human intracortical applications. Journal of Neuroscience Methods 106(1):69-79.
Uusitalo MA, Ilmoniemi RJ. (1997): Signal-space projection method for separating MEG or EEG into components. Medical and Biological Engineering and Computing 35(2):135-140.
Uutela K, Hamalainen M, Salmelin R. (1998): Global optimization in the localization of neuromagnetic sources. IEEE Trans Biomed Eng 45(6):716-23.
Uutela K, Hamalainen M, Somersalo E. (1999): Visualization of magnetoencephalographic data using minimum current estimates. Neuroimage 10(2):173-80.
Van Veen BD, van Drongelen W, Yuchtman M, Suzuki A. (1997): Localization of brain electrical activity via linearly constrained minimum variance spatial filtering. IEEE Trans Biomed Eng 44(9):867-80.
Volkmann J, Joliot M, Mogilner A, Ioannides AA, Lado F, Fazzini E, Ribary U, Llinas R. (1996): Central motor loop oscillations in parkinsonian resting tremor revealed by magnetoencephalography. Neurology 46(5):1359-70.
Vrba J, Robinson SE. (2001): Signal processing in magnetoencephalography. Methods 25(2):249-71.
Wang JZ, Williamson SJ, Kaufman L. (1992): Magnetic source images determined by a lead-field analysis: the unique minimum-norm least-squares estimation. IEEE Trans Biomed Eng 39(7):665-75.
Wipf DP, Rao BD. (2004): Sparse Bayesian Learning for Basis Selection. IEEE Trans Signal Processing 52:2153-2164.
Wipf DP, Rao BD. (2007a): An Empirical Bayesian Strategy for Solving the Simultaneous Sparse Approximation Problem. IEEE Trans Signal Processing. 55(7):3704-3716.
Wipf DP, Ramírez RR, Palmer JA, Makeig S, Rao BD. 2007b. Analysis of Empirical Bayesian Methods for Neuroelectromagnetic Source Localization. Advances in Neural Information Processing Systems (NIPS). Vancouver, CA: MIT Press.
Wolters CH. 2003. Influence of Tissue Conductivity Inhomogeneity and Anisotropy on EEG/MEG based Source Localization in the Human Brain. Leipzig: University of Leipzig. 253 p.
Wolters CH, Grasedyck L, Hackbusch W. (2004): Efficient computation of lead field bases and influence matrix for the FEM-based EEG and MEG inverse problem. Inverse Problems 20(4):1099-1116.
Wolters CH, Anwander A, Tricoche X, Weinstein D, Koch MA, MacLeod RS. (2006): Influence of tissue conductivity anisotropy on EEG/MEG field and return current computation in a realistic head model: A simulation and visualization study using high-resolution finite element modeling. Neuroimage 30(3):813-826.
Zimmerman JE, Frederick NV. (1971): Miniature ultrasensitive superconducting magnetic gradiometer and its use in cardiography and other applications. Appl Phys Lett 19(1):16-19.
Zumer J, Attias H, Sekihara K, Nagarajan S. (2007): A probabilistic algorithm integrating source localization and noise suppression for MEG and EEG data. NeuroImage 37:102-115.

Internal references

Jan A. Sanders (2006) Averaging. Scholarpedia, 1(11):1760.

Valentino Braitenberg (2007) Brain. Scholarpedia, 2(11):2918.

Eugene M. Izhikevich (2006) Bursting. Scholarpedia, 1(3):1300.

James Meiss (2007) Dynamical systems. Scholarpedia, 2(2):1629.

Paul L. Nunez and Ramesh Srinivasan (2007) Electroencephalogram. Scholarpedia, 2(2):1348.

William D. Penny and Karl J. Friston (2007) Functional imaging. Scholarpedia, 2(5):1478.

Seiji Ogawa and Yul-Wan Sung (2007) Functional magnetic resonance imaging. Scholarpedia, 2(10):3105.

Joan Dawson and Paul C. Lauterbur (2008) Magnetic resonance imaging. Scholarpedia, 3(7):3381.

Piotr J. Durka (2007) Matching pursuit. Scholarpedia, 2(11):2288.

Rodolfo Llinas (2008) Neuron. Scholarpedia, 3(8):1490.

Jeff Moehlis, Kresimir Josic, Eric T. Shea-Brown (2006) Periodic orbit. Scholarpedia, 1(7):1358.

Arkady Pikovsky and Michael Rosenblum (2007) Synchronization. Scholarpedia, 2(12):1459.

Carsten Wolters and Jan C de Munck (2007) Volume conduction. Scholarpedia, 2(3):1738.