Signal Processing - January 2016 - 16

The bold font is used to denote vectors or matrices.
Data and parameters are displayed by italic and regular
font, respectively. Subscript index t denotes the discrete time, whereas superscript indices s, k, and l
denote the subject, condition, and trial, respectively.
E [·] and C [·] denote the expectation and covariance
operators, respectively. Some common notations for
probability distributions and stochastic processes used
in this article are listed in Table 1.

variables and unknown parameters for data interpretation. When
the data likelihood p (X | Z, i) is a Gaussian distribution, the
problem becomes regression, which includes the problem of
electromagnetic imaging. When the data likelihood consists of
category label c, such as p (X | Z, i, c) (e.g., the brain-state clas-
sification problem), the inference problem becomes p (c | X) =
# p (c, Z | X) dZ = # p (c | X, Z) p (Z | X) dZ, which can be decom-
posed into two steps: estimation of p (Z | X) and estimation
of p (c | Z).
For electromagnetic brain imaging, f can be approximated as a
linear function of Z as follows:
X = AZ + E or x t = Az t + e t,


where X = [x 1, g, x T] ! R N # T is the EEG/MEG sensor data
matrix with N sensors and T time points. The source activity
matrix, Z = [z 1, g, z T] ! R M # T is associated with M latent brain
sources. Unless stated otherwise, throughout this tutorial, we
assume that the EEG/MEG data have been preprocessed with a
proper clean-up procedure to have nonbrain biological artifacts
removed. In the spatiotemporal decomposition problem, A ! R N # M
is interpreted as a mixing matrix. In the inverse problem, A is
known as the lead-field matrix, which can be obtained by solving the

[taBLE 1] aBBrEVIatEd notatIons For ProBaBILItY
dIstrIButIons and stocHastIc ProcEssEs.


N (n, R)

Gaussian (MEan n, covariancE R)

N + (n, R)

half-Gaussian (saME paraMEtErization as
GaMMa (shapE paraMEtEr a 2 0, ratE
paraMEtEr b 2 0)

G (a, b)
IG (a, b)

invErsE GaMMa (shapE paraMEtEr a 2 0, scalE
paraMEtEr b 2 0)

W (o, W)

Wishart (dEGrEE of frEEdoM o 2 0,
scalE Matrix W)

Be (p)

BErnoulli distriBution ( 0 1 p 1 1)

B (a, b)

BEta distriBution (shapE paraMEtErs a 2 0, b 2 0)
Gaussian procEss (Gp) (zEro MEan function,
covariancE function c)
dirichlEt procEss (dp) (concEntration paraMEtEr
a 2 0, BasE MEasurE G 0)

GP (0, C)
DP (a, G 0)

forward problem (from Maxwell's equations) based on the structural
information of the subject's head, as well as electric and geometric
properties of the electric sources and the volume conductor. Each
unknown source Z often represents the magnitude of a neural cur-
rent dipole, projecting from an rth (discretized) voxel or candidate
location distributed throughout the brain. These candidate locations
can be obtained by segmenting a structural magnetic resonance
(MR) scan of a human subject and tesselating the brain volume with
a set of vertices. Since the number of brain sources largely outnum-
ber the sensors, i.e., M & N, reconstructing brain sources from
EEG/MEG data is a highly ill-posed problem with an infinite number
of solutions. Further anatomical or functional constraints should,
therefore, be incorporated to restrict the solution space. Anatomi-
cally, in reasonable settings, the dipole sources are restricted to be
situated on the cerebral cortex and their orientations perpendicular
to the cortical surface. Functionally, spatial smoothness and sparsity
are the most widely used constraints.
An extension of the static linear model (2) is a Markovian state-
space model, also known as the dynamic factor analysis model [3]
z t = Fz t - 1 + v t
x t = Az t + e t,


where F is a time-invariant state-transition matrix for the latent
state z t, A ! R N # M can be either a factor loading matrix (for mod-
eling low-dimensional sources, where M % N) or a lead-field
matrix (for modeling high-dimensional sources, where M & N),
and v t ~N (0, I) and e t ~N (0, R e) denote zero-mean Gaussian
dynamic and measurement noise, respectively. Simple linear alge-
bra will yield x t ~N (0, AR z A T + R e), where R z denotes the mar-
ginal covariance of z t.
For the most part, we assume that e t is a noise-plus-interfer-
ence term and, for simplicity, that e t s are drawn independently
from N (0, R e). However, temporal correlations can easily be
incorporated using a simple transformation outlined in [4] or
using the spatiotemporal framework introduced in [5]. Initially,
we assume that R e is known; but, in later sections, we also derive
how R e is estimated from data.
For brain-state classification, a labeled state or class variable,
c, is additionally known given training data. The objective is to
determine p (c | X) of the test (unlabeled) data. Three common
problems within brain-state classification are disease diagnosis,
behavioral state classification, and BCIs. In disease diagnosis,
the class or state variable corresponds to the disease diagnosis
group, and the observed sensor data are used to make infer-
ences about whether the given EEG/MEG observations carry a
signature about the disease group. In behavioral state classifica-
tion, the problem is to infer the evolving behavioral state from a
subject's EEG/MEG data. The most common example is to infer
the sleep stage (e.g., awake, slow-wave sleep, and rapid-eye-
movement sleep). A second example is to determine the time
period when abnormal epileptiform activity are present in EEG/
MEG data in a patient with epilepsy. Finally, in BCIs, we learn
the brain state associated with the intended state c of the user,
and subsequently infer the intended brain state c from new
data X.



