IEEE Computational Intelligence Magazine - May 2022 - 44
On the other hand, MC-Dropout might lack some expressiveness
and may not fully capture the uncertainty associated
with the model predictions [98]. It also lacks flexibility compared
to other Bayesian methods for online or active learning.
2) Bayes via Stochastic Gradient Descent
Stochastic gradient descent (SGD) and related algorithms are at
the core of modern machine learning. The initial goal of SGD
is to provide an algorithm that converges to an optimal point
estimate solution while having only noisy estimates of the gradient
of the objective function. This is especially useful when
the training data has to be split into mini-batches. The parameter
update rule at time t can be written as:
T dd;ii i
=+e ^^`
t glog
2
where Dt
n
,,
t N lo ,,pD Dp^ tt ityx hhh^ h j (43)
is a mini-batch subsampled at time t from the complete
dataset D, te is the learning rate at time t, N is the size of
the whole dataset and n the size of the mini-batch.
SGD, or related optimization algorithms such as ADAM
[93], can be reinterpreted as a Markov Chain algorithm [99].
Usually, the hyperparameters of the algorithm are tweaked to
ensure that the chain converges to a Dirac distribution, whose
position gives the final point estimate. This is done by reducing
e t
toward zero while ensuring that R e 3=
3
t =0 t
the learning rate is reduced toward a strictly positive value, the
underlying Markov Chain will converge to a stationary distribution.
If a Bayesian prior is accounted for in the objective
function, then this stationary distribution can be an approximation
of the corresponding posterior.
a) MCMC Algorithms Based on the SGD Dynamic
To approximately sample the posterior using the SGD algorithm,
a specific MCMC method, called stochastic gradient
Langevin dynamic (SGLD) [100], has been developed, see
Algorithm 7. Coupling SGD with Langevin dynamic leads to
a slightly modified update step:
T dd
+
he
=+e
tt
t
2
t ` n
N loglog^ ^p tt
h
ii i hhj + h
N ,.^0
^^ ,,pDtthh
(44)
Welling et al. [100] showed that this method leads to a Markov
Chain that samples the posterior if te goes toward zero. However,
in that case, the successive samples become increasingly
autocorrelated. To address this problem, the authors proposed to
stop reducing te at some point, thus making the samples only
an approximation of the posterior. Nevertheless, SGLD offers
better theoretical guarantees compared to other MCMC methods
when the dataset is split into mini-batches. This makes the
algorithm useful in Bayesian deep learning.
To favor the exploration of the posterior, one can use warm
restart of the algorithm [101], i.e., restarting the algorithm at a
new random position 0i and with a large learning rate
e0 . This
offers multiple benefits. The main one is to avoid the mode collapse
problem [102]. In the case of a BNN, the true Bayesian posterior
is usually a complex multimodal distribution, as multiple
and sometimes not equivalent parametrizations i of the network
can fit the training set. Favoring exploration over precise reconstruction
can help to achieve a better picture of those different
modes. Then, as parameters sampled from the same mode are
likely to make the model generalize in a similar manner, using
warm restarts enables a much better estimate of the epistemic
uncertainty when processing unseen data, even if this approach
provides only a very rough approximation of the exact posterior.
Similar to other MCMC methods, this approach still suffers
. However, if
from a huge memory footprint. This is why a number of
authors have proposed methods that are more similar to traditional
variational inference than to an MCMC algorithm.
b) Variational Inference Based on SGD Dynamic
Instead of an MCMC algorithm, SGD dynamic can be used as
a variational inference method to learn a distribution by using
Laplace approximation. Laplace approximation fits a Gaussian
posterior by using the maximum a posteriori estimate as the
mean and the inverse of the Hessian H of the loss (assuming
the loss is the log likelihood) as covariance matrix:
pDh N ,.H 1- h
^i;. i^ t
Computing H 1(45)
is
usually intractable for large neural network
Algorithm 7 Stochastic Gradient Langevin Dynamic (SGLD).
Draw
i + Initialprobabilitydistribution ;
for t 0= to E do
Select a mini-batch
f^ glog^ ^ii;=+ it
T backprop=ii
ttyxtthh
ff ;
h
n
N
^^ ,,
^ h
,
Draw httN ;0+
`
ii T h
=- e
^ e h
t
tt 2
+1
end for
i + tj ;
f
DD ;D
tt 1,, ,yx
lo ,;pD Dp hh
architectures. Thus, approximations are used, most of the time
by analysing the variance of the gradient descent algorithm
[88], [89], [103]. However, if those methods are able to capture
the fine shape of one mode of the posterior, they cannot fit
multiple modes.
Lakshminarayanan et al. [102] proposed using warm restarts
to obtain different point estimate networks instead of fitting a
parametric distribution. This method, called deep ensembles;
see Figure 10 and Algorithm 8, has been used in the past to
perform model averaging. The main contribution of [102] was
to show that it enables well-calibrated error estimates. While
Lakshminarayanan et al. [102] claim that their method is nonBayesian,
it has been shown that their approach can still be
understood from a Bayesian point of view [12], [104]. When
regularization is used, the different point estimates should correspond
to modes of a Bayesian posterior. This can be
44 IEEE COMPUTATIONAL INTELLIGENCE MAGAZINE | MAY 2022
IEEE Computational Intelligence Magazine - May 2022
Table of Contents for the Digital Edition of IEEE Computational Intelligence Magazine - May 2022
Contents
IEEE Computational Intelligence Magazine - May 2022 - Cover1
IEEE Computational Intelligence Magazine - May 2022 - Cover2
IEEE Computational Intelligence Magazine - May 2022 - Contents
IEEE Computational Intelligence Magazine - May 2022 - 2
IEEE Computational Intelligence Magazine - May 2022 - 3
IEEE Computational Intelligence Magazine - May 2022 - 4
IEEE Computational Intelligence Magazine - May 2022 - 5
IEEE Computational Intelligence Magazine - May 2022 - 6
IEEE Computational Intelligence Magazine - May 2022 - 7
IEEE Computational Intelligence Magazine - May 2022 - 8
IEEE Computational Intelligence Magazine - May 2022 - 9
IEEE Computational Intelligence Magazine - May 2022 - 10
IEEE Computational Intelligence Magazine - May 2022 - 11
IEEE Computational Intelligence Magazine - May 2022 - 12
IEEE Computational Intelligence Magazine - May 2022 - 13
IEEE Computational Intelligence Magazine - May 2022 - 14
IEEE Computational Intelligence Magazine - May 2022 - 15
IEEE Computational Intelligence Magazine - May 2022 - 16
IEEE Computational Intelligence Magazine - May 2022 - 17
IEEE Computational Intelligence Magazine - May 2022 - 18
IEEE Computational Intelligence Magazine - May 2022 - 19
IEEE Computational Intelligence Magazine - May 2022 - 20
IEEE Computational Intelligence Magazine - May 2022 - 21
IEEE Computational Intelligence Magazine - May 2022 - 22
IEEE Computational Intelligence Magazine - May 2022 - 23
IEEE Computational Intelligence Magazine - May 2022 - 24
IEEE Computational Intelligence Magazine - May 2022 - 25
IEEE Computational Intelligence Magazine - May 2022 - 26
IEEE Computational Intelligence Magazine - May 2022 - 27
IEEE Computational Intelligence Magazine - May 2022 - 28
IEEE Computational Intelligence Magazine - May 2022 - 29
IEEE Computational Intelligence Magazine - May 2022 - 30
IEEE Computational Intelligence Magazine - May 2022 - 31
IEEE Computational Intelligence Magazine - May 2022 - 32
IEEE Computational Intelligence Magazine - May 2022 - 33
IEEE Computational Intelligence Magazine - May 2022 - 34
IEEE Computational Intelligence Magazine - May 2022 - 35
IEEE Computational Intelligence Magazine - May 2022 - 36
IEEE Computational Intelligence Magazine - May 2022 - 37
IEEE Computational Intelligence Magazine - May 2022 - 38
IEEE Computational Intelligence Magazine - May 2022 - 39
IEEE Computational Intelligence Magazine - May 2022 - 40
IEEE Computational Intelligence Magazine - May 2022 - 41
IEEE Computational Intelligence Magazine - May 2022 - 42
IEEE Computational Intelligence Magazine - May 2022 - 43
IEEE Computational Intelligence Magazine - May 2022 - 44
IEEE Computational Intelligence Magazine - May 2022 - 45
IEEE Computational Intelligence Magazine - May 2022 - 46
IEEE Computational Intelligence Magazine - May 2022 - 47
IEEE Computational Intelligence Magazine - May 2022 - 48
IEEE Computational Intelligence Magazine - May 2022 - 49
IEEE Computational Intelligence Magazine - May 2022 - 50
IEEE Computational Intelligence Magazine - May 2022 - 51
IEEE Computational Intelligence Magazine - May 2022 - 52
IEEE Computational Intelligence Magazine - May 2022 - 53
IEEE Computational Intelligence Magazine - May 2022 - 54
IEEE Computational Intelligence Magazine - May 2022 - 55
IEEE Computational Intelligence Magazine - May 2022 - 56
IEEE Computational Intelligence Magazine - May 2022 - 57
IEEE Computational Intelligence Magazine - May 2022 - 58
IEEE Computational Intelligence Magazine - May 2022 - 59
IEEE Computational Intelligence Magazine - May 2022 - 60
IEEE Computational Intelligence Magazine - May 2022 - 61
IEEE Computational Intelligence Magazine - May 2022 - 62
IEEE Computational Intelligence Magazine - May 2022 - 63
IEEE Computational Intelligence Magazine - May 2022 - 64
IEEE Computational Intelligence Magazine - May 2022 - 65
IEEE Computational Intelligence Magazine - May 2022 - 66
IEEE Computational Intelligence Magazine - May 2022 - 67
IEEE Computational Intelligence Magazine - May 2022 - 68
IEEE Computational Intelligence Magazine - May 2022 - 69
IEEE Computational Intelligence Magazine - May 2022 - 70
IEEE Computational Intelligence Magazine - May 2022 - 71
IEEE Computational Intelligence Magazine - May 2022 - 72
IEEE Computational Intelligence Magazine - May 2022 - 73
IEEE Computational Intelligence Magazine - May 2022 - 74
IEEE Computational Intelligence Magazine - May 2022 - 75
IEEE Computational Intelligence Magazine - May 2022 - 76
IEEE Computational Intelligence Magazine - May 2022 - 77
IEEE Computational Intelligence Magazine - May 2022 - 78
IEEE Computational Intelligence Magazine - May 2022 - 79
IEEE Computational Intelligence Magazine - May 2022 - 80
IEEE Computational Intelligence Magazine - May 2022 - 81
IEEE Computational Intelligence Magazine - May 2022 - 82
IEEE Computational Intelligence Magazine - May 2022 - 83
IEEE Computational Intelligence Magazine - May 2022 - 84
IEEE Computational Intelligence Magazine - May 2022 - 85
IEEE Computational Intelligence Magazine - May 2022 - 86
IEEE Computational Intelligence Magazine - May 2022 - 87
IEEE Computational Intelligence Magazine - May 2022 - 88
IEEE Computational Intelligence Magazine - May 2022 - 89
IEEE Computational Intelligence Magazine - May 2022 - 90
IEEE Computational Intelligence Magazine - May 2022 - 91
IEEE Computational Intelligence Magazine - May 2022 - 92
IEEE Computational Intelligence Magazine - May 2022 - 93
IEEE Computational Intelligence Magazine - May 2022 - 94
IEEE Computational Intelligence Magazine - May 2022 - 95
IEEE Computational Intelligence Magazine - May 2022 - 96
IEEE Computational Intelligence Magazine - May 2022 - 97
IEEE Computational Intelligence Magazine - May 2022 - 98
IEEE Computational Intelligence Magazine - May 2022 - 99
IEEE Computational Intelligence Magazine - May 2022 - 100
IEEE Computational Intelligence Magazine - May 2022 - 101
IEEE Computational Intelligence Magazine - May 2022 - 102
IEEE Computational Intelligence Magazine - May 2022 - 103
IEEE Computational Intelligence Magazine - May 2022 - 104
IEEE Computational Intelligence Magazine - May 2022 - Cover3
IEEE Computational Intelligence Magazine - May 2022 - Cover4
https://www.nxtbook.com/nxtbooks/ieee/computationalintelligence_202311
https://www.nxtbook.com/nxtbooks/ieee/computationalintelligence_202308
https://www.nxtbook.com/nxtbooks/ieee/computationalintelligence_202305
https://www.nxtbook.com/nxtbooks/ieee/computationalintelligence_202302
https://www.nxtbook.com/nxtbooks/ieee/computationalintelligence_202211
https://www.nxtbook.com/nxtbooks/ieee/computationalintelligence_202208
https://www.nxtbook.com/nxtbooks/ieee/computationalintelligence_202205
https://www.nxtbook.com/nxtbooks/ieee/computationalintelligence_202202
https://www.nxtbook.com/nxtbooks/ieee/computationalintelligence_202111
https://www.nxtbook.com/nxtbooks/ieee/computationalintelligence_202108
https://www.nxtbook.com/nxtbooks/ieee/computationalintelligence_202105
https://www.nxtbook.com/nxtbooks/ieee/computationalintelligence_202102
https://www.nxtbook.com/nxtbooks/ieee/computationalintelligence_202011
https://www.nxtbook.com/nxtbooks/ieee/computationalintelligence_202008
https://www.nxtbook.com/nxtbooks/ieee/computationalintelligence_202005
https://www.nxtbook.com/nxtbooks/ieee/computationalintelligence_202002
https://www.nxtbook.com/nxtbooks/ieee/computationalintelligence_201911
https://www.nxtbook.com/nxtbooks/ieee/computationalintelligence_201908
https://www.nxtbook.com/nxtbooks/ieee/computationalintelligence_201905
https://www.nxtbook.com/nxtbooks/ieee/computationalintelligence_201902
https://www.nxtbook.com/nxtbooks/ieee/computationalintelligence_201811
https://www.nxtbook.com/nxtbooks/ieee/computationalintelligence_201808
https://www.nxtbook.com/nxtbooks/ieee/computationalintelligence_201805
https://www.nxtbook.com/nxtbooks/ieee/computationalintelligence_201802
https://www.nxtbook.com/nxtbooks/ieee/computational_intelligence_winter17
https://www.nxtbook.com/nxtbooks/ieee/computational_intelligence_fall17
https://www.nxtbook.com/nxtbooks/ieee/computational_intelligence_summer17
https://www.nxtbook.com/nxtbooks/ieee/computational_intelligence_spring17
https://www.nxtbook.com/nxtbooks/ieee/computational_intelligence_winter16
https://www.nxtbook.com/nxtbooks/ieee/computational_intelligence_fall16
https://www.nxtbook.com/nxtbooks/ieee/computational_intelligence_summer16
https://www.nxtbook.com/nxtbooks/ieee/computational_intelligence_spring16
https://www.nxtbook.com/nxtbooks/ieee/computational_intelligence_winter15
https://www.nxtbook.com/nxtbooks/ieee/computational_intelligence_fall15
https://www.nxtbook.com/nxtbooks/ieee/computational_intelligence_summer15
https://www.nxtbook.com/nxtbooks/ieee/computational_intelligence_spring15
https://www.nxtbook.com/nxtbooks/ieee/computational_intelligence_winter14
https://www.nxtbook.com/nxtbooks/ieee/computational_intelligence_fall14
https://www.nxtbook.com/nxtbooks/ieee/computational_intelligence_summer14
https://www.nxtbook.com/nxtbooks/ieee/computational_intelligence_spring14
https://www.nxtbook.com/nxtbooks/ieee/computational_intelligence_winter13
https://www.nxtbook.com/nxtbooks/ieee/computational_intelligence_fall13
https://www.nxtbook.com/nxtbooks/ieee/computational_intelligence_summer13
https://www.nxtbook.com/nxtbooks/ieee/computational_intelligence_spring13
https://www.nxtbook.com/nxtbooks/ieee/computational_intelligence_winter12
https://www.nxtbook.com/nxtbooks/ieee/computational_intelligence_fall12
https://www.nxtbookmedia.com