IEEE Computational Intelligence Magazine - May 2022 - 41
the number of variational parameters z from ()O n2
to (),O n
where n is the number of model parameters i . Appendix A
gives more details on how these simplifications are implemented
in practice.
Appendix A
Implementing parameter-efficient normal distributions
for variational inference
Given a random vector f in which each independent and
identically distributed component follows a standard normal
distribution
mal distribution
N( ,)n R using the following formulas:
if ,nR=+
where R is a matrix such that
RR .R=
<
n entries, RR is the Cholesky decomposition of R . As
such, R is a lower triangular matrix and ()O n2
i , it is often more convenient to learn
<
variational
parameters are required to learn the exact covariance matrix.
This becomes exceedingly computationally expensive rather
quickly when n becomes large.
A straightforward simplification is to only consider a diagonal
approximation of R . This can be enforced by learning only the
diagonal coefficients of
R , meaning only ()O n variational
parameters are required (Figure A1a). This approach can be
extended to learn more correlation coefficients by learning a block
diagonal [88] covariance matrix, which can be done by learning
the corresponding lower triangular entries in R (Figure A1b). If
the maximal size of the nonzero blocks is fixed to be w, ()O wn$
variational parameters are required. The major drawback of this
model is that the index of two given parameters determines
whether their covariance can be learned by the variational distribution.
This is not always ideal, as it is hard to predict which
parameters will be the most correlated and need to be positioned
close to one another. An alternative is to learn a diagonal plus low
rank approximation of R [89]. This is done by sampling a vector
f with nr+ (instead of n) components. R is then defined as:
R DL= 6 ,,
@
where D is a diagonal matrix of size nn#
O( )rn $
(56)
and L is a lower triangular
matrix of size nr# (Figure A1c). This means that the model
has more flexibility to learn the correlation between all the components
of i while only requiring
variational parameters.
Algorithm 5 Bayes-by-backprop algorithm.
zz0;=
for i 0= to N do
Draw
(a)
(b)
(c)
FIGURE A1 Nonzero entries in R when learning a diagonal (a),
block diagonal (b) or diagonal plus low rank (c) approximation of R
izt f= ^
h
ff+ q ;^ h
,
h;
fq pD Dp^ ,, ;hh
^iz ii i=-z
T backprop ^ hff ;zz=
zz T f ;a=- z
loglog^ ^ yx;
^ ^
end for
hh
h
This is tedious to work with. Instead, to estimate the gradient
of the ELBO, Blundell et al. [91] proposed to use the fact that if
()
z
then for a differentiable function (, ),f iz
2z z
z ii zi=+ .
2z
() (, )( )
f
(, )( ,)
2i
iz
2
2
z
i
2 iz
(38)
A proof is provided in [91]. We also provide in Appendix B an
alternative proof to give more details on when we can assume
()
qd () .qdii ff=
z
A sufficient condition is for (, ) to be
t zf
invertible with respect to f and the distributions ()q f and
q ()iz
to not be degenerated.
For the case where the weights are treated as stochastic variables,
and thus the hypothesis H, the training loop can be
implemented as described in Algorithm 5.
The objective function f corresponds to an estimate of the
ELBO from a single sample. This means that the gradient
2##ffqf dqll l cmd
2ff
qd () ,qdii ff=
we have:
N( ,),01 one can obtain a sample i from a nor(55)
When
a variational
inference algorithm needs to learn the covariance matrix
of
R . Assuming i has
C. Bayes by Backpropagation
Variational inference offers a good mathematical tool for
Bayesian inference, but it needs to be adapted to deep learning.
The main problem is that stochasticity stops backpropagation
from functioning at the internal nodes of a network [46]. Different
solutions have been proposed to mitigate this problem,
including probabilistic backpropagation [90] or Bayes-by-backprop
[91]. The latter may appear more familiar to deep learning
practitioners. We will thus focus on Bayes-by-backprop in this
tutorial. Bayes-by-backprop is indeed a practical implementation
of SVI combined with a reparametrization trick [92] to
ensure backpropagation works as usual.
The idea is to use a random variable
ff+ q () as a nonvariational
source of noise. i is not sampled directly but obtained
via a deterministic transformation (, )
q iz
t zf
such that izt (, )f=
follows (). f is sampled and thus changes at each iteration
but can still be considered a constant with regard to other variables.
All other transformations being non-stochastic, backpropagation
works as usual for the variational parameters
z ,
meaning the training loop can be implemented analogous to
the training loop of a non-stochastic neural network; see Algorithm
5. The general formula for the ELBO becomes:
# cm (37)
Pt f z D
qt f z
f
z (( ,)) log
(( ,),)
z(( ,))
qt f z
De tdt d ff(( ,)).
f
z
MAY 2022 | IEEE COMPUTATIONAL INTELLIGENCE MAGAZINE 41
IEEE Computational Intelligence Magazine - May 2022
Table of Contents for the Digital Edition of IEEE Computational Intelligence Magazine - May 2022
Contents
IEEE Computational Intelligence Magazine - May 2022 - Cover1
IEEE Computational Intelligence Magazine - May 2022 - Cover2
IEEE Computational Intelligence Magazine - May 2022 - Contents
IEEE Computational Intelligence Magazine - May 2022 - 2
IEEE Computational Intelligence Magazine - May 2022 - 3
IEEE Computational Intelligence Magazine - May 2022 - 4
IEEE Computational Intelligence Magazine - May 2022 - 5
IEEE Computational Intelligence Magazine - May 2022 - 6
IEEE Computational Intelligence Magazine - May 2022 - 7
IEEE Computational Intelligence Magazine - May 2022 - 8
IEEE Computational Intelligence Magazine - May 2022 - 9
IEEE Computational Intelligence Magazine - May 2022 - 10
IEEE Computational Intelligence Magazine - May 2022 - 11
IEEE Computational Intelligence Magazine - May 2022 - 12
IEEE Computational Intelligence Magazine - May 2022 - 13
IEEE Computational Intelligence Magazine - May 2022 - 14
IEEE Computational Intelligence Magazine - May 2022 - 15
IEEE Computational Intelligence Magazine - May 2022 - 16
IEEE Computational Intelligence Magazine - May 2022 - 17
IEEE Computational Intelligence Magazine - May 2022 - 18
IEEE Computational Intelligence Magazine - May 2022 - 19
IEEE Computational Intelligence Magazine - May 2022 - 20
IEEE Computational Intelligence Magazine - May 2022 - 21
IEEE Computational Intelligence Magazine - May 2022 - 22
IEEE Computational Intelligence Magazine - May 2022 - 23
IEEE Computational Intelligence Magazine - May 2022 - 24
IEEE Computational Intelligence Magazine - May 2022 - 25
IEEE Computational Intelligence Magazine - May 2022 - 26
IEEE Computational Intelligence Magazine - May 2022 - 27
IEEE Computational Intelligence Magazine - May 2022 - 28
IEEE Computational Intelligence Magazine - May 2022 - 29
IEEE Computational Intelligence Magazine - May 2022 - 30
IEEE Computational Intelligence Magazine - May 2022 - 31
IEEE Computational Intelligence Magazine - May 2022 - 32
IEEE Computational Intelligence Magazine - May 2022 - 33
IEEE Computational Intelligence Magazine - May 2022 - 34
IEEE Computational Intelligence Magazine - May 2022 - 35
IEEE Computational Intelligence Magazine - May 2022 - 36
IEEE Computational Intelligence Magazine - May 2022 - 37
IEEE Computational Intelligence Magazine - May 2022 - 38
IEEE Computational Intelligence Magazine - May 2022 - 39
IEEE Computational Intelligence Magazine - May 2022 - 40
IEEE Computational Intelligence Magazine - May 2022 - 41
IEEE Computational Intelligence Magazine - May 2022 - 42
IEEE Computational Intelligence Magazine - May 2022 - 43
IEEE Computational Intelligence Magazine - May 2022 - 44
IEEE Computational Intelligence Magazine - May 2022 - 45
IEEE Computational Intelligence Magazine - May 2022 - 46
IEEE Computational Intelligence Magazine - May 2022 - 47
IEEE Computational Intelligence Magazine - May 2022 - 48
IEEE Computational Intelligence Magazine - May 2022 - 49
IEEE Computational Intelligence Magazine - May 2022 - 50
IEEE Computational Intelligence Magazine - May 2022 - 51
IEEE Computational Intelligence Magazine - May 2022 - 52
IEEE Computational Intelligence Magazine - May 2022 - 53
IEEE Computational Intelligence Magazine - May 2022 - 54
IEEE Computational Intelligence Magazine - May 2022 - 55
IEEE Computational Intelligence Magazine - May 2022 - 56
IEEE Computational Intelligence Magazine - May 2022 - 57
IEEE Computational Intelligence Magazine - May 2022 - 58
IEEE Computational Intelligence Magazine - May 2022 - 59
IEEE Computational Intelligence Magazine - May 2022 - 60
IEEE Computational Intelligence Magazine - May 2022 - 61
IEEE Computational Intelligence Magazine - May 2022 - 62
IEEE Computational Intelligence Magazine - May 2022 - 63
IEEE Computational Intelligence Magazine - May 2022 - 64
IEEE Computational Intelligence Magazine - May 2022 - 65
IEEE Computational Intelligence Magazine - May 2022 - 66
IEEE Computational Intelligence Magazine - May 2022 - 67
IEEE Computational Intelligence Magazine - May 2022 - 68
IEEE Computational Intelligence Magazine - May 2022 - 69
IEEE Computational Intelligence Magazine - May 2022 - 70
IEEE Computational Intelligence Magazine - May 2022 - 71
IEEE Computational Intelligence Magazine - May 2022 - 72
IEEE Computational Intelligence Magazine - May 2022 - 73
IEEE Computational Intelligence Magazine - May 2022 - 74
IEEE Computational Intelligence Magazine - May 2022 - 75
IEEE Computational Intelligence Magazine - May 2022 - 76
IEEE Computational Intelligence Magazine - May 2022 - 77
IEEE Computational Intelligence Magazine - May 2022 - 78
IEEE Computational Intelligence Magazine - May 2022 - 79
IEEE Computational Intelligence Magazine - May 2022 - 80
IEEE Computational Intelligence Magazine - May 2022 - 81
IEEE Computational Intelligence Magazine - May 2022 - 82
IEEE Computational Intelligence Magazine - May 2022 - 83
IEEE Computational Intelligence Magazine - May 2022 - 84
IEEE Computational Intelligence Magazine - May 2022 - 85
IEEE Computational Intelligence Magazine - May 2022 - 86
IEEE Computational Intelligence Magazine - May 2022 - 87
IEEE Computational Intelligence Magazine - May 2022 - 88
IEEE Computational Intelligence Magazine - May 2022 - 89
IEEE Computational Intelligence Magazine - May 2022 - 90
IEEE Computational Intelligence Magazine - May 2022 - 91
IEEE Computational Intelligence Magazine - May 2022 - 92
IEEE Computational Intelligence Magazine - May 2022 - 93
IEEE Computational Intelligence Magazine - May 2022 - 94
IEEE Computational Intelligence Magazine - May 2022 - 95
IEEE Computational Intelligence Magazine - May 2022 - 96
IEEE Computational Intelligence Magazine - May 2022 - 97
IEEE Computational Intelligence Magazine - May 2022 - 98
IEEE Computational Intelligence Magazine - May 2022 - 99
IEEE Computational Intelligence Magazine - May 2022 - 100
IEEE Computational Intelligence Magazine - May 2022 - 101
IEEE Computational Intelligence Magazine - May 2022 - 102
IEEE Computational Intelligence Magazine - May 2022 - 103
IEEE Computational Intelligence Magazine - May 2022 - 104
IEEE Computational Intelligence Magazine - May 2022 - Cover3
IEEE Computational Intelligence Magazine - May 2022 - Cover4
https://www.nxtbook.com/nxtbooks/ieee/computationalintelligence_202311
https://www.nxtbook.com/nxtbooks/ieee/computationalintelligence_202308
https://www.nxtbook.com/nxtbooks/ieee/computationalintelligence_202305
https://www.nxtbook.com/nxtbooks/ieee/computationalintelligence_202302
https://www.nxtbook.com/nxtbooks/ieee/computationalintelligence_202211
https://www.nxtbook.com/nxtbooks/ieee/computationalintelligence_202208
https://www.nxtbook.com/nxtbooks/ieee/computationalintelligence_202205
https://www.nxtbook.com/nxtbooks/ieee/computationalintelligence_202202
https://www.nxtbook.com/nxtbooks/ieee/computationalintelligence_202111
https://www.nxtbook.com/nxtbooks/ieee/computationalintelligence_202108
https://www.nxtbook.com/nxtbooks/ieee/computationalintelligence_202105
https://www.nxtbook.com/nxtbooks/ieee/computationalintelligence_202102
https://www.nxtbook.com/nxtbooks/ieee/computationalintelligence_202011
https://www.nxtbook.com/nxtbooks/ieee/computationalintelligence_202008
https://www.nxtbook.com/nxtbooks/ieee/computationalintelligence_202005
https://www.nxtbook.com/nxtbooks/ieee/computationalintelligence_202002
https://www.nxtbook.com/nxtbooks/ieee/computationalintelligence_201911
https://www.nxtbook.com/nxtbooks/ieee/computationalintelligence_201908
https://www.nxtbook.com/nxtbooks/ieee/computationalintelligence_201905
https://www.nxtbook.com/nxtbooks/ieee/computationalintelligence_201902
https://www.nxtbook.com/nxtbooks/ieee/computationalintelligence_201811
https://www.nxtbook.com/nxtbooks/ieee/computationalintelligence_201808
https://www.nxtbook.com/nxtbooks/ieee/computationalintelligence_201805
https://www.nxtbook.com/nxtbooks/ieee/computationalintelligence_201802
https://www.nxtbook.com/nxtbooks/ieee/computational_intelligence_winter17
https://www.nxtbook.com/nxtbooks/ieee/computational_intelligence_fall17
https://www.nxtbook.com/nxtbooks/ieee/computational_intelligence_summer17
https://www.nxtbook.com/nxtbooks/ieee/computational_intelligence_spring17
https://www.nxtbook.com/nxtbooks/ieee/computational_intelligence_winter16
https://www.nxtbook.com/nxtbooks/ieee/computational_intelligence_fall16
https://www.nxtbook.com/nxtbooks/ieee/computational_intelligence_summer16
https://www.nxtbook.com/nxtbooks/ieee/computational_intelligence_spring16
https://www.nxtbook.com/nxtbooks/ieee/computational_intelligence_winter15
https://www.nxtbook.com/nxtbooks/ieee/computational_intelligence_fall15
https://www.nxtbook.com/nxtbooks/ieee/computational_intelligence_summer15
https://www.nxtbook.com/nxtbooks/ieee/computational_intelligence_spring15
https://www.nxtbook.com/nxtbooks/ieee/computational_intelligence_winter14
https://www.nxtbook.com/nxtbooks/ieee/computational_intelligence_fall14
https://www.nxtbook.com/nxtbooks/ieee/computational_intelligence_summer14
https://www.nxtbook.com/nxtbooks/ieee/computational_intelligence_spring14
https://www.nxtbook.com/nxtbooks/ieee/computational_intelligence_winter13
https://www.nxtbook.com/nxtbooks/ieee/computational_intelligence_fall13
https://www.nxtbook.com/nxtbooks/ieee/computational_intelligence_summer13
https://www.nxtbook.com/nxtbooks/ieee/computational_intelligence_spring13
https://www.nxtbook.com/nxtbooks/ieee/computational_intelligence_winter12
https://www.nxtbook.com/nxtbooks/ieee/computational_intelligence_fall12
https://www.nxtbookmedia.com