IEEE Computational Intelligence Magazine - May 2023 - 64
FIGURE 6. An illustration of knowledge distillation. FKT refers to Feature-based Knowledge
Transfer, RKT refers to Relation-based Knowledge Transfer, and LKT refers to Logits-based
Knowledge Transfer.
also be extended to gradient computations.
DoReFa-Net [46] quantized gradients
to numbers with bit-width of less
than 8 before back-propagation. It allows
quantization of the network during training
or fine-tuning. Instead ofquantizing a
single weight, vector quantization
approaches focused on factorizing the
weight matrix by weight clustering and
sharing [47], [48].
Instead ofonly using pruning or quantization
for network compression, a combination
ofthese two have generally been
reported to lead to higher compression
rate.Han et al.presented athree-stage
pipeline which consists ofpruning, weight
sharing by vector quantization, and Huffman
encoding to compress a pre-trained
model, and it achieved state-of-the-art
performance at that time [49]. Both pruning
and quantization are effective ways to
reduce model size for efficient learning.
One limitation is that they can only deal
with models that share a common network
architecture, which means compressing
a cumbersome network to an
efficient one with less weights (or neurons)
and/or lower-precision. They cannot be
used across heterogeneous networks,
which can otherwise be addressed with
the knowledge distillation technique.
3) Knowledge Distillation
Knowledge distillation (KD) is a flexible
yet efficient approachtocompressAI
models. It aims to transfer the knowledge
learnt from a cumbersome network
to a compact one, which is also known
as 'Teacher-Student' learning framework
[50].Withthe knowledgefroma
teacher, a lightweight student network
can achieve competitive performance as
the cumbersome (teacher) network yet at
a higher computational-efficiency. The
successofKD reliesmainly ontwo key
factors: the knowledge and distilling
schemes.
The knowledge information can be
categorized into three different types: logits-based,
feature-based, and relationbased
knowledge [51],which aredepicted
in Fig. 6. The vanilla KD approach leverages
softened logits from a cumbersome
teacher as the knowledge to guide the
training process ofa student [52].Romero
et al. extended it by introducing the 'hint'
knowledge from intermediate layers of a
teacher network, which can be termed
feature-based knowledge [53].Instead of
directly minimizing the discrepancy offeature
maps between the teacher and the
student, other advanced feature-based
knowledge has also been explored. For
instance, Zagoruyko and Komodakis
regarded the teacher's attention maps
derived from feature maps as the knowledge
to be transferred [54]. Different from
the aforementioned two knowledge
types, relation-basedKDmethods emphasize
the transfer of intra-relationships
between data samples [55], [56], [57] or
correlations between feature maps from
multiple intermediate layers [58], [59].
Along with this knowledge information,
various distilling schemes have also
been proposed. Although most KD
64 IEEE COMPUTATIONAL INTELLIGENCE MAGAZINE | MAY 2023
methods follow the conventional singleteacher
single-student distilling scheme,
some other promising schemes have also
been explored in the literature. The
scheme of distilling informative knowledge
from an ensemble ofteachers instead
of a single teacher could better enhance
the student's generalization ability [60],
[61]. Apart from distilling knowledge
from large-scale networks, another
research direction intends to boost the
performance of a compact network by
transferring the knowledge between networks
which share identical architecture
(also termed as self-knowledge distillation)
[62]. Furlanello et al. proposed a
sequential distilling scheme called BornAgain
Networks (BANs), which takes the
model in earlier generation as the teacher
and trains a new initialized identical
model [62], [63]. To better align feature
representations between high-capacity
and low-capacity networks, adversarial
and contrastive distilling schemes are often
adopted. To demonstrate the feasibility of
transferring knowledge between two disparate
network architectures, Xu et al.
exploited adversarial distilling scheme to
transfer knowledge from a much bigger
LSTM-based teacher to a smaller CNNbased
student [64].
Generally, KD methods are more
flexible. The teacher and student networks
inKDcan be homogeneous or heterogeneous
architectures, which is
different from pruning and quantization.
However, in most cases, KD methods
require more training efforts, as they need
to train both the teacher and student networks.
In fact, most of the training efforts
have been given to the teacher network
whichisoften large insizeand complex.
To solve this issue, one way is to consider
adopting a trained network as the
teacher [65]. Building a shared community
for sharing trained AI models can significantly
reduce repeated model training
efforts, which can be an interesting direction
for sustainable AI.
Computation-efficient AI can lead to
compressed AI models which improve
environmental sustainability of AI in two
main aspects. Compressed models can be
deployed on resource-limited devices
(e.g.,embeddedsystems)withlessmemory
and computational power. This also
IEEE Computational Intelligence Magazine - May 2023
Table of Contents for the Digital Edition of IEEE Computational Intelligence Magazine - May 2023
Contents
IEEE Computational Intelligence Magazine - May 2023 - Cover1
IEEE Computational Intelligence Magazine - May 2023 - Cover2
IEEE Computational Intelligence Magazine - May 2023 - Contents
IEEE Computational Intelligence Magazine - May 2023 - 2
IEEE Computational Intelligence Magazine - May 2023 - 3
IEEE Computational Intelligence Magazine - May 2023 - 4
IEEE Computational Intelligence Magazine - May 2023 - 5
IEEE Computational Intelligence Magazine - May 2023 - 6
IEEE Computational Intelligence Magazine - May 2023 - 7
IEEE Computational Intelligence Magazine - May 2023 - 8
IEEE Computational Intelligence Magazine - May 2023 - 9
IEEE Computational Intelligence Magazine - May 2023 - 10
IEEE Computational Intelligence Magazine - May 2023 - 11
IEEE Computational Intelligence Magazine - May 2023 - 12
IEEE Computational Intelligence Magazine - May 2023 - 13
IEEE Computational Intelligence Magazine - May 2023 - 14
IEEE Computational Intelligence Magazine - May 2023 - 15
IEEE Computational Intelligence Magazine - May 2023 - 16
IEEE Computational Intelligence Magazine - May 2023 - 17
IEEE Computational Intelligence Magazine - May 2023 - 18
IEEE Computational Intelligence Magazine - May 2023 - 19
IEEE Computational Intelligence Magazine - May 2023 - 20
IEEE Computational Intelligence Magazine - May 2023 - 21
IEEE Computational Intelligence Magazine - May 2023 - 22
IEEE Computational Intelligence Magazine - May 2023 - 23
IEEE Computational Intelligence Magazine - May 2023 - 24
IEEE Computational Intelligence Magazine - May 2023 - 25
IEEE Computational Intelligence Magazine - May 2023 - 26
IEEE Computational Intelligence Magazine - May 2023 - 27
IEEE Computational Intelligence Magazine - May 2023 - 28
IEEE Computational Intelligence Magazine - May 2023 - 29
IEEE Computational Intelligence Magazine - May 2023 - 30
IEEE Computational Intelligence Magazine - May 2023 - 31
IEEE Computational Intelligence Magazine - May 2023 - 32
IEEE Computational Intelligence Magazine - May 2023 - 33
IEEE Computational Intelligence Magazine - May 2023 - 34
IEEE Computational Intelligence Magazine - May 2023 - 35
IEEE Computational Intelligence Magazine - May 2023 - 36
IEEE Computational Intelligence Magazine - May 2023 - 37
IEEE Computational Intelligence Magazine - May 2023 - 38
IEEE Computational Intelligence Magazine - May 2023 - 39
IEEE Computational Intelligence Magazine - May 2023 - 40
IEEE Computational Intelligence Magazine - May 2023 - 41
IEEE Computational Intelligence Magazine - May 2023 - 42
IEEE Computational Intelligence Magazine - May 2023 - 43
IEEE Computational Intelligence Magazine - May 2023 - 44
IEEE Computational Intelligence Magazine - May 2023 - 45
IEEE Computational Intelligence Magazine - May 2023 - 46
IEEE Computational Intelligence Magazine - May 2023 - 47
IEEE Computational Intelligence Magazine - May 2023 - 48
IEEE Computational Intelligence Magazine - May 2023 - 49
IEEE Computational Intelligence Magazine - May 2023 - 50
IEEE Computational Intelligence Magazine - May 2023 - 51
IEEE Computational Intelligence Magazine - May 2023 - 52
IEEE Computational Intelligence Magazine - May 2023 - 53
IEEE Computational Intelligence Magazine - May 2023 - 54
IEEE Computational Intelligence Magazine - May 2023 - 55
IEEE Computational Intelligence Magazine - May 2023 - 56
IEEE Computational Intelligence Magazine - May 2023 - 57
IEEE Computational Intelligence Magazine - May 2023 - 58
IEEE Computational Intelligence Magazine - May 2023 - 59
IEEE Computational Intelligence Magazine - May 2023 - 60
IEEE Computational Intelligence Magazine - May 2023 - 61
IEEE Computational Intelligence Magazine - May 2023 - 62
IEEE Computational Intelligence Magazine - May 2023 - 63
IEEE Computational Intelligence Magazine - May 2023 - 64
IEEE Computational Intelligence Magazine - May 2023 - 65
IEEE Computational Intelligence Magazine - May 2023 - 66
IEEE Computational Intelligence Magazine - May 2023 - 67
IEEE Computational Intelligence Magazine - May 2023 - 68
IEEE Computational Intelligence Magazine - May 2023 - 69
IEEE Computational Intelligence Magazine - May 2023 - 70
IEEE Computational Intelligence Magazine - May 2023 - 71
IEEE Computational Intelligence Magazine - May 2023 - 72
IEEE Computational Intelligence Magazine - May 2023 - 73
IEEE Computational Intelligence Magazine - May 2023 - 74
IEEE Computational Intelligence Magazine - May 2023 - 75
IEEE Computational Intelligence Magazine - May 2023 - 76
IEEE Computational Intelligence Magazine - May 2023 - 77
IEEE Computational Intelligence Magazine - May 2023 - 78
IEEE Computational Intelligence Magazine - May 2023 - 79
IEEE Computational Intelligence Magazine - May 2023 - 80
IEEE Computational Intelligence Magazine - May 2023 - 81
IEEE Computational Intelligence Magazine - May 2023 - 82
IEEE Computational Intelligence Magazine - May 2023 - 83
IEEE Computational Intelligence Magazine - May 2023 - 84
IEEE Computational Intelligence Magazine - May 2023 - 85
IEEE Computational Intelligence Magazine - May 2023 - 86
IEEE Computational Intelligence Magazine - May 2023 - 87
IEEE Computational Intelligence Magazine - May 2023 - 88
IEEE Computational Intelligence Magazine - May 2023 - 89
IEEE Computational Intelligence Magazine - May 2023 - 90
IEEE Computational Intelligence Magazine - May 2023 - 91
IEEE Computational Intelligence Magazine - May 2023 - 92
IEEE Computational Intelligence Magazine - May 2023 - 93
IEEE Computational Intelligence Magazine - May 2023 - 94
IEEE Computational Intelligence Magazine - May 2023 - 95
IEEE Computational Intelligence Magazine - May 2023 - 96
IEEE Computational Intelligence Magazine - May 2023 - 97
IEEE Computational Intelligence Magazine - May 2023 - 98
IEEE Computational Intelligence Magazine - May 2023 - 99
IEEE Computational Intelligence Magazine - May 2023 - 100
IEEE Computational Intelligence Magazine - May 2023 - 101
IEEE Computational Intelligence Magazine - May 2023 - 102
IEEE Computational Intelligence Magazine - May 2023 - 103
IEEE Computational Intelligence Magazine - May 2023 - 104
IEEE Computational Intelligence Magazine - May 2023 - Cover3
IEEE Computational Intelligence Magazine - May 2023 - Cover4
https://www.nxtbook.com/nxtbooks/ieee/computationalintelligence_202311
https://www.nxtbook.com/nxtbooks/ieee/computationalintelligence_202308
https://www.nxtbook.com/nxtbooks/ieee/computationalintelligence_202305
https://www.nxtbook.com/nxtbooks/ieee/computationalintelligence_202302
https://www.nxtbook.com/nxtbooks/ieee/computationalintelligence_202211
https://www.nxtbook.com/nxtbooks/ieee/computationalintelligence_202208
https://www.nxtbook.com/nxtbooks/ieee/computationalintelligence_202205
https://www.nxtbook.com/nxtbooks/ieee/computationalintelligence_202202
https://www.nxtbook.com/nxtbooks/ieee/computationalintelligence_202111
https://www.nxtbook.com/nxtbooks/ieee/computationalintelligence_202108
https://www.nxtbook.com/nxtbooks/ieee/computationalintelligence_202105
https://www.nxtbook.com/nxtbooks/ieee/computationalintelligence_202102
https://www.nxtbook.com/nxtbooks/ieee/computationalintelligence_202011
https://www.nxtbook.com/nxtbooks/ieee/computationalintelligence_202008
https://www.nxtbook.com/nxtbooks/ieee/computationalintelligence_202005
https://www.nxtbook.com/nxtbooks/ieee/computationalintelligence_202002
https://www.nxtbook.com/nxtbooks/ieee/computationalintelligence_201911
https://www.nxtbook.com/nxtbooks/ieee/computationalintelligence_201908
https://www.nxtbook.com/nxtbooks/ieee/computationalintelligence_201905
https://www.nxtbook.com/nxtbooks/ieee/computationalintelligence_201902
https://www.nxtbook.com/nxtbooks/ieee/computationalintelligence_201811
https://www.nxtbook.com/nxtbooks/ieee/computationalintelligence_201808
https://www.nxtbook.com/nxtbooks/ieee/computationalintelligence_201805
https://www.nxtbook.com/nxtbooks/ieee/computationalintelligence_201802
https://www.nxtbook.com/nxtbooks/ieee/computational_intelligence_winter17
https://www.nxtbook.com/nxtbooks/ieee/computational_intelligence_fall17
https://www.nxtbook.com/nxtbooks/ieee/computational_intelligence_summer17
https://www.nxtbook.com/nxtbooks/ieee/computational_intelligence_spring17
https://www.nxtbook.com/nxtbooks/ieee/computational_intelligence_winter16
https://www.nxtbook.com/nxtbooks/ieee/computational_intelligence_fall16
https://www.nxtbook.com/nxtbooks/ieee/computational_intelligence_summer16
https://www.nxtbook.com/nxtbooks/ieee/computational_intelligence_spring16
https://www.nxtbook.com/nxtbooks/ieee/computational_intelligence_winter15
https://www.nxtbook.com/nxtbooks/ieee/computational_intelligence_fall15
https://www.nxtbook.com/nxtbooks/ieee/computational_intelligence_summer15
https://www.nxtbook.com/nxtbooks/ieee/computational_intelligence_spring15
https://www.nxtbook.com/nxtbooks/ieee/computational_intelligence_winter14
https://www.nxtbook.com/nxtbooks/ieee/computational_intelligence_fall14
https://www.nxtbook.com/nxtbooks/ieee/computational_intelligence_summer14
https://www.nxtbook.com/nxtbooks/ieee/computational_intelligence_spring14
https://www.nxtbook.com/nxtbooks/ieee/computational_intelligence_winter13
https://www.nxtbook.com/nxtbooks/ieee/computational_intelligence_fall13
https://www.nxtbook.com/nxtbooks/ieee/computational_intelligence_summer13
https://www.nxtbook.com/nxtbooks/ieee/computational_intelligence_spring13
https://www.nxtbook.com/nxtbooks/ieee/computational_intelligence_winter12
https://www.nxtbook.com/nxtbooks/ieee/computational_intelligence_fall12
https://www.nxtbookmedia.com