IEEE Circuits and Systems Magazine - Q3 2023 - 25

Table 6.
MCUNet outperforms existing methods for memory-efficient face detection on WIDER FACE [136] dataset. Compared
to RNNPool-Face-C [105], MCUNet-L can achieve similar mAP at 3.4× smaller peak SRAM and 1.6× smaller
computation. The model statistics are profiled on 640 × 480 RGB input images following [105].
Method
EXTD [137]
LFFD [138]
RNNPool-Face-C [105]
MCUNet-L
EagleEye [139]
RNNPool-Face-A [105]
MCUNet-S
MACs ↓
8.49G
9.25G
1.80G
1.10G
0.08G
0.10G
0.11G
Peak Memory
(fp32) ↓
18.8MB (9.9×)
18.8MB (9.9×)
6.44MB (3.4×)
1.89 MB (1.0×)
1.17MB (1.8×)
1.17MB (1.8×)
672kB (1.0×)
from cloud training. Tiny IoT devices (e.g., microcontrollers)
typically have a limited SRAM size like 256
KB. Such a small memory budget is hardly enough
for the inference of deep learning models [8], [9], [44],
[45], [48], [141], [142], [143], let alone the training,
which requires extra computation for the backward
and extra memory for intermediate activation [109].
On the other hand, modern deep training frameworks
(e.g., PyTorch [89] and TensorFlow [144]) are usually
designed for cloud servers and require a large memory
footprint (> 300 MB) even when training a small
model (e.g., MobileNetV2-w0.35 [4]) with batch size
1. The huge gap (> 1000×) makes it impossible to run
on tiny IoT devices. Furthermore, devices like microcontrollers
are bare-metal and do not have an operational
system and the runtime support needed by
existing training frameworks. Therefore, we need to
jointly design the algorithm and the system to enable
tiny on-device training.
Deep learning training systems such as PyTorch [89],
TensorFlow [144], JAX [92], MXNet [91], etc., do not consider
the tight resources on edge devices. Edge deep
learning inference frameworks like TVM [93], TF-Lite
[94], NCNN [97], etc., provide a slim runtime, but lack
the support for back-propagation. There are low-cost efficient
transfer learning algorithms like training only the
final classifier layer, bias-only update [52], etc. However,
the existing training systems can not realize the theoretical
saving into measured savings. The downstream
accuracy of such update schemes is also low (Figure 18).
There is a need for training systems that can effectively
utilize the limited resources on edge devices.
In order to bridge the gap and enable tiny on-device
training with algorithm-system co-design, there are two
unique challenges in tiny on-device training: (1) the model
is quantized on edge devices. A real quantized graph
is difficult to optimize due to mixed-precision tensors
THIRD QUARTER 2023
mAP ↑
Easy Medium
0.90
0.91
0.92
0.92
0.74
0.77
0.85
0.88
0.88
0.89
0.90
0.70
0.75
0.81
Hard
0.82
0.77
0.70
0.70
0.44
0.53
0.55
mAP (≤3 faces) ↑
Easy Medium
0.93
0.83
0.95
0.94
0.79
0.81
0.90
0.93
0.83
0.94
0.93
0.78
0.79
0.89
Hard
0.91
0.82
0.92
0.92
0.75
0.77
0.87
and the lack of batch normalization layers [145]; (2) the
limited hardware resource (memory and computation)
of tiny hardware does not allow full back-propagation,
as the memory usage can easily exceed the SRAM of
microcontrollers by more than an order of magnitude.
To cope with the difficulties, TinyTraining proposes the
following designs.
A. Quantization Aware Scaling
Neural networks usually need to be quantized to fit the
limited memory of edge devices [8], [146]. For a fp32
linear layer yfp32 = Wfp32xfp32 + bfp32, the int 8 quantized
counterpart is:
−
y −
int8 = cast2int8[sfp32 ⋅ (Wint8 x−
int8 +
−
bint32)],
(1)
where −⋅ denotes the tensor being quantized to fixedpoint
numbers, and s is a floating-point scaling factor
to project the results back into int 8 range. The gradient
update for the weights can be presented as:
cast2int8(Wint8 − α · G −
−
W'int8 =
−
W ), where α is the learning rate,
and GW is the gradient of the weights. After applying
the gradient update, the weights are rounded back to
8-bit integers.
1) Gradient Scale Mismatch: Unlike fine-tuning floatingpoint
model on the cloud, training with a real quantized
graph3 is difficult: the quantized graph has tensors of
different bit-precisions (int8, int32, fp32, shown in
Equation (1) and lacks batch normalization [145] layers
(fused), leading to unstable gradient update.
Optimizing a quantized graph often leads to lower
accuracy compared to the floating-point counterpart.
A possible hypothesis is that the quantization process
3Note that this is contrary to the fake quantization graph, which is widely
used in quantization-aware training [146].
IEEE CIRCUITS AND SYSTEMS MAGAZINE
25

IEEE Circuits and Systems Magazine - Q3 2023

Table of Contents for the Digital Edition of IEEE Circuits and Systems Magazine - Q3 2023

Contents
IEEE Circuits and Systems Magazine - Q3 2023 - Cover1
IEEE Circuits and Systems Magazine - Q3 2023 - Cover2
IEEE Circuits and Systems Magazine - Q3 2023 - Contents
IEEE Circuits and Systems Magazine - Q3 2023 - 2
IEEE Circuits and Systems Magazine - Q3 2023 - 3
IEEE Circuits and Systems Magazine - Q3 2023 - 4
IEEE Circuits and Systems Magazine - Q3 2023 - 5
IEEE Circuits and Systems Magazine - Q3 2023 - 6
IEEE Circuits and Systems Magazine - Q3 2023 - 7
IEEE Circuits and Systems Magazine - Q3 2023 - 8
IEEE Circuits and Systems Magazine - Q3 2023 - 9
IEEE Circuits and Systems Magazine - Q3 2023 - 10
IEEE Circuits and Systems Magazine - Q3 2023 - 11
IEEE Circuits and Systems Magazine - Q3 2023 - 12
IEEE Circuits and Systems Magazine - Q3 2023 - 13
IEEE Circuits and Systems Magazine - Q3 2023 - 14
IEEE Circuits and Systems Magazine - Q3 2023 - 15
IEEE Circuits and Systems Magazine - Q3 2023 - 16
IEEE Circuits and Systems Magazine - Q3 2023 - 17
IEEE Circuits and Systems Magazine - Q3 2023 - 18
IEEE Circuits and Systems Magazine - Q3 2023 - 19
IEEE Circuits and Systems Magazine - Q3 2023 - 20
IEEE Circuits and Systems Magazine - Q3 2023 - 21
IEEE Circuits and Systems Magazine - Q3 2023 - 22
IEEE Circuits and Systems Magazine - Q3 2023 - 23
IEEE Circuits and Systems Magazine - Q3 2023 - 24
IEEE Circuits and Systems Magazine - Q3 2023 - 25
IEEE Circuits and Systems Magazine - Q3 2023 - 26
IEEE Circuits and Systems Magazine - Q3 2023 - 27
IEEE Circuits and Systems Magazine - Q3 2023 - 28
IEEE Circuits and Systems Magazine - Q3 2023 - 29
IEEE Circuits and Systems Magazine - Q3 2023 - 30
IEEE Circuits and Systems Magazine - Q3 2023 - 31
IEEE Circuits and Systems Magazine - Q3 2023 - 32
IEEE Circuits and Systems Magazine - Q3 2023 - 33
IEEE Circuits and Systems Magazine - Q3 2023 - 34
IEEE Circuits and Systems Magazine - Q3 2023 - 35
IEEE Circuits and Systems Magazine - Q3 2023 - 36
IEEE Circuits and Systems Magazine - Q3 2023 - 37
IEEE Circuits and Systems Magazine - Q3 2023 - 38
IEEE Circuits and Systems Magazine - Q3 2023 - 39
IEEE Circuits and Systems Magazine - Q3 2023 - 40
IEEE Circuits and Systems Magazine - Q3 2023 - 41
IEEE Circuits and Systems Magazine - Q3 2023 - 42
IEEE Circuits and Systems Magazine - Q3 2023 - 43
IEEE Circuits and Systems Magazine - Q3 2023 - 44
IEEE Circuits and Systems Magazine - Q3 2023 - 45
IEEE Circuits and Systems Magazine - Q3 2023 - 46
IEEE Circuits and Systems Magazine - Q3 2023 - 47
IEEE Circuits and Systems Magazine - Q3 2023 - 48
IEEE Circuits and Systems Magazine - Q3 2023 - 49
IEEE Circuits and Systems Magazine - Q3 2023 - 50
IEEE Circuits and Systems Magazine - Q3 2023 - 51
IEEE Circuits and Systems Magazine - Q3 2023 - 52
IEEE Circuits and Systems Magazine - Q3 2023 - 53
IEEE Circuits and Systems Magazine - Q3 2023 - 54
IEEE Circuits and Systems Magazine - Q3 2023 - 55
IEEE Circuits and Systems Magazine - Q3 2023 - 56
IEEE Circuits and Systems Magazine - Q3 2023 - 57
IEEE Circuits and Systems Magazine - Q3 2023 - 58
IEEE Circuits and Systems Magazine - Q3 2023 - 59
IEEE Circuits and Systems Magazine - Q3 2023 - 60
IEEE Circuits and Systems Magazine - Q3 2023 - 61
IEEE Circuits and Systems Magazine - Q3 2023 - 62
IEEE Circuits and Systems Magazine - Q3 2023 - 63
IEEE Circuits and Systems Magazine - Q3 2023 - 64
IEEE Circuits and Systems Magazine - Q3 2023 - 65
IEEE Circuits and Systems Magazine - Q3 2023 - 66
IEEE Circuits and Systems Magazine - Q3 2023 - 67
IEEE Circuits and Systems Magazine - Q3 2023 - 68
IEEE Circuits and Systems Magazine - Q3 2023 - 69
IEEE Circuits and Systems Magazine - Q3 2023 - 70
IEEE Circuits and Systems Magazine - Q3 2023 - 71
IEEE Circuits and Systems Magazine - Q3 2023 - 72
IEEE Circuits and Systems Magazine - Q3 2023 - 73
IEEE Circuits and Systems Magazine - Q3 2023 - 74
IEEE Circuits and Systems Magazine - Q3 2023 - 75
IEEE Circuits and Systems Magazine - Q3 2023 - 76
IEEE Circuits and Systems Magazine - Q3 2023 - 77
IEEE Circuits and Systems Magazine - Q3 2023 - 78
IEEE Circuits and Systems Magazine - Q3 2023 - 79
IEEE Circuits and Systems Magazine - Q3 2023 - 80
IEEE Circuits and Systems Magazine - Q3 2023 - 81
IEEE Circuits and Systems Magazine - Q3 2023 - 82
IEEE Circuits and Systems Magazine - Q3 2023 - 83
IEEE Circuits and Systems Magazine - Q3 2023 - 84
IEEE Circuits and Systems Magazine - Q3 2023 - 85
IEEE Circuits and Systems Magazine - Q3 2023 - 86
IEEE Circuits and Systems Magazine - Q3 2023 - 87
IEEE Circuits and Systems Magazine - Q3 2023 - 88
IEEE Circuits and Systems Magazine - Q3 2023 - Cover3
IEEE Circuits and Systems Magazine - Q3 2023 - Cover4
https://www.nxtbook.com/nxtbooks/ieee/circuitsandsystems_2023Q3
https://www.nxtbook.com/nxtbooks/ieee/circuitsandsystems_2023Q2
https://www.nxtbook.com/nxtbooks/ieee/circuitsandsystems_2023Q1
https://www.nxtbook.com/nxtbooks/ieee/circuitsandsystems_2022Q4
https://www.nxtbook.com/nxtbooks/ieee/circuitsandsystems_2022Q3
https://www.nxtbook.com/nxtbooks/ieee/circuitsandsystems_2022Q2
https://www.nxtbook.com/nxtbooks/ieee/circuitsandsystems_2022Q1
https://www.nxtbook.com/nxtbooks/ieee/circuitsandsystems_2021Q4
https://www.nxtbook.com/nxtbooks/ieee/circuitsandsystems_2021q3
https://www.nxtbook.com/nxtbooks/ieee/circuitsandsystems_2021q2
https://www.nxtbook.com/nxtbooks/ieee/circuitsandsystems_2021q1
https://www.nxtbook.com/nxtbooks/ieee/circuitsandsystems_2020q4
https://www.nxtbook.com/nxtbooks/ieee/circuitsandsystems_2020q3
https://www.nxtbook.com/nxtbooks/ieee/circuitsandsystems_2020q2
https://www.nxtbook.com/nxtbooks/ieee/circuitsandsystems_2020q1
https://www.nxtbook.com/nxtbooks/ieee/circuitsandsystems_2019q4
https://www.nxtbook.com/nxtbooks/ieee/circuitsandsystems_2019q3
https://www.nxtbook.com/nxtbooks/ieee/circuitsandsystems_2019q2
https://www.nxtbook.com/nxtbooks/ieee/circuitsandsystems_2019q1
https://www.nxtbook.com/nxtbooks/ieee/circuitsandsystems_2018q4
https://www.nxtbook.com/nxtbooks/ieee/circuitsandsystems_2018q3
https://www.nxtbook.com/nxtbooks/ieee/circuitsandsystems_2018q2
https://www.nxtbook.com/nxtbooks/ieee/circuitsandsystems_2018q1
https://www.nxtbookmedia.com