Most of existing CIM architectural designs assume sufficient on-chip memory capacity. But the exceptionally high area cost prevents them from being deployed to edge devices, where the area budget is limited. For example, it is estimated at 7 nm/22 nm node, to store the entire 7.9 MB DenseNet-121 model on-chip, it requires 30 20 10 25%~75% Range Within 1.5 IQR Median Line Mean 5101520 Ideal ADC Output (a) 100% 95% Testing Accuracy ~91% 90% 85% 80% 100 Finetune Iteration (c) 30 35 20 25 90% 10 15 5 (e) 25%~75% Range Within 1.5 IQR Median Line Mean 80% 5101520 Ideal ADC Output 25 30 100 Finetune Iteration (f) Figure 8. Simulated ADC output with offset based on the sense pass rate for different W/L and retraining accuracy curves for Flash-ADC and SAR-ADC. THIRD QUARTER 2021 IEEE CIRCUITS AND SYSTEMS MAGAZINE 45 200 85% SAR ADC 2 bits Weights 1 epoch ~ 250 Iter W/L = 2 W/L = 3 W/L = 4 100% Training Accuracy SAR ADC W/L = 4 95% Testing Accuracy ~91% 200 5101520 Ideal ADC Output (d) 25 30 Flash ADC 2 bits Weights 1 epoch ~ 250 Iter W/L = 2 W/L = 3 W/L = 4 25 30 Flash ADC W/L = 2 30 35 20 25 10 15 5 Flash ADC W/L = 4 25%~75% Range Within 1.5 IQR Median Line Mean 5101520 Ideal ADC Output (b) Training Accuracy 30 35 20 25 10 15 25%~75% 5 Range Within 1.5 IQR Median Line Mean SAR ADC W/L = 2 25 30 Accuracy Actual ADC Output With Offset Actual ADC Output With Offset Accuracy Actual ADC Output With Offset Actual ADC Output With Offset