34 IEEE CIRCUITS AND SYSTEMS MAGAZINE FOURTH QUARTER 2021 10′b Speech Input (Work Time: 80 ms/1 s) SNR ≥ 15 dB Threshold: 15 >A Energy-Based SNR Unit 0-Average-Based VAD Unit x [k]MSB x [k-1]MSB Half Frame Zero Rate Psel Enable Half Frame Input Finished 16′b 16′b || . <<16 <<16 Modulo-Cordic Log-Mel clk Voice-Activated Clock Gating SNR<15 dB Inactivated Voice-Activated Signal Remains " 1 " in Any Case Clock Signal x15 y15 >>>i >>>i >0 ± 0.6073 |s| = sqrt (Real2 + Img2) 16′b Voice-Activated Signal ± x16 >>16 Mel Filter Coeff ROM 0.625 KB m0 m2 ... m1 m3 ... 1 Log2 LUT Enable Speech Signal To Noise Ratio 1 6 b MFCC Output m0 m2 m1 m3 m24 ... m25 Filter Num x0 x1 x0 x1 x0 >>>i >0 y0 >>>i ± Radix -2 Radix-2 x0_r x1_r x0_i x1_i x0_r x1_r x0_i x1_i y0_r y1_r y0_i y1_r y0_r y1_r y0_i y1_r y0 y1 y0 y1 ± y1>4 x [k-1] x [k] Store New h-Frame Data SRAM #1 (0.156 KB) Load Old h-Frame Data Frame Ctr 10′b 64 × 20b BF2 SRAM SRAM w 128 × 20b BF2 1 10′b Windowing (1/16)x [k-1] Hamming Coeff ROM 0.156 KB Figure 8. Precision self-adaptive MFCC architecture with proposed R2SDF FFT and approximate multiplication and addition using Dual-Vdd. (Continued)