Loading web-font TeX/Main/Regular
MC2-RAM: An In-8T-SRAM Computing Macro Featuring Multi-Bit Charge-Domain Computing and ADC-Reduction Weight Encoding | IEEE Conference Publication | IEEE Xplore

MC2-RAM: An In-8T-SRAM Computing Macro Featuring Multi-Bit Charge-Domain Computing and ADC-Reduction Weight Encoding


Abstract:

In-memory computing (IMC) is a promising hardware architecture to circumvent the memory walls in data-intensive applications, like deep learning. Among various memory tec...Show More

Abstract:

In-memory computing (IMC) is a promising hardware architecture to circumvent the memory walls in data-intensive applications, like deep learning. Among various memory technologies, static random-access memory (SRAM) is promising thanks to its high computing accuracy, reliability, and scalability to advanced technology nodes. This paper presents a novel multi-bit capacitive convolution in-SRAM computing macro for high accuracy, high throughput and high efficiency deep learning inference. It realizes fully parallel charge-domain multiply-and-accumulate (MAC) within compact 8-transistor 1-capacitor (8T1C) SRAM arrays that is only 41% larger than the standard 6T cells. It performs MAC with multi-bit activations without conventional digital bit-serial shift-and-add schemes, leading to drastically improved throughput for high-precision CNN models. An ADC-reduction encoding scheme complements the compact sram design, by reducing the number of needed ADCs by half for energy and area savings. A 576 \times 130 macro with 64 ADCs is evaluated in 65nm with post-layout simulations, showing 4.60 TOPS/mm2 compute density and 59.7 TOPS/W energy efficiency with 4/4-bit activations/weights. The MC2 - RAM also achieves excellent linearity with only 0.14 mV (4.5% of the LSB) standard deviation of the output voltage in Monte Carlo simulations.
Date of Conference: 26-28 July 2021
Date Added to IEEE Xplore: 04 August 2021
ISBN Information:
Conference Location: Boston, MA, USA
References is not available for this document.

I. Introduction

Deep convolutional neural networks (CNNs) have achieved unprecedented success in the field of artificial intelligence (AI) in the past decade. However, the intensive computation required for even inference makes it challenging to deploy pre-trained models on resource-constrained edge devices. The essential and computationally dominant operation in CNN models–the convolution–requires overwhelming multiply-and-accumulate (MAC) operations with excessive on-/off-chip memory access. It is well-known that the energy bottleneck in such computation lies in the data movement rather than the arithmetic operations, leading to the so-called memory wall [1].

Select All
1.
M. Horowitz, "1.1 Computing’s energy problem (and what we can do about it)", 2014 IEEE International Solid-State Circuits Conference Digest of Technical Papers (ISSCC), pp. 10-14, 2014.
2.
N. Verma, H. Jia, H. Valavi, Y. Tang, M. Ozatay, L.-Y. Chen, et al., "In-memory computing: Advances and prospects", IEEE Solid-State Circuits Magazine, vol. 11, no. 3, pp. 43-55, 2019.
3.
V. Joshi, M. Le Gallo, S. Haefeli, I. Boybat, S. R. Nandakumar, C. Piveteau, et al., "Accurate deep neural network inference using computational phase-change memory", Nature communications, vol. 11, no. 1, pp. 1-13, 2020.
4.
P. Yao, H. Wu, B. Gao, J. Tang, Q. Zhang, W. Zhang, et al., "Fully hardware-implemented memristor convolutional neural network", Nature, vol. 577, no. 7792, pp. 641-646, 2020.
5.
M.-H. Wu, M.-S. Huang, Z. Zhu, F.-X. Liang, M.-C. Hong, J. Deng, et al., "Compact probabilistic poisson neuron based on back-hopping oscillation in STTMRAM for all-spin deep spiking neural network", IEEE Symposium on VLSI Technology (VLSI), June 2020.
6.
H. Valavi, P. J. Ramadge, E. Nestler and N. Verma, "A 64-Tile 2.4-Mb In-Memory-Computing CNN Accelerator Employing Charge-Domain Compute", IEEE Journal of Solid-State Circuits (JSSC), vol. 54, pp. 1789-1799, June 2019.
7.
X. Si, J.-J. Chen, Y.-N. Tu, W.-H. Huang, J.-H. Wang, Y.-C. Chiu, W.-C. Wei, S.-Y. Wu, X. Sun, R. Liu et al., "A twin-8T SRAM computation-in-memory unit-macro for multibit CNN-based AI edge processors", IEEE Journal of Solid-State Circuits (JSSC), vol. 55, no. 1, pp. 189-202, 2019.
8.
A. Biswas and A. P. Chandrakasan, "CONV-SRAM: An energy-efficient SRAM with in-memory dot-product computation for low-power convolutional neural networks", IEEE Journal of Solid-State Circuits, vol. 54, no. 1, pp. 217-230, 2018.
9.
Z. Jiang, S. Yin, J.-S. Seo and M. Seok, "C3SRAM: An in-memory-computing SRAM macro based on robust capacitive coupling computing mechanism", IEEE Journal of Solid-State Circuits, vol. 55, no. 7, pp. 1888-1897, 2020.
10.
S. K. Gonugondla, M. Kang and N. Shanbhag, "A 42pJ/decision 3.12 TOPS/W robust in-memory machine learning classifier with on-chip training", 2018 IEEE International Solid-State Circuits Conference (ISSCC), pp. 490-492, 2018.
11.
H. Jia, H. Valavi, Y. Tang, J. Zhang and N. Verma, "A programmable heterogeneous microprocessor based on bit-scalable in-memory computing", IEEE Journal of Solid-State Circuits (JSSC), 2020.
12.
X. Si et al., "15.5 A 28nm 64Kb 6T SRAM computing-in-memory macro with 8b MAC operation for AI edge chips", 2020 IEEE International Solid-State Circuits Conference (ISSCC), pp. 246-248, Feb. 2020.
13.
J. Yue et al., "14.3 A 65nm computing-in-memory-based CNN processor with 2.9-to-35.8 TOPS/W system energy efficiency using dynamic-sparsity performance-scaling architecture and energy-efficient inter/intra-macro data reuse", IEEE International Solid-State Circuits Conference (ISSCC), pp. 234-236, 2020.
14.
H. Kim, T. Yoo, T. T.-H. Kim and B. Kim, "Colonnade: A reconfigurable SRAM-based digital bit-serial compute-in-memory macro for processing neural networks", IEEE Journal of Solid-State Circuits (JSSC), 2021.
15.
H. Jia, M. Ozatay, Y. Tang, H. Valavi, R. Pathak, J. Lee, et al., "15.1 a programmable neural-network inference accelerator based on scalable in-memory computing", IEEE International Solid-State Circuits Conference (ISSCC), vol. 64, pp. 236-238, 2021.
16.
Q. Dong, M. E. Sinangil, B. Erbagci, D. Sun, W. Khwa, H. Liao, et al., "15.3 A 351TOPS/W and 372.4GOPS compute-in-memory SRAM macro in 7nm FinFET CMOS for machine-learning applications", 2020 IEEE International Solid-State Circuits Conference (ISSCC), pp. 242-244, Feb. 2020.
17.
C.-X. Xue et al., "15.4 A 22nm 2Mb ReRAM compute-in-memory macro with 121-28TOPS/W for multibit MAC computing for tiny AI edge devices", IEEE International Solid-State Circuits Conference (ISSCC), pp. 244-246, 2020.
18.
M. Kang, M.-S. Keel, N. R. Shanbhag, S. Eilert and K. Curewitz, "An energy-efficient VLSI architecture for pattern recognition via deep embedding of computation in SRAM", 2014 IEEE International Conference on Acoustics Speech and Signal Processing (ICASSP), pp. 8326-8330, May 2014.
19.
J. Zhang, Z. Wang and N. Verma, "In-memory computation of a machine-learning classifier in a standard 6T SRAM array", IEEE Journal of Solid-State Circuits, vol. 52, no. 4, pp. 915-924, 2017.
20.
Z. Chen, Z. Yu, Q. Jin, Y. He, J. Wang, S. Lin, et al., "CAP-RAM: A charge-domain in-memory computing 6T-SRAM for accurate and precision-programmable CNN inference", IEEE Journal of Solid-State Circuits (JSSC), pp. 1924-1935, 2021.
21.
B. Razavi, "The current-steering dac [a circuit for all seasons]", IEEE Solid-State Circuits Magazine, vol. 10, no. 1, pp. 11-15, 2018.
22.
K. D. Choo, J. Bell and M. P. Flynn, "Area-efficient 1GS/s 6b SAR ADC with charge-injection-cell-based DAC", 2016 IEEE International Solid-State Circuits Conference (ISSCC), pp. 460-461, 2016.

Contact IEEE to Subscribe

References

References is not available for this document.