Conferences >2021 IEEE/ACM International S...

MC²-RAM: An In-8T-SRAM Computing Macro Featuring Multi-Bit Charge-Domain Computing and ADC-Reduction Weight Encoding

Download PDF
Download References
Request Permissions
Save to
Alerts

Abstract:

In-memory computing (IMC) is a promising hardware architecture to circumvent the memory walls in data-intensive applications, like deep learning. Among various memory tec...Show More

Metadata

Abstract:

In-memory computing (IMC) is a promising hardware architecture to circumvent the memory walls in data-intensive applications, like deep learning. Among various memory technologies, static random-access memory (SRAM) is promising thanks to its high computing accuracy, reliability, and scalability to advanced technology nodes. This paper presents a novel multi-bit capacitive convolution in-SRAM computing macro for high accuracy, high throughput and high efficiency deep learning inference. It realizes fully parallel charge-domain multiply-and-accumulate (MAC) within compact 8-transistor 1-capacitor (8T1C) SRAM arrays that is only 41% larger than the standard 6T cells. It performs MAC with multi-bit activations without conventional digital bit-serial shift-and-add schemes, leading to drastically improved throughput for high-precision CNN models. An ADC-reduction encoding scheme complements the compact sram design, by reducing the number of needed ADCs by half for energy and area savings. A

$576 \times 130$ macro with 64 ADCs is evaluated in 65nm with post-layout simulations, showing 4.60 TOPS/mm² compute density and 59.7 TOPS/W energy efficiency with 4/4-bit activations/weights. The MC² - RAM also achieves excellent linearity with only 0.14 mV (4.5% of the LSB) standard deviation of the output voltage in Monte Carlo simulations.

Published in: 2021 IEEE/ACM International Symposium on Low Power Electronics and Design (ISLPED)

Date of Conference: 26-28 July 2021

Date Added to IEEE Xplore: 04 August 2021

ISBN Information:

DOI: 10.1109/ISLPED52811.2021.9502505

Conference Location: Boston, MA, USA

Contents

I. Introduction

Deep convolutional neural networks (CNNs) have achieved unprecedented success in the field of artificial intelligence (AI) in the past decade. However, the intensive computation required for even inference makes it challenging to deploy pre-trained models on resource-constrained edge devices. The essential and computationally dominant operation in CNN models–the convolution–requires overwhelming multiply-and-accumulate (MAC) operations with excessive on-/off-chip memory access. It is well-known that the energy bottleneck in such computation lies in the data movement rather than the arithmetic operations, leading to the so-called memory wall [1].

References is not available for this document.

MC²-RAM: An In-8T-SRAM Computing Macro Featuring Multi-Bit Charge-Domain Computing and ADC-Reduction Weight Encoding

Abstract:

Metadata

Abstract:

I. Introduction

References

IEEE Account

Purchase Details

Profile Information

Need Help?

MC2-RAM: An In-8T-SRAM Computing Macro Featuring Multi-Bit Charge-Domain Computing and ADC-Reduction Weight Encoding

Alerts

Abstract:

Metadata

Abstract:

I. Introduction

References

IEEE Account

Purchase Details

Profile Information

Need Help?

MC²-RAM: An In-8T-SRAM Computing Macro Featuring Multi-Bit Charge-Domain Computing and ADC-Reduction Weight Encoding