Conferences >2020 IEEE International Sympo...

MINT: Mixed-Precision RRAM-Based IN-Memory Training Architecture

Download PDF
Download References
Request Permissions
Save to
Alerts

Abstract:

On-chip training of large-scale deep neural networks (DNNs) is challenging. To solve the memory wall problem, compute-in-memory (CIM) is a promising approach that exploit...Show More

Metadata

Abstract:

On-chip training of large-scale deep neural networks (DNNs) is challenging. To solve the memory wall problem, compute-in-memory (CIM) is a promising approach that exploits the analog computation inside the memory array to speed up the vector-matrix multiplication (VMM). Challenges for on-chip CIM training include higher weight precision and higher analog-to-digital converter (ADC) resolution. In this work, we propose a mixed-precision RRAM-based CIM architecture that overcomes these challenges and supports on-chip training. In particular, we split the multi-bit weight into the most significant bits (MSBs) and the least significant bits (LSBs). The forward and backward propagations are performed with CIM transposable arrays for MSBs only, while the weight update is performed in regular memory arrays that store LSBs. Impact of ADC resolution on training accuracy is analyzed. We explore the training performance of a convolutional VGG-like network on the CIFAR-10 dataset using this Mixed-precision IN-memory Training architecture, namely MINT, showing that it can achieve ∼91% accuracy under hardware constraints and ∼4.46TOPS/W energy efficiency. Compared with the baseline CIM architectures based on RRAM, it can achieve 1.35× higher energy efficiency and only 31.9% chip size (∼98.86 mm² at 32 nm node).

Published in: 2020 IEEE International Symposium on Circuits and Systems (ISCAS)

Date of Conference: 12-14 October 2020

Date Added to IEEE Xplore: 28 September 2020

Print ISBN:978-1-7281-3320-1

Print ISSN: 2158-1525

DOI: 10.1109/ISCAS45731.2020.9181020

Conference Location: Seville, Spain

Contents

I. Introduction

As DNNs become deeper and more complicated, number of operations and parameters also increase significantly. For example, as a representative DNN, VGG-16 [1] has 138MB parameters using 8-bit precision for weights and activations. As a result, data movements between the computing units and memory units become the bottleneck. Especially, expensive off-chip DRAM access occurs frequently due to the limit on-chip SRAM buffer size for DNN workloads. There have been many research efforts on the design of application specific integrated circuit (ASIC) accelerators such as Eyeriss [2] and TPU [3], where the parameters are stored in global buffer and the computation is still performed at the digital multiply-accumulate (MAC) arrays. Compute-in-memory (CIM) is an efficient paradigm to address the memory wall problem in DNN hardware acceleration [4]. Convolution operation essentially contains vector-matrix multiplication (VMM), which takes up most of the computations in DNNs. The crossbar structure supports analog VMM operations by activating multiple rows and perform current summation along bit lines (BLs). Emerging non-volatile memory (eNVMs) such as phase change memory (PCM) [5] and resistive random-access memory (RRAM) [6] provide promising solutions to design CIM-based accelerator due to smaller cell size than SRAM at the same node. Though these eNVMs based CIM architectures are promising, grand challenges exist in designing a practical CIM accelerator that supports both training and inference. First, most of the CIM architectures proposed so far, such as PRIME [7] and ISAAC [8] could support the inference only. The data flow for CIM is largely unexplored. Second, the impact of ADC resolution on the training/inference accuracy is rarely explored. Third, the asymmetric and nonlinear conductance tuning introduces significant training accuracy loss [9], making it difficult to utilize the multilevel states of eNVMs for in-situ training.

References is not available for this document.

MINT: Mixed-Precision RRAM-Based IN-Memory Training Architecture

Abstract:

Metadata

Abstract:

I. Introduction

References

IEEE Account

Purchase Details

Profile Information

Need Help?

MINT: Mixed-Precision RRAM-Based IN-Memory Training Architecture

Alerts

Abstract:

Metadata

Abstract:

I. Introduction

References