Conferences >2018 International Conference...

A Highly Parallel FPGA Implementation of Sparse Neural Network Training

Download PDF
Download References
Request Permissions
Save to
Alerts

Abstract:

This paper describes the development of an FPGA implementation of a parallel and reconfigurable architecture for sparse neural networks, capable of on-chip training and i...Show More

Metadata

Abstract:

This paper describes the development of an FPGA implementation of a parallel and reconfigurable architecture for sparse neural networks, capable of on-chip training and inference. The network connectivity uses pre-determined, structured sparsity to significantly reduce complexity by lowering memory and computational requirements. The architecture uses a notion of edge-processing, leading to efficient pipelining and parallelization. Moreover, the device can be reconfigured to trade off resource utilization with training time to fit networks and datasets of varying sizes. The combined effects of complexity reduction and easy reconfigurability enable greater exploration of network hyperparameters and structures on-chip. As proof of concept, we show implementation results on an Artix-7 FPGA.

Published in: 2018 International Conference on ReConFigurable Computing and FPGAs (ReConFig)

Date of Conference: 03-05 December 2018

Date Added to IEEE Xplore: 14 February 2019

ISBN Information:

ISSN Information:

DOI: 10.1109/RECONFIG.2018.8641739

Conference Location: Cancun, Mexico

Contents

I. Introduction

Neural networks (NNs) are critical drivers of new technologies such as image processing and speech recognition. Modern NNs have millions of trainable parameters [1], [2] demanding large amounts of memory and computational resources. This makes the process of training difficult to perform on-chip. As a result, most hardware architectures for NNs perform training off-chip on CPUs/GPUs or the cloud, and only support inference capabilities on the final FPGA/ASIC device [3]–[5]. Unfortunately, off-chip training results in on-chip implementation of a non-reconfigurable network which cannot support training time optimizations over model structure and hyperparameters. This severely hinders the development of independent NN devices which a) dynamically adapt themselves to new models and data, and b) do not depend on costly, possible insecure, cloud resources or power-hungry data centers for training.

References is not available for this document.

MIT Libraries

MIT Libraries

A Highly Parallel FPGA Implementation of Sparse Neural Network Training

Abstract:

Metadata

Abstract:

ISSN Information:

I. Introduction

References

IEEE Account

Purchase Details

Profile Information

Need Help?

MIT Libraries

MIT Libraries

A Highly Parallel FPGA Implementation of Sparse Neural Network Training

Alerts

Abstract:

Metadata

Abstract:

ISSN Information:

I. Introduction

References