Conferences >2022 IEEE 65th International ...

Hardware Acceleration in Large-Scale Tensor Decomposition for Neural Network Compression

Download PDF
Download References
Request Permissions
Save to
Alerts

Abstract:

A tensor is a multi-dimensional array, which is embedded for neural networks. The multiply-accumulate (MAC) operations involved in a large-scale tensor introduces high co...Show More

Metadata

Abstract:

A tensor is a multi-dimensional array, which is embedded for neural networks. The multiply-accumulate (MAC) operations involved in a large-scale tensor introduces high computational complexity. Since the tensor usually features a low rank, the computational complexity can be largely reduced through canonical polyadic decomposition (CPD). This work presents an energy-efficient hardware accelerator that implements randomized CPD in large-scale tensors for neural network compression. A mixing method that combines the Walsh-Hadamard transform and discrete cosine transform is proposed to replace the fast Fourier transform with faster convergence. It reduces the computations for transformation by 83%. 75% of computations for solving the required least squares problem are also reduced. The proposed accelerator is flexible to support tensor decomposition with a size of up to

$512\times 512\times 9\times 9$ . Compared to the prior dedicated processor for tensor computation, this work support larger tensors and achieves a

$112\times$ lower latency given the same condition.

Published in: 2022 IEEE 65th International Midwest Symposium on Circuits and Systems (MWSCAS)

Date of Conference: 07-10 August 2022

Date Added to IEEE Xplore: 22 August 2022

ISBN Information:

ISSN Information:

DOI: 10.1109/MWSCAS54063.2022.9859440

Conference Location: Fukuoka, Japan

Funding Agency:

Contents

I. Introduction

A tensor is an array with multiple dimensions. It has been applied to many applications, especially for deep learning due to the structure of neural networks with multiple dimensions, contributed by the dimensions of feature map and filter (also known as kernel). Modern neural networks usually include a large amount of hyper-parameters. Multiply-accumulate (MAC) operations involved in a large-scale tensor introduces high computational complexity and makes deploying neural networks on resource-constrained devices challenging. The tensors involved in these neural networks usually feature a low-rank property [1]. This property can be leveraged to compress the neural networks, thereby reducing their computational complexity and memory usage.

References is not available for this document.

Hardware Acceleration in Large-Scale Tensor Decomposition for Neural Network Compression

Abstract:

Metadata

Abstract:

ISSN Information:

Funding Agency:

I. Introduction

References

IEEE Account

Purchase Details

Profile Information

Need Help?

Hardware Acceleration in Large-Scale Tensor Decomposition for Neural Network Compression

Alerts

Abstract:

Metadata

Abstract:

ISSN Information:

Funding Agency:

I. Introduction

References

IEEE Account

Purchase Details

Profile Information

Need Help?