Journals & Magazines >IEEE Transactions on Neural N... >Volume: 36 Issue: 3

Neural Network Compression Based on Tensor Ring Decomposition

Download PDF
Download References
Request Permissions
Save to
Alerts

Abstract:

Deep neural networks (DNNs) have made great breakthroughs and seen applications in many domains. However, the incomparable accuracy of DNNs is achieved with the cost of c...Show More

Metadata

Abstract:

Deep neural networks (DNNs) have made great breakthroughs and seen applications in many domains. However, the incomparable accuracy of DNNs is achieved with the cost of considerable memory consumption and high computational complexity, which restricts their deployment on conventional desktops and portable devices. To address this issue, low-rank factorization, which decomposes the neural network parameters into smaller sized matrices or tensors, has emerged as a promising technique for network compression. In this article, we propose leveraging the emerging tensor ring (TR) factorization to compress the neural network. We investigate the impact of both parameter tensor reshaping and TR decomposition (TRD) on the total number of compressed parameters. To achieve the maximal parameter compression, we propose an algorithm based on prime factorization that simultaneously identifies the optimal tensor reshaping and TRD. In addition, we discover that different execution orders of the core tensors result in varying computational complexities. To identify the optimal execution order, we construct a novel tree structure. Based on this structure, we propose a top-to-bottom splitting algorithm to schedule the execution of core tensors, thereby minimizing computational complexity. We have performed extensive experiments using three kinds of neural networks with three different datasets. The experimental results demonstrate that, compared with the three state-of-the-art algorithms for low-rank factorization, our algorithm can achieve better performance with much lower memory consumption and lower computational complexity.

Published in: IEEE Transactions on Neural Networks and Learning Systems ( Volume: 36, Issue: 3, March 2025)

Page(s): 5388 - 5402

Date of Publication: 30 April 2024

ISSN Information:

PubMed ID: 38687669

DOI: 10.1109/TNNLS.2024.3383392

Funding Agency:

Contents

I. Introduction

Recently, deep neural networks (DNNs) have made great breakthroughs and found applications in various domains, including natural language processing [1], [2], [3], semantic segmentation [4], as well as image and video recognition [5]. These networks typically consist of millions of parameters, requiring hundreds of megabytes for storage, and cost a huge amount of time for processing. Moreover, each evolution in the architecture of neural networks brings a continuous increase in the number of parameters, further exacerbating the storage and processing requirements. As a result, the deployment of DNNs on resource-limited conventional desktops and portable devices becomes challenging. Therefore, it is critical to reduce the number of parameters and the complexity of processing for the practical use of DNNs.

References is not available for this document.

MIT Libraries

MIT Libraries

Neural Network Compression Based on Tensor Ring Decomposition

Abstract:

Metadata

Abstract:

ISSN Information:

Funding Agency:

Description

I. Introduction

Description

References

IEEE Account

Purchase Details

Profile Information

Need Help?

MIT Libraries

MIT Libraries

Neural Network Compression Based on Tensor Ring Decomposition

Alerts

Abstract:

Metadata

Abstract:

ISSN Information:

Funding Agency:

Description

I. Introduction

Description

References