Journals & Magazines >IEEE Transactions on Computers >Volume: 67 Issue: 5

On-Chip Communication Network for Efficient Training of Deep Convolutional Networks on Heterogeneous Manycore Systems

Download PDF
Download References
Request Permissions
Save to
Alerts

Abstract:

Convolutional Neural Networks (CNNs) have shown a great deal of success in diverse application domains including computer vision, speech recognition, and natural language...Show More

Metadata

Abstract:

Convolutional Neural Networks (CNNs) have shown a great deal of success in diverse application domains including computer vision, speech recognition, and natural language processing. However, as the size of datasets and the depth of neural network architectures continue to grow, it is imperative to design high-performance and energy-efficient computing hardware for training CNNs. In this paper, we consider the problem of designing specialized CPU-GPU based heterogeneous manycore systems for energy-efficient training of CNNs. It has already been shown that the typical on-chip communication infrastructures employed in conventional CPU-GPU based heterogeneous manycore platforms are unable to handle both CPU and GPU communication requirements efficiently. To address this issue, we first analyze the on-chip traffic patterns that arise from the computational processes associated with training two deep CNN architectures, namely, LeNet and CDBNet, to perform image classification. By leveraging this knowledge, we design a hybrid Network-on-Chip (NoC) architecture, which consists of both wireline and wireless links, to improve the performance of CPU-GPU based heterogeneous manycore platforms running the above-mentioned CNN training workloads. The proposed NoC achieves 1.8× reduction in network latency and improves the network throughput by a factor of 2.2 for training CNNs, when compared to a highly-optimized wireline mesh NoC. For the considered CNN workloads, these network-level improvements translate into 25 percent savings in full-system energy-delay-product (EDP). This demonstrates that the proposed hybrid NoC for heterogeneous manycore architectures is capable of significantly accelerating training of CNNs while remaining energy-efficient.

Published in: IEEE Transactions on Computers ( Volume: 67, Issue: 5, 01 May 2018)

Page(s): 672 - 686

Date of Publication: 27 November 2017

ISSN Information:

DOI: 10.1109/TC.2017.2777863

Funding Agency:

No metrics found for this document.

Contents

1 Introduction

Deep learning techniques have seen great success in diverse application domains including speech processing, computer vision, and natural language processing [1]. While the fundamental ideas of deep learning have been around since the mid-1980s [2], the two main reasons for their recent success are: 1) the availability of large-scale training data; and 2) advances in computing hardware to efficiently train large-scale neural networks using this data.

Usage

Select a Year

View as

Total usage sinceNov 2017:2,892

Year Total:40

Data is updated monthly. Usage includes PDF downloads and HTML views.

Citations

Crossref^®

Search for
Citations in
Google Scholar^®

References is not available for this document.

MIT Libraries

MIT Libraries

On-Chip Communication Network for Efficient Training of Deep Convolutional Networks on Heterogeneous Manycore Systems

Abstract:

Metadata

Abstract:

ISSN Information:

Funding Agency:

1 Introduction

View as

References

IEEE Account

Purchase Details

Profile Information

Need Help?

MIT Libraries

MIT Libraries

On-Chip Communication Network for Efficient Training of Deep Convolutional Networks on Heterogeneous Manycore Systems

Alerts

Abstract:

Metadata

Abstract:

ISSN Information:

Funding Agency:

1 Introduction

View as

References

IEEE Account

Purchase Details

Profile Information

Need Help?