Conferences >2023 5th International Confer...

A Tensor Dataflow Modeling Framework with Fine-grained Hardware Description

Download PDF
Download References
Request Permissions
Save to
Alerts

Abstract:

In recent years, deploying tensor computation tasks on spatial accelerators has been proven to effectively enhance execution speed and efficiency. To effectively deploy t...Show More

Metadata

Abstract:

In recent years, deploying tensor computation tasks on spatial accelerators has been proven to effectively enhance execution speed and efficiency. To effectively deploy tensor applications on spatial accelerators, a series of dataflow modeling frameworks have been proposed, which can quickly evaluate dataflow for efficient design space exploration. However, these frameworks lack a precise description of the hardware structure, which leads to an invalid and incomplete description of the constrained dataflow design space, and the optimal dataflow searched for a given accelerator is often invalid or non-optimal. In this paper, we first propose a hardware description, which models the array structure, storage structure, and interconnect network structure in detail, and a set of workload and dataflow descriptions, to deploy tensor operations on the proposed hardware description. Based on this, we further propose a dataflow evaluation framework to evaluate performance metrics such as latency, data reuse, and process element utilization of the dataflow. Compared to the state-of-the-art evaluation framework, we achieved a reduction in evaluation error of 4.5% for latency.

Published in: 2023 5th International Conference on Frontiers Technology of Information and Computer (ICFTIC)

Date of Conference: 17-19 November 2023

Date Added to IEEE Xplore: 13 March 2024

ISBN Information:

DOI: 10.1109/ICFTIC59930.2023.10456081

Conference Location: Qiangdao, China

Contents

I. Introduction

In recent years, academia and industry have proposed a large number of customized spatial accelerators to effectively handle various tensor computations [1][2][3][4]. Although spatial accelerators share similarities in their overall functionality, the architecture details, performance, and power consumption can vary significantly due to differences in task allocation and execution order. The task allocation and execution order are defined as dataflow. Finding a suitable dataflow is a key issue in the design and deployment of spatial accelerators. TPU [2] uses a weight stationary dataflow on a systolic array, which offers high versatility to support multiple applications. Cambricon [3] employs an H-tree broadcast dataflow to reduce congestion and power consumption in long-distance data transfers. Eyeriss [1] introduces a row stationary dataflow specifically designed for convolution operations to maximize weight reuse and minimize power consumption. Magnet [4] proposes a local output stationary dataflow and local weight stationary dataflow, which leverage data reuse across multiple levels of memory hierarchy by keeping data resident in local caches.

References is not available for this document.

A Tensor Dataflow Modeling Framework with Fine-grained Hardware Description

Abstract:

Metadata

Abstract:

I. Introduction

References

IEEE Account

Purchase Details

Profile Information

Need Help?

A Tensor Dataflow Modeling Framework with Fine-grained Hardware Description

Alerts

Abstract:

Metadata

Abstract:

I. Introduction

References