Loading [MathJax]/extensions/MathMenu.js
An Architecture-Agnostic Dataflow Mapping Framework on CGRA | IEEE Conference Publication | IEEE Xplore

An Architecture-Agnostic Dataflow Mapping Framework on CGRA


Abstract:

Coarse-Grained Reconfigurable Arrays (CGRAs) are promising accelerators in the rapidly evolving field of high-performance computing (HPC). However, their potential is lim...Show More

Abstract:

Coarse-Grained Reconfigurable Arrays (CGRAs) are promising accelerators in the rapidly evolving field of high-performance computing (HPC). However, their potential is limited by the inability of compilers to efficiently map complex application kernels to architectures. In this paper, we propose an architecture-agnostic mapping framework called AGILE, which has a loosely coupled flow that contains architecture modeling and dataflow intermediate representation (IR) generation, hierarchical mapping and design space exploration (DSE). We extend mapping to end-to-end flow for better evaluation of architectures, flexible modeling and IR allow for adapting to and exploring various architectures, and hierarchical dataflow mapping methodology enables better evaluation of proposed architectures. The experiments show that our partitioning algorithm adapts to large-scale DFGs, and divide-and-conquer reduces the problem size. Our framework achieves a 1.86 × and 1.4 × improvement in throughput over the baseline CGRA when compared to CGRA-ME and LISA, respectively, and offers significant mapping acceleration, being 133 × and 33.3 × faster. On the Hycube architecture, we improve utilization and throughput by 4× compared to Morpher. Moreover, the DSE of a range of architectures demonstrates the effectiveness of our approach.
Date of Conference: 27-31 May 2024
Date Added to IEEE Xplore: 26 July 2024
ISBN Information:
Conference Location: San Francisco, CA, USA

Funding Agency:

No metrics found for this document.

I. Introduction

Coarse-Grained Reconfigurable Arrays (CGRAs) are emerging as promising accelerators endowed with a wealth of processing elements (PEs), memory units, input/output blocks, and routing resources. The operations of diverse application kernels' dataflows are mapped onto the CGRA and executed in a pipeline manner. Time-Extended CGRAs (TECs) [1] runtime ensure that the predetermined sequence of configu-rations stored in the configuration memory can reconfigure and execute the dataflow. Fig. 1 illustrates a typical ADRES architecture with a mesh structure that allows resource replication in the time dimension. CGRA cyclically executes a set of configurations repeatedly, which is well suited for application loop kernels and the dataflow of loops can be mapped onto TECs for pipelining.

Usage
Select a Year
2025

View as

Total usage sinceJul 2024:130
05101520JanFebMarAprMayJunJulAugSepOctNovDec14166000000000
Year Total:36
Data is updated monthly. Usage includes PDF downloads and HTML views.
Contact IEEE to Subscribe

References

References is not available for this document.