Abstract:
Coarse-Grained Reconfigurable Arrays (CGRAs) are promising accelerators in the rapidly evolving field of high-performance computing (HPC). However, their potential is lim...Show MoreMetadata
Abstract:
Coarse-Grained Reconfigurable Arrays (CGRAs) are promising accelerators in the rapidly evolving field of high-performance computing (HPC). However, their potential is limited by the inability of compilers to efficiently map complex application kernels to architectures. In this paper, we propose an architecture-agnostic mapping framework called AGILE, which has a loosely coupled flow that contains architecture modeling and dataflow intermediate representation (IR) generation, hierarchical mapping and design space exploration (DSE). We extend mapping to end-to-end flow for better evaluation of architectures, flexible modeling and IR allow for adapting to and exploring various architectures, and hierarchical dataflow mapping methodology enables better evaluation of proposed architectures. The experiments show that our partitioning algorithm adapts to large-scale DFGs, and divide-and-conquer reduces the problem size. Our framework achieves a 1.86 × and 1.4 × improvement in throughput over the baseline CGRA when compared to CGRA-ME and LISA, respectively, and offers significant mapping acceleration, being 133 × and 33.3 × faster. On the Hycube architecture, we improve utilization and throughput by 4× compared to Morpher. Moreover, the DSE of a range of architectures demonstrates the effectiveness of our approach.
Published in: 2024 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW)
Date of Conference: 27-31 May 2024
Date Added to IEEE Xplore: 26 July 2024
ISBN Information:
Funding Agency:
Keywords assist with retrieval of results and provide a means to discovering other relevant content. Learn more.
- IEEE Keywords
- Index Terms
- Functional Framework ,
- Model Architecture ,
- Flexible Model ,
- High-performance Computing ,
- Design Space ,
- Intermediate Representation ,
- Partitioning Algorithm ,
- Hierarchical Map ,
- Throughput Improvement ,
- Design Space Exploration ,
- Benchmark ,
- Multi-objective Optimization ,
- Simulated Annealing ,
- Mapping Algorithm ,
- Comparative Mapping ,
- Bayesian Optimization ,
- Simulated Annealing Algorithm ,
- Scheduling Algorithm ,
- Processes In Place ,
- Graph Partitioning ,
- Memory Operations ,
- Clock Cycles ,
- Routing Algorithm ,
- Divide-and-conquer Approach ,
- Pareto Optimal Set ,
- Partitioning Results ,
- Depth-first ,
- Current Cycle ,
- Memory Unit
- Author Keywords
Keywords assist with retrieval of results and provide a means to discovering other relevant content. Learn more.
- IEEE Keywords
- Index Terms
- Functional Framework ,
- Model Architecture ,
- Flexible Model ,
- High-performance Computing ,
- Design Space ,
- Intermediate Representation ,
- Partitioning Algorithm ,
- Hierarchical Map ,
- Throughput Improvement ,
- Design Space Exploration ,
- Benchmark ,
- Multi-objective Optimization ,
- Simulated Annealing ,
- Mapping Algorithm ,
- Comparative Mapping ,
- Bayesian Optimization ,
- Simulated Annealing Algorithm ,
- Scheduling Algorithm ,
- Processes In Place ,
- Graph Partitioning ,
- Memory Operations ,
- Clock Cycles ,
- Routing Algorithm ,
- Divide-and-conquer Approach ,
- Pareto Optimal Set ,
- Partitioning Results ,
- Depth-first ,
- Current Cycle ,
- Memory Unit
- Author Keywords