I. Introduction
Coarse-Grained Reconfigurable Arrays (CGRAs) are emerging as promising accelerators endowed with a wealth of processing elements (PEs), memory units, input/output blocks, and routing resources. The operations of diverse application kernels' dataflows are mapped onto the CGRA and executed in a pipeline manner. Time-Extended CGRAs (TECs) [1] runtime ensure that the predetermined sequence of configu-rations stored in the configuration memory can reconfigure and execute the dataflow. Fig. 1 illustrates a typical ADRES architecture with a mesh structure that allows resource replication in the time dimension. CGRA cyclically executes a set of configurations repeatedly, which is well suited for application loop kernels and the dataflow of loops can be mapped onto TECs for pipelining.