I. Introduction
Networks-on-Chip (NoC) is a key component of almost all chips today. The domains vary from (i) many-core chips in HPC supercomputers and high-end servers with tens to hundreds of homogeneous cores [1]–[3], to (ii) mobile and embedded SoCs with tens of heterogeneous cores and controllers [4], [5], to (iii) GPUs with hundreds of SMs [6], to (iv) domain-specific accelerators, such as machine learning, with hundreds of processing elements [7], [8]. Without loss of generality, we refer to end point cores, accelerators, PEs, caches, etc. as “IPs” in this work. NoC is the interconnect backbone connecting IPs communicating each other and a critical IP block itself for plug-and-play designs. As the number of IPs in a hardware system increases, the communication fabric also needs to scale so that it does not become a performance or energy bottleneck.