I. Introduction
Massive industrial and academic research efforts aim to continuously enhance the performance and energy efficiency of many-core processors. Networks-on-chip (NoCs) are the standard choice for on-chip communication in these processors used in servers [1], [2], [3], [4] and personal computers [5] executing various applications, including deep learning [6], [7], [8], [9], [10]. Since modern applications exhibit immense data communication between processing cores (e.g., 2.7 TB for a graph processing application), the NoC performance is critical to overall system performance and pre-silicon evaluation [6], [7]. Although cycle-accurate NoC simulations provide accurate analysis, they consume up to 60% of the full-system simulation time [11]. Analytical modeling for NoC performance is a promising alternative to cycle-accurate simulation to speed-up performance evaluation. Besides providing fast NoC performance estimates, analytical models must be accurate and general enough to capture important aspects of industrial many-core processors used in different generations.