1. Introduction
The heterogeneous architecture of system-on-chips (SoCs) on edge poses a new challenge for neural architecture search (NAS) [1]-[4]. In addition to the pursuit of high accuracy, NAS must also confront the challenges posed by heterogeneity-induced pipeline imbalance and the continuous expansion of the search space. First, the significant differences in throughput and energy-efficiency among heterogeneous units lead to low hardware utilization on SoCs under pipeline running, called imbalanced utilization. The computational demands of each layer may vary, while the hardware characteristics are predetermined during design. When mapping network layers to different heterogeneous nodes for pipeline execution, the mismatch between hardware execution speed and data processing speed causes a degradation of pipeline performance. Second, the heterogeneity of resources triggers a dramatic dimension increased of NAS, causing search space growing explosively, called space explosion [5]. Assuming a network model with N layers and S deployable heterogeneous units, the search space for evaluating deployability would expand SN times, as each layer traversal and search across those nodes.