I. Introduction
Deep learning has been successfully applied to many applications and even demonstrates beyond-human capability in some cases, such as image recognition. Convolutional neural network (CNN), which mainly consists of convolutional layers and fully-connected layers, is one of the most commonly used models for deep learning. Generally, training for CNNs is conducted in the cloud and inference can be performed on edge devices. Specialized processors for CNN inference have been well developed [1]–[4].