1. Introduction
Deep convolutional neural networks (CNNs) have been successfully used in various computer vision applications such as image classification [24,11], object detection [21] and semantic segmentation [15]. However, launching most of the widely used CNNs requires heavy computation and storage, which can only be used on PCs with modern GPU cards. For example, over 500MB of memory and over multiplications are demanded for processing one image using VGGNet [24], which is almost impossible to be applied on edge devices such as autonomous cars and micro robots. Although these pre-trained CNNs have a number of parameters, Han et al. [6] showed that discarding over 85% of weights in a given neural network would not obviously damage its performance, which demonstrates that there is a significant redundancy in these CNNs.