I. Introduction
CNN has been one of the most successful applications in deep learning in recent years, with its effectiveness in processing large images, many industries have expected to deploy deep learning models on low-power and smaller edge devices. Typical edge devices have limited resources, with only a few hundred kilobytes to tens of megabytes of static memory, this results in complex models unable to be deployed easily. There are several methods for achieving model compression, but model pruning is a relatively easier approach to implement.