Loading [MathJax]/extensions/MathMenu.js
Global Static Pruning via Adaptive Sample Complexity Awareness | IEEE Conference Publication | IEEE Xplore

Global Static Pruning via Adaptive Sample Complexity Awareness


Abstract:

Dynamic pruning leverage the feature information of each input sample to dynamically adjust the network structure, generating multiple subnetworks suitable for different ...Show More

Abstract:

Dynamic pruning leverage the feature information of each input sample to dynamically adjust the network structure, generating multiple subnetworks suitable for different sample complexity. However, it inevitably introduces higher computational complexity and increased memory consumption. In addition, complex multi-stage pipelines are required to counteract the performance degradation caused by pruning. In this paper, a simple yet effective global static pruning method based on Adaptive Sample Complexity Awareness is proposed, called ASCA, which achieves model compression without pre-training and fine-tuning. Specifically, an adaptive sample complexity-aware static pruning method is proposed, which leverages task loss to guide the network in enhancing or suppressing the feature learning of samples with varying complexities. Then, a new mask binarization loss is proposed to automatically distinguish important and unimportant channels, avoiding the impact of hand-crafted thresholds on pruning performance. Extensive experiments demonstrate that ASCA outperforms state-of-the-art pruning methods on CIFAR-10 and ImageNet datasets.
Date of Conference: 06-11 April 2025
Date Added to IEEE Xplore: 07 March 2025
ISBN Information:

ISSN Information:

Conference Location: Hyderabad, India

Funding Agency:


I. Introduction

Convolutional neural networks (CNNs) have achieved unprecedented success and demonstrated state-of-the-art performance in various domains. However, as the performance of CNNs has rapidly improved, the demand for computational and memory resources has also increased, limiting their deployment on resource-constrained embedded devices. Therefore, model compression and acceleration techniques have been proposed, including knowledge distillation [1], quantization [2], network architecture search [3], and Channel pruning [4] –[6]. Among them, channel pruning has gained widespread attention due to its ability to achieve hardware acceleration without requiring specialized acceleration libraries.

Contact IEEE to Subscribe

References

References is not available for this document.