Abstract:
Recently many plug-and-play self-attention modules (SAMs) are proposed to enhance the model performance by exploiting the internal information of deep convolutional neura...Show MoreMetadata
Abstract:
Recently many plug-and-play self-attention modules (SAMs) are proposed to enhance the model performance by exploiting the internal information of deep convolutional neural networks. However, most SAMs connect individually with each block of the backbone for granted, leading to incremental computational cost and the number of parameters with the growth of network depth. To address this issue, we first empirically and theoretically explore the Lottery Ticket Hypothesis for Self-attention Networks (LTH4SA): a full self-attention network contains a subnetwork with sparse self-attention connections that can (1) accelerate inference, (2) reduce extra parameter increment, and (3) maintain accuracy. Furthermore, we propose a simple yet effective policy-gradient-based baseline method to search the ticket, i.e., the connection scheme that satisfies the three above-mentioned conditions. Extensive experiments on widely-used benchmarks and popular self-attention networks show the effectiveness of our method. Besides, our experiments illustrate that our searched ticket has the capacity of transferring to other vision tasks, e.g., crowd counting and segmentation. https://github.com/gbup-group/EAN-efficient-attention-network.
Date of Conference: 15-19 July 2024
Date Added to IEEE Xplore: 30 September 2024
ISBN Information:
ISSN Information:
Funding Agency:
Keywords assist with retrieval of results and provide a means to discovering other relevant content. Learn more.
- IEEE Keywords
- Index Terms
- Convolutional Neural Network ,
- Residual Convolutional Neural Network ,
- Computational Cost ,
- Deep Network ,
- Deep Convolutional Neural Network ,
- Baseline Methods ,
- Vision Tasks ,
- Sparse Connectivity ,
- Self-attention Module ,
- Connection Scheme ,
- Self-attention Network ,
- Search Space ,
- Search Method ,
- Attention Network ,
- Semantic Segmentation ,
- Extra Cost ,
- Validation Accuracy ,
- Original Network ,
- Appendix For Details ,
- Vector Of Ones ,
- Neural Architecture Search ,
- M Blocks ,
- Policy Gradient ,
- People Counting ,
- Extra Computational Cost ,
- Network Pruning
- Author Keywords
Keywords assist with retrieval of results and provide a means to discovering other relevant content. Learn more.
- IEEE Keywords
- Index Terms
- Convolutional Neural Network ,
- Residual Convolutional Neural Network ,
- Computational Cost ,
- Deep Network ,
- Deep Convolutional Neural Network ,
- Baseline Methods ,
- Vision Tasks ,
- Sparse Connectivity ,
- Self-attention Module ,
- Connection Scheme ,
- Self-attention Network ,
- Search Space ,
- Search Method ,
- Attention Network ,
- Semantic Segmentation ,
- Extra Cost ,
- Validation Accuracy ,
- Original Network ,
- Appendix For Details ,
- Vector Of Ones ,
- Neural Architecture Search ,
- M Blocks ,
- Policy Gradient ,
- People Counting ,
- Extra Computational Cost ,
- Network Pruning
- Author Keywords