Swift Parameter-free Attention Network for Efficient Super-Resolution | IEEE Conference Publication | IEEE Xplore

Swift Parameter-free Attention Network for Efficient Super-Resolution


Abstract:

Single Image Super-Resolution (SISR) is a crucial task in low-level computer vision, aiming to reconstruct high-resolution images from low-resolution counterparts. Conven...Show More

Abstract:

Single Image Super-Resolution (SISR) is a crucial task in low-level computer vision, aiming to reconstruct high-resolution images from low-resolution counterparts. Conventional attention mechanisms have significantly improved SISR performance but often result in complex network structures and large number of parameters, leading to slow inference speed and large model size. To address this issue, we propose the Swift Parameter-free Attention Network (SPAN), a highly efficient SISR model that balances parameter count, inference speed, and image quality. SPAN employs a novel parameter-free attention mechanism, which leverages symmetric activation functions and residual connections to enhance high-contribution information and suppress redundant information. Our theoretical analysis demonstrates the effectiveness of this design in achieving the attention mechanism’s purpose. We evaluate SPAN on multiple benchmarks, showing that it outperforms existing efficient super-resolution models in terms of both image quality and inference speed, achieving a significant quality-speed trade-off. This makes SPAN highly suitable for real-world applications, particularly in resource-constrained scenarios. Notably, we won the first place both in the overall performance track and runtime track of the NTIRE 2024 efficient super-resolution challenge. Our code and models are made publicly available at https://github.com/hongyuanyu/span.
Date of Conference: 17-18 June 2024
Date Added to IEEE Xplore: 27 September 2024
ISBN Information:

ISSN Information:

Conference Location: Seattle, WA, USA
No metrics found for this document.

1. Introduction

Single Image Super-Resolution (SISR) is a well-established task in low-level computer vision, which aims to reconstruct a high-resolution image from a single low-resolution image. This task has broad applicability in enhancing image quality across various domains [16], [37], [43], [44], [48], [49], [57]. The advent of deep learning has led to significant advancements in this field [2], [10], [12], [19], [24], [32], [34], [36], [50], [59]. Recent progress in super-resolution tasks has been largely driven by the attention mechanism. Numerous state-of-the-art super-resolution networks incorporate attention mechanisms or even employ larger vision transformers (ViTs) as the model architecture [6], [8], [20], [27], [32], [35], [42], [53], [60]. These networks emphasize key features and long-distance dependencies between patches through attention maps, capturing a wider range of contextual information to ensure continuity of details and accuracy of edge textures. However, the computational requirements of the attention mechanism, which involve complex network structures and a substantial number of additional parameters, lead to challenges such as large model size and slow inference speed. These challenges limit the applicability of these models, hindering their use in efficient, high-speed computing scenarios, such as SISR tasks on resource-constrained mobile devices.

Usage
Select a Year
2025

View as

Total usage sinceSep 2024:76
05101520JanFebMarAprMayJunJulAugSepOctNovDec1160000000000
Year Total:17
Data is updated monthly. Usage includes PDF downloads and HTML views.
Contact IEEE to Subscribe

References

References is not available for this document.