Conferences >2022 IEEE/CVF Conference on C...

Transformer for Single Image Super-Resolution

Download PDF
Download References
Request Permissions
Save to
Alerts

Abstract:

Single image super-resolution (SISR) has witnessed great strides with the development of deep learning. However, most existing studies focus on building more complex netw...Show More

Metadata

Abstract:

Single image super-resolution (SISR) has witnessed great strides with the development of deep learning. However, most existing studies focus on building more complex networks with a massive number of layers. Recently, more and more researchers start to explore the application of Transformer in computer vision tasks. However, the heavy computational cost and high GPU memory occupation of the vision Transformer cannot be ignored. In this paper, we propose a novel Efficient Super-Resolution Transformer (ESRT) for SISR. ESRT is a hybrid model, which consists of a Lightweight CNN Backbone (LCB) and a Lightweight Transformer Backbone (LTB). Among them, LCB can dynamically adjust the size of the feature map to extract deep features with a low computational costs. LTB is composed of a series of Efficient Transformers (ET), which occupies a small GPU memory occupation, thanks to the specially designed Efficient Multi-Head Attention (EMHA). Extensive experiments show that ESRT achieves competitive results with low computational cost. Compared with the original Transformer which occupies 16,057M GPU memory, ESRT only occupies 4,191M GPU memory. All codes are available at https://github.com/luissen/ESRT.

Published in: 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW)

Date of Conference: 19-20 June 2022

Date Added to IEEE Xplore: 23 August 2022

ISBN Information:

ISSN Information:

DOI: 10.1109/CVPRW56347.2022.00061

Conference Location: New Orleans, LA, USA

Funding Agency:

Contents

1. Introduction

Single image super-resolution (SISR) aims at recovering a super-resolution (SR) image from its degraded low-resolution (LR) counterpart, which is a useful technology to overcome resolution limitations in many applications. However, it still is an ill-posed problem since there exist infinite HR images. To address this issue, numerous deep neural networks have been proposed [10, 13, 18, 21, 22, 26, 39, 40, 45]. Although these methods have achieved outstanding performance, they cannot be easily utilized in real applications due to high computation cost and memory storage. To solve this problem, many recurrent networks and lightweight networks have been proposed, such as DRCN [19], SRRFN [23], IMDN [16], IDN [17], CARN [2], ASSLN [46], MAFFSRN [31], and RFDN [27]. All these models concentrate on constructing a more efficient network structure, but the reduced network capacity will lead to poor performance.

References is not available for this document.

MIT Libraries

MIT Libraries

Transformer for Single Image Super-Resolution

Abstract:

Metadata

Abstract:

ISSN Information:

Funding Agency:

1. Introduction

References

IEEE Account

Purchase Details

Profile Information

Need Help?

MIT Libraries

MIT Libraries

Transformer for Single Image Super-Resolution

Alerts

Abstract:

Metadata

Abstract:

ISSN Information:

Funding Agency:

1. Introduction

References

IEEE Account

Purchase Details

Profile Information

Need Help?