Loading [MathJax]/extensions/MathMenu.js
U²-Former: Nested U-Shaped Transformer for Image Restoration via Multi-View Contrastive Learning | IEEE Journals & Magazine | IEEE Xplore

U²-Former: Nested U-Shaped Transformer for Image Restoration via Multi-View Contrastive Learning


Abstract:

While Transformer has achieved remarkable performance in various high-level vision tasks, it is still challenging to exploit the full potential of Transformer in image re...Show More

Abstract:

While Transformer has achieved remarkable performance in various high-level vision tasks, it is still challenging to exploit the full potential of Transformer in image restoration. The crux lies in the limited depth of applying Transformer in the typical encoder-decoder framework for image restoration, resulting from heavy self-attention computation load and inefficient communications across different depth (scales) of layers. In this paper, we present a deep and effective Transformer-based network for image restoration, termed as U2-Former, which is able to employ self-attention of Transformer as the core operation for feature learning to perform image restoration in a deep encoding and decoding space. Specifically, it leverages the nested U-shaped structure to facilitate the interactions across different layers with different scales of feature maps. Furthermore, we optimize the computational efficiency for the basic Transformer block by introducing a simple yet effective feature-filtering mechanism to compress the token representation. Apart from the typical supervision ways for image restoration, our U2-Former also performs multi-view contrastive learning, which constructs positive pairs in various aspects, to learn noise-sensitive but content-irrelevant features and further decouple the noise component from the background image. Extensive experiments on various image restoration tasks, including reflection removal, rain streak removal and dehazing respectively, demonstrate the effectiveness of the proposed U2-Former.
Page(s): 168 - 181
Date of Publication: 19 June 2023

ISSN Information:

Funding Agency:


I. Introduction

Image restoration is an important yet challenging research problem involving many tasks in computer vision, such as image reflection removal, image deraining, and image dehazing. To efficiently reconstruct the image without corruption, accurate perception on diverse noise patterns plays a key role. Most existing state-of-the-art methods [1], [2], [3] for image restoration are modeled based on CNN structure due to its excellent performance of feature learning. Stemming from the inherent nature of the convolutional operation, a potential limitation for these methods is that the noise patterns are recognized only relying on the features learned in the local view of the input image, although deeper networks lead to a larger yet local view. Nevertheless, it is crucial to obtain a global perception of the whole image when performing image restoration.

Contact IEEE to Subscribe

References

References is not available for this document.