KNN Local Attention for Image Restoration | IEEE Conference Publication | IEEE Xplore

KNN Local Attention for Image Restoration


Abstract:

Recent works attempt to integrate the non-local operation with CNNs or Transformer, achieving remarkable performance in image restoration tasks. The global similarity, ho...Show More

Abstract:

Recent works attempt to integrate the non-local operation with CNNs or Transformer, achieving remarkable performance in image restoration tasks. The global similarity, however, has the problems of the lack of locality and the high computational complexity that is quadratic to an input resolution. The local attention mechanism alleviates these issues by introducing the inductive bias of the locality with convolution-like operators. However, by focusing only on adjacent positions, the local attention suffers from an insufficient receptive field for image restoration. In this paper, we propose a new attention mechanism for image restoration, called k-NN Image Transformer (KiT), that rectifies the above mentioned limitations. Specifically, the KiT groups k-nearest neighbor patches with locality sensitive hashing (LSH), and the grouped patches are aggregated into each query patch by performing a pair-wise local attention. In this way, the pair-wise operation establishes nonlocal connectivity while maintaining the desired properties of the local attention, i.e., inductive bias of locality and linear complexity to input resolution. The proposed method outperforms state-of-the-art restoration approaches on image denoising, deblurring and deraining benchmarks. The code will be available soon.
Date of Conference: 18-24 June 2022
Date Added to IEEE Xplore: 27 September 2022
ISBN Information:

ISSN Information:

Conference Location: New Orleans, LA, USA

1. Introduction

Image restoration aims to recover a clean image from various type of degradations (e.g. noise, blur, rain, and compression artifacts), which has a huge impact on the performance of downstream tasks such as image classification [14], [56], object detection [22], [46], segmentation [4], [10], and to name a few. It is a highly ill-posed inverse problem as there may exist multiple number of solutions for a single degraded image. Recent restoration works [17], [36], [76] attempt to establish a mapping relation between clean and degraded images by leveraging the representation power of the convolutional neural networks (CNNs). A series of local operations used in the CNNs is, however, inherently less capable of capturing a long-range dependency, exhibiting certain limitations in deliberating global information over an entire image. To enlarge the receptive field, increasing network depth [51], dilated convolution [66], and hierarchical architecture [40] have been proposed, but the receptive field still does not secure global information as it is limited to local regions. Recently, non-local operation, which mostly contributed to non-learning based restoration approaches [5], [15], has again emerged as a promising solution with the success of non-local neural networks [58]. As similar patterns tend to repeat within a natural image, non-local self-similarity of computing the response at a single position by weighted sum of all positions has served as an important cue for an image restoration [16], [28], [32], [37], [38], [43], [53], [77], [78]. A non-local self-similarity of [58] could capture the long-range dependency within deep networks, but the quadratic complexity with respect to the input feature resolution limits the network capacity. Consequently, it is employed only in relatively low-resolution feature maps of specific layers [16], [32], [77].

Comparisons of different attention approaches: (a) Global attention [18], [45], [57] computes self-similarity between patches globally, (b) local attention [33], [59] measures self-similarity within a single patch at the pixel-level, and (c) the proposed method aggregates similar k patches with a pair-wise local attention at the pixel-level.

Contact IEEE to Subscribe

References

References is not available for this document.