1. Introduction
Lightweight image restoration (IR) or enhancement techniques are essential for addressing inherent flaws in images captured in the wild, especially those taken by devices with low computational power. These techniques aim to reconstruct high-quality images from their distorted low-quality counterparts. However, many lightweight IR tasks with the popular vision Transformer [14] based methods remain relatively unexplored. Although many recent Transformer [52] networks have improved the IR domain [7], [9], [56], [65], [69], they are infeasible for real-world applications due to their large number of parameters. Furthermore, even the state-of-the-art lightweight IR networks consume intensive computational costs [5], [10], [33], [38], [73]. Another problem is that some IR models mainly focus on expanding the receptive field with respect to locality [9], [10], [33], [56], [73], which is insufficient to capture the global dependency in an image. This is critical because the IR networks need to refer to repeated patterns and textures distributed throughout the image [18], [38]. Meanwhile, others have tried to enlarge the receptive field globally [5], [65], [69] but have overlooked important local (spatial) information, which is conventionally essential for recovery tasks [9], [10], [21], [56]. Fig. 1 visualizes a few examples in which a successful IR depends on the ability to consider both local and global features in a given distorted low-quality image, emphasizing how significant the problem is.