Abstract:
Diffusion models have become a rising star in image super-resolution (SR) tasks. However, it is not trivial to apply diffusion models for light field (LF) image SR, which...Show MoreMetadata
Abstract:
Diffusion models have become a rising star in image super-resolution (SR) tasks. However, it is not trivial to apply diffusion models for light field (LF) image SR, which requires maintaining the high-quality visual appearance of each sub-aperture image (SAI) and the angular consistency between the different SAIs. This paper proposes the first diffusion-based LF image SR model, namely LFSRDiff, by incorporating the LF disentanglement mechanism and residual modeling. Specifically, we introduce a disentangled U-Net (Distg U-Net) for diffusion models, enabling improved extraction and fusion of the spatial and angular information in LF images. Furthermore, we leverage residual modeling in diffusion to learn the residual between the upsampled low-resolution and the ground truth high-resolution, which significantly accelerates model training and yields superior results compared to direct learning. Extensive experiments conducted on the five datasets demonstrate the effectiveness of our approach, which can produce realistic SR results and achieve the highest perceptual metric in terms of LPIPS. Code is publicly available at https://github.com/chaowentao/LFSRDiff.
Published in: ICASSP 2025 - 2025 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)
Date of Conference: 06-11 April 2025
Date Added to IEEE Xplore: 07 March 2025
ISBN Information: