Conferences >2019 Ninth International Conf...

Towards Perceptually Plausible Training of Image Restoration Neural Networks

Download PDF
Download References
Request Permissions
Save to
Alerts

Abstract:

Learning-based black-box approaches have proven to be successful at several tasks in image and video processing domain. Many of these approaches depend on gradient-descen...Show More

Metadata

Abstract:

Learning-based black-box approaches have proven to be successful at several tasks in image and video processing domain. Many of these approaches depend on gradient-descent and back-propagation algorithms which requires to calculate the gradient of the loss function. However, many of the visual metrics are not differentiable, and despite their superior accuracy, they cannot be used to train neural networks for imaging tasks. Most of the image restoration neural networks rely on mean squared error to train. In this paper, we investigate visual system based metrics in order to provide perceptual loss functions that can replace mean squared error for gradient descent-based algorithms. We also share our preliminary results on the proposed approach.

Published in: 2019 Ninth International Conference on Image Processing Theory, Tools and Applications (IPTA)

Date of Conference: 06-09 November 2019

Date Added to IEEE Xplore: 19 December 2019

ISBN Information:

ISSN Information:

DOI: 10.1109/IPTA.2019.8936096

Conference Location: Istanbul, Turkey

References is not available for this document.

Contents

I. Introduction

Visual metrics has numerous use cases in visual processing domain. They play an important role in the development, evaluation, and optimization of many visual processing algorithms. There are various approaches to develop visual metrics. While some of the metrics focus on signal driven calculations [1] [8] [10], some focus on modeling the visual system [2] [3]. Metrics which relies on signal driven calculations model the quality perception as a continuous function. On the other hand, Visual Model based metrics, such as VDP [3] and HDR-VDP [2], can predict the perceptual quality of the images more accurately and tuned on the Just Noticeable Differences around near threshold values. Although they are more accurate, they have a high computational complexity since they are derived from different components of Human Visual System (HVS) where the data is collected from a set of psychophysical measurements. Additionally, this complexity results in non-differentiable models which prevents them to be used in many visual processing applications.

Select All

Z. Wang, A. C. Bovik, H. R. Sheikh and E. P. Simoncelli, "Image quality assessment: From error visibility to structural similarity", IEEE Trans. Image Process., vol. 13, no. 4.

View Article

Google Scholar

R. Mantiuk, K. J. Kim, A. G. Rempel and W. Heidrich, "HDR-VDP-2: A calibrated visual metric for visibility and quality predictions in all luminance conditions", ACM Trans. Graph., vol. 30, no. 4, pp. 40-140, July 2011.

CrossRef Google Scholar

Scott Daly, "The visible differences predictor: an algorithm for the assessment of image fidelity" in Digital images and human vision, Cambridge, MA, USA:MIT Press, pp. 179-206, 1993.

Google Scholar

H. Zhao, O. Gallo, I. Frosio and J. Kautz, "Loss Functions for Image Restoration With Neural Networks", IEEE Transactions on Computational Imaging, vol. 3, no. 1, pp. 47-57, March 2017.

View Article

Google Scholar

J. Goodfellow, Jean Pouget-Abadie, Mehdi Mirza, Bing Xu, David Warde-Farley, Sherjil Ozair, et al., "Generative adversarial nets", Proceedings of the 27th International Conference on Neural Information Processing Systems - Volume 2 (NIPS’14), pp. 2672-2680, 2014.

Google Scholar

J Johnson, A Alahi and FF Li, Perceptual losses for real-time style transfer and super-resolution, pp. 694711, 2016.

Google Scholar

T. Ritschel, M. Ihrke, J. R. Frisvad, J. Coppens, K. Myszkowski and H.-P. Seidel, "Temporal Glare: Real-Time Dynamic Simulation of the Scattering in the Human Eye", Computer Graphics Forum, vol. 28, no. 2, pp. 183192, 2009.

CrossRef Google Scholar

D. Whitaker, R. Steen and D. Elliott, "Light scatter in the normal young elderly and cataractous eye demonstrates little wavelength dependency", Optometry and Vision Science, vol. 70, no. 11, pp. 963968, 1993.

CrossRef Google Scholar

A. Stockman and L. Sharpe, "The spectral sensitivities of the middle-and long-wavelength-sensitive cones derived from measurements in observers of known genotype", Vision Res, vol. 40, no. 13, pp. 17111737, 2000.

CrossRef Google Scholar

10.

T. O. Aydin, R. K. Mantiuk and H.-P. Seidel, "Extending quality metrics to full luminance range images", Proceedings ofSPIE, vol. 6806, pp. 68060B68060B10, 2008.

CrossRef Google Scholar

11.

Taimoor Luis Gonzalez Tariq and Munchurl Juan Kim, "A HVS-inspired Attention Map to Improve CNN-based Perceptual Losses for Image Restoration", 2019.

Google Scholar

12.

K. Simonyan and A. Zisserman, Very deep convolutional networks for large-scale image recognition, 2014.

Google Scholar

13.

Sai Wu, Mengdan Zhang, Gang Chen and Kan Chen, A New Approach to Compute CNNs for Extremely Large Images, CIKM, 2017.

CrossRef Google Scholar

14.

Richard Zhang, Phillip Isola, Alexei A. Efros, Eli Shechtman and Oliver Wang, "The Unreasonable Effectiveness of Deep Features as a Perceptual Metric", 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 586-595, 2018.

View Article

Google Scholar

15.

R. Mantiuk, S. Daly, K. Myszkowski and H. Seidel, "Predicting visible differences in high dynamic range images: model and its calibration", Proc. SPIE, vol. 5666, pp. 204-214, 2005.

CrossRef Google Scholar

16.

J. Foley, "Human luminance pattern-vision mechanisms: masking experiments require a new model", Journal of the Optical Society of America A, vol. 11, no. 6, pp. 17101719, 1994.

CrossRef Google Scholar

17.

A. Watson, "The cortex transform: Rapid computation of simulated neural images", Computer Vision Graphics and Image Processing, vol. 39, no. 3, pp. 311327, 1987.

CrossRef Google Scholar

18.

E. Simoncelli and W. Freeman, "The steerable pyramid: a flexible architecture for multi-scale derivative computation", Proceedings. International Conference on Image Processing IEEE Comput. Soc. Press, vol. 3, pp. 444447, 2002.

Google Scholar

19.

Odena et al., "Deconvolution and Checkerboard Artifacts", Distill, 2016, [online] Available: http://doi.org/10.23915/distill.00003.

CrossRef Google Scholar

20.

Lina Lin Jin, Joe Yu-chieh Hu, Sudeng Wang, Haiqiang Wang, Ping Katsavounidis, Ioannis Aaron, et al., "Statistical Study on Perceived JPEG Image Quality via MCL-JCI Dataset Construction and Analysis", Electronic Imaging. 2016, pp. 1-9, 2016.

CrossRef Google Scholar

References is not available for this document.

MIT Libraries

MIT Libraries

Towards Perceptually Plausible Training of Image Restoration Neural Networks

Abstract:

Metadata

Abstract:

ISSN Information:

I. Introduction

References

IEEE Account

Purchase Details

Profile Information

Need Help?

MIT Libraries

MIT Libraries

Towards Perceptually Plausible Training of Image Restoration Neural Networks

Alerts

Abstract:

Metadata

Abstract:

ISSN Information:

I. Introduction

References

IEEE Account

Purchase Details

Profile Information

Need Help?