Loading [MathJax]/extensions/MathMenu.js
Towards Perceptually Plausible Training of Image Restoration Neural Networks | IEEE Conference Publication | IEEE Xplore

Towards Perceptually Plausible Training of Image Restoration Neural Networks


Abstract:

Learning-based black-box approaches have proven to be successful at several tasks in image and video processing domain. Many of these approaches depend on gradient-descen...Show More

Abstract:

Learning-based black-box approaches have proven to be successful at several tasks in image and video processing domain. Many of these approaches depend on gradient-descent and back-propagation algorithms which requires to calculate the gradient of the loss function. However, many of the visual metrics are not differentiable, and despite their superior accuracy, they cannot be used to train neural networks for imaging tasks. Most of the image restoration neural networks rely on mean squared error to train. In this paper, we investigate visual system based metrics in order to provide perceptual loss functions that can replace mean squared error for gradient descent-based algorithms. We also share our preliminary results on the proposed approach.
Date of Conference: 06-09 November 2019
Date Added to IEEE Xplore: 19 December 2019
ISBN Information:

ISSN Information:

Conference Location: Istanbul, Turkey

I. Introduction

Visual metrics has numerous use cases in visual processing domain. They play an important role in the development, evaluation, and optimization of many visual processing algorithms. There are various approaches to develop visual metrics. While some of the metrics focus on signal driven calculations [1] [8] [10], some focus on modeling the visual system [2] [3]. Metrics which relies on signal driven calculations model the quality perception as a continuous function. On the other hand, Visual Model based metrics, such as VDP [3] and HDR-VDP [2], can predict the perceptual quality of the images more accurately and tuned on the Just Noticeable Differences around near threshold values. Although they are more accurate, they have a high computational complexity since they are derived from different components of Human Visual System (HVS) where the data is collected from a set of psychophysical measurements. Additionally, this complexity results in non-differentiable models which prevents them to be used in many visual processing applications.

Contact IEEE to Subscribe

References

References is not available for this document.