1 Introduction
Single-image super-resolution (SISR) has long been a fundamental problem in low-level vision, aiming to recover a high-resolution (HR) image from an observed low-resolution (LR) input. Years of efforts from the research community have brought about remarkable progress in this field, especially with the booming deep learning techniques [1], [2], [3], [4], [5]. However, most existing methods assume a pre-defined degradation process (e.g., bicubic downsampling) from an HR image to an LR one, which can hardly hold true for real-world images with complex degradation types. Towards filling this gap, growing attention has been paid in recent years to approaches for unknown degradations, namely blind SR. Despite many exciting improvements, these proposed methods tend to fail in many real-world scenarios, as their performance is usually limited to certain kinds of inputs and will drop dramatically in other cases. The main reason is that they still make some assumptions on the degradation types related to the input LR. Readers can see Fig. 1a for an illustration, which shows four different LR inputs with assumed degradation types of some state-of-the-art methods but targeting at the same HR. Therefore, when given an arbitrary input deviating from their assumed data distributions, these methods inevitably produce much less pleasing results. Fig. 1b demonstrates different SR results for a real-world image cropped from the famous film Forrest Gump, which are generated by four state-of-the-art methods. We may find none of these methods have lived up to our expectation for a good viewing experience, since this real-world image does not strictly follow their assumptions on inputs. In fact, it is not rare that we feel confused about which method to choose for a certain image at hand, or whether we could possibly reach a high-quality result using existing methods.