1. Introduction
Single image super-resolution (SISR) takes a low-resolution image and estimates its high-resolution image. An earlier method, such as bicubic interpolation, tries to fill in missing information between pixels by interpolation, thus it does not require training data [3], [7]. Although these methods preserve gross image structures, the interpolation schemes do not guarantee in recovering fine details in HR (High Resolution) images, and often produce fuzzy or blurred images. While it can be said that these methods exploit information from surrounding pixels, the methods have no means of recovering information from correlations among image patches and their semantics. Learning based methods have shown to be effective in exploiting these correlations when given a large set of labeled data. As such, deep learning methods [9], [10], [16] have demonstrated successes on restoring blurred parts into higher contrast, essentially recovering fine image details. Among the learning based models, one of the most widely used models is SRCNN [2]. It delivers super resolution in an end-to-end fashion by using a convolution neural network (CNN), and many of the later models are based on its architecture. While its end-to-end structure is simple, due to its relatively shallow depth, SRCNN does not fully exploit low level image features for recovering fine details. Recently, many learning based methods focusing on effective recovery of high-frequency details have been proposed by employing deeper network layers to capture low level features in an end-to-end manner.