I. Introduction
With the rapid development of intelligent surveillance systems, surveillance cameras have been deployed in various areas including security and protection systems. Surveillance images, especially face images, can provide very important clues to criminal investigation. However, the resolution of a video camera is usually not High-Definition (HD) (see Fig. 1(a)), and the low resolution of the interested face in the picture resulted from the long distance between the object and the camera (see Fig. 1(c)) makes it almost impossible to provide useful information (see Fig. 1(b)). Moreover, in real surveillance scenarios, the qualities of the surveillance images are deteriorated by many environmental factors, such as underexposure, optical blurring, and defocusing. Consequently, the face images of interest are too blurred to be identifiable by humans. In order to obtain enough facial feature details for recognition, a new technique called face super-resolution or face hallucination is adopted to generate High-Resolution (HR) face image from Low-Resolution (LR) images. Existing image hallucination methods mainly fall into two categories: reconstruction-based techniques and learning-based techniques. Based on registration and alignment of multiple LR images of the same scene in sub-pixel accuracy, the former are more susceptible to ill- conditioned registration and inappropriate blurring operators [1], while the latter can generate better performance and higher magnification factor—with the help of a set of training examples. We focus on learning-based method in the sequel. Typical frames from surveillance videos. (a) and (c) are the surveillance images from a camera with CIF size ( pixels) and a camera with 720P size ( pixels) respectively; (b) shows two interested faces extracted from (a) and (c).