I. Introduction
Establishing pixel-level correspondences between two images is an essential basis for a wide range of computer vision tasks such as visual localization [1], [2], 3D scene reconstruction [3] and simultaneous localization and mapping (SLAM) [4]. Such correspondences are usually estimated by sparse local feature extraction and matching [5], [7], [8], [9], [10], [11], [12], [13], [14], [15], [16]. A local feature consists of a keypoint and a descriptor. But the scale invariance of both existing keypoint detectors and descriptors is not enough to deal with large scale changes [17]. Few inlier correspondences can be established by matching local features under the circumstances of large scale changes in images, which is called as the scale problem of local features in this paper. If the scale difference between two images is small, we call that the two images are at related scale levels in scale space [18], [19].