1. Introduction
Thanks to the advancements in sensing technology, camera-carrying mobile platforms have become more and more accessible and available for a vast variety of engineering and science disciplines. This also leads to scientific developments in mainly computer vision, image processing, and machine learning areas. Without a doubt, image matching (or registration) has secured its position and necessity at the core of many more sophisticated tasks. It is defined as a procedure of overlaying two images that have some overlapping area. In order to overlay images, the coordinate transformation (motion) between their coordinate frames (top-left corner as origin) needs to be calculated. The success of several different high-level methods in computer vision and robotics (e.g., mapping, 3D reconstruction, localization, and similar others) relies on image matching. Image registration methods are mainly categorized in three categories [22]; Optical Flow [11, 19], Fourier Transform based [16], and Feature-based. Over last two decades, developments on feature point detection and description (usually Scale invariant feature transform (SFT) [12] and Speeded up robust features (SURF) [2]) made it possible to compute the transformation between images even under extreme cases (e.g., various scale and viewpoints changes). These advancements direct researchers to use Feature-based methods more. Different deep-learning-based methods have been also proposed for feature detection, and matching (e.g., [13, 21]) and comparative benchmarking was presented in [1]. The D2-Net framework for joint detection and description of local features was proposed in [6]. Its performance in localization tasks outperforms the other methods while in image matching, it has some limitations.