1. Introduction
Absolute camera pose estimation is a fundamental step to many computer vision applications, such as Structure-from-Motion (SfM) [19], [33], [34], [39] and visual localization [32], [37], [38]. Given a pre-acquired 3D model of the world, we aim at estimating the most accurate camera pose of an unseen query image. In practice, as illustrated on the left hand-side of Figure 2, this problem is often addressed by sequentially solving two distinct subproblems: First, a feature matching problem that seeks to establish putative 2D-3D correspondences between the 3D point cloud and the image to be localized, and then a Perspective-n-Point (PnP) problem that uses these correspondences as inputs to minimize a sum of so-called reprojection errors w.r.t. the camera pose. The Reprojection Error (RE) is a function of a 2D-3D correspondence and the camera pose. It consists in reprojecting the 3D point, using the camera pose, into the query image plane, computing the euclidean distance between this reprojection and its putative 2D correspondent, and applying a robust loss function, such as Geman-McClure or Tukey’s biweight [3],[47]. The robust loss allows to reduce the influence of outlier 2D-3D correspondences.