On-Manifold Preintegration for Real-Time Visual--Inertial Odometry

SECTION I.

Introduction

The use of cameras and inertial sensors for a three-dimensional (3-D) structure and motion estimation has received considerable attention from the robotics community. Both sensor types are cheap, ubiquitous, and complementary. A single moving camera is an exteroceptive sensor that allows us to measure appearance and geometry of a 3-D scene, up to an unknown metric scale; an inertial measurement unit (IMU) is a proprioceptive sensor that renders metric scale of monocular vision and gravity observable [1] and provides robust and accurate interframe motion estimates. Applications of a visual--inertial odometry (VIO) range from the autonomous navigation in GPS-denied environments, to the 3-D reconstruction, and augmented reality.

The existing literature on VIO imposes a tradeoff between the accuracy and computational efficiency (a detailed review is given in Section II). On one hand, filtering approaches enable a fast inference, but their accuracy is deteriorated by the accumulation of linearization errors. On the other hand, full smoothing approaches, based on nonlinear optimization, are accurate, but computationally demanding. Fixed-lag smoothing offers a compromise between accuracy for efficiency; however, it is not clear how to set the length of the estimation window so to guarantee a given level of performance.

In this study, we show that it is possible to overcome this tradeoff. We design a VIO system that enables fast incremental smoothing and computes the optimal maximum a posteriori (MAP) estimate in real time. An overview of our approach is given in Section IV.

The first step toward this goal is the development of a novel preintegration theory. The use of preintegrated IMU measurements was first proposed in [2] and consists of combining many inertial measurements between two keyframes into a single relative motion constraint. We build upon this study and present a preintegration theory that properly addresses the manifold structure of the rotation group $\mathrm{SO}(3)$. Our preintegration theory is presented in Sections V–VI. Compared with [2], our theory offers a more formal treatment of the rotation noise, and avoids singularities in the representation of rotations. Furthermore, we are able to derive all necessary Jacobians in analytic form: specifically, we report the analytic Jacobians of the residuals, the noise propagation, and the a posteriori bias correction in the appendix of this paper.

Our second contribution is to frame the IMU preintegration theory into a factor graph model. This enables the application of incremental smoothing algorithms, as iSAM2 [3], which avoid the accumulation of linearization errors and offer an elegant way to tradeoff accuracy with efficiency. Inspired by [4] and [5], we also adopt a structureless model for the visual measurements, which allows us to eliminate a large number of variables (i.e., all 3-D points) during incremental smoothing, further accelerating the computation (see Section VII). In contrast to [5], we use the structureless model in an incremental smoothing framework. This has two main advantages: we do not need to delay the processing of visual measurements, and we can relinearize the visual measurements multiple times.

In order to demonstrate the effectiveness of our model, we integrated the proposed IMU preintegration in a state-of-the-art VIO pipeline and tested it on real and simulated datasets (see Sections VIII). Our theoretical development leads to tangible practical advantages: an implementation of the approach proposed in this paper performs full smoothing at a rate of 100 Hz and achieves superior accuracy with respect to competitive state-of-the-art filtering and optimization approaches.

Besides the technical contribution, the paper also provides a tutorial contribution for practitioners. In Section III and across the paper, we provide a short but concise summary of uncertainty representation on manifolds and exemplary derivations for uncertainty propagation and Jacobian computation. The complete derivation of all equations and Jacobians—necessary to implement our model—are given in the appendix.

This paper is an extension of our previous work [6] with additional experiments, an in-depth discussion of related work, and comprehensive technical derivations. The results of the new experiments highlight the accuracy of bias estimation, demonstrate the consistency of our approach, and provide comparisons against full-batch estimation. We release our implementation of the preintegrated IMU and structureless vision factors in the GTSAM 4.0 optimization toolbox [7].

SECTION II.

Related Work

Related work on VIO can be sectioned along three main dimensions. The first dimension is the number of camera poses involved in the estimation. While full smoothers (or batch nonlinear least-squares algorithms) estimate the complete history of poses, fixed-lag smoothers (or sliding window estimators) consider a window of the latest poses, and filtering approaches only estimate the latest state. Both fixed-lag smoothers and filters marginalize older states and absorb the corresponding information in a Gaussian prior.

The second dimension regards the representation of the uncertainty for the measurements and the Gaussian priors: the Extended Kalman Filter (EKF) represents the uncertainty using a covariance matrix; instead, information filters and smoothers resort to the information matrix (the inverse of the covariance) or the square root of the information matrix [3], [8].

Finally, the third dimension distinguishes existing approaches by looking at the number of times in which the measurement model is linearized. While a standard EKF (in contrast to the iterated EKF) processes a measurement only once, a smoothing approach allows linearizing multiple times.

While the terminology is vast, the underlying algorithms are tightly related. For instance, it can be shown that the iterated EKF equations are equivalent to the Gauss–Newton algorithm, commonly used for smoothing [9].

A. Filtering

Filtering algorithms enable the efficient estimation by restricting the inference process to the latest state of the system. The complexity of the EKF grows quadratically in the number of estimated landmarks, therefore, a small number of landmarks (in the order of 20) are typically tracked to allow real-time operation [10]–[12]. An alternative is to adopt a “structureless” approach where landmark positions are marginalized out of the state vector. An elegant example of this strategy is the multistate constraint Kalman filter (MSC-KF) [5]. The structureless approach requires to keep previous poses in the state vector, by means of a stochastic cloning [13].

A drawback of using a structureless approach for filtering, is that the processing of landmark measurements needs to be delayed until all measurements of a landmark are obtained [5]. This hinders accuracy as the filter cannot use all current visual information. Marginalization is also a source of errors as it locks in linearization errors and erroneous outlier measurements. Therefore, it is particularly important to filter out spurious measurements as a single outlier can irreversibly corrupt the filter [14]. Further, linearization errors introduce drift in the estimate and render the filter inconsistent. An effect of inconsistency is that the estimator becomes overconfident, resulting in a nonoptimal information fusion. Generally, the VIO problem has four unobservable directions: the global position and the orientation around the gravity direction (yaw) [15], [16]. In [16], it is shown that the linearization at the wrong estimate results in only three unobservable directions (the global position); hence, an erroneous linearization adds spurious information in the yaw direction to the Gaussian prior, which renders the filter inconsistent. This problem was addressed with the first-estimates Jacobian approach [17], which ensures that a state is not updated with different linearization points—a source of inconsistency. In the observability-constrained EKF (OC-EKF) an estimate of the unobservable directions is maintained which allows to update the filter only in directions that are observable [16], [18]. A thorough analysis of the VIO observability properties is given in [1], [15], and [19].

B. Fixed-Lag Smoothing

Fixed-lag smoothers estimate the states that fall within a given time window, while marginalizing out older states [20]– [24]. In a maximum likelihood estimation setup, fixed-lag smoothers lead to an optimization problem over a set of recent states. For nonlinear problems, smoothing approaches are generally more accurate than filtering, since they relinearize past measurements [25]. Moreover, these approaches are more resilient to outliers, which can be discarded a posteriori (i.e., after the optimization), or can be alleviated by using robust cost functions. On the downside, the marginalization of the states outside the estimation window leads to dense Gaussian priors which hinder efficient inference. For this reason, it has been proposed to drop certain measurements in the interest of sparsity [24]. Furthermore, due to marginalization, fixed-lag smoothers share part of the issues of filtering (consistency, build-up of linearization errors) [18], [22], [26].

C. Full Smoothing

Full smoothing methods estimate the entire history of the states (camera trajectory and 3-D landmarks), by solving a large nonlinear optimization problem [27]–[31]. Full smoothing guarantees the highest accuracy; however, real-time operation quickly becomes infeasible as the trajectory and the map grow over time. Therefore, it has been proposed to discard frames except selected keyframes [24], [32]–[34] or to run the optimization in a parallel thread, using a tracking and mapping dual architecture [20], [35]. A breakthrough has been the development of incremental smoothing techniques (iSAM [36], iSAM2 [3]), which leverage the expressiveness of factor graphs to maintain sparsity and to identify and update only the typically small subset of variables affected by a new measurement.

Nevertheless, the high rate of inertial measurements (usually 100 Hz to 1 kHz) still constitutes a challenge for smoothing approaches. A naive implementation would require adding a new state at every IMU measurement, which quickly becomes impractically slow [37]. Therefore, inertial measurements are typically integrated between frames to form relative motion constraints [24], [30], [38]–[40]. For standard IMU integration between two frames, the initial condition is given by the state estimate at the first frame. However, at every iteration of the optimization, the state estimate changes, which requires to repeat the IMU integration between all frames [24]. Lupton and Sukkarieh [2] show that this repeated integration can be avoided by a reparametrization of the relative motion constraints. Such reparametrization is called IMU preintegration.

In this study, we build upon the seminal work [2] and bring the theory of the IMU preintegration to maturity by properly addressing the manifold structure of the rotation group $\mathrm{SO}(3)$. The work [2] adopted Euler angles as the global parametrization for rotations. Using Euler angles and applying the usual averaging and smoothing techniques of Euclidean spaces for state propagation and covariance estimation is not properly invariant under the action of rigid transformations [41], [42]. Moreover, Euler angles are known to have singularities. Our study, on the other hand, provides a formal treatment of the rotation measurements (and the corresponding noise), and provides a complete derivation of the MAP estimator. We also derive analytic expressions for the Jacobians (needed for the optimization), which, to the best of our knowledge, have not been previously reported in the literature. In the experimental section, we show that a proper representation of the rotation manifold results in a higher accuracy and robustness, leading to tangible advantages over the original proposal [2] .

SECTION III.

Preliminaries

In this paper, we formulate VIO in terms of the MAP estimation. In our model, MAP estimation leads to a nonlinear optimization problem that involves quantities living on smooth manifolds (e.g., rotations, poses). Therefore, before delving into details, we conveniently review some useful geometric concepts. This section can be skipped by the expert reader.

We structure this section as follows: Section III-A provides useful notions related to two main Riemannian manifolds: the Special Orthogonal Group $\mathrm{SO}(3)$ and the Special Euclidean Group $\mathrm{SE}(3)$. Our presentation is based on [43] and [44]. Section III-B describes a suitable model to describe uncertain rotations in $\mathrm{SO}(3)$. Section III-C reviews optimization on manifolds, following standard references [45].

A. Notions of Riemannian Geometry

1) Special Orthogonal Group

$\mathrm{SO}(3)$ describes the group of 3-D rotation matrices and it is formally defined as $ \mathrm{SO}(3) \doteq \lbrace \mathtt{R}\in \mathbb {R}^{3\times 3} \; {:} \; \mathtt{R}^{\mathsf {T}}\mathtt{R}= {\mathbf {I}}, \det (\mathtt{R})=1 \rbrace $. The group operation is the usual matrix multiplication, and the inverse is the matrix transpose. The group $\mathrm{SO}(3)$ also forms a smooth manifold. The tangent space to the manifold (at the identity) is denoted as $\mathfrak {so}(3)$, which is also called the Lie algebra and coincides with the space of $3 \times 3$ skew symmetric matrices. We can identify every skew symmetric matrix with a vector in $\mathbb {R}^{3}$ using the hat operator \begin{equation} \boldsymbol{\omega }^\wedge = \left[\begin{array}{c}\omega _1 \\ \omega _2 \\ \omega _3 \end{array}\right]^\wedge = \left[\begin{array}{ccc}0 & -\omega _3 & \omega _2 \\ \omega _3 & 0 & -\omega _1 \\ -\omega _2 & \omega _1 & 0 \end{array}\right] \in \mathfrak {so}(3). \end{equation} View SourceSimilarly, we can map a skew symmetric matrix to a vector in $\mathbb {R}^{3}$ using the vee operator $(\cdot)^\vee$: for a skew symmetric matrix $\mathtt{S} = \boldsymbol{\omega }^\wedge$, the vee operator is such that $\mathtt{S}^\vee = \boldsymbol{\omega }$ . A property of skew symmetric matrices that will be useful later on is \begin{equation} \mathbf {a}^\wedge \; \mathbf {b} = -\mathbf {b}^\wedge \; \mathbf {a} \quad \forall \; \mathbf {a},\mathbf {b}\in \mathbb {R}^{3}. \end{equation} View SourceThe exponential map (at the identity) $\exp: \mathfrak {so}(3) \rightarrow \mathrm{SO}(3)$ associates an element of the Lie Algebra to a rotation and coincides with the standard matrix exponential (Rodrigues’ formula) \begin{equation} \exp (\boldsymbol{\phi }^\wedge) = {\mathbf I}+ \frac{\sin (\Vert \boldsymbol{\phi }\Vert)}{\Vert \boldsymbol{\phi }\Vert } \boldsymbol{\phi }^\wedge + \frac{1-\cos (\Vert \boldsymbol{\phi }\Vert)}{\Vert \boldsymbol{\phi }\Vert ^2} \left(\boldsymbol{\phi }^\wedge \right)^2. \end{equation} View SourceA first-order approximation of the exponential map that we will use later on is \begin{equation} \exp (\boldsymbol{\phi }^\wedge) \approx \mathbf {I}+ \boldsymbol{\phi }^\wedge \;. \end{equation} View Source

The logarithm map (at the identity) associates a matrix $\mathtt{R}\ne \mathbf {I}$ in $\mathrm{SO}(3)$ to a skew symmetric matrix \begin{equation} \log (\mathtt{R}) = \frac{\varphi \cdot (\mathtt{R}- \mathtt{R}^{\mathsf {T}}) }{2\sin (\varphi)} \text{ with } \varphi = \cos ^{-1}\left(\frac{\mathrm{tr} \left(\mathtt{R}\right)-1}{2} \right). \end{equation} View SourceNote that $\log (\mathtt{R})^{\vee} = {\mathbf {a}{\varphi}}$, where $\mathbf {a}$ and ${\varphi}$ are the rotation axis and the rotation angle of $\mathtt{R}$, respectively. If $\mathtt{R}=\mathbf {I}$, then ${\varphi =0}$ and $\mathbf {a}$ is undetermined and can therefore be chosen arbitrarily.

The exponential map is a bijection if restricted to the open ball $\Vert \boldsymbol{\phi }\Vert < \pi$, and the corresponding inverse is the logarithm map. However, if we do not restrict the domain, the exponential map becomes surjective as every vector $\boldsymbol{\phi }= (\varphi +2k\pi)\mathbf {a}$, $k\in \mathbb {Z}$ would be an admissible logarithm of $\mathtt{R}$.

For notational convenience, we adopt “vectorized” versions of the exponential and logarithm map \begin{equation} \begin{array}{cccccccc} \mathrm{Exp}:& \mathbb {R}^{3} &\rightarrow & \mathrm{SO}(3) &;& \boldsymbol{\phi }&\mapsto & \exp (\boldsymbol{\phi }^\wedge)\\ \mathrm{Log}:& \mathrm{SO}(3) &\rightarrow & \mathbb {R}^{3} &;& \mathtt{R}&\mapsto & \log (\mathtt{R})^\vee \end{array} \end{equation} View Sourcewhich operate directly on vectors, rather than on skew symmetric matrices in $\mathfrak {so}(3)$.

Later, we will use the following first-order approximation: \begin{equation} \mathrm{Exp}(\boldsymbol{\phi }+\delta \boldsymbol{\phi }) \approx \mathrm{Exp}(\boldsymbol{\phi }) \; \mathrm{Exp}(\mathtt{J}_{r}(\boldsymbol{\phi })\delta \boldsymbol{\phi }). \end{equation} View SourceThe term $\mathtt{J}_{r}(\boldsymbol{\phi })$ is the right Jacobian of $\mathrm{SO}(3)$ [43, p. 40] and relates additive increments in the tangent space to multiplicative increments applied on the right-hand side (see Fig. 1) \begin{equation} \mathtt{J}_{r}(\boldsymbol{\phi }) = \mathbf {I}- \frac{1-\cos (\Vert \boldsymbol{\phi }\Vert)}{\Vert \boldsymbol{\phi }\Vert ^2}\boldsymbol{\phi }^\wedge +\frac{\Vert \boldsymbol{\phi }\Vert - \sin (\Vert \boldsymbol{\phi }\Vert)}{\Vert \boldsymbol{\phi }^{3}\Vert }(\boldsymbol{\phi }^\wedge)^2. \; \end{equation} View SourceA similar first-order approximation holds for the logarithm \begin{equation} \mathrm{Log}\big (\; \mathrm{Exp}(\boldsymbol{\phi }) \; \mathrm{Exp}(\delta \boldsymbol{\phi }) \;\big)\approx \boldsymbol{\phi }+\mathtt{J}_{r}^{-1}(\boldsymbol{\phi })\delta \boldsymbol{\phi } \end{equation} View Sourcewhere the inverse of the right Jacobian is \begin{equation*} \mathtt{J}_{r}^{-1}(\boldsymbol{\phi }) = \mathbf {I}+ \frac{1}{2}\boldsymbol{\phi }^\wedge + \left(\frac{1}{\Vert \boldsymbol{\phi }\Vert ^2} +\frac{1+\cos (\Vert \boldsymbol{\phi }\Vert)}{2\Vert \boldsymbol{\phi }\Vert \sin (\Vert \boldsymbol{\phi }\Vert)} \right)(\boldsymbol{\phi }^\wedge)^2. \; \nonumber \end{equation*} View SourceThe right Jacobian $\mathtt{J}_{r}(\boldsymbol{\phi })$ and its inverse $\mathtt{J}_{r}^{-1}(\boldsymbol{\phi })$ reduce to the identity matrix for $\Vert \boldsymbol{\phi }\Vert =0$.

$Fig. 1. - Right Jacobian $\mathtt{J}_{r}$ relates an additive perturbation $\delta \boldsymbol{\phi }$ in the tangent space to a multiplicative perturbation on the manifold $\mathrm{SO}(3)$, as per (7).$

Fig. 1.

Right Jacobian $\mathtt{J}_{r}$ relates an additive perturbation $\delta \boldsymbol{\phi }$ in the tangent space to a multiplicative perturbation on the manifold $\mathrm{SO}(3)$, as per (7).

Show All

Another useful property of the exponential map is \begin{align} \mathtt{R}\ \mathrm{Exp}(\boldsymbol{\phi})\ \mathtt{R}^{\mathsf {T}}&= \exp (\mathtt{R}\boldsymbol{\phi}^{\wedge} \mathtt{R}^{\mathsf {T}}) = \mathrm{Exp}(\mathtt{R}\boldsymbol{\phi}) \\ \Leftrightarrow \quad \mathrm{Exp}(\boldsymbol{\phi})\ \mathtt{R}&= \mathtt{R}\ \mathrm{Exp}(\mathtt{R}^{\mathsf {T}}\boldsymbol{\phi}). \end{align} View Source

2) Special Euclidean Group

$\mathrm{SE}(3)$ describes the group of rigid motion in 3-D, which is the semidirect product of $\mathrm{SO}(3)$ and $\mathbb {R}^{3}$, and it is defined as $\mathrm{SE}(3) \doteq \lbrace (\mathtt{R},\mathbf {p}) \; {:} \; \mathtt{R}\in \mathrm{SO}(3), \mathbf {p}\in \mathbb {R}^{3} \rbrace $. Given $\mathtt{T}_1, \mathtt{T}_2 \in \mathrm{SE}(3)$ , the group operation is $\mathtt{T}_1 \cdot \mathtt{T}_2 = (\mathtt{R}_1 \mathtt{R}_2,\; \mathbf {p}_1 + \mathtt{R}_1 \mathbf {p}_2)$, and the inverse is $\mathtt{T}_1^{-1}= (\mathtt{R}_1^{\mathsf {T}},\; -\mathtt{R}_1^{\mathsf {T}}\mathbf {p}_1)$. The exponential map and the logarithm map for $\mathrm{SE}(3)$ are defined in [44]. However, these are not needed in this paper for reasons that will be clear in Section III-C.

B. Uncertainty Description in SO(3)

A natural definition of uncertainty in $\mathrm{SO}(3)$ is to define a distribution in the tangent space, and then map it to $\mathrm{SO}(3)$ via the exponential map (6) [44], [46], [47]: \begin{equation} \tilde{\mathtt{R}}=\mathtt{R}\,\, {\rm Exp}(\varepsilon),\quad \varepsilon \sim \mathcal{N} (0, \Sigma) \end{equation} View Sourcewhere $\mathtt{R}$ is a given noise-free rotation (the mean) and ${\epsilon}$ is a small normally distributed perturbation with zero mean and covariance ${\Sigma}$.

To obtain an explicit expression for the distribution of $\tilde{\mathtt{R}}$, we start from the integral of the Gaussian distribution in $\mathbb {R}^{3}$ \begin{equation} \int _{\mathbb {R}^{3}} p\,(\epsilon) d\epsilon = \int _{\mathbb {R}^{3}} \alpha e^{-\frac{1}{2} \left\Vert \epsilon \right\Vert ^2_{\Sigma } } \text{d}\epsilon = 1 \end{equation} View Sourcewhere $\alpha = 1 / \sqrt{(2\pi)^{3} \det (\Sigma)}$ and $\left\Vert \epsilon \right\Vert ^2_{\Sigma } \doteq \epsilon ^{\mathsf {T}}\Sigma ^{-1}\epsilon$ is the squared Mahalanobis distance with covariance $\Sigma$. Then, applying the change of coordinates $\epsilon = \mathrm{Log}(\mathtt{R}^{-1}\tilde{\mathtt{R}})$ (this is the inverse of (12) when $\Vert \epsilon \Vert < \pi$), the integral (13) becomes \begin{equation} \int _{\mathrm{SO}(3)} \beta (\tilde{\mathtt{R}}) \; e^{-\frac{1}{2} \left\Vert \mathrm{Log}(\mathtt{R}^{-1}\tilde{\mathtt{R}})\right\Vert ^2_{\Sigma } } \; \text{d}\tilde{\mathtt{R}}= 1 \end{equation} View Sourcewhere $\beta (\tilde{\mathtt{R}})$ is a normalization factor. The normalization factor assumes the form $\beta (\tilde{\mathtt{R}}) = \alpha / |\det ({\mathcal {J}}(\tilde{\mathtt{R}})|$, where ${\mathcal J}(\tilde{\mathtt{R}}) \doteq \mathtt{J}_{r}(\mathrm{Log}(\mathtt{R}^{-1}\tilde{\mathtt{R}}))$ and $\mathtt{J}_{r}(\cdot)$ is the right Jacobian (8); ${\mathcal J}(\tilde{\mathtt{R}})$ is a by-product of the change of variables, see [46] for a derivation.

From the argument of (14), we can directly read our “Gaussian” distribution in $\mathrm{SO}(3)$ as \begin{equation} p\,(\tilde{\mathtt{R}}) = \beta (\tilde{\mathtt{R}}) \; e^{-\frac{1}{2} \left\Vert \mathrm{Log}(\mathtt{R}^{-1}\tilde{\mathtt{R}})\right\Vert ^2_{\Sigma } }. \end{equation} View SourceFor small covariances, we can approximate $\beta \simeq \alpha$, as $\mathtt{J}_{r}(\mathrm{Log}(\mathtt{R}^{-1}\tilde{\mathtt{R}}))$ is well approximated by the identity matrix when $\tilde{\mathtt{R}}$ is close to $\mathtt{R}$. Note that $(14)$ already assumes relatively a small covariance $\Sigma$, since it “clips” the probability tails outside the open ball of radius $\pi$ (this is due to the reparametrization $\epsilon = \mathrm{Log}(\mathtt{R}^{-1}\tilde{\mathtt{R}})$, which restricts $\epsilon$ to $\Vert \epsilon \Vert < \pi$). Approximating $\beta$ as a constant, the negative log-likelihood of a rotation $\mathtt{R}$, given a measurement $\tilde{\mathtt{R}}$ distributed as in (15), is \begin{eqnarray} {\mathcal L}(\mathtt{R}) &=& \frac{1}{2} \left\Vert \mathrm{Log}\left(\mathtt{R}^{-1}\tilde{\mathtt{R}}\right)\right\Vert ^2_{\Sigma } + \text{const} \nonumber\\ &=& \frac{1}{2} \left\Vert \mathrm{Log}(\tilde{\mathtt{R}}^{-1}\mathtt{R})\right\Vert ^2_{\Sigma } + \text{const} \end{eqnarray} View Sourcewhich geometrically can be interpreted as the squared angle (geodesic distance in $\mathrm{SO}(3)$) between $\tilde{\mathtt{R}}$ and $\mathtt{R}$ weighted by the inverse uncertainty $\Sigma ^{-1}$.

C. Gauss–Newton Method on Manifold

A standard Gauss–Newton method in Euclidean space works by repeatedly optimizing a quadratic approximation of the (generally nonconvex) objective function. Solving the quadratic approximation reduces to solving a set of linear equations (normal equations), and the solution of this local approximation is used to update the current estimate. Here, we recall how to extend this approach to (unconstrained) optimization problems whose variables belong to some manifold ${\mathcal M}$.

Let us consider the optimization problem \begin{equation} \min _{x \in {\mathcal M}} f(x) \end{equation} View Sourcewhere the variable $x$ belongs to a manifold ${\mathcal M}$; for the sake of simplicity we consider a single variable in (17), while the description easily generalizes to multiple variables.

Contrarily to the Euclidean case, one cannot directly approximate (17) as a quadratic function of $x$. This is due to two main reasons. First, working directly on $x$ leads to an overparametrization of the problem (e.g., we parametrize a rotation matrix with nine elements, while a 3-D rotation is completely defined by a vector in $\mathbb {R}^{3}$) and this can make the normal equations underdetermined. Second, the solution of the resulting approximation does not belong to ${\mathcal M}$ in general.

A standard approach for optimization on manifold [45], [48], consists of defining a retraction ${\mathcal R}_x$, which is a bijective map between an element $\delta x$ of the tangent space (at $x$) and a neighborhood of $x\in {\mathcal M}$. Using the retraction, we can reparametrize our problem as \begin{equation} \min _{x \in {\mathcal M}} f(x) \quad \Rightarrow \quad \min _{\delta x \in \mathbb {R}^n} f({\mathcal R}_x(\delta x)). \end{equation} View SourceThe reparametrization is usually called lifting [45]. Roughly speaking, we work in the tangent space defined at the current estimate, which locally behaves as an Euclidean space. The use of the retraction allows us to frame the optimization problem over an Euclidean space of suitable dimension (e.g., $\delta x \in \mathbb {R}^{3}$ when we work in $\mathrm{SO}(3)$). We can now apply standard optimization techniques to the problem on the right-hand side of (18). In the Gauss–Newton framework, we square the cost around the current estimate. Then, we solve the quadratic approximation to get a vector $\delta x^\star$ in the tangent space. Finally, the current guess on the manifold is updated as \begin{equation} \hat{x} \leftarrow {\mathcal R}_{\hat{x}} (\delta x^\star). \end{equation} View SourceThis “lift-solve-retract” scheme can be generalized to any trust-region method [45]. Moreover, it provides a grounded and unifying generalization of the error state model, commonly used in the aerospace literature for filtering [49] and recently adopted in robotics for optimization [23], [34].

We conclude this section by discussing the choice of the retraction ${\mathcal R}_x$. A possible retraction is the exponential map. It is known that, computationally, this may not be the most convenient choice (see [50]).

In this study, we use the following retraction for $\mathrm{SO}(3)$ : \begin{equation} {\mathcal R}_\mathtt{R}(\boldsymbol{\phi }) = \mathtt{R}\; \mathrm{Exp}(\delta \boldsymbol{\phi }), \qquad \delta \boldsymbol{\phi }\in \mathbb {R}^{3} \end{equation} View Sourceand for $\mathrm{SE}(3)$, we use the retraction at $\mathtt{T} \doteq (\mathtt{R},\mathbf {p})$ \begin{equation} {\mathcal R}_\mathtt{T}(\delta \boldsymbol{\phi }, \delta \mathbf {p}) = (\mathtt{R}\; \mathrm{Exp}(\delta \boldsymbol{\phi }), \ \mathbf {p}+ \mathtt{R}\; \delta \mathbf {p}), \quad \left[ \delta \boldsymbol{\phi }\ \ \delta \mathbf {p}\right] \in \mathbb {R}^6 \end{equation} View Sourcewhich explains why in Section III-A we only defined the exponential map for $\mathrm{SO}(3)$ : with this choice of retraction, we never need to compute the exponential map for $\mathrm{SE}(3)$.

SECTION IV.

MAP Visual--Inertial State Estimation

We consider a VIO problem in which we want to track the state of a sensing system (e.g., a mobile robot, a UAV, or a handheld device), equipped with an IMU and a monocular camera. We assume that the IMU frame “ $\text{B}$” coincides with the body frame we want to track, and that the transformation between the camera and the IMU is fixed and known from prior calibration (see Fig. 2). Furthermore, we assume that a front-end provides image measurements of 3-D landmarks at unknown position. The front end also selects a subset of images, called keyframes [32], for which we want to compute a pose estimate. Section VIII-B1 discusses implementation aspects, including the choice of the front end in our experiments.

$Fig. 2. - $\mathtt{T}_{\text{{W}}\text{{B}}} \doteq (\mathtt{R}_{\text{{W}}\text{{B}}},{}_{\text{{W}}}\mathbf {p})$ is the pose of the body frame $\text{B}$ w.r.t. the world frame $\text{W}$. We assume that the body frame coincides with the IMU frame. $\mathtt{T}_{\text{{B}}\text{{C}}}$ is the pose of the camera in the body frame, known from prior calibration.$

Fig. 2.

$\mathtt{T}_{\text{{W}}\text{{B}}} \doteq (\mathtt{R}_{\text{{W}}\text{{B}}},{}_{\text{{W}}}\mathbf {p})$ is the pose of the body frame $\text{B}$ w.r.t. the world frame $\text{W}$. We assume that the body frame coincides with the IMU frame. $\mathtt{T}_{\text{{B}}\text{{C}}}$ is the pose of the camera in the body frame, known from prior calibration.

Show All

A. State

The state of the system at time $i$ is described by the IMU orientation, position, velocity, and biases \begin{equation} \mathbf {x}_i \doteq [\mathtt{R}_i, \mathbf {p}_i, \mathbf {v}_i, \mathbf {b}_i]. \end{equation} View SourceThe pose $(\mathtt{R}_i, \mathbf {p}_i)$ belongs to $\mathrm{SE}(3)$, while velocities live in a vector space, i.e., $\mathbf {v}_i \in \mathbb {R}^{3}$. IMU biases can be written as $\mathbf {b}_i = [\mathbf {b}^g_i \;\; \mathbf {b}^a_i] \in \mathbb {R}^6$, where $\mathbf {b}^g_i, \mathbf {b}^a_i \in \mathbb {R}^{3}$ are the gyroscope and accelerometer bias, respectively.

Let $\mathcal {K}_k$ denote the set of all keyframes up to time $k$. In our approach, we estimate the state of all keyframes \begin{equation} \mathcal {X}_k \doteq \lbrace \mathbf {x}_i \rbrace _{i \in \mathcal {K}_k}. \end{equation} View SourceIn our implementation, we adopt a structureless approach (cf., Section VII), hence, the 3-D landmarks are not part of the variables to be estimated. However, the proposed approach generalizes in a straightforward manner to also estimate the landmarks and the camera intrinsic and extrinsic calibration parameters.

B. Measurements

The input to our estimation problem are the measurements from the camera and the IMU. We denote with $\mathcal {C}_i$ the image measurements at keyframe $i$. At time $i$, the camera can observe multiple landmarks $l$, hence, $\mathcal {C}_i$ contains multiple image measurements $\mathbf {z}_{{\rm il}}$. With slight abuse of notation, we write $l \in \mathcal {C}_i$ when a landmark $l$ is seen at time $i$.

We denote with $\mathcal {I}_{{\rm ij}}$ the set of IMU measurements acquired between two consecutive keyframes $i$ and $j$. Depending on the IMU measurement rate and the frequency of selected keyframes, each set $\mathcal {I}_{{\rm ij}}$ can contain from a small number to hundreds of IMU measurements. The set of measurements collected up to time $k$ is \begin{equation} \mathcal {Z}_k \doteq \lbrace \mathcal {C}_i, \mathcal {I}_{{\rm ij}}\rbrace _{(i,j)\in \mathcal {K}_k}. \end{equation} View Source

C. Factor Graphs and MAP Estimation

The posterior probability of the variables $\mathcal {X}_k$, given the available visual and inertial measurements $\mathcal {Z}_k$ and priors $p\,(\mathcal {X}_0)$ is \begin{eqnarray} p\,(\mathcal {X}_k | \mathcal {Z}_k) &\propto & \ p\,(\mathcal {X}_0) p\,(\mathcal {Z}_k | \mathcal {X}_k) \overset{(a)}{=} p\,(\mathcal {X}_0) \prod _{(i,j)\in \mathcal {K}_k} p\,(\mathcal {C}_i, \mathcal {I}_{{\rm ij}}| \mathcal {X}_k) \nonumber \\ &\overset{(b)}{=} & p\,(\mathcal {X}_0) \prod _{(i,j)\in \mathcal {K}_k} p\,(\mathcal {I}_{{\rm ij}}| \mathbf {x}_i, \mathbf {x}_j) \prod _{i \in \mathcal {K}_k} \prod _{l \in \mathcal {C}_i} p\,(\mathbf {z}_{{\rm il}} | \mathbf {x}_i).\quad \end{eqnarray} View SourceThe factorizations (a) and (b) follow from a standard independence assumption among the measurements. Furthermore, the Markovian property is applied in (b) (e.g., an image measurement at time $i$ only depends on the state at time $i$).

As the measurements $\mathcal {Z}_k$ are known, we are free to eliminate them as variables and consider them as parameters of the joint probability factors over the actual unknowns. This naturally leads to the well known factor graph representation, a class of bipartite graphical models that can be used to represent such factored densities [51], [52]. A schematic representation of the connectivity of the factor graph underlying the VIO problem is given in Fig. 3 (the connectivity of the structureless vision factors will be clarified in Section VII). The factor graph is composed of nodes for unknowns and nodes for the probability factors defined on them, and the graph structure expresses which unknowns are involved in each factor.

Fig. 3.

Left: visual and inertial measurements in VIO. Right: factor graph in which several IMU measurements are summarized in a single preintegrated IMU factor and a structureless vision factor constraints keyframes observing the same landmark.

Show All

The MAP estimate $\mathcal {X}_k^\star$ corresponds to the maximum of (25), or equivalently, the minimum of the negative log-posterior. Under the assumption of zero-mean Gaussian noise, the negative log-posterior can be written as a sum of squared residual errors \begin{eqnarray} \mathcal {X}_k^\star &\doteq & \arg \min _{\mathcal {X}_k} \; -\log _e \;p\,(\mathcal {X}_k | \mathcal {Z}_k)\\ &=& \arg \min _{\mathcal {X}_k} \; \Vert \mathbf {r}_{0}\Vert ^2_{\mathbf {\Sigma }_0} + \sum _{(i,j)\in \mathcal {K}_k} \Vert \mathbf {r}_{\mathcal {I}_{{\rm ij}}} \Vert ^2_{\mathbf {\Sigma }_{{\rm ij}}} + \sum _{i \in \mathcal {K}_k}\sum _{l \in \mathcal {C}_i} \Vert \mathbf {r}_{\mathcal {C}_{{\rm il}}} \Vert ^2_{\mathbf {\Sigma }_{\mathcal {C}}} \nonumber \end{eqnarray} View Sourcewhere $\mathbf {r}_{0}$ , $\mathbf {r}_{\mathcal {I}_{{\rm ij}}}$, $\mathbf {r}_{\mathcal {C}_{{\rm il}}}$ are the residual errors associated to the measurements, and $\mathbf {\Sigma }_0$ , $\mathbf {\Sigma }_{{\rm ij}}$, and $\mathbf {\Sigma }_{\mathcal {C}}$ are the corresponding covariance matrices. Roughly speaking, the residual error is a function of $\mathcal {X}_k$ that quantifies the mismatch between a measured quantity and the predicted value of this quantity given the state $\mathcal {X}_k$ and the priors. The goal of the following sections is to provide expressions for the residual errors and the covariances.

SECTION V.

IMU Model and Motion Integration

An IMU commonly includes a three-axis accelerometer and a three-axis gyroscope and allows us to measure the rotation rate and the acceleration of the sensor with respect to an inertial frame. The measurements, namely ${}_\text{{B}}\tilde{\mathbf {a}}(t)$, and ${}_\text{{B}}\tilde{\boldsymbol{\omega }}_{\text{{W}}\text{{B}}}(t)$, are affected by additive white noise $\boldsymbol{\eta }$ and a slowly varying sensor bias $\mathbf {b}$ \begin{eqnarray} {}_\text{{B}}\tilde{\boldsymbol{\omega }}_{\text{{W}}\text{{B}}}(t) &=& {}_\text{{B}}\boldsymbol{\omega }_{\text{{W}}\text{{B}}}(t) + \mathbf {b}^g(t) + \boldsymbol{\eta }^{g}(t) \\ {}_\text{{B}}\tilde{\mathbf {a}}(t) &=& \mathtt{R}_{\text{{W}}\text{{B}}}^{\mathsf {T}}(t)\left({}_\text{{W}}\mathbf {a}(t) - {}_\text{{W}}\mathbf {g}\right) + \mathbf {b}^a(t) + \boldsymbol{\eta }^{a}(t). \quad \end{eqnarray} View SourceIn our notation, the prefix $\text{B}$ denotes that the corresponding quantity is expressed in the frame $\text{B}$ (cf., Fig. 2 ). The pose of the IMU is described by the transformation $\lbrace \mathtt{R}_{\text{{W}}\text{{B}}}, {}_\text{{W}}\mathbf {p}\rbrace$, which maps a point from sensor frame $\text{B}$ to $\text{W}$. The vector ${}_\text{{B}}\boldsymbol{\omega }_{\text{{W}}\text{{B}}}(t) \in \mathbb {R}^{3}$ is the instantaneous angular velocity of $\text{B}$ relative to $\text{W}$ expressed in coordinate frame $\text{B}$, while ${}_\text{{W}}\mathbf {a}(t) \in \mathbb {R}^{3}$ is the acceleration of the sensor; ${}_\text{{W}}\mathbf {g}$ is the gravity vector in world coordinates. We neglect the effects due to earth's rotation, which amounts to assuming that $\text{W}$ is an inertial frame.

The goal now is to infer the motion of the system from IMU measurements. For this purpose, we introduce the following kinematic model [49], [53] : \begin{equation} \dot{\mathtt{R}}_{\text{{W}}\text{{B}}} = \mathtt{R}_{\text{{W}}\text{{B}}} \ {}_\text{{B}}\boldsymbol{\omega }_{\text{{W}}\text{{B}}}^\wedge, \qquad {}_\text{{W}}\dot{\mathbf {v}}= {}_\text{{W}}\mathbf {a}, \qquad {}_\text{{W}}\dot{\mathbf {p}}= {}_\text{{W}}\mathbf {v} \end{equation} View Sourcewhich describes the evolution of the pose and velocity of $\text{B}$.

The state at time $t+\Delta t$ is obtained by integrating (29): \begin{align*} \mathtt{R}_{\text{{W}}\text{{B}}}(t+\Delta t) &= \mathtt{R}_{\text{{W}}\text{{B}}}(t) \; \mathrm{Exp}\left(\int _t^{t+\Delta t} {}_\text{{B}}\boldsymbol{\omega }_{\text{{W}}\text{{B}}}(\tau) d\tau \right)\\ {}_\text{{W}}\mathbf {v}(t+\Delta t) &= {}_\text{{W}}\mathbf {v}(t) + \int _t^{t+\Delta t} {}_\text{{W}}\mathbf {a}(\tau) d\tau \nonumber \\ {}_\text{{W}}\mathbf {p}(t+\Delta t) &= {}_\text{{W}}\mathbf {p}(t) + \int _t^{t+\Delta t} {}_\text{{W}}\mathbf {v}(\tau) d\tau + \iint _t^{t+\Delta t} {}_\text{{W}}\mathbf {a}(\tau) d\tau ^2. \end{align*} View SourceAssuming that ${}_\text{{W}}\mathbf {a}$ and ${}_\text{{B}}\boldsymbol{\omega }_{\text{{W}}\text{{B}}}$ remain constant in the time interval $[t,t+\Delta t]$, we can write \begin{align} \mathtt{R}_{\text{{W}}\text{{B}}}(t+\Delta t) &= \mathtt{R}_{\text{{W}}\text{{B}}}(t) \; \mathrm{Exp}\left({}_\text{{B}}\boldsymbol{\omega }_{\text{{W}}\text{{B}}}(t) \Delta t\right) \nonumber \\ {}_\text{{W}}\mathbf {v}(t+\Delta t) &= {}_\text{{W}}\mathbf {v}(t) + {}_\text{{W}}\mathbf {a}(t) \Delta t \nonumber \\ {}_\text{{W}}\mathbf {p}(t+\Delta t) &= {}_\text{{W}}\mathbf {p}(t) + {}_\text{{W}}\mathbf {v}(t) \Delta t + \frac{1}{2}\,{}_\text{{W}}\mathbf {a}(t) \Delta t^2. \end{align} View SourceUsing (27) and (28), we can write ${}_\text{{W}}\mathbf {a}$ and ${}_\text{{B}}\boldsymbol{\omega }_{\text{{W}}\text{{B}}}$ as a function of the IMU measurements, hence, (30) becomes \begin{align} \mathtt{R}(t+\Delta t) &=\ \mathtt{R}(t) \ \mathrm{Exp}\left(\left(\tilde{\boldsymbol{\omega }}(t) - \mathbf {b}^g(t)-\boldsymbol{\eta }^{{\rm gd}}(t) \right) \Delta t\right) \nonumber \\ \mathbf {v}(t+\Delta t) &=\ \mathbf {v}(t) + \mathbf {g}\Delta t + \mathtt{R}(t) \left(\tilde{\mathbf {a}}(t)- \mathbf {b}^a(t) - \boldsymbol{\eta }^{{\rm ad}}(t)\right) \Delta t \nonumber \\ \mathbf {p}(t+\Delta t) &=\ \mathbf {p}(t) + \mathbf {v}(t) \Delta t + \frac{1}{2}\,\mathbf {g}\Delta t^2 \nonumber \\ &\quad + \frac{1}{2}\,\mathtt{R}(t) \left(\tilde{\mathbf {a}}(t)- \mathbf {b}^a(t) - \boldsymbol{\eta }^{{\rm ad}}(t)\right) \Delta t^2 \end{align} View Sourcewhere we dropped the coordinate frame subscripts for readability (the notation should be unambiguous from now on). This numeric integration of the velocity and position assumes a constant orientation $\mathtt{R}(t)$ for the time of integration between two measurements, which is not an exact solution of the differential equation (29) for measurements with the nonzero rotation rate. In practice, the use of a high-rate IMU mitigates the effects of this approximation. We adopt the integration scheme (31) as it is simple and amenable for modeling and uncertainty propagation. While we show that this integration scheme performs very well in practice, we remark that for slower IMU measurement rates, one may consider using higher order numerical integration methods [54]–[57].

The covariance of the discrete-time noise $\boldsymbol{\eta }^{{\rm gd}}$ is a function of the sampling rate and relates to the continuous-time spectral noise $\boldsymbol{\eta }^{g}$ via $\text{Cov}(\boldsymbol{\eta }^{{\rm gd}}(t)) = \frac{1}{\Delta t}\text{Cov}(\boldsymbol{\eta }^{g}(t))$. The same relation holds for $\boldsymbol{\eta }^{{\rm ad}}$ (cf., [58, Appendix]).

SECTION VI.

IMU Preintegration on Manifold

While (31) could be readily seen as a probabilistic constraint in a factor graph, it would require to include states in the factor graph at a high rate. Intuitively, (31) relates states at time $t$ and $t+\Delta t$ , where $\Delta t$ is the sampling period of the IMU, hence, we would have to add new states in the estimation at every new IMU measurement [37].

Here, we show that all measurements between two keyframes at times $k=i$ and $k=j$ (see Fig. 4) can be summarized in a single compound measurement, named preintegrated IMU measurement, which constrains the motion between the consecutive keyframes. This concept was first proposed in [2] using Euler angles and we extend it, by developing a suitable theory for preintegration on the manifold $\mathrm{SO}(3)$ .

Fig. 4.

Different rates for IMU and camera.

Show All

We assume that the IMU is synchronized with the camera and provides measurements at discrete times $k$ (cf., Fig. 4).¹ Iterating the IMU integration (31) for all $\Delta t$ intervals between two consecutive keyframes at times $k=i$ and $k=j$ (cf., Fig. 4), we find \begin{align} \mathtt{R}_j =&\ \mathtt{R}_i \prod _{k=i}^{j-1}\mathrm{Exp}\left(\left(\tilde{\boldsymbol{\omega }}_k - \mathbf {b}^g_k-\boldsymbol{\eta }^{{\rm gd}}_k \right) \Delta t\right) \\ \mathbf {v}_j =&\ \mathbf {v}_i + \mathbf {g}\Delta t_{{\rm ij}} + \sum _{k=i}^{j-1} \mathtt{R}_k \Big (\tilde{\mathbf {a}}_k-\mathbf {b}^a_k-\boldsymbol{\eta }^{{\rm ad}}_k \Big)\Delta t \nonumber\\ \mathbf {p}_j =&\ \mathbf {p}_i + \sum _{k=i}^{j-1} \Big [ \mathbf {v}_k \Delta t + \frac{1}{2}\,\mathbf {g}\Delta t^2 + \frac{1}{2}\,\mathtt{R}_k \Big (\tilde{\mathbf {a}}_k - \mathbf {b}^a_k - \boldsymbol{\eta }^{{\rm ad}}_k \Big)\Delta t^2 \Big ] \nonumber \end{align} View Sourcewhere we introduced the shorthands $\Delta t_{{\rm ij}}\doteq \sum _{k=i}^{j-1}\Delta t$ and $(\cdot)_i\doteq (\cdot)(t_i)$ for readability. While (32) already provides an estimate of the motion between time $t_i$ and $t_j$, it has the drawback that the integration in (32) has to be repeated whenever the linearization point at time $t_i$ changes [24] [intuitively, a change in the rotation $\mathtt{R}_i$ implies a change in all future rotations $\mathtt{R}_k$, $k=i,\ldots,j-1$, and makes necessary to re-evaluate summations and products in (32)].

We want to avoid to recompute the above integration whenever the linearization point at time $t_i$ changes. Therefore, we follow [2] and define the following relative motion increments that are independent of the pose and velocity at $t_i$: \begin{align} \Delta\, \mathtt{R}_{{\rm ij}} &\doteq \mathtt{R}_i^{\mathsf {T}}\mathtt{R}_j = \prod _{k=i}^{j-1}\mathrm{Exp}\left(\left(\tilde{\boldsymbol{\omega }}_k - \mathbf {b}^g_k-\boldsymbol{\eta }^{{\rm gd}}_k \right) \Delta t\right) \nonumber \\ \Delta \mathbf {v}_{{\rm ij}} &\doteq \mathtt{R}_i^{\mathsf {T}}\left(\mathbf {v}_j - \mathbf {v}_i - \mathbf {g}\Delta t_{{\rm ij}}\right) = \sum _{k=i}^{j-1} \Delta\, \mathtt{R}_{{\rm ik}} \left(\tilde{\mathbf {a}}_k - \mathbf {b}^a_k - \boldsymbol{\eta }^{{\rm ad}}_k\right)\Delta t \nonumber \\ \Delta \mathbf {p}_{{\rm ij}} &\doteq \mathtt{R}_i^{\mathsf {T}}\left(\mathbf {p}_j-\mathbf {p}_i - \mathbf {v}_i\Delta t_{{\rm ij}} - \frac{1}{2}\,\mathbf {g}\Delta t_{{\rm ij}}^2\right) \nonumber \\ &= \sum _{k=i}^{j-1} \left[ \Delta \mathbf {v}_{{\rm ik}}\Delta t + \frac{1}{2}\Delta\, \mathtt{R}_{{\rm ik}}\left(\tilde{\mathbf {a}}_k - \mathbf {b}^a_k - \boldsymbol{\eta }^{{\rm ad}}_k\right)\Delta t^2 \right] \end{align} View Sourcewhere $\Delta\, \mathtt{R}_{{\rm ik}} \doteq \mathtt{R}_i^{\mathsf {T}}\mathtt{R}_k$ and $\Delta \mathbf {v}_{{\rm ik}} \doteq \mathtt{R}_i^{\mathsf {T}}\left(\mathbf {v}_k - \mathbf {v}_i - \mathbf {g}\Delta t_{{\rm ik}}\right)$. We highlight that, in contrast to the “delta” rotation $\Delta\, \mathtt{R}_{{\rm ij}}$, neither $\Delta \mathbf {v}_{{\rm ij}}$ nor $\Delta \mathbf {p}_{{\rm ij}}$ correspond to the true physical change in velocity and position but are defined in a way that make the right-hand side of (33) independent from the state at time $i$ as well as gravitational effects. Indeed, we will be able to compute the right-hand side of (33) directly from the inertial measurements between the two keyframes.

Unfortunately, summations and products in (33) are still the function of the bias estimate. We tackle this problem in two steps. In Section VI-A, we assume $\mathbf {b}_i$ is known; then, in Section VI-C we show how to avoid repeating the integration when the bias estimate changes.

In the rest of the paper, we assume that the bias remains constant between two keyframes: \begin{eqnarray} \mathbf {b}^g_i = \mathbf {b}^g_{i+1} = \cdots = \mathbf {b}^g_{j-1}, \mathbf {b}^a_i = \mathbf {b}^a_{i+1} = \cdots = \mathbf {b}^a_{j-1}. \end{eqnarray} View Source

A. Preintegrated IMU Measurements

Equation (33) relates the states of keyframes $i$ and $j$ (left-hand side) to the measurements (right-hand side). In this sense, it can be already understood as a measurement model. Unfortunately, it has a fairly intricate dependence on the measurement noise and this complicates a direct application of MAP estimation; intuitively, the MAP estimator requires to clearly define the densities (and their log-likelihood) of the measurements. In this section, we manipulate (33) so to make easier the derivation of the measurement log-likelihood. In practice, we isolate the noise terms of the individual inertial measurements in (33). As discussed above, across this section assume that the bias at time $t_i$ is known.

Let us start with the rotation increment $\Delta\, \mathtt{R}_{{\rm ij}}$ in (33). We use the first-order approximation (7) (rotation noise is “small”) and rearrange the terms, by “moving” the noise to the end, using the relation (11): \begin{align} \Delta\, \mathtt{R}_{{\rm ij}} &\stackrel{\text{eq.}(7)}{\simeq } \prod _{k=i}^{j-1} \left[ \mathrm{Exp}\left(\left(\tilde{\boldsymbol{\omega }}_k - \mathbf {b}^g_i \right) \Delta t\right) \mathrm{Exp}\left(- \mathtt{J}^k_{r}\; \boldsymbol{\eta }^{{\rm gd}}_k \; \Delta t \right) \right] \nonumber \\ &\stackrel{\text{eq.}(11)}{=} \Delta\, \tilde{\mathtt{R}}_{{\rm ij}} \prod _{k=i}^{j-1} \mathrm{Exp}\left(- \Delta\, \tilde{\mathtt{R}}_{k+1j}^{\mathsf {T}}\; \mathtt{J}^k_{r}\; \boldsymbol{\eta }^{{\rm gd}}_k \; \Delta t\right) \nonumber \\ & \ \doteq \Delta\, \tilde{\mathtt{R}}_{{\rm ij}} \mathrm{Exp}\left(- \delta \boldsymbol{\phi }_{{\rm ij}} \right) \end{align} View Sourcewith $\mathtt{J}^k_{r}\doteq \mathtt{J}^k_{r}((\tilde{\boldsymbol{\omega }}_k - \mathbf {b}^g_i)\Delta t)$. In the last line of (35), we defined the preintegrated rotation measurement $\Delta\, \tilde{\mathtt{R}}_{{\rm ij}} \doteq \prod _{k=i}^{j-1} \mathrm{Exp}\left(\left(\tilde{\boldsymbol{\omega }}_k - \mathbf {b}^g_i \right) \Delta t\right)$ , and its noise $\delta \boldsymbol{\phi }_{{\rm ij}}$, which will be further analyzed in the next section.

Substituting (35) back into the expression of $\Delta \mathbf {v}_{{\rm ij}}$ in (33), using the first-order approximation (4) for $\mathrm{Exp}\left(- \delta \boldsymbol{\phi }_{{\rm ij}} \right)$, and dropping higher-order noise terms, we obtain \begin{align} \Delta \mathbf {v}_{{\rm ij}} &\stackrel{\text{eq.}(4)}{\simeq } \sum _{k=i}^{j-1} \Delta\, \tilde{\mathtt{R}}_{{\rm ik}}({\mathbf I}- \delta \boldsymbol{\phi }_{{\rm ik}}^\wedge) \left(\tilde{\mathbf {a}}_k - \mathbf {b}^a_i \right)\Delta t - \Delta\, \tilde{\mathtt{R}}_{{\rm ik}} \boldsymbol{\eta }^{{\rm ad}}_k \Delta t \nonumber \\ & \overset{\text{eq.} (2)}{=} \Delta \tilde{\mathbf {v}}_{{\rm ij}} + \sum _{k=i}^{j-1} \left[ \Delta\, \tilde{\mathtt{R}}_{{\rm ik}} \left(\tilde{\mathbf {a}}_k - \mathbf {b}^a_i \right)^\wedge \delta \boldsymbol{\phi }_{{\rm ik}} \Delta t - \Delta\, \tilde{\mathtt{R}}_{{\rm ik}} \boldsymbol{\eta }^{{\rm ad}}_k \Delta t \right] \nonumber \\ &\;\;\doteq \Delta \tilde{\mathbf {v}}_{{\rm ij}} - \delta \mathbf {v}_{{\rm ij}} \end{align} View Sourcewhere we defined the preintegrated velocity measurement $\Delta \tilde{\mathbf {v}}_{{\rm ij}}\doteq \sum _{k=i}^{j-1} \Delta\, \tilde{\mathtt{R}}_{{\rm ik}} \left(\tilde{\mathbf {a}}_k - \mathbf {b}^a_i \right) \Delta t$ and its noise $\delta \mathbf {v}_{{\rm ij}}$ .

Similarly, substituting (35) and (36) in the expression of $\Delta \mathbf {p}_{{\rm ij}}$ in (33) , and using the first-order approximation (4), we obtain \begin{align} \Delta \mathbf {p}_{{\rm ij}} & \stackrel{\text{eq.}(4)}{\simeq } \sum _{k=i}^{j-1} \Big [ (\Delta \tilde{\mathbf {v}}_{{\rm ik}} - \delta \mathbf {v}_{{\rm ik}})\Delta t + \frac{1}{2} \Delta\, \tilde{\mathtt{R}}_{{\rm ik}}({\mathbf I}- \delta \boldsymbol{\phi }_{{\rm ik}}^\wedge) \nonumber \\ & \quad \left(\tilde{\mathbf {a}}_k - \mathbf {b}^a_i\right)\Delta t^2 - \frac{1}{2} \Delta\, \tilde{\mathtt{R}}_{{\rm ik}} \boldsymbol{\eta }^{{\rm ad}}_k \Delta t^2 \Big ] \nonumber \\ & \overset{\text{eq.} (2)}{=} \Delta \tilde{\mathbf {p}}_{{\rm ij}} + \sum _{k=i}^{j-1} \Big [ - \delta \mathbf {v}_{{\rm ik}}\Delta t + \frac{1}{2} \Delta\, \tilde{\mathtt{R}}_{{\rm ik}} \left(\tilde{\mathbf {a}}_k - \mathbf {b}^a_i \right)^\wedge \delta \boldsymbol{\phi }_{{\rm ik}} \Delta t^2 \nonumber \\ & \qquad - \frac{1}{2} \Delta\, \tilde{\mathtt{R}}_{{\rm ik}} \boldsymbol{\eta }^{{\rm ad}}_k \Delta t^2 \Big ] \nonumber \\ & \doteq \Delta \tilde{\mathbf {p}}_{{\rm ij}} - \delta \mathbf {p}_{{\rm ij}} \end{align} View Sourcewhere we defined the preintegrated position measurement $\Delta \tilde{\mathbf {p}}_{{\rm ij}}$ and its noise $\delta \mathbf {p}_{{\rm ij}}$.

Substituting the expressions (35), (36), and (37) back in the original definition of $\Delta\, \mathtt{R}_{{\rm ij}},\Delta \mathbf {v}_{{\rm ij}},\Delta \mathbf {p}_{{\rm ij}}$ in (33), we finally get our preintegrated measurement model (remember $\mathrm{Exp}\left(-\delta \boldsymbol{\phi }_{{\rm ij}} \right)^{\mathsf {T}}= \mathrm{Exp}\left(\delta \boldsymbol{\phi }_{{\rm ij}} \right)$) \begin{align} \Delta\, \tilde{\mathtt{R}}_{{\rm ij}} &= \mathtt{R}_i^{\mathsf {T}}\mathtt{R}_j \mathrm{Exp}\left(\delta \boldsymbol{\phi }_{{\rm ij}} \right) \nonumber \\ \Delta \tilde{\mathbf {v}}_{{\rm ij}} &= \mathtt{R}_i^{\mathsf {T}}\left(\mathbf {v}_j - \mathbf {v}_i - \mathbf {g}\Delta t_{{\rm ij}}\right) + \delta \mathbf {v}_{{\rm ij}} \nonumber \\ \Delta \tilde{\mathbf {p}}_{{\rm ij}} &= \mathtt{R}_i^{\mathsf {T}}\left(\mathbf {p}_j-\mathbf {p}_i - \mathbf {v}_i\Delta t_{{\rm ij}} - \frac{1}{2}\,\mathbf {g}\Delta t_{{\rm ij}}^2\right) + \delta \mathbf {p}_{{\rm ij}} \end{align} View Sourcewhere our compound measurements are written as a function of the (to-be-estimated) state “plus” a random noise, described by the random vector $[\delta \boldsymbol{\phi }_{{\rm ij}}^{\mathsf {T}}, \delta \mathbf {v}_{{\rm ij}}^{\mathsf {T}}, \delta \mathbf {p}_{{\rm ij}}^{\mathsf {T}}]^{\mathsf {T}}$.

To wrap-up the discussion in this section, we manipulated the measurement model (33) and rewrote it as (38). The advantage of (38) is that, for a suitable distribution of the noise, it makes the definition of the log-likelihood straightforward. For instance, the (negative) log-likelihood of measurements with zero-mean additive Gaussian noise [last two lines in (38)] is a quadratic function. Similarly, if $\delta \boldsymbol{\phi }_{{\rm ij}}$ is a zero-mean Gaussian noise, we compute the (negative) log-likelihood associated with $\Delta\, \tilde{\mathtt{R}}_{{\rm ij}}$. The nature of the noise terms is discussed in the following section.

B. Noise Propagation

In this section, we derive the statistics of the noise vector $[\delta \boldsymbol{\phi }_{{\rm ij}}^{\mathsf {T}}, \delta \mathbf {v}_{{\rm ij}}^{\mathsf {T}}, \delta \mathbf {p}_{{\rm ij}}^{\mathsf {T}}]^{\mathsf {T}}$. While we already observed that it is convenient to approximate the noise vector to be zero-mean Normally distributed, it is of paramount importance to accurately model the noise covariance. Indeed, the noise covariance has a strong influence on the MAP estimator [the inverse noise covariance is used to weight the terms in the optimization (26)]. In this section, we therefore provide a derivation of the covariance $\mathbf {\Sigma }_{{\rm ij}}$ of the preintegrated measurements \begin{equation} \boldsymbol{\eta }^{\Delta }_{{\rm ij}} \doteq \left[\delta \boldsymbol{\phi }_{{\rm ij}}^{\mathsf {T}}, \delta \mathbf {v}_{{\rm ij}}^{\mathsf {T}}, \delta \mathbf {p}_{{\rm ij}}^{\mathsf {T}}\right]^{\mathsf {T}}\sim {\mathcal N}(\mathbf {0}_{9 \times 1}, \mathbf {\Sigma }_{{\rm ij}}). \end{equation} View SourceWe first consider the preintegrated rotation noise $\delta \boldsymbol{\phi }_{{\rm ij}}$. Recall from (35) that \begin{equation} \mathrm{Exp}\left(- \delta \boldsymbol{\phi }_{{\rm ij}} \right) \doteq \prod _{k=i}^{j-1} \mathrm{Exp}\left(- \Delta\, \tilde{\mathtt{R}}_{k+1j}^{\mathsf {T}}\mathtt{J}^k_{r}\; \boldsymbol{\eta }^{{\rm gd}}_k \; \Delta t\right). \end{equation} View SourceTaking the $\mathrm{Log}$ on both sides and changing signs, we get \begin{equation} \delta \boldsymbol{\phi }_{{\rm ij}} = - \mathrm{Log}\left(\prod _{k=i}^{j-1} \mathrm{Exp}\left(- \Delta\, \tilde{\mathtt{R}}_{k+1j}^{\mathsf {T}}\mathtt{J}^k_{r}\; \boldsymbol{\eta }^{{\rm gd}}_k \; \Delta t\right) \right). \end{equation} View SourceRepeated application of the first-order approximation (9) (recall that $\boldsymbol{\eta }^{{\rm gd}}_k$ as well as $\delta \boldsymbol{\phi }_{{\rm ij}}$ are small rotation noises, hence the right Jacobians are close to the identity) produces \begin{equation} \delta \boldsymbol{\phi }_{{\rm ij}} \simeq \sum _{k=i}^{j-1} \Delta\, \tilde{\mathtt{R}}_{k+1j}^{\mathsf {T}}\; \mathtt{J}^k_{r}\; \boldsymbol{\eta }^{{\rm gd}}_k \; \Delta t. \end{equation} View SourceUp to first order, the noise $\delta \boldsymbol{\phi }_{{\rm ij}}$ is zero-mean and Gaussian, as it is a linear combination of zero-mean noise terms $\boldsymbol{\eta }^{{\rm gd}}_k$. This is desirable, since it brings the rotation measurement model (38) exactly in the form (12).

Dealing with the noise terms $\delta \mathbf {v}_{{\rm ij}}$ and $\delta \mathbf {p}_{{\rm ij}}$ is now easy: these are linear combinations of the acceleration noise $\boldsymbol{\eta }^{{\rm ad}}_k$ and the preintegrated rotation noise $\delta \boldsymbol{\phi }_{{\rm ij}}$, hence they are also zero-mean and Gaussian. Simple manipulation leads to \begin{eqnarray} \delta \mathbf {v}_{{\rm ij}} &\simeq& \sum _{k=i}^{j-1} \left[ - \Delta\, \tilde{\mathtt{R}}_{{\rm ik}} \left(\tilde{\mathbf {a}}_k - \mathbf {b}^a_i \right)^\wedge \delta \boldsymbol{\phi }_{{\rm ik}} \Delta t + \Delta\, \tilde{\mathtt{R}}_{{\rm ik}} \boldsymbol{\eta }^{{\rm ad}}_k \Delta t \right] \nonumber\\ \delta \mathbf {p}_{{\rm ij}} &\simeq& \sum _{k=i}^{j-1} \left[ \delta \mathbf {v}_{{\rm ik}}\Delta t - \frac{1}{2} \Delta\, \tilde{\mathtt{R}}_{{\rm ik}} \left(\tilde{\mathbf {a}}_k - \mathbf {b}^a_i \right)^\wedge \delta \boldsymbol{\phi }_{{\rm ik}} \Delta t^2 \right.\nonumber\\ &&\left.+\, \frac{1}{2} \Delta\, \tilde{\mathtt{R}}_{{\rm ik}} \boldsymbol{\eta}^{{\rm ad}}_k \Delta t^2 \right] \end{eqnarray} View Sourcewhere the relations are valid up to the first order.

Equations (42)–(43) express the preintegrated noise $\boldsymbol{\eta }^{\Delta }_{{\rm ij}}$ as a linear function of the IMU measurement noise $\boldsymbol{\eta }^{d}_k \doteq [\boldsymbol{\eta }^{{\rm gd}}_k, \boldsymbol{\eta }^{{\rm ad}}_k]$, $k=1,\ldots,j-1$. Therefore, from the knowledge of the covariance of $\boldsymbol{\eta }^{d}_k$ (given in the IMU specifications), we can compute the covariance of $\boldsymbol{\eta }^{\Delta }_{{\rm ij}}$, namely $\mathbf {\Sigma }_{{\rm ij}}$, by a simple linear propagation.

In Appendix IX-A, we provide a more clever way to compute $\mathbf {\Sigma }_{{\rm ij}}$. In particular, we show that $\mathbf {\Sigma }_{{\rm ij}}$ can be conveniently computed in iterative form: as a new IMU measurement arrive we only update $\mathbf {\Sigma }_{{\rm ij}}$, rather than recomputing it from scratch. The iterative computation leads to simpler expressions and is more amenable for online inference.

C. Incorporating Bias Updates

In the previous section, we assumed that the bias $\lbrace \bar{\mathbf {b}}^{a}_i, \bar{\mathbf {b}}^{g}_i \rbrace$ that is used during preintegration between $k=i$ and $k=j$ is correct and does not change. However, more likely, the bias estimate changes by a small amount $\delta \mathbf {b}$ during optimization. One solution would be to recompute the delta measurements when the bias changes; however, that is computationally expensive. Instead, given a bias update $\mathbf {b}\leftarrow \bar{\mathbf {b}}+\delta \mathbf {b}$, we can update the delta measurements using a first-order expansion \begin{align} \Delta\, \tilde{\mathtt{R}}_{{\rm ij}}(\mathbf {b}^g_i) &\simeq \Delta\, \tilde{\mathtt{R}}_{{\rm ij}}(\bar{\mathbf {b}}^g_i)\; \mathrm{Exp}\Bigg (\frac{\partial \Delta \bar{\mathtt{R}}_{{\rm ij}} }{\partial \mathbf {b}^g} \delta \mathbf {b}^g\Bigg) \\ \Delta \tilde{\mathbf {v}}_{{\rm ij}}(\mathbf {b}^g_i, \mathbf {b}^a_i) & \simeq \Delta \tilde{\mathbf {v}}_{{\rm ij}}(\bar{\mathbf {b}}^g_i, \bar{\mathbf {b}}^a_i) + \frac{\partial \Delta \bar{\mathbf {v}}_{{\rm ij}}}{\partial \mathbf {b}^g} \delta \mathbf {b}^g_i + \frac{\partial \Delta \bar{\mathbf {v}}_{{\rm ij}}}{\partial \mathbf {b}^a} \delta \mathbf {b}^a_i \nonumber \\ \Delta \tilde{\mathbf {p}}_{{\rm ij}}(\mathbf {b}^g_i, \mathbf {b}^a_i) & \simeq \Delta \tilde{\mathbf {p}}_{{\rm ij}}(\bar{\mathbf {b}}^g_i, \bar{\mathbf {b}}^a_i) + \frac{\partial \Delta \bar{\mathbf {p}}_{{\rm ij}}}{\partial \mathbf {b}^g} \delta \mathbf {b}^g_i + \frac{\partial \Delta \bar{\mathbf {p}}_{{\rm ij}}}{\partial \mathbf {b}^a} \delta \mathbf {b}^a_i. \nonumber \end{align} View SourceThis is similar to the bias correction in [2] but operates directly on $\mathrm{SO}(3)$. The Jacobians $\lbrace \frac{\partial \Delta \bar{\mathtt{R}}_{{\rm ij}} }{\partial \mathbf {b}^g}, \frac{\partial \Delta \bar{\mathbf {v}}_{{\rm ij}}}{\partial \mathbf {b}^g}, \ldots \rbrace$ (computed at $\bar{\mathbf {b}}_i$ , the bias estimate at integration time) describe how the measurements change due to a change in the bias estimate. The Jacobians remain constant and can be precomputed during the preintegration. The derivation of the Jacobians is very similar to the one we used in Section VI-A to express the measurements as a large value plus a small perturbation and is given in Appendix IX-B.

D. Preintegrated IMU Factors

Given the preintegrated measurement model in (38) and since measurement noise is zero mean and Gaussian (with covariance $\mathbf {\Sigma }_{{\rm ij}}$) up to first order (39), it is now easy to write the residual errors as $ \mathbf {r}_{\mathcal {I}_{{\rm ij}}} \doteq [\mathbf {r}_{\Delta\, \mathtt{R}_{{\rm ij}}}^{\mathsf {T}}, \mathbf {r}_{\Delta \mathbf {v}_{{\rm ij}}}^{\mathsf {T}}, \mathbf {r}_{\Delta \mathbf {p}_{{\rm ij}}}^{\mathsf {T}}]^{\mathsf {T}}\in \mathbb {R}^9$, where \begin{align} \mathbf {r}_{\Delta\, \mathtt{R}_{{\rm ij}}} \doteq &\ \mathrm{Log}\left(\left(\Delta\, \tilde{\mathtt{R}}_{{\rm ij}}(\bar{\mathbf {b}}^g_i) \mathrm{Exp}\left(\frac{\partial \Delta \bar{\mathtt{R}}_{{\rm ij}} }{\partial \mathbf {b}^g} \delta \mathbf {b}^g\right) \right)^{\mathsf {T}}\mathtt{R}_i^{\mathsf {T}}\mathtt{R}_j \right) \nonumber \\ \mathbf {r}_{\Delta \mathbf {v}_{{\rm ij}}} \doteq &\ \mathtt{R}_i^{\mathsf {T}}\left(\mathbf {v}_j-\mathbf {v}_i - \mathbf {g}\Delta t_{{\rm ij}}\right)\nonumber \\ -& \left[ \Delta \tilde{\mathbf {v}}_{{\rm ij}}(\bar{\mathbf {b}}^g_i, \bar{\mathbf {b}}^a_i) + \frac{\partial \Delta \bar{\mathbf {v}}_{{\rm ij}}}{\partial \mathbf {b}^g} \delta \mathbf {b}^g + \frac{\partial \Delta \bar{\mathbf {v}}_{{\rm ij}}}{\partial \mathbf {b}^a} \delta \mathbf {b}^a \right] \nonumber \\ \mathbf {r}_{\Delta \mathbf {p}_{{\rm ij}}} \doteq &\ \mathtt{R}_i^{\mathsf {T}}\big (\mathbf {p}_j-\mathbf {p}_i - \mathbf {v}_i\Delta t_{{\rm ij}} - \frac{1}{2}\,\mathbf {g}\Delta t_{{\rm ij}}^2\big) \nonumber \\ -& \left[ \Delta \tilde{\mathbf {p}}_{{\rm ij}}(\bar{\mathbf {b}}^g_i, \bar{\mathbf {b}}^a_i) + \frac{\partial \Delta \bar{\mathbf {p}}_{{\rm ij}}}{\partial \mathbf {b}^g} \delta \mathbf {b}^g + \frac{\partial \Delta \bar{\mathbf {p}}_{{\rm ij}}}{\partial \mathbf {b}_a} \delta \mathbf {b}^a \right] \end{align} View Sourcein which we also included the bias updates of (44) .

According to the “lift-solve-retract” method (see Section III-C ), at each Gauss–Newton iteration we need to reparametrize (45) using the retraction (21). Then, the “solve” step requires to linearize the resulting cost around the current estimate. For the purpose of linearization, it is convenient to compute analytic expressions of the Jacobians of the residual errors, which we derive in the Appendix IX-C.

E. Bias Model

When presenting the IMU model (27), we said that biases are slowly time-varying quantities. Hence, we model them with a “Brownian motion,” i.e., integrated white noise \begin{equation} \dot{\mathbf {b}}^{g}(t) = \boldsymbol{\eta }^{{\rm bg}}, \qquad \dot{\mathbf {b}}^a(t) = \boldsymbol{\eta }^{{\rm ba}}. \end{equation} View SourceIntegrating (46) over the time interval $[t_i,t_j]$ between the two consecutive keyframes $i$ and $j$ we get \begin{equation} \mathbf {b}^{g}_j = \mathbf {b}^{g}_i + \boldsymbol{\eta }^{{\rm bgd}}, \qquad \mathbf {b}^a_j = \mathbf {b}^a_i + \boldsymbol{\eta }^{{\rm bad}} \end{equation} View Sourcewhere, as done before, we use the shorthand $\mathbf {b}^{g}_i \doteq \mathbf {b}^{g}(t_i)$, and we define the discrete noises $\boldsymbol{\eta }^{{\rm bgd}}$ and $\boldsymbol{\eta }^{{\rm bad}}$, which have zero mean and covariance $\mathbf {\Sigma }^{{\rm bgd}} \doteq \Delta t_{{\rm ij}} \text{Cov}(\boldsymbol{\eta }^{{\rm bg}})$ and $\mathbf {\Sigma }^{{\rm bad}} \doteq \Delta t_{{\rm ij}} \text{Cov}(\boldsymbol{\eta }^{{\rm ba}})$, respectively (cf., [58, Appendix]).

The model (47) can be readily included in our factor graph, as a further additive term in (26) for all consecutive keyframes \begin{equation} \Vert \mathbf {r}_{\mathbf {b}_{{\rm ij}}} \Vert ^2 \doteq \Vert \mathbf {b}^{g}_j - \mathbf {b}^{g}_i \Vert ^2_{\mathbf {\Sigma }^{{\rm bgd}}} +\Vert \mathbf {b}^{a}_j - \mathbf {b}^{a}_i \Vert ^2_{\mathbf {\Sigma }^{\rm bad}}. \end{equation} View Source

SECTION VII.

Structureless Vision Factors

In this section, we introduce our structureless model for vision measurements. The key feature of our approach is the linear elimination of landmarks. Note that the elimination is repeated at each Gauss–Newton iteration, hence, we are still guaranteed to obtain the optimal MAP estimate.

Visual measurements contribute to the cost (26) via the sum \begin{equation} \sum _{i \in \mathcal {K}_k}\sum _{l \in \mathcal {C}_i} \Vert \mathbf {r}_{\mathcal {C}_{{\rm il}}} \Vert ^2_{\mathbf {\Sigma }_{\mathcal {C}}} = \sum _{l=1}^L\sum _{i \in \mathcal {X}(l)} \Vert \mathbf {r}_{\mathcal {C}_{{\rm il}}} \Vert ^2_{\mathbf {\Sigma }_{\mathcal {C}}} \end{equation} View Sourcewhich, on the right-hand side, we rewrote as a sum of contributions of each landmark $l=1,\ldots,L$. In (49), $\mathcal {X}(l)$ denotes the subset of keyframes in which $l$ is seen.

A fairly standard model for the residual error of a single-image measurement $\mathbf {z}_{{\rm il}}$ is the reprojection error \begin{equation} \mathbf {r}_{\mathcal {C}_{{\rm il}}} = \mathbf {z}_{{\rm il}} - \pi (\mathtt{R}_i,\mathbf {p}_i,\mathbf {\rho }_l) \end{equation} View Sourcewhere $\mathbf {\rho }_l \in \mathbb {R}^{3}$ denotes the position of the $l$th landmark, and $\pi (\cdot)$ is a standard perspective projection, which also encodes the (known) IMU-camera transformation $\mathtt{T}_{\text{{B}}\text{{C}}}$.

Direct use of (50) would require to include the landmark positions $\mathbf {\rho }_l$, $l=1,\ldots,L$ in the optimization, and this impacts negatively on the computation. Therefore, in the following, we adopt a structureless approach that avoids optimization over the landmarks, thus ensuring to retrieve the MAP estimate.

As recalled in Section III-C, at each GN iteration, we lift the cost function, using the retraction (21). For the vision factors, this means that the original residuals (49) become \begin{equation} \sum _{l = 1}^L \sum_{i \in \mathcal {X}(l)} \Vert \mathbf {z}_{{\rm il}} - \check{\pi }(\delta \boldsymbol{\phi }_i,\delta \mathbf {p}_i,\delta \mathbf {\rho }_l) \Vert ^2_{\mathbf {\Sigma }_{\mathcal {C}}} \end{equation} View Sourcewhere $\delta \boldsymbol{\phi }_i, \delta \mathbf {p}_i, \delta \mathbf {\rho }_l$ are now Euclidean corrections, and $\check{\pi }(\cdot)$ is the lifted cost function. The “solve” step in the GN method is based on linearization of the residuals \begin{equation} \sum _{l = 1}^L \sum _{i \in \mathcal {X}(l)} \Vert \mathbf {F}_{{\rm il}} \delta \mathbf {T}_i + \mathbf {E}_{{\rm il}} \delta \mathbf {\rho }_l - \mathbf {b}_{{\rm il}} \Vert ^2 \end{equation} View Sourcewhere $\delta \mathbf {T}_i \doteq [\delta \boldsymbol{\phi }_i \; \delta \mathbf {p}_i]^{\mathsf {T}}$; the Jacobians $\mathbf {F}_{{\rm il}}, \mathbf {E}_{{\rm il}}$ , and the vector $\mathbf {b}_{{\rm il}}$ (both normalized by $\mathbf {\Sigma }_{\mathcal {C}}^{1/2}$) result from the linearization. The vector $\mathbf {b}_{{\rm il}}$ is the residual error at the linearization point.

Writing the second sum in (52) in matrix form we get \begin{equation} \sum _{l = 1}^L \Vert \mathbf {F}_{l} \; \delta \mathbf {T}_{\mathcal {X}(l)} + \mathbf {E}_{l} \; \delta \mathbf {\rho }_l - \mathbf {b}_{l} \Vert ^2 \end{equation} View Sourcewhere $\mathbf {F}_{l}, \mathbf {E}_{l}, \mathbf {b}_{l}$ are obtained by stacking $\mathbf {F}_{{\rm il}},\mathbf {E}_{{\rm il}}, \mathbf {b}_{{\rm il}}$, respectively, for all $i \in \mathcal {X}(l)$.

Since a landmark $l$ appears in a single term of the sum (53), for any given choice of the pose perturbation $\delta \mathbf {T}_{\mathcal {X}(l)}$, the landmark perturbation $\delta \mathbf {\rho }_l$ that minimizes the quadratic cost $\Vert \mathbf {F}_{l} \; \delta \mathbf {T}_{\mathcal {X}(l)} + \mathbf {E}_{l} \; \delta \mathbf {\rho }_l - \mathbf {b}_{l} \Vert ^2$ is \begin{equation} \delta \mathbf {\rho }_l = - \,\left(\mathbf {E}_{l}^{\mathsf {T}}\mathbf {E}_{l}\right)^{-1}\mathbf {E}_{l}^{\mathsf {T}}(\mathbf {F}_{l} \; \delta \mathbf {T}_{\mathcal {X}(l)} - \mathbf {b}_{l}). \end{equation} View SourceSubstituting (54) back into (53) we can eliminate the variable $\delta \mathbf {\rho }_l$ from the optimization problem \begin{equation} \sum _{l = 1}^L \Vert \left({\mathbf I}- \mathbf {E}_{l} (\mathbf {E}_{l}^{\mathsf {T}}\mathbf {E}_{l})^{-1}\mathbf {E}_{l}^{\mathsf {T}}\right) \left(\mathbf {F}_{l} \; \delta \mathbf {T}_{\mathcal {X}(l)} - \mathbf {b}_{l} \right) \Vert ^2 \end{equation} View Sourcewhere ${\mathbf I}- \mathbf {E}_{l}(\mathbf {E}_{l}^{\mathsf {T}}\mathbf {E}_{l})^{-1}\mathbf {E}_{l}^{\mathsf {T}}$ is an orthogonal projector of $\mathbf {E}_{l}$. In Appendix IX-D, we show that the cost (55) can be further manipulated, leading to a more efficient implementation.

This approach is well known in the bundle adjustment literature as the Schur complement trick, where a standard practice is to update the linearization point of $\mathbf {\rho }_l$ via back-substitution [61]. In contrast, we obtain the updated landmark positions from the linearization point of the poses using a fast linear triangulation. Using this approach, we reduced a large set of factors (51) which involve poses and landmarks into a smaller set of $L$ factors (55), which only involve poses. In particular, the factor corresponding to the landmark $l$ only involves the states $\mathcal {X}(l)$ observing $l$, creating the connectivity pattern of Fig. 3. The same approach is also used in MSC-KF [5] to avoid the inclusion of landmarks in the state vector. However, since MSC-KF can only linearize and absorb a measurement once, the processing of measurements needs to be delayed until all measurements of the same landmark are observed. This does not apply to the proposed optimization-based approach, which allows for multiple relinearizations and the incremental inclusion of new measurements.

SECTION VIII.

Experimental Analysis

We tested the proposed approach on both simulated and real data. Section VIII-A reports simulation results, showing that our approach is accurate, fast, and consistent. Section VIII-B compares our approach against the state of the art, confirming its superior accuracy in real indoor and outdoor experiments.

A. Simulation Experiments

We simulated a camera following a circular trajectory of a 3 m radius with a sinusoidal vertical motion. The total length of the trajectory is 120 m. While moving, the camera observes landmarks as depicted in Fig. 5. The number of landmark observations per frame is limited to 50. To simulate a realistic feature tracker, we corrupt the landmark measurements with isotropic Gaussian noise with standard deviation $\sigma _\text{px}=1$ pixel. The camera has a focal length of 315 pixels and runs at a rate of 2.5 Hz (simulating keyframes). The simulated acceleration and gyroscope measurements are computed from the analytic derivatives of the parametric trajectory and additionally corrupted by white noise and a slowly time-varying bias terms, according to the IMU model in (27).² To evaluate our approach, we performed a Monte Carlo analysis with 50 simulation runs, each with different realizations of process and measurement noise. In each run, we compute the MAP estimate using the IMU and the vision models presented in this paper. The optimization (whose solution is the MAP estimate) is solved using the incremental smoothing algorithm iSAM2 [3]. iSAM2 uses the Bayes tree [62] data structure to obtain efficient variable ordering that minimizes fill-in in the square-root information matrix and, thus, minimizes computation time. Further, iSAM2 exploits the fact that new measurements often have only local effect on the MAP estimate, hence applies incremental updates directly to the square-root information matrix, only resolving for the variables affected by a new measurement.

Fig. 5.

Simulation setup: The camera moves along a circular trajectory while observing features (green lines) on the walls of a square environment.

Show All

In the following, we present the results of our experiments, organized in four sections:

pose estimation accuracy and timing;
consistency;
bias estimation accuracy; and
first-order bias correction.

Then, in Section VIII-A5, we compare our approach against the original proposal of [2].

1) Pose Estimation Accuracy and Timing

The optimal MAP estimate is given by the batch nonlinear optimization of the least-squares objective in (26). However, as shown on the left in Fig. 6, the computational cost of the batch optimization quickly increases as the trajectory length grows. A key ingredient that makes our approach extremely efficient is the use of the incremental smoothing algorithm iSAM2 [3], which performs close-to-optimal inference, while preserving real-time capability. Fig. 7 shows that the accuracy of iSAM2 is practically the same as the batch estimate. In odometry problems, the iSAM2 algorithm results in approximately constant update time per frame (see Fig. 6 , left), which in our experiment is approximately 10 ms per update (see Fig. 6, right).

Fig. 6.

Left: CPU time required for inference, comparing batch estimation against iSAM2. Right: histogram plot of CPU time for the proposed approach.

Show All

Fig. 7.

Root-mean-squared error averaged over 50 Monte Carlo experiments, comparing batch nonlinear optimization and iSAM2.

Show All

2) Consistency

For generic motion, the VIO problem has four unobservable degrees of freedom, three corresponding to the global translation and one to the global orientation around the gravity direction (yaw), see [16]. A VIO algorithm must preserve these observability properties and avoid inclusion of spurious information along the unobservable directions, which would result in inconsistency [16]. Fig. 8 reports orientation and position errors with the corresponding $3\sigma$ bounds, confirming that our approach is consistent. In the VIO problem, the gravity direction is observable, hence, the uncertainty on roll and pitch remains bounded. In contrast, global yaw and position cannot be measured and the uncertainty slowly grows over time.

$Fig. 8. - Orientation and position errors with $3\sigma$ bounds (single simulation).$

Fig. 8.

Orientation and position errors with $3\sigma$ bounds (single simulation).

Show All

To present more substantial evidence of the fact that our estimator is consistent, we recall a standard measure of consistency, the average normalized estimation error squared (NEES) [63]. The NEES is the squared estimation error $\boldsymbol{\epsilon }_k$ normalized by the estimator-calculated covariance ${\boldsymbol{\Sigma }}_k$ \begin{equation} \eta _k \doteq \boldsymbol{\epsilon }_k^{\mathsf {T}}\hat{\boldsymbol{\Sigma }}_k^{-1}\boldsymbol{\epsilon }_k \qquad \quad \text{(NEES).} \end{equation} View SourceThe error in estimating the current pose is computed as \begin{equation} \boldsymbol{\epsilon }_k \doteq \big [ \mathrm{Log}\left(\hat{\mathtt{R}}_k^{\mathsf {T}}\mathtt{R}_k^\text{gt}\right), \ \hat{\mathtt{R}}_k^{\mathsf {T}}(\hat{\mathbf {p}}_k-\mathbf {p}_k^\text{gt}) \big ]^{\mathsf {T}} \end{equation} View Sourcewhere the exponent “gt” denotes ground-truth states and $(\hat{\mathtt{R}}_k, \hat{\mathbf {p}}_k)$ denotes the estimated pose at time $k$. Note that the error (57) is expressed in the body frame and it is consistent with our choice of the retraction in (21) (intuitively, the retraction applies the perturbation in the body frame).

The average NEES over $N$ independent Monte Carlo runs, can be computed by averaging the NEES values \begin{equation} \bar{\eta }_k = \frac{1}{N}\sum _{i=1}^N \eta ^{(i)}_{k} \qquad \quad \text{(average NEES)} \end{equation} View Sourcewhere $\eta ^{(i)}_{k}$ is the NEES computed at the $i$th Monte Carlo run. If the estimator is consistent, then $N \bar{\eta }_k$ is $\chi ^2_n$ chi-square distributed with $n=\text{dim}(\boldsymbol{\epsilon }_k) \cdot N$ degrees of freedom [63, p. 234]. We evaluate this hypothesis with a $\chi ^2_n$ acceptance test [63, p. 235]. For a significance level $\alpha = 2.5\%$ and $n = \text{dim}(\boldsymbol{\epsilon }_k) \cdot N = 6 \times \ 50$, the acceptance region of the test is given by the two-sided probability concentration region $\bar{\eta }_k \in [5.0,7.0]$. If $\bar{\eta }_k$ rises significantly higher than the upper bound, the estimator is overconfident, if it tends below the lower bound, it is conservative. In VIO, one usually wants to avoid overconfident estimators: the fact that $\bar{\eta }_k$ exceeds the upper bound is an indicator of the fact that the estimator is including spurious information in the inference process.

In Fig. 9, we report the average NEES of the proposed approach. The average NEES approaches the lower bound but, more importantly, it remains below the upper bound at 7.0 (black dots), which assures that the estimator is not overconfident. We also report the average rotational and translational NEES to allow a comparison with the OC-EKF in [16] and [18], which obtains similar results by enforcing explicitly the observability properties in EKF.

Fig. 9.

NEES averaged over 50 Monte Carlo runs. The average NEES is reported for the current pose (top), current position (middle), and current rotation (bottom).

Show All

3) Bias Estimation Accuracy

Our simulations allow us to compare the estimated gyroscope and accelerometer bias with the true biases that were used to corrupt the simulated inertial measurements. Fig. 10 shows that the biases estimated by our approach (in blue) correctly track the ground truth biases (in red). Note that, since we use a smoothing approach, at each step, we potentially change the entire history of the bias estimates, hence, we visualize the bias estimates using multiple curves. Each curve represents the history of the estimated biases from time zero (left-most extreme of the blue curve) to the current time (right-most extreme of the blue curve).

Fig. 10.

Comparison between ground truth bias (red line) and estimated bias (blue lines) in a Monte Carlo run.

Show All

4) First-Order Bias Correction

We performed an additional Monte Carlo analysis to evaluate the a posteriori bias correction proposed in Section VI-C. The preintegrated measurements are computed with the bias estimate at the time of integration. However, as seen in Fig. 10 , the bias estimate for an older preintegrated measurement may change when more information becomes available. To avoid repeating the integration when the bias estimate changes, we perform a first-order correction of the preintegrated measurement according to (44). The accuracy of this first-order bias correction is reported in Fig. 11. To compute the statistics, we integrated 100 random IMU measurements with a given bias estimate $\bar{\mathbf {b}}_i$, which results in the preintegrated measurements $\Delta\, \tilde{\mathtt{R}}_{{\rm ij}}(\bar{\mathbf {b}}_i), \Delta \tilde{\mathbf {v}}_{{\rm ij}}(\bar{\mathbf {b}}_i)$, and $\Delta \tilde{\mathbf {p}}_{{\rm ij}}(\bar{\mathbf {b}}_i)$. Subsequently, a random perturbation $\delta \mathbf {b}_i$ with magnitude between 0.04 and 0.2 was applied to both the gyroscope and accelerometer bias. We repeated the integration at $\bar{\mathbf {b}}_i + \delta \mathbf {b}_i$ to obtain $\Delta\, \tilde{\mathtt{R}}_{{\rm ij}}(\bar{\mathbf {b}}_i + \delta \mathbf {b}_i), \Delta \tilde{\mathbf {v}}_{{\rm ij}}(\bar{\mathbf {b}}_i + \delta \mathbf {b}_i)$, and $\Delta \tilde{\mathbf {p}}_{{\rm ij}}(\bar{\mathbf {b}}_i + \delta \mathbf {b}_i)$. This ground-truth result was then compared against the first-order correction in (44) to compute the error of the approximation. The errors resulting from the first-order approximation are negligible, even for the relatively large bias perturbations.

$Fig. 11. - Error committed when using the first-order approximation (44) instead of repeating the integration, for different bias perturbations. Left: $\Delta \tilde{\mathbf {p}}_{{\rm ij}}(\bar{\mathbf {b}}_i + \delta \mathbf {b}_i)$ error; Center: $\Delta \tilde{\mathbf {v}}_{{\rm ij}}(\bar{\mathbf {b}}_i + \delta \mathbf {b}_i)$ error; Right: $\Delta\, \tilde{\mathtt{R}}_{{\rm ij}}(\bar{\mathbf {b}}_i + \delta \mathbf {b}_i)$ error. Statistics are computed over 1000 Monte Carlo runs.$

Fig. 11.

Error committed when using the first-order approximation (44) instead of repeating the integration, for different bias perturbations. Left: $\Delta \tilde{\mathbf {p}}_{{\rm ij}}(\bar{\mathbf {b}}_i + \delta \mathbf {b}_i)$ error; Center: $\Delta \tilde{\mathbf {v}}_{{\rm ij}}(\bar{\mathbf {b}}_i + \delta \mathbf {b}_i)$ error; Right: $\Delta\, \tilde{\mathtt{R}}_{{\rm ij}}(\bar{\mathbf {b}}_i + \delta \mathbf {b}_i)$ error. Statistics are computed over 1000 Monte Carlo runs.

Show All

5) Advantages Over the Euler-Angle-Based Formulation

In this section, we compare the proposed IMU preintegration with the original formulation of [2], based on Euler angles. We observe three main problems with the preintegration using Euler angles, which are avoided in our formulation.

The first drawback is that, in contrast to the integration using the exponential map in (30), the rotation integration based on Euler angles is only exact up to the first order. For the interested reader, we recall the rotation rate integration using Euler angles in Appendix IX-E. On the left of Fig. 12, we report the integration errors committed by the Euler angle parametrization when integrating angular rates with randomly selected rotation axes and magnitude in the range from 1 to 3 rad/s. The integration error in Euler angles accumulates quickly when the sampling time $\Delta t$ or the angular rate $\tilde{\boldsymbol{\omega }}$ are large. On the other hand, the proposed approach, which performs integration directly on the rotation manifold, is exact, regardless the values of $\Delta t$ and $\tilde{\boldsymbol{\omega }}$.

$Fig. 12. - (a) Integration errors committed with the Euler angle parametrization for angular velocities $\boldsymbol{\omega }$ of increasing magnitude [rad/s]. (b) Negative log-likelihood of a rotation measurement under the action of random rigid body transformations.$

Fig. 12.

(a) Integration errors committed with the Euler angle parametrization for angular velocities $\boldsymbol{\omega }$ of increasing magnitude [rad/s]. (b) Negative log-likelihood of a rotation measurement under the action of random rigid body transformations.

Show All

The second drawback is that the Euler parametrization is not fair [64], which means that, given the preintegrated Euler angles $\tilde{\boldsymbol{\theta }}$, the negative log-likelihood ${\mathcal L}(\boldsymbol{\theta }) = \frac{1}{2}\Vert \tilde{\boldsymbol{\theta }}- \boldsymbol{\theta }\Vert ^2_\Sigma$ is not invariant under the action of rigid body transformations. On the right of Fig. 12, we show experimentally how the log-likelihood changes when the frame of reference is rotated around randomly selected rotation axes. This essentially means that an estimator using Euler angles may give different results for different choices of the world frame (cf., with Fig. 2). On the other hand, the $\mathrm{SO}(3)$ parametrization can be easily seen to be fair (the negative likelihood (16) can be promptly seen to be left invariant), and this is confirmed by Fig. 12 (right).

The third drawback is the existence of so-called gimball lock singularities. For a $zyx$ Euler angle parametrization, the singularity is reached at pitch values of $\theta = \frac{\pi }{2}+n\pi$, for $n \in \mathbb {Z}$. To evaluate the effect of the singularity and how it affects the computation of preintegrated measurement noise, we performed the following Monte Carlo analysis. We simulated a set of trajectories that reach maximum pitch values $\theta _\text{max}$ of increasing magnitude. For each trajectory, we integrate the rotation uncertainty using the Euler parametrization and the proposed on-manifold approach. The ground-truth covariance is instead obtained through sampling. We use the Kullback–Leibler (KL) divergence to quantify the mismatch between the estimated covariances and the ground-truth one. The results of this experiment are shown in Fig. 13, where we observe that the closer we get to the singularity, the worse is the noise propagation using Euler angles. On the other hand, the proposed approach can accurately estimate the measurement covariance, independently on the motion of the platform.

$Fig. 13. - KL divergence between the preintegrated rotation covariance—computed using Euler angles (red) and the proposed approach (blue)—and the ground-truth covariance. The Euler angle parametrization degrades close to the singularity at $\theta _\text{max} = 90$ $^\circ$ while the proposed on-manifold approach is accurate regardless of the motion.$

Fig. 13.

KL divergence between the preintegrated rotation covariance—computed using Euler angles (red) and the proposed approach (blue)—and the ground-truth covariance. The Euler angle parametrization degrades close to the singularity at $\theta _\text{max} = 90$ $^\circ$ while the proposed on-manifold approach is accurate regardless of the motion.

Show All

B. Real Experiments

We integrated the proposed inertial factors in a monocular VIO pipeline to benchmark its performance against the state of the art. In the following, we first discuss our implementation, and then present the results from an indoor experiment with a motion-capture ground truth. Finally, we show results from longer trajectories in outdoor experiments. The results confirm that our approach is more accurate than the state-of-the-art filtering and fixed-lag smoothing algorithms, and enables fast inference in real-world problems.

1) Implementation

Our implementation consists of a high frame rate tracking front end based on SVO³ [65] and an optimization back-end based on iSAM2 [3].⁴ The front-end tracks salient features in the image at camera rate, while the back-end optimizes in parallel the state of selected keyframes as described in this paper.

SVO [65] is a precise and robust monocular visual odometry system that employs sparse image alignment to estimate incremental motion and tracks features by minimizing the photometric error between subsequent frames. The difference to tracking features individually, as in standard Lucas–Kanade tracking, is that we exploit the known depth of features from previous triangulations. This allows us to track all features as a bundle in a single optimization that satisfies epipolar constraints; hence, outliers only originate from erroneous triangulations. In the visual--inertial setting, we further exploit the availability of accurate rotation increments, obtained by integrating angular velocity measurements from the gyroscope. These increments are used as rotation priors in the sparse-image-alignment algorithm, and this increases the overall robustness of the system. The motion estimation is combined with an outlier resistant probabilistic triangulation method that is implemented with a recursive Bayesian filter. The high frame-rate motion estimation combined with the robust depth estimation results in increased robustness in scenes with repetitive and high-frequency texture (e.g., asphalt). The output of SVO are selected keyframes with feature-tracks corresponding to triangulated landmarks. This data is passed to the back end that computes the visual--inertial MAP estimate in (26) using iSAM2 [3].

We remark that our approach does not marginalize out past states. Therefore, while the approach is designed for fast VIO, if desired, it could be readily extended to incorporate loop closures.

2) Indoor Experiments

The indoor experiment shows that the proposed approach is more accurate than two competitive state-of-the-art approaches, namely OKVIS⁵ [24] and MSCKF [5] . The experiment is performed on the 430-m-long indoor trajectory of Fig. 14 . The dataset was recorded with a forward-looking VI-Sensor [66] that consists of an ADIS16448 MEMS IMU and two embedded WVGA monochrome cameras (we only use the left camera). Intrinsic and extrinsic calibration was obtained using [59]. The camera runs at 20 Hz and the IMU at 800 Hz. Ground truth poses are provided by a Vicon system mounted in the room; the hand-eye calibration between the Vicon markers and the camera is computed using a least-squares method [67].

Fig. 14.

Left: two images from the indoor trajectory dataset with tracked features in green. Right: top view of the trajectory estimate produced by our approach (blue) and 3-D landmarks triangulated from the trajectory (green).

Show All

Fig. 15 compares the proposed system against the OKVIS algorithm [24], and an implementation of the MSCKF filter [5]. Both these algorithms currently represent the state-of-the-art in VIO, OKVIS for optimization-based approaches, and MSCKF for filtering methods. We obtained the datasets as well as the trajectories computed with OKVIS and MSCKF from the authors of [24]. We use the relative error metrics proposed in [68] to obtain error statistics. The metric evaluates the relative error by averaging the drift over trajectory segments of different length ($\lbrace \text{10, 40, 90, 160, 250, 360}\rbrace$ m in Fig. 15). Our approach exhibits less drift than the state of the art, achieving 0.3 m drift on average over 360 m traveled distance; OKVIS and MSCKF accumulate an average error of 0.7 m. We observe significantly less drift in yaw direction in the proposed approach while the error in pitch and roll direction is constant for all methods due to the observability of the gravity direction.

Fig. 15.

Comparison of the proposed approach versus the OKVIS algorithm [24] and an implementation of the MSCKF filter [5]. Relative errors are measured over different segments of the trajectory, of length { 10,40,90,160,250,360} m, according to the odometric error metric in [68].

Show All

We highlight that these three algorithms use different front-end feature tracking systems, which influence the overall performance of the approach. Therefore, while in Section VIII-A, we discussed only aspects that are related to the preintegration theory, in this section, we evaluate the proposed system as a whole (SVO, preintegration, structureless vision factors, iSAM2).

Evaluating the consistency in real experiments by means of analyzing the average NEES is difficult as one would have to evaluate and average the results of multiple runs of the same trajectory with different realizations of the sensor noise. In Fig. 17, we show the error plots with the three-sigma bounds for a single run. The result is consistent as the estimation errors remain within the bounds of the estimated uncertainty. In this experiment, we aligned only the first frame of the trajectory with the vicon trajectory. Therefore, analyzing the drift over 400 m is very prone to errors in the initial pose from the ground truth or errors in the hand-eye calibration of the system.

Fig. 16 illustrates the time required by the back end to compute the full MAP estimate, by running iSAM2 with ten optimization iterations. The experiment was performed on a standard laptop (Intel i7, 2.4 GHz). The average update time for iSAM2 is 10 ms. The peak corresponds to the start of the experiment in which the camera was not moving. In this case, the number of tracked features becomes very large making the back end slightly slower. The SVO front end requires approximately 3 ms to process a frame on the laptop while the back-end runs in a parallel thread and optimizes only keyframes. Although the processing times of OKVIS were not reported, the approach is described as computationally demanding [24] . OKVIS needs to repeat IMU integration at every change of the linearization point, which we avoid by using the preintegrated IMU measurements.

Fig. 16.

Processing-time per keyframe for the proposed VIO approach.

Show All

$Fig. 17. - Orientation and position errors with $3\sigma$ bounds for the real indoor experiment in Fig. 14.$

Fig. 17.

Orientation and position errors with $3\sigma$ bounds for the real indoor experiment in Fig. 14.

Show All

3) Outdoor Experiments

The second experiment is performed on an outdoor trajectory, and compares the proposed approach against the Google Tango Peanut sensor (mapper version 3.15), which is an engineered VIO system. We rigidly attached the VI-Sensor to a Tango device and walked around an office building. Fig. 18 depicts the trajectory estimates for our approach and Google Tango. The trajectory starts and ends at the same location, hence, we can report the end-to-end error which is 1.5 m for the proposed approach and 2.2 m for the Google Tango sensor.

Fig. 18.

Outdoor trajectory (length: 300 m) around a building with an identical start and end point at coordinates (0,0,0). The end-to-end error of the proposed approach is 1.0 m. Google Tango accumulated 2.2 m drift. The green dots are the 3-D points triangulated from our trajectory estimate.

Show All

In Fig. 18, we also show the estimated landmark positions (in green). 3-D points are not estimated by our approach (which uses a structureless vision model), but are triangulated from our trajectory estimate for visualization purposes.

The third experiment is the one in Fig. 19. The trajectory goes across three floors of an office building and eventually returns to the initial location on the ground floor. Also, in this case, the proposed approach guarantees a very small end-to-end error (0.5 m), while Tango accumulates 1.4 m error.

Fig. 19.

Real test comparing the proposed VIO approach against Google Tango. The 160-m-long trajectory starts at $(0,0,0)$ (ground floor), goes up till the third floor of a building, and returns to the initial point. The figure shows a side view (left) and a top view (right) of the trajectory estimates for our approach (blue) and Tango (red). Google Tango accumulates 1.4 m error, while the proposed approach only has 0.5 m drift. 3-D points triangulated from our trajectory estimate are shown in green for visualization purposes.

Show All

We remark that Tango and our system use different sensors, hence, the reported end-to-end errors only allow for a qualitative comparison. However, the IMUs of both sensors exhibit similar noise characteristics [69], [70] and the Tango camera has a significantly larger field-of-view and better shutter speed control than our sensor. Therefore, the comparison is still valuable to assess the accuracy of the proposed approach.

A video demonstrating the execution of our approach for the real experiments discussed in this section can be viewed at https://youtu.be/CsJkci5lfco.

SECTION IX.

Conclusion

This paper proposes a novel preintegration theory, which provides a grounded way to model a large number of IMU measurements as a single motion constraint. Our proposal improves over related works that perform integration in a global frame, e.g., [5], [23], as we do not commit to a linearization point during integration. Moreover, it leverages the seminal work on preintegration [2], bringing to maturity the preintegration and uncertainty propagation in $\mathrm{SO}(3)$ .

As a second contribution, we discuss how to use the preintegrated IMU model in a VIO pipeline; we adopt a structureless model for visual measurements which avoids optimizing over 3-D landmarks. Our VIO approach uses iSAM2 to perform constant-time incremental smoothing.

An efficient implementation of our approach requires 10 ms to perform inference (back end), and 3 ms for feature tracking (front end). Experimental results also confirm that our approach is more accurate than the state-of-the-art alternatives, including filtering- and optimization-based techniques.

We release the source-code of the IMU preintegration and the structurless vision factors in the GTSAM 4.0 optimization toolbox [7] and provide additional theoretical derivations and implementation details in the Appendix of this paper.

Appendix

SECTION A.

Iterative Noise Propagation

In this section, we show that the computation of the preintegrated noise covariance, discussed in Section VI-B, can be carried out in iterative form, which leads to simpler expressions and is more amenable for online inference.

Let us start from the preintegrated rotation noise in (42). To write $\delta \boldsymbol{\phi }_{{\rm ij}}$ in iterative form, we simply take the last term ($k=j-1$) out of the sum and rearrange the terms as follows: \begin{align} \delta \boldsymbol{\phi }_{{\rm ij}} & \simeq \sum _{k=i}^{j-1} \Delta\, \tilde{\mathtt{R}}_{k+1j}^{\mathsf {T}}\mathtt{J}^k_{r}\boldsymbol{\eta }^{{\rm gd}}_k \Delta t\nonumber \\ &= \sum _{k=i}^{j-2} \Delta\, \tilde{\mathtt{R}}_{k+1j}^{\mathsf {T}}\mathtt{J}^k_{r}\boldsymbol{\eta }^{{\rm gd}}_k \Delta t + \overbrace{\Delta\, \tilde{\mathtt{R}}_{jj}^{\mathsf {T}}}^{= {\mathbf I}_{3\times 3}} \mathtt{J}_{r}^{j-1} \boldsymbol{\eta }^{{\rm gd}}_{j-1} \Delta t \nonumber \\ &= \sum _{k=i}^{j-2} (\overbrace{\Delta\, \tilde{\mathtt{R}}_{k+1j-1} \Delta\, \tilde{\mathtt{R}}_{j-1j}}^{= \Delta\, \tilde{\mathtt{R}}_{k+1j}})^{\mathsf {T}}\mathtt{J}^k_{r}\boldsymbol{\eta }^{{\rm gd}}_k \Delta t + \mathtt{J}_{r}^{j-1} \boldsymbol{\eta }^{{\rm gd}}_{j-1} \Delta t \nonumber \\ &= \Delta\, \tilde{\mathtt{R}}_{j-1j}^{\mathsf {T}}\sum _{k=i}^{j-2} \Delta\, \tilde{\mathtt{R}}_{k+1j-1}^{\mathsf {T}}\mathtt{J}^k_{r}\boldsymbol{\eta }^{{\rm gd}}_k \Delta t + \mathtt{J}_{r}^{j-1} \boldsymbol{\eta }^{{\rm gd}}_{j-1} \Delta t \nonumber \\ &= \Delta\, \tilde{\mathtt{R}}_{j-1j}^{\mathsf {T}}\delta \boldsymbol{\phi }_{{\rm ij}-1} + \mathtt{J}_{r}^{j-1} \boldsymbol{\eta }^{{\rm gd}}_{j-1} \Delta t. \end{align} View Source

Repeating the same process for $\delta \mathbf {v}_{{\rm ij}}$ in (43) \begin{align} \delta \mathbf {v}_{{\rm ij}} &= \sum _{k=i}^{j-1} \left[ - \Delta\, \tilde{\mathtt{R}}_{{\rm ik}} \left(\tilde{\mathbf {a}}_k - \mathbf {b}^a_i \right)^\wedge \delta \boldsymbol{\phi }_{{\rm ik}} \Delta t + \Delta\, \tilde{\mathtt{R}}_{{\rm ik}} \boldsymbol{\eta }^{{\rm ad}}_k \Delta t \right] \nonumber\\ &= \sum _{k=i}^{j-2} \left[ - \Delta\, \tilde{\mathtt{R}}_{{\rm ik}} \left(\tilde{\mathbf {a}}_k - \mathbf {b}^a_i \right)^\wedge \delta \boldsymbol{\phi }_{{\rm ik}} \Delta t + \Delta\, \tilde{\mathtt{R}}_{{\rm ik}} \boldsymbol{\eta }^{{\rm ad}}_k \Delta t \right] \nonumber \\ &\quad \ - \Delta\, \tilde{\mathtt{R}}_{{\rm ij}-1} \left(\tilde{\mathbf {a}}_{j-1} - \mathbf {b}^a_i \right)^\wedge \delta \boldsymbol{\phi }_{i{j-1}} \Delta t + \Delta\, \tilde{\mathtt{R}}_{i{j-1}} \boldsymbol{\eta }^{{\rm ad}}_{j-1} \Delta t \nonumber \\ & = \delta \mathbf {v}_{{\rm ij}-1} - \Delta\, \tilde{\mathtt{R}}_{{\rm ij}-1} \left(\tilde{\mathbf {a}}_{j-1} - \mathbf {b}^a_i \right)^\wedge \delta \boldsymbol{\phi }_{i{j-1}} \Delta t \nonumber\\ &\quad + \Delta\, \tilde{\mathtt{R}}_{i{j-1}} \boldsymbol{\eta }^{{\rm ad}}_{j-1} \Delta t. \end{align} View Source

Doing the same for $\delta \mathbf {p}_{{\rm ij}}$ in (43), and noting that $\delta \mathbf {p}_{{\rm ij}}$ can be written as a function of $\delta \mathbf {v}_{{\rm ij}}$ [cf., with the expression of $\delta \mathbf {v}_{{\rm ij}}$ in (43)] \begin{align} &\delta \mathbf {p}_{{\rm ij}} = \sum _{k=i}^{j-1} \left[ \delta \mathbf {v}_{{\rm ik}} \Delta t - \frac{1}{2} \Delta\, \tilde{\mathtt{R}}_{{\rm ik}} \left(\tilde{\mathbf {a}}_k - \mathbf {b}^a_i \right)^\wedge \delta \boldsymbol{\phi }_{{\rm ik}} \Delta t^2\right.\nonumber\\ &\qquad\;\;\; \left. + \frac{1}{2} \Delta\, \tilde{\mathtt{R}}_{{\rm ik}} \boldsymbol{\eta }^{{\rm ad}}_k \Delta t^2 \right] \nonumber \\ &\;\;\quad = \sum _{k=i}^{j-2} \left[ \delta \mathbf {v}_{{\rm ik}} \Delta t - \frac{1}{2} \Delta\, \tilde{\mathtt{R}}_{{\rm ik}} \left(\tilde{\mathbf {a}}_k - \mathbf {b}^a_i \right)^\wedge \delta \boldsymbol{\phi }_{{\rm ik}} \Delta t^2\right.\nonumber\\ &\qquad\;\;\; \left.+ \frac{1}{2} \Delta\, \tilde{\mathtt{R}}_{{\rm ik}} \boldsymbol{\eta }^{{\rm ad}}_k \Delta t^2 \right] \nonumber \\ &\qquad\;\;\; + \delta \mathbf {v}_{{\rm ij}-1} \Delta t - \frac{1}{2} \Delta\, \tilde{\mathtt{R}}_{{\rm ij}-1} \left(\tilde{\mathbf {a}}_{j-1} - \mathbf {b}^a_i \right)^\wedge \delta \boldsymbol{\phi }_{{\rm ij}-1} \Delta t^2\nonumber\\ &\qquad\;\;\;+ \frac{1}{2} \Delta\, \tilde{\mathtt{R}}_{i{j-1}} \boldsymbol{\eta }^{{\rm ad}}_{j-1} \Delta t^2 = \delta \mathbf {p}_{{\rm ij}-1} \nonumber \\ &\qquad\;\;\; + \delta \mathbf {v}_{{\rm ij}-1} \Delta t - \frac{1}{2} \Delta\, \tilde{\mathtt{R}}_{{\rm ij}-1} \left(\tilde{\mathbf {a}}_{j-1} - \mathbf {b}^a_i \right)^\wedge \delta \boldsymbol{\phi }_{{\rm ij}-1} \Delta t^2 \nonumber \\ &\qquad\;\;\; + \frac{1}{2} \Delta\, \tilde{\mathtt{R}}_{i{j-1}} \boldsymbol{\eta }^{{\rm ad}}_{j-1} \Delta t^2. \end{align} View SourceRecalling that $\boldsymbol{\eta }^{\Delta }_{{\rm ik}} \doteq [\delta \boldsymbol{\phi }_{{\rm ik}}, \delta \mathbf {v}_{{\rm ik}}, \delta \mathbf {p}_{{\rm ik}}]$ , and defining the IMU measurement noise $\boldsymbol{\eta }^d_k \doteq [\boldsymbol{\eta }^{{\rm gd}}_k \; \; \boldsymbol{\eta }^{{\rm ad}}_k]$,⁶ we can finally write (59)–(61) in compact matrix form as \begin{equation} \boldsymbol{\eta }^{\Delta }_{{\rm ij}} = \mathbf {A}_{j-1} \boldsymbol{\eta }^{\Delta }_{{\rm ij}-1} + \mathbf {B}_{j-1} \boldsymbol{\eta }^d_{j-1}. \end{equation} View SourceFrom the linear model (62) and given the covariance $\mathbf {\Sigma }_{\boldsymbol{\eta }} \in \mathbb {R}^{6 \times 6}$ of the raw IMU measurements noise $\boldsymbol{\eta }^d_k$, it is now possible to compute the preintegrated measurement covariance iteratively \begin{equation} \mathbf {\Sigma }_{{\rm ij}} = \mathbf {A}_{j-1} \mathbf {\Sigma }_{{\rm ij}-1} \mathbf {A}_{j-1}^{\mathsf {T}}+ \mathbf {B}_{j-1} \mathbf {\Sigma }_{\boldsymbol{\eta }} \mathbf {B}_{j-1}^{\mathsf {T}} \end{equation} View Sourcestarting from initial conditions $\mathbf {\Sigma }_{ii} = \mathbf {0}_{9 \times 9}$.

SECTION B.

Bias Correction via First-Order Updates

In this section, we provide a complete derivation of the first-order bias correction proposed in Section VI-C.

Let us assume that we have computed the preintegrated variables at a given bias estimate $\bar{\mathbf {b}}_i \doteq [\bar{\mathbf {b}}^g_i\;\;\bar{\mathbf {b}}^a_i]$, and let us denote the corresponding preintegrated measurements as \begin{equation} \Delta \bar{\mathtt{R}}_{{\rm ij}} \doteq \Delta\, \tilde{\mathtt{R}}_{{\rm ij}}(\bar{\mathbf {b}}_i), \ \Delta \bar{\mathbf {v}}_{{\rm ij}}\doteq \Delta \tilde{\mathbf {v}}_{{\rm ij}}(\bar{\mathbf {b}}_i), \ \Delta \bar{\mathbf {p}}_{{\rm ij}}\doteq \Delta \tilde{\mathbf {p}}_{{\rm ij}}(\bar{\mathbf {b}}_i). \end{equation} View SourceIn this section, we want to devise an expression to “update” $\Delta \bar{\mathtt{R}}_{{\rm ij}}$, $\Delta \bar{\mathbf {v}}_{{\rm ij}}$, $\Delta \bar{\mathbf {p}}_{{\rm ij}}$ when our bias estimate changes.

Consider the case in which we get a new estimate $\hat{\mathbf {b}}_i \leftarrow \bar{\mathbf {b}}_i + \delta \mathbf {b}_i$, where $\delta \mathbf {b}_i$ is a small correction w.r.t. the previous estimate $\bar{\mathbf {b}}_i$.

We start with the bias correction for the preintegrated rotation measurement. The key idea here is to write $\Delta\, \tilde{\mathtt{R}}_{{\rm ij}}(\hat{\mathbf {b}}_i)$ (the preintegrated measurement at the new bias estimate) as a function of $\Delta \bar{\mathtt{R}}_{{\rm ij}}$ (the preintegrated measurement at the old bias estimate), “plus” a first-order correction. Recalling (35), we write $\Delta\, \tilde{\mathtt{R}}_{{\rm ij}}(\hat{\mathbf {b}}_i)$ as \begin{equation} \Delta\, \tilde{\mathtt{R}}_{{\rm ij}}(\hat{\mathbf {b}}_i) = \prod _{k=i}^{j-1} \mathrm{Exp}\big(\big(\tilde{\boldsymbol{\omega }}_k - \hat{\mathbf {b}}^g_i \big) \Delta t\big). \end{equation} View SourceSubstituting $\hat{\mathbf {b}}_i = \bar{\mathbf {b}}_i + \delta \mathbf {b}_i$ in the previous expression and using the first-order approximation (4) in each factor (we assumed small $\delta \mathbf {b}_i$) \begin{align} \Delta\, \tilde{\mathtt{R}}_{{\rm ij}}(\hat{\mathbf {b}}_i) &= \prod _{k=i}^{j-1} \mathrm{Exp}\big(\big(\tilde{\boldsymbol{\omega }}_k - (\bar{\mathbf {b}}^g_i+\delta \mathbf {b}^{g}_i) \big) \Delta t\big) \\ &\simeq \prod _{k=i}^{j-1} \mathrm{Exp}\big(\big(\tilde{\boldsymbol{\omega }}_k - \bar{\mathbf {b}}^g_i \big) \Delta t\big) \mathrm{Exp}\big(- \mathtt{J}^k_{r}\; \delta \mathbf {b}^{g}_i \; \Delta t\big). \nonumber \end{align} View SourceNow, we rearrange the terms in the product, by “moving” the terms including $\delta \mathbf {b}^{{\rm gd}}_i$ to the end, using the relation (11) \begin{equation} \Delta\, \tilde{\mathtt{R}}_{{\rm ij}}(\hat{\mathbf {b}}_i) = \Delta \bar{\mathtt{R}}_{{\rm ij}} \prod _{k=i}^{j-1} \mathrm{Exp}\big(- \Delta\, \tilde{\mathtt{R}}_{k+1j}\big(\bar{\mathbf {b}}_i\big)^{\mathsf {T}}\mathtt{J}^k_{r}\; \delta \mathbf {b}^{g}_i \; \Delta t\big) \end{equation} View Sourcewhere we used the fact that by definition it holds that $\Delta \bar{\mathtt{R}}_{{\rm ij}} = \prod _{k=i}^{j-1} \mathrm{Exp}\left(\left(\tilde{\boldsymbol{\omega }}_k - \bar{\mathbf {b}}^{g}_i \right) \Delta t\right)$. Repeated application of the first-order approximation (7) (recall that $\delta \mathbf {b}^{g}_i$ is small, hence, the right Jacobians are close to the identity) produces \begin{align} \Delta\, \tilde{\mathtt{R}}_{{\rm ij}}(\hat{\mathbf {b}}_i) &\simeq \Delta \bar{\mathtt{R}}_{{\rm ij}} \mathrm{Exp}\left(\sum _{k=i}^{j-1} - \Delta\, \tilde{\mathtt{R}}_{k+1j}(\bar{\mathbf {b}}_i)^{\mathsf {T}}\mathtt{J}^k_{r}\delta \mathbf {b}^{g}_i \Delta t \right) \nonumber \\ &=\Delta \bar{\mathtt{R}}_{{\rm ij}} \mathrm{Exp}\Big (\frac{\partial \Delta \bar{\mathtt{R}}_{{\rm ij}} }{\partial \mathbf {b}^g} \delta \mathbf {b}^g_i\Big). \end{align} View SourceUsing (68), we can now update the preintegrated rotation measurement $\Delta\, \tilde{\mathtt{R}}_{{\rm ij}}(\bar{\mathbf {b}}_i)$ to get $\Delta\, \tilde{\mathtt{R}}_{{\rm ij}}(\hat{\mathbf {b}}_i)$ without repeating the integration.

Let us now focus on the bias correction of the preintegrated velocity $\Delta \tilde{\mathbf {v}}_{{\rm ij}}(\hat{\mathbf {b}}_i)$ \begin{align*} &\Delta \tilde{\mathbf {v}}_{{\rm ij}}(\hat{\mathbf {b}}_i) = \sum _{k=i}^{j-1} \Delta\, \tilde{\mathtt{R}}_{{\rm ik}}(\hat{\mathbf {b}}_i) \left(\tilde{\mathbf {a}}_k - \bar{\mathbf {b}}^a_i - \delta \mathbf {b}^a_i \right)\Delta t \nonumber\\ &\stackrel{(68)}{\simeq } \sum \limits _{k=i}^{j-1} \Delta \bar{\mathtt{R}}_{{\rm ik}} \mathrm{Exp}\Bigg (\frac{\partial \Delta \bar{\mathtt{R}}_{{\rm ik}} }{\partial \mathbf {b}^g} \delta \mathbf {b}^g_i\Bigg) \left(\tilde{\mathbf {a}}_k - \bar{\mathbf {b}}^a_i - \delta \mathbf {b}^a_i \right)\Delta t \nonumber \\ &\stackrel{(4)}{\simeq } \sum \limits _{k=i}^{j-1} \Delta \bar{\mathtt{R}}_{{\rm ik}} \left({\mathbf I}+ \Bigg (\frac{\partial \Delta \bar{\mathtt{R}}_{{\rm ik}} }{\partial \mathbf {b}^g} \delta \mathbf {b}^g_i\Bigg)^\wedge \right) \left(\tilde{\mathbf {a}}_k - \bar{\mathbf {b}}^a_i - \delta \mathbf {b}^a_i \right)\Delta t \nonumber \\ &\stackrel{(a)}{\simeq } \Delta \bar{\mathbf {v}}_{{\rm ij}}- \sum _{k=i}^{j-1} \Delta \bar{\mathtt{R}}_{{\rm ik}} \Delta t \delta \mathbf {b}^a_i\nonumber\\ &\quad + \sum _{k=i}^{j-1} \Delta \bar{\mathtt{R}}_{{\rm ik}} \Bigg(\frac{\partial \Delta \bar{\mathtt{R}}_{{\rm ik}} }{\partial \mathbf {b}^g} \delta \mathbf {b}^g_i\Bigg)^\wedge \left(\tilde{\mathbf {a}}_k - \bar{\mathbf {b}}^a_i \right)\Delta t \nonumber \\ &\stackrel{(2)}{=} \Delta \bar{\mathbf {v}}_{{\rm ij}}- \sum _{k=i}^{j-1} \Delta \bar{\mathtt{R}}_{{\rm ik}} \Delta t \delta \mathbf {b}^a_i\nonumber\\ \end{align*} View Source \begin{align} &\quad - \sum _{k=i}^{j-1} \Delta \bar{\mathtt{R}}_{{\rm ik}} \left(\tilde{\mathbf {a}}_k - \bar{\mathbf {b}}^a_i \right)^\wedge \frac{\partial \Delta \bar{\mathtt{R}}_{{\rm ik}} }{\partial \mathbf {b}^g} \Delta t \; \delta \mathbf {b}^g_i \nonumber \\ &= \Delta \bar{\mathbf {v}}_{{\rm ij}}+ \frac{\partial \Delta \bar{\mathbf {v}}_{{\rm ij}}}{\partial \mathbf {b}^a} \delta \mathbf {b}^a_i + \frac{\partial \Delta \bar{\mathbf {v}}_{{\rm ij}}}{\partial \mathbf {b}^g} \delta \mathbf {b}^g_i \end{align} View Sourcewhere for $(a)$, we used $\Delta \bar{\mathbf {v}}_{{\rm ij}}= \sum _{k=i}^{j-1} \Delta \bar{\mathtt{R}}_{{\rm ik}} \left(\tilde{\mathbf {a}}_k - \bar{\mathbf {b}}^a_i \right)\Delta t$. Exactly the same derivation can be repeated for $\Delta \tilde{\mathbf {p}}_{{\rm ij}}(\hat{\mathbf {b}}_i)$. Summarizing, the Jacobians used for the a posteriori bias update in (44) are \begin{equation*} \begin{array}{ccl}\frac{\partial \Delta \bar{\mathtt{R}}_{{\rm ij}} }{\partial \mathbf {b}^g} &=& - \sum _{k=i}^{j-1} \left[ \Delta\, \tilde{\mathtt{R}}_{k+1j}(\bar{\mathbf {b}}_i)^{\mathsf {T}}\; \mathtt{J}^k_{r}\; \Delta t \right] \\ \frac{\partial \Delta \bar{\mathbf {v}}_{{\rm ij}}}{\partial \mathbf {b}^a} &=& - \sum _{k=i}^{j-1} \Delta \bar{\mathtt{R}}_{{\rm ik}} \Delta t \nonumber \\ \frac{\partial \Delta \bar{\mathbf {v}}_{{\rm ij}}}{\partial \mathbf {b}^g} &=& - \sum _{k=i}^{j-1} \Delta \bar{\mathtt{R}}_{{\rm ik}} \left(\tilde{\mathbf {a}}_k - \bar{\mathbf {b}}^a_i \right)^\wedge \frac{\partial \Delta \bar{\mathtt{R}}_{{\rm ik}} }{\partial \mathbf {b}^g} \Delta t \nonumber \\ \frac{\partial \Delta \bar{\mathbf {p}}_{{\rm ij}}}{\partial \mathbf {b}^a} &=& \sum _{k=i}^{j-1} \frac{\partial \Delta \bar{\mathbf {v}}_{{\rm ik}}}{\partial \mathbf {b}^a} \Delta t -\frac{1}{2} \Delta \bar{\mathtt{R}}_{{\rm ik}} \Delta t^2 \nonumber \\ \frac{\partial \Delta \bar{\mathbf {p}}_{{\rm ij}}}{\partial \mathbf {b}^g} &=& \sum _{k=i}^{j-1} \frac{\partial \Delta \bar{\mathbf {v}}_{{\rm ik}}}{\partial \mathbf {b}^g} \Delta t -\frac{1}{2} \Delta \bar{\mathtt{R}}_{{\rm ik}} \left(\tilde{\mathbf {a}}_k - \bar{\mathbf {b}}^a_i \right)^\wedge \frac{\partial \Delta \bar{\mathtt{R}}_{{\rm ik}} }{\partial \mathbf {b}^g} \Delta t^2. \nonumber \end{array} \end{equation*} View SourceNote that the Jacobians can be computed incrementally, as new measurements arrive.

SECTION C.

Jacobians of Residual Errors

In this section, we provide analytic expressions for the Jacobian matrices of the residual errors in (45). These Jacobians are crucial when using iterative optimization techniques (e.g., the Gauss–Newton method of Section III-C) to minimize the cost in (26).

“Lifting” the cost function (see Section III-C) consists of substituting the following retractions: \begin{equation} \begin{array} {rclrcl}\mathtt{R}_i &\leftarrow & \mathtt{R}_i \ \mathrm{Exp}(\delta \boldsymbol{\phi }_i), & \mathtt{R}_j &\leftarrow & \mathtt{R}_j \ \mathrm{Exp}(\delta \boldsymbol{\phi }_j)\\ \mathbf {p}_i &\leftarrow & \mathbf {p}_i + \mathtt{R}_i \delta \mathbf {p}_i, & \mathbf {p}_j &\leftarrow & \mathbf {p}_j + \mathtt{R}_j \delta \mathbf {p}_j \\ \mathbf {v}_i &\leftarrow & \mathbf {v}_i + \delta \mathbf {v}_i, & \mathbf {v}_j &\leftarrow & \mathbf {v}_j + \delta \mathbf {v}_i \\ \delta \mathbf {b}^g_i &\leftarrow & \delta \mathbf {b}^g_i + \tilde{\delta }{\mathbf {b}^g_i}, & \delta \mathbf {b}^a_i &\leftarrow & \delta \mathbf {b}^a_i + \tilde{\delta }{\mathbf {b}^a_i}. \end{array} \end{equation} View SourceThe process of lifting makes the residual errors a function defined on a vector space, on which it is easy to compute Jacobians. Therefore, in the following sections, we derive the Jacobians w.r.t. the vectors $\delta \boldsymbol{\phi }_i, \delta \mathbf {p}_i, \delta \mathbf {v}_i, \delta \boldsymbol{\phi }_j, \delta \mathbf {p}_j, \delta \mathbf {v}_j, \tilde{\delta }{\mathbf {b}^g_i}, \tilde{\delta }{\mathbf {b}^a_i}$.

1) Jacobians of $\mathbf {r}_{\Delta \mathbf {p}_{{\rm ij}}}$

Since $\mathbf {r}_{\Delta \mathbf {p}_{{\rm ij}}}$ is linear in $\delta \mathbf {b}^g_i$ and $\delta \mathbf {b}^a_i$, and the retraction is simply a vector sum, the Jacobians of $\mathbf {r}_{\Delta \mathbf {p}_{{\rm ij}}}$ w.r.t. $\tilde{\delta }{\mathbf {b}^g_i}, \tilde{\delta }{\mathbf {b}^a_i}$ are simply the matrix coefficients of $\delta \mathbf {b}^g_i$ and $\delta \mathbf {b}^a_i$. Moreover, $\mathtt{R}_j$ and $\mathbf {v}_j$ do not appear in $\mathbf {r}_{\Delta \mathbf {p}_{{\rm ij}}}$, hence, the Jacobians w.r.t. $\delta \boldsymbol{\phi }_j, \delta \mathbf {v}_j$ are zero. Let us focus on the following remaining Jacobians: \begin{align} &\mathbf {r}_{\Delta \mathbf {p}_{{\rm ij}}}(\mathbf {p}_i+\mathtt{R}_i\delta \mathbf {p}_i) \nonumber \\ &= \mathtt{R}_i^{\mathsf {T}}\left(\mathbf {p}_j - \mathbf {p}_i - \mathtt{R}_i \delta \mathbf {p}_i - \mathbf {v}_i \Delta t_{{\rm ij}} - \frac{1}{2}\,\mathbf {g}\Delta t_{{\rm ij}}^2 \right)- C \nonumber\\ &= \mathbf {r}_{\Delta \mathbf {p}_{{\rm ij}}}(\mathbf {p}_i) + (-{\mathbf I}_{3\times 1}) \delta \mathbf {p}_i \end{align} View Source \begin{align} &\mathbf {r}_{\Delta \mathbf {p}_{{\rm ij}}}(\mathbf {p}_j+\mathtt{R}_j\delta \mathbf {p}_j)\nonumber \\ &=\mathtt{R}_i^{\mathsf {T}}\left(\mathbf {p}_j + \mathtt{R}_j\delta \mathbf {p}_j - \mathbf {p}_i - \mathbf {v}_i \Delta t_{{\rm ij}} - \frac{1}{2}\,\mathbf {g}\Delta t_{{\rm ij}}^2 \right)- C \nonumber\\ &= \mathbf {r}_{\Delta \mathbf {p}_{{\rm ij}}}(\mathbf {p}_j) + (\mathtt{R}_i^{\mathsf {T}}\mathtt{R}_j) \delta \mathbf {p}_j \\ &\mathbf {r}_{\Delta \mathbf {p}_{{\rm ij}}}(\mathbf {v}_i+\delta \mathbf {v}_i)\nonumber \\ &=\mathtt{R}_i^{\mathsf {T}}\left(\mathbf {p}_j - \mathbf {p}_i - \mathbf {v}_i \Delta t_{{\rm ij}} - \delta \mathbf {v}_i \Delta t_{{\rm ij}} - \frac{1}{2}\mathbf {g}\Delta t_{{\rm ij}}^2 \right)- C\nonumber\\ &= \mathbf {r}_{\Delta \mathbf {p}_{{\rm ij}}}(\mathbf {v}_i) + (- \mathtt{R}_i^{\mathsf {T}}\Delta t_{{\rm ij}})\delta \mathbf {v}_i \\ &\mathbf {r}_{\Delta \mathbf {p}_{{\rm ij}}}(\mathtt{R}_i\, \mathrm{Exp}(\delta \boldsymbol{\phi }_i))\nonumber \\ &=(\mathtt{R}_i \mathrm{Exp}(\delta \boldsymbol{\phi }_i))^{\mathsf {T}}\left(\mathbf {p}_j -\mathbf {p}_i - \mathbf {v}_i \Delta t_{{\rm ij}} - \frac{1}{2}\,\mathbf {g}\Delta t_{{\rm ij}}^2 \right)- C\nonumber\\ &\stackrel{(4)}{\simeq } ({\mathbf I}- \delta \boldsymbol{\phi }_i^\wedge)\mathtt{R}_i^{\mathsf {T}}\left(\mathbf {p}_j -\mathbf {p}_i - \mathbf {v}_i \Delta t_{{\rm ij}} - \frac{1}{2}\,\mathbf {g}\Delta t_{{\rm ij}}^2 \right)-C \nonumber \\ &\stackrel{(2)}{=} \mathbf {r}_{\Delta \mathbf {p}_{{\rm ij}}}(\mathtt{R}_i) + \left(\mathtt{R}_i^{\mathsf {T}}\left(\mathbf {p}_j -\mathbf {p}_i - \mathbf {v}_i \Delta t_{{\rm ij}} - \frac{1}{2}\,\mathbf {g}\Delta t_{{\rm ij}}^2\right) \right)^\wedge \delta \boldsymbol{\phi }_i \end{align} View Sourcewhere we used the shorthand $C \doteq \Delta \tilde{\mathbf {p}}_{{\rm ij}}+ \frac{\partial \Delta \bar{\mathbf {p}}_{{\rm ij}}}{\partial \mathbf {b}^g_i} \delta \mathbf {b}^g_i + \frac{\partial \Delta \bar{\mathbf {p}}_{{\rm ij}}}{\partial \mathbf {b}^a_i} \delta \mathbf {b}^a_i$. Summarizing, the Jacobians of $\mathbf {r}_{\Delta \mathbf {p}_{{\rm ij}}}$ are \begin{equation*} \begin{array}{ll}\frac{\partial \mathbf {r}_{\Delta \mathbf {p}_{{\rm ij}}}}{\partial \delta \boldsymbol{\phi }_i} = \left(\mathtt{R}_i^{\mathsf {T}}\left(\mathbf {p}_j - \mathbf {p}_i - \mathbf {v}_i \Delta t_{{\rm ij}} - \frac{1}{2}\,\mathbf {g}\Delta t_{{\rm ij}}^2\right)\right)^\wedge & \frac{\partial \mathbf {r}_{\Delta \mathbf {p}_{{\rm ij}}}}{\partial \delta \boldsymbol{\phi }_j} = \mathbf {0}\\ \frac{\partial \mathbf {r}_{\Delta \mathbf {p}_{{\rm ij}}}}{\partial \delta \mathbf {p}_i} = -{\mathbf I}_{3\times 1} & \frac{\partial \mathbf {r}_{\Delta \mathbf {p}_{{\rm ij}}}}{\partial \delta \mathbf {p}_j} = \mathtt{R}_i^{\mathsf {T}}\mathtt{R}_j \\ \frac{\partial \mathbf {r}_{\Delta \mathbf {p}_{{\rm ij}}}}{\partial \delta \mathbf {v}_i} = - \mathtt{R}_i^{\mathsf {T}}\Delta t_{{\rm ij}} & \frac{\partial \mathbf {r}_{\Delta \mathbf {p}_{{\rm ij}}}}{\partial \delta \mathbf {v}_j} = \mathbf {0}\\ \frac{\partial \mathbf {r}_{\Delta \mathbf {p}_{{\rm ij}}}}{\partial \tilde{\delta }{\mathbf {b}^a_i}} = - \frac{\partial \Delta \bar{\mathbf {p}}_{{\rm ij}}}{\partial \mathbf {b}^a_i} & \frac{\partial \mathbf {r}_{\Delta \mathbf {p}_{{\rm ij}}}}{\partial \tilde{\delta }{\mathbf {b}^g_i}}= - \frac{\partial \Delta \bar{\mathbf {p}}_{{\rm ij}}}{\partial \mathbf {b}^g_i}.\nonumber \end{array} \end{equation*} View Source

2) Jacobians of $\mathbf {r}_{\Delta \mathbf {v}_{{\rm ij}}}$

As in the previous section, $\mathbf {r}_{\Delta \mathbf {v}_{{\rm ij}}}$ is linear in $\delta \mathbf {b}^g_i$ and $\delta \mathbf {b}^a_i$, hence, the Jacobians of $\mathbf {r}_{\Delta \mathbf {v}_{{\rm ij}}}$ w.r.t. $\tilde{\delta }{\mathbf {b}^g_i}, \tilde{\delta }{\mathbf {b}^a_i}$ are simply the matrix coefficients of $\delta \mathbf {b}^g_i$ and $\delta \mathbf {b}^a_i$. Moreover, $\mathtt{R}_j$, $\mathbf {p}_i$, and $\mathbf {p}_j$ do not appear in $\mathbf {r}_{\Delta \mathbf {v}_{{\rm ij}}}$, hence, the Jacobians w.r.t. $\delta \boldsymbol{\phi }_j, \delta \mathbf {p}_i, \delta \mathbf {p}_j$ are zero. The remaining Jacobias are computed as \begin{align} \mathbf {r}_{\Delta \mathbf {v}_{{\rm ij}}}(\mathbf {v}_i + \delta \mathbf {v}_i) &= \mathtt{R}_i^{\mathsf {T}}\left(\mathbf {v}_j-\mathbf {v}_i-\delta \mathbf {v}_i - \mathbf {g}\Delta t_{{\rm ij}}\right) - D \nonumber \\ &= \mathbf {r}_{\Delta \mathbf {v}}(\mathbf {v}_i) -\mathtt{R}_i^{\mathsf {T}}\delta \mathbf {v}_i \\ \mathbf {r}_{\Delta \mathbf {v}_{{\rm ij}}}(\mathbf {v}_j + \delta \mathbf {v}_j) &= \mathtt{R}_i^{\mathsf {T}}\left(\mathbf {v}_j + \delta \mathbf {v}_j-\mathbf {v}_i - \mathbf {g}\Delta t_{{\rm ij}}\right) - D \nonumber \\ &= \mathbf {r}_{\Delta \mathbf {v}}(\mathbf {v}_j) + \mathtt{R}_i^{\mathsf {T}}\delta \mathbf {v}_j \\ \mathbf {r}_{\Delta \mathbf {v}_{{\rm ij}}}(\mathtt{R}_i \; \mathrm{Exp}(\delta \boldsymbol{\phi }_i))& = \left(\mathtt{R}_i \; \mathrm{Exp}(\delta \boldsymbol{\phi }_i) \right)^{\mathsf {T}}\left(\mathbf {v}_j-\mathbf {v}_i - \mathbf {g}\Delta t_{{\rm ij}} \right) - D \nonumber \\ &\stackrel{(4)}{\simeq } ({\mathbf I}-\delta \boldsymbol{\phi }_i^\wedge) \mathtt{R}_i^{\mathsf {T}}\left(\mathbf {v}_j-\mathbf {v}_i - \mathbf {g}\Delta t_{{\rm ij}}\right) - D \nonumber \\ &\stackrel{(2)}{=} \mathbf {r}_{\Delta \mathbf {v}}(\mathtt{R}_i) + \left(\mathtt{R}_i^{\mathsf {T}}\left(\mathbf {v}_j-\mathbf {v}_i - \mathbf {g}\Delta t_{{\rm ij}}\right)\right)^\wedge \delta \boldsymbol{\phi }_i \end{align} View Sourcewith $D \doteq \left[ \Delta \tilde{\mathbf {v}}_{{\rm ij}}+ \frac{\partial \Delta \bar{\mathbf {v}}_{{\rm ij}}}{\partial \mathbf {b}^g_i} \delta \mathbf {b}^g_i + \frac{\partial \Delta \bar{\mathbf {v}}_{{\rm ij}}}{\partial \mathbf {b}^a_i} \delta \mathbf {b}^a_i \right]$ . Summarizing, the Jacobians of $\mathbf {r}_{\Delta \mathbf {v}_{{\rm ij}}}$ are \begin{equation*} \begin{array}{ll}\frac{\partial \mathbf {r}_{\Delta \mathbf {v}_{{\rm ij}}}}{\partial \delta \boldsymbol{\phi }_i} = \left(\mathtt{R}_i^{\mathsf {T}}\left(\mathbf {v}_j-\mathbf {v}_i - \mathbf {g}\Delta t_{{\rm ij}}\right)\right)^\wedge & \frac{\partial \mathbf {r}_{\Delta \mathbf {v}_{{\rm ij}}}}{\partial \delta \boldsymbol{\phi }_j} = \mathbf {0}\\ \frac{\partial \mathbf {r}_{\Delta \mathbf {v}_{{\rm ij}}}}{\partial \delta \mathbf {p}_i} = \mathbf {0}& \frac{\partial \mathbf {r}_{\Delta \mathbf {v}_{{\rm ij}}}}{\partial \delta \mathbf {p}_j} = \mathbf {0}\\ \frac{\partial \mathbf {r}_{\Delta \mathbf {v}_{{\rm ij}}}}{\partial \delta \mathbf {v}_i} = -\mathtt{R}_i^{\mathsf {T}}& \frac{\partial \mathbf {r}_{\Delta \mathbf {v}_{{\rm ij}}}}{\partial \delta \mathbf {v}_j} = \mathtt{R}_i^{\mathsf {T}}\\ \frac{\partial \mathbf {r}_{\Delta \mathbf {v}_{{\rm ij}}}}{\partial \tilde{\delta }{\mathbf {b}^a_i}} = - \frac{\partial \Delta \bar{\mathbf {v}}_{{\rm ij}}}{\partial \mathbf {b}^a_i} & \frac{\partial \mathbf {r}_{\Delta \mathbf {v}_{{\rm ij}}}}{\partial \tilde{\delta }{\mathbf {b}^g_i}} = - \frac{\partial \Delta \bar{\mathbf {v}}_{{\rm ij}}}{\partial \mathbf {b}^g_i}. \nonumber \end{array} \end{equation*} View Source

3) Jacobians of $\mathbf {r}_{\Delta\, \mathtt{R}_{{\rm ij}}}$

The derivation of the Jacobians of $\mathbf {r}_{\Delta\, \mathtt{R}_{{\rm ij}}}$ is slightly more involved. We first note that $\mathbf {p}_i, \mathbf {p}_j, \mathbf {v}_i, \mathbf {v}_j, \delta \mathbf {b}^a_i$ do not appear in the expression of $\mathbf {r}_{\Delta\, \mathtt{R}_{{\rm ij}}}$, hence, the corresponding Jacobians are zero. The remaining Jacobians can be computed as \begin{eqnarray} &&\mathbf {r}_{\Delta\, \mathtt{R}_{{\rm ij}}}(\mathtt{R}_i \, \mathrm{Exp}(\delta \boldsymbol{\phi }_i))\nonumber\\ &&\qquad = \mathrm{Log}\left(\left(\Delta\, \tilde{\mathtt{R}}_{{\rm ij}}\big(\bar{\mathbf {b}}^g_i\big) E \right)^{\mathsf {T}}\left(\mathtt{R}_i \, \mathrm{Exp}\big(\delta \boldsymbol{\phi }_i\big) \right)^{\mathsf {T}}\mathtt{R}_j \right) \nonumber \\ &&\qquad = \mathrm{Log}\left(\left(\Delta\, \tilde{\mathtt{R}}_{{\rm ij}}(\bar{\mathbf {b}}^g_i) E \right)^{\mathsf {T}}\mathrm{Exp}(-\delta \boldsymbol{\phi }_i) \mathtt{R}_i^{\mathsf {T}}\mathtt{R}_j \right) \nonumber \\ &&\qquad \overset{(11)}{=} \mathrm{Log}\left(\left(\Delta\, \tilde{\mathtt{R}}_{{\rm ij}}(\bar{\mathbf {b}}^g_i) E \right)^{\mathsf {T}}\mathtt{R}_i^{\mathsf {T}}\mathtt{R}_j \mathrm{Exp}(-\mathtt{R}_j^{\mathsf {T}}\mathtt{R}_i \delta \boldsymbol{\phi }_i) \right) \nonumber \\ &&\qquad \overset{(9)}{\simeq } \mathbf {r}_{\Delta\, \mathtt{R}}(\mathtt{R}_i) - \mathtt{J}_{r}^{-1}(\mathbf {r}_{\Delta\, \mathtt{R}}(\mathtt{R}_i)) \mathtt{R}_j^{\mathsf {T}}\mathtt{R}_i \delta \boldsymbol{\phi }_i \\ &&\mathbf {r}_{\Delta\, \mathtt{R}_{{\rm ij}}}(\mathtt{R}_j\, \mathrm{Exp}(\delta \boldsymbol{\phi }_j))\nonumber\\ &&\qquad = \mathrm{Log}\left(\left(\Delta\, \tilde{\mathtt{R}}_{{\rm ij}}(\bar{\mathbf {b}}^g_i) E \right)^{\mathsf {T}}\mathtt{R}_i^{\mathsf {T}}(\mathtt{R}_j \, \mathrm{Exp}\big(\delta \boldsymbol{\phi }_j)\big) \right) \nonumber \\ &&\qquad \overset{(9)}{\simeq } \mathbf {r}_{\Delta\, \mathtt{R}}(\mathtt{R}_j) + \mathtt{J}_{r}^{-1}(\mathbf {r}_{\Delta\, \mathtt{R}}(\mathtt{R}_j)) \delta \boldsymbol{\phi }_j \end{eqnarray} View Source \begin{align*} \mathbf {r}_{\Delta\, \mathtt{R}_{{\rm ij}}}&(\delta \mathbf {b}^g_i+\tilde{\delta }{\mathbf {b}^g_i})\nonumber\\ &= \mathrm{Log}\Bigg (\Bigg (\Delta\, \tilde{\mathtt{R}}_{{\rm ij}}\big(\bar{\mathbf {b}}^g_i\big) \mathrm{Exp}\Bigg (\frac{\partial \Delta \bar{\mathtt{R}}_{{\rm ij}} }{\partial \mathbf {b}^g} \big(\delta \mathbf {b}^g_i + \tilde{\delta }{\mathbf {b}^g_i}\big) \Bigg) \Bigg)^{\mathsf {T}}\mathtt{R}_i^{\mathsf {T}}\mathtt{R}_j \Bigg) \nonumber \\ &\overset{(7)}{\simeq } \mathrm{Log}\Bigg (\Bigg (\Delta\, \tilde{\mathtt{R}}_{{\rm ij}}\big(\bar{\mathbf {b}}^g_i\big) \ E \ \mathrm{Exp}\Bigg (\mathtt{J}_{r}^b \frac{\partial \Delta \bar{\mathtt{R}}_{{\rm ij}} }{\partial \mathbf {b}^g} \tilde{\delta }{\mathbf {b}^g_i}\Bigg) \Bigg)^{\mathsf {T}}\mathtt{R}_i^{\mathsf {T}}\mathtt{R}_j \Bigg) \nonumber \\ &= \mathrm{Log}\Bigg (\mathrm{Exp}\Bigg (-\mathtt{J}_{r}^b \frac{\partial \Delta \bar{\mathtt{R}}_{{\rm ij}} }{\partial \mathbf {b}^g} \tilde{\delta }{\mathbf {b}^g_i}\Bigg) \big(\Delta\, \tilde{\mathtt{R}}_{{\rm ij}}\big(\bar{\mathbf {b}}^g_i\big) \ E \big)^{\mathsf {T}}\mathtt{R}_i^{\mathsf {T}}\mathtt{R}_j \Bigg) \nonumber \\ &= \mathrm{Log}\Bigg (\mathrm{Exp}\Bigg (-\mathtt{J}_{r}^b \frac{\partial \Delta \bar{\mathtt{R}}_{{\rm ij}} }{\partial \mathbf {b}^g} \tilde{\delta }{\mathbf {b}^g_i}\Bigg) \mathrm{Exp}\big(\mathbf {r}_{\Delta\, \mathtt{R}_{{\rm ij}}}\big(\delta \mathbf {b}^g_i\big) \big) \Bigg) \nonumber \\ &\overset{(11)}{=} \mathrm{Log}\Bigg (\mathrm{Exp}\big (\mathbf {r}_{\Delta\, \mathtt{R}_{{\rm ij}}}\big(\delta \mathbf {b}^g_i\big) \big) \nonumber \\ &\quad\; \cdot \mathrm{Exp}\Bigg (- \mathrm{Exp}\big(\mathbf {r}_{\Delta\, \mathtt{R}_{{\rm ij}}}\big(\delta \mathbf {b}^g_i\big) \big)^{\mathsf {T}}\mathtt{J}_{r}^b \frac{\partial \Delta \bar{\mathtt{R}}_{{\rm ij}} }{\partial \mathbf {b}^g} \tilde{\delta }{\mathbf {b}^g_i}\Bigg)\Bigg) \nonumber \\ \end{align*} View Source \begin{align} &\overset{(9)}{\simeq } \mathbf {r}_{\Delta\, \mathtt{R}_{{\rm ij}}}\big(\delta \mathbf {b}^g_i\big) \nonumber \\ &\quad\; - \mathtt{J}_{r}^{-1}\big(\mathbf {r}_{\Delta\, \mathtt{R}_{{\rm ij}}}\big(\delta \mathbf {b}^g_i\big)\big) \mathrm{Exp}\big(\mathbf {r}_{\Delta\, \mathtt{R}_{{\rm ij}}}\big(\delta \mathbf {b}^g_i\big) \big)^{\mathsf {T}}\mathtt{J}_{r}^b \frac{\partial \Delta \bar{\mathtt{R}}_{{\rm ij}} }{\partial \mathbf {b}^g} \ \tilde{\delta }{\mathbf {b}^g_i} \end{align} View Sourcewhere we used the shorthands $E \doteq \mathrm{Exp}\left(\frac{\partial \Delta \bar{\mathtt{R}}_{{\rm ij}} }{\partial \mathbf {b}^g} \delta \mathbf {b}^g\right)$ and $\mathtt{J}_{r}^b \doteq \mathtt{J}_{r}\left(\frac{\partial \Delta \bar{\mathtt{R}}_{{\rm ij}} }{\partial \mathbf {b}^g} \delta \mathbf {b}^g_i \right)$. In summary, the Jacobians of $\mathbf {r}_{\Delta\, \mathtt{R}_{{\rm ij}}}$ are \begin{equation} \begin{array}{ll}\frac{\partial \mathbf {r}_{\Delta\, \mathtt{R}_{{\rm ij}}}}{\partial \delta \boldsymbol{\phi }_i} = - \mathtt{J}_{r}^{-1}(\mathbf {r}_{\Delta\, \mathtt{R}}(\mathtt{R}_i)) \mathtt{R}_j^{\mathsf {T}}\mathtt{R}_i & \frac{\partial \mathbf {r}_{\Delta\, \mathtt{R}_{{\rm ij}}}}{\partial \delta \mathbf {p}_i} = \mathbf {0}\\ \frac{\partial \mathbf {r}_{\Delta\, \mathtt{R}_{{\rm ij}}}}{\partial \delta \mathbf {v}_i} = \mathbf {0}& \frac{\partial \mathbf {r}_{\Delta\, \mathtt{R}_{{\rm ij}}}}{\partial \delta \boldsymbol{\phi }_j} = \mathtt{J}_{r}^{-1}(\mathbf {r}_{\Delta\, \mathtt{R}}(\mathtt{R}_j)) \\ \frac{\partial \mathbf {r}_{\Delta\, \mathtt{R}_{{\rm ij}}}}{\partial \delta \mathbf {p}_j} = \mathbf {0}& \frac{\partial \mathbf {r}_{\Delta\, \mathtt{R}_{{\rm ij}}}}{\partial \delta \mathbf {v}_j} = \mathbf {0}\\ \frac{\partial \mathbf {r}_{\Delta\, \mathtt{R}_{{\rm ij}}}}{\partial \tilde{\delta }{\mathbf {b}^a_i}} = \mathbf {0}& \frac{\partial \mathbf {r}_{\Delta\, \mathtt{R}_{{\rm ij}}}}{\partial \tilde{\delta }{\mathbf {b}^g_i}} = \alpha \end{array} \end{equation} View Sourcewith $\alpha = -\mathtt{J}_{r}^{-1}\left(\mathbf {r}_{\Delta\, \mathtt{R}_{{\rm ij}}}(\delta \mathbf {b}^g_i) \right) \mathrm{Exp}\left(\mathbf {r}_{\Delta\, \mathtt{R}{{\rm ij}}}(\delta \mathbf {b}^g_i) \right)^{\mathsf {T}}\mathtt{J}_{r}^b \frac{\partial \Delta \bar{\mathtt{R}}_{{\rm ij}} }{\partial \mathbf {b}^g}$.

SECTION D.

Structureless Vision Factors: Null Space Projection

In this section, we provide a more efficient implementation of the structureless vision factors, described in Section VII.

Let us consider (55). Recall that $ \mathbf {Q}\doteq ({\mathbf I}- \mathbf {E}_{l} (\mathbf {E}_{l}^{\mathsf {T}}\mathbf {E}_{l})^{-1}\mathbf {E}_{l}^{\mathsf {T}}) \in \mathbb {R}^{2n_l \times 2n_l}$ is an orthogonal projector of $\mathbf {E}_{l}$, where $n_l$ is the number of cameras observing landmark $l$. Roughly speaking, $\mathbf {Q}$ projects any vector in $\mathbb {R}^{2n_l}$ to the null space of the matrix $\mathbf {E}_{l}$. Since $\mathbf {E}_{l} \in \mathbb {R}^{2 n_l \times 3}$ has rank 3, the dimension of its null space is $2n_l-3$. Any basis $\mathbf {E}^\perp _{l} \in \mathbb {R}^{2n_l \times 2n_l-3}$ of the null space of $\mathbf {E}_{l}$ satisfies the following relation [71]: \begin{equation} \mathbf {E}^\perp _{l} \left((\mathbf {E}^\perp _{l})^{\mathsf {T}}\mathbf {E}^\perp _{l} \right)^{-1}(\mathbf {E}^\perp _{l})^{\mathsf {T}}= {\mathbf I}- \mathbf {E}_{l} (\mathbf {E}_{l}^{\mathsf {T}}\mathbf {E}_{l})^{-1}\mathbf {E}_{l}^{\mathsf {T}}. \end{equation} View SourceA basis for the null space can be easily computed from $\mathbf {E}_{l}$ using SVD. Such basis is unitary , i.e., satisfies $(\mathbf {E}^\perp _{l})^{\mathsf {T}}\mathbf {E}^\perp _{l} = {\mathbf I}$. Substituting (82) into (55) , and recalling that $\mathbf {E}^\perp _{l}$ is a unitary matrix, we obtain \begin{align} &\sum _{l = 1}^L \Vert \mathbf {E}^\perp _{l} (\mathbf {E}^\perp _{l})^{\mathsf {T}}\left(\mathbf {F}_{l} \; \delta \mathbf {T}_{\mathcal {X}(l)} - \mathbf {b}_{l} \right) \Vert ^2 \nonumber\\ &\quad= \sum _{l = 1}^L \big (\mathbf {E}^\perp _{l} \big(\mathbf {E}^\perp _{l}\big)^{\mathsf {T}}\big(\mathbf {F}_{l}\, \delta \mathbf {T}_{\mathcal {X}(l)} - \mathbf {b}_{l} \big) \big)^{\mathsf {T}}\big (\mathbf {E}^\perp _{l} \big(\mathbf {E}^\perp _{l}\big)^{\mathsf {T}}\nonumber \\ &\quad\quad\;\big(\mathbf {F}_{l}\, \delta \mathbf {T}_{\mathcal {X}(l)} - \mathbf {b}_{l}\big) \big) \nonumber\\ &\quad= \sum _{l = 1}^L \left(\mathbf {F}_{l}\, \delta \mathbf {T}_{\mathcal {X}(l)} - \mathbf {b}_{l} \right)^{\mathsf {T}}\mathbf {E}^\perp _{l} \overbrace{\big(\mathbf {E}^\perp _{l}\big)^{\mathsf {T}}\mathbf {E}^\perp _{l}}^{={\mathbf I}_{3\times 3}} \big(\mathbf {E}^\perp _{l}\big)^{\mathsf {T}} \nonumber \\ &\quad\quad\;\left(\mathbf {F}_{l}\, \delta \mathbf {T}_{\mathcal {X}(l)} - \mathbf {b}_{l} \right)\nonumber\\ &\quad =\sum _{l = 1}^L \Vert (\mathbf {E}^\perp _{l})^{\mathsf {T}}\left(\mathbf {F}_{l} \; \delta \mathbf {T}_{\mathcal {X}(l)} - \mathbf {b}_{l} \right) \Vert ^2 \end{align} View Sourcewhich is an alternative representation of the cost function (55). This representation is usually preferable from a computational standpoint, as it does not include matrix inversion and can be computed using a smaller number of matrix multiplications.

SECTION E.

Rotation Rate Integration Using Euler Angles

In this section, we recall how to integrate rotation rate measurements using the Euler angle parametrization. Let $\tilde{\boldsymbol{\omega }}_k$ be the rotation rate measurement at time $k$ and $\boldsymbol{\eta }^{g}_k$ be the corresponding noise. Then, given the vector of Euler angles at time $k$, namely $\boldsymbol{\theta }_k \in \mathbb {R}^{3}$ , we can integrate the rotation rate measurement $\tilde{\boldsymbol{\omega }}_k$ and get $\boldsymbol{\theta }_{k+1}$ as \begin{equation} \boldsymbol{\theta }_{k+1} = \boldsymbol{\theta }_k + \big[E^{\prime }\big(\boldsymbol{\theta }_k\big)\big]^{-1} \big(\tilde{\boldsymbol{\omega }}_k - \boldsymbol{\eta }^{g}_k\big)\Delta t \end{equation} View Sourcewhere the matrix $E^{\prime }(\boldsymbol{\theta }_k)$ is the conjugate Euler angle rate matrix [72]. The covariance of $\boldsymbol{\theta }_{k+1}$ can be approximated by a first-order propagation as \begin{equation} \mathbf {\Sigma }^{\text{Euler}}_{k+1} = \mathbf {A}_{k} \mathbf {\Sigma }^{\text{Euler}}_{k} \mathbf {A}_{k}^{\mathsf {T}}+ \mathbf {B}_{k} \mathbf {\Sigma }_{\boldsymbol{\eta }} \mathbf {B}_{k}^{\mathsf {T}} \end{equation} View Sourcewhere $\mathbf {A}_k \doteq {\mathbf I}_{3 \times 3}+ \frac{\partial [E^{\prime }(\boldsymbol{\theta }_k)]^{-1}}{\partial \boldsymbol{\theta }_k}\Delta t$, $\mathbf {B}_k = - [E^{\prime }(\boldsymbol{\theta }_k)]^{-1} \Delta t$, and $\mathbf {\Sigma }_{\boldsymbol{\eta }}$ is the covariance of the measurement noise $\boldsymbol{\eta }^{{\rm gd}}_k$.

MIT Libraries

MIT Libraries

On-Manifold Preintegration for Real-Time Visual--Inertial Odometry

Alerts

Abstract:

Metadata

Abstract:

ISSN Information:

Funding Agency:

Introduction

Related Work

A. Filtering

B. Fixed-Lag Smoothing

C. Full Smoothing

Preliminaries

A. Notions of Riemannian Geometry

1) Special Orthogonal Group

2) Special Euclidean Group

B. Uncertainty Description in SO(3)

C. Gauss–Newton Method on Manifold

MAP Visual--Inertial State Estimation

A. State

B. Measurements

C. Factor Graphs and MAP Estimation

IMU Model and Motion Integration

IMU Preintegration on Manifold

A. Preintegrated IMU Measurements

B. Noise Propagation

C. Incorporating Bias Updates

D. Preintegrated IMU Factors

E. Bias Model

Structureless Vision Factors

Experimental Analysis

A. Simulation Experiments

1) Pose Estimation Accuracy and Timing

2) Consistency

3) Bias Estimation Accuracy

4) First-Order Bias Correction

5) Advantages Over the Euler-Angle-Based Formulation

B. Real Experiments

1) Implementation

2) Indoor Experiments

3) Outdoor Experiments

Conclusion

ACKNOWLEDGEMENTS

Appendix

Iterative Noise Propagation

Bias Correction via First-Order Updates

Jacobians of Residual Errors

1) Jacobians of $\mathbf {r}_{\Delta \mathbf {p}_{{\rm ij}}}$

2) Jacobians of $\mathbf {r}_{\Delta \mathbf {v}_{{\rm ij}}}$

3) Jacobians of $\mathbf {r}_{\Delta\, \mathtt{R}_{{\rm ij}}}$

Structureless Vision Factors: Null Space Projection

Rotation Rate Integration Using Euler Angles

References