1 Introduction
Localization and mapping are two major topics in the research area of robot autonomous navigation and positioning, which is essential for many applications, such as autonomous cars, search and rescue, as well as household robots. Visual odometry (VO), an important approach to localization and mapping, is hard to obtain a robust solution as the images captured by the camera are easily affected by environment conditions and the results are sensitive to the noise of measurements. Since inertial measurement units (IMUs) can typically acquire accurate short-term measurements at high rate and are resistant to external interference, combining visual and inertial measurements has been a popular means to compensate for the errors made by IMUs and cameras, and eventually, provide more accurate estimates for localization and mapping.