I. Introduction
Mobile crowdsensing (MCS), as a critical component of the Internet of Things (IoT) [1], takes advantage of sensors (e.g., GPS, camera, and microphone) embedded in mobile smart devices (e.g., mobile phone) of individuals to collect sensing data and crowd wisdom to perform complex sensing tasks [2], [3], such as indoor localization [4], object tracking [5], event detection [6], smart city management [7], and environmental monitoring [8]. Besides, many commercial MCS platforms have been developed like EasyShift, Fieldagent, and SmartRoadSense. In practical data integration tasks of MCS, heterogeneous sensing data are widespread. For example, one real sensing task of EasyShift looks for several frozen breakfast products in the frozen food section of a given store, where required data include not only numerical GPS data but also images of the breakfast products and the text descriptions of the frozen food.