I. Introduction
Ubiquitous mobile devices, along with the rich set of sensors, have fostered mobile crowdsensing as a distributed paradigm to gather large-scale, highly diverse, and rich context information for various sensing applications. Despite its potential, mobile crowdsensing is hindered by challenges related to communication efficiency and user privacy. A solution like federated learning protects privacy, yet does not address the communication overhead as it requires multi-round exchanges of data or model parameters between devices and the server, rendering the communication inevitable and significant. Fur-thermore, in terms of performance, it is more promising to train a model with a large amount of sensory data on a central server, rather than on IoT devices with limited computing resources. Especially, the straggler effect of federated learning makes the performance confined by the computation capabili-ties of mobile devices [1]–[3]. Therefore, how to crowdsource massive sensory data with little communication overhead and preserved privacy remains a key enabler.