I. Introduction
Federated learning (FL) is an emerging distributed learning paradigm that enables multiple edge devices to train a common global model without sharing individual data [1]. This privacy-friendly data analytics technique over massive devices is envisioned as a promising solution to realize pervasive intelligence [2]. However, in many real-world application areas, mobile devices are often equipped with different local resources, which raises the emerging challenges for locally on-demand training [3]. Given different local resources status (e.g., computing capability and communication channel state) and personalized efficiency constraints (e.g., latency and energy), it is crucial to customize training strategies for heterogeneous edge devices.