I. Introduction
As a framework for distributed online computing and model training, federated learning (FL) has shown significant potential for applications, e.g., Internet-of-Things, autonomous driving, remote medical care, etc. [1]. FL enables individual mobile clients to train a global model collectively without releasing their data [2]. In particular, each client trains its local model independently relying on its local dataset and sends the gradient of the local model to a server. The server aggregates the gradients and broadcasts the aggregated global parameter to assist the clients in their local training.