I. Introduction
The past few years have witnessed an increasing migration of data-driven applications from the centralized cloud to mobile devices due to the rising privacy concerns. Originated from distributed learning [1], Federated Learning (FL) learns a centralized model where the training data is held privately by end users [2]–[10]. They compute local models in parallel and aggregate their updates towards a centralized parameter server. The server takes the average from the users, pushes the averaged model back to all the users as the initial point for the next iteration.