I. Introduction
With the rapid growth of Internet of things (IoT), a large amount of data is generated by IoT devices every day [1]. To take advantage of the distributed data, edge machine learning algorithms are being developed to realize intelligent applications in wireless networks [1], [2], [3], [4], [5], [6], [7], [8], [9], [10], [11]. Federated learning (FL) [12] is proposed as a method to collaboratively train a machine learning algorithm with the local data of many wireless devices. Compared with traditional centralized learning that transmits large amounts of raw data to the cloud server for training, FL can effectively protect data privacy without exchanging the local data. However, the IoT devices, also called workers, need to perform the local update of the training model with their computing power in FL. It can greatly increase the computation burden of the workers, especially for training deep neural networks with high computational complexity. When workers have low computing power, the FL training time can be significantly prolonged, which can impede the practical application of FL. In addition, local updating entirely by the own computing power of workers increases their energy consumption.