I. Introduction
With the rapid and progressive development of the Internet of Things (IoT), massive smart devices generate a high volume of data including images and voice records, which contributes to the data-driven artificial intelligence (AI) models in various applications, such as staff recognition and risk prediction [1], [2]. However, traditional machine learning requires users to send personal information to a centralized server, resulting in a breach of data privacy and confidentiality. For example, cameras sharing photos to a server will expose important locations. To address this challenge, federated learning (FL) has emerged as a new distributed machine learning paradigm in which clients conduct the training procedure to maintain users data locally [3], [4]. Rather than sending personal data to the centralized server, clients utilize the shared global model sent by the server to train their data and upload gradients back to it for aggregation to prevent data leakage.