I. Introduction
Deep learning has been instrumental in the advances of both data analysis and artificial intelligence (AI) [1]. The algorithms have also been successfully used in almost all areas of applications in industry, science, and engineering [2]. However, traditional deep learning algorithms often require large amounts of training data and centralized training, which are usually impractical or even impossible in practical situations. For example, in the case of mobile devices, it is difficult to collect and transfer large amounts of data to a central server with limited bandwidth [3]. Centralized training is also infeasible due to data privacy and data silos because of the information security, related laws and regulations, and intensive competition among technology giants [4]. In recent years, federated learning (FL) has emerged as a positive response to the increasing need for privacy and other concerns in machine learning applications [5]. This newly emerging paradigm deploys multiple devices to train a global model collaboratively by uploading their models to a server while keeping the private data locally, as opposed to the traditional centralized data storage and model training approach. FL mainly involves privacy preservation, knowledge transfer, and collaboration among data owners, which are essential for many real-world applications of AI. After several years of development, FL has produced promising results, particularly in industries that handle sensitive data such as finance, healthcare, electronic communication and government affairs [6], [7], [8].