I. Introduction
With the continuous development of deep learning technology, the massive use of modern mobile devices and the maturity of technologies such as the Internet of Things (IoT), the demand for using edge devices for model calculation and prediction continues to rise [1], [2], [3]. The client data on these growing edge devices is essential for enhancing the performance of deep models on the client side. Under traditional machine learning conditions, each device needs to upload local data to a central server and perform model training on the server [4]. In order to accelerate the training speed of massive data, using distributed machine learning [5] to perform distributed training on multiple nodes is a solution. However, distributed machine learning methods do not consider protecting user privacy data on edge devices, as well as how to address the problem that the data is not independent and identically distributed (non-i.i.d) among different clients. Such a training strategy in which a global server can access all data may lead to user data leakage and increase the risk of being attacked [6], [7], [8].