Edge-based Protection Against Malicious Poisoning for Distributed Federated Learning | IEEE Conference Publication | IEEE Xplore

Edge-based Protection Against Malicious Poisoning for Distributed Federated Learning


Abstract:

Federated learning is proposed to solve data islands and protect privacy. Especially in the big data environment, participating users can build a model together without s...Show More

Abstract:

Federated learning is proposed to solve data islands and protect privacy. Especially in the big data environment, participating users can build a model together without sharing private sensitive data. However, as the number of end devices becomes larger, and the model becomes more complex, high concurrent access to the cloud server often brings communication delay, and it is also a great challenge to the computing power of end devices. To address this problem, we introduce Unmanned Aerial Vehicle (UAV) swarms as mobile edge nodes for end devices. UAV swarms can provide caching and computing resources for end devices. Therefore, we can implement edge aggregation of parameters on UAV swarms to reduce direct access to the cloud server. Meanwhile, the distributed end-edge-cloud federated learning architecture based on UAV swarms is an open environment, which may have potential malicious end devices or external channel eavesdropping. Malicious end devices or external eavesdroppers may maliciously poison training data sets or model parameters to reduce the classification accuracy of the model. In order to resist malicious poisoning, on UAV swarms we can calculate the cosine similarities between local parameters and their edge aggregation parameters to exclude malicious parameters, which do not conform to the trend of collaborative convergence. Then, the reliable parameters can be aggregated again, and uploaded to the cloud server with Schnorr signature to ensure the authenticity of the data. We analyze the security of the proposed scheme, and verify through experiments that it can resist malicious poisoning effectively and improve the accuracy of the model.
Date of Conference: 04-06 May 2022
Date Added to IEEE Xplore: 20 May 2022
ISBN Information:
Conference Location: Hangzhou, China

Funding Agency:


I. Introduction

The surge of smart phones, Internet of Things (IoT) and other devices has led to the arrival of the big data era [1]. Deep learning provides an effective means for processing massive data [2], such as managing a large number of patient data for disease prediction, performing independent safety audit from system logs, etc. However, centralized deep learning often leads to the disclosure of user’s data and a series of privacy problems. Federated learning (FL) [3] has been proposed to solve the dilemma of centralized deep learning. FL allows users to participate in global training without sharing private sample data, so as to protect the privacy of user’s data. Specifically, each user trains the global model with private datasets and only upload the updated parameters (i.e., weights and offsets) to the central cloud server for aggregation and repeat the above process until the model converges. However, with the increasing number of users participating in training and more complex deep learning models are used, parameters uploaded by users are becoming larger and larger, which inevitably will cause bandwidth contention and communication delay [4]. Some communication compression methods such as Sketched updates [5] alleviate the communication pressure by compressing the upload gradient, but it will bring the loss of gradient information and reduce the accuracy of the model. In order to alleviate the pressure of communication and computing, FL has gradually evolved from end-cloud to end-edge-cloud architecture.

Contact IEEE to Subscribe

References

References is not available for this document.