Loading web-font TeX/Math/Italic
Federated Learning With Differential Privacy: Algorithms and Performance Analysis | IEEE Journals & Magazine | IEEE Xplore

Federated Learning With Differential Privacy: Algorithms and Performance Analysis


Abstract:

Federated learning (FL), as a type of distributed machine learning, is capable of significantly preserving clients’ private data from being exposed to adversaries. Nevert...Show More

Abstract:

Federated learning (FL), as a type of distributed machine learning, is capable of significantly preserving clients’ private data from being exposed to adversaries. Nevertheless, private information can still be divulged by analyzing uploaded parameters from clients, e.g., weights trained in deep neural networks. In this paper, to effectively prevent information leakage, we propose a novel framework based on the concept of differential privacy (DP), in which artificial noise is added to parameters at the clients’ side before aggregating, namely, noising before model aggregation FL (NbAFL). First, we prove that the NbAFL can satisfy DP under distinct protection levels by properly adapting different variances of artificial noise. Then we develop a theoretical convergence bound on the loss function of the trained FL model in the NbAFL. Specifically, the theoretical bound reveals the following three key properties: 1) there is a tradeoff between convergence performance and privacy protection levels, i.e., better convergence performance leads to a lower protection level; 2) given a fixed privacy protection level, increasing the number N of overall clients participating in FL can improve the convergence performance; and 3) there is an optimal number aggregation times (communication rounds) in terms of convergence performance for a given protection level. Furthermore, we propose a K -client random scheduling strategy, where K ( 1\leq K< N ) clients are randomly selected from the N overall clients to participate in each aggregation. We also develop a corresponding convergence bound for the loss function in this case and the K -client random scheduling strategy also retains the above three properties. Moreover, we find that there is an optimal K that achieves the best convergence performance at a fixed privacy level. Evaluations demonstrate that our theoretical results are consistent with simulations, thereby facilitating the design of various privacy-...
Page(s): 3454 - 3469
Date of Publication: 17 April 2020

ISSN Information:

Funding Agency:


I. Introduction

It is anticipated that big data-driven artificial intelligence (AI) will soon be applied in many aspects of our daily life, including medical care, agriculture, transportation systems, etc. At the same time, the rapid growth of Internet-of-Things (IoT) applications calls for data mining and learning securely and reliably in distributed systems [1]–[3]. When integrating AI into a variety of IoT applications, distributed machine learning (ML) is preferred for many data processing tasks by defining parametrized functions from inputs to outputs as compositions of basic building blocks [4], [5]. Federated learning (FL) is a recent advance in distributed ML in which data are acquired and processed locally at the client side, and then the updated ML parameters are transmitted to a central server for aggregation [6]–[8]. The goal of FL is to fit a model generated by an empirical risk minimization (ERM) objective. However, FL also poses several key challenges, such as private information leakage, expensive communication costs between servers and clients, and device variability [9]–[15].

Contact IEEE to Subscribe

References

References is not available for this document.