Journals & Magazines >IEEE Transactions on Parallel... >Volume: 33 Issue: 10

Privacy-Preserving Efficient Federated-Learning Model Debugging

Download PDF
Download References
Request Permissions
Save to
Alerts

Abstract:

Federated learning allows large amounts of mobile clients to jointly construct a global model without sending their private data to a central server. A fundamental issue ...Show More

Metadata

Abstract:

Federated learning allows large amounts of mobile clients to jointly construct a global model without sending their private data to a central server. A fundamental issue in this framework is the susceptibility to the erroneous training data. This problem is especially challenging due to the invisibility of clients’ local training data and training process, as well as the resource constraints. In this paper, we aim to solve this issue by introducing the first FL debugging framework, FLDebugger, for mitigating test error caused by erroneous training data. The proposed solution traces the global model’s bugs (test errors), jointly through the training log and the underlying learning algorithm, back to first identify the clients and subsequently their training samples that are most responsible for the errors. In addition, we devise an influence-based participant selection strategy to fix bugs as well as to accelerate the convergence of model retraining. The performance of the identification algorithm is evaluated via extensive experiments on a real AIoT system (50 clients, including 20 edge computers, 20 laptops and 10 desktops) and in larger-scale simulated environments. The evaluation results attest to that our framework achieves accurate, privacy-preserving and efficient identification of negatively influential clients and samples, and significantly improves the model performance by fixing bugs.

Published in: IEEE Transactions on Parallel and Distributed Systems ( Volume: 33, Issue: 10, 01 October 2022)

Page(s): 2291 - 2303

Date of Publication: 23 December 2021

ISSN Information:

DOI: 10.1109/TPDS.2021.3137321

Funding Agency:

Contents

1 Introduction

Federated learning (FL) decouples the ability to construct a machine learning model from the need to store the data in the cloud. By aggregating hundreds or thousands of clients’ local models without exposing their local data and training process to any third party, a global model is obtained [1], [2], [3]. The participants can be sensors, home gateways, micro servers, small cells, or smartphones, which are equipped with storage and computation capability. Motivating applications include training image classifiers [4], next-word predictors on users’ smartphones [2] and smart wearable healthcare [5], etc. Different from conventional distributed machine learning [6], an FL system consists of a large number of clients who may possess erroneous data (e.g., mislabeled data), which seriously hinders the global model from achieving a good performance [7], [8]. For instance, data collected by crowdsourcing [9] or web crawlers may contain mislabeled samples. One of our experiments in Section 6 shows that a two-class image classifier trained by FL with datasets crawled from image search engines suffered an accuracy loss from 91.6% to 88.2% due to the existence 9% mislabeled data.

References is not available for this document.

Privacy-Preserving Efficient Federated-Learning Model Debugging

Abstract:

Metadata

Abstract:

ISSN Information:

Funding Agency:

1 Introduction

References

IEEE Account

Purchase Details

Profile Information

Need Help?

Privacy-Preserving Efficient Federated-Learning Model Debugging

Alerts

Abstract:

Metadata

Abstract:

ISSN Information:

Funding Agency:

1 Introduction

References