PIVODL: Privacy-Preserving Vertical Federated Learning Over Distributed Labels | IEEE Journals & Magazine | IEEE Xplore

PIVODL: Privacy-Preserving Vertical Federated Learning Over Distributed Labels


Impact Statement:Federated learning is a distributed machine learning framework proposed for privacy preservation. Most federated learning algorithms work on horizontally partitioned data...Show More

Abstract:

Federated learning (FL) is an emerging privacy preserving machine learning protocol that allows multiple devices to collaboratively train a shared global model without re...Show More
Impact Statement:
Federated learning is a distributed machine learning framework proposed for privacy preservation. Most federated learning algorithms work on horizontally partitioned data, with only a few exceptions considering vertically partitioned data that is widely seen in the real-world. However, existing vertical federated learning makes an unrealistic assumption that data labels are distributed on only one device and no research has been reported so far that considers data labels distributed on multiple client devices. The PIVODL framework reported in this work allows us to build a secure vertical federated XGBoost system, in which the labels may distributed either on one device or on multiple devices, making it possible to apply federated learning to a wider range of real-world problems.

Abstract:

Federated learning (FL) is an emerging privacy preserving machine learning protocol that allows multiple devices to collaboratively train a shared global model without revealing their private local data. Nonparametric models like gradient boosting decision trees (GBDTs) have been commonly used in FL for vertically partitioned data. However, all these studies assume that all the data labels are stored on only one client, which may be unrealistic for real-world applications. Therefore, in this article, we propose a secure vertical FL framework, named privacy-preserving vertical federated learning system over distributed labels (PIVODL), to train GBDTs with data labels distributed on multiple devices. Both homomorphic encryption and differential privacy are adopted to prevent label information from being leaked through transmitted gradients and leaf values. Our experimental results show that both information leakage and model performance degradation of the proposed PIVODL are negligible. ...
Published in: IEEE Transactions on Artificial Intelligence ( Volume: 4, Issue: 5, October 2023)
Page(s): 988 - 1001
Date of Publication: 28 December 2021
Electronic ISSN: 2691-4581

Funding Agency:

No metrics found for this document.

I. Introduction

Data privacy has become the main focus of attention in modern societies and the recently enacted General Data Protection Regulation prohibits users from wantonly sharing and exchanging their personal data. This may be a big barrier to model training, since standard centralized machine learning algorithms require to collect and store training data on one single cloud server. To tackle this issue, federated learning (FL) [1] is proposed to enable multiple edge devices to collaboratively train a shared global model while keeping all the users’ data on local devices.

Usage
Select a Year
2025

View as

Total usage sinceDec 2021:1,461
051015202530JanFebMarAprMayJunJulAugSepOctNovDec242519000000000
Year Total:68
Data is updated monthly. Usage includes PDF downloads and HTML views.

Contact IEEE to Subscribe

References

References is not available for this document.