Loading [MathJax]/extensions/MathZoom.js
NoPeek: Information leakage reduction to share activations in distributed deep learning | IEEE Conference Publication | IEEE Xplore

NoPeek: Information leakage reduction to share activations in distributed deep learning


Abstract:

For distributed machine learning with sensitive data, we demonstrate how minimizing distance correlation between raw data and intermediary representations reduces leakage...Show More

Abstract:

For distributed machine learning with sensitive data, we demonstrate how minimizing distance correlation between raw data and intermediary representations reduces leakage of sensitive raw data patterns across client communications while maintaining model accuracy. Leakage (measured using distance correlation between input and intermediate representations) is the risk associated with the invertibility of raw data from intermediary representations. This can prevent client entities that hold sensitive data from using distributed deep learning services. We demonstrate that our method is resilient to such reconstruction attacks and is based on reduction of distance correlation between raw data and learned representations during training and inference with image datasets. We prevent such reconstruction of raw data while maintaining information required to sustain good classification accuracies.
Date of Conference: 17-20 November 2020
Date Added to IEEE Xplore: 16 February 2021
ISBN Information:

ISSN Information:

Conference Location: Sorrento, Italy
No metrics found for this document.

I. Introduction

Data sharing and distributed computation with security, privacy and safety have been identified amongst important current trends in application of data mining and machine learning to healthcare, computer vision, cyber-security, internet of things, distributed systems, data fusion and finance. [1]–[5], [2], [6]–[9]. Hosting of siloed data by multiple client (device or organizational) entities that do not trust each other due to sensitivity and privacy issues poses to be a barrier for distributed machine learning. This paper proposes a way to mitigate the reconstruction of raw data in such distributed machine learning settings from culpable attackers. Our approach is based on minimizing a statistical dependency measure called distance correlation [10]–[14] between raw data and any intermediary communications across the clients or server participating in distributed deep learning. We also ensure our learnt representations help maintain reasonable classification accuracies of the model, thereby making the model useful while also protecting raw sensitive data from reconstruction by an attacker that can be situated in any of the untrusted clients participated in distributed machine learning.

Usage
Select a Year
2025

View as

Total usage sinceFeb 2021:742
02468101214JanFebMarAprMayJunJulAugSepOctNovDec6127000000000
Year Total:25
Data is updated monthly. Usage includes PDF downloads and HTML views.
Contact IEEE to Subscribe

References

References is not available for this document.