Conferences >ICC 2023 - IEEE International...

A2FL: Availability-Aware Selection for Machine Learning on Clients with Federated Big Data

Download PDF
Download References
Request Permissions
Save to
Alerts

Abstract:

Recent advances in Big Data Analytics are primarily driven by innovations in Artificial Intelligence and Machine Learning Methods. Due to the richness of data sources at ...Show More

Metadata

Abstract:

Recent advances in Big Data Analytics are primarily driven by innovations in Artificial Intelligence and Machine Learning Methods. Due to the richness of data sources at the edge and with the increasing privacy concerns, Distributed privacy-preserving machine learning (ML) methods are increasingly becoming the norm for training ML models on federated big data. In a popular approach known as Federated learning (FL), service providers leverage end-user data to train ML models to improve services such as text auto-completion, virtual keyboards, and item recommendations. FL is expected to grow in importance with the increasing focus on big data, privacy and 5G/6G technologies. However, FL faces significant challenges such as heterogeneity, communication overheads, and privacy preservation. In practice, training models via FL is time-intensive and worse its dependent on client participation who may not always be available to join the training. Our empirical analysis shows that client availability can significantly impact the model quality which motivates the design of an availability-aware selection scheme. We propose A2FL to mitigate the quality degradation caused by the under-representation of the global client population by prioritizing the least available clients. Our results show that, compared to state-of-the-art methods, A2FL can improve the client diversity during the training and hence boost the trained model quality.

Published in: ICC 2023 - IEEE International Conference on Communications

Date of Conference: 28 May 2023 - 01 June 2023

Date Added to IEEE Xplore: 23 October 2023

ISBN Information:

Electronic ISSN: 1938-1883

DOI: 10.1109/ICC45041.2023.10279228

Conference Location: Rome, Italy

Contents

I. Introduction

Recently, Big Data analytic has seen a paradigm shift towards Edge AI (i.e., pushing intelligence towards the edge). This is mainly driven by the need to move computation toward data sources in an effort to reduce communication needs as well as enhance privacy and security [1], [2]. These efforts led to the formation of the Federated Learning (FL) paradigm which transformed traditional distributed machine learning (ML) training methods. Many service providers such as Google, Facebook, and Apple use FL to train global models for natural language processing (NLP) and computer vision (CV) tasks to server applications such as virtual keyboards, object detection, image classification, and recommendation systems [3]–[7]. FL is commonly used with distributed medical imaging data [8]; smart camera images [9]. In FL, the central server managing the models ships them to the clients' end-device on which the training is performed locally to preserve the privacy and security of user data. Due to the lack of control over client devices, FL environments are highly heterogeneous which presents a variety of challenges. In FL, the process is participatory and relies on the availability of the clients and their data. These clients produce and store the application data used to locally train the central ML model and contribute their model updates to the central server for incorporation into the global model. Time-to-accuracy is a vital performance measure for training quality and is the focus of much work in this area [2], [10]–[13]. Generally, the objective is to reduce the time-to-accuracy by reducing the training time and improving the statistical efficiency. Reducing training time requires hardware acceleration, and time-efficient training algorithms, and reducing the training time requires time-efficient training algorithms, hardware acceleration, and bandwidth-efficient communication methods on the devices [2], [13], [14].

References is not available for this document.

MIT Libraries

MIT Libraries

A2FL: Availability-Aware Selection for Machine Learning on Clients with Federated Big Data

Abstract:

Metadata

Abstract:

I. Introduction

References

IEEE Account

Purchase Details

Profile Information

Need Help?

MIT Libraries

MIT Libraries

A2FL: Availability-Aware Selection for Machine Learning on Clients with Federated Big Data

Alerts

Abstract:

Metadata

Abstract:

I. Introduction

References