Loading [MathJax]/extensions/MathMenu.js
Partial Training Mechanism to Handle the Impact of Stragglers in Federated Learning with Heterogeneous Clients | IEEE Conference Publication | IEEE Xplore

Partial Training Mechanism to Handle the Impact of Stragglers in Federated Learning with Heterogeneous Clients


Abstract:

Federated Learning (FL) allows distributed devices, known as clients, to train Machine Learning (ML) models collaboratively without sharing sensitive data. A characterist...Show More

Abstract:

Federated Learning (FL) allows distributed devices, known as clients, to train Machine Learning (ML) models collaboratively without sharing sensitive data. A characteristic of FL for mobile and IoT environments is system heterogeneity among clients, which can vary from low-end devices with constrained communication and computing resources to powerful devices with high-speed network access and dedicated GPUs. As the server must wait for all the clients to communicate their updates, slow clients (a.k.a. stragglers) will significantly increase the training time. To tackle this problem, we propose FedPulse, a Partial Training (PT) based mechanism to mitigate the effect of stragglers in FL. The idea is to reduce the training time by dynamically allocating smaller submodels to resource-constrained clients. Experimental results on famous classification datasets show that the proposed solution outperforms other submodel allocation mechanisms and reduces the training time by up to 58% with an accuracy loss of less than 1% when compared to FedAvg.
Date of Conference: 26-29 June 2024
Date Added to IEEE Xplore: 31 October 2024
ISBN Information:

ISSN Information:

Conference Location: Paris, France

I. Introduction

The popularization of mobile and IoT devices allows the collection of a large amount of data at the network’s edge, which enables the training of large Machine Learning (ML) models for a wide range of applications, such as next-word prediction and object detection. Traditionally, the edge devices send their local data to a powerful central server, which trains the model in a centralized manner. However, this mechanism raises several privacy concerns as the local data might be sensitive, and regulations such as the General Data Protection Regulation (GDPR) might restrict data collection from those devices [1], making centralized training unfeasible.

Contact IEEE to Subscribe

References

References is not available for this document.