Eliminating Communication Bottlenecks in Cross-Device Federated Learning with In-Network Processing at the Edge | IEEE Conference Publication | IEEE Xplore

Eliminating Communication Bottlenecks in Cross-Device Federated Learning with In-Network Processing at the Edge


Abstract:

Nowadays, cross-device federated learning (FL) is the key to achieving personalization services for mobile users and has been widely employed by companies like Google, Mi...Show More

Abstract:

Nowadays, cross-device federated learning (FL) is the key to achieving personalization services for mobile users and has been widely employed by companies like Google, Microsoft, and Alibaba in production. With the explosive increase of participants, the central FL server, which acts as the manager and aggregator of cross-device model training, would get overloaded, becoming the system bottlenecks. Inspired by the emerging wave of edge computing, an interesting question is: could edge clouds help cross-device FL systems overcome the bottleneck?This article provides a cautiously optimistic answer by proposing INP, an FL-specific In-Network Processing framework, along with the novel Model Download Protocol of MDP and Model Upload Protocol of MUP. With MDP and MUP, edge cloud nodes along the paths in INP can easily eliminate duplicated model downloads and pre-aggregate associated gradient uploads for the central FL server, thus alleviating its bottleneck effect, and further accelerating the entire training progress significantly.
Date of Conference: 16-20 May 2022
Date Added to IEEE Xplore: 11 August 2022
ISBN Information:

ISSN Information:

Conference Location: Seoul, Korea, Republic of

Funding Agency:


I. Introduction

By sharing the models rather than the raw privacy-sensitive data, cross-device Federated Learning (FL), an emerging distributed machine learning approach that enables end devices like mobile phones to train models cooperatively, has been widely deployed in production for personalization services like on-device item ranking, next-word prediction, content suggestions for on-device keyboards, and real-time e-commerce recommendations [1]–[5]. Generically, FL models are trained iteratively: in each round, a set of dynamically selected end devices (EDs) first download the current model from the central FL server (FLS) to launch the on-device training, then upload their local gradients back to the FLS, which would aggregate received gradients to obtain the new model [1]. Obviously, with the increase in the number of EDs, the central FLS would become the bottleneck of the entire FL system. Optimizing the performance of FL systems, especially removing the bottleneck effects of FLS, becomes the key to support very large-scale federated learning tasks [3].

Contact IEEE to Subscribe

References

References is not available for this document.