Loading [MathJax]/extensions/MathMenu.js
Gradient Scheduling With Global Momentum for Asynchronous Federated Learning in Edge Environment | IEEE Journals & Magazine | IEEE Xplore

Gradient Scheduling With Global Momentum for Asynchronous Federated Learning in Edge Environment


Abstract:

Federated Learning has attracted widespread attention in recent years because it allows massive edge nodes to collaboratively train machine learning models without sharin...Show More

Abstract:

Federated Learning has attracted widespread attention in recent years because it allows massive edge nodes to collaboratively train machine learning models without sharing their private data sets. However, these edge nodes are usually heterogeneous in computational capability and statistically different in data distribution, i.e., non-independent and identically distributed (IID), leading to significant performance degradation. Although status quo asynchronous training methods can solve the heterogeneity issue, they cannot prevent the non-IID problem from reducing the convergence rate. In this article, we propose a novel paradigm that schedules the gradient with partially averaged gradients and applies the global momentum (GSGM) for asynchronous training over non-IID data sets in an edge environment. Our key idea is to apply global momentum and partial average on the biased gradients calculated on edge nodes after scheduling, to make the training process stable. Empirical results demonstrate that GSGM can well adapt to different degrees of non-IID data and bring 20% performance gains in terms of training stability for popular optimization algorithms with enhanced accuracy over Fashion-Mnist and CIFAR-10 data sets.
Published in: IEEE Internet of Things Journal ( Volume: 9, Issue: 19, 01 October 2022)
Page(s): 18817 - 18828
Date of Publication: 25 March 2022

ISSN Information:

Funding Agency:

No metrics found for this document.

I. Introduction

In recent years, edge devices, such as mobile and IoT devices, have become the most popular tools to provide intelligent services due to their rapidly growing computing and communication capacity [1], [2]. Unlike traditional computing platforms, edge devices are likely to locate at the edges of networks, and at the same time are generating the large amount and massive types of big data [3]–[5]. In the traditional cloud computing paradigm, big data are collected from end devices, they are shuffled and fairly distributed to computing nodes in the cloud center. The training data on each node is essentially a random sample from the whole data sets, and thus are independent and identically distributed (IID) [6], [7]. While in the distributed computing paradigm, users tend to keep the raw data locally to protect the user privacy while reducing the bandwidth consumption for data transmission to a remote server [8]–[10]. Due to such advantages, Federated Learning has emerged as a popular way for future AI applications to conduct local training at end devices instead of centralized training [11]–[13]. Specifically, training the local data can contribute to the optimization of a shared model to enable intelligent services [14], [15]. However, there exists a problem that data sets from different end users may not be IID, e.g., the photographs taken from different locations might have totally different characteristics, which may cause difficulties for some ML applications, e.g., target detection, etc. Such non-IID issue is inevitable and has to be considered in Federated Learning algorithms [16].

Usage
Select a Year
2025

View as

Total usage sinceMar 2022:705
051015202530JanFebMarAprMayJunJulAugSepOctNovDec12421000000000
Year Total:37
Data is updated monthly. Usage includes PDF downloads and HTML views.
Contact IEEE to Subscribe

References

References is not available for this document.