MergeSFL: Split Federated Learning with Feature Merging and Batch Size Regulation | IEEE Conference Publication | IEEE Xplore

MergeSFL: Split Federated Learning with Feature Merging and Batch Size Regulation


Abstract:

Recently, federated learning (FL) has emerged as a popular technique for edge AI to mine valuable knowledge in edge computing (EC) systems. To boost the performance of AI...Show More

Abstract:

Recently, federated learning (FL) has emerged as a popular technique for edge AI to mine valuable knowledge in edge computing (EC) systems. To boost the performance of AI applications, large-scale models have received increasing attention due to their excellent generalized abilities. However, training and transmitting large-scale models will incur significant computing and communication burden on the resource-constrained workers, and the exchange of entire models may violate model privacy. To relax the burden of workers and protect model privacy, split federated learning (SFL) has been released by integrating both data and model parallelism. Despite resource limitations, SFL also faces two other critical challenges in EC systems, i.e., statistical heterogeneity and system heterogeneity. In order to address these challenges, we propose a novel SFL framework, termed MergeSFL, by incorporating feature merging and batch size regulation in SFL. Concretely, feature merging aims to merge the features from workers into a mixed feature sequence, which is approximately equivalent to the features derived from IID data and is employed to promote model accuracy. While batch size regulation aims to assign diverse and suitable batch sizes for heterogeneous workers to improve training efficiency. Moreover, MergeSFL explores to jointly optimize these two strategies upon their coupled relationship to better enhance the performance of SFL. Extensive experiments are conducted on a physical platform with 80 NVIDIA Jetson edge devices, and the experimental results show that MergeSFL can improve the final model accuracy by 5.82% to 26.22%, with a speedup by about 1.39x to 4.14x, compared to the baselines.
Date of Conference: 13-16 May 2024
Date Added to IEEE Xplore: 23 July 2024
ISBN Information:

ISSN Information:

Conference Location: Utrecht, Netherlands

Funding Agency:


I. Introduction

As an emerging and popular technique in edge AI, federated learning (FL) is proposed to train a globally-shared model through collaboration among workers (e.g., IoT devices) in the data-parallel fashion [1]–[6]. Under coordination of the parameter server (PS), participating workers periodically train deep learning (DL) models on their local datasets, and then push the models to the PS for global aggregation without exposing their raw data. FL has been leveraged by Google to develop the Gboard application with improved user experience in a privacy-preserving manner [7]. To boost the performance of AI applications or services, it is usually practical and effective to augment the parameters of DL models [8], [9]. However, training large-scale models is challenging for resource-constrained workers due to their hardware limitations of CPU and memory [10]–[12]. Additionally, transmitting large-scale models between workers and the PS incurs significant communication latency, and the exchange of entire models may violate model privacy [13], [14].

Contact IEEE to Subscribe

References

References is not available for this document.