A Novel Resource Management Framework for Blockchain-Based Federated Learning in IoT Networks | IEEE Journals & Magazine | IEEE Xplore

A Novel Resource Management Framework for Blockchain-Based Federated Learning in IoT Networks


Abstract:

At present, the centralized learning models, used for IoT applications generating large amount of data, face several challenges such as bandwidth scarcity, more energy co...Show More

Abstract:

At present, the centralized learning models, used for IoT applications generating large amount of data, face several challenges such as bandwidth scarcity, more energy consumption, increased uses of computing resources, poor connectivity, high computational complexity, reduced privacy, and large latency towards data transfer. In order to address the aforementioned challenges, Blockchain-Enabled Federated Learning Networks (BFLNs) emerged recently, which deal with trained model parameters only, rather than raw data. BFLNs provide enhanced security along with improved energy-efficiency and Quality-of-Service (QoS). However, BFLNs suffer with the challenges of exponential increased action space in deciding various parameter levels towards training and block generation. Motivated by aforementioned challenges of BFLNs, in this work, we are proposing an actor-critic Reinforcement Learning (RL) method to model the Machine Learning Model Owner (MLMO) in selecting the optimal set of parameter levels, addressing the challenges of exponential grow of action space in BFLNs. Further, due to the implicit entropy exploration, actor-critic RL method balances the exploration-exploitation trade-off and shows better performance than most off-policy methods, on large discrete action spaces. Therefore, in this work, considering the mobile scenario of the devices, MLMO decides the data and energy levels that the mobile devices use for the training and determine the block generation rate. This leads to minimized system latency and reduced overall cost, while achieving the target accuracy. Specifically, we have used Proximal Policy Optimization (PPO) as an on-policy actor-critic method with it's two variants, one based on Monte Carlo (MC) returns and another based on Generalized Advantage Estimate (GAE). We analyzed that PPO has better exploration and sample efficiency, lesser training time, and consistently higher cumulative rewards, when compared to off-policy Deep Q-Network (DQN).
Published in: IEEE Transactions on Sustainable Computing ( Volume: 9, Issue: 4, July-Aug. 2024)
Page(s): 648 - 660
Date of Publication: 26 January 2024

ISSN Information:

No metrics found for this document.

I. Introduction

Recent advancement of Artificial Intelligence (AI)-enabled Internet of Things (IoT) is primarily driven by the huge volumes of data generated from the smart devices [1]. These devices are equipped with large number of sensors, processing real-time data. For model training, a lot of data is gathered and delivered to a centralized server having powerful computing capacity. However, there are several challenges involved while sending raw data from distributed IoT devices to a central server. For instance, centralized learning using raw data suffers from privacy concerns, data latency issues, network scalability problems, single point failure, limited autonomy of individual devices, and increased cost.

Usage
Select a Year
2025

View as

Total usage sinceJan 2024:467
051015202530JanFebMarAprMayJunJulAugSepOctNovDec272715000000000
Year Total:69
Data is updated monthly. Usage includes PDF downloads and HTML views.

Contact IEEE to Subscribe

References

References is not available for this document.