I. Introduction
Federated learning (FL) is a privacy-preserving machine learning paradigm that facilitates collaborative training without transferring raw data from participating devices to a server [1], [2], [3]. However, running FL training on resource constrained devices is challenging since training deep neural networks (DNN), which is computationally expensive, is solely run on devices. This is a known bottleneck [4], [5], [6].