I. Introduction
Federated Learning (FL) is a recent distributed Machine Learning (ML) paradigm that aims to collaboratively train an ML model using data owned by clients, without those clients sharing their training data with a central server or other participating clients. Practical applications of FL range from ‘cross-device’ scenarios, with a huge number of unreliable clients each possessing a small number of samples, to ‘cross-silo’ scenarios with fewer, more reliable clients possessing more data [1]. FL has huge economic potential, with cross-device tasks including mobile-keyboard next-word prediction [2], voice detection [3], and even as proof-of-work for blockchain systems [4]. Cross-silo tasks include hospitals jointly training healthcare models [5] and financial institutions creating fraud detectors [6]. FL has been of particular interest for training large Deep Neural Networks (DNNs) due to their state-of-the-art performance across a wide range of tasks.