I. Introduction
Many datasets are inherently decentralized in nature and are distributed across multiple devices owned by different users. Traditional machine learning settings involve aggregating data samples from these users into a central repository and training a machine learning model on it. This movement of data from local devices to a central repository poses two key challenges. Firstly, it compromises the privacy and security of the data. Policies such as the General Data Protection Regulation (GDPR) [1] and Health Insurance Portability and Accountability Act (HIPAA) [2] stipulate provisions that make such movement difficult. Secondly, it imposes communication overheads which, depending on the setting, may be prohibitively expensive.