1 Introduction
Due to its stunning performance, deep learning (DL) has enabled various applications ranging from image classification, object detection, speech recognition to disease diagnosis, financial fraud detection [1], [2], [3]. One major factor in achieving high accuracy is usually to leverage big data to learn high-level features. The intuitive means is to gather the data centrally and then perform the DL model training. However, data can often be highly private or sensitive. For example, data collected from medical sensors [4] and microphones [5] would be such cases. Consequently, users may resist sharing their data with service/cloud providers to build a DL model. In addition, the data aggregator must pay great attention to the data regulations such as General Data Protection Regulation (GDPR) [6], and California Privacy Rights Act (CPRA) [7]. On the other hand, the centralized data might be mishandled or improperly managed by service providers—e.g., incidentally accessed by unauthorized parties [8], or used for unsolicited analytic, or compromised through the network and system security vulnerabilities—resulting in data breach [9], [10]. Therefore, there is a demand for training DL models without aggregating and accessing sensitive raw data that reside in the client-side [11], [12], [13], [14], [15].