I. Introduction
Machine learning models and in particular Neural Networks (NNs) are increasingly being used in the context of nonlinear “cognitive” problems, such as natural language processing and computer vision. These models can learn from a dataset in the training phase and make predictions on a new, previously unseen data in the inference/prediction/classification phase with ever-increasing accuracy. However, the compute and power-intensive nature of NNs prevents their effective deployment in resource-constrained environments, such as mobile scenarios. Hardware acceleration on Application Specific Integrated Circuits (ASICs) or Field Programmable Gate Arrays (FPGAs) offers a roadmap for enabling NNs in these scenarios [1]–[4]. However, similar to general purpose devices, hardware accelerators are also susceptible to faults (permanent/hard and transient/soft), as a consequence of Single Event Upset (SEU), manufacturing defects, and below safe-voltage operations [5], [6]. The ever-increasing rate of these faults in nano-scale technology nodes, can directly impact the accuracy of NNs.