I. Introduction
Machine learning has become ubiquitous in fields, such as image recognition, voice recognition, and machine translation [1], [2], [3]. Deep learning using neural networks has been dominant in machine learning because of its ease of training and generalization capabilities [4]. Neural networks have grown in complexity in the past few years [1], [2], [4], with storage and compute requirements of deep learning outpacing hardware improvement [5], [6]. This increased cost makes it hard to run deep neural networks on resource-constrained devices. Stochastic computing (SC) aims to improve deep learning efficiency [7], [8]. The compact computation unit of SC promises reduced computation area, improved data reuse, and reduced memory access costs [3]. Since SC requires only single gates for multiplication and addition, it enables a large number of multiply–accumulate units on the same hardware, which greatly improves operator reuse and alleviates memory bottlenecks of deep learning accelerators [3].