I. Introduction
In-Memory Computing (IMC) was widely studied due to its potential to break the memory wall [1]–[4]. IMC-based image processers have been widely applied in image recognitions and retrievals [5]–[8]. However, when the image resolution is becoming higher and higher, we have to deal with massive large-size image data, which brings about severe challenges for IMC hardware designs, such as the requirement for larger memory arrays with high stabilities and reliabilities, and the power consumption of peri-circuits can be serious [9]–[13]. Thus, image processing like neural network training will be slow and energy-intensive. A pre-processing unit capable of image compression and feature extraction can benefit an efficient image processing, making the quantity of data needed to be processed decrease grossly, which can bring faster speed and lower energy consumption. So far, such process of data compression is done normally by software with low speed, because it is difficult for some emerging memories to realize such processes with high accuracy, where some precision extension techniques are necessary [14]–[18]. Flash memory is a good candidate for this because flash memory cells and arrays have great potentialities to do more precise matrix calculations in large arrays due to the mature manufacturing technology [19]–[24].