I. Introduction
As a vital part of convolutional neural networks, an efficient hardware implementation of convolution operation is of great importance [1] . This operation involves both addition and multiplication functions, often necessitating extensive numerical processing for intricate networks and applications. In the conventional Von-Neumann architecture, challenges such as memory wall bottlenecks result in substantial power dissipation and latency issues. To tackle these challenges, a data-centric In-Memory-Computing (IMC) architecture has been proposed as a compelling solution, leveraging emerging nonvolatile memories [2] . Among these memory technologies, flash memory stands out as a promising candidate due to its well-established processing technology and cost-effectiveness [3] . Nevertheless, the recognition accuracy is susceptible to significant degradation in the presence of noise during large-scale image pixel processing. Additionally, using large array sizes inevitably leads to pronounced power consumption and reliability concerns [4] .