I. Introduction
Garbage collection (GC) is essential in flash memory storage, namely, solid-state drive (SSD), owing to the erase-before-write nature. GC causes IO blocking that increases the latency of subsequent requests, up to more than 100 times the median latency at the 99th percentile [1], which does not satisfy the quality-of-service (QoS) in real-time and quality-critical systems, e.g., service level agreement (SLA) of the enterprise server systems. Therefore, it is very important to reduce long tail latency. In addition, as the flash memory architecture changes from a 2-D planner to a 3-D stack structure, the block size (the number of pages per block) increases, which increases the blocking time due to the page copy operation in GC and finally aggravates the long tail latency problem [2].