I. Introduction
In modern Commercial Off-The-Shelf (COTS) multicore systems, many parallel memory requests can be sent to the main memory system at any given time for the following two reasons. First, each core employs a variety of techniques-such as non-blocking cache, out-of-order issues, and speculative execution-to hide memory access latency. These techniques allow the core to continue to execute new instructions while it is still waiting for memory requests of previous instructions to be completed. Second, multiple cores can run multiple threads, each of which generates memory requests.