I. Introduction
It is expected that Chip Multiprocessors (CMPs) that contain multiple CPUs on the same die will be main components for building future computer systems [27]. This expectation is based on several observations, including approaching the limit in increasing clock frequencies, increasing design complexity and power consumption of complex single CPU systems, ease of verification and validation of homogeneous multi-CPUs, and possibility of exploiting both thread level and instruction level parallelism when a CMP is used. In fact, several CMP based architectures [17], [20], [25], [1], [14] have already found their way into commercial market. In the long run, it is expected that the number of cores in CMPs will increase [15], [12].