I. Introduction
The present tendency to increase the number of cores in multicore compute clusters leads to significant growth in the length of the on-chip interconnects. Particularly, the L1–L2 interconnect, which has low latency requirements, gets very long because it has to extend over the large physical size of the cores and multiple L2 cache banks. The usage of private instead of shared L2 cache does not efficiently utilize L2 memory when cores have different memory demands. 3-D integrated circuit (3-D IC) technology offers a solution to the wire length problem.