I. Introduction
For almost two decades, there has been a growing gap between processor and memory speeds [1]. Therefore, efficient memory utilization is increasingly important for performance-critical applications such as large-scale simulations or interactive games. To find hints for optimizations, a developer has to understand how the application's computations interact with memory and map that knowledge to the program flow and source code.