Abstract:
The rising computing demands of scientific endeavors often require the creation and management of High Performance Computing (HPC) systems for running experiments and pro...Show MoreMetadata
Abstract:
The rising computing demands of scientific endeavors often require the creation and management of High Performance Computing (HPC) systems for running experiments and processing vast amounts of data. These HPC systems generally operate at peak performance, consuming a large quantity of electricity, even though their workload varies over time. Understanding the behavioral patterns (i.e., phases) of HPC systems during their use is key to adjust performance to resource demand and hence improve the energy efficiency. In this paper, we describe (i) a method to detect phases of an HPC system based on its workload, and (ii) a partial phase recognition technique that works cooperatively with on-the-fly dynamic management. We implement a prototype that guides the use of energy saving capabilities to demonstrate the benefits of our approach. Experimental results reveal the effectiveness of the phase detection method under real-life workload and benchmarks. A comparison with baseline unmanaged execution shows that the partial phase recognition technique saves up to 15% of energy with less than 1% performance degradation.
Date of Conference: 17-19 December 2012
Date Added to IEEE Xplore: 17 January 2013
ISBN Information: