I. Introduction
Distributed systems aggregate the power of heterogeneous, geographically distributed, multiple domain spanning computational resources to provide high performance or high throughput computing [6]. Managing these resources raises a lot of challenges including determining the right resource subset for a specific application and scheduling the job on it. Load balancing is a simple technique that aims to level the load across the distributed system. However such an approach is insufficient for large scale applications with user defined objectives in terms of execution time, throughput or latency. To address this limitation numerous heuristics have been proposed [5], [11]. All these algorithms require using a specific resource. Given the complexity of the objectives and of the resource and job parameters we need an effective and efficient resource characterization and selection mechanism. This task is a major aspect of the resource management problem, which deals with the allocation and scheduling of computing resources such that both user and provider objectives are met.