1. Introduction
With the rise of the service-oriented business model (e.g., cloud computing), it becomes increasingly important to ensure Quality-of-Service (QoS) requirements throughout the period of service provision. One notable example is the service-level agreement (SLA) between a service provider and a client, which usually defines a set of performance metrics (PMs) and their thresholds in qualitative and quantitative terms. According to the PM structure, PMs can be categorized as atomic and composite [19]. Typical examples of atomic PMs are “link failure”, “delay” and “utilization”, which can be determined by direct run-time monitoring. One common form of PM compositions is the aggregation of atomic PMs over a specific time interval, such as “average availability”, “maximum response time” and “top 10%”. Another and more complex form is behavioral aggregation, such as “failure rate in ten consecutive requests” and “expected time to reach maximum queue size”. Extensive research efforts have been dedicated to QoS management in the presence of uncertainty (e.g., [2], [23], [29]).