1. Introduction
The increasing complexity of managing stored data and the economic benefits of consolidation are driving storage systems towards a service-oriented paradigm, in which personal and corporate clients purchase storage space and access bandwidth to store and retrieve their data. This paper deals with issues of performance and provisioning of server resources in storage data centers. In a typical setup, Service Level Agreements (SLA) between the service provider and clients stipulate guarantees on throughput [1], [2] or latency [3], [4] for rate-controlled clients. The service provider must provision sufficient resources to meet these performance guarantees based on estimates of the resource demands of the individual clients, and the aggregate capacity requirements of the client mix admitted into the system. The run-time scheduler must isolate the individual clients from each other so that they receive their reservations without interference from misbehaving clients with demand overruns, and schedule their requests on the server appropriately [5]. A fundamental challenge in data center operations is the need to deal effectively with high-variance bursty workloads arising in the network and storage server traffic [6], [7], [8]. These workloads are characterized by unpredictable bursty periods during which the instantaneous arrival rates can significantly exceed the average long-term rate. In the absence of explicit mechanisms to deal with it, the effects of these bursts are not confined to the localized regions where they occur, but spill over and affect otherwise well-behaved regions of the workload as well. As a consequence, although the bursty portion may be only a small fraction of the entire workload, it has a disproportionate effect on performance and provisioning decisions. This “tail wagging the dog” situation results in the server being forced to make unduly conservative estimates of resource requirements, resulting in excessive provisioning and energy consumption costs, and unnecessary throttling of the number of the clients admitted into the system.