I. Introduction
Scalability has become one major advantage of cloud data centers where application owners can dynamically scale the underlying computing resources on the fly if necessary. First, many web applications are provided using the n-tier architecture, in which each tier of the system can be easily extended by adding or removing servers. Second, web applications naturally have bursty workload, where the peak workload in rush hours can be 10X higher than the overall average [1], [2]; over-provisioning only for peak workload can waste significant amount of computing resources and power. Therefore, scalability is extremely important for web applications to achieve high resource efficiency.