I. Introduction
Recent SCALABLE streaming protocols for on-demand delivery of popular media content promise significant server and network bandwidth savings (e.g., [6], [10], [13], [14]). However, a key unresolved issue is how to design scalable streaming content delivery systems. This problem involves placing replicas of popular objects closer to some of the client sites so as to reduce content delivery cost. The key questions are how many replicas, where each replica should be placed, where to route client requests, and how to route the streams that the clients receive. The goal considered in this paper is to minimize total delivery cost, which in general includes both total network and total server delivery cost. Once the replicas are placed for minimum delivery cost, packet-loss recovery can be provided by using techniques such as those described in [4] or [18], while client latency can be minimized by storing a small prefix closer to the clients.