1 Introduction
Despite the rapid development in network infrastructures, applications requiring quality-assured data transfers on the best-effort Internet continue to suffer from limited bandwidth, highly varying delay, and losses. In literature, several studies suggested that an application can increase its aggregate throughput, and reduce its end-to-end delay and losses, by distributing data over multiple paths [1], [2], [14], [16], [20], [36]. Nonetheless, the effort to support multiple paths at the IP or lower-level layers, e.g., in [13], [40], has not been attractive so far, since the deployment of a new IP-incompatible network infrastructure is required. Recently, application-level overlay networks have been used to establish multipath capable environments over the Internet [4], [5], [31]. The overlay network approach avoids modifying the existing network infrastructure [13], [40]. However, since an overlay network is deployed on top of an underlay network, the overlay traffic has to compete with the underlay one for resources, and then congestions can occur. As a result, a significant challenge in providing high performance data transfers—by using multipath overlay networks—is to infer the congestion states of overlay paths and to deal with fluctuations in overlay path performance. This paper addresses such challenge using a path state monitoring mechanism that complies with the end-to-end principle and captures congestion states of overlay paths. Based on the captured path states, a traffic control mechanism based on Markov Decision Processes (MDPs) is used to simultaneously route traffic over multiple overlay paths, while optimizing some QoS metrics such as end-to-end delay, jitter, and consecutive losses. This control mechanism has the ability to quickly adapt its transmission strategy to the dynamics of the underlying network. Consequently, it can avoid congestions by dynamically shifting traffic over multiple overlay paths. Such congestion avoidance ability comes naturally as a result of the adaptation or “learning” process. Hence, no explicit congestion avoidance mechanism is required. The proposed mechanism is designed to work above the transport layer where the decision timing is less critical, and it can be used in situations like multimedia streaming applications and data distribution scenarios.