TCP Stalls at the Server Side: Measurement and Mitigation | IEEE Journals & Magazine | IEEE Xplore

TCP Stalls at the Server Side: Measurement and Mitigation


Abstract:

TCP is an important factor affecting user-perceived performance of Internet applications. Diagnosing the causes behind TCP performance issues in the wild is essential for...Show More

Abstract:

TCP is an important factor affecting user-perceived performance of Internet applications. Diagnosing the causes behind TCP performance issues in the wild is essential for better understanding the current shortcomings in TCP. This paper presents a TCP flow performance analysis framework that classifies causes of TCP stalls. The framework forms the basis of a tool that we use to analyze packet-level traces of three services (cloud storage, software download, and web search) deployed by a popular service provider. We find that as many as 20% of the flows are stalled for half of their lifetime. Network-related causes, especially timeout retransmissions, dominate the stalls. A breakdown of the causes for timeout retransmission stalls reveals that double retransmission and tail retransmission are among the top contributors. The importance of these causes depends however on the specific service. Based on these observations, we propose smart-retransmission time out (S-RTO), a mechanism that mitigates timeout retransmission stalls through careful and gentle aggression for retransmission. S-RTO is evaluated in a controlled network and also in a production network. The results consistently show that it is effective at improving TCP performance, especially for short flows.
Published in: IEEE/ACM Transactions on Networking ( Volume: 27, Issue: 1, February 2019)
Page(s): 272 - 287
Date of Publication: 27 December 2018

ISSN Information:

Funding Agency:


I. Introduction

Today’s Internet users are increasingly concerned about the perceived performance (i.e., throughput and latency). Current popular applications rely heavily on TCP. Therefore, large service providers, e.g., Google and Amazon, are trying to improve TCP performance, especially at the server side where they have better control.

Contact IEEE to Subscribe

References

References is not available for this document.