Journals & Magazines >IEEE Transactions on Parallel... >Volume: 27 Issue: 11

Towards Practical and Near-Optimal Coflow Scheduling for Data Center Networks

Download PDF
Download References
Request Permissions
Save to
Alerts

Abstract:

In current data centers, an application (e.g., MapReduce, Dryad, search platform, etc.) usually generates a group of parallel flows to complete a job. These flows compose...Show More

Metadata

Abstract:

In current data centers, an application (e.g., MapReduce, Dryad, search platform, etc.) usually generates a group of parallel flows to complete a job. These flows compose a coflow and only completing them all is meaningful to the application. Accordingly, minimizing the average Coflow Completion Time (CCT) becomes a critical objective of flow scheduling. However, achieving this goal in today's Data Center Networks (DCNs) is quite challenging, not only because the schedule problem is theoretically NP-hard, but also because it is tough to perform practical flow scheduling in large-scale DCNs. In this paper, we find that minimizing the average CCT of a set of coflows is equivalent to the well-known problem of minimizing the sum of completion times in a concurrent open shop. As there are abundant existing solutions for concurrent open shop, we open up a variety of techniques for coflow scheduling. Inspired by the best known result, we derive a 2-approximation algorithm for coflow scheduling, and further develop a decentralized coflow scheduling system, D-CAS, which avoids the system problems associated with current centralized proposals while addressing the performance challenges of decentralized suggestions. Trace-driven simulations indicate that D-CAS achieves a performance close to Varys, the state-of-the-art centralized method, and outperforms Baraat, the only existing decentralized method, significantly.

Published in: IEEE Transactions on Parallel and Distributed Systems ( Volume: 27, Issue: 11, 01 November 2016)

Page(s): 3366 - 3380

Date of Publication: 04 February 2016

ISSN Information:

DOI: 10.1109/TPDS.2016.2525767

Funding Agency:

Contents

1 Introduction

Today's data centers widely employ cluster computation frameworks (e.g., MapReduce [1], Dryad [2], CIEL [3], and Spark [4]) to deal with the increasing data process and analysis demands. In these frameworks, data-intensive jobs are divided into multiple successive data-parallel computation stages, and a succeeding computation stage cannot start until all its required inputs are in place, which are the outputs of the previous stage. Recent studies [5], [6], [7] have shown that the intermediate data transmission is not a negligible phase in the process of a job. For example, it accounts for $33$ percent of the job running time in Facebook's system [5]. Accordingly, speeding up data transfers between computation stages will accelerate the job completions and increase resource utilizations [5], [6], [7].

References is not available for this document.

Towards Practical and Near-Optimal Coflow Scheduling for Data Center Networks

Abstract:

Metadata

Abstract:

ISSN Information:

Funding Agency:

1 Introduction

References

IEEE Account

Purchase Details

Profile Information

Need Help?

Towards Practical and Near-Optimal Coflow Scheduling for Data Center Networks

Alerts

Abstract:

Metadata

Abstract:

ISSN Information:

Funding Agency:

1 Introduction

References