Loading [MathJax]/extensions/MathMenu.js
Latency-Oriented Elastic Memory Management at Task-Granularity for Stateful Streaming Processing | IEEE Conference Publication | IEEE Xplore

Latency-Oriented Elastic Memory Management at Task-Granularity for Stateful Streaming Processing


Abstract:

In a streaming application, an operator is usually instantiated into multiple tasks for parallel processing. Tasks across operators have various memory demands due to dif...Show More

Abstract:

In a streaming application, an operator is usually instantiated into multiple tasks for parallel processing. Tasks across operators have various memory demands due to different processing logic (e.g., stateful vs. stateless tasks). The memory demands of tasks from the same operator could also vary and fluctuate due to workload variability. Improper memory provision will cause some tasks to have relatively high latency, or even unbound latency that can eventually lead to system instability. We found that the task with the maximum latency of an operator has a significant and even decisive impact on the end-to-end latency.In this paper, we present our task-level memory manager. Based on our quantitative modeling of memory and task-level latency, the manager can adaptively allocate optimal memory size to each task for minimizing the end-to-end latency. We integrate our memory management on Apache Flink. The experiments show that our memory management could significantly reduce end-to-end latency for various applications at different scales and configurations, compared to the Flink native setting.
Date of Conference: 17-20 May 2023
Date Added to IEEE Xplore: 29 August 2023
ISBN Information:

ISSN Information:

Conference Location: New York City, NY, USA

Funding Agency:

References is not available for this document.

I. Introduction

Streaming Processing Engines (SPEs) [13], [26], [34], [37] have emerged to provide quick analysis for applications that need real-time responses and whose input data is in the form of streams, such as quantitative finance, network monitoring, and alert triggering. These applications typically require consistently short end-to-end (E2E) latency, the time from the data generated to the result is output. Short E2E latency can bring about a smooth user experience, service-level agreements guarantee, and huge profits in the financial market. For example, autonomous high-frequency algorithmic trading [36] must make timely adjustments to market fluctuations, or traders may lose trading opportunities or even incur losses.

Select All
1.
Apache Spark, [online] Available: https://spark.apache.org.
2.
Apache Storm, [online] Available: http://storm.apache.org.
3.
Nexmark Github, [online] Available: https://github.com/nexmark/nexmark.
4.
Rocksdb state store in Spark, [online] Available: https://spark.apache.org/docs/latest/structured-streaming-programming-guide.html#rocksdb-state-store-implementation.
5.
Uppsala University Linear Road Implementations, [online] Available: http://www.it.uu.se/research/group/udbl/lr.html.
6.
A. Arasu, M. Cherniack, E. Galvez, D. Maier, A. S. Maskey, E. Ryvkina, et al., "Linear road: a stream data management benchmark", Proceedings of the Thirtieth international conference on Very large data bases-Volume 30, pp. 480-491, 2004.
7.
A. Awad, J. Traub and S. Sakr, "Adaptive watermarks: A concept drift-based approach for predicting event-time progress in data streams", EDBT, pp. 622-625, 2019.
8.
E. Begoli, T. Akidau, F. Hueske, J. Hyde, K. Knight and K. Knowles, "One sql to rule them all-an efficient and syntactically idiomatic approach to management of streams and tables", Proceedings of the 2019 International Conference on Management of Data, pp. 1757-1772, 2019.
9.
E. Boutin, J. Ekanayake, W. Lin, B. Shi, J. Zhou, Z. Qian, et al., "Apollo: Scalable and coordinated scheduling for {Cloud-Scale} computing", 11th USENIX Symposium on Operating Systems Design and Implementation (OSDI 14), pp. 285-300, 2014.
10.
R. Bruno, D. Patricio, J. Simão, L. Veiga and P. Ferreira, "Runtime object lifetime profiler for latency sensitive big data applications", Proceedings of the Fourteenth EuroSys Conference 2019, pp. 1-16, 2019.
11.
B. Burns, B. Grant, D. Oppenheimer, E. Brewer and J. Wilkes, "Borg omega and kubernetes", Communications of the ACM, vol. 59, no. 5, pp. 50-57, 2016.
12.
P. Carbone, S. Ewen, G. Fóra S. Haridi, S. Richter and K. Tzoumas, "State management in apache flink®: consistent stateful distributed stream processing", Proceedings of the VLDB Endowment, vol. 10, no. 12, pp. 1718-1729, 2017.
13.
P. Carbone, A. Katsifodimos, S. Ewen, V. Markl, S. Haridi and K. Tzoumas, "Apache flink: Stream and batch processing in a single engine", Bulletin of the IEEE Computer Society Technical Committee on Data Engineering, vol. 36, no. 4, 2015.
14.
K. M. Chandy and L. Lamport, "Distributed snapshots: Determining global states of distributed systems", ACM Transactions on Computer Systems (TOCS), vol. 3, no. 1, pp. 63-75, 1985.
15.
H. Che, Y. Tung and Z. Wang, "Hierarchical web caching systems: Modeling design and experimental results", IEEE journal on Selected Areas in Communications, vol. 20, no. 7, pp. 1305-1314, 2002.
16.
W. Chen, A. Pi, S. Wang and X. Zhou, "Pufferfish: Container-driven elastic memory management for data-intensive applications", Proceedings of the ACM Symposium on Cloud Computing, pp. 259-271, 2019.
17.
A. Dabirmoghaddam, M. M. Barijough and J. Garcia-Luna-Aceves, "Understanding optimal caching and opportunistic caching at “ the edge” of information-centric networks", Proceedings of the 1st ACM conference on information-centric networking, pp. 47-56, 2014.
18.
A. Floratou, A. Agrawal, B. Graham, S. Rao and K. Ramasamy, "Dhalion: self-regulating stream processing in heron", Proceedings of the VLDB Endowment, vol. 10, no. 12, pp. 1825-1836, 2017.
19.
C. Fricker, P. Robert and J. Roberts, "A versatile and accurate approximation for lru cache performance", 2012 24th international teletraffic congress (ITC 24), pp. 1-8, 2012.
20.
T. Z. Fu, J. Ding, R. T. Ma, M. Winslett, Y. Yang and Z. Zhang, "Drs: dynamic resource scheduling for real-time analytics over fast streams", 2015 IEEE 35th International Conference on Distributed Computing Systems, pp. 411-420, 2015.
21.
B. Gedik, "Partitioning functions for stateful data parallelism in stream processing", The VLDB Journal, vol. 23, no. 4, pp. 517-539, 2014.
22.
A. Ghodsi, M. Zaharia, B. Hindman, A. Konwinski, S. Shenker and I. Stoica, "Dominant resource fairness: Fair allocation of multiple resource types", 8th USENIX Symposium on Networked Systems Design and Implementation (NSDI 11), 2011.
23.
B. Hindman, A. Konwinski, M. Zaharia, A. Ghodsi, A. D. Joseph, R. Katz, et al., "Mesos: A platform for {Fine-Grained} resource sharing in the data center", 8th USENIX Symposium on Networked Systems Design and Implementation (NSDI 11), 2011.
24.
M. Hoffmann, A. Lattuada, F. McSherry, V. Kalavri, J. Liagouris and T. Roscoe, "Megaphone: Latency-conscious state migration for distributed streaming dataflows", Proceedings of the VLDB Endowment, vol. 12, no. 9, pp. 1002-1015, 2019.
25.
V. Kalavri, J. Liagouris, M. Hoffmann, D. Dimitrova, M. Forshaw and T. Roscoe, "Three steps is all you need: fast accurate automatic scaling decisions for distributed streaming dataflows", 13th USENIX Symposium on Operating Systems Design and Implementation (OSDI 18), pp. 783-798, 2018.
26.
S. Kulkarni, N. Bhagat, M. Fu, V. Kedigehalli, C. Kellogg, S. Mittal, et al., "Twitter heron: Stream processing at scale", Proceedings of the 2015 ACM SIGMOD International Conference on Management of Data, pp. 239-250, 2015.
27.
L. Liu and H. Xu, "Elasecutor: Elastic executor scheduling in data analytics systems", Proceedings of the ACM Symposium on Cloud Computing, pp. 107-120, 2018.
28.
X. Liu and R. Buyya, "Resource management and scheduling in distributed stream processing systems: A taxonomy review and future directions", ACM Computing Surveys (CSUR), vol. 53, no. 3, pp. 1-41, 2020.
29.
Y. Mao, Y. Huang, R. Tian, X. Wang and R. T. Ma, "Trisk: Task-centric data stream reconfiguration", Proceedings of the ACM Symposium on Cloud Computing, pp. 214-228, 2021.
30.
V. Martina, M. Garetto and E. Leonardi, "A unified approach to the performance analysis of caching systems", IEEE INFOCOM 2014-IEEE Conference on Computer Communications, pp. 2040-2048, 2014.

Contact IEEE to Subscribe

References

References is not available for this document.