Loading [MathJax]/extensions/MathZoom.js
Tangram: High-Resolution Video Analytics on Serverless Platform with SLO-Aware Batching | IEEE Conference Publication | IEEE Xplore

Tangram: High-Resolution Video Analytics on Serverless Platform with SLO-Aware Batching


Abstract:

Cloud-edge collaborative computing paradigm is a promising solution to high-resolution video analytics systems. The key lies in reducing redundant data and managing fluct...Show More

Abstract:

Cloud-edge collaborative computing paradigm is a promising solution to high-resolution video analytics systems. The key lies in reducing redundant data and managing fluctuating inference workloads effectively. Previous work has focused on extracting regions of interest (RoIs) from videos and transmitting them to the cloud for processing. However, a naive Infrastructure as a Service (IaaS) resource configuration falls short in handling highly fluctuating workloads, leading to violations of Service Level Objectives (SLOs) and inefficient resource utilization. Besides, these methods neglect the potential benefits of RoIs batching to leverage parallel processing. In this work, we introduce Tangram, an efficient serverless cloud-edge video analytics system fully optimized for both communication and computation. Tangram adaptively aligns the RoIs into patches and transmits them to the scheduler in the cloud. The system employs a unique “stitching” method to batch the patches with various sizes from the edge cameras. Additionally, we develop an online SLO-aware batching algorithm that judiciously determines the optimal invoking time of the serverless function. Experiments on our prototype reveal that Tangram can reduce bandwidth consumption and computation cost up to 74.30 % and 66.35 %, respectively, while maintaining SLO violations within 5 % and the accuracy loss negligible.
Date of Conference: 23-26 July 2024
Date Added to IEEE Xplore: 22 August 2024
ISBN Information:

ISSN Information:

Conference Location: Jersey City, NJ, USA

Funding Agency:

References is not available for this document.

I. Introduction

High-resolution cameras are increasingly prevalent in various edge applications, e.g., surveillance [1], traffic monitoring [2], augmented reality [3], etc. High-resolution video analytics based on advanced computer vision models has become a vibrant research topic in recent years [4]–[7].

Select All
1.
S. Wang, S. Yang and C. Zhao, "Surveiledge: Real-time video query based on collaborative cloud-edge deep learning", Proc. of IEEE INFO COM, pp. 2519-2528, 2020.
2.
J. Li, L. Liu, H. Xu, S. Wu and C. J. Xue, "Cross-camera inference on the constrained edge", Proc. of IEEE INFOCOM, pp. 1-10, 2023.
3.
L. Liu, H. Li and M. Gruteser, "Edge assisted real-time object detection for mobile augmented reality", Proc. of ACM MobiCom, pp. 1-16, 2019.
4.
B. Zhang, X. Jin, S. Ratnasamy, J. Wawrzynek and E. A. Lee, "Awstream: Adaptive wide-area streaming analytics", Proc. of ACM SIGCOMM, pp. 236-252, 2018.
5.
Y. Wang, W. Wang, J. Zhang, J. Jiang and K. Chen, "Bridging the edge-cloud barrier for real-time advanced vision analytics", Proc. of USENIX HotCloud, pp. 1-7, 2019.
6.
H. Wang, Q. Li, H. Sun, Z. Chen, Y. Hao, J. Peng, et al., "Vabus: Edge-cloud real-time video analytics via background understanding and subtraction", IEEE Journal on Selected Areas in Communications, vol. 41, no. 1, pp. 90-106, 2023.
7.
H. Li, D. Zhang, Y. Dai, N. Liu, L. Cheng, J. Li, et al., "Gp-nerf: Generalized perception nerf for context-aware 3d scene understanding", Proc. of IEEE/CVF CVPR, 2024.
8.
Youtube bit rates, [online] Available: https://support.google.com/youtube/answer/2853702?hl=en.
9.
Q. Zhang, K. Du, N. Agarwal, R. Netravali and J. Jiang, "Understanding the potential of server-driven edge video analytics", Proc. of ACM HotMobile, pp. 8-14, 2022.
10.
K. Du, A. Pervaiz, X. Yuan, A. Chowdhery, Q. Zhang, H. Hoffmann, et al., "Server-driven video streaming for deep learning inference", Proc. of ACM SIGCOMM, pp. 557-570, 2020.
11.
L. Zhang, Y. Zhang, X. Wu, F. Wang, L. Cui, Z. Wang, et al., "Batch adaptative streaming for video analytics", Proc. of IEEE INFOCOM, pp. 2158-2167, 2022.
12.
R. Xu, R. Kumar, P. Wang, P. Bai, G. Meghanath, S. Chaterji, et al., "Approxnet: Content and contention-aware video object classification system for embedded clients", ACM Transactions on Sensor Networks, vol. 18, no. 1, pp. 1-27, 2021.
13.
W. Zhang, Z. He, L. Liu, Z. Jia, Y. Liu, M. Gruteser, et al., "Elf: accelerate high-resolution mobile deep vision with content-aware parallel offloading", Proc. of ACM MobiCom, pp. 201-214, 2021.
14.
B. Chen, Z. Yan and K. Nahrstedt, "Context-aware image compression optimization for visual analytics offloading", Proc. of ACM MM, pp. 27-38, 2022.
15.
J. Yi, S. Choi and Y. Lee, "Eagleeye: Wearable camera-based person identification in crowded urban spaces", Proc. of ACM MobiCom, pp. 1-14, 2020.
16.
X. Ran, H. Chen, X. Zhu, Z. Liu and J. Chen, "Deepdecision: A mobile deep learning framework for edge video analytics", Proc. of IEEE INFO COM, pp. 1421-1429, 2018.
17.
Y. Dong, G. Gao, R. Wang and Z. Yan, "Collaborative video analytics on distributed edges with multi agent deep reinforcement learning", arXiv preprint, 2022.
18.
M. Zhu, K. Han, E. Wu, Q. Zhang, Y. Nie, Z. Lan, et al., "Dynamic resolution network", Proc. of NeurIPS, pp. 27 319-27 330, 2021.
19.
A. Ali, R. Pinciroli, F. Yan and E. Smirni, "Optimizing inference serving on serverless platforms", Proceedings of the VLDB Endowment, vol. 15, no. 10, pp. 2071-2084, 2022.
20.
Y. Lu, S. Jiang, T. Cao and Y. Shu, "Turbo: Opportunistic enhancement for edge video analytics", Proc. of ACM SenSys, pp. 263-276, 2022.
21.
R. Ma, Y. Zhan, Y. Xia, C. Wu, L. Yang and R. Gao, Sonnet: A control-theoretic approach for resource allocation in cluster management, vol. 153, pp. 169-181, 2024.
22.
S. Fouladi, R. S. Wahby, B. Shacklett, K. Balasubramaniam, W. Zeng, R. Bhalerao, et al., "Encoding fast and slow: Low-latency video processing using thousands of tiny threads", Proc. of USENIX NSDI, pp. 363-376, 2017.
23.
S. Jiang, Z. Lin, Y. Li, Y. Shu and Y. Liu, "Flexible high-resolution object detection on edge devices with tunable latency", Proc. of ACM MobiCom, pp. 559-572, 2021.
24.
X. Wang, X. Zhang, Y. Zhu, Y. Guo, X. Yuan, L. Xiang, Z. Wang, G. Ding, D. Brady, Q. Dai et al., "Panda: A gigapixel-level human-centric video dataset", Proc. of IEEE/CVF CVPR, pp. 3268-3278, 2020.
25.
D. Crankshaw, X. Wang, G. Zhou, M. J. Franklin, J. E. Gonzalez and I. Stoica, "Clipper: A low-latency online prediction serving system", Proc. of USENIX NSDI, pp. 613-627, 2017.
26.
C. Zhang, M. Yu, W. Wang and F. Yan, "Enabling cost-effective slo-aware machine learning inference serving on public cloud", IEEE Transactions on Cloud Computing, vol. 10, pp. 1765-1779, 2020.
27.
C. Stauffer and W. E. L. Grimson, "Adaptive background mixture models for real-time tracking", Proc. of IEEE CVPR, vol. 2, pp. 246-252, 1999.
28.
Alibaba cloud function compute, [online] Available: https://www.alibabacloud.com/product/function-compute.
29.
Alibaba Cloud, Alibaba cloud function compute billing overview, [online] Available: https://www.alibabacloud.com/help/en/fc/product-overview/billing-overview.
30.
P. Yu, Y. Qiu, X. Jin and M. Chowdhury, "Orloj: Predictably serving unpredictable dnns", arXiv preprint, 2022.
Contact IEEE to Subscribe

References

References is not available for this document.