Loading [MathJax]/extensions/MathMenu.js
OS4C: An Open-Source SR-IOV System for SmartNIC-Based Cloud Platforms | IEEE Conference Publication | IEEE Xplore

OS4C: An Open-Source SR-IOV System for SmartNIC-Based Cloud Platforms


Abstract:

Smart network interface cards (SmartNICs) are programmable network cards that enable the flexible offloading of network- and application-level functionality. The last sev...Show More

Abstract:

Smart network interface cards (SmartNICs) are programmable network cards that enable the flexible offloading of network- and application-level functionality. The last several years have seen a significant rise in research related to smart, programmable NICs. Meanwhile, several open-source FPGA-based NIC and networking projects have emerged. However, these projects lack many key features necessary for strong performance in cloud settings. We identify Single Root In-put/Output Virtualization (SR-IOV) as one of the key missing features in these open-source implementations. SR-IOV enables cloud vendors to grant cloud tenants direct access to hardware resources, dramatically reducing the software overheads of device virtualization. We present OS4C, which extends the popular open-source NIC Corundum [1] with support for SR-IOV. We demonstrate that OS4C can improve virtual machine P99.9 network tail latency by up to 17x, throughput by up to 4x, and CPU effort by up to 3.9x compared to software virtualization. On top of this system, we provide a novel weighted round-robin scheduler that enables tenants and providers to control weight distributions and overhaul the Corundum simulation framework to support multi-tenant tests and performance insights.
Date of Conference: 07-13 July 2024
Date Added to IEEE Xplore: 28 August 2024
ISBN Information:

ISSN Information:

Conference Location: Shenzhen, China
References is not available for this document.

I. Introduction

Over a decade ago, researchers noticed a growing gap between server CPU performance and network throughput demands. This drove them to explore how to make CPU networking tasks more efficient. For example, MegaPipe [2] proposed a novel network stack that reduced operating system overheads. This gap has not disappeared; it has become even more pronounced due to increasing link speeds that have grown faster than single-core CPU performance [3]. Many researchers and companies have developed novel hardware and software features to improve network performance to try and combat this increasing gap [2], [4]–[7].

Select All
1.
A. Forencich, A. C. Snoeren, G. Porter and G. Papen, "Corundum: An open-source 100-gbps nic", 2020 IEEE 28th Annual International Symposium on Field-Programmable Custom Computing Machines (FCCM), pp. 38-46, 2020.
2.
S. Han, S. Marshall, B.-G. Chun and S. Ratnasamy, "MegaPipe: A new programming interface for scalable network I/O" in 10th USENIX Symposium on Operating Systems Design and Implementation (OSDI 12), Hollywood, CA:USENIX Association, pp. 135-148, Oct. 2012.
3.
Q. Cai, S. Chaudhary, M. Vuppalapati, J. Hwang and R. Agarwal, "Understanding host network stack overheads" in Proceedings of the 2021 ACM SIGCOMM 2021 Conference ser. SIGCOMM '21, New York, NY, USA:Association for Computing Machinery, pp. 65-77, 2021.
4.
T. Herbert and W. de Bruijn, Scaling in the linux networking stack, 2018, [online] Available: https://www.kernel.org/doc/Documentation/networking/scaling.txt.
5.
R. Shashidhara, T. Stamler, A. Kaufmann and S. Peter, "FlexTOE: Flexible TCP offload with Fine-Grained parallelism" in 19th USENIX Symposium on Networked Systems Design and Implementation (NSDI 22), Renton, WA:USENIX Association, pp. 87-102, Apr. 2022.
6.
S. Peter, J. Li, I. Zhang, D. R. K. Ports, D. Woos, A. Krishnamurthy, et al., "Arrakis: The operating system is the control plane" in 11 th USENIX Symposium on Operating Systems Design and Implementation (OSDI 14), Broomfield, CO:USENIX Association, pp. 1-16, Oct. 2014.
7.
DPDK Project, Programmer's Guide, 2022, [online] Available: http://doc.dpdk.org/guides/index.html.
8.
D. Firestone et al., "Azure accelerated networking: SmartNICs in the public cloud" in 15th USENIX Symposium on Networked Systems Design and Implementation (NSDI 18), Renton, WA:USENIX Association, pp. 51-66, Apr. 2018.
9.
R. Horner, How can smartnics move your data center forward?, July 2022, [online] Available: https://blogs.synopsys.com/from-silicon-to-software/2022/07/12/what-is-a-smartnic/.
10.
K. Srinivasan, The rise of smartnics, 2021, [online] Available: https://semiengineering.com/the-rise-of-smartnics/.
11.
NVIDIA Corporation, NVIDIA BlueField-2 Ethernet DPU User Guide, 2022, [online] Available: https://docs.nvidia.com/networking/display/BlueField2DPuenuG/NVIDIA+BlueField-2+Ethernet+DPU+User+Guide.
12.
AMD Xilinx Alveo SNI000 SmartNICs Data Sheet, Apr 2022, [online] Available: https://docs.xilinx.com/v/u/en-US/ds989-snl000.
13.
J. W. Lockwood, N. McKeown, G. Watson, G. Gibb, P. Hartke, J. Naous, et al., "Netfpga-an open platform for gigabit-rate network switching and routing", 2007 IEEE International Conference on Microelectronic Systems Education (MSE'07), pp. 160-161, 2007.
14.
Opennic, 2022, [online] Available: https://github.com/Xilinx/open-nic/.
15.
S. Smith, Y. Ma, M. Lanz, B. Dai, M. Ohmacht, B. Sukhwani, et al., "OS4C: An open-source SR-IOV system for SmartNIC-based cloud platforms", 2024 IEEE 32nd Annual International Symposium on Field-Programmable Custom Computing Machines (FCCM), 2024.
16.
AMD Xilinx AMD Alveo U280, 2023, [online] Available: https://www.xilinx.com/content/damlxilinx/publications/product-briefs/alveo-u280-product-brief.pdf.
17.
cocotb, 2023, [online] Available: https://github.com/cocotb/cocotb.
18.
Z. Zhao, H. Sadok, N. Atre, J. C. Hoe, V. Sekar and J. Sherry, "Achieving 100gbps intrusion prevention on a single server" in 14th USENIX Symposium on Operating Systems Design and Implementation (OSDI 20), USENIX Association, pp. 1083-1100, Nov. 2020.
19.
N. Lazarev, S. Xiang, N. Adit, Z. Zhang and C. Delimitrou, "Dagger: Efficient and fast rpcs in cloud micro services with near-memory reconfigurable nics" in Proceedings of the 26th ACM International Conference on Architectural Support for Programming Languages and Operating Systems ser. ASPLOS '21, New York, NY, USA:Association for Computing Machinery, pp. 36-51, 2021.
20.
J. Zhang, H. Huang, L. Zhu, S. Ma, D. Rong, Y. Hou, et al., "Smartds: Middle-tier-centric smartnic enabling application-aware message split for disaggregated block storage" in Proceedings of the 50th Annual International Symposium on Computer Architecture ser. ISCA '23, New York, NY, USA:Association for Computing Machinery, 2023.
21.
R. Ma, E. Georganas, A. Heinecke, S. Gribok, A. Boutros and E. Nurvi-tadhi, "Fpga-based ai smart nics for scalable distributed ai training systems", IEEE Computer Architecture Letters, vol. 21, no. 2, pp. 49-52, 2022.
22.
Frequently Asked Questions, 2022, [online] Available: https://github.com/Xilinx/open-nic/blob/main/FAQ.md.
23.
E. P. Martin, Deep dive into virtio-networking and vhost-net, Sep 2019, [online] Available: https://www.redhat.com/en/blog/deep-dive-virtio-networking-and-vhost-net.
24.
J. Liu, "Evaluating standard-based self-virtualizing devices: A per-formance study on 10 gbe nics with sr-iov support", 2010 IEEE International Symposium on Parallel & Distributed Processing (IPDPS), pp. 1-12, 2010.
25.
Z. Huang, R. Ma, J. Li, Z. Chang and H. Guan, "Adaptive and scalable optimizations for high performance sr-iov", 2012 IEEE International Conference on Cluster Computing, pp. 459-467, 2012.
26.
Y. Dong, X. Yang, J. Li, G. Liao, K. Tian and H. Guan, "High performance network virtualization with sr-iov", Journal of Parallel and Distributed Computing, vol. 72, no. 11, pp. 1471-1480, 2012.
27.
Intel Corporation Intel 64 and IA-32 Architectures Software Developer's Manual, 2016.
28.
Arm AMBA AXI and ACE Protocol Specification, 2013, [online] Available: https://developer.arm.com/documentation/ihi0022/e.
29.
A. Forencich, Axi interface modules for cocotb, 2023, [online] Available: https://github.com/alexforencich/cocotbext-axi.
30.
A. Forencich, Pci express simulation framework for cocotb, 2023, [online] Available: https://github.com/alexforencich/cocotbext-pcie.
Contact IEEE to Subscribe

References

References is not available for this document.