Loading [MathJax]/extensions/MathZoom.js
In-Depth Evaluation of a Lower-Level Direct-Verbs API on InfiniBand-based Clusters: Early Experiences | IEEE Conference Publication | IEEE Xplore

In-Depth Evaluation of a Lower-Level Direct-Verbs API on InfiniBand-based Clusters: Early Experiences


Abstract:

Many High-Performance Computing (HPC) clusters around the world use some variation of InfiniBand interconnects, all of which are powered by the ‘‘Verbs’’ API. Verbs suppl...Show More

Abstract:

Many High-Performance Computing (HPC) clusters around the world use some variation of InfiniBand interconnects, all of which are powered by the ‘‘Verbs’’ API. Verbs supply a quick, efficient, and developer-friendly method of passing data buffers between nodes through their interconnect(s). In more recent years, the MLX5-DV (Direct Verbs) API has made itself known as a method of providing mechanisms to access and expose low-level structures and buffers to a developer. In principle, MLX5-DV is meant to give improved performance over Verbs thanks to the removal of intermediate layers of software abstraction. In this paper, we examine the inner workings of what this means for potential performance improvement and how MLX5-DV compares to its higher-level counterpart. In addition, we will offer insights on how application developers and MPI programmers can use this to their advantage based on initial experiences with benchmark and application-level results.
Date of Conference: 15-19 May 2023
Date Added to IEEE Xplore: 04 August 2023
ISBN Information:
Conference Location: St. Petersburg, FL, USA
Citations are not available for this document.

I. Introduction

The InfiniBand Verbs (Verbs’’, ‘‘IB-Verbs’’) API [5] has been an integral component in HPC clusters for over twenty years with its portability and ease of use. It is a well-known library of functions and data structures used as the backbone of several libraries in the HPC community. Primarily, it is used in the realm of inter-node data transfers through send and receive-based operations and supports a wide range of low-level data structures to manage how these messages are processed at the network level.

Getting results...

Contact IEEE to Subscribe

References

References is not available for this document.