Conferences >2022 IEEE 29th International ...

Efficient Personalized and Non-Personalized Alltoall Communication for Modern Multi-HCA GPU-Based Clusters

Download PDF
Download References
Request Permissions
Save to
Alerts

Abstract:

Graphics Processing Units (GPUs) have become ubiquitous in today’s supercomputing clusters primarily because of their high compute capability and power efficiency. Messag...Show More

Metadata

Abstract:

Graphics Processing Units (GPUs) have become ubiquitous in today’s supercomputing clusters primarily because of their high compute capability and power efficiency. Message Passing Interface (MPI) is a widely adopted programming model for large-scale GPU-based applications used in such clusters. Modern GPU-based systems have multiple HCAs. Previously, scientists have leveraged multi-HCA systems to accelerate inter-node transfers between CPUs using point-to-point primitives. In this work, we show the need for collective-level, multi-rail aware algorithms using MPI_Allgather as an example. We then propose an efficient multi-rail MPI_Allgather algorithm and extend it to MPI_Alltoall. We analyze the performance of this algorithm using OMB benchmark suite. We demonstrate approximately 30% and 43% improvement in non-personalized and personalized communication benchmarks respectively when compared with the state-of-the-art MPI libraries on 128 GPUs

Published in: 2022 IEEE 29th International Conference on High Performance Computing, Data, and Analytics (HiPC)

Date of Conference: 18-21 December 2022

Date Added to IEEE Xplore: 26 April 2023

ISBN Information:

ISSN Information:

DOI: 10.1109/HiPC56025.2022.00025

Conference Location: Bengaluru, India

No metrics found for this document.

Contents

I. Introduction

Graphics Processing Units (GPUs) are one of the accelerators that are gaining prominence in modern super-computing systems. This trend is evident from the fact that eight of the top ten systems on the Top500 [11] list are empowered by GPUs (at the time this paper was written). These accelerators enable supercomputers to run massively parallel application workloads from different domains such as scientific computing and Deep-Learning.

Usage

Select a Year

View as

Total usage sinceMay 2023:167

Year Total:10

Data is updated monthly. Usage includes PDF downloads and HTML views.

Citations

Crossref^®

Search for
Citations in
Google Scholar^®

References is not available for this document.

Efficient Personalized and Non-Personalized Alltoall Communication for Modern Multi-HCA GPU-Based Clusters

Abstract:

Metadata

Abstract:

ISSN Information:

I. Introduction

View as

References

IEEE Account

Purchase Details

Profile Information

Need Help?

Efficient Personalized and Non-Personalized Alltoall Communication for Modern Multi-HCA GPU-Based Clusters

Alerts

Abstract:

Metadata

Abstract:

ISSN Information:

I. Introduction

Authors

Figures

References

Citations

Keywords

Metrics

View as

Footnotes

References

IEEE Account

Purchase Details

Profile Information

Need Help?