Conferences >2020 Workshop on Exascale MPI...

Scalable MPI Collectives using SHARP: Large Scale Performance Evaluation on the TACC Frontera System

Download PDF
Download References
Request Permissions
Save to
Alerts

Abstract:

The Message-Passing Interface (MPI) is the de-facto standard for designing and executing applications on massively parallel hardware. MPI collectives provide a convenient...Show More

Metadata

Abstract:

The Message-Passing Interface (MPI) is the de-facto standard for designing and executing applications on massively parallel hardware. MPI collectives provide a convenient abstraction for multiple processes/threads to communicate with one another. Mellanox's HDR InfiniBand switches pro-vide Scalable Hierarchical Aggregation and Reduction Protocol (SHARP) capabilities to offload collective communication to the network and reduce CPU involvement in the process. In this paper, we design and implement SHARP-based solutions for MPI Reduce and MPI Barrier in MVAPICH2-X. We evaluate the impact of proposed and existing SHARP-based solutions for MPI_Allreduce, MPI_Reduce, and MPI_Barrier operations have on the performance of the collective operation on the 8^th ranked TACC Frontera HPC system. Our experimental evaluation of the SHARP-based designs show up to 5.4X reduction in latency for Reduce, 5.1X for Allreduce, and 7.1X for Barrier at full system scale of 7,861 nodes over a host-based solution.

Published in: 2020 Workshop on Exascale MPI (ExaMPI)

Date of Conference: 13-13 November 2020

Date Added to IEEE Xplore: 04 January 2021

ISBN Information:

DOI: 10.1109/ExaMPI52011.2020.00007

Conference Location: Atlanta, GA, USA

No metrics found for this document.

Contents

I. Introduction

Super-computing systems have grown in size and scale over the last decade. Two key drivers fueling the growth of supercomputers are the current trends in multi-/many-core architectures and the availability of commodity, RDMA-enabled, and high-performance interconnects such as Infini-Band [1] (IB). Such HPC systems are allowing scientists and engineers to tackle grand challenges in various scientific domains. Users of HPC systems rely on parallel programming models to parallelize their applications and obtain performance improvements.

Usage

Select a Year

View as

Total usage sinceJan 2021:621

Year Total:16

Data is updated monthly. Usage includes PDF downloads and HTML views.

Citations

Crossref^®

Search for
Citations in
Google Scholar^®

References is not available for this document.

Scalable MPI Collectives using SHARP: Large Scale Performance Evaluation on the TACC Frontera System

Abstract:

Metadata

Abstract:

I. Introduction

View as

References

IEEE Account

Purchase Details

Profile Information

Need Help?

Scalable MPI Collectives using SHARP: Large Scale Performance Evaluation on the TACC Frontera System

Alerts

Abstract:

Metadata

Abstract:

I. Introduction

Authors

Figures

References

Citations

Keywords

Metrics

View as

Footnotes

References

IEEE Account

Purchase Details

Profile Information

Need Help?