Loading [MathJax]/extensions/MathMenu.js
Abstraction Layer For Standardizing APIs of Task-Based Engines | IEEE Journals & Magazine | IEEE Xplore

Abstraction Layer For Standardizing APIs of Task-Based Engines


Abstract:

We introduce AL4SAN, a lightweight library for abstracting the APIs of task-based runtime engines. AL4SAN unifies the expression of tasks and their data dependencies. It ...Show More

Abstract:

We introduce AL4SAN, a lightweight library for abstracting the APIs of task-based runtime engines. AL4SAN unifies the expression of tasks and their data dependencies. It supports various dynamic runtime systems relying on compiler technology and user-defined APIs. It enables a single application to employ different runtimes and their respective scheduling components, while providing user-obliviousness to the underlying hardware configurations. AL4SAN exposes common front-end APIs and connects to different back-end runtimes. Experiments on performance and overhead assessments are reported on various shared- and distributed-memory systems, possibly equipped with hardware accelerators. A range of workloads, from compute-bound to memory-bound regimes, are employed as proxies for current scientific applications. The low overhead (less than 10 percent) achieved using a variety of workloads enables AL4SAN to be deployed for fast development of task-based numerical algorithms. More interestingly, AL4SAN enables runtime interoperability by switching runtimes at runtime. Blending runtime systems permits to achieve a twofold speedup on a task-based generalized symmetric eigenvalue solver, relative to state-of-the-art implementations. The ultimate goal of AL4SAN is not to create a new runtime, but to strengthen co-design of existing runtimes/applications, while facilitating user productivity and code portability. The code of AL4SAN is freely available at https://github.com/ecrc/al4san, with extensions in progress.
Published in: IEEE Transactions on Parallel and Distributed Systems ( Volume: 31, Issue: 11, 01 November 2020)
Page(s): 2482 - 2495
Date of Publication: 07 May 2020

ISSN Information:

Citations are not available for this document.

1 Introduction

Task-based programming models have become ubiquitous in scientific computing. For the last decade, they have demonstrated how they can leverage performance from the bottom of the software stack with numerical libraries [1], [2], [3], [4], [5], [6], [7] all the way up to computational simulations and applications [8], [9], [10], [11]. Thanks to the fine-grained computations, task-based numerical libraries and applications have proven their ability to reduce idle time due to load imbalance, hiding data movement with computations and weakening artifactual synchronization points between processing units. Although they are capable of mitigating many overheads, they rely on dynamic engines or runtime systems to abstract the underlying hardware complexity from end-users. These runtime systems efficiently marshal task data dependencies and schedule the corresponding computational kernels on available hardware resources. There exists a myriad of dynamic runtime systems to support task-based programming models on shared- and distributed-memory systems, possibly equipped with hardware accelerators [12]. The lack of API standardization makes it cumbersome for task-based applications and library developers to exploit different runtimes and their respective features. This requires changes into the original code to port it to a specific task-based engine in order to execute on a given hardware system.

Getting results...

Contact IEEE to Subscribe

References

References is not available for this document.