Conferences >2015 IEEE Computer Society An...

Communication-Aware Parallelization Strategies for High Performance Applications

Download PDF
Download References
Request Permissions
Save to
Alerts

Abstract:

With the advent of multicore processor architectures and the existence of a huge legacy code base, the need for efficient and scalable parallel zing compilers is growing....Show More

Metadata

Abstract:

With the advent of multicore processor architectures and the existence of a huge legacy code base, the need for efficient and scalable parallel zing compilers is growing. Where multi-core processors were seen as the way forward to address the known challenges such as the memory, power and ILP wall, efficient parallelization to make use of the multiple cores, is still an open issue. In this paper, we present two complementary tools, MCROF and XPU which provide an alternative development path to parallelise applications and that address the challenges of identifying potential parallelism and exploiting it in a different way. The MCROF tool provides a detailed profile of the data flowing inside an application and the XPU programming paradigm provides an intuitive and simple interface to express parallelism as well as the necessary runtime support. We demonstrate through two different use cases that better performance up to 4× can be achieved than available commercial compilers.

Published in: 2015 IEEE Computer Society Annual Symposium on VLSI

Date of Conference: 08-10 July 2015

Date Added to IEEE Xplore: 29 October 2015

ISBN Information:

ISSN Information:

DOI: 10.1109/ISVLSI.2015.89

Conference Location: Montpellier, France

Contents

I. Introduction

The number of transistors per chip is growing due to technology scaling and increasing the clock rate of processors is becoming technologically less viable [1]. The current trend is therefore to integrate a growing number of processing cores on chip, forcing parallelizing compilers to mature rapidly and to provide efficient code for the multi-core processors. Most parallelizing compilers focus on loop parallelization as most of the execution time is spent in loops. However, scalable parallelism is in many cases not realizable because memory accesses and interprocessor communication are the bottlenecks. Recent research makes it clear that memory accesses and data transfers account for the majority of the power consumption [2] [3] and thus need to be addressed and handled more explicitly in order to achieve (power) efficient performance. This paper presents a tool chain that parallelizes an application based the data flowing inside an application and how this helps in mapping (manually) the algorithm on the architecture using intuitive parallel constructs. We present in detail a use case, Canny Edge Detection, as well as the performance numbers for a second application, fluid animate.

References is not available for this document.

MIT Libraries

MIT Libraries

Communication-Aware Parallelization Strategies for High Performance Applications

Abstract:

Metadata

Abstract:

ISSN Information:

I. Introduction

References

IEEE Account

Purchase Details

Profile Information

Need Help?

MIT Libraries

MIT Libraries

Communication-Aware Parallelization Strategies for High Performance Applications

Alerts

Abstract:

Metadata

Abstract:

ISSN Information:

I. Introduction

References