Loading [MathJax]/extensions/MathZoom.js
Configurable Graphic Processing Unit Oriented OpenCL Kernel Performance Prediction Based on LSTM and Transformer | IEEE Conference Publication | IEEE Xplore

Configurable Graphic Processing Unit Oriented OpenCL Kernel Performance Prediction Based on LSTM and Transformer


Abstract:

With the rapid development of big data and artificial intelligence, there is a huge demand for high performance computing. The heterogeneous computing is a running effici...Show More

Abstract:

With the rapid development of big data and artificial intelligence, there is a huge demand for high performance computing. The heterogeneous computing is a running efficiently and cost effectively solution to combining the central processing unit, graphics processing unit (GPU), digital signal processor and other types of processing elements to reduce power consumption and improve capacity. The OpenCL is a common standard for programming heterogeneous devices. The OpenCL code can be applied for the different hardware equipments, but the code needs to be re-adjusted to change the parameter combination of the OpenCL kernel in order to achieve the best performance on the OpenCL device. However, there are thousands of combinations based on OpenCL kernel parameters, and testing all of them is going to consume huge time and resource costs. To solve this issue, the integration of bidirectional long short-term memory (LSTM) and the transformer with multi-head attention mechanism is proposed to predict the performance of the OpenCL kernel with various parameter combinations. In order to evaluate the availability and effects of the LSTM-Transformer deep learning model, the performance predictions of the parameterizable single-precision general matrix multiply GPU kernel with 241600 parameter combinations are implemented and verified. The sequence forecasting effects of the hybrid deep learning model are contrasted with those of the classical machine learning models, and the LSTM-Transformer model converges faster and has higher accuracy compared with other machine learning models.
Date of Conference: 15-17 August 2022
Date Added to IEEE Xplore: 14 February 2023
ISBN Information:

ISSN Information:

Conference Location: Hefei, China

Funding Agency:


1 Introduction

In recent decades, in order to improve the high-performance computing, the computer processor goes from single core to multi-core and the birth of the graphics processor (GPU), digital signal processor (DSP), stream processor and other parallel processors. In this case, the heterogeneous computing becomes extremely important, and the open standard Open Computing Language (OpenCL) for cross-platform parallel programming is proposed to implement the heterogeneous computing. The OpenCL program makes each processor cooperate with each other and exert its computing power [1].

Contact IEEE to Subscribe

References

References is not available for this document.