Conferences >2012 IEEE 14th International ...

An Effective Approach for Implementing Sparse Matrix-Vector Multiplication on Graphics Processing Units

Download PDF
Download References
Request Permissions
Save to
Alerts

Abstract:

Sparse matrix vector multiplication, SpMV, is often a performance bottleneck in iterative solvers. Recently, Graphics Processing Units, GPUs, have been deployed to enhanc...Show More

Metadata

Abstract:

Sparse matrix vector multiplication, SpMV, is often a performance bottleneck in iterative solvers. Recently, Graphics Processing Units, GPUs, have been deployed to enhance the performance of this operation. We present a blocked version of the Transposed Jagged Diagonal storage format which is tailored for GPUs, BTJAD. We develop a highly optimized SpMV kernel that takes advantage of the properties of the BTJAD storage format and reuses loaded values of the source vector in the registers of a GPU. Using 62 matrices with different sparsity patterns and executing on an NVIDIA Tesla T10 GPU, we compare the performance of our kernel with that of the SpMV kernels in NVIDIA's library. Our kernel achieves superior execution throughputs for matrices that are non-uniform in their nonzero row lengths, outperforming the best available kernels by up to 4.67x. When executing on the Fermi class GeForce GTX480 GPU which has a larger register file size, the maximum speedup achieved by our kernel improves to 6.6x.

Published in: 2012 IEEE 14th International Conference on High Performance Computing and Communication & 2012 IEEE 9th International Conference on Embedded Software and Systems

Date of Conference: 25-27 June 2012

Date Added to IEEE Xplore: 18 October 2012

ISBN Information:

DOI: 10.1109/HPCC.2012.68

Conference Location: Liverpool, UK

Contents

I. Introduction

Sparse matrix computations arise in numerous engineering and science applications. One of the most common kernels is sparse matrix-vector multiplication (SpMV). This operation represents one of the most fundamental performance bottlenecks in solving sparse linear systems and eigenvalue problems. In SpMV, the operation is performed, where A is a sparse matrix and are dense vectors. Vectors x and y provide the only opportunities for data reuse since elements of matrix A are used only once. Sparse matrices use special data structures that store only the nonzero elements and hence eliminate unnecessary storage and computations [1] [2]. However, these structures introduce indirect and irregular memory accesses which when combined with the lack of data reuse leads to low computational intensity, i.e. number of arithmetic operations per memory reference [3].

References is not available for this document.

An Effective Approach for Implementing Sparse Matrix-Vector Multiplication on Graphics Processing Units

Abstract:

Metadata

Abstract:

I. Introduction

References

IEEE Account

Purchase Details

Profile Information

Need Help?

An Effective Approach for Implementing Sparse Matrix-Vector Multiplication on Graphics Processing Units

Alerts

Abstract:

Metadata

Abstract:

I. Introduction

Authors

Figures

References

Citations

Keywords

Metrics

References