Conferences >2019 28th International Confe...

Achieving Scalability in a k-NN Multi-GPU Network Service with Centaur

Download PDF
Download References
Request Permissions
Save to
Alerts

Abstract:

Centaur is a GPU-centric architecture for building a low-latency approximate k-Nearest-Neighbors network server. We implement a multi-GPU distributed data flow runtime wh...Show More

Metadata

Abstract:

Centaur is a GPU-centric architecture for building a low-latency approximate k-Nearest-Neighbors network server. We implement a multi-GPU distributed data flow runtime which enables efficient and scalable network request processing on GPUs. The runtime eliminates GPU management overheads from the CPU, making the server throughput and response time largely agnostic to the CPU load, speed or the number of dedicated CPU cores. Our experiments systems show that our server achieves near-perfect scaling for 16 GPUs, beating the throughput of a highly-optimized CPU-driven server by 35% while maintaining about 2msec average request latency. Furthermore, it requires only a single CPU core to run, achieving over an order of magnitude higher throughput than the standard CPU-driven server architecture in this setting.

Published in: 2019 28th International Conference on Parallel Architectures and Compilation Techniques (PACT)

Date of Conference: 23-26 September 2019

Date Added to IEEE Xplore: 07 November 2019

ISBN Information:

ISSN Information:

DOI: 10.1109/PACT.2019.00027

Conference Location: Seattle, WA, USA

Contents

I. Introduction

High-concurrency memory-demanding server applications are ubiquitous in high performance computing systems and data centers [12]. They pose three distinctive requirements to developers: low, strictly bounded response time for client requests, high throughput for higher server efficiency, and large physical memory to keep the data set resident to achieve these performance goals. Fulfilling all these requirements together is a significant challenge.

References is not available for this document.

Achieving Scalability in a k-NN Multi-GPU Network Service with Centaur

Abstract:

Metadata

Abstract:

ISSN Information:

I. Introduction

References

IEEE Account

Purchase Details

Profile Information

Need Help?

Achieving Scalability in a k-NN Multi-GPU Network Service with Centaur

Alerts

Abstract:

Metadata

Abstract:

ISSN Information:

I. Introduction

References

IEEE Account

Purchase Details

Profile Information

Need Help?