Loading [MathJax]/extensions/MathMenu.js
Runtime Performance Prediction for Deep Learning Models with Graph Neural Network | IEEE Conference Publication | IEEE Xplore

Runtime Performance Prediction for Deep Learning Models with Graph Neural Network


Abstract:

Deep learning models have been widely adopted in many application domains. Predicting the runtime performance of deep learning models, such as GPU memory consumption and ...Show More

Abstract:

Deep learning models have been widely adopted in many application domains. Predicting the runtime performance of deep learning models, such as GPU memory consumption and training time, is important for boosting development productivity and reducing resource waste. The reason is that improper configurations of hyperparameters and neural architectures can result in many failed training jobs or unsatisfactory models. However, the runtime performance prediction of deep learning models is challenging because of the hybrid programming paradigm, complicated hidden factors within the framework runtime, enormous model configuration space, and broad differences among models. In this paper, we propose DNNPerf, a novel ML-based tool for predicting the runtime performance of deep learning models using Graph Neural Network. DNNPerf represents a model as a directed acyclic computation graph and incorporates a rich set of performance-related features based on the computational semantics of both nodes and edges. We also propose a new Attention-based Node-Edge Encoder for the node and edge features. DNNPerf is evaluated on thousands of configurations of real-world and synthetic deep learning models to predict their GPU memory consumption and training time. The experimental results show that DNNPerf achieves accurate predictions, with an overall error of 7.4% for the training time prediction and an overall error of 13.7% for the GPU memory consumption prediction, confirming its effectiveness.
Date of Conference: 14-20 May 2023
Date Added to IEEE Xplore: 11 July 2023
ISBN Information:

ISSN Information:

Conference Location: Melbourne, Australia

I. Introduction

In recent years, deep learning (DL) has been widely adopted in many application domains, such as computer vision [1], speech recognition [2], and natural language processing [3]. Like many traditional software systems, DL models are also highly configurable via a set of configuration options for hyperparameters (e.g., the batch size and the dropout) and neural architectures (e.g., the number of layers). To search for the optimal configuration of a DL model that satisfies specific requirements, developers usually run (e.g., via automated machine learning tools) a large number of training jobs to explore diverse configurations.

References

References is not available for this document.