Loading [MathJax]/extensions/MathMenu.js
Fast and Communication-Efficient Algorithm for Distributed Support Vector Machine Training | IEEE Journals & Magazine | IEEE Xplore

Fast and Communication-Efficient Algorithm for Distributed Support Vector Machine Training


Abstract:

Support Vector Machines (SVM) are widely used as supervised learning models to solve the classification problem in machine learning. Training SVMs for large datasets is a...Show More

Abstract:

Support Vector Machines (SVM) are widely used as supervised learning models to solve the classification problem in machine learning. Training SVMs for large datasets is an extremely challenging task due to excessive storage and computational requirements. To tackle so-called big data problems, one needs to design scalable distributed algorithms to parallelize the model training and to develop efficient implementations of these algorithms. In this paper, we propose a distributed algorithm for SVM training that is scalable and communication-efficient. The algorithm uses a compact representation of the kernel matrix, which is based on the QR decomposition of low-rank approximations, to reduce both computation and storage requirements for the training stage. This is accompanied by considerable reduction in communication required for a distributed implementation of the algorithm. Experiments on benchmark data sets with up to five million samples demonstrate negligible communication overhead and scalability on up to 64 cores. Execution times are vast improvements over other widely used packages. Furthermore, the proposed algorithm has linear time complexity with respect to the number of samples making it ideal for SVM training on decentralized environments such as smart embedded systems and edge-based internet of things, IoT.
Published in: IEEE Transactions on Parallel and Distributed Systems ( Volume: 30, Issue: 5, 01 May 2019)
Page(s): 1065 - 1076
Date of Publication: 07 November 2018

ISSN Information:

Funding Agency:


1 Introduction

Machine learning is at the core of solving real-world challenges in sectors like energy, transportation, finance, business analytics, health-care and manufacturing. With the influx of huge amount of digital data from sensors, social media, mobile devices and online transactions, it has become increasingly challenging to store, process and analyze data for predictive analytics.

Contact IEEE to Subscribe

References

References is not available for this document.