Loading [MathJax]/jax/element/mml/optable/BasicLatin.js
Analysis of Particle Swarm Optimisation for Training Support Vector Machines | IEEE Conference Publication | IEEE Xplore

Analysis of Particle Swarm Optimisation for Training Support Vector Machines


Abstract:

Particle swarm optimisation can easily be used to optimise a function constrained by a set of linear equalities. The converging linear particle swarm optimisation algorit...Show More

Abstract:

Particle swarm optimisation can easily be used to optimise a function constrained by a set of linear equalities. The converging linear particle swarm optimisation algorithm has been developed specifically for this purpose. Training of a support vector machine requires the solution of such a linearly constrained quadratic optimisation problem. Due to the combinatorial explosion experienced when solving large instances of this quadratic optimisation problem, support vector machines are notorious for poor scalability when working with large datasets. The converging linear particle swarm optimisation algorithm, in conjunction with a decomposition approach, has therefore been put forward as an alternative method for training support vector machines, with a view towards addressing the problem of scalability. In this paper, a number of problems and shortcomings that may be encountered when implementing the converging linear particle swarm optimisation algorithm for training support vector machines, such as premature convergence due to the clamping procedure employed in the original algorithmic implementations are highlighted, and an approach to overcome these shortcomings is proposed. Finally, an analysis pertaining to the scalability of the proposed approach is conducted.
Date of Conference: 01-04 December 2020
Date Added to IEEE Xplore: 05 January 2021
ISBN Information:
Conference Location: Canberra, ACT, Australia

I. Introduction

Support vector machines (SVMs) [1] are a popular classification technique. The principle behind SVMs is that a number of support vectors are determined by solving a quadratic optimisation problem. These support vectors are subsequently used in order to define a decision boundary for the classifier. To apply SVMs to datasets that are not linearly separable, the well-known kernel trick [2] may be used, mapping the data into a higher dimensional space in which the dataset is linearly separable. The basic quadratic optimisation model of an SVM is given by: \begin{equation*}\max f(\alpha)=\sum_{i=1}^{n}\alpha_{i}-\frac{1}{2}\sum_{i,j=1}^{n}\alpha_{i}\alpha_{j}y_{i}y_{j}K(\mathrm{x}_{i},\ \mathrm{x}_{j}), \tag{1}\end{equation*}

subject to the constraints \begin{equation*}\sum_{i=1}^{n}\alpha_{i}y_{i} = 0 \tag{2}\end{equation*}
\begin{equation*}0\leq \alpha_{i} \leq C for all i\in\{1,\ldots, n\}, \tag{3}\end{equation*}
where denotes the kernel function, and denotes a labelled dataset containing samples from two classes , where , and n denotes the number of observations. Those in α are the support vectors, while C is a constant parameter which quantifies the punishment awarded within the objective function for erroneous classification. The optimisation problem in equations (1)- (3) is solved when the Karush Kuhn Tucker (KKT) conditions [3] are satisfied.

Contact IEEE to Subscribe

References

References is not available for this document.