Sparse LS-SVMs with L0-Norm Minimization (original) (raw)

Improved sparse least-squares support vector machines

Neurocomputing, 2002

, in press) describe a weighted least-squares formulation of the support vector machine for regression problems and present a simple algorithm for sparse approximation of the typically fully dense kernel expansions obtained using this method. In this paper, we present an improved method for achieving sparsity in least-squares support vector machines, which takes into account the residuals for all training patterns, rather than only those incorporated in the sparse kernel expansion. The superiority of this algorithm is demonstrated on the motorcycle and Boston housing data sets.

Sparse least squares support vector machine classifiers

Neural Processing Letters, 2000

Since the early 90's, Support Vector Machines (SVM) are attracting more and more attention due to their applicability to a large number of problems. To overcome the high computational complexity of traditional Support Vector Machines, recently a new technique, the Least-Squares SVM (LS-SVM) has been introduced, but unfortunately a very attractive feature of SVM, namely its sparseness, was lost. LS-SVM simplifies the required computation to solving linear equation set. This equation set embodies all available information about the learning process. By applying modifications to this equation set, we present a Least Squares version of the Least Squares Support Vector Machine (LS 2 -SVM). The proposed modification speeds up the calculations and provides better results, but most importantly it concludes a sparse solution.

IP-LSSVM: A two-step sparse classifier

Pattern Recognition Letters, 2009

We present in this work a two-step sparse classifier called IP À LSSVM which is based on Least Squares Support Vector Machine (LS-SVM). The formulation of LS-SVM aims at solving the learning problem with a system of linear equations. Although this solution is simpler, there is a loss of sparseness in the feature vectors. Many works on LS-SVM are focused on improving support vectors representation in the least squares approach, since they correspond to the only vectors that must be stored for further usage of the machine, which can also be directly used as a reduced subset that represents the initial one. The proposed classifier incorporates the advantages of either SVM and LS-SVM: automatic detection of support vectors and a solution obtained simply by the solution of systems of linear equations. IP À LSSVM was compared with other sparse LS-SVM classifiers from literature, LS 2 À SVM; Pruning; Ada À Pinv and RRS þ LS À SVM. The experiments were performed on four important benchmark databases in Machine Learning and on two artificial databases created to show visually the support vectors detected. The results show that IP À LSSVM represents a viable alternative to SVMs, since both have similar features, supported by literature results and yet IP À LSSVM has a simpler and more understandable formulation.

A hybrid approach for sparse least squares support vector machines

Fifth International Conference on Hybrid Intelligent Systems (HIS'05), 2005

We present in this work a hybrid strategy for training Least Squares Support Vector Machines (LS-SVMs), in order to eliminate their greatest drawback when comparing to original Support Vector Machines (SVMs), the inexistence of support vectors' automatic detection, the so called loss of sparseness. The main characteristic of LS-SVMs is the low computational complexity comparing to SVMs, without quality loss in the solution, because the principles that both have been based are the same. In this work, we use a sample selection technique called Reduced Remaining Subset (RRS), which is based on a modified nearest neighbor rule, in order to choose the best samples to represent each class. After that, LS-SVMs use the selected samples as support vectors to find the decision surface between the classes. Some experiments are presented to compare the proposed approach with two existent methods that also aim to impose sparseness in LS-SVMs.

Sparse approximation using least squares support vector machines

2000

In least squares support vector machines (LS-SVMs) for function estimation Vapnik's ε-insensitive loss function has been replaced by a cost function which corresponds to a form of ridge regression. In this way nonlinear function estimation is done by solving a linear set of equations instead of solving a quadratic programming problem. The LS-SVM formulation also involves less tuning parameters. However, a drawback is that sparseness is lost in the LS-SVM case. In this paper we investigate imposing sparseness by pruning support values from the sorted support value spectrum which results from the solution to the linear system

Sparse Support Vector Machine for pattern recognition

2013 International Conference on High Performance Computing & Simulation (HPCS), 2013

Support vector machine (SVM) is one of the most popular classification techniques in pattern recognition community. However, because of outliers in the training samples, SVM tends to perform poorly under such circumstances. In this paper, we borrow the idea from compressive sensing by introducing an extra term to the objective function of the standard SVM in order to achieve a sparse representation. Furthermore, instead of using the l 0 norm, we adopt the l 1 norm in our sparse SVM. In most cases, our method achieves higher classification rates than the standard SVM because of sparser support vectors and is more robust to outliers in the datasets. Experimental results show that our proposed SVM is efficient in pattern recognition applications.

Support Vector Machines With Constraints for Sparsity in the Primal Parameters

IEEE Transactions on Neural Networks, 2000

This paper introduces a new support vector 1 machine (SVM) formulation to obtain sparse solutions in the 2 primal SVM parameters, providing a new method for feature 3 selection based on SVMs. This new approach includes additional 4 constraints to the classical ones that drop the weights associated 5 to those features that are likely to be irrelevant. A ν-SVM 6 formulation has been used, where ν indicates the fraction of 7 features to be considered. This paper presents two versions of 8 the proposed sparse classifier, a 2-norm SVM and a 1-norm SVM, 9 the latter having a reduced computational burden with respect to 10 the first one. Additionally, an explanation is provided about how 11 the presented approach can be readily extended to multiclass 12 classification or to problems where groups of features, rather 13 than isolated features, need to be selected. The algorithms have 14 been tested in a variety of synthetic and real data sets and they 15 have been compared against other state of the art SVM-based 16 linear feature selection methods, such as 1-norm SVM and doubly 17 regularized SVM. The results show the good feature selection 18 ability of the approaches. 19 Index Terms-Feature group selection, feature selection, 20 margin maximization, multiclass classification, support vector 21 machines. 22 79

ϵ-Sparse Representations: Generalized Sparse Approximation and the Equivalent Family of SVM Tasks

Relation between a family of generalized Support Vector Machine (SVM) problems and the novel-sparse representation is provided. In defining-sparse representations, we use a natural generalization of the classicalinsensitive cost function for vectors. The insensitive parameter of the SVM problem is transformed into component-wise insensitivity and thus overall sparsification is replaced by component-wise sparsification. The connection between these two problems is built through the generalized Moore-Penrose inverse of the Gram matrix associated to the kernel.

A Comparison of Pruning Algorithms for Sparse Least Squares Support Vector Machines

Lecture Notes in Computer Science, 2004

Least Squares Support Vector Machines (LS-SVM) is a proven method for classification and function approximation. In comparison to the standard Support Vector Machines (SVM) it only requires solving a linear system, but it lacks sparseness in the number of solution terms. Pruning can therefore be applied. Standard ways of pruning the LS-SVM consist of recursively solving the approximation problem and subsequently omitting data that have a small error in the previous pass and are based on support values. We suggest a slightly adapted variant that improves the performance significantly. We assess the relative regression performance of these pruning schemes in a comparison with two (for pruning adapted) subset selection schemes, -one based on the QR decomposition (supervised), one that searches the most representative feature vector span (unsupervised)-, random omission and backward selection on independent test sets in some benchmark experiments 1 .

Sparse LS-SVMs using additive regularization with a penalized validation criterion

2004

This paper is based on a new way for determining the regularization trade-off in least squares support vector machines (LS-SVMs) via a mechanism of additive regularization which has been recently introduced in [6]. This framework enables computational fusion of training and validation levels and allows to train the model together with finding the regularization constants by solving a single linear system at once. In this paper we show that this framework allows to consider a penalized validation criterion that leads to sparse LS-SVMs. The model, regularization constants and sparseness follow from a convex quadratic program in this case.