Sparse kernel learning with LASSO and Bayesian inference algorithm (original) (raw)

L1 LASSO Modeling and Its Bayesian Inference

Lecture Notes in Computer Science, 2008

A new iterative procedure for solving regression problems with the so-called LASSO penalty [1] is proposed by using generative Bayesian modeling and inference. The algorithm produces the anticipated parsimonious or sparse regression models that generalize well on unseen data. The proposed algorithm is quite robust and there is no need to specify any model hyperparameters. A comparison with stateof-the-art methods for constructing sparse regression models such as the relevance vector machine (RVM) and the local regularization assisted orthogonal least squares regression (LROLS) is given.

Sparsity via new Bayesian Lasso

Periodicals of Engineering and Natural Sciences (PEN), 2020

Lasso estimate as the posterior mode assuming that the parameter has prior density as double exponential distribution [1]. In this paper, we proposed Scale Mixture of Normals mixing with Rayleigh (SMNR) density on their variances to represent the double exponential distribution. Hierarchical model formulation presented with Gibbs sampler under SMNR as alternative Bayesian analysis of minimization problem of classical lasso. We conducted two simulation examples to explore path solution of the Ridge, Lasso, Bayesian Lasso, and New Bayesian Lasso (R, L, BL, NBL) regression methods through the prediction accuracy using the bias of the estimates with different sample sizes, bias indicates that the lasso regression perform well, followed by the NBL. The Median Mean Absolute Deviations (MMAD) used to compared the perform of the regression methods using real data, MMAD indicates that the proposed method (NBL) perform better than the others.

Significant vector learning to construct sparse kernel regression models

Neural Networks, 2007

A novel significant vector (SV) regression algorithm is proposed in the paper based on the analysis on Chen's orthogonal least squares (OLS) regression algorithm. The proposed regularized SV algorithm finds the significant vectors in a successive greedy process in which, compared to the classical OLS algorithm, the orthogonalization has been removed from the algorithm. The performance of the proposed algorithm is comparable to the OLS algorithm while it saves a lot of time complexities in implementing orthogonalization needed in the OLS algorithm.

The Generalized LASSO

IEEE Transactions on Neural Networks, 2004

In the last few years, the support vector machine (SVM) method has motivated new interest in kernel regression techniques. Although the SVM has been shown to exhibit excellent generalization properties in many experiments, it suffers from several drawbacks, both of a theoretical and a technical nature: the absence of probabilistic outputs, the restriction to Mercer kernels, and the steep growth of the number of support vectors with increasing size of the training set. In this paper, we present a different class of kernel regressors that effectively overcome the above problems. We call this approach generalized LASSO regression. It has a clear probabilistic interpretation, can handle learning sets that are corrupted by outliers, produces extremely sparse solutions, and is capable of dealing with large-scale problems. For regression functionals which can be modeled as iteratively reweighted least-squares (IRLS) problems, we present a highly efficient algorithm with guaranteed global convergence. This defies a unique framework for sparse regression models in the very rich class of IRLS models, including various types of robust regression models and logistic regression. Performance studies for many standard benchmark datasets effectively demonstrate the advantages of this model over related approaches.

Sparse Kernel Regressors

Lecture Notes in Computer Science, 2001

Sparse kernel regressors have become popular by applying the support vector method to regression problems. Although this approach has been shown to exhibit excellent generalization properties in many experiments, it suffers from several drawbacks: the absence of probabilistic outputs, the restriction to Mercer kernels, and the steep growth of the number of support vectors with increasing size of the training set. In this paper we present a new class of kernel regressors that effectively overcome the above problems. We call this new approach generalized LASSO regression. It has a clear probabilistic interpretation, produces extremely sparse solutions, can handle learning sets that are corrupted by outliers, and is capable of dealing with large-scale problems.

Convex vs nonconvex approaches for sparse estimation: Lasso, Multiple Kernel Learning and Hyperparameter Lasso

IEEE Conference on Decision and Control and European Control Conference, 2011

We consider the problem of sparse estimation in a Bayesian framework. We outline the derivation of the Lasso in terms of marginalization of a particular Bayesian model. A different marginalization of the same probabilistic model leads also to a different nonconvex estimator where hyperparameters are optimized. The arguments are extended to problems where groups of variables have to be estimated. An approach alternative to Group Lasso is derived, also providing its connection with Multiple Kernel Learning approaches. Our estimator is nonconvex but one of its versions requires optimization with respect to only one scalar variable. Theoretical arguments and numerical experiments show that the new technique obtains sparse solutions which are more accurate than the other two convex estimators.

Sparse Bayesian modeling with adaptive kernel learning

IEEE transactions on neural networks / a publication of the IEEE Neural Networks Council, 2009

Sparse kernel methods are very efficient in solving regression and classification problems. The sparsity and performance of these methods depend on selecting an appropriate kernel function, which is typically achieved using a cross-validation procedure. In this paper, we propose an incremental method for supervised learning, which is similar to the relevance vector machine (RVM) but also learns the parameters of the kernels during model training. Specifically, we learn different parameter values for each kernel, resulting in a very flexible model. In order to avoid overfitting, we use a sparsity enforcing prior that controls the effective number of parameters of the model. We present experimental results on artificial data to demonstrate the advantages of the proposed method and we provide a comparison with the typical RVM on several commonly used regression and classification data sets.

Comparative Study of LASSO, Ridge Regression, Preliminary Test and Stein-type Estimators for the Sparse Gaussian Regression Model

Statistics, Optimization & Information Computing, 2019

This paper compares the performance characteristics of penalty estimators, namely, LASSO and ridge regression (RR), with the least squares estimator (LSE), restricted estimator (RE), preliminary test estimator (PTE) and the Stein-type estimators. Under the assumption of orthonormal design matrix of a given regression model, we find that the RR estimator dominates the LSE, RE, PTE, Stein-type estimators and LASSO estimator uniformly, while, similar to [17], neither LASSO nor LSE, PTE and Stein-Type estimators dominates the other. Our conclusions are based on the analysis of L 2-risks and relative risk efficiencies (RRE) together with the RRE related tables and graphs.

Sparse Regression

2006

Yuan an Lin (2004) proposed the grouped LASSO, which achieves shrinkage and selection simultaneously, as LASSO does, but works on blocks of covariates. That is, the grouped LASSO provides a model where some blocks of regression coefficients are exactly zero. The grouped LASSO is useful when there are meaningful blocks of covariates such as polynomial regression and dummy variables from categorical variables. In this paper, we propose an extension of the grouped LASSO, called ‘Blockwise Sparse Regression’ (BSR). The BSR achieves shrinkage and selection simultaneously on blocks of covariates similarly to the grouped LASSO, but it works for general loss functions including generalized linear models. An efficient computational algorithm is developed and a blockwise standardization method is proposed. Simulation results show that the BSR compromises the ridge and LASSO for logistic regression. The proposed method is illustrated with two datasets.

Sparse Kernel Regression Modelling Based on L1 Significant Vector Learning

2005 International Conference on Neural Networks and Brain, 2005

A novel L1 significant vector (SV) regression algorithm is proposed in the paper. The proposed regularized L1 SV algorithm finds the significant vectors in a successive greedy process. The performance of the proposed algorithm is comparable to the OLS algorithm while it saves a lot of time complexities in implementing orthogonalization needed in the OLS algorithm.

Sparse kernel learning with LASSO and Bayesian inference algorithm (original) (raw)

Related papers