Sign-regularized Multi-task Learning (original) (raw)
Related papers
Efficient multi-task feature learning with calibration
Proceedings of the 20th ACM SIGKDD international conference on Knowledge discovery and data mining - KDD '14, 2014
Multi-task feature learning has been proposed to improve the generalization performance by learning the shared features among multiple related tasks and it has been successfully applied to many real-world problems in machine learning, data mining, computer vision and bioinformatics. Most existing multi-task feature learning models simply assume a common noise level for all tasks, which may not be the case in real applications. Recently, a Calibrated Multivariate Regression (CMR) model has been proposed, which calibrates different tasks with respect to their noise levels and achieves superior prediction performance over the noncalibrated one. A major challenge is how to solve the CMR model efficiently as it is formulated as a composite optimization problem consisting of two non-smooth terms. In this paper, we propose a variant of the calibrated multi-task feature learning formulation by including a squared norm regularizer. We show that the dual problem of the proposed formulation is a smooth optimization problem with a piecewise sphere constraint. The simplicity of the dual problem enables us to develop fast dual optimization algorithms with low per-iteration cost. We also provide a detailed convergence analysis for the proposed dual optimization algorithm. Empirical studies demonstrate that, the dual optimization algorithm quickly converges and it is much more efficient than the primal optimization algorithm. Moreover, the calibrated multi-task feature learning algorithms with and without the squared norm regularizer achieve similar prediction performance and both outperform the non-calibrated ones. Thus, the proposed variant not only enables us to develop fast optimization algorithms, but also keeps the superior prediction performance of the calibrated multi-task feature learning over the non-calibrated one.
Multi-stage multi-task feature learning
Journal of Machine Learning Research, 2013
Multi-task sparse feature learning aims to improve the generalization performance by exploiting the shared features among tasks. It has been successfully applied to many applications including computer vision and biomedical informatics. Most of the existing multi-task sparse feature learning algorithms are formulated as a convex sparse regularization problem, which is usually suboptimal, due to its looseness for approximating an ℓ 0-type regularizer. In this paper, we propose a non-convex formulation for multi-task sparse feature learning based on a novel non-convex regularizer. To solve the non-convex optimization problem, we propose a Multi-Stage Multi-Task Feature Learning (MSMTFL) algorithm; we also provide intuitive interpretations, detailed convergence and reproducibility analysis for the proposed algorithm. Moreover, we present a detailed theoretical analysis showing that MSMTFL achieves a better parameter estimation error bound than the convex formulation. Empirical studies on both synthetic and real-world data sets demonstrate the effectiveness of MSMTFL in comparison with the state of the art multi-task sparse feature learning algorithms.
SLAW: Scaled Loss Approximate Weighting for Efficient Multi-Task Learning
ArXiv, 2021
Multi-task learning (MTL) is a subfield of machine learning with important applications, but the multiobjective nature of optimization in MTL leads to difficulties in balancing training between tasks. The best MTL optimization methods require individually computing the gradient of each task’s loss function, which impedes scalability to a large number of tasks. In this paper, we propose Scaled Loss Approximate Weighting (SLAW), a method for multitask optimization that matches the performance of the best existing methods while being much more efficient. SLAW balances learning between tasks by estimating the magnitudes of each task’s gradient without performing any extra backward passes. We provide theoretical and empirical justification for SLAW’s estimation of gradient magnitudes. Experimental results on non-linear regression, multi-task computer vision, and virtual screening for drug discovery demonstrate that SLAW is significantly more efficient than strong baselines without sacrif...
Learning with Whom to Share in Multi-task Feature Learning
2011
Abstract In multi-task learning (MTL), multiple tasks are learnt jointly. A major assumption for this paradigm is that all those tasks are indeed related so that the joint training is appropriate and beneficial. In this paper, we study the problem of multi-task learning of shared feature representations among tasks, while simultaneously determining ���with whom��� each task should share.
Clustered Multi-Task Learning Via Alternating Structure Optimization
Advances in neural information processing systems, 2011
Multi-task learning (MTL) learns multiple related tasks simultaneously to improve generalization performance. Alternating structure optimization (ASO) is a popular MTL method that learns a shared low-dimensional predictive structure on hypothesis spaces from multiple related tasks. It has been applied successfully in many real world applications. As an alternative MTL approach, clustered multi-task learning (CMTL) assumes that multiple tasks follow a clustered structure, i.e., tasks are partitioned into a set of groups where tasks in the same group are similar to each other, and that such a clustered structure is unknown a priori. The objectives in ASO and CMTL differ in how multiple tasks are related. Interestingly, we show in this paper the equivalence relationship between ASO and CMTL, providing significant new insights into ASO and CMTL as well as their inherent relationship. The CMTL formulation is non-convex, and we adopt a convex relaxation to the CMTL formulation. We further...
Distributed variance regularized Multitask Learning
2016 International Joint Conference on Neural Networks (IJCNN), 2016
Past research on Multitask Learning (MTL) has focused mainly on devising adequate regularizers and less on their scalability. In this paper, we present a method to scale up MTL methods which penalize the variance of the task weight vectors. The method builds upon the alternating direction method of multipliers to decouple the variance regularizer. It can be efficiently implemented by a distributed algorithm, in which the tasks are first independently solved and subsequently corrected to pool information from other tasks. We show that the method works well in practice and convergences in few distributed iterations. Furthermore, we empirically observe that the number of iterations is nearly independent of the number of tasks, yielding a computational gain of O(T) over standard solvers. We also present experiments on a large URL classification dataset, which is challenging both in terms of volume of data points and dimensionality. Our results confirm that MTL can obtain superior performance over either learning a common model or independent task learning.
Heterogeneous multitask learning with joint sparsity constraints
2009
Multitask learning addresses the problem of learning related tasks that presumably share some commonalities on their input-output mapping functions. Previous approaches to multitask learning usually deal with homogeneous tasks, such as purely regression tasks, or entirely classification tasks. In this paper, we consider the problem of learning multiple related tasks of predicting both continuous and discrete outputs from a common set of input variables that lie in a highdimensional feature space. All of the tasks are related in the sense that they share the same set of relevant input variables, but the amount of influence of each input on different outputs may vary. We formulate this problem as a combination of linear regressions and logistic regressions, and model the joint sparsity as L 1 /L ∞ or L 1 /L 2 norm of the model parameters. Among several possible applications, our approach addresses an important open problem in genetic association mapping, where the goal is to discover genetic markers that influence multiple correlated traits jointly. In our experiments, we demonstrate our method in this setting, using simulated and clinical asthma datasets, and we show that our method can effectively recover the relevant inputs with respect to all of the tasks.
Adaptive weight assignment scheme for multi-task learning
IAES International Journal of Artificial Intelligence (IJ-AI), 2022
Deep learning based models are used regularly in every applications nowadays. Gen- erally we train a single model on a single task. However, we can train multiple tasks on a single model under multi-task learning (MTL) settings. This provides us many benefits like lesser training time, training a single model for multiple tasks, reducing overfitting, and improving performances. To train a model in multi-task learning settings we need to sum the loss values from different tasks. In vanilla multi-task learning settings we assign equal weights but since not all tasks are of similar difficulty we need to allocate more weight to tasks which are more difficult. Also improper weight assignment reduces the performance of the model. We propose a simple weight assignment scheme in this paper which improves the performance of the model and puts more emphasis on difficult tasks. We tested our methods performance on both image and textual data and also compared performance against two popular weight ...
Exploiting Unrelated Tasks in Multi-Task Learning
2012
We study the problem of learning a group of principal tasks using a group of auxiliary tasks, unrelated to the principal ones. In many applications, joint learning of unrelated tasks which use the same input data can be beneficial. The reason is that prior knowledge about which tasks are unrelated can lead to sparser and more informative representations for each task, essentially screening out idiosyncrasies of the data distribution. We propose a novel method which builds on a prior multitask methodology by ...