Mixed-Transfer: Transfer Learning over Mixed Graphs (original) (raw)
Related papers
A survey on heterogeneous transfer learning
Journal of Big Data, 2017
Transfer learning has been demonstrated to be effective for many real-world applications as it exploits knowledge present in labeled training data from a source domain to enhance a model’s performance in a target domain, which has little or no labeled target training data. Utilizing a labeled source, or auxiliary, domain for aiding a target task can greatly reduce the cost and effort of collecting sufficient training labels to create an effective model in the new target distribution. Currently, most transfer learning methods assume the source and target domains consist of the same feature spaces which greatly limits their applications. This is because it may be difficult to collect auxiliary labeled source domain data that shares the same feature space as the target domain. Recently, heterogeneous transfer learning methods have been developed to address such limitations. This, in effect, expands the application of transfer learning to many other real-world tasks such as cross-language text categorization, text-to-image classification, and many others. Heterogeneous transfer learning is characterized by the source and target domains having differing feature spaces, but may also be combined with other issues such as differing data distributions and label spaces. These can present significant challenges, as one must develop a method to bridge the feature spaces, data distributions, and other gaps which may be present in these cross-domain learning tasks. This paper contributes a comprehensive survey and analysis of current methods designed for performing heterogeneous transfer learning tasks to provide an updated, centralized outlook into current methodologies.
Proceedings of the 18th ACM conference on …, 2009
Transfer learning is the task of leveraging the information from labeled examples in some domains to predict the labels for examples in another domain. It finds abundant practical applications, such as sentiment prediction, image classification and network intrusion detection. In this paper, we propose a graph-based transfer learning framework. It propagates the label information from the source domain to the target domain via the example-feature-example tripartite graph, and puts more emphasis on the labeled examples from the target domain via the example-example bipartite graph. Our framework is semi-supervised and nonparametric in nature and thus more flexible. We also develop an iterative algorithm so that our framework is scalable to large-scale applications. It enjoys the theoretical property of convergence. Compared with existing transfer learning methods, the proposed framework propagates the label information to both the features irrelevant to the source domain and the unlabeled examples in the target domain via the common features in a principled way. Experimental results on 3 real data sets demonstrate the effectiveness of our algorithm.
IEEE Transactions on Services Computing
Transfer learning aims to learn classifiers for a target domain by transferring knowledge from a source domain. However, due to two main issues: feature discrepancy and distribution divergence, transfer learning can be a very difficult problem in practice. In this paper, we present a framework called TLF that builds a classifier for the target domain having only few labeled training records by transferring knowledge from the source domain having many labeled records. While existing methods often focus on one issue and leave the other one for the further work, TLF is capable of handling both issues simultaneously. In TLF, we alleviate feature discrepancy by identifying shared label distributions that act as the pivots to bridge the domains. We handle distribution divergence by simultaneously optimizing the structural risk functional, joint distributions between domains, and the manifold consistency underlying marginal distributions. Moreover, for the manifold consistency we exploit its intrinsic properties by identifying k nearest neighbors of a record, where the value of k is determined automatically in TLF. Furthermore, since negative transfer is not desired, we consider only the source records that are belonging to the source pivots during the knowledge transfer. We evaluate TLF on seven publicly available natural datasets and compare the performance of TLF against the performance of eleven state-of-the-art techniques. We also evaluate the effectiveness of TLF in some challenging situations. Our experimental results, including statistical sign test and Nemenyi test analyses, indicate a clear superiority of the proposed framework over the state-of-the-art techniques.
Multi-transfer: Transfer learning with multiple views and multiple sources
Statistical Analysis and Data Mining, 2014
Transfer learning, which aims to help the learning task in a target domain by leveraging knowledge from auxiliary domains, has been demonstrated to be effective in different applications, e.g., text mining, sentiment analysis, etc. In addition, in many real-world applications, auxiliary data are described from multiple perspectives and usually carried by multiple sources. For example, to help classify videos on Youtube, which include three views/perspectives: image, voice and subtitles, one may borrow data from Flickr, Last.FM and Google News. Although any single instance in these domains can only cover a part of the views available on Youtube, actually the piece of information carried by them may compensate with each other. In this paper, we define this transfer learning problem as Transfer Learning with Multiple Views and Multiple Sources. As different sources may have different probability distributions and different views may be compensate or inconsistent with each other, merging all data in a simplistic manner will not give optimal result. Thus, we propose a novel algorithm to leverage knowledge from different views and sources collaboratively, by letting different views from different sources complement each other through a co-training style framework, while revise the distribution differences in different domains. We conduct empirical studies on several real-world datasets to show that the proposed approach can improve the classification accuracy by up to 8% against different state-of-the-art baselines.
Asymmetric Heterogeneous Transfer Learning: A Survey
Proceedings of the 6th International Conference on Data Science, Technology and Applications
One of the main prerequisites in most machine learning and data mining tasks is that all available data originates from the same domain. In practice, we often can't meet this requirement due to poor quality, unavailable data or missing data attributes (new task, e.g. cold-start problem). A possible solution can be the combination of data from different domains represented by different feature spaces, which relate to the same task. We can also transfer the knowledge from a different but related task that has been learned already. Such a solution is called transfer learning and it is very helpful in cases where collecting data is expensive, difficult or impossible. This overview focuses on the current progress in the new and unique area of transfer learning-asymmetric heterogeneous transfer learning. This type of transfer learning considers the same task solved using data from different feature spaces. Through suitable mappings between these different feature spaces we can get more data for solving data mining tasks. We discuss approaches and methods for solving this type of transfer learning tasks. Furthermore, we mention the most used metrics and the possibility of using metric or similarity learning.
Transfer learning on heterogenous feature spaces via spectral transformation
Data Mining (ICDM), 2010 …, 2010
Labeled examples are often expensive and timeconsuming to obtain. One practically important problem is: can the labeled data from other related sources help predict the target task, even if they have (a) different feature spaces (e.g., image vs. text data), (b) different data distributions, and (c) different output spaces? This paper proposes a solution and discusses the conditions where this is possible and highly likely to produce better results. It works by first using spectral embedding to unify the different feature spaces of the target and source data sets, even when they have completely different feature spaces. The principle is to cast into an optimization objective that preserves the original structure of the data, while at the same time, maximizes the similarity between the two. Second, a judicious sample selection strategy is applied to select only those related source examples. At last, a Bayesian-based approach is applied to model the relationship between different output spaces. The three steps can bridge related heterogeneous sources in order to learn the target task. Among the 12 experiment data sets, for example, the images with wavelettransformed-based features are used to predict another set of images whose features are constructed from color-histogram space. By using these extracted examples from heterogeneous sources, the models can reduce the error rate by as much as 50%, compared with the methods using only the examples from the target task.
Heterogeneous transfer learning techniques for machine learning
iran journal of computer science, 2018
The main assumption in machine learning and data mining is, training the data, and the future data have the same distribution and same features. However, in many applications, in the real world, such assumptions may not be retained. For example, sometimes, we have the task of classification in the one domain of interest, but when the same data is used in another domain, it needed enough training to work in the other domain of interest. In the field of heterogeneous transfer learning, train the data in one domain and test with other domain. In this case, knowledge is transfer; if there is a successful transfer, it can significantly improve performance by avoiding the learning in the label information which is more expensive. Over the past few years, the transfer learning has become a new learning framework to address this issue and heterogeneous transfer learning is the most active research area in the recent years. In this study, we are discussing the relationship between heterogeneous transfer learning and the other machine learning methods, including the field of adaptation, learning and multitasking learning and sample selection bias, as well as the associates of variables. We also reconnoiter some main challenges for the future issue in heterogeneous transfer learning.
Graph Transfer Learning via Adversarial Domain Adaptation with Graph Convolution
IEEE Transactions on Knowledge and Data Engineering, 2022
This paper studies the problem of cross-network node classification to overcome the insufficiency of labeled data in a single network. It aims to leverage the label information in a partially labeled source network to assist node classification in a completely unlabeled or partially labeled target network. Existing methods for single network learning cannot solve this problem due to the domain shift across networks. Some multi-network learning methods heavily rely on the existence of cross-network connections, thus are inapplicable for this problem. To tackle this problem, we propose a novel graph transfer learning framework AdaGCN by leveraging the techniques of adversarial domain adaptation and graph convolution. It consists of two components: a semi-supervised learning component and an adversarial domain adaptation component. The former aims to learn class discriminative node representations with given label information of the source and target networks, while the latter contribu...
Regularization for Graph-Based Transfer Learning Text Classification
Progress in Pattern Recognition, Image Analysis, Computer Vision, and Applications, 2019
In machine learning classification problems, it is common to assume train and test sets follow a similar underlying distribution. When this is not true, this can be seen as a transfer learning problem. Sometimes, there is a set of already trained source classification models available. This work focuses on how to best use these models as an ensemble of classifiers on data from a new, but related domain, assuming train data used to train those models is not available. This scenario is common in distributed systems, where sharing all data is technically difficult or there are privacy concerns. Most current solutions are graph-based methods that propagate labels given by the models into new domain data. We propose a regularization method for one of best current solutions, OAC 3 , that shows improvements on accuracy on several text classification datasets.
Ontology-Driven Cross-Domain Transfer Learning
Frontiers in Artificial Intelligence and Applications, 2020
The aim of transfer learning is to reuse learnt knowledge across different contexts. In the particular case of cross-domain transfer (also known as domain adaptation), reuse happens across different but related knowledge domains. While there have been promising first results in combining learning with symbolic knowledge to improve cross-domain transfer results, the singular ability of ontologies for providing classificatory knowledge has not been fully exploited so far by the machine learning community. We show that ontologies, if properly designed, are able to support transfer learning by improving generalization and discrimination across classes. We propose an architecture based on direct attribute prediction for combining ontologies with a transfer learning framework, as well as an ontology-based solution for cross-domain generalization based on the integration of top-level and domain ontologies. We validate the solution on an experiment over an image classification task, demonst...