Ensemble Relational Learning based on Selective Propositionalization (original) (raw)
Related papers
A taxonomy of weight learning methods for statistical relational learning
Machine Learning
Statistical relational learning (SRL) frameworks are effective at defining probabilistic models over complex relational data. They often use weighted first-order logical rules where the weights of the rules govern probabilistic interactions and are usually learned from data. Existing weight learning approaches typically attempt to learn a set of weights that maximizes some function of data likelihood; however, this does not always translate to optimal performance on a desired domain metric, such as accuracy or F1 score. In this paper, we introduce a taxonomy of search-based weight learning approaches for SRL frameworks that directly optimize weights on a chosen domain performance metric. To effectively apply these search-based approaches, we introduce a novel projection, referred to as scaled space (SS), that is an accurate representation of the true weight space. We show that SS removes redundancies in the weight space and captures the semantic distance between the possible weight ...
Relational Ensemble Classification
Sixth International Conference on Data Mining (ICDM'06), 2006
Relational classification aims at including relations among entities, for example taking relations between documents such as a common author or citations into account. However, considering more than one relation can further improve classification accuracy.
Statistical relational learning: Four claims and a survey
2003
Statistical relational learning (SRL) research has made significant progress over the last 5 years. We have successfully demonstrated the feasibility of a number of probabilistic models for relational data, including probabilistic relational models, Bayesian logic programs, and relational probability trees, and the interest in SRL is growing. However, in order to sustain and nurture the growth of SRL as a subfield we need to refocus our efforts on the science of machine learning—moving from demonstrations to comparative and ablation studies.
Structure Learning for Relational Logistic Regression: An Ensemble Approach
2018
We consider the problem of learning Relational Logistic Regression (RLR). Unlike standard logistic regression, the features of RLRs are first-order formulae with associated weight vectors instead of scalar weights. We turn the problem of learning RLR to learning these vector-weighted formulae and develop a learning algorithm based on the recently successful functional-gradient boosting methods for probabilistic logic models. We derive the functional gradients and show how weights can be learned simultaneously in an efficient manner. Our empirical evaluation on standard and novel data sets demonstrates the superiority of our approach over other methods for learning RLR.
Stochastic Propositionalization for Efficient Multi-relational Learning
Lecture Notes in Computer Science, 2008
The efficiency of multi-relational data mining algorithms, addressing the problem of learning First Order Logic (FOL) theories, strongly depends on the search method used for exploring the hypotheses space and on the coverage test assessing the validity of the learned theory against the training examples. A way of tackling the complexity of this kind of learning systems is to use a propositional method that reformulates a multi-relational learning problem into an attribute-value one. We propose a population based algorithm that using a stochastic propositional method efficiently learns complete FOL definitions.
Statistical Relational Learning: A State-Of-The-Art Review
Journal of Engineering and Technology, 2019
The objective of this paper is to review the state-of-the-art of statistical relational learning (SRL) models developed to deal with machine learning and data mining in relational domains in presence of missing, partially observed, and/or noisy data. It starts by giving a general overview of conventional graphical models, first-order logic and inductive logic programming approaches as needed for background. The historical development of each SRL key model is critically reviewed. The study also focuses on the practical application of SRL techniques to a broad variety of areas and their limitations.
Approximate Relational Reasoning by Stochastic Propositionalization
Studies in Computational Intelligence, 2010
For many real-world applications it is important to choose the right representation language. While the setting of First Order Logic (FOL) is the most suitable one to model the multi-relational data of real and complex domains, on the other hand it puts the question of the computational complexity of the knowledge induction process. A way of tackling the complexity of such real domains, in which a lot of relationships are required to model the objects involved, is to use a method that reformulates a multi-relational learning task into an attribute-value one. In this chapter we present an approximate reasoning method able to keep low the complexity of a relational problem by using a stochastic inference procedure. The complexity of the relational language is decreased by means of a propositionalization technique, while the NP-completeness of the deduction is tackled using an approximate query evaluation. The proposed approximate reasoning technique has been used to solve the problem of relational rule induction as well as the task of relational clustering. An anytime algorithm has been used for the induction, implemented by a population based method, able to efficiently extract knowledge from relational data, while the clustering task, both unsupervised and supervised, has been solved using a Partition Around Medoid (PAM) clustering algorithm. The validity of the proposed techniques has been proved making an empirical evaluation on real-world datasets.
An Optimized Approach for Feature Extraction in Multi-Relational Statistical Learning
Journal of Scientific and Industrial Research (JSIR), 2021
Various features come from relational data often used to enhance the prediction of statistical models. The features increases as the feature space increases. We proposed a framework, which generates the features for feature selection using support vector machine with (1) augmentation of relational concepts using classification-type approach (2) various strategy to generate features. Classification are used to increase the productivity of feature space by adding new techniques used to create new features and lead to enhance the accuracy of the model. The feature generation in run-time lead to the building of models with higher accuracy despite generating features in advance. Our results in different applications of data mining in different relations are far better from existing results.
A Toolbox for Learning from Relational Data with Propositional and Multi-instance Learners
Lecture Notes in Computer Science, 2004
Most databases employ the relational model for data storage. To use this data in a propositional learner, a propositionalization step has to take place. Similarly, the data has to be transformed to be amenable to a multi-instance learner. The Proper Toolbox contains an extended version of RELAGGS, the Multi-Instance Learning Kit MILK, and can also combine the multi-instance data with aggregated data from RELAGGS. RELAGGS was extended to handle arbitrarily nested relations and to work with both primary keys and indices. For MILK the relational model is flattened into a single table and this data is fed into a multi-instance learner. REMILK finally combines the aggregated data produced by RELAGGS and the multi-instance data, flattened for MILK, into a single table that is once again the input for a multi-instance learner. Several well-known datasets are used for experiments which highlight the strengths and weaknesses of the different approaches.