Data-Driven Constructive Induction (original) (raw)

DATA-DRIVEN CONSTRUCTIVE INDUCTION: A Methodology and Its Applications

1998

The presented methodology concerns constructive induction, viewed generally as a process combining two intertwined searches: first for the "best" representation space, and second for the "best" hypothesis in that space. The first search employs a range of operators for improving the initial representation space, such as operators for generating new attributes, selecting best attributes among the given ones, and for abstracting attributes. In the methodology presented, these operators are chosen on the basis of the analysis of training data, hence the term data-driven. The second search employs an AQtype rule learning to the examples projected at each iteration to the newly modified representation space. The aim of the search is to determine a generalized description of examples that optimizes a task-oriented multicriterion evaluation function. The two searches are intertwined, as they are executed in a loop in which one feeds into another. Experimental applications of the methodology to text categorization and natural scene interpretation demonstrate a significant practical utility of the proposed methodology.

A Practical Approach for Knowledge-Driven Constructive Induction

Citeseer

Learning problems can be difficult for many reasons, one of them is inadequate representation space or description language. Features can be considered as a representational language; when this language contains more features than necessary, subset selection helps ...

A Relevancy Filter for Constructive Induction

IEEE Expert / IEEE Intelligent Systems, 1998

rithms enable the learner to extend its vocabulary with new terms if, for a given a set of training examples, the learner's vocabulary is too restncted to solve the learning task. We propose a filter that selects potentially relevant terms from the set of constructed terms and eliminates terms that are irrelevant for the learning task. Restricting constructive induction (or predicate invention) to relevant terms allows a much larger explored space of constructed terms. The elimination of irrelevant terms is especially well-suited for learners of large time or space complexity, such as genetic algorithms and artificial neural networks.

Data-driven constructive induction in AQ17PRE: A method and experiments

IEEE Transactions on Applications and Industry, 1991

A method is presented for constructive induction, in which new attributes are constructed as various functions of original attributes. Such a method is called data-driven constructive induction, because new attributes are derived from an analysis of the data (examples) rather than the generated rules. Attribute construction and rule generation are repeated until a termination condition, such as the satisfaction of a rule quality measure, is met. The first step of this method, the generation of new attributes, has been implemented in AQ17-PRE. Initial experiments with AQ17-PRE have shown that it leads to an improvement of the learned rules in terms of both their simplicity and their accuracy on testing examples

Multistrategy Constructive Induction

This paper presents a method for multistrategy constructive induction that integrates two inferential learning strategies---empirical induction and deduction, and two computational methodsdatadriven and hypothesis-driven. The method generates inductive hypotheses in an iteratively modified re ,,e?entaaon space. The operato_rs modifying the presen, tao.n, s?ac,e are, c,!a__'.ed. m "constructors, which expand the space (by generating aoaition0a aunoutes) aria oestructors which contract the space (by removing low relevance attributes or abs_tracting attribu,te value, s). Constructors generate new dimensions (attributes) by analyzin.g original or transormea examptes (data-driven) and by analyzing the rules obtained in the prewous iteration (hypothesis-driven). Destructors detect the irrelevant components of the representation space by rule-based inference or statistical analysis. The method has been implemented in the AQ17-MCI program. The pre 'hminary results from applying...

CONSTRUCTIVE INDUCTION FROM DATA IN AQ17-DCI: Further Experiments

1991

This paper presents a method for data-driven constructive induction, which generates new problemoriented attributes by combining the original attributes according to a variety of heuristic rules. The combination of attributes are defined by different logical and/or mathematical operators, thus producing a potentially very large space of features. This space is reduced by applying an "attribute quality" evaluation function which selects the "best" set of features. The data, enhanced with the new attributes, is used to generate rules which are then evaluated by a "rule quality" function. Attribute construction and rule generation is repeated until a termination condition is satisfied. Attributes produced by the method often represent meaningful and useful concepts. The program, AQ17-DCI, implementing the method has been experimentally applied to a number of problems and produces very satisfactory results. These results are comparable to the best existing machine learning methods.

Principled constructive induction

1989

A framework for the construction of new features for hard classification tasks is discussed. The approach brings together ideas from the fields of machine learning, computational geometry, and pattern recognition. Two heuristics for evaluation of newly-constructed features are proposed, and their statistical significance verified. Finally, it is shown how the proposed framework can be used to combine techniques for selection of representative examples with techniques for construction of new features, in order to solve difficult problems in learning from examples.

Multistrategy Constructive Induction: AQ17-MCI

1993

This paper presents a method for multistrategy constructive induction that integrates two inferential learning strategies—empirical induction and deduction, and two computational methods—data-driven and hypothesis-driven. The method generates inductive hypotheses in an iteratively modified representation space. The operators modifying the representation space are classified into "constructors, " which expand the space (by generating additional attributes) and "destructors " which contract the space (by removing low relevance attributes or abstracting attribute values). Constructors generate new dimensions (attributes) by analyzing original or transformed examples (data-driven) and by analyzing the rules obtained in the previous iteration (hypothesisdriven). Destructors detect the irrelevant components of the representation space by rulebased inference or statistical analysis. The method has been implemented in the AQ17-MCI program. The preliminary results from ap...

Constructive induction on decision trees

Proceedings of the Eleventh International Joint …, 1989

Selective induction techniques perform poorly when the features are inappropriate for the target concept. One solution is to have the learning system construct new features automatically; unfortunately feature construction is a di cult and poorly understood problem. In this paper we present a de nition of feature construction in concept learning, and o er a framework for its study based on four aspects: detection, selection, generalization, and evaluation. This framework is used in the analysis of existing learning systems and as the basis for the design of a new system, citre. citre performs feature construction using decision trees and simple domain knowledge as constructive biases. Initial results on a set of spatial-dependent problems suggest the importance of domain knowledge and feature generalization, i.e., constructive induction.

RULES-5: a rule induction algorithm for classification problems involving continuous attributes

Proceedings of the Institution of Mechanical Engineers, Part C: Journal of Mechanical Engineering Science, 2003

This paper presents R U LES-5, a new induction algorithm for effectively handling problems involving continuous attributes. R U LES-5 is a 'covering' algorithm that extracts IF -TH EN rules from examples presented to it. The paper rst reviews existing methods of rule extraction and dealing with continuous attributes. It then describes the techniques adopted for R U LES-5 and gives a step-by-step example to illustrate their operation. The paper nally gives the results of applying R U LES-5 and other algorithms to benchmark problems. These clearly show that R U LES-5 generates rule sets that are more accurate than those produced by its immediate predecessor R U LES-3 Plus and by a well-known commercially available divide-and-conquer machine learning algorithm.