Obtaining best parameter values for accurate classification (original) (raw)

The effect of threshold values on association rule based classification accuracy

2007

Classification Association Rule Mining (CARM) systems operate by applying an Association Rule Mining (ARM) method to obtain classification rules from a training set of previously classified data. The rules thus generated will be influenced by the choice of ARM parameters employed by the algorithm (typically support and confidence threshold values). In this paper we examine the effect that this choice has on the predictive accuracy of CARM methods.

Threshold tuning for improved classification association rule mining

2005

One application of Association Rule Mining (ARM) is to identify Classification Association Rules (CARs) that can be used to classify future instances from the same population as the data being mined. Most CARM methods first mine the data for candidate rules, then prune these using coverage analysis of the training data. In this paper we describe a CARM algorithm that avoids the need for coverage analysis, and a technique for tuning its threshold parameters to obtain more accurate classification.

Selection of Significant Rules in Classification Association Rule Mining

2005

Abstract—Classification Rule Mining (CRM) is a Data Mining technique for the extraction of hidden Classification Rules (CRs) from a given database, the objective being to build a classifier to classify “unseen” data. One recent approach to CRM is to use Association Rule Mining (ARM) techniques to identify the desired CRs, ie Classification Association Rule Mining (CARM).

A Novel Rule Weighting Approach in Classification Association Rule Mining

Seventh IEEE International Conference on Data Mining Workshops (ICDMW 2007), 2007

. Regardless of which particular CARM algorithm is used, a similar set of CARs is always generated from data, and a classifier is usually presented as an ordered list of CARs, based on a selected rule ordering strategy. Hence to produce an accurate classifier, it is essential to develop a rational rule ordering mechanism. In the past decade, a number of rule ordering strategies have been introduced that can be categorized under three headings: (1) support-confidence, (2) rule weighting, and (3) hybrid. In this paper, we propose an alternative rule weighting scheme, namely CISRW (Class-Item Score based Rule Weighting), and develop a rule weighting based rule ordering mechanism based on CISRW. Subsequently, two hybrid rule ordering strategies are further introduced by combining (1) and CISRW. The experimental results show that the three proposed CISRW based/related rule ordering strategies perform well with respect to the accuracy of classification.

Classification Based on Association-Rule Mining Techniques : A General Survey and Empirical Comparative Evaluation

2011

In this paper classification and association rule mining algorithms are discussed and demonstrated. Particularly, the problem of association rule mining, and the investigation and comparison of popular association rules algorithms. The classic problem of classification in data mining will be also discussed. The paper also considers the use of association rule mining in classification approach in which a recently proposed algorithm is demonstrated for this purpose. Finally, a comprehensive experimental study against 13 UCI data sets is presented to evaluate and compare traditional and association rule based classification techniques with regards to classification accuracy, number of derived rules, rules features and processing time.

A new approach to classification based on association rule mining

Classification is one of the key issues in the fields of decision sciences and knowledge discovery. This paper presents a new approach for constructing a classifier, based on an extended association rule mining technique in the context of classification. The characteristic of this approach is threefold: first, applying the information gain measure to the generation of candidate itemsets; second, integrating the process of frequent itemsets generation with the process of rule generation; third, incorporating strategies for avoiding rule redundancy and conflicts into the mining process. The corresponding mining algorithm proposed, namely GARC (Gain based Association Rule Classification), produces a classifier with satisfactory classification accuracy, compared with other classifiers (e.g., C4.5, CBA, SVM, NN). Moreover, in terms of association rule based classification, GARC could filter out many candidate itemsets in the generation process, resulting in a much smaller set of rules than that of CBA.

An Efficient Association Rule Mining Algorithm for Classification

Lecture Notes in Computer Science

In this paper, we propose a new Association Rule Mining algorithm for Classification (ARMC). Our algorithm extracts the set of rules, specific to each class, using a fuzzy approach to select the items and does not require the user to provide thresholds. ARMC is experimentaly evaluated and compared to state of the art classification algorithms, namely CBA, PART and RIPPER. Results of experiments on standard UCI benchmarks show that our algorithm outperforms the above mentionned approaches in terms of mean accuracy.

Integrating Classification and Association Rule Mining

1998

Classification rule mining aims to discover a small set of rules in the database that forms an accurate classifier. Association rule mining finds all the rules existing in the database that satisfy some minimum support and minimum confidence constraints. For association rule mining, the target of discovery is not pre-determined, while for classification rule mining there is one and only one predetermined target. In this paper, we propose to integrate these two mining techniques. The integration is done by focusing on mining a special subset of association rules, called class association rules (CARs). An efficient algorithm is also given for building a classifier based on the set of discovered CARs. Experimental results show that the classifier built this way is, in general, more accurate than that produced by the state-of-the-art classification system C4.5. In addition, this integration helps to solve a number of problems that exist in the current classification systems.

Classifying Using Specific Rules with High Confidence

2010 Ninth Mexican International Conference on Artificial Intelligence, 2010

In this paper, we introduce a new strategy for mining the set of Class Association Rules (CARs), that allows building specific rules with high confidence. Moreover, we introduce two propositions that support the use of a confidence threshold value equal to 0.5. We also propose a new way for ordering the set of CARs based on rule size and confidence values. Our results show a better average classification accuracy than those obtained by the best classifiers based on CARs reported in the literature.

Association rule evaluation for classification purposes

Association rules have proved to be useful in building both partial and complete classification models. This paper analyzes alternative measures which could replace confidence in order to evaluate the suitability of a given association rule with respect to the classification problem we try to solve when building a classification model.