Closed Set Based Discovery of Small Covers for Association Rules (original) (raw)

1999

In this paper, we address the problem of the understandability and usefulness of the set of discovered association rules. This problem is important since real-life databases lead most of the time to several thousands of rules with high confidence. We thus propose new algorithms based on the Galois closed sets to limit the extraction to small informative covers for exact and approximate rules, and small structural covers for approximate rules. Once frequent closed itemsets - which constitute a generating set for both frequent itemsets and association rules - have been discovered, no additional database pass is needed to derive these covers. Experiments conducted on real-life databases show that these algorithms are efficient and valuable in practice.

Closed-set-based Discovery of Bases of Association Rules

The output of an association rule miner is often huge in practice. This is why several concise lossless representations have been proposed, such as the "essential" or "representative" rules. We revisit the algorithm given by Kryszkiewicz (Int. Symp. Intelligent Data Analysis 2001, Springer-Verlag LNCS 2189, 350-359) for mining representative rules. We show that its output is sometimes incomplete, due to an oversight in its mathematical validation. We propose alternative complete generators and we extend the approach to an existing closure-aware basis similar to, and often smaller than, the representative rules, namely the basis B*.

Mining Minimal Non-redundant Association Rules Using Frequent Closed Itemsets

2000

The problem of the relevance and the usefulness of extracted association rules is of primary importance because, in the majority of cases, real-life databases lead to several thousands association rules with high confidence and among which are many redundancies. Using the closure of the Galois connection, we define two new bases for association rules which union is a generating set for all valid association rules with support and confidence. These bases are characterized using frequent closed itemsets and their generators; they consist of the non-redundant exact and approximate association rules having minimal antecedents and maximal consequents, i.e. the most relevant association rules. Algorithms for extracting these bases are presented and results of experiments carried out on real-life databases show that the proposed bases are useful, and that their generation is not time consuming.

Mining Most Generalization Association Rules Based on Frequent Closed Itemset +

2012

Association rule mining plays an important role in knowledge discovery and data mining. The rules obtained by some previous works based on support and confidence measures might be redundant to a certain degree. This paper thus proposes the concept of most generalization association rules (MGARs), which are more compact than the three previous rule types that include traditional association rules, non-redundant association rules and minimal non-redundant association rules. Some theorems relating to the properties of MGARs are derived as well, and an algorithm based on the theorems for effectively pruning unpromising rules early is then proposed. Hash tables are used to check whether the generated rules are redundant or not. Experimental results show that the number of MGARs generated from a database is much smaller than that of nonredundant association rules and that of minimal non-redundant association rules.

Discovering Frequent Closed Itemsets for Association Rules

1999

In this paper, we address the problem of finding frequent itemsets in a database. Using the closed itemset lattice framework, we show that this problem can be reduced to the problem of finding frequent closed itemsets. Based on this statement, we can construct efficient data mining algorithms by limiting the search space to the closed itemset lattice rather than the subset lattice. Moreover, we show that the set of all frequent closed itemsets suffices to determine a reduced set of association rules, thus addressing another important data mining problem: limiting the number of rules produced without information loss.We propose a new algorithm, called A-Close, using a closure mechanism to find frequent closed itemsets. We realized experiments to compare our approach to the commonly used frequent itemset search approach. Those experiments showed that our approach is very valuable for dense and/or correlated data that represent an important part of existing databases.

New algorithms for fast discovery of association rules

1997

Abstract Discovery of association rules is an important problem in database mining. In this paper we present new algorithms for fast association mining, which scan the database only once, addressing the open question whether all the rules can be efficiently extracted in a single database pass. The algorithms use novel itemset clustering techniques to approximate the set of potentially maximal frequent itemsets.

Loading...

Loading Preview

Sorry, preview is currently unavailable. You can download the paper by clicking the button above.