A New Approach for Extracting Closed Frequent Patterns and their Association Rules using Compressed Data Structure (original) (raw)
Related papers
A Survey on Closed Frequent Pattern Mining
International Journal of Computer Applications, 2013
The identification of association rule mining has attracted many researchers. Several algorithms for effective discovery of association rule have been proposed. With the vast literature of closed frequent itemset discovery and association rule mining, still we are not able to say that we have found solution for most of the problems. This is the inspiration for my study towards the closed frequent itemsets and association rule mining. In this paper we reviewed few algorithms for closed frequent itemset and presented a comparison.
An Efficient Mining Algorithm for Closed Frequent Itemsets and its Associated Data
2020
Database is a repository of information. Retrieving automatic patterns from the database provide the requisite information and are in great demand in various domains of science and engineering. The effective pattern mining methods such as pattern discovery and association rule mining have been developed and find its applicability in a wide gamut ranging from science to medical to military and to engineering applications. Contemporary methods of retrieval such as pattern discovery and association rule mining algorithms are useful only for retrieving the data. The limitations of using these techniques are that they are unable to provide a complete association and relationship among the diverse patterns that is retrieved. This paper attempts a solution to the above limitation by designing a new algorithm (CFIM) which generates closed frequent patterns and its associated data concurrently. CFIM makes explicit the relationship between the patterns and its associated data.
Fuzzy Frequent Pattern Mining by Compressing Large Databases
International Journal of Advance Engineering and Research Development, 2015
Task of extract ing useful and interesting knowledge fro m large data is called data min ing. It has many aspects like clustering, classificat ion, anomaly detection, association rule min ing etc. A mong such data min ing aspects, association rule min ing has gained a lot o f interest among the researchers. So me a pplicat ions of association mining include analysis of stock database, mining of the web data, diagnosis in medical do main and analysis of customer behavior. In past, many algorith ms were developed by researchers for mining frequent itemsets but the problem is that it generates candidate itemsets. So, to overco me it tree based approach for mining frequent patterns were developed that performs the min ing operation by constructing tree with item on its node that eliminates the disadvantage of most of the algorith ms. The paper tries to address the problem of finding frequent itemset by compressing the fuzzy FP tree wh ich confines itemsets into fuzzy regions with the membership value. The application of the co mpression mechanism results in co mpact tree structure that reduces the computation time. The proposed method is co mpared with the conventional method for analy zing the performance.
A Compression Based Methodology to Mine All Frequent Items
International Journal of Trend in Scientific Research and Development, 2018
Data mining is not new. People who first discovered how to start fire and that the earth is round also discovered knowledge which is the main idea of Data mining. Data Mining, also called knowledge Discovery in Database, is one of the latest research area, which has emerged in response to the Tsunami data or the flood of data, world is facing nowadays. It has taken up the challenge to develop techniques that can help humans to discover useful patterns data. One such important technique is frequent pattern mining. This paper will present an compression based technique for mining frequent items from a transaction data set.
High performance frequent patterns extraction using compressed fp-tree
Many algorithms have been proposed to improve the performance of mining frequent patterns from transac-tion databases. Pattern growth algorithms like FP-Growth based on the FP-tree are more efficient than candidate generation and test algorithms. In this paper, we propose a new data structure named Compressed FP-Tree (CFP-Tree) and an algorithm named CT-PRO that performs better than the current algorithms including FP-Growth, OpportuneProject, and Apriori. The number of nodes in a CFP-Tree can be up to 50% less than in the corresponding FP-Tree. CT-PRO is empirically compared with FP-Growth, Opportune-Project, Apriori and CT-ITL using datasets that reveal the effective performance range of these algorithms. CT-PRO is also extended for mining very large data-bases and its scalability evaluated experimentally.
Mining compressed frequent-pattern sets
2005
A major challenge in frequent-pattern mining is the sheer size of its mining results. In many cases, a high min sup threshold may discover only commonsense patterns but a low one may generate an explosive number of output patterns, which severely restricts its usage. In this paper, we study the problem of compressing frequent-pattern sets. Typically, frequent patterns can be clustered with a tightness measure δ (called δ-cluster), and a representative pattern can be selected for each cluster. Unfortunately, finding a minimum set of representative patterns is NP-Hard. We develop two greedy methods, RPglobal and RPlocal. The former has the guaranteed compression bound but higher computational complexity. The latter sacrifices the theoretical bounds but is far more efficient. Our performance study shows that the compression quality using RPlocal is very close to RPglobal, and both can reduce the number of closed frequent patterns by almost two orders of magnitude. Furthermore, RPlocal mines even faster than FPClose[11], a very fast closed frequentpattern mining method. We also show that RPglobal and RPlocal can be combined together to balance the quality and efficiency.
A New Tree StructureTo Extract Frequent Pattern
2013
Frequent pattern mining is a heavily researched area in the field of data mining with wide range of applications. Finding a frequent pattern (or items) plays as essentials role in data mining. Efficient algorithm to discover frequent patterns is essential in data mining research. A number of research works have been published that presenting new algorithm or improvements on existing algorithm to solve data mining problem efficiently. In that Apriori algorithm is the first algorithm proposed in this field. By the time of change or improvement in Apriori algorithm, the algorithms that compressed large database in to small tree data structure like FP tree, CAN tree and CP tree have been discovered. These algorithms are partitioned based , divide and conquer method used that decompose mining task in to smaller set of task for mining confined patterns in conditional database, which dramatically reduce search space. In this paper I propose a new novel tree structure - extension of CP tree...
Improving the Efficiency of Frequent Pattern Mining by Compact Data Structure Design
Lecture Notes in Computer Science, 2003
Mining frequent patterns has been a topic of active research because it is computationally the most expensive step in association rule discovery. In this paper, we discuss the use of compact data structure design for improving the efficiency of frequent pattern mining. It is based on our work in developing efficient algorithms that outperform the best available frequent pattern algorithms on a number of typical data sets. We discuss improvements to the data structure design that has resulted in faster frequent pattern discovery. The performance of our algorithms is studied by comparing their running times on typical test data sets against the fastest Apriori, Eclat, FP-Growth and OpportuneProject algorithms. We discuss the performance results as well as the strengths and limitations of our algorithms.
Fast frequent itemset mining using compressed data representation
21 st IASTED International Multi-Conference …, 2003
Fast frequent itemset mining using compressed data representation. RP Gopalan, YG Sucahyo 21 st IASTED International Multi-Conference on Applied Informatics, 1203-1208, 2003. Discovering association rules by identifying ...