Adaptive and Resource-Aware Mining of Frequent Sets (original) (raw)

An efficient parallel and distributed algorithm for counting frequent sets

2003

Due to the huge increase in the number and dimension of available databases, efficient solutions for counting frequent sets are nowadays very important within the Data Mining community. Several sequential and parallel algorithms were proposed, which in many cases exhibit excellent scalability. In this paper we present ParDCI, a distributed and multithreaded algorithm for counting the occurrences of frequent sets within transactional databases.

Efficiently Mining Frequent Itemsets in Transactional Databases

Journal of Marine Science and Technology, 2016

Discovering frequent itemsets is an essential task in association rules mining and it is considered to be computationally expensive. To find the frequent itemsets, the algorithm of frequent pattern growth (FP-growth) is one of the best algorithms for mining frequent patterns. However, many experimental results have shown that building conditional FP-trees during mining data using this FP-growth method will consume most of CPU time. In addition, it requires a lot of space to save the FP-trees. This paper presents a new approach for mining frequent item sets from a transactional database without building the conditional FP-trees. Thus, lots of computing time and memory space can be saved. Experimental results indicate that our method can reduce lots of running time and memory usage based on the datasets obtained from the FIMI repository website.

A sampling-based framework for parallel mining frequent patterns

2006

Data mining is an emerging research area, whose goal is to discover potentially useful information embedded in databases. Due to the wide availability of huge amounts of data and the imminent need for turning such data into useful knowledge, data mining has attracted a great deal of attention in recent years. Frequent pattern mining has been a focused topic in data mining research. The goal of frequent pattern mining is to discover the patterns whose numbers of occurrence are above a predefined threshold in the datasets. Depending on the different definition of pattern, frequent pattern mining stands for various mining problems, such as frequent itemset mining, sequential pattern mining and so on. Frequent pattern mining has numerous applications, such as the analysis of customer purchase patterns, web access patterns, natural disasters or alarm sequences, disease treatments and DNA sequences. Many algorithms have been presented for mining frequent patterns since the introduction of...

A generalized parallel algorithm for frequent itemset mining

A parallel algorithm for finding the frequent itemsets in a set of transactions is presented. The frequent individual items are identified by their index. We assume that processors number (m) is less than the frequent items number (n). At the first stage, every processor Pi, i isin; {1, ...,m - 1} sequentially computes the frequent itemsets from the interval Ii = [(i - 1) cdot; p + 1, i cdot; p], where p = lfloor;n/mrfloor;. The processor Pm computes frequent itemsets from the interval Im = [(m - 1) cdot; p + 1, n]. In the second stage, the parallel algorithm is applied. The processor Pi computes, step by step, the sets FIi,Ij of the frequent itemsets with individual items from the intervals Ii,j = Ii∪Ii+1∪...∪Ij, j = i+1,...,m. In order to compute the set FIi,Ij, the processor Pi uses FIi,Ij-1 obtained in the previous step and FIi+1,Ij received from the processor Pi+1. The main advantage of our parallel algorithm is that it uses a communication pattern known before algorithm start,...

A scalable multi-strategy algorithm for counting frequent sets

2002

Abstract In this paper we present DCI, a new data mining algorithm for frequent set counting. We also discuss in depth the parallelization strategies used in the design of ParDCI, the distributed and multi-threaded algorithm derived from DCI. Multiple heuristics strategies are adopted within DCI, so that the algorithm is able to adapt its behavior not only to the features of the specific computing platform, but also to the features of the dataset being processed.

IJERT-An Efficient Approach for Frequent Pattern Mining Using Parallel Computing

International Journal of Engineering Research and Technology (IJERT), 2014

https://www.ijert.org/an-efficient-approach-for-frequent-pattern-mining-using-parallel-computing https://www.ijert.org/research/an-efficient-approach-for-frequent-pattern-mining-using-parallel-computing-IJERTV3IS071244.pdf The highly researchable filed of data mining is nothing but frequent itemset mining. Apriori and FP Growth algorithms are most traditional algorithms for it. To develop fast and efficient algorithm for frequent pattern mining is the most challenging task. In this paper, we are improving the efficiency of Apriori algorithm using Hadoop concept and techniques to handle big data problem.

An Efficient Approach for Parallel and Incremental Mining of Frequent Pattern in Transactional Database

2015

In this paper, we provide an overview of parallel incremental association rule mining, which is one of the imminent ideas in the new and rapidly emerging research area of data mining. A useful tool for discovering frequently co-occurrent items is frequent itemset mining (FIM). Since its commencement, a number of significant FIM algorithms have been build up to increase mining performance. But when thedataset size is huge, both the computational cost and memory use can be toocostly. In this paper,we put frontward parallelizing the FP-Growth algorithm.We use MapReduce to execute the parallelization of FP-Growth algorithm. Henceforth, it splits the mining task into number of sub-tasks, implements these sub-tasks in parallel on nodes and then combines the results back for the final result.Experiments show that the result increases the computational speed as compared to apriori and fp-growth. General Terms Data Mining, Association Rule Mining, Incremental Data Mining.

A highly parallel algorithm for frequent itemset mining

Advances in Pattern …, 2010

Abstract. Mining frequent itemsets in large databases is a widely used technique in Data Mining. Several sequential and parallel algorithms have been developed, although, when dealing with high data volumes, the execution of those algorithms takes more time and resources ...