A fast parallel algorithm for frequent itemsets mining (original) (raw)

Abstract

Mining frequent itemsets from leirge databases is an important computational task with a lot of applications. The most known among them is the market-basket problem which assumes that we have a large number of items and we want to know which items are bought together. A recent application is that of web pages (baskets) and linked pages (items). Pages with many common references may be about the same topic. In this paper we present a parallel algorithm for mining frequent itemsets. We provide experimental evidence that our algorithm scales quite well and we discuss the merits of parallelization for this problem.

Loading Preview

Sorry, preview is currently unavailable. You can download the paper by clicking the button above.

References (14)

S. Ahmed, F. Coenen, and P.H. Leng: A Tree Partitioning Method for Memory Man- agement in Association Rule Mining. In Proc. of Data Warehousing and Knowledge Discovery, 6th International Conference (DaWaK 2004), Lecture Notes in Computer Science 3181, pp. 331-340, Springer-Verlag 2004.
R. Agrawal, T. Imielinski, and A. Swami. Mining Association Rules between Sets of Items in Large Databases. In Proc. of the 1993 ACM SIGMOD Conference on Management of Data, Washington DC, pp. 207-216 1993.
R. Agrawal and R. Srikant. Fast Algorithms for mining association rules. In Proc. VLDB'94, pp. 487-499 1994.
R. Agrawal and J.C. Shafer. Parallel Mining of Association Rules. IEEE Transactions on Knowledge and Data Engineering 8(6), pp. 962-969, 1996.
R. J. Bayardo, Jr. and R. Agrawal. Mining the Most Interesting Rules. In Proc. of the Fifth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 145-154, 1999.
F. Coenen, G. Goulbourne, and P. Leng. Computing Association Rules using Partial Totals. In L. De Raedt and A. Siebes eds, Principles of Data Mining and Knowl- edge Discovery (Proceedings of the 5th European Conference, PKDD 2001, Freiburg), Lecture Notes in AI 2168, Springer-Verlag, Berlin, Heidelberg: pp. 54-66 2001.
F. Coenen, G. Goulbourne, and P. Leng. Tree Structures for Mining Association Rules. Data Mining and Knowledge Discovery, pp. 25-51, 8 2004
G. Goulbourne, F. Coenen, and P. Leng. Algorithms for Computing Association Rules using a Partial-Support Tree. Journal of Knowledge-Based Systems pp. 141-149, 13 2000.
F. Coenen, P. Leng, and S. Ahmed. T-Trees, Vertical Partitioning and Distributed Association Rule Mining. In Proc. of the 3rd IEEE International Conference on Data Mining pp. 513-516, ICDM 2003.
F. Coenen, P. Leng, A. Pagourtzis, W. Rytter, D. Souliou. Improved Methods for Extracting Frequent Itemsets from Interim-Support Trees. In Proc. of AI 2005.
J. Han, J. Pei, Y.Yin, and R. Mao. Mining Frequent Patterns without Candidate Gen- eration: A Frequent-Pattern Tree Approach. Data Mining and Knowledge Discovery, pp. 53-87, 8 2004.
R. Raymon. Search Through Systematic Search Enumeration. In Proc. of the 3rd Internaltional Conference on Principles of Knowledge Representation and Reasoning, pp. 539-550 1992.
D. Souliou, A. Pagourtzis, N. Drosinos, P. Tsanakas. Computing Frequent Itemsets in Parallel Using Partial Support Trees, in Proceedings of 12th European PVM/MPI Conference (Euro PVM/MPI 2005), Sorrento (Naples), Italy, Lecture Notes in Com- puter Science 3666, pp. 28-37, Springer-Verlag 2005
Osmar R. Zaine Mohammad El-Hajj Paul Lu. Fast Parallel Association Rule Mining without Candidate Generation. In Proc. of ICDM 2001.