Novel parallel method for mining frequent patterns on multi-core shared memory systems (original) (raw)

Proceedings of the 2013 International Workshop on Data-Intensive Scalable Computing Systems - DISCS-2013, 2013

Abstract

ABSTRACT Frequent pattern mining is an important problem in data mining with many practical applications. Current parallel methods for mining frequent patterns unstably perform for different database types and under-utilize the benefits of multi-core shared memory machines. We present ShaFEM, a novel parallel frequent pattern mining method, to address these issues. Our method can dynamically adapt to the data characteristics to efficiently perform on both sparse and dense databases. Its parallel mining lock free approach minimizes the synchronization needs and maximizes the data independence to enhance the scalability. Its structure lends itself well for dynamic job scheduling resulting in well-balanced load on new multi-core shared memory architectures. We evaluate ShaFEM on a 12-core multi-socket server and find that our method runs 2.1--5.8 times faster than the state-of-the-art parallel method. For some test cases, we have shown that ShaFEM saves 4.9 days and 12.8 hours of execution time over the compared method.

lan anh vu hasn't uploaded this paper.

Let lan anh know you want this paper to be uploaded.

Ask for this paper to be uploaded.