Article ID: IJEET_11_08_009 Frequent Itemsets in Big Data (original) (raw)
The discovery of frequent items from big data or very large dataset is probably not a new technique but lot of the existing algorithms and approaches needs some fine tuning, and this paper deals with a very large data by utilizing the divide and conquer approach where the raw dataset is partitioned or sub divided into many parts based on the size of the input data and the number of process the algorithm uses to unearth the frequent itemsets. The proposed approach computes the count (native support) of each items present in the individual partitions and no pruning is carried out, but then the discovered itemset are combined together in the next stage and universal support is computed to prune away the unpromising itemsets and then the data is divided to calculate the native support. This process is continued until the entire frequent itemsets are unearthed. The proposed algorithm Split and Rule algorithm (SR algorithm) is compared with many existing algorithms to prove its versatility and efficiency related to execution time and memory consumption.