Mining Periodic Patterns from Non-binary Transactions (original) (raw)
Related papers
An Efficient Approach to Mine Periodic-Frequent Patterns in Transactional Databases
Lecture Notes in Computer Science, 2012
Recently, temporal occurrences of the frequent patterns in a transactional database has been exploited as an interestingness criterion to discover a class of user-interest-based frequent patterns, called periodic-frequent patterns. Informally, a frequent pattern is said to be periodic-frequent if it occurs at regular intervals specified by the user throughout the database. The basic model of periodic-frequent patterns is based on the notion of "single constraints." Using this model to mine periodic-frequent patterns containing both frequent and rare items leads to a dilemma called the "rare item problem." To confront the problem, an alternative model based on the notion of "multiple constraints" has been proposed in the literature. The periodic-frequent patterns discovered with this model do not satisfy downward closure property. As a result, it is computationally expensive to mine periodic-frequent patterns with the model. Furthermore, it has been observed that this model still generates some uninteresting patterns as periodic-frequent patterns. With this motivation, we propose an efficient model based on the notion of "multiple constraints." The periodic-frequent patterns discovered with this model satisfy downward closure property. Hence, periodicfrequent patterns can be efficiently discovered. A pattern-growth algorithm has also been discussed for the proposed model. Experimental results show that the proposed model is effective.
Fast and Memory Efficient Mining of Periodic Frequent Patterns
2018
Periodic frequent pattern mining, the process of finding frequent patterns which occur periodically in databases, is an important data mining task for various decision making. Though several algorithms have been proposed for their discovery, most employ a two stage process to evaluate the periodicity of patterns. That is, by firstly deriving the set of periods of a pattern from its coverset, and subsequently evaluating the periodicity from the derived set of periods. This two step process thus make algorithms for discovering periodic frequent patterns both time and memory inefficient in the discovery process. In this paper, we present solutions to reduce both runtime and memory consumption in periodic frequent pattern mining. We achieve this by evaluating the periodicity of patterns without deriving the set of periods from their coversets. Our experimental results show that our proposed solutions are efficient both in reducing the runtime and memory consumption in the discovery of p...
Novel Techniques to Reduce Search Space in Periodic-Frequent Pattern Mining
Lecture Notes in Computer Science, 2014
Periodic-frequent patterns are an important class of regularities that exist in a transactional database. Informally, a frequent pattern is said to be periodic-frequent if it appears at a regular interval specified by the user (i.e., periodically) in a database. A pattern-growth algorithm, called PFP-growth, has been proposed in the literature to discover the patterns. This algorithm constructs a tid-list for a pattern and performs a complete search on the tid-list to determine whether the corresponding pattern is a periodic-frequent or a non-periodic-frequent pattern. In very large databases, the tid-list of a pattern can be very long. As a result, the task of performing a complete search over a pattern's tid-list can make the pattern mining a computationally expensive process. In this paper, we have made an effort to reduce the computational cost of mining the patterns. In particular, we apply greedy search on a pattern's tid-list to determine the periodic interestingness of a pattern. The usage of greedy search facilitate us to prune the non-periodic-frequent patterns with a sub-optimal solution, while finds the periodic-frequent patterns with the global optimal solution. Thus, reducing the computational cost of mining the patterns without missing any knowledge pertaining to the periodic-frequent patterns. We introduce two novel pruning techniques, and extend them to improve the performance of PFP-growth. We call the algorithm as PFP-growth++. Experimental results show that PFP-growth++ is runtime efficient and highly scalable as well.
Efficient Mining of Non-Redundant Periodic Frequent Patterns
Vietnam Journal of Computer Science
Periodic frequent patterns are frequent patterns which occur at periodic intervals in databases. They are useful in decision making where event occurrence intervals are vital. Traditional algorithms for discovering periodic frequent patterns, however, often report a large number of such patterns, most of which are often redundant as their periodic occurrences can be derived from other periodic frequent patterns. Using such redundant periodic frequent patterns in decision making would often be detrimental, if not trivial. This paper addresses the challenge of eliminating redundant periodic frequent patterns by employing the concept of deduction rules in mining and reporting only the set of non-redundant periodic frequent patterns. It subsequently proposes and develops a Non-redundant Periodic Frequent Pattern Miner (NPFPM) to achieve this purpose. Experimental analysis on benchmark datasets shows that NPFPM is efficient and can effectively prune the set of redundant periodic frequent...
Mining periodic-frequent patterns with maximum items' support constraints
Proceedings of the Third Annual ACM Bangalore Conference, 2010
The single minimum support (minsup) based frequent pattern mining approaches like Apriori and FP-growth suffer from "rare item problem" while extracting frequent patterns. That is, at high minsup, frequent patterns consisting of rare items will be missed, and at low minsup, number of frequent patterns explode. In the literature, efforts have been made to extract rare frequent patterns under "multiple minimum support framework". In this framework, "rare frequent patterns" can be extracted by specifying minsup of the pattern using two models: minimum constraint model and maximum constraint model. In the literature, an approach has been proposed to extract only those frequent patterns which occur periodically. The basic model of periodic-frequent patterns is based on single minsup constraint. It was observed that the periodic-frequent pattern mining approach also suffers from the "rare item problem". An effort has been made to extract rare periodic-frequent patterns using minimum constraint model. In this paper, we have proposed a pattern-growth approach to extract rare periodic-frequent patterns by specifying minsup under maximum constraint model. Experiment results show that the proposed approach is efficient.
Efficient discovery of periodic-frequent patterns in very large databases
Journal of Systems and Software, 2016
Periodic-frequent patterns (or itemsets) are an important class of regularities that exist in a transactional database. Finding these patterns involves discovering all frequent patterns that satisfy the user-specified maximum periodicity constraint. This constraint controls the maximum inter-arrival time of a pattern in a database. The time complexity to measure periodicity of a pattern is O(n), where n represents the number of timestamps at which the corresponding pattern has appeared in a database. As n usually represents a high value in voluminous databases, determining the periodicity of every candidate pattern in the itemset lattice makes the periodic-frequent pattern mining a computationally expensive process. This paper introduces a novel approach to address this problem. Our approach determines the periodic interestingness of a pattern by adopting greedy search. The basic idea of our approach is to discover all periodic-frequent patterns by eliminating aperiodic patterns based on suboptimal solutions. The best and worst case time complexities of our approach to determine the periodic interestingness of a frequent pattern are O(1) and O(n), respectively. We introduce two pruning techniques and propose a pattern-growth algorithm to find these patterns efficiently. Experimental results show that our algorithm is runtime efficient and highly scalable as well.
Towards fast and memory efficient discovery of periodic frequent patterns
Journal of Information and Telecommunication
Periodic frequent pattern (PFP) mining, the process of discovering frequent patterns that occur at regular periods in databases, is an important data mining task for various decision-making. Although several algorithms have been proposed for discovering PFPs, most of these algorithms often employ a two-stage approach to mining these periodic frequent patterns. That is, by firstly deriving the set of periods of a pattern from its coverset and subsequently evaluating the patterns' periodicity from the derived set of periods. This two-stage approach in discovering periodic frequent patterns as a result make existing algorithms inefficient in both runtime and memory usage. This paper presents solutions towards reducing the runtime, as well as, memory usage in discovering periodic frequent patterns. This is achieved by evaluating the periodicity of patterns without deriving the set of periods from their coversets. Experimental analysis on benchmark datasets show that the proposed solutions are efficient in reducing both the runtime and memory usage in mining periodic frequent patterns.
Mining Interesting Periodicities of Temporal Patterns
2008
Data mining also known as knowledge discovery from datasets has been recognized as an important area of database research. This area can be defined as efficiently discovering interesting patterns from large data sets. In this paper a generic method has been proposed to extract interesting periodicities of patterns from large datasets where the transactions in the data sets are associated with patterns and time intervals in which the patterns hold. Considering the hierarchy associated with time stamps of the form day-date-hour-minutes-seconds, different types of periodic patterns such as daily, weekly, monthly patterns can be extracted.
Discovering Productive Periodic Frequent Patterns in Transactional Databases
Periodic frequent pattern mining is an important data mining task for various decision making. However, it often presents a large number of periodic frequent patterns, most of which are not useful as their periodicities are due to random occurrence of uncorrelated items. Such periodic frequent patterns would most often be detrimental in decision making where correlations between the items of periodic frequent patterns are vital. To enable mine the periodic frequent patterns with correlated items, we employ a correlation test on periodic frequent patterns and introduce the productive periodic frequent patterns as the set of periodic frequent patterns with correlated items. We finally develop the productive periodic frequent pattern (PPFP) framework for mining our introduced productive periodic frequent patterns. PPFP is efficient and the productiveness measure removes the periodic frequent patterns with uncorrelated items.
Author's personal copy Effective periodic pattern mining in time series databases
The goal of analyzing a time series database is to find whether and how frequent a periodic pattern is repeated within the series. Periodic pattern mining is the problem that regards temporal regularity. However, most of the existing algorithms have a major limitation in mining interesting patterns of users interest, that is, they can mine patterns of specific length with all the events sequentially one after another in exact positions within this pattern. Though there are certain scenarios where a pattern can be flexible, that is, it may be interesting and can be mined by neglecting any number of unimportant events in between important events with variable length of the pattern. Moreover, existing algorithms can detect only specific type of periodicity in various time series databases and require the interaction from user to determine periodicity. In this paper, we have proposed an algorithm for the periodic pattern mining in time series databases which does not rely on the user for the period value or period type of the pattern and can detect all types of periodic patterns at the same time, indeed these flexibilities are missing in existing algorithms. The proposed algorithm facilitates the user to generate different kinds of patterns by skipping intermediate events in a time series database and find out the periodicity of the patterns within the database. It is an improvement over the generating pattern using suffix tree, because suffix tree based algorithms have weakness in this particular area of pattern generation. Comparing with the existing algorithms, the proposed algorithm improves generating different kinds of interesting patterns and detects whether the generated pattern is periodic or not. We have tested the performance of our algorithm on both synthetic and real life data from different domains and found a large number of interesting event sequences which were missing in existing algorithms and the proposed algorithm was efficient enough in generating and detecting periodicity of flexible patterns on both types of data.