Fast Generation of Best Interval Patterns for Nonmonotonic Constraints (original) (raw)

Mining Patterns with a Balanced Interval

In many applications it will be useful to know those patterns that occur with a balanced interval, e.g., a certain combination of phone numbers are called almost every Friday or a group of products are sold a lot on Tuesday and Thursday. In previous work we proposed a new measure of support (the number of occurrences of a pattern in a dataset), where we count the number of times a pattern occurs (nearly) in the middle between two other occurrences. If the number of non-occurrences between two occurrences of a pattern stays almost the same then we call the pattern balanced. It was noticed that some very frequent patterns obviously also occur with a balanced interval, meaning in every transaction. However more interesting patterns might occur, e.g., every three transactions. Here we discuss a solution using standard deviation and average. Furthermore we propose a simpler approach for pruning patterns with a balanced interval, making estimating the pruning threshold more intuitive.

Optimal Approach for Temporal Patterns Discovery

Flairs, 2003

This paper presents new technique for discovering temporal patterns when considered primitives are intervals. Apriori technique is the most used one to deal with temporal patterns using point primitives. An extension of this technique is proposed by Höppner to deal with interval primitives. In this paper, we show that it is not necessary to discover all patterns, instead it is sufficient to discover the set of optimum "interesting" patterns, which is smaller than the set of all significant patterns. For this task, we will introduce a new approach proposal to reduce the combinatorial explosion of generated patterns. The resulting technique, called TPGIP (Temporal Patterns Generation with Interval Primitives), is used to discover the optimal set of interesting patterns efficiently. Then, TPGIP explores some symmetric properties of interval algebra and uses partial patterns structure to propose an efficient approach to explore the patterns set in order to generate the candidate patterns. Some experimental and comparative results are shown at the end of this paper.

Robust mining of time intervals with semi-interval partial order patterns

Proceedings of the 2010 SIAM International Conference on Data Mining, 2010

We present a new approach to mining patterns from symbolic interval data that extends previous approaches by allowing semi-intervals and partially ordered patterns. The mining algorithm combines and adapts efficient algorithms from sequential pattern and itemset mining for discovery of the new semi-interval patterns. The semi-interval patterns and semi-interval partial order patterns are more flexible than patterns over full intervals, and are empirically demonstrated to be more useful as features in classification settings. We performed an extensive empirical evaluation on seven real life interval databases totalling over 146k intervals from more than 400 classes demonstrating the flexibility and usefulness of the patterns.

Constraint-based sequential pattern mining: the pattern-growth methods

Journal of Intelligent Information Systems, 2007

Constraints are essential for many sequential pattern mining applications. However, there is no systematic study on constraint-based sequential pattern mining. In this paper, we investigate this issue and point out that the framework developed for constrained frequent-pattern mining does not fit our mission well. An extended framework is developed based on a sequential pattern growth methodology. Our study shows that constraints can be effectively and efficiently pushed deep into the sequential pattern mining under this new framework. Moreover, this framework can be extended to constraint-based structured pattern mining as well. Keywords Sequential pattern mining • Frequent pattern mining • Mining with constraints • Pattern-growth methods This research is supported in part by NSERC Grant 312194-05, NSF Grants IIS-0308001, IIS-0513678, BDI-0515813 and National Science Foundation of China (NSFC) grants No. 60303008 and 69933010. All opinions, findings, conclusions and recommendations in this paper are those of the authors and do not necessarily reflect the views of the funding agencies.

Efficiently Mining Constrained Subsequence Patterns

Advanced Data Mining and Applications, 2018

Big time series data are generated daily by various application domains such as environment monitoring, internet of things, health care, industry and science. Mining this massive data is a very challenging task because conventional data mining algorithms are unable to scale effectively with massive time series data. Moreover, applying a global classification approach to a highly similar and noisy data will hinder the classification performance. Therefore, utilizing constrained subsequence patterns in data mining applications increases the efficiency, accuracy, and could provide useful insight into the data. To address the above mentioned limitations, we propose an efficient subsequence processing technique with preferences constraints. Then, we introduce a sub-patterns analysis for time series data. The sub-pattern analysis objective is to maximize the interclass separability using a localization approach. Furthermore, we make use of the deviation from a correlation constraint as an objective to minimize in our problem, and we include users preferences as an objective to maximize in proportion to users' preferred time intervals. We experimentally validate the efficiency and effectiveness of our proposed algorithm using real data to demonstrate its superiority and efficiency when compared to recently proposed correlation-based subsequence search algorithms.

Pseudo Projection Based Approach to Discovertime Interval Sequential Pattern

2015

Data mining is the process to find out mysteriousand interesting patterns from transactional database. Sequential mining is the one of the major sub-area of data mining to find out the frequent sequences. As straight sequential pattern mining methods do not consider transaction occurrence time intervals, it is impossible to predict the time intervals of any two transactions mined as frequent sequences. There are several constraints to find out the effective and frequent sequential patterns. In this paper, I take time interval between two successive transactions. Time interval sequential mining is the process to find out the frequent sequential patterns with consideration of time interval constraint between two successive truncations. This paper proposed the modify version of the I-Prefixspan algorithm, is called as NI-PrefixSpan, to find out the time interval sequential pattern with pseudo projection table. Later, in this paper identify various advantage and drawback with this appro...

Constrained pattern mining in the new era

Knowledge and Information Systems, 2015

Twenty years of research on frequent itemset mining, or pattern mining, has led to the existence of a set of efficient algorithms for identifying different types of patterns, from transactional to sequential. Despite the great advances in this field, big data brought a completely new context to operate, with new challenges arising from the growth in data size, dynamics and complexity. These challenges include the shift not only from static to dynamic data, but also from tabular to complex data sources, such as social networks (expressed as graphs) and data warehouses (expressed as multi-relational models). In this new context, and more than ever, users need effective ways to control the large number of discovered patterns, and to be able to choose what patterns to consider at each time. The most accepted and common approach to minimize these drawbacks has been to capture and represent the semantics of the domain through constraints, and use them not only to reduce the number of results, but also to focus the algorithms in areas where it is more likely to gain information and return more interesting results. The use of constraints in pattern mining has been widely studied, and there are a lot of proposed types of constraints and pushing strategies. In this paper, we present a new global view of the work done on the incorporation of constraints in the pattern mining process. In particular, we propose a new framework for constrained pattern mining, that allows us to organize and analyze existing algorithms and strategies, based on the different types and properties of constraints, and on the data sources they are able to handle. Keywords Data mining • Pattern mining • Domain knowledge • Constraints This work was partially supported by FCT-Fundação para a Ciência e a Tecnologia, under Project D2PM

Mining First-Order Temporal Interval Patterns with Regular Expression Constraints

2007

Most methods for temporal pattern mining assume that time is represented by points in a straight line starting at some initial instant. In this paper, we consider a new kind of first order temporal pattern, specified in Allen's Temporal Interval Logic, where time is explicitly represented by intervals. We present the algorithm MILPRIT for mining temporal interval patterns, which uses variants of the classical level-wise search algorithms. MILPRIT allows a broad spectrum of constraints over temporal patterns to be incorporated in the mining process. Some experimental results over synthetic and real data are presented.

Mining periodic-frequent patterns with maximum items' support constraints

Proceedings of the Third Annual ACM Bangalore Conference, 2010

The single minimum support (minsup) based frequent pattern mining approaches like Apriori and FP-growth suffer from "rare item problem" while extracting frequent patterns. That is, at high minsup, frequent patterns consisting of rare items will be missed, and at low minsup, number of frequent patterns explode. In the literature, efforts have been made to extract rare frequent patterns under "multiple minimum support framework". In this framework, "rare frequent patterns" can be extracted by specifying minsup of the pattern using two models: minimum constraint model and maximum constraint model. In the literature, an approach has been proposed to extract only those frequent patterns which occur periodically. The basic model of periodic-frequent patterns is based on single minsup constraint. It was observed that the periodic-frequent pattern mining approach also suffers from the "rare item problem". An effort has been made to extract rare periodic-frequent patterns using minimum constraint model. In this paper, we have proposed a pattern-growth approach to extract rare periodic-frequent patterns by specifying minsup under maximum constraint model. Experiment results show that the proposed approach is efficient.

Mining Closed Frequent Intervals from Interval Data

2012

Interval data is widely encountered in several areas of data mining applications. In the context of discovering interesting temporal patterns in interval data, the notion of closed frequent intervals is proposed in this paper. A pioneering method to determine the set of closed frequent intervals in an interval database is presented. A rigorous mathematical proof has been provided to substantiate the correctness of the proposed method.