Efficiently Mining Constrained Subsequence Patterns (original) (raw)

Big time series data are generated daily by various application domains such as environment monitoring, internet of things, health care, industry and science. Mining this massive data is a very challenging task because conventional data mining algorithms are unable to scale effectively with massive time series data. Moreover, applying a global classification approach to a highly similar and noisy data will hinder the classification performance. Therefore, utilizing constrained subsequence patterns in data mining applications increases the efficiency, accuracy, and could provide useful insight into the data. To address the above mentioned limitations, we propose an efficient subsequence processing technique with preferences constraints. Then, we introduce a sub-patterns analysis for time series data. The sub-pattern analysis objective is to maximize the interclass separability using a localization approach. Furthermore, we make use of the deviation from a correlation constraint as an objective to minimize in our problem, and we include users preferences as an objective to maximize in proportion to users' preferred time intervals. We experimentally validate the efficiency and effectiveness of our proposed algorithm using real data to demonstrate its superiority and efficiency when compared to recently proposed correlation-based subsequence search algorithms.