A Frequent Pattern Discovery Method for Outlier Detection (original) (raw)

FP-outlier: Frequent pattern based outlier detection

Computer Science and Information Systems, 2005

An outlier in a dataset is an observation or a point that is considerably dissimilar to or inconsistent with the remainder of the data. Detection of such outliers is important for many applications and has recently attracted much attention in the data mining research community. In this paper, we present a new method to detect outliers by discovering frequent patterns (or frequent itemsets) from the data set. The outliers are defined as the data transactions that contain less frequent patterns in their itemsets. We define a measure called FPOF (Frequent Pattern Outlier Factor) to detect the outlier transactions and propose the FindFPOF algorithm to discover outliers. The experimental results have shown that our approach outperformed the existing methods on identifying interesting outliers.

An Outlier Detection Based on Frequent Pattern

2013

Outlier detection and analysis is an important data mining task and in recent years it has been widely used in many practices such as fraud detection, marketing analysis, medical analysis, network intrusion and so on. An efficient outlier detection method on transactional datasets, namely, EFPOR, has been focused in this work. In this algorithm, outliers are detected based on frequent patterns of itemset within transactions. First, the EFPOR algorithm for outlier detection has been described in this work. The time complexity of the EFPOR algorithm has been calculated. At the end of the work, the performance efficiency of the algorithm has been verified by comparing with other algorithms and the experiment results are presented.

Anytime algorithm for frequent pattern outlier detection

International Journal of Data Science and Analytics, 2016

Outlier detection consists in detecting anomalous observations from data. During the past decade, outlier detection methods were proposed using the concept of frequent patterns. Basically such methods require to mine all frequent patterns for computing the outlier factor of each transaction. This approach remains too expensive despite recent progress in pattern mining field to provide results within a short response time of only a few seconds. In this paper, we provide the first anytime method for calculating the frequent pattern outlier factor (FPOF). This method which can be interrupted at anytime by the end-user accurately approximates FPOF by mining a sample of patterns. It also computes the maximum error on the estimated FPOF for helping the user to stop the process at the right time. Experiments show the interest of this method for very large datasets where exhaustive mining fails to provide good approximate solutions. The accuracy of our anytime approximate method outperforms the baseline approach for a same budget in number of patterns.

A Survey for Different Approaches of Outlier Detection in Data Mining

— Outlier is defined as an event that deviates too much from other events. The identification of outlier can lead to the discovery of useful and meaningful knowledge. Outlier means it's happen at some time it's not regular activity. Research about Detection of Outlier has been extensively studies in the past decade. However, most existing research focused on the algorithm based on specific knowledge, compared with outlier detection approach is still rare. In this paper mainly focused on different kind of outlier detection approaches and compares it's prone and cones. In this paper we mainly distribute of outlier detection approach in two parts classic outlier approach and spatial outlier approach. The classical outlier approach identifies outlier in real transaction dataset, which can be grouped into statistical approach, distance approach, deviation approach, and density approach. The spatial outlier approach detect outlier based on spatial dataset are different from transaction data, which can be categorized into spaced approach and graph approach. Finally, the comparison of outlier detection approaches.

Detection of Anomalous Value in Data Mining

Kalpa Publications in Engineering

In the database of numeric values, outliers are the points which are different from other values or inconsistent with the rest of the data. They can be novel, abnormal, unusual or noisy information. Outliers are more attention-grabbing than the high proportion data. The challenges of outlier detection arise with the increasing complexity, mass and variety of datasets. The problem is how to manage outliers in a dataset, and how to evaluate the outliers. This paper describes an advancement of approach which uses outlier detection as a pre-processing step to detect the outlier and then applies rectangle fit algorithm, hence to analyze the effects of the outliers on the analysis of dataset.

Survey on Outlier Detection in Data Mining

International Journal of Computer Applications, 2013

Data Mining is used to extract useful information from a collection of databases or data warehouses. In recent years, Data Mining has become an important field. This paper has surveyed upon data mining and its various techniques that are used to extract useful information such as clustering, and has also surveyed the techniques that are used to detect the outliers. This paper also presents various techniques used by different researchers to detect outliers and present the efficient result to the user.

An experimental analysis of outliers detection on static exaustive datasets

INTERNATIONAL JOURNAL OF LATEST TRENDS IN ENGINEERING AND TECHNOLOGY, 2016

I. INTRODUCTION Data mining, in general, focuses on the finding the non-trivial, hidden and useful interesting information from various types of data with the advancement of Information Technologies. Data streams are ubiquitous. These can be found in many application domains from financial transactions to be done online to medical domain and space research centers, where satellites are continuously generates data streams. Clustering, Classification and Association has vital correlation with data mining[1]. In recent years, existing database querying methodologies are not sufficient to extract useful information, and hence researchers nowadays are primarily aiming towards development of new techniques to meet the improved requirements. Outlier detection is an important and major research issue that aims to find objects. that are considerably different, abnormal and inconsistent in the database. Efficient and effective detection of outliers minimizes the risk of making poor decisions based on erroneous data, and helps in identifying, preventing from the effects of malicious or faulty behavior of data. One of the important factor in data mining is increase in dimensionality of data gives rise to a number of computational challenges

Outlier Detection for Different Applications: Review

2013

Outlier Detection is a Data Mining Application. Outlier contains noisy data which is researched in various domains. The various techniques are already being researched that is more generic. We surveyed on various techniques and applications of outlier detection that provides a novel approach that is more useful for the beginners. The proposed approach helps to clean data at university level in less time with great accuracy. This survey includes the existing outlier techniques and applications where the noisy data exists. Our paper defines critical review on various techniques used in different applications of outlier detection that are to be researched further and they gives a particular type of knowledge based data i.e. more useful in research activities. So where the Anomalies is present it will be detected through outlier detection techniques and monitored accordingly.