Improving Efficiency of Apriori Algorithms for Sequential Pattern Mining (original) (raw)
Related papers
INTRUSION DETECTION AND ANOMALY DETECTION SYSTEM USING SEQUENTIAL PATTERN MINING
Nowadays the security methods from password protected access up to firewalls which are used to secure the data as well as the networks from attackers. Several times these types of security methods are not enough to protect data. We can consider the use of Intrusion Detection Systems (IDS) is the one way to secure the data on critical systems. Most of the research work is going on the effectiveness and exactness of the intrusion detection, but these attempts are for the detection of the intrusions at the operating system and network level only. It is unable to detect the unexpected behavior of systems due to malicious transactions in databases. The method used for spotting any interferes on the information in the form of database known as database intrusion detection. It relies on enlisting the execution of a transaction. After that, if the recognized pattern is aside from those regular patterns actual is considered as an intrusion. But the identified problem with this process is that the accuracy algorithm which is used may not identify entire patterns. This type of challenges can affect in two ways. 1) Missing of the database with regular patterns. 2) The detection process neglects some new patterns. Therefore we proposed sequential data mining method by using new Modified Apriori Algorithm. The algorithm upturns the accurateness and rate of pattern detection by the process. The Apriori algorithm with modifications is used in the proposed model.
Finding Frequent Itemsets using Apriori Algorihm to Detect Intrusions in Large Dataset
2014
With the growth of hacking and exploiting tools and invention of new ways of intrusion, Intrusion detection and prevention is becoming the major challenge in the world of network security. The increasing network traffic and data on Internet is making this task more demanding. There are various approaches being utilized in intrusion detections, but unfortunately any of the systems so far is not completely flawless. The flase positive rates makes it extremely hard for to analyze and react to attacks. Intrusion detection systems using data mining approaches make it possible to search patterns and rules in large amount of audit data. In this paper, we represent an model to integrate association rules to intrusion detection to design and implement an network intrusion detection system. Our technnique is used to generate attack rules that will detect the attacks in network audit data using anomaly detection. This shows that the association rules mining algorithm is capable of detecting ne...
Intrusion Detection System Based on Frequent Pattern Mining
2014
Due to the dramatically increment of internet<br> usage, users are facing various attacks day by day.<br> Consequently, the research area for intrusion detection must<br> be fresh with new challenges. Intrusion detection system<br> includes identifying a set of malicious actions that compromise<br> the integrity, confidentiality, and availability of information<br> resources. The major contribution is to apply data mining<br> approach for network intrusion detection system. Among the<br> several features of data mining, association rules mining,FP-<br> growth algorithm, is used to find out the frequent itemsets of<br> incoming packets database. Based on these itemsets, anomaly<br> detection is added. The system will predict whether the<br> incoming data packet is normal or attack. The performance of<br> proposed system is tested by using KDD-99 datasets.
Association Rule Pattern Mining Approaches Network Anomaly Detection
2015
The research area for intrusion detection is becoming growth with new challenges of attack day by day. Intrusion detection system includes identifying a set of malicious actions that compromise the integrity, confidentiality, and availability of information resources. The major objective of this paper is to apply association rule pattern mining approaches for network intrusion detection system. In this paper, traditional FP- growth algorithm, one of the association algorithms is modified and used to mine itemsets from large database. The required statistics from large databases are gathered into a smaller data structure (FP-tree). The itemsets generated from FP-tree are used as profiles to check anomaly detection in the proposed system.
Procedia Computer Science, 2017
Within the fast growing of internet user and technology in Indonesia, thus threat coming from internet is raising. The threat is common for all user in the world. Therefore, the malware has growth rapidly and the behavior is becoming more advanced. From these problem, it is important to know, how the malware is growing and how the characteristics about malware attack in Indonesia. This research aim used the data source taken from Intrusion Detection Systems sensor from Id-SIRTII/CC, Ministry Information and Communication Indonesia. This research finds for any type of attack which frequently occurred using Frequent Item Set Mining. Therefore, data will be visualized for giving the better analysis result and giving the overview about the internet security condition in Indonesia in 2013. In minimum support 95% in frequent item set mining (both Apriori and FP-Max), we found that malware frequently occurred are SQL attack, Malware Virus DNS and DoS. The largest malware in our data only have slightly less than 80% than another pattern that have more than 90% value of support.
Intrusion Detection Technology Research Based on Apriori Algorithm
Intrusion Detection is one of the important parts of the security system. The research has been carried out at home and abroad for decades in the regard. However, with a variety of new method of attack, the demand of the Intrusion Detection methods and algorithms have also been asked to improve. By analyzing the technology of Intrusion Detection System and Data mining in this paper, the author uses Apriori algorithm which is the classic of association rules in Web-based Intrusion Detection System and applies the rule base generated by the Apriori algorithm to identify a variety of attacks, improves the overall performance of the detection system.
Sequential pattern mining for ICT risk assessment and management
Journal of Logical and Algebraic Methods in Programming, 2019
ICT risk assessment and management relies on the analysis of data on the joint behavior of a target system and its attackers. The tools in the Haruspex suite model intelligent, goal-oriented attackers that reach their goals through sequences of attacks. The tools synthetically generate these sequences through a Monte Carlo method that runs multiple simulations of the attacker behavior. This paper presents a sequential pattern mining analysis of the attack sequence database to extract a high-level and succinct understanding of the attacker strategies against the system to assess. Such an understanding is expressed as a set of sequential patterns that cover, and possibly partition, the attack sequences. This set can be extracted in isolation, or in contrast with the behavior of other attackers. In the latter case, the patterns represent a signature of the behavior of an attacker. The dynamic tools of the suite use this signature to deploy dynamic countermeasures that reduce the security risk. We formally motivate the need for using the class of maximal sequential patterns in covering attack sequences, instead of frequent or closed sequential patterns. When contrasting the behavior of different attackers, we resort to distinguishing sequential patterns. We report an extensive experimentation on a system with 36 nodes, 6 attackers, and 600K attack sequences.
ARAA: A Fast Advanced Reverse Apriori Algorithm for Mining Association Rules in Web Data
International Journal of Engineering and Technology
This paper proposed an effective algorithm for mining frequent sequence patterns from the web data by applying association rules based on Apriori, known as Advanced Reverse Apriori Algorithm (ARAA). It also shows the limitation of existing Apriori and Reverse Apriori Algorithm. Our approach is based on the reverse scans. An experimental work is performed that shows that proposed algorithm works better than the existing two algorithms. The advantages of ARAA are that it can deeply reduce the multiple scans for frequent sequence pattern generation which results in less processing overhead. A comparative study performed on all three approaches shows that our algorithm improve the mining process significantly as compared to Apriori and Reverse Apriori based mining algorithms especially for the all database. The advantages of ARAA are reduced execution time and increase throughput. Keyword-Association Rule, Apriori Algorithm, Reverse Apriori, Web Usage, Frequent Sequence Patterns I. INTRODUCTION Data mining is the course of extracting useful information from large dataset by merging statistical and artificial intelligence methods. It aims at finding interesting correlations, frequent patterns, associations among sets of items [1] in the data sources. Association rule mining has been a topic of research in data mining. In order to find the associations among large set of items in a transaction database, mining algorithm are introduced. The mining process is divided into two phases; first phase discovers the frequent large set of items which are based on counts by scanning the transaction data whereas in next phase association rules are made on the basis of the large sets of item originated in the first phase. The problem associated with existing Apriori Algorithm and Reverse Apriori Algorithm for association rule mining are discussed in this paper and we have also proposed an algorithm which is based on existing Reverse Apriori Algorithm known as Advance Reverse Apriori Algorithm (ARAA) for finding the sequence patterns in the filtered data set of web log. Sequential pattern mining is 0the procedure of employing data mining techniques to a sequential database for the purposes of uncovering the association that exist among an ordered list of events. This paper focuses on discovering the frequent sequence patterns from the web data through Apriori Algorithm (AA), Reverse Apriori (RAA) and an Advance Reverse Apriori Algorithm (ARAA) and also performs a comparative study on all the three algorithms in terms of the number of scans required to achieve the frequent sequence patterns. It also justifies why ARAA is better than AA, RAA. A. Problem Statement Agrawal and Srikant formerly proposed the problem of Sequence pattern mining [3, 14]. Sequential pattern mining is a problem associated with finding the combinatorial explosive number of intermediate sequences from the large datasets. This paper discusses the problem of sequence pattern mining with respect to web log data. The problem associated is divided into two sub problems. First problem is to find the item sets whose existences exceed a predefined threshold value in the large database. These item sets are termed frequent or large item sets. This first problem is further divided into two sub-problems. Candidate Large item sets generation Frequent item sets generation Out of these we considered the item sets whose support exceeds the user defined threshold value as frequent or large item sets. The second problem of sequence pattern mining is to engender association rules from those large item sets with minimal confidence as constraint. Let I = { i1, i2………. in } be a set of n discrete literals called items, T be a set of transactions (variable length) over I, where each transaction contains an item set of (i1,