Maximal Association Rules: A Tool for Mining Associations in Text (original) (raw)
Related papers
A Survey of Association Rule Mining in Text applications
In data mining, association rule is an eminent research field to discover frequent pattern in data repositories of either real world datasets or synthetic datasets. As an association rule mining has confined in that every rule fulfilling a set of constraints such as minimum support and confidence. The objective of this survey is to discuss the basic techniques of association rule mining and text mining concepts. Also, the various transactions of text documents are available in different data warehouses. Particularly, this analysis is carried some of the text based medical applications. This work is specifies to integrate one of the association rule mining algorithm namely Apriori into text mining in order to find interesting patterns and it can easily understand by visualization techniques.
TRUMIT: a tool to support large-scale mining of text association rules
2011
Due to the nature of textual data the application of association rule mining in text corpora has attracted the focus of the research scientific community for years. In this paper we demonstrate a system that can efficiently mine association rules from text. The system annotates terms using several annotators, and extracts text association rules between terms or categories of terms. An additional contribution of this work is the inclusion of novel unsupervised evaluation measures for weighting and ranking the importance of the text rules. We demonstrate the functionalities of our system with two text collections, a set of Wikileaks documents, and one from TREC-7.
Textmining: Generating association rules from textual data
Textmining is an emerging research area, whose goal is to discover additional information from hidden patterns in unstructured large textual collection. Hence, given a collection of text documents, most approaches of text mining perform knowledge-discovery operations on labels associated with each document, which are usually keywords that represent the result of non-trivial keyword-labeling processes. In this paper, we are interested especially in the extraction of the associations from unstructured database, especially full text. The aim of this paper is twofold. First, to propose a conceptual approach, based on the formal concept analysis [GANT99], in order to discover knowledge, formally represented by association rules, from large textual corpus. Second, to introduce an algorithm to derive additional and implicit association rules, using an associated taxonomy, from the already discovered association rules.
A Text Mining Technique Using Association Rules Extraction
International Journal of …, 2007
AbstractThis paper describes text mining technique for automatically extracting association rules from collections of textual documents. The technique called, Extracting Association Rules from Text (EART). It depends on keyword features for discover association rules amongst ...
Object-Oriented Data Structure for Text Association Rule Mining
2007
Mining association rules is being actively studied for transaction databases, but extension to text applications is relatively novel. Most of the previous studies implement an Apriori-like approach, which requires multiple passes over the database to find all frequent itemsets. However, for some type of databases such as the bibliographic database where the data are very sparse, the Apriori algorithm becomes very costly. In this paper, we propose a new algorithm called Object-Oriented Association Rule Mining (OOARM), which uses a special objectoriented data structure that holds all relevant information. The algorithm can find all frequent itemsets from a single scan through the database. Performance studies show that it is faster than the Apriori algorithm. All rules that involve a certain itemset can be generated in real time. That, in turn, opens new opportunities to organize and explore relationships in large text data sets.
Mining associations in text in the presence of background knowledge
1996
This paper describes the FACT system for knowledge discovery from text. It discovers associations − patterns of co-occurrence − amongst keywords labeling the items in a collection of textual documents. In addition, FACT is able to use background knowledge about the keywords labeling the documents in its discovery process. FACT takes a query-centered view of knowledge discovery, in which a discovery request is viewed as a query over the implicit set of possible results supported by a collection of documents, and where background knowledge is used to specify constraints on the desired results of this query process. Execution of a knowledge-discovery query is structured so that these background-knowledge constraints can be exploited in the search for possible results. Finally, rather than requiring a user to specify an explicit query expression in the knowledge-discovery query language, FACT presents the user with a simple-to-use graphical interface to the query language, with the language providing a well-defined semantics for the discovery actions performed by a user through the interface.
Mining the Text using Association Rule Mining Technique
2020
As the amount of text available in electronic form continues to increase at alarming rate, the tools to manage these textual resources effectively will become critical. Information Retrieval System tries to save the users access time by classifying the documents and clustering the documents because users spend a lot of time to find documents or information from texts. Therefore, text mining is the most popular and it is necessary to solve this problem. The largest amount of work in text mining has been in the areas of categorization, classification and clustering of documents. Text mining has many methods to find the useful information. Among these methods, association rule mining is very suitable for finding the most frequent words that occur in the document collection. Association rule analysis is the task of discovering association rules that occur frequently in a given text sets. Our proposed system had been developed by applying the preprocessing steps of text mining system and...
Using Soft Set Theory for Mining Maximal Association Rules in Text Data
2016
Using soft set theory for mining maximal association rules based on the concept of frequent maximal itemsets which appear maximally in many records has been developed in recent years. This method has been shown to be very effective for mining interesting association rules which are not obtained by using methods for regular association rule mining. There have been several algorithms developed to solve the problem, but overall, they retain weaknesses related to the use of memory as well as mining time. In this paper, we propose an effective strategy for maximal rules mining based on soft set theory that consists of the following steps: 1) Build tree Max_IT_Tree where each node contains maximal itemsets X, the category of X, the set of transactions in which X is maximal, and the support of the maximal itemsets X for each category. 2) From the tree Max_IT_Tree built in previous steps, build a tree Max_Item_IT_Tree so that each maximal itemset has child nodes where each node contains ite...
ACM Computing Surveys, 2006
The task of finding correlations between items in a dataset, association mining, has received considerable attention over the last decade. This article presents a survey of association mining fundamentals, detailing the evolution of association mining algorithms from the seminal to the state-of-the-art. This survey focuses on the fundamental principles of association mining, that is, itemset identification, rule generation, and their generic optimizations.