Multi-Relational Data Mining A Comprehensive Survey (original) (raw)
Related papers
Multi Relational Data Mining Approaches: A Data Mining Technique
International Journal of Computer Applications, 2012
The multi relational data mining approach has developed as an alternative way for handling the structured data such that RDBMS. This will provides the mining in multiple tables directly. In MRDM the patterns are available in multiple tables (relations) from a relational database. As the data are available over the many tables which will affect the many problems in the practice of the data mining. To deal with this problem, one either constructs a single table by Propositionalisation, or uses a Multi-Relational Data Mining algorithm. MRDM approaches have been successfully applied in the area of bioinformatics. Three popular pattern finding techniques classification, clustering and association are frequently used in MRDM. Multi relational approach has developed as an alternative for analyzing the structured data such as relational database. MRDM allowing applying directly in the data mining in multiple tables. To avoid the expensive joining operations and semantic losses we used the MRDM technique. This paper focuses some of the application areas of MRDM and feature directions as well as the comparison of ILP, GM, SSDM and MRDM
MR-Radix: a multi-relational data mining algorithm
Human-centric Computing and Information Sciences, 2012
Background Once multi-relational approach has emerged as an alternative for analyzing structured data such as relational databases, since they allow applying data mining in multiple tables directly, thus avoiding expensive joining operations and semantic losses, this work proposes an algorithm with multi-relational approach. Methods Aiming to compare traditional approach performance and multi-relational for mining association rules, this paper discusses an empirical study between PatriciaMine - an traditional algorithm - and its corresponding multi-relational proposed, MR-Radix. Results This work showed advantages of the multi-relational approach in performance over several tables, which avoids the high cost for joining operations from multiple tables and semantic losses. The performance provided by the algorithm MR-Radix shows faster than PatriciaMine, despite handling complex multi-relational patterns. The utilized memory indicates a more conservative growth curve for MR-Radix tha...
A Review: Data mining over Multi-Relations
International Journal of Computer Applications, 2013
In this paper, Multi-relational data mining enables pattern mining from multiple tables. Multi-relational data mining algorithms can be used as practical proposal to overcome the deficiency of conventional algorithms. Multi-relational data mining algorithms directly extract frequent patterns from different registers in efficient manner without need of transfer the data in a single table will, on the other hand, used the available memory space is not enough to ensure the production of large amounts of data. For this reason, and the use of space, algorithms are an integral care for the prospection of large repositories. The paper provides the overview of multi relation data mining techniques and classification algorithms. It also defines the frequent pattern mining. The presented paper discussed the various architecture and issues related to multi table data mining. A lot of literature has been proposed in this area. Some of them has discussed in this paper.
On Multi-Relational Data Mining for Foundation of Data Mining
2007 IEEE/ACS International Conference on Computer Systems and Applications, 2007
Multi-Relational Data Mining (MRDM) deals with knowledge discovery from relational databases consisting of one or multiple tables. As a typical technique for MRDM, inductive logic programming (ILP) has the power of dealing with reasoning related to various data mining tasks in a "unified" way. Like granular computing (GrC), ILP-based MRDM models the data and the mining process on these data through intension and extension of concepts. Unlike GrC, however, the inference ability of ILP-based MRDM lies in the powerful Prolog-like search engine. Although this important feature suggests that through ILP, MRDM can contribute to the foundation of data mining (FDM), the interesting perspective of "ILPbased MRDMfor FDM" has not been investigated in the past. In this paper, we examine this perspective. We provide justification and observations, and report results of related experiments. The primary objective of this paper is to draw attention to FDM researchers from the ILP-based MRDMperspective.
A User-Driven Association Rule Mining Based on Templates for Multi-Relational Data
Journal of Computer Science, 2018
Data mining algorithms to find association rules are an important tool to extract knowledge from databases. However, these algorithms produce an enormous amount of rules, many of which could be redundant or irrelevant for a specific decision-making process. Also, the use of previous knowledge and hypothesis are not considered by these algorithms. On the other hand, most existing data mining approaches look for patterns in a single data table, ignoring the relations presented in relational databases. The contribution of this paper is the proposition of a multirelational data mining algorithm based on association rules, called TBMR-Radix, which considers previous knowledge and hypothesis through the using of the Templates technique. Applying this approach over two real databases, we were able to reduce the number of generated rules, use the existing knowledge about the data and reduce the waste of computational resources while processing. Our experiments show that the developed algorithm was also able to perform in a multi-relational environment, while the MR-Radix, that does not use Templates technique, was not.
Multi-Relational Data Mining using Probabilistic Models Research Summary
2001
Abstract. We are often faced with the challenge of mining data represented in relational form. Unfortunately, most statistical learning methods work only with ���flat��� data representations. Thus, to apply these methods, we are forced to convert the data into a flat form, thereby not only losing its compact representation and structure but also potentially introducing statistical skew. These drawbacks severely limit the ability of current statistical methods to mine relational databases.
Confidence-based concept discovery in multi-relational data mining
Proceedings of the International …, 2008
Multi-relational data mining has become popular due to the limitations of propositional problem definition in structured domains and the tendency of storing data in relational databases. Several relational knowledge discovery systems have been developed employing various search strategies, heuristics, language pattern limitations and hypothesis evaluation criteria, in order to cope with intractably large search space and to be able to generate highquality patterns. In this work, a new ILP-based concept discovery method is described in which userdefined specifications are relaxed. Moreover, this new method directly works on relational databases. In addition to this, a new confidence-based pruning is used in this technique. A set of experiments are conducted to test the performance of the new method.
Mining Frequent Patterns from Multi-Dimensional Relational Sequences
2007
Data mining algorithms look for patterns in data. While most existing data mining approaches look for patterns in a single data table, multi-relational data mining (MRDM) approaches look for patterns that involve multiple tables (relations) from a relational database. Mining data which consists of complex/structured objects also falls within the scope of this field, since the normalized representation of such objects in a relational database requires multiple tables. Following the mainstream of MRDM research, the most common types of patterns and approaches considered in data mining have been extended to the multi-relational case and MRDM now encompasses relational association rule discovery, relational classification rules, relational decision and regression trees, and probabilistic relational models, among others. At same time, MRDM methods have been successfully applied across many application areas, ranging from the analysis of business data, through bioinformatics and pharmacology to Web mining and Spatial Data mining. Our goal is to bring together researchers and practitioners of data mining interested in methods for finding patterns in expressive languages from multi-relational / structured data and their applications. The workshop is the sixth of its kind. It follows the success of the workshops on Multi
FOIL-D: Efficiently Scaling FOIL for Multi-relational Data Mining of Large Datasets
Lecture Notes in Computer Science, 2004
Multi-relational rule mining is important for knowledge discovery in relational databases as it allows for discovery of patterns involving multiple relational tables. Inductive logic programming (ILP) techniques have had considerable success on a variety of multi-relational rule mining tasks, however, most ILP systems do not scale to very large datasets. In this paper we present two extensions to a popular ILP system, FOIL, that improve its scalability. (i) We show how to interface FOIL directly to a relational database management system. This enables FOIL to run on data sets that previously had been out of its scope. (ii) We describe estimation methods, based on histograms, that significantly decrease the computational cost of learning a set of rules. We present experimental results that indicate that on a set of standard ILP datasets, the rule sets learned using our extensions are equivalent to those learned with standard FOIL but at considerably less cost.
Multi-relational Algorithm for Mining Association Rules in Large Databases
2011 12th International Conference on Parallel and Distributed Computing, Applications and Technologies, 2011
Multi-relational data mining enables pattern mining from multiple tables. The existing multi-relational mining association rules algorithms are not able to process large volumes of data, because the amount of memory required exceeds the amount available. The proposed algorithm MR-Radix presents a framework that promotes the optimization of memory usage. It also uses the concept of partitioning to handle large volumes of data. The original contribution of this proposal is enable a superior performance when compared to other related algorithms and moreover successfully concludes the task of mining association rules in large databases, bypass the problem of available memory. One of the tests showed that the MR-Radix presents fourteen times less memory usage than the GFP-growth.