Unsupervised and Supervised Pattern Learning in Graph Data (original) (raw)

Learning from Supervised Graphs

Studies in Computational Intelligence, 2007

We describe an approach to learning patterns in relational data represented as a graph. The approach, implemented in the Subdue system, searches for patterns that maximally compress the input graph. Subdue can be used for supervised learning, as well as unsupervised pattern discovery and clustering. Mining graph-based data raises challenges not found in linear attribute-value data. However, additional requirements can further complicate the problem. In particular, we describe how concepts can be learned from training examples which are embedded into a single connected graph, or supervised graph. We demonstrate the technique using data from a a NASA SST domain as well as a homeland security domain.

Eighth workshop on mining and learning with graphs

2011

Abstract The Eighth Workshop on Mining and Learning with Graphs (MLG) 1was held at KDD 2010 in Washington DC. It brought together a variete of researchers interested in analyzing data that is best represented as a graph. Examples include the WWW, social networks, biological networks, communication networks, and many others. The importance of being able to effectively mine and learn from such data is growing, as more and more structured and semi-structured data is becoming available.

Graph-based Mining of Complex Data

Advanced Information and Knowledge Processing

We describe an approach to learning patterns in relational data represented as a graph. The approach, implemented in the Subdue system, searches for patterns that maximally compress the input graph. Subdue can be used for supervised learning, as well as unsupervised pattern discovery and clustering. Mining graph-based data raises challenges not found in linear attribute-value data. However, additional requirements can further complicate the problem. In particular, we describe how Subdue can incrementally process structured data that arrives as streaming data. We also employ these techniques to learn structural concepts from examples embedded in a single large connected graph.

Graph-based data mining

IEEE Intelligent Systems, 2000

at Arlington THE LARGE AMOUNT OF DATA collected today is quickly overwhelming researchers' abilities to interpret the data and discover interesting patterns in it. In response to this problem, researchers have developed techniques and systems for discovering concepts in databases. 1-3 Much of the collected data, however, has an explicit or implicit structural component (spatial or temporal), which few discovery systems are designed to handle. 4 So, in addition to the need to accelerate data mining of large databases, there is an urgent need to develop scalable tools for discovering concepts in structural databases. One method for discovering knowledge in structural data is the identification of common substructures within the data. Substructure discovery is the process of identifying concepts describing interesting and repetitive substructures within structural data. The discovered substructure concepts allow abstraction from the detailed data structure and provide relevant attributes for interpreting the data. The substructure discovery method is the basis of Subdue, which performs data mining on databases represented as graphs. The system performs two key data-mining techniques: unsupervised pattern discovery and supervised concept learning from examples. Our test applications have demonstrated the scalability and effectiveness of these techniques on a variety of structural databases.

Efficient Mining of Graph-Based Data

International Conference on Artificial Intelligence, 2000

With the increasing amount of structural data being collected, there arises a need to efficiently mine infor- mation from this type of data. The goal of this re- search is to provide a system that performs data min- ing on structural data represented as a labeled graph. We demonstrate how the graph-based discovery system Subdue can be used to perform

Structural correlation pattern mining for large graphs

Proceedings of the Eighth Workshop on Mining and Learning with Graphs - MLG '10, 2010

In this paper we define the Structural Correlation Pattern (SCP) mining problem, which consists of determining correlations among vertex attributes and dense components in an undirected graph. Vertex attributes play an important role in several real-life graphs and SCPs help to understand how they relate to the associated graph topology. SCPs may describe, for example, interesting relationships between personal characteristics and the community structure in social networks. We also propose an efficient algorithm, called SCORP, to extract SCPs from large graphs, and compare it against a naive approach for SCP mining, demonstrating its scalability and efficiency. We also discuss the application of SCORP to two actual scenarios, co-authorship networks and social music discovery, showing relevant results that demonstrate the applicability of the proposed approach.

Machine Learning on Graph-Structured Data

Anais Estendidos do XXXVI SimpĆ³sio Brasileiro de Banco de Dados (SBBD Estendido 2021), 2021

Several real-world complex systems have graph-structured data, including social networks, biological networks, and knowledge graphs. A continuous increase in the quantity and quality of these graphs demands learning models to unlock the potential of this data and execute tasks, including node classification, graph classification, and link prediction. This tutorial presents machine learning on graphs, focusing on how representation learning - from traditional approaches (e.g., matrix factorization and random walks) to deep neural architectures - fosters carrying out those tasks. We also introduce representation learning over dynamic and knowledge graphs. Lastly, we discuss open problems, such as scalability and distributed network embedding systems.

Graph-Based Structural Pattern Learning

2006

Public reporting burden for this collection of information is estimated to average 1 hour per response, including the time for reviewing instructions, searching data sources, gathering and maintaining the data needed, and completing and reviewing the collection of information. Send comments regarding this burden estimate or any other aspect of this collection of information, including suggestions for reducing this burden to

Graph Mining : An Overview

2009

In the early years of data mining and knowledge discovery in databases, method development focused on rigidly and plainly structured data. Most often efforts were even confined to data that can be represented as a simple table, which describes a set of sample cases by attribute-value pairs. Recent years, however, have seen a constantly growing interest in the analysis of more complex data, with a less rigid and/or more sophisticated structure.

Iterative Structure Discovery in Graph-Based Data

International Journal on Artificial Intelligence Tools, 2005

Much of current data mining research is focused on discovering sets of attributes that discriminate data entities into classes, such as shopping trends for a particular demographic group. In contrast, we are working to develop data mining techniques to discover patterns consisting of complex relationships between entities. Our research is particularly applicable to domains in which the data is event-driven or relationally structured. In this paper we present approaches to address two related challenges; the need to assimilate incremental data updates and the need to mine monolithic datasets. Many realistic problems are continuous in nature and therefore require a data mining approach that can evolve discovered knowledge over time. Similarly, many problems present data sets that are too large to fit into dynamic memory on conventional computer systems. We address incremental data mining by introducing a mechanism for summarizing discoveries from previous data increments so that the g...