Machine Learning on Graph-Structured Data (original) (raw)

Representation Learning on Graphs

Computer Communications and Networks, 2018

This dissertation includes two separate topics. The first topic studies a promising dynamic spectrum access algorithm that improves the throughput of satellite communication (SATCOM) under the uncertainty environment. The other topic investigates real-time distributed representation learning for streaming and complex networks. 1 Cognitive Satellite Communications Dynamic spectrum access (DSA) allows a secondary user to access the spectrum holes that are not occupied by primary users. However, DSA is normally operated under uncertainty in a complex SATCOM environment, which could cause more spectrum sensing errors or even service disruption. In this case, DSA requires a decision-making process to optimally determine which channels to sense and access. To this end, I propose a solution that addresses the uncertainty in SATCOM to maximize the system throughput. Specifically, the DSA decision making process is formulated as a Partially Observable Markov Decision Process (POMDP) model. Simulation results prove the effectiveness of our proposed DSA strategy. v 2 Distributed Real-time Representation Learning of Large Networks Large-scale networks have attracted significant amount of attentions to extract and analyze the hidden information from big data. In particular, graph embedding learns the representations of the original network in a lower vector space while maximally preserving the original structural information and the similarity among nodes. I propose a real-time distributed graph embedding algorithm (RTDGE) which is capable of distributively embedding the streaming graph data by combining a novel edge partition approach and an incremental negative sample approach. Furthermore, a real-time distributed streaming data processing platform is prototyped based on Kafka and Storm. On this platform, real-time Twitter network data can be retrieved, partitioned and processed for state-of-art tasks including synonymic user detection, community classification and visualization. For complex knowledge graphs, existing works cannot capture the complex connection patterns and never consider the impacts from complicated relations, due to the unquantifiable relationships. In this dissertation, a novel hierarchical embedding algorithm is proposed to hierarchically measure the structural similarities and the impacts from relations by constructing a multi-layer graph. Then an advanced representation learning model is designed based on an entity's context, which is generated by taking random walks on the multi-layer content graph. Experimental results show that our proposed model outperforms the state-of-the-art techniques. vi

Learning Representations of Graph Data - A Survey

ArXiv, 2019

Deep Neural Networks have shown tremendous success in the area of object recognition, image classification and natural language processing. However, designing optimal Neural Network architectures that can learn and output arbitrary graphs is an ongoing research problem. The objective of this survey is to summarize and discuss the latest advances in methods to Learn Representations of Graph Data. We start by identifying commonly used types of graph data and review basics of graph theory. This is followed by a discussion of the relationships between graph kernel methods and neural networks. Next we identify the major approaches used for learning representations of graph data namely: Kernel approaches, Convolutional approaches, Graph neural networks approaches, Graph embedding approaches and Probabilistic approaches. A variety of methods under each of the approaches are discussed and the survey is concluded with a brief discussion of the future of learning representation of graph data.

Node Classification in Complex Social Graphs via Knowledge-Graph Embeddings and Convolutional Neural Network

Lecture Notes in Computer Science, 2020

The interactions between humans and their environment, comprising living and non-living entities, can be studied via Social Network Analysis (SNA). Node classification, as well as community detection tasks, are still open research problems in SNA. Hence, SNA has become an interesting and appealing domain in Artificial Intelligence (AI) research. Immanent facts about social network structures can be effectively harnessed for training AI models in a bid to solve node classification and community detection problems in SNA. Hence, crucial aspects such as the individual attributes of spatial social actors, and the underlying patterns of relationship binding these social actors must be taken into consideration in the course of analyzing the social network. These factors determine the nature and dynamics of a given social network. In this paper, we have proposed a unique framework, Representation Learning via Knowledge-Graph Embeddings and ConvNet (RLVECN), for studying and extracting meaningful facts from social network structures to aid in node classification as well as community detection tasks. Our proposition utilizes an edge sampling approach for exploiting features of the social graph, via learning the context of each actor with respect to neighboring actors/nodes, with the goal of generating vectorspace embedding per actor. Successively, these relatively low-dimensional vector embeddings are fed as input features to a downstream classifier for classification tasks about the social graph/network. Herein RLVECN has been trained, tested, and evaluated on real-world social networks.

graph2vec: Learning Distributed Representations of Graphs

ArXiv, 2017

Recent works on representation learning for graph structured data predominantly focus on learning distributed representations of graph substructures such as nodes and subgraphs. However, many graph analytics tasks such as graph classification and clustering require representing entire graphs as fixed length feature vectors. While the aforementioned approaches are naturally unequipped to learn such representations, graph kernels remain as the most effective way of obtaining them. However, these graph kernels use handcrafted features (e.g., shortest paths, graphlets, etc.) and hence are hampered by problems such as poor generalization. To address this limitation, in this work, we propose a neural embedding framework named graph2vec to learn data-driven distributed representations of arbitrary sized graphs. graph2vec's embeddings are learnt in an unsupervised manner and are task agnostic. Hence, they could be used for any downstream task such as graph classification, clustering and...

DeepMap: Learning Deep Representations for Graph Classification

2020

Graph-structured data arise in many scenarios. A fundamental problem is to quantify the similarities of graphs for tasks such as classification. Graph kernels are positive-semidefinite functions that decompose graphs into substructures and compare them. One problem in the effective implementation of this idea is that the substructures are not independent, which leads to high-dimensional feature space. In addition, graph kernels cannot capture the high-order complex interactions between vertices. To mitigate these two problems, we propose a framework called DeepMap to learn deep representations for graph feature maps. The learnt deep representation for a graph is a dense and low-dimensional vector that captures complex high-order interactions in a vertex neighborhood. DeepMap extends Convolutional Neural Networks (CNNs) to arbitrary graphs by aligning vertices across graphs and building the receptive field for each vertex. We empirically validate DeepMap on various graph classificati...

Learning Deep Graph Representations via Convolutional Neural Networks

IEEE Transactions on Knowledge and Data Engineering, 2021

Graph-structured data arise in many scenarios. A fundamental problem is to quantify the similarities of graphs for tasks such as classification. R-convolution graph kernels are positive-semidefinite functions that decompose graphs into substructures and compare them. One problem in the effective implementation of this idea is that the substructures are not independent, which leads to high-dimensional feature space. In addition, graph kernels cannot capture the high-order complex interactions between vertices. To mitigate these two problems, we propose a framework called DEEPMAP to learn deep representations for graph feature maps. The learned deep representation for a graph is a dense and low-dimensional vector that captures complex high-order interactions in a vertex neighborhood. DEEPMAP extends Convolutional Neural Networks (CNNs) to arbitrary graphs by generating aligned vertex sequences and building the receptive field for each vertex. We empirically validate DEEPMAP on various graph classification benchmarks and demonstrate that it achieves state-of-the-art performance.

A Survey on Graph Representation Learning Methods

arXiv (Cornell University), 2022

Graphs representation learning has been a very active research area in recent years. The goal of graph representation learning is to generate graph representation vectors that capture the structure and features of large graphs accurately. This is especially important because the quality of the graph representation vectors will affect the performance of these vectors in downstream tasks such as node classification, link prediction and anomaly detection. Many techniques have been proposed for generating effective graph representation vectors, which generally fall into two categories: traditional graph embedding methods and graph neural nets (GNN) based methods. These methods can be applied to both static and dynamic graphs. A static graph is a single fixed graph, while a dynamic graph evolves over time and its nodes and edges can be added or deleted from the graph. In this survey, we review the graph embedding methods in both traditional and GNN-based categories for both static and dynamic graphs and include the recent papers published until the time of submission. In addition, we summarize a number of limitations of GNNs and the proposed solutions to these limitations. Such a summary has not been provided in previous surveys. Finally, we explore some open and ongoing research directions for future work.

A Comprehensive Survey on Graph Neural Networks

IEEE Transactions on Neural Networks and Learning Systems, 2020

Deep learning has revolutionized many machine learning tasks in recent years, ranging from image classification and video processing to speech recognition and natural language understanding. The data in these tasks are typically represented in the Euclidean space. However, there is an increasing number of applications where data are generated from non-Euclidean domains and are represented as graphs with complex relationships and interdependency between objects. The complexity of graph data has imposed significant challenges on existing machine learning algorithms. Recently, many studies on extending deep learning approaches for graph data have emerged. In this survey, we provide a comprehensive overview of graph neural networks (GNNs) in data mining and machine learning fields. We propose a new taxonomy to divide the state-of-the-art graph neural networks into four categories, namely recurrent graph neural networks, convolutional graph neural networks, graph autoencoders, and spatial-temporal graph neural networks. We further discuss the applications of graph neural networks across various domains and summarize the open source codes, benchmark data sets, and model evaluation of graph neural networks. Finally, we propose potential research directions in this rapidly growing field.

An Overview on Network Representation Learning

2021

Representation learning has proven its usefulness in many activities such as photography and text mining. The goal of network representation learning is to learn distributed vector representation for each vertex in the networks, an essential feature of network analysis is now increasingly recognised. Some techniques of network representation research network systems for learning. In effect, vertices of the network contain rich data (such as text), that cannot be used with the traditional algorithmic frameworks. We suggest DeepWalk in text-associated form, by showing that DeepWalk, a high-tech network representation solution, is equal to matrix factorisation (TADW). In the context of matrix factorisation, TADW introduce text features of vertices in network representation research. Through applying them to the multi classifying of vertices, we compare our system and different baseline methods.The experimental results show that, our method outperforms other baselines on all three datas...

GRAPE for fast and scalable graph processing and random-walk-based embedding

Nature Computational Science

Graph representation learning methods opened new avenues for addressing complex, real-world problems represented by graphs. However, many graphs used in these applications comprise millions of nodes and billions of edges and are beyond the capabilities of current methods and software implementations. We present GRAPE (Graph Representation Learning, Prediction and Evaluation), a software resource for graph processing and embedding that is able to scale with big graphs by using specialized and smart data structures, algorithms, and a fast parallel implementation of random-walk-based methods. Compared with state-of-the-art software resources, GRAPE shows an improvement of orders of magnitude in empirical space and time complexity, as well as competitive edge- and node-label prediction performance. GRAPE comprises approximately 1.7 million well-documented lines of Python and Rust code and provides 69 node-embedding methods, 25 inference models, a collection of efficient graph-processing...