Heterogeneous Hypergraph Embedding for Graph Classification (original) (raw)

A Survey on Hypergraph Representation Learning

ACM Computing Surveys

Hypergraphs have attracted increasing attention in recent years thanks to their flexibility in naturally modeling a broad range of systems where high-order relationships exist among their interacting parts. This survey reviews the newly born hypergraph representation learning problem, whose goal is to learn a function to project objects - most commonly nodes - of an input hyper-network into a latent space such that both the structural and relational properties of the network can be encoded and preserved. We provide a thorough overview of existing literature and offer a new taxonomy of hypergraph embedding methods by identifying three main families of techniques, i.e., spectral, proximity-preserving, and (deep) neural networks. For each family, we describe its characteristics and our insights in a single yet flexible framework and then discuss the peculiarities of individual methods, as well as their pros and cons. We then review the main tasks, datasets, and settings in which hyperg...

Heterogeneous Network Representation Learning: A Unified Framework With Survey and Benchmark

IEEE Transactions on Knowledge and Data Engineering, 2020

Since real-world objects and their interactions are often multimodal and multi-typed, heterogeneous networks have been widely used as a more powerful, realistic, and generic superclass of traditional homogeneous networks (graphs). Meanwhile, representation learning (a.k.a. embedding) has recently been intensively studied and shown effective for various network mining and analytical tasks. Since there has already been a broad body of heterogeneous network embedding (HNE) algorithms but no dedicated survey, as the first contribution of this work, we pioneer in providing a unified paradigm for the systematic categorization and analysis over the merits of various existing HNE algorithms. Moreover, existing HNE algorithms, though mostly claimed generic, are often evaluated on different datasets. Understandable due to the natural application favor of HNE, such indirect comparisons largely hinder the proper attribution of improved task performance towards effective data preprocessing and novel technical design, especially considering the various ways possible to construct a heterogeneous network from real-world application data. Therefore, as the second contribution, we create four benchmark datasets with various properties regarding scale, structure, attribute/label availability, and etc. from different sources, towards the comprehensive evaluation of HNE algorithms. As the third contribution, we carefully refactor and amend the implementations of and create friendly interfaces for ten popular HNE algorithms, and provide all-around comparisons among them over multiple tasks and experimental settings.

Efficient and Effective Edge-wise Graph Representation Learning

Proceedings of the 29th ACM SIGKDD Conference on Knowledge Discovery and Data Mining

Graph representation learning (GRL) is a powerful tool for graph analysis, which has gained massive attention from both academia and industry due to its superior performance in various real-world applications. However, the majority of existing works for GRL are dedicated to node-based tasks and thus focus on producing node representations. Despite such methods can be used to derive edge representations by regarding edges as nodes, they suffer from sub-par result utility in practical edge-wise applications, such as financial fraud detection and review spam combating, due to neglecting the unique properties of edges and their inherent drawbacks. Moreover, to our knowledge, there is a paucity of research devoted to edge representation learning. These methods either require high computational costs in sampling random walks or yield severely compromised representation quality because of falling short of capturing high-order information between edges. To address these challenges, we present TER and AER, which generate high-quality edge representation vectors based on the graph structure surrounding edges and edge attributes, respectively. In particular, TER can accurately encode high-order proximities of edges into low-dimensional vectors in a practically efficient and theoretically sound way, while AER augments edge attributes through a carefullydesigned feature aggregation scheme. Our extensive experimental study demonstrates that the combined edge representations of TER and AER can achieve significantly superior performance in terms of edge classification on 8 real-life datasets, while being up to one order of magnitude faster than 16 baselines on large graphs. CCS CONCEPTS • Computing methodologies → Supervised learning by classification; Factorization methods; • Mathematics of computing → Graph algorithms.

Uniting Heterogeneity, Inductiveness, and Efficiency for Graph Representation Learning

ArXiv, 2021

With the ubiquitous graph-structured data in various applications, models that can learn compact but expressive vector representations of nodes have become highly desirable. Recently, bearing the message passing paradigm, graph neural networks (GNNs) have greatly advanced the performance of node representation learning on graphs. However, a majority class of GNNs are only designed for homogeneous graphs, leading to inferior adaptivity to the more informative heterogeneous graphs with various types of nodes and edges. Also, despite the necessity of inductively producing representations for completely new nodes (e.g., in streaming scenarios), few heterogeneous GNNs can bypass the transductive learning scheme where all nodes must be known during training. Furthermore, the training efficiency of most heterogeneous GNNs has been hindered by their sophisticated designs for extracting the semantics associated with each meta path or relation. In this paper, we propose wide and deep message ...

An Overview on Network Representation Learning

2021

Representation learning has proven its usefulness in many activities such as photography and text mining. The goal of network representation learning is to learn distributed vector representation for each vertex in the networks, an essential feature of network analysis is now increasingly recognised. Some techniques of network representation research network systems for learning. In effect, vertices of the network contain rich data (such as text), that cannot be used with the traditional algorithmic frameworks. We suggest DeepWalk in text-associated form, by showing that DeepWalk, a high-tech network representation solution, is equal to matrix factorisation (TADW). In the context of matrix factorisation, TADW introduce text features of vertices in network representation research. Through applying them to the multi classifying of vertices, we compare our system and different baseline methods.The experimental results show that, our method outperforms other baselines on all three datas...

Exploiting Relational Information in Social Networks using Geometric Deep Learning on Hypergraphs

Proceedings of the 2018 ACM on International Conference on Multimedia Retrieval

Online social networks are constituted by a diverse set of entities including users, images and posts which makes the task of predicting interdependencies between entities challenging. We need a model that transfers information from a given type of relations between entities to predict other types of relations, irrespective of the type of entity. In order to devise a generic framework, one needs to capture the relational information between entities without any entity dependent information. However, there are two challenges: (a) a social network has an intrinsic community structure. In these communities, some relations are much more complicated than pairwise relations, thus cannot be simply modeled by a graph; (b) there are different types of entities and relations in a social network, taking into account all of them makes it difficult to formulate a model. In this paper, we claim that representing social networks using hypergraphs improves the task of predicting missing information about an entity by capturing higher-order relations. We study the behavior of our method by performing experiments on CLEF dataset consisting of images from Flickr, an online photo sharing social network. CCS CONCEPTS • Computing methodologies → Multi-task learning; Neural networks;

A Survey on Graph Representation Learning Methods

arXiv (Cornell University), 2022

Graphs representation learning has been a very active research area in recent years. The goal of graph representation learning is to generate graph representation vectors that capture the structure and features of large graphs accurately. This is especially important because the quality of the graph representation vectors will affect the performance of these vectors in downstream tasks such as node classification, link prediction and anomaly detection. Many techniques have been proposed for generating effective graph representation vectors, which generally fall into two categories: traditional graph embedding methods and graph neural nets (GNN) based methods. These methods can be applied to both static and dynamic graphs. A static graph is a single fixed graph, while a dynamic graph evolves over time and its nodes and edges can be added or deleted from the graph. In this survey, we review the graph embedding methods in both traditional and GNN-based categories for both static and dynamic graphs and include the recent papers published until the time of submission. In addition, we summarize a number of limitations of GNNs and the proposed solutions to these limitations. Such a summary has not been provided in previous surveys. Finally, we explore some open and ongoing research directions for future work.

Network representation learning: models, methods and applications

SN Applied Sciences, 2019

With the rise of large-scale social networks, network mining has become an important sub-domain of data mining. Generating an efficient network representation is one important challenge in applying machine learning to network data. Recently, representation learning methods are widely used in various domains to generate low dimensional latent features from complex high dimensional data. A significant amount of research effort is made in the past few years to generate node representations from graph-structured data using representation learning methods. Here, we provide a detailed study of the latest advancements in the field of network representation learning (also called network embedding). We first discuss the basic concepts and models of network embedding. Further, we build a taxonomy of network embedding methods based on the type of networks and review the major research works that come under each category. We then cover the major datasets used in network embedding research and describe the major applications of network embedding with respect to various network mining tasks. Finally, we provide various directions for future work which enhance further research.

Representation Learning on Multi-layered Heterogeneous Network

Machine Learning and Knowledge Discovery in Databases. Research Track

Network data can often be represented in a multi-layered structure with rich semantics. One example is e-commerce data, containing user-user social network layer and item-item context layer, with cross-layer user-item interactions. Given the dual characters of homogeneity within each layer and heterogeneity across layers, we seek to learn node representations from such a multilayered heterogeneous network while jointly preserving structural information and network semantics. In contrast, previous works on network embedding mainly focus on single-layered or homogeneous networks with one type of nodes and links. In this paper we propose intra-and cross-layer proximity concepts. Intralayer proximity simulates propagation along homogeneous nodes to explore latent structural similarities. Cross-layer proximity captures network semantics by extending heterogeneous neighborhood across layers. Through extensive experiments on four datasets, we demonstrate that our model achieves substantial gains in different real-world domains over state-of-the-art baselines.

Network Embedding: An Overview

2019

Networks are one of the most powerful structures for modeling problems in the real world. Downstream machine learning tasks defined on networks have the potential to solve a variety of problems. With link prediction, for instance, one can predict whether two persons will become friends on a social network. Many machine learning algorithms, however, require that each input example is a real vector. Network embedding encompasses various methods for unsupervised, and sometimes supervised, learning of feature representations of nodes and links in a network. Typically, embedding methods are based on the assumption that the similarity between nodes in the network should be reflected in the learned feature representations. In this paper, we review significant contributions to network embedding in the last decade. In particular, we look at four methods: Spectral Clustering, DeepWalk, Large-scale Information Network Embedding (LINE), and node2vec. We describe each method and list its advanta...