Latent Feature Kernels for Link Prediction on Sparse Graphs (original) (raw)
Related papers
Kernels for Link Prediction with Latent Feature Models
2011
Predicting new links in a network is a problem of interest in many application domains. Most of the prediction methods utilize information on the network's entities such as nodes to build a model of links. Network structures are usually not used except for the networks with similarity or relatedness semantics. In this work, we use network structures for link prediction with a more general network type with latent feature models. The problem is the difficulty to train these models directly for large data. We propose a method to solve this problem using kernels and cast the link prediction problem into a binary classification problem. The key idea is not to infer latent features explicitly, but to represent these features implicitly in the kernels, making the method scalable to large networks. In contrast to the other methods for latent feature models, our method inherits all the advantages of kernel framework: optimality, efficiency and nonlinearity. We apply our method to real data of protein-protein interactions to show the merits of our method.
Latent linkage semantic kernels for collective classification of link data
Journal of Intelligent Information Systems, 2006
Generally, links among objects demonstrate certain patterns and contain rich semantic clues. These important clues can be used to improve classification accuracy. However, many real-world link data may exhibit more complex regularity. For example, there may be some noisy links that carry no human editorial endorsement about semantic relationships. To effectively capture such regularity, this paper proposes latent linkage semantic kernels (LLSKs) by first introducing the linkage kernels to model the local and global dependency structure of a link graph and then applying the singular value decomposition (SVD) in the kernel-induced space. For the computational efficiency on large datasets, we also develop a block-based algorithm for LLSKs. A kernel-based contextual dependency network (KCDN) model is then presented to exploit the dependencies in a network of objects for collective classification. We provide experimental results demonstrating that the KCDN model, together with LLSKs, demonstrates relatively high robustness on the datasets with the complex link regularity, and the block-based computation method can scale well with varying sizes of the problem.
Graph-based features for supervised link prediction
Neural Networks (IJCNN), …, 2011
The growing ubiquity of social networks has spurred research in link prediction, which aims to predict new connections based on existing ones in the network. The 2011 IJCNN Social Network challenge asked participants to separate real edges from fake in a set of 8960 edges sampled from an anonymized, directed graph depicting a subset of relationships on Flickr. Our method incorporates 94 distinct graph features, used as input for classification with Random Forests. We present a three-pronged approach to the link prediction task, along with several novel variations on established similarity metrics. We discuss the challenges of processing a graph with more than a million nodes. We found that the best classification results were achieved through the combination of a large number of features that model different aspects of the graph structure. Our method achieved an area under the receiver-operator characteristic (ROC) curve of 0.9695, the 2nd best overall score in the competition and the best score which did not de-anonymize the dataset.
A Topical Graph Kernel for Link Prediction in Labeled Graphs
This paper proposes a solution to the problem of link prediction in labeled graphs with additional text information associated with the nodes. By fitting a topic model on the text corpus and some processing, we compute the topics of interest to a node. We propose a walk based graph kernel which incorporates the node's interest and thus represents structural as well as textual information. We then make predictions about the existence of unseen links using a kernelized SVM. Our experiments with an author citation network shows that our method is effective and significantly outperforms a network-oriented approach.
Learning spectral graph transformations for link prediction
Proceedings of the 26th Annual …, 2009
We present a unified framework for learning link prediction and edge weight prediction functions in large networks, based on the transformation of a graph's algebraic spectrum. Our approach generalizes several graph kernels and dimensionality reduction methods and provides a method to estimate their parameters efficiently. We show how the parameters of these prediction functions can be learned by reducing the problem to a one-dimensional regression problem whose runtime only depends on the method's reduced rank and that can be inspected visually. We derive variants that apply to undirected, weighted, unweighted, unipartite and bipartite graphs. We evaluate our method experimentally using examples from social networks, collaborative filtering, trust networks, citation networks, authorship graphs and hyperlink networks.
A Latent Space Mapping for Link Prediction
Network modeling can be approached using either discriminative or probabilistic models. In the task of link prediction a probabilistic model will give a probability for the existence of a link; while in some scenarios this may be beneficial, in others a hard discriminative boundary needs to be set. Hence the use of a discriminative classifier is preferable. In domains such as image analysis and speaker recogni- tion, probabilistic models have been used as a mechanism from which features can be extracted. This paper examines using a probabilistic model built on the entire graph to extract features to predict the existence of unknown links between two nodes. It demonstrates how features extracted from the model as well as the predicted probability of a link existing can aid the classification process.
Link classification with probabilistic graphs
Journal of Intelligent Information Systems, 2014
Your article is protected by copyright and all rights are held exclusively by Springer Science +Business Media New York. This e-offprint is for personal use only and shall not be selfarchived in electronic repositories. If you wish to self-archive your article, please use the accepted manuscript version for posting on your own website. You may further deposit the accepted manuscript version in any repository, provided it is only made publicly available 12 months after official publication or later and provided acknowledgement is given to the original source of publication and a link is inserted to the published article on Springer's website. The link must be accompanied by the following text: "The final publication is available at link.springer.com".
Collective prediction with latent graphs
Proceedings of the 20th ACM international …, 2011
Collective classification in relational data has become an important and active research topic in the last decade. It exploits the dependencies of instances in a network to improve predictions. Related applications include hyperlinked document classification, social network analysis and collaboration network analysis. Most of the traditional collective classification models mainly study the scenario that there exists a large amount of labeled examples (labeled nodes). However, in many real-world applications, labeled data are extremely difficult to obtain. For example, in network intrusion detection, there may be only a limited number of identified intrusions whereas there are a huge set of unlabeled nodes. In this situation, most of the data have no connection to labeled nodes; hence, no supervision knowledge can be obtained from the local connections. In this paper, we propose to explore various latent linkages among the nodes and judiciously integrate the linkages to generate a latent graph. This is achieved by finding a graph that maximizes the linkages among the training data with the same label, and maximizes the separation among the data with different labels. The objective is further cast into an optimization problem and is solved with quadratic programming. Finally, we apply label propagation on the latent graph to make prediction. Experiments show that the proposed model LNP (Latent Network Propagation) can improve the learning accuracy significantly. For instance, when there are only 10% of labeled examples, the accuracies of all the comparison models are less than 63%, while that of the proposed model is 74%.
Graph Kernel-Based Learning for Gene Function Prediction from Gene Interaction Network
2007
Prediction of gene functions is a major challenge to biologists in the post-genomic era. Interactions between genes and their products compose networks and can be used to infer gene functions. Most previous studies used heuristic approaches based on either local or global information of gene interaction networks to assign unknown gene functions. In this study, we propose a graph kernel-based method that can capture the structure of gene interaction networks to predict gene functions. We conducted an experimental study on a test-bed of P53-related genes. The experimental results demonstrated better performance for our proposed method as compared with baseline methods.
Link Prediction based on Deep Latent Feature Model by Fusion of Network Hierarchy Information
Tehnicki vjesnik - Technical Gazette, 2020
Link prediction aims at predicting latent edges according to the existing network structure information and it has become one of the hot topics in complex networks. Latent feature model that has been used in link prediction directly projects the original network into the latent space. However, traditional latent feature model cannot fully characterize the deep structure information of complex networks. As a result, the prediction ability of the traditional method in sparse networks is limited. Aiming at the above problems, we propose a novel link prediction model based on deep latent feature model by Deep Non-negative Matrix Factorization (DNMF). DNMF method can obtain more comprehensive network structure information through multi-layer factorization. Experiments on ten typical real networks show that the proposed method has performances superior to the state-of-the-art link prediction methods.