The maximum capability of a topological feature in link prediction (original) (raw)

Link Prediction in Social Networks using Computationally Efficient Topological Features

The Third IEEE International Conference on Social Computing (SocialCom2011), MIT, Boston, USA

"Online social networking sites have become increas-ingly popular over the last few years. As a result, new interdisci-plinary research directions have emerged in which social network analysis methods are applied to networks containing hundreds millions of users. Unfortunately, links between individuals may be missing either due to imperfect acquirement processes or because they are not yet reflected in the online network (i.e., friends in real-world did not form a virtual connection.) Existing link prediction techniques lack the scalability required for full application on a continuously growing social network. The primary bottleneck in link prediction techniques is ex-tracting structural features required for classifying links. In this paper we propose a set of simple, easy-to-compute structural features, that can be analyzed to identify missing links. We show that by using simple structural features, a machine learning classifier can successfully identify missing links, even when applied to a hard problem of classifying links between individuals with at least one common friend. A new friends measure that we developed is shown to be a good predictor for missing links. An evaluation experiment was performed on five large Social Networks datasets: Facebook, Flickr, YouTube, Academia and TheMarker. Our methods can provide social network site operators with the capability of helping users to find known, offline contacts and to discover new friends online. They may also be used for exposing hidden links in an online social network."

Common neighbours and the local-community-paradigm for topological link prediction in bipartite networks

New Journal of Physics, 2015

Bipartite networks are powerful descriptions of complex systems characterized by two different classes of nodes and connections allowed only across but not within the two classes. Unveiling physical principles, building theories and suggesting physical models to predict bipartite links such as product-consumer connections in recommendation systems or drug-target interactions in molecular networks can provide priceless information to improve e-commerce or to accelerate pharmaceutical research. The prediction of nonobserved connections starting from those already present in the topology of a network is known as the link-prediction problem. It represents an important subject both in many-body interaction theory in physics and in new algorithms for applied tools in computer science. The rationale is that the existing connectivity structure of a network can suggest where new connections can appear with higher likelihood in an evolving network, or where nonobserved connections are missing in a partially known network. Surprisingly, current complex network theory presents a theoretical bottleneck: a general framework for local-based link prediction directly in the bipartite domain is missing. Here, we overcome this theoretical obstacle and present a formal definition of common neighbour index (CN) and local-community-paradigm (LCP) for bipartite networks. As a consequence, we are able to introduce the first node-neighbourhood-based and LCP-based models for topological link prediction that utilize the bipartite domain. We performed link prediction evaluations in several networks of different size and of disparate origin, including technological, social and biological systems. Our models significantly improve topological prediction in many bipartite networks because they exploit local physical driving-forces that participate in the formation and organization of many real-world bipartite networks. Furthermore, we present a local-based formalism that allows to intuitively implement neighbourhood-based link prediction entirely in the bipartite domain.

Link Prediction in Human Complex Network Based on Random Walk with Global Topological Features

IETI Transactions on Data Analysis and Forecasting (iTDAF)

Link Prediction in Human Complex Networks aims to predict the missing, deleted, or future link formations. These complex networks are represented graphically, consisting of nodes and links, also referred to as vertices and edges, respectively. We employ Link Prediction techniques on four different human-related networks to determine the most effective methods in the Human Complex domain. The techniques utilized are similarity-based, primarily focused on determining the similarity score of each network. We select four algorithms that demonstrated superior results in other complex networks and implement them on human-related networks. Our goal is to predict links that have been removed from the network in order to evaluate the prediction accuracy of the applied techniques. To accomplish this, we convert the datasets into adjacency matrices and divide them into training and probe sets. The training session is then conducted, followed by the testing of the data. The selected techniques ...

Link Prediction Using Supervised Machine Learning based on Aggregated and Topological Features

ArXiv, 2020

Link prediction is an important task in social network analysis. There are different characteristics (features) in a social network that can be used for link prediction. In this paper, we evaluate the effectiveness of aggregated features and topological features in link prediction using supervised learning. The aggregated features, in a social network, are some aggregation functions of the attributes of the nodes. Topological features describe the topology or structure of a social network, and its underlying graph. We evaluated the effectiveness of these features by measuring the performance of different supervised machine learning methods. Specifically, we selected five well-known supervised methods including J48 decision tree, multi-layer perceptron (MLP), support vector machine (SVM), logistic regression and Naive Bayes (NB). We measured the performance of these five methods with different sets of features of the DBLP Dataset. Our results indicate that the combination of aggregat...

Discriminative Topological Features Reveal Biological Network Mechanisms

2004

Background: Recent genomic and bioinformatic advances have motivated the development of numerous network models intending to describe graphs of biological, technological, and sociological origin. In most cases the success of a model has been evaluated by how well it reproduces a few key features of the real-world data, such as degree distributions, mean geodesic lengths, and clustering coefficients. Often pairs of models can reproduce these features with indistinguishable fidelity despite being generated by vastly different mechanisms. In such cases, these few target features are insufficient to distinguish which of the different models best describes real world networks of interest; moreover, it is not clear a priori that any of the presently-existing algorithms for network generation offers a predictive description of the networks inspiring them.

Local Topological Signatures for Network-Based Prediction of Biological Function

Lecture Notes in Computer Science, 2013

In biology, similarity in structure or sequence between molecules is often used as evidence of functional similarity. In protein interaction networks, structural similarity of nodes (i.e., proteins) is often captured by comparing node signatures (vectors of topological properties of neighborhoods surrounding the nodes). In this paper, we ask how well such topological signatures predict protein function, using protein interaction networks of the organism Saccharomyces cerevisiae. To this end, we compare two node signatures from the literature-the graphlet degree vector and a signature based on the graph spectrum-and our own simple node signature based on basic topological properties. We find the connection between topology and protein function to be weak but statistically significant. Surprisingly, our node signature, despite its simplicity, performs on par with the other more sophisticated node signatures. In fact, we show that just two metrics, the link count and transitivity, are enough to classify protein function at a level on par with the other signatures suggesting that detailed topological characteristics are unlikely to aid in protein function prediction based on protein interaction networks.

Graph-based features for supervised link prediction

Neural Networks (IJCNN), …, 2011

The growing ubiquity of social networks has spurred research in link prediction, which aims to predict new connections based on existing ones in the network. The 2011 IJCNN Social Network challenge asked participants to separate real edges from fake in a set of 8960 edges sampled from an anonymized, directed graph depicting a subset of relationships on Flickr. Our method incorporates 94 distinct graph features, used as input for classification with Random Forests. We present a three-pronged approach to the link prediction task, along with several novel variations on established similarity metrics. We discuss the challenges of processing a graph with more than a million nodes. We found that the best classification results were achieved through the combination of a large number of features that model different aspects of the graph structure. Our method achieved an area under the receiver-operator characteristic (ROC) curve of 0.9695, the 2nd best overall score in the competition and the best score which did not de-anonymize the dataset.

Future link regression using supervised learning on graph topology

2014

Link prediction provides useful information for a variety of graph models, including communication, biochemical, and social networks. The goal of link prediction is usually to predict novel interactions (modeled as links/edges) between previously unconnected nodes in a graph. Link prediction is used on social networks to suggest future friends and in protein networks to suggest possible undiscovered pairwise interactions. Link prediction does not model repeat interactions or make any predictions about the number of interactions. To do this we need to predict the weight of links. We call this this problem link regression.

JUCS - Journal of Universal Computer Science

Several real-world phenomena, including social, communication, transportation, and biological networks, can be efficiently expressed as graphs. This enables the deployment of graph algorithms to infer information from such complex network interactions to enhance graph applications’ accuracy, including link prediction, node classification, and clustering. However, the large size and complexity of the network data limit the efficiency of the learning algorithms in making decisions from such graph datasets. To overcome these limitations, graph embedding techniques are usually adopted. However, many studies not only assume static networks but also pay less attention to preserving the network topological and centrality information, which information is key in analyzing networks. In order to fill these gaps, we propose a novel end-to-end unified Topological Similarity and Centrality driven Hybrid Deep Learning model for Temporal Link Prediction (TSC-TLP). First, we extract topological sim...

International Journal of Electrical and Computer Engineering (IJECE), 2022

In recent years, the study of social networks and the analysis of these networks in various fields have grown significantly. One of the most widely used fields in the study of social networks is the issue of link prediction, which has recently been very popular among researchers. A link in a social network means communication between members of the network, which can include friendships, cooperation, writing a joint article or even membership in a common place such as a company or club. The main purpose of link prediction is to investigate the possibility of creating or deleting links between members in the future state of the network using the analysis of its current state. In this paper, three new similarities, degree neighbor similarity (DNS), path neighbor similarity (PNS) and degree path neighbor Similarity (DPNS) criteria are introduced using neighbor-based and path-based similarity criteria, both of which use graph structures. The results have been tested based on area under ...

The maximum capability of a topological feature in link prediction (original) (raw)

Related papers