Model of complex networks based on citation dynamics (original) (raw)

Growing complex network of citations of scientific papers: Modeling and measurements

Physical Review E

To quantify the mechanism of a complex network growth we focus on the network of citations of scientific papers and use a combination of the theoretical and experimental tools to uncover microscopic details of this network growth. Namely, we develop a stochastic model of citation dynamics based on copying/redirection/triadic closure mechanism. In a complementary and coherent way, the model accounts both for statistics of references of scientific papers and for their citation dynamics. Originating in empirical measurements, the model is cast in such a way that it can be verified quantitatively in every aspect. Such verification is performed by measuring citation dynamics of Physics papers. The measurements revealed nonlinear citation dynamics, the nonlinearity being intricately related to network topology. The nonlinearity has far-reaching consequences including non-stationary citation distributions, diverging citation trajectory of similar papers, runaways or "immortal papers" with infinite citation lifetime etc. Thus, our most important finding is nonlinearity in complex network growth. In a more specific context, our results can be a basis for quantitative probabilistic prediction of citation dynamics of individual papers and of the journal impact factor.

Large-scale structure of time evolving citation networks

The European Physical Journal B, 2007

In this paper we examine a number of methods for probing and understanding the large-scale structure of networks that evolve over time. We focus in particular on citation networks, networks of references between documents such as papers, patents, or court cases. We describe three different methods of analysis, one based on an expectation-maximization algorithm, one based on modularity optimization, and one based on eigenvector centrality. Using the network of citations between opinions of the United States Supreme Court as an example, we demonstrate how each of these methods can reveal significant structural divisions in the network, and how, ultimately, the combination of all three can help us develop a coherent overall picture of the network's shape.

Statistical modeling of the temporal dynamics in a large scale-citation network

2016

Citation Networks of papers are vast networks that grow over time. The manner or the form a citation network grows is not entirely a random process, but a preferential attachment relationship; highly cited papers are more likely to be cited by newly published papers. The result is a network whose degree distribution follows a power law. This growth of citation network of papers will be modeled with a negative binomial regression coupled with logistic growth and/or Cauchy distribution curve. Then a Barabási Albert model based on the negative binomial models, and a combination of the Dirichlet distribution and multinomial will be utilized to simulate a network that follows preferential attachments between newly added nodes and existing nodes. Acknowledgements I would like to thank everyone at University of Arkansas for being so helpful throughout the three years of my master studies. Thanks to all the faculty and staff for enabling my education goals and providing the opportunities fo...

On the Stability of Citation Networks

ArXiv, 2021

Citation networks can reveal many important information regarding the development of science and the relationship between different areas of knowledge. Thus, many studies have analyzed the topological properties of such networks. Frequently, citation networks are created using articles acquired from a set of relevant keywords or queries. Here, we study the robustness of citation networks with regards to the keywords that were used for collecting the respective articles. A perturbation approach is proposed, in which the influence of missing keywords on the topology and community structure of citation networks is quantified. In addition, the relationship between keywords and the community structure of citation networks is studied using networks generated from a simple model. We find that, owing to its highly modular structure, the community structure of citation networks tends to be preserved even when many relevant keywords are left out. Furthermore, the proposed model can reflect th...

Effect of citation patterns on network structure

We propose a model for an evolving citation network that incorporates the citation pattern followed in a particular discipline. We define the citation pattern in a discipline by three factors. The average number of references per article, the probability of citing an article based on it's age and the number of citations it already has. We also consider the average number of articles published per year in the discipline. We propose that the probability of citing an article based on it's age can be modeled by a lifetime distribution. The lifetime distribution models the citation lifetime of an average article in a particular discipline. We find that the citation lifetime distribution in a particular discipline predicts the topological structure of the citation network in that discipline. We show that the power law exponent depends on the three factors that define the citation pattern. Finally we fit the data from the Physical Review D journal to obtain the citation pattern and calculate the total degree distribution for the citation network.

Stochastic Dynamical Model of a Growing Citation Network Based on a Self-Exciting Point Process

Physical Review Letters, 2012

We put under experimental scrutiny the preferential attachment model that is commonly accepted as a generating mechanism of the scale-free complex networks. To this end we chose a citation network of physics papers and traced the citation history of 40 195 papers published in one year. Contrary to common belief, we find that the citation dynamics of the individual papers follows the superlinear preferential attachment, with the exponent ¼ 1:25-1:3. Moreover, we show that the citation process cannot be described as a memoryless Markov chain since there is a substantial correlation between the present and recent citation rates of a paper. Based on our findings we construct a stochastic growth model of the citation network, perform numerical simulations based on this model and achieve an excellent agreement with the measured citation distributions.

A Simple Model for Complex Networks with Arbitrary Degree Distribution and Clustering

Lecture Notes in Computer Science, 2007

We present a stochastic model for networks with arbitrary degree distributions and average clustering coefficient. Many descriptions of networks are based solely on their computed degree distribution and clustering coefficient. We propose a statistical model based on these characterizations. This model generalizes models based solely on the degree distribution. We present alternative parameterizations of the model. Each parameterization of the model is interpretable and tunable. We present a simple Markov Chain Monte Carlo (MCMC) algorithm to generate networks with the specified characteristics. We provide a algorithm based on MCMC to infer the network properties from network data and develop statistical inference for the model. The model is generalizable to include mixing based on attributes and other complex social structure.

Analysis of reference and citation copying in evolving bibliographic networks

Journal of Informetrics

Extensive literature demonstrates how the copying of references (links) can lead to the emergence of various structural properties (e.g., power-law degree distribution and bipartite cores) in bibliographic and other similar directed networks. However, it is also well known that the copying process is incapable of mimicking the number of directed triangles in such networks; neither does it have the power to explain the obsolescence of older papers. In this paper, we propose RefOrCite, a new model that allows for copying of both the references from (i.e., out-neighbors of) as well as the citations to (i.e., inneighbors of) an existing node. In contrast, the standard copying model (CP) only copies references. While retaining its spirit, RefOrCite differs from the Forest Fire (FF) model in ways that makes RefOrCite amenable to mean-field analysis for degree distribution, triangle count, and densification. Empirically, RefOrCite gives the best overall agreement with observed degree distribution, triangle count, diameter, h-index, and the growth of citations to newer papers.

Randomizing growing networks with a time-respecting null model

Physical Review E

Complex networks are often used to represent systems that are not static but grow with time: People make new friendships, new papers are published and refer to the existing ones, and so forth. To assess the statistical significance of measurements made on such networks, we propose a randomization methodology-a time-respecting null model-that preserves both the network's degree sequence and the time evolution of individual nodes' degree values. By preserving the temporal linking patterns of the analyzed system, the proposed model is able to factor out the effect of the system's temporal patterns on its structure. We apply the model to the citation network of Physical Review scholarly papers and the citation network of US movies. The model reveals that the two data sets are strikingly different with respect to their degree-degree correlations, and we discuss the important implications of this finding on the information provided by paradigmatic node centrality metrics such as indegree and Google's PageRank. The randomization methodology proposed here can be used to assess the significance of any structural property in growing networks, which could bring new insights into the problems where null models play a critical role, such as the detection of communities and network motifs.

A Realistic Model for Complex Networks

2003

It appeared recently that the classical random network model used to represent complex networks does not capture their main properties (clustering, degree distribution). Since then, various attempts have been made to provide network models having these properties. We propose here the first model which achieves the following challenges: it produces networks which have the three main wanted properties, it is based on some real-world observations, and it is sufficiently simple to make it possible to prove its main properties. We first give an overview of the field by presenting the main models introduced until now, then we discuss some remarks on some complex networks which lead us to the definition of our model. We then show that the model has the expected properties and that it can actually be seen as a general model for complex networks.