Large-scale structure of time evolving citation networks (original) (raw)

Time and Citation Networks

ArXiv, 2015

Citation networks emerge from a number of different social systems, such as academia (from published papers), business (through patents) and law (through legal judgements). A citation represents a transfer of information, and so studying the structure of the citation network will help us understand how knowledge is passed on. What distinguishes citation networks from other networks is time; documents can only cite older documents. We propose that existing network measures do not take account of the strong constraint imposed by time. We will illustrate our approach with two types of causally aware analysis. We apply our methods to the citation networks formed by academic papers on the arXiv, to US patents and to US Supreme Court judgements. We show that our tools can reveal that citation networks which appear to have very similar structure by standard network measures turn out to have significantly different properties. We interpret our results as indicating that many papers in a bib...

Time & Citation Networks

2015

Citation networks emerge from a number of different social systems, such as academia (from published papers), business (through patents) and law (through legal judgements). A citation represents a transfer of information, and so studying the structure of the citation network will help us understand how knowledge is passed on. What distinguishes citation networks from other networks is time; documents can only cite older documents. We propose that existing network measures do not take account of the strong constraint imposed by time. We will illustrate our approach with two types of causally aware analysis. We apply our methods to the citation networks formed by academic papers on the arXiv, to US patents and to US Supreme Court judgements. We show that our tools can reveal that citation networks which appear to have very similar structure by standard network measures, turn out to have significantly different properties. We interpret our results as indicating that many papers in a bi...

Growing complex network of citations of scientific papers: Modeling and measurements

Physical Review E

To quantify the mechanism of a complex network growth we focus on the network of citations of scientific papers and use a combination of the theoretical and experimental tools to uncover microscopic details of this network growth. Namely, we develop a stochastic model of citation dynamics based on copying/redirection/triadic closure mechanism. In a complementary and coherent way, the model accounts both for statistics of references of scientific papers and for their citation dynamics. Originating in empirical measurements, the model is cast in such a way that it can be verified quantitatively in every aspect. Such verification is performed by measuring citation dynamics of Physics papers. The measurements revealed nonlinear citation dynamics, the nonlinearity being intricately related to network topology. The nonlinearity has far-reaching consequences including non-stationary citation distributions, diverging citation trajectory of similar papers, runaways or "immortal papers" with infinite citation lifetime etc. Thus, our most important finding is nonlinearity in complex network growth. In a more specific context, our results can be a basis for quantitative probabilistic prediction of citation dynamics of individual papers and of the journal impact factor.

Effect of citation patterns on network structure

We propose a model for an evolving citation network that incorporates the citation pattern followed in a particular discipline. We define the citation pattern in a discipline by three factors. The average number of references per article, the probability of citing an article based on it's age and the number of citations it already has. We also consider the average number of articles published per year in the discipline. We propose that the probability of citing an article based on it's age can be modeled by a lifetime distribution. The lifetime distribution models the citation lifetime of an average article in a particular discipline. We find that the citation lifetime distribution in a particular discipline predicts the topological structure of the citation network in that discipline. We show that the power law exponent depends on the three factors that define the citation pattern. Finally we fit the data from the Physical Review D journal to obtain the citation pattern and calculate the total degree distribution for the citation network.

An Efficient Algorithm for Citation Networks Analysis. Paper presented at EASST’94

2013

In the paper very efficient, linear in number of arcs, algorithms for determining Hummon and Doreian’s arc weights SPLC and SPNP in citation network are proposed, and some theoretical properties of these weights are presented. The nonacyclicity problem in citation networks is discussed. An approach to identify on the basis of arc weights an important small subnetwork is proposed and illustrated on the citation networks of SOM (self organizing maps) literature and US patents.

Model of complex networks based on citation dynamics

Proceedings of the 22nd International Conference on World Wide Web, 2013

Complex networks of real-world systems are believed to be controlled by common phenomena, producing structures far from regular or random. These include scale-free degree distributions, small-world structure and assortative mixing by degree, which are also the properties captured by different random graph models proposed in the literature. However, many (non-social) real-world networks are in fact disassortative by degree. Thus, we here propose a simple evolving model that generates networks with most common properties of real-world networks including degree disassortativity. Furthermore, the model has a natural interpretation for citation networks with different practical applications.

Efficient algorithms for citation network analysis

Arxiv preprint cs/0309023, 2003

In the paper very efficient, linear in number of arcs, algorithms for determining Hummon and Doreian's arc weights SPLC and SPNP in citation network are proposed, and some theoretical properties of these weights are presented. The nonacyclicity problem in citation networks is discussed. An approach to identify on the basis of arc weights an important small subnetwork is proposed and illustrated on the citation networks of SOM (self organizing maps) literature and US patents.

Case study to approaches to finding patterns in citation networks

Analysis of a dataset including a network of LED patents and their metadata is carried out using several methods in order to answer questions about the domain. We are interested in finding the relationship between the metadata and the network structure; for example, are central patents in the network produced by larger or smaller companies? We begin by exploring the structure of the network without any metadata, applying known techniques in citation analysis and a simple clustering scheme. These techinques are then combined with metadata analysis to draw preliminary conclusions about the dataset.

Statistical modeling of the temporal dynamics in a large scale-citation network

2016

Citation Networks of papers are vast networks that grow over time. The manner or the form a citation network grows is not entirely a random process, but a preferential attachment relationship; highly cited papers are more likely to be cited by newly published papers. The result is a network whose degree distribution follows a power law. This growth of citation network of papers will be modeled with a negative binomial regression coupled with logistic growth and/or Cauchy distribution curve. Then a Barabási Albert model based on the negative binomial models, and a combination of the Dirichlet distribution and multinomial will be utilized to simulate a network that follows preferential attachments between newly added nodes and existing nodes. Acknowledgements I would like to thank everyone at University of Arkansas for being so helpful throughout the three years of my master studies. Thanks to all the faculty and staff for enabling my education goals and providing the opportunities fo...

Distance measures for dynamic citation networks

Physica A: Statistical Mechanics and its Applications, 2010

Acyclic digraphs arise in many natural and artificial processes. Among the broader set, dynamic citation networks represent a substantively important form of acyclic digraphs. For example, the study of such networks includes the spread of ideas through academic citations, the spread of innovation through patent citations, and the development of precedent in common law systems. The specific dynamics that produce such acyclic digraphs not only differentiate them from other classes of graphs, but also provide guidance for the development of meaningful distance measures. In this article, we develop and apply our sink distance measure together with the singlelinkage hierarchical clustering algorithm to both a two-dimensional directed preferential attachment model as well as empirical data drawn from the first quarter century of decisions of the United States Supreme Court. Despite applying the simplest combination of distance measures and clustering algorithms, analysis reveals that more accurate and more interpretable clusterings are produced by this scheme.