The Citation Merit of Scientific Publications (original) (raw)

Rank analysis of most cited publications, a new approach for research assessments

arXiv (Cornell University), 2023

Citation metrics are the best tools for research assessments. However, current metrics may be misleading in research systems that pursue simultaneously different goals, such as the advance of science and incremental innovations, because their publications have different citation distributions. We estimate the contribution to the progress of knowledge by studying only a limited number of the most cited papers, which are dominated by publications pursuing this progress. To field-normalize the metrics, we substitute the number of citations by the rank position of papers from one country in the global list of papers. Using synthetic series of lognormally distributed numbers, we developed the Rk-index, which is calculated from the global ranks of the 10 highest numbers in each series, and demonstrate its equivalence to the number of papers in top percentiles, P top 0.1% and P top 0.01%. In real cases, the Rk-index is simple and easy to calculate, and evaluates the contribution to the progress of knowledge better than less stringent metrics. Although further research is needed, rank analysis of the most cited papers is a promising approach for research evaluation. It is also demonstrated that, for this purpose, domestic and collaborative papers should be studied independently.

Universality of citation distributions: Toward an objective measure of scientific impact

Proceedings of The National Academy of Sciences, 2008

We study the distributions of citations received by a single publication within several disciplines, spanning broad areas of science. We show that the probability that an article is cited c times has large variations between different disciplines, but all distributions are rescaled on a universal curve when the relative indicator c f = c/c0 is considered, where c0 is the average number of citations per article for the discipline. In addition we show that the same universal behavior occurs when citation distributions of articles published in the same field, but in different years, are compared. These findings provide a strong validation of c f as an unbiased indicator for citation performance across disciplines and years. Based on this indicator, we introduce a generalization of the h-index suitable for comparing scientists working in different fields.

Comprehensive Evaluation of Publication and Citation Metrics for Quantifying Scholarly Influence

IEEE Access, 2023

Ranking of researchers based on their scientific impact in a scientific community is indeed a very crucial task. However, identifying the researchers ranking helps the scientific community in various decisions such as awarding scholarships, selection for tenure, awarding achievements, giving promotions, etc. In literature numerous parameters have been proposed for the ranking of researchers, such as publication count, citation count, coauthor count, h-index, and its extensions. The current state-of-the-art research delineates that no such universally accepted parameter exists which can identify the most influential researchers. Therefore, it is necessary to determine an optimal parameter that can effectively rank authors. Furthermore, to identify the best parameter, few of the researchers conducted evaluative surveys as reflected in the literature. In these evaluative surveys, the researchers utilized a limited number of indices on the small and imbalanced datasets, followed by fictional cases and scenarios, this has made it challenging to ascertain the relative importance and impact of each parameter in comparison to the others. This research evaluates the h-index and its thirty-two variants, which are based on the number of publications and citation count category used for ranking the authors. We have collected data from 1050 researchers working in the mathematical domain for our experimental purposes. For the benchmark dataset, we have collected the awardees' data of the last two decades of four different societies belonging to mathematical domain. First and foremost, we have computed the correlations among the obtained values of the indices to assess their similarities and differences to evaluate indices. The result revealed that there is a high degree of correlation observed among h-index and it's twenty-four different variants. However, some of the indices represented weak correlation, signifying that their rankings are highly dissimilar to those of other indices. Secondly, the position of awardees is checked in the top 10, top 50, and top 100 return records based on the ranking list of each index. The outcome of the last step divulges that A index, E index, H core citation, H2 lower index, K index, M index, and woginger index retrieved almost 80% of awardees in top 10% ranked list. Further, the analysis revealed that most of the winners (awards) were in the top tier, belonging to IMS, LMS, and AMS society returned by hg index, g index, k index, etc. indicating a relationship between the stated societies and the indices.

A Simple Index for the High-Citation Tail of Citation Distribution to Quantify Research Performance in Countries and Institutions

PLoS ONE, 2011

Background: Conventional scientometric predictors of research performance such as the number of papers, citations, and papers in the top 1% of highly cited papers cannot be validated in terms of the number of Nobel Prize achievements across countries and institutions. The purpose of this paper is to find a bibliometric indicator that correlates with the number of Nobel Prize achievements. Methodology/Principal Findings: This study assumes that the high-citation tail of citation distribution holds most of the information about high scientific performance. Here I propose the x-index, which is calculated from the number of national articles in the top 1% and 0.1% of highly cited papers and has a subtractive term to discount highly cited papers that are not scientific breakthroughs. The x-index, the number of Nobel Prize achievements, and the number of national articles in Nature or Science are highly correlated. The high correlations among these independent parameters demonstrate that they are good measures of high scientific performance because scientific excellence is their only common characteristic. However, the x-index has superior features as compared to the other two parameters. Nobel Prize achievements are low frequency events and their number is an imprecise indicator, which in addition is zero in most institutions; the evaluation of research making use of the number of publications in prestigious journals is not advised. Conclusion: The x-index is a simple and precise indicator for high research performance.

On the evolution and utility of annual citation indices

We study the statistics of citations made to the top ranked indexed journals for Science and Social Science databases in the Journal Citation Reports using different measures. Total annual citation and impact factor, as well as a third measure called the annual citation rate are used to make the detailed analysis. We observe that the distribution of the annual citation rate has an universal feature -it shows a maximum at the rate scaled by half the average, irrespective of how the journals are ranked, and even across Science and Social Science journals, and fits well to log-Gumbel distribution. Correlations between different quantities are studied and a comparative analysis of the three measures is presented. The newly introduced annual citation rate factor helps in understanding the effect of scaling the number of citation by the total number of publications. The effect of the impact factor on authors contributing to the journals as well as on editorial policies is also discussed.

The citer-success-index: a citer-based indicator to select a subset of elite papers

Scientometrics, 2014

The goal of this paper is introducing the citer-success-index (cs-index), i.e. an indicator that uses the number of different citers as a proxy for the impact of a generic set of papers. For each of the articles of interest, it is defined a comparison term-which represents the number of citers that, on average, an article published in a certain period and scientific field is expected to ''infect''-to be compared with the actual number of citers of the article. Similarly to the recently proposed success-index (Franceschini et al. Scientometrics 92 :621-6415, 2011), the cs-index allows to select a subset of ''elite'' papers. The cs-index is analyzed from a conceptual and empirical perspective. Special attention is devoted to the study of the link between the number of citers and cited authors relating to articles from different fields, and the possible correlation between the cs-and the successindex. Some advantages of the cs-index are that (i) it can be applied to multidisciplinary groups of papers, thanks to the field-normalization that it achieves at the level of individual paper and (ii) it is not significantly affected by self citers and recurrent citers. The main drawback is its computational complexity.

Differences in citation impact across scientific fields

2012

This paper has two aims: (i) to introduce a novel method for measuring which part of overall citation inequality can be attributed to differences in citation practices across scientific fields, and (ii) to implement an empirical strategy for making meaningful comparisons between the number of citations received by articles in the 22 broad fields distinguished by Thomson Scientific. The paper is based on a model in which the number of citations received by any article is a function of the article's scientific influence, and the field to which it belongs. The model includes a key assumption according to which articles in the same quantile of any field citation distribution have the same degree of citation impact in their respective field. Using a dataset of 4.4 million articles published in 1998-2003 with a five-year citation window, we find that differences in citation practices between the 22 fields account for about 14% of overall citation inequality. Our empirical strategy for making comparisons of citation counts across fields is based on the strong similarities found in the behavior of citation distributions over a large quantile interval. We obtain three main results. Firstly, we provide a set of exchange rates to express citations in any field into citations in the all-fields case. (This can be done for articles in the interval between, approximately, the 71 st and the 99 th percentiles of their citation distributions). The answer is very satisfactory for 20 out of 22 fields. Secondly, when the raw citation data is normalized with our exchange rates, the effect of differences in citation practices is reduced to, approximately, 2% of overall citation inequality in the normalized citation distributions. Thirdly, we provide an empirical explanation of why the usual normalization procedure based on the fields' mean citation rates is found to be equally successful.

A two-dimensional stratification of publication, citation and h-index data

Research performance evaluation based on citation data is primarily two-dimensional with quantity and quality being the main orthogonal dimensions. Usually the data for the agent or actor is readily available from aggregators such as Web of Science or Scopus in the form of the number of papers P, the number of citations C and the h-index. The ratio of citations to papers (i = C/P) is a quality measure called impact, while P itself is an indicator of quantity or size. P, i and C can be classified as size-dependent, size-independent and composite indicators respectively. A scatter plot of i vs P is a two-dimensional quality-quantity map (or phase diagram) and from this it is possible to identify what are called the skyline and shoreline boundaries showing the upper and lower bounds of performance. Recent studies have shown that quality does not necessarily grow with size; there is a noticeable scale-dependent stratification. In this paper we show how we can also present h-C or h-P data as scatter plots with skylines and shorelines of extreme performance. The h-index taken alone tends to compress quality and quantity into a single score overlooking the complex stratification that takes place. The two-dimensional representation shown here highlights the danger of compressing performance to a single number.

Field-normalized citation impact indicators using algorithmically constructed classification systems of science

Journal of Informetrics, 2015

We study the problem of normalizing citation impact indicators for differences in citation practices across scientific fields. Normalization of citation impact indicators is usually done based on a field classification system. In practice, the Web of Science journal subject categories are often used for this purpose. However, many of these subject categories have a quite broad scope and are not sufficiently homogeneous in terms of citation practices. As an alternative, we propose to work with algorithmically constructed classification systems. We construct these classification systems by performing a large-scale clustering of publications based on their citation relations. In our analysis, 12 classification systems are constructed, each at a different granularity level. The number of fields in these systems ranges from 390 to 73,205 in granularity levels 1-12. This contrasts with the 236 subject categories in the WoS classification system. Based on an investigation of some key characteristics of the 12 classification systems, we argue that working with a few thousand fields may be an optimal choice. We then study the effect of the choice of a classification system on the citation impact of the 500 universities included in the 2013 edition of the CWTS Leiden Ranking. We consider both the MNCS and the PP top 10% indicator. Globally, for all the universities taken together citation impact indicators generally turn out to be relatively insensitive to the choice of a classification system. Nevertheless, for individual universities, we sometimes observe substantial differences between indicators normalized based on the journal subject categories and indicators normalized based on an appropriately chosen algorithmically constructed classification system.