Citation Resolution: A method for evaluating context-based citation recommendation systems (original) (raw)

Context-aware citation recommendation

Proceedings of the 19th international conference on World wide web - WWW '10, 2010

When you write papers, how many times do you want to make some citations at a place but you are not sure which papers to cite? Do you wish to have a recommendation system which can recommend a small number of good candidates for every place that you want to make some citations? In this paper, we present our initiative of building a context-aware citation recommendation system. High quality citation recommendation is challenging: not only should the citations recommended be relevant to the paper under composition, but also should match the local contexts of the places citations are made. Moreover, it is far from trivial to model how the topic of the whole paper and the contexts of the citation places should affect the selection and ranking of citations. To tackle the problem, we develop a context-aware approach. The core idea is to design a novel non-parametric probabilistic model which can measure the context-based relevance between a citation context and a document. Our approach can recommend citations for a context effectively. Moreover, it can recommend a set of citations for a paper with high quality. We implement a prototype system in CiteSeerX. An extensive empirical evaluation in the Cite-SeerX digital library against many baselines demonstrates the effectiveness and the scalability of our approach.

Applying Core Scientific Concepts to Context-Based Citation Recommendation

2016

The task of recommending relevant scientific literature for a draft academic paper has recently received significant interest. In our effort to ease the discovery of scientific literature and augment scientific writing, we aim to improve the relevance of results based on a shallow semantic analysis of the source document and the potential documents to recommend. We investigate the utility of automatic argumentative and rhetorical annotation of documents for this purpose. Specifically, we integrate automatic Core Scientific Concepts (CoreSC) classification into a prototype context-based citation recommendation system and investigate its usefulness to the task. We frame citation recommendation as an information retrieval task and we use the categories of the annotation schemes to apply different weights to the similarity formula. Our results show interesting and consistent correlations between the type of citation and the type of sentence containing the relevant information.

RefSeer: A citation recommendation system

IEEE/ACM Joint Conference on Digital Libraries, 2014

Citations are important in academic dissemination. To help researchers check the completeness of citations while authoring a paper, we introduce a citation recommendation system called RefSeer. Researchers can use this system while authoring papers to find related works to cite. It can also be used by reviewers to check the completeness of a paper's references. RefSeer presents both topic-based global recommendations and citation-context-based local recommendations. By evaluating the quality of recommendations, we show that such a recommendation system can recommend citations with good precision and recall. We also show that our recommendation system is very efficient and scalable.

Recommending citations: translating papers into references

2012

When we write or prepare to write a research paper, we always have appropriate references in mind. However, there are most likely references we have missed and should have been read and cited. As such a good citation recommendation system would not only improve our paper but, overall, the efficiency and quality of literature search.

Recommending citations for academic papers

Proceedings of the 30th annual international ACM SIGIR conference on Research and development in information retrieval, 2007

We approach the problem of academic literature search by considering an unpublished manuscript as a query to a search system. We use the text of previous literature as well as the citation graph that connects it to find relevant related material. We evaluate our technique with manual and automatic evaluation methods, and find an order of magnitude improvement in mean average precision as compared to a text similarity baseline.

In-text citation’s frequencies-based recommendations of relevant research papers

PeerJ Computer Science, 2021

From the past half of a century, identification of the relevant documents is deemed an active area of research due to the rapid increase of data on the web. The traditional models to retrieve relevant documents are based on bibliographic information such as Bibliographic coupling, Co-citations, and Direct citations. However, in the recent past, the scientific community has started to employ textual features to improve existing models’ accuracy. In our previous study, we found that analysis of citations at a deep level (i.e., content level) can play a paramount role in finding more relevant documents than surface level (i.e., just bibliography details). We found that cited and citing papers have a high degree of relevancy when in-text citations frequency of the cited paper is more than five times in the citing paper’s text. This paper is an extension of our previous study in terms of its evaluation of a comprehensive dataset. Moreover, the study results are also compared with other s...

Context-Based Collaborative Filtering for Citation Recommendation

IEEE Access, 2015

Citation recommendation is an interesting and significant research area as it solves the information overload in academia by automatically suggesting relevant references for a research paper. Recently, with the rapid proliferation of information technology, research papers are rapidly published in various conferences and journals. This makes citation recommendation a highly important and challenging discipline. In this paper, we propose a novel citation recommendation method that uses only easily obtained citation relations as source data. The rationale underlying this method is that, if two citing papers are significantly co-occurring with the same citing paper(s), they should be similar to some extent. Based on the above rationale, an association mining technique is employed to obtain the paper representation of each citing paper from the citation context. Then, these paper representations are pairwise compared to compute similarities between the citing papers for collaborative filtering. We evaluate our proposed method through two relevant real-world data sets. Our experimental results demonstrate that the proposed method significantly outperforms the baseline method in terms of precision, recall, and F1, as well as mean average precision and mean reciprocal rank, which are metrics related to the rank information in the recommendation list.

GORC: A large contextual citation graph of academic papers

2019

We introduce the Semantic Scholar Graph of References in Context (GORC),1 a large contextual citation graph of 81.1M academic publications, including parsed full text for 8.1M open access papers, across broad domains of science. Each paper is represented with rich paper metadata (title, authors, abstract, etc.), and where available: cleaned full text, section headers, figure and table captions, and parsed bibliography entries. In-line citation mentions in full text are linked to their corresponding bibliography entries, which are in turn linked to in-corpus cited papers, forming the edges of a contextual citation graph. To our knowledge, this is the largest publicly available contextual citation graph; the full text alone is the largest parsed academic text corpus publicly available. We demonstrate the ability to identify similar papers using these citation contexts and propose several applications for language modeling and citation-related tasks.

CitePrompt: Using Prompts to Identify Citation Intent in Scientific Papers

arXiv (Cornell University), 2023

Citations in scientific papers not only help us trace the intellectual lineage but also are a useful indicator of the scientific significance of the work. Citation intents prove beneficial as they specify the role of the citation in a given context. We present a tool CitePrompt which uses the hitherto unexplored approach of prompt learning for citation intent classification. We argue that with the proper choice of the pretrained language model, the prompt template, and the prompt verbalizer, we can not only get results that are better than or comparable to those obtained with the state-of-the-art methods but also do it with much less exterior information about the scientific document. We report state-of-the-art results on the ACL-ARC dataset, and also show significant improvement on the SciCite dataset over all baseline models except one. As suitably large labelled datasets for citation intent classification can be quite hard to find, in a first, we propose the conversion of this task to the few-shot and zero-shot settings. For the ACL-ARC dataset, we report a 53.86% F1 score for the zero-shot setting, which improves to 63.61% and 66.99% for the 5-shot and 10-shot settings respectively. CCS CONCEPTS • Computing methodologies → Natural language processing; • Information systems → Digital libraries and archives.