The Role of Context in Detecting Previously Fact-Checked Claims (original) (raw)

Findings of the Association for Computational Linguistics: NAACL 2022

Recent years have seen the proliferation of disinformation and misinformation online, thanks to the freedom of expression on the Internet and to the rise of social media. Two solutions were proposed to address the problem: (i) manual fact-checking, which is accurate and credible, but slow and non-scalable, and (ii) automatic fact-checking, which is fast and scalable, but lacks explainability and credibility. With the accumulation of enough manually factchecked claims, a middle-ground approach has emerged: checking whether a given claim has previously been fact-checked. This can be made automatically, and thus fast, while also offering credibility and explainability, thanks to the human fact-checking and explanations in the associated fact-checking article. This is a relatively new and understudied research direction, and here we focus on claims made in a political debate, where context really matters. Thus, we study the impact of modeling the context of the claim: both on the source side, i.e., in the debate, as well as on the target side, i.e., in the fact-checking explanation document. We do this by modeling the local context, the global context, as well as by means of co-reference resolution, and reasoning over the target text using Transformer-XH. The experimental results show that each of these represents a valuable information source, but that modeling the source-side context is more important, and can yield 10+ points of absolute improvement. 1 Introduction The fight against the spread of dis/mis-information in social media has become an urgent social and political issue. Social media have been widely used not only for social good but also to mislead entire communities. Many fact-checking organizations, such as FactCheck.org, Snopes, PolitiFact, and FullFact, along with many others, and also along with some broader international initiatives such as the Credibility Coalition and Eufactcheck, 043 have emerged in the past few years to address the 044 issue (Stencel, 2019). It has also become of great 045 concern for government entities, companies, as well 046 as national and international agencies. 047 At the same time, there have been efforts to 048 develop automatic systems to detect and to flag 049