AllegatorTrack: Combining and Reporting Results of Truth Discovery from Multi-source Data (original) (raw)

VERA: A Platform for Veracity Estimation over Web Data

Proc. of the 25th International WWW16 Conference

Social networks and the Web in general are characterized by multiple information sources often claiming conflicting data values. Data veracity is hard to estimate, especially when there is no prior knowledge about the sources or the claims in time-dependent scenarios (e.g., crisis situation) where initially very few observers can report first information. Despite the wide set of recently proposed truth discovery approaches, " no-one-fits-all " solution emerges for estimating the veracity of on-line information in open contexts. However, analyzing the space of conflicting information and disagreeing sources might be relevant, as well as ensembling multiple truth discovery methods. This demonstration presents VERA, a Web-based platform that supports information extraction from Web textual data and micro-texts from Twitter and estimates data veracity. Given a user query, VERA systematically extracts entities and relations from Web content, structures them as claims relevant to the query and gathers more conflicting/corroborating information. VERA combines multiple truth discovery algorithms through ensembling and returns the veracity label and score of each data value as well as the trustworthiness scores of the sources. VERA will be demonstrated through several real-world scenarios to show its potential value for fact-checking from Web data.

Truth Discovery Algorithms: An Experimental Evaluation Tech.Report May 2014 http://arxiv.org/abs/1409.6428

A fundamental problem in data fusion is to determine the veracity of multi-source data in order to resolve conflicts. While previous work in truth discovery has proved to be useful in practice for specific settings, sources' behavior or data set characteristics, there has been limited systematic comparison of the competing methods in terms of efficiency, usability, and repeatability. We remedy this deficit by providing a comprehensive review of 12 state-of-the art algorithms for truth discovery. We provide reference implementations and an in-depth evaluation of the methods based on extensive experiments on synthetic and real-world data. We analyze aspects of the problem that have not been explicitly studied before, such as the impact of initialization and parameter setting, convergence, and scalability. We provide an experimental framework for extensively comparing the methods in a wide range of truth discovery scenarios where source coverage, numbers and distributions of confli...

Toward Automated Factchecking

Digital Threats: Research and Practice

In an effort to assist factcheckers in the process of factchecking, we tackle the claim detection task, one of the necessary stages prior to determining the veracity of a claim. It consists of identifying the set of sentences, out of a long text, deemed capable of being factchecked. This article is a collaborative work between Full Fact, an independent factchecking charity, and academic partners. Leveraging the expertise of professional factcheckers, we develop an annotation schema and a benchmark for automated claim detection that is more consistent across time, topics, and annotators than are previous approaches. Our annotation schema has been used to crowdsource the annotation of a dataset with sentences from UK political TV shows. We introduce an approach based on universal sentence representations to perform the classification, achieving an F1 score of 0.83, with over 5% relative improvement over the state-of-the-art methods ClaimBuster and ClaimRank. The system was deployed in...