Developing a Machine Learning Event Factuality Classifier using the FactBank Corpus (original) (raw)
Related papers
FactBank: A corpus annotated with event factuality
Recent work in computational linguistics points out the need for systems to be sensitive to the veracity or factuality of events as mentioned in text; that is, to recognize whether events are presented as corresponding to actual situations in the world, situations that have not happened, or situations of uncertain interpretation. Event factuality is an important aspect of the representation of events in discourse, but the annotation of such information poses a representational challenge, largely because factuality is expressed through the interaction of numerous linguistic markers and constructions. Many of these markers are already encoded in existing corpora, albeit in a somewhat fragmented way. In this article, we present FACTBANK, a corpus annotated with information concerning the factuality of events. Its annotation has been carried out from a descriptive framework of factuality grounded on both theoretical findings and data analysis. FactBank is built on top of TimeBank, adding to it an additional level of semantic information.
Overview of FACT at IberLEF 2020: Events Detection and Classification
2020
In this paper we present the second edition of the FACT shared task (Factuality Annotation and Classification Task), included in IberLEF2020. The main objective of this task is to advance in the study of the factuality of the events mentioned in texts. This year, the FACT task includes a subtask on event identification in addition to the factuality classification subtask. We describe the submitted systems as well as the corpus used, which is the same used in FACT 2019 but extended by adding annotations for nominal events.
The EVALITA 2016 Event Factuality Annotation Task (FactA)
2016
English. This report describes the FactA (Event Factuality Annotation) Task presented at the EVALITA 2016 evaluation campaign. The task aimed at evaluating systems for the identification of the factuality profiling of events. Motivations, datasets, evaluation metrics, and postevaluation results are presented and discussed. Italiano. Questo report descrive il task di valutazione FactA (Event Factaulity Annotation) presentato nell’ambito della campagna di valutazione EVALITA 2016. Il task si prefigge lo scopo di valutare sistemi automatici per il riconoscimento della fattualitá associata agli eventi in un testo. Le motivazioni, i dati usati, le metriche di valutazione, e risultati post-valutazione sono presentati e discussi.
Overview of FACT at IberLEF 2019: Factuality Analysis and Classification Task
2019
In this paper we describe the FACT shared task (Factuality Annotation and Classification Task), included in the First Iberian Languages Evaluation Forum (IberLEF). Factuality is understood, following [6], as the category that determines the factual status of events, that is, whether events are presented or not as certain. In order to analyze event references in texts, it is crucial to determine whether they are presented as having taken place or as potential or not accomplished events. This information can be used for different applications like Question Answering, Information Extraction, or Incremental Timeline Construction. Despite its centrality for Natural Language Understanding, this task has been underresearched, with the work by [7] as a reference for English and [8] for Spanish. For Italian, a task similar to FACT has been proposed in the past [4]. The bottleneck to advance on this task has usually been the lack of annotated resources, together with its inherent difficulty. ...
From structure to interpretation: A double-layered annotation for event factuality
Current work from different areas in the field points out the need for systems to be sensitive to the factuality nature of events mentioned in text; that is, to recognize whether events are presented as corresponding to real situations in the world, situations that have not happened, or situations of uncertain status. Event factuality is a necessary component for representing events in discourse, but for annotation purposes it poses a representational challenge because it is expressed through the interaction of a varied set of structural markers. Part of these factuality markers is already encoded in some of the existing corpora but always in a partial way; that is, missing an underlying model that is capable of representing the factuality value resulting from their interaction. In this paper, we present FactBank, a corpus of events annotated with factuality information which has been built on top of TimeBank. Together, TimeBank and FactBank offer a double-layered annotation of event factuality: where TimeBank encodes most of the basic structural elements expressing factuality information, FactBank adds a representation of the resulting factuality interpretation.
Factuality Classification Using BERT Embeddings and Support Vector Machines
2020
For any topic, its factuality can be defined as the category that determines the status of events with certainty of presentation of them. The first edition of the FACT task mainly focused on determination of the factuality of verb based events. The present edition is aimed at identifying noun based events and determine the factuality of all events be it verbs or nouns. We have participated in Subtask-1 of FACT 2020 task which is to automatically propose a factual tag for each event in the text. In this paper we have presented a method which extracts various features like BERT embeddings, Word2Vec embeddings and TF-IDF (Term Frequency-Inverse Document Frequency) scores of commonly recurring words, along with other manually extracted features as input features and passes them through a SVM (Support Vector Machine) classifier for classification purposes. Our system has achieved a f1-score of 36.6% and accuracy of 59.9% which is quite satisfactory relative to performance of other systems.
FacTA: Evaluation of Event Factuality and Temporal Anchoring
Proceedings of the Second Italian Conference on Computational Linguistics CLiC-it 2015
English. In this paper we describe FacTA, a new task connecting the evaluation of factuality profiling and temporal anchoring, two strictly related aspects in event processing. The proposed task aims at providing a complete evaluation framework for factuality profiling, at taking the first steps in the direction of narrative container evaluation for Italian, and at making available benchmark data for high-level semantic tasks. Italiano. Questo articolo descrive FacTA, un nuovo esercizio di valutazione su fattualità ed ancoraggio temporale, due aspetti dell'analisi degli eventi strettamente connessi tra loro. Il compito proposto mira a fornire una cornice completa di valutazione per la fattualità, a muovere i primi passi nella direzione della valutazione dei contenitori narrativi per l'italiano e a rendere disponibili dati di riferimento per compiti semantici di alto livello.
FACT2020: Factuality Identification in Spanish Text
2020
In this article we present our proposal for the FACT (Factuality Analysis and Classification Task) challenge tasks 1 and 2. The objective of task1 is to create a system capable of classifying given events found in Spanish texts. Although we present several approaches, the best performing classifier takes an approach of recurrent neural networks trained with embeddings data about the event word and its surroundings, reporting a F1 macro score of 0.6. For task2, a simple rule-base modeling approach is used, reaching a F1 macro score of 0.84.
Are You Sure That This Happened? Assessing the Factuality Degree of Events in Text
2012
Identifying the veracity, or factuality, of event mentions in text is fundamental for reasoning about eventualities in discourse. Inferences derived from events judged as not having happened, or as being only possible, are different from those derived from events evaluated as factual. Event factuality involves two separate levels of information. On the one hand, it deals with polarity, which distinguishes between positive and negative instantiations of events. On the other, it has to do with degrees of certainty (e.g., possible, probable), an information level generally subsumed under the category of epistemic modality. This article aims at contributing to a better understanding of how event factuality is articulated in natural language. For that purpose, we put forward a linguistic-oriented computational model which has at its core an algorithm articulating the effect of factuality relations across levels of syntactic embedding. As a proof of concept, this model has been implemented in De Facto, a factuality profiler for eventualities mentioned in text, and tested against a corpus built specifically for the task, yielding an F 1 of 0.70 (macro-averaging) and 0.80 (micro-averaging). These two measures mutually compensate for an over-emphasis present in the other (either on the lesser or greater populated categories), and can therefore be interpreted as the lower and upper bounds of the De Facto's performance.
Prototype of the Fact Mining Toolkit
2010
Executive Summary This deliverable describes a prototype of the Fact Mining Toolkit, which provides fact extraction functionality and is designed as a Web Service with nine components: one for pre-processing plain-text, six core components providing annotations, assertions and categories and two components for rendering the output–either in RDF or as a graphical visualization. As an example application we developed a News Fact Extraction Service, which applies fact extraction to a stream of news articles.