Introduction to Special Issue on Advances in Question Answering (original) (raw)
Related papers
An annotation protocol for evaluative stance in discourse
An annotation protocol for evaluative stance in discourse, 2020
This paper is part of the work carried out in the funded research project Stance and subjectivity in discourse: towards an integrated model of the analysis of epistemicity, effectivity, evaluation and intersubjectivity from a critical discourse perspective (PGC2018-095798-B-I00). In this paper we propose a protocol for the annotation of evaluative stance across discourse types. We have used the protocol to annotate four 100,000-word corpora in English: opinion articles (The Guardian and The Times), science popularization in the press (The Guardian and The Times), political discourse (speeches delivered by British politicians) and fora on social issues (REDDIT). The development of the protocol has gone through two main stages. The first stage has consisted in a preliminary theoretical definition of the model of evaluative stance and its main categories, drawing from research on stance, evaluation and critical discourse analysis, together with methods for the identification of metaphoricity (du Bois 2007, Martin and White 2005, Pragglejazz 2007, van Leeuwen 2008, Wodak and Meyer 2015, among others). The preliminary model was tested in samples of the corpora and subsequently, the protocol underwent a first initial refinement and revision. The second stage has consisted in a process of establishing a good degree of inter-rater reliability for the full annotation of the corpora. The procedure of inter-rater reliability was carried out by three researchers (Hidalgo-Downing, Pérez-Sobrino, and Williams-Camus) who individually annotated samples from the corpora in four subsequent rounds. A joint discussion followed each round to discuss conflicting annotations and to refine the protocol for the ensuing round. The goal of these series of annotations was to know whether there was any variation in the inter-rater reliability with which evaluative stance was identified across researchers, rounds and genres. The results of the inter-rater reliability tests show a consistent increase in the kappa scores for the value category (positive vs negative evaluation) and, to a lesser extent, for metaphoricity (although, in both cases, kappa scores showed moderate to high agreement). These rounds were complemented with two rounds of annotation of sample texts by the full team (all seven researchers participating in this project) in order to ensure the understanding and uniform application of the criteria of the protocol for the annotation of the whole corpora.
Going beyond traditional QA systems: challenges and keys in opinion question answering
2010
The treatment of factual data has been widely studied in different areas of Natural Language Processing (NLP). However, processing subjective information still poses important challenges. This paper presents research aimed at assessing techniques that have been suggested as appropriate in the context of subjective -Opinion Question Answering (OQA). We evaluate the performance of an OQA with these new components and propose methods to optimally tackle the issues encountered. We assess the impact of including additional resources and processes with the purpose of improving the system performance on two distinct blog datasets. The improvements obtained for the different combination of tools are statistically significant. We thus conclude that the proposed approach is adequate for the OQA task, offering a good strategy to deal with opinionated questions.
The aim of this study is to explore the possibility of identifying speaker stance in discourse, provide an analytical resource for it and an evaluation of the level of agreement across speakers. We also explore to what extent language users agree about what kind of stances are expressed in natural language use or whether their interpretations diverge. In order to perform this task, a comprehensive cognitive-functional framework of ten stance categories was developed based on previous work on speaker stance in the literature. A corpus of opinionated texts was compiled, the Brexit Blog Corpus (BBC). An analytical protocol and interface (Active Learning and Visual Analytics) for the annotations was set up and the data were independently annotated by two annotators. The annotation procedure, the annotation agreements and the co-occurrence of more than one stance in the utterances are described and discussed. The careful, analytical annotation process has returned satisfactory inter-and intra-annotation agreement scores, resulting in a gold standard corpus, the final version of the BBC.
Resumen: El crecimiento exponencial de la información subjetiva en el marco de la Web 2.0 ha creado la necesidad de producir herramientas de Procesamiento del Lenguaje Natural que sean capaces de analizar y procesar estos datos para aplicaciones concretas. Estas herramientas requieren un entrenamiento con corpus anotados con este tipo de información a nivel muy detallado para poder capturar aquellos fenómenos lingüísticos que contienen una carga emotiva. El presente artículo describe EmotiBlog, un modelo detallado para la anotación de la subjetividad. Presentamos el proceso de creación y demostramos que aporta mejoras a los sistemas de aprendizaje automático. Para ello, empleamos distintos corpus que presentan textos de diversos géneros -una colección de noticias periodísticas en estilo indirecto, la colección de títulos de noticias anotados con la polaridad y emoción del SemEval 2007 (Tarea 14) e ISEAR, un corpus de expresiones reales de emociones. Además, demostramos que otros recursos pueden integrarse con EmotiBlog. Los resultados prueban que gracias a su estructura y parámetros de anotación, el modelo propuesto, EmotiBlog, proporciona ventajas considerables para el entrenamiento de sistemas que trabajan con minería de opiniones y detección de emoción.
Annotating Speaker Stance in Discourse: The Brexit Blog Corpus
The aim of this study is to explore the possibility of identifying speaker stance in discourse, provide an analytical resource for it and an evaluation of the level of agreement across speakers. We also explore to what extent language users agree about what kind of stances are expressed in natural language use or whether their interpretations diverge. In order to perform this task, a comprehensive cognitive-functional framework of ten stance categories was developed based on previous work on speaker stance in the literature. A corpus of opinionated texts was compiled, the Brexit Blog Corpus (BBC). An analytical protocol and interface (Active Learning and Visual Analytics) for the annotations was set up and the data were independently annotated by two annotators. The annotation procedure, the annotation agreements and the co-occurrence of more than one stance in the utterances are described and discussed. The careful, analytical annotation process has returned satisfactory inter-and intra-annotation agreement scores, resulting in a gold standard corpus, the final version of the BBC.
Analyzing Opinions and Argumentation in News
2014
Analyzing opinions and arguments in news editorials and op-eds is an interesting and a challenging task. The challenges lie in multiple levels – the text has to be analyzed in the discourse level (paragraphs and above) and also in the lower levels (sentence, phrase and word levels). The abundance of implicit opinions involving sarcasm, irony and biases adds further complexity to the task. The available methods and techniques on sentiment analysis and opinion mining are still much focused in the lower levels, i.e., up to the sentence level. However, the given task requires the application of the concepts from a number of closely related sub-disciplines – Sentiment Analysis, Argumentation Theory, Discourse Analysis, Computational Linguistics, Logic and Reasoning etc. The primary argument of this paper is that partial solutions to the problem can be achieved by developing linguistic resources and using them for automatically annotating the texts for opinions and arguments. This paper d...
Towards Building Annotated Resources for Analyzing Opinions and Argumentation in News Editorials
This paper describes an annotation scheme for argumentation in opinionated texts such as newspaper editorials, develo ped from a corpus of approximately 500 English texts from Nepali and international newspaper sources. We present the results of analysis and evaluation of the corpus annotation - currently, the inter -annotator agreement kappa value being 0.80 which indicat es substantial agreement between the annotators. We also discuss some of linguistic resources (key factors for distinguishing facts from opi nions, opinion lexicon, intensifier lexicon, pre -modifier lexicon, modal verb lexicon, reporting verb lexicon, gener al opinion patterns from the corpus etc.) developed as a result of our corpus analysis, which can be used to identify an opinion or a controversial issue, arguments supporting an opinion, orientation of the supporting arguments and their strength (intr insic, relative and in terms of persuasion). These resources form the backbone of our work especially for per...
Discourse Level Opinion Relations: An Annotation Study
2008
This work proposes opinion framesas a repre- sentation of discourse-level associations that arise from related opinion targets and which are common in task-oriented meeting dialogs. We define the opinion frames and explain their interpretation. Additionally we present an annotation scheme that realizes the opinion frames and via human annotation studies, we show that these can be reliably identified.