Which Argumentative Aspects of Hate Speech in Social Media can be reliably identified? (original) (raw)
Related papers
Parsimonious Argument Annotations for Hate Speech Counter-narratives
arXiv (Cornell University), 2022
We present an enrichment of the Hateval corpus of hate speech tweets (Basile et al., 2019) aimed to facilitate automated counternarrative generation. Comparably to previous work (Chung et al., 2019), manually written counter-narratives are associated to tweets. However, this information alone seems insufficient to obtain satisfactory language models for counter-narrative generation. That is why we have also annotated tweets with argumentative information based on Wagemans (2016), that we believe can help in building convincing and effective counter-narratives for hate speech against particular groups. We discuss adequacies and difficulties of this annotation process and present several baselines for automatic detection of the annotated elements. Preliminary results show that automatic annotators perform close to human annotators to detect some aspects of argumentation, while others only reach low or moderate level of inter-annotator agreement.
Quantifying the impact of context on the quality of manual hate speech annotation
Natural Language Engineering, 2022
The quality of annotations in manually annotated hate speech datasets is crucial for automatic hate speech detection. This contribution focuses on the positive effects of manually annotating online comments for hate speech within the context in which the comments occur. We quantify the impact of context availability by meticulously designing an experiment: Two annotation rounds are performed, one in-context and one out-of-context, on the same English YouTube data (more than 10,000 comments), by using the same annotation schema and platform, the same highly trained annotators, and quantifying annotation quality through inter-annotator agreement. Our results show that the presence of context has a significant positive impact on the quality of the manual annotations. This positive impact is more noticeable among replies than among comments, although the former is harder to consistently annotate overall. Previous research reporting that out-of-context annotations favour assigning non-hate-speech labels is also corroborated, showing further that this tendency is especially present among comments inciting violence, a highly relevant category for hate speech research and society overall. We believe that this work will improve future annotation campaigns even beyond hate speech and motivate further research on the highly relevant questions of data annotation methodology in natural language processing, especially in the light of the current expansion of its scope of application.
Agreeing to Disagree: Annotating Offensive Language Datasets with Annotators’ Disagreement
Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing, 2021
Since state-of-the-art approaches to offensive language detection rely on supervised learning, it is crucial to quickly adapt them to the continuously evolving scenario of social media. While several approaches have been proposed to tackle the problem from an algorithmic perspective, so to reduce the need for annotated data, less attention has been paid to the quality of these data. Following a trend that has emerged recently, we focus on the level of agreement among annotators while selecting data to create offensive language datasets, a task involving a high level of subjectivity. Our study comprises the creation of three novel datasets of English tweets covering different topics and having five crowd-sourced judgments each. We also present an extensive set of experiments showing that selecting training and test data according to different levels of annotators' agreement has a strong effect on classifiers performance and robustness. Our findings are further validated in cross-domain experiments and studied using a popular benchmark dataset. We show that such hard cases, where low agreement is present, are not necessarily due to poor-quality annotation and we advocate for a higher presence of ambiguous cases in future datasets, particularly in test sets, to better account for the different points of view expressed online.
Handling Disagreement in Hate Speech Modelling
Communications in computer and information science, 2022
Hate speech annotation for training machine learning models is an inherently ambiguous and subjective task. In this paper, we adopt a perspectivist approach to data annotation, model training and evaluation for hate speech classification. We first focus on the annotation process and argue that it drastically influences the final data quality. We then present three large hate speech datasets that incorporate annotator disagreement and use them to train and evaluate machine learning models. As the main point, we propose to evaluate machine learning models through the lens of disagreement by applying proper performance measures to evaluate both annotators' agreement and models' quality. We further argue that annotator agreement poses intrinsic limits to the performance achievable by models. When comparing models and annotators, we observed that they achieve consistent levels of agreement across datasets. We reflect upon our results and propose some methodological and ethical considerations that can stimulate the ongoing discussion on hate speech modelling and classification with disagreement.
Annotating Hate Speech: Three Schemes at Comparison
Annotated data are essential to train and benchmark NLP systems. The reliability of the annotation, i.e. low inter-annotator disagreement, is a key factor, especially when dealing with highly subjective phenomena occurring in human language. Hate speech (HS), in particular, is intrinsically nuanced and hard to fit in any fixed scale, therefore crisp classification schemes for its annotation often show their limits. We test three annotation schemes on a corpus of HS, in order to produce more reliable data. While rating scales and best-worst-scaling are more expensive strategies for annotation, our experimental results suggest that they are worth implementing in a HS detection perspective. 1
2018
We present the Gab Hate Corpus (GHC), consisting of 27,665 posts from the social network service gab.com, each annotated for the presence of “hate-based rhetoric” by a minimum of three annotators. Posts were labeled according to a coding typology derived from a synthesis of hate speech definitions across legal precedent, previous hate speech coding typologies, and definitions from psychology and sociology, comprising hierarchical labels indicating dehumanizing and violent speech as well as indicators of targeted groups and rhetorical framing. We provide inter-annotator agreement statistics and perform a classification analysis in order to validate the corpus and establish performance baselines. The GHC complements existing hate speech datasets in its theoretical grounding and by providing a large, representative sample of richly annotated social media posts.
Factoring Hate Speech: A New Annotation Framework to Study Hate Speech in Social Media
The 7th Workshop on Online Abuse and Harms (WOAH)
In this work we propose a novel annotation scheme which factors hate speech into five separate discursive categories. To evaluate our scheme, we construct a corpus of over 2.9M Twitter posts containing hateful expressions directed at Jews, and annotate a sample dataset of 1,050 tweets. We present a statistical analysis of the annotated dataset as well as discuss annotation examples, and conclude by discussing promising directions for future work.
Building and Annotating a Codeswitched Hate Speech Corpora
International Journal of Information Technology and Computer Science
Presidential campaign periods are a major trigger event for hate speech on social media in almost every country. A systematic review of previous studies indicates inadequate publicly available annotated datasets and hardly any evidence of theoretical underpinning for the annotation schemes used for hate speech identification. This situation stifles the development of empirically useful data for research, especially in supervised machine learning. This paper describes the methodology that was used to develop a multidimensional hate speech framework based on the duplex theory of hate [1] components that include distance, passion, commitment to hate, and hate as a story. Subsequently, an annotation scheme based on the framework was used to annotate a random sample of ~51k tweets from ~400k tweets that were collected during the August and October 2017 presidential campaign period in Kenya. This resulted in a goldstandard codeswitched dataset that could be used for comparative and empiri...
Directions for NLP Practices Applied to Online Hate Speech Detection
2022
Addressing hate speech in online spaces has been conceptualized as a classification task that uses Natural Language Processing (NLP) techniques. Through this conceptualization, the hate speech detection task has relied on common conventions and practices from NLP. For instance, inter-annotator agreement is conceptualized as a way to measure dataset quality and certain metrics and benchmarks are used to assure model generalization. However, hate speech is a deeply complex and situated concept that eludes such static and disembodied practices. In this position paper, we critically reflect on these methodologies for hate speech detection, we argue that many conventions in NLP are poorly suited for the problem and encourage researchers to develop methods that are more appropriate for the task.
Automated Hate Speech Target Identification
2021
We present a new human-labelled Slovenian Twitter dataset an- notated for hate speech targets and attempts to automated hate speech target classication via dierent machine learning ap- proaches. This work represents, to our knowledge, one of the rst attempts to solve a Slovene-based text classication task with an autoML approach. Our results show that the classication task is a dicult one, both in terms of annotator agreement and in terms of classier performance. The best performing classier is SloBERTa-based, followed by AutoBOT-neurosymbolic-full.