Investigating cross-lingual training for offensive language detection (original) (raw)

Deep-BERT: Transfer Learning for Classifying Multilingual Offensive Texts on Social Media

Computer Systems Science and Engineering

Offensive messages on social media, have recently been frequently used to harass and criticize people. In recent studies, many promising algorithms have been developed to identify offensive texts. Most algorithms analyze text in a unidirectional manner, where a bidirectional method can maximize performance results and capture semantic and contextual information in sentences. In addition, there are many separate models for identifying offensive texts based on monolingual and multilingual, but there are a few models that can detect both monolingual and multilingual-based offensive texts. In this study, a detection system has been developed for both monolingual and multilingual offensive texts by combining deep convolutional neural network and bidirectional encoder representations from transformers (Deep-BERT) to identify offensive posts on social media that are used to harass others. This paper explores a variety of ways to deal with multilingualism, including collaborative multilingual and translation-based approaches. Then, the Deep-BERT is tested on the Bengali and English datasets, including the different bidirectional encoder representations from transformers (BERT) pre-trained word-embedding techniques, and found that the proposed Deep-BERT's efficacy outperformed all existing offensive text classification algorithms reaching an accuracy of 91.83%. The proposed model is a state-of-the-art model that can classify both monolingual-based and multilingual-based offensive texts.

AlexU-BackTranslation-TL at SemEval-2020 Task 12: Improving Offensive Language Detection Using Data Augmentation and Transfer Learning

Proceedings of the Fourteenth Workshop on Semantic Evaluation, 2020

Social media platforms, online news commenting spaces, and many other public forums have become widely known for issues of abusive behavior such as cyber-bullying and personal attacks. In this paper, we use the annotated tweets of Offensive Language Identification Dataset (OLID) to train three levels of deep learning classifiers to solve the three sub-tasks associated with the dataset. Sub-task A is to determine if the tweet is toxic or not. Then, for offensive tweets, sub-task B requires determining whether the toxicity is targeted. Finally, for sub-task C, we predict the target of the offense; i.e. a group, individual or other entity. In our solution, we tackle the problem of class imbalance in the dataset by using back translation for data augmentation and utilizing fine-tuned BERT model in an ensemble of deep learning classifiers. We used this solution to participate in the three English sub-tasks of SemEval-2020 task 12. The proposed solution achieved 0.91393, 0.6300 and 0.57607 macro F1-average in sub-tasks A, B and C respectively. We achieved the 8th, 14th and 21st places for sub-tasks A, B and C respectively.

Zero-shot Cross-lingual Content Filtering: Offensive Language and Hate Speech Detection

2021

We present a system for zero-shot cross-lingual offensive language and hate speech classification. The system was trained on English datasets and tested on a task of detecting hate speech and offensive social media content in a number of languages without any additional training. Experiments show an impressive ability of both models to generalize from English to other languages. There is however an expected gap in performance between the tested cross-lingual models and the monolingual models. The best performing model (offensive content classifier) is available online as a REST API.

Multilingual Auxiliary Tasks Training: Bridging the Gap between Languages for Zero-Shot Transfer of Hate Speech Detection Models

arXiv (Cornell University), 2022

Zero-shot cross-lingual transfer learning has been shown to be highly challenging for tasks involving a lot of linguistic specificities or when a cultural gap is present between languages, such as in hate speech detection. In this paper, we highlight this limitation for hate speech detection in several domains and languages using strict experimental settings. Then, we propose to train on multilingual auxiliary tasks-sentiment analysis, named entity recognition, and tasks relying on syntactic information-to improve zero-shot transfer of hate speech detection models across languages. We show how hate speech detection models benefit from a cross-lingual knowledge proxy brought by auxiliary tasks fine-tuning and highlight these tasks' positive impact on bridging the hate speech linguistic and cultural gap between languages. 1 e.g. Hate speech towards Chinese communities spiked in 2020 with the emergence of the COVID-19 Pandemic.

GruPaTo at SemEval-2020 Task 12: Retraining mBERT on Social Media and Fine-tuned Offensive Language Models

SemEval2020 Task 12 , 2020

We introduce an approach to multilingual Offensive Language Detection based on the mBERT transformer model. We download extra training data from Twitter in English, Danish, and Turkish, and use it to retrain the model. We then fine-tuned the model on the provided training data and, in some configurations, implement transfer learning approach exploiting the typological relatedness of the English and Danish languages. Our systems obtained good results across the three languages (.9036 for EN, .7619 for DA, and .7789 for TR).

Using Transfer-based Language Models to Detect Hateful and Offensive Language Online

Proceedings of the Fourth Workshop on Online Abuse and Harms

Distinguishing hate speech from non-hate offensive language is challenging, as hate speech not always includes offensive slurs and offensive language not always express hate. Here, four deep learners based on the Bidirectional Encoder Representations from Transformers (BERT), with either general or domain-specific language models, were tested against two datasets containing tweets labelled as either 'Hateful', 'Normal' or 'Offensive'. The results indicate that the attention-based models profoundly confuse hate speech with offensive and normal language. However, the pre-trained models outperform state-of-the-art results in terms of accurately predicting the hateful instances.

FBK-DH at SemEval-2020 Task 12: Using Multi-channel BERT for Multilingual Offensive Language Detection

Proceedings of the Fourteenth Workshop on Semantic Evaluation, 2020

In this paper we present our submission to subtask A at SemEval 2020 Task 12: Multilingual Offensive Language Identification in Social Media (OffensEval2). For Danish, Turkish, Arabic and Greek, we develop an architecture based on transfer learning and relying on a two-channel BERT model, in which the English BERT and the multilingual one are combined after creating a machine-translated parallel corpus for each language in the task. For English, instead, we adopt a more standard, single-channel approach. We find that, in a multilingual scenario, with some languages having small training data, using parallel BERT models with machine translated data can give systems more stability, especially when dealing with noisy data. The fact that machine translation on social media data may not be perfect does not hurt the overall classification performance.

Transfer language selection for zero-shot cross-lingual abusive language detection

Information Processing & Management

• Zero-shot cross-lingual transfer can yield good results for abusive language detection • Linguistic similarity metrics can be used to find an optimal cross-lingual transfer language, at least for abusive language detection • Choosing a transfer language by intuition, for example by purely looking at the same language family, is not optimal • The World Atlas of Language Structures can be quantified into an e ective linguistic similarity metric

Transfer Learning for Hate Speech Detection in Social Media

arXiv (Cornell University), 2019

Today, the internet is an integral part of our daily lives, enabling people to be more connected than ever before. However, this greater connectivity and access to information increase exposure to harmful content such as cyber-bullying and cyber-hatred. Models based on machine learning and natural language offer a way to make online platforms safer by identifying hate speech in web text autonomously. However, the main difficulty is annotating a sufficiently large number of examples to train these models. This paper uses a transfer learning technique to leverage two independent datasets jointly and builds a single representation of hate speech. We build an interpretable two-dimensional visualization tool of the constructed hate speech representation-dubbed the Map of Hate-in which multiple datasets can be projected and comparatively analyzed. The hateful content is annotated differently across the two datasets (racist and sexist in one dataset, hateful and offensive in another). However, the common representation successfully projects the harmless class of both datasets into the same space and can be used to uncover labeling errors (false positives). We also show that the joint representation boosts prediction performances when only a limited amount of supervision is available. These methods and insights 1