Defining and detecting toxicity on social media: context and knowledge are key (original) (raw)

Characterizing Variation in Toxic Language by Social Context

2020

How two people speak to one another depends heavily on the nature of their relationship. For example, the same phrase said to a friend in jest may be offensive to a stranger. In this paper, we apply this simple observation to study toxic comments in online social networks. We curate a collection of 6.7K tweets containing potentially toxic terms from users with different relationship types, as determined by the nature of their follower-friend connection. We find that such tweets between users with no connection are nearly three times as likely to be toxic as those between users who are mutual friends, and that taking into account this relationship type improves toxicity detection methods by about 5% on average. Furthermore, we provide a descriptive analysis of how toxic language varies by relationship type, finding for example that mildly offensive terms are used to express hostility more commonly between users with no social connection than users who are mutual friends.

Predicting Different Types of Subtle Toxicity in Unhealthy Online Conversations

ArXiv, 2021

This paper investigates the use of machine learning models for the classification of unhealthy online conversations containing one or more forms of subtler abuse, such as hostility, sarcasm, and generalization. We leveraged a public dataset of 44K online comments containing healthy and unhealthy comments labeled with seven forms of subtle toxicity. We were able to distinguish between these comments with a top micro F1-score, macro F1score, and ROC-AUC of 88.76%, 67.98%, and 0.71, respectively. Hostile comments were easier to detect than other types of unhealthy comments. We also conducted a sentiment analysis which revealed that most types of unhealthy comments were associated with a slight negative sentiment, with hostile comments being the most negative ones.

Defending Digital Discourse: Developing a Toxic Comment Classifier for Fostering Healthy Online Communities

International Journal of Computer Science and Engineering, 2024

To an extent, trolls or abusive users tend to penetrate the online community and ruin the potential healthy interactions that members and users can have; they over-engage members in the virtual space. In this regard, our work aims to develop models for the automatic detection and classification of toxic comments. The study is divided into four stages or executed in four steps. The first step is data preparation, which is done in stages; the data is loaded and preprocessed. The second step comprises Exploratory Data Analysis (EDA), where we seek to describe the toxic labels in the data and how they vary. The text is then standardized using text preprocessing techniques such as lower casing and punctuation removal before model training. For the model training tasks, logistic regression and Naive Bayes models are used to label each category of the toxicity classifier. It was observed that more than 96% of accuracy is achieved across varied categories: 96.9% of toxic comments, 97.2% of severe toxicity, 97.7% of obscenity, 98.9% of threats, 97.1% of insults, and 96.9% of identity hate. The models were very robust; the whole work took only 2 minutes and 58.24 seconds, which is an indication of its effectiveness and scalability.

Detection of toxicity in social media: A study on semantic orientation and linguistic structure.

2022

Social networks’ astonishing increase in popularity allowed users to be connected to their friends and family, in addition to being able to make new connections, either in the personal, in the academic or in the professional area. It is not to doubt the benefits that social media had during the COVID-19 pandemic, as it allowed a virtual environment for meetings and social interactions. However, social networks have a dark side which can be appreciated in forms of toxic content. This toxicity present in all kinds of social media platforms raised a warning for users, researchers, and companies, and that is why there has been an increase of studies and works related to detection and prevention of toxicity in social networks. Although the term toxic is not an easy one to describe, the research community worked based on their understanding or needs to define what we understand by toxic, what forms of toxic content are present online, and how to detect it using several Machine Learning approaches. This work is based on toxicity detection on social media and focuses on the use of semantic orientation bias and linguistic structure of the messages to detect toxic content. More particularly, it is based on the term anisotropy, meaning that the word vectors are distributed through the multidimensional space oriented in a particular direction. For this reason, we are using Static Word Embeddings, as they maintain the semantic properties of the meaning of the words they represent. We performed experiments on vector proximity and orientation proximity, which allowed us to check if we could predict new toxic messages using these factors. The second foundation of this work is to explore if linguistic structure influences in detecting toxic content. As say, if there are some words, categories or linguistic structure that have more impact in the process of toxicity detection, and how can we compound sentence vectors to address this same issue in sentence level. We performed several experiments that illustrated which linguistic content was more relevant to consider. At word level, we selected Nouns and excluded stopwords (as they present some inherent semantic orientation bias), and at sentence level we performed the Composition process in a linear way using a simple global average composition function, which calculated the average of all the vectors that compound the sentence to obtain a sentence vector. The results allowed us to confirm that toxic content indeed shows orientation direction bias towards the same semantic space and that linguistic structure plays a role in such content.

Are These Comments Triggering? Predicting Triggers of Toxicity in Online Discussions

Are These Comments Triggering? Predicting Triggers of Toxicity in Online Discussions, 2020

Understanding the causes or triggers of toxicity adds a new dimension to the prevention of toxic behavior in online discussions. In this research, we define toxicity triggers in online discussions as a non-toxic comment that lead to toxic replies. Then, we build a neural network-based prediction model for toxicity trigger. The prediction model incorporates text-based features and derived features from previous studies that pertain to shifts in sentiment, topic flow, and discussion context. Our findings show that triggers of toxicity contain identifiable features and that incorporating shift features with the discussion context can be detected with a ROC-AUC score of 0.87. We discuss implications for online communities and also possible further analysis of online toxicity and its root causes.

Detection and Classification of Online Toxic Comments

2021

In the current century, social media has created many job opportunities and has become a unique place for people to freely express their opinions. But as every coin has two sides, the good and the bad, along with the pros, social media has many cons. Among these users, a few bunches of users are taking advantage of this system and are misusing this opportunity to express their toxic mindset (i.e., insulting, verbal sexual harassment, foul behavior, etc.). And hence cyberbullying has become a major problem. If we can filter out the hurtful, toxic words expressed on social media platforms like Twitter, Instagram, and Facebook, the online world will become a safer and more harmonious place. We gained initial ideas by researching current toxic comments classifiers to come up with this design. We then took what we found and made the most user-friendly product possible. For this project, we created a Toxic Comments Classifier which will classify the comments depending on the category of t...

ALONE: A Dataset for Toxic Behavior among Adolescents on Twitter

2020

The convenience of social media has also enabled its misuse, potentially resulting in toxic behavior. Nearly 66% of internet users have observed online harassment, and 41% claim personal experience, with 18% facing severe forms of online harassment. This toxic communication has a significant impact on the well-being of young individuals, affecting mental health and, in some cases, resulting in suicide. These communications exhibit complex linguistic and contextual characteristics, making recognition of such narratives challenging. In this paper, we provide a multimodal dataset of toxic social media interactions between confirmed high school students, called ALONE (AdoLescents ON twittEr), along with descriptive explanation. Each instance of interaction includes tweets, images, emoji and related metadata. Our observations show that individual tweets do not provide sufficient evidence for toxic behavior, and meaningful use of context in interactions can enable highlighting or exonerat...

Topic-driven toxicity: Exploring the relationship between online toxicity and news topics

PLoS ONE, 2020

Hateful commenting, also known as ‘toxicity’, frequently takes place within news stories in social media. Yet, the relationship between toxicity and news topics is poorly understood. To analyze how news topics relate to the toxicity of user comments, we classify topics of 63,886 online news videos of a large news channel using a neural network and topical tags used by journalists to label content. We score 320,246 user comments from those videos for toxicity and compare how the average toxicity of comments varies by topic. Findings show that topics like Racism, Israel-Palestine, and War & Conflict have more toxicity in the comments, and topics such as Science & Technology, Environment & Weather, and Arts & Culture have less toxic commenting. Qualitative analysis reveals five themes: Graphic videos, Humanistic stories, History and historical facts, Media as a manipulator, and Religion. We also observe cases where a typically more toxic topic becomes non-toxic and where a typically less toxic topic becomes “toxicified” when it involves sensitive elements, such as politics and religion. Findings suggest that news comment toxicity can be characterized as topic-driven toxicity that targets topics rather than as vindictive toxicity that targets users or groups. Practical implications suggest that humanistic framing of the news story (i.e., reporting stories through real everyday people) can reduce toxicity in the comments of an otherwise toxic topic.