Predicting Different Types of Subtle Toxicity in Unhealthy Online Conversations (original) (raw)
Related papers
An Intense Study of Machine Learning Research Approach to Identify Toxic Comments
International Journal of Scientific Research in Computer Science, Engineering and Information Technology, 2022
A large number of online public domain comments are usually constructive, but a significant proportion is toxic. The comments include several errors that allow the machine-learning algorithm to train the data set by processing dataset with numerous variety of tasks, in the method of conversion of raw comments previously feeding it to Classification models using a ML method. In this study, we have proposed classification of toxic comments using a ML approach on a multilinguistic toxic comment dataset. The logistic regression method is applied to classify processed dataset, which will distinguish toxic comments from non-toxic comments. The multi-headed model comprises toxicity (obscene, insult, severe toxic, threat, & identity-hate) or Nontoxicity Estimation. We have implemented four models (LSTM, GRU RNN, and BiLSTM) and detected the toxic comments. In Python 3, all models have a simple structure that can adapt to the resolution of other tasks. The classification problem resolution findings are presented with the aid of the proposed models. It has been concluded that all models solve the challenge effectively, but the BiLSTM is the most effective to ensure the best practicable accuracy.
Are These Comments Triggering? Predicting Triggers of Toxicity in Online Discussions
Are These Comments Triggering? Predicting Triggers of Toxicity in Online Discussions, 2020
Understanding the causes or triggers of toxicity adds a new dimension to the prevention of toxic behavior in online discussions. In this research, we define toxicity triggers in online discussions as a non-toxic comment that lead to toxic replies. Then, we build a neural network-based prediction model for toxicity trigger. The prediction model incorporates text-based features and derived features from previous studies that pertain to shifts in sentiment, topic flow, and discussion context. Our findings show that triggers of toxicity contain identifiable features and that incorporating shift features with the discussion context can be detected with a ROC-AUC score of 0.87. We discuss implications for online communities and also possible further analysis of online toxicity and its root causes.
Toxic Comment Classification Using Neural Networks and Machine Learning
International Advanced Research Journal in Science, Engineering and Technology, 2018
: A cornucopia of data is developed through conversations, interactions of humans online. This scenario has contributed considerably well to the quality of human life but it also involves prodigious dangers as online text communications with high toxicity quality cause individual assaults, online provocation and harassing practices. This has activated both industrial and research network over the most recent couple of years while there are a few attempts to distinguish a proficient model for online toxic comment classification and prediction. Be that as it may, these means are still in their earliest stages and new methodologies and structures are required. On parallel, the information blast that shows up always, makes the development of new machine learning computational apparatuses for overseeing this data, a basic need. Gratefully progresses in big data management, hardware and cloud computing administration permit the advancement of Deep Learning approaches showing up exceptionally encouraging execution up until now. Recently the use of Convolutional Neural Networks and Recurrent Neural Networks have been approached for computational purposes for the text classification systems. In this work, we utilize this way to deal with finding toxic comments, remarks in an extensive pool of records given by a current Kaggle's competition with respect to Wikipedia's talk page edits which has divided the level of toxicity into 6 labels: toxicity, severe toxicity, obscenity, threat, insult or identity hate.
International Journal of Computer Science and Engineering, 2024
To an extent, trolls or abusive users tend to penetrate the online community and ruin the potential healthy interactions that members and users can have; they over-engage members in the virtual space. In this regard, our work aims to develop models for the automatic detection and classification of toxic comments. The study is divided into four stages or executed in four steps. The first step is data preparation, which is done in stages; the data is loaded and preprocessed. The second step comprises Exploratory Data Analysis (EDA), where we seek to describe the toxic labels in the data and how they vary. The text is then standardized using text preprocessing techniques such as lower casing and punctuation removal before model training. For the model training tasks, logistic regression and Naive Bayes models are used to label each category of the toxicity classifier. It was observed that more than 96% of accuracy is achieved across varied categories: 96.9% of toxic comments, 97.2% of severe toxicity, 97.7% of obscenity, 98.9% of threats, 97.1% of insults, and 96.9% of identity hate. The models were very robust; the whole work took only 2 minutes and 58.24 seconds, which is an indication of its effectiveness and scalability.
IJERT-Multilabel Toxic Comment Detection and Classification
International Journal of Engineering Research and Technology (IJERT), 2021
https://www.ijert.org/multilabel-toxic-comment-detection-and-classification https://www.ijert.org/research/multilabel-toxic-comment-detection-and-classification-IJERTV10IS050012.pdf Toxic comments refers to hatred online comments classified as disrespectful or abusive towards individual or community. With a boom of internet, lot of users are brought to online social discussion platforms. These platforms are created to exchange ideas, learning new things and have meaningful conversations. But due to toxic comments many users are not able to put their points in online discussions. This degrades quality of discussion. In this paper we will check the toxicity of comment. And if the comment is toxic then classify the comments into different categories to examine the type of toxicity. We will utilize different machine learning and deep learning algorithms on our dataset and select the best algorithms based on our evaluation methodology. Moving forward we seek to attain high performance through our machine learning and deep learning models which will help in limiting the toxicity present on various discussion sites.
Detection and Classification of Online Toxic Comments
2021
In the current century, social media has created many job opportunities and has become a unique place for people to freely express their opinions. But as every coin has two sides, the good and the bad, along with the pros, social media has many cons. Among these users, a few bunches of users are taking advantage of this system and are misusing this opportunity to express their toxic mindset (i.e., insulting, verbal sexual harassment, foul behavior, etc.). And hence cyberbullying has become a major problem. If we can filter out the hurtful, toxic words expressed on social media platforms like Twitter, Instagram, and Facebook, the online world will become a safer and more harmonious place. We gained initial ideas by researching current toxic comments classifiers to come up with this design. We then took what we found and made the most user-friendly product possible. For this project, we created a Toxic Comments Classifier which will classify the comments depending on the category of t...
International Journal for Research in Applied Science & Engineering Technology (IJRASET), 2022
Conversational toxicity is a problem that might drive people to cease truly expressing themselves and seeking out other people's opinions out of fear of being attacked or harassed. The purpose of this research is to employ natural language processing (NLP) techniques to detect toxicity in writing, which might be used to alert people before transmitting potentially toxic informational messages. Natural language processing (NLP) is a part of machine learning that enables computers to comprehend natural language. Understanding, analysing, manipulating, and maybe producing human data language with the help of a machine are all possibilities. Natural Language Processing (NLP) is a type of artificial intelligence that allows machines to understand and interpret human language instead of simply reading it. Machines can understand written or spoken text and execute tasks such as speech recognition, sentiment analysis, text classification, and automatic text summarization using natural language processing (NLP). I.
Investigating Bias In Automatic Toxic Comment Detection: An Empirical Study
ArXiv, 2021
With surge in online platforms, there has been an upsurge in the user engagement on these platforms via comments and reactions. A large portion of such textual comments are abusive, rude and offensive to the audience. With machine learning systems in-place to check such comments coming onto platform, biases present in the training data gets passed onto the classifier leading to discrimination against a set of classes, religion and gender. In this work, we evaluate different classifiers and feature to estimate the bias in these classifiers along with their performance on downstream task of toxicity classification. Results show that improvement in performance of automatic toxic comment detection models is positively correlated to mitigating biases in these models. In our work, LSTM with attention mechanism proved to be a better modelling strategy than a CNN model. Further analysis shows that fasttext embeddings is marginally preferable than glove embeddings on training models for toxi...
PROVOKE: Toxicity trigger detection in conversations from the top 100 subreddits
PROVOKE: Toxicity trigger detection in conversations from the top 100 subreddits, 2022
Promoting healthy discourse on community-based online platforms like Reddit can be challenging, especially when conversations show ominous signs of toxicity. Therefore, in this study, we find the turning points (i.e., toxicity triggers) making conversations toxic. Before finding toxicity triggers, we built and evaluated various machine learning models to detect toxicity from Reddit comments. Subsequently, we used our best-performing model, a fine-tuned Bidirectional Encoder Representations from Transformers (BERT) model that achieved an area under the receiver operating characteristic curve (AUC) score of 0.983 to detect toxicity. Next, we constructed conversation threads and used the toxicity prediction results to build a training set for detecting toxicity triggers. This procedure entailed using our large-scale dataset to refine toxicity triggers' definition and build a trigger detection dataset using 991,806 conversation threads from the top 100 communities on Reddit. Then, we extracted a set of sentiment shift, topical shift, and context-based features from the trigger detection dataset, using them to build a dual embedding biLSTM neural network that achieved an AUC score of 0.789. Our trigger detection dataset analysis showed that specific triggering keywords are common across all communities, like 'racist' and 'women'. In contrast, other triggering keywords are specific to certain communities, like 'overwatch' in r/Games. Implications are that toxicity trigger detection algorithms can leverage generic approaches but must also tailor detections to specific communities.