Brute Force Works Best Against Bullying (original) (raw)

Extracting Patterns of Harmful Expressions for Cyberbullying Detection

Cyberbullying, or humiliating and slandering people through Internet, has been recently noticed as a serious social problem disturbing mental health of Internet users. In Japan, to deal with the problem, voluntary members of Parent-Teacher Association (PTA) manually read through the Web to spot cyberbullying entries. To help PTA members in their uphill task we propose a novel method for automatic detection of malicious contents on the Internet. The method is based on a combinatorial approach resembling brute force search algorithms with application to language classification. The method extracts sophisticated patterns from sentences and uses them in classification. We tested the method on actual data containing cyberbullying provided by Human Rights Center. The results show our method outperformed previous methods. It is also more efficient as it requires minimal human effort.

Cyberbullying Detection using Natural Language Processing

International Journal for Research in Applied Science & Engineering Technology (IJRASET), 2022

Around the world, the use of the Internet and social media has increased exponentially, and they have become an integral part of daily life. It allows people to share their thoughts, feelings, and ideas with their loved ones through the Internet and social media. But with social networking sites becoming more popular, cyberbullying is on the rise. Using technology as a medium to bully someone is known as Cyberbullying. The Internet can be a source of abusive and harmful content and cause harm to others. Social networking sites provide a great medium for harassment, bullies, and youngsters who use these sites are vulnerable to attacks. Bullying can have long-term effects on adolescents' ability to socialize and build lasting friendships Victims of cyberbullying often feel humiliated. social media users often can hide their identity, which helps misuse the available features. The use of offensive language has become one of the most popular issues on social networking. Text containing any form of abusive conduct that displays acts intended to hurt others is offensive language. Cyberbullying frequently leads to serious mental and physical distress, particularly for women and children, and sometimes forces them to commit suicide. The purpose of this project is to develop a technique that is effective to detect and avoid cyberbullying on social networking sites we are using Natural Language Processing and other machine learning algorithms. The dataset that we used for this project was collected from Kaggle, it contains data from Twitter that is then labeled to train the algorithm. Several classifiers are used to train and recognize bullying actions. The evaluation of the proposed Model for cyberbullying dataset shows that Logistic Regression performs better and achieves good accuracy than SVM, Ransom forest, Naive-Bayes, and Xgboost algorithm.

IRJET- INTERNATIONAL RESEARCH JOURNAL OF ENGINEERING AND TECHNOLOGY (IRJET) Automated Detection of Cyberbullying Using Machine Learning

IRJET, 2020

Increasing the use of Internet and facilitating access to online communities such as social media have led to the emergence of cybercrime. Cyberbullying is very common now a days. which have no tracking like it may harm any individual, business, society, country in past few days it seems that riots were happened due to some statement used by one community on another its important to identify such content which spreads hate or harm community text processing, NLP (natural language processing) is an emerging field with the help of NLP and machine learning algorithms such as naive bayes, random forest, SVM we are going to identify cyberbullying in twitter. Objectives of this implementation written in objective section. Image character with the help of OCR will be done by us to find image-based cyberbullying the impact on individual basis thus will be checked on dummy system. Machine learning and natural language processing techniques to identify the characteristics of a cyberbullying exchange and automatically detect cyberbullying by matching textual data to the identified traits. On the basis of our extensive literature review, we categorise existing approaches into 4 main classes, namely supervised learning, lexicon-based, rule-based, and mixed-initiative approaches. Supervised learning-based approaches typically use classifiers such as SVM and Naïve Bayes to develop predictive models for cyberbullying detection. Index Terms-cyber bullying, natural language processing, machine learning algorithms, Social networking.

Text Mining Techniques for Cyberbullying Detection: State of the Art

Advances in Science, Technology and Engineering Systems Journal

The dramatic growth of social media during the last years has been associated with the emergence of a new bullying types. Platforms such as Facebook, Twitter, YouTube, and others are now privileged ways to disseminate all kinds of information. Indeed, communicating through social media without revealing the real identity has emerged an ideal atmosphere for cyberbullying, where people can pour out their hatred. Therefore, become very urgent to find automated methods to detect cyberbullying through text mining techniques. So, many researchers have recently investigated various approaches, and the number of scientific studies about this topic is growing very rapidly. Nonetheless, the methods are used to classify the phenomenon and evaluation methods are still under discussion. Subsequently, comparing the results between the studies and identifying their performance is still difficult. Therefore, the current systematic review has been conducted with the aim of survey the researches and studies that have been conducted so far by the research community in the topic of cyberbullying classification based on text language. In order to direct future studies on the topic to a more consistent and compatible perspective on recent works, we undertook a deep review of evaluation methods, features, dataset size, language, and dataset source of the latest research in this field. We made a choice to focus more on techniques that adopted neural networks and machine learning algorithms. After conducting systematic searches and applying the inclusion criteria, 16 different studies were included. It was found that the best accuracy was achieved when a deep learning approach is used particularly CNN approach. It was found also that, SVM is the most common classifier in both Arabic and Latin languages and outperformed the other classifiers. Also, the most widely used feature is N-Gram especially bigram and trigram. Furthermore, results show that Twitter is the main source for the collected datasets, and there are no unified datasets. There is also a shortage of studies in Arabic texts for cyberbullying identification in contrast with English texts.

Using Machine Learning to Detect Cyberbullying

Cyberbullying is the use of technology as a medium to bully someone. Although it has been an issue for many years, the recognition of its impact on young people has recently increased. Social networking sites provide a fertile medium for bullies, and teens and young adults who use these sites are vulnerable to attacks. Through machine learning, we can detect language patterns used by bullies and their victims, and develop rules to automatically detect cyberbullying content.

Natural Language Processing and Naïve Bayes Classifier Algorithm to Automate the Detection of Cyberbullying

International Journal for Research in Applied Science & Engineering Technology (IJRASET), 2023

The impact of social media on contemporary culture has been unprecedented, making it the most significant medium of our times. While it has had a positive effect on people's worldview, social media has also been linked to a rise in undesirable phenomena such as cyberbullying, cyberstalking, and cybercrime. Cyberbullying, in particular, can have a negative impact on individuals' mental health and has even been identified as the root cause of mental health issues in some cases. The proliferation of sexually explicit comments and the spread of rumors by multiple individuals are some of the negative influences that have been observed in the social media ecosystem. In recent years, academics have been increasingly concerned about the indicators of online harassment. Our goal is to develop a system that can detect instances of online abuse using Natural Language Processing (NLP) and Naïve Bayes, among other techniques. The cultural norms have shifted dramatically due to the rapid transmission of the COVID-19 virus, resulting in a rise in cyberbullying, especially among adolescents. The younger generation is more likely to engage in this practice, which has become more widespread with the stratospheric rise in popularity of various online engagement-promoting platforms. The COVID-19 pandemic has changed the way people interact online and has contributed to an increase in cyberbullying. As more people began working from home, bullying became a more significant concern. Our proposed system includes modules for data cleansing, text mining, word embedding, and regression analysis, among others. We utilize the Lemmatization technique for text mining, which enhances the model's precision. We also utilize the Vader emotion for feature extraction, which generates word vectors that are scattered numerical representations of word attributes. Additionally, Naive Bayes is used for data categorization to prevent overfitting in the proposed model. This would help in creating vectors that connect words with similar meanings.

Cyberbullying Detection - Technical Report 2/2018, Department of Computer Science AGH, University of Science and Technology

ArXiv, 2018

The research described in this paper concerns automatic cyberbullying detection in social media. There are two goals to achieve: building a gold standard cyberbullying detection dataset and measuring the performance of the Samurai cyberbullying detection system. The Formspring dataset provided in a Kaggle competition was re-annotated as a part of the research. The annotation procedure is described in detail and, unlike many other recent data annotation initiatives, does not use Mechanical Turk for finding people willing to perform the annotation. The new annotation compared to the old one seems to be more coherent since all tested cyberbullying detection system performed better on the former. The performance of the Samurai system is compared with 5 commercial systems and one well-known machine learning algorithm, used for classifying textual content, namely Fasttext. It turns out that Samurai scores the best in all measures (accuracy, precision and recall), while Fasttext is the sec...

Detection of Cyberbullying using Machine Learning

International Journal for Research in Applied Science and Engineering Technology IJRASET, 2020

Cyberbullying is a type of tormenting wherein technology is utilized as a medium to menace somebody. As the new blast of the web and other social media platforms are expanding, the quantity of users is additionally expanding and the primary users of online networking are for the most part adolescents and young adults. As much as these social media platforms are utilized for getting new data and for amusement, it is increasingly inclined for bullies to utilizes these systems as helpless against assaults against casualties. Because of the expansion in cyberbullying on casualties, it is deprived to build up an appropriate strategy for the identification and anticipation of cyberbullying. A developing assortment of work is rising on mechanized ways to deal with cyberbullying location. These methodologies use machine learning and natural language processing techniques to identify the characteristics of a cyberbullying exchange and automatically detect cyberbullying by matching Textual data. The primary goal of this task is to distinguish cyberbullying by coordinating both Image and Textual information. The test cases are utilized to characterize the dataset and distinguish the bullying. Machine learning techniques are utilized to proficiently anticipate and identify cyberbullying.

Cyber Bullying Text Detection Using Machine Learning

International Journal for Research in Applied Science & Engineering Technology (IJRASET), 2022

With the upward thrust of the Internet, using social media has exploded, and it's emerged because the most powerful networking platform of the ordinal century. However, exaggerated social networking oftentimes has negative consequences for society, causative to some unwanted phenomena at the side of online abuse, harassment, cyberbullying, cybercrime, and trolling. Cyberbullying causes severe mental and physical distress in several people, particularly ladies and children, and might even cause suicide damaging social impact of online harassment attracts attention. Several incidences of online harassment, equivalent to sharing personal chats, spreading rumours, and creating sexual remarks, have recently occurred everywhere on the planet. As a result, specialists are paying nearer interest to detect bullying the big texts or messages on social media. By combining natural language processing and machine learning the aim of this observation is to create and construct a powerful method for detecting online abusive and bullying texts. The accuracy stage of six different machine learning techniques is evaluated the usage by of extraordinary features, particularly the count vectorizer.

IRJET- Prevention of Cyber Bullying Using Machine Learning Approach

IRJET, 2021

The online interaction among people happens mostly using social media. There are recent developments in social media. It has challenges to the research community. The challenge is to analyze the online interactions among people. There are several social networking sites where people can share their views on a particular topic. The recent research reveals that on average 20 to 40 % of all teenagers have been victimized because of online social networking sites. In this paper, we mainly focus on the particular form of cyberbullying. This form is nothing but a form of cyber victimization. This can be prevented by adequate detection of such harmful messages. As there is a massive load on the web information, there should be an intelligent system to detect such cyberbullying. The system should identify the potential risk automatically. In this paper, we represent the construction and annotation of a corpus. The fine-grained annotations in cyberbullying such as text categories, insults, abusive words involve cyberbullying. The dataset has the construction of curse words and abusing words. The identification and intimation are done. We present the proof of concept experiments on automatic identification. The finegrained annotations are used for the identification of categories of cyberbullying.