Hoax Detection System on Indonesian News Sites Based on Text Classification using SVM and SGD (original) (raw)

Analysis and Detection of Hoax Contents in Indonesian News Based on Machine Learning

2019

Hoax newsthat contain incorrect (false) information often become public consumption on social media today. This hoax phenomenon raises doubts about information and makes confusion in the community. In this study, experiments conducted aimed at selecting the best algorithm in classifying hoax and non-hoax news with the number of data in 251 articles in Indonesian language (100 hoax articles and 151 non-hoax articles) using text mining method and machine learning based approaches. This research undergoes the text preprocessing phase which consists of tokenizing, case folding, filtering, stopwords removing, stemming and TF-IDF weighting using unigram and bigram combine features before processing it into classification text. The results of this research is the Random Forest algorithm that gets the best accuracy in classifying hoax and non-hoax news compared to the Multilayer Perceptron algorithm, Naive Bayes, Support Vector Machine, and Decision Tree with an accuracy value of 76.47%.

The Effect of Information Gain Feature Selection for Hoax Identification in Twitter Using Classification Method Support Vector Machine

2020

Nowadays social media twitter is popular media for news dissemination. News has elements that can be distinguished types of news, such as hoax that has elements of panic, worry, and anxiety that can have a significant impact in various fields of social, economic, educational, and political. Hoax prevention efforts need as possible before news viral, by to be developed method with functions to identify and hoax analyze. in this research we have proposed an approach Machine Learning with method Support Vector Machine (SVM) supported by feature selection Information Gain (IG) added Term Frequency–Inverse Document Frequency (TF-IDF) for word weighting system performance is very optimal in increasing accuracy by 37,51%, with accuracy reaching 96.55%.

Hoax classification and sentiment analysis of Indonesian news using Naive Bayes optimization

TELKOMNIKA Telecommunication Computing Electronics and Control, 2020

Currently, the spread of hoax news has increased significantly, especially on social media networks. Hoax news is very dangerous and can provoke readers. So, this requires special handling. This research proposed a hoax news detection system using searching, snippet and cosine similarity methods to classify hoax news. This method is proposed because the searching method does not require training data, so it is practical to use and always up to date. In addition, one of the drawbacks of the existing approaches is they are not equipped with a sentiment analysis feature. In our system, sentiment analysis is carried out after hoax news is detected. The goal is to extract the true hidden sentiment inside hoax whether positive sentiment or negative sentiment. In the process of sentiment analysis, the Naïve Bayes (NB) method was used which was optimized using the Particle Swarm Optimization (PSO) method. Based on the results of experiment on 30 hoax news samples that are widely spread on social media networks, the average of hoax news detection reaches 77% of accuracy, where each news is correctly identified as a hoax in the range between 66% and 91% of accuracy. In addition, the proposed sentiment analysis method proved to has a better performance than the previous analysis sentiment method.

A Survey on Fake News Detection using Support Vector Model

International Journal of Scientific Research in Science, Engineering and Technology, 2022

In the boosting period of Social Media availability and easy availability of Internet to end users in various regions, many challenges are also occurred with usage of this technology. Fake news spreading in various field is also a major challenge in recent time. Fake news has been spreading into vast in significant numbers for various business reasons and also for political reasons. Problem of fake news has become frequent in the online world. People can get affected and their view are also affected easily by these type of fake news for its fabricated words. This type of news has enormous effects on the offline community in various sectors. In this way it is an interesting topic for research. Significant research has been conducted on the detection of fake news from English texts and other languages but there is chancel to improve the work with other languages as well as from multiple sources. Various algorithms like SVM and other supervised algorithm can be helpful to classify fake news. As Sentiment Score is an also a major point for detection of Fake news; in our work we are applying SVM algorithm with TF/IDF, multiple Languages (Language Conversation) etc.”

Hoax Detection on Indonesian Tweets using Naïve Bayes Classifier with TF-IDF

Journal of Information System Research (JOSH)

Twitter is one of the most popular social media platforms in the world nowadays. Twitter users in Indonesia are the fifth largest in the world and are always active in expressing themselves and getting information through tweets. A hoax is a lie created as if it were true. Hoaxes are also often spread via tweets. The spread of hoaxes is extremely dangerous because it can cause social discord and even misunderstanding. Therefore, hoaxes must be resisted. This study aims to build a system to detect hoaxes on Indonesian tweets. The objective of this research is to identify hoax Indonesian tweets by using the Naïve Bayes classifier with Term Frequency Inverse Document Frequency (TF-IDF). This study collects and annotates tweets from hoax tweets post which sent by a user account. This study also applied several text preprocessing techniques to provide datasets. To provide the best hoax prediction model, this work splits datasets into training and testing datasets. There are four experime...

Hoax Detection System on Indonesian News Sites Based on Text Classification using SVM and SGD (original) (raw)

Related papers