afrida helen - Academia.edu (original) (raw)

Papers by afrida helen

Research paper thumbnail of Klasifikasi Sentimen Judul Berita Pemberitaan COVID-19 Tahun 2021 pada Media DetikHealth

Journal of Information Engineering and Educational Technology

Penelitian ini merupakan penelitian linguistik terapan yang mengombinasikan linguistik dan ilmu k... more Penelitian ini merupakan penelitian linguistik terapan yang mengombinasikan linguistik dan ilmu komputasi dan berfokus di bidang natural language processing (NLP). Fenomena yang dikaji adalah klasifikasi sentimen pada judul berita pemberitaan COVID-19 di media DetikHealth selama tahun 2021 sehingga orientasi penelitian ini adalah mengklasifikasikan sentimen pada fenomena tersebut. Pengumpulan data dilaksanakan dengan memanfaatkan fitur saring yang disediakan media tersebut dan analisis data dilakukan dalam dua tahap besar, yaitu text preprocessing dan klasifikasi sentimen. Algoritma yang diimplementasikan dalam penelitian ini adalah algoritma MultinomialNB yang merupakan bagian dari naïve bayes classifier. Hasil dari penelitian ini adalah diperolehnya tingkat akurasi prediksi sentimen sebesar 72.5%. Selain itu, uji coba dengan tanpa melakukan salah satu atau keseluruhan tahapan preprocessing data memberikan dampak terhadap tingkat akurasi mesin. Penurunan tingkat akurasi paling meno...

Research paper thumbnail of Naïve bayes and maximum entropy comparison for translated novel’s genre classification

Journal of Physics: Conference Series, 2021

In the last two decades, novel translation had become one of the popular products among the liter... more In the last two decades, novel translation had become one of the popular products among the literature community. People had favorited some genre based on their ages. The reader needs to finish reading until the end first before they could determine what genre a novel should have. There were some cases where the genre written in the description differs from the real novel’s content, which made readers felt upset and had not pleasant reading experience. This research is going to do classification for the novel’s genre automatically. Naïve Bayes is the method chosen for classification, later the result of Naïve Bayes classification is going to be compared with another algorithm, which is Maximum Entropy algorithm. Each method would apply algorithms to label the data based on an existing class. The data origin was taken from 12 translated novel that has 3746 lines. Data was portioned into three genre classes, “Action-Fantasy” for about 1293 lines, “Modern-Slice-of-Life” for 1203 lines,...

Research paper thumbnail of Aspect and Opinion Word Extraction on Opinion Sentences in Bahasa Indonesia using Rule Based Generated from Regular Expression

2019 4th International Conference on Information Technology, Information Systems and Electrical Engineering (ICITISEE)

Extracting aspects and opinion words is important in ABSA study. Generally, aspect extraction was... more Extracting aspects and opinion words is important in ABSA study. Generally, aspect extraction was conducted first then determining a pair of opinion words from the aspects. The process allows errors to occur. This study proposes an approach to obtain the appropriate pair of aspect and opinion words in opinion sentences using rule-based method generated from regular expressions. The strength of this process is that it enables to obtain a pair of aspects and opinion words simultaneously. Three methods are developed this approach (1) selection of aspect-based opinion sentences in user reviews, (2) extraction of candidate aspects and opinion words in aspect-based opinion sentences and (3) aspect categorization. This study enables to extract pairs of aspects and opinion words using rule-based aspects generated from regular expression. Determining valid aspects uses aspect extraction. Evaluation of aspect categorization shows the value of precision of 0.82, recall of 0.70 and f-measure of 0.75.

Research paper thumbnail of Integrating Psychophysiology within Clinical Practice: A Pilot Cross-sectional Study on Prodromal Symptoms of Schizophrenia, Emotion Regulation, and Personality Functioning

Clinical Neuropsychiatry, 2021

Objective To investigate the association between prodromal symptoms of schizophrenia, autonomic a... more Objective To investigate the association between prodromal symptoms of schizophrenia, autonomic activity, and personality functioning. Method 10 adolescents underwent semi-structured interviews assessing prodromal symptoms of schizophrenia and personality functioning. Cardiac activity was recorded at baseline, during the clinical interviews, and at recovery to assess concurrent changes in autonomic functioning. Results During the assessment of prodromal symptoms of schizophrenia, participants increased sympathetic activation compared to the recovery condition, and reduced vagal activation compared to the assessment of interpersonal functioning. Conclusions The findings highlight the importance of integrating the autonomic assessment in clinical psychiatric and psychological practice.

Research paper thumbnail of Comparison of Adolescent Vaccination Data Accuracy by Urban Village in DKI Jakarta Province in July 2021 Using Several Data Mining Methods

2021 International Conference on Artificial Intelligence and Big Data Analytics, 2021

Since 2020, the outbreak of the Coronavirus disease has begun to enter the territory of Indonesia... more Since 2020, the outbreak of the Coronavirus disease has begun to enter the territory of Indonesia. For a year and a half, various efforts have been made to reduce the number of deaths caused by this pandemic. One of the efforts made by the government is the provision of vaccinations for the community, especially for adolescents. This is one way to attract people's interest to vaccinate and also make it easier for the government and the system to process vaccination data, especially for youth vaccination. The purpose of this study is to determine the accuracy of the data on adolescents who have been vaccinated in the DKI Jakarta province in July 2021 by using several methods of data mining. Of the three data mining methods used in this study, the JRip method produces the highest percentage of accuracy, which is 100%.

Research paper thumbnail of The Effect of Educational Background on High Jobs and Income

2021 International Conference on Artificial Intelligence and Big Data Analytics, 2021

In this era, people tend to assume that educational background is a string enough as consideratio... more In this era, people tend to assume that educational background is a string enough as consideration for opportunities, placements, and income of jobs. In fact, nowadays most companies also perceive that a good educational background will automatically bring out the maximum results for the company's performance. This refers to the assumption that the efficiency of workers is determined from the background itself, which also will boost the company's profit from human resources. However, to measure the level of influence of educational background on jobs and income, it needs a test that can map the percentage results accurately. Machine learning is an approach of AI (Artificial Intelligence) that is often used to help human problem solving or perform automation. Machine learning requires data to be studied, then it will classify the data based on how humans distinguish an object. Therefore, to complete the test above, it is necessary to use machine learning, namely KNIME. There are 3 algorithms of KNIME that can solve the above problems, which are Decision Tree, Logistic Regression, and Random Forest. Then, the highest level of algorithm accuracy is Random Forest, with a percentage of 84.9%. The accuracy result shows that the effect of educational background on employment and high income is 84.9%. So, it is true that the effect of educational background is very high. Instead, it is not only determined the high employment and income, yet it is still influenced by 25.1% other factors, such as productivity, work experience, etc.

Research paper thumbnail of Twitter’s Hate Speech Multi-label Classification Using Bidirectional Long Short-term Memory (BiLSTM) Method

2021 International Conference on Artificial Intelligence and Big Data Analytics, 2021

Since social media is one of the most likable products of techngetpeople get easier to express th... more Since social media is one of the most likable products of techngetpeople get easier to express their opinions. Anyone be able to tell their opinion freely there. Unfortunately, its convenience has also become a boomerang for us, the easier every opinion conveyed the easier hate speech is expressed. This matter become the dark side of social media. Hate speech face us with a lot of dangers, such as violence, social conflict, even homicide. Therefore, preventing all of those dangers that might be occur because of hate speech is one of the prior things we need to do. This research was done as an attempt to take care of the dangers that could be done by hate speech. The attempt we tried to do is using multi-label text classification to predict hate speech with the Bidirectional Long Short-term Memory (BiLSTM) method. This multi-label text classification labelled every tweet in the dataset crawled from Twitter with 12 labels about hate speech. From this experiment, we obtained the best hyperparameter value that could achieve great performance with 82.31% accuracy, 83.41% precision, 87.28% recall, and 85.30% F1-score.

Research paper thumbnail of Search System for Translation of Al-Qur'an Verses in Indonesian using BM25 and Semantic Query Expansion

2021 International Conference on Artificial Intelligence and Big Data Analytics, 2021

Al-Qur'an is a source of life guidance for Muslims. The digital version of the Qur'an is ... more Al-Qur'an is a source of life guidance for Muslims. The digital version of the Qur'an is already available on the android platform. The Indonesian Ministry of Religion provides an official application called the Ministry of Religion's Qur'an which has a translation search system in it. However, adjustments need to be made so that Indonesian people can look for translations of verses that have a definite order based on their level of relevance to the keywords entered. This study aims to develop a search system for translations of Indonesian-language Qur'an verses that already exist in the Ministry of Religion's Qur'an application.This study uses the BM25 algorithm with Word2vec as the Semantic Query Expansion method. Data as many as 6236 translated documents in the application are used to create a model of the search system. Tests on hyperparameters are carried out to get the most optimal model. The research results obtained several hyperparameter values in SQE including a window of 7, and a query expansion term of 1. In the BM25 hyperparameter, the optimal condition is obtained when the k1 variable is 1.8 and the b variable is 0.85. The search system was evaluated using the Mean Average Precision and compared with the search system that was previously available in the Ministry of Religion's Qur'an. The MAP score increased with the proposed method, from 0.53718 to 0.66556.

Research paper thumbnail of Classification of Water Potability Using Machine Learning Algorithms

2021 International Conference on Artificial Intelligence and Big Data Analytics, 2021

Clean water is one of the basic needs of everyday life. Recently, an ongoing process has been sho... more Clean water is one of the basic needs of everyday life. Recently, an ongoing process has been shown to improve water quality, making water less suitable for use. To solve this problem, research is done using a machine learning model. The Decision Tree Algorithm is used by Naïve Bayes algorithm in this type of machine learning to support drinking water quality. The two types of performance are compared in this work. K-fold cross credentials are used to evaluate our machine learning model. Results obtained in the decision tree algorithm have the best results in the configuration with an accuracy value of 97.23%.

Research paper thumbnail of Preprocessing Application for Car Insurance Claim Classification Model

2021 International Conference on Artificial Intelligence and Big Data Analytics, 2021

The use of data in various industry sectors becomes a necessity in predicting or deciding on one ... more The use of data in various industry sectors becomes a necessity in predicting or deciding on one of them is the use of car insurance data related to claims made by vehicle owners. The data can later be used by insurance companies to be analyzed related to car insurance claims both from the owner's side of the car and from the condition of the car. This paper will discuss the preprocessing data on auto insurance claim data in America with the aim of later making a data model so that it can be used next to see the accuracy of the data processed classification. The results of processing data and data classification can help car insurance companies in deciding a policy or problem that occurs accurately and measurable. The data used in this paper is data that still has a missing value. Therefore, data cleaning is done by cleaning, filtering, and combining these data. The study used the Car Insurance Claim Data dataset downloaded on kaggle's website. The results showed that the JRIP algorithm had the best accuracy of 83.09 percent (before the preprocessing dataset) and 83.14 percent (after the preprocessing dataset was applied) in the 10 fold cross-validation test mode. With an increased level of accuracy, the data can be better used again as an example for forecasting tau as a reference company to trigger something related to the data.

Research paper thumbnail of Comparative Study of J48 Decision Tree Classification Algorithm, Random Tree, and Random Forest on In-Vehicle CouponRecommendation Data

2021 International Conference on Artificial Intelligence and Big Data Analytics, 2021

Coupons are one of the media used to increase sales and invite customers to repurchase products. ... more Coupons are one of the media used to increase sales and invite customers to repurchase products. A study to investigate the effectiveness of the distribution of coupons, especially coupons for restaurants and bar, can be carried out by collecting data through an in-vehicle survey. The data can then be analyzed using classification techniques in data mining. This paper presents a classification on the problem of in-vehicle coupon recommendation to determine the decision of coupon acceptance through the J48, Random Tree, and Random Forest decision tree classification algorithm. The dataset used consists of 23 attributes including the class Y attribute which indicates the receipt of coupons by customers. The performance of the three algorithms is evaluated to determine the best classification algorithm by looking at accuracy, time to build the model, and other variables that appear in the class classification experiment. The results reveal that the Random Tree classification algorithm takes the least amount of time (0.28 seconds) and has the lowest accuracy (67.38%). The J48 algorithm is more accurate than the Random Tree algorithm (72.79%) but takes significantly longer time (0.36 seconds). The Random Forest technique has the best accuracy (77.0%), but the time it takes for model creation is substantially longer than the Random Tree and J48 algorithms (10.89 seconds).

Research paper thumbnail of The Safety of Carfilzomib Therapy in Patients on Long-Term Hemodialysis and Multiple Myeloma

Saudi Journal of Kidney Diseases and Transplantation, 2021

Research paper thumbnail of Mining of Community Participatory in the Government of Surabaya Berlian

Diskominfo Surabaya, as a government agency, received much community participatory for improvemen... more Diskominfo Surabaya, as a government agency, received much community participatory for improvement of governmental services, with increasing number of 698, 2717, 4176 and 4298 participatory data respectively in 2011, 2012, 2013 and 2014. It is challenging for Diskominfo Surabaya to set a target by giving the response back within 24 hours. Due to task complexity to address the degree of participatory and to categorize the group of participatory, they faced difficulty to fulfill the target. In this research, we present a new system for measuring the sentiment degree of community participatory. We provide 5 functions in our system, which are: (1) Data Collection, (2) Data Preprocessing, (3) Text Mining, (4) Sentiment Analysis and (5) Validation. We propose our rule-based technique for the sentiment analysis of opinion mining with detection of 8 important parts, which are (1) Verb, (2) Adjective, (3) Preposition, (4) Noun, (5) Adverb, (6) Symbol, (7) Phrase, and (8) Complimentary. For a...

Research paper thumbnail of Indonesian news auto summarization in infrastructure development topic using 5W+1H consideration

2017 International Electronics Symposium on Knowledge Creation and Intelligent Computing (IES-KCIC), 2017

With an average reading speed of 200–500 words per minute, at least human takes 2 to 3 minutes to... more With an average reading speed of 200–500 words per minute, at least human takes 2 to 3 minutes to read and understand one news in online media. The number of news updates on an online media in a few minutes can be a lot and it's time-consuming if a reader has to read the contents of all the news. Reading a summary that represents the main idea of the news can be a solution to save time. This study considers the 5W + 1H element in generating news summaries because this element is important in a news. The single news from online media pages is taken by scanning and grabbing process which is further will be sanitized, then segmentation and tokenizing to break the news into sentences and words. Each sentence classified into multi-label whether it contains 5W + 1H (What, Who, Where, When, Why and/or How) or nothing else by using training data that has been built. Sentences containing 5W + 1H will be selected as summary sentences. Testing of summary results shows the average precision...

Research paper thumbnail of Hybrid Analytic Hierarchy Process Algoritma Genetika Untuk Permasalahan Pengambilan Keputusan Melibatkan Kelompok Pengambil Keputusan

Analytic Hierarchy Process (AHP) adalah suatu model pengambil keputusan dari suatu permasalahan p... more Analytic Hierarchy Process (AHP) adalah suatu model pengambil keputusan dari suatu permasalahan pengambilan keputusan dengan banyak pilihan dan banyak kriteria. Pengambilan keputusan dilakukan oleh seseorang yang dianggap ahli dengan cara memberikan nilai persepsi sebagai pembobot pada matrik pembanding berpasangan. Yang dimaksud dengan ahli adalah orang yang betul-betul mengerti dengan permasalahan yang diajukan, merasakan akibat dari permasalahan tersebut atau punya kepentingan terhadap permasalahan tersebut. Hasil dari AHP adalah nilai terbaik dari pembobot-pembobot yang terdapat pada setiap pilihan yang tersedia. Dalam satu permasalahan pengambilan keputusan yang melibatkan sekelompok ahli atau sekelompok pengambil keputusan perlu ditentukan pembobot yang paling baik dalam arti dapat menghasilkan keputusan terbaik Untuk menghasilkan pembobot yang paling baik akan digunakan Algoritma Genetika (GA) dari kombinasi nilai pembobot yang diberikan oleh masing-masing pengambil keputusan...

Research paper thumbnail of Assessment for the renewal and management of stormwater drainage facilities in residential areas of Enugu city, Nigeria

The objectives of this paper are to assess the quality of artificial drainage facilities in resid... more The objectives of this paper are to assess the quality of artificial drainage facilities in residential premises of Enugu city and proffer appropriate renewal strategies for managing them. An appraisal technique, which employs a checklist of seven simple classifications of specified qualities, is utilized to achieve the aim. In this approach percentage penalty points (Pp) are assigned to drainage facilities having observable defects. A total of 366 drains in 20 residential areas of the city are appraised. The poorest condition of drainage system is observed in Ogui Urban area with the highest penalty points of 88% followed by Abakpa with 87.5% penalty. Only three layouts (Ekulu, G.R.A., and Independence Layout) have less than 25% penalty points rated as good condition of drainage. None of the residential areas in the city has excellent drainage facilities as none scored less than 10% penalty. A more integrative legislation on urban land use planning and management to protect the dra...

Research paper thumbnail of Performansi Neuro Fuzzy untuk Peramalan Data Time Series

ANFIS (Adaptif Neuro Fuzzy Inference System) adalah metode jaringan neural yang fungsinya sama de... more ANFIS (Adaptif Neuro Fuzzy Inference System) adalah metode jaringan neural yang fungsinya sama dengan sistem inferensi fuzzy. Pada ANFIS, proses belajar pada jaringan neural dengan sejumlah pasangan data berguna untuk memperbaharui parameter-parameter sistem inferensi fuzzy. Metode ANFIS menggunakan algoritma Error backpropagation yang memiliki beberapa keunggulan, yaitu baik dari segi kekonvergenan maupun dari segi lokal minimumnya yang sangat peka terhadap perbaikan parameter ANFIS. Metode ini diimplementasikan pada peramalan data time series untuk 4 jenis tipe data yaitu stasioner (data sunspot), random (data saham), non stasioner (airline), musiman (beban listrik). Proses learning data dengan ANFIS memiliki hasil yang sempurna dimana nilai error proses training mampu mencapai 0 (nol). Metode ANFIS memiliki hasil yang sangat baik untuk peramalan data saham dimana didapatkan nilai MSE 2.27 pada time lag 320. Hasil peramalan untuk data sunspot dan data beban listrik memiliki hasil ...

Research paper thumbnail of Emotional Context Detection on Conversation Text with Deep Learning Method Using Long Short-Term Memory and Attention Networks

2021 9th International Conference on Information and Communication Technology (ICoICT), 2021

The conversation in the text is an interesting research on Natural Language Processing. One of th... more The conversation in the text is an interesting research on Natural Language Processing. One of the text conversation tasks is to know the emotions of the people involved in the conversation. The conversation in social media like Twitter, Instagram, short message service, WhatsApp, and so on, often involves emotion. Somehow the comments are impulsive sentences that can stimulate emotions. Expressing emotions using text are rarely done and uncomfortable. However, with natural language technology development, expressing emotions using text can be succeeded with specific symbols. We call the specific symbols emojis. So many emojis can express emotions. This research proposes the emoji symbol as a character feature. We introduce the Emoji2Vec method and Long Short-term Memory with Attention. The Attention that is used has a complex topology. We compared the results of this study with the baseline model. The method we propose is better than the baseline model.

Research paper thumbnail of Semantic Information Retrival for Scientific Experimental Papers with Knowlege based Feature Extraction

INOVTEK Polbeng - Seri Informatika, 2019

Abstrack-Along with the times, demands for information retrievals in scientific papers have also ... more Abstrack-Along with the times, demands for information retrievals in scientific papers have also increased. Regarding experimental scientific papers, researchers have difficulty in searching for information on experimental scientific papers because information retrieval engines have limitations in the search process due to text mining-based feature extraction of the entire text, while experimental types of scientific paper have specific contents, which should have a different treatment in feature extraction. In this paper, we propose a new system for information retrieval on experimental scientific papers. This system consists of 4 main functions: (1) Specific content-based feature extraction, (2) Classification model, (3) Context-based subspace selection, and (4) Context-dependent similarity measurement. In feature extraction, our system extracts feature category in experimental scientific papers with specific content-based features, which are data, problem, method and result. To perform the applicability of our proposed system, we tested 77 papers in the dataset with the Leave-One-Out validation model with several classification algorithm (Nearest Neighbour, Naive Bayes, Support Vector Machine and Decision Tree) and on average performed 66.65% precision rate and accuracy of 76,18% precision rate. We also made the experiment on the similarity, our proposed system performed 79.17% accuracy rate Keywords-Scientific experimental paper, Context-base subspace selection, Context-dependent similarity measurement. Intisari-Seiring dengan perkembangan zaman permintaan pencarian informasi dalam makalah ilmiah juga meningkat. Mesin pencari informasi yang ada saat ini memiliki keterbatasan dalam proses pencarian berdasarkan ekstraksi fitur berbasis text-mining dari seluruh teks, sedangkan jenis makalah ilmiah eksperimental memiliki konten spesifik. Dalam makalah yang kami usulkan sistem untuk pengambilan informasi pada makalah ilmiah eksperimental. Sistem terdiri dari 4 fungsi: (1) Ekstraksi fitur berbasis konten, (2) Model klasifikasi, (3) Pemilihan subruang berbasis konteks, dan (4) Pengukuran kesamaan berdasar pada konteks. Dalam Pemilihan Subruang Berbasis Konteks, sistem melakukan pengurangan dimensi dengan pemilihan subruang berbasis konteks yang dipilih oleh pengguna. Untuk mendapatkan hasil pencarian akhir, kami mengukur kesamaan konteks dengan membangun metrik dataset berdasar konteks ke paper. Untuk melakukan penerapan sistem yang kami usulkan, kami menguji 77 makalah dalam dataset dengan model validasi Leave-One-Out dengan beberapa algoritma klasifikasi (Nearest Neighbor, Naive Bayes, Support Vector Machine, dan Decision Tree) dan rata-rata melakukan presisi 66,65% tingkat dan akurasi tingkat presisi 76,18%. Kami juga melakukan percobaan pada pengukuran kesamaan dengan memberikan queri paper dan konten yang diinginkan (data, hasil, metode, dan masalah) sebagai konteks yang diberikan oleh pengguna. Dalam percobaan pengukuran kesamaan, sistem yang kami usulkan memiliki tingkat akurasi 79,17%.

Research paper thumbnail of Penentuan Aspek Implisit dengan Ekstraksi Knowledge Berbasis Rule pada Ulasan Bahasa Indonesia (Determination of Implicit Aspects with Rule Based Knowledge Extraction in Indonesian Reviews)

Jurnal Nasional Teknik Elektro dan Teknologi Informasi, 2020

Determination of implicit aspects is one of the important things in opinion sentences. This study... more Determination of implicit aspects is one of the important things in opinion sentences. This study proposes a new approach for developing rule-based knowledge by forming a relation between opinion words with aspect categories. The relationship is obtained from the combination of rules, based on Opinion Word Similarity (OWS). Evaluation for rule-based knowledge extraction is in the form of threshold values of frequency and confidence to produce the best precision, recall, and f-measure values. The knowledge extraction consists of two phases: training phase and testing phase. The training phase is described as the process to extract rule-based knowledge. The testing phase is described as the process to obtain the implicit aspects of opinion sentences by referring to rule-based knowledge. To extract rule-based knowledge on user reviews, it is necessary to identify opinion sentences with explicit aspects and get pairs of aspects and words of opinion with rules generated from regular expr...

Research paper thumbnail of Klasifikasi Sentimen Judul Berita Pemberitaan COVID-19 Tahun 2021 pada Media DetikHealth

Journal of Information Engineering and Educational Technology

Penelitian ini merupakan penelitian linguistik terapan yang mengombinasikan linguistik dan ilmu k... more Penelitian ini merupakan penelitian linguistik terapan yang mengombinasikan linguistik dan ilmu komputasi dan berfokus di bidang natural language processing (NLP). Fenomena yang dikaji adalah klasifikasi sentimen pada judul berita pemberitaan COVID-19 di media DetikHealth selama tahun 2021 sehingga orientasi penelitian ini adalah mengklasifikasikan sentimen pada fenomena tersebut. Pengumpulan data dilaksanakan dengan memanfaatkan fitur saring yang disediakan media tersebut dan analisis data dilakukan dalam dua tahap besar, yaitu text preprocessing dan klasifikasi sentimen. Algoritma yang diimplementasikan dalam penelitian ini adalah algoritma MultinomialNB yang merupakan bagian dari naïve bayes classifier. Hasil dari penelitian ini adalah diperolehnya tingkat akurasi prediksi sentimen sebesar 72.5%. Selain itu, uji coba dengan tanpa melakukan salah satu atau keseluruhan tahapan preprocessing data memberikan dampak terhadap tingkat akurasi mesin. Penurunan tingkat akurasi paling meno...

Research paper thumbnail of Naïve bayes and maximum entropy comparison for translated novel’s genre classification

Journal of Physics: Conference Series, 2021

In the last two decades, novel translation had become one of the popular products among the liter... more In the last two decades, novel translation had become one of the popular products among the literature community. People had favorited some genre based on their ages. The reader needs to finish reading until the end first before they could determine what genre a novel should have. There were some cases where the genre written in the description differs from the real novel’s content, which made readers felt upset and had not pleasant reading experience. This research is going to do classification for the novel’s genre automatically. Naïve Bayes is the method chosen for classification, later the result of Naïve Bayes classification is going to be compared with another algorithm, which is Maximum Entropy algorithm. Each method would apply algorithms to label the data based on an existing class. The data origin was taken from 12 translated novel that has 3746 lines. Data was portioned into three genre classes, “Action-Fantasy” for about 1293 lines, “Modern-Slice-of-Life” for 1203 lines,...

Research paper thumbnail of Aspect and Opinion Word Extraction on Opinion Sentences in Bahasa Indonesia using Rule Based Generated from Regular Expression

2019 4th International Conference on Information Technology, Information Systems and Electrical Engineering (ICITISEE)

Extracting aspects and opinion words is important in ABSA study. Generally, aspect extraction was... more Extracting aspects and opinion words is important in ABSA study. Generally, aspect extraction was conducted first then determining a pair of opinion words from the aspects. The process allows errors to occur. This study proposes an approach to obtain the appropriate pair of aspect and opinion words in opinion sentences using rule-based method generated from regular expressions. The strength of this process is that it enables to obtain a pair of aspects and opinion words simultaneously. Three methods are developed this approach (1) selection of aspect-based opinion sentences in user reviews, (2) extraction of candidate aspects and opinion words in aspect-based opinion sentences and (3) aspect categorization. This study enables to extract pairs of aspects and opinion words using rule-based aspects generated from regular expression. Determining valid aspects uses aspect extraction. Evaluation of aspect categorization shows the value of precision of 0.82, recall of 0.70 and f-measure of 0.75.

Research paper thumbnail of Integrating Psychophysiology within Clinical Practice: A Pilot Cross-sectional Study on Prodromal Symptoms of Schizophrenia, Emotion Regulation, and Personality Functioning

Clinical Neuropsychiatry, 2021

Objective To investigate the association between prodromal symptoms of schizophrenia, autonomic a... more Objective To investigate the association between prodromal symptoms of schizophrenia, autonomic activity, and personality functioning. Method 10 adolescents underwent semi-structured interviews assessing prodromal symptoms of schizophrenia and personality functioning. Cardiac activity was recorded at baseline, during the clinical interviews, and at recovery to assess concurrent changes in autonomic functioning. Results During the assessment of prodromal symptoms of schizophrenia, participants increased sympathetic activation compared to the recovery condition, and reduced vagal activation compared to the assessment of interpersonal functioning. Conclusions The findings highlight the importance of integrating the autonomic assessment in clinical psychiatric and psychological practice.

Research paper thumbnail of Comparison of Adolescent Vaccination Data Accuracy by Urban Village in DKI Jakarta Province in July 2021 Using Several Data Mining Methods

2021 International Conference on Artificial Intelligence and Big Data Analytics, 2021

Since 2020, the outbreak of the Coronavirus disease has begun to enter the territory of Indonesia... more Since 2020, the outbreak of the Coronavirus disease has begun to enter the territory of Indonesia. For a year and a half, various efforts have been made to reduce the number of deaths caused by this pandemic. One of the efforts made by the government is the provision of vaccinations for the community, especially for adolescents. This is one way to attract people's interest to vaccinate and also make it easier for the government and the system to process vaccination data, especially for youth vaccination. The purpose of this study is to determine the accuracy of the data on adolescents who have been vaccinated in the DKI Jakarta province in July 2021 by using several methods of data mining. Of the three data mining methods used in this study, the JRip method produces the highest percentage of accuracy, which is 100%.

Research paper thumbnail of The Effect of Educational Background on High Jobs and Income

2021 International Conference on Artificial Intelligence and Big Data Analytics, 2021

In this era, people tend to assume that educational background is a string enough as consideratio... more In this era, people tend to assume that educational background is a string enough as consideration for opportunities, placements, and income of jobs. In fact, nowadays most companies also perceive that a good educational background will automatically bring out the maximum results for the company's performance. This refers to the assumption that the efficiency of workers is determined from the background itself, which also will boost the company's profit from human resources. However, to measure the level of influence of educational background on jobs and income, it needs a test that can map the percentage results accurately. Machine learning is an approach of AI (Artificial Intelligence) that is often used to help human problem solving or perform automation. Machine learning requires data to be studied, then it will classify the data based on how humans distinguish an object. Therefore, to complete the test above, it is necessary to use machine learning, namely KNIME. There are 3 algorithms of KNIME that can solve the above problems, which are Decision Tree, Logistic Regression, and Random Forest. Then, the highest level of algorithm accuracy is Random Forest, with a percentage of 84.9%. The accuracy result shows that the effect of educational background on employment and high income is 84.9%. So, it is true that the effect of educational background is very high. Instead, it is not only determined the high employment and income, yet it is still influenced by 25.1% other factors, such as productivity, work experience, etc.

Research paper thumbnail of Twitter’s Hate Speech Multi-label Classification Using Bidirectional Long Short-term Memory (BiLSTM) Method

2021 International Conference on Artificial Intelligence and Big Data Analytics, 2021

Since social media is one of the most likable products of techngetpeople get easier to express th... more Since social media is one of the most likable products of techngetpeople get easier to express their opinions. Anyone be able to tell their opinion freely there. Unfortunately, its convenience has also become a boomerang for us, the easier every opinion conveyed the easier hate speech is expressed. This matter become the dark side of social media. Hate speech face us with a lot of dangers, such as violence, social conflict, even homicide. Therefore, preventing all of those dangers that might be occur because of hate speech is one of the prior things we need to do. This research was done as an attempt to take care of the dangers that could be done by hate speech. The attempt we tried to do is using multi-label text classification to predict hate speech with the Bidirectional Long Short-term Memory (BiLSTM) method. This multi-label text classification labelled every tweet in the dataset crawled from Twitter with 12 labels about hate speech. From this experiment, we obtained the best hyperparameter value that could achieve great performance with 82.31% accuracy, 83.41% precision, 87.28% recall, and 85.30% F1-score.

Research paper thumbnail of Search System for Translation of Al-Qur'an Verses in Indonesian using BM25 and Semantic Query Expansion

2021 International Conference on Artificial Intelligence and Big Data Analytics, 2021

Al-Qur'an is a source of life guidance for Muslims. The digital version of the Qur'an is ... more Al-Qur'an is a source of life guidance for Muslims. The digital version of the Qur'an is already available on the android platform. The Indonesian Ministry of Religion provides an official application called the Ministry of Religion's Qur'an which has a translation search system in it. However, adjustments need to be made so that Indonesian people can look for translations of verses that have a definite order based on their level of relevance to the keywords entered. This study aims to develop a search system for translations of Indonesian-language Qur'an verses that already exist in the Ministry of Religion's Qur'an application.This study uses the BM25 algorithm with Word2vec as the Semantic Query Expansion method. Data as many as 6236 translated documents in the application are used to create a model of the search system. Tests on hyperparameters are carried out to get the most optimal model. The research results obtained several hyperparameter values in SQE including a window of 7, and a query expansion term of 1. In the BM25 hyperparameter, the optimal condition is obtained when the k1 variable is 1.8 and the b variable is 0.85. The search system was evaluated using the Mean Average Precision and compared with the search system that was previously available in the Ministry of Religion's Qur'an. The MAP score increased with the proposed method, from 0.53718 to 0.66556.

Research paper thumbnail of Classification of Water Potability Using Machine Learning Algorithms

2021 International Conference on Artificial Intelligence and Big Data Analytics, 2021

Clean water is one of the basic needs of everyday life. Recently, an ongoing process has been sho... more Clean water is one of the basic needs of everyday life. Recently, an ongoing process has been shown to improve water quality, making water less suitable for use. To solve this problem, research is done using a machine learning model. The Decision Tree Algorithm is used by Naïve Bayes algorithm in this type of machine learning to support drinking water quality. The two types of performance are compared in this work. K-fold cross credentials are used to evaluate our machine learning model. Results obtained in the decision tree algorithm have the best results in the configuration with an accuracy value of 97.23%.

Research paper thumbnail of Preprocessing Application for Car Insurance Claim Classification Model

2021 International Conference on Artificial Intelligence and Big Data Analytics, 2021

The use of data in various industry sectors becomes a necessity in predicting or deciding on one ... more The use of data in various industry sectors becomes a necessity in predicting or deciding on one of them is the use of car insurance data related to claims made by vehicle owners. The data can later be used by insurance companies to be analyzed related to car insurance claims both from the owner's side of the car and from the condition of the car. This paper will discuss the preprocessing data on auto insurance claim data in America with the aim of later making a data model so that it can be used next to see the accuracy of the data processed classification. The results of processing data and data classification can help car insurance companies in deciding a policy or problem that occurs accurately and measurable. The data used in this paper is data that still has a missing value. Therefore, data cleaning is done by cleaning, filtering, and combining these data. The study used the Car Insurance Claim Data dataset downloaded on kaggle's website. The results showed that the JRIP algorithm had the best accuracy of 83.09 percent (before the preprocessing dataset) and 83.14 percent (after the preprocessing dataset was applied) in the 10 fold cross-validation test mode. With an increased level of accuracy, the data can be better used again as an example for forecasting tau as a reference company to trigger something related to the data.

Research paper thumbnail of Comparative Study of J48 Decision Tree Classification Algorithm, Random Tree, and Random Forest on In-Vehicle CouponRecommendation Data

2021 International Conference on Artificial Intelligence and Big Data Analytics, 2021

Coupons are one of the media used to increase sales and invite customers to repurchase products. ... more Coupons are one of the media used to increase sales and invite customers to repurchase products. A study to investigate the effectiveness of the distribution of coupons, especially coupons for restaurants and bar, can be carried out by collecting data through an in-vehicle survey. The data can then be analyzed using classification techniques in data mining. This paper presents a classification on the problem of in-vehicle coupon recommendation to determine the decision of coupon acceptance through the J48, Random Tree, and Random Forest decision tree classification algorithm. The dataset used consists of 23 attributes including the class Y attribute which indicates the receipt of coupons by customers. The performance of the three algorithms is evaluated to determine the best classification algorithm by looking at accuracy, time to build the model, and other variables that appear in the class classification experiment. The results reveal that the Random Tree classification algorithm takes the least amount of time (0.28 seconds) and has the lowest accuracy (67.38%). The J48 algorithm is more accurate than the Random Tree algorithm (72.79%) but takes significantly longer time (0.36 seconds). The Random Forest technique has the best accuracy (77.0%), but the time it takes for model creation is substantially longer than the Random Tree and J48 algorithms (10.89 seconds).

Research paper thumbnail of The Safety of Carfilzomib Therapy in Patients on Long-Term Hemodialysis and Multiple Myeloma

Saudi Journal of Kidney Diseases and Transplantation, 2021

Research paper thumbnail of Mining of Community Participatory in the Government of Surabaya Berlian

Diskominfo Surabaya, as a government agency, received much community participatory for improvemen... more Diskominfo Surabaya, as a government agency, received much community participatory for improvement of governmental services, with increasing number of 698, 2717, 4176 and 4298 participatory data respectively in 2011, 2012, 2013 and 2014. It is challenging for Diskominfo Surabaya to set a target by giving the response back within 24 hours. Due to task complexity to address the degree of participatory and to categorize the group of participatory, they faced difficulty to fulfill the target. In this research, we present a new system for measuring the sentiment degree of community participatory. We provide 5 functions in our system, which are: (1) Data Collection, (2) Data Preprocessing, (3) Text Mining, (4) Sentiment Analysis and (5) Validation. We propose our rule-based technique for the sentiment analysis of opinion mining with detection of 8 important parts, which are (1) Verb, (2) Adjective, (3) Preposition, (4) Noun, (5) Adverb, (6) Symbol, (7) Phrase, and (8) Complimentary. For a...

Research paper thumbnail of Indonesian news auto summarization in infrastructure development topic using 5W+1H consideration

2017 International Electronics Symposium on Knowledge Creation and Intelligent Computing (IES-KCIC), 2017

With an average reading speed of 200–500 words per minute, at least human takes 2 to 3 minutes to... more With an average reading speed of 200–500 words per minute, at least human takes 2 to 3 minutes to read and understand one news in online media. The number of news updates on an online media in a few minutes can be a lot and it's time-consuming if a reader has to read the contents of all the news. Reading a summary that represents the main idea of the news can be a solution to save time. This study considers the 5W + 1H element in generating news summaries because this element is important in a news. The single news from online media pages is taken by scanning and grabbing process which is further will be sanitized, then segmentation and tokenizing to break the news into sentences and words. Each sentence classified into multi-label whether it contains 5W + 1H (What, Who, Where, When, Why and/or How) or nothing else by using training data that has been built. Sentences containing 5W + 1H will be selected as summary sentences. Testing of summary results shows the average precision...

Research paper thumbnail of Hybrid Analytic Hierarchy Process Algoritma Genetika Untuk Permasalahan Pengambilan Keputusan Melibatkan Kelompok Pengambil Keputusan

Analytic Hierarchy Process (AHP) adalah suatu model pengambil keputusan dari suatu permasalahan p... more Analytic Hierarchy Process (AHP) adalah suatu model pengambil keputusan dari suatu permasalahan pengambilan keputusan dengan banyak pilihan dan banyak kriteria. Pengambilan keputusan dilakukan oleh seseorang yang dianggap ahli dengan cara memberikan nilai persepsi sebagai pembobot pada matrik pembanding berpasangan. Yang dimaksud dengan ahli adalah orang yang betul-betul mengerti dengan permasalahan yang diajukan, merasakan akibat dari permasalahan tersebut atau punya kepentingan terhadap permasalahan tersebut. Hasil dari AHP adalah nilai terbaik dari pembobot-pembobot yang terdapat pada setiap pilihan yang tersedia. Dalam satu permasalahan pengambilan keputusan yang melibatkan sekelompok ahli atau sekelompok pengambil keputusan perlu ditentukan pembobot yang paling baik dalam arti dapat menghasilkan keputusan terbaik Untuk menghasilkan pembobot yang paling baik akan digunakan Algoritma Genetika (GA) dari kombinasi nilai pembobot yang diberikan oleh masing-masing pengambil keputusan...

Research paper thumbnail of Assessment for the renewal and management of stormwater drainage facilities in residential areas of Enugu city, Nigeria

The objectives of this paper are to assess the quality of artificial drainage facilities in resid... more The objectives of this paper are to assess the quality of artificial drainage facilities in residential premises of Enugu city and proffer appropriate renewal strategies for managing them. An appraisal technique, which employs a checklist of seven simple classifications of specified qualities, is utilized to achieve the aim. In this approach percentage penalty points (Pp) are assigned to drainage facilities having observable defects. A total of 366 drains in 20 residential areas of the city are appraised. The poorest condition of drainage system is observed in Ogui Urban area with the highest penalty points of 88% followed by Abakpa with 87.5% penalty. Only three layouts (Ekulu, G.R.A., and Independence Layout) have less than 25% penalty points rated as good condition of drainage. None of the residential areas in the city has excellent drainage facilities as none scored less than 10% penalty. A more integrative legislation on urban land use planning and management to protect the dra...

Research paper thumbnail of Performansi Neuro Fuzzy untuk Peramalan Data Time Series

ANFIS (Adaptif Neuro Fuzzy Inference System) adalah metode jaringan neural yang fungsinya sama de... more ANFIS (Adaptif Neuro Fuzzy Inference System) adalah metode jaringan neural yang fungsinya sama dengan sistem inferensi fuzzy. Pada ANFIS, proses belajar pada jaringan neural dengan sejumlah pasangan data berguna untuk memperbaharui parameter-parameter sistem inferensi fuzzy. Metode ANFIS menggunakan algoritma Error backpropagation yang memiliki beberapa keunggulan, yaitu baik dari segi kekonvergenan maupun dari segi lokal minimumnya yang sangat peka terhadap perbaikan parameter ANFIS. Metode ini diimplementasikan pada peramalan data time series untuk 4 jenis tipe data yaitu stasioner (data sunspot), random (data saham), non stasioner (airline), musiman (beban listrik). Proses learning data dengan ANFIS memiliki hasil yang sempurna dimana nilai error proses training mampu mencapai 0 (nol). Metode ANFIS memiliki hasil yang sangat baik untuk peramalan data saham dimana didapatkan nilai MSE 2.27 pada time lag 320. Hasil peramalan untuk data sunspot dan data beban listrik memiliki hasil ...

Research paper thumbnail of Emotional Context Detection on Conversation Text with Deep Learning Method Using Long Short-Term Memory and Attention Networks

2021 9th International Conference on Information and Communication Technology (ICoICT), 2021

The conversation in the text is an interesting research on Natural Language Processing. One of th... more The conversation in the text is an interesting research on Natural Language Processing. One of the text conversation tasks is to know the emotions of the people involved in the conversation. The conversation in social media like Twitter, Instagram, short message service, WhatsApp, and so on, often involves emotion. Somehow the comments are impulsive sentences that can stimulate emotions. Expressing emotions using text are rarely done and uncomfortable. However, with natural language technology development, expressing emotions using text can be succeeded with specific symbols. We call the specific symbols emojis. So many emojis can express emotions. This research proposes the emoji symbol as a character feature. We introduce the Emoji2Vec method and Long Short-term Memory with Attention. The Attention that is used has a complex topology. We compared the results of this study with the baseline model. The method we propose is better than the baseline model.

Research paper thumbnail of Semantic Information Retrival for Scientific Experimental Papers with Knowlege based Feature Extraction

INOVTEK Polbeng - Seri Informatika, 2019

Abstrack-Along with the times, demands for information retrievals in scientific papers have also ... more Abstrack-Along with the times, demands for information retrievals in scientific papers have also increased. Regarding experimental scientific papers, researchers have difficulty in searching for information on experimental scientific papers because information retrieval engines have limitations in the search process due to text mining-based feature extraction of the entire text, while experimental types of scientific paper have specific contents, which should have a different treatment in feature extraction. In this paper, we propose a new system for information retrieval on experimental scientific papers. This system consists of 4 main functions: (1) Specific content-based feature extraction, (2) Classification model, (3) Context-based subspace selection, and (4) Context-dependent similarity measurement. In feature extraction, our system extracts feature category in experimental scientific papers with specific content-based features, which are data, problem, method and result. To perform the applicability of our proposed system, we tested 77 papers in the dataset with the Leave-One-Out validation model with several classification algorithm (Nearest Neighbour, Naive Bayes, Support Vector Machine and Decision Tree) and on average performed 66.65% precision rate and accuracy of 76,18% precision rate. We also made the experiment on the similarity, our proposed system performed 79.17% accuracy rate Keywords-Scientific experimental paper, Context-base subspace selection, Context-dependent similarity measurement. Intisari-Seiring dengan perkembangan zaman permintaan pencarian informasi dalam makalah ilmiah juga meningkat. Mesin pencari informasi yang ada saat ini memiliki keterbatasan dalam proses pencarian berdasarkan ekstraksi fitur berbasis text-mining dari seluruh teks, sedangkan jenis makalah ilmiah eksperimental memiliki konten spesifik. Dalam makalah yang kami usulkan sistem untuk pengambilan informasi pada makalah ilmiah eksperimental. Sistem terdiri dari 4 fungsi: (1) Ekstraksi fitur berbasis konten, (2) Model klasifikasi, (3) Pemilihan subruang berbasis konteks, dan (4) Pengukuran kesamaan berdasar pada konteks. Dalam Pemilihan Subruang Berbasis Konteks, sistem melakukan pengurangan dimensi dengan pemilihan subruang berbasis konteks yang dipilih oleh pengguna. Untuk mendapatkan hasil pencarian akhir, kami mengukur kesamaan konteks dengan membangun metrik dataset berdasar konteks ke paper. Untuk melakukan penerapan sistem yang kami usulkan, kami menguji 77 makalah dalam dataset dengan model validasi Leave-One-Out dengan beberapa algoritma klasifikasi (Nearest Neighbor, Naive Bayes, Support Vector Machine, dan Decision Tree) dan rata-rata melakukan presisi 66,65% tingkat dan akurasi tingkat presisi 76,18%. Kami juga melakukan percobaan pada pengukuran kesamaan dengan memberikan queri paper dan konten yang diinginkan (data, hasil, metode, dan masalah) sebagai konteks yang diberikan oleh pengguna. Dalam percobaan pengukuran kesamaan, sistem yang kami usulkan memiliki tingkat akurasi 79,17%.

Research paper thumbnail of Penentuan Aspek Implisit dengan Ekstraksi Knowledge Berbasis Rule pada Ulasan Bahasa Indonesia (Determination of Implicit Aspects with Rule Based Knowledge Extraction in Indonesian Reviews)

Jurnal Nasional Teknik Elektro dan Teknologi Informasi, 2020

Determination of implicit aspects is one of the important things in opinion sentences. This study... more Determination of implicit aspects is one of the important things in opinion sentences. This study proposes a new approach for developing rule-based knowledge by forming a relation between opinion words with aspect categories. The relationship is obtained from the combination of rules, based on Opinion Word Similarity (OWS). Evaluation for rule-based knowledge extraction is in the form of threshold values of frequency and confidence to produce the best precision, recall, and f-measure values. The knowledge extraction consists of two phases: training phase and testing phase. The training phase is described as the process to extract rule-based knowledge. The testing phase is described as the process to obtain the implicit aspects of opinion sentences by referring to rule-based knowledge. To extract rule-based knowledge on user reviews, it is necessary to identify opinion sentences with explicit aspects and get pairs of aspects and words of opinion with rules generated from regular expr...