Waheeda Almayyan | College of Technological Studies, PAAET (original) (raw)

Papers by Waheeda Almayyan

Research paper thumbnail of A Data Mining Approach for Filtering out Social Spammers in Large-Scale Twitter Data Collections

International Journal of Artificial Intelligence & Applications

Social networking services – such as Facebook.com and Twitter.com – are fast-growing enterprise p... more Social networking services – such as Facebook.com and Twitter.com – are fast-growing enterprise platform that has become a prevalent and essential component of daily life. Due to its popularity, Twitter draws many spammers or other fake accounts to post malicious links and infiltrate legitimate users' accounts with many spam messages. Therefore, it is crucial to recognize and screen spam tweets and spam accounts. As a result, spam detection is highly needed but still a difficult challenge. This article applied several Bio-inspired optimization algorithms to reduce the features' dimensions in the first stage. Then we used several classification schemes in the second stage to enhance the spam detection rate in three real Twitter data collections. The performance of the chosen classifiers also revealed that Random Forest and C4.5 classifiers achieved the highest Accuracy, Precision, Recall, and F1-score even on class imbalance.

Research paper thumbnail of Diagnosis of Obesity Level based on Bagging Ensemble Classifier and Feature Selection Methods

International Journal of Artificial Intelligence & Applications

In the current era, the amount of data generated from various device sources and business transac... more In the current era, the amount of data generated from various device sources and business transactions is rising exponentially, and the current machine learning techniques are not feasible for handling the massive volume of data. Two commonly adopted schemes exist to solve such issues scaling up the data mining algorithms and data reduction. Scaling the data mining algorithms is not the best way, but data reduction is feasible. There are two approaches to reducing datasets selecting an optimal subset of features from the initial dataset or eliminating those that contribute less information. Overweight and obesity are increasing worldwide, and forecasting future overweight or obesity could help intervention. Our primary objective is to find the optimal subset of features to diagnose obesity. This article proposes adapting a bagging algorithm based on filter-based feature selection to improve the prediction accuracy of obesity with a minimal number of feature subsets. We utilized seve...

Research paper thumbnail of Towards Predicting Software Defects with Clustering Techniques

Social Science Research Network, 2021

Research paper thumbnail of Determination of Fetal State from Cardiotocogram using Random Forest and Particle Swarm Optimization

Journal of Convergence Information Technology, 2016

Research paper thumbnail of Developing a Machine Learning Model for Detecting Job Burnout During the COVID-19 Pandemic Among Front-line Workers in Kuwait

During the COVID-19 pandemic, front-line personnel worldwide had tremendous psychological stress&... more During the COVID-19 pandemic, front-line personnel worldwide had tremendous psychological stress<br> compared with the general population. High mental stress may lead to job burnout. This paper starts with gathering a job<br> burnout dataset from medical staff and police officers working in Kuwait during the COVID-19 pandemic using a webbased<br> Arabic version of the Maslach Burnout Inventory questionnaire. The gathered dataset shows that there an elevated<br> burnout rates among the front-line personnel dealing with COVID-19 patients. It utilizes machine learning techniques<br> including AnDE Bayesian, JChaid* decision tree, SVM margin-based, ForestPA Decision forest, and DMLP to predict the<br> presence of job burnout. Then, we present efficient feature subset selection approaches using several metaheuristic<br> methods such as Bat, Cuckoo, PSO, GWO, and CGWO to select the most competent features. Experiments showed that<br> reducin...

Research paper thumbnail of Information Fusion in Biometrics: A Case Study in Online Signature

Biometrics is constantly evolving technology which has been widely used in many official and comm... more Biometrics is constantly evolving technology which has been widely used in many official and commercial identification applications. A biometric system is essentially a pattern recognition system which makes a personal identification decision by determining the authority of specific physiological or behavioral traits. Despite considerable advances in recent years, there are still challenges in authentication based on a single biometric trait, such as noisy data, restricted degree of freedom, intra-class variability, non-universality, spoof attack and undesirable error rates. Some of the restrictions can be lifted by designing a multimodal biometric system. Multimodal biometric systems are those which utilize, or are capable of utilizing, more than one physiological or behavioral characteristic for enrollment either in verification or identification mode. Among the biometric traits, handwritten signature is considered to be the most widely accepted biometric for identity verification...

Research paper thumbnail of Biometric-Based Authentication System Using Rough Set Theory

Rough Sets and Current Trends in Computing, 2010

In this paper we have proposed a biometric-based authentication system based on rough set theory.... more In this paper we have proposed a biometric-based authentication system based on rough set theory. The system employed signature for authentication purpose. The major functional blocks of the proposed system are presented. Information is extracted as time functions of various dynamic properties of the signatures. We apply our methodology to global features extracted from a 108-users database. Thirty-one features were identified and extracted from each signature. Rough set approach has resulted in a reduced set of nine features that ...

Research paper thumbnail of A comparative evalution of feature level based fusion schemes for multimodal biometric authentication

2011 11th International Conference on Hybrid Intelligent Systems (HIS), 2011

Abstract This paper proposes a novel fusion technique using iris-online signature biometrics at f... more Abstract This paper proposes a novel fusion technique using iris-online signature biometrics at feature level space. The biometric features are extracted from the pre-processed images of iris and the dynamics of signatures. We propose different fusion schemes at feature level. In order to reduce the complexity of the fusion scheme, we adopt a binary particle swarm optimization (BPSO) procedure which allows the number of features to be significantly reduced while highlighting the difference between classes. This paper examines how the ...

Research paper thumbnail of Iris features extraction using dual-tree complex wavelet transform

2010 International Conference of Soft Computing and Pattern Recognition, 2010

Research paper thumbnail of Rough set approach to online signature identification

Digital Signal Processing, 2011

Research paper thumbnail of Rough Set Based Global Features for Online Signature Identification

Egyptian Computer Science Journal, 2009

A wet processing arrangement for photosensitive articles has a container for a processing bath. A... more A wet processing arrangement for photosensitive articles has a container for a processing bath. A rack is mountable in the container and includes a set of rollers which define nips for advancing photosensitive articles through the bath. A drive for the rollers is disposed inside the container. Means is provided to eliminate air bubbles which tend to form beneath the nips of the rollers as the rack is immersed in the bath.

Research paper thumbnail of Improved Discriminatory Ability using Hybrid Feature Selection via Approach Inspired by Grey Wolf Search and Ensemble Classifier for Medical Datasets

Abstract- Medical datasets inevitably suffer from redundant and irrelevant attributes, which redu... more Abstract- Medical datasets inevitably suffer from redundant and irrelevant attributes, which reduce data mining<br> algorithms' ability and often lead to uninterpretable results. Therefore, the first step in medical diagnosis problems is<br> to reduce dimensionality. This paper presents a computational method that takes advantage of wrapper subset<br> evaluation with a meta-heuristic algorithm in a two-phase process to improve the classification performance with a<br> group of meta-classifiers. The first phase filters the feature domain using the information gain ratio in an attribute<br> evaluation method. The first layer's output serves as an input feature for the second phase, which uses grey wolf<br> optimization to find the optimal feature space. An ensemble-based classifier scheme was built based on C4.5, RandF,<br> and ForestPA classifiers to obtain the final decision. The proposed method was validated on several medical datas...

Research paper thumbnail of Towards Predicting Software Defects with Clustering Techniques

The purpose of software defect prediction is to improve the quality of a software project by buil... more The purpose of software defect prediction is to improve the quality of a software project by building a predictive model to decide whether a software module is or is not fault prone. In recent years, much research in using machine learning techniques in this topic has been performed. Our aim was to evaluate the performance of clustering techniques with feature selection schemes to address the problem of software defect prediction problem. We analysed the National Aeronautics and Space Administration (NASA) dataset benchmarks using three clustering algorithms: (1) Farthest First, (2) X-Means, and (3) selforganizing map (SOM). In order to evaluate different feature selection algorithms, this article presents a comparative analysis involving software defects prediction based on Bat, Cuckoo, Grey Wolf Optimizer (GWO), and particle swarm optimizer (PSO). The results obtained with the proposed clustering models enabled us to build an efficient predictive model with a satisfactory detectio...

Research paper thumbnail of Data Mining Approach to Analyze COVID-19 Dataset of Mexican Patients

The pandemic originated by coronavirus (COVID-19), force governments to choosing different health... more The pandemic originated by coronavirus (COVID-19), force governments to choosing different health policies to stop the infection and inspire many research groups to work on patient’s data to understand the virus behaviour. This research suggests a two-phase prediction system with several learning algorithms to explore the COVID-19 dataset, where Chisquare is employed at the first stage. Cuckoo search and Grey Wolf Optimiser approaches have been proposed in the second stage to inherit their advantages to select the most distinctive features. The proposed classification model is trained and tested with six machine learning algorithms. The proposed model resulted in 96.5% of Accuracy with samples of 95839 patients with several incomplete data.

Research paper thumbnail of A Hybrid Feature Selection Approach to Improve Parkinson’s disease Mining

Research paper thumbnail of Performance analysis of multimodal biometric fusion

Research paper thumbnail of Mining Sports Articles using Cuckoo Search and Tabu Search with SMOTE Preprocessing Technique

Sentiment analysis is one of the most popular domains for natural language text classification, c... more Sentiment analysis is one of the most popular domains for natural language text classification, crucial for improving information extraction. However, massive data availability is one of the biggest problems for opinion mining due to accuracy considerations. Selecting high discriminative features from an opinion mining database is still an ongoing research topic. This study presents a two-stage heuristic feature selection method to classify sports articles using Tabu search and Cuckoo search via Levy flight. Levy flight is used to prevent the solution from being trapped at local optima. Comparative results on a benchmark dataset prove that our method shows significant improvements in the overall accuracy from 82.6% up to 89.5%.

Research paper thumbnail of A Hybrid Two-Step-Model Based on Cuckoo-Search and Grey Wolf Optimiser Ensemble-Based Classifier Approach for Network Intrusion Detection

Research paper thumbnail of Information Fusion in Biometrics: A Case Study in Online Signature

Biometrics is constantly evolving technology which has been widely used in many official and comm... more Biometrics is constantly evolving technology which has been widely used in many official and commercial identification applications. A biometric system is essentially a pattern recognition system which makes a personal identification decision by determining the authority of specific physiological or behavioral traits. Despite considerable advances in recent years, there are still challenges in authentication based on a single biometric trait, such as noisy data, restricted degree of freedom, intra-class variability, non-universality, spoof attack and undesirable error rates. Some of the restrictions can be lifted by designing a multimodal biometric system. Multimodal biometric systems are those which utilize, or are capable of utilizing, more than one physiological or behavioral characteristic for enrollment either in verification or identification mode. Among the biometric traits, handwritten signature is considered to be the most widely accepted biometric for identity verification...

Research paper thumbnail of Clustering Techniques

The purpose of software defect prediction is to improve the quality of a software project by buil... more The purpose of software defect prediction is to improve the quality of a software project by building a predictive model to decide whether a software module is or is not fault prone. In recent years, much research in using machine learning techniques in this topic has been performed. Our aim was to evaluate the performance of clustering techniques with feature selection schemes to address the problem of software defect prediction problem. We analysed the National Aeronautics and Space Administration (NASA) dataset benchmarks using three clustering algorithms: (1) Farthest First, (2) X-Means, and (3) selforganizing map (SOM). In order to evaluate different feature selection algorithms, this article presents a comparative analysis involving software defects prediction based on Bat, Cuckoo, Grey Wolf Optimizer (GWO), and particle swarm optimizer (PSO). The results obtained with the proposed clustering models enabled us to build an efficient predictive model with a satisfactory detectio...

Research paper thumbnail of A Data Mining Approach for Filtering out Social Spammers in Large-Scale Twitter Data Collections

International Journal of Artificial Intelligence & Applications

Social networking services – such as Facebook.com and Twitter.com – are fast-growing enterprise p... more Social networking services – such as Facebook.com and Twitter.com – are fast-growing enterprise platform that has become a prevalent and essential component of daily life. Due to its popularity, Twitter draws many spammers or other fake accounts to post malicious links and infiltrate legitimate users' accounts with many spam messages. Therefore, it is crucial to recognize and screen spam tweets and spam accounts. As a result, spam detection is highly needed but still a difficult challenge. This article applied several Bio-inspired optimization algorithms to reduce the features' dimensions in the first stage. Then we used several classification schemes in the second stage to enhance the spam detection rate in three real Twitter data collections. The performance of the chosen classifiers also revealed that Random Forest and C4.5 classifiers achieved the highest Accuracy, Precision, Recall, and F1-score even on class imbalance.

Research paper thumbnail of Diagnosis of Obesity Level based on Bagging Ensemble Classifier and Feature Selection Methods

International Journal of Artificial Intelligence & Applications

In the current era, the amount of data generated from various device sources and business transac... more In the current era, the amount of data generated from various device sources and business transactions is rising exponentially, and the current machine learning techniques are not feasible for handling the massive volume of data. Two commonly adopted schemes exist to solve such issues scaling up the data mining algorithms and data reduction. Scaling the data mining algorithms is not the best way, but data reduction is feasible. There are two approaches to reducing datasets selecting an optimal subset of features from the initial dataset or eliminating those that contribute less information. Overweight and obesity are increasing worldwide, and forecasting future overweight or obesity could help intervention. Our primary objective is to find the optimal subset of features to diagnose obesity. This article proposes adapting a bagging algorithm based on filter-based feature selection to improve the prediction accuracy of obesity with a minimal number of feature subsets. We utilized seve...

Research paper thumbnail of Towards Predicting Software Defects with Clustering Techniques

Social Science Research Network, 2021

Research paper thumbnail of Determination of Fetal State from Cardiotocogram using Random Forest and Particle Swarm Optimization

Journal of Convergence Information Technology, 2016

Research paper thumbnail of Developing a Machine Learning Model for Detecting Job Burnout During the COVID-19 Pandemic Among Front-line Workers in Kuwait

During the COVID-19 pandemic, front-line personnel worldwide had tremendous psychological stress&... more During the COVID-19 pandemic, front-line personnel worldwide had tremendous psychological stress<br> compared with the general population. High mental stress may lead to job burnout. This paper starts with gathering a job<br> burnout dataset from medical staff and police officers working in Kuwait during the COVID-19 pandemic using a webbased<br> Arabic version of the Maslach Burnout Inventory questionnaire. The gathered dataset shows that there an elevated<br> burnout rates among the front-line personnel dealing with COVID-19 patients. It utilizes machine learning techniques<br> including AnDE Bayesian, JChaid* decision tree, SVM margin-based, ForestPA Decision forest, and DMLP to predict the<br> presence of job burnout. Then, we present efficient feature subset selection approaches using several metaheuristic<br> methods such as Bat, Cuckoo, PSO, GWO, and CGWO to select the most competent features. Experiments showed that<br> reducin...

Research paper thumbnail of Information Fusion in Biometrics: A Case Study in Online Signature

Biometrics is constantly evolving technology which has been widely used in many official and comm... more Biometrics is constantly evolving technology which has been widely used in many official and commercial identification applications. A biometric system is essentially a pattern recognition system which makes a personal identification decision by determining the authority of specific physiological or behavioral traits. Despite considerable advances in recent years, there are still challenges in authentication based on a single biometric trait, such as noisy data, restricted degree of freedom, intra-class variability, non-universality, spoof attack and undesirable error rates. Some of the restrictions can be lifted by designing a multimodal biometric system. Multimodal biometric systems are those which utilize, or are capable of utilizing, more than one physiological or behavioral characteristic for enrollment either in verification or identification mode. Among the biometric traits, handwritten signature is considered to be the most widely accepted biometric for identity verification...

Research paper thumbnail of Biometric-Based Authentication System Using Rough Set Theory

Rough Sets and Current Trends in Computing, 2010

In this paper we have proposed a biometric-based authentication system based on rough set theory.... more In this paper we have proposed a biometric-based authentication system based on rough set theory. The system employed signature for authentication purpose. The major functional blocks of the proposed system are presented. Information is extracted as time functions of various dynamic properties of the signatures. We apply our methodology to global features extracted from a 108-users database. Thirty-one features were identified and extracted from each signature. Rough set approach has resulted in a reduced set of nine features that ...

Research paper thumbnail of A comparative evalution of feature level based fusion schemes for multimodal biometric authentication

2011 11th International Conference on Hybrid Intelligent Systems (HIS), 2011

Abstract This paper proposes a novel fusion technique using iris-online signature biometrics at f... more Abstract This paper proposes a novel fusion technique using iris-online signature biometrics at feature level space. The biometric features are extracted from the pre-processed images of iris and the dynamics of signatures. We propose different fusion schemes at feature level. In order to reduce the complexity of the fusion scheme, we adopt a binary particle swarm optimization (BPSO) procedure which allows the number of features to be significantly reduced while highlighting the difference between classes. This paper examines how the ...

Research paper thumbnail of Iris features extraction using dual-tree complex wavelet transform

2010 International Conference of Soft Computing and Pattern Recognition, 2010

Research paper thumbnail of Rough set approach to online signature identification

Digital Signal Processing, 2011

Research paper thumbnail of Rough Set Based Global Features for Online Signature Identification

Egyptian Computer Science Journal, 2009

A wet processing arrangement for photosensitive articles has a container for a processing bath. A... more A wet processing arrangement for photosensitive articles has a container for a processing bath. A rack is mountable in the container and includes a set of rollers which define nips for advancing photosensitive articles through the bath. A drive for the rollers is disposed inside the container. Means is provided to eliminate air bubbles which tend to form beneath the nips of the rollers as the rack is immersed in the bath.

Research paper thumbnail of Improved Discriminatory Ability using Hybrid Feature Selection via Approach Inspired by Grey Wolf Search and Ensemble Classifier for Medical Datasets

Abstract- Medical datasets inevitably suffer from redundant and irrelevant attributes, which redu... more Abstract- Medical datasets inevitably suffer from redundant and irrelevant attributes, which reduce data mining<br> algorithms' ability and often lead to uninterpretable results. Therefore, the first step in medical diagnosis problems is<br> to reduce dimensionality. This paper presents a computational method that takes advantage of wrapper subset<br> evaluation with a meta-heuristic algorithm in a two-phase process to improve the classification performance with a<br> group of meta-classifiers. The first phase filters the feature domain using the information gain ratio in an attribute<br> evaluation method. The first layer's output serves as an input feature for the second phase, which uses grey wolf<br> optimization to find the optimal feature space. An ensemble-based classifier scheme was built based on C4.5, RandF,<br> and ForestPA classifiers to obtain the final decision. The proposed method was validated on several medical datas...

Research paper thumbnail of Towards Predicting Software Defects with Clustering Techniques

The purpose of software defect prediction is to improve the quality of a software project by buil... more The purpose of software defect prediction is to improve the quality of a software project by building a predictive model to decide whether a software module is or is not fault prone. In recent years, much research in using machine learning techniques in this topic has been performed. Our aim was to evaluate the performance of clustering techniques with feature selection schemes to address the problem of software defect prediction problem. We analysed the National Aeronautics and Space Administration (NASA) dataset benchmarks using three clustering algorithms: (1) Farthest First, (2) X-Means, and (3) selforganizing map (SOM). In order to evaluate different feature selection algorithms, this article presents a comparative analysis involving software defects prediction based on Bat, Cuckoo, Grey Wolf Optimizer (GWO), and particle swarm optimizer (PSO). The results obtained with the proposed clustering models enabled us to build an efficient predictive model with a satisfactory detectio...

Research paper thumbnail of Data Mining Approach to Analyze COVID-19 Dataset of Mexican Patients

The pandemic originated by coronavirus (COVID-19), force governments to choosing different health... more The pandemic originated by coronavirus (COVID-19), force governments to choosing different health policies to stop the infection and inspire many research groups to work on patient’s data to understand the virus behaviour. This research suggests a two-phase prediction system with several learning algorithms to explore the COVID-19 dataset, where Chisquare is employed at the first stage. Cuckoo search and Grey Wolf Optimiser approaches have been proposed in the second stage to inherit their advantages to select the most distinctive features. The proposed classification model is trained and tested with six machine learning algorithms. The proposed model resulted in 96.5% of Accuracy with samples of 95839 patients with several incomplete data.

Research paper thumbnail of A Hybrid Feature Selection Approach to Improve Parkinson’s disease Mining

Research paper thumbnail of Performance analysis of multimodal biometric fusion

Research paper thumbnail of Mining Sports Articles using Cuckoo Search and Tabu Search with SMOTE Preprocessing Technique

Sentiment analysis is one of the most popular domains for natural language text classification, c... more Sentiment analysis is one of the most popular domains for natural language text classification, crucial for improving information extraction. However, massive data availability is one of the biggest problems for opinion mining due to accuracy considerations. Selecting high discriminative features from an opinion mining database is still an ongoing research topic. This study presents a two-stage heuristic feature selection method to classify sports articles using Tabu search and Cuckoo search via Levy flight. Levy flight is used to prevent the solution from being trapped at local optima. Comparative results on a benchmark dataset prove that our method shows significant improvements in the overall accuracy from 82.6% up to 89.5%.

Research paper thumbnail of A Hybrid Two-Step-Model Based on Cuckoo-Search and Grey Wolf Optimiser Ensemble-Based Classifier Approach for Network Intrusion Detection

Research paper thumbnail of Information Fusion in Biometrics: A Case Study in Online Signature

Biometrics is constantly evolving technology which has been widely used in many official and comm... more Biometrics is constantly evolving technology which has been widely used in many official and commercial identification applications. A biometric system is essentially a pattern recognition system which makes a personal identification decision by determining the authority of specific physiological or behavioral traits. Despite considerable advances in recent years, there are still challenges in authentication based on a single biometric trait, such as noisy data, restricted degree of freedom, intra-class variability, non-universality, spoof attack and undesirable error rates. Some of the restrictions can be lifted by designing a multimodal biometric system. Multimodal biometric systems are those which utilize, or are capable of utilizing, more than one physiological or behavioral characteristic for enrollment either in verification or identification mode. Among the biometric traits, handwritten signature is considered to be the most widely accepted biometric for identity verification...

Research paper thumbnail of Clustering Techniques

The purpose of software defect prediction is to improve the quality of a software project by buil... more The purpose of software defect prediction is to improve the quality of a software project by building a predictive model to decide whether a software module is or is not fault prone. In recent years, much research in using machine learning techniques in this topic has been performed. Our aim was to evaluate the performance of clustering techniques with feature selection schemes to address the problem of software defect prediction problem. We analysed the National Aeronautics and Space Administration (NASA) dataset benchmarks using three clustering algorithms: (1) Farthest First, (2) X-Means, and (3) selforganizing map (SOM). In order to evaluate different feature selection algorithms, this article presents a comparative analysis involving software defects prediction based on Bat, Cuckoo, Grey Wolf Optimizer (GWO), and particle swarm optimizer (PSO). The results obtained with the proposed clustering models enabled us to build an efficient predictive model with a satisfactory detectio...