Hamada Nayel - Academia.edu (original) (raw)

Papers by Hamada Nayel

Research paper thumbnail of Misinformation Detection in Arabic Tweets: A Case Study about COVID-19 Vaccination

Benha Journal of Applied Sciences

Misinformation about COVID-19 overwhelmed our lives due to the tremendous usage of social media, ... more Misinformation about COVID-19 overwhelmed our lives due to the tremendous usage of social media, especially Twitter. Spreading misinformation caused fear and panic among people affecting the national economic security of many countries. Vaccination is the crucial key to limiting the pandemic spread of COVID-19. Therefore, researchers start to detect and fight against the spread of misinformation taking it as a new challenge. This paper illustrates a model for misinformation detection in Arabic tweets using Natural Language Processing (NLP) techniques. A machine learning-based system has been developed regarding COVID-19 vaccination tweets. Term Frequency-Inverse Document Frequency (TF-IDF) has been used as vector space model for feature extraction. Support Vector Machines classification algorithm has been used for implementation the proposed system. Evaluation of the system, using different metrics, has been implemented on Arcov-19Vac, a dataset of Arabic tweets related to COVID-19 vaccination. The results reported by the illustrated model show that the performance of our model is promising.

Research paper thumbnail of BFCAI at SemEval-2022 Task 6: Multi-Layer Perceptron for Sarcasm Detection in Arabic Texts

Proceedings of the 16th International Workshop on Semantic Evaluation (SemEval-2022)

Research paper thumbnail of NAYEL @LT-EDI-ACL2022: Homophobia/Transphobia Detection for Equality, Diversity, and Inclusion using SVM

Proceedings of the Second Workshop on Language Technology for Equality, Diversity and Inclusion

Research paper thumbnail of A Comparative Study of Machine Learning Approaches for Rumors Detection in Covid-19 Tweets

2022 2nd International Mobile, Intelligent, and Ubiquitous Computing Conference (MIUCC)

Research paper thumbnail of AI Question Bank

Research paper thumbnail of Lectures of AI

Research paper thumbnail of Biomedical Named Entity Recognition

Research paper thumbnail of Biomedical Named Entity Recognition: PhD Synnopsis

Research paper thumbnail of Type Systems Based Data Race Detector

Multi-threading is a methodology that has been extremely used. Modern software depends essentiall... more Multi-threading is a methodology that has been extremely used. Modern software depends essentially on multi-threading. Operating systems, famous examples, are based on multi-threading; a user can write his document, play an audio file, and downloading a file from internet at the same time. Each of these tasks called a thread. A common problem occurs when implementing multi-threaded programs is a data-race. Data race occurs when two threads try to access a shared variable at the same time without a proper synchronization. A detector is software that determines if the program contains a data-race problem or not. In this paper, we develop a detector that has the form of a type system. We present a type system which discovers the data-race problems. We also prove the soundness of our type system.

Research paper thumbnail of Mangalore-University@INLI-FIRE-2017: Indian Native Language Identification using Support Vector Machines and Ensemble approach

Forum for Information Retrieval Evaluation, 2017

This paper describes the systems submitted by our team for Indian Native Language Identification ... more This paper describes the systems submitted by our team for Indian Native Language Identification (INLI) task held in conjunction with FIRE 2017. Native Language Identification (NLI) is an important task that has different applications in different areas such as social-media analysis, authorship identification, second language acquisition and forensic investigation. We submitted two systems using Support Vector Machine (SVM) and Ensemble Classifier based on three different classifiers representing the comments (data) as vector space model for both systems and achieved accuracy of 47.60% and 47.30% respectively and secured second rank over all submissions for the task. CCS CONCEPTS • Information systems → Web and social media search; Multilingual and cross-lingual retrieval; • Computing methodologies → Language resources;

Research paper thumbnail of DEEP at HASOC2019: A Machine Learning Framework for Hate Speech and Offensive Language Detection

Forum for Information Retrieval Evaluation, 2019

In this paper, we describe the system submitted by our team for Hate Speech and Offensive Content... more In this paper, we describe the system submitted by our team for Hate Speech and Offensive Content Identification in Indo-European Languages (HASOC) shared task held at FIRE 2019. Hate speech and offensive language detection have become an important task due to the overwhelming usage of social media platforms in our daily life. This task has been applied for three languages namely, English, Germany and Hindi. The proposed model uses classical machine learning approaches to create classifiers that are used to classify the given post according to different subtasks.

Research paper thumbnail of Mangalore University INLI@FIRE2018: Artificial Neural Network and Ensemble based Models for INLI

Forum for Information Retrieval Evaluation, 2018

In this paper, the systems submitted by Mangalore University team for Indian Native Language Iden... more In this paper, the systems submitted by Mangalore University team for Indian Native Language Identification (INLI) task have been described. Native Language Identification (NLI) has different applications such as social media analysis, authorship identification, secondlanguage acquisition and forensic investigation. We submitted three systems using Artificial Neural Network (ANN) model and Ensemble approach. All the three submitted systems achieved the same accuracy of 35.30% and secured second rank over all submissions for the task.

Research paper thumbnail of NAYEL at SemEval-2020 Task 12: TF/IDF-Based Approach for Automatic Offensive Language Detection in Arabic Tweets

In this paper, we present the system submitted to “SemEval-2020 Task 12”. The proposed system aim... more In this paper, we present the system submitted to “SemEval-2020 Task 12”. The proposed system aims at automatically identify the Offensive Language in Arabic Tweets. A machine learning based approach has been used to design our system. We implemented a linear classifier with Stochastic Gradient Descent (SGD) as optimization algorithm. Our model reported 84.20%, 81.82% f1-score on development set and test set respectively. The best performed system and the system in the last rank reported 90.17% and 44.51% f1-score on test set respectively.

Research paper thumbnail of BENHA@IDAT: Improving Irony Detection in Arabic Tweets using Ensemble Approach

This paper describes the methods and experiments that have been used in the development of our mo... more This paper describes the methods and experiments that have been used in the development of our model submitted to Irony Detection for Arabic Tweets shared task. We submitted three runs based on our model using Support Vector Machines (SVM), Linear and Ensemble classifiers. Bag-of-Words with range of n-grams model have been used for feature extraction. Our submissions achieved accuracies of 82.1%, 81.6% and 81.1% for ensemble based, SVM and linear classifiers respectively.

Research paper thumbnail of NAYEL@APDA: Machine Learning Approach for Author Profiling and Deception Detection in Arabic Texts

In this paper, we describe the methods and experiments that have been used in development of our ... more In this paper, we describe the methods and experiments that have been used in development of our system for Author Profiling and Deception Detection in Arabic shared task. There are two tasks, Author Profiling in Arabic Tweets and Deception Detection in Arabic Texts. We have submitted three runs for each task. The proposed system depends on classical machine learning approaches namely Linear Classifier, Support Vector Machine and Multilayer Perceptron Classifier. Bag-of-Word with range of n-grams model has been used for feature extraction. Our submissions for the first task achieved the second, seventh and third ranks. For the second task, one of our submissions outperformed all other submissions developed by other teams.

Research paper thumbnail of Partial Redundancy Elimination for Multi-threaded Programs

Multi-threaded programs have many applications which are widely used such as operating systems. A... more Multi-threaded programs have many applications which are widely used such as operating systems. Analyzing multi-threaded programs differs from sequential ones; the main feature is that many threads execute at the same time. The effect of all other running threads must be taken in account. Partial redundancy elimination is among the most powerful compiler optimizations: it performs loop-invariant code motion and common subexpression elimination. We present a type system with optimization component which performs partial redundancy elimination for multi-threaded programs. Key words:

Research paper thumbnail of Improving Multi-Word Entity Recognition for Biomedical Texts

ArXiv, 2019

Biomedical Named Entity Recognition (BioNER) is a crucial step for analyzing Biomedical texts, wh... more Biomedical Named Entity Recognition (BioNER) is a crucial step for analyzing Biomedical texts, which aims at extracting biomedical named entities from a given text. Different supervised machine learning algorithms have been applied for BioNER by various researchers. The main requirement of these approaches is an annotated dataset used for learning the parameters of machine learning algorithms. Segment Representation (SR) models comprise of different tag sets used for representing the annotated data, such as IOB2, IOE2 and IOBES. In this paper, we propose an extension of IOBES model to improve the performance of BioNER. The proposed SR model, FROBES, improves the representation of multi-word entities. We used Bidirectional Long Short-Term Memory (BiLSTM) network; an instance of Recurrent Neural Networks (RNN), to design a baseline system for BioNER and evaluated the new SR model on two datasets, i2b2/VA 2010 challenge dataset and JNLPBA 2004 shared task dataset. The proposed SR model...

Research paper thumbnail of Partial Redundancy Elimination for Multi-threaded Programs

Multi-threaded programs have many applications which are widely used such as operating systems. A... more Multi-threaded programs have many applications which are widely used such as operating systems. Analyzing multi-threaded programs differs from sequential ones; the main feature is that many threads execute at the same time. The effect of all other running threads must be taken in account. Partial redundancy elimination is among the most powerful compiler optimizations: it performs loop-invariant code motion and common subexpression elimination. We present a type system with optimization component which performs partial redundancy elimination for multi-threaded programs.

Research paper thumbnail of Machine Learning-Based Approach for Arabic Dialect Identification

This paper describes our systems submitted to the Second Nuanced Arabic Dialect Identification Sh... more This paper describes our systems submitted to the Second Nuanced Arabic Dialect Identification Shared Task (NADI 2021). Dialect identification is the task of automatically detecting the source variety of a given text or speech segment. There are four subtasks, two subtasks for country-level identification and the other two subtasks for province-level identification. The data in this task covers a total of 100 provinces from all 21 Arab countries and come from the Twitter domain. The proposed systems depend on five machine-learning approaches namely Complement Naïve Bayes, Support Vector Machine, Decision Tree, Logistic Regression and Random Forest Classifiers. F1 macro-averaged score of Naïve Bayes classifier outperformed all other classifiers for development and test data.

Research paper thumbnail of Mangalore-University@INLI-FIRE-2017: Indian Native Language Identification using Support Vector Machines and Ensemble approach

This paper describes the systems submitted by our team for Indian Native Language Identification ... more This paper describes the systems submitted by our team for Indian Native Language Identification (INLI) task held in conjunction with FIRE 2017. Native Language Identification (NLI) is an important task that has different applications in different areas such as social-media analysis, authorship identification, second language acquisition and forensic investigation. We submitted two systems using Support Vector Machine (SVM) and Ensemble Classifier based on three different classifiers representing the comments (data) as vector space model for both systems and achieved accuracy of 47.60% and 47.30% respectively and secured second rank over all submissions for the task. CCS CONCEPTS • Information systems → Web and social media search; Multilingual and cross-lingual retrieval; • Computing methodologies → Language resources;

Research paper thumbnail of Misinformation Detection in Arabic Tweets: A Case Study about COVID-19 Vaccination

Benha Journal of Applied Sciences

Misinformation about COVID-19 overwhelmed our lives due to the tremendous usage of social media, ... more Misinformation about COVID-19 overwhelmed our lives due to the tremendous usage of social media, especially Twitter. Spreading misinformation caused fear and panic among people affecting the national economic security of many countries. Vaccination is the crucial key to limiting the pandemic spread of COVID-19. Therefore, researchers start to detect and fight against the spread of misinformation taking it as a new challenge. This paper illustrates a model for misinformation detection in Arabic tweets using Natural Language Processing (NLP) techniques. A machine learning-based system has been developed regarding COVID-19 vaccination tweets. Term Frequency-Inverse Document Frequency (TF-IDF) has been used as vector space model for feature extraction. Support Vector Machines classification algorithm has been used for implementation the proposed system. Evaluation of the system, using different metrics, has been implemented on Arcov-19Vac, a dataset of Arabic tweets related to COVID-19 vaccination. The results reported by the illustrated model show that the performance of our model is promising.

Research paper thumbnail of BFCAI at SemEval-2022 Task 6: Multi-Layer Perceptron for Sarcasm Detection in Arabic Texts

Proceedings of the 16th International Workshop on Semantic Evaluation (SemEval-2022)

Research paper thumbnail of NAYEL @LT-EDI-ACL2022: Homophobia/Transphobia Detection for Equality, Diversity, and Inclusion using SVM

Proceedings of the Second Workshop on Language Technology for Equality, Diversity and Inclusion

Research paper thumbnail of A Comparative Study of Machine Learning Approaches for Rumors Detection in Covid-19 Tweets

2022 2nd International Mobile, Intelligent, and Ubiquitous Computing Conference (MIUCC)

Research paper thumbnail of AI Question Bank

Research paper thumbnail of Lectures of AI

Research paper thumbnail of Biomedical Named Entity Recognition

Research paper thumbnail of Biomedical Named Entity Recognition: PhD Synnopsis

Research paper thumbnail of Type Systems Based Data Race Detector

Multi-threading is a methodology that has been extremely used. Modern software depends essentiall... more Multi-threading is a methodology that has been extremely used. Modern software depends essentially on multi-threading. Operating systems, famous examples, are based on multi-threading; a user can write his document, play an audio file, and downloading a file from internet at the same time. Each of these tasks called a thread. A common problem occurs when implementing multi-threaded programs is a data-race. Data race occurs when two threads try to access a shared variable at the same time without a proper synchronization. A detector is software that determines if the program contains a data-race problem or not. In this paper, we develop a detector that has the form of a type system. We present a type system which discovers the data-race problems. We also prove the soundness of our type system.

Research paper thumbnail of Mangalore-University@INLI-FIRE-2017: Indian Native Language Identification using Support Vector Machines and Ensemble approach

Forum for Information Retrieval Evaluation, 2017

This paper describes the systems submitted by our team for Indian Native Language Identification ... more This paper describes the systems submitted by our team for Indian Native Language Identification (INLI) task held in conjunction with FIRE 2017. Native Language Identification (NLI) is an important task that has different applications in different areas such as social-media analysis, authorship identification, second language acquisition and forensic investigation. We submitted two systems using Support Vector Machine (SVM) and Ensemble Classifier based on three different classifiers representing the comments (data) as vector space model for both systems and achieved accuracy of 47.60% and 47.30% respectively and secured second rank over all submissions for the task. CCS CONCEPTS • Information systems → Web and social media search; Multilingual and cross-lingual retrieval; • Computing methodologies → Language resources;

Research paper thumbnail of DEEP at HASOC2019: A Machine Learning Framework for Hate Speech and Offensive Language Detection

Forum for Information Retrieval Evaluation, 2019

In this paper, we describe the system submitted by our team for Hate Speech and Offensive Content... more In this paper, we describe the system submitted by our team for Hate Speech and Offensive Content Identification in Indo-European Languages (HASOC) shared task held at FIRE 2019. Hate speech and offensive language detection have become an important task due to the overwhelming usage of social media platforms in our daily life. This task has been applied for three languages namely, English, Germany and Hindi. The proposed model uses classical machine learning approaches to create classifiers that are used to classify the given post according to different subtasks.

Research paper thumbnail of Mangalore University INLI@FIRE2018: Artificial Neural Network and Ensemble based Models for INLI

Forum for Information Retrieval Evaluation, 2018

In this paper, the systems submitted by Mangalore University team for Indian Native Language Iden... more In this paper, the systems submitted by Mangalore University team for Indian Native Language Identification (INLI) task have been described. Native Language Identification (NLI) has different applications such as social media analysis, authorship identification, secondlanguage acquisition and forensic investigation. We submitted three systems using Artificial Neural Network (ANN) model and Ensemble approach. All the three submitted systems achieved the same accuracy of 35.30% and secured second rank over all submissions for the task.

Research paper thumbnail of NAYEL at SemEval-2020 Task 12: TF/IDF-Based Approach for Automatic Offensive Language Detection in Arabic Tweets

In this paper, we present the system submitted to “SemEval-2020 Task 12”. The proposed system aim... more In this paper, we present the system submitted to “SemEval-2020 Task 12”. The proposed system aims at automatically identify the Offensive Language in Arabic Tweets. A machine learning based approach has been used to design our system. We implemented a linear classifier with Stochastic Gradient Descent (SGD) as optimization algorithm. Our model reported 84.20%, 81.82% f1-score on development set and test set respectively. The best performed system and the system in the last rank reported 90.17% and 44.51% f1-score on test set respectively.

Research paper thumbnail of BENHA@IDAT: Improving Irony Detection in Arabic Tweets using Ensemble Approach

This paper describes the methods and experiments that have been used in the development of our mo... more This paper describes the methods and experiments that have been used in the development of our model submitted to Irony Detection for Arabic Tweets shared task. We submitted three runs based on our model using Support Vector Machines (SVM), Linear and Ensemble classifiers. Bag-of-Words with range of n-grams model have been used for feature extraction. Our submissions achieved accuracies of 82.1%, 81.6% and 81.1% for ensemble based, SVM and linear classifiers respectively.

Research paper thumbnail of NAYEL@APDA: Machine Learning Approach for Author Profiling and Deception Detection in Arabic Texts

In this paper, we describe the methods and experiments that have been used in development of our ... more In this paper, we describe the methods and experiments that have been used in development of our system for Author Profiling and Deception Detection in Arabic shared task. There are two tasks, Author Profiling in Arabic Tweets and Deception Detection in Arabic Texts. We have submitted three runs for each task. The proposed system depends on classical machine learning approaches namely Linear Classifier, Support Vector Machine and Multilayer Perceptron Classifier. Bag-of-Word with range of n-grams model has been used for feature extraction. Our submissions for the first task achieved the second, seventh and third ranks. For the second task, one of our submissions outperformed all other submissions developed by other teams.

Research paper thumbnail of Partial Redundancy Elimination for Multi-threaded Programs

Multi-threaded programs have many applications which are widely used such as operating systems. A... more Multi-threaded programs have many applications which are widely used such as operating systems. Analyzing multi-threaded programs differs from sequential ones; the main feature is that many threads execute at the same time. The effect of all other running threads must be taken in account. Partial redundancy elimination is among the most powerful compiler optimizations: it performs loop-invariant code motion and common subexpression elimination. We present a type system with optimization component which performs partial redundancy elimination for multi-threaded programs. Key words:

Research paper thumbnail of Improving Multi-Word Entity Recognition for Biomedical Texts

ArXiv, 2019

Biomedical Named Entity Recognition (BioNER) is a crucial step for analyzing Biomedical texts, wh... more Biomedical Named Entity Recognition (BioNER) is a crucial step for analyzing Biomedical texts, which aims at extracting biomedical named entities from a given text. Different supervised machine learning algorithms have been applied for BioNER by various researchers. The main requirement of these approaches is an annotated dataset used for learning the parameters of machine learning algorithms. Segment Representation (SR) models comprise of different tag sets used for representing the annotated data, such as IOB2, IOE2 and IOBES. In this paper, we propose an extension of IOBES model to improve the performance of BioNER. The proposed SR model, FROBES, improves the representation of multi-word entities. We used Bidirectional Long Short-Term Memory (BiLSTM) network; an instance of Recurrent Neural Networks (RNN), to design a baseline system for BioNER and evaluated the new SR model on two datasets, i2b2/VA 2010 challenge dataset and JNLPBA 2004 shared task dataset. The proposed SR model...

Research paper thumbnail of Partial Redundancy Elimination for Multi-threaded Programs

Multi-threaded programs have many applications which are widely used such as operating systems. A... more Multi-threaded programs have many applications which are widely used such as operating systems. Analyzing multi-threaded programs differs from sequential ones; the main feature is that many threads execute at the same time. The effect of all other running threads must be taken in account. Partial redundancy elimination is among the most powerful compiler optimizations: it performs loop-invariant code motion and common subexpression elimination. We present a type system with optimization component which performs partial redundancy elimination for multi-threaded programs.

Research paper thumbnail of Machine Learning-Based Approach for Arabic Dialect Identification

This paper describes our systems submitted to the Second Nuanced Arabic Dialect Identification Sh... more This paper describes our systems submitted to the Second Nuanced Arabic Dialect Identification Shared Task (NADI 2021). Dialect identification is the task of automatically detecting the source variety of a given text or speech segment. There are four subtasks, two subtasks for country-level identification and the other two subtasks for province-level identification. The data in this task covers a total of 100 provinces from all 21 Arab countries and come from the Twitter domain. The proposed systems depend on five machine-learning approaches namely Complement Naïve Bayes, Support Vector Machine, Decision Tree, Logistic Regression and Random Forest Classifiers. F1 macro-averaged score of Naïve Bayes classifier outperformed all other classifiers for development and test data.

Research paper thumbnail of Mangalore-University@INLI-FIRE-2017: Indian Native Language Identification using Support Vector Machines and Ensemble approach

This paper describes the systems submitted by our team for Indian Native Language Identification ... more This paper describes the systems submitted by our team for Indian Native Language Identification (INLI) task held in conjunction with FIRE 2017. Native Language Identification (NLI) is an important task that has different applications in different areas such as social-media analysis, authorship identification, second language acquisition and forensic investigation. We submitted two systems using Support Vector Machine (SVM) and Ensemble Classifier based on three different classifiers representing the comments (data) as vector space model for both systems and achieved accuracy of 47.60% and 47.30% respectively and secured second rank over all submissions for the task. CCS CONCEPTS • Information systems → Web and social media search; Multilingual and cross-lingual retrieval; • Computing methodologies → Language resources;