Hamada Nayel - Academia.edu (original) (raw)
Papers by Hamada Nayel
Benha Journal of Applied Sciences
Misinformation about COVID-19 overwhelmed our lives due to the tremendous usage of social media, ... more Misinformation about COVID-19 overwhelmed our lives due to the tremendous usage of social media, especially Twitter. Spreading misinformation caused fear and panic among people affecting the national economic security of many countries. Vaccination is the crucial key to limiting the pandemic spread of COVID-19. Therefore, researchers start to detect and fight against the spread of misinformation taking it as a new challenge. This paper illustrates a model for misinformation detection in Arabic tweets using Natural Language Processing (NLP) techniques. A machine learning-based system has been developed regarding COVID-19 vaccination tweets. Term Frequency-Inverse Document Frequency (TF-IDF) has been used as vector space model for feature extraction. Support Vector Machines classification algorithm has been used for implementation the proposed system. Evaluation of the system, using different metrics, has been implemented on Arcov-19Vac, a dataset of Arabic tweets related to COVID-19 vaccination. The results reported by the illustrated model show that the performance of our model is promising.
Proceedings of the 16th International Workshop on Semantic Evaluation (SemEval-2022)
Proceedings of the Second Workshop on Language Technology for Equality, Diversity and Inclusion
2022 2nd International Mobile, Intelligent, and Ubiquitous Computing Conference (MIUCC)
Multi-threading is a methodology that has been extremely used. Modern software depends essentiall... more Multi-threading is a methodology that has been extremely used. Modern software depends essentially on multi-threading. Operating systems, famous examples, are based on multi-threading; a user can write his document, play an audio file, and downloading a file from internet at the same time. Each of these tasks called a thread. A common problem occurs when implementing multi-threaded programs is a data-race. Data race occurs when two threads try to access a shared variable at the same time without a proper synchronization. A detector is software that determines if the program contains a data-race problem or not. In this paper, we develop a detector that has the form of a type system. We present a type system which discovers the data-race problems. We also prove the soundness of our type system.
Forum for Information Retrieval Evaluation, 2017
This paper describes the systems submitted by our team for Indian Native Language Identification ... more This paper describes the systems submitted by our team for Indian Native Language Identification (INLI) task held in conjunction with FIRE 2017. Native Language Identification (NLI) is an important task that has different applications in different areas such as social-media analysis, authorship identification, second language acquisition and forensic investigation. We submitted two systems using Support Vector Machine (SVM) and Ensemble Classifier based on three different classifiers representing the comments (data) as vector space model for both systems and achieved accuracy of 47.60% and 47.30% respectively and secured second rank over all submissions for the task. CCS CONCEPTS • Information systems → Web and social media search; Multilingual and cross-lingual retrieval; • Computing methodologies → Language resources;
Forum for Information Retrieval Evaluation, 2019
In this paper, we describe the system submitted by our team for Hate Speech and Offensive Content... more In this paper, we describe the system submitted by our team for Hate Speech and Offensive Content Identification in Indo-European Languages (HASOC) shared task held at FIRE 2019. Hate speech and offensive language detection have become an important task due to the overwhelming usage of social media platforms in our daily life. This task has been applied for three languages namely, English, Germany and Hindi. The proposed model uses classical machine learning approaches to create classifiers that are used to classify the given post according to different subtasks.
Forum for Information Retrieval Evaluation, 2018
In this paper, the systems submitted by Mangalore University team for Indian Native Language Iden... more In this paper, the systems submitted by Mangalore University team for Indian Native Language Identification (INLI) task have been described. Native Language Identification (NLI) has different applications such as social media analysis, authorship identification, secondlanguage acquisition and forensic investigation. We submitted three systems using Artificial Neural Network (ANN) model and Ensemble approach. All the three submitted systems achieved the same accuracy of 35.30% and secured second rank over all submissions for the task.
In this paper, we present the system submitted to “SemEval-2020 Task 12”. The proposed system aim... more In this paper, we present the system submitted to “SemEval-2020 Task 12”. The proposed system aims at automatically identify the Offensive Language in Arabic Tweets. A machine learning based approach has been used to design our system. We implemented a linear classifier with Stochastic Gradient Descent (SGD) as optimization algorithm. Our model reported 84.20%, 81.82% f1-score on development set and test set respectively. The best performed system and the system in the last rank reported 90.17% and 44.51% f1-score on test set respectively.
This paper describes the methods and experiments that have been used in the development of our mo... more This paper describes the methods and experiments that have been used in the development of our model submitted to Irony Detection for Arabic Tweets shared task. We submitted three runs based on our model using Support Vector Machines (SVM), Linear and Ensemble classifiers. Bag-of-Words with range of n-grams model have been used for feature extraction. Our submissions achieved accuracies of 82.1%, 81.6% and 81.1% for ensemble based, SVM and linear classifiers respectively.
In this paper, we describe the methods and experiments that have been used in development of our ... more In this paper, we describe the methods and experiments that have been used in development of our system for Author Profiling and Deception Detection in Arabic shared task. There are two tasks, Author Profiling in Arabic Tweets and Deception Detection in Arabic Texts. We have submitted three runs for each task. The proposed system depends on classical machine learning approaches namely Linear Classifier, Support Vector Machine and Multilayer Perceptron Classifier. Bag-of-Word with range of n-grams model has been used for feature extraction. Our submissions for the first task achieved the second, seventh and third ranks. For the second task, one of our submissions outperformed all other submissions developed by other teams.
Multi-threaded programs have many applications which are widely used such as operating systems. A... more Multi-threaded programs have many applications which are widely used such as operating systems. Analyzing multi-threaded programs differs from sequential ones; the main feature is that many threads execute at the same time. The effect of all other running threads must be taken in account. Partial redundancy elimination is among the most powerful compiler optimizations: it performs loop-invariant code motion and common subexpression elimination. We present a type system with optimization component which performs partial redundancy elimination for multi-threaded programs. Key words:
ArXiv, 2019
Biomedical Named Entity Recognition (BioNER) is a crucial step for analyzing Biomedical texts, wh... more Biomedical Named Entity Recognition (BioNER) is a crucial step for analyzing Biomedical texts, which aims at extracting biomedical named entities from a given text. Different supervised machine learning algorithms have been applied for BioNER by various researchers. The main requirement of these approaches is an annotated dataset used for learning the parameters of machine learning algorithms. Segment Representation (SR) models comprise of different tag sets used for representing the annotated data, such as IOB2, IOE2 and IOBES. In this paper, we propose an extension of IOBES model to improve the performance of BioNER. The proposed SR model, FROBES, improves the representation of multi-word entities. We used Bidirectional Long Short-Term Memory (BiLSTM) network; an instance of Recurrent Neural Networks (RNN), to design a baseline system for BioNER and evaluated the new SR model on two datasets, i2b2/VA 2010 challenge dataset and JNLPBA 2004 shared task dataset. The proposed SR model...
Multi-threaded programs have many applications which are widely used such as operating systems. A... more Multi-threaded programs have many applications which are widely used such as operating systems. Analyzing multi-threaded programs differs from sequential ones; the main feature is that many threads execute at the same time. The effect of all other running threads must be taken in account. Partial redundancy elimination is among the most powerful compiler optimizations: it performs loop-invariant code motion and common subexpression elimination. We present a type system with optimization component which performs partial redundancy elimination for multi-threaded programs.
This paper describes our systems submitted to the Second Nuanced Arabic Dialect Identification Sh... more This paper describes our systems submitted to the Second Nuanced Arabic Dialect Identification Shared Task (NADI 2021). Dialect identification is the task of automatically detecting the source variety of a given text or speech segment. There are four subtasks, two subtasks for country-level identification and the other two subtasks for province-level identification. The data in this task covers a total of 100 provinces from all 21 Arab countries and come from the Twitter domain. The proposed systems depend on five machine-learning approaches namely Complement Naïve Bayes, Support Vector Machine, Decision Tree, Logistic Regression and Random Forest Classifiers. F1 macro-averaged score of Naïve Bayes classifier outperformed all other classifiers for development and test data.
This paper describes the systems submitted by our team for Indian Native Language Identification ... more This paper describes the systems submitted by our team for Indian Native Language Identification (INLI) task held in conjunction with FIRE 2017. Native Language Identification (NLI) is an important task that has different applications in different areas such as social-media analysis, authorship identification, second language acquisition and forensic investigation. We submitted two systems using Support Vector Machine (SVM) and Ensemble Classifier based on three different classifiers representing the comments (data) as vector space model for both systems and achieved accuracy of 47.60% and 47.30% respectively and secured second rank over all submissions for the task. CCS CONCEPTS • Information systems → Web and social media search; Multilingual and cross-lingual retrieval; • Computing methodologies → Language resources;
Benha Journal of Applied Sciences
Misinformation about COVID-19 overwhelmed our lives due to the tremendous usage of social media, ... more Misinformation about COVID-19 overwhelmed our lives due to the tremendous usage of social media, especially Twitter. Spreading misinformation caused fear and panic among people affecting the national economic security of many countries. Vaccination is the crucial key to limiting the pandemic spread of COVID-19. Therefore, researchers start to detect and fight against the spread of misinformation taking it as a new challenge. This paper illustrates a model for misinformation detection in Arabic tweets using Natural Language Processing (NLP) techniques. A machine learning-based system has been developed regarding COVID-19 vaccination tweets. Term Frequency-Inverse Document Frequency (TF-IDF) has been used as vector space model for feature extraction. Support Vector Machines classification algorithm has been used for implementation the proposed system. Evaluation of the system, using different metrics, has been implemented on Arcov-19Vac, a dataset of Arabic tweets related to COVID-19 vaccination. The results reported by the illustrated model show that the performance of our model is promising.
Proceedings of the 16th International Workshop on Semantic Evaluation (SemEval-2022)
Proceedings of the Second Workshop on Language Technology for Equality, Diversity and Inclusion
2022 2nd International Mobile, Intelligent, and Ubiquitous Computing Conference (MIUCC)
Multi-threading is a methodology that has been extremely used. Modern software depends essentiall... more Multi-threading is a methodology that has been extremely used. Modern software depends essentially on multi-threading. Operating systems, famous examples, are based on multi-threading; a user can write his document, play an audio file, and downloading a file from internet at the same time. Each of these tasks called a thread. A common problem occurs when implementing multi-threaded programs is a data-race. Data race occurs when two threads try to access a shared variable at the same time without a proper synchronization. A detector is software that determines if the program contains a data-race problem or not. In this paper, we develop a detector that has the form of a type system. We present a type system which discovers the data-race problems. We also prove the soundness of our type system.
Forum for Information Retrieval Evaluation, 2017
This paper describes the systems submitted by our team for Indian Native Language Identification ... more This paper describes the systems submitted by our team for Indian Native Language Identification (INLI) task held in conjunction with FIRE 2017. Native Language Identification (NLI) is an important task that has different applications in different areas such as social-media analysis, authorship identification, second language acquisition and forensic investigation. We submitted two systems using Support Vector Machine (SVM) and Ensemble Classifier based on three different classifiers representing the comments (data) as vector space model for both systems and achieved accuracy of 47.60% and 47.30% respectively and secured second rank over all submissions for the task. CCS CONCEPTS • Information systems → Web and social media search; Multilingual and cross-lingual retrieval; • Computing methodologies → Language resources;
Forum for Information Retrieval Evaluation, 2019
In this paper, we describe the system submitted by our team for Hate Speech and Offensive Content... more In this paper, we describe the system submitted by our team for Hate Speech and Offensive Content Identification in Indo-European Languages (HASOC) shared task held at FIRE 2019. Hate speech and offensive language detection have become an important task due to the overwhelming usage of social media platforms in our daily life. This task has been applied for three languages namely, English, Germany and Hindi. The proposed model uses classical machine learning approaches to create classifiers that are used to classify the given post according to different subtasks.
Forum for Information Retrieval Evaluation, 2018
In this paper, the systems submitted by Mangalore University team for Indian Native Language Iden... more In this paper, the systems submitted by Mangalore University team for Indian Native Language Identification (INLI) task have been described. Native Language Identification (NLI) has different applications such as social media analysis, authorship identification, secondlanguage acquisition and forensic investigation. We submitted three systems using Artificial Neural Network (ANN) model and Ensemble approach. All the three submitted systems achieved the same accuracy of 35.30% and secured second rank over all submissions for the task.
In this paper, we present the system submitted to “SemEval-2020 Task 12”. The proposed system aim... more In this paper, we present the system submitted to “SemEval-2020 Task 12”. The proposed system aims at automatically identify the Offensive Language in Arabic Tweets. A machine learning based approach has been used to design our system. We implemented a linear classifier with Stochastic Gradient Descent (SGD) as optimization algorithm. Our model reported 84.20%, 81.82% f1-score on development set and test set respectively. The best performed system and the system in the last rank reported 90.17% and 44.51% f1-score on test set respectively.
This paper describes the methods and experiments that have been used in the development of our mo... more This paper describes the methods and experiments that have been used in the development of our model submitted to Irony Detection for Arabic Tweets shared task. We submitted three runs based on our model using Support Vector Machines (SVM), Linear and Ensemble classifiers. Bag-of-Words with range of n-grams model have been used for feature extraction. Our submissions achieved accuracies of 82.1%, 81.6% and 81.1% for ensemble based, SVM and linear classifiers respectively.
In this paper, we describe the methods and experiments that have been used in development of our ... more In this paper, we describe the methods and experiments that have been used in development of our system for Author Profiling and Deception Detection in Arabic shared task. There are two tasks, Author Profiling in Arabic Tweets and Deception Detection in Arabic Texts. We have submitted three runs for each task. The proposed system depends on classical machine learning approaches namely Linear Classifier, Support Vector Machine and Multilayer Perceptron Classifier. Bag-of-Word with range of n-grams model has been used for feature extraction. Our submissions for the first task achieved the second, seventh and third ranks. For the second task, one of our submissions outperformed all other submissions developed by other teams.
Multi-threaded programs have many applications which are widely used such as operating systems. A... more Multi-threaded programs have many applications which are widely used such as operating systems. Analyzing multi-threaded programs differs from sequential ones; the main feature is that many threads execute at the same time. The effect of all other running threads must be taken in account. Partial redundancy elimination is among the most powerful compiler optimizations: it performs loop-invariant code motion and common subexpression elimination. We present a type system with optimization component which performs partial redundancy elimination for multi-threaded programs. Key words:
ArXiv, 2019
Biomedical Named Entity Recognition (BioNER) is a crucial step for analyzing Biomedical texts, wh... more Biomedical Named Entity Recognition (BioNER) is a crucial step for analyzing Biomedical texts, which aims at extracting biomedical named entities from a given text. Different supervised machine learning algorithms have been applied for BioNER by various researchers. The main requirement of these approaches is an annotated dataset used for learning the parameters of machine learning algorithms. Segment Representation (SR) models comprise of different tag sets used for representing the annotated data, such as IOB2, IOE2 and IOBES. In this paper, we propose an extension of IOBES model to improve the performance of BioNER. The proposed SR model, FROBES, improves the representation of multi-word entities. We used Bidirectional Long Short-Term Memory (BiLSTM) network; an instance of Recurrent Neural Networks (RNN), to design a baseline system for BioNER and evaluated the new SR model on two datasets, i2b2/VA 2010 challenge dataset and JNLPBA 2004 shared task dataset. The proposed SR model...
Multi-threaded programs have many applications which are widely used such as operating systems. A... more Multi-threaded programs have many applications which are widely used such as operating systems. Analyzing multi-threaded programs differs from sequential ones; the main feature is that many threads execute at the same time. The effect of all other running threads must be taken in account. Partial redundancy elimination is among the most powerful compiler optimizations: it performs loop-invariant code motion and common subexpression elimination. We present a type system with optimization component which performs partial redundancy elimination for multi-threaded programs.
This paper describes our systems submitted to the Second Nuanced Arabic Dialect Identification Sh... more This paper describes our systems submitted to the Second Nuanced Arabic Dialect Identification Shared Task (NADI 2021). Dialect identification is the task of automatically detecting the source variety of a given text or speech segment. There are four subtasks, two subtasks for country-level identification and the other two subtasks for province-level identification. The data in this task covers a total of 100 provinces from all 21 Arab countries and come from the Twitter domain. The proposed systems depend on five machine-learning approaches namely Complement Naïve Bayes, Support Vector Machine, Decision Tree, Logistic Regression and Random Forest Classifiers. F1 macro-averaged score of Naïve Bayes classifier outperformed all other classifiers for development and test data.
This paper describes the systems submitted by our team for Indian Native Language Identification ... more This paper describes the systems submitted by our team for Indian Native Language Identification (INLI) task held in conjunction with FIRE 2017. Native Language Identification (NLI) is an important task that has different applications in different areas such as social-media analysis, authorship identification, second language acquisition and forensic investigation. We submitted two systems using Support Vector Machine (SVM) and Ensemble Classifier based on three different classifiers representing the comments (data) as vector space model for both systems and achieved accuracy of 47.60% and 47.30% respectively and secured second rank over all submissions for the task. CCS CONCEPTS • Information systems → Web and social media search; Multilingual and cross-lingual retrieval; • Computing methodologies → Language resources;