fethi jarray - Academia.edu (original) (raw)

Papers by fethi jarray

Research paper thumbnail of SiameseBERT: A Bert-Based Siamese Network Enhanced with a Soft Attention Mechanism for Arabic Semantic Textual Similarity

Research paper thumbnail of A Sequence-to-Sequence Neural Network for Joint Aspect Term Extraction and Aspect Term Sentiment Classification Tasks

Research paper thumbnail of Sentence Transformers and DistilBERT for Arabic Word Sense Induction

Research paper thumbnail of Comparative Analysis of Recurrent Neural Network Architectures for Arabic Word Sense Disambiguation

Due to the stochastic nature and complexity of flow, as well as the existence of hydrological unc... more Due to the stochastic nature and complexity of flow, as well as the existence of hydrological uncertainties, predicting streamflow in dam reservoirs, especially in semi-arid and arid areas, is essential for the optimal and timely use of surface water resources. In this research, daily streamflow to the Ermenek hydroelectric dam reservoir located in Turkey is simulated using deep recurrent neural network (RNN) architectures, including bidirectional long short-term memory (Bi-LSTM), gated recurrent unit (GRU), long short-term memory (LSTM), and simple recurrent neural networks (simple RNN). For this purpose, daily observational flow data are used during the period 2012-2018, and all models are coded in Python software programming language. Only delays of streamflow time series are used as the input of models. Then, based on the correlation coefficient (CC), mean absolute error (MAE), root mean square error (RMSE), and Nash-Sutcliffe efficiency coefficient (NS), results of deep-learning architectures are compared with one another and with an artificial neural network (ANN) with two hidden layers. Results indicate that the accuracy of deep-learning RNN methods are better and more accurate than ANN. Among methods used in deep learning, the LSTM method has the best accuracy, namely, the simulated streamflow to the dam reservoir with 90% accuracy in the training stage and 87% accuracy in the testing stage. However, the accuracies of ANN in training and testing stages are 86% and 85%, respectively. Considering that the Ermenek Dam is used for hydroelectric purposes and energy production, modeling inflow in the most realistic way may lead to an increase in energy production and income by optimizing water management. Hence, multi-percentage improvements can be extremely useful. According to results, deep-learning methods of RNNs can be used for estimating streamflow to the Ermenek Dam reservoir due to their accuracy.

Research paper thumbnail of BERT-Based Joint Model for Aspect Term Extraction and Aspect Polarity Detection in Arabic Text

Electronics, Jan 19, 2023

This article is an open access article distributed under the terms and conditions of the Creative... more This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY

Research paper thumbnail of BERT-Based Ensemble Learning Approach for Sentiment Analysis

Communications in computer and information science, 2023

Research paper thumbnail of Combining Bert Representation and POS Tagger for Arabic Word Sense Disambiguation

Research paper thumbnail of Genetic Algorithm and Latent Semantic Analysis based Documents Summarization Technique

Proceedings of the 14th International Joint Conference on Knowledge Discovery, Knowledge Engineering and Knowledge Management

Research paper thumbnail of Large Class Arabic Sign Language Recognition

Proceedings of the 14th International Joint Conference on Knowledge Discovery, Knowledge Engineering and Knowledge Management

Research paper thumbnail of Hybridation of genetic algorithms and tabu search for reconstructing convex binary images from discrete orthogonal projections

Le Centre pour la Communication Scientifique Directe - HAL - Université Paris Descartes, 2014

Research paper thumbnail of An automatic arabic text summarization system based on genetic algorithms

Procedia Computer Science

Text summarization is the creation of compressed version of a given document that covers importan... more Text summarization is the creation of compressed version of a given document that covers important information from original document. The aim of text summarization is to reduce the original text into a shorter text which represents significant content of the original text. There are two approaches for automatic text summarization: extractive and abstractive. Extractive summarization consists to select the most significant sentences on the original text. Abstractive summarization consists to compose novel sentences coherent with the original text. In this paper, we present an extractive based single document approach for Arabic text summarization system using Genetic Algorithms.

Research paper thumbnail of Combinatorial Approaches for Low-power and Real-time Adaptive Reconfigurable Embedded Systems

Proceedings of the 4th International Conference on Pervasive and Embedded Computing and Communication Systems, 2014

This paper describes an optimisation-oriented approach to dynamic reconfiguration of embedded sys... more This paper describes an optimisation-oriented approach to dynamic reconfiguration of embedded systems with real-time and power consumption constraints. A reconfiguration scenario is assumed to be any run-time operation allowing the addition-removal-update of OS tasks to adapt the system to its environment under well-defined conditions. The problem is that any reconfiguration can lead the system to an unfeasible state where temporal properties are violated or the energy consumption is well-increased. Two methods, integer programming and simulated annealing, are used to that purpose. The methods have been validated using analysis tools to evaluate the whole contribution.

Research paper thumbnail of Reconstruction of binary matrices under adjacency constraints

Electronic Notes in Discrete Mathematics, 2005

We consider a generalization of the classical binary matrix reconstruction problem by considering... more We consider a generalization of the classical binary matrix reconstruction problem by considering adjacency constraints between the cells: if a given cell is of value 1 then all its neighbors are of value 0. This problem arises especially on statistical physics. We consider several definitions of neighborhood and for each one we give complexity results, necessary and/or sufficient conditions for

Research paper thumbnail of GPT-2 Contextual Data Augmentation for Word Sense Disambiguation

Le Centre pour la Communication Scientifique Directe - HAL - Grenoble Ecole de Management, Oct 20, 2022

Research paper thumbnail of A step loss function based SVM classifier for binary classification

Procedia Computer Science, 2018

In this paper, we propose a new cost function, step loss, for support vector machine classifiers ... more In this paper, we propose a new cost function, step loss, for support vector machine classifiers based on a deep distinction between the instances. It takes into account the position of the samples with the margin. More precisely, we divide the instances into four categories: i) instances correctly classified and lies outside the margin, ii) instances well classified and lies within the margin, iii) instances misclassified and lies within the margin and iv) instances misclassified and lies outside the margin. The the step loss assign a constant cost for each group of instances. By this it is more general than the hard margin cost that divide the instances into two categories. It will be also more robust to the outliers than the soft margin because the instances of the fourth group have a constant cost contrary to the hinge cost where the misclassified instances have a linear cost. It will be more accurate than the Ramp loss because it hardly distinguishes between the instances well classified within the margin and the instances misclassified within the margin. Theoretically, we prove that SVM model integrated with the step loss function has has the nice property of kernilization.

Research paper thumbnail of Alternating Loss Correction for Preterm-Birth Prediction from EHR Data with Noisy Labels

In this paper we are interested in the prediction of preterm birth based on diagnosis codes from ... more In this paper we are interested in the prediction of preterm birth based on diagnosis codes from longitudinal EHR. We formulate the prediction problem as a supervised classification with noisy labels. Our base classifier is a Recurrent Neural Network with an attention mechanism. We assume the availability of a data subset with both noisy and clean labels. For the cohort definition, most of the diagnosis codes on mothers' records related to pregnancy are ambiguous for the definition of full-term and preterm classes. On the other hand, diagnosis codes on babies' records provide fine-grained information on prematurity. Due to data de-identification, the links between mothers and babies are not available. We developed a heuristic based on admission and discharge times to match babies to their mothers and hence enrich mothers' records with additional information on delivery status. The obtained additional dataset from the matching heuristic has noisy labels and was used to le...

Research paper thumbnail of A Greedy Algorithm for Reconstructing Binary Matrices with Adjacent 1s

This paper deals with the reconstruction of special cases of binary matrices with adjacent 1s. Ea... more This paper deals with the reconstruction of special cases of binary matrices with adjacent 1s. Each element is horizontally adjacent to at least another element. The projections are the number of elements on each row and column. We give a greedy polynomial time algorithm to reconstruct such matrices when satisfying only the vertical projection. We show also that the reconstruction is NP-complete when fixing the number of sequence of length two and three per row and column.

Research paper thumbnail of Genetic Approach for Arabic Part of Speech Tagging

With the growing number of textual resources available, the ability to understand them becomes cr... more With the growing number of textual resources available, the ability to understand them becomes critical. An essential first step in understanding these sources is the ability to identify the parts-of-speech in each sentence. Arabic is a morphologically rich language, which presents a challenge for part of speech tagging. In this paper, our goal is to propose, improve, and implement a part-of-speech tagger based on a genetic algorithm. The accuracy obtained with this method is comparable to that of other probabilistic approaches.

Research paper thumbnail of Optimal classifier for imbalanced data using Matthews Correlation Coefficient metric

PLOS ONE

We introduce here some general results that can be used for many categories of classifier. Let X ... more We introduce here some general results that can be used for many categories of classifier. Let X be a metric space with a measure µ(x) , F = {f : .X → R} the set of real function or real classifier on X, let G : F → R be a functional on F and let S ⊂ F be a set. Consider the following optimization problem

Research paper thumbnail of Programmation discrète pour la reconstruction de matrices convexes

Le probleme classique de reconstruction de matrices convexes a partir de projections orthogonales... more Le probleme classique de reconstruction de matrices convexes a partir de projections orthogonales est defini comme suit : etant donne deux vecteurs H et V , nous cherchons a reconstruire une matrice binaire consistante avec ces projections [4]. La projection d’une rangee (ligne ou colonne) donne le nombre de 1s dans cette rangee. Nous nous interessons au probleme de reconstruction de matrices binaires convexes a partir de projections orthogonales. Ce probleme consiste a retrouver une matrice binaire convexe verifiant les projections horizontales et verticales. Une matrice est hv-convexe ou simplement convexe si les 1s de chaque rangee forment un seul bloc. Plusieurs approches ont ete proposees pour resoudre ce probleme NP-Complet [3, 2, 1, 5]. Nous presentons dans ce travail une methode basee sur la programmation mathematique en 0-1.

Research paper thumbnail of SiameseBERT: A Bert-Based Siamese Network Enhanced with a Soft Attention Mechanism for Arabic Semantic Textual Similarity

Research paper thumbnail of A Sequence-to-Sequence Neural Network for Joint Aspect Term Extraction and Aspect Term Sentiment Classification Tasks

Research paper thumbnail of Sentence Transformers and DistilBERT for Arabic Word Sense Induction

Research paper thumbnail of Comparative Analysis of Recurrent Neural Network Architectures for Arabic Word Sense Disambiguation

Due to the stochastic nature and complexity of flow, as well as the existence of hydrological unc... more Due to the stochastic nature and complexity of flow, as well as the existence of hydrological uncertainties, predicting streamflow in dam reservoirs, especially in semi-arid and arid areas, is essential for the optimal and timely use of surface water resources. In this research, daily streamflow to the Ermenek hydroelectric dam reservoir located in Turkey is simulated using deep recurrent neural network (RNN) architectures, including bidirectional long short-term memory (Bi-LSTM), gated recurrent unit (GRU), long short-term memory (LSTM), and simple recurrent neural networks (simple RNN). For this purpose, daily observational flow data are used during the period 2012-2018, and all models are coded in Python software programming language. Only delays of streamflow time series are used as the input of models. Then, based on the correlation coefficient (CC), mean absolute error (MAE), root mean square error (RMSE), and Nash-Sutcliffe efficiency coefficient (NS), results of deep-learning architectures are compared with one another and with an artificial neural network (ANN) with two hidden layers. Results indicate that the accuracy of deep-learning RNN methods are better and more accurate than ANN. Among methods used in deep learning, the LSTM method has the best accuracy, namely, the simulated streamflow to the dam reservoir with 90% accuracy in the training stage and 87% accuracy in the testing stage. However, the accuracies of ANN in training and testing stages are 86% and 85%, respectively. Considering that the Ermenek Dam is used for hydroelectric purposes and energy production, modeling inflow in the most realistic way may lead to an increase in energy production and income by optimizing water management. Hence, multi-percentage improvements can be extremely useful. According to results, deep-learning methods of RNNs can be used for estimating streamflow to the Ermenek Dam reservoir due to their accuracy.

Research paper thumbnail of BERT-Based Joint Model for Aspect Term Extraction and Aspect Polarity Detection in Arabic Text

Electronics, Jan 19, 2023

This article is an open access article distributed under the terms and conditions of the Creative... more This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY

Research paper thumbnail of BERT-Based Ensemble Learning Approach for Sentiment Analysis

Communications in computer and information science, 2023

Research paper thumbnail of Combining Bert Representation and POS Tagger for Arabic Word Sense Disambiguation

Research paper thumbnail of Genetic Algorithm and Latent Semantic Analysis based Documents Summarization Technique

Proceedings of the 14th International Joint Conference on Knowledge Discovery, Knowledge Engineering and Knowledge Management

Research paper thumbnail of Large Class Arabic Sign Language Recognition

Proceedings of the 14th International Joint Conference on Knowledge Discovery, Knowledge Engineering and Knowledge Management

Research paper thumbnail of Hybridation of genetic algorithms and tabu search for reconstructing convex binary images from discrete orthogonal projections

Le Centre pour la Communication Scientifique Directe - HAL - Université Paris Descartes, 2014

Research paper thumbnail of An automatic arabic text summarization system based on genetic algorithms

Procedia Computer Science

Text summarization is the creation of compressed version of a given document that covers importan... more Text summarization is the creation of compressed version of a given document that covers important information from original document. The aim of text summarization is to reduce the original text into a shorter text which represents significant content of the original text. There are two approaches for automatic text summarization: extractive and abstractive. Extractive summarization consists to select the most significant sentences on the original text. Abstractive summarization consists to compose novel sentences coherent with the original text. In this paper, we present an extractive based single document approach for Arabic text summarization system using Genetic Algorithms.

Research paper thumbnail of Combinatorial Approaches for Low-power and Real-time Adaptive Reconfigurable Embedded Systems

Proceedings of the 4th International Conference on Pervasive and Embedded Computing and Communication Systems, 2014

This paper describes an optimisation-oriented approach to dynamic reconfiguration of embedded sys... more This paper describes an optimisation-oriented approach to dynamic reconfiguration of embedded systems with real-time and power consumption constraints. A reconfiguration scenario is assumed to be any run-time operation allowing the addition-removal-update of OS tasks to adapt the system to its environment under well-defined conditions. The problem is that any reconfiguration can lead the system to an unfeasible state where temporal properties are violated or the energy consumption is well-increased. Two methods, integer programming and simulated annealing, are used to that purpose. The methods have been validated using analysis tools to evaluate the whole contribution.

Research paper thumbnail of Reconstruction of binary matrices under adjacency constraints

Electronic Notes in Discrete Mathematics, 2005

We consider a generalization of the classical binary matrix reconstruction problem by considering... more We consider a generalization of the classical binary matrix reconstruction problem by considering adjacency constraints between the cells: if a given cell is of value 1 then all its neighbors are of value 0. This problem arises especially on statistical physics. We consider several definitions of neighborhood and for each one we give complexity results, necessary and/or sufficient conditions for

Research paper thumbnail of GPT-2 Contextual Data Augmentation for Word Sense Disambiguation

Le Centre pour la Communication Scientifique Directe - HAL - Grenoble Ecole de Management, Oct 20, 2022

Research paper thumbnail of A step loss function based SVM classifier for binary classification

Procedia Computer Science, 2018

In this paper, we propose a new cost function, step loss, for support vector machine classifiers ... more In this paper, we propose a new cost function, step loss, for support vector machine classifiers based on a deep distinction between the instances. It takes into account the position of the samples with the margin. More precisely, we divide the instances into four categories: i) instances correctly classified and lies outside the margin, ii) instances well classified and lies within the margin, iii) instances misclassified and lies within the margin and iv) instances misclassified and lies outside the margin. The the step loss assign a constant cost for each group of instances. By this it is more general than the hard margin cost that divide the instances into two categories. It will be also more robust to the outliers than the soft margin because the instances of the fourth group have a constant cost contrary to the hinge cost where the misclassified instances have a linear cost. It will be more accurate than the Ramp loss because it hardly distinguishes between the instances well classified within the margin and the instances misclassified within the margin. Theoretically, we prove that SVM model integrated with the step loss function has has the nice property of kernilization.

Research paper thumbnail of Alternating Loss Correction for Preterm-Birth Prediction from EHR Data with Noisy Labels

In this paper we are interested in the prediction of preterm birth based on diagnosis codes from ... more In this paper we are interested in the prediction of preterm birth based on diagnosis codes from longitudinal EHR. We formulate the prediction problem as a supervised classification with noisy labels. Our base classifier is a Recurrent Neural Network with an attention mechanism. We assume the availability of a data subset with both noisy and clean labels. For the cohort definition, most of the diagnosis codes on mothers' records related to pregnancy are ambiguous for the definition of full-term and preterm classes. On the other hand, diagnosis codes on babies' records provide fine-grained information on prematurity. Due to data de-identification, the links between mothers and babies are not available. We developed a heuristic based on admission and discharge times to match babies to their mothers and hence enrich mothers' records with additional information on delivery status. The obtained additional dataset from the matching heuristic has noisy labels and was used to le...

Research paper thumbnail of A Greedy Algorithm for Reconstructing Binary Matrices with Adjacent 1s

This paper deals with the reconstruction of special cases of binary matrices with adjacent 1s. Ea... more This paper deals with the reconstruction of special cases of binary matrices with adjacent 1s. Each element is horizontally adjacent to at least another element. The projections are the number of elements on each row and column. We give a greedy polynomial time algorithm to reconstruct such matrices when satisfying only the vertical projection. We show also that the reconstruction is NP-complete when fixing the number of sequence of length two and three per row and column.

Research paper thumbnail of Genetic Approach for Arabic Part of Speech Tagging

With the growing number of textual resources available, the ability to understand them becomes cr... more With the growing number of textual resources available, the ability to understand them becomes critical. An essential first step in understanding these sources is the ability to identify the parts-of-speech in each sentence. Arabic is a morphologically rich language, which presents a challenge for part of speech tagging. In this paper, our goal is to propose, improve, and implement a part-of-speech tagger based on a genetic algorithm. The accuracy obtained with this method is comparable to that of other probabilistic approaches.

Research paper thumbnail of Optimal classifier for imbalanced data using Matthews Correlation Coefficient metric

PLOS ONE

We introduce here some general results that can be used for many categories of classifier. Let X ... more We introduce here some general results that can be used for many categories of classifier. Let X be a metric space with a measure µ(x) , F = {f : .X → R} the set of real function or real classifier on X, let G : F → R be a functional on F and let S ⊂ F be a set. Consider the following optimization problem

Research paper thumbnail of Programmation discrète pour la reconstruction de matrices convexes

Le probleme classique de reconstruction de matrices convexes a partir de projections orthogonales... more Le probleme classique de reconstruction de matrices convexes a partir de projections orthogonales est defini comme suit : etant donne deux vecteurs H et V , nous cherchons a reconstruire une matrice binaire consistante avec ces projections [4]. La projection d’une rangee (ligne ou colonne) donne le nombre de 1s dans cette rangee. Nous nous interessons au probleme de reconstruction de matrices binaires convexes a partir de projections orthogonales. Ce probleme consiste a retrouver une matrice binaire convexe verifiant les projections horizontales et verticales. Une matrice est hv-convexe ou simplement convexe si les 1s de chaque rangee forment un seul bloc. Plusieurs approches ont ete proposees pour resoudre ce probleme NP-Complet [3, 2, 1, 5]. Nous presentons dans ce travail une methode basee sur la programmation mathematique en 0-1.