fethi jarray - Academia.edu (original) (raw)
Papers by fethi jarray
Due to the stochastic nature and complexity of flow, as well as the existence of hydrological unc... more Due to the stochastic nature and complexity of flow, as well as the existence of hydrological uncertainties, predicting streamflow in dam reservoirs, especially in semi-arid and arid areas, is essential for the optimal and timely use of surface water resources. In this research, daily streamflow to the Ermenek hydroelectric dam reservoir located in Turkey is simulated using deep recurrent neural network (RNN) architectures, including bidirectional long short-term memory (Bi-LSTM), gated recurrent unit (GRU), long short-term memory (LSTM), and simple recurrent neural networks (simple RNN). For this purpose, daily observational flow data are used during the period 2012-2018, and all models are coded in Python software programming language. Only delays of streamflow time series are used as the input of models. Then, based on the correlation coefficient (CC), mean absolute error (MAE), root mean square error (RMSE), and Nash-Sutcliffe efficiency coefficient (NS), results of deep-learning architectures are compared with one another and with an artificial neural network (ANN) with two hidden layers. Results indicate that the accuracy of deep-learning RNN methods are better and more accurate than ANN. Among methods used in deep learning, the LSTM method has the best accuracy, namely, the simulated streamflow to the dam reservoir with 90% accuracy in the training stage and 87% accuracy in the testing stage. However, the accuracies of ANN in training and testing stages are 86% and 85%, respectively. Considering that the Ermenek Dam is used for hydroelectric purposes and energy production, modeling inflow in the most realistic way may lead to an increase in energy production and income by optimizing water management. Hence, multi-percentage improvements can be extremely useful. According to results, deep-learning methods of RNNs can be used for estimating streamflow to the Ermenek Dam reservoir due to their accuracy.
Electronics, Jan 19, 2023
This article is an open access article distributed under the terms and conditions of the Creative... more This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY
Communications in computer and information science, 2023
Proceedings of the 14th International Joint Conference on Knowledge Discovery, Knowledge Engineering and Knowledge Management
Proceedings of the 14th International Joint Conference on Knowledge Discovery, Knowledge Engineering and Knowledge Management
Le Centre pour la Communication Scientifique Directe - HAL - Université Paris Descartes, 2014
Procedia Computer Science
Text summarization is the creation of compressed version of a given document that covers importan... more Text summarization is the creation of compressed version of a given document that covers important information from original document. The aim of text summarization is to reduce the original text into a shorter text which represents significant content of the original text. There are two approaches for automatic text summarization: extractive and abstractive. Extractive summarization consists to select the most significant sentences on the original text. Abstractive summarization consists to compose novel sentences coherent with the original text. In this paper, we present an extractive based single document approach for Arabic text summarization system using Genetic Algorithms.
Proceedings of the 4th International Conference on Pervasive and Embedded Computing and Communication Systems, 2014
This paper describes an optimisation-oriented approach to dynamic reconfiguration of embedded sys... more This paper describes an optimisation-oriented approach to dynamic reconfiguration of embedded systems with real-time and power consumption constraints. A reconfiguration scenario is assumed to be any run-time operation allowing the addition-removal-update of OS tasks to adapt the system to its environment under well-defined conditions. The problem is that any reconfiguration can lead the system to an unfeasible state where temporal properties are violated or the energy consumption is well-increased. Two methods, integer programming and simulated annealing, are used to that purpose. The methods have been validated using analysis tools to evaluate the whole contribution.
Electronic Notes in Discrete Mathematics, 2005
We consider a generalization of the classical binary matrix reconstruction problem by considering... more We consider a generalization of the classical binary matrix reconstruction problem by considering adjacency constraints between the cells: if a given cell is of value 1 then all its neighbors are of value 0. This problem arises especially on statistical physics. We consider several definitions of neighborhood and for each one we give complexity results, necessary and/or sufficient conditions for
Le Centre pour la Communication Scientifique Directe - HAL - Grenoble Ecole de Management, Oct 20, 2022
Procedia Computer Science, 2018
In this paper, we propose a new cost function, step loss, for support vector machine classifiers ... more In this paper, we propose a new cost function, step loss, for support vector machine classifiers based on a deep distinction between the instances. It takes into account the position of the samples with the margin. More precisely, we divide the instances into four categories: i) instances correctly classified and lies outside the margin, ii) instances well classified and lies within the margin, iii) instances misclassified and lies within the margin and iv) instances misclassified and lies outside the margin. The the step loss assign a constant cost for each group of instances. By this it is more general than the hard margin cost that divide the instances into two categories. It will be also more robust to the outliers than the soft margin because the instances of the fourth group have a constant cost contrary to the hinge cost where the misclassified instances have a linear cost. It will be more accurate than the Ramp loss because it hardly distinguishes between the instances well classified within the margin and the instances misclassified within the margin. Theoretically, we prove that SVM model integrated with the step loss function has has the nice property of kernilization.
In this paper we are interested in the prediction of preterm birth based on diagnosis codes from ... more In this paper we are interested in the prediction of preterm birth based on diagnosis codes from longitudinal EHR. We formulate the prediction problem as a supervised classification with noisy labels. Our base classifier is a Recurrent Neural Network with an attention mechanism. We assume the availability of a data subset with both noisy and clean labels. For the cohort definition, most of the diagnosis codes on mothers' records related to pregnancy are ambiguous for the definition of full-term and preterm classes. On the other hand, diagnosis codes on babies' records provide fine-grained information on prematurity. Due to data de-identification, the links between mothers and babies are not available. We developed a heuristic based on admission and discharge times to match babies to their mothers and hence enrich mothers' records with additional information on delivery status. The obtained additional dataset from the matching heuristic has noisy labels and was used to le...
This paper deals with the reconstruction of special cases of binary matrices with adjacent 1s. Ea... more This paper deals with the reconstruction of special cases of binary matrices with adjacent 1s. Each element is horizontally adjacent to at least another element. The projections are the number of elements on each row and column. We give a greedy polynomial time algorithm to reconstruct such matrices when satisfying only the vertical projection. We show also that the reconstruction is NP-complete when fixing the number of sequence of length two and three per row and column.
With the growing number of textual resources available, the ability to understand them becomes cr... more With the growing number of textual resources available, the ability to understand them becomes critical. An essential first step in understanding these sources is the ability to identify the parts-of-speech in each sentence. Arabic is a morphologically rich language, which presents a challenge for part of speech tagging. In this paper, our goal is to propose, improve, and implement a part-of-speech tagger based on a genetic algorithm. The accuracy obtained with this method is comparable to that of other probabilistic approaches.
PLOS ONE
We introduce here some general results that can be used for many categories of classifier. Let X ... more We introduce here some general results that can be used for many categories of classifier. Let X be a metric space with a measure µ(x) , F = {f : .X → R} the set of real function or real classifier on X, let G : F → R be a functional on F and let S ⊂ F be a set. Consider the following optimization problem
Le probleme classique de reconstruction de matrices convexes a partir de projections orthogonales... more Le probleme classique de reconstruction de matrices convexes a partir de projections orthogonales est defini comme suit : etant donne deux vecteurs H et V , nous cherchons a reconstruire une matrice binaire consistante avec ces projections [4]. La projection d’une rangee (ligne ou colonne) donne le nombre de 1s dans cette rangee. Nous nous interessons au probleme de reconstruction de matrices binaires convexes a partir de projections orthogonales. Ce probleme consiste a retrouver une matrice binaire convexe verifiant les projections horizontales et verticales. Une matrice est hv-convexe ou simplement convexe si les 1s de chaque rangee forment un seul bloc. Plusieurs approches ont ete proposees pour resoudre ce probleme NP-Complet [3, 2, 1, 5]. Nous presentons dans ce travail une methode basee sur la programmation mathematique en 0-1.
Due to the stochastic nature and complexity of flow, as well as the existence of hydrological unc... more Due to the stochastic nature and complexity of flow, as well as the existence of hydrological uncertainties, predicting streamflow in dam reservoirs, especially in semi-arid and arid areas, is essential for the optimal and timely use of surface water resources. In this research, daily streamflow to the Ermenek hydroelectric dam reservoir located in Turkey is simulated using deep recurrent neural network (RNN) architectures, including bidirectional long short-term memory (Bi-LSTM), gated recurrent unit (GRU), long short-term memory (LSTM), and simple recurrent neural networks (simple RNN). For this purpose, daily observational flow data are used during the period 2012-2018, and all models are coded in Python software programming language. Only delays of streamflow time series are used as the input of models. Then, based on the correlation coefficient (CC), mean absolute error (MAE), root mean square error (RMSE), and Nash-Sutcliffe efficiency coefficient (NS), results of deep-learning architectures are compared with one another and with an artificial neural network (ANN) with two hidden layers. Results indicate that the accuracy of deep-learning RNN methods are better and more accurate than ANN. Among methods used in deep learning, the LSTM method has the best accuracy, namely, the simulated streamflow to the dam reservoir with 90% accuracy in the training stage and 87% accuracy in the testing stage. However, the accuracies of ANN in training and testing stages are 86% and 85%, respectively. Considering that the Ermenek Dam is used for hydroelectric purposes and energy production, modeling inflow in the most realistic way may lead to an increase in energy production and income by optimizing water management. Hence, multi-percentage improvements can be extremely useful. According to results, deep-learning methods of RNNs can be used for estimating streamflow to the Ermenek Dam reservoir due to their accuracy.
Electronics, Jan 19, 2023
This article is an open access article distributed under the terms and conditions of the Creative... more This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY
Communications in computer and information science, 2023
Proceedings of the 14th International Joint Conference on Knowledge Discovery, Knowledge Engineering and Knowledge Management
Proceedings of the 14th International Joint Conference on Knowledge Discovery, Knowledge Engineering and Knowledge Management
Le Centre pour la Communication Scientifique Directe - HAL - Université Paris Descartes, 2014
Procedia Computer Science
Text summarization is the creation of compressed version of a given document that covers importan... more Text summarization is the creation of compressed version of a given document that covers important information from original document. The aim of text summarization is to reduce the original text into a shorter text which represents significant content of the original text. There are two approaches for automatic text summarization: extractive and abstractive. Extractive summarization consists to select the most significant sentences on the original text. Abstractive summarization consists to compose novel sentences coherent with the original text. In this paper, we present an extractive based single document approach for Arabic text summarization system using Genetic Algorithms.
Proceedings of the 4th International Conference on Pervasive and Embedded Computing and Communication Systems, 2014
This paper describes an optimisation-oriented approach to dynamic reconfiguration of embedded sys... more This paper describes an optimisation-oriented approach to dynamic reconfiguration of embedded systems with real-time and power consumption constraints. A reconfiguration scenario is assumed to be any run-time operation allowing the addition-removal-update of OS tasks to adapt the system to its environment under well-defined conditions. The problem is that any reconfiguration can lead the system to an unfeasible state where temporal properties are violated or the energy consumption is well-increased. Two methods, integer programming and simulated annealing, are used to that purpose. The methods have been validated using analysis tools to evaluate the whole contribution.
Electronic Notes in Discrete Mathematics, 2005
We consider a generalization of the classical binary matrix reconstruction problem by considering... more We consider a generalization of the classical binary matrix reconstruction problem by considering adjacency constraints between the cells: if a given cell is of value 1 then all its neighbors are of value 0. This problem arises especially on statistical physics. We consider several definitions of neighborhood and for each one we give complexity results, necessary and/or sufficient conditions for
Le Centre pour la Communication Scientifique Directe - HAL - Grenoble Ecole de Management, Oct 20, 2022
Procedia Computer Science, 2018
In this paper, we propose a new cost function, step loss, for support vector machine classifiers ... more In this paper, we propose a new cost function, step loss, for support vector machine classifiers based on a deep distinction between the instances. It takes into account the position of the samples with the margin. More precisely, we divide the instances into four categories: i) instances correctly classified and lies outside the margin, ii) instances well classified and lies within the margin, iii) instances misclassified and lies within the margin and iv) instances misclassified and lies outside the margin. The the step loss assign a constant cost for each group of instances. By this it is more general than the hard margin cost that divide the instances into two categories. It will be also more robust to the outliers than the soft margin because the instances of the fourth group have a constant cost contrary to the hinge cost where the misclassified instances have a linear cost. It will be more accurate than the Ramp loss because it hardly distinguishes between the instances well classified within the margin and the instances misclassified within the margin. Theoretically, we prove that SVM model integrated with the step loss function has has the nice property of kernilization.
In this paper we are interested in the prediction of preterm birth based on diagnosis codes from ... more In this paper we are interested in the prediction of preterm birth based on diagnosis codes from longitudinal EHR. We formulate the prediction problem as a supervised classification with noisy labels. Our base classifier is a Recurrent Neural Network with an attention mechanism. We assume the availability of a data subset with both noisy and clean labels. For the cohort definition, most of the diagnosis codes on mothers' records related to pregnancy are ambiguous for the definition of full-term and preterm classes. On the other hand, diagnosis codes on babies' records provide fine-grained information on prematurity. Due to data de-identification, the links between mothers and babies are not available. We developed a heuristic based on admission and discharge times to match babies to their mothers and hence enrich mothers' records with additional information on delivery status. The obtained additional dataset from the matching heuristic has noisy labels and was used to le...
This paper deals with the reconstruction of special cases of binary matrices with adjacent 1s. Ea... more This paper deals with the reconstruction of special cases of binary matrices with adjacent 1s. Each element is horizontally adjacent to at least another element. The projections are the number of elements on each row and column. We give a greedy polynomial time algorithm to reconstruct such matrices when satisfying only the vertical projection. We show also that the reconstruction is NP-complete when fixing the number of sequence of length two and three per row and column.
With the growing number of textual resources available, the ability to understand them becomes cr... more With the growing number of textual resources available, the ability to understand them becomes critical. An essential first step in understanding these sources is the ability to identify the parts-of-speech in each sentence. Arabic is a morphologically rich language, which presents a challenge for part of speech tagging. In this paper, our goal is to propose, improve, and implement a part-of-speech tagger based on a genetic algorithm. The accuracy obtained with this method is comparable to that of other probabilistic approaches.
PLOS ONE
We introduce here some general results that can be used for many categories of classifier. Let X ... more We introduce here some general results that can be used for many categories of classifier. Let X be a metric space with a measure µ(x) , F = {f : .X → R} the set of real function or real classifier on X, let G : F → R be a functional on F and let S ⊂ F be a set. Consider the following optimization problem
Le probleme classique de reconstruction de matrices convexes a partir de projections orthogonales... more Le probleme classique de reconstruction de matrices convexes a partir de projections orthogonales est defini comme suit : etant donne deux vecteurs H et V , nous cherchons a reconstruire une matrice binaire consistante avec ces projections [4]. La projection d’une rangee (ligne ou colonne) donne le nombre de 1s dans cette rangee. Nous nous interessons au probleme de reconstruction de matrices binaires convexes a partir de projections orthogonales. Ce probleme consiste a retrouver une matrice binaire convexe verifiant les projections horizontales et verticales. Une matrice est hv-convexe ou simplement convexe si les 1s de chaque rangee forment un seul bloc. Plusieurs approches ont ete proposees pour resoudre ce probleme NP-Complet [3, 2, 1, 5]. Nous presentons dans ce travail une methode basee sur la programmation mathematique en 0-1.