mohamed biniz | Chouaib Doukkali University (original) (raw)

Papers by mohamed biniz

Research paper thumbnail of Contribution to predicte of PIC50 using algorithms of Deep Learning

Research paper thumbnail of DataSet for Arabic Classification

The dataset is a collection of Arabic texts, which covers modern Arabic language used in newspape... more The dataset is a collection of Arabic texts, which covers modern Arabic language used in newspapers articles. The text contains alphabetic, numeric and symbolic words. The existence of numeric and symbolic words in this dataset could tell the efficiency and robustness of many Arabic text classification and indexing documents. The dataset consists of 111,728 documents (cf. Table 1) and 319,254,124 words (cf. Table 2) structured in text files, and collected from 3 Arabic online newspapers: Assabah [9], Hespress [10] and Akhbarona [11] using semi-automatic web crawling process. The documents in the dataset are categorized into 5 classes: sport, politic, culture, economy and diverse. The number of documents and words for each class varies from one class to another (cf. Tables 1-2).

Research paper thumbnail of Contribution to Arabic Text Classification Using Machine Learning Techniques

Lecture notes in business information processing, 2021

With the increase of text stored in electronic format, it is no longer possible for humans to und... more With the increase of text stored in electronic format, it is no longer possible for humans to understand all the incoming data or even categorize it. We need an automatic text classification system in order to classify them into predefined classes and quickly retrieve information. Text classification can be achieved by machine learning, it requires a set of approaches for vectorization and classification. In vectorization phase, this work proposes two approaches (BOW and TF-IDF), but in the classification phase, the algorithms of machine learning used are: RL, SVM and ANN. At the end, a comparison study is given.

Research paper thumbnail of Amazigh part-of-speech tagging with machine learning and deep learning

Indonesian Journal of Electrical Engineering and Computer Science, Dec 1, 2021

Natural language processing (NLP) is a part of artificial intelligence that dissects, comprehends... more Natural language processing (NLP) is a part of artificial intelligence that dissects, comprehends, and changes common dialects with computers in composed and spoken settings. At that point in scripts. Grammatical features part-of-speech (POS) allow marking the word as per its statement. We find in the literature that POS is used in a few dialects, in particular: French and English. This paper investigates the attention-based long short-term memory (LSTM) networks and simple recurrent neural network (RNN) in Tifinagh POS tagging when it is compared to conditional random fields (CRF) and decision tree. The attractiveness of LSTM networks is their strength in modeling long-distance dependencies. The experiment results show that LSTM networks perform better than RNN, CRF and decision tree that has a near performance.

Research paper thumbnail of An ontology alignment hybrid method based on decision rules

The International Arab Journal of Information Technology, 2019

In this paper, we propose a hybrid approach based on the extraction of decision rules to refine t... more In this paper, we propose a hybrid approach based on the extraction of decision rules to refine the alignment results due to the use of three alignment strategies. This approach contains two phases: training phase which uses structural similarity, element similarity, instance-based similarity and C4.5 algorithms to extract decision rules, and evaluation phase which refines discovered alignment by three alignment strategies using extracted decision rules. This approach is compared with the best systems according to benchmark OAEI 2016: Framework for Ontology Alignment and Mapping (FOAM), A Dynamic Multistrategy Ontology Alignment Framework (RIMOM), AgreementMakerLight and Yet Another MatcherBiomedical Ontologies (YAM-BIO), the proposed method gives good results (good recall, precision and F-measure). Experimental results show that the proposed approach is effective.

Research paper thumbnail of Unveiling the Performance Insights: Benchmarking Anomaly-Based Intrusion Detection Systems Using Decision Tree Family Algorithms on the CICIDS2017 Dataset

Lecture notes in business information processing, 2023

Research paper thumbnail of Classification of Sentiment Using Optimized Hybrid Deep Learning Model

Computing and Informatics

Research paper thumbnail of Realization of an intelligent evaluation system

International Journal of Informatics and Communication Technology (IJ-ICT)

A number of benefits have been reported for computer-based assessments over traditional paper-bas... more A number of benefits have been reported for computer-based assessments over traditional paper-based exams, both in terms of IT support for question development, reduced distribution and test administration costs, and automated support. Possible for the ranking. However, existing computerized assessment systems do not provide all kinds of questions, namely open questions that require writing solutions. To overcome the challenges of the existing, the objective of this work is to achieve an intelligent evaluation system (IES) responding to the problems identified, and which adapts to the different types of questions, especially open-ended questions of which the answer requires sentence writing or programming.

Research paper thumbnail of An improved approach to Arabic news classification based on hyperparameter tuning of machine learning algorithms

Journal of Engineering Research

Research paper thumbnail of Correcting optical character recognition result via a novel approach

International Journal of Informatics and Communication Technology, Apr 1, 2022

Optical character recognition (OCR) is a recognition system used to recognize the substance of a ... more Optical character recognition (OCR) is a recognition system used to recognize the substance of a checked picture. This system gives erroneous results, which necessitates a post-treatment, for the sentence correction. In this paper, we proposed a new method for syntactic and semantic correction of sentences it is based on the frequency of two correct words in the sentence and a recursive technique. This approach starts with the frequency calculation of each two words successive in the corpora, the words that have the greatest frequency build a correction center. We found 98% using our approach when we used the noisy channel. Further, we obtained 96% using the same corpus in the same conditions.

Research paper thumbnail of Optimization of Aircraft Operations on Airport Surface

2020 IEEE 6th International Conference on Optimization and Applications (ICOA)

This study presents the elaboration of an airport surface simulator using the output data of a fi... more This study presents the elaboration of an airport surface simulator using the output data of a first come first served (FCFS) planner to simulate the movement of these aircraft on the surface. Different scenarios of conflictual situations in airport surface traffic are analyzed and classified. A conflict detection and resolution algorithm is implemented to preserve the recommended separation distance. The simulator has been tested with a deployment case at Mohammed V International Airport involving the use of 70 aircrafts. In the absence of conflict detection and resolution, various conflict situations are identified. Once the conflict detection and resolution algorithm is used to manage the traffic, three prioritization strategies are implemented and the number of delayed aircraft and overall delays are compared. From the results of this prioritization based on time or distance remaining, what is more interesting is the choice of the detection algorithm rather than the minimization of the delay for each given situation.

Research paper thumbnail of Part-of-Speech Tagging Using Conditional Random Fields and Decision Tree: Amazigh Text Written in Tifinaghe Characters

Advanced Intelligent Systems for Sustainable Development (AI2SD’2020), 2022

Research paper thumbnail of DataSet for Arabic Classification

The dataset is a collection of Arabic texts, which covers modern Arabic language used in newspape... more The dataset is a collection of Arabic texts, which covers modern Arabic language used in newspapers articles. The text contains alphabetic, numeric and symbolic words. The existence of numeric and symbolic words in this dataset could tell the efficiency and robustness of many Arabic text classification and indexing documents. The dataset consists of 111,728 documents (cf. Table 1) and 319,254,124 words (cf. Table 2) structured in text files, and collected from 3 Arabic online newspapers: Assabah [9], Hespress [10] and Akhbarona [11] using semi-automatic web crawling process. The documents in the dataset are categorized into 5 classes: sport, politic, culture, economy and diverse. The number of documents and words for each class varies from one class to another (cf. Tables 1-2).

Research paper thumbnail of Contribution to Arabic Text Classification Using Machine Learning Techniques

With the increase of text stored in electronic format, it is no longer possible for humans to und... more With the increase of text stored in electronic format, it is no longer possible for humans to understand all the incoming data or even categorize it. We need an automatic text classification system in order to classify them into predefined classes and quickly retrieve information. Text classification can be achieved by machine learning, it requires a set of approaches for vectorization and classification. In vectorization phase, this work proposes two approaches (BOW and TF-IDF), but in the classification phase, the algorithms of machine learning used are: RL, SVM and ANN. At the end, a comparison study is given.

Research paper thumbnail of A new Approach of Documents Indexing Using subject modelling and Summarization

Journal of Physics: Conference Series, 2021

Document indexing is a field of research in Natural Language Processing (NLP) that has been rapid... more Document indexing is a field of research in Natural Language Processing (NLP) that has been rapidly evolving for 70 years. It is an operation that focuses on the synthetic representation of a document according to a model in order to facilitate their subsequent use. This work is concerned with document indexing. Two points are addressed. This work is concerned with document indexing, we are trying to accelerate the indexing process of large document datasets, two points are addressed. The first one concerns the development of a document indexing system using the system's operating process based on three phases namely pre-processing, weighting, and subject modelling. The second point concerns the proposal for a new system that integrates a new developed automatic summary subsystem, the goal of this point is to minimize indexing time.

Research paper thumbnail of Ontology Matching using BabelNet Dictionary and Word Sense Disambiguation Algorithms

Indonesian Journal of Electrical Engineering and Computer Science, 2017

Ontology matching is a discipline that means two things: first, the process of discovering corres... more Ontology matching is a discipline that means two things: first, the process of discovering correspondences between two different ontologies, and second is the result of this process, that is to say the expression of correspondences. This discipline is a crucial task to solve problems merging and evolving of heterogeneous ontologies in applications of the Semantic Web. This domain imposes several challenges, among them, the selection of appropriate similarity measures to discover the correspondences. In this article, we are interested to study algorithms that calculate the semantic similarity by using Adapted Lesk algorithm, Wu & Palmer Algorithm, Resnik Algorithm, Leacock and Chodorow Algorithm, and similarity flooding between two ontologies and BabelNet as reference ontology, we implement them, and compared experimentally. Overall, the most effective methods are Wu & Palmer and Adapted Lesk, which is widely used for Word Sense Disambiguation (WSD) in the field of Automatic Natural ...

Research paper thumbnail of A comparative Plagiarism Detection System methods between sentences

Journal of Physics: Conference Series, 2021

After the era of the World Wide Web, information is easily accessible with a single click. But th... more After the era of the World Wide Web, information is easily accessible with a single click. But this progression has drawbacks despite the ease of access to information. Plagiarism has a growing challenge to society, which impact on the academic world, researchers, and students in particular. This work discusses the plagiarism process, types, and detection methodologies. It presents the different plagiarism detection techniques based on syntactic and semantic approaches. The result of this work is a comparative survey of plagiarism detection system methods using the identification of syntactic and semantic similarities based a sentence-to-sentence comparison, and no longer word-to-word like the classical systems because the similarity between the sentences is a complex phenomenon.

Research paper thumbnail of Arabic Text Classification Using Deep Learning Technics

International Journal of Grid and Distributed Computing, 2018

Research paper thumbnail of An Approach of Documents Indexing Using Summarization

Advances in Library and Information Science

Document indexing is an active domain, which is interesting a lot of researchers. Generally, it i... more Document indexing is an active domain, which is interesting a lot of researchers. Generally, it is used in the information retrieval systems. Document indexing encompasses a set of approaches that can be applied to index a document using a corpus. This treatment has several advantages, like accelerating the research process, finding the pertinent contains related to a query, reducing storage space, etc. The use of the entire document in the indexing process affects several parameters, such as indexing time, research time, storage space of treatment, etc. The focus of this chapter is to improve all parameters (cited above) related to the indexing process by proposing a new indexing approach. The goal of proposed approach is to use a summarization to minimize the size of documents without affecting the meaning.

Research paper thumbnail of Amazigh part-of-speech tagging with machine learning and deep learning

Indonesian Journal of Electrical Engineering and Computer Science

Natural language processing (NLP) is a part of artificial intelligence that dissects, comprehends... more Natural language processing (NLP) is a part of artificial intelligence that dissects, comprehends, and changes common dialects with computers in composed and spoken settings. At that point in scripts. Grammatical features part-of-speech (POS) allow marking the word as per its statement. We find in the literature that POS is used in a few dialects, in particular: French and English. This paper investigates the attention-based long short-term memory (LSTM) networks and simple recurrent neural network (RNN) in Tifinagh POS tagging when it is compared to conditional random fields (CRF) and decision tree. The attractiveness of LSTM networks is their strength in modeling long-distance dependencies. The experiment results show that LSTM networks perform better than RNN, CRF and decision tree that has a near performance.

Research paper thumbnail of Contribution to predicte of PIC50 using algorithms of Deep Learning

Research paper thumbnail of DataSet for Arabic Classification

The dataset is a collection of Arabic texts, which covers modern Arabic language used in newspape... more The dataset is a collection of Arabic texts, which covers modern Arabic language used in newspapers articles. The text contains alphabetic, numeric and symbolic words. The existence of numeric and symbolic words in this dataset could tell the efficiency and robustness of many Arabic text classification and indexing documents. The dataset consists of 111,728 documents (cf. Table 1) and 319,254,124 words (cf. Table 2) structured in text files, and collected from 3 Arabic online newspapers: Assabah [9], Hespress [10] and Akhbarona [11] using semi-automatic web crawling process. The documents in the dataset are categorized into 5 classes: sport, politic, culture, economy and diverse. The number of documents and words for each class varies from one class to another (cf. Tables 1-2).

Research paper thumbnail of Contribution to Arabic Text Classification Using Machine Learning Techniques

Lecture notes in business information processing, 2021

With the increase of text stored in electronic format, it is no longer possible for humans to und... more With the increase of text stored in electronic format, it is no longer possible for humans to understand all the incoming data or even categorize it. We need an automatic text classification system in order to classify them into predefined classes and quickly retrieve information. Text classification can be achieved by machine learning, it requires a set of approaches for vectorization and classification. In vectorization phase, this work proposes two approaches (BOW and TF-IDF), but in the classification phase, the algorithms of machine learning used are: RL, SVM and ANN. At the end, a comparison study is given.

Research paper thumbnail of Amazigh part-of-speech tagging with machine learning and deep learning

Indonesian Journal of Electrical Engineering and Computer Science, Dec 1, 2021

Natural language processing (NLP) is a part of artificial intelligence that dissects, comprehends... more Natural language processing (NLP) is a part of artificial intelligence that dissects, comprehends, and changes common dialects with computers in composed and spoken settings. At that point in scripts. Grammatical features part-of-speech (POS) allow marking the word as per its statement. We find in the literature that POS is used in a few dialects, in particular: French and English. This paper investigates the attention-based long short-term memory (LSTM) networks and simple recurrent neural network (RNN) in Tifinagh POS tagging when it is compared to conditional random fields (CRF) and decision tree. The attractiveness of LSTM networks is their strength in modeling long-distance dependencies. The experiment results show that LSTM networks perform better than RNN, CRF and decision tree that has a near performance.

Research paper thumbnail of An ontology alignment hybrid method based on decision rules

The International Arab Journal of Information Technology, 2019

In this paper, we propose a hybrid approach based on the extraction of decision rules to refine t... more In this paper, we propose a hybrid approach based on the extraction of decision rules to refine the alignment results due to the use of three alignment strategies. This approach contains two phases: training phase which uses structural similarity, element similarity, instance-based similarity and C4.5 algorithms to extract decision rules, and evaluation phase which refines discovered alignment by three alignment strategies using extracted decision rules. This approach is compared with the best systems according to benchmark OAEI 2016: Framework for Ontology Alignment and Mapping (FOAM), A Dynamic Multistrategy Ontology Alignment Framework (RIMOM), AgreementMakerLight and Yet Another MatcherBiomedical Ontologies (YAM-BIO), the proposed method gives good results (good recall, precision and F-measure). Experimental results show that the proposed approach is effective.

Research paper thumbnail of Unveiling the Performance Insights: Benchmarking Anomaly-Based Intrusion Detection Systems Using Decision Tree Family Algorithms on the CICIDS2017 Dataset

Lecture notes in business information processing, 2023

Research paper thumbnail of Classification of Sentiment Using Optimized Hybrid Deep Learning Model

Computing and Informatics

Research paper thumbnail of Realization of an intelligent evaluation system

International Journal of Informatics and Communication Technology (IJ-ICT)

A number of benefits have been reported for computer-based assessments over traditional paper-bas... more A number of benefits have been reported for computer-based assessments over traditional paper-based exams, both in terms of IT support for question development, reduced distribution and test administration costs, and automated support. Possible for the ranking. However, existing computerized assessment systems do not provide all kinds of questions, namely open questions that require writing solutions. To overcome the challenges of the existing, the objective of this work is to achieve an intelligent evaluation system (IES) responding to the problems identified, and which adapts to the different types of questions, especially open-ended questions of which the answer requires sentence writing or programming.

Research paper thumbnail of An improved approach to Arabic news classification based on hyperparameter tuning of machine learning algorithms

Journal of Engineering Research

Research paper thumbnail of Correcting optical character recognition result via a novel approach

International Journal of Informatics and Communication Technology, Apr 1, 2022

Optical character recognition (OCR) is a recognition system used to recognize the substance of a ... more Optical character recognition (OCR) is a recognition system used to recognize the substance of a checked picture. This system gives erroneous results, which necessitates a post-treatment, for the sentence correction. In this paper, we proposed a new method for syntactic and semantic correction of sentences it is based on the frequency of two correct words in the sentence and a recursive technique. This approach starts with the frequency calculation of each two words successive in the corpora, the words that have the greatest frequency build a correction center. We found 98% using our approach when we used the noisy channel. Further, we obtained 96% using the same corpus in the same conditions.

Research paper thumbnail of Optimization of Aircraft Operations on Airport Surface

2020 IEEE 6th International Conference on Optimization and Applications (ICOA)

This study presents the elaboration of an airport surface simulator using the output data of a fi... more This study presents the elaboration of an airport surface simulator using the output data of a first come first served (FCFS) planner to simulate the movement of these aircraft on the surface. Different scenarios of conflictual situations in airport surface traffic are analyzed and classified. A conflict detection and resolution algorithm is implemented to preserve the recommended separation distance. The simulator has been tested with a deployment case at Mohammed V International Airport involving the use of 70 aircrafts. In the absence of conflict detection and resolution, various conflict situations are identified. Once the conflict detection and resolution algorithm is used to manage the traffic, three prioritization strategies are implemented and the number of delayed aircraft and overall delays are compared. From the results of this prioritization based on time or distance remaining, what is more interesting is the choice of the detection algorithm rather than the minimization of the delay for each given situation.

Research paper thumbnail of Part-of-Speech Tagging Using Conditional Random Fields and Decision Tree: Amazigh Text Written in Tifinaghe Characters

Advanced Intelligent Systems for Sustainable Development (AI2SD’2020), 2022

Research paper thumbnail of DataSet for Arabic Classification

The dataset is a collection of Arabic texts, which covers modern Arabic language used in newspape... more The dataset is a collection of Arabic texts, which covers modern Arabic language used in newspapers articles. The text contains alphabetic, numeric and symbolic words. The existence of numeric and symbolic words in this dataset could tell the efficiency and robustness of many Arabic text classification and indexing documents. The dataset consists of 111,728 documents (cf. Table 1) and 319,254,124 words (cf. Table 2) structured in text files, and collected from 3 Arabic online newspapers: Assabah [9], Hespress [10] and Akhbarona [11] using semi-automatic web crawling process. The documents in the dataset are categorized into 5 classes: sport, politic, culture, economy and diverse. The number of documents and words for each class varies from one class to another (cf. Tables 1-2).

Research paper thumbnail of Contribution to Arabic Text Classification Using Machine Learning Techniques

With the increase of text stored in electronic format, it is no longer possible for humans to und... more With the increase of text stored in electronic format, it is no longer possible for humans to understand all the incoming data or even categorize it. We need an automatic text classification system in order to classify them into predefined classes and quickly retrieve information. Text classification can be achieved by machine learning, it requires a set of approaches for vectorization and classification. In vectorization phase, this work proposes two approaches (BOW and TF-IDF), but in the classification phase, the algorithms of machine learning used are: RL, SVM and ANN. At the end, a comparison study is given.

Research paper thumbnail of A new Approach of Documents Indexing Using subject modelling and Summarization

Journal of Physics: Conference Series, 2021

Document indexing is a field of research in Natural Language Processing (NLP) that has been rapid... more Document indexing is a field of research in Natural Language Processing (NLP) that has been rapidly evolving for 70 years. It is an operation that focuses on the synthetic representation of a document according to a model in order to facilitate their subsequent use. This work is concerned with document indexing. Two points are addressed. This work is concerned with document indexing, we are trying to accelerate the indexing process of large document datasets, two points are addressed. The first one concerns the development of a document indexing system using the system's operating process based on three phases namely pre-processing, weighting, and subject modelling. The second point concerns the proposal for a new system that integrates a new developed automatic summary subsystem, the goal of this point is to minimize indexing time.

Research paper thumbnail of Ontology Matching using BabelNet Dictionary and Word Sense Disambiguation Algorithms

Indonesian Journal of Electrical Engineering and Computer Science, 2017

Ontology matching is a discipline that means two things: first, the process of discovering corres... more Ontology matching is a discipline that means two things: first, the process of discovering correspondences between two different ontologies, and second is the result of this process, that is to say the expression of correspondences. This discipline is a crucial task to solve problems merging and evolving of heterogeneous ontologies in applications of the Semantic Web. This domain imposes several challenges, among them, the selection of appropriate similarity measures to discover the correspondences. In this article, we are interested to study algorithms that calculate the semantic similarity by using Adapted Lesk algorithm, Wu & Palmer Algorithm, Resnik Algorithm, Leacock and Chodorow Algorithm, and similarity flooding between two ontologies and BabelNet as reference ontology, we implement them, and compared experimentally. Overall, the most effective methods are Wu & Palmer and Adapted Lesk, which is widely used for Word Sense Disambiguation (WSD) in the field of Automatic Natural ...

Research paper thumbnail of A comparative Plagiarism Detection System methods between sentences

Journal of Physics: Conference Series, 2021

After the era of the World Wide Web, information is easily accessible with a single click. But th... more After the era of the World Wide Web, information is easily accessible with a single click. But this progression has drawbacks despite the ease of access to information. Plagiarism has a growing challenge to society, which impact on the academic world, researchers, and students in particular. This work discusses the plagiarism process, types, and detection methodologies. It presents the different plagiarism detection techniques based on syntactic and semantic approaches. The result of this work is a comparative survey of plagiarism detection system methods using the identification of syntactic and semantic similarities based a sentence-to-sentence comparison, and no longer word-to-word like the classical systems because the similarity between the sentences is a complex phenomenon.

Research paper thumbnail of Arabic Text Classification Using Deep Learning Technics

International Journal of Grid and Distributed Computing, 2018

Research paper thumbnail of An Approach of Documents Indexing Using Summarization

Advances in Library and Information Science

Document indexing is an active domain, which is interesting a lot of researchers. Generally, it i... more Document indexing is an active domain, which is interesting a lot of researchers. Generally, it is used in the information retrieval systems. Document indexing encompasses a set of approaches that can be applied to index a document using a corpus. This treatment has several advantages, like accelerating the research process, finding the pertinent contains related to a query, reducing storage space, etc. The use of the entire document in the indexing process affects several parameters, such as indexing time, research time, storage space of treatment, etc. The focus of this chapter is to improve all parameters (cited above) related to the indexing process by proposing a new indexing approach. The goal of proposed approach is to use a summarization to minimize the size of documents without affecting the meaning.

Research paper thumbnail of Amazigh part-of-speech tagging with machine learning and deep learning

Indonesian Journal of Electrical Engineering and Computer Science

Natural language processing (NLP) is a part of artificial intelligence that dissects, comprehends... more Natural language processing (NLP) is a part of artificial intelligence that dissects, comprehends, and changes common dialects with computers in composed and spoken settings. At that point in scripts. Grammatical features part-of-speech (POS) allow marking the word as per its statement. We find in the literature that POS is used in a few dialects, in particular: French and English. This paper investigates the attention-based long short-term memory (LSTM) networks and simple recurrent neural network (RNN) in Tifinagh POS tagging when it is compared to conditional random fields (CRF) and decision tree. The attractiveness of LSTM networks is their strength in modeling long-distance dependencies. The experiment results show that LSTM networks perform better than RNN, CRF and decision tree that has a near performance.