Omar Jamal | National University of Malaysia (original) (raw)

Papers by Omar Jamal

Research paper thumbnail of A COMBINATION METHOD OF SYNTACTIC AND SEMANTIC APPROACHES FOR CLASSIFYING EXAMINATION QUESTIONS INTO BLOOM'S TAXONOMY COGNITIVE

Omar Jamal Mohamed, 2019

Bloom's taxonomy has been proposed for categorizing examination questions in accordance with the ... more Bloom's taxonomy has been proposed for categorizing examination questions in accordance with the student's cognitive ability. Recently, researchers tend to utilize machine learning techniques in order to classify the questions. However, there is still a remarkable limitation, which can be represented by the ambiguity lies on the question. Due to the short length of the question, it is difficult to identify the contextual information of the words. This means that a single word could yield multiple meanings. This would significantly affect the process of classification especially for the verbs that are usually located within the question such as 'define' or 'write'. The ambiguity of such verbs would mislead the classification process regarding Bloom's cognitive levels. Therefore, this study aims to propose a combination method of semantic and syntactic approaches in order to overcome such drawback. The semantic approach aims to utilize an external knowledge source in order to retrieve semantic correspondences. Whereas, the syntactic approach aims to determine the syntactic tag of the terms to address the significant verbs and nouns. Finally, three machine learning techniques will be used including Support Vector Machine, J48 and Naïve Bayes classifiers to classify the questions. In order to assess the effectiveness of the proposed combination method, the classifiers have been applied with the proposed combination and without it. Results revealed that the classifiers with the combination method have outperformed the traditional ones. This implies the significance of using the proposed semantic and syntactic approaches.

Research paper thumbnail of WORD SENSE DISAMBIGUATION BASED ON YAROWSKY APPROACH IN ENGLISH QURANIC INFORMATION RETRIEVAL SYSTEM

Word sense disambiguation (WSD) is the process of eliminating ambiguity that lies on some words b... more Word sense disambiguation (WSD) is the process of eliminating ambiguity that lies on some words by identifying the exact sense of a given word. In the natural languages, many words could yield multiple meaning based on the context. WSD aims to identify the most accurate sense for such cases. In particular, when translating one language to another, there would be a possibility to tackle an ambiguity among the translated words. Quran, which is the holy book for approximately billion Muslims, has been originally written in Arabic language. Apparently, when translating Quran to English language, several semantic issues have been caught by researchers. Such issues lies on the ambiguity of words such as ‫وﻧﮭﺎرا'‬ ‫'ﻟﯿﻼ‬ and ‫اﻟﺤﺴﺎب'‬ ‫,'ﯾﻮم‬ which are translated into 'day and night' and 'judgment day'. Such ambiguity has to be eliminated by determining the exact sense of the translated word. Several research efforts have been intended to disambiguate the sense of translated Quran. However, the process of identifying an appropriate method for WSD in translated Quran is still challenging task. This is due to the complexity of Arabic morphology. Hence, this study aims to propose an adaption for Yarowsky algorithm as a WSD method for Quranic translation. In addition, this study aims to develop an IR prototype based on the proposed adaption method in order to evaluate such method based on the retrieval effectiveness. In fact, the dataset that has been used in this study is a collection of Quranic content. Several pre-processing tasks have been performed in order to eliminate the irrelevant data such as stop-words, numbers and punctuation. Sequentially, two lists of senses for each ambiguity word will be created with their context. This would be performed in order to let the Yarowsky algorithm train on such example set. After that, a decision list will be constructed by the Yarowsky algorithm, which depicts the labelling sense of each word. The evaluation method that has been used in this study is the three IR evaluation metrics; Precision, Recall and F-measure. The experimental results have shown a 77% of f-measure. Such result seems to be weak in compared to the results of Yarowsky that have been applied in open domain. This is due to the lack of examples that could be extracted from Quran for both senses. Meanwhile, such result seems to be competitive in WSD of Quranic translation. Finally, it can be concluded that WSD has a significant impact on the IR system by improving the retrieval effectiveness.

Research paper thumbnail of A COMBINATION METHOD OF SYNTACTIC AND SEMANTIC APPROACHES FOR CLASSIFYING EXAMINATION QUESTIONS INTO BLOOM'S TAXONOMY COGNITIVE

Omar Jamal Mohamed, 2019

Bloom's taxonomy has been proposed for categorizing examination questions in accordance with the ... more Bloom's taxonomy has been proposed for categorizing examination questions in accordance with the student's cognitive ability. Recently, researchers tend to utilize machine learning techniques in order to classify the questions. However, there is still a remarkable limitation, which can be represented by the ambiguity lies on the question. Due to the short length of the question, it is difficult to identify the contextual information of the words. This means that a single word could yield multiple meanings. This would significantly affect the process of classification especially for the verbs that are usually located within the question such as 'define' or 'write'. The ambiguity of such verbs would mislead the classification process regarding Bloom's cognitive levels. Therefore, this study aims to propose a combination method of semantic and syntactic approaches in order to overcome such drawback. The semantic approach aims to utilize an external knowledge source in order to retrieve semantic correspondences. Whereas, the syntactic approach aims to determine the syntactic tag of the terms to address the significant verbs and nouns. Finally, three machine learning techniques will be used including Support Vector Machine, J48 and Naïve Bayes classifiers to classify the questions. In order to assess the effectiveness of the proposed combination method, the classifiers have been applied with the proposed combination and without it. Results revealed that the classifiers with the combination method have outperformed the traditional ones. This implies the significance of using the proposed semantic and syntactic approaches.

Research paper thumbnail of WORD SENSE DISAMBIGUATION BASED ON YAROWSKY APPROACH IN ENGLISH QURANIC INFORMATION RETRIEVAL SYSTEM

Word sense disambiguation (WSD) is the process of eliminating ambiguity that lies on some words b... more Word sense disambiguation (WSD) is the process of eliminating ambiguity that lies on some words by identifying the exact sense of a given word. In the natural languages, many words could yield multiple meaning based on the context. WSD aims to identify the most accurate sense for such cases. In particular, when translating one language to another, there would be a possibility to tackle an ambiguity among the translated words. Quran, which is the holy book for approximately billion Muslims, has been originally written in Arabic language. Apparently, when translating Quran to English language, several semantic issues have been caught by researchers. Such issues lies on the ambiguity of words such as ‫وﻧﮭﺎرا'‬ ‫'ﻟﯿﻼ‬ and ‫اﻟﺤﺴﺎب'‬ ‫,'ﯾﻮم‬ which are translated into 'day and night' and 'judgment day'. Such ambiguity has to be eliminated by determining the exact sense of the translated word. Several research efforts have been intended to disambiguate the sense of translated Quran. However, the process of identifying an appropriate method for WSD in translated Quran is still challenging task. This is due to the complexity of Arabic morphology. Hence, this study aims to propose an adaption for Yarowsky algorithm as a WSD method for Quranic translation. In addition, this study aims to develop an IR prototype based on the proposed adaption method in order to evaluate such method based on the retrieval effectiveness. In fact, the dataset that has been used in this study is a collection of Quranic content. Several pre-processing tasks have been performed in order to eliminate the irrelevant data such as stop-words, numbers and punctuation. Sequentially, two lists of senses for each ambiguity word will be created with their context. This would be performed in order to let the Yarowsky algorithm train on such example set. After that, a decision list will be constructed by the Yarowsky algorithm, which depicts the labelling sense of each word. The evaluation method that has been used in this study is the three IR evaluation metrics; Precision, Recall and F-measure. The experimental results have shown a 77% of f-measure. Such result seems to be weak in compared to the results of Yarowsky that have been applied in open domain. This is due to the lack of examples that could be extracted from Quran for both senses. Meanwhile, such result seems to be competitive in WSD of Quranic translation. Finally, it can be concluded that WSD has a significant impact on the IR system by improving the retrieval effectiveness.