A Comparative Evaluation of POS Tagging and N-Gram Measures in Arabic Corpus Resources and Tools (original) (raw)
The purpose of this evaluation is twofold: an overview of the extent to which the functioning of the large-scale Arabic corpus resources examined serves the criteria of parts-of-speech tagging in the corpus design of linguistic data and to evaluate Arabic corpus analysis tools in terms of natural language processing statistics. The confusion matrix statistical method shows that some Arabic monitor corpora need further development, and the International Corpus of Arabic scores high levels on confusion matrices. There are nine Arabic corpus analysis tools under evaluation, and the attested reliable statistical outcomes are retrieved in terms of statistical algorithms for association measures. This is done by relying on one million empirically designated clean Arabic data to evaluate the association measures among the nine Arabic corpus analysis tools. The results presented at the end of this article indicate that the limitations could be tackled by evaluating the Arabic monitor Corpus...
Sign up for access to the world's latest research.
checkGet notified about relevant papers
checkSave papers to use in your research
checkJoin the discussion with peers
checkTrack your impact
Loading Preview
Sorry, preview is currently unavailable. You can download the paper by clicking the button above.