A New Distributional Semantic Model for Classical Arabic (original) (raw)

Classical Arabic forms the basis of Arabic linguistic theory, and it is the language in which the holy Quran was revealed. To the best of the authors knowledge, no previous attempts were made to build a distributional lexical semantic model for Classical Arabic. In this paper, we present a new association measure, the Refined Dice, for detecting syntagmatic relations between words in a very large corpus of Classical Arabic. In addition, an experimental study to evaluate the performance of the proposed measure is presented. The measure showed outstanding results in identifying collocations and significant co-occurrences from a very large corpus of Classical Arabic.