Arabic information retrieval perspectives (original) (raw)

Development of Arabic evaluations in information retrieval

International Journal of ADVANCED AND APPLIED SCIENCES

The field of information retrieval has observed noticeable growth over the past decades in reaction to the prolonged practice of the internet and the dreadful requirement of users to hunt for huge amounts of digital information. Assuming the stable intensification of Arabic e-content, brilliant information retrieval systems must be planned to uniform the nature and needs of the Arabic language. This paper shelters graceful on the present development in the field of Arabic information retrieval finds the trials that delay the development of this learning and proposes recommendations for additional research. This paper practices the imaginative analytical technique to scrutinize the genuineness of Arabic educations in the field of information retrieval and to learn the difficulties that are being confronted in this area. Especially, the earlier literature on information retrieval is reviewed by searching the connected databases and websites.

Arabic Information Retrieval Literature Review

Arabic words are typically derived through a robust system of Arabic roots. According to Sakhr Software Company, there are over 10,000 potential roots, but far fewer that are used regularly.4 This robust root system also has large implications in Arabic IR and may present information retrieval challenges. This literature review will address two main issues that are consistently raised in this field of Arabic IR: stemming and stopwords. There are still no standardized methods of stemming or stopword elimination highlighting the infancy of the field of Arabic information retrieval.

Evaluating Arabic Retrieval from English or French Queries: The TREC-2001 Cross-Language Information Retrieval Track

2002

The Cross-language information retrieval track at the 2001 Text Retrieval Conference (TREC-2001) produced the first large information retrieval test collection for Arabic. The collection contains 383,872 Arabic news stories, 25 topic descriptions in Arabic, English and French from which queries can be formed, and manual (ground truth) relevance judgments for a useful subset of the topic-document combinations. This paper describes the way in which the collection was created, explains the evaluation measures that the collection is designed to support, and provides an overview of the results from the first set of experiments with the collection. The results make it possible to draw some inferences regarding the utility of the collection for post hoc evaluations. Character © -grams. As with other languages, overlapping character ©

Arabic Studies’ Progress in Information Retrieval

International Journal of Advanced Computer Science and Applications, 2016

The field of information retrieval has witnessed tangible progress over the past decades in response to the expanded usage of the internet and the dire need of users to search for massive amounts of digital information. Given the steady increase of Arabic e-content, excellent information retrieval systems must be devised to suit the nature and requirements of the Arabic language. This paper sheds light on the current progress in the field of Arabic information retrieval, identifies the challenges that hinder the progress of this science, and proposes suggestions for further research. This paper uses the descriptive analytical method to examine the reality of Arabic studies in the field of information retrieval and to study the problems that are being faced in this area. Specifically, the previous literature on information retrieval is reviewed by searching the related databases and websites.

Review on Recent Arabic Information Retrieval Techniques

EAI Endorsed Transactions on Internet of Things

Information retrieval is an important field that aims to provide a relevant document to a user information need, expressed through a query. Arabic is a challenging language that gained much attention recently in the information retrieval domain. To overcome the problems related to its complexity, many studies and techniques have been presented, most of them were conducted to solve the stemming problem. This paper presents an overview of the Arabic information retrieval process, including various text processing techniques, ranking approaches, evaluation measures, and some important information retrieval models. The paper finally presents some recent related studies and approaches in different Arabic information retrieval fields.

Integration of Arabic to a Cross-Lingual Retrieval Tool: Challenges and Perspectives

The ambition of this paper is to resume briefly the challenges that Arabic offers in Cross-Lingual Information Retrieval, to show the potential of MIMOR 1 , a retrieval system, that has proved to be succesful for cross-lingual retrieval tasks, and to propose string matching techniques for feature unification instead of stemming techniques.

Arabic Natural Language Processing for Information Retrieval

2004

Human Language Technology has played a big role in implementing Latin based information retrieval systems. Two of the most sited techniques are stemming and truncation. Numerous studies have showed that the inflectional structure of words has a big impact on the retrieval accuracy of Latin-based languages information retrieval systems (IRS). Stemming or truncation is done for two principal reasons: the reduction in index storage required and the increase in performance due to the use of word variants. Several stemming algorithms were proposed for stemming text such as Porter for English.

Arabic Cross-Language Information Retrieval

ACM Transactions on Asian and Low-Resource Language Information Processing, 2016

Cross-language information retrieval (CLIR) deals with retrieving relevant documents in one language using queries expressed in another language. As CLIR tools rely on translation techniques, they are challenged by the properties of highly derivational and flexional languages like Arabic. Much work has been done on CLIR for different languages including Arabic. In this article, we introduce the reader to the motivations for solving some problems related to Arabic CLIR approaches. The evaluation of these approaches is discussed starting from the 2001 and 2002 TREC Arabic CLIR tracks, which aim to objectively evaluate CLIR systems. We also study many other research works to highlight the unresolved problems or those that require further investigation. These works are discussed in the light of a deep study of the specificities and the tasks of Arabic information retrieval (IR). Particular attention is given to translation techniques and CLIR resources, which are key issues challenging ...

Performance and Effectiveness Examination of the IQE and AQE with Application on Arabic Content

Literature search show that information retrieval (IR) systems on Arabic are little compared with English language. Additionally, IR systems face many problems when used with Arabic language, including, complexity and ambiguity. The performance and effectiveness of interactive query expansion (IQE) and automatic query expansion (AQE) represent a key towards improved IR systems. The performance and effectiveness of IQE compared with AQE have been examined via a series of search experiments on Arabic content. Compared with no query expansion, the experimental results showed that AQE provides enhanced performance and effectiveness, with 54% query improvement and average precision of 42.1. However, results revealed that IQE provides high performance and effectiveness compared with AQE, with 84% query improvement and average precision of 43.4.

Arabic Information Retrieval: A Relevancy Assessment Survey

2016

The paper presents a research in Arabic Information Retrieval (IR). It surveys the impact of statistical and morphological analysis of Arabic text in improving Arabic IR relevancy. We investigated the contributions of Stemming, Indexing, Query Expansion, Text Summarization (TS), Text Translation, and Named Entity Recognition (NER) in enhancing the relevancy of Arabic IR. Our survey emphasizing on the quantitative relevancy measurements provided in the surveyed publications. The paper shows that the researchers achieved significant enhancements especially in building accurate stemmers, with accuracy reaches 97%, and in measuring the impact of different indexing strategies. Query expansion and Text Translation showed positive relevancy effect. However, other tasks such as NER and TS still need more research to realize their impact on Arabic IR.