Optimizing Information Retrieval Using Evolutionary Algorithms and Fuzzy Inference (original) (raw)

Optimizing Information Retrieval Using Evolutionary Algorithms and Fuzzy Inference System

2009

With the rapid growth of the amount of data available in electronic libraries, through Internet and enterprise network mediums, advanced methods of search and information retrieval are in demand. Information retrieval systems, designed for storing, maintaining and searching large-scale sets of unstructured documents, are the subject of intensive investigation. An information retrieval system, a sophisticated application managing underlying documentary databases, is at the core of every search engine, including Internet search services. There is a clear demand for fine-tuning the performance of information retrieval systems. One step in optimizing the information retrieval experience is the deployment of Genetic Algorithms, a widely used subclass of Evolutionary Algorithms that have proved to be a successful optimization tool in many areas. In this paper, we revise and extend genetic approaches to information retrieval leverage via the optimization of search queries. As the next trend in improving search effectiveness and user-friendliness, system interaction will use fuzzy concepts in information retrieval systems. Deployment of fuzzy technology allows stating flexible, smooth and vague search criteria and retrieving a rich set of relevance ranked documents aiming to supply the inquirer with more satisfactory answers.

Introduction to the special issue on advanced information retrieval and databases

Journal of Intelligent Information Systems, 2010

Information retrieval (IR) is the process of searching through a collection of data for information relevant to a user's need. Rapid developments in hardware and software technologies have led to generation of tremendous amounts of data in various digital forms such as text documents, audio/video files and data warehouses. Searching for relevant information from such collections of data becomes more and more challenging. Besides an obvious expectation of fast performance of IR and database systems in general, there is also a challenge how to use them optimally, in particular, how to formulate queries or questions in order to get the required answers as fast as possible. Besides another obvious expectation that the answers should be of satisfactory quality, there is a fundamental challenge how the quality should be actually defined. With respect to both speed and quality, there is also an issue how to tune the systems and algorithms to match varying users' preferences, varying queries, and varying data types related to various real-life applications.

In telligcnt Search of Full-Text Databases

This project applies expert system technology to the task of searching online collections of documents. We are developing an intelligent search intermediary to help end-users locate relevant passages in large full-text databases. Our expert system will automatically reformulate conte.xtual Boolean queries to improve search results and will present re~rieved passages in decreasing order of relevance. It differs from other intelligent database functions in two ways: it works with semanticall y unprocessed tex't and the expert systems contains a knowledge base of search strategies independent of any particular content domain. The goals for our current project are to demonstrate the feasibility of the approach and to evaluate the effectiveness of the system through a controlled experiment. While the work we report here has limited objectives, the system and techniques are general a.nd can be extended ~o large, real-world databases.

A new type of information retrieval system

1976

In the period 1964-1968, Peter G. Ossorio ~I0,!i,12] developed and tested, on a pilot study basis, a new approach to the problem of automatic document retrieval. Ossorlo's studies were entirely successful, as pilot studies, and show the feasibility of using his approach to produce a new kind of retrieval system. These retrieval systems do not operate by word matching. The basic approach is to simulate the Judgement of competent human Judges of the conceptual content of each document, and the request. This judgement is then used to retrieve those documents with conceptual content most similar to that of the request. Each document is processed only at the time it is added to the data base, in time linear in the number of words in the document that the system recognizes. The retrieval request is in ordinary English. Time for retrieval is linear in the number of documents on file. Documents are retrieved in order of similarity of conceptual content to that of the request. The system works, in certain respects, better on full text documents, providing better descriptions of document content, and more detailed cross-indexing. are: The new type of system shows a number of interesting features. Among these (i) Much better performance "than systems using the old techniques; (2) Faithful representation of the Judgement of the person(s) whose judgement is being simulated, thus providing the possibility of indivldualized retrieval systems; (3) Ability to explain to a user why it retrieved certain documents, and not others. With this information, the user can alter his request, or instruct the system to judge things differently; (4) Automatic recognition of requests the system cannot properly handle; (5) Sub-documentary indexing reflectlng heterogeneity of material. As is often the case with a new paradigm, Ossorio's work raises at least as many questions as it answers. This paper presents the new approach, and the results of some first explorations in the new field.

Fuzzy information retrieval model revisited

Fuzzy Sets and Systems, 2009

A new comprehensive model of information retrieval (IR) based on Zadeh's calculus of linguistic statements is proposed. Its characteristic and novel feature is the capability to take into account both the imprecision and uncertainty pervading the textual information representation. It extends earlier IR models based on broadly meant fuzzy logic. Moreover, some techniques for indexing documents and queries in the framework of this model are proposed. The results of the computational experiments on standard document collections are reported.

Crux and Crushed of the Information Retrieval

In modern days the world has become highly dependent on world wide web. Now it has captured every walk of human life. Creation of global web succeeded in allowing the people to share their information and data with each other globally. Such uses and presence of data and information created trillions of databases. Now in this complex scenario searching a particular information or data with accuracy is the need of hour. We need specialist tools. Search engines are one of the answers but retrieving meaningful information is difficult. To overcome this problem with search engines many modern technologies have been implemented which may retrieve meaningful information intelligently. Semantic web technologies are playing a major role. In this paper we are presenting the modern techniques on the search engine in intelligent web search technologies. Introduction:

An intelligent web search framework for performing efficient retrieval of data.pdf

Computers and Electrical Engineering, 2016

There are numerous search engines available in today’s world to search and retrieve the required information. However retrieval of meaningful and appropriate formation as per the user requirement is always a challenging task. The foremost intention of any search engine is to provide the information with in a quick span of time. Since the nature of data available in World-Wide-Web shows heterogeneity in common and the sources of data are also distinct with each other, issues pertaining to schema structure and data representational are also there. In such circumstances, to eliminate inconsistencies and for enabling seamless integration of multiple data sources while retrieving web data, an efficient web search mechanism that fulfils the customer requirement is always needed. To enable the integration of multiple data sources while performing efficient retrieval of web data, an intelligent web search framework has been proposed in this paper.

Performance Enhancement of Information Retrieval via Artificial Intelligence

Big data is an ocean of data, most of it unstructured, growing exponentially, requires proper strategies and techniques to deal with it. But it is not the major concern of data scientist whereas cost effectiveness, dealing with unstructured data, managing the data effectively, controlling its growth, making raw data useful, its storage are the major challenges. There has been an increasing interest in the application of AI tools to IR in the last few years. Concretely, the machine learning paradigm whose aim is the design of system able to automatically acquire knowledge by them. In the light of requirement of intelligent processing of the big data so as to retrieve the information as per the business requirement, authors have proposed a novel architecture. Proposed architecture may help in faster information retrieval with better accuracy and recall.

Intelligent Search on the Internet

2006

The Web has grown from a simple hypertext system for research labs to an ubiquitous information system including virtually all human knowledge, e.g., movies, images, music, documents, etc. The traditional browsing activity seems to be often inadequate to locate information satisfying the user needs. Even search engines, based on the Information Retrieval approach, with their huge indexes show many drawbacks, which force users to sift through long lists of results or reformulate queries several times.

A New Approach of Intelligent Data Retrieval Paradigm

Artificial Intelligence Advances, 2021

What is a real time agent, how does it remedy ongoing daily frustrations for users, and how does it improve the retrieval performance in World Wide Web? These are the main question we focus on this manuscript. In many distributed information retrieval systems, information in agents should be ranked based on a combination of multiple criteria. Linear combination of ranks has been the dominant approach due to its simplicity and effectiveness. Such a combination scheme in distributed infrastructure requires that the ranks in resources or agents are comparable to each other before combined. The main challenge is transforming the raw rank values of different criteria appropriately to make them comparable before any combination. Different ways for ranking agents make this strategy difficult. In this research, we will demonstrate how to rank Web documents based on resource-provided information how to combine several resources raking schemas in one time. The proposed system was implemented ...