A simple solution for improving the effectiveness of traditional information retrieval systems (original) (raw)

Comparison of Semantic and Syntactic Information Retrieval System on the Basis of Precision and Recall

International Journal of Data Engineering (IJDE)

In this paper information retrieval system for local databases are discussed. The approach is to search the web both semantically and syntactically. The proposal handles the search queries related to the user who is interested in the focused results regarding a product with some specific characteristics. The objective of the work will be to find and retrieve the accurate information from the available information warehouse which contains related data having common keywords. This information retrieval system can eventually be used for accessing the internet also. Accuracy in information retrieval that is achieving both high precision and recall is difficult. So both semantic and syntactic search engine are compared for information retrieval using two parameters i.e. precision and recall.

A SURVEY ON VARIOUS ARCHITECTURES, MODELS AND METHODOLOGIES FOR INFORMATION RETRIEVAL

iaeme

The typical Information Retrieval (IR) model of the search process consists of three essentials: query, documents and search results. An user looking to fulfill information need has to formulate a query usually consisting of a small set of keywords summarizing the information need. The goal of an IR system is to retrieve documents containing information which might be useful or relevant to the user. Throughout the search process there is a loss of focus, because keyword queries entered by users often do not suitably summarize their complex information needs, and IR systems do not sufficiently interpret the contents of documents leading to result lists containing irrelevant and redundant information. The short keyword query used as input to the retrieval system can be supplemented with topic categories from structured Web resources. The topic categories can be used as query context to retrieve documents that are not only relevant to the query but also belongs to a relevant topic category. Category information is especially useful for the task of entity ranking where the user is searching for a certain type of entity such as companies or persons. Category information can help to improve the search results by promoting in the ranking pages belonging to relevant topic categories, or categories similar to the relevant categories. Users may raise various queries to describe the same information need. For example, to search for National Board of Accreditation, queries “National Board of Accreditation (NBA)” or “NB Accreditation” may be formulated. Directly using individual queries to describe context cannot capture contexts concisely and accurately. Also queries may arise where “NBA” can be expanded as either “National Basketball Association” or “National Board of accreditation”. Hence it becomes extremely important to go for context based query based on the user history and present requirements of the user in that context. In this paper, an extensive survey has been made on different Architectures, Models and Methodologies that have been used in IR by various researchers along with the comparison of results against various performance metrics, also highlighting the need for context based query

Model for semantic processing in information retrieval systems

The processing of information with semantic annotation allows to identify the intention of search of the users and to adjust the result according to the context of the information. The present research proposes a model for the retrieval of information with semantic annotation that allows to help the user to recover the most relevant information among all the information available on the web. In the model, three components (Trace-Indexing, Processing and Presentation) are developed that allow identifying the need for user information through the processing, selection and subsequent publication of the retrieved information. The crawling and indexing component allows the identification of available web sites to extract information and perform semantic annotation by applying different information processing techniques. The processing component analyzes the preferences of the user and processes the query performed to calculate the similarity of the indexed information. Subsequently the results are sorted according to the relevance to show in the Presentation component a quantity of information that can be assimilated by the users. For the validation of the proposal, the metrics of precision and completeness were used to demonstrate the quality and relevance of the information retrieval with semantic annotation.

A survey in traditional information retrieval models

2008 2nd IEEE International Conference on Digital Ecosystems and Technologies, 2008

As a matter of fact, many so-called semantic search algorithms are derived from the traditional indexterm-based search models. In this paper, we survey the traditional information retrieval models by categorizing them into three main classes and eleven subclasses, and analyse their benefits and issues of them.

Bridging THE Gap: From Traditional Information Retrieval TO THE Semantic Web

AMCIS 2002 Proceedings, 2002

Web is the nature of information search. The Semantic Web vision reveals a radical departure from the traditional theories of Information Retrieval (IR) upon which current search engine technology is built. Semantic Web researchers are very articulate about how the pillars of the Semantic Web-semantically aware, intelligent agents, ontologies, and markup languages-will revolutionize the way that we interact with information on the web. They are less articulate about how we will get there from here. While it's true that the traditional assumptions of IR-small, static, homogeneous, centrally located, monolingual document collections-don't hold for the Web, still it is important to note the success of search engines built on IR theory. This paper calls attention to the gap between traditional IR and the more visionary Semantic Web research. We describe a preliminary roadmap bridging the two areas focusing on the concrete contributions and also calling attention to the weak points of both fields.

A Novel Approach for Information Retrieval on the Web

The diverse and huge volume of information available over networks makes it difficult for the user to find their accurate information need. Previous information retrieval system mostly uses keywords to retrieve and index the documents, which may return inaccurate result when different keywords are employed to express the same concept in the document and queries presented by the user. However, to overcome the problem of retrieving accurate information on the Web intelligently, current information retrieval approaches need to be improved. Concept-based information retrieval are playing major role in retrieving meaningful information intelligently. In this paper, we propose a novel concept-based approach using Wikipedia-based Explicit Semantic Analysis and Ontology. We propose approach that first specifies the given context of query words semantically by using Ontology, which contains Wikipedia-based concept to query words disambiguation, and then extracts corresponding concepts for each word based on the given context. As a result the retrieved documents will be more relevant to the user query and Concept-based information retrieval systems performance will improve.

Crux and Crushed of the Information Retrieval

In modern days the world has become highly dependent on world wide web. Now it has captured every walk of human life. Creation of global web succeeded in allowing the people to share their information and data with each other globally. Such uses and presence of data and information created trillions of databases. Now in this complex scenario searching a particular information or data with accuracy is the need of hour. We need specialist tools. Search engines are one of the answers but retrieving meaningful information is difficult. To overcome this problem with search engines many modern technologies have been implemented which may retrieve meaningful information intelligently. Semantic web technologies are playing a major role. In this paper we are presenting the modern techniques on the search engine in intelligent web search technologies. Introduction:

A Smart Query Formulation for an Efficient Web Search

2007

Traditional search engines rely on keyword-based matching, recovering the documents which present some occurrences of the input keywords, but ignore at all the data meaning of the retrieved documents. Thus, long lists of pages links are returned but actually only a handful of pages contain reference to relevant web resources and meet the needs of users. The exigency of major awareness in the interpretation of web data yields new approaches and methodologies for improving the web search and retrieval, by taking into account the context of information, related to the user query. This work presents an approach for supporting the user in the Web search activity: it achieves the interpretation of the input query and, on the basis of the the local knowledge, replies by providing (links of) web pages which are more relevant to the content meaning of the input query. The approach combines intrinsic potential of the agent-based paradigm with the modeling of knowledge through techniques of soft computing. The agents encode the semantics of data, by exploiting ontologies, in order to grasp the actual query meaning. The information elicited by the query interpretation represents an add-on, aimed at augmenting the system knowledge, exploited in the discovery of web pages which match the user request. engines are not efficient in terms of time and bandwidth). On the other hand, reusing the existing large indexes of general purpose search engines is a solution to retrieve , after a filtering activity, documents from a specific domain (though the the response time to the user query are slow too). Similarly, other approaches achieve clustering of results for automatic organization (into categories) of documents (i.e. WiseNut and Vivissimo [34]). Metasearch environments, instead implement strategies that apply user queries to several search engines simultaneously. However many of these approaches do not consider the semantic relationships existing among terms: the query ambiguity and the vocabulary gap represent extant impediments that confirm the search engine technology is far from the ideal response to a certain query.

Performance Comparison between Keyword-based and WQCA-based Information Retrieval System

2018

Today, semantic logics are very important in query understanding to create successful web search engines. A user might not formalize the query when he seeks information although he knows what he wants. As a result, understanding the nature of the information that is needed behind the queries are important research problem. So, this system proposes the Web Query Classification Algorithm (WQCA) for efficient Information Retrieval (IR) system. In the WQCA process, this system firstly classifies the web queries into each characteristic (taxonomies). Then, this system extracts the domain terms from the query. By using NoSQL graph database, this system classifies each domain term into their relevant categories according to the WQCA algorithm. In the WQCA-based IR process, this system uses the classified query to find the relevant document form the document collection. Finally, this system compares the performance between keyword-based IR and WQCA-based IR to show the effectiveness of the ...