A simple solution for improving the effectiveness of traditional information retrieval systems (original) (raw)
Related papers
International Journal of Data Engineering (IJDE)
In this paper information retrieval system for local databases are discussed. The approach is to search the web both semantically and syntactically. The proposal handles the search queries related to the user who is interested in the focused results regarding a product with some specific characteristics. The objective of the work will be to find and retrieve the accurate information from the available information warehouse which contains related data having common keywords. This information retrieval system can eventually be used for accessing the internet also. Accuracy in information retrieval that is achieving both high precision and recall is difficult. So both semantic and syntactic search engine are compared for information retrieval using two parameters i.e. precision and recall.
A SURVEY ON VARIOUS ARCHITECTURES, MODELS AND METHODOLOGIES FOR INFORMATION RETRIEVAL
iaeme
The typical Information Retrieval (IR) model of the search process consists of three essentials: query, documents and search results. An user looking to fulfill information need has to formulate a query usually consisting of a small set of keywords summarizing the information need. The goal of an IR system is to retrieve documents containing information which might be useful or relevant to the user. Throughout the search process there is a loss of focus, because keyword queries entered by users often do not suitably summarize their complex information needs, and IR systems do not sufficiently interpret the contents of documents leading to result lists containing irrelevant and redundant information. The short keyword query used as input to the retrieval system can be supplemented with topic categories from structured Web resources. The topic categories can be used as query context to retrieve documents that are not only relevant to the query but also belongs to a relevant topic category. Category information is especially useful for the task of entity ranking where the user is searching for a certain type of entity such as companies or persons. Category information can help to improve the search results by promoting in the ranking pages belonging to relevant topic categories, or categories similar to the relevant categories. Users may raise various queries to describe the same information need. For example, to search for National Board of Accreditation, queries “National Board of Accreditation (NBA)” or “NB Accreditation” may be formulated. Directly using individual queries to describe context cannot capture contexts concisely and accurately. Also queries may arise where “NBA” can be expanded as either “National Basketball Association” or “National Board of accreditation”. Hence it becomes extremely important to go for context based query based on the user history and present requirements of the user in that context. In this paper, an extensive survey has been made on different Architectures, Models and Methodologies that have been used in IR by various researchers along with the comparison of results against various performance metrics, also highlighting the need for context based query
A survey in traditional information retrieval models
2008 2nd IEEE International Conference on Digital Ecosystems and Technologies, 2008
As a matter of fact, many so-called semantic search algorithms are derived from the traditional indexterm-based search models. In this paper, we survey the traditional information retrieval models by categorizing them into three main classes and eleven subclasses, and analyse their benefits and issues of them.
Bridging the Gap: From Traditional Information Retrieval to the Semantic Web
Web is the nature of information search. The Semantic Web vision reveals a radical departure from the traditional theories of Information Retrieval (IR) upon which current search engine technology is built. Semantic Web researchers are very articulate about how the pillars of the Semantic Web-semantically aware, intelligent agents, ontologies, and markup languages-will revolutionize the way that we interact with information on the web. They are less articulate about how we will get there from here. While it's true that the traditional assumptions of IR-small, static, homogeneous, centrally located, monolingual document collections-don't hold for the Web, still it is important to note the success of search engines built on IR theory. This paper calls attention to the gap between traditional IR and the more visionary Semantic Web research. We describe a preliminary roadmap bridging the two areas focusing on the concrete contributions and also calling attention to the weak points of both fields.
A Novel Approach for Information Retrieval on the Web
The diverse and huge volume of information available over networks makes it difficult for the user to find their accurate information need. Previous information retrieval system mostly uses keywords to retrieve and index the documents, which may return inaccurate result when different keywords are employed to express the same concept in the document and queries presented by the user. However, to overcome the problem of retrieving accurate information on the Web intelligently, current information retrieval approaches need to be improved. Concept-based information retrieval are playing major role in retrieving meaningful information intelligently. In this paper, we propose a novel concept-based approach using Wikipedia-based Explicit Semantic Analysis and Ontology. We propose approach that first specifies the given context of query words semantically by using Ontology, which contains Wikipedia-based concept to query words disambiguation, and then extracts corresponding concepts for each word based on the given context. As a result the retrieved documents will be more relevant to the user query and Concept-based information retrieval systems performance will improve.
A Smart Query Formulation for an Efficient Web Search
2007
Traditional search engines rely on keyword-based matching, recovering the documents which present some occurrences of the input keywords, but ignore at all the data meaning of the retrieved documents. Thus, long lists of pages links are returned but actually only a handful of pages contain reference to relevant web resources and meet the needs of users. The exigency of major awareness in the interpretation of web data yields new approaches and methodologies for improving the web search and retrieval, by taking into account the context of information, related to the user query. This work presents an approach for supporting the user in the Web search activity: it achieves the interpretation of the input query and, on the basis of the the local knowledge, replies by providing (links of) web pages which are more relevant to the content meaning of the input query. The approach combines intrinsic potential of the agent-based paradigm with the modeling of knowledge through techniques of soft computing. The agents encode the semantics of data, by exploiting ontologies, in order to grasp the actual query meaning. The information elicited by the query interpretation represents an add-on, aimed at augmenting the system knowledge, exploited in the discovery of web pages which match the user request. engines are not efficient in terms of time and bandwidth). On the other hand, reusing the existing large indexes of general purpose search engines is a solution to retrieve , after a filtering activity, documents from a specific domain (though the the response time to the user query are slow too). Similarly, other approaches achieve clustering of results for automatic organization (into categories) of documents (i.e. WiseNut and Vivissimo [34]). Metasearch environments, instead implement strategies that apply user queries to several search engines simultaneously. However many of these approaches do not consider the semantic relationships existing among terms: the query ambiguity and the vocabulary gap represent extant impediments that confirm the search engine technology is far from the ideal response to a certain query.
Crux and Crushed of the Information Retrieval
In modern days the world has become highly dependent on world wide web. Now it has captured every walk of human life. Creation of global web succeeded in allowing the people to share their information and data with each other globally. Such uses and presence of data and information created trillions of databases. Now in this complex scenario searching a particular information or data with accuracy is the need of hour. We need specialist tools. Search engines are one of the answers but retrieving meaningful information is difficult. To overcome this problem with search engines many modern technologies have been implemented which may retrieve meaningful information intelligently. Semantic web technologies are playing a major role. In this paper we are presenting the modern techniques on the search engine in intelligent web search technologies. Introduction:
Performance Comparison between Keyword-based and WQCA-based Information Retrieval System
2018
Today, semantic logics are very important in query understanding to create successful web search engines. A user might not formalize the query when he seeks information although he knows what he wants. As a result, understanding the nature of the information that is needed behind the queries are important research problem. So, this system proposes the Web Query Classification Algorithm (WQCA) for efficient Information Retrieval (IR) system. In the WQCA process, this system firstly classifies the web queries into each characteristic (taxonomies). Then, this system extracts the domain terms from the query. By using NoSQL graph database, this system classifies each domain term into their relevant categories according to the WQCA algorithm. In the WQCA-based IR process, this system uses the classified query to find the relevant document form the document collection. Finally, this system compares the performance between keyword-based IR and WQCA-based IR to show the effectiveness of the ...
New Information Retrieval Approach Based on Semantic Indexing by Meaning
Proceedings of the 16th International Conference on Applied Computing 2019, 2019
An Information Retrieval System (IRS) offers a number of tools and techniques, which enable to locate and visualize the relevant information needed. This information, is expressed by the user in the form of a query natural language. However, the representation of documents and the query in a traditional IRS lead to a lexical-centered relevance estimation which is, in fact, less efficient than a semantic-focused estimation. As a consequence, the documents that are actually relevant are not being recovered if they do not share words with the query, while the documents non relevant, which are words in common with the query, are recovered even though at times they do not have the meaning intended. This paper tackles this problem while suggesting a solution in the level of indexation of an IRS allowing it to improve its performance. To be more precise, we suggest a new approach of semantic indexation allowing to lead to the exact meaning of each term in a document or query undergoing a contextual analysis at the sentence level. In fact, if the system is able to comprehend the need of the user, then consequently it is perfectly capable to respond to it. Add to that, we suggest a simple method allowing to apply any model of IR on our new index table without changing its original bases making it faster. In order to validate this proposed approach, this new created system is evaluated base on numerous collections naming "TIME", "BBC", "The Guardian" and "BigThink". The results based on the experiments indicate the efficacy of our hypothesis compared to traditional IR approaches.