TOP 10 Cited Computer Science & Information Technology Research Articles From 2009 Issue (original) (raw)

Automatic Query Expansion for Information Retrieval : A Survey and Problem Definition

2017

An ideal information retrieval system is expected to retrieve only the relevant documents while irrelevant ones are ignored towards ensuring throughput of the retrieval system and reduce the time user spend on the search engines as well as serving a motivation for continue the search. The process of IR consists of locating relevant documents on the basis of user query, such as keywords. One of the most fundamental research questions in information retrieval is how to operationally define the notion of relevance so that we can score a document with respect to a query appropriately. The most critical language issue for retrieval effectiveness is the term mismatch problem because both the indexers and the users do often not use the same words. This scenario is called vocabulary problem. Consequently, IRS users spend much time and resources to obtain their information need after querying the system. One solution to this problem is known as query expansion via pseudo relevance feedback w...

Studying query expansion effectiveness

2009

Abstract. Query expansion is an effective technique in improving the retrieval performance for ad-hoc retrieval. However, query expansion can also fail, leading to a degradation of the retrieval performance. In this paper, we aim to provide a better understanding of query expansion by an empirical study on what factors can affect query expansion, and how these factors affect query expansion. We examine how the quality of the query, measured by the first-pass retrieval performance, is related to the effectiveness of query expansion.

Web Query Expansion and Refinement using Query -Level Clustering

The objectives raised in this paper are to pave the new dimension to Internet searching and bring the semantic core strategies to the forefront to add values to the search process. In precise, " the search must be what user wish, not what user types ". To know the process of search intricacy, we observed the vocabulary contradiction and mismatch problem existence during retrieval can estimate the irrelevant document matching. Generally, a term or vocabulary mismatch can happens to the search iteration only if the terms not present in the fetched documents. Many techniques have been proposed such as library science, pseudo relevance feedback and later semantic indexing etc, where all the algorithms tend to find the objectives sustained but did not deal with alternate process. Hence we have proposed a technique which gives the sheer implications of all the pitfalls and device a new mechanism to support the mismatch problem. By bringing the semantics aspects of the sentences and word order of the sentence to the core part, we have emulated the proper solution to get rid of sentence or term mismatch problem. 1. Introduction It is observed that searches conducted on search engines are purely for learning, entertainment or to carry business transactions. But many searches are having the real purpose and made some impact on to take important decision about life, health, major purchase of certain things or quenching the business community quest for an acquisition target. Although the search engines have been achieving remarkable success in recent years and reaching new heights in bringing the quality results to the users, but still poor at helping the people to find exactly what they want, and their needs, especially in the circumstances where the users don't have a clear idea of what they are actually looking for. Both the conventional and the modern search engines are simply attempt to find the best match between what users asks for and what is available in their indices. Search engines have not done a good job of assessing exactly what the user wants because they are lack in the sheer knowledge of the context that made the user to generate the poor search query. Besides, the ambiguities of language are an issue which is more difficult to understand the exact intent or absolute meaning of the user's query. Searching is a iterative process in which a users grab the intended web pages via trial and error query methods that work best for the issue to resolve. It might surprise most people to know that search engines only index a small percentage of the knowledge resources available. This occurs because many web pages are stored behind password protected sites, pages are dynamically created and disappear once they serve their purpose, and several types of information are in formats that are not useable by search engines. Users search the web for the information with their needs and mostly their queries are explicit expression of their search needs. The information need in web search process can be termed as intent and that demands more productive fetching of web pages. Many times, the user query is not adequate to describe the intent which they actually aimed but it only contains few terms. This problem exists, because of the lack of domain knowledge or insufficient skills to express their intents. And also, the intent primarily resides in the mind of the user and thus difficult to observe. Despite all these hiccups, even if the user is obliged to reveal his actual intent, it's also a challenging task to describe the intent accurately. Hence, users can reformulate the initial query following the search results shown to them and their understanding would become more specific by extracting clues from search activities. Basically, the web users are categorically separated as: navigational, informational and transactional. The navigational query can be used to reach the specific web site or web pages where the users don't have the clear indication of it. The navigational queries can take the user to different web pages which are all relevant to one another. The information queries are very specific where it demands the relevant information about the given topic. The users want to learn or find the information which might scatter at various web pages or sites. The transactional queries are absolutely interactive and carry out a robust transaction with the websites like downloading music, carry out online shopping, playing online games etc. In order to achieve the search process more productive, we need to extract the semantics from the questions which the user often posed in the web. The questions can be categorized in many ways like the queries which are only yes or no type, some queries are seeking the reasons of particular thing (like why type questions), few queries are asking the opinion of particular things, some queries wants to know the details of the particular

Enhanced Web document retrieval using automatic query expansion

Journal of The American Society for Information Science and Technology, 2004

The ever growing popularity of the Internet as a source of information, coupled with the accompanying growth in the number of documents made available through the World Wide Web, is leading to an increasing demand for more efficient and accurate information retrieval tools. Numerous techniques have been proposed and tried for improving the effectiveness of searching the World Wide Web for documents relevant to a given topic of interest. The specification of appropriate keywords and phrases by the user is crucial for the successful execution of a query as measured by the relevance of documents retrieved. Lack of users' knowledge on the search topic and their changing information needs often make it difficult for them to find suitable keywords or phrases for a query. This results in searches that fail to cover all likely aspects of the topic of interest. We describe a scheme that attempts to remedy this situation by automatically expanding the user query through the analysis of initially retrieved documents. Experimental results to demonstrate the effectiveness of the query expansion scheme are presented.