Extended Boolean query processing in the generalized vector space model (original) (raw)

On extending the vector space model for Boolean query processing

1986

An infamation retrieval model, named the Generaliied Vectm Spice Model (GVSM). is extended m handle situations where queries are specitied as (extended) Boolean expressions. It is shown tbat this unified model, unlike currently available alternatives, has the advantage of inwrpating tetm cortelations inm the retrieval process. 'Ilte query language extension is attractive in the sense that most of the aIgebraic properties of tbe strict Boolean language are still preserved. Although the experimental results for extended Boolean retrieval are not always better than the vector processing method, the developments here am signiecant in facilitating commercially available retrieval systems to benefit from the vector based methods. The proposed scheme is compared m the pnorm model advanced by Salmn snd coworkers. An important conclusion is that it is desirable m investigate further extensions that can offer the benefits of both proposals.

The Performance of Boolean Retrieval and Vector Space Model in Textual Information Retrieval

CommIT (Communication and Information Technology) Journal, 2017

Boolean Retrieval (BR) and Vector Space Model (VSM) are very popular methods in information retrieval for creating an inverted index and querying terms. BR method searches the exact results of the textual information retrieval without ranking the results. VSM method searches and ranks the results. This study empirically compares the two methods. The research utilizes a sample of the corpus data obtained from Reuters. The experimental results show that the required times to produce an inverted index by the two methods are nearly the same. However, a difference exists on the querying index. The results also show that the numberof generated indexes, the sizes of the generated files, and the duration of reading and searching an index are proportional with the file number in the corpus and thefile size.

Trends in research on information retrieval — The potential for improvements in conventional Boolean retrieval systems

Information Processing & Management, 1988

Operational retrieval systems are firmly embedded within the pure Boolean framework, and the theoretical model underlying these systems is based on the implicit assumption that documents and user information needs can be precisely and completely characterized by sets of index terms and Boolean search request formulations, respectively. However, this assumption must be considered grossly inaccurate since uncertainty is intrinsic to the document retrieval process. The inability of the standard Boolean model to deal effectively with the inherent fallibility of retrieval decisions is the main reason for a number of serious deficiencies exhibited by present-day operational retrieval systems. This article reviews recent advances in information retrieval research and examines their practical potential for overcoming these deficiencies. The primary source for this review is the subsequent articles that comprise this special issue of Information Processing & Management, although earlier results published elsewhere have also been considered.

A mathematical model of a weighted boolean retrieval system

Information Processing and Management, 1979

The use of weights to denote a query representation and/or the indexing of a document is analysed as a generalization of a Boolean retrieval system. Criteria are given for the functions used to evaluate the relevance of the records to a specific query, including self-consistency. Various mechanisms suggested in the literature for evaluating the relevance of records with regard to a given query are tested and found to be less than satisfactory. A new approach is suggested to avoid some of the perils of a weighted Boolean retrieval system.

Extended Boolean Retrieval Model Using Pnorm and Term Independent Bound Methods

This paper provides a comparison report of two processes of retrieving a keyword or information's from a given database or from a multiple databases. The process1 referred to as Extended Boolean Retrieval (EBR) model, it gives us an outcome in the database. Since EBR model implementation factors lead to a higher cost, we consider a p-norm method of the EBR execution. P-Norm approach plays a role in the EBR model to preserve strictness of the conjunctions and disjunctions to establish them with their own identification on the node. As Adaptive Boolean Retrieval Process (ABRP the process2 known). Within this paradigm of text categorization aspect, first it assigns a value to a specified key word or info then starts its hunting procedure with an index. This value includes the factors such as a position and appearances of phrase. In existing, they use these concepts in Bag-of-word approach. In this paper EBR model gives an edge of reformulation facet, which gives a hundreds or tens of thousands of answers for the given query. To finish, we assess together with the reported results of these models on query to demonstrate an better retrieving process according to their efficiency and correctness with the max score ranking algorithm.

An enhanced Boolean retrieval model for efficient searching

Scientific Journal of India

A large number of information of all the domains are available online in the form of hyper text in web pages. Peoples from different domians are consulting different web sites to fetch information according to their need. It is very difficult to remember the names of the websites for a specific domain for which the user wants to search. So a search is a system which mines information from the World Wide Web and present it to the user according to its query. Information retrieval system (IRs) works for search engine arranges the web documents systematically and retrieves the result according to the user query. In this paper an efficient Boolean retrieval model is proposed which retrieves the results according to the according to the Boolean operation specified within the terms of the search query. Also the proposed model is capable to store large indexes.

Query Expansion Using Augmented Terms in an Extended Boolean Model

Journal of Computing Science and Engineering, 2008

We propose a new query expansion method in the extended Boolean model that improves precision without degrading recall. For improving precision, our method promotes the ranks of documents having more query terms since users typically prefer such documents. The proposed method consists of the following three steps: (1) expanding the query by adding new terms related to each term of the query, (2) further expanding the query by adding augmented terms, which are conjunctions of the terms, (3) assigning a weight on each term so that augmented terms have higher weights than the other terms. We conduct extensive experiments to show the effectiveness of the proposed method. The experimental results show that the proposed method improves precision by up to 102% for the TREC-6 data compared with the existing query expansion method using a thesaurus proposed by Kwon et al. [Kwon et al. 1994].

A New Approach for Boolean Query Processing in Text Information Retrieval

Advances in Soft Computing

The main objective of an information retrieval system is to be effective in providing a user with relevant information in response to a query. However, especially given the information explosion which has created an enormous volume of information, efficiency issues cannot be ignored. Thus, to be able to quickly process lists of documents that have the keywords stated in a given query assigned/indexed to them by merging via the Boolean logic of the query is essential in a Boolean query system. A new algorithm, based loosely on concurrent codes, is developed and discussed.

Extended Boolean query processing in the generalized vector space model (original) (raw)

Related papers