Extended Boolean Operations in Latent Semantic Indexing Search (original) (raw)
Related papers
Getting Better Results with Latent Semantic Indexing
2000
The paper presents an overview of some important factors influencing the quality of the results obtained when using Latent Semantic Indexing. The factors are separated in 5 major groups and analyzed both separately and as whole. A new class of extended Boolean operations such as OR, AND and NOT (AND-NOT) and their combinations is proposed and evaluated on a corpus of religious and sacred texts.
A New Approach for Boolean Query Processing in Text Information Retrieval
Advances in Soft Computing
The main objective of an information retrieval system is to be effective in providing a user with relevant information in response to a query. However, especially given the information explosion which has created an enormous volume of information, efficiency issues cannot be ignored. Thus, to be able to quickly process lists of documents that have the keywords stated in a given query assigned/indexed to them by merging via the Boolean logic of the query is essential in a Boolean query system. A new algorithm, based loosely on concurrent codes, is developed and discussed.
Refining Search Queries From Examples Using Boolean Expressions and Latent Semantic Analysis
2004
This paper describes an algorithm whereby an initial, naïve user query to a search engine can be subsequently refined to improve both its recall and precision. This is achieved by manually classifying the documents retrieved by the original query into relevant and irrelevant categories, and then finding additional Boolean terms which successfully discriminate between these categories. Latent semantic analysis is used to weight the choice of these extra search terms to make the resulting queries more intuitive to users.
An enhanced Boolean retrieval model for efficient searching
Scientific Journal of India
A large number of information of all the domains are available online in the form of hyper text in web pages. Peoples from different domians are consulting different web sites to fetch information according to their need. It is very difficult to remember the names of the websites for a specific domain for which the user wants to search. So a search is a system which mines information from the World Wide Web and present it to the user according to its query. Information retrieval system (IRs) works for search engine arranges the web documents systematically and retrieves the result according to the user query. In this paper an efficient Boolean retrieval model is proposed which retrieves the results according to the according to the Boolean operation specified within the terms of the search query. Also the proposed model is capable to store large indexes.
Web personalization using extended Boolean operations with Latent Semantic indexing
2000
The paper discusses the potential of the usage of Extended Boolean operations for personalized information delivery on the Internet based on semantic vector representation models. The final goal is the design of an e-commerce portal tracking user's clickstream activity and purchases history in order to offer them personalized information. The emphasis is put on the introduction of dynamic composite user profiles constructed by means of extended Boolean operations. The basic binary Boolean operations such as OR, AND and NOT (AND-NOT) and their combinations have been introduced and implemented in variety of ways. An evaluation is presented based on the classic Latent Semantic Indexing method for information retrieval using a text corpus of religious and sacred texts.
IJERT-Extended Semantic based Boolean Information Retrieval Algorithm for User-driven Query
International Journal of Engineering Research and Technology (IJERT), 2015
https://www.ijert.org/extended-semantic-based-boolean-information-retrieval-algorithm-for-user-driven-query https://www.ijert.org/research/extended-semantic-based-boolean-information-retrieval-algorithm-for-user-driven-query-IJERTV4IS050514.pdf Information Retrieval (IR) is essentially a matter of deciding which documents within a large collection satisfies a user's information need. Those documents are called relevant documents and the documents that are not of the topic specified by the user are said to be non-relevant. An existing SBIR algorithm uses lexical database, WordNet to find synonyms of single-word query term considering that the absence of the given term in a document does not necessarily mean that the document is not a relevant.In this paper, a new algorithm is proposed which works with compound terms and uses modified Porter Stemming Algorithm to solve some stemming errors found in Porter Stemmer Algorithm proposed by M. F. Porter. This will improve the recall as more relevant documents will be retrieved. We propose to involve a user in the search process through interactive feedback for word senses. This will further improve recall by retrieving more user relevant results.
Efficiency of Boolean Search strings for Information Retrieval
Abstract: The review of available literature is a foundation requirement for most research projects. The relevant literatures should be searched from multiple sources. Search engines and on-line bibliography resource sites are conventionally used to find the relevant literatures using key word search. However, with little automated help for the free text search query. In this paper, the technique of Boolean search string is explored in details along with the analysis/evaluation of the effectiveness of the technique. Searching engines such as google, google scholar and online bibliography sources such as IEEE Xplore, ACM and Science Direct were used to implement the technique. The technique was evaluated based on three (3) criteria: Number of documents retrieved, the time taken to retrieve them and the relevance of the documents to the query or research question.The analysis of this technique shows that Boolean search strings technique returns more relevant articles compared to the free text query by at least 77% and in shorter time frame. Hence, Boolean search strings are very useful for information retrieval.
The Performance of Boolean Retrieval and Vector Space Model in Textual Information Retrieval
CommIT (Communication and Information Technology) Journal, 2017
Boolean Retrieval (BR) and Vector Space Model (VSM) are very popular methods in information retrieval for creating an inverted index and querying terms. BR method searches the exact results of the textual information retrieval without ranking the results. VSM method searches and ranks the results. This study empirically compares the two methods. The research utilizes a sample of the corpus data obtained from Reuters. The experimental results show that the required times to produce an inverted index by the two methods are nearly the same. However, a difference exists on the querying index. The results also show that the numberof generated indexes, the sizes of the generated files, and the duration of reading and searching an index are proportional with the file number in the corpus and thefile size.