A novel fuzzy logic-based approach for textual documents indexing (original) (raw)

ON AN INTERPRETATION OF KEYWORDS WEIGHTS IN INFORMATION RETRIEVAL: SOME FUZZY LOGIC BASED APPROACHES

International Journal of Uncertainty, Fuzziness and Knowledge-Based Systems, 2009

Relevant contributions of fuzzy logic to the logical models in information retrieval is studied. It makes it possible to grasp the graduality of some relevant concepts and to model both imprecision and uncertainty inherent to the retrieval process, still in the framework of the broadly meant logical approach. In this perspective we discuss various extensions to the basic Boolean model which are needed to attain such a greater expressivity. In particular, we show how the well-known semantics of keywords weights may be recovered in various fuzzy logic based information retrieval models.

A new fuzzy logic based information retrieval model

2008

We propose a comprehensive model of information retrieval (IR) based on Zadeh's linguistic statements. Its characteristic feature is a capability to take into account both the imprecision and uncertainty pervading the textual information representation. It extends earlier IR models based on broadly meant fuzzy logic. Moreover, some techniques for obtaining quantitative representations of documents and queries are proposed.

Fuzzy information retrieval model revisited

Fuzzy Sets and Systems, 2009

A new comprehensive model of information retrieval (IR) based on Zadeh's calculus of linguistic statements is proposed. Its characteristic and novel feature is the capability to take into account both the imprecision and uncertainty pervading the textual information representation. It extends earlier IR models based on broadly meant fuzzy logic. Moreover, some techniques for indexing documents and queries in the framework of this model are proposed. The results of the computational experiments on standard document collections are reported.

Fuzzy Logic in Natural Language Processing – A Closer View

Procedia Computer Science, 2018

The natural instinct of human beings is very complex. The understanding of this instinct requires a clear dimensional analysi s of the knowledge of discourse. The computer systems are now trained to understand how things work in real-world domain for intelligent analysis. This effort although very progressive has a limitation. There is an intelligence gap which makes human one step above the machine. Fuzzy logic can be used to make a machine understand this intelligence gap in a better way. Fuzzy logic is the science which makes a computer understand and think the way humans do. The aim of this study is two folds: first, to understand fuzzy logic, a computational Intelligence technique, for effective decision making and second, to illustrate through real world examples the existence of this intelligence gap using well known natural language processing applications like Google Search Engine, Google Translator and MIT Start. To the best of our knowledge, there is no work available in the literature which exemplifies this intelligence gap in such a simple manner. The examples are chosen carefully to illustrate and demonstrate the applications of fuzzy logic in natural language processing environment for every reader.

Rules and fuzzy rules in text: concept, extraction and usage

International Journal of Approximate Reasoning, 2003

Several concepts and techniques have been imported from other disciplines such as Machine Learning and Artificial Intelligence to the field of textual data. In this paper, we focus on the concept of rule and the management of uncertainty in text applications. The different structures considered for the construction of the rules, the extraction of the knowledge base and the applications and usage of these rules are detailed. We include a review of the most relevant works of the different types of rules based on their representation and their application to most of the common tasks of Information Retrieval such as categorization, indexing and classification.

A Fuzzy Approach for Text Mining

International Journal of Mathematical Sciences and Computing, 2015

Document clustering is an integral and important part of text mining. There are two types of clustering, namely, hard clustering and soft clustering. In case of hard clustering, data item belongs to only one cluster whereas in soft clustering, data point may fall into more than one cluster. Thus, soft clustering leads to fuzzy clustering wherein each data point is associated with a membership function that expresses the degree to which individual data points belong to the cluster. Accuracy is desired in information retrieval, which can be achieved by fuzzy clustering. In the work presented here, a fuzzy approach for text classification is used to classify the documents into appropriate clusters using Fuzzy C Means (FCM) clustering algorithm. Enron email dataset is used for experimental purpose. Using FCM clustering algorithm, emails are classified into different clusters. The results obtained are compared with the output produced by k means clustering algorithm. The comparative study showed that the fuzzy clusters are more appropriate than hard clusters.

An Enhanced Fuzzy Information Retrieval Model Based on Linguistics

2014

The paper proposes a linguistic based multi-view fuzzy ontology information retrieval model. It deals with multi-view linguistic based queries in multi domains. Such linguistics are user defined, reflecting his subjective view. The model also proposes a ranking algorithm that ranks the set of relevant documents according to some criteria such as their relevance degree, confidence degree, and updating degree.

Current trends in information retrieval systems: review of fuzzy set theory and fuzzy Boolean retrieval models

Journal of Library Services and Technologies, 2020

This paper reviews the concept and goal of Information Retrieval Systems (IRSs). It also explains the synonymous concepts in Information Retrieval (IR)which include such terms as: imprecision, vagueness, uncertainty, and inconsistency. Current trends in IRSs are discussed. Fuzzy Set Theory, Fuzzy Retrieval Modelsare reviewed. The paper also discusses extensions of Fuzzy Boolean Retrieval Models including Fuzzy techniques for documents’ indexingandFlexible query languages. Fuzzy associative mechanisms were identified to include:(1)fuzzy pseudothesauri and fuzzy ontologies which can be used to contextualize the search by expanding the set of index terms of documents;(2)an alternative use of fuzzy pseudothesarui and fuzzy ontologies is to expand the query with related terms by taking into account their varying importance of an additional termand (3)fuzzy clustering techniques, where each document can be placed within several clusters with a given strength of belonging to each cluster, ...

Exploring the use of fuzzy signature for text mining

2010

The classical approaches for the traditional problems of text mining, such as document indexing, document clustering or text classification, represent the text as bag-of-words. Words, the units of the representation, are determined by tokenization, using e.g. whitespace and punctuation characters as separator. The bag-of-word based methods face problem with non-segmented text typical for some Asian languages, since the tokenization based solution cannot be applied anymore to determine the representation units. Several solutions were proposed so far, among them frequent max substring mining is adopted here because of its language-independency and favourable speed and store requirements. We present in this paper a fuzzy signature based solution using frequent max substring for non-segmented document representation, and propose how it could be applied for some typical text mining tasks. We show how the flexibility of fuzzy signatures can be exploited for text mining tasks. With the use of this proposed concept, complex decision models in text mining may be constructed more effectively in future.