Computing with words for text processing: An approach to the text categorization (original) (raw)
Related papers
Fuzzy information retrieval model revisited
Fuzzy Sets and Systems, 2009
A new comprehensive model of information retrieval (IR) based on Zadeh's calculus of linguistic statements is proposed. Its characteristic and novel feature is the capability to take into account both the imprecision and uncertainty pervading the textual information representation. It extends earlier IR models based on broadly meant fuzzy logic. Moreover, some techniques for indexing documents and queries in the framework of this model are proposed. The results of the computational experiments on standard document collections are reported.
A new fuzzy logic based information retrieval model
2008
We propose a comprehensive model of information retrieval (IR) based on Zadeh's linguistic statements. Its characteristic feature is a capability to take into account both the imprecision and uncertainty pervading the textual information representation. It extends earlier IR models based on broadly meant fuzzy logic. Moreover, some techniques for obtaining quantitative representations of documents and queries are proposed.
Fuzzy Rules for Document Classification to Improve Information Retrieval
mirlabs.org
In this work, we present a method to generate, from text documents, fuzzy rules used to classify documents and to improve the information retrieval. With this method, we face the issue of dimensionality in text documents for information retrieval. We also present a comparison analysis among the method that we proposed and well-known machine learning methods for classification. The aim of our work is to develop a mechanism to reduce the high dimensionality of the attribute-value matrix obtained from the documents and, consequently, scale up the proposed classifier. Some experiments have been run using different domains in order to validate the proposed approach and compare the results with the ones obtained with the OneR, K-Nearest Neighbor classifier, C4.5, Multi-variable Naive Bayes, and SVM methods. The experiments and the obtained results showed that this is a promising approach to deal with the dimensionality problem of document for information retrieval.
A novel fuzzy logic-based approach for textual documents indexing
Indonesian journal of electrical engineering and computer science, 2024
In the evolving landscape of information retrieval and natural language processing, the quest for more effective automatic keyword extraction (AKE) techniques from textual documents has become a pivotal research focus. Existing methodologies, while offering valuable insights, often grapple with the challenges posed by the imprecision and variability inherent in human language. This has led to a growing recognition of the need for innovative approaches to navigating textual content's nuances more adeptly. In response to this imperative, this paper proposes a novel fuzzy indexing approach designed specifically for the indexing of textual documents. Fuzzy indexing, grounded in the principles of fuzzy logic, provides solutions for handling the inherent uncertainty and imprecision in natural language, especially when confronted with the intricacies of linguistic ambiguity and variability. By leveraging the power of fuzzy logic, we aim to enhance the precision of keyword extraction. This paper unfolds the intricacies of our fuzzy indexing approach, detailing the theoretical methodology through empirical evaluation and comparative analysis; we seek to demonstrate the efficacy of our approach in outperforming traditional methods in the context of fuzzy indexing for textual documents.
Linguistic aggregation operators of selection criteria in fuzzy information retrieval
International Journal of Intelligent Systems, 1995
A "softening" of the hard Boolean scheme for information retrieval is presented. In this approach, information retrieval is seen as a multicriteria decision-making activity in which the criteria to be satisfied by the potential solutions, i.e., the archived documents, are the requirements expressed in the query. The retrieval function is then an overall decision function evaluating the degree to which each potential solution satisfies a query consisting of information requirements aggregated by operators. Linguistic quantifiers and a connector dealing with primary and optional criteria are defined and introduced in the query language in order to specify the aggregation criteria of the single query requirements. These criteria make it possible for users t o express queries in a simple and self-explanatory manner. In particular, linguistic quantifiers are defined which capture the intrinsic vagueness of information needs. 0
ON AN INTERPRETATION OF KEYWORDS WEIGHTS IN INFORMATION RETRIEVAL: SOME FUZZY LOGIC BASED APPROACHES
International Journal of Uncertainty, Fuzziness and Knowledge-Based Systems, 2009
Relevant contributions of fuzzy logic to the logical models in information retrieval is studied. It makes it possible to grasp the graduality of some relevant concepts and to model both imprecision and uncertainty inherent to the retrieval process, still in the framework of the broadly meant logical approach. In this perspective we discuss various extensions to the basic Boolean model which are needed to attain such a greater expressivity. In particular, we show how the well-known semantics of keywords weights may be recovered in various fuzzy logic based information retrieval models.
Semi-fuzzy quantifiers for information retrieval
Soft Computing in Web …, 2006
Recent research on fuzzy quantification for information retrieval has proposed the application of semi-fuzzy quantifiers for improving query languages. Fuzzy quantified sentences are useful as they allow additional restrictions to be imposed on the retrieval process unlike more popular retrieval approaches, which lack the facility to accurately express information needs. For instance, fuzzy quantification supplies a variety of methods for combining query terms whereas extended boolean models can only handle extended boolean-like operators to connect query terms. Although some experiments validating these advantages have been reported in recent works, a comparison against state-of-the-art techniques has not been addressed. In this work we provide empirical evidence on the adequacy of fuzzy quantifiers to enhance information retrieval systems. We show that our fuzzy approach is competitive with respect to models such as the vector-space model with pivoted document-length normalization, which is at the heart of some high-performance web search systems. These empirical results strengthen previous theoretical works that suggested fuzzy quantification as an appropriate technique for modeling information needs. In this respect, we demonstrate here the connection between the retrieval framework based on the concept of semi-fuzzy quantifier and the seminal proposals for modeling linguistic statements through Ordered Weighted Averaging operators (OWA).