Computing with words for text processing: An approach to the text categorization (original) (raw)

The use of the computing with words paradigm for the automatic text documents categorization problem is discussed. This specific problem of information retrieval (IR) becomes more and more important, notably in view of a fast proliferation of textual information available on the Internet. The main issues that have to be addressed here are: document representation and classification. The use of fuzzy logic for both problems has already been quite deeply studied though for the latter, i.e., classification, generally not in an IR context. Our approach is based mainly on the classical calculus of linguistically quantified propositions proposed by Zadeh. Moreover, we employ results related to fuzzy (linguistic) queries in IR, notably various interpretations of the weights of query terms. Some preliminary results on widely adopted text corpora are presented.