A New Two-Stage Approach to the Multiaspect Text Categorization (original) (raw)

2015, 2015 IEEE Symposium Series on Computational Intelligence

We consider a particular type of text categorization problem which we refer to as the multiaspect classification. It is inspired by some practical scenario of business documents management in a company but has a broader application potential. A distinguishing feature of the new problem considered is the existence of two schemes of classification. The first one is based on the traditional, static set of text categories, possibly arranged into a hierarchy. The second one is based on a dynamic structure of sequences of documents, referred to as cases, identified within each category. While the former problem may be addressed using one of the well known techniques of text categorization (classification), the latter seems to require some distinct approaches due to the fact that the set of cases is unknown in advance, as well as due to the assumed limited number of training documents, if a case should be interpreted as a classic category. In the paper, we discuss the problem in a more detail as well as show the applicability of an intuitively appealing two stage approach to solving the problem of such a multiaspect text categorization.

Loading...

Loading Preview

Sorry, preview is currently unavailable. You can download the paper by clicking the button above.