A Bayesian Classification Approach Using Class-Specific Features for Text Categorization (original) (raw)

In this paper, we present a Bayesian classification approach for automatic text categorization using class-specific features. Unlike the conventional approaches for text categorization, our proposed method selects a specific feature subset for each class. To apply these class-dependent features for classification, we follow Baggenstoss's PDF Projection Theorem to reconstruct PDFs in raw data space from the class-specific PDFs in low-dimensional feature space, and build a Bayes classification rule. One noticeable significance of our approach is that most feature selection criteria, such as Information Gain (IG) and Maximum Discrimination (MD), can be easily incorporated into our approach. We evaluate our method's classification performance on several real-world benchmark data sets, compared with the state-of-the-art feature selection approaches. The superior results demonstrate the effectiveness of the proposed approach and further indicate its wide potential applications in text categorization.