Document Classification of Assamese Text Using Naïve Bayes Approach (original) (raw)

International Journal of Computer Trends and Technology, 2015

Abstract

Document classification has become an emerging technique in the field of research due to the abundance of documents available in digital form. Document classification can be used to organize data into smaller and meaningful classes. Correctly identifying a document into a particular class is still a huge challenge particularly in Assamese text as very few work has been done in this field . In this paper we have done document classification using Naïve bayes classifier. In regards to the various classifying approaches, Naïve Bayes is potentially good at serving as a document classification model due to its simplicity. The aim of this paper is to highlight the performance of employing Naïve Bayes in document classification. In this paper the document is classified into one of the four classes i.e. sports, politics , law and science. To build and evaluate the classification model, a total 200 documents is split into two datasets, namely training set and testing set, in which 60% of the documents is used as training set whereas the remaining 40% is used as the testing set. The results have been validated using statistical measures of precision , recall and their combination F-measure. Results show that Naïve Bayes is a good classifiers .

moromi gogoi hasn't uploaded this paper.

Let moromi know you want this paper to be uploaded.

Ask for this paper to be uploaded.