Classifying Political Arabic Articles Using Support Vector Machine with Different Feature Extraction (original) (raw)

In the recent years, the number of web logs, and the amount of opinionated data on the World Wide Web, have been grown substantially. The ability to determine the political orientation of an article automatically can be beneficial in many areas from academia to security. However, the sentiment classification of web log posts (political web log posts in particular), is apparently more complex than the sentiment classification of conventional text. In this paper, a supervised machine learning with two feature extraction techniques Term Frequency (TF) and Term Frequency-Inverse Document Frequency (TF-IDF) are used for the classification process. For investigation, SVM with four kernels for supervised machine learning have been employed. Subsequent to testing, the results reveal that the linear with TF achieved the results in accuracy of 91.935% also with TF-IDF achieved the 95.161%. The linear kernel was deemed the most suitable for our model.