Alleviating Classification Problem of Imbalanced Dataset (original) (raw)

2013

Abstract

The Class Imbalance problem occurs when there are many more instances of some class than others. i.e. skewed class distribution. In cases like this, standard classifier tends to be overwhelmed by the majority class and ignores the minority class. It is one of the 10 challenging problems of data mining research and pattern recognition. This imbalanced dataset degrades the performance of the classifier as accuracy is tendered towards the majority class. Several techniques have been proposed to solve this problem. This paper aims to improve the true positive rate/ detection of the minority class (GDM) which is the class of interest. This study proposes the use of two under sampling techniques reported in the literature. It involves under sampling the majority class which balances the dataset before classification. These under sampling schemes were evaluated on three learning algorithms (Decision tree both pruned and un- pruned and RIPPER) using Matthew’s Correlation Coefficient (MCC) a...

Sakinat Tijani -Folorunso hasn't uploaded this paper.

Let Sakinat know you want this paper to be uploaded.

Ask for this paper to be uploaded.