Software and Libraries for Imbalanced Classification (original) (raw)

2018

Abstract

Researchers in the topic of imbalanced classification have proposed throughout the years a large amount of different approaches to address this issue. To keep on developing this area of study, it is of extreme importance to make these methods available for the research community. This allows for a double advantage: (1) to analyze in depth the features and capabilities of the algorithms; and (2) to carry out a fair comparison with any novel proposal. Taking the former into account, different open source libraries and software packages on imbalanced classification can be found, being built under different tools. In this chapter, we compile the most significant ones focusing on their main characteristics and included methods, from standard DM to Big Data applications. Our intention is to make close to researchers, practitioners and corporations, a non-exhaustive list of the alternatives for applying diverse algorithms to their problem in order to achieve the most accurate results with the lowest effort. To present these software tools, this chapter is organized as follows. First, in Sect. 14.1 the significance of software implementations for imbalanced classification is stressed. Then, Sect. 14.2 introduces the Java tools, i.e. KEEL [2] and WEKA [17]. Next, Sect. 14.3 focus on different R packages. The “imbalanced-learn” Python toolbox [29] from “scikit learn” [39] is described in Sect. 14.4. Big Data solutions under Spark [26] are summarized in Sect. 14.5. Finally, Sect. 14.6 provides some concluding remarks.

Ronaldo Prati hasn't uploaded this paper.

Let Ronaldo know you want this paper to be uploaded.

Ask for this paper to be uploaded.