Predicting software aging related bugs from imbalanced datasets by using data mining techniques (original) (raw)
Software aging bugs are related with the lifespan of the software. Rebooting is one of the solutions of this problem, however, it is time consuming and causes resources loss. It is difficult to detect these bugs during the time-limited software testing process. Data mining techniques can be useful to predict whether a piece of software has aging related bugs or not. The available datasets of software aging bugs present a challenge as they are imbalanced datasets. In these datasets, the number of data points with bugs is very small as compared to the number of data points with no bugs. It is important to predict the rare class (Bugs). In this paper we carried out experiment with a dataset containing data points related to aging-related bugs found in an open-source project MySQL DBMS. Data mining techniques developed for imbalanced datasets were compared with general data mining techniques. Various performance measures were used for the comparative study. The results suggest that data mining techniques developed for imbalanced datasets are more useful for correct prediction of data points related to aging related bugs. Data mining techniques developed for imbalanced datasets performed better than general data mining techniques on G-mean measure which is an important performance measure for imbalanced datasets.
Loading Preview
Sorry, preview is currently unavailable. You can download the paper by clicking the button above.