Hierarchy-Debug: a scalable statistical technique for fault localization (original) (raw)

Statistical bug localisation by supervised clustering of program predicates

International Journal of Information Systems and Change Management, 2018

Regarding the fact that the majority of faults may be revealed as joint effect of program predicates on each other, a new method for localising complex bugs of programs is proposed in this article. The presented approach attempts to identify and select groups of interdependent predicates which altogether may affect the program failure. To find these groups, we suggest the use of a supervised algorithm that is based on penalised logistic regression analysis. To provide the failure context, faulty sub-paths are recognised as sequences of fault relevant predicates. Estimating the grouping effect of program predicates on the failure helps programmers in the multiple-bug setting. Several case studies have been designed to evaluate the proposed approach on well-known test suites. The evaluations show that our method produces more precise results compared with prior fault localisation techniques.

Statistical software debugging: From bug predictors to the main causes of failure

2009

Detecting latent errors is a key challenging issue in the software testing process. Latent errors could be best detected by bug predictors. A bug predictor manifests the effect of a bug on the program execution state. The aim has been to find the smallest reasonable subset of the bug predictors, manifesting all possible bugs within a program. In this paper, a new algorithm for finding the smallest subset of bug predictors is presented. The algorithm, firstly, applies a LASSO method to detect program predicates which have relatively higher effect on the termination status of the program. Then, a ridge regression method is applied to select a subset of the detected predicates as independent representatives of all the program predicates. Program control and data dependency graphs can be best applied to find the causes of bugs represented by the selected bug predictors. Our proposed approach has been evaluated on two well-known test suites. The experimental results demonstrate the effectiveness and accuracy of the proposed approach.

FPA-Debug: Effective Statistical Fault Localization Considering Fault-proneness Analysis

2016

The aim is to identify faulty predicates which have strong effect on program failure. Statistical debugging techniques are amongst best methods for pinpointing defects within the program source code. However, they have some drawbacks. They require a large number of executions to identify faults, they might be adversely affected by coincidental correctness, and they do not take into consideration fault-proneness associated with different parts of the program code while constructing behavioral models. Additionally, they do not consider the simultaneous impact of predicates on program termination status. To deal with mentioned problems, a new fault-proneness-aware approach based on elastic net regression, namely FPA-Debug has been proposed in this paper. FPA-Debug employs a clustering-based strategy to alleviate coincidental correctness in fault localization and finds the smallest effective subset of program predicates known as bug predictors. Moreover, the approach considers fault-pro...

SOBER: statistical model-based bug localization

2005

Automated localization of software bugs is one of the essential issues in debugging aids. Previous studies indicated that the evaluation history of program predicates may disclose important clues about underlying bugs. In this paper, we propose a new statistical model-based approach, called SOBER, which localizes software bugs without any prior knowledge of program semantics. Unlike existing statistical debugging approaches that select predicates correlated with program failures, SOBER models evaluation patterns of predicates in both correct and incorrect runs respectively and regards a predicate as bug-relevant if its evaluation pattern in incorrect runs differs significantly from that in correct ones. SOBER features a principled quantification of the pattern difference that measures the bug-relevance of program predicates.

IJERT-Fine Grained Statistical Debugging for the Identification of Multiple Bugs

International Journal of Engineering Research and Technology (IJERT), 2021

https://www.ijert.org/fine-grained-statistical-debugging-for-the-identification-of-multiple-bugs https://www.ijert.org/research/fine-grained-statistical-debugging-for-the-identification-of-multiple-bugs-IJERTV10IS050308.pdf Commercial software ships with undetected bugs despite the combined efforts of programmers, sophisticated bug detection tools and extensive testing. So, the identification and localization of the bugs in the software becomes essential issues in program debugging. Traditional software debugging is a difficult task to accomplish which requires a lot of time, effort and very good understanding of the source code. Given the scale and complexity of the job, automating the process of program debugging is very essential. Our approach aims at automating the process of program debugging. The earlier proposed approaches namely, statistical debugging and decision tree were able to identify only the most frequently occurring bugs and they failed to identify masked, simultaneously and non-frequently occurring bugs. We propose two approaches: one is decision tree based and the other uses bi-clustering for the task. The results obtained by our proposed approaches showed great improvements in the results in terms of purity, mis-classification rate and over splitting. Our proposed approaches were able to identify all the bugs present in the software including the masked and nonfrequently occurring bugs.

Statistical Debugging Effectiveness as a Fault Localization Approach: Comparative Study

Journal of Software Engineering and Applications, 2016

Fault localization is an important topic in software testing, as it enables the developer to specify fault location in their code. One of the dynamic fault localization techniques is statistical debugging. In this study, two statistical debugging algorithms are implemented, SOBER and Cause Isolation, and then the experimental works are conducted on five programs coded using Python as an example of well-known dynamic programming language. Results showed that in programs that contain only single bug, the two studied statistical debugging algorithms are very effective to localize a bug. In programs that have more than one bug, SOBER algorithm has limitations related to nested predicates, rarely observed predicates and complement predicates. The Cause Isolation has limitations related to sorting predicates based on importance and detecting bugs in predicate condition. The accuracy of both SOBER and Cause Isolation is affected by the program size. Quality comparison showed that SOBER algorithm requires more code examination than Cause Isolation to discover the bugs.

FLUCCS: using code and change metrics to improve fault localization

Proceedings of the 26th ACM SIGSOFT International Symposium on Software Testing and Analysis, 2017

Fault localisation aims to support the debugging activities of human developers by highlighting the program elements that are suspected to be responsible for the observed failure. Spectrum Based Fault Localisation (SBFL), an existing localisation technique that only relies on the coverage and pass/fail results of executed test cases, has been widely studied but also criticised for the lack of precision and limited e ort reduction. To overcome restrictions of techniques based purely on coverage, we extend SBFL with code and change metrics that have been studied in the context of defect prediction, such as size, age and code churn. Using suspiciousness values from existing SBFL formulae and these source code metrics as features, we apply two learn-to-rank techniques, Genetic Programming (GP) and linear rank Support Vector Machines (SVMs). We evaluate our approach with a tenfold cross validation of method level fault localisation, using 210 real world faults from the Defects4J repository. GP with additional source code metrics ranks the faulty method at the top for 106 faults, and within the top ve for 173 faults. is is a signi cant improvement over the state-of-the-art SBFL formulae, the best of which can rank 49 and 127 faults at the top and within the top ve, respectively. CCS CONCEPTS •So ware and its engineering →Search-based so ware engineering;

An evaluation of similarity coefficients for software fault localization

Proceedings - 12th Pacific Rim International Symposium on Dependable Computing, PRDC 2006, 2006

Automated diagnosis of software faults can improve the efficiency of the debugging process, and is therefore an important technique for the development of dependable software. In this paper we study different similarity coefficients that are applied in the context of a program spectral approach to software fault localization (single programming mistakes). The coefficients studied are taken from the systems diagnosis / automated debugging tools Pinpoint, Tarantula, and AMPLE, and from the molecular biology domain (the Ochiai coefficient). We evaluate these coefficients on the Siemens Suite of benchmark faults, and assess their effectiveness in terms of the position of the actual fault in the probability ranking of fault candidates produced by the diagnosis technique. Our experiments indicate that the Ochiai coefficient consistently outperforms the coefficients currently used by the tools mentioned. In terms of the amount of code that needs to be inspected, this coefficient improves 5% on average over the next best technique, and up to 30% in specific cases.

Statistical Debugging: A Hypothesis Testing-Based Approach

IEEE Transactions on Software Engineering, 2006

Manual debugging is tedious, as well as costly. The high cost has motivated the development of fault localization techniques, which help developers search for fault locations. In this paper, we propose a new statistical method, called SOBER, which automatically localizes software faults without any prior knowledge of the program semantics. Unlike existing statistical approaches that select predicates correlated with program failures, SOBER models the predicate evaluation in both correct and incorrect executions and regards a predicate as fault-relevant if its evaluation pattern in incorrect executions significantly diverges from that in correct ones. Featuring a rationale similar to that of hypothesis testing, SOBER quantifies the fault relevance of each predicate in a principled way. We systematically evaluate SOBER under the same setting as previous studies. The result clearly demonstrates the effectiveness: SOBER could help developers locate 68 out of the 130 faults in the Siemens suite by examining no more than 10 percent of the code, whereas the Cause Transition approach proposed by Holger et al. [6] and the statistical approach by Liblit et al. [12] locate 34 and 52 faults, respectively. Moreover, the effectiveness of SOBER is also evaluated in an "imperfect world," where the test suite is either inadequate or only partially labeled. The experiments indicate that SOBER could achieve competitive quality under these harsh circumstances. Two case studies with grep 2.2 and bc 1.06 are reported, which shed light on the applicability of SOBER on reasonably large programs.

Empirical Evaluation of Fault Localisation Using Code and Change Metrics

IEEE Transactions on Software Engineering, 2019

Fault localisation aims to reduce the debugging efforts of human developers by highlighting the program elements that are suspected to be the root cause of the observed failure. Spectrum Based Fault Localisation (SBFL), a coverage based approach, has been widely studied in many researches as a promising localisation technique. Recently, however, it has been proven that SBFL techniques have reached the limit of further improvement. To overcome the limitation, we extend SBFL with code and change metrics that have been mainly studied in defect prediction, such as size, age, and churn. FLUCCS, our fault learn-to-rank localisation technique, employs both existing SBFL formulae and these metrics as input. We investigate the effect of employing code and change metrics for fault localisation using four different learn-to-rank techniques: Genetic Programming, Gaussian Process Modelling, Support Vector Machine, and Random Forest. We evaluate the performance of FLUCCS with 386 real world faults collected from Defects4J repository. The results show that FLUCCS with code and change metrics places 144 faults at the top and 304 faults within the top ten. This is a significant improvement over the state-of-art SBFL formulae, which can locate 65 and 212 faults at the top and within the top ten, respectively. We also investigate the feasibility of cross-project transfer learning of fault localisation. The results show that, while there exist project-specific properties that can be exploited for better localisation per project, ranking models learnt from one project can be applied to others without significant loss of effectiveness.