Localizing State-Dependent Faults Using Associated Sequence Mining (original) (raw)

Fault-localization techniques for software systems

ACM SIGSOFT Software Engineering Notes, 2014

Software is a major component of any computer system. To maintain the quality of software, early fault localization is necessary. Many different fault-localization methods have been used by researchers. Ideally, methods for fault-localization are used in such a way that one is able to detect as many faults as possible using the least resources. But, in general, it is hard to predict a test suite's fault-localization capability. This paper gives a review of the previous studies that are related to software fault-localization methods. It reviews various journal and conference papers on localization of faults and various methods for localization of faults proposed in the literature.

Heuristics for automatic localization of software faults

1992

Abstract Developing effective debugging strategies to guarantee the reliability of software is important. By analyzing the debugging process used by experienced programmers, four distinct tasks are found to be consistently performed:(1) determining statements involved in program failures,(2) selecting suspicious statements that might contain faults,(3) making hypotheses about suspicious faults (variables and locations), and (4) restoring program state to a specific statement for verification.

OSD: A Source Level Bug Localization Technique Incorporating Control Flow and State Information in Object Oriented Program

Bug localization in object oriented program has always been an important issue in softeware engineering. In this paper, I propose a source level bug localization technique for object oriented embedded programs. My proposed technique, presents the idea of debugging an object oriented program in class level, incorporating the object state information into the Class Dependence Graph (ClDG). Given a program (having buggy statement) and an input that fails and others pass, my approach uses concrete as well as symbolic execution to synthesize the passing inputs that marginally from the failing input in their control flow behavior. A comparison of the execution traces of the failing input and the passing input provides necessary clues to the root-cause of the failure. A state trace difference, regarding the respective nodes of the ClDG is obtained, which leads to detect the bug in the program.

W.E. Wong and T.H. Tse (eds.), Handbook of Software Fault Localization: Foundations and Advances

Wiley-IEEE Press, Hoboken, NY, USA, xiv + 590 pages, 2023

IEEE Computer Society is the world's leading computing membership organization and the trusted information and career-development source for a global workforce of technology leaders including: professors, researchers, software engineers, IT professionals, employers, and students. The unmatched source for technology information, inspiration, and collaboration, the IEEE Computer Society is the source that computing professionals trust to provide high-quality, state-of-the-art information on an on-demand basis. The Computer Society provides a wide range of forums for top minds to come together, including technical conferences, publications, and a comprehensive digital library, unique training webinars, professional training, and the Tech Leader Training Partner Program to help organizations increase their staff's technical knowledge and expertise, as well as the personalized information tool my Computer. To find out more about the community for technology

Extended comprehensive study of association measures for fault localization

Journal of Software: Evolution and Process, 2013

ABSTRACTSpectrum‐based fault localization is a promising approach to automatically locate root causes of failures quickly. Two well‐known spectrum‐based fault localization techniques, Tarantula and Ochiai, measure how likely a program element is a root cause of failures based on profiles of correct and failed program executions. These techniques are conceptually similar to association measures that have been proposed in statistics, data mining, and have been utilized to quantify the relationship strength between two variables of interest (e.g., the use of a medicine and the cure rate of a disease). In this paper, we view fault localization as a measurement of the relationship strength between the execution of program elements and program failures. We investigate the effectiveness of 40 association measures from the literature on locating bugs. Our empirical evaluations involve single‐bug and multiple‐bug programs. We find there is no best single measure for all cases. Klosgen and Oc...

DeLLIS: A data mining process for fault localization

2009

Most fault localization methods aim at totally ordering program elements from highly suspicious to innocent. This ignores the structure of the program and creates clusters of program elements where the relations between the elements are lost. We propose a data mining process that computes program element clusters and that also shows dependencies between program elements. Experimentations show that our process gives a comparable number of lines to analyze than the best related methods while providing a richer environment for the analysis. We also show that the method scales up by tuning the statistical indicators of the data mining process.

Mining Sequential Patterns of Predicates for Fault Localization and Understanding

2013 IEEE 7th International Conference on Software Security and Reliability - SERE 2013, 2013

Fault localization has been widely recognized as one of the most costly activities in software engineering. Most of existing techniques target a single faulty entity as the root cause of a failure. However these techniques often fail to reveal the context of a failure which can be valuable for the developers and testers to understand and correct faults. Thus some tentative solutions have been proposed to localize faults as sequences of software entities. However, as far as we know, none of these pioneering works consistently handles execution data in a sequence-oriented way, i.e., they analyze suspiciousness of software entities separately before or after the construction of a faulty sequence. In this paper, we establish a systematic framework based on sequential-pattern mining to assist fault localization. We model the executions of test cases as sequences of predicates. Our framework outputs sequential patterns which are more likely related to the actual faults based on a 3-stage procedure: a preprocessing stage to prune sequences of predicates, a mining stage to discover candidate sequential patterns based on the revised SPADE mining algorithm, and a ranking stage to obtain top K results according to our novel metrics. The obtained sequential patterns of predicates can not only provide information about the locations of faults, but also convey valuable context information for understanding the root causes of software failures. A preliminary experiment on some widely used benchmarks was conducted to evaluate the performance of our framework. The experimental results show that our technique is effective and efficient in revealing causes of failures.

Evaluation of Measures for Statistical Fault Localisation and an Optimising Scheme

Lecture Notes in Computer Science, 2015

Statistical Fault Localisation (SFL) is a widely used method for localizing faults in software. SFL gathers coverage details of passed and failed executions over a faulty program and then uses a measure to assign a degree of suspiciousness to each of a chosen set of program entities (statements, predicates, etc.) in that program. The program entities are then inspected by the engineer in descending order of suspiciousness until the bug is found. The effectiveness of this process relies on the quality of the suspiciousness measure. In this paper, we compare 157 measures, 95 of which are new to SFL and borrowed from other branches of science and philosophy. We also present a new measure optimiser Lex g , which optimises a given measure g according to a criterion of single bug optimality. An experimental comparison on benchmarks from the Software-artifact Infrastructure Repository (SIR) indicates that many of the new measures perform competitively with the established ones. Furthermore, the large-scale comparison reveals that the new measures Lex Ochiai and Pattern-Similarity perform best overall.

An evaluation of similarity coefficients for software fault localization

Proceedings - 12th Pacific Rim International Symposium on Dependable Computing, PRDC 2006, 2006

Automated diagnosis of software faults can improve the efficiency of the debugging process, and is therefore an important technique for the development of dependable software. In this paper we study different similarity coefficients that are applied in the context of a program spectral approach to software fault localization (single programming mistakes). The coefficients studied are taken from the systems diagnosis / automated debugging tools Pinpoint, Tarantula, and AMPLE, and from the molecular biology domain (the Ochiai coefficient). We evaluate these coefficients on the Siemens Suite of benchmark faults, and assess their effectiveness in terms of the position of the actual fault in the probability ranking of fault candidates produced by the diagnosis technique. Our experiments indicate that the Ochiai coefficient consistently outperforms the coefficients currently used by the tools mentioned. In terms of the amount of code that needs to be inspected, this coefficient improves 5% on average over the next best technique, and up to 30% in specific cases.

Empirical Evaluation of Fault Localisation Using Code and Change Metrics

IEEE Transactions on Software Engineering, 2019

Fault localisation aims to reduce the debugging efforts of human developers by highlighting the program elements that are suspected to be the root cause of the observed failure. Spectrum Based Fault Localisation (SBFL), a coverage based approach, has been widely studied in many researches as a promising localisation technique. Recently, however, it has been proven that SBFL techniques have reached the limit of further improvement. To overcome the limitation, we extend SBFL with code and change metrics that have been mainly studied in defect prediction, such as size, age, and churn. FLUCCS, our fault learn-to-rank localisation technique, employs both existing SBFL formulae and these metrics as input. We investigate the effect of employing code and change metrics for fault localisation using four different learn-to-rank techniques: Genetic Programming, Gaussian Process Modelling, Support Vector Machine, and Random Forest. We evaluate the performance of FLUCCS with 386 real world faults collected from Defects4J repository. The results show that FLUCCS with code and change metrics places 144 faults at the top and 304 faults within the top ten. This is a significant improvement over the state-of-art SBFL formulae, which can locate 65 and 212 faults at the top and within the top ten, respectively. We also investigate the feasibility of cross-project transfer learning of fault localisation. The results show that, while there exist project-specific properties that can be exploited for better localisation per project, ranking models learnt from one project can be applied to others without significant loss of effectiveness.