Automated Bug Neighborhood Analysis for Identifying Incomplete Bug Fixes (original) (raw)

An evaluation of two bug pattern tools for Java

Proceedings of the 1st International Conference on Software Testing, Verification and Validation, ICST 2008, 2008

Automated static analysis is a promising technique to detect defects in software. However, although considerable effort has been spent for developing sophisticated detection possibilities, the effectiveness and efficiency has not been treated in equal detail. This paper presents the results of two industrial case studies in which two tools based on bug patterns for Java are applied and evaluated. First, the economic implications of the tools are analysed. It is estimated that only 3-4 potential field defects need to be detected for the tools to be cost-efficient. Second, the capabilities of detecting field defects are investigated. No field defects have been found that could have been detected by the tools. Third, the identification of fault-prone classes based on the results of such tools is investigated and found to be possible. Finally, methodological consequences are derived from the results and experiences in order to improve the use of bug pattern tools in practice.

A Public Unified Bug Dataset for Java

Proceedings of the 14th International Conference on Predictive Models and Data Analytics in Software Engineering, 2018

Background: Bug datasets have been created and used by many researchers to build bug prediction models. Aims: In this work we collected existing public bug datasets and unified their contents. Method: We considered 5 public datasets which adhered to all of our criteria. We also downloaded the corresponding source code for each system in the datasets and performed their source code analysis to obtain a common set of source code metrics. This way we produced a unified bug dataset at class and file level that is suitable for further research (e.g. to be used in the building of new bug prediction models). Furthermore, we compared the metric definitions and values of the different bug datasets. Results: We found that (i) the same metric abbreviation can have different definitions or metrics calculated in the same way can have different names, (ii) in some cases different tools give different values even if the metric definitions coincide because (iii) one tool works on source code while the other calculates metrics on bytecode, or (iv) in several cases the downloaded source code contained more files which influenced the afferent metric values significantly. Conclusions: Apart from all these imprecisions, we think that having a common metric set can help in building better bug prediction models and deducing more general conclusions. We made the unified dataset publicly available for everyone. By using a public dataset as an input for different bug prediction related investigations, researchers can make their studies reproducible, thus able to be validated and verified.

AVATAR: Fixing Semantic Bugs with Fix Patterns of Static Analysis Violations

2019 IEEE 26th International Conference on Software Analysis, Evolution and Reengineering (SANER), 2019

Fix pattern-based patch generation is a promising direction in Automated Program Repair (APR). Notably, it has been demonstrated to produce more acceptable and correct patches than the patches obtained with mutation operators through genetic programming. The performance of pattern-based APR systems, however, depends on the fix ingredients mined from fix changes in development histories. Unfortunately, collecting a reliable set of bug fixes in repositories can be challenging. In this paper, we propose to investigate the possibility in an APR scenario of leveraging code changes that address violations by static bug detection tools. To that end, we build the AVATAR APR system, which exploits fix patterns of static analysis violations as ingredients for patch generation. Evaluated on the Defects4J benchmark, we show that, assuming a perfect localization of faults, AVATAR can generate correct patches to fix 34/39 bugs. We further find that AVATAR yields performance metrics that are comparable to that of the closely-related approaches in the literature. While AVATAR outperforms many of the state-of-theart pattern-based APR systems, it is mostly complementary to current approaches. Overall, our study highlights the relevance of static bug finding tools as indirect contributors of fix ingredients for addressing code defects identified with functional test cases.

Automatic Program Repair of Java Single Bugs using Two-level Mutation Operators

2020

Automatic program repair (APR) is one of the necessary software maintenance tasks because most software systems have errors that need to be fixed. APR techniques are considered as a search problem where the search space includes all potential repair candidates, with the aim of identifying the correct repair code in space. This paper proposes a repair approach that finds the correct repair code for object-oriented program bugs such as Java bugs in the minimized search space using the type of buggy statement and mutation system, MuJava. This approach consists of four main phases. First, program bugs are localized by prioritizing the statements based on their suspiciousness of containing bugs. Second, the mutation system is employed to mutate the program using two-level operators of the mutation system. Third, we extract the fixed candidates that are similar to the buggy statement type within the mutants and after receiving the ordered list of candidate patches, the last phase validate...

Automatically Finding Bugs in Open Source Programs

We consider properties desirable for static analysis tools targeted at finding bugs in the real open source code, and review tools based on various approaches to defect detection. A static analysis tool is described, that includes a framework for flow-sensitive interprocedural dataflow analysis and scales to analysis of large programs. The framework enables implementation of multiple checkers searching for specific bugs, such as null pointer dereference and buffer overflow, abstracting from the checkers details such as alias analysis.

To what extent could we detect field defects? An extended empirical study of false negatives in static bug-finding tools

Automated Software Engineering, 2014

Software defects can cause much loss. Static bug-finding tools are designed to detect and remove software defects and believed to be effective. However, do such tools in fact help prevent actual defects that occur in the field and reported by users? If these tools had been used, would they have detected these field defects, and generated warnings that would direct programmers to fix them? To answer these questions, we perform an empirical study that investigates the effectiveness of five state-of-the-art static bug-finding tools (FindBugs, JLint, PMD, CheckStyle, and JCSC) on hundreds of reported and fixed defects extracted from three open source programs (Lucene, Rhino, and AspectJ). Our study addresses the question: To what extent could field defects be detected by state-of-the-art static bug-finding tools? Different from past studies that are concerned with the numbers of false positives produced by such tools, we address an orthogonal issue on the numbers of false negatives. We

To what extent could we detect field defects? an empirical study of false negatives in static bug finding tools

Proceedings of the 27th IEEE/ACM International Conference on Automated Software Engineering - ASE 2012, 2012

Software defects can cause much loss. Static bug-finding tools are believed to help detect and remove defects. These tools are designed to find programming errors; but, do they in fact help prevent actual defects that occur in the field and reported by users? If these tools had been used, would they have detected these field defects, and generated warnings that would direct programmers to fix them? To answer these questions, we perform an empirical study that investigates the effectiveness of state-of-the-art static bug finding tools on hundreds of reported and fixed defects extracted from three open source programs: Lucene, Rhino, and As-pectJ. Our study addresses the question: To what extent could field defects be found and detected by state-of-the-art static bug-finding tools? Different from past studies that are concerned with the numbers of false positives produced by such tools, we address an orthogonal issue on the numbers of false negatives. We find that although many field defects could be detected by static bug finding tools, a substantial proportion of defects could not be flagged. We also analyze the types of tool warnings that are more effective in finding field defects and characterize the types of missed defects.

Toward an understanding of bug fix patterns

Empirical Software Engineering, 2009

Twenty-seven automatically extractable bug fix patterns are defined using the syntax components and context of the source code involved in bug fix changes. Bug fix patterns are extracted from the configuration management repositories of seven open source projects, all written in Java (Eclipse, Columba, JEdit, Scarab, ArgoUML, Lucene, and MegaMek). Defined bug fix patterns cover 45.7% to 63.3% of the total bug fix hunk pairs in these projects. The frequency of occurrence of each bug fix pattern is computed across all projects. The most common individual patterns are MC-DAP (method call with different actual parameter values) at 14.9-25.5%, IF-CC (change in if conditional) at 5.6-18.6%, and AS-CE (change of assignment expression) at 6.0-14.2%. A correlation analysis on the extracted pattern instances on the seven projects shows that six have very similar bug fix pattern frequencies. Analysis of if conditional bug fix sub-patterns shows a trend towards increasing conditional complexity in if conditional fixes. Analysis of five developers in the Eclipse projects shows overall consistency with project-level bug fix pattern frequencies, as well as distinct variations among developers in their rates of producing various bug patterns. Overall, data in the paper suggest that developers have difficulty with specific code situations at surprisingly consistent rates. There appear to be broad mechanisms causing the injection of bugs that are largely independent of the type of software being produced.

Common Bug-Fix Patterns: A Large-Scale Observational Study

2017 ACM/IEEE International Symposium on Empirical Software Engineering and Measurement (ESEM)

Background]: There are more bugs in real-world programs than human programmers can realistically address. Several approaches have been proposed to aid debugging. A recent research direction that has been increasingly gaining interest to address the reduction of costs associated with defect repair is automatic program repair. Recent work has shown that some kind of bugs are more suitable for automatic repair techniques. [Aim]: The detection and characterization of common bug-fix patterns in software repositories play an important role in advancing the field of automatic program repair. In this paper, we aim to characterize the occurrence of known bugfix patterns in Java repositories at an unprecedented large scale. [Method]: The study was conducted for Java GitHub projects organized in two distinct data sets: the first one (i.e., Boa data set) contains more than 4 million bug-fix commits from 101,471 projects and the second one (i.e., Defects4J data set) contains 369 real bug fixes from five open-source projects. We used a domain-specific programming language called Boa in the first data set and conducted a manual analysis on the second data set in order to confront the results. [Results]: We characterized the prevalence of the five most common bug-fix patterns (identified in the work of Pan et al.) in those bug fixes. The combined results showed direct evidence that developers often forget to add IF preconditions in the code. Moreover, 76% of bug-fix commits associated with the IF-APC bug-fix pattern are isolated from the other four bug-fix patterns analyzed. [Conclusion]: Targeting on bugs that miss preconditions is a feasible alternative in automatic repair techniques that would produce a relevant payback.