Fabrizio Pastore - Academia.edu (original) (raw)
Papers by Fabrizio Pastore
ABSTRACT Software developers usually integrate third party components to build systems providing ... more ABSTRACT Software developers usually integrate third party components to build systems providing distinct functionalities like graph drawing, or persis-tence capabilities. Developers often use grey-box components: software modules they do not know in details because provided without source code or incomplete specifications or both. Lack of source code and speci-fications makes the integration of such modules difficult and often causes faults that lead to critical failures if not detected at testing time. Lack of such informations complicates faults localization too. Existing static analysis and debugging techniques rely on source code or specifi-cations, for this reason their applicability is often limited when grey-box components are used. Dynamic analysis techniques do not need such infor-mation: they monitor components interfaces, thus being applicable even when this information is missing. Dynamic analysis approaches identify violations of models inferred from data recorded during monitored execu-tions. Unfortunately these techniques suffers from limitations too: they are capable of identifying only specific kinds of faults, their results are often affected by false positives, and they present scalability issues de-pending on the huge amount of data collected during training or on the identification of many violations during debugging. This paper presents Behaviour Capture and Test (BCT), a dynamic analysis technique that overcomes the limitations of the existing dynamic analysis techniques. BCT uses different kinds of models to localize differ-ent types of faults, it prunes false positives, it incrementally builds models to save disk space and guides developers when many models violations are identified. Successful results obtained when applying the technique on injected and real faults are reported in the paper: the considered case studies regard both regression faults and faults depending on rare events sequences not stressed at testing time. * MauroPez e is also professor at the University of Lugano, Faculty of Informatics, via Buffi, 13 6900 Lugano (Switzerland).
Proceedings of the 2017 11th Joint Meeting on Foundations of Software Engineering, 2017
In this paper we present VART, a tool for automatically revealing regression faults missed by reg... more In this paper we present VART, a tool for automatically revealing regression faults missed by regression test suites. Interestingly, VART is not limited to faults causing crashing or exceptions, but can reveal faults that cause the violation of application-specific correctness properties. VART achieves this goal by combining static and dynamic program analysis. CCS CONCEPTS • Software and its engineering → Software verification and validation;
Proceedings of the 29th ACM SIGSOFT International Symposium on Software Testing and Analysis, 2020
Debugging Cyber-Physical System models is often challenging, as it requires identifying a potenti... more Debugging Cyber-Physical System models is often challenging, as it requires identifying a potentially long, complex and heterogenous combination of events that resulted in a violation of the expected behavior of the system. In this paper we present CPSDebug, a tool for supporting designers in the debugging of failures in MATLAB Simulink/Stateflow models. CPSDebug implements a gray-box approach that combines testing, specification mining, and failure analysis to identify the causes of failures and explain their propagation in time and space. The evaluation of the tool, based on multiple usage scenarios and faults and direct feedback from engineers, shows that CPSDebug can effectively aid engineers during debugging tasks.
2017 IEEE International Conference on Software Testing, Verification and Validation (ICST), 2017
Accurate and up-to-date models describing the behavior of software systems are seldom available i... more Accurate and up-to-date models describing the behavior of software systems are seldom available in practice. To address this issue, software engineers may use specification mining techniques, which can automatically derive models that capture the behavior of the system under analysis. So far, most specification mining techniques focused on the functional behavior of the systems, with specific emphasis on models that represent the ordering of operations, such as temporal rules and finite state models. Although useful, these models are inherently partial. For instance, they miss the timing behavior, which is extremely relevant for many classes of systems and components, such as shared libraries and user-driven applications. Mining specifications that include both the functional and the timing aspects can improve the applicability of many testing and analysis solutions. This paper addresses this challenge by presenting the Timed k-Tail (TkT) specification mining technique that can mine timed automata from program traces. Since timed automata can effectively represent the interplay between the functional and the timing behavior of a system, TkT could be exploited in those contexts where time-related information is relevant. Our empirical evaluation shows that TkT can efficiently and effectively mine accurate models. The mined models have been used to identify executions with anomalous timing. The evaluation shows that most of the anomalous executions have been correctly identified while producing few false positives.
2015 IEEE/ACM 37th IEEE International Conference on Software Engineering, 2015
Automatic testing, although useful, is still quite ineffective against faults that do not cause c... more Automatic testing, although useful, is still quite ineffective against faults that do not cause crashes or uncaught exceptions. In the majority of the cases automatic tests do not include oracles, and only in some cases they incorporate assertions that encode the observed behavior instead of the intended behavior, that is if the application under test produces a wrong result, the synthesized assertions will encode wrong expectations that match the actual behavior of the application. In this paper we present ZoomIn, a technique that extends the fault-revealing capability of test case generation techniques from crash-only faults to faults that require non-trivial oracles to be detected. ZoomIn exploits the knowledge encoded in the manual tests written by developers and the similarity between executions to automatically determine an extremely small set of suspicious assertions that are likely wrong and thus worth manual inspection. Early empirical results show that ZoomIn has been able to detect 50% of the analyzed non-crashing faults in the Apache Commons Math library requiring the inspection of less than 1.5% of the assertions automatically generated by EvoSuite.
Validation of Evolving Software, 2015
Validation of Evolving Software, 2015
In this chapter we present Verification-Aided Regression Testing, a novel extension of regression... more In this chapter we present Verification-Aided Regression Testing, a novel extension of regression testing that is significantly less sensitive to the completeness of the validation test suite due to the use of model checking. We extend the use of test case executions from conventional direct fault discovery to generation of behavioral properties specific to the new version by (i) automatically producing properties that are proved to hold for the base version of a program, (ii) automatically identifying and checking on the upgraded program only the properties that, according to the developers’ intention, must be preserved by the upgrade, and (iii) reporting the faults and the corresponding counterexamples that are not revealed by the regression tests. Our empirical study on both open-source and industrial software systems show that Verification-Aided Regression Testing produces properties that can be extremely beneficial in increasing the effectiveness of regression testing by timely and automatically detecting faults unnoticed by existing test suites.
2012 34th International Conference on Software Engineering (ICSE), 2012
Most of the modern Integrated Development Environments are developed with plug-in based architect... more Most of the modern Integrated Development Environments are developed with plug-in based architectures that can be extended with additional functionalities and plug-ins, according to user needs. However, extending an IDE is still a possibility restricted to developers with deep knowledge about the specific development environment and its architecture. In this paper we present MASH, a tool that eases the programming of Integrated Development Environments. The tool supports the definition of workflows that can be quickly designed to integrate functionalities offered by multiple plugins, without the need of knowing anything about the internal architecture of the IDE. Workflows can be easily reshaped every time an analysis must be modified, without the need of producing Java code and deploying components in the IDE. Early results suggest that this approach can effectively facilitate programming of IDEs.
2013 IEEE Sixth International Conference on Software Testing, Verification and Validation, 2013
Despite the recent advances in test generation, fully automatic software testing remains a dream:... more Despite the recent advances in test generation, fully automatic software testing remains a dream: Ultimately, any generated test input depends on a test oracle that determines correctness, and, except for generic properties such as "the program shall not crash", such oracles require human input in one form or another. CrowdSourcing is a recently popular technique to automate computations that cannot be performed by machines, but only by humans. A problem is split into small chunks, that are then solved by a crowd of users on the Internet. In this paper we investigate whether it is possible to exploit CrowdSourcing to solve the oracle problem: We produce tasks asking users to evaluate CrowdOraclesassertions that reflect the current behavior of the program. If the crowd determines that an assertion does not match the behavior described in the code documentation, then a bug has been found. Our experiments demonstrate that CrowdOracles are a viable solution to automate the oracle problem, yet taming the crowd to get useful results is a difficult task.
2013 IEEE Sixth International Conference on Software Testing, Verification and Validation, 2013
Several debugging techniques can be used to automatically identify the code fragments or the runt... more Several debugging techniques can be used to automatically identify the code fragments or the runtime events likely responsible of a failure. These techniques are useful, but can help reducing the debugging effort only to a given extent. In fact, even when these techniques are successful, software developers still have to invest a lot of effort in understanding if and why something detected as suspicious is really wrong. In this paper we present the tool implementing the AVA technique. AVA, compared to other approaches dedicated to automatic debugging, in addition to automatically identifying the events likely responsible of a failure, generates an explanation about why these events have been considered suspicious. This explanation can be used by developers to quickly discard imprecise outputs and more effectively work on the relevant anomalies.
2008 19th International Symposium on Software Reliability Engineering (ISSRE), 2008
Log files are commonly inspected by system administrators and developers to detect suspicious beh... more Log files are commonly inspected by system administrators and developers to detect suspicious behaviors and diagnose failure causes. Since size of log files grows fast, thus making manual analysis impractical, different automatic techniques have been proposed to analyze log files. Unfortunately, accuracy and effectiveness of these techniques are often limited by the unstructured nature of logged messages and the variety of data that can be logged. This paper presents a technique to automatically analyze log files and retrieve important information to identify failure causes. The technique automatically identifies dependencies between events and values in logs corresponding to legal executions, generates models of legal behaviors and compares log files collected during failing executions with the generated models to detect anomalous event sequences that are presented to users. Experimental results show the effectiveness of the technique in supporting developers and testers to identify failure causes.
2009 IEEE 31st International Conference on Software Engineering, 2009
Classic fault localization techniques can automatically provide information about the suspicious ... more Classic fault localization techniques can automatically provide information about the suspicious code blocks that are likely responsible for observed failures. This information is useful, but not sufficient to completely understand the causes of failing executions, which still require further (time-consuming) investigations to be exactly identified. A useful and comprehensive source of information is frequently given by the set of unexpected events that have been observed during failures. Sequences of unexpected events are usually simple to be interpret, and testers can guess the expected correct sequences of events from the faulty sequences. In this paper, we present a tool that automatically identifies anomalous events that likely caused failures, filters the possible false positives, and presents the resulting data by building views that show chains of cause-effect relations, i.e., views that show when anomalous events are caused by other anomalous events. The use of the technique to investigate a fault in the Tomcat application server is also presented in the paper.
Proceedings of the 2014 International Symposium on Software Testing and Analysis - ISSTA 2014, 2014
Motivation • Regression testing is an integral part of many software development processes • Give... more Motivation • Regression testing is an integral part of many software development processes • Given an upgrade of a software, does it satisfy a validation test suite passed by the base version of the software
International Symposium on Software Testing and Analysis, 2009
Dynamic analysis techniques have been extensively adopted to discover causes of observed failures... more Dynamic analysis techniques have been extensively adopted to discover causes of observed failures. In particular, anom- aly detection techniques can infer behavioral models from observed legal executions and compare failing executions with the inferred models to automatically identify the likely anomalous events that caused observed failures. Unfortunately the output of these techniques is limited to a set of independent suspicious
Heterogeneity, mobility, complexity and new application domains raise new software reliability is... more Heterogeneity, mobility, complexity and new application domains raise new software reliability issues that cannot be met cost-effectively only with classic software engineering approaches. Self-healing systems can successfully address these problems, thus increasing software reliability while reducing maintenance costs. Self-healing systems must be able to automatically identify runtime failures, locate faults, and find a way to bring the system back to an acceptable behavior. This paper discusses the challenges underlying the construction of self-healing systems with particular focus on functional failures, and presents a set of techniques to build software systems that can automatically heal such failures. It introduces techniques to automatically derive assertions to effectively detect functional failures, locate the faults underlying the failures, and identify sequences of actions alternative to the failing sequence to bring the system back to an acceptable behavior.
2017 IEEE International Conference on Software Testing, Verification and Validation (ICST)
In the context of use-case centric development and requirements-driven testing, this paper addres... more In the context of use-case centric development and requirements-driven testing, this paper addresses the problem of automatically deriving system test cases to verify timing requirements. Inspired by engineering practice in an automotive software development context, we rely on an analyzable form of use case specifications and augment such functional descriptions with timed automata, capturing timing requirements, following a methodology aiming at minimizing modeling overhead. We automate the generation of executable test cases using a test strategy based on maximizing test suite diversity and building over the UPPAAL model checker. Initial empirical results based on an industrial case study provide evidence of the effectiveness of the approach. This paper has been accepted for publication in the proceeding of the 10th IEEE International Conference on Software Testing, Verification and Validation (ICST 2017), IEEE.
Proceedings of the ACM/IEEE 44th International Conference on Software Engineering: Companion Proceedings
We present MASS, a mutation analysis tool for embedded software in cyber-physical systems (CPS). ... more We present MASS, a mutation analysis tool for embedded software in cyber-physical systems (CPS). We target space CPS (e.g., satellites) and other CPS with similar characteristics (e.g., UAV). Mutation analysis measures the quality of test suites in terms of the percentage of detected artificial faults. There are many mutation analysis tools available but they are inapplicable to CPS because of scalability and accuracy challenges. To overcome such limitations, MASS implements a set of optimization techniques that enable the applicability of mutation analysis and address scalability and accuracy in the CPS context. MASS has been successfully evaluated on a large study involving embedded software systems provided by industry partners; the study includes an on-board software system managing a microsatellite currently on-orbit, a set of libraries used in deployed cubesats, and a mathematical library provided by the European Space Agency. A demo video of MASS is available at https://www.youtube.com/watch?v=gC1x9cU0-tU. CCS CONCEPTS • Software and its engineering → Software verification and validation.
ACM Transactions on Software Engineering and Methodology
Apps’ pervasive role in our society led to the definition of test automation approaches to ensure... more Apps’ pervasive role in our society led to the definition of test automation approaches to ensure their dependability. However, state-of-the-art approaches tend to generate large numbers of test inputs and are unlikely to achieve more than 50% method coverage. In this article, we propose a strategy to achieve significantly higher coverage of the code affected by updates with a much smaller number of test inputs, thus alleviating the test oracle problem. More specifically, we present ATUA, a model-based approach that synthesizes App models with static analysis, integrates a dynamically refined state abstraction function and combines complementary testing strategies, including (1) coverage of the model structure, (2) coverage of the App code, (3) random exploration, and (4) coverage of dependencies identified through information retrieval. Its model-based strategy enables ATUA to generate a small set of inputs that exercise only the code affected by the updates. In turn, this makes co...
IEEE Transactions on Software Engineering, 2021
On-board embedded software developed for spaceflight systems (space software) must adhere to stri... more On-board embedded software developed for spaceflight systems (space software) must adhere to stringent software quality assurance procedures. For example, verification and validation activities are typically performed and assessed by third party organizations. To further minimize the risk of human mistakes, space agencies, such as the European Space Agency (ESA), are looking for automated solutions for the assessment of software testing activities, which play a crucial role in this context. Though space software is our focus here, it should be noted that such software shares the above considerations, to a large extent, with embedded software in many other types of cyber-physical systems. Over the years, mutation analysis has shown to be a promising solution for the automated assessment of test suites; it consists of measuring the quality of a test suite in terms of the percentage of injected faults leading to a test failure. A number of optimization techniques, addressing scalability and accuracy problems, have been proposed to facilitate the industrial adoption of mutation analysis. However, to date, two major problems prevent space agencies from enforcing mutation analysis in space software development. First, there is uncertainty regarding the feasibility of applying mutation analysis optimization techniques in their context. Second, most of the existing techniques either can break the real-time requirements common in embedded software or cannot be applied when the software is tested in Software Validation Facilities, including CPU emulators and sensor simulators. In this paper, we enhance mutation analysis optimization techniques to enable their applicability to embedded software and propose a pipeline that successfully integrates them to address scalability and accuracy issues in this context, as described above. Further, we report on the largest study involving embedded software systems in the mutation analysis literature. Our research is part of a research project funded by ESA ESTEC involving private companies (GomSpace Luxembourg and LuxSpace) in the space sector. These industry partners provided the case studies reported in this paper; they include an on-board software system managing a microsatellite currently on-orbit, a set of libraries used in deployed cubesats, and a mathematical library certified by ESA.
This repository provides the data used for the experiments of the paper "Supporting DNN Safe... more This repository provides the data used for the experiments of the paper "Supporting DNN Safety Analysis and Retraining through Heatmap-based Unsupervised Learning" by Hazem Fahmy, Fabrizio Pastore, Mojtaba Bagherzadeh, and Lionel Briand appearing in IEEE Transactions on Reliability (doi: 10.1109/TR.2021.3074750) Deep neural networks (DNNs) are increasingly important in safety-critical systems, for example in their perception layer to analyze images. Unfortunately, there is a lack of methods to ensure the functional safety of DNN-based components. We observe three major challenges with existing practices regarding DNNs in safety-critical systems: (1) scenarios that are underrepresented in the test set may lead to serious safety violation risks, but may, however, remain unnoticed; (2) char- acterizing such high-risk scenarios is critical for safety analysis; (3) retraining DNNs to address these risks is poorly supported when causes of violations are difficult to determine. T...
ABSTRACT Software developers usually integrate third party components to build systems providing ... more ABSTRACT Software developers usually integrate third party components to build systems providing distinct functionalities like graph drawing, or persis-tence capabilities. Developers often use grey-box components: software modules they do not know in details because provided without source code or incomplete specifications or both. Lack of source code and speci-fications makes the integration of such modules difficult and often causes faults that lead to critical failures if not detected at testing time. Lack of such informations complicates faults localization too. Existing static analysis and debugging techniques rely on source code or specifi-cations, for this reason their applicability is often limited when grey-box components are used. Dynamic analysis techniques do not need such infor-mation: they monitor components interfaces, thus being applicable even when this information is missing. Dynamic analysis approaches identify violations of models inferred from data recorded during monitored execu-tions. Unfortunately these techniques suffers from limitations too: they are capable of identifying only specific kinds of faults, their results are often affected by false positives, and they present scalability issues de-pending on the huge amount of data collected during training or on the identification of many violations during debugging. This paper presents Behaviour Capture and Test (BCT), a dynamic analysis technique that overcomes the limitations of the existing dynamic analysis techniques. BCT uses different kinds of models to localize differ-ent types of faults, it prunes false positives, it incrementally builds models to save disk space and guides developers when many models violations are identified. Successful results obtained when applying the technique on injected and real faults are reported in the paper: the considered case studies regard both regression faults and faults depending on rare events sequences not stressed at testing time. * MauroPez e is also professor at the University of Lugano, Faculty of Informatics, via Buffi, 13 6900 Lugano (Switzerland).
Proceedings of the 2017 11th Joint Meeting on Foundations of Software Engineering, 2017
In this paper we present VART, a tool for automatically revealing regression faults missed by reg... more In this paper we present VART, a tool for automatically revealing regression faults missed by regression test suites. Interestingly, VART is not limited to faults causing crashing or exceptions, but can reveal faults that cause the violation of application-specific correctness properties. VART achieves this goal by combining static and dynamic program analysis. CCS CONCEPTS • Software and its engineering → Software verification and validation;
Proceedings of the 29th ACM SIGSOFT International Symposium on Software Testing and Analysis, 2020
Debugging Cyber-Physical System models is often challenging, as it requires identifying a potenti... more Debugging Cyber-Physical System models is often challenging, as it requires identifying a potentially long, complex and heterogenous combination of events that resulted in a violation of the expected behavior of the system. In this paper we present CPSDebug, a tool for supporting designers in the debugging of failures in MATLAB Simulink/Stateflow models. CPSDebug implements a gray-box approach that combines testing, specification mining, and failure analysis to identify the causes of failures and explain their propagation in time and space. The evaluation of the tool, based on multiple usage scenarios and faults and direct feedback from engineers, shows that CPSDebug can effectively aid engineers during debugging tasks.
2017 IEEE International Conference on Software Testing, Verification and Validation (ICST), 2017
Accurate and up-to-date models describing the behavior of software systems are seldom available i... more Accurate and up-to-date models describing the behavior of software systems are seldom available in practice. To address this issue, software engineers may use specification mining techniques, which can automatically derive models that capture the behavior of the system under analysis. So far, most specification mining techniques focused on the functional behavior of the systems, with specific emphasis on models that represent the ordering of operations, such as temporal rules and finite state models. Although useful, these models are inherently partial. For instance, they miss the timing behavior, which is extremely relevant for many classes of systems and components, such as shared libraries and user-driven applications. Mining specifications that include both the functional and the timing aspects can improve the applicability of many testing and analysis solutions. This paper addresses this challenge by presenting the Timed k-Tail (TkT) specification mining technique that can mine timed automata from program traces. Since timed automata can effectively represent the interplay between the functional and the timing behavior of a system, TkT could be exploited in those contexts where time-related information is relevant. Our empirical evaluation shows that TkT can efficiently and effectively mine accurate models. The mined models have been used to identify executions with anomalous timing. The evaluation shows that most of the anomalous executions have been correctly identified while producing few false positives.
2015 IEEE/ACM 37th IEEE International Conference on Software Engineering, 2015
Automatic testing, although useful, is still quite ineffective against faults that do not cause c... more Automatic testing, although useful, is still quite ineffective against faults that do not cause crashes or uncaught exceptions. In the majority of the cases automatic tests do not include oracles, and only in some cases they incorporate assertions that encode the observed behavior instead of the intended behavior, that is if the application under test produces a wrong result, the synthesized assertions will encode wrong expectations that match the actual behavior of the application. In this paper we present ZoomIn, a technique that extends the fault-revealing capability of test case generation techniques from crash-only faults to faults that require non-trivial oracles to be detected. ZoomIn exploits the knowledge encoded in the manual tests written by developers and the similarity between executions to automatically determine an extremely small set of suspicious assertions that are likely wrong and thus worth manual inspection. Early empirical results show that ZoomIn has been able to detect 50% of the analyzed non-crashing faults in the Apache Commons Math library requiring the inspection of less than 1.5% of the assertions automatically generated by EvoSuite.
Validation of Evolving Software, 2015
Validation of Evolving Software, 2015
In this chapter we present Verification-Aided Regression Testing, a novel extension of regression... more In this chapter we present Verification-Aided Regression Testing, a novel extension of regression testing that is significantly less sensitive to the completeness of the validation test suite due to the use of model checking. We extend the use of test case executions from conventional direct fault discovery to generation of behavioral properties specific to the new version by (i) automatically producing properties that are proved to hold for the base version of a program, (ii) automatically identifying and checking on the upgraded program only the properties that, according to the developers’ intention, must be preserved by the upgrade, and (iii) reporting the faults and the corresponding counterexamples that are not revealed by the regression tests. Our empirical study on both open-source and industrial software systems show that Verification-Aided Regression Testing produces properties that can be extremely beneficial in increasing the effectiveness of regression testing by timely and automatically detecting faults unnoticed by existing test suites.
2012 34th International Conference on Software Engineering (ICSE), 2012
Most of the modern Integrated Development Environments are developed with plug-in based architect... more Most of the modern Integrated Development Environments are developed with plug-in based architectures that can be extended with additional functionalities and plug-ins, according to user needs. However, extending an IDE is still a possibility restricted to developers with deep knowledge about the specific development environment and its architecture. In this paper we present MASH, a tool that eases the programming of Integrated Development Environments. The tool supports the definition of workflows that can be quickly designed to integrate functionalities offered by multiple plugins, without the need of knowing anything about the internal architecture of the IDE. Workflows can be easily reshaped every time an analysis must be modified, without the need of producing Java code and deploying components in the IDE. Early results suggest that this approach can effectively facilitate programming of IDEs.
2013 IEEE Sixth International Conference on Software Testing, Verification and Validation, 2013
Despite the recent advances in test generation, fully automatic software testing remains a dream:... more Despite the recent advances in test generation, fully automatic software testing remains a dream: Ultimately, any generated test input depends on a test oracle that determines correctness, and, except for generic properties such as "the program shall not crash", such oracles require human input in one form or another. CrowdSourcing is a recently popular technique to automate computations that cannot be performed by machines, but only by humans. A problem is split into small chunks, that are then solved by a crowd of users on the Internet. In this paper we investigate whether it is possible to exploit CrowdSourcing to solve the oracle problem: We produce tasks asking users to evaluate CrowdOraclesassertions that reflect the current behavior of the program. If the crowd determines that an assertion does not match the behavior described in the code documentation, then a bug has been found. Our experiments demonstrate that CrowdOracles are a viable solution to automate the oracle problem, yet taming the crowd to get useful results is a difficult task.
2013 IEEE Sixth International Conference on Software Testing, Verification and Validation, 2013
Several debugging techniques can be used to automatically identify the code fragments or the runt... more Several debugging techniques can be used to automatically identify the code fragments or the runtime events likely responsible of a failure. These techniques are useful, but can help reducing the debugging effort only to a given extent. In fact, even when these techniques are successful, software developers still have to invest a lot of effort in understanding if and why something detected as suspicious is really wrong. In this paper we present the tool implementing the AVA technique. AVA, compared to other approaches dedicated to automatic debugging, in addition to automatically identifying the events likely responsible of a failure, generates an explanation about why these events have been considered suspicious. This explanation can be used by developers to quickly discard imprecise outputs and more effectively work on the relevant anomalies.
2008 19th International Symposium on Software Reliability Engineering (ISSRE), 2008
Log files are commonly inspected by system administrators and developers to detect suspicious beh... more Log files are commonly inspected by system administrators and developers to detect suspicious behaviors and diagnose failure causes. Since size of log files grows fast, thus making manual analysis impractical, different automatic techniques have been proposed to analyze log files. Unfortunately, accuracy and effectiveness of these techniques are often limited by the unstructured nature of logged messages and the variety of data that can be logged. This paper presents a technique to automatically analyze log files and retrieve important information to identify failure causes. The technique automatically identifies dependencies between events and values in logs corresponding to legal executions, generates models of legal behaviors and compares log files collected during failing executions with the generated models to detect anomalous event sequences that are presented to users. Experimental results show the effectiveness of the technique in supporting developers and testers to identify failure causes.
2009 IEEE 31st International Conference on Software Engineering, 2009
Classic fault localization techniques can automatically provide information about the suspicious ... more Classic fault localization techniques can automatically provide information about the suspicious code blocks that are likely responsible for observed failures. This information is useful, but not sufficient to completely understand the causes of failing executions, which still require further (time-consuming) investigations to be exactly identified. A useful and comprehensive source of information is frequently given by the set of unexpected events that have been observed during failures. Sequences of unexpected events are usually simple to be interpret, and testers can guess the expected correct sequences of events from the faulty sequences. In this paper, we present a tool that automatically identifies anomalous events that likely caused failures, filters the possible false positives, and presents the resulting data by building views that show chains of cause-effect relations, i.e., views that show when anomalous events are caused by other anomalous events. The use of the technique to investigate a fault in the Tomcat application server is also presented in the paper.
Proceedings of the 2014 International Symposium on Software Testing and Analysis - ISSTA 2014, 2014
Motivation • Regression testing is an integral part of many software development processes • Give... more Motivation • Regression testing is an integral part of many software development processes • Given an upgrade of a software, does it satisfy a validation test suite passed by the base version of the software
International Symposium on Software Testing and Analysis, 2009
Dynamic analysis techniques have been extensively adopted to discover causes of observed failures... more Dynamic analysis techniques have been extensively adopted to discover causes of observed failures. In particular, anom- aly detection techniques can infer behavioral models from observed legal executions and compare failing executions with the inferred models to automatically identify the likely anomalous events that caused observed failures. Unfortunately the output of these techniques is limited to a set of independent suspicious
Heterogeneity, mobility, complexity and new application domains raise new software reliability is... more Heterogeneity, mobility, complexity and new application domains raise new software reliability issues that cannot be met cost-effectively only with classic software engineering approaches. Self-healing systems can successfully address these problems, thus increasing software reliability while reducing maintenance costs. Self-healing systems must be able to automatically identify runtime failures, locate faults, and find a way to bring the system back to an acceptable behavior. This paper discusses the challenges underlying the construction of self-healing systems with particular focus on functional failures, and presents a set of techniques to build software systems that can automatically heal such failures. It introduces techniques to automatically derive assertions to effectively detect functional failures, locate the faults underlying the failures, and identify sequences of actions alternative to the failing sequence to bring the system back to an acceptable behavior.
2017 IEEE International Conference on Software Testing, Verification and Validation (ICST)
In the context of use-case centric development and requirements-driven testing, this paper addres... more In the context of use-case centric development and requirements-driven testing, this paper addresses the problem of automatically deriving system test cases to verify timing requirements. Inspired by engineering practice in an automotive software development context, we rely on an analyzable form of use case specifications and augment such functional descriptions with timed automata, capturing timing requirements, following a methodology aiming at minimizing modeling overhead. We automate the generation of executable test cases using a test strategy based on maximizing test suite diversity and building over the UPPAAL model checker. Initial empirical results based on an industrial case study provide evidence of the effectiveness of the approach. This paper has been accepted for publication in the proceeding of the 10th IEEE International Conference on Software Testing, Verification and Validation (ICST 2017), IEEE.
Proceedings of the ACM/IEEE 44th International Conference on Software Engineering: Companion Proceedings
We present MASS, a mutation analysis tool for embedded software in cyber-physical systems (CPS). ... more We present MASS, a mutation analysis tool for embedded software in cyber-physical systems (CPS). We target space CPS (e.g., satellites) and other CPS with similar characteristics (e.g., UAV). Mutation analysis measures the quality of test suites in terms of the percentage of detected artificial faults. There are many mutation analysis tools available but they are inapplicable to CPS because of scalability and accuracy challenges. To overcome such limitations, MASS implements a set of optimization techniques that enable the applicability of mutation analysis and address scalability and accuracy in the CPS context. MASS has been successfully evaluated on a large study involving embedded software systems provided by industry partners; the study includes an on-board software system managing a microsatellite currently on-orbit, a set of libraries used in deployed cubesats, and a mathematical library provided by the European Space Agency. A demo video of MASS is available at https://www.youtube.com/watch?v=gC1x9cU0-tU. CCS CONCEPTS • Software and its engineering → Software verification and validation.
ACM Transactions on Software Engineering and Methodology
Apps’ pervasive role in our society led to the definition of test automation approaches to ensure... more Apps’ pervasive role in our society led to the definition of test automation approaches to ensure their dependability. However, state-of-the-art approaches tend to generate large numbers of test inputs and are unlikely to achieve more than 50% method coverage. In this article, we propose a strategy to achieve significantly higher coverage of the code affected by updates with a much smaller number of test inputs, thus alleviating the test oracle problem. More specifically, we present ATUA, a model-based approach that synthesizes App models with static analysis, integrates a dynamically refined state abstraction function and combines complementary testing strategies, including (1) coverage of the model structure, (2) coverage of the App code, (3) random exploration, and (4) coverage of dependencies identified through information retrieval. Its model-based strategy enables ATUA to generate a small set of inputs that exercise only the code affected by the updates. In turn, this makes co...
IEEE Transactions on Software Engineering, 2021
On-board embedded software developed for spaceflight systems (space software) must adhere to stri... more On-board embedded software developed for spaceflight systems (space software) must adhere to stringent software quality assurance procedures. For example, verification and validation activities are typically performed and assessed by third party organizations. To further minimize the risk of human mistakes, space agencies, such as the European Space Agency (ESA), are looking for automated solutions for the assessment of software testing activities, which play a crucial role in this context. Though space software is our focus here, it should be noted that such software shares the above considerations, to a large extent, with embedded software in many other types of cyber-physical systems. Over the years, mutation analysis has shown to be a promising solution for the automated assessment of test suites; it consists of measuring the quality of a test suite in terms of the percentage of injected faults leading to a test failure. A number of optimization techniques, addressing scalability and accuracy problems, have been proposed to facilitate the industrial adoption of mutation analysis. However, to date, two major problems prevent space agencies from enforcing mutation analysis in space software development. First, there is uncertainty regarding the feasibility of applying mutation analysis optimization techniques in their context. Second, most of the existing techniques either can break the real-time requirements common in embedded software or cannot be applied when the software is tested in Software Validation Facilities, including CPU emulators and sensor simulators. In this paper, we enhance mutation analysis optimization techniques to enable their applicability to embedded software and propose a pipeline that successfully integrates them to address scalability and accuracy issues in this context, as described above. Further, we report on the largest study involving embedded software systems in the mutation analysis literature. Our research is part of a research project funded by ESA ESTEC involving private companies (GomSpace Luxembourg and LuxSpace) in the space sector. These industry partners provided the case studies reported in this paper; they include an on-board software system managing a microsatellite currently on-orbit, a set of libraries used in deployed cubesats, and a mathematical library certified by ESA.
This repository provides the data used for the experiments of the paper "Supporting DNN Safe... more This repository provides the data used for the experiments of the paper "Supporting DNN Safety Analysis and Retraining through Heatmap-based Unsupervised Learning" by Hazem Fahmy, Fabrizio Pastore, Mojtaba Bagherzadeh, and Lionel Briand appearing in IEEE Transactions on Reliability (doi: 10.1109/TR.2021.3074750) Deep neural networks (DNNs) are increasingly important in safety-critical systems, for example in their perception layer to analyze images. Unfortunately, there is a lack of methods to ensure the functional safety of DNN-based components. We observe three major challenges with existing practices regarding DNNs in safety-critical systems: (1) scenarios that are underrepresented in the test set may lead to serious safety violation risks, but may, however, remain unnoticed; (2) char- acterizing such high-risk scenarios is critical for safety analysis; (3) retraining DNNs to address these risks is poorly supported when causes of violations are difficult to determine. T...