Per Runeson | Lund University (original) (raw)

Conference Papers by Per Runeson

Research paper thumbnail of A replicated study on duplicate detection: Using Apache Lucene to search among Android defects

Duplicate detection is a fundamental part of issue management. Systems able to predict whether a ... more Duplicate detection is a fundamental part of issue management. Systems able to predict whether a new defect report will be closed as a duplicate, may decrease costs by limiting rework and collecting related pieces of information. Previous work relies on the textual content of the defect reports, often assuming that better results are obtained if the title is weighted as more important than the descrip- tion. We conduct a conceptual replication of a well-cited study conducted at Sony Ericsson, using Apache Lucene for searching in the public Android defect repository. In line with the original study, we explore how varying the weighting of the title and the description affects the accuracy. Our work shows the poten- tial of using Lucene as a scalable solution for duplicate detection. Also, we show that Lucene obtains the best results the when the defect report title is weighted three times higher than the description, a bigger difference than has been previously acknowledged.

Bookmarks Related papers MentionsView impact

Research paper thumbnail of Navigating Information Overload Caused by Automated Testing – A Clustering Approach in Multi-Branch Development

Background. Test automation is a widely used technique to increase the efficiency of software tes... more Background. Test automation is a widely used technique to increase the efficiency of software testing. However, executing more test cases increases the effort required to analyze test results. At Qlik, automated tests run nightly for up to 20 development branches, each containing thousands of test cases, resulting in information overload. Aim. We therefore develop a tool that supports the analy- sis of test results. Method. We create NIOCAT, a tool that clusters similar test case fail- ures, to help the analyst identify underlying causes. To evaluate the tool, experiments on manually created subsets of failed test cases representing different use cases are conducted, and a focus group meeting is held with test analysts at Qlik. Results. The case study shows that NIOCAT creates accurate clusters, in line with analyses performed by human analysts. Further, the potential time-savings of our approach is confirmed by the participants in the focus group. Conclusions. NIOCAT provides a feasible complement to current au-tomated testing practices at Qlik by reducing information overload.

Bookmarks Related papers MentionsView impact

Research paper thumbnail of A Qualitative Survey of Regression Testing Practices

Regression testing practices in industry have to be better understood, both for the industry itse... more Regression testing practices in industry have to be better understood, both for the industry itself and for the research community. Method : We conducted a qualitative industry survey by i) running a focus group meeting with 15 industry participants and ii) validating the outcome in an on line questionnaire with 32 respondents. Results: Regression testing needs and practices vary greatly between and within organizations and at different stages of a project. The importance and challenges of automation is clear from the survey. Conclusions: Most of the findings are general testing issues and are not specific to regression testing. Challenges and good practices relate to test automation and testability issues.

Bookmarks Related papers MentionsView impact

Research paper thumbnail of Software Testing in Open Innovation: An Exploratory Case Study of the Acceptance Test Harness for Jenkins

Open Innovation (OI) has gained significant attention since the term was introduced in 2003. Howe... more Open Innovation (OI) has gained significant attention since the term was introduced in 2003. However, little is known whether general software testing processes are well suited for OI. An exploratory case study on the Acceptance Test Harness (ATH) is conducted to investigate OI testing activities of Jenkins. As far as the research methodology is concerned, we extracted the change log data of ATH followed by five interviews with key contributors in the development of ATH. The findings of the study are threefold. First, it highlights the key stakeholders involved in the development of ATH. Second, the study compares the ATH testing activities with ISO/IEC/IEEE testing process and presents a tailored process for software testing in OI. Finally, the study underlines some key challenges that software intensive organizations face while working with the testing in OI.

Bookmarks Related papers MentionsView impact

Research paper thumbnail of A replicated study on duplicate detection: Using Apache Lucene to search among Android defects

Proc. of the 8th International Symposium on Empirical Software Engineering and Measurement, Sep 18, 2014

Context: Duplicate detection is a fundamental part of issue management. Systems able to predict w... more Context: Duplicate detection is a fundamental part of issue management. Systems able to predict whether a new defect report will be closed as a duplicate, may decrease costs by limiting rework and collecting related pieces of information. Previous work relies on the textual content of the defect reports, often assuming that better results are obtained if the title is weighted as more important than the description. Method: We conduct a conceptual replication of a well-cited study conducted at Sony Ericsson, using
Apache Lucene for searching in the public Android defect repository. In line with the original study, we explore how varying the weighting of the title and the description affects the accuracy. Results and conclusions: Our work shows the potential of using Lucene as a scalable solution for duplicate detection. Also, we show that Lucene obtains the best results the when the defect report title is weighted three times higher than the description, a bigger differencethan has been previously acknowledged.

Bookmarks Related papers MentionsView impact

Research paper thumbnail of Evaluation of Traceability Recovery in Context: A Taxonomy for Information Retrieval Tools

Background: Development of complex, software intensive systems generates large amounts of inform... more Background: Development of complex, software intensive
systems generates large amounts of information. Several
researchers have developed tools implementing information
retrieval (IR) approaches to suggest traceability links among
artifacts. Aim: We explore the consequences of the fact that
a majority of the evaluations of such tools have been focused
on benchmarking of mere tool output. Method: To illustrate this
issue, we have adapted a framework of general IR evaluations to a context taxonomy specifically for IR-based traceability recovery. Furthermore, we evaluate a previously proposed experimental framework by conducting a study using two publicly available tools on two datasets originating from development of embedded software systems. Results: Our study shows that even though both datasets contain software artifacts from embedded development, the characteristics of the two datasets differ considerably, and consequently the traceability outcomes. Conclusions: To enable replications and secondary studies, we suggest that datasets should be thoroughly characterized in future studies on traceability
recovery, especially when they can not be disclosed. Also, while
we conclude that the experimental framework provides useful
support, we argue that our proposed context taxonomy is a useful complement. Finally, we discuss how empirical evidence of the feasibility of IR-based traceability recovery can be strengthened in future research.

Bookmarks Related papers MentionsView impact

Research paper thumbnail of Analysing Networks of Issue Reports

Proc. of the 17th European Conference on Software Maintenance and Reengineering, Mar 6, 2013

Completely analyzed and closed issue reports in software development projects, particularly in th... more Completely analyzed and closed issue reports in software development projects, particularly in the development of safety-critical systems, often carry important information about issue-related change locations. These locations may be in the source code, as well as traces to test cases affected by the issue, and related design and requirements documents. In order to help developers analyze new issues, knowledge about issue clones and duplicates, as well as other relations between the new issue and existing issue reports would be useful. This paper analyses, in an exploratory study, issue reports contained in two Issue Management Systems (IMS) containing approximately 20.000 issue reports. The purpose of the analysis is to gain a better understanding of relationships between issue reports
in IMSs. We found that link-mining explicit references can reveal complex networks of issue reports. Furthermore, we found that textual similarity analysis might have the potential to complement the explicitly signaled links by recommending additional relations. In line with work in other fields, links between software artifacts have a potential to improve search and navigation in large software engineering projects.

Bookmarks Related papers MentionsView impact

Research paper thumbnail of Supporting Regression Test Scoping with Visual Analytics

Proceedings of the 7th International Conference on Software Testing, Verification and Validation, Mar 31, 2014

Background: Test managers have to repeatedly select test cases for test activities during evoluti... more Background: Test managers have to repeatedly select test cases for test activities during evolution of large software systems. Researchers have widely studied automated test scoping, but have not fully investigated decision support with human interaction. We previously proposed the introduction of visual analytics for this purpose. Aim: In this empirical study we investigate how to design such decision support. Method: We
explored the use of visual analytics using heat maps of historical
test data for test scoping support by letting test managers
evaluate prototype visualizations in three focus groups with in
total nine industrial test experts. Results: All test managers in
the study found the visual analytics useful for supporting test
planning. However, our results show that different tasks and
contexts require different types of visualizations. Conclusion:
Important properties for test planning support are: ability to
overview testing from different perspectives, ability to filter and
zoom to compare subsets of the testing with respect to various
attributes and the ability to manipulate the subset under analysis
by selecting and deselecting test cases. Our results may be used
to support the introduction of visual test analytics in practice.

Bookmarks Related papers MentionsView impact

Research paper thumbnail of IR in Software Traceability: From a Bird's Eye View

Proc. of the 7th International Symposium on Empirical Software Engineering and Measurement

Several researchers have proposed creating after-the-fact structure among software artifacts usin... more Several researchers have proposed creating after-the-fact structure among software artifacts using trace recovery based on Information Retrieval (IR) approaches. Due to significant variation points in previous studies, results are not easily aggregated. We provide an initial overview picture of the outcome of previous evaluations. Based on a systematic mapping study, we perform a synthesis of published research. Our results show that there are no empirical evidence that any IR model outperforms another model consistently. We also display a strong dependency between the P-R values and the input datasets. Finally, our mapping of Precision and Recall (P-R) values on the possible output space highlights the difficulty of recovering accurate trace links using naïve cut-off strategies. Thus, our work presents empirical evidence that confirms several previous claims on IR-based trace recovery and stresses the needs for empirical evaluations beyond the basic P-R "race".

Bookmarks Related papers MentionsView impact

Research paper thumbnail of Navigating Information Overload Caused by Automated Testing – A Clustering Approach in Multi-Branch Development

Background. Test automation is a widely used technique to increase the efficiency of software tes... more Background. Test automation is a widely used technique
to increase the efficiency of software testing. However,
executing more test cases increases the effort required to analyze test results. At Qlik, automated tests run nightly for up to 20 development branches, each containing thousands of test cases, resulting in information overload. Aim. We therefore develop a tool that supports the analysis of test results. Method. We create NIOCAT, a tool that clusters similar test case failures, to help the analyst identify underlying causes. To evaluate the tool, experiments on manually created subsets of failed test cases representing different use cases are conducted, and a focus group meeting is held with test analysts at Qlik. Results. The case study shows that NIOCAT creates accurate clusters, in line with analyses performed by human analysts. Further, the potential time-savings of our approach is confirmed by the participants in the focus group. Conclusions. NIOCAT provides a feasible complement to current automated testing practices at Qlik by reducing information overload.

Bookmarks Related papers MentionsView impact

Papers by Per Runeson

Research paper thumbnail of Analyzing Networks of Issue Reports

Bookmarks Related papers MentionsView impact

Research paper thumbnail of Software Engineers' Information Seeking Behavior in Change Impact Analysis - An Interview Study

arXiv (Cornell University), Mar 6, 2017

Bookmarks Related papers MentionsView impact

Research paper thumbnail of Published in

Copyright and moral rights for the publications made accessible in the public portal are retained... more Copyright and moral rights for the publications made accessible in the public portal are retained by the authors and/or other copyright owners and it is a condition of accessing publications that users recognise and abide by the legal requirements associated with these rights. • Users may download and print one copy of any publication from the public portal for the purpose of private study or research. • You may not further distribute the material or use it for any profit-making activity or commercial gain • You may freely distribute the URL identifying the publication in the public portal? Take down policy If you believe that this document breaches copyright please contact us providing details, and we will remove access to the work immediately and investigate your claim.

Bookmarks Related papers MentionsView impact

Research paper thumbnail of Analysing Networks of Issue Reports

Completely analyzed and closed issue reports in software development projects, particularly in th... more Completely analyzed and closed issue reports in software development projects, particularly in the development of safety-critical systems, often carry important information about issue-related change locations. These locations may be in the source code, as well as traces to test cases affected by the issue, and related design and requirements documents. In order to help developers analyze new issues, knowledge about issue clones and duplicates, as well as other relations between the new issue and existing issue reports would be useful. This paper analyses, in an exploratory study, issue reports contained in two Issue Management Systems (IMS) containing approximately 20.000 issue reports. The purpose of the analysis is to gain a better understanding of relationships between issue reports in IMSs. We found that link-mining explicit references can reveal complex networks of issue reports. Furthermore, we found that textual similarity analysis might have the potential to complement the explicitly signaled links by recommending additional relations. In line with work in other fields, links between software artifacts have a potential to improve search and navigation in large software engineering projects.

Bookmarks Related papers MentionsView impact

Research paper thumbnail of Changes, Evolution, and Bugs

Springer eBooks, Dec 20, 2013

Bookmarks Related papers MentionsView impact

Research paper thumbnail of Open Tools for Software Engineering

Bookmarks Related papers MentionsView impact

Research paper thumbnail of A replicated study on duplicate detection

Bookmarks Related papers MentionsView impact

Research paper thumbnail of It is More Blessed to Give than to Receive - Open Software Tools Enable Open Innovation

Bookmarks Related papers MentionsView impact

Research paper thumbnail of Open Tools for Software Engineering

Proceedings of the Evaluation and Assessment on Software Engineering

Bookmarks Related papers MentionsView impact

Research paper thumbnail of Automated Controlled Experimentation on Software by Evolutionary Bandit Optimization

Search Based Software Engineering, 2017

Controlled experiments, also called A/B tests or split tests, are used in software engineering to... more Controlled experiments, also called A/B tests or split tests, are used in software engineering to improve products by evaluating variants with user data. By parameterizing software systems, multivariate experiments can be performed automatically and in large scale, in this way, controlled experimentation is formulated as an optimization problem. Using genetic algorithms for automated experimentation requires repetitions to evaluate a variant, since the fitness function is noisy. We propose to combine genetic algorithms with bandit optimization to optimize where repetitions are evaluated, instead of uniform sampling. We setup a simulation environment that allows us to evaluate the solution, and see that it leads to increased fitness, population diversity, and rewards, compared to only genetic algorithms.

Bookmarks Related papers MentionsView impact

Research paper thumbnail of A replicated study on duplicate detection: Using Apache Lucene to search among Android defects

Duplicate detection is a fundamental part of issue management. Systems able to predict whether a ... more Duplicate detection is a fundamental part of issue management. Systems able to predict whether a new defect report will be closed as a duplicate, may decrease costs by limiting rework and collecting related pieces of information. Previous work relies on the textual content of the defect reports, often assuming that better results are obtained if the title is weighted as more important than the descrip- tion. We conduct a conceptual replication of a well-cited study conducted at Sony Ericsson, using Apache Lucene for searching in the public Android defect repository. In line with the original study, we explore how varying the weighting of the title and the description affects the accuracy. Our work shows the poten- tial of using Lucene as a scalable solution for duplicate detection. Also, we show that Lucene obtains the best results the when the defect report title is weighted three times higher than the description, a bigger difference than has been previously acknowledged.

Bookmarks Related papers MentionsView impact

Research paper thumbnail of Navigating Information Overload Caused by Automated Testing – A Clustering Approach in Multi-Branch Development

Background. Test automation is a widely used technique to increase the efficiency of software tes... more Background. Test automation is a widely used technique to increase the efficiency of software testing. However, executing more test cases increases the effort required to analyze test results. At Qlik, automated tests run nightly for up to 20 development branches, each containing thousands of test cases, resulting in information overload. Aim. We therefore develop a tool that supports the analy- sis of test results. Method. We create NIOCAT, a tool that clusters similar test case fail- ures, to help the analyst identify underlying causes. To evaluate the tool, experiments on manually created subsets of failed test cases representing different use cases are conducted, and a focus group meeting is held with test analysts at Qlik. Results. The case study shows that NIOCAT creates accurate clusters, in line with analyses performed by human analysts. Further, the potential time-savings of our approach is confirmed by the participants in the focus group. Conclusions. NIOCAT provides a feasible complement to current au-tomated testing practices at Qlik by reducing information overload.

Bookmarks Related papers MentionsView impact

Research paper thumbnail of A Qualitative Survey of Regression Testing Practices

Regression testing practices in industry have to be better understood, both for the industry itse... more Regression testing practices in industry have to be better understood, both for the industry itself and for the research community. Method : We conducted a qualitative industry survey by i) running a focus group meeting with 15 industry participants and ii) validating the outcome in an on line questionnaire with 32 respondents. Results: Regression testing needs and practices vary greatly between and within organizations and at different stages of a project. The importance and challenges of automation is clear from the survey. Conclusions: Most of the findings are general testing issues and are not specific to regression testing. Challenges and good practices relate to test automation and testability issues.

Bookmarks Related papers MentionsView impact

Research paper thumbnail of Software Testing in Open Innovation: An Exploratory Case Study of the Acceptance Test Harness for Jenkins

Open Innovation (OI) has gained significant attention since the term was introduced in 2003. Howe... more Open Innovation (OI) has gained significant attention since the term was introduced in 2003. However, little is known whether general software testing processes are well suited for OI. An exploratory case study on the Acceptance Test Harness (ATH) is conducted to investigate OI testing activities of Jenkins. As far as the research methodology is concerned, we extracted the change log data of ATH followed by five interviews with key contributors in the development of ATH. The findings of the study are threefold. First, it highlights the key stakeholders involved in the development of ATH. Second, the study compares the ATH testing activities with ISO/IEC/IEEE testing process and presents a tailored process for software testing in OI. Finally, the study underlines some key challenges that software intensive organizations face while working with the testing in OI.

Bookmarks Related papers MentionsView impact

Research paper thumbnail of A replicated study on duplicate detection: Using Apache Lucene to search among Android defects

Proc. of the 8th International Symposium on Empirical Software Engineering and Measurement, Sep 18, 2014

Context: Duplicate detection is a fundamental part of issue management. Systems able to predict w... more Context: Duplicate detection is a fundamental part of issue management. Systems able to predict whether a new defect report will be closed as a duplicate, may decrease costs by limiting rework and collecting related pieces of information. Previous work relies on the textual content of the defect reports, often assuming that better results are obtained if the title is weighted as more important than the description. Method: We conduct a conceptual replication of a well-cited study conducted at Sony Ericsson, using
Apache Lucene for searching in the public Android defect repository. In line with the original study, we explore how varying the weighting of the title and the description affects the accuracy. Results and conclusions: Our work shows the potential of using Lucene as a scalable solution for duplicate detection. Also, we show that Lucene obtains the best results the when the defect report title is weighted three times higher than the description, a bigger differencethan has been previously acknowledged.

Bookmarks Related papers MentionsView impact

Research paper thumbnail of Evaluation of Traceability Recovery in Context: A Taxonomy for Information Retrieval Tools

Background: Development of complex, software intensive systems generates large amounts of inform... more Background: Development of complex, software intensive
systems generates large amounts of information. Several
researchers have developed tools implementing information
retrieval (IR) approaches to suggest traceability links among
artifacts. Aim: We explore the consequences of the fact that
a majority of the evaluations of such tools have been focused
on benchmarking of mere tool output. Method: To illustrate this
issue, we have adapted a framework of general IR evaluations to a context taxonomy specifically for IR-based traceability recovery. Furthermore, we evaluate a previously proposed experimental framework by conducting a study using two publicly available tools on two datasets originating from development of embedded software systems. Results: Our study shows that even though both datasets contain software artifacts from embedded development, the characteristics of the two datasets differ considerably, and consequently the traceability outcomes. Conclusions: To enable replications and secondary studies, we suggest that datasets should be thoroughly characterized in future studies on traceability
recovery, especially when they can not be disclosed. Also, while
we conclude that the experimental framework provides useful
support, we argue that our proposed context taxonomy is a useful complement. Finally, we discuss how empirical evidence of the feasibility of IR-based traceability recovery can be strengthened in future research.

Bookmarks Related papers MentionsView impact

Research paper thumbnail of Analysing Networks of Issue Reports

Proc. of the 17th European Conference on Software Maintenance and Reengineering, Mar 6, 2013

Completely analyzed and closed issue reports in software development projects, particularly in th... more Completely analyzed and closed issue reports in software development projects, particularly in the development of safety-critical systems, often carry important information about issue-related change locations. These locations may be in the source code, as well as traces to test cases affected by the issue, and related design and requirements documents. In order to help developers analyze new issues, knowledge about issue clones and duplicates, as well as other relations between the new issue and existing issue reports would be useful. This paper analyses, in an exploratory study, issue reports contained in two Issue Management Systems (IMS) containing approximately 20.000 issue reports. The purpose of the analysis is to gain a better understanding of relationships between issue reports
in IMSs. We found that link-mining explicit references can reveal complex networks of issue reports. Furthermore, we found that textual similarity analysis might have the potential to complement the explicitly signaled links by recommending additional relations. In line with work in other fields, links between software artifacts have a potential to improve search and navigation in large software engineering projects.

Bookmarks Related papers MentionsView impact

Research paper thumbnail of Supporting Regression Test Scoping with Visual Analytics

Proceedings of the 7th International Conference on Software Testing, Verification and Validation, Mar 31, 2014

Background: Test managers have to repeatedly select test cases for test activities during evoluti... more Background: Test managers have to repeatedly select test cases for test activities during evolution of large software systems. Researchers have widely studied automated test scoping, but have not fully investigated decision support with human interaction. We previously proposed the introduction of visual analytics for this purpose. Aim: In this empirical study we investigate how to design such decision support. Method: We
explored the use of visual analytics using heat maps of historical
test data for test scoping support by letting test managers
evaluate prototype visualizations in three focus groups with in
total nine industrial test experts. Results: All test managers in
the study found the visual analytics useful for supporting test
planning. However, our results show that different tasks and
contexts require different types of visualizations. Conclusion:
Important properties for test planning support are: ability to
overview testing from different perspectives, ability to filter and
zoom to compare subsets of the testing with respect to various
attributes and the ability to manipulate the subset under analysis
by selecting and deselecting test cases. Our results may be used
to support the introduction of visual test analytics in practice.

Bookmarks Related papers MentionsView impact

Research paper thumbnail of IR in Software Traceability: From a Bird's Eye View

Proc. of the 7th International Symposium on Empirical Software Engineering and Measurement

Several researchers have proposed creating after-the-fact structure among software artifacts usin... more Several researchers have proposed creating after-the-fact structure among software artifacts using trace recovery based on Information Retrieval (IR) approaches. Due to significant variation points in previous studies, results are not easily aggregated. We provide an initial overview picture of the outcome of previous evaluations. Based on a systematic mapping study, we perform a synthesis of published research. Our results show that there are no empirical evidence that any IR model outperforms another model consistently. We also display a strong dependency between the P-R values and the input datasets. Finally, our mapping of Precision and Recall (P-R) values on the possible output space highlights the difficulty of recovering accurate trace links using naïve cut-off strategies. Thus, our work presents empirical evidence that confirms several previous claims on IR-based trace recovery and stresses the needs for empirical evaluations beyond the basic P-R "race".

Bookmarks Related papers MentionsView impact

Research paper thumbnail of Navigating Information Overload Caused by Automated Testing – A Clustering Approach in Multi-Branch Development

Background. Test automation is a widely used technique to increase the efficiency of software tes... more Background. Test automation is a widely used technique
to increase the efficiency of software testing. However,
executing more test cases increases the effort required to analyze test results. At Qlik, automated tests run nightly for up to 20 development branches, each containing thousands of test cases, resulting in information overload. Aim. We therefore develop a tool that supports the analysis of test results. Method. We create NIOCAT, a tool that clusters similar test case failures, to help the analyst identify underlying causes. To evaluate the tool, experiments on manually created subsets of failed test cases representing different use cases are conducted, and a focus group meeting is held with test analysts at Qlik. Results. The case study shows that NIOCAT creates accurate clusters, in line with analyses performed by human analysts. Further, the potential time-savings of our approach is confirmed by the participants in the focus group. Conclusions. NIOCAT provides a feasible complement to current automated testing practices at Qlik by reducing information overload.

Bookmarks Related papers MentionsView impact

Research paper thumbnail of Analyzing Networks of Issue Reports

Bookmarks Related papers MentionsView impact

Research paper thumbnail of Software Engineers' Information Seeking Behavior in Change Impact Analysis - An Interview Study

arXiv (Cornell University), Mar 6, 2017

Bookmarks Related papers MentionsView impact

Research paper thumbnail of Published in

Copyright and moral rights for the publications made accessible in the public portal are retained... more Copyright and moral rights for the publications made accessible in the public portal are retained by the authors and/or other copyright owners and it is a condition of accessing publications that users recognise and abide by the legal requirements associated with these rights. • Users may download and print one copy of any publication from the public portal for the purpose of private study or research. • You may not further distribute the material or use it for any profit-making activity or commercial gain • You may freely distribute the URL identifying the publication in the public portal? Take down policy If you believe that this document breaches copyright please contact us providing details, and we will remove access to the work immediately and investigate your claim.

Bookmarks Related papers MentionsView impact

Research paper thumbnail of Analysing Networks of Issue Reports

Completely analyzed and closed issue reports in software development projects, particularly in th... more Completely analyzed and closed issue reports in software development projects, particularly in the development of safety-critical systems, often carry important information about issue-related change locations. These locations may be in the source code, as well as traces to test cases affected by the issue, and related design and requirements documents. In order to help developers analyze new issues, knowledge about issue clones and duplicates, as well as other relations between the new issue and existing issue reports would be useful. This paper analyses, in an exploratory study, issue reports contained in two Issue Management Systems (IMS) containing approximately 20.000 issue reports. The purpose of the analysis is to gain a better understanding of relationships between issue reports in IMSs. We found that link-mining explicit references can reveal complex networks of issue reports. Furthermore, we found that textual similarity analysis might have the potential to complement the explicitly signaled links by recommending additional relations. In line with work in other fields, links between software artifacts have a potential to improve search and navigation in large software engineering projects.

Bookmarks Related papers MentionsView impact

Research paper thumbnail of Changes, Evolution, and Bugs

Springer eBooks, Dec 20, 2013

Bookmarks Related papers MentionsView impact

Research paper thumbnail of Open Tools for Software Engineering

Bookmarks Related papers MentionsView impact

Research paper thumbnail of A replicated study on duplicate detection

Bookmarks Related papers MentionsView impact

Research paper thumbnail of It is More Blessed to Give than to Receive - Open Software Tools Enable Open Innovation

Bookmarks Related papers MentionsView impact

Research paper thumbnail of Open Tools for Software Engineering

Proceedings of the Evaluation and Assessment on Software Engineering

Bookmarks Related papers MentionsView impact

Research paper thumbnail of Automated Controlled Experimentation on Software by Evolutionary Bandit Optimization

Search Based Software Engineering, 2017

Controlled experiments, also called A/B tests or split tests, are used in software engineering to... more Controlled experiments, also called A/B tests or split tests, are used in software engineering to improve products by evaluating variants with user data. By parameterizing software systems, multivariate experiments can be performed automatically and in large scale, in this way, controlled experimentation is formulated as an optimization problem. Using genetic algorithms for automated experimentation requires repetitions to evaluate a variant, since the fitness function is noisy. We propose to combine genetic algorithms with bandit optimization to optimize where repetitions are evaluated, instead of uniform sampling. We setup a simulation environment that allows us to evaluate the solution, and see that it leads to increased fitness, population diversity, and rewards, compared to only genetic algorithms.

Bookmarks Related papers MentionsView impact

Research paper thumbnail of A Machine Learning Approach for Semi-Automated Search and Selection in Literature Studies

Proceedings of the 21st International Conference on Evaluation and Assessment in Software Engineering, 2017

Background. Search and selection of primary studies in Systematic Literature Reviews (SLR) is lab... more Background. Search and selection of primary studies in Systematic Literature Reviews (SLR) is labour intensive, and hard to replicate and update. Aims. We explore a machine learning approach to support semi-automated search and selection in SLRs to address these weaknesses. Method. We 1) train a classifier on an initial set of papers, 2) extend this set of papers by automated search and snowballing, 3) have the researcher validate the top paper, selected by the classifier, and 4) update the set of papers and iterate the process until a stopping criterion is met. Results. We demonstrate with a proof-of-concept tool that the proposed automated search and selection approach generates valid search strings and that the performance for subsets of primary studies can reduce the manual work by half. Conclusions. The approach is promising and the demonstrated advantages include cost savings and replicability. The next steps include further tool development and evaluate the approach on a complete SLR.

Bookmarks Related papers MentionsView impact

Research paper thumbnail of A case study of industry–academia communication in a joint software engineering research project

Journal of Software: Evolution and Process, 2021

Empirical software engineering research relies on good communication with industrial partners. Co... more Empirical software engineering research relies on good communication with industrial partners. Conducting joint research both requires and contributes to bridging the communication gap between industry and academia (IA) in software engineering. This study aims to explore communication between the two parties in such a setting. To better understand what facilitates good IA communication and what project outcomes such communication promotes, we performed a case study, in the context of a long‐term IA joint project, followed by a validating survey among practitioners and researchers with experience of working in similar settings. We identified five facilitators of IA communication and nine project outcomes related to this communication. The facilitators concern the relevance of the research, practitioners' attitude and involvement in research, frequency of communication and longevity of the collaboration. The project outcomes promoted by this communication include, for researchers,...

Bookmarks Related papers MentionsView impact

Research paper thumbnail of Guiding the selection of research methodology in industry–academia collaboration in software engineering

Information and Software Technology, 2021

Abstract Background: The literature concerning research methodologies and methods has increased i... more Abstract Background: The literature concerning research methodologies and methods has increased in software engineering in the last decade. However, there is limited guidance on selecting an appropriate research methodology for a given research study or project. Objective: Based on a selection of research methodologies suitable for software engineering research in collaboration between industry and academia, we present, discuss and compare the methodologies aiming to provide guidance on which research methodology to choose in a given situation to ensure successful industry–academia collaboration in research. Method: Three research methodologies were chosen for two main reasons. Design Science and Action Research were selected for their usage in software engineering. We also chose a model emanating from software engineering, i.e., the Technology Transfer Model. An overview of each methodology is provided. It is followed by a discussion and an illustration concerning their use in industry–academia collaborative research. The three methodologies are then compared using a set of criteria as a basis for our guidance. Results: The discussion and comparison of the three research methodologies revealed general similarities and distinct differences. All three research methodologies are easily mapped to the general research process describe–solve–practice, while the main driver behind the formulation of the research methodologies is different. Thus, we guide in selecting a research methodology given the primary research objective for a given research study or project in collaboration between industry and academia. Conclusions: We observe that the three research methodologies have different main objectives and differ in some characteristics, although still having a lot in common. We conclude that it is vital to make an informed decision concerning which research methodology to use. The presentation and comparison aim to guide selecting an appropriate research methodology when conducting research in collaboration between industry and academia.

Bookmarks Related papers MentionsView impact

Research paper thumbnail of How Companies Use OSS Tools Ecosystems for Open Innovation

IT Professional, Nov 1, 2019

Bookmarks Related papers MentionsView impact

Research paper thumbnail of Unit Verification Effects on Reused Components in Sequential Project Releases

2017 43rd Euromicro Conference on Software Engineering and Advanced Applications (SEAA), 2017

Background. The effects of different practices on fault distributions in evolving complex softwar... more Background. The effects of different practices on fault distributions in evolving complex software systems is not fully understood. Software reuse and unit verification are practices used to improve system reliability by minimising the number of late faults. Reused software benefits from already being verified while unit verification aims to find faults early.Aims. We want to study effects of software reuse and unit verification on future modifications, fault densities of software units, and fault distributions.Method. We applied statistical analysis to a sample of 520 units that were reused and modified within four sequential projects from one product line in the telecommunication domain.Results. In reused units, the results of unit verification are correlated to a smaller degree of modifications and decreased fault densities.Conclusion. Unit verification in complex systems may improve system evolution in terms of smaller modifications and decrease of fault densities. The unit veri...

Bookmarks Related papers MentionsView impact

Research paper thumbnail of Plug-in software engineering case studies

Proceedings of the 4th International Workshop on Conducting Empirical Studies in Industry, 2016

Empirical software engineering is a growing research area. Industrial experience gathered by syst... more Empirical software engineering is a growing research area. Industrial experience gathered by systematic empirical case studies is extremely important for further evolution of the software engineering discipline. Scientic theory cannot provide eective means for software industry without fundamental understanding of the evolutionary development of complex software systems. However, there are certain limitations in performing observational quantitative case studies in real software engineering environments, and to enable their replication. In this paper, we propose a framework that would allow plug-in case studies for industries, aiming to overcome obstacles of engagement and wide replications of industrial empirical studies.

Bookmarks Related papers MentionsView impact

Research paper thumbnail of Automated bug assignment: Ensemble-based machine learning in large scale industrial contexts

Empirical Software Engineering, Sep 10, 2015

Bookmarks Related papers MentionsView impact

Research paper thumbnail of A Scenario Distribution Model for Effective and Efficient Testing of Autonomous Driving Systems

37th IEEE/ACM International Conference on Automated Software Engineering

Bookmarks Related papers MentionsView impact

Research paper thumbnail of Optimization of anomaly detection in a microservice system through continuous feedback from development

Proceedings of the 10th IEEE/ACM International Workshop on Software Engineering for Systems-of-Systems and Software Ecosystems

Bookmarks Related papers MentionsView impact

Research paper thumbnail of Public Sector Platforms going Open

Proceedings of the 16th International Symposium on Open Collaboration

Bookmarks Related papers MentionsView impact

Research paper thumbnail of Supporting Change Impact Analysis Using a Recommendation System: An Industrial Case Study in a Safety-Critical Context

Change Impact Analysis (CIA) during software evolution of safety-critical systems is a labor-inte... more Change Impact Analysis (CIA) during software evolution of safety-critical systems is a labor-intensive task. Several authors have proposed tool support for CIA, but very few tools were evaluated in industry. We present a case study on ImpRec, a recommendation System for Software Engineering (RSSE), tailored for CIA at a process automation company. ImpRec builds on assisted tracing, using information retrieval solutions and mining software repositories to recommend development artifacts, potentially impacted when resolving incoming issue reports. In contrast to the majority of tools for automated CIA, ImpRec explicitly targets development artifacts that are not source code. We evaluate ImpRec in a two-phase study. First, we measure the correctness of ImpRec's recommendations by a simulation based on 12 years' worth of issue reports in the company. Second, we assess the utility of working with ImpRec by deploying the RSSE in two development teams on different continents. The results suggest that ImpRec presents about 40% of the true impact among the top-10 recommendations. Furthermore, user log analysis indicates that ImpRec can support CIA in industry, and developers acknowledge the value of ImpRec in interviews. In conclusion, our findings show the potential of reusing traceability associated with developers' past activities in an RSSE.

Bookmarks Related papers MentionsView impact

Research paper thumbnail of Automated Bug Assignment: Ensemble-based Machine Learning in Large Scale Industrial Contexts

Bug report assignment is an important part of software maintenance. In particular, incorrect assi... more Bug report assignment is an important part of software maintenance. In particular, incorrect assignments of bug reports to development teams can be very expensive in large software development projects. Several studies propose automating bug assignment techniques using machine learning in open source software contexts, but no study exists for large-scale proprietary projects in industry. The goal of this study is to evaluate automated bug assignment techniques that are based on machine learning classification. In particular, we study the state-of-the-art ensemble learner Stacked Generalization (SG) that combines several classifiers. We collect more than 50,000 bug reports from five development projects from two companies in different domains. We implement automated bug assignment and evaluate the performance in a set of controlled experiments. We show that SG scales to large scale industrial application and that it outperforms the use of individual classifiers for bug assignment, reaching prediction accuracies from 50 % to 89 % when large training sets are used. In addition, we show how old training data can decrease the prediction accuracy of bug assignment. We advice industry to use SG for bug assignment in proprietary contexts, using at least 2,000 bug reports for training. Finally, we highlight the importance of not solely relying on results from cross-validation when evaluating automated bug assignment.

Bookmarks Related papers MentionsView impact

Research paper thumbnail of Challenges and practices in aligning requirements with verification and validation: A case study of six companies

Empirical Software Engineering, 2014

Weak alignment of requirements engineering (RE) with verification and validation (VV) may lead to... more Weak alignment of requirements engineering (RE) with verification and validation (VV) may lead to problems in delivering the required products in time with the right quality. For example, weak communication of requirements changes to testers may result in
lack of verification of new requirements and incorrect verification of old invalid requirements, leading to software quality problems, wasted effort and delays. However, despite the serious implications of weak alignment research and practice both tend to focus on one or the other of RE or VV rather than on the alignment of the two.We have performed a multi-unit case study to gain
insight into issues around aligning RE and VV by interviewing 30 practitioners from 6 software developing companies, involving 10 researchers in a flexible research process for case studies.
The results describe current industry challenges and practices in aligning RE with VV, ranging from quality of the individual RE and VVactivities, through tracing and tools, to change control and sharing a common understanding at strategy, goal and design level. The study identified that human aspects are central, i.e. cooperation and communication, and that requirements engineering practices are a critical basis for alignment. Further, the size of an organisation and its motivation for applying alignment practices, e.g. external enforcement of traceability, are
variation factors that play a key role in achieving alignment. Our results provide a strategic roadmap for practitioners improvement work to address alignment challenges. Furthermore, the
study provides a foundation for continued research to improve the alignment of RE with VV.

Bookmarks Related papers MentionsView impact

Research paper thumbnail of Recovering from a Decade: A Systematic Mapping of Information Retrieval Approaches to Software Traceability

Empirical Software Engineering, 2014

Engineers in large-scale software development have to manage large amounts of information, spread... more Engineers in large-scale software development have to manage large amounts of information, spread across many artifacts. Several researchers have proposed expressing retrieval of trace links among artifacts, i.e. trace recovery, as an Information Retrieval (IR) problem. The objective of this study is to produce a map of work on IR-based trace recovery, with a particular focus on previous evaluations and strength of evidence. We conducted a systematic mapping of IR-based trace recovery. Of the 79 publications classified, a majority applied algebraic IR models. While a set of studies on students indicate that IR-based trace recovery tools support certain work tasks, most previous studies do not go beyond reporting precision and recall of candidate trace links from evaluations using datasets containing less than 500 artifacts. Our review identified a need of industrial case studies. Furthermore, we conclude that the overall quality of reporting should be improved regarding both context and tool details, measures reported, and use of IR terminology. Finally, based on our empirical findings, we present suggestions on how to advance research on IR-based trace recovery.

Bookmarks Related papers MentionsView impact