Support Mechanisms to Conduct Empirical Studies in Software Engineering (original) (raw)

Support Mechanisms to Conduct

Empirical Studies in Software Engineering

Alex Borges*, Waldemar Ferreira*, Emanoel Barreiros*, Adauto Almeida*, Liliane Fonseca*, Eudis Teixeira*, Diogo Silva*, Aline Alencar*, Sergio Soares

*Informatics Center (Cln), Federal University of Pernambuco (anbj, wpfn, efsb, ataf, lss4, eot, dvss, aaac, scbs)@cln.ufpe.br

Abstract

Context: Empirical studies are gaining recognition in the Software Engineering (SE) research community. In order to foster empirical research, it is essential understand the environments, guidelines, process, and other mechanisms available to support these studies in SE. Goal: Identifying the mechanisms used to support the empirical strategies adopted by the researches in the major Empirical Software Engineering (ESE) scientific venues. Method: We performed a systematic mapping study that included all full papers published at EASE, ESEM and ESEJ since their first editions. A total of 898 studies were selected. Results: We provide the full list of identified support mechanisms and the strategies that uses them. The most commonly mechanisms used to support the empirical strategies were two sets of guidelines, one to secondary studies and another to experiments. The most reported empirical strategies are experiments and case studies. Conclusions: The use of empirical methods in SE has increased over the years but many studies do not apply these methods nor use mechanisms to guide their research. Therefore, the list of support mechanisms, where and how they were applied is a major asset to the SE community. Such asset can foster empirical studies aiding the choice regarding which strategies and mechanisms to use in a research. Also, we identified new perspectives and gaps that foster the development of resources to aid empirical studies.

Categories and Subject Descriptors

A.0.1 [Cross-computing Tools and Techniques]: Empirical Studies.

General Terms

Measurement, Experimentation, Verification.

Keywords

Empirical Software Engineering, Systematic Mapping Study, EASE, ESEM, ESEJ, Empirical Strategies, Support Mechanisms.

1. INTRODUCTION

In recent years, researchers have been emphasizing the importance of using empirical methods to evaluate research results in SE. These methods provide consistent and systematic approaches to evaluate phenomena (technologies, processes, models, etc.), as

[1]well as identify problems and propose solutions in SE [1]. Many initiatives arose to propose resources to support conducting empirical studies [3,4,5][3,4,5]. However, experiments in this area are still limited, which hinders its progress as science and delays the adoption of new technologies in software industry [2, 6]. To improve the research quality and to increase the use of empirical strategies in SE, it is necessary understand research designs and methods available, and mechanisms used to aid the researchers.

This scenario motivated us to investigate which mechanisms support empirical studies in SE. We focused our investigation in the most well-known venues of the ESE community: the international conferences on Evaluation and Assessment in Software Engineering (EASE), and on Empirical Software Engineering and Measurement (ESEM), and the international journal on Empirical Software Engineering (ESEJ). People involved in publishing and/or peer-reviewing in these vehicles are well-established researchers in the ESE field. Moreover, studies published on these venues can reflect a considerable spectrum of ESE community, providing a great knowledge base of this area.

We carried out a systematic mapping study (SMS) to identify which mechanisms have been used to support empirical studies on EASE, ESEM, and ESEJ. Our research protocol was based on Kitchenham et al.’ guidelines [7]. We collected 898 papers, among primary, secondary, and tertiary studies. We also categorized the empirical research strategies employed. Among our results, we observed that the most used support mechanisms are related to experiment [3], case study [9], and systematic review [7]. The most reported empirical strategies are experiments and case studies.

The main contribution of this research is providing a list of support mechanisms available to aid empirical studies, and in which contexts they are applied. Such list may be a reference to the SE researchers in the decision regarding which empirical strategies and support mechanisms to use in a specific research. This is a valuable asset, mainly to newcomer and less experienced researchers. We also identified new perspectives and gaps that foster the development of mechanisms to aid empirical studies.

2. METHOD

We conducted a SMS to identify methodologies, processes, guidelines, and tools used to support empirical research in SE. We followed Kitchenham et al. guidelines [7]. The complete protocol is available elsewhere (http://bit.ly/1ptiKTU).
The study addresses the following research questions:

RQ 1: Which are the support mechanisms used to conduct empirical studies published in EASE, ESEM, and ESEJ?
RQ 2: Which are the empirical strategies most used in the researches published in EASE, ESEM, and ESEJ?

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee.
ESEM’14, September 18-19, 2014, Torino, Italy.
Copyright 2014 ACM 978-1-4503-2774-9/14/09… 15.0015.00\15.00 15.00
http://dx.doi.org/10.1145/2652524.2652572 ↩︎

2.1 Search Strategy

Since our SMS intends exclusively to make analyses of EASE, ESEM, and ESEJ proceedings, no search string and automatic search were necessary, hence, the search process only involved manual searches. All studies from these vehicles, since their first editions until 2013, were obtained.

Considering that only manual search was done, most of the articles found were considered relevant to the research. Despite that, three exclusion criteria were adopted: (1) short papers, (2) all non-technical research studies (tutorial, keynote, industrial presentation, etc.), and (3) duplicate papers.

2.2 Data Extraction

The process of data extraction involved eight researchers, four PhD and four MSc students, divided in pairs, making sure each paper was read by at least two researchers. Before the actual data extraction, we performed an extraction pilot to calibrate the extraction instrument and to avoid misunderstandings among the participants.

During the data extraction, papers were analyzed considering abstract, introduction, methodology, results, and conclusion. In some cases, a meticulous reading was necessary. We extracted the information exactly as the authors mentioned in the paper. Any conflicts were discussed and resolved internally by the pairs. If there was no consensus, they were discussed with all participants, in general meetings. A spreadsheet was used to make data extraction, where each column represents a piece of information that had to be extracted from the studies, as detailed in Table 1.

Table 1. Data extraction instrument

Information	Description
General Information	Title; Authors; Institution; Publication Year.
Support Mechanism	Bibliographical Reference.
Mechanism Type: Methodologies, Processes, Guidelines, Tools, Techniques, Good Practices or General Mechanism.
Mechanism Domain.
Empirical Strategy	Empirical Strategy Type: Experiment, Case Study, Survey, Ethnography, Action Research, Systematic Literature Study, Mixed Methods, Others, or Not Identified.

We consider that a mechanism is any resource that supports empirical strategies, such as tools, methodologies, processes, guidelines, etc. We also consider any resources used to analyze the study results (qualitative and quantitative) or used only to guide the study validation. However, we are not accounting the mechanisms used for a specific domain other than ESE. For instance, assuming a study that performs a case study in an agile project, and uses a guideline to support the case study and another to support the agile method, we extracted as mechanism only the guideline to perform the case study, since the guideline for the agile method is not specific to the ESE domain.

A fundamental remark is that all extracted pieces of information have to correspond strictly to the authors’ words in the paper. We adopted this policy in order to avoid subjectivity and allow easier the replication and verification of our study. Therefore, the type of a mechanism is defined based only on the paper’s content. If the authors do not specify the mechanisms type, it is classified as “General Mechanism”. The same approach was followed for the
empirical strategy classification. If the author does not state the empirical strategy applied or does not perform an empirical study, the study was classified as Not Identified. In general, these kinds of paper are theoretical study or perform a dataset analysis.

The empirical strategy classification adopted is the one provided by Easterbrook et al. [8]. We classify systematic review, systematic mapping, and tertiary studies as Systematic Literature Studies. Besides, all studies that adopt more than one empirical strategy were classified as Mixed Methods [8]. The studies that specify an empirical strategy that do not match any of the strategies presented in Table 1 are classified as Others, for instance, focus group, cross validation, and qualitative study.

In the final, two researchers were responsible for integrating the final spreadsheets from all teams. The result was a spreadsheet with all data extracted from studies included in this SMS.

2.3 Data Analysis

The data collected from studies were organized in graphics and tables, which allow better visualization. Since the amount of extracted data is also large, we developed a tool to automate the data extraction from spreadsheets and to organize the results by counting and graph plotting. The source code of our tool is available on-line (http://bit.ly/1ijEAZE).

3. RESULTS AND DISCUSSIONS

Unlike other SMS, we cannot detail each article collected, since our study included 898 articles. Due to space constraints, we only show some evidences for each research question. Further results are available elsewhere (http://bit.ly/1ptiKTU).

3.1 General Information

We gathered the studies from the conference websites, and many search engines (IEEE, ACM, Springer Link, ScienceDirect, etc). When the paper was not available, we contacted the authors. However, even with these efforts, we did not found 44 papers. After the collection process we excluded studies that were not technical research papers, short papers, and duplicate papers, resulting in 898 papers, 202 from EASE, 374 from ESEM, and 322 from ESEJ.

Figure 1 presents the number of full papers gathered by year. Since its first edition, EASE published papers had a smooth oscillation in its growth, with a mean of 12 papers per year. On the other hand, ESEM, since the first editions, has a larger number of publications. In the last eight years of ESEJ, the high number of publications is almost constant.

Figure 1. Distribution of full papers by year

We identified 1,972 authors that had at least one study published at these vehicles. Barbara Kitchenham, Emilia Mendes, and Claes Wohlin are the most active authors; they published 32, 30, and 30 studies, respectively. We also made a geographic analysis of our data. The country that most contributes to the conference is the United States of America, with 187 publications, followed by United Kingdom, with 138 publications. Other countries that have an important role as contributor are Sweden (82), Germany (78), Norway (71), Italy (67), and Australia (61). The institutions that most contributed with published studies are Lund University (40), Keele University (37), and Maryland University (37).

3.2 Mechanisms to Support Empirical Studies

We identified 412 support mechanisms. All mechanisms received a unique identifier. For instance, SM01 stands for Support Mechanisms 01. Due to space constraints, we only show the most used mechanisms. The complete list comprising all mechanisms, organized by ID, is available on-line (http://bit.ly/1ptiKTU).

Table 2 identifies the support mechanisms most used by the collected empirical studies. The first column presents their ID, the second column shows the mechanisms references, the third shows the count of studies that cited each mechanism, and the fourth which domain each mechanism aims to support.

Table 2. Most used support mechanisms

Mechanism ID	Reference	Number of Citation	Domain
SM39	[3]	88	Experiment
SM36	[9]	51	Case Study
SM45	[11]	32	Experiment
SM99	[7]	31	Systematic Study
SM09	[10]	25	Experiment Goals (GQM)
SM129	[2]	23	Experiment
SM17	[16]	21	Quantitative and Qualitative Approaches
SM77	[17]	21	Qualitative Data Analysis
SM68	[13]	19	Qualitative Data Analysis
SM69	[12]	19	Systematic Literature Study
SM29	[15]	19	Experiment
SM67	[14]	17	Grounded Theory

The most used support mechanism was SM39, cited by 88 studies. This is a guideline to experiment planning and execution, as well as threats to research validity. Other guidelines for experiments with many citations were SM45 (32 studies) and SM129 (23 studies). We also identified a web-based framework to support SE experiments activities (SM98). It is important to mention that 32%32 \% of the experiment studies ( 95 studies) do not cite any support mechanism to support their empirical process.

Cited by 51 studies, the second most used support mechanisms was SM36. It presents methods to aid case study researches, supporting the research design, evidence collecting, and evidence analysis. Ten other case studies cited SM15, nine studies cited SM82, and seven cited SM113. All these mechanisms are guidelines for conducting case study research. In spite of this, 110 studies ( 51%51 \% of case studies) do not cite any support mechanism.

Besides experiment and case study, another empirical strategy that has many support mechanisms is systematic literature study. Cited by 31 studies, the third most used support mechanism was SM99. It is a set of guidelines used to plan and guide systematic literature reviews and systematic mappings. Other guideline for systematic studies with a high number of citations was SM69 (19 studies). All systematic studies use at least one support mechanism.

We also identified some mechanisms to support surveys. SM80, the most used (nine studies), consist in a set of principles to plan and to conduct a survey. SM23 and SM52 present handbooks for survey. SM141 (Lime Survey http://limesurvey.org) and SM283 (Survey Monkey https://pt.surveymonkey.com) are web-tools to create survey questionnaires and perform data analysis. Others mechanisms to support survey could be seen in the complete list.

Related with ethnography and action research, we found few references. SM48 presents principles and SM156 presents step-by-step to perform ethnography. Both are not specific to SE area, but can be applied. Two guidelines to conduct action research strategies were found: SM160, and SM162. To the best of our knowledge, we do not know a guideline of those strategies that are specific for SE, which could characterize as an open issue in ESE.

Other identified support mechanisms do not address a specific empirical strategy:

Statistical Data Analysis: SM03 and SM16 present principles and techniques to perform statistical data analysis. The ANOVA variance analysis model (SM75) was also used in some researches for this same purpose. Cohen (SM65) presents another widely used statistical model to data analysis. S-PLUS (SM27) is a statistical tool that allows manipulating experiment data, performing statistical analysis, and creating graphs. Another tool found was R (SM137), an environment for statistical computing.
Qualitative Data Analysis: besides the main mechanisms cited in Table 3 (SM67, SM68, and SM77), we identified other mechanisms to support qualitative data analysis. SM85 provides guidelines on data analysis and synthesis. SM325 recommended steps for thematic synthesis in SE.
Replication: SM12 presents a replication approach to ESE research. It was used by three studies, two surveys and one experiment. SM176 and SM218 were applied in survey studies replication. SM305 shows some good practices and SM295 is a framework for SE experiments replication.
Validity of the Research: SM39 is the mechanism most used to the threats to validity ( 23 citations for this goal).

Maybe the most remarkable result is regarding to the studies that do not cite any mechanism to support their empirical strategies: 412 studies ( 45%45 \% of the total). The number of studies that do not cite any support mechanism is slowly decreasing. In the last four years the rate of studies that not cited support mechanisms maintained an average of 35%35 \%. This can limit the researches and hinder the improvement in the ESE area. We believe that one action to address this issue could be made by the researchers in making a catalog with recommendations of support mechanism.

3.3 Empirical Methods Applied

Figure 2 presents the distribution of empirical strategies among the studies published in EASE, ESEM, and ESEJ. Experiment is the empirical strategy more commonly adopted, with 288 studies, 32%32 \% of the total. Case study is the second strategy more adopted, with 218 studies ( 24%24 \% ). The third most frequently reported strategy is survey, with 57 studies (7%). This strategy is most frequent in combination of empirical strategies.

Systematic literature study composes a group of 42 studies, where 32 are systematic literature reviews, eight are systematic mapping studies, and just two are tertiary studies. The evolution line of this kind of study has a singular behavior, with an ascendant trajectory since last decade. So, we noticed that such strategy is a trend.

Figure 2. Empirical strategies distribution
Considering the studies classified as Others (108 studies), the authors classified their studies as: empirical study, focus group, empirical investigation, qualitative study, qualitative research, quantitative analyses, empirical analysis, grounded theory, empirical validation, correlational study, and cross validation. 30 studies were classified as Mixed Methods. The most used combination was survey and case study with nine studies. The most frequent strategy adopted in mixed methods is survey, appearing in 83%83 \% of mixed methods studies ( 25 occurrences), followed by case study, with 18 occurrences.

It is important to mention that we identified a high number of the Not Identified studies, 16% (145 studies). These studies are present in almost all editions of the venues. We can say that such kind of study is not increasing; however we can also say they are slowly decreasing. In spite of all efforts to evolve ESE, we consider these results as a sign of misunderstandings on the usage of ESE methodologies, which could be mitigated by increasing knowledge about the empirical strategies and their mechanisms.

4. STUDY LIMITATIONS

An important limitation is regarding to the not available studies. The majority was studies published in early EASE editions. In ESEM and ESEJ, we had fewer problems with this issue. However, we believe that our conclusions cannot be invalidated by this lack, since such studies correspond only to 4.8%4.8 \% of candidate studies (some of them could be excluded) for our SMS.

Due to the larger size of our study set when compared to other systematic mappings, one possible threat to this work is the inaccuracy in data extraction. To mitigate this threat all extracted information process was performed by pairs of researchers, as described in Section 2.2, and all disagreements were resolved collectively in the research group. Besides, we performed an extraction pilot to avoid misunderstandings among the participants. Since the amount of extracted information is also large, we developed a tool in order to automatically consume and analyze them (Section 2.3). Moreover, we conducted a review strategy in order to evaluate if the information presented by the tool is accurate. In particular, two researchers manually extracted part of the information presented in this paper and we compare with the tool. No disagreements were found.

5. CONCLUDING REMARKS

Empirical methods allow evaluating research results and conducting studies with greater scientific value, thus contributing to the advancement of SE. The goal of this systematic mapping was to investigate the mechanisms used to support empirical methods in studies published by EASE, ESEM, and ESEJ, representing the ESE community, and also the methods applied.

Among our findings, we highlight: (i) the most used support mechanisms are Wholin et al. [3], for experiments, and Yin et al. [9] , for case study; (ii) experiments and case studies are the most adopted empirical strategy in the ESE community, (iii) systematic literature studies are a trend, and (iv) support mechanisms to systematic studies [7,12][7,12] are well established.

The list of support mechanisms found, where and how they were applied is a major contribution of this work to the SE community interested in empirical studies. This can serve mainly to newcomers and less experienced researchers to choose strategies and support mechanisms to conduct research, and so popularize further the use of empirical studies.

We believe that our results can reflect a significant spectrum of the ESE community. However, in order to improve the accuracy of our results, we shall perform other similar study considering more general conferences, comparing our results with other communities in SE. An investigation to analyze with more details the mechanisms found could be a good contribution to the topic.

6. ACKNOWLEDGMENTS

This work was partially supported by the National Institute of Science and Technology for Software Engineering (INES), funded by CNPq and FACEPE, grants 573964/2008-4 and APQ-1037-1.03/08. Sergio is partially supported by CNPq grants 304581/2013-5 and 471381/2012-8.

7. REFERENCES

[1] D. I. Sjoberg, et al. The Future of Empirical Methods in Software Engineering Research. ICSE 2007. IEEE, Washington, 358-378.
[2] N. Juristo and A. M. Moreno. Basics of software engineering experimentation. Springer Publishing Company, Incorporated, 2010.
[3] C. Wohlin, et al. Experimentation in software engineering. Springer Publishing Company, Incorporated, 2012.
[4] V. R. Basili, et al. Experimentation in software engineering. Software Engineering, IEEE Transactions on, (7):733-743, 1986.
[5] G. H. Travassos, et al. An environment to support large scale experimentation in software engineering. In ICECCS, IEEE, 2008.
[6] W. F. Tichy. Should computer scientists experiment more? Computer, 31(5): 32-40, 1998.
[7] Kitchenham, B., and S. Charters. Guidelines for performing Systematic Literature Reviews in Software Engineering. Technical Report. Keele University and University of Durham. 2007.
[8] S. Easterbrook, et al. Selecting empirical methods for software engineering research. In Guide to Advanced ESE, Springer, 2008.
[9] R. K. Yin. Case study research: Design and methods. Sage, 2009.
[10] V. R. B. G. Caldiera and H. D. Rombach. The goal question metric approach. Encyclopedia of software engineering, 2:528-532, 1994.
[11] B. A. Kitchenham, et al. Preliminary guidelines for empirical research in software engineering. IEEE TOSEM, 28:721-734, 2002.
[12] B. A. Kitchenham, 2004. Procedures for performing systematic reviews. Technical Report. Keele University at Staffordshire.
[13] J. Corbin and A. Strauss. Basics of qualitative research: Techniques and procedures for developing grounded theory. Sage, 2008.
[14] B. G. Glaser and A. L. Strauss. The discovery of grounded theory: Strategies for qualitative research. Transaction Books, 2009.
[15] V. R. Basili, et al. Building knowledge through families of experiments. IEEE TOSEM, 25(4): 456-473, 1999.
[16] C. Robson. Real world research: A resource for social scientists and practitioner-researchers, volume 2. Blackwell Oxford, 2002.
[17] Seaman, C. B. 1999. Qualitative Methods in Empirical Studies of Software Engineering. IEEE TOSEM, 25(4), 557-572.