Newcomers Withdrawal in Open Source Software Projects: Analysis of Hadoop Common Project (original) (raw)
Related papers
2012
Collective production communities, like open source projects, are based on volunteers collaboration and require newcomers for their continuity. Newcomers face difficulties and obstacles when starting their contributions, resulting in a large withdrawal and consequent low retention rate. This paper presents an analysis of newcomers withdrawal, checking if the dropout is influenced by lack of answer, answers politeness and helpfulness, and the answer author. We have collected five years data from the developers mail list communication and task manager (Jira) discussions of Hadoop Common project. We observed the users' communication, identifying newcomers and classifying questions and answers content. For the study conducted, less than 20% of newcomers became long term contributors. There are evidences that the withdrawal is influenced by the respondents and by the type of response received. However, the lack of answer was not evidenced as a factor that influences newcomers withdrawal in the project.
Prediction of Developer Participation in Issues of Open Source Projects
2012
Developers of distributed open source projects use management and issues tracking tool to communicate. These tools provide a large volume of unstructured information that makes the triage of issues difficult, increasing developers' overhead. This problem is common to online communities based on volunteer participation. This paper shows the importance of the content of comments in an open source project to build a classifier to predict the participation for a developer in an issue. To design this prediction model, we used two machine learning algorithms called Naive Bayes and J48. We used the data of three Apache Hadoop subprojects to evaluate the use of the algorithms. By applying our approach to the most active developers of these subprojects we have achieved an accuracy ranging from 79% to 96%. The results indicate that the content of comments in issues of open source projects is a relevant factor to build a classifier of issues for developers. Content analysis; prediction model; issue tracking classifier; machine learning.
Students' Engagement in Open Source Projects
Proceedings of the 31st Brazilian Symposium on Software Engineering - SBES'17
Several open source software (OSS) communities promote and participate in initiatives such as summers of code to foster contributions and attract new developers. However, little is known about how successful these initiatives are. As a case study, we analyzed Google Summer of Code (GSoC), which is a three-month program that fosters students' participation in OSS projects. We found that 82% of the studied OSS projects merged at least one students' commit in codebase. When only newcomers are considered, ~54% of OSS projects merged at least one commit. We also found that ~23% of newcomers started contributing to GSoC projects before knowing they would be accepted. We also did not find statistical difference between newcomers and students with prior participation in the projects regarding retention time after GSoC, except for 2015 edition. Using survival analysis, we found that ~40% of students kept contributing longer than a month, while ~15% contributed longer than a year. OSS communities can take advantage of our results to balance the trade-offs involved in joining this kind of program and to set expectations about how much contribution to expect and for how long students engage. 1
2012
Developers of distributed open source projects use management and issues tracking tool to communicate. These tools provide a large volume of unstructured information that makes the triage of issues difficult, increasing developers' overhead. This problem is common to online communities based on volunteer participation. This paper shows the importance of the content of comments in an open source project to build a classifier to predict the participation for a developer in an issue. To design this prediction model, we used two machine learning algorithms called Naive Bayes and J48. We used the data of three Apache Hadoop subprojects to evaluate the use of the algorithms. By applying our approach to the most active developers of these subprojects we have achieved an accuracy ranging from 79% to 96%. The results indicate that the content of comments in issues of open source projects is a relevant factor to build a classifier of issues for developers. Content analysis; prediction model; issue tracking classifier; machine learning.
Identification and mitigation of risks in distributed development of software projects
Sistemas & Gestão, 2016
Os mercados estão cada vez mais globalizados, gerando impactos em diversas áreas e provocando o surgimento de novas formas de competição e cooperação que ultrapassam os limites das fronteiras dos países. Nesse contexto, a área de desenvolvimento de software também foi afetada, impulsionando muitas organizações a experimentarem o desenvolvimento de software em instalações localizadas em vários países. Surgiu assim o desenvolvimento distribuído de software (DDS). O DDS apresenta diversos fatores de risco como a dispersão geográfica e problemas de comunicação. Embora existam diversas dificuldades enfrentadas pelas organizações que adotam o DDS, foi verificado que com medidas de mitigação dos riscos, as vantagens do DDS se sobrepõem as dificuldades e seu uso possibilita a redução de custos e aumento da competitividade das organizações. Foi realizada uma pesquisa bibliográfica e um estudo de caso, onde foram identificados os riscos mais comuns do DDS e as formas de tratá-los.
Definitions for an Approach to Innovative Software Project Management
Information Systems and Technology Management 2, 2019
Gerenciamento de recursos de informação. 2. Sistemas de informação gerencial. 3. Tecnologia da informação. I. Machado, William Kaspchak. II. Série. CDD 658.4 Elaborado por Maurício Amormino Júnior-CRB6/2422 O conteúdo dos artigos e seus dados em sua forma, correção e confiabilidade são de responsabilidade exclusiva dos autores. 2019 Permitido o download da obra e o compartilhamento desde que sejam atribuídos créditos aos autores, mas sem a possibilidade de alterá-la de nenhuma forma ou utilizá-la para fins comerciais. www.atenaeditora.com.br Information Systems and Technology Management 2 Capítulo 1
Padrões de socialização de novatos em projetos de software livre
Open source software projects are based on volunteers collaboration and require a continuous influx of newcomers for their continuity. However, newcomers face difficulties and obstacles when starting their contributions. Using an iterative method based upon mining of software repositories and social network analysis, we aim the detection of socialization patterns for newcomers in open source software projects. As research subject, we use the Apache project Hadoop Common. We analysed messages and issues through december 2012. The results point that most newcomers stays for few months in the project, and the few persistent newcomers employ just one interaction method and interact mostly with experienced developers. Due to the small account of fruitfull interactions, we could not detect further socialization patterns.
Collaboration in Open Source Development as a Strategy for the Software Industry
This article aims to identify the knowledge management actions used in the Free Software Development. In this work are identified differences between “free software” and “open source” in tree dimensions: gratuity, freedom of use and developer’s strategy. The research method uses the (results) of an early research done on the (Brazil’s) scenario, comparing (these results) to a reference sample obtained in the capital of Santa Catarina state. It is used a pre-tested questionnaire with forty-six questions as research instrument, of which eight questions were selected to analyze related Knowledge Management issues. The results show that the sample is close to the national average and that the environment is conducive to knowledge sharing actions among communities of practice, which can be operationalized through discussion groups, knowledge codification, use of knowledge repositories, among other knowledge management practices.