Exploring the Ecosystem of Software Developers on GitHub and Other Platforms (original) (raw)

Social coding in github: transparency and collaboration in an open software repository

2012

Abstract Social applications on the web let users track and follow the activities of a large number of others regardless of location or affiliation. There is a potential for this transparency to radically improve collaboration and learning in complex knowledge-based activities. Based on a series of in-depth interviews with central and peripheral GitHub users, we examined the value of transparency for large-scale distributed collaborations and communities of practice.

Open Source-Style Collaborative Development Practices in Commercial Projects Using GitHub

—Researchers are currently drawn to study projects hosted on GitHub due to its popularity, ease of obtaining data, and its distinctive built-in social features. GitHub has been found to create a transparent development environment, which together with a pull request-based workflow, provides a lightweight mechanism for committing, reviewing and managing code changes. These features impact how GitHub is used and the benefits it provides to teams' development and collaboration. While most of the evidence we have is from GitHub's use in open source software (OSS) projects, GitHub is also used in an increasing number of commercial projects. It is unknown how GitHub supports these projects given that GitHub's workflow model does not intuitively fit the commercial development way of working. In this paper, we report findings from an online survey and interviews with GitHub users on how GitHub is used for collaboration in commercial projects. We found that many commercial projects adopted practices that are more typical of OSS projects including reduced communication, more independent work, and self-organization. We discuss how GitHub's transparency and popular workflow can promote open collaboration, allowing organizations to increase code reuse and promote knowledge sharing across their teams.

SAGH: A Social Analysis tool for GitHub

2012

In this paper, we model the open source software development community as a heterogeneous social network of projects and developers. In particular, we work with the novel dataset of the GitHub developer community and provide a toolkit for extracting and analyzing a social network from the raw GitHub archives. We propose a developer recommendation system as one interesting application of our comprehensive toolkit. We also propose, implement, and compare different graph algorithms for computing similarity as possible solutions to the recommendation problem. Finally, we provide a generalized experimental framework in which the different algorithms can be compared against different desired input metrics thereby creating an adaptable recommendation tool.

A topology of groups: What GitHub can tell us about online collaboration

Technological Forecasting and Social Change, 2020

In this work, we study the collaboration patterns of open source software projects on GitHub by analyzing the pull request submissions and acceptances of repositories. We develop a group typology based on the structural properties of the corresponding directed graphs, and analyze how the topology is connected to the repositorys collective identity, hierarchy, productivity, popularity, resilience and stability. These analyses indicate significant differences between group types and thereby provide valuable insights on how to effectively organize collaborative software development. Identifying the mechanisms that underlie self-organized collaboration on digital platforms is important not just to better understand open source software development but also all other decentralized and digital work environments, a setting widely regarded as a key feature of the future work place.

An Overview on GITHUB

International Journal for Research in Applied Science and Engineering Technology, 2019

GitHub Inc is one of the web-hosting services that are being controlled by git. These are mostly implemented for computer codes. In addition to offering all the distributed version control and SCM (Source Code Management) functionalities, it also provides its own features. In its every project, it offers access control and features like bug tracking, feature request, task manager, wiki's etc. Both private repositories and free accounts (which is mostly occupied for hosing open-source software projects) gains plans with the help of GitHub. According to the report made for June 2018 it is the largest host of the source code containing 28 million users, 57 million repositories which includes 28 million public repositories. GitHub is both version controlled and collaboration platform used for software development. GitHub is being generally delivered as SaaS (Software-asa-service) business model, is an open source code management system found on Git during 2008 which was created by Linus Torvalds to construct software at a quicker phase.

A topological analysis of communication channels for knowledge sharing in contemporary GitHub projects

Journal of Systems and Software, 2019

With over 28 million developers, success of the GitHub collaborative platform is highlighted through an abundance of communication channels among contemporary software projects. Knowledge is broken into two forms and its sharing (through communication channels) can be described as externalization or combination by the SECI model. Such platforms have revolutionized the way developers work, introducing new channels to share knowledge in the form of pull requests, issues and wikis. It is unclear how these channels capture and share knowledge. In this research, our goal is to analyze these communication channels in GitHub. First, using the SECI model, we are able to map how knowledge is shared through the communication channels. Then in a large-scale topology analysis of seven library package projects (i.e., involving over 70 thousand projects), we extracted insights of the different communication channels within GitHub. Using two research questions, we explored the evolution of the channels and adoption of channels by both popular and unpopular library package projects. Results show that (i) contemporary GitHub Projects tend to adopt multiple communication channels, (ii) communication channels change over time and (iii) communication channels are used to both capture new knowledge (i.e., externalization) and updating existing knowledge (i.e., combination).

G-Repo: a Tool to Support MSR Studies on GitHub

2021 IEEE International Conference on Software Analysis, Evolution and Reengineering (SANER)

GitHub currently hosts more than 100 million public repositories. This has made it very popular to conduct Mining Software Repositories (MSR) studies. Researchers have been exploiting the information stored in GitHub (e.g., commits, pull requests, or issues) to investigate both developer- and project-related aspects. GitHub provides the REST API to make queries without cloning repositories. In this tool-demo paper, we highlight some issues we noticed when conducting an MSR study on GitHub by using the REST API and present G-Repo: a tool developed to support researchers when tackling these issues able to ease the creation of datasets for MSR studies. Also, we provide a manually-annotated dataset with information about the kind and the (spoken) languages of 1,500 repositories hosted on GitHub. A video showing the functioning of G-Repo is available at: https://youtu.be/mb9CIALBFZk.

GitHub Marketplace for Practitioners and Researchers to Date: A Systematic Analysis of the Knowledge Mobilization Gap in Open Source Software Automation

arXiv (Cornell University), 2022

Marketplaces for distributing software products and services have been getting increasing popularity. GitHub, which is most known for its version control functionality through Git, launched its own marketplace in 2017. GitHub Marketplace hosts third party apps and actions to automate workflows in software teams. Currently, this marketplace hosts 440 Apps and 7,878 Actions across 32 different categories. Overall, 419 Third party developers released their apps on this platform which 111 distinct customers adopted. The popularity and accessibility of GitHub projects have made this platform and the projects hosted on it one of the most frequent subjects for experimentation in the software engineering research. A simple Google Scholar search shows that 24,100 Research papers have discussed GitHub within the Software Engineering field since 2017, but none have looked into the marketplace. The GitHub Marketplace provides a unique source of information on the tools used by the practitioners in the Open Source Software (OSS) ecosystem for automating their project's workflow. In this study, we (i) mine and provide a descriptive overview of the GitHub Marketplace, (ii) perform a systematic mapping of research studies in automation for open source software, and (iii) compare the state of the art with the state of the practice on the automation tools. We conclude the paper by discussing the potential of GitHub Marketplace for knowledge mobilization and collaboration within the field. This is the first study on the GitHub Marketplace in the field.

The appropriation of GitHub for curation

2017

GitHub is a widely used online collaborative software development environment. In this paper, we describe curation projects as a new category of GitHub project that collects, evaluates, and preserves resources for software developers. We investigate: (1) what motivates software developers to curate resources; (2) why curation has occurred on GitHub; (3) how curated resources are used by software developers; and (4) how the GitHub platform could better support these practices. We conduct indepth interviews with 16 software developers, each of whom hosts curation projects on GitHub. Our results suggest that the motivators that inspire software developers to curate resources on GitHub are similar to those that motivate them to participate in the development of open source projects. Convenient tools (e.g., Markdown syntax and Git version control system) and the opportunity to address professional needs of a large number of peers attract developers to engage in curation projects on GitHub. Benefits of curating on GitHub include learning opportunities, support for development work, and professional interaction. However, curation is limited by GitHub's document structure, format, and a lack of key features, such as search. In light of this, we propose design possibilities to encourage and improve appropriations of GitHub for curation.

GitHub Discussions: An exploratory study of early adoption

Empirical Software Engineering, 2021

Discussions is a new feature of GitHub for asking questions or discussing topics outside of specific Issues or Pull Requests. Before being available to all projects in December 2020, it had been tested on selected open source software projects. To understand how developers use this novel feature, how they perceive it, and how it impacts the development processes, we conducted a mixed-methods study based on early adopters of GitHub discussions from January until July 2020. We found that: (1) errors, unexpected behavior, and code reviews are prevalent discussion categories; (2) there is a positive relationship between project member involvement and discussion frequency; (3) developers consider GitHub Discussions useful but face the problem of topic duplication between Discussions and Issues; (4) Discussions play a crucial role in advancing the development of projects; and (5) positive sentiment in Discussions is more frequent than in Stack Overflow posts. Our findings are a first step...