SAGH: A Social Analysis tool for GitHub (original) (raw)
Related papers
GitHub open source project recommendation system
ArXiv, 2016
Hosting platforms for software projects can form collaborative social networks and a prime example of this is GitHub which is arguably the most popular platform of this kind. An open source project recommendation system could be a major feature for a platform like GitHub, enabling its users to find relevant projects in a fast and simple manner. We perform network analysis on a constructed graph based on GitHub data and present a recommendation system that uses link prediction.
Recommending Relevant Open Source Projects on GitHub using a Collaborative-Filtering Technique
International Journal of Open Source Software and Processes, 2015
The GitHub website represents nowadays an essential tool for developers from around the world; it is considered as a social network for them in which they can share their open source projects with others in a form of repositories. This paper presents and discusses the design and the implementation of a new recommender system for GitHub repositories based on a collaborative-filtering approach, which can be useful in many ways in the process of searching for the right solutions to build projects. The GitHub website is becoming very popular these days, a lot of projects are shared by millions of developers, building this recommender system can reduce searching time and make search results more and more relevant. The authors evaluate their system by conducting a set of experiments on a real data set using different well-known metrics and the k-fold cross validation method. Results obtained from these experiments are very promising, the authors found that their recommender system can rea...
From Periphery to Core: A Temporal Analysis of GitHub Contributors’ Collaboration Network
Collaboration in a Data-Rich World
Open-source projects in GitHub exhibit rich temporal dynamics, and diverse contributors' social interactions further intensify this process. In this paper, we analyze temporal patterns associated with Open Source Software (OSS) projects and how the contributor's notoriety grows and fades over time in a core-periphery structure. In order to explore the temporal dynamics of GitHub communities we formulate a time series clustering model using both Social Network Analysis (SNA) and technical metrics. By applying an adaptive time frame incremental approach to clustering, we locate contributors in different temporal networks. We demonstrate our approach on five long-lived OSS projects involving more than 700 contributors and found that there are three main temporal shapes of attention when contributors shift from periphery to core. Our analyses provide insights into common temporal patterns of the growing OSS communities on GitHub and broaden the understanding of the dynamics and motivation of open source contributors. Keywords: Collaboration Á Core-periphery Á Socio-technical relationships Á SNA Recent studies have shown that only small portion of contributors leads an OSS project making a large proportion of technical contributions [3-8]. For instance,
Applied Sciences
Software collaboration platforms where millions of developers from diverse locations can contribute to the common open source projects have recently become popular. On these platforms, various information is obtained from developer activities that can then be used as developer metrics to solve a variety of challenges. In this study, we proposed new developer metrics extracted from the issue, commit, and pull request activities of developers on GitHub. We created developer metrics from the individual activities and combined certain activities according to some common traits. To evaluate these metrics, we created an item-based project recommendation system. In order to validate this system, we calculated the similarity score using two methods and assessed top-n hit scores using two different approaches. The results for all scores with these methods indicated that the most successful metrics were binary_issue_related, issue_commented, binary_pr_related, and issue_opened. To verify our ...
Link Recommendation in collaborative coding platforms like GitHub using biased random walks Group 35
2013
In this paper, we address the classical problem of link prediction in social networks. Link prediction is the problem of identifying interactions between the nodes in the near future given the network snapshot at a point in time. Whereas the problem is well-studied in traditional social networks and co-authorship graphs, the concept of collaborative programming is fairly new and unexplored. There are several applications of link recommendation in this kind of a setting, like recommending potential collaborators on GitHub, making smarter hiring decisions, among others. To formalize, given the snapshot of a network at time t and a node u, we predict the nodes with whom u is likely to collaborate at time t′ > t, on an online collaborative coding platform like GitHub.
Analysis of Intercrossed Open-Source Software Repositories Data in GitHub
The use of public repositories of software is becoming increasingly common as a strategy for reducing costs and streamlining software development processes. However, given the large amount of software available on such deposits, the task of selecting the components and applications of higher quality or that best meet the quality requirements of a product, has become a demanding task in time and effort. In this paper we present a tool to support the analysis and selection of components and software applications in public repositories, particularly GitHub.
Discovering and Studying Collaboration Networks in Software Repositories
Collaboration is important to software development processes and collaboration networks help us understand its structure and patterns. A common problem, however, is that these networks are not known and need to be discovered. In this work we study collaboration networks of five projects using an existing method that mines these networks from version control systems. The method is based on Recom-mender System techniques and finds similar developers by analyzing commits that are made to common files. These similarities are then used to automatically construct the network and it is visualized using a force directed graph layout algorithm. Two of the studied projects come from industry and are closed source while the other three are open source. In each study we learn some of the project's collaboration form and organization. We also were able to find various aspects of these projects that were previously not known.
Open Source Software Recommendations Using Github
Digital Libraries for Open Knowledge, 2018
The focus of this work is on providing an open source software recommendations using the Github API. Specifically, we propose a hybrid method that considers the programming languages, topics and README documents that appear in the users' repositories. To demonstrate our approach, we implement a proof of concept that provides recommendations.
Social Network Analysis of Developers' and Users' Mailing Lists of Some Free Open Source Software
2015 IEEE International Congress on Big Data, 2015
As reported by Kevin Crowston and co-authors in a recent paper, free open source software is a very important social phenomenon that involves nearly one million programmers, a myriad of software development firms, millions of users, and its financial impact is huge since for instance the cost of recreating available free software is estimated in tens of billions of euros. Free open source software projects generally have one mailing list for developers and another one for users. This large number of mailing lists changes constantly and shows a great variety with respect to membership and topics covered. This makes them very difficult to monitor. One way of overcoming this Big Data Challenge is to identify some easily computable global indicators that can be used for instance to detect important events. We illustrate this approach here by making a social network analysis and comparison of developers' and users' mailing lists of four free open source software projects: CentOS,...
GitEvolve: Predicting the Evolution of GitHub Repositories
2020
Software development is becoming increasingly open and collaborative with the advent of platforms such as GitHub. Given its crucial role, there is a need to better understand and model the dynamics of GitHub as a social platform. Previous work has mostly considered the dynamics of traditional social networking sites like Twitter and Facebook. We propose GitEvolve, a system to predict the evolution of GitHub repositories and the different ways by which users interact with them. To this end, we develop an end-to-end multi-task sequential deep neural network that given some seed events, simultaneously predicts which user-group is next going to interact with a given repository, what the type of the interaction is, and when it happens. To facilitate learning, we use graph based representation learning to encode relationship between repositories. We map users to groups by modelling common interests to better predict popularity and to generalize to unseen users during inference. We introdu...