GiLA: GitHub label analyzer (original) (raw)

Exploring the Characteristics of Issue-Related Behaviors in GitHub Using Visualization Techniques

IEEE Access

Feedback from software users, such as bug reports, is vital in the management of software projects. In GitHub, the feedback is typically expressed as new issues. Through filing issue reports, users may help identify and fix bugs, document software code, and enhance software quality via feature requests. In this paper, we aim at investigating some characteristics of issues to facilitate issue management and software management. We investigate the important degrees of behaviors that are related to issues in popular projects to assess the importance of issues in GitHub and analyze the effectiveness of issue labeling for issue handling. Then, we explore the patterns of issue commits over time in popular projects based on visual analysis and obtain the following results: we find that the behaviors that are related to issues play important roles in the GitHub. We also find that the time distribution of issue commits follows a three-period development model, which approximately corresponds to the project life cycle. These results may provide a new knowledge about issues that can help managers manage and allocate project resources more effectively and even reduce software failures. INDEX TERMS Open-source software community, project development model, visual analysis, issue commit, software management.

Visualizing GitHub Issues

2021

The rise of distributed version control systems, such as git, and platforms built on top of it, such as GitHub, has triggered a change in how software is developed. Most notably, state-of-the-art practice foresees the use of pull requests and issues, enriched by means to enable discussions among the involved people. Platforms like GitHub and GitLab have thus turned into comprehensive and cohesive modern software development environments, also offering additional mechanisms, such as code review tools and a transversal support for continuous integration and deployment. However, the plethora of concepts, mechanisms, and their interconnections are stored and presented in textual form, which makes the understanding of the underlying evolutionary processes difficult. We introduce the notion of an issue tale, a visual narrative of the events and actors revolving around any GitHub issue, and present an approach, implemented as an interactive visual analytics tool, to depict and analyze the relevant information pertaining to issue tales. We illustrate our approach and its implementation on several open-source software systems.

Visualisation of GitHub’s Public Data

With software development becoming a very popular hobby and career for many, the range of technologies and programming languages that are available today is very wide and diverse; each language being crafted for certain problem areas. The goal of this project is to utilise the public data available in a large online source control system for creating a visualisation which can provide insight into the inter-connectivity of programming languages used in open-source projects. We build a graph model of the relationship between programming languages used in projects on GitHub (our online source control system of choice) where the connection between two programming languages depends on the number of users who can program in both of them. Using a force directed graph layout we create an interactive visualisation of the graph of programming languages. Our visualisation is available as a Web application at http://language-connectivity.herokuapp.com.

Maispion: A Tool for Analysing and Visualising Open Source Software Developer Communities

We present Maispion, a tool for analysing software developer communities. The tool, developed in Smalltalk, mines mailing list and version repositories, and provides visualisa- tions to provide insights into the ecosystem of open source software (OSS) development. We show how Maispion can analyze the history of medium to large OSS communities, by applying our tool to three well-known open source projects: Moose, Drupal and Python.

QuerTCI: A Tool Integrating GitHub Issue Querying with Comment Classification

2022

Empirical Software Engineering (ESE) researchers study (opensource) project issues and the comments and threads within to discover-among others-challenges developers face when incorporating new technologies, platforms, and programming language constructs. However, such threads accumulate, becoming unwieldy and hindering any insight researchers may gain. While existing approaches alleviate this burden by classifying issue thread comments, there is a gap between searching popular open-source software repositories (e.g., those on GitHub) for issues containing particular keywords and feeding the results into a classification model. This paper demonstrates a research infrastructure tool called QuerTCI that bridges this gap by integrating the GitHub issue comment search API with the classification models found in existing approaches. Using queries, ESE researchers can retrieve GitHub issues containing particular keywords, e.g., those related to a specific programming language construct, and, subsequently, classify the discussions occurring in those issues. We hope that ESE researchers can use our tool to uncover challenges related to particular technologies using specific keywords through popular open-source repositories more seamlessly than previously possible. A tool demonstration video may be found at: https://youtu.be/fADKSxn0QUk. CCS CONCEPTS • Software and its engineering → Software libraries and repositories.

Binoculars: Comprehending Open Source Projects through Graphs

IFIP Advances in Information and Communication Technology, 2012

Comprehending Open Source Software (OSS) projects requires dealing with huge historical information stored in heterogeneous repositories, such as source code versioning systems, bug tracking system, mailing lists, and revision history logs. In this paper, we present Binoculars, a prototype tool which aims to provide a platform for graph based visualization and exploration of OSS projects. We describe the issues need to be addressed for the design and implementation of a graph based tool and distill lessons learned for future guideline.

Predicting issue types on GitHub

Science of Computer Programming, 2021

Software maintenance and evolution involves critical activities for the success of software projects. To support such activities and keep code up-to-date and error-free, software communities make use of issue trackers, i.e., tools for signaling, handling, and addressing the issues occurring in software systems. However, in popular projects, tens or hundreds of issue reports are daily submitted. In this context, identifying the type of each submitted report (e.g., bug report, feature request, etc.) would facilitate the management and the prioritization of the issues to address. To support issue handling activities, in this paper, we propose Ticket Tagger, a GitHub app analyzing the issue title and description through machine learning techniques to automatically recognize the types of reports submitted on GitHub and assign labels to each issue accordingly. We empirically evaluated the tool's prediction performance on about 30,000 GitHub issues. Our results show that the Ticket Tagger can identify the correct labels to assign to GitHub issues with reasonably high effectiveness. Considering these results and the fact that the tool is designed to be easily integrated in the GitHub issue management process, Ticket Tagger consists in a useful solution for developers.

Exploring the use of labels to categorize issues in Open-Source Software projects

2015 IEEE 22nd International Conference on Software Analysis, Evolution, and Reengineering (SANER), 2015

Reporting bugs, asking for new features and in general giving any kind of feedback is a common way to contribute to an Open-Source Software (OSS) project. This feedback is generally reported in the form of new issues for the project, managed by the so-called issue-trackers. One of the features provided by most issue-trackers is the possibility to define a set of labels/tags to classify the issues and, at least in theory, facilitate their management. Nevertheless, there is little empirical evidence to confirm that taking the time to categorize new issues has indeed a beneficial impact on the project evolution. In this paper we analyze a population of more than three million of GitHub projects and give some insights on how labels are used in them. Our preliminary results reveal that, even if the label mechanism is scarcely used, using labels favors the resolution of issues. Our analysis also suggests that not all projects use labels in the same way (e.g., for some labels are only a way to prioritize the project while others use them to signal their temporal evolution as they move along in the development workflow). Further research is needed to precisely characterize these label "families" and learn more the ideal application scenarios for each of them. 978-1-4799-8469-5/15 c 2015 IEEE SANER 2015, Montréal, Canada Accepted for publication by IEEE. c 2015 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/ republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works.

Visualizing social interaction in open source software projects

2007

Open source software projects such as Apache and Mozilla present an opportunity for information visualization. Since these projects typically require collaboration between developers located far apart, the amount of electronic communication between them is large. Our goal is to apply information visualization techniques to assist software engineering scientists and project managers with analyzing the data. We present a visualization technique that provides an intuitive, time-series, interactive summary view of the the social groups that form, evolve and vanish during the entire lifetime of the project. This visualization helps software engineering researchers understand the organization, structure, and evolution of the communication and collaboration activities of a large, complex software project.

GiveMeLabeledIssues: An Open Source Issue Recommendation System

arXiv (Cornell University), 2023

Developers often struggle to navigate an Open Source Software (OSS) project's issue-tracking system and find a suitable task. Proper issue labeling can aid task selection, but current tools are limited to classifying the issues according to their type (e.g., bug, question, good first issue, feature, etc.). In contrast, this paper presents a tool (GiveMeLabeledIssues) that mines project repositories and labels issues based on the skills required to solve them. We leverage the domain of the APIs involved in the solution (e.g., User Interface (UI), Test, Databases (DB), etc.) as a proxy for the required skills. GiveMeLabeledIssues facilitates matching developers' skills to tasks, reducing the burden on project maintainers. The tool obtained a precision of 83.9% when predicting the API domains involved in the issues. The replication package contains instructions on executing the tool and including new projects.