Yegor Bugayenko - Academia.edu (original) (raw)
Papers by Yegor Bugayenko
arXiv (Cornell University), Mar 22, 2024
In traditional management, tasks are typically assigned to individuals, with each worker taking f... more In traditional management, tasks are typically assigned to individuals, with each worker taking full responsibility for the success or failure of a task. In contrast, modern Agile, Lean, and eXtreme Programming practices advocate for shared responsibility, where an entire group is accountable for the outcome of a project or task. Despite numerous studies in other domains, the preferences of programmers have not been thoroughly analyzed. To address this gap, we conducted a survey featuring seven situational questions and collected the opinions of 120 software development practitioners. Our findings reveal that programmers prefer tasks to be assigned to them on an individual basis and appreciate taking personal responsibility for failures, as well as receiving individual rewards for successes. Understanding these preferences is crucial for project managers aiming to optimize team dynamics and ensure the successful completion of software projects.
arXiv (Cornell University), Mar 16, 2024
arXiv (Cornell University), Mar 13, 2024
Even though numerous researchers require stable datasets along with source code and basic metrics... more Even though numerous researchers require stable datasets along with source code and basic metrics calculated on them, neither GitHub nor any other code hosting platform provides such a resource. Consequently, each researcher must download their own data, compute the necessary metrics, and then publish the dataset somewhere to ensure it remains accessible indefinitely. Our CAM (stands for "Classes and Metrics") project addresses this need. It is an opensource software capable of cloning Java repositories from GitHub, filtering out unnecessary files, parsing Java classes, and computing metrics such as Cyclomatic Complexity, Halstead Effort and Volume, C&K metrics, Maintainability Metrics, LCOM5 and HND, as well as some Git-based Metrics. At least once a year, we execute the entire script, a process which requires a minimum of ten days on a very powerful server, to generate a new dataset. Subsequently, we publish it on Amazon S3, thereby ensuring its availability as a reference for researchers. The latest archive of 2.2Gb that we published on the 2nd of March, 2024 includes 532K Java classes with 48 metrics for each class.
arXiv (Cornell University), Nov 26, 2021
Object-oriented programming (OOP) is one of the most popular paradigms used for building software... more Object-oriented programming (OOP) is one of the most popular paradigms used for building software systems 1. However, despite its industrial and academic popularity, OOP is still missing a formal apparatus similar to-calculus, which functional programming is based on. There were a number of attempts to formalize OOP, but none of them managed to cover all the features available in modern OO programming languages, such as C++ or Java. We have made yet another attempt and created-calculus. We also created EOLANG (also called EO), an experimental programming language based on-calculus.
Class cohesion is a measure of the degree to which a class’s inner elements, like methods and att... more Class cohesion is a measure of the degree to which a class’s inner elements, like methods and attributes, are bound or related to one another. There have been over thirty different formulas proposed in order to calculate the metric. None of them are explicitly designed to deal with constructors in any different way than with regular methods—they simply treat them as identical entities. However, as many object-oriented theorists say, constructors play a very specific role in object life-cycle. In the scope of this empirical research, five different formulas were implemented in two ways: including constructors and excluding them. Then both set of formulas were applied to the same set of 1000 mid-size open source Java projects. The results obtained demonstrated how much of a distraction constructors were bringing into metric calculations.
arXiv (Cornell University), Jun 6, 2022
We introduce a taxonomy of objects for EO programming language. This taxonomy is designed with a ... more We introduce a taxonomy of objects for EO programming language. This taxonomy is designed with a few principles in mind: non-redundancy, simplicity, and so on. The taxonomy is supposed to be used as a navigation map by EO programmers. It may also be helpful as a guideline for designers of other object-oriented languages or libraries for them.
Communications of The ACM, Jun 25, 2018
Springer eBooks, 2009
There are many methods of software cost estimating (COCOMO, function points analysis, three-point... more There are many methods of software cost estimating (COCOMO, function points analysis, three-point estimate, use case points, class points, XP user stories, SLOC prediction and others), with their advantages and drawbacks. One common problem with all methods is the necessity to estimate the whole requirements specification, item by item. At the end, either this process is expensive or the numbers are inaccurate. This paper presents a method of software cost estimating using a limited number of functional requirements, called Scope Champions. Estimators produce more detailed and grounded numbers that are used in a final estimation formula. The method reduces the costs of estimating and increases accuracy.
Quality of Code is an important and critical health indicator of any software development project... more Quality of Code is an important and critical health indicator of any software development project. However, due to the complexity and ambiguousness of calculating this indicator it is rarely used in commercial contracts. As programmers are much more motivated with respect to the delivery of functionality than quality of code beneath it,they often produce low-quality code, which leads to post-delivery and maintenance problems. The proposed mechanism eliminates this lack of attention to Quality of Code. The results achieved after the implementation of the mechanism are more motivated programmers, higher project sponsor confidence and a predicted Quality of Code.
Communications of The ACM, Aug 22, 2018
Procedia Computer Science, 2020
arXiv (Cornell University), Nov 26, 2021
Object-oriented programming (OOP) is one of the most popular paradigms used for building software... more Object-oriented programming (OOP) is one of the most popular paradigms used for building software systems 1. However, despite its industrial and academic popularity, OOP is still missing a formal apparatus similar to-calculus, which functional programming is based on. There were a number of attempts to formalize OOP, but none of them managed to cover all the features available in modern OO programming languages, such as C++ or Java. We have made yet another attempt and created-calculus. We also created EOLANG (also called EO), an experimental programming language based on-calculus.
Communications of The ACM, Oct 22, 2020
The Communications Web site, http://cacm.acm.org, features more than a dozen bloggers in the BLOG... more The Communications Web site, http://cacm.acm.org, features more than a dozen bloggers in the BLOG@CACM community. In each issue of Communications , we'll publish selected posts or excerpts. twitter Follow us on Twitter at http://twitter.com/blogCACM http://cacm.acm.org/blogs/blog-cacm David Patterson wants to boost industry submissions to conferences, while Yegor Bugayenko suggests productivity should govern coders' pay when they work from home.
IEEE Access, 2023
Context: Within the domain of managing software development teams, effective task prioritization ... more Context: Within the domain of managing software development teams, effective task prioritization is a critical responsibility that should not be underestimated, particularly for larger organizations with significant backlogs. Current approaches primarily rely on predicting task priority without considering information about other tasks, potentially resulting in inaccurate priority predictions. Objective: This paper presents the benefits of considering the entire backlog when prioritizing tasks. Method: We employ an iterative approach using particle swarm optimization to optimize a linear model with various preprocessing methods to determine the optimal model for task prioritization within a backlog. Results: The findings of our study demonstrate the usefulness of constructing a task prioritization model based on complete information from the backlog. Conclusion: The method proposed in our study can serve as a valuable resource for future researchers and can also facilitate the development of new tools to aid IT management teams.
Social Science Research Network, 2022
arXiv (Cornell University), Dec 17, 2021
C++, Java, C#, Python, Ruby, JavaScript are the most powerful object-oriented programming languag... more C++, Java, C#, Python, Ruby, JavaScript are the most powerful object-oriented programming languages, if language power would be defined as the number of features available for a programmer. EO, on the other hand, is an object-oriented programming language with a reduced set of features: it has nothing by objects and mechanisms of their composition and decoration. We are trying to answer the following research question: "Which known features are possible to implement using only objects?"
A new metric was introduced to calculate the distance between actively modified files in a source... more A new metric was introduced to calculate the distance between actively modified files in a source code repository and the files, which are rarely modified and may be considered abandoned or even dead. It was empirically demonstrated that larger repositories have larger values of the introduced metric. The metric may be used for earlier detection of code maintenance anomalies and helping software developers make the decision of splitting the repository into smaller ones in order to prevent maintainability issues. CCS Concepts: • Software and its engineering → Maintaining software.
Communications of The ACM, Oct 24, 2019
arXiv (Cornell University), Mar 22, 2024
In traditional management, tasks are typically assigned to individuals, with each worker taking f... more In traditional management, tasks are typically assigned to individuals, with each worker taking full responsibility for the success or failure of a task. In contrast, modern Agile, Lean, and eXtreme Programming practices advocate for shared responsibility, where an entire group is accountable for the outcome of a project or task. Despite numerous studies in other domains, the preferences of programmers have not been thoroughly analyzed. To address this gap, we conducted a survey featuring seven situational questions and collected the opinions of 120 software development practitioners. Our findings reveal that programmers prefer tasks to be assigned to them on an individual basis and appreciate taking personal responsibility for failures, as well as receiving individual rewards for successes. Understanding these preferences is crucial for project managers aiming to optimize team dynamics and ensure the successful completion of software projects.
arXiv (Cornell University), Mar 16, 2024
arXiv (Cornell University), Mar 13, 2024
Even though numerous researchers require stable datasets along with source code and basic metrics... more Even though numerous researchers require stable datasets along with source code and basic metrics calculated on them, neither GitHub nor any other code hosting platform provides such a resource. Consequently, each researcher must download their own data, compute the necessary metrics, and then publish the dataset somewhere to ensure it remains accessible indefinitely. Our CAM (stands for "Classes and Metrics") project addresses this need. It is an opensource software capable of cloning Java repositories from GitHub, filtering out unnecessary files, parsing Java classes, and computing metrics such as Cyclomatic Complexity, Halstead Effort and Volume, C&K metrics, Maintainability Metrics, LCOM5 and HND, as well as some Git-based Metrics. At least once a year, we execute the entire script, a process which requires a minimum of ten days on a very powerful server, to generate a new dataset. Subsequently, we publish it on Amazon S3, thereby ensuring its availability as a reference for researchers. The latest archive of 2.2Gb that we published on the 2nd of March, 2024 includes 532K Java classes with 48 metrics for each class.
arXiv (Cornell University), Nov 26, 2021
Object-oriented programming (OOP) is one of the most popular paradigms used for building software... more Object-oriented programming (OOP) is one of the most popular paradigms used for building software systems 1. However, despite its industrial and academic popularity, OOP is still missing a formal apparatus similar to-calculus, which functional programming is based on. There were a number of attempts to formalize OOP, but none of them managed to cover all the features available in modern OO programming languages, such as C++ or Java. We have made yet another attempt and created-calculus. We also created EOLANG (also called EO), an experimental programming language based on-calculus.
Class cohesion is a measure of the degree to which a class’s inner elements, like methods and att... more Class cohesion is a measure of the degree to which a class’s inner elements, like methods and attributes, are bound or related to one another. There have been over thirty different formulas proposed in order to calculate the metric. None of them are explicitly designed to deal with constructors in any different way than with regular methods—they simply treat them as identical entities. However, as many object-oriented theorists say, constructors play a very specific role in object life-cycle. In the scope of this empirical research, five different formulas were implemented in two ways: including constructors and excluding them. Then both set of formulas were applied to the same set of 1000 mid-size open source Java projects. The results obtained demonstrated how much of a distraction constructors were bringing into metric calculations.
arXiv (Cornell University), Jun 6, 2022
We introduce a taxonomy of objects for EO programming language. This taxonomy is designed with a ... more We introduce a taxonomy of objects for EO programming language. This taxonomy is designed with a few principles in mind: non-redundancy, simplicity, and so on. The taxonomy is supposed to be used as a navigation map by EO programmers. It may also be helpful as a guideline for designers of other object-oriented languages or libraries for them.
Communications of The ACM, Jun 25, 2018
Springer eBooks, 2009
There are many methods of software cost estimating (COCOMO, function points analysis, three-point... more There are many methods of software cost estimating (COCOMO, function points analysis, three-point estimate, use case points, class points, XP user stories, SLOC prediction and others), with their advantages and drawbacks. One common problem with all methods is the necessity to estimate the whole requirements specification, item by item. At the end, either this process is expensive or the numbers are inaccurate. This paper presents a method of software cost estimating using a limited number of functional requirements, called Scope Champions. Estimators produce more detailed and grounded numbers that are used in a final estimation formula. The method reduces the costs of estimating and increases accuracy.
Quality of Code is an important and critical health indicator of any software development project... more Quality of Code is an important and critical health indicator of any software development project. However, due to the complexity and ambiguousness of calculating this indicator it is rarely used in commercial contracts. As programmers are much more motivated with respect to the delivery of functionality than quality of code beneath it,they often produce low-quality code, which leads to post-delivery and maintenance problems. The proposed mechanism eliminates this lack of attention to Quality of Code. The results achieved after the implementation of the mechanism are more motivated programmers, higher project sponsor confidence and a predicted Quality of Code.
Communications of The ACM, Aug 22, 2018
Procedia Computer Science, 2020
arXiv (Cornell University), Nov 26, 2021
Object-oriented programming (OOP) is one of the most popular paradigms used for building software... more Object-oriented programming (OOP) is one of the most popular paradigms used for building software systems 1. However, despite its industrial and academic popularity, OOP is still missing a formal apparatus similar to-calculus, which functional programming is based on. There were a number of attempts to formalize OOP, but none of them managed to cover all the features available in modern OO programming languages, such as C++ or Java. We have made yet another attempt and created-calculus. We also created EOLANG (also called EO), an experimental programming language based on-calculus.
Communications of The ACM, Oct 22, 2020
The Communications Web site, http://cacm.acm.org, features more than a dozen bloggers in the BLOG... more The Communications Web site, http://cacm.acm.org, features more than a dozen bloggers in the BLOG@CACM community. In each issue of Communications , we'll publish selected posts or excerpts. twitter Follow us on Twitter at http://twitter.com/blogCACM http://cacm.acm.org/blogs/blog-cacm David Patterson wants to boost industry submissions to conferences, while Yegor Bugayenko suggests productivity should govern coders' pay when they work from home.
IEEE Access, 2023
Context: Within the domain of managing software development teams, effective task prioritization ... more Context: Within the domain of managing software development teams, effective task prioritization is a critical responsibility that should not be underestimated, particularly for larger organizations with significant backlogs. Current approaches primarily rely on predicting task priority without considering information about other tasks, potentially resulting in inaccurate priority predictions. Objective: This paper presents the benefits of considering the entire backlog when prioritizing tasks. Method: We employ an iterative approach using particle swarm optimization to optimize a linear model with various preprocessing methods to determine the optimal model for task prioritization within a backlog. Results: The findings of our study demonstrate the usefulness of constructing a task prioritization model based on complete information from the backlog. Conclusion: The method proposed in our study can serve as a valuable resource for future researchers and can also facilitate the development of new tools to aid IT management teams.
Social Science Research Network, 2022
arXiv (Cornell University), Dec 17, 2021
C++, Java, C#, Python, Ruby, JavaScript are the most powerful object-oriented programming languag... more C++, Java, C#, Python, Ruby, JavaScript are the most powerful object-oriented programming languages, if language power would be defined as the number of features available for a programmer. EO, on the other hand, is an object-oriented programming language with a reduced set of features: it has nothing by objects and mechanisms of their composition and decoration. We are trying to answer the following research question: "Which known features are possible to implement using only objects?"
A new metric was introduced to calculate the distance between actively modified files in a source... more A new metric was introduced to calculate the distance between actively modified files in a source code repository and the files, which are rarely modified and may be considered abandoned or even dead. It was empirically demonstrated that larger repositories have larger values of the introduced metric. The metric may be used for earlier detection of code maintenance anomalies and helping software developers make the decision of splitting the repository into smaller ones in order to prevent maintainability issues. CCS Concepts: • Software and its engineering → Maintaining software.
Communications of The ACM, Oct 24, 2019