Software effort estimation accuracy prediction of machine learning techniques: A systematic performance evaluation (original) (raw)

A Preliminary Performance Evaluation of Machine Learning Algorithms for Software Effort Estimation

2017

Accurate Software Effort Estimation is vital to the areas of Software Project Management. It is a process to predict the Effort in terms of cost and time, required to develop a software product. Traditionally, researchers have used the off the shelf empirical models like COCOMO or developed various methods using statistical approaches like regression and analogy based methods but these methods exhibit a number of shortfalls. To predict the effort at early stages is really difficult as very less information is available. To improve the effort estimation accuracy, an alternative is to use machine learning (ML) techniques and many researchers have proposed plethora of such machine learning based models. This paper aims to systematically analyze various machine learning models considering the traits like type of machine learning method used, estimation accuracy gained with that method, dataset used and its comparison with empirical model. Although researchers have started exploring Mach...

Comparative Analysis on Prediction of Software Effort Estimation Using Machine Learning Techniques

Effort Estimation (EE) is a technique for finding the entire effort required to predict the accuracy of a model. It's a significant chore in software application development practice. To find accurate estimation, numerous predictive models have developed in recent times. The estimate prepared during the early stage of a model expansion is inaccurate since requirements at that time are not very clear, but as the model progresses, the accuracy of the estimation increases. Therefore, accurate estimation is essential to choose for each software application model development. Here, Linear Regression (LR), Multi-layer perceptron (MLP), Random Forest (RF) algorithms are implemented using WEKA toolkit, and results shows that Linear Regression shows better estimation accuracy than Multilayer Perceptron and Random Forest.

Predicting Software Effort Estimation Using Machine Learning Techniques

Predicting Software Effort Estimation Using Machine Learning Techniques, 2018

In software engineering, estimation plays a vital role in software development. Thus, affecting its cost and required effort and consequently influencing the overall success of software development. The error margin in Expert-Based, Analogy-Based and algorithmic based methods including: COCOMO, Function Point Analysis and Use-Case-Points is quite significant, which exposes software projects to the danger of delays and running over-budget. To obtain better estimation, we propose an alternative method through performing data mining on historical data. This paper suggests performing this prediction using three machine learning techniques that were applied to a preprocessed COCOMO NASA benchmark data which covered 93 projects: Naïve Bayes, Logistic Regression and Random Forests. The generated models were tested using five folds cross-validation and were evaluated using Classification Accuracy, Precision, Recall, and AUC. The estimation results were then compared to COCOMO estimation. All the applied techniques were successful in achieving better results than the compared COCOMO model. However, the best performance was obtained using both Naïve Bayes and Random Forests. Despite the fact that Naïve Bayes outperformed both of the other two techniques in its ROC curve and Recall score, Random Forests has a better Confusion Matrix and scored better in both Classification Accuracy, and Precision measures. The results of this work confirm the validity of data mining in general and the applied technique in particular for software estimation.

Software Effort Estimation using Machine Learning Technique

International Journal of Advanced Computer Science and Applications, 2023

Software engineering effort estimation plays a significant role in managing project cost, quality, and time and creating software. Researchers have been paying close attention to software estimation during the past few decades, and a great amount of work has been done utilizing a variety of machinelearning techniques and algorithms. In order to better effectively evaluate predictions, this study recommends various machine learning algorithms for estimating, including k-nearest neighbor regression, support vector regression, and decision trees. These methods are now used by the software development industry for software estimating with the goal of overcoming the limitations of parametric and conventional estimation techniques and advancing projects. Our dataset, which was created by a software company called Edusoft Consulted LTD, was used to assess the effectiveness of the established method. The three commonly used performance evaluation measures, mean absolute error (MAE), mean squared error (MSE), and R square error, represent the base for these. Comparative experimental results demonstrate that decision trees perform better at predicting effort than other techniques.

Systematic literature review of machine learning based software development effort estimation models

Context: Software development effort estimation (SDEE) is the process of predicting the effort required to develop a software system. In order to improve estimation accuracy, many researchers have proposed machine learning (ML) based SDEE models (ML models) since 1990s. However, there has been no attempt to analyze the empirical evidence on ML models in a systematic way. Objective: This research aims to systematically analyze ML models from four aspects: type of ML technique , estimation accuracy, model comparison, and estimation context. Method: We performed a systematic literature review of empirical studies on ML model published in the last two decades (1991–2010). Results: We have identified 84 primary studies relevant to the objective of this research. After investigating these studies, we found that eight types of ML techniques have been employed in SDEE models. Overall speaking, the estimation accuracy of these ML models is close to the acceptable level and is better than that of non-ML models. Furthermore, different ML models have different strengths and weaknesses and thus favor different estimation contexts. Conclusion: ML models are promising in the field of SDEE. However, the application of ML models in industry is still limited, so that more effort and incentives are needed to facilitate the application of ML models. To this end, based on the findings of this review, we provide recommendations for researchers as well as guidelines for practitioners.

Adoption of Machine Learning Techniques in Software Effort Estimation: An Overview

IOP Conference Series: Materials Science and Engineering, 2019

Nowadays the significant trend of the effort estimation is in demand. It needs more data to be collected and the stakeholders require an effective and efficient software for processing, which makes the hardware and software cost development becomes steeply increasing. This scenario is true especially in the area of large industry, as the size of a software project is becoming more complex and bigger, the complexity of estimation is continuously increased. Effort estimation is part of the software engineering economic study on how to manage limited resources in a way a project could meet its target goal in a specified schedule, budget and scope. It is necessary to develop or adopt a useful software development process in executing a software development project by acting as a key constraint to the project. The accuracy of estimation is the main critical evaluation for every study. Recently, the machine learning techniques are becoming widely used in many effort estimation problems bu...

Effort Estimation Methods in Software Development Using Machine Learning Algorithms

2016

Estimation of effort for the proposed software is a standout amongst the most essential activities in project management. Proper estimation of effort is often desirable in order to avoid any sort of failures in a project and is the practice to adopted by developers at the very beginning stage of the software development life cycle. Estimating the effort and schedule with a higher accuracy is a challenge that attracts attention of researchers as well as practitioners. Predicting the effort required to develop a software to a certain level of accuracy is definitely a difficult assignment for a manager or system analyst, when the requirements are not very clearly identified. Effort estimation helps project managers to determine time and effort required for the successful completion of the project. In order to help the organization in developing qualitative products within a planned time frame, the job of appropriate software effort estimation is of primary requirement. For measuring th...

Improving Estimation Accuracy Prediction of Software Development Effort: A Proposed Ensemble Model

2020 International Conference on Electrical, Communication, and Computer Engineering (ICECCE), 2020

Software effort estimation is an essential feature of software engineering for effective planning, controlling and delivering successful software projects. The overestimation and underestimation both are the key challenges for future software development. The failure to acknowledge the effort estimation accuracy may lead to customer disappointment, inaccurate estimation and hence, contribute to either poor software development process or project failure. The main aim of this research is to optimize the estimation accuracy prediction of software development effort to support software development firms and practitioners. In this paper, we propose an ensemble software effort estimation model based on Use Case Points (UCP), expert judgment and Case-Based Reasoning (CBR) techniques. This research is conducted through primary (a multi-case involving software companies) study to make an ensemble model. The estimation accuracy prediction of the proposed model will be evaluated by selecting projects from primary studies as case selections in applying a quantitative approach through industrial experts, archival data about estimates and evaluation metrics. The proposed model produced at the end of this research will be used by software development firms and practitioners as an instrument to estimate the effort required to develop new software projects at an earlier stage.

Software Effort Estimation Using Machine Learning Methods

… international symposium on …, 2007

In software engineering, the main aim is to develop projects that produce the desired results within limited schedule and budget. The most important factor affecting the budget of a project is the effort. Therefore, estimating effort is crucial because hiring people more than ...

Software Effort Prediction Using Ensemble Learning Methods

Journal of Software Engineering and Applications, 2020

Software Cost Estimation (SCE) is an essential requirement in producing software these days. Genuine accurate estimation requires cost-and-efforts factors in delivering software by utilizing algorithmic or Ensemble Learning Methods (ELMs). Effort is estimated in terms of individual months and length. Overestimation as well as underestimation of efforts can adversely affect software development. Hence, it is the responsibility of software development managers to estimate the cost using the best possible techniques. The predominant cost for any product is the expense of figuring effort. Subsequently, effort estimation is exceptionally pivotal and there is a constant need to improve its accuracy. Fortunately, several efforts estimation models are available; however, it is difficult to determine which model is more accurate on what dataset. Hence, we use ensemble learning bagging with base learner Linear regression, SMOReg, MLP, random forest, REPTree, and M5Rule. We also implemented the feature selection algorithm to examine the effect of feature selection algorithm BestFit and Genetic Algorithm. The dataset is based on 499 projects known as China. The results show that the Mean Magnitude Relative error of Bagging M5 rule with Genetic Algorithm as Feature Selection is 10%, which makes it better than other algorithms.