Risk Analysis in Software Cost Estimation: A Simulation-Based Approach

Risk analysis and cost estimation are two important aspects of project planning that can either make the way or break the way to a project’s success. At the same, both these tasks are difficult and painstaking, but whether someone likes it and not, the project’s success depends heavily on them. As documented by Fredric Brooks Junior in his legendry book “The Mythical Man-Month,” planning, scheduling, and estimation have been central to software engineering since its early days in the 1970s. Present communication presents a simulation-based approach to estimate the costing schedule of a software development project. The results show that simulation is expedient as well as efficient in terms of time, effort, and cost requirement and provides pragmatic results.


Introduction
For almost every project, the budget is of utmost importance. It requires correct estimation for the successful and fruitful completion of the project. But correct estimation is not easy to come by, before the actual beginning of the project because of inherent uncertainties. Estimation at an early stage of a project is as difficult as driving a car being blindfolded while taking directions from someone else (Wood 2002). In (Brooks 1975), Fred Brooks' observations were stemmed from almost a decade-long experience as the project lead of IMB's OS/360. He tried to manage the behind-schedule project by adding manpower. Later, he conceded that the decision had proved counter-productive and had further delayed the project. His experiences with OS/360 had led him to summarise his experiences in form of a book "The Mythical Man-Month" (Brooks 1975).
However, temporal and budgetary schedules not only get troubled by poor estimation but also by the inherent cost risks. And, according to Sharif and Basri (2011), risk can be defined as "the likelihood of suffering loss which impacts the project in terms of poor quality, cost overrun, schedule slippage, etc." Thus, cost risks can disturb the budgets and can result in cost overrun. Therefore, it is intelligent to analyse the cost risks before beginning the actual project.
The purpose of the risk analysis is to forecast what may happen if a certain set of actions take place (Mansoorzadeh & Yusof 2011). So, accurate risk analysis and estimation of cost together can increase the chances of success in budget project completion. And for this, the use of simulation can be of great help. As described in Xing-xia and Jian-wen (2009), "simulation is the practice of imitating the true behaviour of a system into a model based on some realistic conditions". In this paper, Monte Carlo computation has been used which involves the use of random numbers to address the uncertainty inherent in an otherwise deterministic problem. It simulates a system by reproducing random numbers using some probability distribution and then analyses all the results collectively to forecast the estimates and unknown future risks (Hira 2001).

Review of Literature
Effort estimation in software projects comprises an early but important activity. The size of software, usually measured in KLOC, works as the most popular input for the estimation of software effort. Authors in (Moataz, Irfan & Jarallah 2013) have proposed a framework for software effort prediction by using UML models created early in SDLC. Uncertainty in software size is accounted for by a probability density function Effort estimation accounts for an important activity in the software development life cycle and it comprises an important part of the planning and management of a software project. But, more often than not, in the software engineering practice, the software engineering principles like cost estimation are not given due importance in the lifecycle of a software artifact. In Trendowicz (2013), the author has presented the CoBRA model for effort estimating in a software project. This method takes into account the artifact parameters and human judgment. The scope of the method extends well beyond effort estimation and it helps in managing project-related risks as well. Trendowicz (2013) has recorded numerous real-world cases wherein the CoBRA has been put to use in varied real-world software project planning situations.
Authors in Santos and Bela (2013) have investigated the role of factors like cost, quality, schedule, and scope in software development projects and have established that irrespective of the nature of an engineering projecttraditional engineering or software engineeringthe aforementioned process management variables are critical to the decision-making process in any engineering project. And, particularly in software development projects, the planning and execution phases are subjected to the creativity of individuals with key portfolios. In Santos and Bela (2013), the authors have found that the cost, quality, schedule, and scope of critical tasks in the software development projects be monitored and controlled effectively. Authors have also reported a toolkit based on data mining techniques to find the correlations between the most relevant parameters of human resource, the project underhand, and the organisation e.g. experience of individuals, the complexity of the project, and maturity of organizational practices. This toolkit has been developed for helping the management in planning, monitoring, and controlling a software development project.
Authors in Uzzafer (2017) have found a research gap in that estimated costs can be represented as the percentiles of probability distributions, but, these estimates may not be subadditives. Uzzafer (2017) has reported a model which takes probability distributions as its input and produces as output, the sub-additive cost estimates. The model proposed herein is based on expectations in contrast with the traditional approach where percentiles of probability distributions determine the estimated costs. The proposed model works on the premise that estimated costs of components add up to give a pretty good estimate of the overall project cost. Other way round, the estimated cost of a project can be decomposed into sub-estimates to represent the estimated costs of project components.
In Kumar and Yadav (2015) authors have underlined the importance and vitality of risk management for the success of software development projects. Whereas every SDLC phase poses a potential risk involving people, technology, money, time, hardware, and software resources. Sources of software risks since it involves hardware, software, technology, people, cost, and schedule. A whole lot of risk factors affect the software development process, but, Kumar and Yadav (2015) have focused on working out how the outcome of a software development project is possibly correlated with risk factors. Bayesian Belief Network-based probabilistic model has been proposed to estimate/assess the risk in the software process.
Whereas many researchers have considered the task of software cost estimation with plausible precision as unachievable, there as others who believe that this task can be achieved using established principles of mathematics and management. Briciu, Filip and Indries (2016) have addressed the problem faced in the estimation of software cost owing to rapid changes in software process, principles, and practices. Herein, a genetic algorithm-based model has been reported for estimating the cost of software products.

Research Article
Vol. 12 No.6 (2021), [2176][2177][2178][2179][2180][2181][2182][2183] The distinct nature of the software process, its produce and failure of hardware cost and risk estimation tools in the software industry have led researchers to work on risk management in the software industry for a long. Software engineering as an engineering discipline is aimed at identification, assessment, alleviation, and monitoring of risk, among other objectives. Further, the risk profile of large software systems is quite different from small-scale systems. Maruf, Ghazia & Urooj (2018) have underlined this difference and presents a detailed comparison of risk management models for software projects.

Network Representation of Software Project
The network of a website development project is considered here on which simulation is applied for its risk analysis and cost estimation. Here, as shown in Figure 1, PERT (Project Evaluation and Review Technique) network representation is used, according to which the circles represent the nodes and the numbers inside the circles represent the node numbers (Sharma 2009). The arcs/edges between two nodes represent an activity and the number along with an activity represents the activity number (written in bold). The dark circles represent the starting and finishing activities of the project.  Table 1.

Simulator Design
A simulator viz. SIM_RACE (Simulator for Risk Analysis and Cost Estimation) has been designed here, which simulates the cost of various coordinating activities in the website development project to address the uncertainty that can be present in it. The simulator has been designed in C language on Intel core i5 processor, under the windows operating system. The algorithm of the simulator is given as below:

Results and Discussion
The output obtained after running the simulator for 1000 runs is presented here using tables and graphs. Table 2 shows the results obtained after the first simulation run which spans over the forward  In each simulation run, PCC and critical path are determined which gives the estimate about the total cost to complete the project and the set of critical activities due to which the cost may overrun. Table  3 presents the values of these parameters in different simulation runs. Table 3. PCC and Critical paths in different runs.
Using the values of PCC in 1000 simulation runs, the frequency of PCC values is calculated which is shown below with the help of Graph 1. Here, the graph shows that the interval of 21000.0-22000.0 rupees occurred for the maximum number of times that is 319 times, which means chances are higher for the budget to fall under this interval. Using this information, decisions regarding the project budget can be made aptly.

Run PCC (in rupees)
Next, Graph 2 shows the risk indices of the coordinating activities of the project. The risk index is calculated by dividing the total number of times an activity becomes critical to the total number of simulation runs (Deo 2003).

Graph 2. Risk indices of activities.
Activities having risk indices falling in the range of 0.8-1.0 are more critical as compared to others. Therefore, according to the graph, activities 1, 2, 3, 4, 7, 10, 17, 26, 28, and 31 are the most critical activities and needed to be handled with more care.
As already stated that correct estimation is essential for a project, with the help of the designed simulator it can be attained quite accurately. For this, the results of the simulator are needed to be analysed carefully. So, after analysing the results of the simulator it is observed that values of run number 791 were repeated maximum times and therefore its results can be used as the estimates of cost. Results of run number 791 are shown with the help of Table 4.

Conclusion
Risk analysis and cost estimation ensure the success of a project if done correctly. By using the described approach, a project manager can find the answers to "how much will it cost", "what should be the estimates", "what are the critical activities" and "what-if" like questions before the very beginning of the project. The designed simulator has provided quality results through Monte Carlo Simulation using less time, effort, and cost. The results obtained are neither pessimistic nor optimistic but realistic, and that increases the chances of in-budget project completion. And that is, after all, the ultimate goal of any project!