Effort estimation’s inaccurate past and the way forward
Almost since people started building software systems, effort estimation has been a hot topic for researchers.
Effort estimation models are necessarily driven by the available data (the Putnam model is one of few whose theory is based on more than arm waving). General information about source code can often be obtained (e.g., size in lines of code), and before package software and open source, software with roughly the same functionality was being implemented in lots of organizations.
Estimation models based on source code characteristics proliferated, e.g., COCOMO. What these models overlooked was human variability in implementing the same functionality (a standard deviation that is 25% of the actual size is going to introduce a lot of uncertainty into any effort estimate), along with the more obvious assumption that effort was closely tied to source code characteristics.
The advent of high-tech clueless button pushing machine learning created a resurgence of new effort estimation models; actually they are estimation adjustment models, because they require an initial estimate as one of the input variables. Creating a machine learned model requires a list of estimated/actual values, along with any other available information, to build a mapping function.
The sparseness of the data to learn from (at most a few hundred observations of half-a-dozen measured variables, and usually less) has not prevented a stream of puffed-up publications making all kinds of unfounded claims.
Until a few years ago the available public estimation data did not include any information about who made the estimate. Once estimation data contained the information needed to distinguish the different people making estimates, the uncertainty introduced by human variability was revealed (some consistently underestimating, others consistently overestimating, with 25% difference between two estimators being common, and a factor of two difference between some pairs of estimators).
How much accuracy is it realistic to expect with effort estimates?
At the moment we don’t have enough information on the software development process to be able to create a realistic model; without a realistic model of the development process, it’s a waste of time complaining about the availability of information to feed into a model.
I think a project simulation model is the only technique capable of creating a good enough model for use in industry; something like Abdel-Hamid’s tour de force PhD thesis (he also ignores my emails).
We are still in the early stages of finding out the components that need to be fitted together to build a model of software development, e.g., round numbers.
Even if all attempts to build such a model fail, there will be payback from a better understanding of the development process.
> factor of two difference between some pairs of estimators
I’m willing to bet there is that sort of variability between estimations by a single estimator in different contexts.
There are huge pressures and very perverse incentives at play within the industry. Thus looking at an estimation without looking at the context in which it was made is meaningless.
One case I personally saw years ago…
A project manager wanting to be seen as having a “Can Do” attitude, formally wrote up and disciplined a technical lead for failing to estimate a project at less than some arbitrary deadline.
Unsurprisingly the technical lead left at the earliest opportunity and the project manager claimed success on his CV for the project whilst still in progress, and left for better paying job.
@John Carter
I find it amazing that people continue to be surprised that estimates given in response to competitive quotes turn out to be less than the actual. In a competitive market this behavior is to be expected.
Yes, context is all important (another reason why the NASA data I linked to is fairly worthless, well apart from generating academic papers).
The SiP data (the only estimation data I know of where an anonymous ID labels those making the estimate) is for short (a few hours) agile tasks within the same company, over a period of 10 years.
Over the past three decades, I have become convinced that internal politics plays as much of a rĂ´le as external competition. (I always asked the engineers who would do the work to make the estimate. After a few rounds, and reading Humphrey’s PSP tome, they became reasonably accurate, allowing me to stand my ground against upper management.)