The Shape of Code

About

Home > Uncategorized > Impact of developer uncertainty on estimating probabilities

Impact of developer uncertainty on estimating probabilities

August 3, 2025 Derek Jones Leave a comment Go to comments

For over 50 years, it has been known that people tend to overestimate the likelihood of uncommon events/items occurring, and underestimate the likelihood of common events/items. This behavior has replicated in many experiments and is sometimes listed as a so-called cognitive bias.

Cognitive bias has become the term used to describe the situation where the human response to a problem (in an experiment) fails to match the response produced by the mathematical model that researchers believe produces the best output for this kind of problem. The possibility that the mathematical models do not reflect the reality of the contexts in which people have to solve the problems (outside of psychology experiments), goes against the grain of the idealised world in which many researchers work.

When models take into account the messiness of the real world, the responses are a closer match to the patterns seen in human responses, without requiring any biases.

The 2014 paper Surprisingly Rational: Probability theory plus noise explains biases in judgment by F. Costello and P. Watts (shorter paper), showed that including noise in a probability estimation model produces behavior that follows the human behavior patterns seen in practice.

If a developer is asked to estimate the probability that a particular event, , occurs, they may not have all the information needed to make an accurate estimate. They may fail to take into account some s, and incorrectly include other kinds of events as being s. This noise, , introduces a pattern into the developer estimate:

D_E=(1-N)*P_E + N*(1-P_E)=(1-2N)*P_E+N

where: D_E is the developer’s estimated probability of event occurring, P_E is the actual probability of the event, and is the probability that noise produces an incorrect classification of an event as or (for simplicity, the impact of noise is assumed to be the same for both cases).

The plot below shows actual event probability against developer estimated probability for various values of , with a red line showing that at N==0 , the developer estimate matches reality (code):

Developer estimated event probability against actual probability for various noise probabilities.

The effect of noise is to increase probability estimates for events whose actually probability is less than 0.5, and to decrease the probability when the actual is greater than 0.5. All estimates move towards 0.5.

What other estimation behaviors does this noise model predict?

If there are two events, say and , then the noise model (and probability theory) specifies that the following relationship holds:

P(A)+P(B) == P(A & B)+P(A delim{|}{B}{})

where: P(...) denotes the probability of its argument.

The experimental results show that this relationship does hold, i.e., the noise model is consistent with the experiment results.

This noise model can be used to explain the conjunction fallacy, i.e., Tversky & Kahneman’s famous 1970s “Lindy is a bank teller” experiment.

What predictions does the noise model make about the estimated probability of experiencing ( 1<x ) occurrences of the event in a sequence of assorted events (the previous analysis deals with the case x=1 )?

An estimation model that takes account of noise gives the equation:

$D_S={P_S - N}/{1 - 2N}$

where: D_S is the developer’s estimated probability of experiencing s in a sequence of length , and P_S is the actual probability of there being .

The plot below shows actual event probability against developer estimated probability for various values of , with a red line showing that at N==0 , the developer estimate matches reality (code):

Developer estimated probability of x events in a sequence against actual probability for various noise probabilities.

This predicted behavior, which is the opposite of the case where x=1 , follows the same pattern seen in experiments, i.e., actual probabilities less than 0.5 are decreased (towards zero), while actual probabilities greater than 0.5 are increased (towards one).

There have been replications and further analysis of the predictions made by this model, along with alternative models that incorporate noise.

To summarise:

When estimating the probability of a single event/item occurring, noise/uncertainty will cause the estimated probability to be closer to 50/50 than the actual probability.

When estimating the probability of multiple events/items occurring, noise/uncertainty will cause the estimated probability to move towards the extremes, i.e., zero and one.

Categories: Uncategorized Tags: experiment, fallacy, human behavior, probability, signal/noise, uncertainty

Comments (0) Trackbacks (0) Leave a comment Trackback

No comments yet.

No trackbacks yet.

Predicted impact of LLM use on developer ecosystems A process to find and extract data-points from graphs in pdf files

The Shape of Code

Impact of developer uncertainty on estimating probabilities

Recent Posts

Recent Comments

Archives

Meta