The 520’th post
This is the 520’th post on this blog, which will be 10-years old tomorrow. Regular readers may have noticed an increase in the rate of posting over the last few months; at the start of this month I needed to write 10 posts to hit my one-post a week target (which has depleted the list of things I keep meaning to write about).
What has happened in the last 10-years?
- I no longer visit libraries, which are becoming coffee shops+wifi hot-spots where people who have librarian in their job title, hot desk; books, they are around here somewhere. I used to regularly visit libraries, particularly while working on my C book. No libraries have so far needed to be visited, for the writing of my evidence-based software engineering book,
- many old manuals, reports, books and magazines became freely available for download, via sites like the Internet Archive, Bitsavers and the Defense Technical Information Center; for second hand books there is AbeBooks. Site like Research Gate, Semantic Scholar and Google Scholar are fantastic sources for more recent work; for new books there is Amazon,
- Github became the place to make source code+stuff available,
- researchers in software engineering started to become interested in evidence-based research. In the UK the CREST Open Workshops were a fantastic series of events; I went to about a third of them, and there were often a couple of gold nuggets per event (a change of funding means running future events will require a lot more work),
- smart phones became the last, next, major software consumer ecosystem (capturing a large percentage of the world’s population means there is no room left for something bigger), and the cloud started on its path to being 99% of the commercial software ecosystem,
- Python joined the short-list to become the world’s primary programming language (assuming that people still run programs outside of the browser). The decline of PERL became very obvious, and work on adding new features to Cobol stopped (work on adding features to Fortran is still going strong),
- known faults are now being automatically fixed by modifying the source code (using genetic programming). This has yet to move out of research, but we all know where it’s going,
- whole program optimization of systems containing millions of lines of code became a viable option for commercial developers (a topic of late night discussion for compiler writers in the 1980s, and perhaps earlier decades, when having more than 64K of memory was treated as nirvana),
- after 20-years of being the only major open source compiler tool-chain, gcc got some serious competition. I originally predicted that llvm would disappear, failing to recognize that Apple were supporting it for licensing reasons,
- the death throes of Moore’s law went from subtle to, isn’t it dead yet?
I probably missed several major events hiding in plain sight, either because I am too close to them or blinkered.
What did not happen in the last 10 years?
- No major new languages. These require major new hardware ecosystems; in the smartphone market Android used Java and iOS made use of existing languages. There were the usual selection of fashion/vanity driven wannabes, e.g., Julia, Rust, and Go. The R language started to get noticed, but it has been around since 1995, and Python looks set to eventually kill it off,
- no accident killing 100+ people has been attributed to faults in software. Until this happens, software engineering has a dead bodies problem,
- the creation of new software did not slow down from its break-neck speed,
- in the first few years of this blog I used to make yearly predictions, which did not happen (most of the time).
Now I can relax for 9.5 years, before scurrying to complete 1,040 posts, i.e., the rate of posting will now resume its previous, more sedate, pace.
Where are the dead bodies?
The possibility of faults in software causing death or serious injury is often talked about and in some cases large amounts of money are invested in work to reduce the possibility of these events occurring (or at least doing things that will support the view that a company took reasonable precautions, should a case end up in court). The Therac-25 accidents are an often quoted example of a software fault that directly resulted in deaths. These accidents occurred over a 19 month period in the mid 1980s and are believed to have resulted in the death of six people. I don’t wish to disrespect the memory of the people who died, but six people 20 years ago; is that it? Less than the number of people killed every day (around 10) in traffic accidents in the UK.
If faults in software really do have a non-trivial impact on human safety then we would expect this fact to be reflected in accident statistics. After searching the accident statistics for the UK I cannot find any whose cause is directly attributed to software. If there are people who have died as a direct result of faults in software, the death rate has not yet reached the minimum level needed to be recorded as such (or are these deaths ‘hidden’ away in ones and twos within other causes?)
The US National Transportation Safety Board carries out a thorough investigation of all US aviation accidents. Searching the Aviation Accident Database on the query “software” between the dates 1 Jan 2000 and 9 Aug 2005 returns 44 matches. Reading these 44 reports I did not find any accident attributed to a software related issue.
If faults in software are not killing or seriously injuring many people why is so much effort invested in reducing the probability of these events occurring? The following are some of the possibilities:
- The investment actually made is small, but it is talked up.
- The investment is made for economic reasons (e.g., more reliable products are likely to reduce support costs) and increased ’safety’ is a side effect.
- In situations where there is a likelihood of death or serious injury the procedures and reliability of non-software items is sufficient to short-circuit the effects of any life threatening faults that may exist in the software used (at least until the fault can be corrected).
As any developer knows, replicating faulty behavior in software can be very difficult, if not impossible. It may be that software faults are not given as the root cause of death or serious injury because the necessary proof is not available. Or perhaps software faults have yet to be the root cause of such events on any non-trivial scale.
Existing practice affects what people are willing to put up with. Many users of Microsoft Windows now accept that it is necessary to reboot the computer they are using on a daily, or even hourly, basis. Users of cars accept that the tool they are using can result in serious injuries or even death (usually rating nothing more than a story in the local town newspaper). Will there be a public hue and cry once software faults start to be recorded as a primary factor in accidental death or serious injury? As this paper shows, it can take a lot of dead bodies before existing practices are changed.
The lack of dead bodies attributed to a software root cause suggests that it is very still early days for the field of high integrity software development.
This material was originally written in 2005 and appeared in an earlier blog of mine which I did not keep up.
Recent Comments