My 2023 in software engineering
In a 2009 post, I predicted that Chinese and Indian developers would become a major influence in the next decade. This year, it was very noticeable that many of the authors of papers at major conferences had Asian names. I would say that, on average, papers with Asian author names were better than papers by authors with non-Asian names.
While LLMs dominated the software news this year, the lead time for research projects and conference submission deadlines meant that few of the papers accepted at this year’s top ranked conferences were LLM based, e.g., around 5% at ICSE. I expect there will be a much higher percentage of LLM based papers in 2024, which I think will be a disaster for software engineering research, at least in the short term. From what I have seen and read, much of LLM based software engineering is driven by fashion and/or a desire to gain experience that leads to a job in AI. Discovering something useful about software development takes a back seat (the current fashionable topic, butterfly collecting, at least produces potentially useful datasets). I think that LLMs are going to be very useful for analyzing text data, e.g., named entity recognition.
London based, software related meetups have come back to life. I go to around 1-2 a week, and the regular good ones include: Internet of Things, Extreme Tuesday Club, London Prompt Engineers, and London R. On the academic front, I have started attending the software reliability seminars at Imperial, and funding means that the excellent Crest Open Workshops are down to two a year. There were a handful of hackathons this year, and I got to go to one of them, a LLM hackathon.
Not usually software specific: Newspeak House hosts a variety of events that are often attended by many developers and those associated with the rationalist community. I attend maybe 2–3 events a month.
What did I learn/discover about software engineering this year?
- A small team estimation dataset showed the same kinds of patterns seen in larger teams,
- more cost/benefit analysis of software engineering activities here and here,
- data on Cobol source is very rare, and I found some,
- programs often continue to work very well in the presence of serious coding mistakes; I discovered some conditions where this occurs (to be continued next year),
- yet more debunking of software folklore: Optimal function length, and Hardware/Software cost ratio,
- I fell down the rabbit hole of early computer performance and their benchmarks.
The evidence-based software engineering Discord channel ticks over (invitation), with sporadic interesting exchanges.
> “which I think will be a disaster for software engineering research”
Research with external validity has been increasingly encouraged by reviewers, and is even used as evaluation criteria at conferences. Ground research/internal validity does not seem to be very exciting for software engineering researchers, let alone replications. With LLMS I think this will only get worse
> “London Prompt Engineers”
What do you think of the term “immediate engineering”? Didn’t we have a better name to give it? hahaha