Software engineering research problems having worthwhile benefits
Which software engineering research problems are likely to yield good-enough solutions that provide worthwhile benefits to professional software developers?
I can think of two (hopefully there are more):
- what is the lifecycle of software? For instance, the expected time-span of the active use of its various components, and the evolution of its dependency ecosystem,
- a model of the main processes involved in a software development project.
Solving problems requires data, and I think it is practical to collect the data needed to solve these two problems; here is some: application lifetime data, and detailed project data (a lot more is needed).
Once a good-enough solution is available, its practical application needs to provide a worthwhile benefit to the customer (when I was in the optimizing compiler business, I found that many customers were not interested in more compact code unless the executable was at least a 10% smaller; this was the era of computer memory often measured in kilobytes).
Investment decisions require information about what is likely to happen in the future, and an understanding of common software lifecycles is needed. The fact that most source code has a brief existence (a few years) and is rarely modified by somebody other than the original author, has obvious implications for investment decisions intended to reduce future maintenance costs.
Running a software development project requires an understanding of the processes involved. This knowledge is currently acquired by working on projects managed by people who have successfully done it before. A good-enough model is not going to replace the need for previous experience, some amount of experience is always going to be needed, but it will provide an effective way of understanding what is going on. There are probably lots of different good-enough ways of running a project, and I’m not expecting there to be a one-true-way of optimally running a project.
Perhaps the defining characteristic of the solution to both of these problems is lots of replication data.
Applications are developed in many ecosystems, and there is likely to be variations between the lifecycles that occur in different ecosystems. Researchers tend to focus on Github because it is easily accessible, which is no good when replications from many ecosystems are needed (an analysis of Github source lifetime has been done).
Projects come in various shapes and sizes, and a good-enough model needs to handle all the combinations that regularly occur. Project level data is not really present on Github, so researchers need to get out from behind their computers and visit real companies.
Given the payback time-frame for software engineering research, there are problems which are not cost-effective to attempt to answer. Suggestions for other software engineering problems likely to be worthwhile trying to solve welcome.
> what is the lifecycle of software? For instance, the expected time-span of the active use of its various components, and the evolution of its dependency ecosystem
Oooh. I can see the headline now…
“100% of Russia Roulette players interviewed survived… Risks Exaggerated!”
The trouble with this question is …
* Most software shouldn’t have been written, it didn’t have a viable economic reason to exist.
* Lots of software was so badly written, it died or was surpassed.
* Some software is “Good Enough” to be of value unchanged for decades, but something is seriously wrong because it isn’t upgraded for decades. (eg. splint, graphviz…)
* Some software, hit upon such rich vein of economic value, that no matter how atrociously it was written, no matter how bad the processes or tools used, …. that software and the processes and tools comes to epitomize “Best Practice” for a generation. Because “Look How Rich they Are”.
Facebook is my current best example of this… some of their tools and practices are awful… but when you have that much money and that many warm bodies to throw at your problems… everything looks shiny. Except if you try scale it to a small project.
@John Carter
You are right about the lack of a viable economic reason to exist (I give a great example at the start of the Projects chapter of my Evidence-based software engineering book).
All your other points are also very true.
Following through the logic of your points, we should be teaching students how to minimize their investment in the software they produce (and not teach them that they should write software like it’s going to last forever).
It could be said that many developers already work like this, out of accident, laziness, incompetence, etc, rather than a thoughtful analysis of the factors involved.
graphviv is a great package that I once used to use a lot. Isn’t it the case that the software does what it was designed to do well, and does not need any major changes?
By far most programs I work on, or use have already had over 20 years and going lifespans…
Graphviz is fantastic, but in using it over the years I personally have spotted quite a few gaps, but last time I look none were going to be easy to fill.
Sadly I have seen a lot of companies write software on the basis you describe, which is a self fulfilling prophecy…
It’s an important research problem, but a very very hard question to answer well. Success has many fathers, but failure…
In principle, according to modern best practice, every company does a project retrospective… so it’s just a matter of collecting those.
…in practice, they tend to be “very company confidential / sensitive”.
Partly because every retro I have attended or heard report back on…
* the problems on the project mostly arose from those people not in, or no longer, “in the room”.
* the problems listed had a large overlap with the problems listed in the last retrospective.
* the problems created by those “in the room” usually were not regarded as such by their creators… So unless some vigorous “clearing the air” was in the offing, in the best english manner, the problems were set aside and swept under carpet.
As I often say, to understand software engineering, it probably better to read Frans de Waal than Tom deMarco.