August 25, 2017 Derek Jones No comments

In the mid-70s the US Department of Defense decided it could save lots of money by getting all its contractors to write code in the same programming language. In February 1980 a language was chosen, Ada, but by the end of the decade the DoD had snatched defeat from the jaws of victory; what happened?

I think microcomputers are what happened; these created whole new market ecosystems, some of which were much larger than the ecosystems that mainframes and minicomputers sold into.

These new ecosystems sucked up nearly all the available software developer mind-share; the DoD went from being a major employer of developers to a niche player. Developers did not want a job using Ada because they thought that being type-cast as Ada programmers would overly restrict their future job opportunities; Ada was perceived as a DoD only language (actually there was so little Ada code in the DoD, that only by counting new project starts could it get any serious ranking).

Lots of people were blindsided by the rapid rise (to world domination) of microcomputers. Compilers could profitably be sold (in some cases) for thousands or hundreds of dollars/pounds because the markets were large enough for this to be economically viable. In the DoD ecosystems, compilers had to be sold for thousands or hundreds of thousands of dollars/pounds because the markets were small. Micros were everywhere and being programmed in languages other than Ada; cheap Ada compilers arrived after today’s popular languages had taken off. There is no guarantee that cheap compilers would have made Ada a success, but they would have ensured the language was a serious contender in the popularity stakes.

By the start of the 1990s, Ada supporters were reduced to funding studies to produce glowing reports of the advantages of Ada compared to C/C++ and how Ada had many more compilers, tools and training than C++. Even the 1991 mandate “… where cost-effective, all Department of Defense software shall be written in the programming language Ada, in the absence of special exemption by an official designated by the Secretary of Defense.” failed to have an impact and was withdrawn in 1997.

The Ada mandate was cancelled as the rise of the Internet created even bigger markets, which attracted developer mind-share towards even newer languages, further reducing the comparative size of the Ada niche.

Astute readers will notice that I have not said anything about the technical merits of Ada, compared to other languages. Like all languages, Ada has its fanbois; these are essentially much older versions of the millennial fanbois of the latest web languages (e.g., Go and Rust). There is virtually no experimental evidence that any feature of any language is best/worse than any feature in any other language (a few experiments showing weak support for stronger typing). To its credit, the DoD did fund a few studies, but these used small samples (there was not yet enough Ada usage to make larger sample possible) that were suspiciously glowing in their support of Ada.

Categories: Uncategorized Tags: Ada, ecosystems

We hereby retract the content of this paper

August 17, 2017 Derek Jones No comments

Yesterday I came across a paper in software engineering that had been retracted, the first time I had encountered such a paper (I had previously written about how software engineering is great discipline for an academic fraudster).

Having an example of the wording used by the IEEE to describe a retracted paper (i.e., “this paper has been found to be in violation of IEEE’s Publication Principles”), I could search for more. I get 24,400 hits listed when “software” is included in the search, but clicking through the pages there are just 71 actual results.

A search of Retraction Watch using “software engineering” returns nine hits, none of which appear related to a software paper.

I was beginning to think that no software engineering papers had been retracted, now I know of one and if I am really interested the required search terms are now known.

Categories: Uncategorized Tags: retraction

Two approaches to arguing the 1969 IBM antitrust case

August 16, 2017 Derek Jones No comments

My search for software engineering data has made me a regular customer of second-hand book sellers; a recent acquisition is: “Big Blue: IBM’s use and abuse of power” by Richard DeLamarter, which contains lots of interesting sales and configuration data for IBM mainframes from the first half of the 1960s.

DeLamarter’s case, that IBM systematically abused its dominant market position, looked very convincing to me, but I saw references to work by Franklin Fisher (and others) that, it was claimed, contained arguments for IBM’s position. Keen to find more data and hear alternative interpretations of the data, I bought “Folded, Spindled, and Mutilated” by Fisher, McGowan and Greenwood (by far the cheaper of the several books that have written on the subject).

The title of the book, Folded, Spindled, and Mutilated, is an apt description of the arguments contained in the book (which is also almost completely devoid of data). Fisher et al obviously recognized the hopelessness of arguing IBM’s case and spend their time giving general introductions to various antitrust topics, arguing minor points or throwing up various smoke-screens.

An example of the contrasting approaches is calculation of market share. In order to calculate market share, the market has to be defined. DeLamarter uses figures from internal IBM memos (top management were obsessed with maintaining market share) and quote IBM lawyers’ advice to management on phrases to use (e.g., ‘Use the term market leadership, … avoid using phrasing such as “containment of competitive threats” and substitute instead “maintain position of leadership.”‘); Fisher et al arm wave at length and conclude that the appropriate market is the entire US electronic data processing industry (the more inclusive the market used, the lower the overall share that IBM will have; using this definition IBM’s market share drops from 93% in 1952 to 43% in 1972 and there is a full page graph showing this decline), the existence of IBM management memos is not mentioned.

Why do academics risk damaging their reputation by arguing these hopeless cases (I have seen it done in other contexts)? Part of the answer is a fat pay check, but also many academics’ consider consulting for industry akin to supping with the devil (so they get a free pass on any nonsense sprouted when “just doing it for the money”).

Categories: Uncategorized Tags: antitrust, IBM

Books similar to my empirical software engineering book

August 7, 2017 Derek Jones No comments

I am sometimes asked which other books are similar to the Empirical Software Engineering book I am working on.

In spirit, the most similar book is “Software Project Dynamics” by Abdel-Hamid and Madnick, based on Abdel-Hamid’s PhD thesis. The thesis/book sets out to create an integrated model of software development projects, using system dynamics (the model can be ‘run’ to produce outputs from inputs, assuming the necessary software is available).

Building a model of the software development process requires figuring out the behavior of all the important factors, and Abdel-Hamid does a thorough job of enumerating the important factors and tracking down the available empirical work (in the 1980s). The system dynamics model, written in Dynamo, appears in an appendix (I have not been able to locate any current implementation).

In the 1980s, I would have agreed with Abdel-Hamid that it was possible to build a reasonably accurate model of software development projects. Thirty years later, I have tracked down a lot more empirical work and know a more about how software projects work. All this has taught me is that I don’t know enough to be able to build a model of software development projects; but I still think it is possible, one day.

There have been other attempts to build models of major aspects of software development projects (all using system dynamics), including Madachy’s PhD and later book “Software Process Dynamics”, and Buettner’s PhD (no book, yet???).

There are other books that include some combination of the words empirical, software and engineering in their title. On the whole, these are collections of edited papers, whose chapters are written by researchers promoting their latest work; there is even one that aims to teach students how to do empirical work.

Dag Sjøberg has done some interesting empirical work and is currently working on an empirical book, this should be worth a look.

“R in Action” by Kabacoff is the closest to the statistical material, but at a more general level. “The R Book” by Crawley is the R book I would recommend, but it is not at all like the material I have written.

Categories: Uncategorized Tags: book, models, software development

Signed-magnitude: The integer representation of choice for IoT?

July 28, 2017 Derek Jones No comments

What is the best representation to use for integer values in a binary computer? I’m guessing that most people think two’s complement is the answer, because this is the representation that all the computers they know about use (the Univac 1100/2200 series uses one’s complement; I don’t know of any systems currently in use that make use of signed magnitude, pointers welcome).

The C Standard allows implementations to support two’s complement, one’s complement and signed magnitude (the Univac 1100/2200 series has a C compiler). Is it time for the C Standard to drop support for one’s complement and signed magnitude?.

Why did two’s complement ‘win’ the integer representation battle and what are the chances that hardware vendors are likely to want to use a different representation in the future?

The advantage of two’s complement over the other representations is that the same hardware circuits can be used to perform arithmetic on unsigned and signed integer values. Not a big issue these days, but a major selling point back when chip real-estate was limited.

I can think of one market where signed magnitude is the ‘best representation’, extremely low power devices, such as those that extract power from the radio waves permeating the environment, or from the vibrations people generate as they move around.

Most of the power consumed by digital devices occurs when a bit flips from zero to one, or from one to zero. An application that spends most of its time processing signals that vary around zero (i.e., can have positive and negative values) will experience many bit flips, using a two’s complement representation, when the value changes from positive to negative, or vice-versa, e.g., from 0000000000000001 to 0000000000000000 to 1111111111111111; in signed magnitude a change of sign generates one extra bit-flip, e.g., 0000000000000001 to 0000000000000000 to 1000000000000001.

Simulations show around 30% few transitions for signed magnitude compared with two’s complement, for certain kinds of problems.

Signed magnitude would appear to be the integer representation of choice for some Internet-of-Things solutions.

Categories: Uncategorized Tags: IoT, representation, signed-magnitude

Software systems are the product of cognitive capitalism

July 19, 2017 Derek Jones No comments

Economics obviously has a significant impact on the production of software systems; it is the second chapter of my empirical software engineering book (humans, who are the primary influencers, are the first chapter; technically the Introduction is the first chapter, but you know what I mean).

I have never been happy with the chapter title “Economics”; it does not capture the spirit of what I want to talk about. Yes, a lot of the technical details covered are to be found in economics related books and courses, but how do these technical details fit into a grand scheme?

I was recently reading the slim volume “Dead Man Working” by Cederström and Fleming and the phrase cognitive capitalism jumped out at me; here was a term that fitted the ideas I had been trying to articulate. It took a couple of days before I took the plunge and changed the chapter title. In the current draft pdf little else has changed in the ex-Economics chapter (e.g., a bit of a rewrite of the first few lines), but now there is a coherent concept to mold the material around.

Categories: Uncategorized Tags: capitalism, cognitive

Ecosystems chapter added to “Empirical software engineering using R”

July 17, 2017 Derek Jones No comments

The Ecosystems chapter of my Empirical software engineering book has been added to the draft pdf (download here).

I don’t seem to be able to get away from rewriting everything, despite working on the software engineering material for many years. Fortunately the sparsity of the data keeps me in check, but I keep finding new and interesting data (not a lot, but enough to slow me down).

There is still a lot of work to be done on the ecosystems chapter, not least integrating all the data I have been promised. The basic threads are there, they just need filling out (assuming the promised data sets arrive).

I did not get any time to integrate in the developer and economics data received since those draft chapters were released; there has been some minor reorganization.

As always, if you know of any interesting software engineering data, please tell me.

I’m looking to rerun the workshop on analyzing software engineering data. If anybody has a venue in central London, that holds 30 or so people+projector, and is willing to make it available at no charge for a series of free workshops over several Saturdays, please get in touch.

Projects chapter next.

Categories: Uncategorized Tags: book, ecosystems, R

2017 in the programming language standards’ world

July 12, 2017 Derek Jones No comments

Yesterday I was at the British Standards Institution for a meeting of IST/5, the committee responsible for programming languages.

The amount of management control over those wanting to get to the meeting room, from outside the building, has increased. There is now a sensor activated sliding door between the car-park and side-walk from the rear of the building to the front, and there are now two receptions; the ground floor reception gets visitors a pass to the first floor, where a pass to the fifth floor is obtained from another reception (I was totally confused by being told to go to the first floor, which housed the canteen last time I was there, and still does, the second reception is perched just inside the automatic barriers to the canteen {these barriers are also new; the food is reasonable, but not free}).

Visitors are supposed to show proof that they are attending a meeting, such as a meeting calling notice or an agenda. I have always managed to look sufficiently important/knowledgeable/harmless to get in without showing any such documents. I was asked to show them this time, perhaps my image is slipping, but my obvious bafflement at the new setup rescued me.

Why does BSI do this? My theory is that it’s all about image, BSI is the UK’s standard setting body and as such has to be seen to follow these standards. There is probably some security standard for rules to follow to prevent people sneaking into buildings. It could be argued that the name British Standards is enough to put anybody off wanting to enter the building in the first place, but this does not sound like a good rationale for BSI to give. Instead, we have lots of sliding doors/gates, multiple receptions (I suspect this has more to do with a building management cat fight over reception costs), lifts with no buttons ‘inside’ for selecting floors, and proof of reasons to be in the building.

There are also new chairs in the open spaces. The chairs have very high backs and side-baffles that surround the head area, excellent for having secret conversations and in-tune with all the security. These open areas are an image of what people in the 1970s thought the future would look like (BSI is a traditional organization after all).

So what happened in the meeting?

Cobol standard’s work becomes even more dead. PL22.4, the US Cobol group is no more (there were insufficient people willing to pay membership fees, so the group was closed down).

People are continuing to work on Fortran (still the language of choice for supercomputer Apps), Ada (some new people have started attending meetings and support for @ is still being fought over), C, Internationalization (all about character sets these days). Unprompted somebody pointed out that the UK C++ panel seemed to be attracting lots of people from the financial industry (I was very professional and did not relay my theory that it’s all about bored consultants wanting an outlet for their creative urges).

SC22, the ISO committee responsible for programming languages, is meeting at BSI next month, and our chairman asked if any of us planned to attend. The chair’s response, to my request to sell the meeting to us, was that his vocabulary was not up to the task; a two-day management meeting (no technical discussions permitted at this level) on programming languages is that exciting (and they are setting up a special reception so that visitors don’t have to go to the first floor to get a pass to attend a meeting on the ground floor).

Categories: Uncategorized Tags: Ada, BSI, C, Cobol, Fortran, ISO, language standard, SC22

Information on computers from the 1970s and earlier

July 7, 2017 Derek Jones No comments

A collection of links to sources of hardware and software related information from the 1970s and earlier.

Computers and Automation, a monthly journal published between 1954 and 1978, by far and away the best source of detailed information from this period. The June issue contained an extensive computer directory and buyers guide, including a census of installed computers. The collected census for 1962-1974 must rank in the top ten of pdf files that need to be reliably converted to text.

Computer characteristics quarterly, the title says it all; the stories about the weird and wonderful computers that used to be on sale really are true. Only a couple of issues available online at the moment.

Bitsavers huge collection of scanned computer manuals. The directory listing of computer companies is a resource in its own right.

DTIC (Defense Technical Information Center). A treasure trove of work sponsored by the US military from the time of Rome and late.

Ed Thelan’s computer history: note his contains material that can be hard to find via the main page, e.g., the BRL 1961 report.

“Inventory of Automatic data processing equipment in the Federal Government”: There are all sorts of interesting documents lurking in pdfs waiting to be found by the right search query.

Books

“Software Reliability” by Thayer, Lipow and Nelson is now available online.

“The Economics of Computers” by William F. Sharpe contains lots of analysis and data on computer purchase/leasing and usage/performace details from the mid-1960s.

“Data processing technology and economics” by Montgomery Phister is still only available in dead tree form (and uses up a substantial amount of tree).

“Handbook of Automation Computation and Control Volume 2”

“Foundations of computer programming in Britain, 1945-55”, M. Campbell-Kelly’s PhD thesis (freely downloadable from the British Library; registration required).

Reports

Computers in Spaceflight The NASA Experience covers computers used in spacecraft up to the mid 1980s.

History of NSA General-Purpose Electronic Digital Computers (written in 1964, declassified in 2004).

Missing in Action

“A Study of Technological Innovation: The Evolution of Digital Computers”, Kenneth Knight’s PhD thesis at Carnegie Institute of Technology, published in 1963. Given Knight’s later work, this will probably be a very interesting read.

“Computer Survey”, compiled by Mr Peddar, was a quarterly list of computers installed in the UK. It relied on readers (paper) mailing in details of computers in use. There are a handful of references and that’s all I can find.

What have I missed? Suggests and links very welcome.

Categories: Uncategorized Tags: books, industry data

Increase your citation count, send me your data!

June 30, 2017 Derek Jones No comments

I regularly email people asking for a copy of the data used in a paper they wrote. In around 32% of cases I don’t get any reply, around 12% promise to send the data when they are less busy (a few say they are just too busy) and every now and again people ask why I want the data.

After 6-12 months, I email again saying that I am still interested in their data; a few have replied with apologies and the data.

I need a new strategy to motivate people to spend some time tracking down their data and sending it to me; there are now over 200 data-sets possibly lost forever!

I think those motivated by the greater good will have already responded. It is time to appeal to baser instincts, e.g., self-interest. The currency of academic life is paper citations, which translate into status in the community, which translate into greater likelihood of grant proposals being accepted (i.e., money to do what they want to do).

Sending data gets researchers one citation in my book (I am ruthless about not citing a paper if I don’t get any data).

My current argument is that once their data is publicly available (and advertised in my book) lots of other researchers will use it and more citation to their work will follow; they also get an exclusive, I only use one data-set for each topic (actually data is hard to get hold of, so the exclusivity offer is spin).

To back up my advertising claims, I point out that influential people are writing about my book and it’s all over social media. If you want me to add you to the list of influential people, send me a link to what you have written (I have no shame).

If you write about my book, please talk about the data and that researchers who make their data public are the only ones who deserve to fund and may citations rain down on them.

That is the carrot approach, how can I apply some stick to motivate people?

I could point out that if they don’t send me their data their work is doomed to obscurity, because I will use somebody else’s (skipping over the minor detail of data being hard to find). Research has found that people are less willing to share their data if the strength of the evidence is weak; calling out somebody like that is do-or-die.

If you write about my book, please talk about the data and point out that researchers who don’t make their data public have something to hide and should not be funded.

Since the start of 2017, researchers in the UK receiving government research grants are required to archive their data and make it available. This is good for future researchers, but not a lot of use for me now.

What do readers think? Ideas and suggestions welcome.

Categories: Uncategorized Tags: data availability, motivation

Newer Entries Older Entries

The Shape of Code

Microcomputers ‘killed’ Ada

We hereby retract the content of this paper

Two approaches to arguing the 1969 IBM antitrust case

Books similar to my empirical software engineering book

Signed-magnitude: The integer representation of choice for IoT?

Software systems are the product of cognitive capitalism

Ecosystems chapter added to “Empirical software engineering using R”

2017 in the programming language standards’ world

Information on computers from the 1970s and earlier

Increase your citation count, send me your data!

Recent Posts

Recent Comments

Archives

Meta