The fatal programming language research mistake
There is a fatal mistake often made by those involved in academic programming language research and a recent blog post (by an academic) asking if programming language research has a future has spurred me into writing about this mistake.
As an aside, I would agree with much of what the academic (Cristina (Crista) Videira Lopes) says about many popular modern programming languages being hacked together by kids who did not know much, if anything about, language design. However, this post is not a lament about the poor design quality of the languages commonly used in the commercial world; it is about the most common fatal mistake academics make when researching programming languages and a suggestion about how they can avoid making this mistake. What really endeared me to Crista was her critic of academic claims of language ‘betterness’ being completely unfounded (i.e., not being based on any empirical research).
The most common fatal mistake made by researchers in programing language design is to invent a new language. Creating an implementation for any language is a big undertaking and a new language has the added hurdles of convincing developers it is worth learning, providing the learning/reference materials and porting to multiple platforms. Researchers spend nearly all their time creating an implementation and a small percentage of their time actively researching the ‘new idea’.
The attraction of designing a new language is that it is regarded as ‘sexy’ activity and the first (and usually only) time around the work needed to create an implementations does not look that great.
If a researcher really does feel that their idea is so revolutionary it is worth creating a whole new language for and they want me, and others, to start using it, then they need to make sure they can answer yes to the following questions:
- Have you, or your students, created an implementation of the language that provides reasonable diagnostics, executes programs at an acceptable rate and is available free of charge on the operating systems I use for software development?
- Is sufficient documentation available for me to learn the language and act as a reference manual once I become more expert?
- For the next five years will you, or your students, be providing, free of charge, prompt bug fixes to errors in your implementation?
- Will you and your students spend the time necessary to build an active user community for your language?
- For the next five years will you, or your students knowledgeable in the language, provide prompt support (via an email group or bulletin board) to user queries?
Some new languages from academia have managed to answer yes to these questions (Haskell, R and OCaml spring to mind, but only R looks like it will have any significant industrial take-up).
In practice most new languages fail to get past fragile implementations only ever used by their designer, with minimal new research to show for all the effort that went into them.
What programming language researchers need to do, at least if they want people outside of their own department to pay any attention to their ideas, is to experiment by adding functionality to an existing language. Using an existing language as a base has the advantages:
- modifying an existing implementation is significantly less work than creating a new one,
- having to address all of the features present in real world languages will help weed out poor designs that only look good on paper (I continue to be amazed that people can be involved in programming language research without knowing any language very well),
- documentation for most of the language already exists,
- more likely to attract early adopters, developers tend to treat existing language+extensions as being a much smaller jump than a new language.
Programming language research is something of a fashion industry and I can well imagine people objecting to having to deal with a messy existing language. Well yes, the real world is a messy place and if a new design idea cannot handle that it deserves to be lost to posterity.
One cannot blame students for being idealistic and thinking they can create a language that will take over the world. It is the members of staff who should be ridiculed for being either naive or intellectually shallow.
While I agree with many of your points, I feel I need to take issue with your requirement for these things to be “free of charge”. Yes, free languages are more easily picked up, but programmers are willing to pay for their tools.
@Uli Kusterer
Yes, people are willing to pay for tools and a programming language is a tool. The world is full of different programming language and making an implementation freely available eliminates a significant barrier to people being willing to try out that language.
I know of research systems containing a newly created language that have been developed and sold. What people are buying with these systems is a solution to some application domain problem, with the domain specific language being one component of a much larger package (i.e., the language is almost incidental).
Wouldn’t Python be a counterexample of your post? Originally designed and written by an academic, very popular and with a huge number of libraries for practical applications?
@Luis
Python was created as a “hobby language” rather than as something to do research about programming languages. One of the big influences Guido van Rossum cites for Python is SETL which would be a good counter example to my post.
Good points, but some of the arguments are disputable.
Brendan Eich claims he designed and created a working JavaScript implementation in ten days and in the 18 years that have passed since, has continued to meet your criteria. Clearly, the effort of creating the language in the first place is neglilible in comparison.
And many researchers have shifted their activities towards extending existing languages, or at least interoperating with them, a long time ago – think of the C# ecosystem.
@Reinier Post
I think I could design a language and implement it in 10 days; I have built working parsers for languages in a day, provided they are close enough to C syntax and tokenization that I can reuse code I have previously written and am familiar with. Any language built in 10 days will have the characteristics of the existing code reused to build them, will probably be only be an outline of its later self and be somewhat buggy and slow (I don’t know if early releases of JavaScript were like that).
I wish those researchers who extend an existing language would do so by modifying an existing implementation. O so frequently they create a new implementation, which invariably handles the extension well but not the rest of the original language (rendering their extension unusable by developers who also want to use the original language).
@Derek-Jones If you’re trying to say that the Tower of Babel phenomenon appears to be an inherent property of language development, I agree. (And you don’t even need a God to explain it.)