The first commercially available (claimed) verified compiler
Yesterday, I read a paper containing a new claim by some of those involved with CompCert (yes, they of soap powder advertising fame): “CompCert is the first commercially available optimizing compiler that is formally verified, using machine assisted mathematical proofs, to be exempt from miscompilation”.
First commercially available; really? Surely there are earlier claims of verified compilers being commercial availability. Note, I’m saying claims; bits of the CompCert compiler have involved mathematical proofs (i.e., code generation), so I’m considering earlier claims having at least the level of intellectual honesty used in some CompCert papers (a very low bar).
What does commercially available mean? The CompCert system is open source (but is not free software), so I guess it’s commercially available via free downloading licensing from AbsInt (the paper does not define the term).
Computational Logic, Inc is the name that springs to mind, when thinking of commercial and formal verification. They were active from 1983 to 1997, and published some very interesting technical reports about their work (sadly there are gaps in the archive). One project was A Mechanically Verified Code Generator (in 1989) and their Gypsy system (a Pascal-like language+IDE) provided an environment for doing proofs of programs (I cannot find any reports online). Piton was a high-level assembler and there was a mechanically verified implementation (in 1988).
There is the Danish work on the formal specification of the code generators for their Ada compiler (while there was a formal specification of the Ada semantics in VDM, code generators tend to be much simpler beasts, i.e., a lot less work is needed in formal verification). The paper I have is: “Retargeting and rehosting the DDC Ada compiler system: A case study – the Honeywell DPS 6” by Clemmensen, from 1986 (cannot find an online copy). This Ada compiler was used by various hardware manufacturers, so it was definitely commercially available for (lots of) money.
Are then there any earlier verified compilers with a commercial connection? There is A PRACTICAL FORMAL SEMANTIC DEFINITION AND VERIFICATION SYSTEM FOR TYPED LISP, from 1976, which has “… has proved a number of interesting, non-trivial theorems including the total correctness of an algorithm which sorts by successive merging, the total correctness of the McCarthy-Painter compiler for expressions, …” (which sounds like a code generator, or part of one, to me).
Francis Morris’s thesis, from 1972, proves the correctness of compilers for three languages (each language contained a single feature) and discusses how these features may be combined into a more “realistic” language. No mention of commercial availability, but I cannot see the demand being that great.
The definition of PL/1 was written in VDM, a formal language. PL/1 is a huge language and there were lots of subsets. Were there any claims of formal verification of a subset compiler for PL/1? I have had little contact with the PL/1 world, so am not in a good position to know. Anybody?
Over to you dear reader. Are there any earlier claims of verified compilers and commercial availability?
I’m not sure of the exactly how much of the University of Toronto “Holt Group” (R.C.Holt and J.R.Cordy) work made it into commercial products. Their early work was on Euclid and Concurrent Euclid, which were aimed at producing formally verifiable, provably correct, systems software. I don’t know if there was a “commercial” Concurrent Euclid compiler, though it was supported within the University of Toronto for systems programming by students. Concurrent Euclid more or less was, as I understand it, the basis for Turing (which was initially called New Euclid), and indeed a Turing compiler was made commercially available (by Holt Software Associates, used by high schools in Ontario and other universities). I don’t know if attempts were made to verify their Turing compiler, though there was a formal definition written for the language (the Dec. 1988 CACM article describing Turing mentions that they though the formal definition would allow them to someday prove the correctness of an implementation). I did know one of the people who was a contract employee working on the compiler maintenance, but all I remember is she used a regression test suite when fixing reported bugs.
I found this article which only mentions one thesis paper from UofT (which may not be about Turing) and I’m sure there should be several more: https://dl.acm.org/citation.cfm?id=966235
On a bit of a side-track I was lead to the following claims by the Northeastern University’s College of Computer and Information Science. I don’t know though if any of their work has made it into any commercially available compiler or other tool.
* CCIS researchers showed how to scale the decades-old proof method known as logical relations to realistic languages—with features like memory allocation and mutation, objects, and concurrency—leading to wide use of the technique, e.g., for verifying fine-grained concurrent data structures, correctness of compiler transformations, and soundness of advanced type systems that guarantee information hiding or differential privacy.
* CCIS developed the first proof architecture for verifying compilers in the presence of inter-anguage linking of compiled code. To date, it is the only such architecture applicable to compilers for typed high-level (non-C- like) languages.
https://www.ccis.northeastern.edu/research-area/formal-methods/
@Greg A. Woods
My only knowledge of Euclid is from a couple of papers I vaguely recall and reading its Wikipedia article, before writing this reply. The software analysis techniques during the 1970s were not up to analyzing code that modified global variables (it introduced too much complexity and memory requirements); I’m guessing that this is the reason for Euclid not supporting assignment to globals, and the other features look like they are designed to stop people writing code that 1970’s tools could not process (or only with great difficulty).
There have been several languages called Turing. Other than their existence, I know nothing about them.