Secret instruction sets about to make a come-back
Now it has happened Apple’s launch of a new processor, the A4, seems such an obvious thing to do. As the manufacturer of propriety products Apple wants to have complete control over the software that customers run on those products. Using a processor whose instruction set and electrical signals are not publicly available goes a very long way to ensuring that somebody else’s BIOS/OS/drivers/etc do not replace Apple’s or that the distributed software is not ‘usefully’ patched (Apple have yet to reveal their intentions on publishing instruction set details, this is an off-the-cuff prediction on my part).
Why are Apple funding the development of the LLVM C/C++ compiler? Because it enables them to write a back-end for the A4 without revealing the instruction set (using gcc for this purpose would require that the source be distributed, revealing the instruction set). So much for my prediction that Apple will stop funding LLVM.
The landscape of compute intensive cpus used to be populated with a wide variety of offerings from many manufacturers. The very high price of riding the crest of Moore’s law is one of the reasons that Intel’s x86 acquired such a huge chunk of this market; one by one processor companies decided not to make the investment needed to keep their products competitive. Now that the applicability of Moore’s law is drawing to an end the treadmill of continued processor performance upgrades is starting to fading away. Apple are not looking at a processor upgrade cycle, that they need to keep up with to be competitive, that stretches very far into the future.
Isn’t keeping an instruction set secret commercial suicide? The way to sell hardware is to make sure that lots of software runs on it and software developers want instruction set details (well ok, a small number do, but I’m sure lots of others get a warm fuzzy feeling knowing that the information is available should they need it). This is very much a last century view. The world is awash with an enormous variety of software, including lot of it very useful open source, and there is a lot more information available about the set of applications many customers want to use most of the time. Other than existing practice there is little reason why a manufacturer of a proprietary product, that is not a processor, needs to release instruction set details.
In the early 1980s I was at the technical launch of the Inmos Transputer and was told that the instruction set would not be published because the company did not want to be tied down to backwards compatibility. Perish the thought that customers would write assembler because the Inmos supplied compilers produced poor quality code or that a third party compiler would good enough to gain a significant market share. In marketing ability Inmos was the polar opposite of Apple. For a while HP were not very forthcoming on the instruction set behind their PA-RISC processor.
Will the A4 instruction set remain secret? On the whole, perhaps it can. The software based approaches to reverse engineering an instruction set require access to a compiler that generates it. For instance, changing a plus to a minus in one expression and looking for the small change in the generated executable; figuring out which status flags are set under which conditions is harder, but given time it can be done. Why would Apple release executable compilers if it wants to control the software that runs on the A4?
If the instruction set were cracked where would a developer go to obtain a compiler targeting the A4?
Given the FSF‘s history with Apple I would not be surprised if there was a fatwa against support for a proprietary instruction set in gcc. I suspect Apple would frown very heavily on such support ever being in the standard llvm distribution. I could see Apple’s lawyers sending letters to anybody making available a compiler that generated A4 code.
In the past manufacturers have tried to keep processor instruction sets secret and found that commercial reality required them to make this information freely available. Perhaps in the long term, like Moore’s law, the publication of detailed instruction set information may just be a passing phase.
Most commentators think the A4 is an ARM system-on-chip. There’s unlikely to be anythijng proprietory about the CPU instruction set. However the GPU is more likely to be kept proproietory – but this is also true for mainstream x86 desktops and laptops.
There are advantages for Apple to take an existing design and tweak it for their own requirement (e.g., lower cost, proven design). The A4 instruction set does not need to be completely unknown to meet a lock-out objective. The semiconductor company purchased by Apple a couple of years ago, PA Semi, used to specialised in low power POWER/PowerPC designs and I would think that this is a more likely choice if they go down a tweak an existing processor route.
This seems unlikely. The A4 will be an ARM Cortex plus peripherals. There are too many iPhone OS developers out there for any specialised instruction set to remain secret.
I don’t believe that the ‘many eyes’ theory of fault finding (or instruction reverse engineering) holds much water. Most people don’ have the skill, inclination or time to get involved.
Over time information on any secret instruction set will leak out. It only takes one skilled person with the inclination and time, or somebody to leak a confidential document. But once sufficient information is available who will write the code generator? Again given time I’m sure one or more people will write one. However, Apple’s lawyers will ensure that getting hold of this compiler will be hard (assuming Apple want to keep the instruction set secret), so bugs will probably take ages to be fixed and the quality probably low.
Any commercial company will not touch a ‘black market’ compiler and will have to strike a deal with Apple to port their application, which means Apple maintains control of the money making (the whole purpose of the exercise)..