The Met Office ‘climategate’ Perl code
In response to the Climategate goings on the UK Meteorological Office has released a subset of its land surface climate station records and some code to process it. The code consists of 397 lines of Perl (station_gridder.perl and pretty printer and kind of implies more than one person doing the editing. And why are some variables names capitalized and other not (the names in subroutine read_station
are all lower case, while the names in the surrounding subroutines are mostly upper case)? More than one author is the simplest answer.
One Perl usage caught my eye, the construct unless
is rarely used and often recommended against. Without a lot more code being available for analysis there are no obviously conclusions to draw from this usage (apart from it being an indicator of somebody who knows Perl well, most mainstream languages do not support this construct and developers have to use a ‘positive’ construct containing a negated condition rather than a ‘negative’ construct containing a positive condition).
Given that it was written in perl, it should have been rewritten as a one-liner.
The people who wrote these scripts may be “fluent in Perl” (whatever that means), but they are certainly no masters of the language, nor are they plugged into the general Perl community. There is no usage of CPAN modules (many of which would have made this code more comprehensible). There are several subtle bugs in the code and the code is unnecessarily imperative. While it may not be “Fortran-accented” (and I agree), it is definitely “C-accented”. (I would say “Java-accented”, but someone will argue that there aren’t any objects.)
If I had to guess, this code was written by 3-5 developers who are conversant in several languages and prefer Java or C++ over most other languages (likely they have a large Java or .Net application to maintain and this is a side project). Someone decided that since this was manipulating text, Perl would be the best choice and everyone else just went along. The code has undergone at least 3 internal bugfix/upgrade cycles and possibly as many as 6+.
And, finally, this code isn’t being released to the OSS community – it’s being released as a political maneuver.
(Oh – Damian’s anti-unless screed is almost completely disregarded by the Perl community at large, at least as judged by CPAN authors.)
I think the code is comprehensible as written and somebody who knows a language other than Perl could probably figure out what is going on. CPAN does contain lots of useful modules but when writing relatively small amounts of code, unless the developer happens to be familiar with a module that solves the problem at hand it is generally less effort to to write the required code. Not using CPAN modules also makes life simpler for non-PERL users who will not know about CPAN (here is one).
The code does have a strong imperative feel to it. I don’t see this as being necessary or unnecessary, but then I am always less than impressed by the cryptic code much beloved of Perl aficionados. I see no C or Java specific ‘accent’, but then 400 lines of code is the briefest of conversations.
Unless you have inside information the numbers in your second paragraph might just as well been delivered by Santa Claus.
The release of the data is obviously driven by a public relations agenda and the code might just have been tagged on as an after thought. Perhaps a couple of Met Office web site developers had some spare time on their hands and decided to put something together.