Wednesday, March 10, 2010

An Olympic honour for Alan Turing

Over at The Guardian I write:

Last year I led a campaign to obtain an apology for the mistreatment of the British mathematician Alan Turing. Turing's prosecution for homosexuality led to the death of a true genius at the age of only 41 in 1954. On 10 September last year, Gordon Brown issued an apology that recognised Turing's stature as one of the greatest Britons. But Britain has a final opportunity to unapologetically recognise Alan Turing in two years' time, at the 2012 Olympics.

Read the rest here.

Labels:

Tuesday, March 09, 2010

Did Monbiot try to understand climate science?

In The Guardian's Comment is Free section there's an article by George Monbiot called The trouble with trusting complex science which argues that:

The detail of modern science is incomprehensible to almost everyone, which means that we have to take what scientists say on trust.

He does this in the context of climate change science. I wonder if he actually tried to read the key paper that describes why we know that the global temperature is increasing. The paper is Uncertainty estimates in regional and global observed temperature changes: a new dataset from 1850. Go on, read it. I dare you.

The critical thing you need to be able to understand to understand that paper is... how to calculate an average. That's a GCSE level maths subject; here's a quick page to revise that in case you've forgotten how to average.

Because, you see, the entire process described in that paper involves the following steps:

1. Get temperature data (i.e. thermometer readings) at different places around the world for many, many years
2. Work out the average temperature at each location by averaging the values between 1961 and 1990 on a monthly basis. So you end up knowing things like the average January temperature at Heathrow.
3. Now go back and work out how much the temperature for any given month and year deviates from the average: all that means is subtract the average temperature from the observed temperature for the same month. Now you know how 'different' the temperature is. This is called the anomaly. If it's getting hotter the anomalies will get bigger.
4. Divide the globe up into squares 5 degrees on each side. Find all the thermometers inside each square, find their anomalies for each month and year. Average them to get an average anomaly for that square.
5. Take all the squares in the northern hemisphere, average their anomalies for each month and year. Draw a graph showing the temperature changing. Repeat the for southern hemisphere.
6. Now take the northern and southern hemisphere temperatures for each month and year and average them to get a global temperature anomaly chart.

Child's play? Yes.

I'll admit that the rest of the paper has some harder concepts (standard deviation, anyone?). But I'll wager that the real reason that people don't understand science is not because it's too hard to understand, but because they aren't motivated.

Yes, there are parts of science that require a lot of knowledge, but covering your eyes and not trying to understand is likely where many people go wrong.

Or to put it Monbiot's way:

My heart rebels against this project: I would rather be pelting scientists with eggs than trying to understand their datasets.

Labels:

Wednesday, March 03, 2010

A welcome bunch of amateurs

Here's me writing in The Guardian's Comment is Free section:

We're all the children of amateurs: amateur parents. There's no government department that will certify you as a parent (thankfully), nor a university department where you get your PhD in being a daddy, nor a professional body ready to strike you off for not following mothering standards. But any parent who's held a newborn child in their arms has unconsciously taken the amateur's oath: "I may not be a professional, but I'm going to do whatever it takes to act like one."

You can read the rest here.

Labels:

Monday, March 01, 2010

The reason I managed to find errors in the Met Office data and code

It turns out the reason is rather simple. In the evidence being given today to the Parliamentary Science and Technology Committee the Met Office says of their quality control procedures:

Manual inspection, including real-time quality control using GIS software; quality control described in literature for the various regional studies.

Contrast that with what they say about NOAA's procedures for the same sort of data:

A long series of automatic quality control tests based on both statistics and physics (e.g., outlier tests, identical values two months in row, etc.)

Given the amount of data it's unsurprising that 'manual inspection' isn't enough.

Update There's a lovely bit of evidence from Professor Darrel Ince about software quality that I wholeheartedly agree with.

Labels:

Thursday, February 25, 2010

Something a bit confusing from UEA/CRU

UEA and CRU have issued a document that they have submitted to the Parliamentary Select Committee on Science and Technology who are looking into the taking of email and documents from CRU. The document can be found here.

In it there are two interesting paragraphs concerning software:

3.4.7 CRU has been accused of the effective, if not deliberate, falsification of findings through deployment of “substandard” computer programs and documentation. But the criticized computer programs were not used to produce CRUTEM3 data, nor were they written for third-party users. They were written for/by researchers who understand their limitations and who inspect intermediate results to identify and solve errors.

3.4.8 The different computer program used to produce the CRUTEM3 dataset has now been released by the MOHC with the support of CRU.

It's 3.4.8 that's surprising. I assume that they are referring to the code released by the Met Office on this page (MOHC = Met Office Hadley Centre). On that page they say (my emphasis):

station_gridder.perl takes the station data files and makes gridded fields in the same way as used in CRUTEM3. The gridded fields are output in the ASCII format used for distributing CRUTEM3.

My reading of "in the same way as" has always been that this code is not the actual code that they used for CRUTEM3 but something written to operate in the same manner. In which case 3.4.8 is either incorrect, or referring to some other code that I can't lay my hands on.

Has anyone seen any other CRUTEM3 code released by the Met Office?

More information

Looking into this a bit further there's a description of the CRUTEM3 data format on the CRU site here. Here's what it says:

for year = 1850 to endyear
for month = 1 to 12 (or less in endyear)
format(2i6) year, month
for row = 1 to 36 (85-90N,80-85N,75-70N,...75-80S,80-85S,85-90S)
format(72(e10.3,1x)) 180W-175W,175W-170W,...,175-180E

In that the interesting thing is the format command. That is an IDL command (and not a Perl command). The first one pads the year and month to 6 characters, the second one outputs a row of 72 values each 10 characters wide in exponent format with three characters after the decimal point (the 1x gives a single space of separation).

The other oddness is that the NetCDF files that are available for download were not produced by Perl, they were produced by XConv (specifically, version 1.90 on Mon Feb 22 18:26:48 GMT 2010). And I've tested XConv and it can't read the output of the Perl program supplied by the Met Office.

It's not definitive, but all that points to the Perl programs released by the Met Office not being the actual programs used to produce CRUTEM3. Which leads me back to my original question: has anyone seen any other CRUTEM3 code released by the Met Office?

PS I think the Perl code released by the Met Office was likely written by Philip Brohan (he's the lead author on the CRUTEM3 paper), the style is very, very similar to this code. Given that he's written a lot of Perl code, perhaps I'm simply wrong and the Perl code released by the Met Office is the actual CRUTEM3 generating code.

Update Confusion cleared up by Phil Jones of CRU talking to the Parliamentary committee. He stated that CRU has not released their code for generating CRUTEM3 because it is written in Fortran. The code released by the Met Office (the Perl code) is their version that produces the same result.

Here's the relevant exchange (my transcript):

Graham Stringer MP: So have you now released the code, the actual code used for CRUTEM3?

Professor Jones: Uh, the Met Office has. They have released their version.

Stringer: Well, have you released your version?

Jones: We haven't released our version. But it produces exactly the same result.

Stringer: So you haven't released your version?

Jones: We haven't released our version, but I can assure you...

Stringer: But it's different.

Jones: It's different because the Met Office version is written in a computer language called Perl and they wrote it independently of us and ours is written in Fortran.

It's worth noting that above I said that the format command is present in IDL, it's also present in Fortran which jibes with Professor Jones' statement above.

Later the same day Graham Stringer asked a panel about scientific software and here's part of the response from Professor Julia Slingo representing the Met Office:

Slingo: I mean, around the UEA issue, of course, we did put the code out. Um, at Christmas time. Before Christmas, to, along with the data. Because, we, I felt very strongly that we needed to have the code out there so that it could be checked.

(The rest of her answer doesn't concern CRUTEM3. It was a discussion of code used for climate modeling; I'm going to ignore what she said as it seems to have little bearing on the code I've looked at).

Labels:

Wednesday, February 24, 2010

The station errors in CRUTEM3 and HadCRUT3 are incorrect

I'm told by a BBC journalist that the Met Office has said through their press office that the errors that were pointed out by Ilya Goz and I have been confirmed. The station errors are being incorrectly calculated (almost certainly because of a bug in the software) and that the Met Office is rechecking all the error data.

I haven't heard directly from the Met Office yet; apparently the Met Office is waiting to write to me when they have rechecked their entire dataset.

The outcome is likely to be a small reduction in the error bars surrounding the temperature trend. The trend itself should stay the same, but the uncertainty about the trend will be slightly less.

Labels:

Tuesday, February 16, 2010

The magic of sub-editors

In the print version of my Times article today there's been significant cutting to get it to fit into the space available. This is the magic work of sub-editors.

Here's the full text of the article with the words that remained in the sub-edited version (which appeared in the paper):

The history of science is filled with stories of amateur scientists who made significant contributions. In 1937 the American amateur astronomer Grote Reber built a pioneering dish-shaped radio telescope in his back garden and produced the first radio map of the sky. And in the 19th century the existence of dominant and recessive genes was described by a priest, Gregor Mendel, after years of experimentation with pea plants.

But with the advent of powerful home computers, even the humble amateur like myself can make a contribution.

Using my laptop and my knowledge of computer programming I accidentally uncovered errors in temperature data released by the Met Office that form part of the vital records used to show that the climate is changing. Although the errors don’t change the basic message of global warming, they do illustrate how open access to data means that many hands make light work of replicating and checking the work of professional scientists.

After e-mails and documents were taken from the Climatic Research Unit at the University of East Anglia late last year, the Met Office decided to release global thermometer readings stretching back to 1850 that they use to show the rise in land temperatures. These records hadn’t been freely available to the public before, although graphs drawn using them had.


Apart from seeing Al Gore’s film An Inconvenient Truth I’d paid little attention to the science of global warming until the e-mail leaks from UEA last year.

I trusted the news stories about the work of the IPCC, but I thought it would be a fun hobby project to write a program to read the Met Office records on global temperature readings and draw the sort of graphs (a graph) that show(ing) how it’s hotter now than ever before.

Since my training is in mathematics and computing I thought it best to write self-checking code: I’m unfamiliar with the science of climate change(climate science) and so having my program perform internal checks for consistency was vital to making sure I didn’t make a mistake.

To my surprise the program complained about average temperatures in Australia and New Zealand. At first I assumed I’d made a mistake in the code and used (having checked the results with) a pocket calculator to double check the calculations.

The result was unequivocal: something was wrong with the average temperature data in Oceania. And I also stumbled upon other small errors in calculations.

About a week after I’d told the Met Office about these problems I received a response confirming that I was correct: a problem in the process of updating Met Office records had caused the wrong average temperatures to be reported. Last month the Met Office updated their public temperature records to include my corrections.

Labels: ,