The UK Select Committee on Science and Technology has had its hearing today and there is a video link up here. Jones is about 1 hour into the proceedings. Takes a while to load.
I just wondered what the two people from the Global Warming Policy Foundation were doing there. They seemed very eager on the issue of openness, at least until they were asked about their funding.
The part about "hide the decline" has now been officially laid to rest in a governmental inquiry. The divergence problem was there all along. What a surprise.
If anybody has problems displaying the video like me, here is the direct link. You can open the video in your own player. I'm using VLC from videolan.org:
Hans I thought the hardly controlled pleasure of getting someone "grilled" for his scientific behaviour deserves some comment. Obviously there a number of people thinking one should go a step further than just some "grilling": http://epw.senate.gov/public/index.cfm?FuseAction=Minority.PressReleases&ContentRecord_id=fb6d4083-802a-23ad-46e8-c5c098e22aa1&Region_id=&Issue_id=
So besides of commenting on the "grilling" what can be commented if not the exactly the "grilling"?
No problem. I'll put it on my blog. Actually it's tru I like cheap jokes.
Georg - in most cases it is possible to express views in a manner, which is not considered inflammatory by opponents. As we know, we have lots of opponents (of whatever type) on this blog - and that is why we run the blog! - so that we all should avoid such "inflammatory" posting. The issue is: "sustainable use of the resource Klimazwiebel". -- Thanks for your understanding -- Hans.
Hans again, the word "grilling" is obviously not from me. So can it be commented or not? If someone takes a pleasure of having "grilled" someone who declared that he was thinking of suicide two weeks ago do you thinking this is the appropriate language of commenting this hearing? So or Reiner refrains from this sort of polemics or you allow commenting it. Both is not very consistent, in particular for the resource Klimazwiebel.
Lord Lawson... what exactly is he doing there? The "hockey stick" which was vindicated again and again, is called "largely fraudulent", when asked about "hiding the decline", if he agrees that the problem is that it "didn't appear in the footnotes or in the literature" (20:20 onwards) he just said "yes I do". Which is either rather uninformed. Since Briffa 1998 Nature 391 states exaxtly that. And a year before the E-Mail. The title of the paper? "Reduced sensitivity of recent tree-growth to temperature at high northern latitudes". No, it wasn't in the footnotes. It was in a paper's title. In Nature. Or in the IPCC AR3 WG1, Chapter 2.3.2.1 Palaeoclimate proxy indicators: "Non-climatic growth trends must be removed from the tree-ring chronology" and "There is evidence, for example, that high latitude tree-ring density variations have changed in their response to temperature in recent decades".
So Lord Lawson sits there and makes plain false statements to a parliamentary inquiry. Why? Doesn't he know better? Then what is he doing there in the first place?
Georg - what I meant is - make a statement which is clear to everybody so that people understand what you mean. Do not hide it in easily mis-understandable "jokes".
Hans it cuts a bit the beauty of exchanging opinions, but ok:
I think Mr Jones gave informations on his scientific work to a commitee. I think the word "grilling" expresses Reiners preconceived concept and prejustice of what this committee is actually about and what it might conclude. I think the word "grilling" is not helpful to have a rational discussion on the issue of climate sciences. I personally think of Guantanamo or Joseph MacCarthy when I here "someone getting grilled" and therefore I think it is not appropriate. You should better choose you words.
When looking through the list of submissions to the committee, I am really puzzled where all the "friends" of CRU are? My impression is that only very few of the many coworkers of Phil Jones have considered it useful to post a statement of support. Or do I miss here something? -- Hans
@Flin You are right. If "they" dont like the results send them into prison. Completely new possibilities for "future review processes".
Why actually there is no comment here on sueing climate scientists for their supposed "misbehaviour"?
I just like to put this in a normal perspective. The only thing I would be really embarassed about Phil Jones behaviour and work would be if his results have serious flaws or are manipulated. Absolutely NOTHING indicates that.
For the rest, Jones is nor my neighbour nor my friend but I wonder if Hans or Reiner really think that he merits being trashed by the british yellow press? I had expect a little less enthusiasm about the way things are going.
Hola Edu Lomborg had a comission investigating his activities? An american senator was suggesting throwing him into prison?
But in any case when someone feels sad about Jones and Lomborg beeing treated as they were treated why then writing they are "getting grilled" with such malice?
@Edu "did I ?" Of course not. I am all the time speaking of the headline of this post. And actually my problem is rather inconsistency. If Reiner chooses an obviously polemical headline like "Jones getting grilled" one should allow polemical replies. If Hans does not like polemics why then not writing in the headline: "Here is the video on the CRU commission".
re 8 _Flin_ Hockeystick vindicated again and again? Yes by the same group of people more or less (Mann, Aman, etc). Discard Wegman and McIntyre if you want (personally I tend to put more stock in McI than Mann in the hockey stick issue). Even Phil Jones states in the Guardian interview that the temperatures in the MWP may have been as high as today's! And regarding the statistical analysis used in the "vindicated" papers: Mann has yet to release his interesting statistical methodology. As far as I'm concerned the only question remains whether the MWP was global or just NH; not whether there was a significant MWP even if the hockey stick papers deny it.
Correction: Phil Jones didn't say that temps in the MWP may have been as high as today's. Used the word "if" , my mistake. He did acknowledge the existence of a MWP however. Leaves me with Wegman and McIntyre. And the fact we are still waiting for Mann to release his code and statistical methods.
Sorry, Henk, but Mann's code and statistical methods are fully available, in particular for his latest range of papers.
@Hans von Storch: I think that many of Jones' "friends" are scared as hell of being the next in line to be attacked. With the exception of Tim Osborn, I guess the CRU people didn't think it appropriate to react. I'd love to have seen Briffa, though, especially now that McIntyre attacked him again, but put in a graph that does not support his conclusion...
Oh dear, didn't realize people were being kept awake all night because of that word. Like Werner points out, it is not polemical at all. Maybe Georg is not familar with its use.
Georg - maybe my English is simply too limited. "Grilling" for me means a serious investigation - by asking tough questions. Certainly not a pleasure but entirely legitimate, given the significance of Phil Jones' results. Also, many reviews of your scientific submissions amount to this type of "grilling". On the other hand, given the language used by Phil in his e-mails, I would not expect him to be too sensitive - in the same way as I am not too sensitive about attributes given to me.
I agree in the wider scheme of things it probably doesn't matter all that much, but defending the indefensible is counter-productive and promotes mistrust. It de-values all the solid evidence.
And "hide the decline" meant exactly that: "let's not draw too much attention to the divergence problem". It refers to a graph on the front cover of a WMO publication released in 1999. Look at the green line here:
Georg, something else. I am one of the few people who actively made a submission to the UK commission; I guess you did not. Also others were similarly "bequem" as you were, of saying nothing except for a blog. (Before we did that in nature.) - But, note that Myles Allan and I made an assertion only on the thermometer-based temperature series, not on the proxy work, particularly not on the 2000 year reconstruction done together with Mike Mann, which is widely in the community considered questionable. Bradley big cross on the slate some years ago at a Swiss summer school is remember by quite a few.
Georg As regards the 'yellow press', here is Fred Pearce who comments in the Guardian today. The headline reads: Phil Jones survives MPs' grilling over climate emails. Commons committee tiptoed round embattled scientist and sidestepped crucial questions
So Pearce seems to think Jones only got a 'light grilling', indicating perhaps that the grilling should have been more intense?
Here is a little taster:
Jones did his best to persuade the Commons science and technology committee that all was well in the house of climate science. If they didn't quite believe him, they didn't have the heart to press the point. The man has had three months of hell, after all.
Jones's general defence was that anything people didn't like – the strong-arm tactics to silence critics, the cold-shouldering of freedom of information requests, the economy with data sharing – were all "standard practice" among climate scientists. "Maybe it should be, but it's not."
And he seemed to be right. The most startling observation came when he was asked how often scientists reviewing his papers for probity before publication asked to see details of his raw data, methodology and computer codes. "They've never asked," he said.
Read the whole comment and make a judgement yourself here
Hockeystick "vindicated" - that's what Mike Mann and his friends are claiming, and repeating. But others, independent people do not see it like that. They consider the original work (MBH) as questionable. We also remember vividly the moving target-procedure employed in earlier times, when the algorithm was changed from paper to paper, without proper documentation. Eduardo can tell the story. Nowadays, Mann's products are seen as just one proposal for past temperature development, among others. And this is ok. When TAR made it THE one, it created the problem.
Either history is right, or Mann's hockey stick is right. It can't be both. We all know that global temperatures were not flat from 1000 to 1900. The fact that the IPCC slobbered all over his hockey stick and made it the centrefold of the TAR, without questioning it, reinforces the notion they are acting on an agenda and not scientifically.
I haven't had time to look at Jones being "grilled". But scanning the other blogs, it appears he claims it's standard practice to hide data. Well, that pretty much proves that climate science is broken, corrupt and in need of a major overhaul, doesn't it? I can't think of any other science where hiding data is "standard practice", except maybe finance. Either Jones goes, or the CRU's reputation goes. Not a tough choice.
What I found staggering to read (haven't actually seen the "grilling") is the quote from Jones that it was not ’standard practice’ in climate science to release data and methodology for scientific findings so that other scientists could check and challenge the research. Also that scientific journals that published his paper never asked for the data. Is this normal practice in science, or only in climate science? funny thing is that in 2002 Jones did shared the data with McIntyre. Apparently after he saw what McIntyre did with the data he decide to invent new rules.
IoP, RSC and RSS (reading their assesments) also think that models, codes and data should be in public domain. And why not? We're not talking about nuclear/rocket science, it's temperature fercrissakes...
my reaction yesterday was a bit to harsh and mixing things that do not belong together.
I havent heard of this commission before I read this posting. I heard however of Senator Inhofe asking for sueing 11 climate researchers for their general implication in climate research and for what they wrote in mails they at least didnt consider as public. I am really missing here on the Klimazwiebel notices like this one (as by the way any contributions to climate sciences besides one of Edu, but that's just for the records). It is McCarthysm and though right now he is still a minority it might be become a much bigger problem than the Question if the data of the Met Station of Ulan Bator should be made public or not. Hans if you could provide me any link to the commissions "submission" section. I doubt that I can say or do any relevant being bequem or not.
1) There are non public data if you like it or not. I just can invite you to get the complete data sets of an arbitrary german station entering into Jones computations from the DWD . The DWD makes money with these data and still at the Max-Planck I had to pay 200 Marks for the data from Heidelberg. Just one station.
2) There are exercises in climate sciences which can not be reviewed in the way as you might have that in mind. Satellite data sets (again partly non public) are huge as Jones data set is quite large. Even for someone working in the same domain it will take weeks if not months to check on each step. The scientific review process as a whole therefore is twofold. A) Obviously internal plausiblity, citations, etc etc by the actual reviewer B) Subsequently others with similar data sets (though not identical) and different reasonable methods get or dont get similar results. That's in fact the really important step, not what has been done until the actual publication. So what the Guardian and/or you have in mind is a) not practicable and b) not needed.
3) And the end it's a question of trust. If Jones data set or in general surface temperatures are only accepted when McIntyre agrees with the smallest and minor subjective choice that's the end of climate sciences and this is ofcource intended by many. As a friend of mine put it. If anything is a lie you can no longer prove or disprove anything. Once you divide by zero everything becomes true.
Hans 27 If I understand you correctly, you made s submission to the UK Select Committee on Science and Technology? How does this work? Did they ask for contributions?
The hockey stick is now under political scrutiny (the UK committee), and in case senator Inhofe really sues climate scientists, under legal scrutiny. 'McCarthyism' seems to be an omnipresent accusation. You, Hans, once complained about McCarthyism - gatekeepers from influential journals who prevent critical or skeptical contributions; now, with Inhofe (and maybe even the Jones grilling), there is again McCarthyism in the air. Anyway, is there any hope that politics and law will correct what science itself obviously did not achieve - the control of regular scientific work? Or will the mess accumulate?
I am not sure about other branches of science. I explain situation of climate science as far as I know.
Not all climate data are in public domain. The principle of the U.S. federal government that the data obtained by the government shall be in public domain is not common with other nations. But large part of data needed for global climate research are openly available, at least for non-commercial applications, thanks to various international collaborations.
There are data centers whose job is to provide data to users. The data center to be named first for both modern climatology and paleoclimatology is the National Climatic Data Center, a part of NOAA of the USA. Their data are in public domain unless otherwise specified by those who provide the data to them. But they may charge fee for their service. Recently the more data are on the Internet so the occasions where we must pay fee have become rarer.
In the following I limit the subject to modern climate data.
CRU is not a data center but just a research institution. They released their data products (gridded data), and maybe that was their duty by contracts. However, providing data of observational records at stations (their raw material) to other users are not their job, but the data centers' job. So I think it reasonable for Jones to say that McIntyre et al. should get data from NCDC.
Actually, when researchers obtain data from data centers, they often have to make conversions between different formats, and some quality checks from the researchers' viewpoint, before using them for the substantial analysis.
I am a scientist who use data from various sources. I write ad hoc program codes to do such minor tasks every time I need to use something new to me. But I do not usually write documents of those codes. If I need to share them with my collaborators, I write some informal documents to be supplemented by conversations. (I feel I should do this for future myself as well.) Writing full users' manuals understandable by outsiders is a much harder task. If the equivalent of British IOC determines that I must release the codes, I will be obliged to write their documents with more time than writing the codes themselves.
When I report the results in scientific papers, I explain my methodology. It means that I explain the substantial parts of the process of my analysis in words or mathematical formulas. Probably I do not explain trivial parts of the process such as format conversion.
Other scientists who want to repeat my analysis probably need to create their own program codes. I think it has been normal practice of our scientific community.
If there is some motive for me to be sympathetic to them, I give them my codes and also take my time to explain them. I do not mind if they are going to scientifically refute me. But if they are going to morally discredit me, I would not help them.
Note that the program codes I mention are short ones, usually about one hundred lines long each.
The source codes of full climate models are tens of thousands of lines long. My knowledge about their situation is too fragmentary to talk about.
Dear Kooiti Masuda, You wrote: "If there is some motive for me to be sympathetic to them, I give them my codes and also take my time to explain them. I do not mind if they are going to scientifically refute me. But if they are going to morally discredit me, I would not help them."
I have to think about the validity of this statement. You are assuming you are moral and that the critical party is not. Seems like a strange way of scientific cooperation. Actually, if your work is sound and done well, there is nothing to fear. Of course if one deliberately fudged the data, or is very insecure about the methodology he employed, then he might choose not show his work to a critical party. Indeed he'll probaly only agree to show it to his morally equivalent pals.
Why does there seem to be some kind of paranoia among CRU scientists - thinking others are only out to destroy them? Do you believe that Mr McIntyre is only interested in morally discrediting Mr Jones? What do you base this on?
If CRU's work is good, then put it to the test. If not, then it is necesary to destroy it. Otherwise science becomes a folly.
I'm not convinced by your defence of Mr Jones and the CRU.
@MASUDAsan: Many thanks for your extensive explanation which gives me a good insight in your procedure.
However: "Other scientists who want to repeat my analysis probably need to create their own program codes. I think it has been normal practice of our scientific community." Why is that? Isn't it important for peer reviewing to know how you calculated the results. How can it be falsified otherwise?
What also struck me that apparantly Jones was never asked for the data by his reviewers.
BTW McIntyre described what happened when he was a reviewer:
"When I was a reviewer of Wahl and Ammann, I asked for verification r2 statistics; they refused and Schneider terminated me as a reviewer. When I was a reviewer of Mann et al (submitted to Clim Chg 2004), I asked for supporting data and code; Schneider said that no one had ever asked for such things in 28 years of editing and it would require a policy change by the editorial board. Jones and Santer were on the editorial board and the matter is discussed in a number of early 2004 Climategate letters (which Jones “confidentially” sent to Mann.)
@itisi69 (apologies in advance for the length of my comment) It is indeed common practice in science to NOT ask for data upon reviewing a paper. As a scientist myself (but in another field than climate science), I review about 10 papers a year (and get asked for many more). On top of that I have to make sure my PhD students and postdocs write proper papers. I also have to teach. I also have to write grant proposals. I also have some additional administrative tasks. I go to various scientific meetings and presentations. I use on average more than 50 hours each and every week on these tasks. While I do this with all the love of my heart, I simply am incapable, as a reviewer, to redo the analysis of the provided data. If I use the same 'code' as provided by the authors, I would have to check the code in detail first. If it happens to be written in a 'language' I do not understand, I have to find someone else to check it. Alternatively, somehow I need to authors to write it in a format I do understand. The raw data can actually also be wrong, but in many cases it is not possible to see that. In my area of research, analytical methods are very common. I can't see in the raw data that instead of the claimed 700 microliter, 800 microliter was used. I'd have to repeat the experiment (meaning I need the same material and instrumentation) to see that problem. Especially with more advance methods this may be totally impossible (both economics and time are issues there).
In short: even *with* the data provided, very few reviewers would be able to use that raw data to check the results of the authors within the time-frame allowed for a review, within the time-frame our employers would allow for this type of work, and most certainly within the time-frame I am able to spend (mentally) on reviewing papers. Reviewers look at the methodology as described, whether the results make sense (graph and discussion must fit), whether prior work is properly referenced and whether the authors explain any discrepancies, whether it is novel, etc. etc. etc. But reviewing is not auditing. The scientific audit is others using the same procedures (or novel procedures) and apply it to new data (or the old data). Do they find the same (/similar)?
There are some areas where certain 'data' must be submitted to databases, but these areas are actually limited. For example, many crystallographic data needs to be submitted to a database upon submission of a manuscript. That doesn't mean the reviewers even check the data, though! Recently, some Chinese were caught having submitted the crystal data of one compound as many different compounds, for example. It was discovered because someone else had made one of those compounds, and did not get the same result. Discovering this fraud required *independent* work, not an audit of the data.
In contentious situations, we need an arbiter or a moderator to continue conversation. Otherwise we had better avoid contacts in order not to escalate contention.
Marco, "Discovering this fraud required 'independent' work, not an audit of the data."
So what's your point? Scientists don't have time to properly and rigorously review papers, and thus should just skim over them and accept them on blind faith? Or papers should be subjected to "independent" work, but not to audits? Independent work will eventually out the errors?
I think for an issue as important as climate science, it has to be checked inside out - with microscopes, audits, reviews and independent work.
I think if Mann, Jones, etc. had reacted cooperatively when first asked to provide the data and not combatively and aggressively, and had not attempted to arrogantly beliitle those who had legitimate questions, that is if they had behaved semi-professionally, they would not have had to endure the scrutiny and suspicion that naturally ensued and brought their house down. Yes, they were hiding things - the e-mails confirm this, and a lot more.
Surely when one starts calling fellow scientific colleagues names and starts to publicly question their credentials, he is neither going to dampen their suspicions nor make them into friends. If you live by the sword, then you also die by it.
Itsi @37 The McI quote raises an interesting question. Normally, researchers would not do what he did (Marco's valid point about time constraints aside) because they would hamper their careers. So you need to get someone with nothing to lose in order to get a thorough job done.
Is it normal for a scientist to express glee over the death of a critical colleague?
Again, who, other than perhaps Masuda san, who has yet to answer the question, believes Mr McIntyre was interested in morally discrediting Mr Jones?
It amazes me that scientists may have reached this plateau. I'm only asking questions based on statements made by others. Unless scientists return to professionalism in the field, the situation is only going to get a lot worse.
Inhofe's announcements are empty gestures. If there is breach of the law, investigations and prosecution would have started already. Inhofe cannot produce this. He plays to the gallery. And as a byproduct, he makes other people jump up and down. ;-)
@P Gosselin: I'm not asking for blind faith. On the other hand, some level of faith in other people *is* warranted. Independent work is much better to both weed out errors AND to strengthen results. The second issue should not be forgotten.
And putting a microscope on data may be fine (note: it really doesn't matter that much for AGW what the pre-1900 reconstruction exactly looks like), but why not put a microscope on the microscope? We've already seen audits of people like Watts, D'Aleo and Smith turning out to be wrong. The auditor audited. Which then is audited and audited again, all on blogs, with cheering crowds all the way. McIntyre has also been audited a few times, and plenty of people are, to put it mildly, not impressed with several of his analyses. Minor issues blow out of proportion, or (as in the case of Yamal) outright wrong.
Note also that Jones *had* been forthcoming. But he quickly found out what McIntyre was doing: trying to find a comma wrong, and blow it up as much as possible. Sure, he may not have directly set out to do so, but his cheering crowd has been pretty good at doing it for him. Scientists are not happy when they are shown wrong, but at the very least they like to be able to respond to criticism before it is thrown into the world, especially with all the accompanying invectives as has happened to Mann and Jones.
Take also the experience of Briffa. Briffa was so kind to send McIntyre's request for data through to his Russian colleagues (McIntyre knew who had the data, they wer acknowledged in the paper). The Russians gave McIntyre the data. What did McIntyre do? He kept on demanding Briffa send him the data! Any normal human being would get seriously unwilling to cooperate with someone who is clearly never satisfied, and whips up a rather unpleasant atmosphere when he actually does get his hands on the data.
The land-based tmeperature record is another one of those areas where you really should wonder why people are getting so upset. 95% of all the raw data HADCRU uses is freely available. HADCRU can be already be checked using that data. There are also independent analyses, such as those from NASA (GISTEMP), NOAA (NCDC), and the JMA, which reproduce, using other methodology, the same result.
In essence, giving McIntyre (or others) the data would thus not have mattered for the science. Unfortunately, we also know it *would* have mattered for the scientist, as even the slightest error would be pounded upon as some kind of huge crime, and all that after Jones would have spend masses of time to find all the data and the code and whatnot to satisfy McIntyre in the first place. I thus understand the reluctance of so many climate scientists to just give their data to people they do not consider very ethical nor friends of science.
It is true that reviewing papers generally doesn't mean going through all the raw data, code and methods, as it would be too time consuming and so on, agree with Marco here.
BUT, refusing to release data and code to a reviewer who wants to do a thorough analysis is very wrong and unscientific.
Marco, I don't expect to reconcile differences here. But the matter indeed goes far beyond slight errors and commas. The e-mails couldn't be clearer. You are mischaracterising Mr McIntyre. He is not being emboldened by his “cheering fans”. Rather it is more his cheering fans being emboldened by his profound and disturbing findings. You make it sound like Mr McIntyre gets his kicks by being a troublemaker. Nothing is further from the truth. Climategate, Hockey Stick and IPCC would not be in complete turmoil if we were only talking about only a few mistakes. Unfortunately, it gets down to some serious problems. It's institutionalised systematic chronic wrongdoing and general unprofessionalism vis a vis legitimate inquiry that have been uncovered. Whitewashing, downplaying and continued attacks on critics will only make the situation get far worse. Much of the science community is already turned off by this behaviour, e.g. see the latest IoP statement. Like it or not, it's a growing tide.
Marco, I don't expect to reconcile differences here. But the matter indeed goes far beyond slight errors and commas. The e-mails couldn't be clearer. You are mischaracterising Mr McIntyre. He is not being emboldened by his “cheering fans”. Rather it is more his cheering fans being emboldened by his profound and disturbing findings. You make it sound like Mr McIntyre gets his kicks by being a troublemaker. Nothing is further from the truth. Climategate, Hockey Stick and IPCC would not be in complete turmoil if we were only talking about only a few mistakes. Unfortunately, it gets down to some serious problems. It's institutionalised systematic chronic wrongdoing and general unprofessionalism vis a vis legitimate inquiry that have been uncovered. Whitewashing, downplaying and continued attacks on critics will only make the situation get far worse. Much of the science community is already turned off by this behaviour, e.g. see the latest IoP statement.
Marco / 43. - we know from the e-mails that Phil did not want to share the data with somebody who wants "to find errors" (I once asked Phil for confirmation - and got a positive response form Phil). Even if that may be understandable, it is simply false. Our scientific adversaries are allowed to want to find errors.
By the way, when I met Phil the last time, in October 2010, I urged him to solve the problem (by sharing the data right away; by starting an initiative to persuade the data producers to allow distribution) - and, as we know be now, admitting that some data have been lost. He did not want to do so, and one month later the mails were leaked.
On the other hand, I have to admit that most data of earlier studies of mine are lost.
@Marco @Reiner and other scientists So basically it comes down to the fact that "we dont have enough time to check all the data and codes and believe the scientists on their blue eyes"?
What about when someone does have time to go through the data/codes? What's so bad about that? (all or not unintended) errors can occur and can be adressed to. To me this excuse has no value. Would you believe Fleischmann and Pons on face value now?
Wasn't the scrutinizing of the data that revealed that Mann's Hockey Stick was flawed as proved by Wegman? And that Mann used data upside down?
"What did McIntyre do? He kept on demanding Briffa send him the data! Any normal human being would get seriously unwilling to cooperate with someone who is clearly never satisfied, and whips up a rather unpleasant atmosphere when he actually does get his hands on the data." Sorry, nut this is absolutely bollox. As the Climategate emails clearly show McIntyre was up to something and there was a lot of nervousness how to reply. The rather unpleasant atmosphere is created by those who don't want to issue the data. I'm asking over and over again: if there's nothing to hide, give it to them! There's no way better way to shut these pesky skeptics up than to show them in the open that nothing is wrong with the data.
I just can't believe my ears and apparantly looking at the MP's this afternoon I'm not the only one.
@all: I think there was a discussion before about releasing data. If I remember correctly, there was quite some agreement that it is the right thing to do to provide the data.
I would be very interested in hearing from you, which kind of data you would happily provide and where the boundaries of this are.
Can it be asked from scientists to release all of their findings in form of data and code (in other fields called intellectual property) for the use of the public? Or is it necessary that all scientific code becomes Open Source, so that the code can be improved and reviewed? What about security risks (imagine someone wants to steal, lets say, your emails)? Where are the limitations? Time limitations, e.g. 3 years?
@Reiner "Inhofe's announcements are empty gestures. If there is breach of the law, investigations and prosecution would have started already. Inhofe cannot produce this. He plays to the gallery. And as a byproduct, he makes other people jump up and down"
On the other hand I dont know how he could start if not like this. I suppose McCarthy didnt get started with: "And now everyone who has a grandfather in the communist party looses his employement".
If one even cannt take an american senator seriously who else? the pope?
This is a very elastic sentence. what does this specifically mean? that all other reconstructions lie within its error range? or that the year 1998 was the warmest in the millennium ? or that the temperature evolution in the past was flat ? or that the reconstruction method was correct ?
There are still methodological aspects that are not clear, for instance how were the uncertainty ranges calculated for the smoothed reconstructions. This issue also emerges in the CRU emails, as Mann would like that data from his statistical analyis (the regression residuals) are not made public. although later they were made public. Other open question is that so far all reconstruction methods have been shown to underestimate the past variations - you will probably here some other opinions, that of Mann and co-workers and that of the rest of the groups. So it is likely that if a warmer period than today (say around 2000) had indeed occured in the past the reconstructions would not have picked it. This is also related to the divergence problem displayed by some tree-ring proxies. I think that the state-of-the art now is well summarized by one sentence of the NAS report on millennial reconstructions: back to 1600 it is very likely that it was cooler (by how much ?). before that everything is much more speculative - less proxies, more uncertainty, etc.
I think Jolliffe was also quite right when he wrote that the evidence for AGW does not rest on the hockey-stick. Actually, the hockey-stick would be a quite minor component of the AGW body. However, it has indeed been bloated in the past as one of the cornerstones, and now we pay the price for that. In my view the HS says much more about the working of climate science in the last decade and related issues than about climate science itself.
It cannot be denied that Mann was very defensive against everyone trying to check or criticize his work, also before the appearance of McIntyre on stage. This is featured in the CRU mails, for instance in those related to the Esper et al paper in Science. I think this stance has not been helpful for the whole community, and again we all pay now the price.
This doesnt mean that everything is 'blue sky' on the other camp. I think the interest in getting the science right is overshadowed by a thrust to oppose the perceived Team, where by the Team is more or less almost every climate scientist. We can only speculate what would have happened if the first approaches of McIntyre would have been met more openly.
Review process. it is indeed impossible to review a paper in-depth, checking all calculations and code. many papers use intermediate data, for instance data from simulations performed by other groups, etc. To review a paper means to check that the relevant literature has been considered, that there are not logical fallacies, that is clearly written, that it provides all information to understand what has been done. Many journals set a tight time schedule, 3 or 4 weeks, and reviewing a paper is done on a voluntary y basis. The real self-correcting processes occurs later, when over the subsequent years, other teams try to reproduce the results or find other contradicting results. Thats why I think it is misleading to highlight any findings of any recent paper as the final proof of anything. The results of a paper are more or less established after say 3 or 4 years or even longer. Even more so in climate science, where we are dealing with additional uncertainties due to the impossibility of conducting experiments. So any claim that a paper debunks the whole climate science or that the IPCC underestimated climate change are really not serious. This is a slow prodding business.
It is indeed disturbing that some journals do not like to publish comments or corrections. For instance, according to the PNAS guidelines, comments to papers are only allowed within 3 months after publication and can be 500 words long. For Science the limit is 6 months. I think there shouldn't be a time limit to report any errors found in a publication.
Data sharing. there are too aspects that have to be consider. One is the need to help in the replicability of results explaining data, code, and calculations. Other is that some kind of data are very burdensome to obtain and can be used for different studies. I think there is a gap here. Journals should set-up some mechanisms in which data are shared just for the purpose of checking a paper, and not to initiate new studies if the owner of the data does not agree. Perhaps a kind of data broker.
It is also true that it has been very difficult to obtain observational data for everyone of us, not only for McIntyre, and Georg is right to explain that the staunchest guardians of data are the national weather services. This situation may be slowly changing, but in the early 2000 and late 90's it was much more difficult.
In my whole career I was asked only once by a reviewer to provide data so that he could check my calculations
Werner / 37. -- "If I understand you correctly, you made s submission to the UK Select Committee on Science and Technology? How does this work? Did they ask for contributions?"
- No, we somehow knew (ok, Myles knew) that there was a call for comments; so we submitted our extended nature-text to a given web-page, and that was it. No special invitation. -- Hans
re#23 Thanks Marco. Tricky thing the blogosphere. Have to make a mental note: "Do not quote from others (as authoritative) until original papers/data have been checked"
re#24 Reiner My watch says 1:38 pm right now
re #38 Marco ,I fully agree with your assessment. Was going to write something to that effect as well. What kind of incentive is there for overworked scientists to go with a comb over other peoples original data?
In a way this situation in climate science is unique. Is there another branch in science that gets so seriously scrutinized by outsiders with a good scientific background ( and some with a lot less understanding)? If only there was a serious "medicalaudit.com" ! As it is there are plenty of papers without statistical merit. What if all the original data were audited? Would love to see that.
In the UK, the Economic and Social Research Council has the following policy in place:
As an ESRC award holder, under the ESRC Data Policy, your contract requires you to offer all research data generated from your award for deposit with the UK Data Archive. This applies to all kinds of ESRC awards - individual research grants, fellowships, Research Centres investments and awards under Research Programmes.
The offering data process is administered by the ESDS, which you must contact within three months of completing your ESRC grant to offer data for archiving. The UK Data Archive, a service provider for the ESDS, is where research data are processed, physically stored and disseminated.
I have looked for the equivalent for the physical sciences (EPSRC) and engineering but so far have not found anything.
In a way this situation in climate science is unique. Is there another branch in science that gets so seriously scrutinized by outsiders with a good scientific background ( and some with a lot less understanding)?
This applies to scientific controversies. Remember, climate science under the vision of the IPCC was supposed to provide a consensus view, not a controversy. We only have open controversy since recently.
During 'normal science' in most cases papers will not be scrutinized and experiments not repeated for obvious reasons. It is not in the self interest of a researcher to do that as you could just prove someone else right. You would waste your time and resources.
A controversy changes this incentive structure. If you are part of one 'camp' you may want to do many things to disprove the 'other side'. Often findings that support the other side are not reported. We need to find robust institutional structures to create maximim transparency. Controversies provide such a mechanism but at a high cost.
medicalaudit or epidemiologyaudit would definitely be called for.
As a matter of fact, the reason I began to look into climate science issues, was the striking similiarities on the PR side - "the science is settled". And in the case of epidemiology, you don´t need a statistics degree to find major flaws in peer-review literature.
A good blog used to be http://junkfoodscience.blogspot.com
However, the lady running it stopped posting in Fall 2009
It's safe to say that, at least on this forum, there's consensus between (climate) scientists that scrutinizing data is not common in the peer review process, in fact very rare. Fair enough.
Buttt....when global policy is depending on this science, with trillions $$$ at stake, whole economies designed based on the results, billions of investments, mitigation, adaption, the full monty. Don't you think it's imperative, no it's matter of life and death that the data is scrupulous scrutinized?
Again, sorry to beat a dead horse the Hockey Stick, this graph was the epicentre of the first IPCC report, the Mother of all Anthropogenic Warmings. And look what happened... Mann and Jones seem now to "hint" at a global MWP.
Don't you all agree that in this particular case a less cavalier attitude is desirable?
Reiner 58 There are other factors introducing bias in medical studies and that is money. Pharmaceutical studies are twice as likely to show positive results as purely academic studies. Interestingly the statistics in the drug company studies are often quite good. One way to fool us though is using the comparator drug in relatively lower doses and bingo, the study drug is more effective. Drug companies have left out original data sets, a notorious example was one of the anti depressants. Another wonderful way in which to get results is: start off with a good number of co-variables. You are likely to find at least one significantly changed with p< 0.05. Then don't report your initial co-variants that turn out to have no significant change. The most interesting part is how new drugs get presented to the docs. Y axes are blown up like a balloon to make small changes look impressive. Usually presented now a days by a beautiful lady. When you ask her "what is the number needed to treat" ( the number of patient years needed to prevent one death or one serious event) the routine answer is "I will have to look that up" Below a study in the BMJ. Didn't even look at original data; that would be a whole different story. http://www.bmj.com/cgi/content/full/326/7400/1167
As most climate datasets have a horrible signal to noise ratio, results depend strongly on adjusting techniques. Of each timeseries multitudes of grey variants are circling. To point to "the" data elsewhere, as e.g. Briffa and Jones did, is no guarantee that "that" version is identical to "the" one used in the publication. Reference to original data residing in a common public domain database would resolve most of the issues of irreproducible science.
Addendum to my previous posting (which I see as #63) with comments to Hans Erren (#62):
Compilation of global surface air temperature since 1850 will surely be a subject with the special condition to preserve all the input data and program codes.
(Perhaps this suggestion overlaps with the initiative of the UK Met. Office announced already. Excuse me, I have not checked what they say yet.)
Restricted data will deliberately be excluded from the input to the particular project. Logically, the software that comprises the data stream towards the product (not only those codes produced in the project) should be all open source. I am not sure whether this condition can be actually enforced.
Also, I do not think we can make the adjustment of historical data fully objective. We sometimes need expert decisions. So, I think the achievable level of reproducibility will be like this: If we tentatively accept the same set of expert decisions, we can reproduce the same results. We can choose an alternative set and get somewhat different results. We can trace the source of difference.
@Reiner #57: Note that there is an organisation specifically dedicated to archiving the data, and help researchers archive said data. And I'm pretty sure they have the same caveat as other data repositories: you can get exceptions for data that is going to be used in future publications, and exceptions for data that may be used commercially(!).
Having reviewed 100s of papers and having been a journal editor for almost 10 years, let me say the following about reviewing and access.
In most cases, you can write a perfectly round review without having checked every detail. This is obvious if the paper goes wrong at some higher level.
As a referee, you put a paper through a series of validity checks. Is it original and relevant? Are the methods and data sound? Once the referee is satisfied that the main conclusion is robust, there is no need to check every detail.
There are papers, however, where the surprising conclusions hangs on a badly documented detail. In such cases, a referee would typically ask for better documentation instead to trying to reproduce the result herself.
note that there are areas of science where the raw data are evaluated thoroughly. Drug and chemical safety studies are perfomed to a standard (GLP) with auditing, and all the raw data and conclusions, methods of analysis are audited.
it is absolutely routine for epidemiology studies to be audited, and cross-checked by other groups- frequently revealing substantial differences in analysis. There are difficulties- you have to anonymise data- but when it matters, you have to be able to check the data and the conclusions.
It more difficult to analyse lab notebooks- for obvious reasons- but when it comes to analysing data, the codes for analysis and the output, there is little reason for not making this freely available.
itisi69 They would not be able to call themselves scientists if they decided it any other way. I read the RSC statement and they are right on in every point. Paranoia and secrecy have no place in science. Donna calls em out! http://nofrakkingconsensus.blogspot.com/2010/03/battle-for-soul-of-science.html
re 67 anonymous New drugs have to pass studies of pharmacokinetics, toxicology and mutagenicity and sometimes teratology before being tested on humans. This is the laboratory part and has to comply to GLP (good laboratory practises). The phase 3 clinical studies is where the problems creep in. They are phenomenally expensive.These are the studies the BMJ article reviews.
@itisi69: Funnily enough, the Institute of Physics is not willing to tell who are on its subgroup for Energy, in other words, the person(s) who have written their submission.
So much for openness...
And note that the RSS also puts constraints on data sharing, and that the RSC has a nice sneer towards the people who are attacking climate science and scientists, and are not held to the same scrunity (Watts comes to mind, but also McIntyre).
I'm surprised that the HARRY_READ_ME file hasn't been mentioned. IMHO, there may be 2 reason why some scientists at CRU may not have supplied data and code as requested.
1. They were unable to do so. The data was in such a mess that the original data was essentially unavailable because they did not keep intermediate files and could not find the programs that produced the intermediate files. See the comments by Francis Turner , a computer professional. There are other computer professionals that are less kind than Turner.
2. They wanted to hide code because it indicated fudging to get their results. See this excerpt of the some programs used by CRU.
Before the email and other documents were made public, I would have accepted the results of papers by CRU on faith. But now I won't accept the results until I see those results duplicated. I'm not saying that Mann and others cannot produce the data and code for their results; I'm just saying I have lost faith in CRU.
@everyone who tries to compare Climate Science with other fields of science:
I think the idea of "other fields of science do things different" is a false argument. It should not matter how science is conducted in other fields when we're examining Climate Science. Climate Science stands or falls on its own.
The facts are, as demonstrated, there was intentional fudging of results, intentional withholding of data/codes for the express purpose of preventing others from falsifying results, deliberate truncation of a series of data in a highly publicized graph to hide an inconvenient decline, and a number of non-scientific behaviors engaged in. This does not prove that the results achieved by Mann, Jones, Briffa, etc, are necessarily wrong, but certainly weakens the confidence of any subsequent science that relied on their work, which appears to be the bulk of climate science at this moment. Ethics matter. When people ignore them, they lose credibility. It is much worse to be seen after the fact as attempting to prevent others from being able to verify your work, than to have someone discredit a piece of it but keep your dignity through the process by honestly addressing any mistakes that are uncovered in your work.
I personally would love it if I had a competent statistician point out flaws in my own application of statistics in my work - I'd probably invite him to help me on my next paper to ensure that the result was much higher caliber than I could have achieved on my own. But then I'm not Mann or Jones.
As someone who, in the latter part of his career, participated in feasiblity studies for major projects (of the order of $500m to $2 billion capex) I am astounded by the attitude of scientists and particularly climate scientists to what I would describe as "due diligence".
Before the financiers (usually but not always banks) will accept a feasibility is sound and thus suitable for financing, they require a detailed due diligence exercise be undertaken by a specialist independent review company. That review can cost between $500k and $1 million.
The reviewer takes every material aspect of the FS, and asks for the underlying data, reports, calculations, and specialist engineers check the statements for soundness.
If the independent engineer is not satisfied (which is usually the case, at least on some aspect or another) those responsible for the FS must address the issue, and do further work until the independent engineer is satisfied.
Also, in the commercial world, I was frequently involved in the preparation of Prospectuses to raise equity funds through a public issue on a major stock exchange. In each case, the lawyers advising the company maintain a detailed due diligence file in which is recorded the underlying or supporting information for EVERY material statement made in the Prospectus. It is recognised that the Prospectus might become subject to later legal action, the hard-copy ringbound due diligence files constitute the evidence that those involved followed appropriate processes.
It is indeed staggering to observe the much more casual approach of Mann, CRU, IPCC, especially when not $billions, but many $trillions are involved.
@ anonymous 75 In the light of the current economic crisis, I am not the sure how reliable economic feasibility studies really are. And the equation that Mann's hockey stick is worth many trillions of dollars is polemical and under-complex. The relation between science and politics is not one of a magic bullet.
@Raz0rama: You'd be delighted with a statistician helping you? Sure, we all would!
But how would you react to a statistician redoing your work, and throw all mistakes into the wide world on a blog, upon which the readers of his blog are not too shy to cry "fraud"! "Manipulation"! and other 'nice' qualifications?
How would you react when the statistician tries to enter *your* area of expertise, makes claims that are at odds with those of the general community, and then claims he has falsified what you did?
@anonymous #75: Despite all the claims of billions of dollars/euros flowing towards climate science, there is hardly a penny for such review you propose. However, there *are* others who repeat results, often using different methods. For example, however much people complain about parts of CRUTEM not being openly accessible, its results are already repeated by others using different methodology (GISTEMP, NCDC, JMA). In science, such repeatability is worth many times more than just checking someone did all his calculations correct. It means the result is robust to different methodology.
Comparing the work of feasibility studies (FS) with that of individual scientists, like Jones, amounts to a category error. The comparator would be the IPCC (and its necessary reform) and this is where much of the attention is focussing now. Apart from this, I am not convinced that FS would fix climate science (or climate policy). The relation between the two is not direct and the 'level of soundness' of scientific results tells you nothing about the policy choices. What is needed is a better institutional quality control in science (peer review problems, data accessibility, etc., see other blog post about model uncertainty) and wide political discussion about practical political steps to address climate change. The IPCC has given the impression that both processes could be folded into one, which is presumably the reason why you (and others on this blog) point to feasibility studies.
Anon75; I've always wondered that too. But then again you're talking about the real world, not the Twilight Zone called Climate Science.
Jones' CRU got 22Mio funds, yet they've only a 3 person staff, no time to respond to 60 FOI requests, shoddy codes and losing important data. Imaging this would happen in the real world...
@itisi69: the number that has been thrown around is at the very least lying by omission. The supposed 22 million (13 million pounds) has gone mostly to the *university* for *non-research* activities. For example, 6.6 million pounds was used to set up the Institute for Connective Environmental Research. The money went into the *building* of the Institute... Another 2.7 million pound was for Tyndall phase 2, led by the University of Oxford (Tyndall Centre), with Jones just one of many applicants. The same goes for several other applications (Interestingly, Jones, or the UEA for that matter, isn't even mentioned as a member of the EMULATE consortium).
Even if we're generous, I guess we are talking about 3 million pounds of a 15 year period. That really is not that much, and most certainly does not sustain a large research group. Just for comparative purposes: my group with 5 permanent staff members has at least twice that amount available, that's without the considerable co-funding of our university (PhD grants, for example), and are research is not all that expensive: we often get considerable numbers of materials for free. In the last 3 years about 1 million(!) commercial value from one company alone, for example.
Thanks Marco for the explanation. I knew you, knowing everything, would have an explanation for the Jones grants.
Except the bean counting, my point is, that Jones et al have insufficient staff, lose data, write lousy climate software etc. So tell me, what's the point of "building" institutes like Tyndall for 2,7Mio and not to forget ICER for 6.6Mio(!) when the back office is incapable of doing the most essential work, has a Director who works in disorganized fashion admitting that he ‘did not do a thorough job’ of keeping track of his own records and has only one secretary for it’s 13 staff who’s also acting as a part-time receptionist. This kind of behavior would be absolutely unacceptable in the normal economy.
The Jones Jar: http://spreadsheets.google.com/ccc?key=0Ah4XLQCleuUYdFIxMnhMNnlXb2JQcDZUendjUXpWWUE&hl=en
@itisi69: Unlike many others, I actually check the sources and look at the background. Anyone with half a brain would be able to do it, but even many with a whole brain don't.
Regarding the one secretary for its 13 staff, also acting as a receptionist: my Department has 5 secretaries on a total workforce of 140. This may be unacceptable in the normal economy, but unfortunately the daily reality of the universities (and most likely all over the world). The assignment of universities is to teach and research. While we, as scientists and educators, recognise the importance of secretaries, the funding agencies (including the government) do not.
Even the EU is hesitant to give money for secretarial help to big EU projects. As a result, I have seen scientists doing loads of things they simply are not educated for. If you're so upset about this, contact your government and tell them to give universities more money for administration. At present, most universities get less and less basic funding, and need to get more and more funding through applications. Put a secretary in that funding application, and your chances of getting money rapidly go down. Besides that, it is a non-permanent position, meaning you are constantly shifting secretaries if you even get money.
Note that Tyndall was a research consortium. (Z)ICER was a building. You can't do anything without a building, including educating students (which still is the official main assignment of universities).
There are actually two meanings of “peer review”. The first one refers to the fact that scientists (should) critical observe the work of their colleagues. The second refers to a formalized practice of editorial boards and funding agencies.
The formalized peer review of a paper does not mean that the presented findings would be necessarily true. It is philosophy of science 101 that final prove is impossible. Scientific knowledge can only be falsified (well the falsification theory has problems too but it describes the current practice well). Consecutively, probation means that findings withstand REPEATED attempts of falsification.
Formalized peer review is a first short cut to weed out contributions that violate the norms, theories, and methods of a field. The real discussion (among peers and increasingly non-peers) starts AFTER publication.
The falsification approach also implies that a scientist who wrote papers that were disproved later does not get morally discounted. Therefore the number of publications and citations that creates his or her track record does not consider whether contributions where falsified later. Scientists get morally discounted for misbehavior (violating the ethos and norms of a discipline) and the sanctions are severe. It is of course a problem if such norms have value or political dimensions. That happens more often than one might think and it needs to be addressed urgently.
Classical peer review does not check how research results were produced (context of discovery). And it is not supposed to check on the moral integrity of the researcher! It relays entirely on the text. Formal peer review is basically a consistency check against existing papers (and not replication). The classical scientific paper should be a manual for replication. This has been not a problem in little science where all peers of a field work in similar laboratories, have the same training, and where research is cheap etc.
The troubles started with modern large-scale research. The ability of replication disappeared over night. Money, machinery, and manpower were available only to a few. For this reason, the demand data publication became prevalent. Especially in the US, the requirements to put data in the public domain are pretty high as long as the data is entirely generated by taxpayer’s money. It is different in Europe where rules of data distributions are still in a nascent state. In addition, European weather services are mostly commercial institutions that sell products to costumers (the defenders of the free market have to consider this conflict of the market with scientific principles).
But even if a reviewer has access to data, he or she would still miss money, machinery, and manpower. And of course, the motivation of replication attempts from the inside is quite low (Maybe funding agencies should set aside a certain percentage for replication and reduce the pressure to produce spectacular new results).
I agree, many climate researchers did not take these structural problems seriously enough. They stuck to the classical view of science (because it provided legitimization in the policy process) and handled the occurring problems arbitrarily.
Of course, Jones and his team are probably not able to handle all data requests. However, they did not create defensible guidelines. What we need are standardized procedures on which scientists and the interested public can agree on without discouraging scientific advancement and creativity (including replication and falsification attempts).
@83 wrote Of course, Jones and his team are probably not able to handle all data requests.
I don't understand why they couldn't handle the data requests. I assume the data requests were only for the data needed for replication of the results for their paper. It should not have been difficult at the time they made the final run of their program to get scripts, data and programs in one place and create a tarball (archive of scripts, data, programs). A year or two later it might have been difficult without a versioning system. It was just bad software practice, IMHO, not to be able to reproduce the data.
@Anonymous: It will depend on what "data" you are referring. There was a FOI request for all communications regarding AR4: plenty of searching required to fulfill that. MANY requests for specific agreements between NMSs and CRU on data sharing policy (Problem: many were oral and/or really old). There were also requests for raw data, where the problem was related to confidentiality and CRU not holding the actual raw data. You'd have to extract that data from elsewhere, although the FOI request should actually be denied (data not held). Yet other data requests *were* upheld, only to be followed by the next request (for 'help'), and the next, and the next, and the next.
With the exception of some data requests for the paleoclimatic reconstructions, none of the requests were for replication of results from certain papers. And those requests were upheld! (regardless of McIntyre's complaints about not getting the data fro Briffa: he actually got the data from Hantemirov, the actual owner of the data).
Marco is correct, some FOI requests might require lots of time. However IMHO, replication of results should not take very much time if scientists thought ahead and followed good software practices.
By the way, according to Climate Audit it seems that sometimes the confidentialality argumements were bogus If that is correct it does not reflect well on some scientists
I urge you to read the documents from the SMHI: First says: "No, don't publish our data". Then Phil Jones and Acton testify. Three days AFTER the Swedes say "well, OK, publish anyway, but with disclaimer!".
Those that refer to the links to raw data provided by the SMHI should read the license agreement. Especially paragraphs 3.2 and 4.1. The latter cannot be more specific: "you can NOT redistribute our data". Period!
Marco, I have a different interpretation of the exchange of documents.
There are two different files in question, the Swedish raw data file and CRU's homogenized file.
A letter written by John Hirst of the MetOffice on Nov, 30 2009 forwards a request by Phil Jones for permission to release a homogenized Swedish data (see bottom of this document )
In reply the Swedish authorities did not want that the data be released on UK web site because it would differ from the raw data they hold. The requested data will be available for non-commercial purposess on a web site they are developing.
In a letter from the Swedish authorities on March 4, 2010, the Swedish authorities said their letter to the MetOffice may have been misinterpreted. They said "It has never been our intention to withhold any data but we feel that it is paramount that data that has undergone, for instance, homogenisation by anyone other than SMHI is not presented as SMHI data.We see no problem with publication of the data set together with a reference stating that the data included in the dataset is based on observations made by SMHI but it has undergone processing made by your research unit. We would also prefer a link to SMHI or to our web site where the original data can be obtained." (my emphasis).
I interpreted the above to mean that the Swedish authorities never opposed the use of their data for non-commercial uses. They request that the original raw data not to be confused with homogenized data. If my interpretation is correct then there was never any difficulty in sharing the data, at least as far as the Swedish data sets were concerned.
I was unable to read the Swedish License agreement, but one of the letters mentioned the raw data was available for non-commercial uses. Of course the license agreement could be in conflict of my reading of the letters.
@klee12: the license agreement is indeed in conflict with your reading. It sadly is only available in Swedish, but I think Hans von Storch (and Tobias W?) can both affirm whether my translations below are accurate:
"3.2 The Licensee owns no right to use the data or products provided under this agreement for commercial purposes and not for development or production of meteorological, hydrological and oceanographic value added-value services. The licensee does not own nor is authorized to redistribute, sell, assign or otherwise transfer data products or documentation without further processing to third parties unless the parties have received written permission from SMHI." and "4.1 The Licensee does not own the right to disclose, send onwards, link to or in any other way spread the contents of the data and/or products that has been received in accordance with this agreement to a third part."
The SMHI is not the only organisation to have such license agreements. Apparently, the excuse they gave to a Swedish MoP was "everyone does it like this"...
I can't seem to find the text that you've translated, do you have a direct link? The point in this matter, the way I see it, is that the CRU hadn't had any intention before of trying to make their data "transparent". All the rest in this particular instance really is a "tempest in someones teapot", as far as I'm concerned.
Say what you will of Phil Jones, but it was there in black and white - if it's data homogenized by anyone else than us, then no you can't, and we will ourself be making the raw data available, thank you very much! If the CRU doesn't have anything but "value added" data, then they were not allowed to publish, and even if they did SMHI would be doing it themselves. For once - move along, there's nothing to see here!!!
There seems to a conflict between the letters I referenced and the terms of the License, which I assume is to use the raw data. Here is my attempt to resolve the conflict.
My interpretation of Section 3.2 is that the Licensee may use the raw data for non-commercial purposes. The licensee cannot transfer the data unless the licensee perfoms "further processing" or unless they obtain permission from SMHI. That seems to me to be a double negative meaning (to me) that they can transfer processed or homogenized data.
My interpretation of Section 4.2 is as follows. In order to enforce the restriction of 3.2 on the use of the raw data, SMHI wants every one to go to their site and agree to the License terms. Then they can get the data. Thus you cannot disseminate the raw data since the person you disseminated the data to may not have seen the License and inadvertently use the raw data for commercial purposes.
If that is the intent of the License agreement, and if that policy has been in effect for at least five years, then I think nothing would have prevented Prof. Jones from providing data from the Swedish site to whoever asks for it. All he has to do is to provide the homogenized data on his web site and then tell the enquirer where to get the raw data.
Another point is that the letters I referenced were apparantly written after November 2009, and there were FOI requests since, I think 2006. So there seems to have been no hurry to resolve the issue.
@Tobias W. The license information is here: http://data.smhi.se/met/climate/time_series/html/essential20.html
@klee12: regarding the letters: CRU already provided *their own* data, but not the raw data. In essence, they are not the data depository, and the SMHI in essence agrees to that point (or rather, enforces that point: you are not to be the depository). That it took a long time for CRU to ask whether they could be should be seen in the larger perspective: attempts to get Jones to do all the hard work, with the sole desire to find any flaw, regardless of magnitude, and then pounce on it in public.
It's like the police asking any person to find anything and everything that might incriminate him. And if he once had a car with a broken lamp, made a picture of that, hang him in public for a crime. I think most, if not all, people would tell the police to do their own investigative work.
The translation is spot on, is that you doing the work yourself, or has google translator become that accurate:-)???
About your metafor of "investigative work" I think you're a bit off, because if it is not possible to get the information to be able to investigate, then something is wrong. That in this effect isn't necessarily the CRU:s fault though, which is apparent in this case...
@Tobias W: I know a bit of Swedish (being nearly fluent in Danish), although I did use translations provided by others (Max Andersson for the second one). I just corrected the mistakes :-)
Regarding issue 2: I was trying to be a bit provocative (not necessarily towards you!), pointing to Jones' reply to Warwick Hughes. Hughes *could* have done everything Jones did: contact the NMSs, ask for their data (which may have meant payment) and then redo the analysis. It is now slowly getting a bit easier, with some NMSs putting free data on their webside. But note the time-period of the freely available data...it isn't very up-to-date, and lacks old data!
@marco wrote attempts to get Jones to do all the hard work, with the sole desire to find any flaw, regardless of magnitude, and then pounce on it in public. I guess this is the crux of dispute re Jones.
From my point of view, reproducibility of results is the bedrock of science. Phil Jones has an obligation as a scientist to provide data and code to others so that others can reproduce the results. Only that and nothing more. No user's manual, no special documentation of code. There is a saying in software science that says the code is the documentation. Programmers like to program and will fix bugs, but they don't like to do documentation, so often a bug is fixed but the documentation is not changed to reflect the change. Having documentation within the code helps but not necessary. I have ported many programs without much documetation. And what happens when the results of the author and the person trying to reproduce the results differs? Whose code is correct?
The program needs data, so the data must be provided. I don't see why homogenized data files cannot be provided with meta data (i.e. the WMO numbers of stations, location of weather stations, etc) and pointer to where the raw data resides. A copy of the data seems important since overtime the homogenized data may change.
Before submitting a paper to the author of the paper should run all programs from scratch, creating all intermediate files. This can be viewed as final check to see that you have not introduced errors when trying to fix other errors, a common occurrence. If everything works, the author can archive all programs and the data (preferably with version numbers).
That does not seem onerous. If the author is asked to do more, to provide data unrelated to the reproduction of the results of his/her paper that would be onerous. The skeptics say all they want is the program, data, and meta data to reproduce the results; the other side seems to say the skeptics are asking for much more. Who is right?
In my post #73 on this thread I suggested, based on the HARRY_README file, that Prof. Jones did not honor requests for data because (1) he couldn't because the data files were in a chaotic state and (2) there are indications that some code files there was fudging. That still seems reasonable.
Openness and reproducibility in climate science are good things. But these should be designed proactively. Accusing scientists retroactively does not make progress. Also it is pity that many people consider these concepts in the frame created by Steve McIntyre based on his limited experience in statistical analysis of climate data and personal communications. More social-scientific analyses of the activities of climate scientists are needed. Perhaps we need perspectives of such persons as Paul Edwards (see http://pne.people.si.umich.edu/ ).
In post 96, Kooiti Masuda wrote Openness and reproducibility in climate science are good things.But these should be designed proactively. Accusing scientists retroactively does not make progress.
But reproducibility of results is a bedrock of science. What are we to do if we can't reproduce a result? Accept it on faith? If any discipline says results should be accepted even if not reproducible then it's not really science IMHO. It is fine to advance hypothesis without evidence, but if one puts forth empirical results in support of one's hypothesis, that evidence should be reproducible.
klee12/97 - Speculation should certainly be allowed in scientific publication - in the good old days, when hardly anybody spoke about global warming, the then chief-editor of Monthly Weather Review Chester Newton purportedly used the rule that "25% of speculation is allowed in a paper, as long as it is labelled as such".
Allow me some advertisement - this good old times are described in some of the interviews I prepared with people, who are now in their late 70s or 80s (or sadly have passed away), among them Harry van Loon - see my Interview web-page.
Some computer scientists expect that they can devise a way to enhance reproducibility of scientific works by recording the work flow of scientists (as far as they work on a computer system). An attempt (already finished) is introduced at http://www.earthsystemcurator.org/projects/workflow.shtml . I cannot tell whether this shows a direction which climate research institutions should take or not.
The same project (Earth System Curator) has a social scientist, Paul Edwards, as a member, and they want him analyze how climate modelers work. (But I have not yet heard his specific remarks.)
(Incidentally, I learned about these when I came to Hamburg last October, in "GO-ESSP" meeting held in ZMAW.)
Kooiti MASUDA/98 refers to an way to enhance enhance reproducibility of scientific works by recording the work flow of scientists (as far as they work on a computer system). I looked at the site referenced
The software system seems cumbersome for reproducing results that are submitted to journals and I can't see where have any mention of keeping track or archiving versions. Software engineering is simple: it says think ahead, anticipate problems and take steps today to avoid wasting a lot of time later. I know that I can get away with sloppiness for small projects. I know from experience that I can't on large projects. Here's what I would do for large projects.
I would put programs and data in my own directory, organized by subdirectories of course. Disk space is cheap; your time isn't. With your copy of the files you don't have to worry about access or unexpected changes in the contents. Every month I would archive them (create a tarball). The purpose of the monthly archive would be so that that if a disaster happened, such as accidently erasing a subdirectory (has happened to me) I could recover from a previous archive. I loose at most one month's work. Also, in case I wanted to go back to a previous version and I could go back to a last months or last years version. The time stamp of the archive serves as a version number for my private archive.
In the directory of my programs I create a subdirectory called Save. I might copy files to this directory before modifying a file. If I goof up and want to undo the changes I have a day old version of the file.
I would use scripts or a makefile to automate compilations an running programs. This is a big times saver.
When am ready to publish results I would, in the following order
1. Archive everything 2. Delete unnecessary files 3. Get fresh copies of all data files 4. Assign a version number to every file. The version should be inserted into the files as a comment if possible. 5. Run all programs, including compilation of the programs using a script or a Makefile. No author intervention is allowed since someone trying to reproduce your results may not be able to. 6. Examine output. 7. If some data files are proprietary delete those files and add a README file that tells others where the data file was obtained and the date the data file was fetched. Add information about special programs or libraries. 8. Archive the directory.
In Open Source projects they use an elaborate versioning system (CVS). That process was designed for cases (1) when you had many different authors throughout the world working on the same project and you didn't want inconsistencies to arise and (2) you needed different versions for different machines. (2) should not arise in scientific programming. Keep it simple; don't use CVS unless it's needed.
A small part of the versioning system might be useful in maintaining a temperature file. The data might change frequently for various reasons. There is a program that can record changes in 2 text files in a diff file and another program that can apply the diff file to create the second file from the first. Therefore it is easy to maintain a base file, get the next version by applying the changes to get the 2nd version, and changes to the 2nd version to get the third version. Scientists might want to look at the changes rather than the files. I don't think you need this unless you're in charge of archiving data.
I am sure what happened at CRU wouldn't have happened if they did something similar to what I am proposing.
There was a live blog at the Guardian as well, for people who prefer to read.
ReplyDeleteI just wondered what the two people from the Global Warming Policy Foundation were doing there. They seemed very eager on the issue of openness, at least until they were asked about their funding.
The part about "hide the decline" has now been officially laid to rest in a governmental inquiry. The divergence problem was there all along. What a surprise.
If anybody has problems displaying the video like me, here is the direct link. You can open the video in your own player. I'm using VLC from videolan.org:
ReplyDeletehttp://www.twofourdigital.net/UKParliament/Archive/0000011672.wmv.asx
This comment has been removed by a blog administrator.
ReplyDeleteGeorg / 3 - your comment had nothing to do with the issue. No problem if you have such shallow jokes on your own blog but not here. Deleted. - Hans
ReplyDeleteHans
ReplyDeleteI thought the hardly controlled pleasure of getting someone "grilled" for his scientific behaviour deserves some comment.
Obviously there a number of people thinking one should go a step further than just some "grilling":
http://epw.senate.gov/public/index.cfm?FuseAction=Minority.PressReleases&ContentRecord_id=fb6d4083-802a-23ad-46e8-c5c098e22aa1&Region_id=&Issue_id=
So besides of commenting on the "grilling" what can be commented if not the exactly the "grilling"?
No problem. I'll put it on my blog.
Actually it's tru I like cheap jokes.
Georg - in most cases it is possible to express views in a manner, which is not considered inflammatory by opponents. As we know, we have lots of opponents (of whatever type) on this blog - and that is why we run the blog! - so that we all should avoid such "inflammatory" posting. The issue is: "sustainable use of the resource Klimazwiebel". -- Thanks for your understanding -- Hans.
ReplyDeleteHans
ReplyDeleteagain, the word "grilling" is obviously not from me. So can it be commented or not?
If someone takes a pleasure of having "grilled" someone who declared that he was thinking of suicide two weeks ago do you thinking this is the appropriate language of commenting this hearing?
So or Reiner refrains from this sort of polemics or you allow commenting it. Both is not very consistent, in particular for the resource Klimazwiebel.
Lord Lawson... what exactly is he doing there? The "hockey stick" which was vindicated again and again, is called "largely fraudulent", when asked about "hiding the decline", if he agrees that the problem is that it "didn't appear in the footnotes or in the literature" (20:20 onwards) he just said "yes I do". Which is either rather uninformed. Since Briffa 1998 Nature 391 states exaxtly that. And a year before the E-Mail. The title of the paper? "Reduced sensitivity of recent tree-growth to temperature at high northern latitudes". No, it wasn't in the footnotes. It was in a paper's title. In Nature. Or in the IPCC AR3 WG1, Chapter 2.3.2.1 Palaeoclimate proxy indicators: "Non-climatic growth trends must be removed from the tree-ring chronology" and "There is evidence, for example, that high latitude tree-ring density variations have changed in their response to temperature in recent decades".
ReplyDeleteSo Lord Lawson sits there and makes plain false statements to a parliamentary inquiry. Why? Doesn't he know better? Then what is he doing there in the first place?
Georg, if you find the term "grilling" inappropriate, then say so: "I think the term 'grilling' is inappropriate, misleading ..." whatever you want.
ReplyDeleteGeorg - what I meant is - make a statement which is clear to everybody so that people understand what you mean. Do not hide it in easily mis-understandable "jokes".
ReplyDeleteHans
ReplyDeleteit cuts a bit the beauty of exchanging opinions, but ok:
I think Mr Jones gave informations on his scientific work to a commitee. I think the word "grilling" expresses Reiners preconceived concept and prejustice of what this committee is actually about and what it might conclude. I think the word "grilling" is not helpful to have a rational discussion on the issue of climate sciences.
I personally think of Guantanamo or Joseph MacCarthy when I here "someone getting grilled" and therefore I think it is not appropriate. You should better choose you words.
It this better?
Perfect, Georg.
ReplyDelete@Georg: Funny that you should bring up McCarthy. Wasn't the first time in the last week.
ReplyDeleteAlthough I think that a UK parliamentary "grilling" is probably rather comfortable compared to what Sen. Inhofe has in mind for 17 climate scientists.
When looking through the list of submissions to the committee, I am really puzzled where all the "friends" of CRU are? My impression is that only very few of the many coworkers of Phil Jones have considered it useful to post a statement of support. Or do I miss here something? -- Hans
ReplyDelete@Flin
ReplyDeleteYou are right.
If "they" dont like the results send them into prison. Completely new possibilities for "future review processes".
Why actually there is no comment here on sueing climate scientists for their supposed "misbehaviour"?
I just like to put this in a normal perspective. The only thing I would be really embarassed about Phil Jones behaviour and work would be if his results have serious flaws or are manipulated. Absolutely NOTHING indicates that.
For the rest, Jones is nor my neighbour nor my friend but I wonder if Hans or Reiner really think that he merits being trashed by the british yellow press? I had expect a little less enthusiasm about the way things are going.
@ 15
ReplyDeleteHola Georg,
I do really feel sad for Phil Jones, as I did for Bjorn Lomborg when he was being attacked. And Lomborg didnt do anything to block his opponents.
Hola Edu
ReplyDeleteLomborg had a comission investigating his activities? An american senator was suggesting throwing him into prison?
But in any case when someone feels sad about Jones and Lomborg beeing treated as they were treated why then writing they are "getting grilled" with such malice?
@ 17
ReplyDeleteYes, Lomborg was investigated and accused of dishonesty by a commission under the Ministry of Science
http://en.wikipedia.org/wiki/Bj%C3%B8rn_Lomborg
'why then writing they are "getting grilled" with such malice?'
did I ?
@Edu
ReplyDelete"did I ?"
Of course not.
I am all the time speaking of the headline of this post.
And actually my problem is rather inconsistency. If Reiner chooses an obviously polemical headline like "Jones getting grilled" one should allow polemical replies. If Hans does not like polemics why then not writing in the headline: "Here is the video on the CRU commission".
re 8 _Flin_
ReplyDeleteHockeystick vindicated again and again? Yes by the same group of people more or less (Mann, Aman, etc). Discard Wegman and McIntyre if you want (personally I tend to put more stock in McI than Mann in the hockey stick issue). Even Phil Jones states in the Guardian interview that the temperatures in the MWP may have been as high as today's!
And regarding the statistical analysis used in the "vindicated" papers: Mann has yet to release his interesting statistical methodology.
As far as I'm concerned the only question remains whether the MWP was global or just NH; not whether there was a significant MWP even if the hockey stick papers deny it.
for German speakers help from dict.leo.org
ReplyDeleteto grill - in die Mangel nehmen
to grill someone - jmd auf den Zahn fühlen
(Phil Jones gets a grilling: P.J. wird in die Mangel genommen - immer noch besser als gegrillt werden, oder?)
re 8, 20, _Flin_
ReplyDeleteCorrection: Phil Jones didn't say that temps in the MWP may have been as high as today's. Used the word "if" , my mistake. He did acknowledge the existence of a MWP however.
Leaves me with Wegman and McIntyre. And the fact we are still waiting for Mann to release his code and statistical methods.
Sorry, Henk, but Mann's code and statistical methods are fully available, in particular for his latest range of papers.
ReplyDelete@Hans von Storch:
I think that many of Jones' "friends" are scared as hell of being the next in line to be attacked. With the exception of Tim Osborn, I guess the CRU people didn't think it appropriate to react.
I'd love to have seen Briffa, though, especially now that McIntyre attacked him again, but put in a graph that does not support his conclusion...
Oh dear, didn't realize people were being kept awake all night because of that word. Like Werner points out, it is not polemical at all. Maybe Georg is not familar with its use.
ReplyDeleteGeorg - maybe my English is simply too limited. "Grilling" for me means a serious investigation - by asking tough questions. Certainly not a pleasure but entirely legitimate, given the significance of Phil Jones' results. Also, many reviews of your scientific submissions amount to this type of "grilling".
ReplyDeleteOn the other hand, given the language used by Phil in his e-mails, I would not expect him to be too sensitive - in the same way as I am not too sensitive about attributes given to me.
@_Flin_
ReplyDeleteThe "hockey stick" which was vindicated again and again,
Not according to Ian Joliffe.
http://tamino.wordpress.com/2008/08/10/open-thread-5-2/#comment-21873
I agree in the wider scheme of things it probably doesn't matter all that much, but defending the indefensible is counter-productive and promotes mistrust. It de-values all the solid evidence.
And "hide the decline" meant exactly that: "let's not draw too much attention to the divergence problem". It refers to a graph on the front cover of a WMO publication released in 1999. Look at the green line here:
http://www.wmo.ch/pages/prog/wcp/wcdmp/statemnt/wmo913.pdf
To pretend that no-one has done anything wrong is silly. However, the vitriol aimed at him I find distasteful.
Georg, something else. I am one of the few people who actively made a submission to the UK commission; I guess you did not. Also others were similarly "bequem" as you were, of saying nothing except for a blog. (Before we did that in nature.)
ReplyDelete- But, note that Myles Allan and I made an assertion only on the thermometer-based temperature series, not on the proxy work, particularly not on the 2000 year reconstruction done together with Mike Mann, which is widely in the community considered questionable. Bradley big cross on the slate some years ago at a Swiss summer school is remember by quite a few.
Georg
ReplyDeleteAs regards the 'yellow press', here is Fred Pearce who comments in the Guardian today. The headline reads:
Phil Jones survives MPs' grilling over climate emails.
Commons committee tiptoed round embattled scientist and sidestepped crucial questions
So Pearce seems to think Jones only got a 'light grilling', indicating perhaps that the grilling should have been more intense?
Here is a little taster:
Jones did his best to persuade the Commons science and technology committee that all was well in the house of climate science. If they didn't quite believe him, they didn't have the heart to press the point. The man has had three months of hell, after all.
Jones's general defence was that anything people didn't like – the strong-arm tactics to silence critics, the cold-shouldering of freedom of information requests, the economy with data sharing – were all "standard practice" among climate scientists. "Maybe it should be, but it's not."
And he seemed to be right. The most startling observation came when he was asked how often scientists reviewing his papers for probity before publication asked to see details of his raw data, methodology and computer codes. "They've never asked," he said.
Read the whole comment and make a judgement yourself here
Hockeystick "vindicated" - that's what Mike Mann and his friends are claiming, and repeating. But others, independent people do not see it like that. They consider the original work (MBH) as questionable. We also remember vividly the moving target-procedure employed in earlier times, when the algorithm was changed from paper to paper, without proper documentation. Eduardo can tell the story.
ReplyDeleteNowadays, Mann's products are seen as just one proposal for past temperature development, among others. And this is ok. When TAR made it THE one, it created the problem.
Either history is right, or Mann's hockey stick is right. It can't be both. We all know that global temperatures were not flat from 1000 to 1900. The fact that the IPCC slobbered all over his hockey stick and made it the centrefold of the TAR, without questioning it, reinforces the notion they are acting on an agenda and not scientifically.
ReplyDeleteI haven't had time to look at Jones being "grilled". But scanning the other blogs, it appears he claims it's standard practice to hide data. Well, that pretty much proves that climate science is broken, corrupt and in need of a major overhaul, doesn't it? I can't think of any other science where hiding data is "standard practice", except maybe finance.
Either Jones goes, or the CRU's reputation goes. Not a tough choice.
What I found staggering to read (haven't actually seen the "grilling") is the quote from Jones that it was not ’standard practice’ in climate science to release data and methodology for scientific findings so that other scientists could check and challenge the research. Also that scientific journals that published his paper never asked for the data. Is this normal practice in science, or only in climate science? funny thing is that in 2002 Jones did shared the data with McIntyre. Apparently after he saw what McIntyre did with the data he decide to invent new rules.
ReplyDeleteIoP, RSC and RSS (reading their assesments) also think that models, codes and data should be in public domain. And why not? We're not talking about nuclear/rocket science, it's temperature fercrissakes...
Hans, Reiner,
ReplyDeletemy reaction yesterday was a bit to harsh and mixing things that do not belong together.
I havent heard of this commission before I read this posting. I heard however of Senator Inhofe asking for sueing 11 climate researchers for their general implication in climate research and for what they wrote in mails they at least didnt consider as public. I am really missing here on the Klimazwiebel notices like this one (as by the way any contributions to climate sciences besides one of Edu, but that's just for the records). It is McCarthysm and though right now he is still a minority it might be become a much bigger problem than the Question if the data of the Met Station of Ulan Bator should be made public or not.
Hans if you could provide me any link to the commissions "submission" section. I doubt that I can say or do any relevant being bequem or not.
Reiner
ReplyDeletethree points you are probably ot aware of.
1) There are non public data if you like it or not. I just can invite you to get the complete data sets of an arbitrary german station entering into Jones computations from the DWD . The DWD makes money with these data and still at the Max-Planck I had to pay 200 Marks for the data from Heidelberg. Just one station.
2) There are exercises in climate sciences which can not be reviewed in the way as you might have that in mind. Satellite data sets (again partly non public) are huge as Jones data set is quite large. Even for someone working in the same domain it will take weeks if not months to check on each step. The scientific review process as a whole therefore is twofold. A) Obviously internal plausiblity, citations, etc etc by the actual reviewer B) Subsequently others with similar data sets (though not identical) and different reasonable methods get or dont get similar results. That's in fact the really important step, not what has been done until the actual publication.
So what the Guardian and/or you have in mind is a) not practicable and b) not needed.
3) And the end it's a question of trust. If Jones data set or in general surface temperatures are only accepted when McIntyre agrees with the smallest and minor subjective choice that's the end of climate sciences and this is ofcource intended by many. As a friend of mine put it. If anything is a lie you can no longer prove or disprove anything. Once you divide by zero everything becomes true.
Hans 27
ReplyDeleteIf I understand you correctly, you made s submission to the UK Select Committee on Science and Technology? How does this work? Did they ask for contributions?
The hockey stick is now under political scrutiny (the UK committee), and in case senator Inhofe really sues climate scientists, under legal scrutiny.
'McCarthyism' seems to be an omnipresent accusation. You, Hans, once complained about McCarthyism - gatekeepers from influential journals who prevent critical or skeptical contributions; now, with Inhofe (and maybe even the Jones grilling), there is again McCarthyism in the air.
Anyway, is there any hope that politics and law will correct what science itself obviously did not achieve - the control of regular scientific work? Or will the mess accumulate?
@itisi69
ReplyDeleteI am not sure about other branches of science. I explain situation of climate science as far as I know.
Not all climate data are in public domain. The principle of the U.S. federal government that the data obtained by the government shall be in public domain is not common with other nations. But large part of data needed for global climate research are openly available, at least for non-commercial applications, thanks to various international collaborations.
There are data centers whose job is to provide data to users. The data center to be named first for both modern climatology and paleoclimatology is the National Climatic Data Center, a part of NOAA of the USA. Their data are in public domain unless otherwise specified by those who provide the data to them. But they may charge fee for their service. Recently the more data are on the Internet so the occasions where we must pay fee have become rarer.
In the following I limit the subject to modern climate data.
CRU is not a data center but just a research institution. They released their data products (gridded data), and maybe that was their duty by contracts. However, providing data of observational records at stations (their raw material) to other users are not their job, but the data centers' job. So I think it reasonable for Jones to say that McIntyre et al. should get data from NCDC.
Actually, when researchers obtain data from data centers, they often have to make conversions between different formats, and some quality checks from the researchers' viewpoint, before using them for the substantial analysis.
I am a scientist who use data from various sources. I write ad hoc program codes to do such minor tasks every time I need to use something new to me. But I do not usually write documents of those codes. If I need to share them with my collaborators, I write some informal documents to be supplemented by conversations. (I feel I should do this for future myself as well.) Writing full users' manuals understandable by outsiders is a much harder task. If the equivalent of British IOC determines that I must release the codes, I will be obliged to write their documents with more time than writing the codes themselves.
When I report the results in scientific papers, I explain my methodology. It means that I explain the substantial parts of the process of my analysis in words or mathematical formulas. Probably I do not explain trivial parts of the process such as format conversion.
Other scientists who want to repeat my analysis probably need to create their own program codes. I think it has been normal practice of our scientific community.
If there is some motive for me to be sympathetic to them, I give them my codes and also take my time to explain them. I do not mind if they are going to scientifically refute me. But if they are going to morally discredit me, I would not help them.
Note that the program codes I mention are short ones, usually about one hundred lines long each.
The source codes of full climate models are tens of thousands of lines long. My knowledge about their situation is too fragmentary to talk about.
Dear Kooiti Masuda,
ReplyDeleteYou wrote:
"If there is some motive for me to be sympathetic to them, I give them my codes and also take my time to explain them. I do not mind if they are going to scientifically refute me. But if they are going to morally discredit me, I would not help them."
I have to think about the validity of this statement. You are assuming you are moral and that the critical party is not. Seems like a strange way of scientific cooperation.
Actually, if your work is sound and done well, there is nothing to fear. Of course if one deliberately fudged the data, or is very insecure about the methodology he employed, then he might choose not show his work to a critical party. Indeed he'll probaly only agree to show it to his morally equivalent pals.
Why does there seem to be some kind of paranoia among CRU scientists - thinking others are only out to destroy them? Do you believe that Mr McIntyre is only interested in morally discrediting Mr Jones? What do you base this on?
If CRU's work is good, then put it to the test. If not, then it is necesary to destroy it. Otherwise science becomes a folly.
I'm not convinced by your defence of Mr Jones and the CRU.
@MASUDAsan:
ReplyDeleteMany thanks for your extensive explanation which gives me a good insight in your procedure.
However: "Other scientists who want to repeat my analysis probably need to create their own program codes. I think it has been normal practice of our scientific community."
Why is that? Isn't it important for peer reviewing to know how you calculated the results. How can it be falsified otherwise?
What also struck me that apparantly Jones was never asked for the data by his reviewers.
BTW McIntyre described what happened when he was a reviewer:
"When I was a reviewer of Wahl and Ammann, I asked for verification r2 statistics; they refused and Schneider terminated me as a reviewer. When I was a reviewer of Mann et al (submitted to Clim Chg 2004), I asked for supporting data and code; Schneider said that no one had ever asked for such things in 28 years of editing and it would require a policy change by the editorial board. Jones and Santer were on the editorial board and the matter is discussed in a number of early 2004 Climategate letters (which Jones “confidentially” sent to Mann.)
Cheers
@itisi69 (apologies in advance for the length of my comment)
ReplyDeleteIt is indeed common practice in science to NOT ask for data upon reviewing a paper. As a scientist myself (but in another field than climate science), I review about 10 papers a year (and get asked for many more). On top of that I have to make sure my PhD students and postdocs write proper papers. I also have to teach. I also have to write grant proposals. I also have some additional administrative tasks. I go to various scientific meetings and presentations. I use on average more than 50 hours each and every week on these tasks. While I do this with all the love of my heart, I simply am incapable, as a reviewer, to redo the analysis of the provided data. If I use the same 'code' as provided by the authors, I would have to check the code in detail first. If it happens to be written in a 'language' I do not understand, I have to find someone else to check it. Alternatively, somehow I need to authors to write it in a format I do understand. The raw data can actually also be wrong, but in many cases it is not possible to see that. In my area of research, analytical methods are very common. I can't see in the raw data that instead of the claimed 700 microliter, 800 microliter was used. I'd have to repeat the experiment (meaning I need the same material and instrumentation) to see that problem. Especially with more advance methods this may be totally impossible (both economics and time are issues there).
In short: even *with* the data provided, very few reviewers would be able to use that raw data to check the results of the authors within the time-frame allowed for a review, within the time-frame our employers would allow for this type of work, and most certainly within the time-frame I am able to spend (mentally) on reviewing papers. Reviewers look at the methodology as described, whether the results make sense (graph and discussion must fit), whether prior work is properly referenced and whether the authors explain any discrepancies, whether it is novel, etc. etc. etc. But reviewing is not auditing. The scientific audit is others using the same procedures (or novel procedures) and apply it to new data (or the old data). Do they find the same (/similar)?
There are some areas where certain 'data' must be submitted to databases, but these areas are actually limited. For example, many crystallographic data needs to be submitted to a database upon submission of a manuscript. That doesn't mean the reviewers even check the data, though! Recently, some Chinese were caught having submitted the crystal data of one compound as many different compounds, for example. It was discovered because someone else had made one of those compounds, and did not get the same result. Discovering this fraud required *independent* work, not an audit of the data.
@P. Gosselin:
ReplyDeleteIn contentious situations, we need an arbiter or a moderator to continue conversation. Otherwise we had better avoid contacts in order not to escalate contention.
Marco,
ReplyDelete"Discovering this fraud required 'independent' work, not an audit of the data."
So what's your point?
Scientists don't have time to properly and rigorously review papers, and thus should just skim over them and accept them on blind faith?
Or papers should be subjected to "independent" work, but not to audits? Independent work will eventually out the errors?
I think for an issue as important as climate science, it has to be checked inside out - with microscopes, audits, reviews and independent work.
I think if Mann, Jones, etc. had reacted cooperatively when first asked to provide the data and not combatively and aggressively, and had not attempted to arrogantly beliitle those who had legitimate questions, that is if they had behaved semi-professionally, they would not have had to endure the scrutiny and suspicion that naturally ensued and brought their house down. Yes, they were hiding things - the e-mails confirm this, and a lot more.
Surely when one starts calling fellow scientific colleagues names and starts to publicly question their credentials, he is neither going to dampen their suspicions nor make them into friends. If you live by the sword, then you also die by it.
Itsi @37
ReplyDeleteThe McI quote raises an interesting question. Normally, researchers would not do what he did (Marco's valid point about time constraints aside) because they would hamper their careers. So you need to get someone with nothing to lose in order to get a thorough job done.
Is it normal for a scientist to express glee over the death of a critical colleague?
ReplyDeleteAgain, who, other than perhaps Masuda san, who has yet to answer the question, believes Mr McIntyre was interested in morally discrediting Mr Jones?
It amazes me that scientists may have reached this plateau. I'm only asking questions based on statements made by others.
Unless scientists return to professionalism in the field, the situation is only going to get a lot worse.
Werner 34 -- McCarthyism:
ReplyDeleteInhofe's announcements are empty gestures. If there is breach of the law, investigations and prosecution would have started already. Inhofe cannot produce this. He plays to the gallery. And as a byproduct, he makes other people jump up and down. ;-)
@P Gosselin:
ReplyDeleteI'm not asking for blind faith. On the other hand, some level of faith in other people *is* warranted. Independent work is much better to both weed out errors AND to strengthen results. The second issue should not be forgotten.
And putting a microscope on data may be fine (note: it really doesn't matter that much for AGW what the pre-1900 reconstruction exactly looks like), but why not put a microscope on the microscope? We've already seen audits of people like Watts, D'Aleo and Smith turning out to be wrong. The auditor audited. Which then is audited and audited again, all on blogs, with cheering crowds all the way. McIntyre has also been audited a few times, and plenty of people are, to put it mildly, not impressed with several of his analyses. Minor issues blow out of proportion, or (as in the case of Yamal) outright wrong.
Note also that Jones *had* been forthcoming. But he quickly found out what McIntyre was doing: trying to find a comma wrong, and blow it up as much as possible. Sure, he may not have directly set out to do so, but his cheering crowd has been pretty good at doing it for him. Scientists are not happy when they are shown wrong, but at the very least they like to be able to respond to criticism before it is thrown into the world, especially with all the accompanying invectives as has happened to Mann and Jones.
Take also the experience of Briffa.
Briffa was so kind to send McIntyre's request for data through to his Russian colleagues (McIntyre knew who had the data, they wer acknowledged in the paper). The Russians gave McIntyre the data. What did McIntyre do? He kept on demanding Briffa send him the data! Any normal human being would get seriously unwilling to cooperate with someone who is clearly never satisfied, and whips up a rather unpleasant atmosphere when he actually does get his hands on the data.
The land-based tmeperature record is another one of those areas where you really should wonder why people are getting so upset. 95% of all the raw data HADCRU uses is freely available. HADCRU can be already be checked using that data. There are also independent analyses, such as those from NASA (GISTEMP), NOAA (NCDC), and the JMA, which reproduce, using other methodology, the same result.
In essence, giving McIntyre (or others) the data would thus not have mattered for the science. Unfortunately, we also know it *would* have mattered for the scientist, as even the slightest error would be pounded upon as some kind of huge crime, and all that after Jones would have spend masses of time to find all the data and the code and whatnot to satisfy McIntyre in the first place. I thus understand the reluctance of so many climate scientists to just give their data to people they do not consider very ethical nor friends of science.
It is true that reviewing papers generally doesn't mean going through all the raw data, code and methods, as it would be too time consuming and so on, agree with Marco here.
ReplyDeleteBUT, refusing to release data and code to a reviewer who wants to do a thorough analysis is very wrong and unscientific.
Marco,
ReplyDeleteI don't expect to reconcile differences here. But the matter indeed goes far beyond slight errors and commas. The e-mails couldn't be clearer.
You are mischaracterising Mr McIntyre. He is not being emboldened by his “cheering fans”. Rather it is more his cheering fans being emboldened by his profound and disturbing findings.
You make it sound like Mr McIntyre gets his kicks by being a troublemaker. Nothing is further from the truth.
Climategate, Hockey Stick and IPCC would not be in complete turmoil if we were only talking about only a few mistakes. Unfortunately, it gets down to some serious problems. It's institutionalised systematic chronic wrongdoing and general unprofessionalism vis a vis legitimate inquiry that have been uncovered. Whitewashing, downplaying and continued attacks on critics will only make the situation get far worse. Much of the science community is already turned off by this behaviour, e.g. see the latest IoP statement. Like it or not, it's a growing tide.
Marco,
ReplyDeleteI don't expect to reconcile differences here. But the matter indeed goes far beyond slight errors and commas. The e-mails couldn't be clearer.
You are mischaracterising Mr McIntyre. He is not being emboldened by his “cheering fans”. Rather it is more his cheering fans being emboldened by his profound and disturbing findings.
You make it sound like Mr McIntyre gets his kicks by being a troublemaker. Nothing is further from the truth.
Climategate, Hockey Stick and IPCC would not be in complete turmoil if we were only talking about only a few mistakes. Unfortunately, it gets down to some serious problems. It's institutionalised systematic chronic wrongdoing and general unprofessionalism vis a vis legitimate inquiry that have been uncovered. Whitewashing, downplaying and continued attacks on critics will only make the situation get far worse. Much of the science community is already turned off by this behaviour, e.g. see the latest IoP statement.
Marco / 43. - we know from the e-mails that Phil did not want to share the data with somebody who wants "to find errors" (I once asked Phil for confirmation - and got a positive response form Phil). Even if that may be understandable, it is simply false. Our scientific adversaries are allowed to want to find errors.
ReplyDeleteBy the way, when I met Phil the last time, in October 2010, I urged him to solve the problem (by sharing the data right away; by starting an initiative to persuade the data producers to allow distribution) - and, as we know be now, admitting that some data have been lost. He did not want to do so, and one month later the mails were leaked.
On the other hand, I have to admit that most data of earlier studies of mine are lost.
@Marco
ReplyDelete@Reiner
and other scientists
So basically it comes down to the fact that "we dont have enough time to check all the data and codes and believe the scientists on their blue eyes"?
What about when someone does have time to go through the data/codes? What's so bad about that? (all or not unintended) errors can occur and can be adressed to. To me this excuse has no value. Would you believe Fleischmann and Pons on face value now?
Wasn't the scrutinizing of the data that revealed that Mann's Hockey Stick was flawed as proved by Wegman? And that Mann used data upside down?
"What did McIntyre do? He kept on demanding Briffa send him the data! Any normal human being would get seriously unwilling to cooperate with someone who is clearly never satisfied, and whips up a rather unpleasant atmosphere when he actually does get his hands on the data."
Sorry, nut this is absolutely bollox. As the Climategate emails clearly show McIntyre was up to something and there was a lot of nervousness how to reply. The rather unpleasant atmosphere is created by those who don't want to issue the data. I'm asking over and over again: if there's nothing to hide, give it to them! There's no way better way to shut these pesky skeptics up than to show them in the open that nothing is wrong with the data.
I just can't believe my ears and apparantly looking at the MP's this afternoon I'm not the only one.
@all: I think there was a discussion before about releasing data. If I remember correctly, there was quite some agreement that it is the right thing to do to provide the data.
ReplyDeleteI would be very interested in hearing from you, which kind of data you would happily provide and where the boundaries of this are.
Can it be asked from scientists to release all of their findings in form of data and code (in other fields called intellectual property) for the use of the public? Or is it necessary that all scientific code becomes Open Source, so that the code can be improved and reviewed? What about security risks (imagine someone wants to steal, lets say, your emails)? Where are the limitations? Time limitations, e.g. 3 years?
@Reiner
ReplyDelete"Inhofe's announcements are empty gestures. If there is breach of the law, investigations and prosecution would have started already. Inhofe cannot produce this. He plays to the gallery. And as a byproduct, he makes other people jump up and down"
On the other hand I dont know how he could start if not like this. I suppose McCarthy didnt get started with: "And now everyone who has a grandfather in the communist party looses his employement".
If one even cannt take an american senator seriously who else? the pope?
@6
ReplyDelete'Hockey stick vindicated..'
This is a very elastic sentence. what does this specifically mean? that all other reconstructions lie within its error range? or that the year 1998 was the warmest in the millennium ? or that the temperature evolution in the past was flat ? or that the reconstruction method was correct ?
There are still methodological aspects that are not clear, for instance how were the uncertainty ranges calculated for the smoothed reconstructions. This issue also emerges in the CRU emails, as Mann would like that data from his statistical analyis (the regression residuals) are not made public. although later they were made public. Other open question is that so far all reconstruction methods have been shown to underestimate the past variations - you will probably here some other opinions, that of Mann and co-workers and that of the rest of the groups. So it is likely that if a warmer period than today (say around 2000) had indeed occured in the past the reconstructions would not have picked it. This is also related to the divergence problem displayed by some tree-ring proxies. I think that the state-of-the art now is well summarized by one sentence of the NAS report on millennial reconstructions: back to 1600 it is very likely that it was cooler (by how much ?). before that everything is much more speculative - less proxies, more uncertainty, etc.
I think Jolliffe was also quite right when he wrote that the evidence for AGW does not rest on the hockey-stick. Actually, the hockey-stick would be a quite minor component of the AGW body. However, it has indeed been bloated in the past as one of the cornerstones, and now we pay the price for that. In my view the HS says much more about the working of climate science in the last decade and related issues than about climate science itself.
It cannot be denied that Mann was very defensive against everyone trying to check or criticize his work, also before the appearance of McIntyre on stage. This is featured in the CRU mails, for instance in those related to the Esper et al paper in Science. I think this stance has not been helpful for the whole community, and again we all pay now the price.
This doesnt mean that everything is 'blue sky' on the other camp. I think the interest in getting the science right is overshadowed by a thrust to oppose the perceived Team, where by the Team is more or less almost every climate scientist. We can only speculate what would have happened if the first approaches of McIntyre would have been met more openly.
Review process.
ReplyDeleteit is indeed impossible to review a paper in-depth, checking all calculations and code. many papers use intermediate data, for instance data from simulations performed by other groups, etc. To review a paper means to check that the relevant literature has been considered, that there are not logical fallacies, that is clearly written, that it provides all information to understand what has been done. Many journals set a tight time schedule, 3 or 4 weeks, and reviewing a paper is done on a voluntary y basis. The real self-correcting processes occurs later, when over the subsequent years, other teams try to reproduce the results or find other contradicting results. Thats why I think it is misleading to highlight any findings of any recent paper as the final proof of anything. The results of a paper are more or less established after say 3 or 4 years or even longer. Even more so in climate science, where we are dealing with additional uncertainties due to the impossibility of conducting experiments. So any claim that a paper debunks the whole climate science or that the IPCC underestimated climate change are really not serious. This is a slow prodding business.
It is indeed disturbing that some journals do not like to publish comments or corrections. For instance, according to the PNAS guidelines, comments to papers are only allowed within 3 months after publication and can be 500 words long. For Science the limit is 6 months. I think there shouldn't be a time limit to report any errors found in a publication.
Data sharing.
ReplyDeletethere are too aspects that have to be consider. One is the need to help in the replicability of results explaining data, code, and calculations. Other is that some kind of data are very burdensome to obtain and can be used for different studies. I think there is a gap here. Journals should set-up some mechanisms in which data are shared just for the purpose of checking a paper, and not to initiate new studies if the owner of the data does not agree. Perhaps a kind of data broker.
It is also true that it has been very difficult to obtain observational data for everyone of us, not only for McIntyre, and Georg is right to explain that the staunchest guardians of data are the national weather services. This situation may be slowly changing, but in the early 2000 and late 90's it was much more difficult.
In my whole career I was asked only once by a reviewer to provide data so that he could check my calculations
Werner / 37. -- "If I understand you correctly, you made s submission to the UK Select Committee on Science and Technology? How does this work? Did they ask for contributions?"
ReplyDelete- No, we somehow knew (ok, Myles knew) that there was a call for comments; so we submitted our extended nature-text to a given web-page, and that was it. No special invitation. -- Hans
re#23
ReplyDeleteThanks Marco. Tricky thing the blogosphere. Have to make a mental note: "Do not quote from others (as authoritative) until original papers/data have been checked"
re#24 Reiner
My watch says 1:38 pm right now
re #38
Marco ,I fully agree with your assessment. Was going to write something to that effect as well.
What kind of incentive is there for overworked scientists to go with a comb over other peoples original data?
In a way this situation in climate science is unique. Is there another branch in science that gets so seriously scrutinized by outsiders with a good scientific background ( and some with a lot less understanding)?
If only there was a serious "medicalaudit.com" ! As it is there are plenty of papers without
statistical merit. What if all the original data were audited?
Would love to see that.
Data sharing
ReplyDeleteIn the UK, the Economic and Social Research Council has the following policy in place:
As an ESRC award holder, under the ESRC Data Policy, your contract requires you to offer all research data generated from your award for deposit with the UK Data Archive. This applies to all kinds of ESRC awards - individual research grants, fellowships, Research Centres investments and awards under Research Programmes.
The offering data process is administered by the ESDS, which you must contact within three months of completing your ESRC grant to offer data for archiving. The UK Data Archive, a service provider for the ESDS, is where research data are processed, physically stored and disseminated.
I have looked for the equivalent for the physical sciences (EPSRC) and engineering but so far have not found anything.
Henk 56
ReplyDeleteIn a way this situation in climate science is unique. Is there another branch in science that gets so seriously scrutinized by outsiders with a good scientific background ( and some with a lot less understanding)?
This applies to scientific controversies. Remember, climate science under the vision of the IPCC was supposed to provide a consensus view, not a controversy. We only have open controversy since recently.
During 'normal science' in most cases papers will not be scrutinized and experiments not repeated for obvious reasons. It is not in the self interest of a researcher to do that as you could just prove someone else right. You would waste your time and resources.
A controversy changes this incentive structure. If you are part of one 'camp' you may want to do many things to disprove the 'other side'. Often findings that support the other side are not reported. We need to find robust institutional structures to create maximim transparency. Controversies provide such a mechanism but at a high cost.
@Henk 56
ReplyDeletemedicalaudit or epidemiologyaudit would definitely be called for.
As a matter of fact, the reason I began to look into climate science issues, was the striking similiarities on the PR side - "the science is settled".
And in the case of epidemiology, you don´t need a statistics degree to find major flaws in peer-review literature.
A good blog used to be http://junkfoodscience.blogspot.com
However, the lady running it stopped posting in Fall 2009
It's safe to say that, at least on this forum, there's consensus between (climate) scientists that scrutinizing data is not common in the peer review process, in fact very rare. Fair enough.
ReplyDeleteButtt....when global policy is depending on this science, with trillions $$$ at stake, whole economies designed based on the results, billions of investments, mitigation, adaption, the full monty. Don't you think it's imperative, no it's matter of life and death that the data is scrupulous scrutinized?
Again, sorry to beat a dead horse the Hockey Stick, this graph was the epicentre of the first IPCC report, the Mother of all Anthropogenic Warmings. And look what happened... Mann and Jones seem now to "hint" at a global MWP.
Don't you all agree that in this particular case a less cavalier attitude is desirable?
Dr.von Storch lost data? Pfui Herr Doktor! ;)
Reiner 58
ReplyDeleteThere are other factors introducing bias in medical studies and that is money. Pharmaceutical studies are twice as likely to show positive results as purely academic studies.
Interestingly the statistics in the drug company studies are often quite good. One way to fool us though is using the comparator drug in relatively lower doses and bingo, the study drug is more effective. Drug companies have left out original data sets, a notorious example was one of the anti depressants. Another wonderful way in which to get results is: start off with a good number of co-variables. You are likely to find at least one significantly changed with p< 0.05.
Then don't report your initial co-variants that turn out to have no significant change.
The most interesting part is how new drugs get presented to the docs. Y axes are blown up like a balloon to make small changes look impressive. Usually presented now a days by a beautiful lady. When you ask her "what is the number needed to treat" ( the number of patient years needed to prevent one death or one serious event) the routine answer is "I will have to look that up"
Below a study in the BMJ. Didn't even look at original data; that would be a whole different story.
http://www.bmj.com/cgi/content/full/326/7400/1167
As most climate datasets have a horrible signal to noise ratio, results depend strongly on adjusting techniques. Of each timeseries multitudes of grey variants are circling. To point to "the" data elsewhere, as e.g. Briffa and Jones did, is no guarantee that "that" version is identical to "the" one used in the publication. Reference to original data residing in a common public domain database would resolve most of the issues of irreproducible science.
ReplyDeleteAddendum to my previous posting (which I see as #63) with comments to Hans Erren (#62):
ReplyDeleteCompilation of global surface air temperature since 1850 will surely be a subject with the special condition to preserve all the input data and program codes.
(Perhaps this suggestion overlaps with the initiative of the UK Met. Office announced already. Excuse me, I have not checked what they say yet.)
Restricted data will deliberately be excluded from the input to the particular project. Logically, the software that comprises the data stream towards the product (not only those codes produced in the project) should be all open source. I am not sure whether this condition can be actually enforced.
Also, I do not think we can make the adjustment of historical data fully objective. We sometimes need expert decisions. So, I think the achievable level of reproducibility will be like this: If we tentatively accept the same set of expert decisions, we can reproduce the same results. We can choose an alternative set and get somewhat different results. We can trace the source of difference.
@Reiner #57:
ReplyDeleteNote that there is an organisation specifically dedicated to archiving the data, and help researchers archive said data. And I'm pretty sure they have the same caveat as other data repositories: you can get exceptions for data that is going to be used in future publications, and exceptions for data that may be used commercially(!).
Having reviewed 100s of papers and having been a journal editor for almost 10 years, let me say the following about reviewing and access.
ReplyDeleteIn most cases, you can write a perfectly round review without having checked every detail. This is obvious if the paper goes wrong at some higher level.
As a referee, you put a paper through a series of validity checks. Is it original and relevant? Are the methods and data sound? Once the referee is satisfied that the main conclusion is robust, there is no need to check every detail.
There are papers, however, where the surprising conclusions hangs on a badly documented detail. In such cases, a referee would typically ask for better documentation instead to trying to reproduce the result herself.
note that there are areas of science where the raw data are evaluated thoroughly. Drug and chemical safety studies are perfomed to a standard (GLP) with auditing, and all the raw data and conclusions, methods of analysis are audited.
ReplyDeleteit is absolutely routine for epidemiology studies to be audited, and cross-checked by other groups- frequently revealing substantial differences in analysis. There are difficulties- you have to anonymise data- but when it matters, you have to be able to check the data and the conclusions.
It more difficult to analyse lab notebooks- for obvious reasons- but when it comes to analysing data, the codes for analysis and the output, there is little reason for not making this freely available.
which is what is at issue here.
per
Well, maybe science is seeing a change of mind re datasharing lately due to Climategate?
ReplyDeleteFollowing Institutes are in favor of sharing the data:
Institute of Physics
Royal Society of Chemistry
Royal Statistical Society
itisi69
ReplyDeleteThey would not be able to call themselves scientists if they decided it any other way. I read the RSC statement and they are right on in every point.
Paranoia and secrecy have no place in science.
Donna calls em out!
http://nofrakkingconsensus.blogspot.com/2010/03/battle-for-soul-of-science.html
re 67 anonymous
ReplyDeleteNew drugs have to pass studies of pharmacokinetics, toxicology and mutagenicity and sometimes teratology before being tested on humans. This is the laboratory part and has to comply to GLP (good laboratory practises). The phase 3 clinical studies is where the problems creep in. They are phenomenally expensive.These are the studies the BMJ article reviews.
Hans @55 and others:
ReplyDeleteHere is the link all submissions made to the select committee
http://www.publications.parliament.uk/pa/cm200910/cmselect/cmsctech/memo/climatedata/contents.htm
@itisi69:
ReplyDeleteFunnily enough, the Institute of Physics is not willing to tell who are on its subgroup for Energy, in other words, the person(s) who have written their submission.
So much for openness...
And note that the RSS also puts constraints on data sharing, and that the RSC has a nice sneer towards the people who are attacking climate science and scientists, and are not held to the same scrunity (Watts comes to mind, but also McIntyre).
Hello,
ReplyDeleteI'm surprised that the
HARRY_READ_ME
file hasn't been mentioned. IMHO, there may be 2 reason why some scientists at CRU may not have supplied data and code as requested.
1. They were unable to do so. The data was in such a mess that the original data was essentially unavailable because they did not keep intermediate files and could not find the programs that produced the intermediate files. See the
comments by Francis Turner ,
a computer professional. There are other computer professionals that are less kind than Turner.
2. They wanted to hide code because it indicated fudging to get their results. See
this excerpt
of the some programs used by CRU.
Before the email and other documents were made public, I would have accepted the results of papers by CRU on faith. But now I won't accept the results until I see those results duplicated. I'm not saying that Mann and others cannot produce the data and code for their results; I'm just saying I have lost faith in CRU.
klee12
@everyone who tries to compare Climate Science with other fields of science:
ReplyDeleteI think the idea of "other fields of science do things different" is a false argument. It should not matter how science is conducted in other fields when we're examining Climate Science. Climate Science stands or falls on its own.
The facts are, as demonstrated, there was intentional fudging of results, intentional withholding of data/codes for the express purpose of preventing others from falsifying results, deliberate truncation of a series of data in a highly publicized graph to hide an inconvenient decline, and a number of non-scientific behaviors engaged in. This does not prove that the results achieved by Mann, Jones, Briffa, etc, are necessarily wrong, but certainly weakens the confidence of any subsequent science that relied on their work, which appears to be the bulk of climate science at this moment. Ethics matter. When people ignore them, they lose credibility. It is much worse to be seen after the fact as attempting to prevent others from being able to verify your work, than to have someone discredit a piece of it but keep your dignity through the process by honestly addressing any mistakes that are uncovered in your work.
I personally would love it if I had a competent statistician point out flaws in my own application of statistics in my work - I'd probably invite him to help me on my next paper to ensure that the result was much higher caliber than I could have achieved on my own. But then I'm not Mann or Jones.
As someone who, in the latter part of his career, participated in feasiblity studies for major projects (of the order of $500m to $2 billion capex) I am astounded by the attitude of scientists and particularly climate scientists to what I would describe as "due diligence".
ReplyDeleteBefore the financiers (usually but not always banks) will accept a feasibility is sound and thus suitable for financing, they require a detailed due diligence exercise be undertaken by a specialist independent review company. That review can cost between $500k and $1 million.
The reviewer takes every material aspect of the FS, and asks for the underlying data, reports, calculations, and specialist engineers check the statements for soundness.
If the independent engineer is not satisfied (which is usually the case, at least on some aspect or another) those responsible for the FS must address the issue, and do further work until the independent engineer is satisfied.
Also, in the commercial world, I was frequently involved in the preparation of Prospectuses to raise equity funds through a public issue on a major stock exchange. In each case, the lawyers advising the company maintain a detailed due diligence file in which is recorded the underlying or supporting information for EVERY material statement made in the Prospectus. It is recognised that the Prospectus might become subject to later legal action, the hard-copy ringbound due diligence files constitute the evidence that those involved followed appropriate processes.
It is indeed staggering to observe the much more casual approach of Mann, CRU, IPCC, especially when not $billions, but many $trillions are involved.
@ anonymous 75
ReplyDeleteIn the light of the current economic crisis, I am not the sure how reliable economic feasibility studies really are.
And the equation that Mann's hockey stick is worth many trillions of dollars is polemical and under-complex. The relation between science and politics is not one of a magic bullet.
@Raz0rama:
ReplyDeleteYou'd be delighted with a statistician helping you? Sure, we all would!
But how would you react to a statistician redoing your work, and throw all mistakes into the wide world on a blog, upon which the readers of his blog are not too shy to cry "fraud"! "Manipulation"! and other 'nice' qualifications?
How would you react when the statistician tries to enter *your* area of expertise, makes claims that are at odds with those of the general community, and then claims he has falsified what you did?
@anonymous #75:
Despite all the claims of billions of dollars/euros flowing towards climate science, there is hardly a penny for such review you propose. However, there *are* others who repeat results, often using different methods. For example, however much people complain about parts of CRUTEM not being openly accessible, its results are already repeated by others using different methodology (GISTEMP, NCDC, JMA). In science, such repeatability is worth many times more than just checking someone did all his calculations correct. It means the result is robust to different methodology.
Anon @75
ReplyDeleteComparing the work of feasibility studies (FS) with that of individual scientists, like Jones, amounts to a category error. The comparator would be the IPCC (and its necessary reform) and this is where much of the attention is focussing now.
Apart from this, I am not convinced that FS would fix climate science (or climate policy). The relation between the two is not direct and the 'level of soundness' of scientific results tells you nothing about the policy choices.
What is needed is a better institutional quality control in science (peer review problems, data accessibility, etc., see other blog post about model uncertainty) and wide political discussion about practical political steps to address climate change.
The IPCC has given the impression that both processes could be folded into one, which is presumably the reason why you (and others on this blog) point to feasibility studies.
Anon75; I've always wondered that too. But then again you're talking about the real world, not the Twilight Zone called Climate Science.
ReplyDeleteJones' CRU got 22Mio funds, yet they've only a 3 person staff, no time to respond to 60 FOI requests, shoddy codes and losing important data. Imaging this would happen in the real world...
@itisi69:
ReplyDeletethe number that has been thrown around is at the very least lying by omission. The supposed 22 million (13 million pounds) has gone mostly to the *university* for *non-research* activities.
For example, 6.6 million pounds was used to set up the Institute for Connective Environmental Research. The money went into the *building* of the Institute...
Another 2.7 million pound was for Tyndall phase 2, led by the University of Oxford (Tyndall Centre), with Jones just one of many applicants. The same goes for several other applications (Interestingly, Jones, or the UEA for that matter, isn't even mentioned as a member of the EMULATE consortium).
Even if we're generous, I guess we are talking about 3 million pounds of a 15 year period. That really is not that much, and most certainly does not sustain a large research group. Just for comparative purposes: my group with 5 permanent staff members has at least twice that amount available, that's without the considerable co-funding of our university (PhD grants, for example), and are research is not all that expensive: we often get considerable numbers of materials for free. In the last 3 years about 1 million(!) commercial value from one company alone, for example.
Thanks Marco for the explanation. I knew you, knowing everything, would have an explanation for the Jones grants.
ReplyDeleteExcept the bean counting, my point is, that Jones et al have insufficient staff, lose data, write lousy climate software etc. So tell me, what's the point of "building" institutes like Tyndall for 2,7Mio and not to forget ICER for 6.6Mio(!) when the back office is incapable of doing the most essential work, has a Director who works in disorganized fashion admitting that he ‘did not do a thorough job’ of keeping track of his own records and has only one secretary for it’s 13 staff who’s also acting as a part-time receptionist. This kind of behavior would be absolutely unacceptable in the normal economy.
The Jones Jar:
http://spreadsheets.google.com/ccc?key=0Ah4XLQCleuUYdFIxMnhMNnlXb2JQcDZUendjUXpWWUE&hl=en
@itisi69:
ReplyDeleteUnlike many others, I actually check the sources and look at the background. Anyone with half a brain would be able to do it, but even many with a whole brain don't.
Regarding the one secretary for its 13 staff, also acting as a receptionist: my Department has 5 secretaries on a total workforce of 140. This may be unacceptable in the normal economy, but unfortunately the daily reality of the universities (and most likely all over the world). The assignment of universities is to teach and research. While we, as scientists and educators, recognise the importance of secretaries, the funding agencies (including the government) do not.
Even the EU is hesitant to give money for secretarial help to big EU projects. As a result, I have seen scientists doing loads of things they simply are not educated for. If you're so upset about this, contact your government and tell them to give universities more money for administration. At present, most universities get less and less basic funding, and need to get more and more funding through applications. Put a secretary in that funding application, and your chances of getting money rapidly go down. Besides that, it is a non-permanent position, meaning you are constantly shifting secretaries if you even get money.
Note that Tyndall was a research consortium. (Z)ICER was a building. You can't do anything without a building, including educating students (which still is the official main assignment of universities).
@35, 38, 53, and 66
ReplyDeleteThank you for clarifying what peer review means.
There are actually two meanings of “peer review”. The first one refers to the fact that scientists (should) critical observe the work of their colleagues. The second refers to a formalized practice of editorial boards and funding agencies.
The formalized peer review of a paper does not mean that the presented findings would be necessarily true. It is philosophy of science 101 that final prove is impossible. Scientific knowledge can only be falsified (well the falsification theory has problems too but it describes the current practice well). Consecutively, probation means that findings withstand REPEATED attempts of falsification.
Formalized peer review is a first short cut to weed out contributions that violate the norms, theories, and methods of a field. The real discussion (among peers and increasingly non-peers) starts AFTER publication.
The falsification approach also implies that a scientist who wrote papers that were disproved later does not get morally discounted. Therefore the number of publications and citations that creates his or her track record does not consider whether contributions where falsified later. Scientists get morally discounted for misbehavior (violating the ethos and norms of a discipline) and the sanctions are severe. It is of course a problem if such norms have value or political dimensions. That happens more often than one might think and it needs to be addressed urgently.
Classical peer review does not check how research results were produced (context of discovery). And it is not supposed to check on the moral integrity of the researcher! It relays entirely on the text. Formal peer review is basically a consistency check against existing papers (and not replication). The classical scientific paper should be a manual for replication. This has been not a problem in little science where all peers of a field work in similar laboratories, have the same training, and where research is cheap etc.
The troubles started with modern large-scale research. The ability of replication disappeared over night. Money, machinery, and manpower were available only to a few. For this reason, the demand data publication became prevalent. Especially in the US, the requirements to put data in the public domain are pretty high as long as the data is entirely generated by taxpayer’s money. It is different in Europe where rules of data distributions are still in a nascent state. In addition, European weather services are mostly commercial institutions that sell products to costumers (the defenders of the free market have to consider this conflict of the market with scientific principles).
But even if a reviewer has access to data, he or she would still miss money, machinery, and manpower. And of course, the motivation of replication attempts from the inside is quite low (Maybe funding agencies should set aside a certain percentage for replication and reduce the pressure to produce spectacular new results).
I agree, many climate researchers did not take these structural problems seriously enough. They stuck to the classical view of science (because it provided legitimization in the policy process) and handled the occurring problems arbitrarily.
Of course, Jones and his team are probably not able to handle all data requests. However, they did not create defensible guidelines. What we need are standardized procedures on which scientists and the interested public can agree on without discouraging scientific advancement and creativity (including replication and falsification attempts).
@83 wrote
ReplyDeleteOf course, Jones and his team are probably not able to handle all data requests.
I don't understand why they couldn't handle the data requests. I assume the data requests were only for the data needed for replication of the results for their paper. It should not have been difficult at the time they made the final run of their program to get scripts, data and programs in one place and create a tarball (archive of scripts, data, programs). A year or two later it might have been difficult without a versioning system. It was just bad software practice, IMHO, not to be able to reproduce the data.
klee12
@Anonymous:
ReplyDeleteIt will depend on what "data" you are referring. There was a FOI request for all communications regarding AR4: plenty of searching required to fulfill that. MANY requests for specific agreements between NMSs and CRU on data sharing policy (Problem: many were oral and/or really old). There were also requests for raw data, where the problem was related to confidentiality and CRU not holding the actual raw data. You'd have to extract that data from elsewhere, although the FOI request should actually be denied (data not held). Yet other data requests *were* upheld, only to be followed by the next request (for 'help'), and the next, and the next, and the next.
With the exception of some data requests for the paleoclimatic reconstructions, none of the requests were for replication of results from certain papers. And those requests were upheld! (regardless of McIntyre's complaints about not getting the data fro Briffa: he actually got the data from Hantemirov, the actual owner of the data).
Marco wrote <>
ReplyDeleteMarco is correct, some FOI requests might require lots of time. However IMHO, replication of results should not take very much time if scientists thought ahead and followed good software practices.
By the way, according to Climate Audit it seems that sometimes the confidentialality argumements were bogus If that is correct it does not reflect well on some scientists
klee12
@klee12:
ReplyDeleteI urge you to read the documents from the SMHI:
First says: "No, don't publish our data". Then Phil Jones and Acton testify. Three days AFTER the Swedes say "well, OK, publish anyway, but with disclaimer!".
Those that refer to the links to raw data provided by the SMHI should read the license agreement. Especially paragraphs 3.2 and 4.1. The latter cannot be more specific: "you can NOT redistribute our data". Period!
Marco, I have a different interpretation of the exchange of documents.
ReplyDeleteThere are two different files in question, the Swedish raw data file and CRU's homogenized file.
A letter written by John Hirst of the MetOffice on Nov, 30 2009 forwards a request by Phil Jones for permission to release a homogenized Swedish data (see bottom of this document )
In reply the Swedish authorities did not want that the data be released on UK web site because it would differ from the raw data they hold. The requested data will be available for non-commercial purposess on a web site they are developing.
In a letter from the Swedish authorities on March 4, 2010, the Swedish authorities said their letter to the MetOffice may have been misinterpreted. They said "It has never been our intention to withhold any data but we feel that it is paramount that data that has undergone, for instance, homogenisation by anyone other than SMHI is not presented as SMHI data.We see no problem with publication of the data set together with a reference stating that the data included in the dataset is based on observations made by SMHI but it has undergone processing made by your research unit. We would also prefer a link to SMHI or to our web site where the original data can be obtained." (my emphasis).
I interpreted the above to mean that the Swedish authorities never opposed the use of their data for non-commercial uses. They request that the original raw data not to be confused with homogenized data. If my interpretation is correct then there was never any difficulty in sharing the data, at least as far as the Swedish data sets were concerned.
I was unable to read the Swedish License agreement, but one of the letters mentioned the raw data was available for non-commercial uses. Of course the license agreement could be in conflict of my reading of the letters.
klee12
@klee12:
ReplyDeletethe license agreement is indeed in conflict with your reading. It sadly is only available in Swedish, but I think Hans von Storch (and Tobias W?) can both affirm whether my translations below are accurate:
"3.2 The Licensee owns no right to use the data or products provided under this agreement for commercial purposes and not for development or production of meteorological, hydrological and oceanographic value added-value services. The licensee does not own nor is authorized to redistribute, sell, assign or otherwise transfer data products or documentation without further processing to third parties unless the parties have received written permission from SMHI."
and
"4.1 The Licensee does not own the right to disclose, send onwards, link to or in any other way spread the contents of the data and/or products that has been received in accordance with this agreement to a third part."
The SMHI is not the only organisation to have such license agreements. Apparently, the excuse they gave to a Swedish MoP was "everyone does it like this"...
marco:
ReplyDeleteI can't seem to find the text that you've translated, do you have a direct link? The point in this matter, the way I see it, is that the CRU hadn't had any intention before of trying to make their data "transparent". All the rest in this particular instance really is a "tempest in someones teapot", as far as I'm concerned.
Say what you will of Phil Jones, but it was there in black and white - if it's data homogenized by anyone else than us, then no you can't, and we will ourself be making the raw data available, thank you very much! If the CRU doesn't have anything but "value added" data, then they were not allowed to publish, and even if they did SMHI would be doing it themselves. For once - move along, there's nothing to see here!!!
@marco:
ReplyDeleteThere seems to a conflict between the letters I referenced and the terms of the License, which I assume is to use the raw data. Here is my attempt to resolve the conflict.
My interpretation of Section 3.2 is that the Licensee may use the raw data for non-commercial purposes. The licensee cannot transfer the data unless the licensee perfoms "further processing" or unless they obtain permission from SMHI. That seems to me to be a double negative meaning (to me) that they can transfer processed or homogenized data.
My interpretation of Section 4.2 is as follows. In order to enforce the restriction of 3.2 on the use of the raw data, SMHI wants every one to go to their site and agree to the License terms. Then they can get the data. Thus you cannot disseminate the raw data since the person you disseminated the data to may not have seen the License and inadvertently use the raw data for commercial purposes.
If that is the intent of the License agreement, and if that policy has been in effect for at least five years, then I think nothing would have prevented Prof. Jones from providing data from the Swedish site to whoever asks for it. All he has to do is to provide the homogenized data on his web site and then tell the enquirer where to get the raw data.
Another point is that the letters I referenced were apparantly written after November 2009, and there were FOI requests since, I think 2006. So there seems to have been no hurry to resolve the issue.
klee12
@Tobias W.
ReplyDeleteThe license information is here:
http://data.smhi.se/met/climate/time_series/html/essential20.html
@klee12:
regarding the letters: CRU already provided *their own* data, but not the raw data. In essence, they are not the data depository, and the SMHI in essence agrees to that point (or rather, enforces that point: you are not to be the depository). That it took a long time for CRU to ask whether they could be should be seen in the larger perspective: attempts to get Jones to do all the hard work, with the sole desire to find any flaw, regardless of magnitude, and then pounce on it in public.
It's like the police asking any person to find anything and everything that might incriminate him. And if he once had a car with a broken lamp, made a picture of that, hang him in public for a crime. I think most, if not all, people would tell the police to do their own investigative work.
marco:
ReplyDeleteThe translation is spot on, is that you doing the work yourself, or has google translator become that accurate:-)???
About your metafor of "investigative work" I think you're a bit off, because if it is not possible to get the information to be able to investigate, then something is wrong. That in this effect isn't necessarily the CRU:s fault though, which is apparent in this case...
@Tobias W:
ReplyDeleteI know a bit of Swedish (being nearly fluent in Danish), although I did use translations provided by others (Max Andersson for the second one). I just corrected the mistakes :-)
Regarding issue 2: I was trying to be a bit provocative (not necessarily towards you!), pointing to Jones' reply to Warwick Hughes. Hughes *could* have done everything Jones did: contact the NMSs, ask for their data (which may have meant payment) and then redo the analysis. It is now slowly getting a bit easier, with some NMSs putting free data on their webside. But note the time-period of the freely available data...it isn't very up-to-date, and lacks old data!
@marco wrote attempts to get Jones to do all the hard work, with the sole desire to find any flaw, regardless of magnitude, and then pounce on it in public. I guess this is the crux of dispute re Jones.
ReplyDeleteFrom my point of view, reproducibility of results is the bedrock of science. Phil Jones has an obligation as a scientist to provide data and code to others so that others can reproduce the results. Only that and nothing more. No user's manual, no special documentation of code. There is a saying in software science that says the code is the documentation. Programmers like to program and will fix bugs, but they don't like to do documentation, so often a bug is fixed but the documentation is not changed to reflect the change. Having documentation within the code helps but not necessary. I have ported many programs without much documetation. And what happens when the results of the author and the person trying to reproduce the results differs? Whose code is correct?
The program needs data, so the data must be provided. I don't see why homogenized data files cannot be provided with meta data (i.e. the WMO numbers of stations, location of weather stations, etc) and pointer to where the raw data resides. A copy of the data seems important since overtime the homogenized data may change.
Before submitting a paper to the author of the paper should run all programs from scratch, creating all intermediate files. This can be viewed as final check to see that you have not introduced errors when trying to fix other errors, a common occurrence. If everything works, the author can archive all programs and the data (preferably with version numbers).
That does not seem onerous. If the author is asked to do more, to provide data unrelated to the reproduction of the results of his/her paper that would be onerous. The skeptics say all they want is the program, data, and meta data to reproduce the results; the other side seems to say the skeptics are asking for much more. Who is right?
In my post #73 on this thread I suggested, based on the HARRY_README file, that Prof. Jones did not honor requests for data because (1) he couldn't because the data files were in a chaotic state and (2) there are indications that some code files there was fudging. That still seems reasonable.
klee12
Openness and reproducibility in climate science are good things. But these should be designed proactively. Accusing scientists retroactively does not make progress. Also it is pity that many people consider these concepts in the frame created by Steve McIntyre based on his limited experience in statistical analysis of climate data and personal communications. More social-scientific analyses of the activities of climate scientists are needed. Perhaps we need perspectives of such persons as Paul Edwards (see http://pne.people.si.umich.edu/ ).
ReplyDeleteIn post 96, Kooiti Masuda wrote Openness and reproducibility in climate science are good things.But these should be designed proactively. Accusing scientists retroactively does not make progress.
ReplyDeleteBut reproducibility of results is a bedrock of science. What are we to do if we can't reproduce a result? Accept it on faith? If any discipline says results should be accepted even if not reproducible then it's not really science IMHO. It is fine to advance hypothesis without evidence, but if one puts forth empirical results in support of one's hypothesis, that evidence should be reproducible.
klee12
klee12/97 - Speculation should certainly be allowed in scientific publication - in the good old days, when hardly anybody spoke about global warming, the then chief-editor of Monthly Weather Review Chester Newton purportedly used the rule that "25% of speculation is allowed in a paper, as long as it is labelled as such".
ReplyDeleteAllow me some advertisement - this good old times are described in some of the interviews I prepared with people, who are now in their late 70s or 80s (or sadly have passed away), among them Harry van Loon - see my Interview web-page.
Some computer scientists expect that they can devise a way to enhance reproducibility of scientific works by recording the work flow of scientists (as far as they work on a computer system). An attempt (already finished) is introduced at http://www.earthsystemcurator.org/projects/workflow.shtml . I cannot tell whether this shows a direction which climate research institutions should take or not.
ReplyDeleteThe same project (Earth System Curator) has a social scientist, Paul Edwards, as a member, and they want him analyze how climate modelers work. (But I have not yet heard his specific remarks.)
(Incidentally, I learned about these when I came to Hamburg last October, in "GO-ESSP" meeting held in ZMAW.)
Kooiti MASUDA/98 refers to an way to enhance enhance reproducibility of scientific works by recording the work flow of scientists (as far as they work on a computer system). I looked at the site referenced
ReplyDeleteThe software system seems cumbersome for reproducing results that are submitted to journals and I can't see where have any mention of keeping track or archiving versions. Software engineering is simple: it says think ahead, anticipate problems and take steps today to avoid wasting a lot of time later. I know that I can get away with sloppiness for small projects. I know from experience that I can't on large projects. Here's what I would do for large projects.
I would put programs and data in my own directory, organized by subdirectories of course. Disk space is cheap; your time isn't. With your copy of the files you don't have to worry about access or unexpected changes in the contents. Every month I would archive them (create a tarball). The purpose of the monthly archive would be so that that if a disaster happened, such as accidently erasing a subdirectory (has happened to me) I could recover from a previous archive. I loose at most one month's work. Also, in case I wanted to go back to a previous version and I could go back to a last months or last years version. The time stamp of the archive serves as a version number for my private archive.
In the directory of my programs I create a subdirectory called Save. I might copy files to this directory before modifying a file. If I goof up and want to undo the changes I have a day old version of the file.
I would use scripts or a makefile to automate compilations an running programs. This is a big times saver.
When am ready to publish results I would, in the following order
1. Archive everything
2. Delete unnecessary files
3. Get fresh copies of all data files
4. Assign a version number to every file. The version should be inserted into the files as a comment if possible.
5. Run all programs, including compilation of the programs using a script or a Makefile. No author intervention is allowed since someone trying to reproduce your results may not be able to.
6. Examine output.
7. If some data files are proprietary delete those files and add a README file that tells others where the data file was obtained and the date the data file was fetched. Add information about special programs or libraries.
8. Archive the directory.
In Open Source projects they use an elaborate versioning system (CVS). That process was designed for cases (1) when you had many different authors throughout the world working on the same project and you didn't want inconsistencies to arise and (2) you needed different versions for different machines. (2) should not arise in scientific programming. Keep it simple; don't use CVS unless it's needed.
A small part of the versioning system might be useful in maintaining a temperature file. The data might change frequently for various reasons. There is a program that can record changes in 2 text files in a diff file and another program that can apply the diff file to create the second file from the first. Therefore it is easy to maintain a base file, get the next version by applying the changes to get the 2nd version, and changes to the 2nd version to get the third version. Scientists might want to look at the changes rather than the files. I don't think you need this unless you're in charge of archiving data.
I am sure what happened at CRU wouldn't have happened if they did something similar to what I am proposing.
klee12