Die Klimazwiebel: Statistical significance of climate change ensembles?

Saturday, February 9, 2013

Statistical significance of climate change ensembles?

The issue of statistical significance of an ensemble of climate change simulations has been examined in a paper by Hans von Storch and Francis Zwiers: Testing ensembles of climate change scenarios for"statistical significance" now published by Climatic Change 117 (2013): 1-9 DOI: 10.1007/s10584-012-0551-0. The article was published as open access, and a copy can be downlaoded from academia.edu. Earlier on-line versions were available.

The applicability of the concept of "statistical significance" is discussed for the case of expected future climate change. The concept is sometimes sloppily used, associating the impression that scenarios would be representative of the future (and not mere possible futures), while this is hardly the case. Indeed, almost always, apart of a very limited case, the concept is not applicable. One reason is that the population of valid scenarios can not be described, so that a random variable can not be defined.

When applying the concept of "statistical significance",a (null) hypothesis is needed to be formulated, which features a well-defined random variable. Significance then means that the hypothesis may be rejected with a pre-defined probability, which exploits the random structure of the considered random varibale (under the null hypothesis). For further details refer to the text book "Statistical analysis in climate research" by Hans von Storch and Francis Zwiers.

In the article, different cases of random variables, or populations of events conceptualized as random, are considered:

a) “All climate scenarios”—in which case, we would have to determine what is meant by “all”. This presumably means climate scenarios produced with all conceivable models, all conceivable emission scenarios, and all conceivable downscaling approaches, plus an understanding of how the available 11 scenarios were selected from that broad population. Moreover, this would need qualification. All emission scenarios that are deemed valid by contemporary economists? Or all followers of a certain school of thought? All climate models? Note that in the IPCC Special Report on Emissions Scenarios probabilities or relative likelihoods were not assigned to the various scenarios that were described.

There is likely no way to make an assertion about “all climate scenarios”, because that set is simply not definable. Nevertheless, there have been attempts to quantify uncertainty from models, forcing scenarios and downscaling, for example, using complex hierarchical Bayesian models, and serious thought has been given to the basis for the interpretation of statistics calculated from multimodel ensembles.

b) “Climate scenarios based on a specific emissions scenario”—in which case we would want to make statements that are specific to individual emission scenarios. This is the approach that was used by the Intergovernmental Panel on Climate Change (IPCC) in its 4th Assessment Report, in which it assessed a likely range for global projections of future change under each of several different emissions scenarios. The IPCC was also confronted by the question of the interpretation of the available ensemble of models, and thus the assessments of global projections included an aspect of expert judgement that drew on physical understanding as well as the spread of available ensembles of climate change simulations.

c) “Climate scenarios based on a specific emission scenario and produced with a restricted class of models”—this is closer to being a tractable problem if the class of models is sufficiently restricted, albeit still a very large problem. For example, one might consider models that share the same code, but in which parameter settings have been varied systematically using a well designed sampling scheme, such as is used in the climateprediction.net-project.

d) “All available climate scenarios”—in which case this is no longer a problem of statistical inference because the population is completely known. Therefore we can simply state: summer rainfall decreases in most locations in all n = 11 cases, with some exceptions at a few locations in m = 3 or four cases. No attempt is made to quantify statistically how additional, so far unknown scenarios, would play out. Nevertheless, the lessons learned from the responses to forcing simulated by a limited collection of physically based climate models under a limited range of forcing scenarios, together with our understanding of physics and how changes in the composition of the atmosphere affect the planet’s energy balance, should provide considerable insight about the general nature of the responses that can reasonably be expected from other physically based climate models and under other forcing scenarios.

7 comments:

hvwFebruary 12, 2013 at 11:05 AM
True enough, albeit the presentation appears a bit limited due to a strictly "frequentist" or even "Fisherian" point of view.
ReplyDelete
Replies
Hans von StorchFebruary 12, 2013 at 9:25 PM
Would the analysis be different if we would adopt a Bayesian view point? We would still need to define random variables, and ways to sample the population from all over the event space?
ReplyDelete
Replies
hvwFebruary 14, 2013 at 12:01 PM
I believe your paper would read differently and convey a somewhat different message. Of course, the lack of information you point out doesn't magically go away by adopting a Bayesian point of view.

First of all, I would argue that one should discard "significance tests" in our context at all. You are well aware of the criticisms of that concept that are being published since 50 years or so, as you likely are, as a teacher, about the common misconceptions. From personal communication I am led to believe that the vast majority of scientists agrees, and only the reviewers are holding back the abandonment of this relict ;). This is Klimazwiebel and we care for proper, "honest" communication of uncertainties to the statistically less-educated. So shouldn't it be judged deliberate misinformation to talk about statistical significance, which we can be sure to be taken for something completely else by the general public? (And in its technical meaning more often than not has no relevance to the actual question). Look at blog-arguments about anything climate related: Almost any argument involving "but this is statistically (not) significant" is totally meaningless.

Regarding your case a), not having read a paper that attempts to quantify uncertainty from models, forcing scenarios ..., I would consider that nonsense, a priori, because if we reasonably could assign probabilities to emission scenarios we would not talk about scenarios and projections but about boundary conditions and predictions. I guess you agree?

You cases b) and c) is where the action is, currently, I believe. I am far from wanting (and being qualified) to reenact the Frequentist vs. Bayesian debate. It appears however, that for wringing high-dimensional information from multiple GCMs/RCMs and possibly incorporating "soft" expert knowledge, the adoption of a Bayesian conceptualization is the way to go. Currently the state of the art is obviously a mess, but apure frequentist has no choice but to just give up. An nice feature of Bayesian approaches is also that assumptions are out in the open for discussion as they are mostly contained in the choice of the prior, as opposed to classical strategies, where assumptions are frequently borrowed deep within and easily overlooked.

ReplyDelete
Replies
hvwFebruary 14, 2013 at 1:17 PM
grmptzcx
"borrowed" -> "buried"
ReplyDelete
Replies
AnonymousFebruary 17, 2013 at 8:26 PM
Hans von Storch,

Sie baten um ein Feedback. Ja, ich habe ihr Paper auch gelesen.

Ich denke, es geht um Kommunikation von wissenschaftlichen Ergebnissen in der Öffentlichkeit. Die Grenzen von Modellen und deren Aussagen haben die Wissenschaftler sicherlich stets im Hinterkopf. Für mich als Laie war es auf jeden Fall interessant und lehrreich, ich fühlte mich fast schon ertappt bei teilweise schlampigen Formulierungen. Für ihre Kollegen aus der Wissenschaft geht es wie gesagt wohl eher um Kommunikation.

Ihr Vorschlag
" Using n scenarios constructed with the models A, B, ..., emissions scenarios S1, S2,..., and so on, we ﬁnd rainfall amounts decrease in most grid boxes for all scenarios,and that in the remaining few grid boxes, they decrease in most (72,3%) but not all scenarios."
klingt vernünftig.

Schwieriger ist die Frage, wie ein Wissenschaftsjournalist Ensemblevorhersagen formulieren sollte. Für diesen wäre der Vorschlag sicherlich zu sperrig und es stellt sich wieder mal die Frage, wie der perfekte Spagat aussehen könnte zwischen Exaktheit und notwendiger Vereinfachung.

Vielleicht "Die Mehrzahl der verwendeten Modelle zeigt für Norddeutschland..."?

Einen interessanten Blogbeitrag zu diesem Thema fand ich bei T. Edwards: http://allmodelsarewrong.com/many-dimensions-to-life-and-science/

Noch ein Gedanke: Niederschlag ist sicherlich deutlich schwieriger zu modellieren als Temperaturen. Die härteste Nuss ist m.E., dass die Modellergebnisse nicht unabhängig voneinander sind, was ja in ihrem Beispiel beim Vorzeichentest mit B(n;0,5) benutzt wurde. Im Extremfall sind die Minderheitsaussagen von wenigen Modellen sogar die besten Aussagen, weil diese einfach Niederschläge besser modellieren.

PS:
Warum ich vorher geschwiegen hatte? Nun ja, mein erster Eindruck war "interessant", "hübsch". Zustimmung und Lob von einem Laien wie mir finde ich irgendwie nichtssagend, normalerweise frage ich lieber nach, was ich nicht verstanden habe. Vielleicht jetzt aber doch ein Lob, es geht ja um Kommunikation: Das Paper war sehr gut geschrieben, auch für Laien wie mich sehr gut verständlich.

Viele Grüße
Andreas

ReplyDelete
Replies
Günter HeßFebruary 17, 2013 at 9:09 PM
Lieber Herr von Storch,

wie lade ich denn ohne Facebook Account runter?

Grüße
Günter Heß
ReplyDelete
Replies
Hans von StorchFebruary 17, 2013 at 9:51 PM
Sorry, Mißverständnis - ich wollte hier keine Raaktion einfordern; das wäre ganz unangemessen. Aber ich war neugierig, warum es einerseits keine Reaktion gab, aber andererseits eine deutliches Interesse am Manuskript.
Also nix für Ungut.

Danke an Andreas für die netten Worte; und "alle vorhandenen Modellsimulationen weisen auf einen erhöhten Niederschlag hin / 70% der vorhandenen ..." oder so wäre sicher eine angemessene Formulierung.
Danke an Günter Heß - ich glaube man kann auch mit einem google-Konto zugreifen; andererseits, wenn es nun gar nicht geht, bitte mich anmailen, ich gebe dann im Ausnahmefall einen anderen download Ort an.
ReplyDelete
Replies

Add comment