tag:blogger.com,1999:blog-8216971263350849959.post2954223123988730563..comments2023-08-07T16:41:49.660+02:00Comments on Die Klimazwiebel: Trendy trendseduardohttp://www.blogger.com/profile/17725131974182980651noreply@blogger.comBlogger14125tag:blogger.com,1999:blog-8216971263350849959.post-55880375147481228812012-07-17T12:16:40.734+02:002012-07-17T12:16:40.734+02:00HvS, #6
Another trick is to add complexity in the ...HvS, #6<br /><i>Another trick is to add complexity in the statistical method, in the hope that more simple minded people would not understand such methods and trust that the complexity would add reliability of the result. In general this is not the case. </i><br /><br />Such evil tricks might actually happen and I am certainly among the "simple minded people" who do not understand purposely obfuscated methods (but I am not so simple minded to automatically assume they work). But obfuscated methods that do not work will have no impact in the long run. What worries me more is the opposite, and you actually point to a nice example of this attitude: A valid statistical criticism (<a href="http://www.sciencemag.org/content/317/5846/1866.3.abstract" rel="nofollow">Schmith et al.(2007)</a>) of a <a href="http://www.sciencemag.org/content/315/5810/368.abstract" rel="nofollow">study that is flawed by an oversimplistic statistical approach</a> is <a href="http://www.sciencemag.org/content/317/5846/1866.4.abstract" rel="nofollow">handwaivingly discarded</a>.hvwnoreply@blogger.comtag:blogger.com,1999:blog-8216971263350849959.post-66116564529159427272012-07-17T11:52:05.359+02:002012-07-17T11:52:05.359+02:00OBothe, eduardo
thanks for the pointers and exten...OBothe, eduardo<br /><br />thanks for the pointers and extensive comments. Actually I find Tingley et al. (2012) quite informative. They give a nice overview about what is out there, from a Bayesian perspective, which makes it conceptually simpler. The paper is not at all rigorous or deep, but that makes it a very accessible easy read.<br /><br />Some artice-comment-response are elucidating on a meta-level and nicely illustrate how people with a classical (frequentist) background can misunderstand those who try to advance a modern (Bayesian) approach. (Christiansen's (2012) LOC, Tingley's (2012)comment and Christiansen's reply).<br /><br />The take-away for me is:<br />1) Damn hard problem<br /><br />2) State-of-the art, performing at least as good as everything else in the majority of evaluations and being currently practically applicable is <a href="http://web.gps.caltech.edu/~tapio/imputation/" rel="nofollow">RegEM (Schneider 2011)</a><br /><br />3) This by no means implies that RegEM is "good enough", on the contary.<br /><br />4) The way forward are Bayesian Hierarchical Models (BHMs). Because only this framework allows for a clear inclusion of all information available (e.g. spatiotemporal covariance structure of both, predictand and predictor) and proper uncertainty propagation. Most importantly, and referring to eduardo's last paragraph, BHMs can easily (well, <i>conceptually</i> easily) incorporate detailed, complicated (aka realistic) process-level models, and this to me seems the proper way to constrain the uncertainty of proxy reconstructions. As opposed to simulate uncertainty assessment through GCM driven pseudo-proxies which are constructed so that they behave in a way that is doable in the chosen statistical framework (aka linear) but don't incorporate (and possibly even contradict) the empirical process knowledge available.<br /><br />5) To actually do 4) you need Bayesian statisticians, paleo-people, climatologists, a good scientific programmer and a huge cluster all working nicely together. But I am pretty sure that projects into that direction are underway.hvwnoreply@blogger.comtag:blogger.com,1999:blog-8216971263350849959.post-16481219624802324962012-07-17T11:46:14.407+02:002012-07-17T11:46:14.407+02:00OBothe, eduardo
thanks for the pointers and exten...OBothe, eduardo<br /><br />thanks for the pointers and extensive comments. Actually I find Tingley et al. (2012) quite informative. They give a nice overview about what is out there, from a Bayesian perspective, which makes it conceptually simpler. The paper is not at all rigorous or deep, but that makes it a very accessible easy read.<br /><br />Some artice-comment-response are elucidating on a meta-level and nicely illustrate how people with a classical (frequentist) background can misunderstand those who try to advance a modern (Bayesian) approach. (Christiansen's (2012) LOC, Tingley's (2012)comment and Christiansen's reply).<br /><br />The take-away for me is:<br />1) Damn hard problem<br /><br />2) State-of-the art, performing at least as good as everything else in the majority of evaluations and being currently practically applicable is <a href="http://web.gps.caltech.edu/~tapio/imputation/" rel="nofollow">RegEM (Schneider 2011)</a><br /><br />3) This by no means implies that RegEM is "good enough", on the contary.<br /><br />4) The way forward are Bayesian Hierarchical Models (BHMs). Because only this framework allows for a clear inclusion of all information available (e.g. spatiotemporal covariance structure of both, predictand and predictor) and proper uncertainty propagation. Most importantly, and referring to eduardo's last paragraph, BHMs can easily (well, <i>conceptually</i> easily) incorporate detailed, complicated (aka realistic) process-level models, and this to me seems the proper way to constrain the uncertainty of proxy reconstructions. As opposed to simulate uncertainty assessment through GCM driven pseudo-proxies which are constructed so that they behave in a way that is doable in the chosen statistical framework (aka linear) but don't incorporate (and possibly even contradict) the empirical process knowledge available.<br /><br />5) To actually do 4) you need Bayesian statisticians, paleo-people, climatologists, a good scientific programmer and a huge cluster all working nicely together. But I am pretty sure that projects into that direction are underway.hvwnoreply@blogger.comtag:blogger.com,1999:blog-8216971263350849959.post-51508897816517273542012-07-15T00:57:56.885+02:002012-07-15T00:57:56.885+02:00@3 hvv,
well, that link was included at some poin...@3 hvv,<br /><br />well, that link was included at some point later. I remember posting a comment there to make the realclimate readers aware of the existence of the response, but it was 'moderated'. Anyway, this is not really important now, after the years passed.<br /><br />With the perspective of these few years - and independent of the issue of detrending- the problem of the underestimation of the variance has been confirmed by many other studies. A nice review was written by Smerdon just a few months ago (linked in comment 5), which if I remember properly does not include the recent application of Bayesian methods. Form the results that we are getting on other projects - still unpublished- I would say that the Bayesian Hierarchical methods still suffer from this underestimation. A previous attempt with Bayesian methods, including not only proxy information but also information about the external forcing, was published by <a href="http://www.springerlink.com/content/wj52683504977744/" rel="nofollow">Lee and others</a><br /><br />As Hans explained before this can be a fundamental property of a large family of statistical models. <br /><br />I would mention that other methods, based on local calibration of one proxy record with one instrumental temperature, based on inverse regression (also known as classical calibration, predictor is the instrumental variable, predictand is the proxy) show promising results. Bo Christiansen blogged <a href="http://klimazwiebel.blogspot.de/2010/05/guest-post-by-bo-christiansen-on.html" rel="nofollow">here in the Klimazwiebel</a> some time ago.<br /><br />To leave the low frequency signal would be, in my opinion, justified if we are completely sure that the proxy is reacting to climate and we just wished to calibrate the proxy as accurately as possible. Unfortunately, this is not the case. There are are many proxy records around, but by no means all, that simply do not contain any climate signal. They have sometimes interpreted as a temperature signal, then later as precipitation signal, later as a mixed signal that flips in certain periods...etc. In other cases, for instance stalagmites, records from the same cave look quite different, and the experts here claim that you need a very developed mechanistic knowledge of the proxy to identify the best locations within a single cave. In the Mann et al (1998) study, clearly some precipitation records were wrongly interpreted as containing a temperature signal, something that was much criticized in the paleo community at that time.<br /><br />I would say that most of us still have to learn quite a lot from professionals, but they dont show much interest, with some exceptions. One initiative was the <a href="http://www.google.de/url?sa=t&rct=j&q=&esrc=s&source=web&cd=16&ved=0CGYQFjAFOAo&url=http%3A%2F%2Fwww.pages-igbp.org%2Fdownload%2Fdocs%2FBayesian_2011-2(78-79).pdf&ei=0eoBUImuDa7S4QS6zfX2Bw&usg=AFQjCNEoKMNloDNtuA3Ha2tVJYcmGJICjg&sig2=V3Z-EAnjgpg7C_zRoTM2aw" rel="nofollow">workshop organized last year</a> .eduardohttps://www.blogger.com/profile/17725131974182980651noreply@blogger.comtag:blogger.com,1999:blog-8216971263350849959.post-51776384942329237892012-07-14T23:24:32.518+02:002012-07-14T23:24:32.518+02:00@ 2
Wallacer,
the issue with the trends is actual...@ 2<br /><br />Wallacer,<br />the issue with the trends is actually a step previous to the 'correlation is not causation' meme. I would rather describe it as ' common trends are not correlation. All series with a long-term trend appear correlated, but tested properly that correlation is not statistically significant. The number of degrees of freedom is much less that the number of time stepseduardohttps://www.blogger.com/profile/17725131974182980651noreply@blogger.comtag:blogger.com,1999:blog-8216971263350849959.post-648301922960412972012-07-14T23:03:28.479+02:002012-07-14T23:03:28.479+02:00@ Eduardo
Thanks for your interesting contributio...@ Eduardo<br /><br />Thanks for your interesting contribution. I wish you had told more about your encounter with E. Wahl.<br /><br />PS:<br />I've just read an excellent article and learned a lot about MWP and LIA and their relevance for climate projections.<br />http://www.st-andrews.ac.uk/~rjsw/all%20pdfs/Franketal2010.pdf<br />Thanks!<br /><br />AndreasAnonymousnoreply@blogger.comtag:blogger.com,1999:blog-8216971263350849959.post-20898588006451960572012-07-14T22:55:05.479+02:002012-07-14T22:55:05.479+02:00@ hvw
what is the state-of-the art for this probl...@ hvw<br /><br /><em>what is the state-of-the art for this problem?</em><br /><br />Maybe ensemble reconstructions, as described here:<br />http://www.climate.unibe.ch/~joos/papers/frank10nat.pdf<br /><br />The idea is, that when you have no chance to find the "best" reconstruction, you can obtain valuable information about uncertainties by using several methods and creating an ensemble of reconstructions.<br /><br />AndreasAnonymousnoreply@blogger.comtag:blogger.com,1999:blog-8216971263350849959.post-84698603611601167902012-07-13T18:57:49.195+02:002012-07-13T18:57:49.195+02:00Forgot two points:
a) the link between f and p, pr...Forgot two points:<br />a) the link between f and p, proxy and geophysical data may not be stationary.<br />b) Correlations in this business are often of about 0.7 and less, corresponding to 1/2 or less of the variance, or 1/2 or more of the variance remains "unexplained".Hans von Storchhttps://www.blogger.com/profile/08778028673130006646noreply@blogger.comtag:blogger.com,1999:blog-8216971263350849959.post-42583556145116891132012-07-13T16:16:28.478+02:002012-07-13T16:16:28.478+02:00O Bothe,
let us assume the p_t is a series of pro...O Bothe,<br /><br />let us assume the p_t is a series of proxy-data, and f_t the geophysical variable of interest. Let us further assume that p_t and f_t are stationary random variables, which is with respect to p_t a nontrivial assumption (without statistical analysis makes little sense; one can weaken this assumption by going to quasi-stationarity or other complex constellations, but I have never seen this done).<br /><br />When building a statistical link, then you assume that you learn something from the joint variability of the pairs (p_t and f_t). To do so, you must have several, or even better many samples of (p_t and f_t). Also, you should know how often a new pair tells you something NEW about the joint generating process. That is, how often is (p_t+1, f_t+1) essentially the same constellation which was already described by (p_t, f_t). In particular, you do not want to see unrelated trends in both variables. Unfortunately, statistics can hardly tell you if the trends are related or not, only if you refer to difference stationary time series analysis methods known from econometrics (Schmith, T., S. Johansen, and P. Thejll, 2007: Comment on “A Semi-Empirical Approach to Projecting Future Sea-Level Rise” science 10.1126/science.1143286). Thus, what matters is the number of degrees of independent sample pairs; the assumptions about the sampling process are a key element (whenever statements about the reality of a link of the variations is made).<br /><br />Now, let's write p=p*+p' and f=f*+f', with p* being the archive for variations in f, and f* the archive for variations in p. [Symmetry here, because both forward and inverted regression are in use.] In case of a forward regression, it would be p*=alpha f + random error, with alpha = , and Var(p*) =alpha^2 Var(f)= ^2/var(f) = Corr(p,f)*VAR(p). Since the correlation is in all practical situations less than 1, we find VAR(p*)< Var(p). [The same way the other way around.] Independently if we use forward or inverted regression, we have Var(p*) .ne. Var(p) and Var(f*) .ne. Var(f). Which is obvious, because we have the nonzero contributions f' and p', which are part of p and f, but which do not leave traces on the the other variable, = 0 and = 0. (hope my calculations are complete.)<br /><br />With statistical analysis, 100% of the variance of f, or of p, can not be recovered by screening p (or f). Some part of the original variability is lost, and lost for good, except if one could recover f' (or p'), which very likely is not just noise. The same applies when more sophisticated links are established, such as neural nets or whatever (methods, which need much more samples in general for leading to reasonably small estimation errors; please check).<br /><br />An often used trick is to employ "inflation", that is to merely multiply the *-series with a suitable factor so that VAR(p*) = VAR(p). This implicitly assumes that p' = 0, or Corr(p,f)=1, which is obviously an invalid assumption. All proxies contain variations, which are not related to, whatever we want to take it as representative for, temperature, precip etc ... but to other influences, such as local environmental changes, ranging from bug contamination to land slides etc.<br /><br />Another trick is to add complexity in the statistical method, in the hope that more simple minded people would not understand such methods and trust that the complexity would add reliability of the result. In general this is not the case. <br /><br />In short: the problem that proxy-reconstructions tell us only part of what happened is an intrinsic property of the approach and can not be overcome by statistical analysis alone. A possible solution may be process-based modeling using proxy-data for constraining the dynamical modeling (c.f. data assimilation) - but on the other hand: what is lost, is lost. Proxies do not tell past states, but part of past states and variations.Hans von Storchhttps://www.blogger.com/profile/08778028673130006646noreply@blogger.comtag:blogger.com,1999:blog-8216971263350849959.post-80521128843812367042012-07-13T14:20:54.170+02:002012-07-13T14:20:54.170+02:00two things.
First: I'll second hvw's que...two things. <br /><br />First: I'll second hvw's question whether there is a recent review? I should know about them, but the only thing that comes to mind is Jason Smerdons WiresCC paper on pseudo-proxy work ( see <a href="http://dx.doi.org/10.1002/wcc.149" rel="nofollow">here</a> or <a href="http://www.ldeo.columbia.edu/~jsmerdon/Site/Publications.html" rel="nofollow">here</a>. There is the editorial by <a href="http://www.springerlink.com/content/35072300uj442329/" rel="nofollow">Hughes and Ammann</a> and there is Tingley et al.'s <a href="www.sciencedirect.com/science/article/pii/S0277379112000248" rel="nofollow">"Piecing together the past: statistical insights into paleoclimatic reconstructions"</a>, but a "complete" review of the methodologies? Some insights may come from blogposts (lucia, SMcI, JeffID etc.) or the discussions surrounding Bo Christiansen's publications of the last years. <br /><br />Tingley's <a href="ftp://ftp.ncdc.noaa.gov/pub/data/paleo/softlib/barcast/readme-barcast.txt" rel="nofollow"> BARCAST </a> may be more powerful then the regression based methods. Which leads again back to Smerdon's publication page. <br /><br />My second point: Consent about dissent. Yes, but it's kind of astonishing how quickly the scientific discussion becomes infested by emotions. One only has to look at the most recent spectacle. Interestingly it goes along trenches quite similar to Eduardo description above.OBothenoreply@blogger.comtag:blogger.com,1999:blog-8216971263350849959.post-62731546130281307562012-07-13T12:27:26.161+02:002012-07-13T12:27:26.161+02:00Der ursprüngliche Frager (siehe eduardos Text) sch...Der ursprüngliche Frager (siehe eduardos Text) schrieb mir als Reaktion auf eduardos Beitrag:<br /><br />"<i>vielen Dank !<br /><br />Warum nur kann diese Diskussion nicht ohne Tricks und Desinformation stattfinden ???<br /><br />Ich lese viele Kommentare von Wissenschaftlern zum Thema und treffe oft auf heftigsten und oft unsachlichen Streit, den ich mir nur dadurch erklären kann, daß keine eindeutigen wissenschaftlichen Ergebnisse existieren und daher viel persönliche Meinungen und Interpretationen sind. Und dafür geben wir Steuerzahler zig Milliarden aus (CO2-Steuer, EEG-Gesetz etc. etc.)…</i>"<br /><br />Ich denke, dass sollte uns allen zu denken geben, dass wir versuchen sollten, den gemeinsamen Konsens zu benennen - und den Disens. Also das Einvernehmen, worüber wir uns nicht einig sind.Hans von Storchhttps://www.blogger.com/profile/08778028673130006646noreply@blogger.comtag:blogger.com,1999:blog-8216971263350849959.post-78246541828051681992012-07-12T15:25:52.316+02:002012-07-12T15:25:52.316+02:00Eduardo,
will no enter the question of why realcl...Eduardo,<br /><i> will no enter the question of why realclimate linked to Comment published in Science by Wahl et al. (2006) and not to our response, both published side by side.</i><br /><br />In fact, the realclimate article has a link you your response to Wahl et al. in Science, 2006. The reader you referred to just did not read properly.<br /><br />However, I perceive the realclimate article a bit unfairly slanted because it blames you for not responding the the critique before the critique was published but you must have been aware of it anyways. RC seems to justify this extraordinary demand by the assumption that this critique of your paper invalidates its conclusion. Judging from your response however you see this differently and from that perspective there isn't any need by any standard to reply to an inconsequential little oversight.<br /><br />I also perceive a bit misleading your press-release example (unemployment). It simply illustrates the highschool knowledge of <a href="http://xkcd.com/552/" rel="nofollow"> correlation not implying causation</a>. However, to extrapolate temperature from proxies we need not only causation but also a good idea about the nature of this causation. This, in principle, can not come from the timeseries themselves. Fortunately there is a huge body of knowledge concerning the physical, biological and chemical processes that relate various proxies to temperature (as opposed to the literature pertainig to unemployment as a function of temperature). So your argument for removing low frequency variation (linear trend) from the timeseries before regression, i.e. to avoid inflation of the validation measure due to correlation "by chance", seems misguided. The validation helps to select the best model among candidates, its value in validating the underlying assumptions is very limited indeed. Intuitively I seems right to me to leave the low frequency signal in the calibration, if the low frequency variation is what we are interested in. Conversely, removing the trend seems to rely on the assumption of scale-invariance of the temperature-proxy relationship. <br /><br /><br />What I would like to know: 14 years after pretty much the very first attempt of a muliproxy reconstruction (MBH98), to which your article referred, and six years after the exchange describe above, what is the state-of-the art for this problem? The linear regression methods, direct or inverse, with PCA or not, and particularly the results of Bürger & Cubasch (2005) tell me that the statistical methodology taken into consideration at the time was just very much ad-hoc and pedestrian. Do we have more powerful methods today and a better guidance about what method works in which case? Wouldn't this for example be a posterchild problem for Bayesian approaches? Any recommendation for a recent review paper?hvwnoreply@blogger.comtag:blogger.com,1999:blog-8216971263350849959.post-89311351030774457552012-07-11T20:36:26.009+02:002012-07-11T20:36:26.009+02:00Eduardo
An interesting warning re. the very often ...Eduardo<br />An interesting warning re. the very often neglected "Correlation is not causation" meme.<br /><br />I got really shocked reading H. v. Storch and you qualified as "skeptics". It sadly reminds me when everyone on the right of the Party were labeled "fascists" ...<br /><br />Will you publish an entry about "your" latest article? It's making a bit of noise ...wllacerhttps://www.blogger.com/profile/00784957308949919126noreply@blogger.comtag:blogger.com,1999:blog-8216971263350849959.post-17783109811996296202012-07-11T12:36:13.420+02:002012-07-11T12:36:13.420+02:00Thanks Eduardo, that is interesting.
For now, jus...Thanks Eduardo, that is interesting.<br /><br />For now, just a broken link report:<br />The last link to Geophys. Res. Lett. needs to be<br /><br />http://www.agu.org/pubs/crossref/2005/2005GL024155.shtmlhvwnoreply@blogger.com