One motivation for the SOS4R plugin is reproducible research – and what data could be more appropriate than the question of climate change? I am certainly not saying that all problems (“Climategate“) could be helped with making analysis simpler just because data is easily accessible… but hopefully some!

My two use cases (excerpt from application) that I will base one temperature data are:

First, a researcher wants to use temporally ordered univariate data to create a forecast based on an ARIMA model. She uses the package forecast to achieve her goal.

Second, a point pattern analysis shall be performed. The user wants to analyse covariate effects of spatially distributed data. She uses the package spatstat. An analysis from the workshop paper “Analysing spatial point patterns in R” by Adrian Baddeley is carried out.

Natually, I am not a climate researcher, but nevertheless I’d like to see whether I can make my own temperature model of the earth based on publically available data and Open Source software.

I did an advanced search (i.e. googled) for data sources for climate data and found the really useful website RealClimate: Data Sources. My demands on the data are: one well-defined data format (so I can parse it automatically), rather long time period (the last hundred years), including temperature data, worldwide.

I found the following interesting sources, but I will not use them (and give a very short explanation). Overall the data sets are well documented, e.g. which procedures are applied to ensure data quality, and reasonable subsets are available to limit data downloads. I am sure there is more data out there, and I am grateful about comments! I simply stopped looking any further when I thought I had one that fits my needs :-).

And the dataset I did choose is:

My next steps are now parsing this data (using Java and of course published on this website) and converting it to an OGC Observations & Measurements modelling that I can load into an OGC SOS.