WikiPrint - from Polar Technologies


Different sector-specific impact activities to be undertaken in ?SPECS and ?EUPORIAS projects require a reduced number of variables (typically at surface) from different data sources (mainly seasonal forecasts, reanalysis, and observations). The ?SPECS-EUPORIAS Data Server has been established by the Santander Meteorology Group (UC-CSIC) as part of the data management activities in these projects to provide a unique access for these impact-relevant variables, gathered from existing datasets. The data portal is based on a THREDDS data server providing metadata and data access using OPeNDAP and other remote data access protocols. Moreover, a user-friendly ?R package has also been developed for exploring and remotely accessing subsets of data, thus reducing the burden of data access in these activities. This package will be also a key component for other tasks of the projects based on R, including the validation and downscaling packages to be developed within SPECS and sector-specific calibration and modeling tools to be developed in EUPORIAS.

This trac/wiki page provides an up-to-date description of the SPECS-EUPORIAS Data Server, including information of the available datasets and the documentation and code of the R data access package. This page is currently under construction, but both a first tutorial describing the basic functioning and a first version of the R package (a R function) are already available:

Dataset catalog: ?

R code: ?loadSystem4.R

Tutorial: PDF file?

Contents (under development):

  1. Data Server?
  2. R Package for Data Access?
  3. Other interfaces for Data Access?

Introduction and Motivation

The impact activities on seasonal timescales involved in ?SPECS and ?EUPORIAS projects require the use of different data sources (mainly seasonal forecasts, reanalysis, and observations). These activities include the calibration, downscaling, and modelling of sector-specific indices in agriculture, energy, health, etc., building on meteorological information. Typically, only a reduced subset of surface variables (precipitation, temperatures, mean sea level pressure, etc.) or in a reduced number of vertical levels (circulation and termodynamic drivers at, e.g., 850, 500, 200 hPa) is required for these activities. The SPECS-EUPORIAS Data Portal has been established by the Santander Meteorology Group (UC-CSIC) to gather the relevant information from existing datasets in order to provide a unique homogenized access to data for the SPECS and EUPORIAS partners (in particular for impact-users).

The SPECS-EUPORIAS Data Portal is based on a THREDDS Data Server (?TDS) providing metadata and data access using OPeNDAP and other remote data access protocols. Moreover, since the R language (? has been adopted for some key tasks in these projects (including the development of comprehensive validation and statistical-downscaling packages) a user-friendly R package has been developed to explore and access the data portal. This package can be used in R programs to remotely access subsets of data, thus reducing the burden of data access (versions for Python and Matlab are also available under request). This package will be continuously updated (keep informed at the documentation URL above) as part of the data management activities to build a data bridge for impact users and for the R developments to be done in these projects.

This document briefly describes the current state of the data portal, which has initially focused on data from the ECMWF's System4 seasonal model, as agreed in the downscaling parallel session of the kick-off meeting.

The THREDDS Data Server

The SPECS-EUPORIAS Data Portal is based on a password-protected THREDDS data server (TDS) providing metadata and data access to a set of georeferenced atmospheric variables using OPeNDAP and other remote data access protocols. The variables names, units and additional metadata follow the ?CF convention. The variables are spatial grids based on multidimensional arrays of indexed values, following Unidata's _Coordinate convention12.

Typically the data portal will include information at a daily resolution, but monthly-aggregated values could be also provided in some cases due to data limitations (in particular, Mto-France and Met Office have agreeed to provide monthly mean hindcasts for their use by the SPECS and EUPORIAS partners). In general, the data available will be typical surface variables (e.g. precipitation and near-surface temperature), although several variables (e.g. geopotential and temperature) on pressure levels will also be stored for the statistical downscaling activities.

The data gathering activities have initially focused on the ECMWF System4 seasonal model. The Meteorological Archival and Retrieval System (MARS) is the main repository of meteorological data at the ECMWF (European Centre for Medium-Range Weather Forecasts). It contains terabytes of operational and research data as well as data from special projects3. The large amount of information stored and the inherent complexities of data access, download and post-processing is a first shortcoming for a flexible use of these datasets by a large number of partners. To overcome this issue, a reduced subset of surface variables4 (precipitation, temperatures and mean sea level pressure) have been downloaded from MARS (a colection of GRIB-1 files) at 0.75 spatial resolution and made available throught the SPECS-EUPORIAS data portal. The downloaded data has been exposed as three different virtual datasets using TDS:

Data gathering activities will next move to the CFS (? version 2 hindcast, developed at the Environmental Modeling Center at NCEP and also to reanalysis and observational datasets.

Accesing the Data Portal via R

Accesing the Data Portal via Web

Accessing to the Data portal using Octave

This section needs revision and integrtion with the Matlab example section

>> ver
GNU Octave Version 3.6.1
GNU Octave License: GNU General Public License
Operating System: unknown
>> urlwrite('','netcdfAll-4.3.jar')
>> javaaddpath('./netcdfAll-4.3.jar');
>> javaMethod('setGlobalCredentialsProvider','',javaObject('','username','password'));
>> ncfile = javaMethod('openDataset','ucar.nc2.dataset.NetcdfDataset','');
>> v = ncfile.findVariable('Minimum_temperature_at_2_metres_since_last_24_hours_surface');
>> disp(v.getDimensions.toString)
[   member = 15;,    run = 360;,    time = 215;,    lat = 241;,    lon = 480;]
>> d ='0,11:359:12,31:61,66,475');
>> tmp = javaObject('org.octave.Matrix',d.reduce.copyToNDJavaArray);
>> oldFlag = java_convert_matrix (1);
>> octaveMatrix = tmp.ident(tmp);
[ (30 by 31) array of double ]
>> disp(squeeze(mean(octaveMatrix,2))')
 Columns 1 through 13:

   270.79   273.29   271.57   271.04   271.83   272.49   271.48   268.59   271.53   273.82   270.99   274.24   270.99

 Columns 14 through 26:

   271.56   273.99   270.51   272.45   270.66   271.31   272.77   273.44   271.85   273.40   274.16   269.98   271.30

 Columns 27 through 30:

   273.12   271.27   272.29   270.47

Accessing to the Data portal using Matlab

This section needs revision]

>> ver
MATLAB Version (R2009a)
MATLAB License Number: 161051
Operating System: Microsoft Windows Vista Version 6.1 (Build 7601: Service Pack 1)
Java VM Version: Java 1.6.0_04-b12 with Sun Microsystems Inc. Java HotSpot(TM) 64-Bit Server VM mixed mode
>> javaaddpath('');
>> %javaaddpath('');
>> import* %this will download the netcdfAll-4.3.jar
>> HTTPSession.setGlobalCredentialsProvider(HTTPBasicProvider('username','password'));
>> import ucar.nc2.*;
>> import ucar.nc2.dataset.*;
>> ncfile = NetcdfDataset.openDataset('');
>> v = ncfile.findVariable('Minimum_temperature_at_2_metres_since_last_24_hours_surface');
>> disp(v.getDimensions)
[   member = 15;,    run = 360;,    time = 215;,    lat = 241;,    lon = 480;]
>> data ='0,11:359:12,31:61,66,475').copyToNDJavaArray();
>> disp(squeeze(mean(data,3)))
  Columns 1 through 13

  270.7917  273.2944  271.5666  271.0371  271.8275  272.4928  271.4809  268.5912  271.5313  273.8216  270.9940  274.2363  270.9933

  Columns 14 through 26

  271.5612  273.9899  270.5076  272.4505  270.6556  271.3118  272.7720  273.4359  271.8502  273.3965  274.1638  269.9825  271.3017

  Columns 27 through 30

  273.1195  271.2730  272.2915  270.4669

Example of Data Analysis with R


  1. 1. ?
  2. 2. ?
  3. 3. ?
  4. 4. ?