Changes between Version 5 and Version 6 of udg/ecoms/RPackage


Ignore:
Timestamp:
May 22, 2013 1:09:36 PM (8 years ago)
Author:
juaco
Comment:

--

Legend:

Unmodified
Added
Removed
Modified
  • udg/ecoms/RPackage

    v5 v6  
    3636== Dictionary ==
    3737
    38 The dictionary is a table that defines the conversion between the variables of the model and the standard variables defined in the Vocabulary. The dictionary is a comma-sepparated text file (csv), that is identified with the same name than the dataset, and the extension ''.dic''. In addition, it should be stored in the same directory that the dataset. The creation of the dictionary must be made by the user 'by hand', because it requires knowledge about the characteristics of the data stored in the dataset. The columns of the dictionary are next described:
     38The dictionary is a table whose aim is twofold:
     39 1. On the one hand, the dictionary is intended for the translation of generic variables, as idiosyncratically defined in each particular dataset, to the standard variables defined in the vocabulary with their corresponding nomenclature and units. This is achieved by providing a correspondence between the name of the variable as encoded in the dataset (`short_name`) and the corresponding name of the standard variable as defined in the vocabulary (`identifier`), and by applying the corresponding transformation to the native variable in order to match the standard units by means of a `scale` factor and an `offset`.
     40 2. In addition, the dictionary provides additional metadata often not explicitly declared in the datasets, regarding the ''time'' aggregation of the dataset (often referred to as the ''cell method''). This includes the fields `time_step`, which is merely informative, and describes the time interval between two consecutive values, and the `lower_time_bound` and `upper_time_bound`, which are the values that should be summed to each verification time to unequivocally delimit the time span encompassed by each value.
     41       
     42The dictionary is a comma-sepparated text file (csv), that by default is identified with the same name than the dataset, and the extension ''.dic'', and stored in the same directory than the dataset, although its name and location can be other if adequately specified in the loading functions. The dictionary must be created  ''"by hand"'' by the user, because it requires some ''a priori'' knowledge about the characteristics of the data stored in the dataset, which can be partly obtained from the output of the function `dataInventory`. The columns of the dictionary are next described:
    3943 
    4044 * `identifier`: this is the name of the standard variable, as defined in the vocabulary
     
    4650 * `offset`: constant summed to the original variable for units conversion (e.g.: offset = -273.15 for conversion from Kelvin to Celsius)
    4751 * `scale`: scale factor applied to the original variable for units conversion (e.g.: scale = 0.001 for conversion from m to mm)
     52 * `deaccum`. This is a logical flag (0 = FALSE, 1= TRUE), which indicates if the variable should be de-accumulated at each time step. Typically applied to precipitation in some forecast datasets.
     53
     54In the following example, we show the characteristics of the dictionary constructed for the 15 members seasonal forecast of the ECMWF's System4 model:
    4855
    4956{{{
    50   identifier short_name time_step lower_time_bound upper_time_bound aggr_fun  offset scale
    51 1         ta        air        6h                0                0     none -273.15  1.00
    52 2         zg        hgt        6h                0                0     none    0.00  1.00
    53 3        hur       rhum        6h                0                0     none    0.00  0.01
    54 4        hus       shum        6h                0                0     none    0.00  1.00
    55 5        psl        slp        6h                0                0     none    0.00  1.00
    56 6         ua       uwnd        6h                0                0     none    0.00  1.00
    57 7         va       vwnd        6h                0                0     none    0.00  1.00
     57identifier,short_name,time_step,lower_time_bound,upper_time_bound,aggr_fun,offset,scale,deaccum
     58tasmax,Maximum_temperature_at_2_metres_since_last_24_hours_surface,24h,0,24,max,-273.15,1,0
     59tasmin,Minimum_temperature_at_2_metres_since_last_24_hours_surface,24h,0,24,min,-273.15,1,0
     60tas,Mean_temperature_at_2_metres_since_last_24_hours_surface,24h,0,24,mean,-273.15,1,0
     61pr,Total_precipitation_surface,24h,0,24,sum,0,1000,1
     62psl,Mean_sea_level_pressure_surface,6h,0,0,none,0,1,0
    5863}}}
    5964
     
    6166Note that the names of the columns are important (not so their relative order), because the `loadData` and `loadObservations` R functions will perform the conversion of the variable to the standard format by finding the corresponding values by the name of the columns.
    6267
    63 
    64 == Forecast Datasets ==
    65 
    66 There is a special case in which the use of the dictionary is not needed. Because of the large size of the forecast databases, it is assumed that these data will never be locally stored, but accessed through the SPECS-EUPORIAS THREDDS Data Server. For this reason, for instance the variables of the System4 model, loaded via the `loadSystem4` function, will always be automatically transformed by the function in order to match the units defined in the vocabulary.
    67 
    68 
    69 
    7068