# Changes between Version 6 and Version 7 of EcomsUdg/RPackage/Functions

Ignore:
Timestamp:
Apr 29, 2013 2:37:44 PM (9 years ago)
Comment:

--

### Legend:

Unmodified
 v6 * '''__dataInventory.R__''' '''__1. dataInventory.R__''' Prior to data analysis, a common need is to have an overview of all data available and their structure (variables, dimensions, units, geographical extent, time span ...). The function dataInventory.R is intended to perform this task, returning a list of components summarizing the main characteristics of the selected dataset. The function is called in the following way: Prior to data analysis, a common need is to have an overview of all data available and their structure (variables, dimensions, units, geographical extent, time span ...). The function dataInventory.R is intended to perform this task, returning a list of meta-data components summarizing the main characteristics of the selected dataset. Note that his function provides an overview of the data as they are stored in the original data files. The characteristics of the loaded data after using any of the functions for data access (e.g., loadSystem4.R) may change (for instance, after data transformation temperature may be provided in ºC instead of the originally stored K, and so on). The function is called in the following way: {{{ dataInventory(dataset, print.summary = TRUE) > dataInventory(dataset, print.summary = TRUE) }}} * dataset: a character string indicating the full path to the virtual dataset (a ncml file). This can be either a path containing the directory and name of the file, or an appropriate URL in case the dataset is remotely accessed (e.g., via the [https://www.meteo.unican.es/trac/meteo/wiki/SpecsEuporias/DataServer/THREDDS SPECS-EUPORIAS THREDDS]). * print.summary: logical flag indicating if a summary table is printed on screen, in addition to the output list. * print.summary: logical flag indicating if a summary table is printed on screen, in addition to the output list. Default to TRUE. The output of the function consists of a list of variable length depending on the number of variables contained in the dataset, following this structure: The output of the function consists of a list of variable length, depending on the number of variables contained in the dataset, following this structure: * Name of the variable * Description * Name * Description: Description of the variable * Name: Character string. Long name of the variable * DataType: Character string indicating data type (i.e. float ...) * Units: Character string indicating the units of the variable * Shape: A vector of ''n'' integers, where ''n''=number of dimensions, specifying the length of each dimension * Dimensions: A list of length ''n'', containing the following information for each of the ''n'' dimensions: * Type: Character vector indicating the type of dimension (e.g. Time, Lon, Pressure ...) * Units: Character vector indicating the units of the dimension axis * Values: A vector containing all the dimension values. This might be a vector of POSIXlt class in case of time type dimension, or numeric in other cases. * '''__loadSystem4.R__''' '''__2. loadSystem4.R__''' The ''SPECS-EUPORIAS Data Portal'' can be remotely accessed from R via the [mtl:browser:MLToolbox/trunk/MLToolbox_experiments/antonio/system4/r/loadSystem4.R loadSystem4.R] function. Note that this function is part of a more comprehensive R package currently under development. This function automatically cares about the proper location of the right indices for data sub-setting across the different variable dimensions, given a few simple arguments for subset definition. In addition, instead of retrieving a NetCDF file that needs to be opened and read, the requested data is directly loaded into the current R working session, according to a particular structure described below, prior to data analysis and/or representation.