# Changes between Version 8 and Version 9 of udg/ecoms/RPackage/examples/bias

Ignore:
Timestamp:
May 13, 2016 2:27:10 PM (6 years ago)
Comment:

--

### Legend:

Unmodified
 v8 aggr.d = "min", aggr.m = "mean") }}} Some information messages will appear on-screen indicating the steps: {{{ [2015-05-15 14:18:08] Defining homogeneization parameters for variable "tasmin" [2015-05-15 14:18:09] Defining geo-location parameters [2015-05-15 14:18:09] Defining initialization time parameters NOTE: Daily aggregation will be computed from 6-hourly data NOTE: Daily data will be monthly aggregated [2015-05-15 14:18:12] Retrieving data subset ... [2015-05-15 14:24:31] Done ## [2016-05-13 13:41:56] Defining homogeneization parameters for variable "tasmin" ## [2016-05-13 13:41:56] Opening dataset... ## [2016-05-13 13:41:56] The dataset was successfuly opened ## [2016-05-13 13:41:56] Defining geo-location parameters ## [2016-05-13 13:41:56] Defining initialization time parameters ## NOTE: Daily aggregation will be computed from 6-hourly data ## NOTE: Daily data will be monthly aggregated ## [2016-05-13 13:41:58] Retrieving data subset ... ## [2016-05-13 13:47:30] Done }}} Note the difference in size of the daily-aggregated data of the [https://meteo.unican.es/trac/wiki/udg/ecoms/RPackage/examples/continentalSelection in the previous example] (35.1 Mb) as compared to the new monthly-aggregated data size (1.2 Mb): Note the difference in size of the daily-aggregated data of the [https://meteo.unican.es/trac/wiki/udg/ecoms/RPackage/examples/continentalSelection in the previous example] (35.1 Mb) as compared to the new monthly-aggregated data size (1.2 Mb). This is particularly important if our goal is to perform a validation of a large domain (possibly global), that can be easily handled on a monthly basis but that will cause memory problems using daily/sub-daily data. {{{#!text/R print(object.size(ex3.cfs), units = "Mb") ## 1.2 Mb }}} years = 1991:2000, aggr.m = "mean") }}} {{{ [2015-05-15 14:31:16] Defining homogeneization parameters for variable "tasmin" [2015-05-15 14:31:16] Defining geo-location parameters [2015-05-15 14:31:16] Defining time selection parameters NOTE: Daily data will be monthly aggregated [2015-05-15 14:31:17] Retrieving data subset ... [2015-05-15 14:31:39] Done ## [2016-05-13 13:48:39] Defining homogeneization parameters for variable "tasmin" ## [2016-05-13 13:48:39] Opening dataset... ## [2016-05-13 13:48:39] The dataset was successfuly opened ## [2016-05-13 13:48:39] Defining geo-location parameters ## [2016-05-13 13:48:39] Defining time selection parameters ## NOTE: Daily data will be monthly aggregated ## [2016-05-13 13:48:39] Retrieving data subset ... ## [2016-05-13 13:49:08] Done }}} [[Image(image-20150515-143320.png)]] Note that WFDEI provides data for land areas only, and its spatial resolution is finer than CFS (1º vs 0.5º). In order to compare both datasets, it is first necessary to put them in the same grid (i.e., to interpolate). We use bilinear interpolation to this aim, using the downscaleR function interpGridData in combination with the getGrid method, useful to recover the parameters defining the grid of a dataset to pass them to the interpolator: Note that WFDEI provides data for land areas only, and its spatial resolution is finer than CFS (1º vs 0.5º). In order to compare both datasets, it is first necessary to put them in the same grid (i.e., to interpolate). We use bilinear interpolation to this aim, using the downscaleR function interpGrid in combination with the getGrid method, useful to recover the parameters defining the grid of a dataset to pass them to the interpolator: {{{ #!text/R obs.regridded <- downscaleR::interpData(gridData = ex2.obs, new.coordinates = getGrid(ex2), method = "bilinear", parallel = TRUE) obs.regridded <- downscaleR::interpGrid(grid = ex3.obs, new.coordinates = getGrid(ex3.cfs), method = "nearest") ## [2016-05-13 13:51:37] Calculating nearest neighbors... ## [2016-05-13 13:51:37] Performing nearest interpolation... may take a while ## [2016-05-13 13:51:37] Done ## Warning messages: ## 1: In downscaleR::interpGrid(grid = ex3.obs, new.coordinates = getGrid(ex2),  : ##   The new longitudes are outside the data extent ## 2: In downscaleR::interpGrid(grid = ex3.obs, new.coordinates = getGrid(ex2),  : ##   The new latitudes are outside the data extent }}} Note the use of the parallelization options to do the interpolation in parallel. Also, note the warnings reminding us that the extent of the input grid is wider than that from CFS. However, in this case we can safely ignore this warnings, since all the land areas we are interest in are within the CFS domain. {{{ [2015-05-15 14:34:58] Performing bilinear interpolation... may take a while [2015-05-15 14:34:58] Done Warning messages: 1: In interpGridData(gridData = ex3.obs, new.grid = getGrid(ex3.cfs),  : The new longitudes are outside the data extent 2: In interpGridData(gridData = ex3.obs, new.grid = getGrid(ex3.cfs),  : The new latitudes are outside the data extent }}} Note the warnings reminding us that the extent of the input grid is wider than that from CFS. However, in this case we can safely ignore this warnings, since all the land areas we are interest in are within the CFS domain. {{{ [[Image(image-20150515-143628.png)]] [[Image(image-20160513-135405.png)]] After regridding, both model data and observations are in the same grid. We can compute the bias. First, we calculate the mean of WFDEI, which is the reference against which to compute the biases: After regridding, both model data and observations are in the same grid. We can compute the bias. {{{ #!text/R ref <- apply(obs.regridded$Data, MARGIN = c(3,2), mean, na.rm = TRUE) }}} In order to compute the mean bias, we need to calculate the climatologies from both the observations, and the predictions. This can be easily accomplished using the function climatology in downscaleR. By default, it will compute the mean along the time dimension, but it is flexible and any other function could be defined by the user. In addition, the logical flag by.member allows to compute the climatology on each member by sepparate, rather than on the ensemble mean: The following lines of code compute the bias of each member w.r.t. the reference and plot them: {{{ #!text/R # Now we compute the difference against each of the multimember spatial means: require(fields) n.members <- dim(ex3.cfs$Data)[1] par(mfrow = c(1,2)) for (i in 1:n.members) { member <- apply(ex2$Data[i, , , ], MARGIN = c(3,2), mean, na.rm = TRUE) bias <- member - ref image.plot(ex2$xyCoords$x, ex2$xyCoords$y, bias, xlab = "lon", ylab = "lat", asp = 1) title(paste("Bias member", i)) world(add = TRUE) } par(mfrow = c(1,1)) # To reset the graphical window {{{#!text/R obs.clim <- climatology(obs.regridded) ## [2016-05-13 14:19:15] - Computing climatology... ## [2016-05-13 14:19:16] - Done. pred.clim <- climatology(ex3.cfs, by.member = TRUE) ## [2016-05-13 14:19:31] - Computing climatology... ## [2016-05-13 14:19:31] - Done. }}} [[Image(image-20150515-144801.png)]] Now, the biases can be computed, member by member, by just subtracting the observed climatology from the member climatologies. In this example, we create a new data array of the same dimensions than the predictions, include it in a new multimember grid object of the same characteristics as ex3.cfs and plot it: {{{#!text/R n.members <- dim(pred.clim$Data)[1] arr <- pred.clim$Data for (i in 1:n.members) { arr[i,,,] <- pred.clim$Data[i,,,] - obs.clim$Data[1,,,] } bias.grid <- ex3.cfs bias.grid$Data <- arr plotMeanGrid(bias.grid, multi.member = TRUE) }}} [[Image(image-20160513-142608.png)]] **NOTE**: a more elaborated example to compute bias as well as other verification metrics is provided is [../verification this EXAMPLE]