Aim of the demo

The aim of this example is to obtain a bias corrected time series of the NCEP/NCAR Reanalisys 1 in a particular location for the period 2001-2010 building upon the observed and simulated time series for the period 1991-2000, using the different bias correction techniques included in downscaleR.

All the required data for the first part of the practice are included in the built-in datasets in the downscaleR library, although note that NCEP/NCAR Reanalisys 1 data can be also accessed via the ECOMS User Data Gateway using the loadECOMS interface of the ecomsUDG.Raccess package.

The example will be done on the Navacerrada weather station, close to Madrid - Spain, and for the NCEP/NCAR Reanalysis 1, but this demo could be applied for several locations or gridded datasets, and for several members of a seasonal forecasting model.

Key inputs

Demo

First of all, we must to install or load the downscaleR R-package:

# If we have not installed the "devtools" library we should do it:
# install.packages("devtools")
library(devtools)
## Warning: package 'devtools' was built under R version 3.1.1
## WARNING: Rtools is required to build R packages, but is not currently installed.
## 
## Please download and install Rtools 3.1 from http://cran.r-project.org/bin/windows/Rtools/ and then run find_rtools().
## 
## Attaching package: 'devtools'
## 
## The following objects are masked from 'package:utils':
## 
##     ?, help
## 
## The following object is masked from 'package:base':
## 
##     system.file
# Installing the downscaleR R-package
devtools::install_github(c("SantanderMetGroup/downscaleR.java@stable","SantanderMetGroup/downscaleR@stable"))
## Installing github repo downscaleR.java/stable from SantanderMetGroup
## Downloading stable.zip from https://github.com/SantanderMetGroup/downscaleR.java/archive/stable.zip
## Installing package from C:\Users\sixto\AppData\Local\Temp\RtmpE39Rbe/stable.zip
## Installing downscaleR.java
## "C:/PROGRA~1/R/R-31~1.0/bin/x64/R" --vanilla CMD INSTALL  \
##   "C:\Users\sixto\AppData\Local\Temp\RtmpE39Rbe\devtools185c24805d7e\downscaleR.java-stable"  \
##   --library="C:/Users/sixto/Documents/R/win-library/3.1"  \
##   --install-tests 
## 
## Installing github repo downscaleR/stable from SantanderMetGroup
## Downloading stable.zip from https://github.com/SantanderMetGroup/downscaleR/archive/stable.zip
## Installing package from C:\Users\sixto\AppData\Local\Temp\RtmpE39Rbe/stable.zip
## Installing downscaleR
## "C:/PROGRA~1/R/R-31~1.0/bin/x64/R" --vanilla CMD INSTALL  \
##   "C:\Users\sixto\AppData\Local\Temp\RtmpE39Rbe\devtools185c42d35a69\downscaleR-stable"  \
##   --library="C:/Users/sixto/Documents/R/win-library/3.1"  \
##   --install-tests
library(downscaleR)
## Loading required package: rJava
## Loading required package: downscaleR.java
## NetCDF Java Library v4.3.22 (27 May 2014) loaded and ready
## Loading required package: maps
## downscaleR version 0.4-1 (06-Sep-2014) is loaded

In this example we will use the observations and simulations included by default in the downscaleR R-package: the GCOS Surface Network - GSN over the Iberian Peninsula and a subset created for this R-package of the NCEP/NCAR Reanalysis 1.

# Where is the dataset GSN located?
gsn.data.dir <- file.path(find.package("downscaleR"), "datasets/observations/GSN_Iberia")
# What variables includes this dataset?
# stationInfo provides a quick overview of the dataset:
stationInfo(gsn.data.dir)
## [2014-09-11 09:15:24] Doing inventory ...
## [2014-09-11 09:15:24] Done.

plot of chunk unnamed-chunk-3

##     stationID longitude latitude altitude
## 1 SP000008027    -2.039    43.31      251
## 2 SP000008181     2.070    41.29        4
## 3 SP000008202    -5.498    40.96      790
## 4 SP000008215    -4.010    40.78     1894
## 5 SP000008280    -1.863    38.95      704
## 6 SP000008410    -4.846    37.84       90
##                  location WMO_Id Koppen.class
## 1 SAN SEBASTIAN - IGUELDO   8027          Cfb
## 2    BARCELONA/AEROPUERTO   8181          Csa
## 3    SALAMANCA AEROPUERTO   8202          BSk
## 4             NAVACERRADA   8215          Csb
## 5     ALBACETE LOS LLANOS   8280          BSk
## 6      CORDOBA AEROPUERTO   8410          Csa

We load the observations.

obs <- loadStationData(dataset = gsn.data.dir, var="tmean", lonLim = c(-10,5), latLim = c(34,44), season=c(12,1,2), years = 1991:2000)
## [2014-09-11 09:15:25] Loading data ...
## [2014-09-11 09:15:25] Retrieving metadata ...
## [2014-09-11 09:15:25] Done.

To apply the bias correction methods we should have the same variable in the observational and the model datasets. Then, if we want to calibrate the mean temperature (tmean) we look for the corresponding variable in the model, 2T in our case. The downscaleR vocabulary and the inventory of the NCEP/NCAR reanalysis could help us to identify the proper variable.

# data(vocabulary)
# print(vocabulary) 
ncep.data.dir <- file.path(find.package("downscaleR"), "datasets/reanalysis/Iberia_NCEP/Iberia_NCEP.ncml")
ncep.inv <- dataInventory(ncep.data.dir)
## [2014-09-11 09:15:25] Doing inventory ...
## [2014-09-11 09:15:26] Retrieving info for 'Z' (5 vars remaining)
## [2014-09-11 09:15:26] Retrieving info for 'T' (4 vars remaining)
## [2014-09-11 09:15:27] Retrieving info for 'Q' (3 vars remaining)
## [2014-09-11 09:15:27] Retrieving info for '2T' (2 vars remaining)
## [2014-09-11 09:15:27] Retrieving info for 'SLP' (1 vars remaining)
## [2014-09-11 09:15:27] Retrieving info for 'pr' (0 vars remaining)
## [2014-09-11 09:15:27] Done.
# str(ncep.inv)
# We load the data in the control period to adjust the method:
prd <- loadGridData(dataset = ncep.data.dir, var = "tas", lonLim = c(-10,5), latLim= c(34,44), season=c(12,1,2), years = 1991:2000)
## [2014-09-11 09:15:27] Defining homogeneization parameters for variable "tas"
## [2014-09-11 09:15:27] Defining geo-location parameters
## [2014-09-11 09:15:27] Defining time selection parameters
## [2014-09-11 09:15:28] Retrieving data subset ...
## [2014-09-11 09:15:28] Done
# We should interpolate the observations to the grid of model: we use the method "nearest" and the getGrid function to ensure spatial consistency:
obs <- interpGridData(obs, new.grid = getGrid(prd), method = "nearest")
## Warning: The new longitudes are outside the data extent
## Warning: The new latitudes are outside the data extent
## [2014-09-11 09:15:28] Performing nearest interpolation... may take a while
## [2014-09-11 09:15:28] Done

Once both the observations and model data are loaded and defined in the same locations, we can apply one of the bias correction methodologies available in downscaleR. To analyze properly the effect of the different methods we apply them to adjust the same prd object used as control simulation

prd1 <- biasCorrection (obs, prd, prd, method = "unbiasing")# bias

and we plot the time series of one point:

row <- 3
col <- 4
# Time Series
par(mfrow = c(1,1))
plot(obs$Data[,row,col], type = "l", col = "black")
lines(prd$Data[,row,col],  col = "red")
lines(prd1$Data[,row,col],  col = "blue")

plot of chunk unnamed-chunk-7

Note that, in the case of the unbiasing, the effect of the correction (blue line) is only a translation of the predicted (red line) series according with the difference between both means, the observed and the predicted. The black line in the figure is the observed series. Other useful graphic is the quantile-quantile plot (type ?qqplot) qhich consists in draw the rank ordered series observed and predicted in a scatter-plot.

# Note that the observations could have missing data
indNaN <-which(!is.na(obs$Data[,row,col]))
#  scatter-plots
par(mfrow = c(1,1))
qqplot(sort(obs$Data[indNaN,row,col], decreasing = FALSE, na.last = NA), sort(prd$Data[indNaN,row,col], decreasing = FALSE, na.last = NA), plot.it = TRUE, xlab = "Observations", ylab = "Prediction", col = "red")
points(sort(obs$Data[indNaN,row,col], decreasing = FALSE, na.last = NA), sort(prd1$Data[indNaN,row,col], decreasing = FALSE, na.last = NA), type = "p", col = "blue", asp = 1)

plot of chunk unnamed-chunk-8

Finally, once we know how the bias correction method work, we could apply them to unknown conditions,

obsSim <- loadStationData(dataset = gsn.data.dir, var="tmean", lonLim = c(-10,5), latLim = c(34,44), season=c(12,1,2), years = 2001:2010)
## [2014-09-11 09:15:29] Loading data ...
## [2014-09-11 09:15:29] Retrieving metadata ...
## [2014-09-11 09:15:29] Done.
obsSim <- interpGridData(obsSim, new.grid = getGrid(prd), method = "nearest")
## Warning: The new longitudes are outside the data extent
## Warning: The new latitudes are outside the data extent
## [2014-09-11 09:15:29] Performing nearest interpolation... may take a while
## [2014-09-11 09:15:29] Done
# We apply in unknown conditions:
sim <- loadGridData(dataset = ncep.data.dir, var = "tas", lonLim = c(-10,5), latLim= c(34,44), season=c(12,1,2), years = 2001:2010)
## [2014-09-11 09:15:29] Defining homogeneization parameters for variable "tas"
## [2014-09-11 09:15:29] Defining geo-location parameters
## [2014-09-11 09:15:29] Defining time selection parameters
## [2014-09-11 09:15:30] Retrieving data subset ...
## [2014-09-11 09:15:30] Done
sim1 <- biasCorrection (obs, prd, sim, method = "unbiasing")# bias
# We show the resulting as time Series
par(mfrow = c(1,1))
plot(obsSim$Data[,row,col], type = "l", col = "black")
lines(sim$Data[,row,col], col = "red")
lines(sim1$Data[,row,col],  col = "blue")

plot of chunk unnamed-chunk-9

#  scatter-plots
indNaN <-which(!is.na(obsSim$Data[,row,col]))
par(mfrow = c(1,1))
qqplot(sort(obsSim$Data[indNaN,row,col], decreasing = FALSE, na.last = NA), sort(sim$Data[indNaN,row,col], decreasing = FALSE, na.last = NA), plot.it = TRUE, xlab = "Observations", ylab = "Simulation", col = "red")
points(sort(obsSim$Data[indNaN,row,col], decreasing = FALSE, na.last = NA), sort(sim1$Data[indNaN,row,col], decreasing = FALSE, na.last = NA), type = "p", col = "blue", asp = 1)

plot of chunk unnamed-chunk-9