Title: | Assessing Skill of Climate Forecasts on Seasonal-to-Decadal Timescales |
---|---|
Description: | Exploits dynamical seasonal forecasts in order to provide information relevant to stakeholders at the seasonal timescale. The package contains process-based methods for forecast calibration, bias correction, statistical and stochastic downscaling, optimal forecast combination and multivariate verification, as well as basic and advanced tools to obtain tailored products. This package was developed in the context of the 'ERA4CS' project 'MEDSCOPE' and the 'H2020 S2S4E' project and includes contributions from 'ArticXchange' project founded by 'EU-PolarNet 2'. 'Pérez-Zanón et al. (2022) <doi:10.5194/gmd-15-6115-2022>'. 'Doblas-Reyes et al. (2005) <doi:10.1111/j.1600-0870.2005.00104.x>'. 'Mishra et al. (2018) <doi:10.1007/s00382-018-4404-z>'. 'Sanchez-Garcia et al. (2019) <doi:10.5194/asr-16-165-2019>'. 'Straus et al. (2007) <doi:10.1175/JCLI4070.1>'. 'Terzago et al. (2018) <doi:10.5194/nhess-18-2825-2018>'. 'Torralba et al. (2017) <doi:10.1175/JAMC-D-16-0204.1>'. 'D'Onofrio et al. (2014) <doi:10.1175/JHM-D-13-096.1>'. 'Verfaillie et al. (2017) <doi:10.5194/gmd-10-4257-2017>'. 'Van Schaeybroeck et al. (2019) <doi:10.1016/B978-0-12-812372-0.00010-8>'. 'Yiou et al. (2013) <doi:10.1007/s00382-012-1626-3>'. |
Authors: | Nuria Perez-Zanon [aut] , Louis-Philippe Caron [aut] , Carmen Alvarez-Castro [aut] , Lauriane Batte [aut], Carlos Delgado [aut], Jost von Hardenberg [aut] , Llorenç LLedo [aut], Nicolau Manubens [aut], Lluís Palma [aut], Eroteida Sanchez-Garcia [aut], Bert van Schaeybroeck [aut], Veronica Torralba [aut], Deborah Verfaillie [aut], Eva Rifà [ctb], Filippo Cali Quaglia [ctb], Maria M. Chaves-Montero [ctb], Chihchung Chou [ctb], Nicola Cortesi [ctb], Susanna Corti [ctb], Paolo Davini [ctb], Gildas Dayon [ctb], Marta Dominguez [ctb], Federico Fabiano [ctb], Ignazio Giuntoli [ctb], Raul Marcos [ctb], Paola Marson [ctb], Niti Mishra [ctb], Jesus Peña [ctb], Francesc Roura-Adserias [ctb], Silvia Terzago [ctb], Danila Volpi [ctb], An-Chi Ho [ctb], Victoria Agudetse [cre], BSC-CNS [cph] |
Maintainer: | Victoria Agudetse <[email protected]> |
License: | GPL-3 |
Version: | 5.2.0 |
Built: | 2024-12-21 06:46:14 UTC |
Source: | CRAN |
This function computes a quantile mapping based on weather types
for experiment data (typically a hindcast) onto reference obs
,
typically provided by reanalysis data.
AdamontQQCorr( exp, wt_exp, obs, wt_obs, corrdims = c("member", "sdate", "ftime"), londim = "lon", latdim = "lat", regrid = FALSE, NN = NULL )
AdamontQQCorr( exp, wt_exp, obs, wt_obs, corrdims = c("member", "sdate", "ftime"), londim = "lon", latdim = "lat", regrid = FALSE, NN = NULL )
exp |
Array with named dimensions (such as |
wt_exp |
Corresponding weather types (same dimensions as |
obs |
Array with named dimensions with reference data (can also be
|
wt_obs |
Corresponding weather types (same dimensions as |
corrdims |
List of dimensions in |
londim |
Character name of longitude dimension in |
latdim |
Character name of latitude dimension in |
regrid |
(optional) Boolean indicating whether .NearestNeighbors regridding is needed. |
NN |
(optional, if regrid = TRUE) List (output from .NearestNeighbors) maps (nlat, nlon) onto (nlat_o, nlon_o). |
An array (such as $data
array from an object of class
s2dv_cube
) with named dimensions, containing experiment data on the
lat/lon grid of obs
array, corrected by quantile mapping depending on
the weather types wt_exp
Paola Marson, [email protected] for PROSNOW version
Lauriane Batté, [email protected] for CSTools adaptation
wt_exp <- c(1,1,2,3,3,2,2,1,1,2,2,3) dim(wt_exp) <- c(dataset = 1, member = 1, sdate = 4, ftime = 3) wt_obs <- c(3,3,1,2,2,2,2,1,3,1,1,2) dim(wt_obs) <- c(dataset = 1, member = 1, sdate = 4, ftime = 3) exp <- 1 : c(1 * 1 * 4 * 3 * 4 * 4) dim(exp) <- c(dataset = 1, member = 1, sdate = 4, ftime = 3, lat = 4, lon = 4) obs <- 101 : c(100 + 1 * 1 * 4 * 3 * 4 * 4) dim(obs) <- c(dataset = 1, member = 1, sdate = 4, ftime = 3, lat = 4, lon = 4) exp_corr <- AdamontQQCorr(exp = exp, wt_exp = wt_exp, obs = obs, wt_obs = wt_obs, corrdims = c('dataset', 'member', 'sdate', 'ftime'))
wt_exp <- c(1,1,2,3,3,2,2,1,1,2,2,3) dim(wt_exp) <- c(dataset = 1, member = 1, sdate = 4, ftime = 3) wt_obs <- c(3,3,1,2,2,2,2,1,3,1,1,2) dim(wt_obs) <- c(dataset = 1, member = 1, sdate = 4, ftime = 3) exp <- 1 : c(1 * 1 * 4 * 3 * 4 * 4) dim(exp) <- c(dataset = 1, member = 1, sdate = 4, ftime = 3, lat = 4, lon = 4) obs <- 101 : c(100 + 1 * 1 * 4 * 3 * 4 * 4) dim(obs) <- c(dataset = 1, member = 1, sdate = 4, ftime = 3, lat = 4, lon = 4) exp_corr <- AdamontQQCorr(exp = exp, wt_exp = wt_exp, obs = obs, wt_obs = wt_obs, corrdims = c('dataset', 'member', 'sdate', 'ftime'))
This function perform a downscaling using Analogs. To compute the analogs, the function search for days with similar large scale conditions to downscaled fields in the local scale. The large scale and the local scale regions are defined by the user. The large scale is usually given by atmospheric circulation as sea level pressure or geopotential height (Yiou et al, 2013) but the function gives the possibility to use another field. The local scale will be usually given by precipitation or temperature fields, but might be another variable. The analogs function will find the best analogs based in three criterias: (1) Minimum Euclidean distance in the large scale pattern (i.e. SLP) (2) Minimum Euclidean distance in the large scale pattern (i.e. SLP) and minimum Euclidean distance in the local scale pattern (i.e. SLP). (3) Minimum Euclidean distance in the large scale pattern (i.e. SLP), minimum distance in the local scale pattern (i.e. SLP) and highest correlation in the local variable to downscale (i.e Precipitation). The search of analogs must be done in the longest dataset posible. This is important since it is necessary to have a good representation of the possible states of the field in the past, and therefore, to get better analogs. Once the search of the analogs is complete, and in order to used the three criterias the user can select a number of analogs , using parameter 'nAnalogs' to restrict the selection of the best analogs in a short number of posibilities, the best ones. This function has not constrains of specific regions, variables to downscale, or data to be used (seasonal forecast data, climate projections data, reanalyses data). The regrid into a finner scale is done interpolating with CST_Start. Then, this interpolation is corrected selecting the analogs in the large and local scale in based of the observations. The function is an adapted version of the method of Yiou et al 2013.
Analogs( expL, obsL, time_obsL, time_expL = NULL, lonL = NULL, latL = NULL, expVar = NULL, obsVar = NULL, sdate_dim = "sdate", criteria = "Large_dist", excludeTime = NULL, lonVar = NULL, latVar = NULL, region = NULL, nAnalogs = NULL, AnalogsInfo = FALSE, ncores = NULL )
Analogs( expL, obsL, time_obsL, time_expL = NULL, lonL = NULL, latL = NULL, expVar = NULL, obsVar = NULL, sdate_dim = "sdate", criteria = "Large_dist", excludeTime = NULL, lonVar = NULL, latVar = NULL, region = NULL, nAnalogs = NULL, AnalogsInfo = FALSE, ncores = NULL )
expL |
An array of N named dimensions containing the experimental field on the large scale for which the analog is aimed. This field is used to in all the criterias. If parameter 'expVar' is not provided, the function will return the expL analog. The element 'data' in the 's2dv_cube' object must have, at least, latitudinal and longitudinal dimensions. The object is expect to be already subset for the desired large scale region. Latitudinal dimension accepted names: 'lat', 'lats', 'latitude', 'y', 'j', 'nav_lat'. Longitudinal dimension accepted names: 'lon', 'lons','longitude', 'x', 'i', 'nav_lon'. |
obsL |
An array of N named dimensions containing the observational field on the large scale. The element 'data' in the 's2dv_cube' object must have the same latitudinal and longitudinal dimensions as parameter 'expL' and a single temporal dimension with the maximum number of available observations. |
time_obsL |
A character string indicating the date of the observations in the format "dd-mm-yyyy". Reference time to search for analogs. It must have time dimensions. |
time_expL |
An array of N named dimensions (coinciding with time dimensions in expL) of character string(s) indicating the date(s) of the experiment in the format "dd-mm-yyyy". Time(s) to find the analogs. If it is not an scalar it must have named dimensions. |
lonL |
A vector containing the longitude of parameter 'expL'. |
latL |
A vector containing the latitude of parameter 'expL'. |
expVar |
An array of N named dimensions containing the experimental field on the local scale, usually a different variable to the parameter 'expL'. If it is not NULL (by default, NULL), the returned field by this function will be the analog of parameter 'expVar'. |
obsVar |
An array of N named dimensions containing the field of the same variable as the passed in parameter 'expVar' for the same region. |
sdate_dim |
A character string indicating the name of the start date dimension. By default, it is set to 'sdate'. |
criteria |
A character string indicating the criteria to be used for the selection of analogs:
|
excludeTime |
An array of N named dimensions (coinciding with time dimensions in expL) of character string(s) indicating the date(s) of the observations in the format "dd/mm/yyyy" to be excluded during the search of analogs. It can be NULL but if expL is not a forecast (time_expL contained in time_obsL), by default time_expL will be removed during the search of analogs. |
lonVar |
A vector containing the longitude of parameter 'expVar'. |
latVar |
A vector containing the latitude of parameter 'expVar'. |
region |
A vector of length four indicating the minimum longitude, the maximum longitude, the minimum latitude and the maximum latitude. |
nAnalogs |
Number of Analogs to be selected to apply the criterias 'Local_dist' or 'Local_cor'. This is not the necessary the number of analogs that the user can get, but the number of events with minimum distance in which perform the search of the best Analog. The default value for the 'Large_dist' criteria is 1, for 'Local_dist' and 'Local_cor' criterias must be greater than 1 in order to match with the first criteria, if nAnalogs is NULL for 'Local_dist' and 'Local_cor' the default value will be set at the length of 'time_obsL'. If AnalogsInfo is FALSE the function returns just the best analog. |
AnalogsInfo |
A logical value. If it is TRUE it returns a list with two elements: 1) the downscaled field and 2) the AnalogsInfo which contains: a) the number of the best analogs, b) the corresponding value of the metric used in the selected criteria (distance values for Large_dist and Local_dist,correlation values for Local_cor), c)dates of the analogs). The analogs are listed in decreasing order, the first one is the best analog (i.e if the selected criteria is Local_cor the best analog will be the one with highest correlation, while for Large_dist criteria the best analog will be the day with minimum Euclidean distance). Set to FALSE to get a single analog, the best analog, for instance for downscaling. |
ncores |
The number of cores to use in parallel computation. |
An array with the dowscaled values of the best analogs for the criteria selected. If 'AnalogsInfo' is set to TRUE it returns a list with an array of the dowsncaled field and the analogs information in elements 'analogs', 'metric' and 'dates'.
M. Carmen Alvarez-Castro, [email protected]
Maria M. Chaves-Montero, [email protected]
Veronica Torralba, [email protected]
Nuria Perez-Zanon [email protected]
Yiou, P., T. Salameh, P. Drobinski, L. Menut, R. Vautard, and M. Vrac, 2013 : Ensemble reconstruction of the atmospheric column from surface pressure using analogues. Clim. Dyn., 41, 1419-1437. [email protected]
# Example 1: Downscaling using criteria 'Large_dist' and a single variable: expSLP <- rnorm(1:20) dim(expSLP) <- c(lat = 4, lon = 5) obsSLP <- c(rnorm(1:180), expSLP * 1.2) dim(obsSLP) <- c(time = 10, lat = 4, lon = 5) time_obsSLP <- paste(rep("01", 10), rep("01", 10), 1994 : 2003, sep = "-") dim(time_obsSLP) <- c(time = 10) downscale_field <- Analogs(expL = expSLP, obsL = obsSLP, time_obsL = time_obsSLP,time_expL = "01-01-1994") # Example 2: Downscaling using criteria 'Large_dist' and 2 variables: obs.pr <- c(rnorm(1:200) * 0.001) dim(obs.pr) <- dim(obsSLP) downscale_field <- Analogs(expL = expSLP, obsL = obsSLP, obsVar = obs.pr, time_obsL = time_obsSLP, time_expL = "01-01-1994") # Example 3: Downscaling using criteria 'Local_dist' and 2 variables: # analogs of local scale using criteria 2 region = c(lonmin = -1 ,lonmax = 2, latmin = 30, latmax = 33) Local_scale <- Analogs(expL = expSLP, obsL = obsSLP, time_obsL = time_obsSLP, obsVar = obs.pr, criteria = "Local_dist", lonL = seq(-1, 5, 1.5),latL = seq(30, 35, 1.5), region = region,time_expL = "01-10-2000", nAnalogs = 10, AnalogsInfo = TRUE) # Example 4: Downscaling using criteria 'Local_cor' and 2 variables: exp.pr <- c(rnorm(1:20) * 0.001) dim(exp.pr) <- dim(expSLP) Local_scalecor <- Analogs(expL = expSLP, obsL = obsSLP, time_obsL = time_obsSLP, obsVar = obs.pr, expVar = exp.pr, criteria = "Local_cor", lonL = seq(-1, 5, 1.5), time_expL = "01-10-2000", latL = seq(30, 35, 1.5), lonVar = seq(-1, 5, 1.5), latVar = seq(30, 35, 1.5), nAnalogs = 8, region = region, AnalogsInfo = FALSE) # Example 5: List of best analogs in the three criterias Large_dist, Large_scale <- Analogs(expL = expSLP, obsL = obsSLP, time_obsL = time_obsSLP, criteria = "Large_dist", time_expL = "01-10-2000", nAnalogs = 7, AnalogsInfo = TRUE) Local_scale <- Analogs(expL = expSLP, obsL = obsSLP, time_obsL = time_obsSLP, time_expL = "01-10-2000", criteria = "Local_dist", lonL = seq(-1, 5, 1.5), latL = seq(30, 35, 1.5), nAnalogs = 7, region = region, AnalogsInfo = TRUE) Local_scalecor <- Analogs(expL = expSLP, obsL = obsSLP, time_obsL = time_obsSLP, obsVar = obsSLP, expVar = expSLP, time_expL = "01-10-2000", criteria = "Local_cor", lonL = seq(-1, 5, 1.5), latL = seq(30, 35, 1.5), lonVar = seq(-1, 5, 1.5), latVar = seq(30, 35, 1.5), nAnalogs = 7, region = region, AnalogsInfo = TRUE)
# Example 1: Downscaling using criteria 'Large_dist' and a single variable: expSLP <- rnorm(1:20) dim(expSLP) <- c(lat = 4, lon = 5) obsSLP <- c(rnorm(1:180), expSLP * 1.2) dim(obsSLP) <- c(time = 10, lat = 4, lon = 5) time_obsSLP <- paste(rep("01", 10), rep("01", 10), 1994 : 2003, sep = "-") dim(time_obsSLP) <- c(time = 10) downscale_field <- Analogs(expL = expSLP, obsL = obsSLP, time_obsL = time_obsSLP,time_expL = "01-01-1994") # Example 2: Downscaling using criteria 'Large_dist' and 2 variables: obs.pr <- c(rnorm(1:200) * 0.001) dim(obs.pr) <- dim(obsSLP) downscale_field <- Analogs(expL = expSLP, obsL = obsSLP, obsVar = obs.pr, time_obsL = time_obsSLP, time_expL = "01-01-1994") # Example 3: Downscaling using criteria 'Local_dist' and 2 variables: # analogs of local scale using criteria 2 region = c(lonmin = -1 ,lonmax = 2, latmin = 30, latmax = 33) Local_scale <- Analogs(expL = expSLP, obsL = obsSLP, time_obsL = time_obsSLP, obsVar = obs.pr, criteria = "Local_dist", lonL = seq(-1, 5, 1.5),latL = seq(30, 35, 1.5), region = region,time_expL = "01-10-2000", nAnalogs = 10, AnalogsInfo = TRUE) # Example 4: Downscaling using criteria 'Local_cor' and 2 variables: exp.pr <- c(rnorm(1:20) * 0.001) dim(exp.pr) <- dim(expSLP) Local_scalecor <- Analogs(expL = expSLP, obsL = obsSLP, time_obsL = time_obsSLP, obsVar = obs.pr, expVar = exp.pr, criteria = "Local_cor", lonL = seq(-1, 5, 1.5), time_expL = "01-10-2000", latL = seq(30, 35, 1.5), lonVar = seq(-1, 5, 1.5), latVar = seq(30, 35, 1.5), nAnalogs = 8, region = region, AnalogsInfo = FALSE) # Example 5: List of best analogs in the three criterias Large_dist, Large_scale <- Analogs(expL = expSLP, obsL = obsSLP, time_obsL = time_obsSLP, criteria = "Large_dist", time_expL = "01-10-2000", nAnalogs = 7, AnalogsInfo = TRUE) Local_scale <- Analogs(expL = expSLP, obsL = obsSLP, time_obsL = time_obsSLP, time_expL = "01-10-2000", criteria = "Local_dist", lonL = seq(-1, 5, 1.5), latL = seq(30, 35, 1.5), nAnalogs = 7, region = region, AnalogsInfo = TRUE) Local_scalecor <- Analogs(expL = expSLP, obsL = obsSLP, time_obsL = time_obsSLP, obsVar = obsSLP, expVar = expSLP, time_expL = "01-10-2000", criteria = "Local_cor", lonL = seq(-1, 5, 1.5), latL = seq(30, 35, 1.5), lonVar = seq(-1, 5, 1.5), latVar = seq(30, 35, 1.5), nAnalogs = 7, region = region, AnalogsInfo = TRUE)
This function converts data loaded using Start function from startR package or Load from s2dv into an 's2dv_cube' object.
as.s2dv_cube(object, remove_attrs_coords = FALSE, remove_null = FALSE)
as.s2dv_cube(object, remove_attrs_coords = FALSE, remove_null = FALSE)
object |
An object of class 'startR_array' generated from function
|
remove_attrs_coords |
A logical value indicating whether to remove the attributes of the coordinates (TRUE) or not (FALSE). The default value is FALSE. |
remove_null |
Optional. A logical value indicating whether to remove the
elements that are NULL (TRUE) or not (FALSE) of the output object. It is
only used when the object is an output from function |
The function returns an 's2dv_cube' object to be easily used with
functions with the prefix CST
from CSTools and CSIndicators packages.
The object is mainly a list with the following elements:
'data', array with named dimensions;
'dims', named vector of the data dimensions;
'coords', list of named vectors with the coordinates corresponding to the dimensions of the data parameter;
'attrs', named list with elements:
'Dates', array with named temporal dimensions of class 'POSIXct' from time values in the data;
'Variable', has the following components:
'varName', character vector of the short variable name. It is usually specified in the parameter 'var' from the functions Start and Load;
'metadata', named list of elements with variable metadata. They can be from coordinates variables (e.g. longitude) or main variables (e.g. 'var');
'Datasets', character strings indicating the names of the datasets;
'source_files', a vector of character strings with complete paths to all the found files involved in loading the data;
'when', a time stamp of the date issued by the Start() or Load() call to obtain the data;
'load_parameters', it contains the components used in the arguments to load the data from Start() or Load() functions.
Perez-Zanon Nuria, [email protected]
Nicolau Manubens, [email protected]
s2dv_cube
, CST_Start
,
Start
and CST_Load
## Not run: # Example 1: convert an object from startR::Start function to 's2dv_cube' library(startR) repos <- '/esarchive/exp/ecmwf/system5_m1/monthly_mean/$var$_f6h/$var$_$sdate$.nc' data <- Start(dat = repos, var = 'tas', sdate = c('20170101', '20180101'), ensemble = indices(1:5), time = 'all', latitude = indices(1:5), longitude = indices(1:5), return_vars = list(latitude = 'dat', longitude = 'dat', time = 'sdate'), retrieve = TRUE) data <- as.s2dv_cube(data) # Example 2: convert an object from s2dv::Load function to 's2dv_cube' startDates <- c('20001101', '20011101', '20021101', '20031101', '20041101', '20051101') data <- Load(var = 'tas', exp = 'system5c3s', nmember = 2, sdates = startDates, leadtimemax = 3, latmin = 10, latmax = 30, lonmin = -10, lonmax = 10, output = 'lonlat') data <- as.s2dv_cube(data) ## End(Not run)
## Not run: # Example 1: convert an object from startR::Start function to 's2dv_cube' library(startR) repos <- '/esarchive/exp/ecmwf/system5_m1/monthly_mean/$var$_f6h/$var$_$sdate$.nc' data <- Start(dat = repos, var = 'tas', sdate = c('20170101', '20180101'), ensemble = indices(1:5), time = 'all', latitude = indices(1:5), longitude = indices(1:5), return_vars = list(latitude = 'dat', longitude = 'dat', time = 'sdate'), retrieve = TRUE) data <- as.s2dv_cube(data) # Example 2: convert an object from s2dv::Load function to 's2dv_cube' startDates <- c('20001101', '20011101', '20021101', '20031101', '20041101', '20051101') data <- Load(var = 'tas', exp = 'system5c3s', nmember = 2, sdates = startDates, leadtimemax = 3, latmin = 10, latmax = 30, lonmin = -10, lonmax = 10, output = 'lonlat') data <- as.s2dv_cube(data) ## End(Not run)
This function implements the computation to obtain the weighted ensemble means for SFSs using a normalized weights array,
BEI_EMWeighting(var_exp, aweights, time_dim_name = "time", memb_dim = "member")
BEI_EMWeighting(var_exp, aweights, time_dim_name = "time", memb_dim = "member")
var_exp |
Variable (e.g. precipitation, temperature, NAO index) array from a SFS with at least dimensions (time, member) for a spatially aggregated variable or dimensions (time, member, lat, lon) for a spatial variable, as 'time' the spatial dimension by default. |
aweights |
Normalized weights array with at least dimensions (time, member), when 'time' is the temporal dimension as default. |
time_dim_name |
A character string indicating the name of the temporal dimension, by default 'time'. |
memb_dim |
A character string indicating the name of the member dimension, by default 'member'. |
BEI_EMWeighting() returns an array with at least one or three dimensions depending if the variable is spatially aggregated variable (as e.g. NAO index)(time) or it is spatial variable (as e.g. precipitation or temperature) (time, lat, lon), containing the ensemble means computing with weighted members.
Eroteida Sanchez-Garcia - AEMET, [email protected]
Regionally improved seasonal forecast of precipitation through Best estimation of winter NAO, Sanchez-Garcia, E. et al., Adv. Sci. Res., 16, 165174, 2019, doi:10.5194/asr-16-165-2019
# Example 1 var_exp <- 1 : (2 * 3 * 4) dim(var_exp) <- c(time = 2, dataset = 3, member = 4) aweights <- runif(24, min = 0.001, max = 0.999) dim(aweights) <- c(time = 2, dataset = 3, member = 4) res <- BEI_EMWeighting(var_exp, aweights) # Example 2 var_exp <- 1 : (2 * 4 * 2 * 3) dim(var_exp) <- c(time = 2, member = 4, lat = 2, lon = 3) aweights <- c(0.2, 0.1, 0.3, 0.4, 0.1, 0.2, 0.4, 0.3) dim(aweights) <- c(time = 2, member = 4) res <- BEI_EMWeighting(var_exp, aweights)
# Example 1 var_exp <- 1 : (2 * 3 * 4) dim(var_exp) <- c(time = 2, dataset = 3, member = 4) aweights <- runif(24, min = 0.001, max = 0.999) dim(aweights) <- c(time = 2, dataset = 3, member = 4) res <- BEI_EMWeighting(var_exp, aweights) # Example 2 var_exp <- 1 : (2 * 4 * 2 * 3) dim(var_exp) <- c(time = 2, member = 4, lat = 2, lon = 3) aweights <- c(0.2, 0.1, 0.3, 0.4, 0.1, 0.2, 0.4, 0.3) dim(aweights) <- c(time = 2, member = 4) res <- BEI_EMWeighting(var_exp, aweights)
This function implements the computation to obtain the index Probability Density Functions (PDFs) (e.g. NAO index) obtained to combining the Index PDFs for two Seasonal Forecast Systems (SFSs), the Best Index estimation (see Sanchez-Garcia, E. et al (2019), doi:10.5194/asr-16-165-2019 for more details about the methodology applied to estimate the Best Index).
BEI_PDFBest( index_obs, index_hind1, index_hind2, index_fcst1 = NULL, index_fcst2 = NULL, method_BC = "none", time_dim_name = "time", na.rm = FALSE )
BEI_PDFBest( index_obs, index_hind1, index_hind2, index_fcst1 = NULL, index_fcst2 = NULL, method_BC = "none", time_dim_name = "time", na.rm = FALSE )
index_obs |
Index (e.g. NAO index) array from an observational database or reanalysis with at least a temporal dimension (by default 'time'), which must be greater than 2. |
index_hind1 |
Index (e.g. NAO index) array from a SFS (named SFS1) with at least two dimensions (time , member) or (time, statistic). The temporal dimension, by default 'time', must be greater than 2. The dimension 'member' must be greater than 1. The dimension 'statistic' must be equal to 2, for containing the two paramenters of a normal distribution (mean and sd) representing the ensemble of a SFS. It is not possible to have the dimension 'member' and 'statistic' at the same time. |
index_hind2 |
Index (e.g. NAO index) array from a SFS (named SFS2) with at least two dimensions (time , member) or (time, statistic). The temporal dimension, by default 'time', must be greater than 2. The dimension 'member' must be greater than 1. The dimension 'statistic' must be equal to 2, for containing the two paramenters of a normal distribution (mean and sd) representing the ensemble of a SFS. It is not possible to have the dimension 'member' and 'statistic' together. |
index_fcst1 |
(optional, default = NULL) Index (e.g. NAO index) array from forescating of SFS1 with at least two dimensions (time , member) or (time, statistic). The temporal dimension, by default 'time', must be equal to 1, the forecast year target. The dimension 'member' must be greater than 1. The dimension 'statistic' must be equal to 2, for containing the two paramenters of a normal distribution (mean and sd) representing the ensemble of a SFS. It is not possible to have the dimension 'member' and 'statistic' together. |
index_fcst2 |
(optional, default = NULL) Index (e.g. NAO index) array from forescating of SFS2 with at least two dimensions (time , member) or (time, statistic). The temporal dimension, by default 'time', must be equal to 1, the forecast year target. The dimension 'member' must be greater than 1. The dimension 'statistic' must be equal to 2, for containing the two paramenters of a normal distribution (mean and sd) representing the ensemble of a SFS. It is not possible to have the dimension 'member' and 'statistic' together. |
method_BC |
A character vector of maximun length 2 indicating the bias correction methodology to be applied on each SFS. If it is 'none' or any of its elements is 'none', the bias correction won't be applied. Available methods developped are "ME" (a bias correction scheme based on the mean error or bias between observation and predictions to correct the predicted values), and "LMEV" (a bias correction scheme based on a linear model using ensemble variance of index as predictor). (see Sanchez-Garcia, E. et al (2019), doi:10.5194/asr-16-165-2019 for more details). |
time_dim_name |
A character string indicating the name of the temporal dimension, by default 'time'. |
na.rm |
Logical (default = FALSE). Should missing values be removed? |
BEI_PDFBest() returns an array with the parameters that caracterize the PDFs, with at least a temporal dimension, by default 'time' and dimension 'statistic' equal to 2. The firt statistic is the parameter 'mean' of the PDF for the best estimation combining the two SFSs PDFs. The second statistic is the parameter 'standard deviation' of the PDF for the best estimation combining the two SFSs PDFs. If index_fcst1 and/or index_fcst2 are null, returns the values for hindcast period. Otherwise, it returns the values for a forecast year.
Eroteida Sanchez-Garcia - AEMET, [email protected]
Regionally improved seasonal forecast of precipitation through Best estimation of winter NAO, Sanchez-Garcia, E. et al., Adv. Sci. Res., 16, 165174, 2019, doi:10.5194/asr-16-165-2019
# Example 1 for the BEI_PDFBest function index_obs<- rnorm(10, sd = 3) dim(index_obs) <- c(time = 5, season = 2) index_hind1 <- rnorm(40, mean = 0.2, sd = 3) dim(index_hind1) <- c(time = 5, member = 4, season = 2) index_hind2 <- rnorm(60, mean = -0.5, sd = 4) dim(index_hind2) <- c(time = 5, member = 6, season = 2) index_fcst1 <- rnorm(16, mean = 0.2, sd = 3) dim(index_fcst1) <- c(time = 1, member = 8, season = 2) index_fcst2 <- rnorm(18, mean = -0.5, sd = 4) dim(index_fcst2) <- c(time = 1, member = 9, season = 2) method_BC <- 'ME' res <- BEI_PDFBest(index_obs, index_hind1, index_hind2, index_fcst1, index_fcst2, method_BC) # Example 2 for the BEI_PDFBest function index_obs<- rnorm(10, sd = 3) dim(index_obs) <- c(time = 5, season = 2) index_hind1 <- rnorm(40, mean = 0.2, sd = 3) dim(index_hind1) <- c(time = 5, member = 4, season = 2) index_hind2 <- rnorm(60, mean = -0.5, sd = 4) dim(index_hind2) <- c(time = 5, member = 6, season = 2) index_fcst1 <- rnorm(16, mean = 0.2, sd = 3) dim(index_fcst1) <- c(time = 1, member = 8, season = 2) index_fcst2 <- rnorm(18, mean = -0.5, sd = 4) dim(index_fcst2) <- c(time = 1, member = 9, season = 2) method_BC <- c('LMEV', 'ME') res <- BEI_PDFBest(index_obs, index_hind1, index_hind2, index_fcst1, index_fcst2, method_BC)
# Example 1 for the BEI_PDFBest function index_obs<- rnorm(10, sd = 3) dim(index_obs) <- c(time = 5, season = 2) index_hind1 <- rnorm(40, mean = 0.2, sd = 3) dim(index_hind1) <- c(time = 5, member = 4, season = 2) index_hind2 <- rnorm(60, mean = -0.5, sd = 4) dim(index_hind2) <- c(time = 5, member = 6, season = 2) index_fcst1 <- rnorm(16, mean = 0.2, sd = 3) dim(index_fcst1) <- c(time = 1, member = 8, season = 2) index_fcst2 <- rnorm(18, mean = -0.5, sd = 4) dim(index_fcst2) <- c(time = 1, member = 9, season = 2) method_BC <- 'ME' res <- BEI_PDFBest(index_obs, index_hind1, index_hind2, index_fcst1, index_fcst2, method_BC) # Example 2 for the BEI_PDFBest function index_obs<- rnorm(10, sd = 3) dim(index_obs) <- c(time = 5, season = 2) index_hind1 <- rnorm(40, mean = 0.2, sd = 3) dim(index_hind1) <- c(time = 5, member = 4, season = 2) index_hind2 <- rnorm(60, mean = -0.5, sd = 4) dim(index_hind2) <- c(time = 5, member = 6, season = 2) index_fcst1 <- rnorm(16, mean = 0.2, sd = 3) dim(index_fcst1) <- c(time = 1, member = 8, season = 2) index_fcst2 <- rnorm(18, mean = -0.5, sd = 4) dim(index_fcst2) <- c(time = 1, member = 9, season = 2) method_BC <- c('LMEV', 'ME') res <- BEI_PDFBest(index_obs, index_hind1, index_hind2, index_fcst1, index_fcst2, method_BC)
This function implements the computation to obtain the tercile probabilities for a weighted variable for SFSs using a normalized weights array,
BEI_ProbsWeighting( var_exp, aweights, terciles, time_dim_name = "time", memb_dim = "member" )
BEI_ProbsWeighting( var_exp, aweights, terciles, time_dim_name = "time", memb_dim = "member" )
var_exp |
Variable (e.g. precipitation, temperature, NAO index) array from a SFS with at least dimensions (time, member) for a spatially aggregated variable or dimensions (time, member, lat, lon) for a spatial variable, as 'time' the spatial dimension by default. |
aweights |
Normalized weights array with at least dimensions (time, member), when 'time' is the temporal dimension as default. |
terciles |
A numeric array with at least one dimension 'tercil' equal to 2, the first element is the lower tercil for a hindcast period, and the second element is the upper tercile. |
time_dim_name |
A character string indicating the name of the temporal dimension, by default 'time'. |
memb_dim |
A character string indicating the name of the member dimension, by default 'member'. |
BEI_ProbsWeighting() returns an array with at least two or four dimensions depending if the variable is a spatially aggregated variable (as e.g. NAO index)(time, tercil) or it is spatial variable (as e.g. precipitation or temperature)(time, tercile, lat, lon), containing the terciles probabilities computing with weighted members. The first tercil is the lower tercile, the second is the normal tercile and the third is the upper tercile.
Eroteida Sanchez-Garcia - AEMET, [email protected]
Regionally improved seasonal forecast of precipitation through Best estimation of winter NAO, Sanchez-Garcia, E. et al., Adv. Sci. Res., 16, 165174, 2019, doi:10.5194/asr-16-165-2019
# Example 1 var_exp <- 1 : (2 * 4) dim(var_exp) <- c(time = 2, member = 4) aweights <- c(0.2, 0.1, 0.3, 0.4, 0.1, 0.2, 0.4, 0.3) dim(aweights) <- c(time = 2, member = 4) terciles <- c(2.5,5) dim(terciles) <- c(tercil = 2) res <- BEI_ProbsWeighting(var_exp, aweights, terciles) # Example 2 var_exp <- rnorm(48, 50, 9) dim(var_exp) <- c(time = 2, member = 4, lat = 2, lon = 3) aweights <- c(0.2, 0.1, 0.3, 0.4, 0.1, 0.2, 0.4, 0.3) dim(aweights) <- c(time = 2, member = 4) terciles <- rep(c(48,50), 2*3) dim(terciles) <- c(tercil = 2, lat = 2, lon = 3) res <- BEI_ProbsWeighting(var_exp, aweights, terciles)
# Example 1 var_exp <- 1 : (2 * 4) dim(var_exp) <- c(time = 2, member = 4) aweights <- c(0.2, 0.1, 0.3, 0.4, 0.1, 0.2, 0.4, 0.3) dim(aweights) <- c(time = 2, member = 4) terciles <- c(2.5,5) dim(terciles) <- c(tercil = 2) res <- BEI_ProbsWeighting(var_exp, aweights, terciles) # Example 2 var_exp <- rnorm(48, 50, 9) dim(var_exp) <- c(time = 2, member = 4, lat = 2, lon = 3) aweights <- c(0.2, 0.1, 0.3, 0.4, 0.1, 0.2, 0.4, 0.3) dim(aweights) <- c(time = 2, member = 4) terciles <- rep(c(48,50), 2*3) dim(terciles) <- c(tercil = 2, lat = 2, lon = 3) res <- BEI_ProbsWeighting(var_exp, aweights, terciles)
This function implements the computation to obtain the terciles for a weighted variable for SFSs using a normalized weights array,
BEI_TercilesWeighting( var_exp, aweights, time_dim_name = "time", memb_dim = "member" )
BEI_TercilesWeighting( var_exp, aweights, time_dim_name = "time", memb_dim = "member" )
var_exp |
Variable (e.g. precipitation, temperature, NAO index) array from a SFS with at least dimensions (time, member) for a spatially aggregated variable or dimensions (time, member, lat, lon) for a spatial variable, as 'time' the spatial dimension by default. |
aweights |
Normalized weights array with at least dimensions (time, member), when 'time' is the temporal dimension as default. |
time_dim_name |
A character string indicating the name of the temporal dimension, by default 'time'. |
memb_dim |
A character string indicating the name of the member dimension, by default 'member'. |
BEI_TercilesWeighting() returns an array with at least one dimension depending if the variable is a spatially aggregated variable (as e.g. NAO index)(tercil) or it is spatial variable (as e.g. precipitation or temperature)(tercil, lat, lon), containing the terciles computing with weighted members. The first tercil is the lower tercile, the second is the upper tercile.
Eroteida Sanchez-Garcia - AEMET, [email protected]
Regionally improved seasonal forecast of precipitation through Best estimation of winter NAO, Sanchez-Garcia, E. et al., Adv. Sci. Res., 16, 165174, 2019, doi:10.5194/asr-16-165-2019
# Example 1 var_exp <- 1 : (2 * 4) dim(var_exp) <- c(time = 2, member = 4) aweights<- c(0.2, 0.1, 0.3, 0.4, 0.1, 0.2, 0.4, 0.3) dim(aweights) <- c(time = 2, member = 4) res <- BEI_TercilesWeighting(var_exp, aweights) # Example 2 var_exp <- rnorm(48, 50, 9) dim(var_exp) <- c(time = 2, member = 4, lat = 2, lon = 3) aweights<- c(0.2, 0.1, 0.3, 0.4, 0.1, 0.2, 0.4, 0.3) dim(aweights) <- c(time = 2, member = 4) res <- BEI_TercilesWeighting(var_exp, aweights)
# Example 1 var_exp <- 1 : (2 * 4) dim(var_exp) <- c(time = 2, member = 4) aweights<- c(0.2, 0.1, 0.3, 0.4, 0.1, 0.2, 0.4, 0.3) dim(aweights) <- c(time = 2, member = 4) res <- BEI_TercilesWeighting(var_exp, aweights) # Example 2 var_exp <- rnorm(48, 50, 9) dim(var_exp) <- c(time = 2, member = 4, lat = 2, lon = 3) aweights<- c(0.2, 0.1, 0.3, 0.4, 0.1, 0.2, 0.4, 0.3) dim(aweights) <- c(time = 2, member = 4) res <- BEI_TercilesWeighting(var_exp, aweights)
This function implements the computation to obtain the normalized weights for each member of each Seasonal Forecast Systems (SFS) or dataset using the Probability Density Functions (PDFs) indicated by the parameter 'pdf_weight' (for instance the Best Index estimation obtained using the 'PDFBest' function). The weight of each member is proportional to the probability of its index calculated with the PDF "pdf_weight".
BEI_Weights(index_weight, pdf_weight, time_dim_name = "time")
BEI_Weights(index_weight, pdf_weight, time_dim_name = "time")
index_weight |
Index (e.g. NAO index) array, from a dataset of SFSs for a period of years, with at least dimensions 'member'. Additional dimensions, for instance, a temporal dimension as 'time', must have the same lenght in both parameters, 'index_weight' and 'pdf_weight'. |
pdf_weight |
Statistics array to define a Gaussian PDF with at least dimensions 'statistic'. The firt statistic is the parameter 'mean' of the PDF and the second statistic is the parameter 'standard deviation' of the PDF. |
time_dim_name |
A character string indicating the name of the temporal dimension, by default 'time'. |
BEI_Weights() returns a normalized weights array with the same dimensions that index_weight.
Eroteida Sanchez-Garcia - AEMET, [email protected]
Regionally improved seasonal forecast of precipitation through Best estimation of winter NAO, Sanchez-Garcia, E. et al., Adv. Sci. Res., 16, 165174, 2019, doi:10.5194/asr-16-165-2019
# Example for the BEI_Weights function index_weight <- 1 : (10 * 3 * 5 * 1) dim(index_weight) <- c(sdate = 10, dataset = 3, member = 5, season = 1) pdf_weight <- 1 : (10 * 3 * 2 * 1) dim(pdf_weight) <- c(sdate = 10, dataset = 3, statistic = 2, season = 1) res <- BEI_Weights(index_weight, pdf_weight) dim(res) # sdate dataset member season # 10 3 5 1
# Example for the BEI_Weights function index_weight <- 1 : (10 * 3 * 5 * 1) dim(index_weight) <- c(sdate = 10, dataset = 3, member = 5, season = 1) pdf_weight <- 1 : (10 * 3 * 2 * 1) dim(pdf_weight) <- c(sdate = 10, dataset = 3, statistic = 2, season = 1) res <- BEI_Weights(index_weight, pdf_weight) dim(res) # sdate dataset member season # 10 3 5 1
This function applies the simple bias adjustment technique described in Torralba et al. (2017). The adjusted forecasts have an equivalent standard deviation and mean to that of the reference dataset.
BiasCorrection( exp, obs, exp_cor = NULL, na.rm = FALSE, memb_dim = "member", sdate_dim = "sdate", dat_dim = NULL, ncores = NULL )
BiasCorrection( exp, obs, exp_cor = NULL, na.rm = FALSE, memb_dim = "member", sdate_dim = "sdate", dat_dim = NULL, ncores = NULL )
exp |
A multidimensional array with named dimensions containing the seasonal forecast experiment data with at least time and member dimensions. |
obs |
A multidimensional array with named dimensions containing the observed data with at least time dimension. |
exp_cor |
A multidimensional array with named dimensions containing the seasonal forecast experiment to be corrected with at least time and member dimension. If it is NULL, the 'exp' forecast will be corrected. If there is only one corrected dataset, it should not have dataset dimension. If there is a corresponding corrected dataset for each 'exp' forecast, the dataset dimension must have the same length as in 'exp'. The default value is NULL. |
na.rm |
A logical value indicating whether missing values should be stripped before the computation proceeds, by default it is set to FALSE. |
memb_dim |
A character string indicating the name of the member dimension. By default, it is set to 'member'. |
sdate_dim |
A character string indicating the name of the start date dimension. By default, it is set to 'sdate'. |
dat_dim |
A character string indicating the name of dataset dimension. The length of this dimension can be different between 'exp' and 'obs'. The default value is NULL. |
ncores |
An integer that indicates the number of cores for parallel computations using multiApply function. The default value is NULL. |
An array containing the bias corrected forecasts with the dimensions nexp, nobs and same dimensions as in the 'exp' object. nexp is the number of experiment (i.e., 'dat_dim' in exp), and nobs is the number of observation (i.e., 'dat_dim' in obs). If dat_dim is NULL, nexp and nobs are omitted. If 'exp_cor' is provided the returned array will be with the same dimensions as 'exp_cor'.
Verónica Torralba, [email protected]
Torralba, V., F.J. Doblas-Reyes, D. MacLeod, I. Christel and M. Davis (2017). Seasonal climate prediction: a new source of information for the management of wind energy resources. Journal of Applied Meteorology and Climatology, 56, 1231-1247, doi:10.1175/JAMC-D-16-0204.1. (CLIM4ENERGY, EUPORIAS, NEWA, RESILIENCE, SPECS)
mod1 <- 1 : (1 * 3 * 4 * 5 * 6 * 7) dim(mod1) <- c(dataset = 1, member = 3, sdate = 4, time = 5, lat = 6, lon = 7) obs1 <- 1 : (1 * 1 * 4 * 5 * 6 * 7) dim(obs1) <- c(dataset = 1, member = 1, sdate = 4, time = 5, lat = 6, lon = 7) a <- BiasCorrection(exp = mod1, obs = obs1)
mod1 <- 1 : (1 * 3 * 4 * 5 * 6 * 7) dim(mod1) <- c(dataset = 1, member = 3, sdate = 4, time = 5, lat = 6, lon = 7) obs1 <- 1 : (1 * 1 * 4 * 5 * 6 * 7) dim(obs1) <- c(dataset = 1, member = 1, sdate = 4, time = 5, lat = 6, lon = 7) a <- BiasCorrection(exp = mod1, obs = obs1)
Five types of member-by-member bias correction can be performed.
The "bias"
method corrects the bias only, the "evmos"
method
applies a variance inflation technique to ensure the correction of the bias
and the correspondence of variance between forecast and observation (Van
Schaeybroeck and Vannitsem, 2011). The ensemble calibration methods
"mse_min"
and "crps_min"
correct the bias, the overall forecast
variance and the ensemble spread as described in Doblas-Reyes et al. (2005)
and Van Schaeybroeck and Vannitsem (2015), respectively. While the
"mse_min"
method minimizes a constrained mean-squared error using three
parameters, the "crps_min"
method features four parameters and
minimizes the Continuous Ranked Probability Score (CRPS). The
"rpc-based"
method adjusts the forecast variance ensuring that the
ratio of predictable components (RPC) is equal to one, as in Eade et al.
(2014). Both in-sample or our out-of-sample (leave-one-out cross
validation) calibration are possible.
Calibration( exp, obs, exp_cor = NULL, cal.method = "mse_min", eval.method = "leave-one-out", multi.model = FALSE, na.fill = TRUE, na.rm = TRUE, apply_to = NULL, alpha = NULL, memb_dim = "member", sdate_dim = "sdate", dat_dim = NULL, ncores = NULL )
Calibration( exp, obs, exp_cor = NULL, cal.method = "mse_min", eval.method = "leave-one-out", multi.model = FALSE, na.fill = TRUE, na.rm = TRUE, apply_to = NULL, alpha = NULL, memb_dim = "member", sdate_dim = "sdate", dat_dim = NULL, ncores = NULL )
exp |
A multidimensional array with named dimensions (at least 'sdate' and 'member') containing the seasonal hindcast experiment data. The hindcast is used to calibrate the forecast in case the forecast is provided; if not, the same hindcast will be calibrated instead. |
obs |
A multidimensional array with named dimensions (at least 'sdate') containing the observed data. |
exp_cor |
An optional multidimensional array with named dimensions (at least 'sdate' and 'member') containing the seasonal forecast experiment data. If the forecast is provided, it will be calibrated using the hindcast and observations; if not, the hindcast will be calibrated instead. If there is only one corrected dataset, it should not have dataset dimension. If there is a corresponding corrected dataset for each 'exp' forecast, the dataset dimension must have the same length as in 'exp'. The default value is NULL. |
cal.method |
A character string indicating the calibration method used,
can be either |
eval.method |
A character string indicating the sampling method used,
can be either |
multi.model |
A boolean that is used only for the |
na.fill |
A boolean that indicates what happens in case calibration is
not possible or will yield unreliable results. This happens when three or
less forecasts-observation pairs are available to perform the training phase
of the calibration. By default |
na.rm |
A boolean that indicates whether to remove the NA values or
not. The default value is |
apply_to |
A character string that indicates whether to apply the
calibration to all the forecast ( |
alpha |
A numeric value indicating the significance level for the
correlation test. Only useful if |
memb_dim |
A character string indicating the name of the member dimension. By default, it is set to 'member'. |
sdate_dim |
A character string indicating the name of the start date dimension. By default, it is set to 'sdate'. |
dat_dim |
A character string indicating the name of dataset dimension. The length of this dimension can be different between 'exp' and 'obs'. The default value is NULL. |
ncores |
An integer that indicates the number of cores for parallel computation using multiApply function. The default value is NULL (one core). |
Both the na.fill
and na.rm
parameters can be used to
indicate how the function has to handle the NA values. The na.fill
parameter checks whether there are more than three forecast-observations pairs
to perform the computation. In case there are three or less pairs, the
computation is not carried out, and the value returned by the function depends
on the value of this parameter (either NA if na.fill == TRUE
or the
uncorrected value if na.fill == TRUE
). On the other hand, na.rm
is used to indicate the function whether to remove the missing values during
the computation of the parameters needed to perform the calibration.
An array containing the calibrated forecasts with the dimensions nexp, nobs and same dimensions as in the 'exp' array. nexp is the number of experiment (i.e., 'dat_dim' in exp), and nobs is the number of observation (i.e., 'dat_dim' in obs). If dat_dim is NULL, nexp and nobs are omitted. If 'exp_cor' is provided the returned array will be with the same dimensions as 'exp_cor'.
Verónica Torralba, [email protected]
Bert Van Schaeybroeck, [email protected]
Doblas-Reyes F.J, Hagedorn R, Palmer T.N. The rationale behind the success of multi-model ensembles in seasonal forecasting-II calibration and combination. Tellus A. 2005;57:234-252. doi:10.1111/j.1600-0870.2005.00104.x
Eade, R., Smith, D., Scaife, A., Wallace, E., Dunstone, N., Hermanson, L., & Robinson, N. (2014). Do seasonal-to-decadal climate predictions underestimate the predictability of the read world? Geophysical Research Letters, 41(15), 5620-5628. doi:10.1002/2014GL061146
Van Schaeybroeck, B., & Vannitsem, S. (2011). Post-processing through linear regression. Nonlinear Processes in Geophysics, 18(2), 147. doi:10.5194/npg-18-147-2011
Van Schaeybroeck, B., & Vannitsem, S. (2015). Ensemble post-processing using member-by-member approaches: theoretical aspects. Quarterly Journal of the Royal Meteorological Society, 141(688), 807-818. doi:10.1002/qj.2397
mod1 <- 1 : (1 * 3 * 4 * 5 * 6 * 7) dim(mod1) <- c(dataset = 1, member = 3, sdate = 4, ftime = 5, lat = 6, lon = 7) obs1 <- 1 : (1 * 1 * 4 * 5 * 6 * 7) dim(obs1) <- c(dataset = 1, member = 1, sdate = 4, ftime = 5, lat = 6, lon = 7) a <- Calibration(exp = mod1, obs = obs1)
mod1 <- 1 : (1 * 3 * 4 * 5 * 6 * 7) dim(mod1) <- c(dataset = 1, member = 3, sdate = 4, ftime = 5, lat = 6, lon = 7) obs1 <- 1 : (1 * 1 * 4 * 5 * 6 * 7) dim(obs1) <- c(dataset = 1, member = 1, sdate = 4, ftime = 5, lat = 6, lon = 7) a <- Calibration(exp = mod1, obs = obs1)
This function converts a multi-model ensemble forecast into a categorical forecast by giving the probability for each category. Different methods are available to combine the different ensemble forecasting models into probabilistic categorical forecasts.
See details in ?CST_CategoricalEnsCombination
CategoricalEnsCombination(fc, obs, cat.method, eval.method, amt.cat, ...)
CategoricalEnsCombination(fc, obs, cat.method, eval.method, amt.cat, ...)
fc |
A multi-dimensional array with named dimensions containing the
seasonal forecast experiment data in the element named |
obs |
A multidimensional array with named dimensions containing the
observed data in the element named |
cat.method |
Method used to produce the categorical forecast, can be
either |
eval.method |
Is the sampling method used, can be either
|
amt.cat |
Is the amount of categories. Equally-sized quantiles will be calculated based on the amount of categories. |
... |
Other parameters to be passed on to the calibration procedure. |
An array containing the categorical forecasts in the element called
$data
. The first two dimensions of the returned object are named
dataset and member and are both of size one. An additional dimension named
category is introduced and is of size amt.cat.
Bert Van Schaeybroeck, [email protected]
Rajagopalan, B., Lall, U., & Zebiak, S. E. (2002). Categorical climate forecasts through regularization and optimal combination of multiple GCM ensembles. Monthly Weather Review, 130(7), 1792-1811.
Robertson, A. W., Lall, U., Zebiak, S. E., & Goddard, L. (2004). Improved combination of multiple atmospheric GCM ensembles for seasonal prediction. Monthly Weather Review, 132(12), 2732-2744.
Van Schaeybroeck, B., & Vannitsem, S. (2019). Postprocessing of Long-Range Forecasts. In Statistical Postprocessing of Ensemble Forecasts (pp. 267-290).
This function searches for analogs in a reference dataset for experiment data, based on corresponding weather types. The experiment data is typically a hindcast, observations are typically provided by reanalysis data.
This function searches for analogs in a reference dataset for experiment data, based on corresponding weather types. The experiment data is typically a hindcast, observations are typically provided by reanalysis data.
CST_AdamontAnalog( exp, obs, wt_exp, wt_obs, nanalogs, method = "pattcorr", thres = NULL, search_obsdims = c("member", "sdate", "ftime"), londim = "lon", latdim = "lat" ) AdamontAnalog( exp, obs, wt_exp, wt_obs, nanalogs = 5, method = "pattcorr", thres = NULL, search_obsdims = c("member", "sdate", "ftime"), londim = "lon", latdim = "lat" )
CST_AdamontAnalog( exp, obs, wt_exp, wt_obs, nanalogs, method = "pattcorr", thres = NULL, search_obsdims = c("member", "sdate", "ftime"), londim = "lon", latdim = "lat" ) AdamontAnalog( exp, obs, wt_exp, wt_obs, nanalogs = 5, method = "pattcorr", thres = NULL, search_obsdims = c("member", "sdate", "ftime"), londim = "lon", latdim = "lat" )
exp |
A multidimensional array with named dimensions containing the experiment data. |
obs |
A multidimensional array with named dimensions containing the
reference data. Note that lat/lon dimensions need to be the same as
|
wt_exp |
Corresponding weather types (same dimensions as |
wt_obs |
Corresponding weather types (same dimensions as |
nanalogs |
Integer defining the number of analog values to return (default: 5). |
method |
A character string indicating the method used for analog definition. It can be:
|
thres |
Real number indicating the threshold to define rain occurrence/non occurrence in rain (0)1. |
search_obsdims |
List of dimensions in |
londim |
Name of longitude dimension. |
latdim |
Name of latitude dimension. |
analog_vals An object of class s2dv_cube
containing
nanalogs analog values for each value of exp
input data.
analog_vals An array containing nanalogs analog values.
Paola Marson, [email protected] for PROSNOW version
Lauriane Batté, [email protected] for CSTools adaptation
wt_exp <- sample(1:3, 15*6*3, replace = TRUE) dim(wt_exp) <- c(dataset = 1, member = 15, sdate = 6, ftime = 3) wt_obs <- sample(1:3, 6*3, replace = TRUE) dim(wt_obs) <- c(dataset = 1, member = 1, sdate = 6, ftime = 3) exp <- NULL exp$data <- 1 : c(1 * 15 * 6 * 3 * 8 * 8) dim(exp$data) <- c(dataset = 1, member = 15, sdate = 6, ftime = 3, lat = 8, lon = 8) class(exp) <- 's2dv_cube' obs <- NULL obs$data <- 101 : c(100 + 1 * 1 * 6 * 3 * 8 * 8) dim(obs$data) <- c(dataset = 1, member = 1, sdate = 6, ftime = 3, lat = 8, lon = 8) class(obs) <- 's2dv_cube' analog_vals <- CST_AdamontAnalog(exp = exp, obs = obs, wt_exp = wt_exp, wt_obs = wt_obs, nanalogs = 2) wt_exp <- sample(1:3, 15*6*3, replace = TRUE) dim(wt_exp) <- c(dataset = 1, member = 15, sdate = 6, ftime = 3) wt_obs <- sample(1:3, 6*3, replace = TRUE) dim(wt_obs) <- c(dataset = 1, member = 1, sdate = 6, ftime = 3) exp <- 1 : c(1 * 15 * 6 * 3 * 8 * 8) dim(exp) <- c(dataset = 1, member = 15, sdate = 6, ftime = 3, lat = 8, lon = 8) obs <- 101 : c(100 + 1 * 1 * 6 * 3 * 8 * 8) dim(obs) <- c(dataset = 1, member = 1, sdate = 6, ftime = 3, lat = 8, lon = 8) analog_vals <- AdamontAnalog(exp = exp, obs = obs, wt_exp = wt_exp, wt_obs = wt_obs, nanalogs = 2)
wt_exp <- sample(1:3, 15*6*3, replace = TRUE) dim(wt_exp) <- c(dataset = 1, member = 15, sdate = 6, ftime = 3) wt_obs <- sample(1:3, 6*3, replace = TRUE) dim(wt_obs) <- c(dataset = 1, member = 1, sdate = 6, ftime = 3) exp <- NULL exp$data <- 1 : c(1 * 15 * 6 * 3 * 8 * 8) dim(exp$data) <- c(dataset = 1, member = 15, sdate = 6, ftime = 3, lat = 8, lon = 8) class(exp) <- 's2dv_cube' obs <- NULL obs$data <- 101 : c(100 + 1 * 1 * 6 * 3 * 8 * 8) dim(obs$data) <- c(dataset = 1, member = 1, sdate = 6, ftime = 3, lat = 8, lon = 8) class(obs) <- 's2dv_cube' analog_vals <- CST_AdamontAnalog(exp = exp, obs = obs, wt_exp = wt_exp, wt_obs = wt_obs, nanalogs = 2) wt_exp <- sample(1:3, 15*6*3, replace = TRUE) dim(wt_exp) <- c(dataset = 1, member = 15, sdate = 6, ftime = 3) wt_obs <- sample(1:3, 6*3, replace = TRUE) dim(wt_obs) <- c(dataset = 1, member = 1, sdate = 6, ftime = 3) exp <- 1 : c(1 * 15 * 6 * 3 * 8 * 8) dim(exp) <- c(dataset = 1, member = 15, sdate = 6, ftime = 3, lat = 8, lon = 8) obs <- 101 : c(100 + 1 * 1 * 6 * 3 * 8 * 8) dim(obs) <- c(dataset = 1, member = 1, sdate = 6, ftime = 3, lat = 8, lon = 8) analog_vals <- AdamontAnalog(exp = exp, obs = obs, wt_exp = wt_exp, wt_obs = wt_obs, nanalogs = 2)
This function computes a quantile mapping based on weather types
for experiment data (typically a hindcast) onto reference obs
,
typically provided by reanalysis data.
CST_AdamontQQCorr( exp, wt_exp, obs, wt_obs, corrdims = c("member", "sdate", "ftime"), londim = "lon", latdim = "lat" )
CST_AdamontQQCorr( exp, wt_exp, obs, wt_obs, corrdims = c("member", "sdate", "ftime"), londim = "lon", latdim = "lat" )
exp |
Experiment data an object of class |
wt_exp |
Corresponding weather types (same dimensions as |
obs |
Reference data, also of class |
wt_obs |
Corresponding weather types (same dimensions as |
corrdims |
List of dimensions in |
londim |
Character name of longitude dimension in |
latdim |
Character name of latitude dimension in |
An object of class s2dv_cube
containing experiment data on the
lat/lon grid of obs
input data, corrected by quantile mapping
depending on the weather types wt_exp
.
Lauriane Batté, [email protected]
Paola Marson, [email protected]
Gildas Dayon, [email protected]
wt_exp <- c(1,1,2,3,3,2,2,1,1,2,2,3) dim(wt_exp) <- c(dataset = 1, member = 1, sdate = 4, ftime = 3) wt_obs <- c(3,3,1,2,2,2,2,1,3,1,1,2) dim(wt_obs) <- c(dataset = 1, member = 1, sdate = 4, ftime = 3) exp <- NULL exp$data <- 1 : c(1 * 1 * 4 * 3 * 4 * 4) dim(exp$data) <- c(dataset = 1, member = 1, sdate = 4, ftime = 3, lat = 4, lon = 4) class(exp) <- 's2dv_cube' obs <- NULL obs$data <- 101 : c(100 + 1 * 1 * 4 * 3 * 4 * 4) dim(obs$data) <- c(dataset = 1, member = 1, sdate = 4, ftime = 3, lat = 4, lon = 4) class(obs) <- 's2dv_cube' exp_corr <- CST_AdamontQQCorr(exp = exp, wt_exp = wt_exp, obs = obs, wt_obs = wt_obs, corrdims = c('dataset','member','sdate','ftime'))
wt_exp <- c(1,1,2,3,3,2,2,1,1,2,2,3) dim(wt_exp) <- c(dataset = 1, member = 1, sdate = 4, ftime = 3) wt_obs <- c(3,3,1,2,2,2,2,1,3,1,1,2) dim(wt_obs) <- c(dataset = 1, member = 1, sdate = 4, ftime = 3) exp <- NULL exp$data <- 1 : c(1 * 1 * 4 * 3 * 4 * 4) dim(exp$data) <- c(dataset = 1, member = 1, sdate = 4, ftime = 3, lat = 4, lon = 4) class(exp) <- 's2dv_cube' obs <- NULL obs$data <- 101 : c(100 + 1 * 1 * 4 * 3 * 4 * 4) dim(obs$data) <- c(dataset = 1, member = 1, sdate = 4, ftime = 3, lat = 4, lon = 4) class(obs) <- 's2dv_cube' exp_corr <- CST_AdamontQQCorr(exp = exp, wt_exp = wt_exp, obs = obs, wt_obs = wt_obs, corrdims = c('dataset','member','sdate','ftime'))
This function perform a downscaling using Analogs. To compute the analogs, the function search for days with similar large scale conditions to downscaled fields to a local scale. The large scale and the local scale regions are defined by the user. The large scale is usually given by atmospheric circulation as sea level pressure or geopotential height (Yiou et al, 2013) but the function gives the possibility to use another field. The local scale will be usually given by precipitation or temperature fields, but might be another variable.The analogs function will find the best analogs based in Minimum Euclidean distance in the large scale pattern (i.e.SLP).
The search of analogs must be done in the longest dataset posible. This is important since it is necessary to have a good representation of the possible states of the field in the past, and therefore, to get better analogs. This function has not constrains of specific regions, variables to downscale, or data to be used (seasonal forecast data, climate projections data, reanalyses data). The regrid into a finner scale is done interpolating with CST_Start. Then, this interpolation is corrected selecting the analogs in the large and local scale in based of the observations. The function is an adapted version of the method of Yiou et al 2013. For an advanced search of Analogs (multiple Analogs, different criterias, further information from the metrics and date of the selected Analogs) use the'Analog' function within 'CSTools' package.
CST_Analogs( expL, obsL, expVar = NULL, obsVar = NULL, sdate_dim = "sdate", region = NULL, criteria = "Large_dist", excludeTime = NULL, time_expL = NULL, time_obsL = NULL, nAnalogs = NULL, AnalogsInfo = FALSE, ncores = NULL )
CST_Analogs( expL, obsL, expVar = NULL, obsVar = NULL, sdate_dim = "sdate", region = NULL, criteria = "Large_dist", excludeTime = NULL, time_expL = NULL, time_obsL = NULL, nAnalogs = NULL, AnalogsInfo = FALSE, ncores = NULL )
expL |
An 's2dv_cube' object containing the experimental field on the large scale for which the analog is aimed. This field is used to in all the criterias. If parameter 'expVar' is not provided, the function will return the expL analog. The element 'data' in the 's2dv_cube' object must have, at least, latitudinal and longitudinal dimensions. The object is expect to be already subset for the desired large scale region. Latitudinal dimension accepted names: 'lat', 'lats', 'latitude', 'y', 'j', 'nav_lat'. Longitudinal dimension accepted names: 'lon', 'lons','longitude', 'x', 'i', 'nav_lon'. |
obsL |
An 's2dv_cube' object containing the observational field on the large scale. The element 'data' in the 's2dv_cube' object must have the same latitudinal and longitudinal dimensions as parameter 'expL' and a temporal dimension with the maximum number of available observations. |
expVar |
An 's2dv_cube' object containing the experimental field on the local scale, usually a different variable to the parameter 'expL'. If it is not NULL (by default, NULL), the returned field by this function will be the analog of parameter 'expVar'. |
obsVar |
An 's2dv_cube' containing the field of the same variable as the passed in parameter 'expVar' for the same region. |
sdate_dim |
A character string indicating the name of the start date dimension. By default, it is set to 'sdate'. |
region |
A vector of length four indicating the minimum longitude, the maximum longitude, the minimum latitude and the maximum latitude. |
criteria |
A character string indicating the criteria to be used for the selection of analogs:
Criteria 'Large_dist' is recommended for CST_Analogs, for an advanced use of the criterias 'Local_dist' and 'Local_cor' use 'Analogs' function. |
excludeTime |
An array of N named dimensions (coinciding with time dimensions in expL)of character string(s) indicating the date(s) of the observations in the format "dd/mm/yyyy" to be excluded during the search of analogs. It can be NULL but if expL is not a forecast (time_expL contained in time_obsL), by default time_expL will be removed during the search of analogs. |
time_expL |
A character string indicating the date of the experiment
in the same format than time_obsL (i.e. "yyyy-mm-dd"). By default it is NULL
and dates are taken from element |
time_obsL |
A character string indicating the date of the observations
in the date format (i.e. "yyyy-mm-dd"). By default it is NULL and dates are
taken from element |
nAnalogs |
Number of Analogs to be selected to apply the criterias 'Local_dist' or 'Local_cor'. This is not the necessary the number of analogs that the user can get, but the number of events with minimum distance in which perform the search of the best Analog. The default value for the 'Large_dist' criteria is 1, for 'Local_dist' and 'Local_cor' criterias must be greater than 1 in order to match with the first criteria, if nAnalogs is NULL for 'Local_dist' and 'Local_cor' the default value will be set at the length of 'time_obsL'. If AnalogsInfo is FALSE the function returns just the best analog. |
AnalogsInfo |
A logical value. TRUE to get a list with two elements: 1) the downscaled field and 2) the AnalogsInfo which contains: a) the number of the best analogs, b) the corresponding value of the metric used in the selected criteria (distance values for Large_dist and Local_dist, correlation values for Local_cor), c)dates of the analogs). The analogs are listed in decreasing order, the first one is the best analog (i.e if the selected criteria is Local_cor the best analog will be the one with highest correlation, while for Large_dist criteria the best analog will be the day with minimum Euclidean distance). Set to FALSE to get a single analog, the best analog, for instance for downscaling. |
ncores |
The number of cores to use in parallel computation |
An 's2dv_cube' object containing an array with the dowscaled values of the best analogs in element 'data'. If 'AnalogsInfo' is TRUE, 'data' is a list with an array of the downscaled fields and the analogs information in elements 'analogs', 'metric' and 'dates'.
M. Carmen Alvarez-Castro, [email protected]
Maria M. Chaves-Montero, [email protected]
Veronica Torralba, [email protected]
Nuria Perez-Zanon [email protected]
Yiou, P., T. Salameh, P. Drobinski, L. Menut, R. Vautard, and M. Vrac, 2013 : Ensemble reconstruction of the atmospheric column from surface pressure using analogues. Clim. Dyn., 41, 1419-1437. [email protected]
expL <- rnorm(1:200) dim(expL) <- c(member = 10, lat = 4, lon = 5) obsL <- c(rnorm(1:180), expL[1, , ]*1.2) dim(obsL) <- c(time = 10, lat = 4, lon = 5) time_obsL <- as.POSIXct(paste(rep("01", 10), rep("01", 10), 1994:2003, sep = "-"), format = "%d-%m-%y") dim(time_obsL) <- c(time = 10) time_expL <- time_obsL[1] dim(time_expL) <- c(time = 1) lon <- seq(-1, 5, 1.5) lat <- seq(30, 35, 1.5) coords <- list(lon = seq(-1, 5, 1.5), lat = seq(30, 35, 1.5)) attrs_expL <- list(Dates = time_expL) attrs_obsL <- list(Dates = time_obsL) expL <- list(data = expL, coords = coords, attrs = attrs_expL) obsL <- list(data = obsL, coords = coords, attrs = attrs_obsL) class(expL) <- 's2dv_cube' class(obsL) <- 's2dv_cube' region <- c(min(lon), max(lon), min(lat), max(lat)) downscaled_field <- CST_Analogs(expL = expL, obsL = obsL, region = region)
expL <- rnorm(1:200) dim(expL) <- c(member = 10, lat = 4, lon = 5) obsL <- c(rnorm(1:180), expL[1, , ]*1.2) dim(obsL) <- c(time = 10, lat = 4, lon = 5) time_obsL <- as.POSIXct(paste(rep("01", 10), rep("01", 10), 1994:2003, sep = "-"), format = "%d-%m-%y") dim(time_obsL) <- c(time = 10) time_expL <- time_obsL[1] dim(time_expL) <- c(time = 1) lon <- seq(-1, 5, 1.5) lat <- seq(30, 35, 1.5) coords <- list(lon = seq(-1, 5, 1.5), lat = seq(30, 35, 1.5)) attrs_expL <- list(Dates = time_expL) attrs_obsL <- list(Dates = time_obsL) expL <- list(data = expL, coords = coords, attrs = attrs_expL) obsL <- list(data = obsL, coords = coords, attrs = attrs_obsL) class(expL) <- 's2dv_cube' class(obsL) <- 's2dv_cube' region <- c(min(lon), max(lon), min(lat), max(lat)) downscaled_field <- CST_Analogs(expL = expL, obsL = obsL, region = region)
This function downscales low resolution precipitation data (e.g. from Seasonal Forecast Models) through the association with an observational high resolution (HR) dataset (AEMET 5 km gridded data of daily precipitation (Peral et al., 2017)) and a collection of predictors and past synoptic situations similar to estimated day. The method uses three domains:
Peninsular Spain and Balearic Islands domain (5 km resolution): HR precipitation and the downscaling result domain.
Synoptic domain (low resolution, e.g. 1.5º x 1.5º): it should be centered over Iberian Peninsula and cover enough extension to detect as much synoptic situations as possible.
Extended domain (low resolution, e.g. 1.5º x 1.5º): it should have the same resolution as synoptic domain. It is used for SLP Seasonal Forecast Models.
CST_AnalogsPredictors( exp, slp, obs, lon, lat, slp_lon, slp_lat, var_name, hr_obs, tdates, ddates, restrain, dim_name_longitude = "lon", dim_name_latitude = "lat", dim_name_time = "time" )
CST_AnalogsPredictors( exp, slp, obs, lon, lat, slp_lon, slp_lat, var_name, hr_obs, tdates, ddates, restrain, dim_name_longitude = "lon", dim_name_latitude = "lat", dim_name_time = "time" )
exp |
List of arrays with downscaled period seasonal forecast data. The list has to contain model atmospheric variables (instantaneous 12h data) that must be indentify by parenthesis name. For precipitation:
For temperature:
The arrays must have at least three dimensions with names 'lon', 'lat' and 'time'. (lon = gridpoints of longitude, lat = gridpoints of latitude, time = number of downscaling days) Seasonal forecast variables must have the same resolution and domain as reanalysis variables ('obs' parameter, below). |
slp |
Array with atmospheric seasonal forecast model sea level pressure (instantaneous 12h data) that must be indentify as 'slp' (hPa). It has the same resolution as 'exp' and 'obs' paremeters but with an extended domain. This domain contains extra degrees (most in the north and west part) compare to synoptic domain. The array must have at least three dimensions with names 'lon', 'lat' and 'time'. |
obs |
List of arrays with training period reanalysis data. The list has to contain reanalysis atmospheric variables (instantaneous 12h data) that must be indentify by parenthesis name. For precipitation:
For maximum and minimum temperature:
The arrays must have at least three dimensions with names 'lon', 'lat' and 'time'. |
lon |
Vector of the synoptic longitude (from (-180º) to 180º), The vector must go from west to east. The same as for the training function. |
lat |
Vector of the synoptic latitude. The vector must go from north to south. The same as for the training function. |
slp_lon |
Vector of the extended longitude (from (-180º) to 180º), The vector must go from west to east. The same as for the training function. |
slp_lat |
Vector of the extended latitude. The vector must go from north to south. The same as for the training function. |
var_name |
Variable name to downscale. There are two options: 'prec' for precipitation and 'temp' for maximum and minimum temperature. |
hr_obs |
Local path of HR observational files (maestro and pcp/tmx-tmn). For precipitation and temperature can be downloaded from the following link: https://www.aemet.es/en/serviciosclimaticos/cambio_climat/datos_diarios?w=2 respetively. Maestro file (maestro_red_hr_SPAIN.txt) has gridpoint (nptos), longitude (lon), latitude (lat) and altitude (alt) in columns (vector structure). Data file (pcp/tmx/tmn_red_SPAIN_1951-201903.txt) includes 5km resolution spanish daily data (precipitation or maximum and minimum temperature from january 1951 to june 2020. See README file for more information. IMPORTANT!: HR observational period must be the same as for reanalysis variables. It is assumed that the training period is smaller than the HR original one (1951-2019), so it is needed to make a new ascii file with the new period and the same structure as original, specifying the training dates in the name (e.g. 'pcp_red_SPAIN_19810101-19961231.txt' for '19810101-19961231' period). |
tdates |
Training period dates in format YYYYMMDD(start)-YYYYMMDD(end) (e.g. 19810101-20181231). |
ddates |
Downscaling period dates in format YYYYMMDD(start)-YYYYMMDD(end) (e.g. 20191001-20200331). |
restrain |
Output (list of matrix) obtained from 'training_analogs' function. For precipitation, 'restrain' object must contains um, vm, nger, gu92, gv92, gu52, gv52, neni, vdmin, vref, ccm, lab_pred and cor_pred variables. For maximum and minimum temperature, 'restrain' object must contains um, vm, insol, neni, vdmin y vref. See 'AnalogsPred_train.R' for more information. |
dim_name_longitude |
A character string indicating the name of the longitude dimension, by default 'longitude'. |
dim_name_latitude |
A character string indicating the name of the latitude dimension, by default 'latitude'. |
dim_name_time |
A character string indicating the name of the time dimension, by default 'time'. |
Matrix with seasonal forecast precipitation (mm) or maximum and minimum temperature (dozens of ºC) in a 5km x 5km regular grid over peninsular Spain and Balearic Islands. The resulted matrices have two dimensions ('ddates' x 'nptos').(ddates = number of downscaling days and nptos = number of 'hr_obs' gridpoints).
Marta Dominguez Alonso - AEMET, [email protected]
Nuria Perez-Zanon - BSC, [email protected]
This function computes the anomalies relative to a climatology computed along the selected dimension (usually starting dates or forecast time) allowing the application or not of crossvalidated climatologies. The computation is carried out independently for experimental and observational data products.
CST_Anomaly( exp = NULL, obs = NULL, dim_anom = "sdate", cross = FALSE, memb_dim = "member", memb = TRUE, dat_dim = c("dataset", "member"), filter_span = NULL, ftime_dim = "ftime", ncores = NULL )
CST_Anomaly( exp = NULL, obs = NULL, dim_anom = "sdate", cross = FALSE, memb_dim = "member", memb = TRUE, dat_dim = c("dataset", "member"), filter_span = NULL, ftime_dim = "ftime", ncores = NULL )
exp |
An object of class |
obs |
An object of class |
dim_anom |
A character string indicating the name of the dimension along which the climatology will be computed. The default value is 'sdate'. |
cross |
A logical value indicating whether cross-validation should be applied or not. Default = FALSE. |
memb_dim |
A character string indicating the name of the member dimension. It must be one dimension in 'exp' and 'obs'. If there is no member dimension, set NULL. The default value is 'member'. |
memb |
A logical value indicating whether to subtract the climatology based on the individual members (TRUE) or the ensemble mean over all members (FALSE) when calculating the anomalies. The default value is TRUE. |
dat_dim |
A character vector indicating the name of the dataset and member dimensions. If there is no dataset dimension, it can be NULL. The default value is "c('dataset', 'member')". |
filter_span |
A numeric value indicating the degree of smoothing. This
option is only available if parameter |
ftime_dim |
A character string indicating the name of the temporal dimension where the smoothing with 'filter_span' will be applied. It cannot be NULL if 'filter_span' is provided. The default value is 'ftime'. |
ncores |
An integer indicating the number of cores to use for parallel computation. The default value is NULL. It will be used only when 'filter_span' is not NULL. |
A list with two S3 objects, 'exp' and 'obs', of the class 's2dv_cube', containing experimental and date-corresponding observational anomalies, respectively. These 's2dv_cube's can be ingested by other functions in CSTools.
Perez-Zanon Nuria, [email protected]
Pena Jesus, [email protected]
Ano_CrossValid
, Clim
and
CST_Start
mod <- 1 : (2 * 3 * 4 * 5 * 6 * 7) dim(mod) <- c(dataset = 2, member = 3, sdate = 4, ftime = 5, lat = 6, lon = 7) obs <- 1 : (1 * 1 * 4 * 5 * 6 * 7) dim(obs) <- c(dataset = 1, member = 1, sdate = 4, ftime = 5, lat = 6, lon = 7) lon <- seq(0, 30, 5) lat <- seq(0, 25, 5) coords <- list(lon = lon, lat = lat) exp <- list(data = mod, coords = coords) obs <- list(data = obs, coords = coords) attr(exp, 'class') <- 's2dv_cube' attr(obs, 'class') <- 's2dv_cube' anom <- CST_Anomaly(exp = exp, obs = obs, cross = FALSE, memb = TRUE)
mod <- 1 : (2 * 3 * 4 * 5 * 6 * 7) dim(mod) <- c(dataset = 2, member = 3, sdate = 4, ftime = 5, lat = 6, lon = 7) obs <- 1 : (1 * 1 * 4 * 5 * 6 * 7) dim(obs) <- c(dataset = 1, member = 1, sdate = 4, ftime = 5, lat = 6, lon = 7) lon <- seq(0, 30, 5) lat <- seq(0, 25, 5) coords <- list(lon = lon, lat = lat) exp <- list(data = mod, coords = coords) obs <- list(data = obs, coords = coords) attr(exp, 'class') <- 's2dv_cube' attr(obs, 'class') <- 's2dv_cube' anom <- CST_Anomaly(exp = exp, obs = obs, cross = FALSE, memb = TRUE)
Function to apply weights to a 's2dv_cube' object. It could return a weighted ensemble mean (deterministic output) or the terciles probabilities (probabilistic output) for Seasonal Forecast Systems (SFSs).
CST_BEI_Weighting( var_exp, aweights, terciles = NULL, type = "ensembleMean", time_dim_name = "time", memb_dim = "member" )
CST_BEI_Weighting( var_exp, aweights, terciles = NULL, type = "ensembleMean", time_dim_name = "time", memb_dim = "member" )
var_exp |
An object of the class 's2dv_cube' containing the variable
(e.g. precipitation, temperature, NAO index) array.
The var_exp object is expected to have an element named |
aweights |
Normalized weights array with at least dimensions (time, member), when 'time' is the temporal dimension as default. When 'aweights' parameter has any other dimensions (as e.g. 'lat') and 'var_exp' parameter has also the same dimension, they must be equals. |
terciles |
A numeric array with at least one dimension 'tercil' equal to 2, the first element is the lower tercil for a hindcast period, and the second element is the upper tercile. By default is NULL, the terciles are computed from var_exp data. |
type |
A character string indicating the type of output. If 'type' = 'probs', the function returns, in the element data from 'var_exp' parameter, an array with at least two or four dimensions depending if the variable is spatially aggregated variable (as e.g. NAO index), dimension (time, tercil) or it is spatial variable (as e.g. precipitation or temperature), dimension (time, tercile, lat, lon), containing the terciles probabilities computing with weighted members. The first tercil is the lower tercile, the second is the normal tercile and the third is the upper tercile. If 'type' = 'ensembleMean', the function returns, in the element data from 'var_exp' parameter, an array with at least one or three dimensions depending if the variable is a spatially aggregated variable (as e.g. NAO index)(time) or it is spatial variable (as e.g. precipitation or temperature) (time, lat, lon), containing the ensemble means computing with weighted members. |
time_dim_name |
A character string indicating the name of the temporal dimension, by default 'time'. |
memb_dim |
A character string indicating the name of the member dimension, by default 'member'. |
CST_BEI_Weighting() returns a CSTools object (i.e., of the
class 's2dv_cube').
This object has at least an element named $data
with at least a temporal dimension (and dimension 'tercil' when the output
are tercile probabilities), containing the ensemble means computing with
weighted members or probabilities of terciles.
Eroteida Sanchez-Garcia - AEMET, [email protected]
Regionally improved seasonal forecast of precipitation through Best estimation of winter NAO, Sanchez-Garcia, E. et al., Adv. Sci. Res., 16, 165174, 2019, doi:10.5194/asr-16-165-2019
var_exp <- 1 : (2 * 4 * 3 * 2) dim(var_exp) <- c(time = 2, member = 4, lat = 3, lon = 2) aweights <- c(0.2, 0.1, 0.3, 0.4, 0.1, 0.2, 0.4, 0.3, 0.1, 0.2, 0.4, 0.4, 0.1, 0.2, 0.4, 0.2) dim(aweights) <- c(time = 2, member = 4, dataset = 2) var_exp <- list(data = var_exp) class(var_exp) <- 's2dv_cube' res_CST <- CST_BEI_Weighting(var_exp, aweights)
var_exp <- 1 : (2 * 4 * 3 * 2) dim(var_exp) <- c(time = 2, member = 4, lat = 3, lon = 2) aweights <- c(0.2, 0.1, 0.3, 0.4, 0.1, 0.2, 0.4, 0.3, 0.1, 0.2, 0.4, 0.4, 0.1, 0.2, 0.4, 0.2) dim(aweights) <- c(time = 2, member = 4, dataset = 2) var_exp <- list(data = var_exp) class(var_exp) <- 's2dv_cube' res_CST <- CST_BEI_Weighting(var_exp, aweights)
This function applies the simple bias adjustment technique described in Torralba et al. (2017). The adjusted forecasts have an equivalent standard deviation and mean to that of the reference dataset.
CST_BiasCorrection( exp, obs, exp_cor = NULL, na.rm = FALSE, memb_dim = "member", sdate_dim = "sdate", dat_dim = NULL, ncores = NULL )
CST_BiasCorrection( exp, obs, exp_cor = NULL, na.rm = FALSE, memb_dim = "member", sdate_dim = "sdate", dat_dim = NULL, ncores = NULL )
exp |
An object of class |
obs |
An object of class |
exp_cor |
An object of class |
na.rm |
A logical value indicating whether missing values should be stripped before the computation proceeds, by default it is set to FALSE. |
memb_dim |
A character string indicating the name of the member dimension. By default, it is set to 'member'. |
sdate_dim |
A character string indicating the name of the start date dimension. By default, it is set to 'sdate'. |
dat_dim |
A character string indicating the name of dataset dimension. The length of this dimension can be different between 'exp' and 'obs'. The default value is NULL. |
ncores |
An integer that indicates the number of cores for parallel computations using multiApply function. The default value is NULL. |
An object of class s2dv_cube
containing the bias corrected
forecasts with the dimensions nexp, nobs and same dimensions as in the 'exp'
object. nexp is the number of experiment (i.e., 'dat_dim' in exp), and nobs is
the number of observation (i.e., 'dat_dim' in obs). If dat_dim is NULL, nexp
and nobs are omitted. If 'exp_cor' is provided the returned array will be with
the same dimensions as 'exp_cor'.
Verónica Torralba, [email protected]
Torralba, V., F.J. Doblas-Reyes, D. MacLeod, I. Christel and M. Davis (2017). Seasonal climate prediction: a new source of information for the management of wind energy resources. Journal of Applied Meteorology and Climatology, 56, 1231-1247, doi:10.1175/JAMC-D-16-0204.1. (CLIM4ENERGY, EUPORIAS, NEWA, RESILIENCE, SPECS)
mod1 <- 1 : (1 * 3 * 4 * 5 * 6 * 7) dim(mod1) <- c(dataset = 1, member = 3, sdate = 4, time = 5, lat = 6, lon = 7) obs1 <- 1 : (1 * 1 * 4 * 5 * 6 * 7) dim(obs1) <- c(dataset = 1, member = 1, sdate = 4, time = 5, lat = 6, lon = 7) lon <- seq(0, 30, 5) lat <- seq(0, 25, 5) coords <- list(lat = lat, lon = lon) exp <- list(data = mod1, coords = coords) obs <- list(data = obs1, coords = coords) attr(exp, 'class') <- 's2dv_cube' attr(obs, 'class') <- 's2dv_cube' a <- CST_BiasCorrection(exp = exp, obs = obs)
mod1 <- 1 : (1 * 3 * 4 * 5 * 6 * 7) dim(mod1) <- c(dataset = 1, member = 3, sdate = 4, time = 5, lat = 6, lon = 7) obs1 <- 1 : (1 * 1 * 4 * 5 * 6 * 7) dim(obs1) <- c(dataset = 1, member = 1, sdate = 4, time = 5, lat = 6, lon = 7) lon <- seq(0, 30, 5) lat <- seq(0, 25, 5) coords <- list(lat = lat, lon = lon) exp <- list(data = mod1, coords = coords) obs <- list(data = obs1, coords = coords) attr(exp, 'class') <- 's2dv_cube' attr(obs, 'class') <- 's2dv_cube' a <- CST_BiasCorrection(exp = exp, obs = obs)
Five types of member-by-member bias correction can be performed.
The "bias"
method corrects the bias only, the "evmos"
method
applies a variance inflation technique to ensure the correction of the bias
and the correspondence of variance between forecast and observation (Van
Schaeybroeck and Vannitsem, 2011). The ensemble calibration methods
"mse_min"
and "crps_min"
correct the bias, the overall forecast
variance and the ensemble spread as described in Doblas-Reyes et al. (2005)
and Van Schaeybroeck and Vannitsem (2015), respectively. While the
"mse_min"
method minimizes a constrained mean-squared error using three
parameters, the "crps_min"
method features four parameters and
minimizes the Continuous Ranked Probability Score (CRPS). The
"rpc-based"
method adjusts the forecast variance ensuring that the
ratio of predictable components (RPC) is equal to one, as in Eade et al.
(2014). It is equivalent to function Calibration
but for objects
of class s2dv_cube
.
CST_Calibration( exp, obs, exp_cor = NULL, cal.method = "mse_min", eval.method = "leave-one-out", multi.model = FALSE, na.fill = TRUE, na.rm = TRUE, apply_to = NULL, alpha = NULL, memb_dim = "member", sdate_dim = "sdate", dat_dim = NULL, ncores = NULL )
CST_Calibration( exp, obs, exp_cor = NULL, cal.method = "mse_min", eval.method = "leave-one-out", multi.model = FALSE, na.fill = TRUE, na.rm = TRUE, apply_to = NULL, alpha = NULL, memb_dim = "member", sdate_dim = "sdate", dat_dim = NULL, ncores = NULL )
exp |
An object of class |
obs |
An object of class |
exp_cor |
An optional object of class |
cal.method |
A character string indicating the calibration method used,
can be either |
eval.method |
A character string indicating the sampling method used, it
can be either |
multi.model |
A boolean that is used only for the |
na.fill |
A boolean that indicates what happens in case calibration is
not possible or will yield unreliable results. This happens when three or
less forecasts-observation pairs are available to perform the training phase
of the calibration. By default |
na.rm |
A boolean that indicates whether to remove the NA values or not.
The default value is |
apply_to |
A character string that indicates whether to apply the
calibration to all the forecast ( |
alpha |
A numeric value indicating the significance level for the
correlation test. Only useful if |
memb_dim |
A character string indicating the name of the member dimension. By default, it is set to 'member'. |
sdate_dim |
A character string indicating the name of the start date dimension. By default, it is set to 'sdate'. |
dat_dim |
A character string indicating the name of dataset dimension. The length of this dimension can be different between 'exp' and 'obs'. The default value is NULL. |
ncores |
An integer that indicates the number of cores for parallel computations using multiApply function. The default value is one. |
Both the na.fill
and na.rm
parameters can be used to
indicate how the function has to handle the NA values. The na.fill
parameter checks whether there are more than three forecast-observations pairs
to perform the computation. In case there are three or less pairs, the
computation is not carried out, and the value returned by the function depends
on the value of this parameter (either NA if na.fill == TRUE
or the
uncorrected value if na.fill == TRUE
). On the other hand, na.rm
is used to indicate the function whether to remove the missing values during
the computation of the parameters needed to perform the calibration.
An object of class s2dv_cube
containing the calibrated
forecasts in the element data
with the dimensions nexp, nobs and same
dimensions as in the 'exp' object. nexp is the number of experiment
(i.e., 'dat_dim' in exp), and nobs is the number of observation (i.e.,
'dat_dim' in obs). If dat_dim is NULL, nexp and nobs are omitted. If 'exp_cor'
is provided the returned array will be with the same dimensions as 'exp_cor'.
Verónica Torralba, [email protected]
Bert Van Schaeybroeck, [email protected]
Doblas-Reyes F.J, Hagedorn R, Palmer T.N. The rationale behind the success of multi-model ensembles in seasonal forecasting-II calibration and combination. Tellus A. 2005;57:234-252. doi:10.1111/j.1600-0870.2005.00104.x
Eade, R., Smith, D., Scaife, A., Wallace, E., Dunstone, N., Hermanson, L., & Robinson, N. (2014). Do seasonal-to-decadal climate predictions underestimate the predictability of the read world? Geophysical Research Letters, 41(15), 5620-5628. doi:10.1002/2014GL061146
Van Schaeybroeck, B., & Vannitsem, S. (2011). Post-processing through linear regression. Nonlinear Processes in Geophysics, 18(2), 147. doi:10.5194/npg-18-147-2011
Van Schaeybroeck, B., & Vannitsem, S. (2015). Ensemble post-processing using member-by-member approaches: theoretical aspects. Quarterly Journal of the Royal Meteorological Society, 141(688), 807-818. doi:10.1002/qj.2397
# Example 1: mod1 <- 1 : (1 * 3 * 4 * 5 * 6 * 7) dim(mod1) <- c(dataset = 1, member = 3, sdate = 4, ftime = 5, lat = 6, lon = 7) obs1 <- 1 : (1 * 1 * 4 * 5 * 6 * 7) dim(obs1) <- c(dataset = 1, member = 1, sdate = 4, ftime = 5, lat = 6, lon = 7) lon <- seq(0, 30, 5) lat <- seq(0, 25, 5) coords <- list(lat = lat, lon = lon) exp <- list(data = mod1, coords = coords) obs <- list(data = obs1, coords = coords) attr(exp, 'class') <- 's2dv_cube' attr(obs, 'class') <- 's2dv_cube' a <- CST_Calibration(exp = exp, obs = obs, cal.method = "mse_min", eval.method = "in-sample") # Example 2: mod1 <- 1 : (1 * 3 * 4 * 5 * 6 * 7) mod2 <- 1 : (1 * 3 * 1 * 5 * 6 * 7) dim(mod1) <- c(dataset = 1, member = 3, sdate = 4, ftime = 5, lat = 6, lon = 7) dim(mod2) <- c(dataset = 1, member = 3, sdate = 1, ftime = 5, lat = 6, lon = 7) obs1 <- 1 : (1 * 1 * 4 * 5 * 6 * 7) dim(obs1) <- c(dataset = 1, member = 1, sdate = 4, ftime = 5, lat = 6, lon = 7) lon <- seq(0, 30, 5) lat <- seq(0, 25, 5) coords <- list(lat = lat, lon = lon) exp <- list(data = mod1, coords = coords) obs <- list(data = obs1, coords = coords) exp_cor <- list(data = mod2, lat = lat, lon = lon) attr(exp, 'class') <- 's2dv_cube' attr(obs, 'class') <- 's2dv_cube' attr(exp_cor, 'class') <- 's2dv_cube' a <- CST_Calibration(exp = exp, obs = obs, exp_cor = exp_cor, cal.method = "evmos")
# Example 1: mod1 <- 1 : (1 * 3 * 4 * 5 * 6 * 7) dim(mod1) <- c(dataset = 1, member = 3, sdate = 4, ftime = 5, lat = 6, lon = 7) obs1 <- 1 : (1 * 1 * 4 * 5 * 6 * 7) dim(obs1) <- c(dataset = 1, member = 1, sdate = 4, ftime = 5, lat = 6, lon = 7) lon <- seq(0, 30, 5) lat <- seq(0, 25, 5) coords <- list(lat = lat, lon = lon) exp <- list(data = mod1, coords = coords) obs <- list(data = obs1, coords = coords) attr(exp, 'class') <- 's2dv_cube' attr(obs, 'class') <- 's2dv_cube' a <- CST_Calibration(exp = exp, obs = obs, cal.method = "mse_min", eval.method = "in-sample") # Example 2: mod1 <- 1 : (1 * 3 * 4 * 5 * 6 * 7) mod2 <- 1 : (1 * 3 * 1 * 5 * 6 * 7) dim(mod1) <- c(dataset = 1, member = 3, sdate = 4, ftime = 5, lat = 6, lon = 7) dim(mod2) <- c(dataset = 1, member = 3, sdate = 1, ftime = 5, lat = 6, lon = 7) obs1 <- 1 : (1 * 1 * 4 * 5 * 6 * 7) dim(obs1) <- c(dataset = 1, member = 1, sdate = 4, ftime = 5, lat = 6, lon = 7) lon <- seq(0, 30, 5) lat <- seq(0, 25, 5) coords <- list(lat = lat, lon = lon) exp <- list(data = mod1, coords = coords) obs <- list(data = obs1, coords = coords) exp_cor <- list(data = mod2, lat = lat, lon = lon) attr(exp, 'class') <- 's2dv_cube' attr(obs, 'class') <- 's2dv_cube' attr(exp_cor, 'class') <- 's2dv_cube' a <- CST_Calibration(exp = exp, obs = obs, exp_cor = exp_cor, cal.method = "evmos")
This function converts a multi-model ensemble forecast into a categorical forecast by giving the probability for each category. Different methods are available to combine the different ensemble forecasting models into probabilistic categorical forecasts.
Motivation: Beyond the short range, the unpredictable component of weather predictions becomes substantial due to the chaotic nature of the earth system. Therefore, predictions can mostly be skillful when used in a probabilistic sense. In practice this is done using ensemble forecasts. It is then common to convert the ensemble forecasts to occurence probabilities for different categories. These categories typically are taken as terciles from climatolgical distributions. For instance for temperature, there is a cold, normal and warm class. Commonly multiple ensemble forecasting systems are available but some models may be more competitive than others for the variable, region and user need under consideration. Therefore, when calculating the category probabilities, the ensemble members of the different forecasting system may be differently weighted. Such weighting is typically done by comparison of the ensemble forecasts with observations.
Description of the tool: The tool considers all forecasts (all members from all forecasting systems) and converts them into occurrence probabilities of different categories. The amount of categories can be changed and are taken as the climatological quantiles (e.g. terciles), extracted from the observational data. The methods that are available to combine the ensemble forecasting models into probabilistic categorical forecasts are: 1) ensemble pooling where all ensemble members of all ensemble systems are weighted equally, 2) model combination where each model system is weighted equally, and, 3) model weighting. The model weighting method is described in Rajagopalan et al. (2002), Robertson et al. 2004 and Van Schaeybroeck and Vannitsem (2019). More specifically, this method uses different weights for the occurence probability predicted by the available models and by a climatological model and optimizes the weights by minimizing the ignorance score. Finally, the function can also be used to categorize the observations in the categorical quantiles.
CST_CategoricalEnsCombination( exp, obs, cat.method = "pool", eval.method = "leave-one-out", amt.cat = 3, ... )
CST_CategoricalEnsCombination( exp, obs, cat.method = "pool", eval.method = "leave-one-out", amt.cat = 3, ... )
exp |
An object of class |
obs |
An object of class |
cat.method |
Method used to produce the categorical forecast, can be
either |
eval.method |
Is the sampling method used, can be either
|
amt.cat |
Is the amount of categories. Equally-sized quantiles will be calculated based on the amount of categories. |
... |
other parameters to be passed on to the calibration procedure. |
An object of class s2dv_cube
containing the categorical
forecasts in the element called $data
. The first two dimensions of the
returned object are named dataset and member and are both of size one. An
additional dimension named category is introduced and is of size amt.cat.
Bert Van Schaeybroeck, [email protected]
Rajagopalan, B., Lall, U., & Zebiak, S. E. (2002). Categorical climate forecasts through regularization and optimal combination of multiple GCM ensembles. Monthly Weather Review, 130(7), 1792-1811.
Robertson, A. W., Lall, U., Zebiak, S. E., & Goddard, L. (2004). Improved combination of multiple atmospheric GCM ensembles for seasonal prediction. Monthly Weather Review, 132(12), 2732-2744.
Van Schaeybroeck, B., & Vannitsem, S. (2019). Postprocessing of Long-Range Forecasts. In Statistical Postprocessing of Ensemble Forecasts (pp. 267-290).
mod1 <- 1 : (2 * 2* 4 * 5 * 2 * 2) dim(mod1) <- c(dataset = 2, member = 2, sdate = 4, ftime = 5, lat = 2, lon = 2) mod1[2, 1, , , , ] <- NA datasets <- c("MF", "UKMO") obs1 <- 1 : (1 * 1 * 4 * 5 * 2 * 2) dim(obs1) <- c(dataset = 1, member = 1, sdate = 4, ftime = 5, lat = 2, lon = 2) lon <- seq(0, 30, 5) lat <- seq(0, 25, 5) coords <- list(lat = lat, lon = lon) attrs <- list(Datasets = datasets) exp <- list(data = mod1, coords = coords, attrs = attrs) obs <- list(data = obs1, coords = coords) attr(exp, 'class') <- 's2dv_cube' attr(obs, 'class') <- 's2dv_cube' a <- CST_CategoricalEnsCombination(exp = exp, obs = obs, amt.cat = 3, cat.method = "mmw")
mod1 <- 1 : (2 * 2* 4 * 5 * 2 * 2) dim(mod1) <- c(dataset = 2, member = 2, sdate = 4, ftime = 5, lat = 2, lon = 2) mod1[2, 1, , , , ] <- NA datasets <- c("MF", "UKMO") obs1 <- 1 : (1 * 1 * 4 * 5 * 2 * 2) dim(obs1) <- c(dataset = 1, member = 1, sdate = 4, ftime = 5, lat = 2, lon = 2) lon <- seq(0, 30, 5) lat <- seq(0, 25, 5) coords <- list(lat = lat, lon = lon) attrs <- list(Datasets = datasets) exp <- list(data = mod1, coords = coords, attrs = attrs) obs <- list(data = obs1, coords = coords) attr(exp, 'class') <- 's2dv_cube' attr(obs, 'class') <- 's2dv_cube' a <- CST_CategoricalEnsCombination(exp = exp, obs = obs, amt.cat = 3, cat.method = "mmw")
Change the names of the dimensions specified in 'original_names' to the names in 'new_names'. The coordinate names and the dimensions of any attributes are also modified accordingly.
CST_ChangeDimNames(data, original_names, new_names)
CST_ChangeDimNames(data, original_names, new_names)
data |
An object of class |
original_names |
A single character string or a vector indicating the dimensions to be renamed. |
new_names |
A single character string or a vector indicating the new dimension names, in the same order as the dimensions in 'original_names'. |
An object of class s2dv_cube
with similar data, coordinates and
attributes as the data
input, but with modified dimension names.
Agudetse Roures Victoria, [email protected]
# Example with sample data: # Check original dimensions and coordinates lonlat_temp$exp$dims names(lonlat_temp$exp$coords) dim(lonlat_temp$exp$attrs$Dates) # Change 'dataset' to 'dat' and 'ftime' to 'time' exp <- CST_ChangeDimNames(lonlat_temp$exp, original_names = c("dataset", "ftime"), new_names = c("dat", "time")) # Check new dimensions and coordinates exp$dims names(exp$coords) dim(exp$attrs$Dates)
# Example with sample data: # Check original dimensions and coordinates lonlat_temp$exp$dims names(lonlat_temp$exp$coords) dim(lonlat_temp$exp$attrs$Dates) # Change 'dataset' to 'dat' and 'ftime' to 'time' exp <- CST_ChangeDimNames(lonlat_temp$exp, original_names = c("dataset", "ftime"), new_names = c("dat", "time")) # Check new dimensions and coordinates exp$dims names(exp$coords) dim(exp$attrs$Dates)
This function perform a bias correction conditioned by the dynamical properties of the dataset. This function internally uses the functions 'Predictability' to divide in terciles the two dynamical proxies computed with 'CST_ProxiesAttractor'. A bias correction between the model and the observations is performed using the division into terciles of the local dimension 'dim' and inverse of the persistence 'theta'. For instance, model values with lower 'dim' will be corrected with observed values with lower 'dim', and the same for theta. The function gives two options of bias correction: one for 'dim' and/or one for 'theta'
CST_DynBiasCorrection( exp, obs, method = "QUANT", wetday = FALSE, proxy = "dim", quanti, ncores = NULL )
CST_DynBiasCorrection( exp, obs, method = "QUANT", wetday = FALSE, proxy = "dim", quanti, ncores = NULL )
exp |
An s2v_cube object with the experiment data. |
obs |
An s2dv_cube object with the reference data. |
method |
A character string indicating the method to apply bias correction among these ones: "PTF","RQUANT","QUANT","SSPLIN". |
wetday |
Logical indicating whether to perform wet day correction or not OR a numeric threshold below which all values are set to zero (by default is set to 'FALSE'). |
proxy |
A character string indicating the proxy for local dimension 'dim' or inverse of persistence 'theta' to apply the dynamical conditioned bias correction method. |
quanti |
A number lower than 1 indicating the quantile to perform the computation of local dimension and theta. |
ncores |
The number of cores to use in parallel computation. |
dynbias An s2dvcube object with a bias correction performed conditioned by local dimension 'dim' or inverse of persistence 'theta'.
Carmen Alvarez-Castro, [email protected]
Maria M. Chaves-Montero, [email protected]
Veronica Torralba, [email protected]
Davide Faranda, [email protected]
Faranda, D., Alvarez-Castro, M.C., Messori, G., Rodriguez, D., and Yiou, P. (2019). The hammam effect or how a warm ocean enhances large scale atmospheric predictability.Nature Communications, 10(1), 1316. doi:10.1038/s41467-019-09305-8"
Faranda, D., Gabriele Messori and Pascal Yiou. (2017). Dynamical proxies of North Atlantic predictability and extremes. Scientific Reports, 7-41278, 2017.
expL <- rnorm(1:2000) dim(expL) <- c(time = 100, lat = 4, lon = 5) obsL <- c(rnorm(1:1980), expL[1, , ] * 1.2) dim(obsL) <- c(time = 100, lat = 4, lon = 5) time_obsL <- as.POSIXct(paste(rep("01", 100), rep("01", 100), 1920:2019, sep = "-"), format = "%d-%m-%y") time_expL <- as.POSIXct(paste(rep("01", 100), rep("01", 100), 1929:2019, sep = "-"), format = "%d-%m-%y") lon <- seq(-1, 5, 1.5) lat <- seq(30, 35, 1.5) # qm = 0.98 #'too high for this short dataset, it is possible that doesn't # get the requirement, in that case it would be necessary select a lower qm # for instance qm = 0.60 expL <- s2dv_cube(data = expL, coords = list(lon = lon, lat = lat), Dates = time_expL) obsL <- s2dv_cube(data = obsL, coords = list(lon = lon, lat = lat), Dates = time_obsL) # to use DynBiasCorrection dynbias1 <- DynBiasCorrection(exp = expL$data, obs = obsL$data, proxy= "dim", quanti = 0.6) # to use CST_DynBiasCorrection dynbias2 <- CST_DynBiasCorrection(exp = expL, obs = obsL, proxy= "dim", quanti = 0.6)
expL <- rnorm(1:2000) dim(expL) <- c(time = 100, lat = 4, lon = 5) obsL <- c(rnorm(1:1980), expL[1, , ] * 1.2) dim(obsL) <- c(time = 100, lat = 4, lon = 5) time_obsL <- as.POSIXct(paste(rep("01", 100), rep("01", 100), 1920:2019, sep = "-"), format = "%d-%m-%y") time_expL <- as.POSIXct(paste(rep("01", 100), rep("01", 100), 1929:2019, sep = "-"), format = "%d-%m-%y") lon <- seq(-1, 5, 1.5) lat <- seq(30, 35, 1.5) # qm = 0.98 #'too high for this short dataset, it is possible that doesn't # get the requirement, in that case it would be necessary select a lower qm # for instance qm = 0.60 expL <- s2dv_cube(data = expL, coords = list(lon = lon, lat = lat), Dates = time_expL) obsL <- s2dv_cube(data = obsL, coords = list(lon = lon, lat = lat), Dates = time_obsL) # to use DynBiasCorrection dynbias1 <- DynBiasCorrection(exp = expL$data, obs = obsL$data, proxy= "dim", quanti = 0.6) # to use CST_DynBiasCorrection dynbias2 <- CST_DynBiasCorrection(exp = expL, obs = obsL, proxy= "dim", quanti = 0.6)
This function performs a clustering on members/starting dates and returns a number of scenarios, with representative members for each of them. The clustering is performed in a reduced EOF space.
Motivation: Ensemble forecasts give a probabilistic insight of average weather conditions on extended timescales, i.e. from sub-seasonal to seasonal and beyond. With large ensembles, it is often an advantage to be able to group members according to similar characteristics and to select the most representative member for each cluster. This can be useful to characterize the most probable forecast scenarios in a multi-model (or single model) ensemble prediction. This approach, applied at a regional level, can also be used to identify the subset of ensemble members that best represent the full range of possible solutions for downscaling applications. The choice of the ensemble members is made flexible in order to meet the requirements of specific (regional) climate information products, to be tailored for different regions and user needs.
Description of the tool: EnsClustering is a cluster analysis tool, based on the k-means algorithm, for ensemble predictions. The aim is to group ensemble members according to similar characteristics and to select the most representative member for each cluster. The user chooses which feature of the data is used to group the ensemble members by clustering: time mean, maximum, a certain percentile (e.g., 75 time period. For each ensemble member this value is computed at each grid point, obtaining N lat-lon maps, where N is the number of ensemble members. The anomaly is computed subtracting the ensemble mean of these maps to each of the single maps. The anomaly is therefore computed with respect to the ensemble members (and not with respect to the time) and the Empirical Orthogonal Function (EOF) analysis is applied to these anomaly maps. Regarding the EOF analysis, the user can choose either how many Principal Components (PCs) to retain or the percentage of explained variance to keep. After reducing dimensionality via EOF analysis, k-means analysis is applied using the desired subset of PCs.
The major final outputs are the classification in clusters, i.e. which member belongs to which cluster (in k-means analysis the number k of clusters needs to be defined prior to the analysis) and the most representative member for each cluster, which is the closest member to the cluster centroid. Other outputs refer to the statistics of clustering: in the PC space, the minimum and the maximum distance between a member in a cluster and the cluster centroid (i.e. the closest and the furthest member), the intra-cluster standard deviation for each cluster (i.e. how much the cluster is compact).
CST_EnsClustering( exp, time_moment = "mean", numclus = NULL, lon_lim = NULL, lat_lim = NULL, variance_explained = 80, numpcs = NULL, time_dim = NULL, time_percentile = 90, cluster_dim = "member", verbose = F )
CST_EnsClustering( exp, time_moment = "mean", numclus = NULL, lon_lim = NULL, lat_lim = NULL, variance_explained = 80, numpcs = NULL, time_dim = NULL, time_percentile = 90, cluster_dim = "member", verbose = F )
exp |
An object of the class 's2dv_cube', containing the variables to be analysed. The element 'data' in the 's2dv_cube' object must have, at least, spatial and temporal dimensions. Latitudinal dimension accepted names: 'lat', 'lats', 'latitude', 'y', 'j', 'nav_lat'. Longitudinal dimension accepted names: 'lon', 'lons','longitude', 'x', 'i', 'nav_lon'. |
time_moment |
Decides the moment to be applied to the time dimension. Can be either 'mean' (time mean), 'sd' (standard deviation along time) or 'perc' (a selected percentile on time). If 'perc' the keyword 'time_percentile' is also used. |
numclus |
Number of clusters (scenarios) to be calculated. If set to NULL the number of ensemble members divided by 10 is used, with a minimum of 2 and a maximum of 8. |
lon_lim |
List with the two longitude margins in 'c(-180,180)' format. |
lat_lim |
List with the two latitude margins. |
variance_explained |
variance (percentage) to be explained by the set of EOFs. Defaults to 80. Not used if numpcs is specified. |
numpcs |
Number of EOFs retained in the analysis (optional). |
time_dim |
String or character array with name(s) of dimension(s) over which to compute statistics. If omitted c("ftime", "sdate", "time") are searched in this order. |
time_percentile |
Set the percentile in time you want to analyse (used for 'time_moment = "perc"). |
cluster_dim |
Dimension along which to cluster. Typically "member" or "sdate". This can also be a list like c("member", "sdate"). |
verbose |
Logical for verbose output |
A list with elements $cluster
(cluster assigned for each
member), $freq
(relative frequency of each cluster),
$closest_member
(representative member for each cluster),
$repr_field
(list of fields for each representative member),
composites
(list of mean fields for each cluster), $lon
(selected longitudes of output fields), $lat
(selected longitudes of
output fields).
Federico Fabiano - ISAC-CNR, [email protected]
Ignazio Giuntoli - ISAC-CNR, [email protected]
Danila Volpi - ISAC-CNR, [email protected]
Paolo Davini - ISAC-CNR, [email protected]
Jost von Hardenberg - ISAC-CNR, [email protected]
dat_exp <- array(abs(rnorm(1152))*275, dim = c(dataset = 1, member = 4, sdate = 6, ftime = 3, lat = 4, lon = 4)) lon <- seq(0, 3) lat <- seq(48, 45) coords <- list(lon = lon, lat = lat) exp <- list(data = dat_exp, coords = coords) attr(exp, 'class') <- 's2dv_cube' res <- CST_EnsClustering(exp = exp, numclus = 3, cluster_dim = c("sdate"))
dat_exp <- array(abs(rnorm(1152))*275, dim = c(dataset = 1, member = 4, sdate = 6, ftime = 3, lat = 4, lon = 4)) lon <- seq(0, 3) lat <- seq(48, 45) coords <- list(lon = lon, lat = lat) exp <- list(data = dat_exp, coords = coords) attr(exp, 'class') <- 's2dv_cube' res <- CST_EnsClustering(exp = exp, numclus = 3, cluster_dim = c("sdate"))
Insert an extra dimension into an array at position 'posdim' with length
'lendim'. The array in data
repeats along the new dimension.
The dimensions, coordinates and attributes are modified accordingly.
CST_InsertDim(data, posdim, lendim, name, values = NULL)
CST_InsertDim(data, posdim, lendim, name, values = NULL)
data |
An object of class |
posdim |
An integer indicating the position of the new dimension. |
lendim |
An integer indicating the length of the new dimension. |
name |
A character string indicating the name for the new dimension. |
values |
A vector containing the values of the new dimension and any relevant attributes. If NULL, a sequence of integers from 1 to lendim will be added. |
An object of class s2dv_cube
with similar data, coordinates and
attributes as the data
input, but with an additional dimension.
Agudetse Roures Victoria, [email protected]
#Example with sample data: # Check original dimensions and coordinates lonlat_temp$exp$dims names(lonlat_temp$exp$coords) # Add 'variable' dimension exp <- CST_InsertDim(lonlat_temp$exp, posdim = 2, lendim = 1, name = "variable", values = c("tas")) # Check new dimensions and coordinates exp$dims exp$coords$variable
#Example with sample data: # Check original dimensions and coordinates lonlat_temp$exp$dims names(lonlat_temp$exp$coords) # Add 'variable' dimension exp <- CST_InsertDim(lonlat_temp$exp, posdim = 2, lendim = 1, name = "variable", values = c("tas")) # Check new dimensions and coordinates exp$dims exp$coords$variable
This function aggregates, subsets and retrieves sub-seasonal, seasonal, decadal or climate projection data from NetCDF files in a local file system or on remote OPeNDAP servers, and arranges it for easy application of the CSTools functions.
CST_Load(...)
CST_Load(...)
... |
Parameters that are automatically forwarded to the 's2dv::Load' function. See details in '?s2dv::Load'. |
It receives any number of parameters ('...') that are automatically forwarded to the 's2dv::Load' function. See details in '?s2dv::Load'.
It is recommended to use this function in combination with the 'zeallot::"
A list with one or two S3 objects, named 'exp' and 'obs', of the class 's2dv_cube', containing experimental and date-corresponding observational data, respectively. These 's2dv_cube's can be ingested by other functions in CSTools. If the parameter ‘exp' in the call to 'CST_Load' is set to 'NULL', then only the ’obs' component is returned, and viceversa.
Nicolau Manubens, [email protected]
## Not run: library(zeallot) startDates <- c('20001101', '20011101', '20021101', '20031101', '20041101', '20051101') c(exp, obs) %<-% CST_Load( var = 'tas', exp = 'system5c3s', obs = 'era5', nmember = 15, sdates = startDates, leadtimemax = 3, latmin = 27, latmax = 48, lonmin = -12, lonmax = 40, output = 'lonlat', nprocs = 1 ) ## End(Not run)
## Not run: library(zeallot) startDates <- c('20001101', '20011101', '20021101', '20031101', '20041101', '20051101') c(exp, obs) %<-% CST_Load( var = 'tas', exp = 'system5c3s', obs = 'era5', nmember = 15, sdates = startDates, leadtimemax = 3, latmin = 27, latmax = 48, lonmin = -12, lonmax = 40, output = 'lonlat', nprocs = 1 ) ## End(Not run)
This function merges two dimensions of the array data
in a
's2dv_cube' object into one. The user can select the dimensions to merge and
provide the final name of the dimension. The user can select to remove NA
values or keep them.
CST_MergeDims( data, merge_dims = c("ftime", "monthly"), rename_dim = NULL, na.rm = FALSE )
CST_MergeDims( data, merge_dims = c("ftime", "monthly"), rename_dim = NULL, na.rm = FALSE )
data |
An 's2dv_cube' object |
merge_dims |
A character vector indicating the names of the dimensions to merge. |
rename_dim |
a character string indicating the name of the output
dimension. If left at NULL, the first dimension name provided in parameter
|
na.rm |
A logical indicating if the NA values should be removed or not. |
Nuria Perez-Zanon, [email protected]
data <- 1 : c(2 * 3 * 4 * 5 * 6 * 7) dim(data) <- c(time = 7, lat = 2, lon = 3, monthly = 4, member = 6, dataset = 5, var = 1) data[2,,,,,,] <- NA data[c(3,27)] <- NA data <- list(data = data) class(data) <- 's2dv_cube' new_data <- CST_MergeDims(data, merge_dims = c('time', 'monthly')) new_data <- CST_MergeDims(data, merge_dims = c('lon', 'lat'), rename_dim = 'grid') new_data <- CST_MergeDims(data, merge_dims = c('time', 'monthly'), na.rm = TRUE)
data <- 1 : c(2 * 3 * 4 * 5 * 6 * 7) dim(data) <- c(time = 7, lat = 2, lon = 3, monthly = 4, member = 6, dataset = 5, var = 1) data[2,,,,,,] <- NA data[c(3,27)] <- NA data <- list(data = data) class(data) <- 's2dv_cube' new_data <- CST_MergeDims(data, merge_dims = c('time', 'monthly')) new_data <- CST_MergeDims(data, merge_dims = c('lon', 'lat'), rename_dim = 'grid') new_data <- CST_MergeDims(data, merge_dims = c('time', 'monthly'), na.rm = TRUE)
This function performs EOF analysis over multiple variables, accepting in input a list of CSTools objects. Based on Singular Value Decomposition. For each field the EOFs are computed and the corresponding PCs are standardized (unit variance, zero mean); the minimum number of principal components needed to reach the user-defined variance is retained. The function weights the input data for the latitude cosine square root.
CST_MultiEOF( datalist, lon_dim = "lon", lat_dim = "lat", time_dim = "ftime", sdate_dim = "sdate", var_dim = "var", neof_max = 40, neof_composed = 5, minvar = 0.6, lon_lim = NULL, lat_lim = NULL, ncores = NULL )
CST_MultiEOF( datalist, lon_dim = "lon", lat_dim = "lat", time_dim = "ftime", sdate_dim = "sdate", var_dim = "var", neof_max = 40, neof_composed = 5, minvar = 0.6, lon_lim = NULL, lat_lim = NULL, ncores = NULL )
datalist |
A list of objects of the class 's2dv_cube', containing the
variables to be analysed. Each data object in the list is expected to have
an element named |
lon_dim |
A character string indicating the name of the longitudinal dimension. By default, it is set to 'lon'. |
lat_dim |
A character string indicating the name of the latitudinal dimension. By default, it is set to 'lat'. |
time_dim |
A character string indicating the name of the temporal dimension. By default, it is set to 'time'. |
sdate_dim |
A character string indicating the name of the start date dimension. By default, it is set to 'sdate'. |
var_dim |
A character string indicating the name of the variable dimension. By default, it is set to 'var'. |
neof_max |
Maximum number of single eofs considered in the first decomposition. |
neof_composed |
Number of composed eofs to return in output. |
minvar |
Minimum variance fraction to be explained in first decomposition. |
lon_lim |
Vector with longitudinal range limits for the EOF calculation for all input variables. |
lat_lim |
Vector with latitudinal range limits for the EOF calculation for all input variables. |
ncores |
An integer indicating the number of cores to use for parallel computation. The default value is NULL. |
A list containing:
coeff |
An 's2dv_cube' with the data element being an array of principal components with dimensions 'time_dim', 'sdate_dim', number of eof, rest of the dimensions of 'data' except 'lon_dim' and 'lat_dim'. |
variance |
An 's2dv_cube' with the data element being an array of explained variances with dimensions 'eof' and the rest of the dimensions of 'data' except 'time_dim', 'sdate_dim', 'lon_dim' and 'lat_dim'. |
eof_pattern |
An 's2dv_cube' with the data element being an array of EOF patterns obtained by regression with dimensions: 'eof' and the rest of the dimensions of 'data' except 'time_dim' and 'sdate_dim'. |
mask |
An 's2dv_cube' with the data element being an array of the mask with dimensions ('lon_dim', 'lat_dim', rest of the dimensions of 'data' except 'time_dim'). It is made from 'data', 1 for the positions that 'data' has value and NA for the positions that 'data' has NA. It is used to replace NAs with 0s for EOF calculation and mask the result with NAs again after the calculation. |
coordinates |
Longitudinal and latitudinal coordinates vectors. |
Jost von Hardenberg - ISAC-CNR, [email protected]
Paolo Davini - ISAC-CNR, [email protected]
seq <- 1 : (2 * 3 * 4 * 5 * 6 * 8) mod1 <- sin( 0.7 + seq )^2 + cos( seq ^ 2 * 1.22 ) dim(mod1) <- c(dataset = 2, member = 3, sdate = 4, ftime = 5, lat = 6, lon = 8) mod2 <- sin( seq * 2 ) ^ 3 + cos( seq ^ 2 ) dim(mod2) <- c(dataset = 2, member = 3, sdate = 4, ftime = 5, lat = 6, lon = 8) lon <- seq(0, 35, 5) lat <- seq(0, 25, 5) exp1 <- list(data = mod1, coords = list(lat = lat, lon = lon)) exp2 <- list(data = mod2, coords = list(lat = lat, lon = lon)) attr(exp1, 'class') <- 's2dv_cube' attr(exp2, 'class') <- 's2dv_cube' d = as.POSIXct(c("2017/01/01", "2017/01/02", "2017/01/03", "2017/01/04", "2017/01/05", "2018/01/01", "2018/01/02", "2018/01/03", "2018/01/04", "2018/01/05", "2019/01/01", "2019/01/02", "2019/01/03", "2019/01/04", "2019/01/05", "2020/01/01", "2020/01/02", "2020/01/03", "2020/01/04", "2020/01/05")) exp1$attrs$Dates = d exp2$attrs$Dates = d cal <- CST_MultiEOF(datalist = list(exp1, exp2), neof_composed = 2)
seq <- 1 : (2 * 3 * 4 * 5 * 6 * 8) mod1 <- sin( 0.7 + seq )^2 + cos( seq ^ 2 * 1.22 ) dim(mod1) <- c(dataset = 2, member = 3, sdate = 4, ftime = 5, lat = 6, lon = 8) mod2 <- sin( seq * 2 ) ^ 3 + cos( seq ^ 2 ) dim(mod2) <- c(dataset = 2, member = 3, sdate = 4, ftime = 5, lat = 6, lon = 8) lon <- seq(0, 35, 5) lat <- seq(0, 25, 5) exp1 <- list(data = mod1, coords = list(lat = lat, lon = lon)) exp2 <- list(data = mod2, coords = list(lat = lat, lon = lon)) attr(exp1, 'class') <- 's2dv_cube' attr(exp2, 'class') <- 's2dv_cube' d = as.POSIXct(c("2017/01/01", "2017/01/02", "2017/01/03", "2017/01/04", "2017/01/05", "2018/01/01", "2018/01/02", "2018/01/03", "2018/01/04", "2018/01/05", "2019/01/01", "2019/01/02", "2019/01/03", "2019/01/04", "2019/01/05", "2020/01/01", "2020/01/02", "2020/01/03", "2020/01/04", "2020/01/05")) exp1$attrs$Dates = d exp2$attrs$Dates = d cal <- CST_MultiEOF(datalist = list(exp1, exp2), neof_composed = 2)
This function calculates correlation (Anomaly Correlation Coefficient; ACC), root mean square error (RMS) and the root mean square error skill score (RMSSS) of individual anomaly models and multi-models mean (if desired) with the observations.
CST_MultiMetric( exp, obs, metric = "correlation", multimodel = TRUE, time_dim = "ftime", memb_dim = "member", sdate_dim = "sdate" )
CST_MultiMetric( exp, obs, metric = "correlation", multimodel = TRUE, time_dim = "ftime", memb_dim = "member", sdate_dim = "sdate" )
exp |
An object of class |
obs |
An object of class |
metric |
A character string giving the metric for computing the maximum skill. This must be one of the strings 'correlation', 'rms', 'rmsss' and 'rpss'. If 'rpss' is chossen the terciles probabilities are evaluated. |
multimodel |
A logical value indicating whether a Multi-Model Mean should be computed. |
time_dim |
Name of the temporal dimension where a mean will be applied. It can be NULL, the default value is 'ftime'. |
memb_dim |
Name of the member dimension. It can be NULL, the default value is 'member'. |
sdate_dim |
Name of the start date dimension or a dimension name identifiying the different forecast. It can be NULL, the default value is 'sdate'. |
An object of class s2dv_cube
containing the statistics of the
selected metric in the element $data
which is a list of arrays: for the
metric requested and others for statistics about its signeificance. The arrays
have two dataset dimensions equal to the 'dataset' dimension in the
exp$data
and obs$data
inputs. If multimodel
is TRUE, the
first position in the first 'nexp' dimension correspons to the Multi-Model Mean.
Mishra Niti, [email protected]
Perez-Zanon Nuria, [email protected]
Mishra, N., Prodhomme, C., & Guemas, V. (n.d.). Multi-Model Skill Assessment of Seasonal Temperature and Precipitation Forecasts over Europe, 29-31. doi:10.1007/s00382-018-4404-z
mod <- rnorm(2*2*4*5*2*2) dim(mod) <- c(dataset = 2, member = 2, sdate = 4, ftime = 5, lat = 2, lon = 2) obs <- rnorm(1*1*4*5*2*2) dim(obs) <- c(dataset = 1, member = 1, sdate = 4, ftime = 5, lat = 2, lon = 2) lon <- seq(0, 30, 5) lat <- seq(0, 25, 5) coords <- list(lat = lat, lon = lon) exp <- list(data = mod, coords = coords) obs <- list(data = obs, coords = coords) attr(exp, 'class') <- 's2dv_cube' attr(obs, 'class') <- 's2dv_cube' a <- CST_MultiMetric(exp = exp, obs = obs)
mod <- rnorm(2*2*4*5*2*2) dim(mod) <- c(dataset = 2, member = 2, sdate = 4, ftime = 5, lat = 2, lon = 2) obs <- rnorm(1*1*4*5*2*2) dim(obs) <- c(dataset = 1, member = 1, sdate = 4, ftime = 5, lat = 2, lon = 2) lon <- seq(0, 30, 5) lat <- seq(0, 25, 5) coords <- list(lat = lat, lon = lon) exp <- list(data = mod, coords = coords) obs <- list(data = obs, coords = coords) attr(exp, 'class') <- 's2dv_cube' attr(obs, 'class') <- 's2dv_cube' a <- CST_MultiMetric(exp = exp, obs = obs)
This function calculates the RMSE from multiple variables, as the mean of each variable's RMSE scaled by its observed standard deviation. Variables can be weighted based on their relative importance (defined by the user).
CST_MultivarRMSE( exp, obs, weight = NULL, memb_dim = "member", dat_dim = "dataset", sdate_dim = "sdate", ftime_dim = "ftime" )
CST_MultivarRMSE( exp, obs, weight = NULL, memb_dim = "member", dat_dim = "dataset", sdate_dim = "sdate", ftime_dim = "ftime" )
exp |
A list of objects, one for each variable, of class |
obs |
A list of objects, one for each variable (in the same order than
the input in 'exp') of class |
weight |
(optional) A vector of weight values to assign to each variable. If no weights are defined, a value of 1 is assigned to every variable. |
memb_dim |
A character string indicating the name of the member dimension. It must be one dimension in 'exp' and 'obs'. The default value is 'member'. |
dat_dim |
A character string indicating the name of the dataset dimension. It must be one dimension in 'exp' and 'obs'. If there is no dataset dimension, it can be NULL. The default value is 'dataset'. |
sdate_dim |
A character string indicating the name of the start date dimension. It must be one dimension in 'exp' and 'obs'. The default value is 'sdate'. |
ftime_dim |
A character string indicating the name of the forecast time dimension. It must be one dimension in 'exp' and 'obs'. The default value is 'ftime'. |
An object of class s2dv_cube
containing the RMSE in the element
$data
which is an array with two datset dimensions equal to the
'dataset' dimension in the exp$data
and obs$data
inputs. An
array with dimensions: c(number of exp, number of obs, 1 (the multivariate
RMSE value), number of lat, number of lon)
Deborah Verfaillie, [email protected]
# Example with 2 variables mod1 <- abs(rnorm(1 * 3 * 4 * 5 * 6 * 7)) mod2 <- abs(rnorm(1 * 3 * 4 * 5 * 6 * 7)) dim(mod1) <- c(dataset = 1, member = 3, sdate = 4, ftime = 5, lat = 6, lon = 7) dim(mod2) <- c(dataset = 1, member = 3, sdate = 4, ftime = 5, lat = 6, lon = 7) obs1 <- abs(rnorm(1 * 1 * 4 * 5 * 6 * 7)) obs2 <- abs(rnorm(1 * 1 * 4 * 5 * 6 * 7)) dim(obs1) <- c(dataset = 1, member = 1, sdate = 4, ftime = 5, lat = 6, lon = 7) dim(obs2) <- c(dataset = 1, member = 1, sdate = 4, ftime = 5, lat = 6, lon = 7) lon <- seq(0, 30, 5) lat <- seq(0, 25, 5) coords <- list(lat = lat, lon = lon) exp1 <- list(data = mod1, coords = coords, attrs = list(Datasets = "EXP1", source_files = "file1", Variable = list(varName = 'pre'))) exp2 <- list(data = mod2, coords = coords, attrs = list(Datasets = "EXP2", source_files = "file2", Variable = list(varName = 'tas'))) obs1 <- list(data = obs1, coords = coords, attrs = list(Datasets = "OBS1", source_files = "file1", Variable = list(varName = 'pre'))) obs2 <- list(data = obs2, coords = coords, attrs = list(Datasets = "OBS2", source_files = "file2", Variable = list(varName = 'tas'))) attr(exp1, 'class') <- 's2dv_cube' attr(exp2, 'class') <- 's2dv_cube' attr(obs1, 'class') <- 's2dv_cube' attr(obs2, 'class') <- 's2dv_cube' anom1 <- CST_Anomaly(exp1, obs1, cross = TRUE, memb = TRUE) anom2 <- CST_Anomaly(exp2, obs2, cross = TRUE, memb = TRUE) ano_exp <- list(anom1$exp, anom2$exp) ano_obs <- list(anom1$obs, anom2$obs) a <- CST_MultivarRMSE(exp = ano_exp, obs = ano_obs, weight = c(1, 2))
# Example with 2 variables mod1 <- abs(rnorm(1 * 3 * 4 * 5 * 6 * 7)) mod2 <- abs(rnorm(1 * 3 * 4 * 5 * 6 * 7)) dim(mod1) <- c(dataset = 1, member = 3, sdate = 4, ftime = 5, lat = 6, lon = 7) dim(mod2) <- c(dataset = 1, member = 3, sdate = 4, ftime = 5, lat = 6, lon = 7) obs1 <- abs(rnorm(1 * 1 * 4 * 5 * 6 * 7)) obs2 <- abs(rnorm(1 * 1 * 4 * 5 * 6 * 7)) dim(obs1) <- c(dataset = 1, member = 1, sdate = 4, ftime = 5, lat = 6, lon = 7) dim(obs2) <- c(dataset = 1, member = 1, sdate = 4, ftime = 5, lat = 6, lon = 7) lon <- seq(0, 30, 5) lat <- seq(0, 25, 5) coords <- list(lat = lat, lon = lon) exp1 <- list(data = mod1, coords = coords, attrs = list(Datasets = "EXP1", source_files = "file1", Variable = list(varName = 'pre'))) exp2 <- list(data = mod2, coords = coords, attrs = list(Datasets = "EXP2", source_files = "file2", Variable = list(varName = 'tas'))) obs1 <- list(data = obs1, coords = coords, attrs = list(Datasets = "OBS1", source_files = "file1", Variable = list(varName = 'pre'))) obs2 <- list(data = obs2, coords = coords, attrs = list(Datasets = "OBS2", source_files = "file2", Variable = list(varName = 'tas'))) attr(exp1, 'class') <- 's2dv_cube' attr(exp2, 'class') <- 's2dv_cube' attr(obs1, 'class') <- 's2dv_cube' attr(obs2, 'class') <- 's2dv_cube' anom1 <- CST_Anomaly(exp1, obs1, cross = TRUE, memb = TRUE) anom2 <- CST_Anomaly(exp2, obs2, cross = TRUE, memb = TRUE) ano_exp <- list(anom1$exp, anom2$exp) ano_obs <- list(anom1$obs, anom2$obs) a <- CST_MultivarRMSE(exp = ano_exp, obs = ano_obs, weight = c(1, 2))
This function computes two dinamical proxies of the attractor: The local dimension (d) and the inverse of the persistence (theta) for an 's2dv_cube' object. These two parameters will be used as a condition for the computation of dynamical scores to measure predictability and to compute bias correction conditioned by the dynamics with the function DynBiasCorrection Function based on the matlab code ([email protected]) used in
CST_ProxiesAttractor(data, quanti, ncores = NULL)
CST_ProxiesAttractor(data, quanti, ncores = NULL)
data |
An s2dv_cube object with the data to create the attractor. Must be a matrix with the timesteps in nrow and the grids in ncol(dat(time,grids) |
quanti |
A number lower than 1 indicating the quantile to perform the computation of local dimension and theta. |
ncores |
The number of cores to use in parallel computation. |
dim and theta
Carmen Alvarez-Castro, [email protected]
Maria M. Chaves-Montero, [email protected]
Veronica Torralba, [email protected]
Davide Faranda, [email protected]
Faranda, D., Alvarez-Castro, M.C., Messori, G., Rodriguez, D., and Yiou, P. (2019). The hammam effect or how a warm ocean enhances large scale atmospheric predictability. Nature Communications, 10(1), 1316. doi:10.1038/s41467-019-09305-8"
Faranda, D., Gabriele Messori and Pascal Yiou. (2017). Dynamical proxies of North Atlantic predictability and extremes. Scientific Reports, 7-41278, 2017.
# Example 1: Computing the attractor using simple s2dv data obs <- rnorm(2 * 3 * 4 * 8 * 8) dim(obs) <- c(dataset = 1, member = 2, sdate = 3, ftime = 4, lat = 8, lon = 8) lon <- seq(10, 13.5, 0.5) lat <- seq(40, 43.5, 0.5) coords <- list(lon = lon, lat = lat) data <- list(data = obs, coords = coords) class(data) <- "s2dv_cube" attractor <- CST_ProxiesAttractor(data = data, quanti = 0.6)
# Example 1: Computing the attractor using simple s2dv data obs <- rnorm(2 * 3 * 4 * 8 * 8) dim(obs) <- c(dataset = 1, member = 2, sdate = 3, ftime = 4, lat = 8, lon = 8) lon <- seq(10, 13.5, 0.5) lat <- seq(40, 43.5, 0.5) coords <- list(lon = lon, lat = lat) data <- list(data = obs, coords = coords) class(data) <- "s2dv_cube" attractor <- CST_ProxiesAttractor(data = data, quanti = 0.6)
This function is a wrapper of fitQmap and doQmap from package 'qmap' to be applied on the object of class 's2dv_cube'. The quantile mapping adjustment between an experiment, typically a hindcast, and observation is applied to the experiment itself or to a provided forecast.
CST_QuantileMapping( exp, obs, exp_cor = NULL, sdate_dim = "sdate", memb_dim = "member", window_dim = NULL, method = "QUANT", na.rm = FALSE, ncores = NULL, ... )
CST_QuantileMapping( exp, obs, exp_cor = NULL, sdate_dim = "sdate", memb_dim = "member", window_dim = NULL, method = "QUANT", na.rm = FALSE, ncores = NULL, ... )
exp |
An object of class |
obs |
An object of class |
exp_cor |
An object of class |
sdate_dim |
A character string indicating the dimension name in which cross-validation would be applied when exp_cor is not provided. 'sdate' by default. |
memb_dim |
A character string indicating the dimension name where ensemble members are stored in the experimental arrays. It can be NULL if there is no ensemble member dimension. It is set as 'member' by default. |
window_dim |
A character string indicating the dimension name where samples have been stored. It can be NULL (default) in case all samples are used. |
method |
A character string indicating the method to be used:'PTF', 'DIST', 'RQUANT', 'QUANT', 'SSPLIN'. By default, the empirical quantile mapping 'QUANT' is used. |
na.rm |
A logical value indicating if missing values should be removed (FALSE by default). |
ncores |
An integer indicating the number of cores for parallel computation using multiApply function. The default value is NULL (1). |
... |
Additional parameters to be used by the method choosen. See qmap package for details. |
An object of class s2dv_cube
containing the experimental data
after applying the quantile mapping correction.
Nuria Perez-Zanon, [email protected]
# Use synthetic data exp <- NULL exp$data <- 1 : c(1 * 3 * 5 * 4 * 3 * 2) dim(exp$data) <- c(dataset = 1, member = 3, sdate = 5, ftime = 4, lat = 3, lon = 2) class(exp) <- 's2dv_cube' obs <- NULL obs$data <- 101 : c(100 + 1 * 1 * 5 * 4 * 3 * 2) dim(obs$data) <- c(dataset = 1, member = 1, sdate = 5, ftime = 4, lat = 3, lon = 2) class(obs) <- 's2dv_cube' res <- CST_QuantileMapping(exp, obs)
# Use synthetic data exp <- NULL exp$data <- 1 : c(1 * 3 * 5 * 4 * 3 * 2) dim(exp$data) <- c(dataset = 1, member = 3, sdate = 5, ftime = 4, lat = 3, lon = 2) class(exp) <- 's2dv_cube' obs <- NULL obs$data <- 101 : c(100 + 1 * 1 * 5 * 4 * 3 * 2) dim(obs$data) <- c(dataset = 1, member = 1, sdate = 5, ftime = 4, lat = 3, lon = 2) class(obs) <- 's2dv_cube' res <- CST_QuantileMapping(exp, obs)
This function implements the RainFARM stochastic precipitation downscaling method and accepts a CSTools object (an object of the class 's2dv_cube' as provided by 'CST_Load') as input. Adapted for climate downscaling and including orographic correction as described in Terzago et al. 2018.
CST_RainFARM( data, weights = 1, slope = 0, nf, kmin = 1, nens = 1, fglob = FALSE, fsmooth = TRUE, nprocs = 1, time_dim = NULL, verbose = FALSE, drop_realization_dim = FALSE )
CST_RainFARM( data, weights = 1, slope = 0, nf, kmin = 1, nens = 1, fglob = FALSE, fsmooth = TRUE, nprocs = 1, time_dim = NULL, verbose = FALSE, drop_realization_dim = FALSE )
data |
An object of the class 's2dv_cube' as returned by 'CST_Load',
containing the spatial precipitation fields to downscale.
The data object is expected to have an element named |
weights |
Matrix with climatological weights which can be obtained using
the |
slope |
Prescribed spectral slope. The default is |
nf |
Refinement factor for downscaling (the output resolution is increased by this factor). |
kmin |
First wavenumber for spectral slope (default: |
nens |
Number of ensemble members to produce (default: |
fglob |
Logical to conserve global precipitation over the domain (default: FALSE). |
fsmooth |
Logical to conserve precipitation with a smoothing kernel (default: TRUE). |
nprocs |
The number of parallel processes to spawn for the use for parallel computation in multiple cores. (default: 1) |
time_dim |
String or character array with name(s) of dimension(s) (e.g. "ftime", "sdate", "member" ...) over which to compute spectral slopes. If a character array of dimension names is provided, the spectral slopes will be computed as an average over all elements belonging to those dimensions. If omitted one of c("ftime", "sdate", "time") is searched and the first one with more than one element is chosen. |
verbose |
Logical for verbose output (default: FALSE). |
drop_realization_dim |
Logical to remove the "realization" stochastic ensemble dimension, needed for saving data through function CST_SaveData (default: FALSE) with the following behaviour if set to TRUE:
|
Wether parameter 'slope' and 'weights' presents seasonality dependency, a dimension name should match between these parameters and the input data in parameter 'data'. See example 2 below where weights and slope vary with 'sdate' dimension.
CST_RainFARM() returns a downscaled CSTools object (i.e., of the
class 's2dv_cube'). If nens > 1
an additional dimension named
"realization" is added to the $data
array after the "member" dimension
(unless drop_realization_dim = TRUE
is specified). The ordering of the
remaining dimensions in the $data
element of the input object is
maintained.
Jost von Hardenberg - ISAC-CNR, [email protected]
Terzago, S. et al. (2018). NHESS 18(11), 2825-2840. doi:10.5194/nhess-18-2825-2018; D'Onofrio et al. (2014), J of Hydrometeorology 15, 830-843; Rebora et. al. (2006), JHM 7, 724.
# Example 1: using CST_RainFARM for a CSTools object nf <- 8 # Choose a downscaling by factor 8 exp <- 1 : (2 * 3 * 4 * 8 * 8) dim(exp) <- c(dataset = 1, member = 2, sdate = 3, ftime = 4, lat = 8, lon = 8) lon <- seq(10, 13.5, 0.5) lat <- seq(40, 43.5, 0.5) coords <- list(lon = lon, lat = lat) data <- list(data = exp, coords = coords) class(data) <- 's2dv_cube' # Create a test array of weights ww <- array(1., dim = c(lon = 8 * nf, lat = 8 * nf)) res <- CST_RainFARM(data, nf = nf, weights = ww, nens = 3, time_dim = 'ftime')
# Example 1: using CST_RainFARM for a CSTools object nf <- 8 # Choose a downscaling by factor 8 exp <- 1 : (2 * 3 * 4 * 8 * 8) dim(exp) <- c(dataset = 1, member = 2, sdate = 3, ftime = 4, lat = 8, lon = 8) lon <- seq(10, 13.5, 0.5) lat <- seq(40, 43.5, 0.5) coords <- list(lon = lon, lat = lat) data <- list(data = exp, coords = coords) class(data) <- 's2dv_cube' # Create a test array of weights ww <- array(1., dim = c(lon = 8 * nf, lat = 8 * nf)) res <- CST_RainFARM(data, nf = nf, weights = ww, nens = 3, time_dim = 'ftime')
This function performs the matching between a field of anomalies and a set of maps which will be used as a reference. The anomalies will be assigned to the reference map for which the minimum Eucledian distance (method =’distance’) or highest spatial correlation (method = 'ACC') is obtained.
CST_RegimesAssign( data, ref_maps, method = "distance", composite = FALSE, memb = FALSE, ncores = NULL )
CST_RegimesAssign( data, ref_maps, method = "distance", composite = FALSE, memb = FALSE, ncores = NULL )
data |
An 's2dv_cube' object. |
ref_maps |
An 's2dv_cube' object as the output of CST_WeatherRegimes. |
method |
Whether the matching will be performed in terms of minimum distance (default = 'distance') or the maximum spatial correlation (method = 'ACC') between the maps. |
composite |
A logical parameter indicating if the composite maps are computed or not (default = FALSE). |
memb |
A logical value indicating whether to compute composites for separate members (default FALSE) or as unique ensemble (TRUE). This option is only available for when parameter 'composite' is set to TRUE and the data object has a dimension named 'member'. |
ncores |
The number of multicore threads to use for parallel computation. |
A list with two elements $data
(a 's2dv_cube' object containing
the composites cluster=1,..,K for case (*1) or only k=1 for any specific
cluster, i.e., case (*2)) (only when composite = 'TRUE') and $statistics
that includes $pvalue
(array with the same structure as $data
containing the pvalue of the composites obtained through a t-test that
accounts for the serial dependence of the data with the same structure as
Composite.)(only when composite = 'TRUE'), $cluster
(array with the
same dimensions as data (except latitude and longitude which are removed)
indicating the ref_maps to which each point is allocated.), $frequency
(A vector of integers (from k=1,...k n reference maps) indicating the
percentage of assignations corresponding to each map.).
Verónica Torralba - BSC, [email protected]
Torralba, V. (2019) Seasonal climate prediction for the wind energy sector: methods and tools for the development of a climate service. Thesis. Available online: https://eprints.ucm.es/56841/
data <- array(abs(rnorm(1280, 282.7, 6.4)), dim = c(dataset = 2, member = 2, sdate = 3, ftime = 3, lat = 4, lon = 4)) coords <- list(lon = seq(0, 3), lat = seq(47, 44)) exp <- list(data = data, coords = coords) class(exp) <- 's2dv_cube' regimes <- CST_WeatherRegimes(data = exp, EOFs = FALSE, ncenters = 4) res1 <- CST_RegimesAssign(data = exp, ref_maps = regimes, composite = FALSE)
data <- array(abs(rnorm(1280, 282.7, 6.4)), dim = c(dataset = 2, member = 2, sdate = 3, ftime = 3, lat = 4, lon = 4)) coords <- list(lon = seq(0, 3), lat = seq(47, 44)) exp <- list(data = data, coords = coords) class(exp) <- 's2dv_cube' regimes <- CST_WeatherRegimes(data = exp, EOFs = FALSE, ncenters = 4) res1 <- CST_RegimesAssign(data = exp, ref_maps = regimes, composite = FALSE)
This function computes spatial spectral slopes from a CSTools object to be used for RainFARM stochastic precipitation downscaling method and accepts a CSTools object (of the class 's2dv_cube') as input.
CST_RFSlope(data, kmin = 1, time_dim = NULL, ncores = NULL)
CST_RFSlope(data, kmin = 1, time_dim = NULL, ncores = NULL)
data |
An object of the class 's2dv_cube', containing the spatial
precipitation fields to downscale. The data object is expected to have an
element named |
kmin |
First wavenumber for spectral slope (default |
time_dim |
String or character array with name(s) of dimension(s) (e.g. "ftime", "sdate", "member" ...) over which to compute spectral slopes. If a character array of dimension names is provided, the spectral slopes will be computed as an average over all elements belonging to those dimensions. If omitted one of c("ftime", "sdate", "time") is searched and the first one with more than one element is chosen. |
ncores |
Is an integer that indicates the number of cores for parallel computations using multiApply function. The default value is one. |
CST_RFSlope() returns spectral slopes using the RainFARM convention
(the logarithmic slope of k*|A(k)|^2 where A(k) are the spectral amplitudes).
The returned array has the same dimensions as the exp
element of the
input object, minus the dimensions specified by lon_dim
,
lat_dim
and time_dim
.
Jost von Hardenberg - ISAC-CNR, [email protected]
exp <- 1 : (2 * 3 * 4 * 8 * 8) dim(exp) <- c(dataset = 1, member = 2, sdate = 3, ftime = 4, lat = 8, lon = 8) lon <- seq(10, 13.5, 0.5) lat <- seq(40, 43.5, 0.5) coords <- list(lon = lon, lat = lat) data <- list(data = exp, coords = coords) class(data) <- 's2dv_cube' slopes <- CST_RFSlope(data)
exp <- 1 : (2 * 3 * 4 * 8 * 8) dim(exp) <- c(dataset = 1, member = 2, sdate = 3, ftime = 4, lat = 8, lon = 8) lon <- seq(10, 13.5, 0.5) lat <- seq(40, 43.5, 0.5) coords <- list(lon = lon, lat = lat) data <- list(data = exp, coords = coords) class(data) <- 's2dv_cube' slopes <- CST_RFSlope(data)
This function implements a simple lapse rate correction of a temperature field (an object of class 's2dv_cube' as provided by 'CST_Load') as input. The input lon grid must be increasing (but can be modulo 360). The input lat grid can be irregularly spaced (e.g. a Gaussian grid) The output grid can be irregularly spaced in lon and/or lat.
CST_RFTemp( data, oro, xlim = NULL, ylim = NULL, lapse = 6.5, lon_dim = "lon", lat_dim = "lat", time_dim = NULL, nolapse = FALSE, verbose = FALSE, compute_delta = FALSE, method = "bilinear", delta = NULL )
CST_RFTemp( data, oro, xlim = NULL, ylim = NULL, lapse = 6.5, lon_dim = "lon", lat_dim = "lat", time_dim = NULL, nolapse = FALSE, verbose = FALSE, compute_delta = FALSE, method = "bilinear", delta = NULL )
data |
An object of the class 's2dv_cube' as returned by 'CST_Load',
containing the temperature fields to downscale. The data object is expected
to have an element named |
oro |
An object of the class 's2dv_cube' as returned by 'CST_Load', containing fine scale orography (in meters). The destination downscaling area must be contained in the orography field. |
xlim |
Vector with longitude bounds for downscaling; the full input field is downscaled if 'xlim' and 'ylim' are not specified. |
ylim |
Vector with latitude bounds for downscaling |
lapse |
Float with environmental lapse rate |
lon_dim |
String with name of longitude dimension |
lat_dim |
String with name of latitude dimension |
time_dim |
A vector of character string indicating the name of temporal dimension. By default, it is set to NULL and it considers "ftime", "sdate" and "time" as temporal dimensions. |
nolapse |
Logical, if true 'oro' is interpreted as a fine-scale climatology and used directly for bias correction. |
verbose |
Logical if to print diagnostic output. |
compute_delta |
Logical if true returns only a delta to be used for out-of-sample forecasts. Returns an object of the class 's2dv_cube', containing a delta. Activates 'nolapse = TRUE'. |
method |
String indicating the method used for interpolation: "nearest" (nearest neighbours followed by smoothing with a circular uniform weights kernel), "bilinear" (bilinear interpolation) The two methods provide similar results, but nearest is slightly better provided that the fine-scale grid is correctly centered as a subdivision of the large-scale grid. |
delta |
An object of the class 's2dv_cube', containing a delta to be applied to the downscaled input data. Activates 'nolapse = TRUE'. The grid of this object must coincide with that of the required output. |
CST_RFTemp() returns a downscaled CSTools object (i.e., of the class 's2dv_cube').
Jost von Hardenberg - ISAC-CNR, [email protected]
Method described in ERA4CS MEDSCOPE milestone M3.2: High-quality climate prediction data available to WP4 here: https://www.medscope-project.eu/the-project/deliverables-reports/ and in H2020 ECOPOTENTIAL Deliverable No. 8.1: High resolution (1-10 km) climate, land use and ocean change scenarios available here: https://ec.europa.eu/research/participants/documents/downloadPublic?documentIds=080166e5b6cd2324&appId=PPGMS
# Generate simple synthetic data and downscale by factor 4 t <- rnorm(7 * 6 * 2 * 3 * 4)*10 + 273.15 + 10 dim(t) <- c(dataset = 1, member = 2, sdate = 3, ftime = 4, lat = 6, lon = 7) lon <- seq(3, 9, 1) lat <- seq(42, 47, 1) coords <- list(lat = lat, lon = lon) exp <- list(data = t, coords = coords) attr(exp, 'class') <- 's2dv_cube' o <- runif(29*29)*3000 dim(o) <- c(lats = 29, lons = 29) lon <- seq(3, 10, 0.25) lat <- seq(41, 48, 0.25) coords <- list(lat = lat, lon = lon) oro <- list(data = o, coords = coords) attr(oro, 'class') <- 's2dv_cube' res <- CST_RFTemp(data = exp, oro = oro, xlim = c(4,8), ylim = c(43, 46), lapse = 6.5, time_dim = 'ftime', lon_dim = 'lon', lat_dim = 'lat')
# Generate simple synthetic data and downscale by factor 4 t <- rnorm(7 * 6 * 2 * 3 * 4)*10 + 273.15 + 10 dim(t) <- c(dataset = 1, member = 2, sdate = 3, ftime = 4, lat = 6, lon = 7) lon <- seq(3, 9, 1) lat <- seq(42, 47, 1) coords <- list(lat = lat, lon = lon) exp <- list(data = t, coords = coords) attr(exp, 'class') <- 's2dv_cube' o <- runif(29*29)*3000 dim(o) <- c(lats = 29, lons = 29) lon <- seq(3, 10, 0.25) lat <- seq(41, 48, 0.25) coords <- list(lat = lat, lon = lon) oro <- list(data = o, coords = coords) attr(oro, 'class') <- 's2dv_cube' res <- CST_RFTemp(data = exp, oro = oro, xlim = c(4,8), ylim = c(43, 46), lapse = 6.5, time_dim = 'ftime', lon_dim = 'lon', lat_dim = 'lat')
Compute climatological ("orographic") weights from a fine-scale precipitation climatology file.
CST_RFWeights( climfile, nf, lon, lat, varname = NULL, fsmooth = TRUE, lonname = "lon", latname = "lat", ncores = NULL )
CST_RFWeights( climfile, nf, lon, lat, varname = NULL, fsmooth = TRUE, lonname = "lon", latname = "lat", ncores = NULL )
climfile |
Filename of a fine-scale precipitation climatology. The file is expected to be in NetCDF format and should contain at least one precipitation field. If several fields at different times are provided, a climatology is derived by time averaging. Suitable climatology files could be for example a fine-scale precipitation climatology from a high-resolution regional climate model (see e.g. Terzago et al. 2018), a local high-resolution gridded climatology from observations, or a reconstruction such as those which can be downloaded from the WORLDCLIM (https://www.worldclim.org) or CHELSA (https://chelsa-climate.org/) websites. The latter data will need to be converted to NetCDF format before being used (see for example the GDAL tools (https://gdal.org/). It could also be an 's2dv_cube' object. |
nf |
Refinement factor for downscaling (the output resolution is increased by this factor). |
lon |
Vector of longitudes. |
lat |
Vector of latitudes. The number of longitudes and latitudes is expected to be even and the same. If not the function will perform a subsetting to ensure this condition. |
varname |
Name of the variable to be read from |
fsmooth |
Logical to use smooth conservation (default) or large-scale box-average conservation. |
lonname |
A character string indicating the name of the longitudinal dimension set as 'lon' by default. |
latname |
A character string indicating the name of the latitudinal dimension set as 'lat' by default. |
ncores |
An integer that indicates the number of cores for parallel computations using multiApply function. The default value is one. |
An object of class 's2dv_cube' containing in matrix data
the
weights with dimensions (lon, lat).
Jost von Hardenberg - ISAC-CNR, [email protected]
Terzago, S., Palazzi, E., & von Hardenberg, J. (2018). Stochastic downscaling of precipitation in complex orography: A simple method to reproduce a realistic fine-scale climatology. Natural Hazards and Earth System Sciences, 18(11), 2825-2840. doi:10.5194/nhess-18-2825-2018.
# Create weights to be used with the CST_RainFARM() or RainFARM() functions # using an external random data in the form of 's2dv_cube'. obs <- rnorm(2 * 3 * 4 * 8 * 8) dim(obs) <- c(dataset = 1, member = 2, sdate = 3, ftime = 4, lat = 8, lon = 8) lon <- seq(10, 13.5, 0.5) lat <- seq(40, 43.5, 0.5) coords <- list(lon = lon, lat = lat) data <- list(data = obs, coords = coords) class(data) <- "s2dv_cube" res <- CST_RFWeights(climfile = data, nf = 3, lon, lat, lonname = 'lon', latname = 'lat', fsmooth = TRUE)
# Create weights to be used with the CST_RainFARM() or RainFARM() functions # using an external random data in the form of 's2dv_cube'. obs <- rnorm(2 * 3 * 4 * 8 * 8) dim(obs) <- c(dataset = 1, member = 2, sdate = 3, ftime = 4, lat = 8, lon = 8) lon <- seq(10, 13.5, 0.5) lat <- seq(40, 43.5, 0.5) coords <- list(lon = lon, lat = lat) data <- list(data = obs, coords = coords) class(data) <- "s2dv_cube" res <- CST_RFWeights(climfile = data, nf = 3, lon, lat, lonname = 'lon', latname = 'lat', fsmooth = TRUE)
This function allows to divide and save a object of class
's2dv_cube' into a NetCDF file, allowing to reload the saved data using
CST_Start
or CST_Load
functions. It also allows to save any
's2dv_cube' object that follows the NetCDF attributes conventions.
CST_SaveExp( data, destination = "./", startdates = NULL, sdate_dim = "sdate", ftime_dim = "time", memb_dim = "member", dat_dim = "dataset", var_dim = "var", drop_dims = NULL, single_file = FALSE, extra_string = NULL, global_attrs = NULL, units_hours_since = FALSE )
CST_SaveExp( data, destination = "./", startdates = NULL, sdate_dim = "sdate", ftime_dim = "time", memb_dim = "member", dat_dim = "dataset", var_dim = "var", drop_dims = NULL, single_file = FALSE, extra_string = NULL, global_attrs = NULL, units_hours_since = FALSE )
data |
An object of class |
destination |
A character string containing the directory name in which to save the data. NetCDF file for each starting date are saved into the folder tree: 'destination/Dataset/variable/'. By default the function saves the data into the working directory. |
startdates |
A vector of dates that will be used for the filenames
when saving the data in multiple files (single_file = FALSE). It must be a
vector of the same length as the start date dimension of data. It must be a
vector of class |
sdate_dim |
A character string indicating the name of the start date dimension. By default, it is set to 'sdate'. It can be NULL if there is no start date dimension. |
ftime_dim |
A character string indicating the name of the forecast time dimension. If 'Dates' are used, it can't be NULL. If there is no forecast time dimension, 'Dates' will be set to NULL and will not be used. By default, it is set to 'time'. |
memb_dim |
A character string indicating the name of the member dimension. It can be NULL if there is no member dimension. By default, it is set to 'member'. |
dat_dim |
A character string indicating the name of dataset dimension. It can be NULL if there is no dataset dimension. By default, it is set to 'dataset'. |
var_dim |
A character string indicating the name of variable dimension. It can be NULL if there is no variable dimension. By default, it is set to 'var'. |
drop_dims |
(optional) A vector of character strings indicating the dimension names of length 1 that need to be dropped in order that they don't appear in the netCDF file. Only is allowed to drop dimensions that are not used in the computation. The dimensions used in the computation are the ones specified in: sdate_dim, ftime_dim, dat_dim, var_dim and memb_dim. It is NULL by default. |
single_file |
A logical value indicating if all object is saved in a single file (TRUE) or in multiple files (FALSE). When it is FALSE, the array is separated for datasets, variable and start date. When there are no specified time dimensions, the data will be saved in a single file by default. The output file name when 'single_file' is TRUE is a character string containing: '<var>_<first_sdate>_<last_sdate>.nc'; when it is FALSE, it is '<var>_<sdate>.nc'. It is FALSE by default. |
extra_string |
(Optional) A character string to be included as part of the file name, for instance, to identify member or realization. When single_file is TRUE, the 'extra_string' will substitute all the default file name; when single_file is FALSE, the 'extra_string' will be added in the file name as: '<var>_<extra_string>_<sdate>.nc'. It is NULL by default. |
global_attrs |
(Optional) A list with elements containing the global attributes to be saved in the NetCDF. |
units_hours_since |
(Optional) A logical value only available for the case: 'Dates' have forecast time and start date dimension, 'single_file' is TRUE and 'time_bounds' are not used. When it is TRUE, it saves the forecast time with units of 'hours since'; if it is FALSE, the time units will be a number of time steps with its corresponding frequency (e.g. n days, n months or n hours). It is FALSE by default. |
Multiple or single NetCDF files containing the data array.
single_file is TRUE |
All data is saved in a single file located in the specified destination path with the following name (by default): '<variable_name>_<first_sdate>_<last_sdate>.nc'. Multiple variables are saved separately in the same file. The forecast time units are calculated from each start date (if sdate_dim is not NULL) or from the time step. If 'units_hours_since' is TRUE, the forecast time units will be 'hours since <each start date>'. If 'units_hours_since' is FALSE, the forecast time units are extracted from the frequency of the time steps (hours, days, months); if no frequency is found, the units will be ’hours since’. When the time units are 'hours since' the time ateps are assumed to be equally spaced. |
single_file is FALSE |
The data array is subset and stored into multiple files. Each file contains the data subset for each start date, variable and dataset. Files with different variables and datasets are stored in separated directories within the following directory tree: 'destination/Dataset/variable/'. The name of each file will be by default: '<variable_name>_<sdate>.nc'. The forecast time units are calculated from each start date (if sdate_dim is not NULL) or from the time step. The forecast time units will be 'hours since <each start date>'. |
Perez-Zanon Nuria, [email protected]
Start
, as.s2dv_cube
and
s2dv_cube
## Not run: data <- lonlat_temp_st$exp CST_SaveExp(data = data, ftime_dim = 'ftime', var_dim = 'var', dat_dim = 'dataset', sdate_dim = 'sdate') ## End(Not run)
## Not run: data <- lonlat_temp_st$exp CST_SaveExp(data = data, ftime_dim = 'ftime', var_dim = 'var', dat_dim = 'dataset', sdate_dim = 'sdate') ## End(Not run)
This function split a dimension in two. The user can select the dimension to split and provide indices indicating how to split that dimension or dates and the frequency expected (monthly or by day, month and year). The user can also provide a numeric frequency indicating the length of each division.
CST_SplitDim( data, split_dim = "time", indices = NULL, freq = "monthly", new_dim_name = NULL, insert_ftime = NULL, ftime_dim = "time", sdate_dim = "sdate", return_indices = FALSE )
CST_SplitDim( data, split_dim = "time", indices = NULL, freq = "monthly", new_dim_name = NULL, insert_ftime = NULL, ftime_dim = "time", sdate_dim = "sdate", return_indices = FALSE )
data |
A 's2dv_cube' object |
split_dim |
A character string indicating the name of the dimension to split. It is set as 'time' by default. |
indices |
A vector of numeric indices or dates. If left at NULL, the dates provided in the s2dv_cube object (element Dates) will be used. |
freq |
A character string indicating the frequency: by 'day', 'month' and 'year' or 'monthly' (by default). 'month' identifies months between 1 and 12 independently of the year they belong to, while 'monthly' differenciates months from different years. |
new_dim_name |
A character string indicating the name of the new dimension. |
insert_ftime |
An integer indicating the number of time steps to add at the begining of the time series. |
ftime_dim |
A character string indicating the name of the forecast time dimension. It is set as 'time' by default. |
sdate_dim |
A character string indicating the name of the start date dimension. It is set as 'sdate' by default. |
return_indices |
A logical value that if it is TRUE, the indices used in splitting the dimension will be returned. It is FALSE by default. |
Parameter 'insert_ftime' has been included for the case of using
daily data, requiring split the temporal dimensions by months (or similar) and
the first lead time doesn't correspondt to the 1st day of the month. In this
case, the insert_ftime could be used, to get a final output correctly
organized. E.g.: leadtime 1 is the 2nd of November and the input time series
extend to the 31st of December. When requiring split by month with
inset_ftime = 1
, the 'monthly' dimension of length two will indicate
the month (position 1 for November and position 2 for December), dimension
'time' will be length 31. For November, the position 1 and 31 will be NAs,
while from positon 2 to 30 will be filled with the data provided. This allows
to select correctly days trhough time dimension.
Nuria Perez-Zanon, [email protected]
data <- 1 : 20 dim(data) <- c(time = 10, lat = 2) data <-list(data = data) class(data) <- 's2dv_cube' indices <- c(rep(1,5), rep(2,5)) new_data <- CST_SplitDim(data, indices = indices) time <- c(seq(ISOdate(1903, 1, 1), ISOdate(1903, 1, 4), "days"), seq(ISOdate(1903, 2, 1), ISOdate(1903, 2, 4), "days"), seq(ISOdate(1904, 1, 1), ISOdate(1904, 1, 2), "days")) data <- list(data = data$data, Dates = time) class(data) <- 's2dv_cube' new_data <- CST_SplitDim(data, indices = time) new_data <- CST_SplitDim(data, indices = time, freq = 'day') new_data <- CST_SplitDim(data, indices = time, freq = 'month') new_data <- CST_SplitDim(data, indices = time, freq = 'year')
data <- 1 : 20 dim(data) <- c(time = 10, lat = 2) data <-list(data = data) class(data) <- 's2dv_cube' indices <- c(rep(1,5), rep(2,5)) new_data <- CST_SplitDim(data, indices = indices) time <- c(seq(ISOdate(1903, 1, 1), ISOdate(1903, 1, 4), "days"), seq(ISOdate(1903, 2, 1), ISOdate(1903, 2, 4), "days"), seq(ISOdate(1904, 1, 1), ISOdate(1904, 1, 2), "days")) data <- list(data = data$data, Dates = time) class(data) <- 's2dv_cube' new_data <- CST_SplitDim(data, indices = time) new_data <- CST_SplitDim(data, indices = time, freq = 'day') new_data <- CST_SplitDim(data, indices = time, freq = 'month') new_data <- CST_SplitDim(data, indices = time, freq = 'year')
This function aggregates, subsets and retrieves sub-seasonal, seasonal,
decadal or climate projection data from NetCDF files in a local file system
and arranges it for easy application of the CSTools functions. It calls the
function Start
from startR, which is an R package started at BSC with
the aim to develop a tool that allows the user to automatically process large
multidimensional distributed data sets. Then, the output is transformed into
's2dv_cube' object.
CST_Start(...)
CST_Start(...)
... |
Parameters that are automatically forwarded to the 'startR::Start' function. See details in '?startR::Start'. |
It receives any number of parameters ('...') that are automatically forwarded to the 'startR::Start' function. See details in '?startR::Start'. The auxiliary functions used to define dimensions need to be called within the startR namespace (e.g. startR::indices(), startR::values(), startR::Sort(), startR::CircularSort(), startR::CDORemapper(), ...).
## Not run: sdates <- c('20101101', '20111101', '20121101') latmin <- 44 latmax <- 47 lonmin <- 6 lonmax <- 9 data <- CST_Start(dat = path, var = 'prlr', ensemble = indices(1:6), sdate = sdates, time = 121:151, latitude = values(list(latmin, latmax)), longitude = values(list(lonmin, lonmax)), synonims = list(longitude = c('lon', 'longitude'), latitude = c('lat', 'latitude')), return_vars = list(time = 'sdate', longitude = NULL, latitude = NULL), retrieve = FALSE) ## End(Not run)
## Not run: sdates <- c('20101101', '20111101', '20121101') latmin <- 44 latmax <- 47 lonmin <- 6 lonmax <- 9 data <- CST_Start(dat = path, var = 'prlr', ensemble = indices(1:6), sdate = sdates, time = 121:151, latitude = values(list(latmin, latmax)), longitude = values(list(lonmin, lonmax)), synonims = list(longitude = c('lon', 'longitude'), latitude = c('lat', 'latitude')), return_vars = list(time = 'sdate', longitude = NULL, latitude = NULL), retrieve = FALSE) ## End(Not run)
This function allows to subset (i.e. slice, take a chunk of) the data inside
an object of class s2dv_cube
and modify the dimensions, coordinates and
attributes accordingly, removing any variables, time steps and spatial
coordinates that are dropped when subsetting. It ensures that the information
inside the s2dv_cube remains coherent with the data it contains.
As in the function Subset
from the ClimProjDiags package, the
dimensions to subset along can be specified via the parameter along
either with integer indices or by their name.
There are additional ways to adjust which dimensions are dropped in the
resulting object: either to drop all, to drop none, to drop only the ones that
have been sliced or to drop only the ones that have not been sliced.
The load_parameters
and when
attributes of the original cube
are preserved. The source_files
attribute is subset along the
var_dim
and dat_dim
dimensions.
CST_Subset(x, along, indices, drop = FALSE, var_dim = NULL, dat_dim = NULL)
CST_Subset(x, along, indices, drop = FALSE, var_dim = NULL, dat_dim = NULL)
x |
An object of class |
along |
A vector with references to the dimensions to take the subset from: either integers or dimension names. |
indices |
A list of indices to take from each dimension specified in 'along'. If a single dimension is specified in 'along', it can be directly provided as an integer or a vector. |
drop |
Whether to drop all the dimensions of length 1 in the resulting array, none, only those that are specified in 'along', or only those that are not specified in 'along'. The possible values are: 'all' or TRUE, 'none' or FALSE, 'selected', and 'non-selected'. The default value is FALSE. |
var_dim |
A chatacter string indicating the name of the variable dimension. The default value is NULL. |
dat_dim |
A character string indicating the name of dataset dimension. The default value is NULL. |
An object of class s2dv_cube
with similar data, coordinates and
attributes as the x
input, but with trimmed or dropped dimensions.
Agudetse Roures Victoria, [email protected]
#Example with sample data: # Check original dimensions and coordinates lonlat_temp$exp$dims names(lonlat_temp$exp$coords) # Subset the s2dv_cube exp_subset <- CST_Subset(lonlat_temp$exp, along = c("lat", "lon"), indices = list(1:10, 1:10), drop = 'non-selected') # Check new dimensions and coordinates exp_subset$dims names(exp_subset$coords)
#Example with sample data: # Check original dimensions and coordinates lonlat_temp$exp$dims names(lonlat_temp$exp$coords) # Subset the s2dv_cube exp_subset <- CST_Subset(lonlat_temp$exp, along = c("lat", "lon"), indices = list(1:10, 1:10), drop = 'non-selected') # Check new dimensions and coordinates exp_subset$dims names(exp_subset$coords)
This function computes the weather regimes from a cluster
analysis. It is applied on the array data
in a 's2dv_cube' object. The
dimensionality of this object can be also reduced by using PCs obtained from
the application of the #'EOFs analysis to filter the dataset. The cluster
analysis can be performed with the traditional k-means or those methods
included in the hclust (stats package).
CST_WeatherRegimes( data, ncenters = NULL, EOFs = TRUE, neofs = 30, varThreshold = NULL, method = "kmeans", iter.max = 100, nstart = 30, ncores = NULL )
CST_WeatherRegimes( data, ncenters = NULL, EOFs = TRUE, neofs = 30, varThreshold = NULL, method = "kmeans", iter.max = 100, nstart = 30, ncores = NULL )
data |
An 's2dv_cube' object. |
ncenters |
Number of clusters to be calculated with the clustering function. |
EOFs |
Whether to compute the EOFs (default = 'TRUE') or not (FALSE) to filter the data. |
neofs |
Number of modes to be kept (default = 30). |
varThreshold |
Value with the percentage of variance to be explained by the PCs. Only sufficient PCs to explain this much variance will be used in the clustering. |
method |
Different options to estimate the clusters. The most traditional approach is the k-means analysis (default=’kmeans’) but the function also support the different methods included in the hclust . These methods are: "ward.D", "ward.D2", "single", "complete", "average" (= UPGMA), "mcquitty" (= WPGMA), "median" (= WPGMC) or "centroid" (= UPGMC). For more details about these methods see the hclust function documentation included in the stats package. |
iter.max |
Parameter to select the maximum number of iterations allowed (Only if method='kmeans' is selected). |
nstart |
Parameter for the cluster analysis determining how many random sets to choose (Only if method='kmeans' is selected). |
ncores |
The number of multicore threads to use for parallel computation. |
A list with two elements $data
(a 's2dv_cube' object containing
the composites cluster = 1,..,K for case (*1) or only k = 1 for any specific
cluster, i.e., case (*2)) and $statistics
that includes $pvalue
(array with the same structure as $data
containing the pvalue of the
composites obtained through a t-test that accounts for the serial dependence.),
cluster
(A matrix or vector with integers (from 1:k) indicating the
cluster to which each time step is allocated.), persistence
(Percentage
of days in a month/season before a cluster is replaced for a new one (only if
method=’kmeans’ has been selected.)), frequency
(Percentage of days in
a month/season belonging to each cluster (only if method=’kmeans’ has been
selected).),
Verónica Torralba - BSC, [email protected]
Cortesi, N., V., Torralba, N., González-Reviriego, A., Soret, and F.J., Doblas-Reyes (2019). Characterization of European wind speed variability using weather regimes. Climate Dynamics,53, 4961–4976, doi:10.1007/s00382-019-04839-5.
Torralba, V. (2019) Seasonal climate prediction for the wind energy sector: methods and tools for the development of a climate service. Thesis. Available online: https://eprints.ucm.es/56841/.
data <- array(abs(rnorm(1280, 283.7, 6)), dim = c(dataset = 2, member = 2, sdate = 3, ftime = 3, lat = 4, lon = 4)) coords <- list(lon = seq(0, 3), lat = seq(47, 44)) obs <- list(data = data, coords = coords) class(obs) <- 's2dv_cube' res1 <- CST_WeatherRegimes(data = obs, EOFs = FALSE, ncenters = 4) res2 <- CST_WeatherRegimes(data = obs, EOFs = TRUE, ncenters = 3)
data <- array(abs(rnorm(1280, 283.7, 6)), dim = c(dataset = 2, member = 2, sdate = 3, ftime = 3, lat = 4, lon = 4)) coords <- list(lon = seq(0, 3), lat = seq(47, 44)) obs <- list(data = data, coords = coords) class(obs) <- 's2dv_cube' res1 <- CST_WeatherRegimes(data = obs, EOFs = FALSE, ncenters = 4) res2 <- CST_WeatherRegimes(data = obs, EOFs = TRUE, ncenters = 3)
This function perform a bias correction conditioned by the dynamical properties of the dataset. This function used the functions 'CST_Predictability' to divide in terciles the two dynamical proxies computed with 'CST_ProxiesAttractor'. A bias correction between the model and the observations is performed using the division into terciles of the local dimension 'dim' and inverse of the persistence 'theta'. For instance, model values with lower 'dim' will be corrected with observed values with lower 'dim', and the same for theta. The function gives two options of bias correction: one for 'dim' and/or one for 'theta'
DynBiasCorrection( exp, obs, method = "QUANT", wetday = FALSE, proxy = "dim", quanti, ncores = NULL )
DynBiasCorrection( exp, obs, method = "QUANT", wetday = FALSE, proxy = "dim", quanti, ncores = NULL )
exp |
A multidimensional array with named dimensions with the experiment data. |
obs |
A multidimensional array with named dimensions with the observation data. |
method |
A character string indicating the method to apply bias correction among these ones: "PTF", "RQUANT", "QUANT", "SSPLIN". |
wetday |
Logical indicating whether to perform wet day correction or not OR a numeric threshold below which all values are set to zero (by default is set to 'FALSE'). |
proxy |
A character string indicating the proxy for local dimension 'dim' or inverse of persistence 'theta' to apply the dynamical conditioned bias correction method. |
quanti |
A number lower than 1 indicating the quantile to perform the computation of local dimension and theta. |
ncores |
The number of cores to use in parallel computation. |
A multidimensional array with named dimensions with a bias correction performed conditioned by local dimension 'dim' or inverse of persistence 'theta'.
Carmen Alvarez-Castro, [email protected]
Maria M. Chaves-Montero, [email protected]
Veronica Torralba, [email protected]
Davide Faranda, [email protected]
Faranda, D., Alvarez-Castro, M.C., Messori, G., Rodriguez, D., and Yiou, P. (2019). The hammam effect or how a warm ocean enhances large scale atmospheric predictability.Nature Communications, 10(1), 1316. doi:10.1038/s41467-019-09305-8"
Faranda, D., Gabriele Messori and Pascal Yiou. (2017). Dynamical proxies of North Atlantic predictability and extremes. Scientific Reports, 7-41278, 2017.
expL <- rnorm(1:2000) dim (expL) <- c(time =100,lat = 4, lon = 5) obsL <- c(rnorm(1:1980),expL[1,,]*1.2) dim (obsL) <- c(time = 100,lat = 4, lon = 5) dynbias <- DynBiasCorrection(exp = expL, obs = obsL, method='QUANT', proxy= "dim", quanti = 0.6)
expL <- rnorm(1:2000) dim (expL) <- c(time =100,lat = 4, lon = 5) obsL <- c(rnorm(1:1980),expL[1,,]*1.2) dim (obsL) <- c(time = 100,lat = 4, lon = 5) dynbias <- DynBiasCorrection(exp = expL, obs = obsL, method='QUANT', proxy= "dim", quanti = 0.6)
This function performs a clustering on members/starting dates and returns a number of scenarios, with representative members for each of them. The clustering is performed in a reduced EOF space.
EnsClustering( data, lat, lon, time_moment = "mean", numclus = NULL, lon_lim = NULL, lat_lim = NULL, variance_explained = 80, numpcs = NULL, time_percentile = 90, time_dim = NULL, cluster_dim = "member", verbose = T )
EnsClustering( data, lat, lon, time_moment = "mean", numclus = NULL, lon_lim = NULL, lat_lim = NULL, variance_explained = 80, numpcs = NULL, time_percentile = 90, time_dim = NULL, cluster_dim = "member", verbose = T )
data |
A matrix of dimensions 'dataset member sdate ftime lat lon' containing the variables to be analysed. Latitudinal dimension accepted names: 'lat', 'lats', 'latitude', 'y', 'j', 'nav_lat'. Longitudinal dimension accepted names: 'lon', 'lons','longitude', 'x', 'i', 'nav_lon'. |
lat |
Vector of latitudes. |
lon |
Vector of longitudes. |
time_moment |
Decides the moment to be applied to the time dimension. Can be either 'mean' (time mean), 'sd' (standard deviation along time) or 'perc' (a selected percentile on time). If 'perc' the keyword 'time_percentile' is also used. |
numclus |
Number of clusters (scenarios) to be calculated. If set to NULL the number of ensemble members divided by 10 is used, with a minimum of 2 and a maximum of 8. |
lon_lim |
List with the two longitude margins in 'c(-180,180)' format. |
lat_lim |
List with the two latitude margins. |
variance_explained |
variance (percentage) to be explained by the set of EOFs. Defaults to 80. Not used if numpcs is specified. |
numpcs |
Number of EOFs retained in the analysis (optional). |
time_percentile |
Set the percentile in time you want to analyse (used for 'time_moment = "perc"). |
time_dim |
String or character array with name(s) of dimension(s) over which to compute statistics. If omitted c("ftime", "sdate", "time") are searched in this order. |
cluster_dim |
Dimension along which to cluster. Typically "member" or "sdate". This can also be a list like c("member", "sdate"). |
verbose |
Logical for verbose output |
A list with elements $cluster
(cluster assigned for each member),
$freq
(relative frequency of each cluster), $closest_member
(representative member for each cluster), $repr_field
(list of fields for
each representative member), composites
(list of mean fields for each
cluster), $lon
(selected longitudes of output fields), $lat
(selected longitudes of output fields).
Federico Fabiano - ISAC-CNR, [email protected]
Ignazio Giuntoli - ISAC-CNR, [email protected]
Danila Volpi - ISAC-CNR, [email protected]
Paolo Davini - ISAC-CNR, [email protected]
Jost von Hardenberg - ISAC-CNR, [email protected]
exp <- array(abs(rnorm(1152))*275, dim = c(dataset = 1, member = 4, sdate = 6, ftime = 3, lat = 4, lon = 4)) lon <- seq(0, 3) lat <- seq(48, 45) res <- EnsClustering(exp, lat = lat, lon = lon, numclus = 2, cluster_dim = c("member", "dataset", "sdate"))
exp <- array(abs(rnorm(1152))*275, dim = c(dataset = 1, member = 4, sdate = 6, ftime = 3, lat = 4, lon = 4)) lon <- seq(0, 3) lat <- seq(48, 45) res <- EnsClustering(exp, lat = lat, lon = lon, numclus = 2, cluster_dim = c("member", "dataset", "sdate"))
This sample data set contains a small cutout of gridded seasonal precipitation forecast data from the Copernicus Climate Change ECMWF-System 5 forecast system, to be used to demonstrate downscaling. Specifically, for the 'pr' (precipitation) variable, for the first 6 forecast ensemble members, daily values, for all 31 days in March following the forecast starting dates in November of years 2010 to 2012, for a small 4x4 pixel cutout in a region in the North-Western Italian Alps (44N-47N, 6E-9E). The data resolution is 1 degree.
The 'CST_Load' call used to generate the data set in the infrastructure of the Marconi machine at CINECA is shown next, working on files which were extracted from forecast data available in the MEDSCOPE internal archive.
library(CSTools) infile <- list(path = paste0('/esarchive/exp/ecmwf/system5c3s/daily_mean/', '$VAR_NAME$_s0-24h/$VAR_NAME$_$START_DATE$.nc')) lonlat_prec <- CST_Load('prlr', exp = list(infile), obs = NULL, sdates = c('20101101', '20111101', '20121101'), leadtimemin = 121, leadtimemax = 151, latmin = 44, latmax = 47, lonmin = 6, lonmax = 9, nmember = 6, storefreq = "daily", sampleperiod = 1, output = "lonlat" )
Jost von Hardenberg [email protected]
An-Chi Ho [email protected]
This sample data set contains a small cutout of gridded seasonal precipitation forecast data from the Copernicus Climate Change ECMWF-System 5 forecast system, to be used to demonstrate downscaling. Specifically, for the 'pr' (precipitation) variable, for the first 6 forecast ensemble members, daily values, for all 31 days in March following the forecast starting dates in November of years 2010 to 2012, for a small 4x4 pixel cutout in a region in the North-Western Italian Alps (44N-47N, 6E-9E). The data resolution is 1 degree.
The 'CST_Start' call used to generate the data set in the infrastructure of the Marconi machine at CINECA is shown next, working on files which were extracted from forecast data available in the MEDSCOPE internal archive.
path <- paste0('/esarchive/exp/ecmwf/system5c3s/daily_mean/', '$var$_s0-24h/$var$_$sdate$.nc') sdates = c('20101101', '20111101', '20121101') latmin <- 44 latmax <- 47 lonmin <- 6 lonmax <- 9 lonlat_prec_st <- CST_Start(dataset = path, var = 'prlr', member = startR::indices(1:6), sdate = sdates, ftime = 121:151, lat = startR::values(list(latmin, latmax)), lat_reorder = startR::Sort(decreasing = TRUE), lon = startR::values(list(lonmin, lonmax)), lon_reorder = startR::CircularSort(0, 360), synonims = list(lon = c('lon', 'longitude'), lat = c('lat', 'latitude'), ftime = c('time', 'ftime'), member = c('member', 'ensemble')), return_vars = list(ftime = 'sdate', lon = NULL, lat = NULL), retrieve = TRUE)
Jost von Hardenberg [email protected]
An-Chi Ho [email protected]
This sample data set contains gridded seasonal forecast and corresponding observational data from the Copernicus Climate Change ECMWF-System 5 forecast system, and from the Copernicus Climate Change ERA-5 reconstruction. Specifically, for the 'tas' (2-meter temperature) variable, for the 15 first forecast ensemble members, monthly averaged, for the 3 first forecast time steps (lead months 1 to 4) of the November start dates of 2000 to 2005, for the Mediterranean region (27N-48N, 12W-40E). The data was generated on (or interpolated onto, for the reconstruction) a rectangular regular grid of size 360 by 181.
It is recommended to use the data set as follows:
require(zeallot) c(exp, obs)
The 'CST_Load' call used to generate the data set in the infrastructure of the Earth Sciences Department of the Barcelona Supercomputing Center is shown next. Note that 'CST_Load' internally calls 's2dv::Load', which would require a configuration file (not provided here) expressing the distribution of the 'system5c3s' and 'era5' NetCDF files in the file system.
library(CSTools) require(zeallot) startDates <- c('20001101', '20011101', '20021101', '20031101', '20041101', '20051101') lonlat_temp <- CST_Load( var = 'tas', exp = 'system5c3s', obs = 'era5', nmember = 15, sdates = startDates, leadtimemax = 3, latmin = 27, latmax = 48, lonmin = -12, lonmax = 40, output = 'lonlat', nprocs = 1 )
Nicolau Manubens [email protected]
This sample data set contains gridded seasonal forecast and corresponding observational data from the Copernicus Climate Change ECMWF-System 5 forecast system, and from the Copernicus Climate Change ERA-5 reconstruction. Specifically, for the 'tas' (2-meter temperature) variable, for the 15 first forecast ensemble members, monthly averaged, for the 3 first forecast time steps (lead months 1 to 4) of the November start dates of 2000 to 2005, for the Mediterranean region (27N-48N, 12W-40E). The data was generated on (or interpolated onto, for the reconstruction) a rectangular regular grid of size 360 by 181.
The 'CST_Start' call used to generate the data set in the infrastructure of the Earth Sciences Department of the Barcelona Supercomputing Center is shown next. Note that 'CST_Start' internally calls 'startR::Start' and then uses 'as.s2dv_cube' that converts the 'startR_array' into 's2dv_cube'.
lonlat_temp_st <- NULL repos_exp <- paste0('/esarchive/exp/ecmwf/system5c3s/monthly_mean/', '$var$_f6h/$var$_$sdate$.nc') sdates <- sapply(2000:2005, function(x) paste0(x, '1101')) lonmax <- 40 lonmin <- -12 latmax <- 48 latmin <- 27 lonlat_temp_st$exp <- CST_Start(dataset = repos_exp, var = 'tas', member = startR::indices(1:15), sdate = sdates, ftime = startR::indices(1:3), lat = startR::values(list(latmin, latmax)), lat_reorder = startR::Sort(decreasing = TRUE), lon = startR::values(list(lonmin, lonmax)), lon_reorder = startR::CircularSort(0, 360), synonims = list(lon = c('lon', 'longitude'), lat = c('lat', 'latitude'), member = c('member', 'ensemble'), ftime = c('ftime', 'time')), return_vars = list(lat = NULL, lon = NULL, ftime = 'sdate'), retrieve = TRUE) dates <- c(paste0(2000, c(11, 12)), paste0(2001, c('01', 11, 12)), paste0(2002, c('01', 11, 12)), paste0(2003, c('01', 11, 12)), paste0(2004, c('01', 11, 12)), paste0(2005, c('01', 11, 12)), 200601) dates <- sapply(dates, function(x) {paste0(x, '01')}) dates <- as.POSIXct(dates, format = ' dim(dates) <- c(ftime = 3, sdate = 6) dates <- t(dates) names(dim(dates)) <- c('sdate', 'ftime') path.obs <- '/esarchive/recon/ecmwf/era5/monthly_mean/$var$_f1h-r1440x721cds/$var$_$date$.nc' lonlat_temp_st$obs <- CST_Start(dataset = path.obs, var = 'tas', date = unique(format(dates, ' ftime = startR::values(dates), ftime_across = 'date', ftime_var = 'ftime', merge_across_dims = TRUE, split_multiselected_dims = TRUE, lat = startR::values(list(latmin, latmax)), lat_reorder = startR::Sort(decreasing = TRUE), lon = startR::values(list(lonmin, lonmax)), lon_reorder = startR::CircularSort(0, 360), synonims = list(lon = c('lon', 'longitude'), lat = c('lat', 'latitude'), ftime = c('ftime', 'time')), transform = startR::CDORemapper, transform_extra_cells = 2, transform_params = list(grid = 'r360x181', method = 'conservative'), transform_vars = c('lat', 'lon'), return_vars = list(lon = NULL, lat = NULL, ftime = 'date'), retrieve = TRUE) library(lubridate) dates_exp <- lonlat_temp_st$exp$attrs$Dates lonlat_temp_st$exp$attrs$Dates <- floor_date(ymd_hms(dates_exp), unit = "months") dim(lonlat_temp_st$exp$attrs$Dates) <- dim(dates_exp) dates_obs <- lonlat_temp_st$obs$attrs$Dates lonlat_temp_st$obs$attrs$Dates <- floor_date(ymd_hms(dates_obs), unit = "months") dim(lonlat_temp_st$obs$attrs$Dates) <- dim(dates_obs)
Nicolau Manubens [email protected]
This function merges two dimensions of an array into one. The user can select the dimensions to merge and provide the final name of the dimension. The user can select to remove NA values or keep them.
MergeDims( data, merge_dims = c("time", "monthly"), rename_dim = NULL, na.rm = FALSE )
MergeDims( data, merge_dims = c("time", "monthly"), rename_dim = NULL, na.rm = FALSE )
data |
An n-dimensional array with named dimensions |
merge_dims |
A character vector indicating the names of the dimensions to merge. |
rename_dim |
A character string indicating the name of the output
dimension. If left at NULL, the first dimension name provided in parameter
|
na.rm |
A logical indicating if the NA values should be removed or not. |
Nuria Perez-Zanon, [email protected]
data <- 1 : 20 dim(data) <- c(time = 10, lat = 2) new_data <- MergeDims(data, merge_dims = c('time', 'lat'))
data <- 1 : 20 dim(data) <- c(time = 10, lat = 2) new_data <- MergeDims(data, merge_dims = c('time', 'lat'))
This function performs EOF analysis over multiple variables,
accepting in input an array with a dimension "var"
for each variable to
analyse. Based on Singular Value Decomposition. For each field the EOFs are
computed and the corresponding PCs are standardized (unit variance, zero mean);
the minimum number of principal components needed to reach the user-defined
variance is retained. The function weights the input data for the latitude
cosine square root.
MultiEOF( data, lon, lat, dates, time = NULL, lon_dim = "lon", lat_dim = "lat", time_dim = "ftime", sdate_dim = "sdate", var_dim = "var", neof_max = 40, neof_composed = 5, minvar = 0.6, lon_lim = NULL, lat_lim = NULL, ncores = NULL )
MultiEOF( data, lon, lat, dates, time = NULL, lon_dim = "lon", lat_dim = "lat", time_dim = "ftime", sdate_dim = "sdate", var_dim = "var", neof_max = 40, neof_composed = 5, minvar = 0.6, lon_lim = NULL, lat_lim = NULL, ncores = NULL )
data |
A multidimensional array with dimension |
lon |
Vector of longitudes. |
lat |
Vector of latitudes. |
dates |
Vector or matrix of dates in POSIXct format. |
time |
Deprecated parameter, it has been substituted by 'dates'. It will be removed in the next release. |
lon_dim |
A character string indicating the name of the longitudinal dimension. By default, it is set to 'lon'. |
lat_dim |
A character string indicating the name of the latitudinal dimension. By default, it is set to 'lat'. |
time_dim |
A character string indicating the name of the temporal dimension. By default, it is set to 'time'. |
sdate_dim |
A character string indicating the name of the start date dimension. By default, it is set to 'sdate'. |
var_dim |
A character string indicating the name of the variable dimension. By default, it is set to 'var'. |
neof_max |
Maximum number of single eofs considered in the first decomposition. |
neof_composed |
Number of composed eofs to return in output. |
minvar |
Minimum variance fraction to be explained in first decomposition. |
lon_lim |
Vector with longitudinal range limits for the calculation for all input variables. |
lat_lim |
Vector with latitudinal range limits for the calculation for all input variables. |
ncores |
An integer indicating the number of cores to use for parallel computation. The default value is NULL. |
A list containing:
coeff |
An array of principal components with dimensions 'time_dim', 'sdate_dim', number of eof, rest of the dimensions of 'data' except 'lon_dim' and 'lat_dim'. |
variance |
An array of explained variances with dimensions 'eof' and the rest of the dimensions of 'data' except 'time_dim', 'sdate_dim', 'lon_dim' and 'lat_dim'. |
eof_pattern |
An array of EOF patterns obtained by regression with dimensions: 'eof' and the rest of the dimensions of 'data' except 'time_dim' and 'sdate_dim'. |
mask |
An array of the mask with dimensions ('lon_dim', 'lat_dim', rest of the dimensions of 'data' except 'time_dim'). It is made from 'data', 1 for the positions that 'data' has value and NA for the positions that 'data' has NA. It is used to replace NAs with 0s for EOF calculation and mask the result with NAs again after the calculation. |
coordinates |
Longitudinal and latitudinal coordinates vectors. |
Jost von Hardenberg - ISAC-CNR, [email protected]
Paolo Davini - ISAC-CNR, [email protected]
exp <- array(runif(1280)*280, dim = c(dataset = 2, member = 2, sdate = 3, ftime = 3, lat = 4, lon = 4, var = 1)) lon <- seq(0, 3) lat <- seq(47, 44) dates <- c("2000-11-01", "2000-12-01", "2001-01-01", "2001-11-01", "2001-12-01", "2002-01-01", "2002-11-01", "2002-12-01", "2003-01-01") Dates <- as.POSIXct(dates, format = "%Y-%m-%d") dim(Dates) <- c(ftime = 3, sdate = 3) cal <- MultiEOF(data = exp, lon = lon, lat = lat, dates = Dates)
exp <- array(runif(1280)*280, dim = c(dataset = 2, member = 2, sdate = 3, ftime = 3, lat = 4, lon = 4, var = 1)) lon <- seq(0, 3) lat <- seq(47, 44) dates <- c("2000-11-01", "2000-12-01", "2001-01-01", "2001-11-01", "2001-12-01", "2002-01-01", "2002-11-01", "2002-12-01", "2003-01-01") Dates <- as.POSIXct(dates, format = "%Y-%m-%d") dim(Dates) <- c(ftime = 3, sdate = 3) cal <- MultiEOF(data = exp, lon = lon, lat = lat, dates = Dates)
This function calculates correlation (Anomaly Correlation Coefficient; ACC), root mean square error (RMS) and the root mean square error skill score (RMSSS) of individual anomaly models and multi-models mean (if desired) with the observations on arrays with named dimensions.
MultiMetric( exp, obs, metric = "correlation", multimodel = TRUE, time_dim = "ftime", memb_dim = "member", sdate_dim = "sdate" )
MultiMetric( exp, obs, metric = "correlation", multimodel = TRUE, time_dim = "ftime", memb_dim = "member", sdate_dim = "sdate" )
exp |
A multidimensional array with named dimensions. |
obs |
A multidimensional array with named dimensions. |
metric |
A character string giving the metric for computing the maximum skill. This must be one of the strings 'correlation', 'rms' or 'rmsss. |
multimodel |
A logical value indicating whether a Multi-Model Mean should be computed. |
time_dim |
Name of the temporal dimension where a mean will be applied. It can be NULL, the default value is 'ftime'. |
memb_dim |
Name of the member dimension. It can be NULL, the default value is 'member'. |
sdate_dim |
Name of the start date dimension or a dimension name identifiying the different forecast. It can be NULL, the default value is 'sdate'. |
A list of arrays containing the statistics of the selected metric in
the element $data
which is a list of arrays: for the metric requested
and others for statistics about its signeificance. The arrays have two dataset
dimensions equal to the 'dataset' dimension in the exp$data
and
obs$data
inputs. If multimodel
is TRUE, the greatest position in
the first dimension correspons to the Multi-Model Mean.
Mishra Niti, [email protected]
Perez-Zanon Nuria, [email protected]
Mishra, N., Prodhomme, C., & Guemas, V. (n.d.). Multi-Model Skill Assessment of Seasonal Temperature and Precipitation Forecasts over Europe, 29-31. doi:10.1007/s00382-018-4404-z
exp <- array(rnorm(2*2*4*5*2*2), dim = c(dataset = 2, member = 2, sdate = 4, ftime = 5, lat = 2, lon = 2)) obs <- array(rnorm(1*1*4*5*2*2), dim = c(dataset = 1, member = 1, sdate = 4, ftime = 5, lat = 2, lon = 2)) res <- MultiMetric(exp = exp, obs = obs)
exp <- array(rnorm(2*2*4*5*2*2), dim = c(dataset = 2, member = 2, sdate = 4, ftime = 5, lat = 2, lon = 2)) obs <- array(rnorm(1*1*4*5*2*2), dim = c(dataset = 1, member = 1, sdate = 4, ftime = 5, lat = 2, lon = 2)) res <- MultiMetric(exp = exp, obs = obs)
This function implements the computation to obtain the index PDFs (e.g. NAO index) to improve the index estimate from SFSs for a hindcast period.
PDFIndexHind( index_hind, index_obs, method = "ME", time_dim_name = "time", na.rm = FALSE )
PDFIndexHind( index_hind, index_obs, method = "ME", time_dim_name = "time", na.rm = FALSE )
index_hind |
Index (e.g. NAO index) array from SFSs with at least two dimensions (time , member) or (time, statistic). The temporal dimension, by default 'time', must be greater than 2. The dimension 'member' must be greater than 1. The dimension 'statistic' must be equal to 2, for containing the two paramenters of a normal distribution (mean and sd) representing the ensemble of a SFS. It is not possible to have the dimension 'member' and 'statistic' together. |
index_obs |
Index (e.g. NAO index) array from an observational database or reanalysis with at least a temporal dimension (by default 'time'), which must be greater than 2. |
method |
A character string indicating which methodology is applied to compute the PDFs. One of "ME" (default) or "LMEV". |
time_dim_name |
A character string indicating the name of the temporal dimension, by default 'time'. |
na.rm |
Logical (default = FALSE). Should missing values be removed? |
An array with at least two dimensions (time, statistic = 4). The firt statistic is the parameter 'mean' of the PDF with not bias corrected. The second statistic is the parameter 'standard deviation' of the PDF with not bias corrected. The third statistic is the parameter 'mean' of the PDF with bias corrected. The fourth statistic is the parameter 'standard deviation' of the PDF with bias corrected.
Eroteida Sanchez-Garcia - AEMET, [email protected]
Regionally improved seasonal forecast of precipitation through Best estimation of winter NAO, Sanchez-Garcia, E. et al., Adv. Sci. Res., 16, 165174, 2019, doi:10.5194/asr-16-165-2019
# Example for the PDFIndexHind function # Example 1 index_obs <- 1 : (5 * 3 ) dim(index_obs) <- c(time = 5, season = 3) index_hind <- 1 : (5 * 4 * 3) dim(index_hind) <- c(time = 5, member = 4, season = 3) res <- PDFIndexHind(index_hind, index_obs) dim(res) # time statistic season # 5 4 3 # Example 2 index_obs <- 1 : (5 * 3) dim(index_obs) <- c(time = 5, season = 3) index_hind <- 1 : (5 * 2 * 3) dim(index_hind) <- c(time = 5, statistic = 2, season = 3) res <- PDFIndexHind(index_hind, index_obs)
# Example for the PDFIndexHind function # Example 1 index_obs <- 1 : (5 * 3 ) dim(index_obs) <- c(time = 5, season = 3) index_hind <- 1 : (5 * 4 * 3) dim(index_hind) <- c(time = 5, member = 4, season = 3) res <- PDFIndexHind(index_hind, index_obs) dim(res) # time statistic season # 5 4 3 # Example 2 index_obs <- 1 : (5 * 3) dim(index_obs) <- c(time = 5, season = 3) index_hind <- 1 : (5 * 2 * 3) dim(index_hind) <- c(time = 5, statistic = 2, season = 3) res <- PDFIndexHind(index_hind, index_obs)
Plot a number a two dimensional matrices with (longitude, latitude) dimensions on a single map with the cylindrical equidistant latitude and longitude projection.
PlotCombinedMap( maps, lon, lat, map_select_fun, display_range, map_dim = "map", brks = NULL, cols = NULL, bar_limits = NULL, triangle_ends = c(F, F), col_inf = NULL, col_sup = NULL, col_unknown_map = "white", mask = NULL, col_mask = "grey", dots = NULL, bar_titles = NULL, legend_scale = 1, cex_bar_titles = 1.5, plot_margin = NULL, bar_extra_margin = c(2, 0, 2, 0), fileout = NULL, width = 8, height = 5, size_units = "in", res = 100, drawleg = T, return_leg = FALSE, ... )
PlotCombinedMap( maps, lon, lat, map_select_fun, display_range, map_dim = "map", brks = NULL, cols = NULL, bar_limits = NULL, triangle_ends = c(F, F), col_inf = NULL, col_sup = NULL, col_unknown_map = "white", mask = NULL, col_mask = "grey", dots = NULL, bar_titles = NULL, legend_scale = 1, cex_bar_titles = 1.5, plot_margin = NULL, bar_extra_margin = c(2, 0, 2, 0), fileout = NULL, width = 8, height = 5, size_units = "in", res = 100, drawleg = T, return_leg = FALSE, ... )
maps |
List of matrices to plot, each with (longitude, latitude) dimensions, or 3-dimensional array with the dimensions (longitude, latitude, map). Dimension names are required. |
lon |
Vector of longitudes. Must match the length of the corresponding dimension in 'maps'. |
lat |
Vector of latitudes. Must match the length of the corresponding dimension in 'maps'. |
map_select_fun |
Function that selects, for each grid point, which value
to take among all the provided maps. This function receives as input a
vector of values for a same grid point for all the provided maps, and must
return a single selected value (not its index!) or NA. For example, the
|
display_range |
Range of values to be displayed for all the maps. This must be a numeric vector c(range min, range max). The values in the parameter 'maps' can go beyond the limits specified in this range. If the selected value for a given grid point (according to 'map_select_fun') falls outside the range, it will be coloured with 'col_unknown_map'. |
map_dim |
Optional name for the dimension of 'maps' along which the multiple maps are arranged. Only applies when 'maps' is provided as a 3-dimensional array. Takes the value 'map' by default. |
brks |
Colour levels to be sent to PlotEquiMap. This parameter is optional and adjusted automatically by the function. |
cols |
List of vectors of colours to be sent to PlotEquiMap for the colour bar of each map. This parameter is optional and adjusted automatically by the function (up to 5 maps). The colours provided for each colour bar will be automatically interpolated to match the number of breaks. Each item in this list can be named, and the name will be used as title for the corresponding colour bar (equivalent to the parameter 'bar_titles'). |
bar_limits |
Parameter from s2dv::ColorBar. Vector of two numeric values with the extremes of the range of values represented in the colour bar. If 'var_limits' go beyond this interval, the drawing of triangle extremes is triggered at the corresponding sides, painted in 'col_inf' and 'col_sup'. Either of them can be set as NA and will then take as value the corresponding extreme in 'var_limits' (hence a triangle end won't be triggered for these sides). Takes as default the extremes of 'brks' if available, else the same values as 'var_limits'. |
triangle_ends |
Parameter from s2dv::ColorBar. Vector of two logical elements, indicating whether to force the drawing of triangle ends at each of the extremes of the colour bar. This choice is automatically made from the provided 'brks', 'bar_limits', 'var_limits', 'col_inf' and 'col_sup', but the behaviour can be manually forced to draw or not to draw the triangle ends with this parameter. If 'cols' is provided, 'col_inf' and 'col_sup' will take priority over 'triangle_ends' when deciding whether to draw the triangle ends or not. |
col_inf |
Parameter from s2dv::ColorBar. Colour to fill the inferior triangle end with. Useful if specifying colours manually with parameter 'cols', to specify the colour and to trigger the drawing of the lower extreme triangle, or if 'cols' is not specified, to replace the colour automatically generated by ColorBar(). |
col_sup |
Parameter from s2dv::ColorBar. Colour to fill the superior triangle end with. Useful if specifying colours manually with parameter 'cols', to specify the colour and to trigger the drawing of the upper extreme triangle, or if 'cols' is not specified, to replace the colour automatically generated by ColorBar(). |
col_unknown_map |
Colour to use to paint the grid cells for which a map is not possible to be chosen according to 'map_select_fun' or for those values that go beyond 'display_range'. Takes the value 'white' by default. |
mask |
Optional numeric array with dimensions (latitude, longitude), with values in the range [0, 1], indicating the opacity of the mask over each grid point. Cells with a 0 will result in no mask, whereas cells with a 1 will result in a totally opaque superimposed pixel coloured in 'col_mask'. |
col_mask |
Colour to be used for the superimposed mask (if specified in 'mask'). Takes the value 'grey' by default. |
dots |
Array of same dimensions as 'var' or with dimensions c(n, dim(var)), where n is the number of dot/symbol layers to add to the plot. A value of TRUE at a grid cell will draw a dot/symbol on the corresponding square of the plot. By default all layers provided in 'dots' are plotted with dots, but a symbol can be specified for each of the layers via the parameter 'dot_symbol'. |
bar_titles |
Optional vector of character strings providing the titles to be shown on top of each of the colour bars. |
legend_scale |
Scale factor for the size of the colour bar labels. Takes 1 by default. |
cex_bar_titles |
Scale factor for the sizes of the bar titles. Takes 1.5 by default. |
plot_margin |
Numeric vector of length 4 for the margin sizes in the following order: bottom, left, top, and right. If not specified, use the default of par("mar"), c(5.1, 4.1, 4.1, 2.1). Used as 'margin_scale' in s2dv::PlotEquiMap. |
bar_extra_margin |
Parameter from s2dv::ColorBar. Extra margins to be added around the colour bar, in the format c(y1, x1, y2, x2). The units are margin lines. Takes rep(0, 4) by default. |
fileout |
File where to save the plot. If not specified (default) a graphics device will pop up. Extensions allowed: eps/ps, jpeg, png, pdf, bmp and tiff |
width |
File width, in the units specified in the parameter size_units (inches by default). Takes 8 by default. |
height |
File height, in the units specified in the parameter size_units (inches by default). Takes 5 by default. |
size_units |
Units of the size of the device (file or window) to plot in. Inches ('in') by default. See ?Devices and the creator function of the corresponding device. |
res |
Resolution of the device (file or window) to plot in. See ?Devices and the creator function of the corresponding device. |
drawleg |
Where to draw the common colour bar. Can take values TRUE,
FALSE or: |
return_leg |
A logical value indicating if the color bars information should be returned by the function. If TRUE, the function doesn't plot the color bars but still creates the layout with color bar areas, and the arguments for GradientCatsColorBar() or ColorBar() will be returned. It is convenient for users to adjust the color bars manually. The default is FALSE, the color bars will be plotted directly. |
... |
Additional parameters to be passed on to |
Nicolau Manubens, [email protected]
Veronica Torralba, [email protected]
PlotCombinedMap
and PlotEquiMap
# Simple example x <- array(1:(20 * 10), dim = c(lat = 10, lon = 20)) / 200 a <- x * 0.6 b <- (1 - x) * 0.6 c <- 1 - (a + b) lons <- seq(0, 359.5, length = 20) lats <- seq(-89.5, 89.5, length = 10) ## Not run: PlotCombinedMap(list(a, b, c), lons, lats, toptitle = 'Maximum map', map_select_fun = max, display_range = c(0, 1), bar_titles = paste('% of belonging to', c('a', 'b', 'c')), brks = 20, width = 12, height = 10) ## End(Not run) Lon <- c(0:40, 350:359) Lat <- 51:26 data <- rnorm(51 * 26 * 3) dim(data) <- c(map = 3, lon = 51, lat = 26) mask <- sample(c(0,1), replace = TRUE, size = 51 * 26) dim(mask) <- c(lat = 26, lon = 51) ## Not run: PlotCombinedMap(data, lon = Lon, lat = Lat, map_select_fun = max, display_range = range(data), mask = mask, width = 14, height = 10) ## End(Not run)
# Simple example x <- array(1:(20 * 10), dim = c(lat = 10, lon = 20)) / 200 a <- x * 0.6 b <- (1 - x) * 0.6 c <- 1 - (a + b) lons <- seq(0, 359.5, length = 20) lats <- seq(-89.5, 89.5, length = 10) ## Not run: PlotCombinedMap(list(a, b, c), lons, lats, toptitle = 'Maximum map', map_select_fun = max, display_range = c(0, 1), bar_titles = paste('% of belonging to', c('a', 'b', 'c')), brks = 20, width = 12, height = 10) ## End(Not run) Lon <- c(0:40, 350:359) Lat <- 51:26 data <- rnorm(51 * 26 * 3) dim(data) <- c(map = 3, lon = 51, lat = 26) mask <- sample(c(0,1), replace = TRUE, size = 51 * 26) dim(mask) <- c(lat = 26, lon = 51) ## Not run: PlotCombinedMap(data, lon = Lon, lat = Lat, map_select_fun = max, display_range = range(data), mask = mask, width = 14, height = 10) ## End(Not run)
This function plots the probability distribution function of several ensemble forecasts. Separate panels are used to plot forecasts valid or initialized at different times or by different models or even at different locations. Probabilities for tercile categories are computed, plotted in colors and annotated. An asterisk marks the tercile with higher probabilities. Probabilities for extreme categories (above P90 and below P10) can also be included as hatched areas. Individual ensemble members can be plotted as jittered points. The observed value is optionally shown as a diamond.
PlotForecastPDF( fcst, tercile.limits, extreme.limits = NULL, obs = NULL, plotfile = NULL, title = "Set a title", var.name = "Varname (units)", fcst.names = NULL, add.ensmemb = c("above", "below", "no"), color.set = c("ggplot", "s2s4e", "hydro", "vitigeoss"), memb_dim = "member" )
PlotForecastPDF( fcst, tercile.limits, extreme.limits = NULL, obs = NULL, plotfile = NULL, title = "Set a title", var.name = "Varname (units)", fcst.names = NULL, add.ensmemb = c("above", "below", "no"), color.set = c("ggplot", "s2s4e", "hydro", "vitigeoss"), memb_dim = "member" )
fcst |
A dataframe or array containing all the ensember members for each
forecast. If |
tercile.limits |
An array or vector with P33 and P66 values that define the tercile categories for each panel. Use an array of dimensions (nforecasts,2) to define different terciles for each forecast panel, or a vector with two elements to reuse the same tercile limits for all forecast panels. |
extreme.limits |
(optional) An array or vector with P10 and P90 values that define the extreme categories for each panel. Use an array of (nforecasts,2) to define different extreme limits for each forecast panel, or a vector with two elements to reuse the same tercile limits for all forecast panels. (Default: extreme categories are not shown). |
obs |
(optional) A vector providing the observed values for each forecast panel or a single value that will be reused for all forecast panels. (Default: observation is not shown). |
plotfile |
(optional) A filename (pdf, png...) where the plot will be saved. (Default: the plot is not saved). |
title |
A string with the plot title. |
var.name |
A string with the variable name and units. |
fcst.names |
(optional) An array of strings with the titles of each individual forecast. |
add.ensmemb |
Either to add the ensemble members |
color.set |
A selection of predefined color sets: use |
memb_dim |
A character string indicating the name of the member dimension. |
A ggplot object containing the plot.
Llorenç Lledó [email protected]
fcsts <- data.frame(fcst1 = rnorm(10), fcst2 = rnorm(10, 0.5, 1.2)) PlotForecastPDF(fcsts,c(-1,1))
fcsts <- data.frame(fcst1 = rnorm(10), fcst2 = rnorm(10, 0.5, 1.2)) PlotForecastPDF(fcsts,c(-1,1))
This function receives as main input (via the parameter
probs
) a collection of longitude-latitude maps, each containing the
probabilities (from 0 to 1) of the different grid cells of belonging to a
category. As many categories as maps provided as inputs are understood to
exist. The maps of probabilities must be provided on a common rectangular
regular grid, and a vector with the longitudes and a vector with the latitudes
of the grid must be provided. The input maps can be provided in two forms,
either as a list of multiple two-dimensional arrays (one for each category) or
as a three-dimensional array, where one of the dimensions corresponds to the
different categories.
PlotMostLikelyQuantileMap( probs, lon, lat, cat_dim = "bin", bar_titles = NULL, col_unknown_cat = "white", drawleg = T, ... )
PlotMostLikelyQuantileMap( probs, lon, lat, cat_dim = "bin", bar_titles = NULL, col_unknown_cat = "white", drawleg = T, ... )
probs |
A list of bi-dimensional arrays with the named dimensions 'latitude' (or 'lat') and 'longitude' (or 'lon'), with equal size and in the same order, or a single tri-dimensional array with an additional dimension (e.g. 'bin') for the different categories. The arrays must contain probability values between 0 and 1, and the probabilities for all categories of a grid cell should not exceed 1 when added. |
lon |
A numeric vector with the longitudes of the map grid, in the same
order as the values along the corresponding dimension in |
lat |
A numeric vector with the latitudes of the map grid, in the same
order as the values along the corresponding dimension in |
cat_dim |
The name of the dimension along which the different categories
are stored in |
bar_titles |
Vector of character strings with the names to be drawn on
top of the color bar for each of the categories. As many titles as
categories provided in |
col_unknown_cat |
Character string with a colour representation of the colour to be used to paint the cells for which no category can be clearly assigned. Takes the value 'white' by default. |
drawleg |
Where to draw the common colour bar. Can take values TRUE,
FALSE or: |
... |
Additional parameters to be sent to |
Veronica Torralba, [email protected], Nicolau Manubens, [email protected]
PlotCombinedMap
and PlotEquiMap
# Simple example x <- array(1:(20 * 10), dim = c(lat = 10, lon = 20)) / 200 a <- x * 0.6 b <- (1 - x) * 0.6 c <- 1 - (a + b) lons <- seq(0, 359.5, length = 20) lats <- seq(-89.5, 89.5, length = 10) ## Not run: PlotMostLikelyQuantileMap(list(a, b, c), lons, lats, toptitle = 'Most likely tercile map', bar_titles = paste('% of belonging to', c('a', 'b', 'c')), brks = 20, width = 10, height = 8) ## End(Not run) # More complex example n_lons <- 40 n_lats <- 20 n_timesteps <- 100 n_bins <- 4 # 1. Generation of sample data lons <- seq(0, 359.5, length = n_lons) lats <- seq(-89.5, 89.5, length = n_lats) # This function builds a 3-D gaussian at a specified point in the map. make_gaussian <- function(lon, sd_lon, lat, sd_lat) { w <- outer(lons, lats, function(x, y) dnorm(x, lon, sd_lon) * dnorm(y, lat, sd_lat)) min_w <- min(w) w <- w - min_w w <- w / max(w) w <- t(w) names(dim(w)) <- c('lat', 'lon') w } # This function generates random time series (with values ranging 1 to 5) # according to 2 input weights. gen_data <- function(w1, w2, n) { r <- sample(1:5, n, prob = c(.05, .9 * w1, .05, .05, .9 * w2), replace = TRUE) r <- r + runif(n, -0.5, 0.5) dim(r) <- c(time = n) r } # We build two 3-D gaussians. w1 <- make_gaussian(120, 80, 20, 30) w2 <- make_gaussian(260, 60, -10, 40) # We generate sample data (with dimensions time, lat, lon) according # to the generated gaussians sample_data <- multiApply::Apply(list(w1, w2), NULL, gen_data, n = n_timesteps)$output1 # 2. Binning sample data prob_thresholds <- 1:n_bins / n_bins prob_thresholds <- prob_thresholds[1:(n_bins - 1)] thresholds <- quantile(sample_data, prob_thresholds) binning <- function(x, thresholds) { n_samples <- length(x) n_bins <- length(thresholds) + 1 thresholds <- c(thresholds, max(x)) result <- 1:n_bins lower_threshold <- min(x) - 1 for (i in 1:n_bins) { result[i] <- sum(x > lower_threshold & x <= thresholds[i]) / n_samples lower_threshold <- thresholds[i] } dim(result) <- c(bin = n_bins) result } bins <- multiApply::Apply(sample_data, 'time', binning, thresholds)$output1 # 3. Plotting most likely quantile/bin ## Not run: PlotMostLikelyQuantileMap(bins, lons, lats, toptitle = 'Most likely quantile map', bar_titles = paste('% of belonging to', letters[1:n_bins]), mask = 1 - (w1 + w2 / max(c(w1, w2))), brks = 20, width = 10, height = 8) ## End(Not run)
# Simple example x <- array(1:(20 * 10), dim = c(lat = 10, lon = 20)) / 200 a <- x * 0.6 b <- (1 - x) * 0.6 c <- 1 - (a + b) lons <- seq(0, 359.5, length = 20) lats <- seq(-89.5, 89.5, length = 10) ## Not run: PlotMostLikelyQuantileMap(list(a, b, c), lons, lats, toptitle = 'Most likely tercile map', bar_titles = paste('% of belonging to', c('a', 'b', 'c')), brks = 20, width = 10, height = 8) ## End(Not run) # More complex example n_lons <- 40 n_lats <- 20 n_timesteps <- 100 n_bins <- 4 # 1. Generation of sample data lons <- seq(0, 359.5, length = n_lons) lats <- seq(-89.5, 89.5, length = n_lats) # This function builds a 3-D gaussian at a specified point in the map. make_gaussian <- function(lon, sd_lon, lat, sd_lat) { w <- outer(lons, lats, function(x, y) dnorm(x, lon, sd_lon) * dnorm(y, lat, sd_lat)) min_w <- min(w) w <- w - min_w w <- w / max(w) w <- t(w) names(dim(w)) <- c('lat', 'lon') w } # This function generates random time series (with values ranging 1 to 5) # according to 2 input weights. gen_data <- function(w1, w2, n) { r <- sample(1:5, n, prob = c(.05, .9 * w1, .05, .05, .9 * w2), replace = TRUE) r <- r + runif(n, -0.5, 0.5) dim(r) <- c(time = n) r } # We build two 3-D gaussians. w1 <- make_gaussian(120, 80, 20, 30) w2 <- make_gaussian(260, 60, -10, 40) # We generate sample data (with dimensions time, lat, lon) according # to the generated gaussians sample_data <- multiApply::Apply(list(w1, w2), NULL, gen_data, n = n_timesteps)$output1 # 2. Binning sample data prob_thresholds <- 1:n_bins / n_bins prob_thresholds <- prob_thresholds[1:(n_bins - 1)] thresholds <- quantile(sample_data, prob_thresholds) binning <- function(x, thresholds) { n_samples <- length(x) n_bins <- length(thresholds) + 1 thresholds <- c(thresholds, max(x)) result <- 1:n_bins lower_threshold <- min(x) - 1 for (i in 1:n_bins) { result[i] <- sum(x > lower_threshold & x <= thresholds[i]) / n_samples lower_threshold <- thresholds[i] } dim(result) <- c(bin = n_bins) result } bins <- multiApply::Apply(sample_data, 'time', binning, thresholds)$output1 # 3. Plotting most likely quantile/bin ## Not run: PlotMostLikelyQuantileMap(bins, lons, lats, toptitle = 'Most likely quantile map', bar_titles = paste('% of belonging to', letters[1:n_bins]), mask = 1 - (w1 + w2 / max(c(w1, w2))), brks = 20, width = 10, height = 8) ## End(Not run)
This function plots two probability density gaussian functions and the optimal linear estimation (OLE) as result of combining them.
PlotPDFsOLE( pdf_1, pdf_2, nsigma = 3, legendPos = "bottom", legendSize = 1, plotfile = NULL, width = 30, height = 15, units = "cm", dpi = 300 )
PlotPDFsOLE( pdf_1, pdf_2, nsigma = 3, legendPos = "bottom", legendSize = 1, plotfile = NULL, width = 30, height = 15, units = "cm", dpi = 300 )
pdf_1 |
A numeric array with a dimension named 'statistic', containg two parameters: mean' and 'standard deviation' of the first gaussian pdf to combining. |
pdf_2 |
A numeric array with a dimension named 'statistic', containg two parameters: mean' and 'standard deviation' of the second gaussian pdf to combining. |
nsigma |
(optional) A numeric value for setting the limits of X axis. (Default nsigma = 3). |
legendPos |
(optional) A character value for setting the position of the legend ("bottom", "top", "right" or "left")(Default 'bottom'). |
legendSize |
(optional) A numeric value for setting the size of the legend text. (Default 1.0). |
plotfile |
(optional) A filename where the plot will be saved. (Default: the plot is not saved). |
width |
(optional) A numeric value indicating the plot width in units ("in", "cm", or "mm"). (Default width = 30). |
height |
(optional) A numeric value indicating the plot height. (Default height = 15). |
units |
(optional) A character value indicating the plot size unit. (Default units = 'cm'). |
dpi |
(optional) A numeric value indicating the plot resolution. (Default dpi = 300). |
PlotPDFsOLE() returns a ggplot object containing the plot.
Eroteida Sanchez-Garcia - AEMET, [email protected]
# Example 1 pdf_1 <- c(1.1,0.6) attr(pdf_1, "name") <- "NAO1" dim(pdf_1) <- c(statistic = 2) pdf_2 <- c(1,0.5) attr(pdf_2, "name") <- "NAO2" dim(pdf_2) <- c(statistic = 2) PlotPDFsOLE(pdf_1, pdf_2)
# Example 1 pdf_1 <- c(1.1,0.6) attr(pdf_1, "name") <- "NAO1" dim(pdf_1) <- c(statistic = 2) pdf_2 <- c(1,0.5) attr(pdf_2, "name") <- "NAO2" dim(pdf_2) <- c(statistic = 2) PlotPDFsOLE(pdf_1, pdf_2)
This function converts a 3-d numerical data array into a coloured grid with triangles. It is useful for a slide or article to present tabular results as colors instead of numbers. This can be used to compare the outputs of two or four categories (e.g. modes of variability, clusters, or forecast systems).
PlotTriangles4Categories( data, brks = NULL, cols = NULL, toptitle = NULL, sig_data = NULL, pch_sig = 18, col_sig = "black", cex_sig = 1, xlab = TRUE, ylab = TRUE, xlabels = NULL, xtitle = NULL, ylabels = NULL, ytitle = NULL, legend = TRUE, lab_legend = NULL, cex_leg = 1, col_leg = "black", cex_axis = 1.5, mar = c(5, 4, 0, 0), fileout = NULL, size_units = "px", res = 100, figure.width = 1, ... )
PlotTriangles4Categories( data, brks = NULL, cols = NULL, toptitle = NULL, sig_data = NULL, pch_sig = 18, col_sig = "black", cex_sig = 1, xlab = TRUE, ylab = TRUE, xlabels = NULL, xtitle = NULL, ylabels = NULL, ytitle = NULL, legend = TRUE, lab_legend = NULL, cex_leg = 1, col_leg = "black", cex_axis = 1.5, mar = c(5, 4, 0, 0), fileout = NULL, size_units = "px", res = 100, figure.width = 1, ... )
data |
Array with three named dimensions: 'dimx', 'dimy', 'dimcat', containing the values to be displayed in a coloured image with triangles. |
brks |
A vector of the color bar intervals. The length must be one more than the parameter 'cols'. Use ColorBar() to generate default values. |
cols |
A vector of valid colour identifiers for color bar. The length must be one less than the parameter 'brks'. Use ColorBar() to generate default values. |
toptitle |
A string of the title of the grid. Set NULL as default. |
sig_data |
Logical array with the same dimensions as 'data' to add layers to the plot. A value of TRUE at a grid cell will draw a dot/symbol on the corresponding triangle of the plot. Set NULL as default. |
pch_sig |
Symbol to be used to represent sig_data. Takes 18 (diamond) by default. See 'pch' in par() for additional accepted options. |
col_sig |
Colour of the symbol to represent sig_data. |
cex_sig |
Parameter to increase/reduce the size of the symbols used to represent sig_data. |
xlab |
A logical value (TRUE) indicating if xlabels should be plotted |
ylab |
A logical value (TRUE) indicating if ylabels should be plotted |
xlabels |
A vector of labels of the x-axis The length must be length of the col of parameter 'data'. Set the sequence from 1 to the length of the row of parameter 'data' as default. |
xtitle |
A string of title of the x-axis. Set NULL as default. |
ylabels |
A vector of labels of the y-axis The length must be length of the row of parameter 'data'. Set the sequence from 1 to the length of the row of parameter 'data' as default. |
ytitle |
A string of title of the y-axis. Set NULL as default. |
legend |
A logical value to decide to draw the color bar legend or not. Set TRUE as default. |
lab_legend |
A vector of labels indicating what is represented in each category (i.e. triangle). Set the sequence from 1 to the length of the categories (2 or 4). |
cex_leg |
A number to indicate the increase/reductuion of the lab_legend used to represent sig_data. |
col_leg |
Color of the legend (triangles). |
cex_axis |
A number to indicate the increase/reduction of the axis labels. |
mar |
A numerical vector of the form c(bottom, left, top, right) which gives the number of lines of margin to be specified on the four sides of the plot. |
fileout |
A string of full directory path and file name indicating where to save the plot. If not specified (default), a graphics device will pop up. |
size_units |
A string indicating the units of the size of the device (file or window) to plot in. Set 'px' as default. See ?Devices and the creator function of the corresponding device. |
res |
A positive number indicating resolution of the device (file or window) to plot in. See ?Devices and the creator function of the corresponding device. |
figure.width |
a numeric value to control the width of the plot. |
... |
The additional parameters to be passed to function ColorBar() in s2dv for color legend creation. |
A figure in popup window by default, or saved to the specified path.
History:
1.0 - 2020-10 (V.Torralba, [email protected]) - Original code
# Example with random data arr1 <- array(runif(n = 4 * 5 * 4, min = -1, max = 1), dim = c(4,5,4)) names(dim(arr1)) <- c('dimx', 'dimy', 'dimcat') arr2 <- array(TRUE, dim = dim(arr1)) arr2[which(arr1 < 0.3)] <- FALSE PlotTriangles4Categories(data = arr1, cols = c('white','#fef0d9','#fdd49e','#fdbb84','#fc8d59'), brks = c(-1, 0, 0.1, 0.2, 0.3, 0.4), lab_legend = c('NAO+', 'BL','AR','NAO-'), xtitle = "Target month", ytitle = "Lead time", xlabels = c("Jan", "Feb", "Mar", "Apr"))
# Example with random data arr1 <- array(runif(n = 4 * 5 * 4, min = -1, max = 1), dim = c(4,5,4)) names(dim(arr1)) <- c('dimx', 'dimy', 'dimcat') arr2 <- array(TRUE, dim = dim(arr1)) arr2[which(arr1 < 0.3)] <- FALSE PlotTriangles4Categories(data = arr1, cols = c('white','#fef0d9','#fdd49e','#fdbb84','#fc8d59'), brks = c(-1, 0, 0.1, 0.2, 0.3, 0.4), lab_legend = c('NAO+', 'BL','AR','NAO-'), xtitle = "Target month", ytitle = "Lead time", xlabels = c("Jan", "Feb", "Mar", "Apr"))
This function plots the observed weekly means and climatology of a timeseries data using ggplot package. It compares the weekly climatology in a specified period (reference period) to the observed conditions during the target period analyzed in the case study.
PlotWeeklyClim( data, first_date, ref_period, last_date = NULL, data_years = NULL, time_dim = "time", sdate_dim = "sdate", ylim = NULL, title = NULL, subtitle = NULL, ytitle = NULL, legend = TRUE, palette = "Blues", fileout = NULL, device = NULL, width = 8, height = 6, units = "in", dpi = 300 )
PlotWeeklyClim( data, first_date, ref_period, last_date = NULL, data_years = NULL, time_dim = "time", sdate_dim = "sdate", ylim = NULL, title = NULL, subtitle = NULL, ytitle = NULL, legend = TRUE, palette = "Blues", fileout = NULL, device = NULL, width = 8, height = 6, units = "in", dpi = 300 )
data |
A multidimensional array with named dimensions with at least sdate and time dimensions containing observed daily data. It can also be a dataframe with computed percentiles as input for ggplot. If it's a dataframe, it must contain the following column names: 'week', 'clim', 'p10', 'p90', 'p33', 'p66', 'week_mean', 'day' and 'data'. |
first_date |
The first date of the observed values of timeseries. It can be of class 'Date', 'POSIXct' or a character string in the format 'yyyy-mm-dd'. If parameter 'data_years' is not provided, it must be a date included in the reference period. |
ref_period |
A vector of numeric values indicating the years of the reference period. If parameter 'data_years' is not specified, it must be of the same length of dimension 'sdate_dim' of parameter 'data'. |
last_date |
Optional parameter indicating the last date of the target period of the daily timeseries. It can be of class 'Date', 'POSIXct' or a character string in the format 'yyyy-mm-dd'. If it is NULL, the last date of the daily timeseries will be set as the last date of 'data'. As the data is plotted by weeks, only full groups of 7 days will be plotted. If the last date of the timeseries is not a multiple of 7 days, the last week will not be plotted. |
data_years |
A vector of numeric values indicating the years of the data. It must be of the same length of dimension 'sdate_dim' of parameter 'data'. It is optional, if not specified, all the years will be used as the target period. |
time_dim |
A character string indicating the daily time dimension name. The default value is 'time'. |
sdate_dim |
A character string indicating the start year dimension name. The default value is 'sdate'. |
ylim |
A numeric vector of length two providing limits of the scale. Use NA to refer to the existing minimum or maximum. For more information, see 'ggplot2' documentation of 'scale_y_continuous' parameter. |
title |
The text for the top title of the plot. It is NULL by default. |
subtitle |
The text for the subtitle of the plot. It is NULL bu default. |
ytitle |
Character string to be drawn as y-axis title. It is NULL by default. |
legend |
A logical value indicating whether a legend should be included in the plot. If it is TRUE or NA, the legend will be included. If it is FALSE, the legend will not be included. It is TRUE by default. |
palette |
A palette name from the R Color Brewer’s package. The default value is 'Blues'. |
fileout |
A character string indicating the file name where to save the plot. If not specified (default) a graphics device will pop up. |
device |
A character string indicating the device to use. Can either be a device function (e.g. png), or one of "eps", "ps", "tex" (pictex), "pdf", "jpeg", "tiff", "png", "bmp", "svg" or "wmf" (windows only). |
width |
A numeric value of the plot width in units ("in", "cm", "mm", or "px"). It is set to 8 by default. |
height |
A numeric value of the plot height in units ("in", "cm", "mm", or "px"). It is set to 6 by default. |
units |
Units of the size of the device (file or window) to plot in. Inches (’in’) by default. |
dpi |
A numeric value of the plot resolution. It is set to 300 by default. |
A ggplot object containing the plot.
data <- array(rnorm(49*20*3, 274), dim = c(time = 49, sdate = 20, member = 3)) PlotWeeklyClim(data = data, first_date = '2002-08-09', last_date = '2002-09-15', ref_period = 2010:2019, data_years = 2000:2019, time_dim = 'time', sdate_dim = 'sdate', title = "Observed weekly means and climatology", subtitle = "Target years: 2010 to 2019", ytitle = paste0('tas', " (", "deg.C", ")"))
data <- array(rnorm(49*20*3, 274), dim = c(time = 49, sdate = 20, member = 3)) PlotWeeklyClim(data = data, first_date = '2002-08-09', last_date = '2002-09-15', ref_period = 2010:2019, data_years = 2000:2019, time_dim = 'time', sdate_dim = 'sdate', title = "Observed weekly means and climatology", subtitle = "Target years: 2010 to 2019", ytitle = paste0('tas', " (", "deg.C", ")"))
This function divides in terciles the two dynamical proxies computed with CST_ProxiesAttractor or ProxiesAttractor. These terciles will be used to measure the predictability of the system in dyn_scores. When the local dimension 'dim' is small and the inverse of persistence 'theta' is small the predictability is high, and viceversa.
Predictability(dim, theta, ncores = NULL)
Predictability(dim, theta, ncores = NULL)
dim |
An array of N named dimensions containing the local dimension as the output of CST_ProxiesAttractor or ProxiesAttractor. |
theta |
An array of N named dimensions containing the inverse of the persistence 'theta' as the output of CST_ProxiesAttractor or ProxiesAttractor. |
ncores |
The number of cores to use in parallel computation. |
A list of length 2:
'pred.dim', a list of two lists 'qdim' and 'pos.d'. The 'qdim' list contains values of local dimension 'dim' divided by terciles: d1: lower tercile (high predictability), d2: middle tercile, d3: higher tercile (low predictability) The 'pos.d' list contains the position of each tercile in parameter 'dim'.
'pred.theta', a list of two lists 'qtheta' and 'pos.t'. The 'qtheta' list contains values of the inverse of persistence 'theta' divided by terciles: th1: lower tercile (high predictability), th2: middle tercile, th3: higher tercile (low predictability) The 'pos.t' list contains the position of each tercile in parameter 'theta'.
dyn_scores values from 0 to 1. A dyn_score of 1 indicates the highest predictability.
Carmen Alvarez-Castro, [email protected]
Maria M. Chaves-Montero, [email protected]
Veronica Torralba, [email protected]
Davide Faranda, [email protected]
Faranda, D., Alvarez-Castro, M.C., Messori, G., Rodriguez, D., and Yiou, P. (2019). The hammam effect or how a warm ocean enhances large scale atmospheric predictability.Nature Communications, 10(1), 1316. doi:10.1038/s41467-019-09305-8"
Faranda, D., Gabriele Messori and Pascal Yiou. (2017). Dynamical proxies of North Atlantic predictability and extremes. Scientific Reports, 7-41278, 2017.
# Creating an example of matrix dat(time,grids): m <- matrix(rnorm(2000) * 10, nrow = 50, ncol = 40) names(dim(m)) <- c('time', 'grid') # imposing a threshold quanti <- 0.90 # computing dyn_scores from parameters dim and theta of the attractor attractor <- ProxiesAttractor(dat = m, quanti = 0.60) predyn <- Predictability(dim = attractor$dim, theta = attractor$theta)
# Creating an example of matrix dat(time,grids): m <- matrix(rnorm(2000) * 10, nrow = 50, ncol = 40) names(dim(m)) <- c('time', 'grid') # imposing a threshold quanti <- 0.90 # computing dyn_scores from parameters dim and theta of the attractor attractor <- ProxiesAttractor(dat = m, quanti = 0.60) predyn <- Predictability(dim = attractor$dim, theta = attractor$theta)
This is an S3 method of the generic 'print' for the class 's2dv_cube'. When you will call 'print' on an 's2dv_cube' object, this method will display the content of the object in a clear and informative way.
## S3 method for class 's2dv_cube' print(x, ...)
## S3 method for class 's2dv_cube' print(x, ...)
x |
An 's2dv_cube' object. |
... |
Additional arguments of print function. |
The object will be displayed following 's2dv_cube' class conventions. The
top-level elements are: 'Data', a multidimensional array containing the
object's data; 'Dimensions', the dimensions of the array; 'Coordinates', the
array coordinates that match its dimensions, explicit coordinates have an
asterisk (*) at the beginning while index coordinates do not; and
'Attributes', which contains all the metadata of the object. For more
information about the 's2dv_cube', see s2dv_cube()
and
as.s2dv_cube()
functions.
This function computes two dinamical proxies of the attractor: The local dimension (d) and the inverse of the persistence (theta). These two parameters will be used as a condition for the computation of dynamical scores to measure predictability and to compute bias correction conditioned by the dynamics with the function DynBiasCorrection. Funtion based on the matlab code ([email protected]) used in:
ProxiesAttractor(data, quanti, ncores = NULL)
ProxiesAttractor(data, quanti, ncores = NULL)
data |
A multidimensional array with named dimensions to create the attractor. It requires a temporal dimension named 'time' and spatial dimensions called 'lat' and 'lon', or 'latitude' and 'longitude' or 'grid'. |
quanti |
A number lower than 1 indicating the quantile to perform the computation of local dimension and theta |
ncores |
The number of cores to use in parallel computation. |
dim and theta
Carmen Alvarez-Castro, [email protected]
Maria M. Chaves-Montero, [email protected]
Veronica Torralba, [email protected]
Davide Faranda, [email protected]
Faranda, D., Alvarez-Castro, M.C., Messori, G., Rodriguez, D., and Yiou, P. (2019). The hammam effect or how a warm ocean enhances large scale atmospheric predictability. Nature Communications, 10(1), 1316. doi:10.1038/s41467-019-09305-8"
Faranda, D., Gabriele Messori and Pascal Yiou. (2017). Dynamical proxies of North Atlantic predictability and extremes. Scientific Reports, 7-41278, 2017.
# Example 1: Computing the attractor using simple data # Creating an example of matrix data(time,grids): mat <- array(rnorm(36 * 40), c(time = 36, grid = 40)) qm <- 0.90 # imposing a threshold Attractor <- ProxiesAttractor(data = mat, quanti = qm) # to plot the result time = c(1:length(Attractor$theta)) plot(time, Attractor$dim, xlab = 'time', ylab = 'd', main = 'local dimension', type = 'l')
# Example 1: Computing the attractor using simple data # Creating an example of matrix data(time,grids): mat <- array(rnorm(36 * 40), c(time = 36, grid = 40)) qm <- 0.90 # imposing a threshold Attractor <- ProxiesAttractor(data = mat, quanti = qm) # to plot the result time = c(1:length(Attractor$theta)) plot(time, Attractor$dim, xlab = 'time', ylab = 'd', main = 'local dimension', type = 'l')
This function is a wrapper of fitQmap and doQmap from package 'qmap' to be applied on multi-dimensional arrays. The quantile mapping adjustment between an experiment, typically a hindcast, and observation is applied to the experiment itself or to a provided forecast.
QuantileMapping( exp, obs, exp_cor = NULL, sdate_dim = "sdate", memb_dim = "member", window_dim = NULL, method = "QUANT", na.rm = FALSE, ncores = NULL, ... )
QuantileMapping( exp, obs, exp_cor = NULL, sdate_dim = "sdate", memb_dim = "member", window_dim = NULL, method = "QUANT", na.rm = FALSE, ncores = NULL, ... )
exp |
A multidimensional array with named dimensions containing the hindcast. |
obs |
A multidimensional array with named dimensions containing the reference dataset. |
exp_cor |
A multidimensional array with named dimensions in which the quantile mapping correction should be applied. If it is not specified, the correction is applied on object 'exp'. |
sdate_dim |
A character string indicating the dimension name in which cross-validation would be applied when exp_cor is not provided. 'sdate' by default. |
memb_dim |
A character string indicating the dimension name where ensemble members are stored in the experimental arrays. It can be NULL if there is no ensemble member dimension. It is set as 'member' by default. |
window_dim |
A character string indicating the dimension name where samples have been stored. It can be NULL (default) in case all samples are used. |
method |
A character string indicating the method to be used: 'PTF', 'DIST', 'RQUANT', 'QUANT', 'SSPLIN'. By default, the empirical quantile mapping 'QUANT' is used. |
na.rm |
A logical value indicating if missing values should be removed (FALSE by default). |
ncores |
An integer indicating the number of cores for parallel computation using multiApply function. The default value is NULL (1). |
... |
Additional parameters to be used by the method choosen. See qmap package for details. |
An array containing the experimental data after applying the quantile mapping correction.
Nuria Perez-Zanon, [email protected]
# Use synthetic data exp <- 1 : c(1 * 3 * 5 * 4 * 3 * 2) dim(exp) <- c(dataset = 1, member = 3, sdate = 5, ftime = 4, lat = 3, lon = 2) obs <- 101 : c(100 + 1 * 1 * 5 * 4 * 3 * 2) dim(obs) <- c(dataset = 1, member = 1, sdate = 5, ftime = 4, lat = 3, lon = 2) res <- QuantileMapping(exp, obs)
# Use synthetic data exp <- 1 : c(1 * 3 * 5 * 4 * 3 * 2) dim(exp) <- c(dataset = 1, member = 3, sdate = 5, ftime = 4, lat = 3, lon = 2) obs <- 101 : c(100 + 1 * 1 * 5 * 4 * 3 * 2) dim(obs) <- c(dataset = 1, member = 1, sdate = 5, ftime = 4, lat = 3, lon = 2) res <- QuantileMapping(exp, obs)
This function implements the RainFARM stochastic precipitation downscaling method and accepts in input an array with named dims ("lon", "lat") and one or more dimension (such as "ftime", "sdate" or "time") over which to average automatically determined spectral slopes. Adapted for climate downscaling and including orographic correction. References: Terzago, S. et al. (2018). NHESS 18(11), 2825-2840. doi:10.5194/nhess-18-2825-2018, D'Onofrio et al. (2014), J of Hydrometeorology 15, 830-843; Rebora et. al. (2006), JHM 7, 724.
RainFARM( data, lon, lat, nf, weights = 1, nens = 1, slope = 0, kmin = 1, fglob = FALSE, fsmooth = TRUE, nprocs = 1, time_dim = NULL, lon_dim = "lon", lat_dim = "lat", drop_realization_dim = FALSE, verbose = FALSE )
RainFARM( data, lon, lat, nf, weights = 1, nens = 1, slope = 0, kmin = 1, fglob = FALSE, fsmooth = TRUE, nprocs = 1, time_dim = NULL, lon_dim = "lon", lat_dim = "lat", drop_realization_dim = FALSE, verbose = FALSE )
data |
Precipitation array to downscale. The input array is expected to
have at least two dimensions named "lon" and "lat" by default (these default
names can be changed with the |
lon |
Vector or array of longitudes. |
lat |
Vector or array of latitudes. |
nf |
Refinement factor for downscaling (the output resolution is increased by this factor). |
weights |
Multi-dimensional array with climatological weights which can
be obtained using the |
nens |
Number of ensemble members to produce (default: |
slope |
Prescribed spectral slope. The default is |
kmin |
First wavenumber for spectral slope (default: |
fglob |
Logical to conseve global precipitation over the domain (default: FALSE). |
fsmooth |
Logical to conserve precipitation with a smoothing kernel (default: TRUE). |
nprocs |
The number of parallel processes to spawn for the use for parallel computation in multiple cores. (default: 1) |
time_dim |
String or character array with name(s) of time dimension(s) (e.g. "ftime", "sdate", "time" ...) over which to compute spectral slopes. If a character array of dimension names is provided, the spectral slopes will be computed over all elements belonging to those dimensions. If omitted one of c("ftime", "sdate", "time") is searched and the first one with more than one element is chosen. |
lon_dim |
Name of lon dimension ("lon" by default). |
lat_dim |
Name of lat dimension ("lat" by default). |
drop_realization_dim |
Logical to remove the "realization" stochastic ensemble dimension (default: FALSE) with the following behaviour if set to TRUE:
|
verbose |
logical for verbose output (default: FALSE). |
Wether parameter 'slope' and 'weights' presents seasonality dependency, a dimension name should match between these parameters and the input data in parameter 'data'. See example 2 below where weights and slope vary with 'sdate' dimension.
RainFARM() Returns a list containing the fine-scale longitudes,
latitudes and the sequence of nens
downscaled fields. If
nens > 1
an additional dimension named "realization" is added to the
output array after the "member" dimension (if it exists and unless
drop_realization_dim = TRUE
is specified). The ordering of the
remaining dimensions in the exp
element of the input object is
maintained.
Jost von Hardenberg - ISAC-CNR, [email protected]
# Example for the 'reduced' RainFARM function nf <- 8 # Choose a downscaling by factor 8 exp <- 1 : (2 * 3 * 4 * 8 * 8) dim(exp) <- c(dataset = 1, member = 2, sdate = 3, ftime = 4, lat = 8, lon = 8) lon <- seq(10, 13.5, 0.5) lat <- seq(40, 43.5, 0.5) # Create a test array of weights ww <- array(1., dim = c(lon = 8 * nf, lat = 8 * nf)) res <- RainFARM(data = exp, lon = lon, lat = lat, nf = nf, weights = ww, nens = 3, time_dim = 'ftime')
# Example for the 'reduced' RainFARM function nf <- 8 # Choose a downscaling by factor 8 exp <- 1 : (2 * 3 * 4 * 8 * 8) dim(exp) <- c(dataset = 1, member = 2, sdate = 3, ftime = 4, lat = 8, lon = 8) lon <- seq(10, 13.5, 0.5) lat <- seq(40, 43.5, 0.5) # Create a test array of weights ww <- array(1., dim = c(lon = 8 * nf, lat = 8 * nf)) res <- RainFARM(data = exp, lon = lon, lat = lat, nf = nf, weights = ww, nens = 3, time_dim = 'ftime')
This function performs the matching between a field of anomalies and a set of maps which will be used as a reference. The anomalies will be assigned to the reference map for which the minimum Eucledian distance (method = 'distance') or highest spatial correlation (method = 'ACC') is obtained.
RegimesAssign( data, ref_maps, lat, method = "distance", composite = FALSE, memb = FALSE, ncores = NULL )
RegimesAssign( data, ref_maps, lat, method = "distance", composite = FALSE, memb = FALSE, ncores = NULL )
data |
An array containing anomalies with named dimensions: dataset, member, sdate, ftime, lat and lon. |
ref_maps |
Array with 3-dimensions ('lon', 'lat', 'cluster') containing the maps/clusters that will be used as a reference for the matching. |
lat |
A vector of latitudes corresponding to the positions provided in data and ref_maps. |
method |
Whether the matching will be performed in terms of minimum distance (default = 'distance') or the maximum spatial correlation (method = 'ACC') between the maps. |
composite |
A logical parameter indicating if the composite maps are computed or not (default = FALSE). |
memb |
A logical value indicating whether to compute composites for separate members (default FALSE) or as unique ensemble (TRUE). This option is only available for when parameter 'composite' is set to TRUE and the data object has a dimension named 'member'. |
ncores |
The number of multicore threads to use for parallel computation. |
A list with elements $composite
(3-d array (lon, lat, k)
containing the composites k = 1,..,K for case (*1) or only k = 1 for any specific
cluster, i.e., case (*2)) (only if composite = 'TRUE'), $pvalue
(array
with the same structure as $composite
containing the pvalue of the
composites obtained through a t-test that accounts for the serial dependence
of the data with the same structure as Composite.) (only if composite='TRUE'),
$cluster
(array with the same dimensions as data (except latitude and
longitude which are removed) indicating the ref_maps to which each point is
allocated.), $frequency
(A vector of integers (from k = 1, ... k n
reference maps) indicating the percentage of assignations corresponding to
each map.),
Verónica Torralba - BSC, [email protected]
Torralba, V. (2019) Seasonal climate prediction for the wind energy sector: methods and tools for the development of a climate service. Thesis. Available online: https://eprints.ucm.es/56841/
data <- array(abs(rnorm(1280, 282.7, 6.4)), dim = c(dataset = 2, member = 2, sdate = 3, ftime = 3, lat = 4, lon = 4)) regimes <- WeatherRegime(data = data, lat = seq(47, 44), EOFs = FALSE, ncenters = 4)$composite res1 <- RegimesAssign(data = data, ref_maps = drop(regimes), lat = seq(47, 44), composite = FALSE)
data <- array(abs(rnorm(1280, 282.7, 6.4)), dim = c(dataset = 2, member = 2, sdate = 3, ftime = 3, lat = 4, lon = 4)) regimes <- WeatherRegime(data = data, lat = seq(47, 44), EOFs = FALSE, ncenters = 4)$composite res1 <- RegimesAssign(data = data, ref_maps = drop(regimes), lat = seq(47, 44), composite = FALSE)
Compute climatological ("orographic") weights from a fine-scale precipitation climatology file.
RF_Weights( zclim, latin, lonin, nf, lat, lon, fsmooth = TRUE, lonname = "lon", latname = "lat", ncores = NULL )
RF_Weights( zclim, latin, lonin, nf, lat, lon, fsmooth = TRUE, lonname = "lon", latname = "lat", ncores = NULL )
zclim |
A multi-dimensional array with named dimension containing at least one precipiation field with spatial dimensions. |
latin |
A vector indicating the latitudinal coordinates corresponding to
the |
lonin |
A vector indicating the longitudinal coordinates corresponding to
the |
nf |
Refinement factor for downscaling (the output resolution is increased by this factor). |
lat |
Vector of latitudes. The number of longitudes and latitudes is expected to be even and the same. If not the function will perform a subsetting to ensure this condition. |
lon |
Vector of longitudes. |
fsmooth |
Logical to use smooth conservation (default) or large-scale box-average conservation. |
lonname |
A character string indicating the name of the longitudinal dimension set as 'lon' by default. |
latname |
A character string indicating the name of the latitudinal dimension set as 'lat' by default. |
ncores |
An integer that indicates the number of cores for parallel computations using multiApply function. The default value is one. |
An object of class 's2dv_cube' containing in matrix data
the
weights with dimensions (lon, lat).
Jost von Hardenberg - ISAC-CNR, [email protected]
Terzago, S., Palazzi, E., & von Hardenberg, J. (2018). Stochastic downscaling of precipitation in complex orography: A simple method to reproduce a realistic fine-scale climatology. Natural Hazards and Earth System Sciences, 18(11), 2825-2840. doi:10.5194/nhess-18-2825-2018.
a <- array(1:2500, c(lat = 50, lon = 50)) res <- RF_Weights(a, seq(0.1 ,5, 0.1), seq(0.1 ,5, 0.1), nf = 5, lat = 1:5, lon = 1:5)
a <- array(1:2500, c(lat = 50, lon = 50)) res <- RF_Weights(a, seq(0.1 ,5, 0.1), seq(0.1 ,5, 0.1), nf = 5, lat = 1:5, lon = 1:5)
This function computes spatial spectral slopes from an array, to be used for RainFARM stochastic precipitation downscaling method.
RFSlope( data, kmin = 1, time_dim = NULL, lon_dim = "lon", lat_dim = "lat", ncores = NULL )
RFSlope( data, kmin = 1, time_dim = NULL, lon_dim = "lon", lat_dim = "lat", ncores = NULL )
data |
Array containing the spatial precipitation fields to downscale.
The input array is expected to have at least two dimensions named "lon" and
"lat" by default (these default names can be changed with the |
kmin |
First wavenumber for spectral slope (default |
time_dim |
String or character array with name(s) of dimension(s) (e.g. "ftime", "sdate", "member" ...) over which to compute spectral slopes. If a character array of dimension names is provided, the spectral slopes will be computed as an average over all elements belonging to those dimensions. If omitted one of c("ftime", "sdate", "time") is searched and the first one with more than one element is chosen. |
lon_dim |
Name of lon dimension ("lon" by default). |
lat_dim |
Name of lat dimension ("lat" by default). |
ncores |
is an integer that indicates the number of cores for parallel computations using multiApply function. The default value is one. |
RFSlope() returns spectral slopes using the RainFARM convention
(the logarithmic slope of k*|A(k)|^2 where A(k) are the spectral amplitudes).
The returned array has the same dimensions as the input array,
minus the dimensions specified by lon_dim
, lat_dim
and time_dim
.
Jost von Hardenberg - ISAC-CNR, [email protected]
# Example for the 'reduced' RFSlope function # Create a test array with dimension 8x8 and 20 timesteps, # 3 starting dates and 20 ensemble members. pr <- 1:(4*3*8*8*20) dim(pr) <- c(ensemble = 4, sdate = 3, lon = 8, lat = 8, ftime = 20) # Compute the spectral slopes ignoring the wavenumber # corresponding to the largest scale (the box) slopes <- RFSlope(pr, kmin = 2, time_dim = 'ftime')
# Example for the 'reduced' RFSlope function # Create a test array with dimension 8x8 and 20 timesteps, # 3 starting dates and 20 ensemble members. pr <- 1:(4*3*8*8*20) dim(pr) <- c(ensemble = 4, sdate = 3, lon = 8, lat = 8, ftime = 20) # Compute the spectral slopes ignoring the wavenumber # corresponding to the largest scale (the box) slopes <- RFSlope(pr, kmin = 2, time_dim = 'ftime')
This function implements a simple lapse rate correction of a temperature field (a multidimensional array) as input. The input lon grid must be increasing (but can be modulo 360). The input lat grid can be irregularly spaced (e.g. a Gaussian grid) The output grid can be irregularly spaced in lon and/or lat.
RFTemp( data, lon, lat, oro, lonoro, latoro, xlim = NULL, ylim = NULL, lapse = 6.5, lon_dim = "lon", lat_dim = "lat", time_dim = NULL, nolapse = FALSE, verbose = FALSE, compute_delta = FALSE, method = "bilinear", delta = NULL )
RFTemp( data, lon, lat, oro, lonoro, latoro, xlim = NULL, ylim = NULL, lapse = 6.5, lon_dim = "lon", lat_dim = "lat", time_dim = NULL, nolapse = FALSE, verbose = FALSE, compute_delta = FALSE, method = "bilinear", delta = NULL )
data |
Temperature array to downscale. The input array is expected to
have at least two dimensions named "lon" and "lat" by default (these default
names can be changed with the |
lon |
Vector or array of longitudes. |
lat |
Vector or array of latitudes. |
oro |
Array containing fine-scale orography (in m). The destination downscaling area must be contained in the orography field. |
lonoro |
Vector or array of longitudes corresponding to the fine orography. |
latoro |
Vector or array of latitudes corresponding to the fine orography. |
xlim |
Vector with longitude bounds for downscaling; the full input field is downscaled if 'xlim' and 'ylim' are not specified. |
ylim |
Vector with latitude bounds for downscaling. |
lapse |
Float with environmental lapse rate. |
lon_dim |
String with name of longitude dimension. |
lat_dim |
String with name of latitude dimension. |
time_dim |
A vector of character string indicating the name of temporal dimension. By default, it is set to NULL and it considers "ftime", "sdate" and "time" as temporal dimensions. |
nolapse |
Logical, if true 'oro' is interpreted as a fine-scale climatology and used directly for bias correction. |
verbose |
Logical if to print diagnostic output. |
compute_delta |
Logical if true returns only a delta to be used for out-of-sample forecasts. |
method |
String indicating the method used for interpolation: "nearest" (nearest neighbours followed by smoothing with a circular uniform weights kernel), "bilinear" (bilinear interpolation) The two methods provide similar results, but nearest is slightly better provided that the fine-scale grid is correctly centered as a subdivision of the large-scale grid. |
delta |
Matrix containing a delta to be applied to the downscaled input data. The grid of this matrix is supposed to be same as that of the required output field. |
CST_RFTemp() returns a downscaled CSTools object.
RFTemp() returns a list containing the fine-scale longitudes, latitudes and the downscaled fields.
Jost von Hardenberg - ISAC-CNR, [email protected]
Method described in ERA4CS MEDSCOPE milestone M3.2: High-quality climate prediction data available to WP4 here: https://www.medscope-project.eu/the-project/deliverables-reports/ and in H2020 ECOPOTENTIAL Deliverable No. 8.1: High resolution (1-10 km) climate, land use and ocean change scenarios here: https://ec.europa.eu/research/participants/documents/downloadPublic?documentIds=080166e5b6cd2324&appId=PPGMS.
# Generate simple synthetic data and downscale by factor 4 t <- rnorm(7 * 6 * 4 * 3) * 10 + 273.15 + 10 dim(t) <- c(sdate = 3, ftime = 4, lat = 6, lon = 7) lon <- seq(3, 9, 1) lat <- seq(42, 47, 1) o <- runif(29 * 29) * 3000 dim(o) <- c(lat = 29, lon = 29) lono <- seq(3, 10, 0.25) lato <- seq(41, 48, 0.25) res <- RFTemp(t, lon, lat, o, lono, lato, xlim = c(4, 8), ylim = c(43, 46), lapse = 6.5, time_dim = 'ftime')
# Generate simple synthetic data and downscale by factor 4 t <- rnorm(7 * 6 * 4 * 3) * 10 + 273.15 + 10 dim(t) <- c(sdate = 3, ftime = 4, lat = 6, lon = 7) lon <- seq(3, 9, 1) lat <- seq(42, 47, 1) o <- runif(29 * 29) * 3000 dim(o) <- c(lat = 29, lon = 29) lono <- seq(3, 10, 0.25) lato <- seq(41, 48, 0.25) res <- RFTemp(t, lon, lat, o, lono, lato, xlim = c(4, 8), ylim = c(43, 46), lapse = 6.5, time_dim = 'ftime')
This function allows to create an 's2dv_cube' object by passing information through its parameters. This function will be needed if the data hasn't been loaded using CST_Start or has been transformed with other methods. An 's2dv_cube' object has many different components including metadata. This function will allow to create 's2dv_cube' objects even if not all elements are defined and for each expected missed parameter a warning message will be returned.
s2dv_cube( data, coords = NULL, varName = NULL, metadata = NULL, Datasets = NULL, Dates = NULL, when = NULL, source_files = NULL, ... )
s2dv_cube( data, coords = NULL, varName = NULL, metadata = NULL, Datasets = NULL, Dates = NULL, when = NULL, source_files = NULL, ... )
data |
A multidimensional array with named dimensions, typically with dimensions: dataset, member, sdate, time, lat and lon. |
coords |
A list of named vectors with the coordinates corresponding to the dimensions of the data parameter. If any coordinate has dimensions, they will be set as NULL. If any coordinate is not provided, it is set as an index vector with the values from 1 to the length of the corresponding dimension. The attribute 'indices' indicates wether the coordinate is an index vector (TRUE) or not (FALSE). |
varName |
A character string indicating the abbreviation of the variable name. |
metadata |
A named list where each element is a variable containing the corresponding information. The information can be contained in a list of lists for each variable. |
Datasets |
Character strings indicating the names of the dataset. It there are multiple datasets it can be a vector of its names or a list of lists with additional information. |
Dates |
A POSIXct array of time dimensions containing the Dates. |
when |
A time stamp of the date when the data has been loaded. This parameter is also found in Load() and Start() functions output. |
source_files |
A vector of character strings with complete paths to all the found files involved in loading the data. |
... |
Additional elements to be added in the object. They will be stored in the end of 'attrs' element. Multiple elements are accepted. |
The function returns an object of class 's2dv_cube' with the following
elements in the structure:
'data', array with named dimensions;
'dims', named vector of the data dimensions;
'coords', list of named vectors with the coordinates corresponding to the dimensions of the data parameter;
'attrs', named list with elements:
'Dates', array with named temporal dimensions of class 'POSIXct' from time values in the data;
'Variable', has the following components:
'varName', with the short name of the loaded variable as specified in the parameter 'var';
'metadata', named list of elements with variable metadata. They can be from coordinates variables (e.g. longitude) or main variables (e.g. 'var');
'Datasets', character strings indicating the names of the dataset;
'source_files', a vector of character strings with complete paths to all the found files involved in loading the data;
'when', a time stamp of the date issued by the Start() or Load() call to obtain the data;
'load_parameters', it contains the components used in the arguments to load the data from Start() or Load() functions.
Perez-Zanon Nuria, [email protected]
exp_original <- 1:100 dim(exp_original) <- c(lat = 2, time = 10, lon = 5) exp1 <- s2dv_cube(data = exp_original) class(exp1) coords <- list(lon = seq(-10, 10, 5), lat = c(45, 50)) exp2 <- s2dv_cube(data = exp_original, coords = coords) class(exp2) metadata <- list(tas = list(level = '2m')) exp3 <- s2dv_cube(data = exp_original, coords = coords, varName = 'tas', metadata = metadata) class(exp3) Dates = as.POSIXct(paste0(rep("01", 10), rep("01", 10), 1990:1999), format = "%d%m%Y") dim(Dates) <- c(time = 10) exp4 <- s2dv_cube(data = exp_original, coords = coords, varName = 'tas', metadata = metadata, Dates = Dates) class(exp4) exp5 <- s2dv_cube(data = exp_original, coords = coords, varName = 'tas', metadata = metadata, Dates = Dates, when = "2019-10-23 19:15:29 CET") class(exp5) exp6 <- s2dv_cube(data = exp_original, coords = coords, varName = 'tas', metadata = metadata, Dates = Dates, when = "2019-10-23 19:15:29 CET", source_files = c("/path/to/file1.nc", "/path/to/file2.nc")) class(exp6) exp7 <- s2dv_cube(data = exp_original, coords = coords, varName = 'tas', metadata = metadata, Dates = Dates, when = "2019-10-23 19:15:29 CET", source_files = c("/path/to/file1.nc", "/path/to/file2.nc"), Datasets = list( exp1 = list(InitializationsDates = list(Member_1 = "01011990", Members = "Member_1")))) class(exp7) dim(exp_original) <- c(dataset = 1, member = 1, time = 10, lat = 2, lon = 5) exp8 <- s2dv_cube(data = exp_original, coords = coords, varName = 'tas', metadata = metadata, Dates = Dates, original_dates = Dates) class(exp8)
exp_original <- 1:100 dim(exp_original) <- c(lat = 2, time = 10, lon = 5) exp1 <- s2dv_cube(data = exp_original) class(exp1) coords <- list(lon = seq(-10, 10, 5), lat = c(45, 50)) exp2 <- s2dv_cube(data = exp_original, coords = coords) class(exp2) metadata <- list(tas = list(level = '2m')) exp3 <- s2dv_cube(data = exp_original, coords = coords, varName = 'tas', metadata = metadata) class(exp3) Dates = as.POSIXct(paste0(rep("01", 10), rep("01", 10), 1990:1999), format = "%d%m%Y") dim(Dates) <- c(time = 10) exp4 <- s2dv_cube(data = exp_original, coords = coords, varName = 'tas', metadata = metadata, Dates = Dates) class(exp4) exp5 <- s2dv_cube(data = exp_original, coords = coords, varName = 'tas', metadata = metadata, Dates = Dates, when = "2019-10-23 19:15:29 CET") class(exp5) exp6 <- s2dv_cube(data = exp_original, coords = coords, varName = 'tas', metadata = metadata, Dates = Dates, when = "2019-10-23 19:15:29 CET", source_files = c("/path/to/file1.nc", "/path/to/file2.nc")) class(exp6) exp7 <- s2dv_cube(data = exp_original, coords = coords, varName = 'tas', metadata = metadata, Dates = Dates, when = "2019-10-23 19:15:29 CET", source_files = c("/path/to/file1.nc", "/path/to/file2.nc"), Datasets = list( exp1 = list(InitializationsDates = list(Member_1 = "01011990", Members = "Member_1")))) class(exp7) dim(exp_original) <- c(dataset = 1, member = 1, time = 10, lat = 2, lon = 5) exp8 <- s2dv_cube(data = exp_original, coords = coords, varName = 'tas', metadata = metadata, Dates = Dates, original_dates = Dates) class(exp8)
This function allows to save a data array with metadata into a
NetCDF file, allowing to reload the saved data using Start
function
from StartR package. If the original 's2dv_cube' object has been created from
CST_Load()
, then it can be reloaded with Load()
.
SaveExp( data, destination = "./", coords = NULL, Dates = NULL, time_bounds = NULL, startdates = NULL, varname = NULL, metadata = NULL, Datasets = NULL, sdate_dim = "sdate", ftime_dim = "time", memb_dim = "member", dat_dim = "dataset", var_dim = "var", drop_dims = NULL, single_file = FALSE, extra_string = NULL, global_attrs = NULL, units_hours_since = FALSE )
SaveExp( data, destination = "./", coords = NULL, Dates = NULL, time_bounds = NULL, startdates = NULL, varname = NULL, metadata = NULL, Datasets = NULL, sdate_dim = "sdate", ftime_dim = "time", memb_dim = "member", dat_dim = "dataset", var_dim = "var", drop_dims = NULL, single_file = FALSE, extra_string = NULL, global_attrs = NULL, units_hours_since = FALSE )
data |
A multi-dimensional array with named dimensions. |
destination |
A character string indicating the path where to store the NetCDF files. |
coords |
A named list with elements of the coordinates corresponding to the dimensions of the data parameter. The names and length of each element must correspond to the names of the dimensions. If any coordinate is not provided, it is set as an index vector with the values from 1 to the length of the corresponding dimension. |
Dates |
A named array of dates with the corresponding sdate and forecast time dimension. If there is no sdate_dim, you can set it to NULL. It must have ftime_dim dimension. |
time_bounds |
(Optional) A list of two arrays of dates containing the lower (first array) and the upper (second array) time bounds corresponding to Dates. Each array must have the same dimensions as Dates. If 'Dates' parameter is NULL, 'time_bounds' are not used. It is NULL by default. |
startdates |
A vector of dates that will be used for the filenames
when saving the data in multiple files (single_file = FALSE). It must be a
vector of the same length as the start date dimension of data. It must be a
vector of class |
varname |
A character string indicating the name of the variable to be saved. |
metadata |
A named list where each element is a variable containing the corresponding information. The information must be contained in a list of lists for each variable. |
Datasets |
A vector of character string indicating the names of the datasets. |
sdate_dim |
A character string indicating the name of the start date dimension. By default, it is set to 'sdate'. It can be NULL if there is no start date dimension. |
ftime_dim |
A character string indicating the name of the forecast time dimension. By default, it is set to 'time'. It can be NULL if there is no forecast time dimension. |
memb_dim |
A character string indicating the name of the member dimension. By default, it is set to 'member'. It can be NULL if there is no member dimension. |
dat_dim |
A character string indicating the name of dataset dimension. By default, it is set to 'dataset'. It can be NULL if there is no dataset dimension. |
var_dim |
A character string indicating the name of variable dimension. By default, it is set to 'var'. It can be NULL if there is no variable dimension. |
drop_dims |
(optional) A vector of character strings indicating the dimension names of length 1 that need to be dropped in order that they don't appear in the netCDF file. Only is allowed to drop dimensions that are not used in the computation. The dimensions used in the computation are the ones specified in: sdate_dim, ftime_dim, dat_dim, var_dim and memb_dim. It is NULL by default. |
single_file |
A logical value indicating if all object is saved in a single file (TRUE) or in multiple files (FALSE). When it is FALSE, the array is separated for datasets, variable and start date. When there are no specified time dimensions, the data will be saved in a single file by default. The output file name when 'single_file' is TRUE is a character string containing: '<var>_<first_sdate>_<last_sdate>.nc'; when it is FALSE, it is '<var>_<sdate>.nc'. It is FALSE by default. |
extra_string |
(Optional) A character string to be included as part of the file name, for instance, to identify member or realization. When single_file is TRUE, the 'extra_string' will substitute all the default file name; when single_file is FALSE, the 'extra_string' will be added in the file name as: '<var>_<extra_string>_<sdate>.nc'. It is NULL by default. |
global_attrs |
(Optional) A list with elements containing the global attributes to be saved in the NetCDF. |
units_hours_since |
(Optional) A logical value only available for the case: Dates have forecast time and start date dimension, single_file is TRUE and 'time_bounds' is NULL. When it is TRUE, it saves the forecast time with units of 'hours since'; if it is FALSE, the time units will be a number of time steps with its corresponding frequency (e.g. n days, n months or n hours). It is FALSE by default. |
Multiple or single NetCDF files containing the data array.
single_file is TRUE |
All data is saved in a single file located in the specified destination path with the following name (by default): '<variable_name>_<first_sdate>_<last_sdate>.nc'. Multiple variables are saved separately in the same file. The forecast time units are calculated from each start date (if sdate_dim is not NULL) or from the time step. If 'units_hours_since' is TRUE, the forecast time units will be 'hours since <each start date>'. If 'units_hours_since' is FALSE, the forecast time units are extracted from the frequency of the time steps (hours, days, months); if no frequency is found, the units will be ’hours since’. When the time units are 'hours since' the time ateps are assumed to be equally spaced. |
single_file is FALSE |
The data array is subset and stored into multiple files. Each file contains the data subset for each start date, variable and dataset. Files with different variables and datasets are stored in separated directories within the following directory tree: 'destination/Dataset/variable/'. The name of each file will be by default: '<variable_name>_<sdate>.nc'. The forecast time units are calculated from each start date (if sdate_dim is not NULL) or from the time step. The forecast time units will be 'hours since <each start date>'. |
Perez-Zanon Nuria, [email protected]
## Not run: data <- lonlat_temp_st$exp$data lon <- lonlat_temp_st$exp$coords$lon lat <- lonlat_temp_st$exp$coords$lat coords <- list(lon = lon, lat = lat) Datasets <- lonlat_temp_st$exp$attrs$Datasets varname <- 'tas' Dates <- lonlat_temp_st$exp$attrs$Dates metadata <- lonlat_temp_st$exp$attrs$Variable$metadata SaveExp(data = data, coords = coords, Datasets = Datasets, varname = varname, Dates = Dates, metadata = metadata, single_file = TRUE, ftime_dim = 'ftime', var_dim = 'var', dat_dim = 'dataset') ## End(Not run)
## Not run: data <- lonlat_temp_st$exp$data lon <- lonlat_temp_st$exp$coords$lon lat <- lonlat_temp_st$exp$coords$lat coords <- list(lon = lon, lat = lat) Datasets <- lonlat_temp_st$exp$attrs$Datasets varname <- 'tas' Dates <- lonlat_temp_st$exp$attrs$Dates metadata <- lonlat_temp_st$exp$attrs$Variable$metadata SaveExp(data = data, coords = coords, Datasets = Datasets, varname = varname, Dates = Dates, metadata = metadata, single_file = TRUE, ftime_dim = 'ftime', var_dim = 'var', dat_dim = 'dataset') ## End(Not run)
This function split a dimension in two. The user can select the dimension to split and provide indices indicating how to split that dimension or dates and the frequency expected (monthly or by day, month and year). The user can also provide a numeric frequency indicating the length of each division.
SplitDim( data, split_dim = "time", indices, freq = "monthly", new_dim_name = NULL, dates = NULL, return_indices = FALSE )
SplitDim( data, split_dim = "time", indices, freq = "monthly", new_dim_name = NULL, dates = NULL, return_indices = FALSE )
data |
An n-dimensional array with named dimensions. |
split_dim |
A character string indicating the name of the dimension to split. |
indices |
A vector of numeric indices or dates. |
freq |
A character string indicating the frequency: by 'day', 'month' and 'year' or 'monthly' (by default). 'month' identifies months between 1 and 12 independetly of the year they belong to, while 'monthly' differenciates months from different years. Parameter 'freq' can also be numeric indicating the length in which to subset the dimension. |
new_dim_name |
A character string indicating the name of the new dimension. |
dates |
An optional parameter containing an array of dates of class 'POSIXct' with the corresponding time dimensions of 'data'. It is NULL by default. |
return_indices |
A logical value that if it is TRUE, the indices used in splitting the dimension will be returned. It is FALSE by default. |
Nuria Perez-Zanon, [email protected]
data <- 1 : 20 dim(data) <- c(time = 10, lat = 2) indices <- c(rep(1,5), rep(2,5)) new_data <- SplitDim(data, indices = indices) time <- c(seq(ISOdate(1903, 1, 1), ISOdate(1903, 1, 4), "days"), seq(ISOdate(1903, 2, 1), ISOdate(1903, 2, 4), "days"), seq(ISOdate(1904, 1, 1), ISOdate(1904, 1, 2), "days")) new_data <- SplitDim(data, indices = time) new_data <- SplitDim(data, indices = time, freq = 'day') new_data <- SplitDim(data, indices = time, freq = 'month') new_data <- SplitDim(data, indices = time, freq = 'year')
data <- 1 : 20 dim(data) <- c(time = 10, lat = 2) indices <- c(rep(1,5), rep(2,5)) new_data <- SplitDim(data, indices = indices) time <- c(seq(ISOdate(1903, 1, 1), ISOdate(1903, 1, 4), "days"), seq(ISOdate(1903, 2, 1), ISOdate(1903, 2, 4), "days"), seq(ISOdate(1904, 1, 1), ISOdate(1904, 1, 2), "days")) new_data <- SplitDim(data, indices = time) new_data <- SplitDim(data, indices = time, freq = 'day') new_data <- SplitDim(data, indices = time, freq = 'month') new_data <- SplitDim(data, indices = time, freq = 'year')
This function caracterizes the synoptic situations in a past period based on low resolution reanalysis data (e.g, ERAInterim 1.5º x 1.5º) and an observational high resolution (HR) dataset (AEMET 5 km gridded daily precipitation and maximum and minimum temperature) (Peral et al., 2017)). The method uses three domains:
peninsular Spain and Balearic Islands domain (5 km resolution): HR domain
synoptic domain (low resolution): it should be centered over Iberian Peninsula and cover enough extension to detect as much synoptic situations as possible.
extended domain (low resolution): it is an extension of the synoptic domain. It is used for 'slp_ext' parameter (see 'slp_lon' and 'slp_lat' below).
training_analogs( pred, slp_ext, lon, lat, slp_lon, slp_lat, var, HR_path, tdates )
training_analogs( pred, slp_ext, lon, lat, slp_lon, slp_lat, var, HR_path, tdates )
pred |
List of matrix reanalysis data in a synoptic domain. The list has to contain reanalysis atmospheric variables (instantaneous 12h data) that must be indentify by parenthesis name. For precipitation:
For maximum and minimum temperature:
All matrix must have [time,gridpoint] dimensions. (time = number of training days, gridpoint = number of synoptic gridpoints). |
slp_ext |
Matrix with atmospheric reanalysis sea level pressure (instantaneous 12h data)(hPa). It has the same resolution as 'pred' parameter but with an extended domain. This domain contains extra degrees (most in the north and west part) compare to synoptic domain. The matrix must have [time,gridpoint] dimensions. (time = number of training days, gridpoint = number of extended gridpoints). |
lon |
Vector of the synoptic longitude (from (-180º) to 180º), The vector must go from west to east. |
lat |
Vector of the synoptic latitude. The vector must go from north to south. |
slp_lon |
Vector of the extended longitude (from (-180º) to 180º). The vector must go from west to east. |
slp_lat |
Vector of the extended latitude. The vector must go from north to south. |
var |
Variable name to downscale. There are two options: 'prec' for precipitation and 'temp' for maximum and minimum temperature. |
HR_path |
Local path of HR observational files (maestro and pcp/tmx-tmn). For precipitation and temperature can be downloaded from the following link: https://www.aemet.es/en/serviciosclimaticos/cambio_climat/datos_diarios?w=2 respetively. Maestro file (maestro_red_hr_SPAIN.txt) has gridpoint (nptos), longitude (lon), latitude (lat) and altitude (alt) in columns (vector structure). Data file (pcp/tmx/tmn_red_SPAIN_1951-201903.txt) includes 5km resolution spanish daily data (precipitation or maximum and minimum temperature from january 1951 to june 2020. See README file for more information. IMPORTANT!: HR observational period must be the same as for reanalysis variables. It is assumed that the training period is smaller than the HR original one (1951-2020), so it is needed to make a new ascii file with the new period and the same structure as original, specifying the training dates ('tdates' parameter) in the name (e.g. 'pcp_red_SPAIN_19810101-19961231.txt' for '19810101-19961231' period). |
tdates |
Training period dates in format YYYYMMDD(start)-YYYYMMDD(end) (e.g. 19810101-19961231). |
A matrix list (e.g. restrain) as a result of characterize the past synoptic situations and the significant predictors needed to downscale seasonal forecast variables. For precipitation the output includes:
'um': u component of geostrophic wind in all period (numeric matrix with [time, gridpoint] dimensions).
'vm': v component of geostrophic wind in all period (numeric matrix with [time,gridpoint] dimensions).
'nger': number of synoptic situations (integer).
'gu92': u component of geostrophic wind for each synoptic situation (numeric matrix with [nger,gridpoint] dimensions).
'gv92': v component of geostrophic wind for each synoptic situation (numeric matrix with [nger, gridpoint] dimensions).
'gu52': u component of wind at 500 hPa for each synotic situation (numeric matrix with [nger, gridpoint] dimensions).
'gv52': v component of wind at 500 hPa for each synotic situation (numeric matrix with [nger, gridpoint] dimensions).
'neni': number of reference centers where predictors are calculated (integer).
'vdmin': minimum distances between each HR gridpoint and the four nearest synoptic gridpoints (numeric matrix with [nptos,4] dimensions) (nptos = number of HR gridpoints).
'vref': four nearest synoptic gridpoints to each HR gridpoint (integer matrix with [nptos, 4] dimensions).
'ccm': multiple correlation coeficients (numeric matrix with [nger, nptos] dimensions) indices:
'lab_pred': numeric labels of selected predictors (integer matrix with [nger,nptos,11,1] dimensions).
'cor_pred': partial correlation of selected predictors (numeric matrix with [nger,nptos,11,2] dimensions).
For maximum and minimum temperature the output includes:
'um': u component of geostrophic wind in all training period (numeric matrix with [time,gridpoint] dimensions).
'vm': v component of geostrophic wind in all training period (numeric matrix with [time,gridpoint] dimensions).
'insol': insolation in all training period (numeric vector with [time] dimension).
'neni': number of reference centers where predictors are calculated (integer).
'vdmin': minimum distances between each HR gridpoint and the four nearest synoptic gridpoints (numeric matrix with [nptos,4] dimensions) (nptos = number of HR gridpoints).
'vref': four nearest synoptic gridpoints to each HR gridpoint (integer matrix with [nptos,4] dimensions).
The output can directly use as argument to 'CST_AnalogsPredictors' function (e.g. resdowns <- CST_AnalogsPredictors(...,restrain)).
Marta Dominguez Alonso - AEMET, [email protected]
Nuria Perez-Zanon - BSC, [email protected]
This function computes the weather regimes from a cluster analysis. It can be applied over the dataset with dimensions c(year/month, month/day, lon, lat), or by using PCs obtained from the application of the EOFs analysis to filter the dataset. The cluster analysis can be performed with the traditional k-means or those methods included in the hclust (stats package).
WeatherRegime( data, ncenters = NULL, EOFs = TRUE, neofs = 30, varThreshold = NULL, lon = NULL, lat = NULL, method = "kmeans", iter.max = 100, nstart = 30, ncores = NULL )
WeatherRegime( data, ncenters = NULL, EOFs = TRUE, neofs = 30, varThreshold = NULL, lon = NULL, lat = NULL, method = "kmeans", iter.max = 100, nstart = 30, ncores = NULL )
data |
An array containing anomalies with named dimensions with at least start date 'sdate', forecast time 'ftime', latitude 'lat' and longitude 'lon'. |
ncenters |
Number of clusters to be calculated with the clustering function. |
EOFs |
Whether to compute the EOFs (default = 'TRUE') or not (FALSE) to filter the data. |
neofs |
Number of modes to be kept only if EOFs = TRUE has been selected. (default = 30). |
varThreshold |
Value with the percentage of variance to be explained by the PCs. Only sufficient PCs to explain this much variance will be used in the clustering. |
lon |
Vector of longitudes. |
lat |
Vector of latitudes. |
method |
Different options to estimate the clusters. The most traditional approach is the k-means analysis (default=’kmeans’) but the function also support the different methods included in the hclust . These methods are: "ward.D", "ward.D2", "single", "complete", "average" (= UPGMA), "mcquitty" (= WPGMA), "median" (= WPGMC) or "centroid" (= UPGMC). For more details about these methods see the hclust function documentation included in the stats package. |
iter.max |
Parameter to select the maximum number of iterations allowed (Only if method = 'kmeans' is selected). |
nstart |
Parameter for the cluster analysis determining how many random sets to choose (Only if method='kmeans' is selected). |
ncores |
The number of multicore threads to use for parallel computation. |
A list with elements $composite
(array with at least 3-d ('lat',
'lon', 'cluster') containing the composites k = 1,..,K for case (*1) or only k = 1
for any specific cluster, i.e., case (*2)), pvalue
(array with at least
3-d ('lat','lon','cluster') with the pvalue of the composites obtained through
a t-test that accounts for the serial dependence of the data with the same
structure as Composite.), cluster
(A matrix or vector with integers
(from 1:k) indicating the cluster to which each time step is allocated.),
persistence
(Percentage of days in a month/season before a cluster is
replaced for a new one (only if method=’kmeans’ has been selected.)),
frequency
(Percentage of days in a month/season belonging to each
cluster (only if method=’kmeans’ has been selected).),
Verónica Torralba - BSC, [email protected]
Cortesi, N., V., Torralba, N., González-Reviriego, A., Soret, and F.J., Doblas-Reyes (2019). Characterization of European wind speed variability using weather regimes. Climate Dynamics,53, 4961–4976, doi:10.1007/s00382-019-04839-5.
Torralba, V. (2019) Seasonal climate prediction for the wind energy sector: methods and tools for the development of a climate service. Thesis. Available online: https://eprints.ucm.es/56841/
data <- array(abs(rnorm(1280, 283.7, 6)), dim = c(dataset = 2, member = 2, sdate = 3, ftime = 3, lat = 4, lon = 4)) lat <- seq(47, 44) res <- WeatherRegime(data = data, lat = lat, EOFs = FALSE, ncenters = 4)
data <- array(abs(rnorm(1280, 283.7, 6)), dim = c(dataset = 2, member = 2, sdate = 3, ftime = 3, lat = 4, lon = 4)) lat <- seq(47, 44) res <- WeatherRegime(data = data, lat = lat, EOFs = FALSE, ncenters = 4)