| Title: | Conditional Copula Model for Crop Yield Forecasting |
|---|---|
| Description: | Provides functions to model and forecast crop yields using a spatial temporal conditional copula approach. The package incorporates extreme weather covariates and Bayesian Structural Time Series models to analyze crop yield dependencies across multiple regions. Includes tools for fitting, simulating, and visualizing results. This method build upon established R packages, including 'Hofert' 'et' 'al'. (2025) <doi:10.32614/CRAN.package.copula>, 'Scott' (2024) <doi:10.32614/CRAN.package.bsts>, and 'Stephenson' 'et' 'al'. (2024) <doi:10.32614/CRAN.package.evd>. |
| Authors: | Marie Michaelides [aut], Mélina Mailhot [aut], Yongkun Li [aut, cre] |
| Maintainer: | Yongkun Li <[email protected]> |
| License: | MIT + file LICENSE |
| Version: | 1.0.0 |
| Built: | 2026-05-20 06:55:36 UTC |
| Source: | https://github.com/cran/STCCGEV |
Computes the Clayton copula dependence parameter based on Kendall's tau.
clayton.theta(tau)clayton.theta(tau)
tau |
Numeric, Kendall's tau correlation coefficient. |
Numeric, estimated Clayton copula parameter.
A list containing supported copula types.
copula_listcopula_list
A list of copula types.
"Gaussian" "Clayton" "Frank" "Gumbel" "Joe"
Contains crop yields and climate indices data of 24 CD regions in Ontario from 1950 to 2022
cropyields_covariatescropyields_covariates
A data frame with 1752 rows and 38 variables:
chr: year from 1950-2022
num: 1-4
chr: Southern, Western, Central, Eastern Ontario
num
chr: 24 subregions
chr
num: latitude
num: longitude
num: wheat crop yield per census division, in bushel/acre
num: Annual maximum number of consecutive days with daily precipitation below 1mm (unit = days)
num: Annual cooling degree days above 18C (unit = degree_days)
num: Annual number of days with a diurnal freeze-thaw cycle : tmax > 0 degc and tmin <= -1 degc
num: First day of year with temperature below 0 degc for at least 1 days
num: Annual number of days with minimum daily temperature below 0C
num: Annual number of days with maximum daily temperature below 0 degC
num: The annual number of dry periods of 6 days and more, during which the maximal precipitation on a window of 6 days is under 1.0 mm
num: Annual total precipitation (unit = mm)
num: Annual number of days with daily precipitation over 1.0 mm/day
num: Annual number of days with daily precipitation over 10.0 mm/day
num: Annual number of days with daily precipitation over 20.0 mm/day
num: Annual maximum 1-day total precipitation (unit = mm)
num: Annual maximum 5-day total precipitation (unit = mm)
num: Annual mean of daily mean temperatures (unit = C degrees)
num: Annual mean of daily minimum temperatures (unit = C degrees)
num: Annual minimum of daily minimum temperatures (unit = C degrees)
num: Annual number of days where daily minimum temperature is below -15 degC
num: Annual number of days where daily minimum temperature is below -25 degC
num: Annual number of tropical nights : defined as days with minimum daily temperature above 18 degc
num: Annual number of tropical nights : defined as days with minimum daily temperature above 20 degc
num: Annual number of tropical nights : defined as days with minimum daily temperature above 22 degc
num: Annual minimum of daily maximum temperature (unit = C degrees)
num: Annual mean of daily maximum temperature (unit = C degrees)
num: Annual number of days where daily maximum temperature exceeds 25 degC
num: Annual number of days where daily maximum temperature exceeds 27 degC
num: Annual number of days where daily maximum temperature exceeds 29 degC
num: Annual number of days where daily maximum temperature exceeds 30 degC
num: Annual number of days where daily maximum temperature exceeds 32 degC
ClimateData.ca
Computes the time-varying correlation parameter (rho) for a Gaussian copula.
dynamic.rho(params, lagged_rho, X_t)dynamic.rho(params, lagged_rho, X_t)
params |
Numeric vector of parameters: omega, alpha, and gamma coefficients. |
lagged_rho |
Numeric, the previous rho value. |
X_t |
Numeric vector or matrix of covariates at time t. |
Numeric, estimated dynamic Gaussian copula correlation.
Computes the Clayton copula parameter dynamically based on lagged values and covariates.
dynamic.theta.clayton(params, lagged_theta, X_t)dynamic.theta.clayton(params, lagged_theta, X_t)
params |
Numeric vector of parameters: omega, alpha, and gamma coefficients. |
lagged_theta |
Numeric, the previous theta value. |
X_t |
Numeric vector or matrix of covariates at time t. |
Numeric, estimated dynamic Clayton copula parameter.
Computes the Frank copula parameter dynamically based on lagged values and covariates.
dynamic.theta.frank(params, lagged_theta, X_t)dynamic.theta.frank(params, lagged_theta, X_t)
params |
Numeric vector of parameters: omega, alpha, and gamma coefficients. |
lagged_theta |
Numeric, the previous theta value. |
X_t |
Numeric vector or matrix of covariates at time t. |
Numeric, estimated dynamic Frank copula parameter.
Computes the Gumbel copula parameter dynamically based on lagged values and covariates.
dynamic.theta.gumbel(params, lagged_theta, X_t)dynamic.theta.gumbel(params, lagged_theta, X_t)
params |
Numeric vector of parameters: omega, alpha, and gamma coefficients. |
lagged_theta |
Numeric, the previous theta value. |
X_t |
Numeric vector or matrix of covariates at time t. |
Numeric, estimated dynamic Gumbel copula parameter.
Computes the Joe copula parameter dynamically based on lagged values and covariates.
dynamic.theta.joe(params, lagged_theta, X_t)dynamic.theta.joe(params, lagged_theta, X_t)
params |
Numeric vector of parameters: omega, alpha, and gamma coefficients. |
lagged_theta |
Numeric, the previous theta value. |
X_t |
Numeric vector or matrix of covariates at time t. |
Numeric, estimated dynamic Joe copula parameter.
Fits a BSTS model for a time series y, given a vector or matrix of covariates z.
fit_bsts(y, z, lags = 0, MCMC.iter = 5000)fit_bsts(y, z, lags = 0, MCMC.iter = 5000)
y |
A numeric vector (time series response variable). |
z |
A numeric vector or matrix (covariates). |
lags |
Integer, number of lags for the autoregressive component. |
MCMC.iter |
Integer, number of MCMC iterations. |
A fitted BSTS model.
Computes the Frank copula dependence parameter based on Kendall's tau.
frank.theta(tau)frank.theta(tau)
tau |
Numeric, Kendall's tau correlation coefficient. |
Numeric, estimated Frank copula parameter.
Computes the Gumbel-Hougaard copula dependence parameter based on Kendall's tau.
GH.theta(tau)GH.theta(tau)
tau |
Numeric, Kendall's tau correlation coefficient. |
Numeric, estimated Gumbel copula parameter.
Initial Parameters for 2D Pseudo-Loglikelihood Estimation
init_params_fullinit_params_full
A numeric vector of length where:
Baseline autoregressive coefficient.
Parameter controlling variance.
Coefficients related to external factors.
AR(1) coefficient for GEV.
Std dev of innovations for AR(1) process for GEV.
GEV scale parameter for GEV.
GEV shape parameter for GEV.
Initial Parameters for 2D Pseudo-Loglikelihood-Generalized Estimation
init_params_full_Ginit_params_full_G
A numeric vector of length , structured as follows:
Baseline autoregressive coefficient.
Parameter controlling variance.
Coefficients related to external factors.
For each climate variable in each region, the following parameters are included:
mean(z), sd(z), sd(z), xi_gev for each region and variable.
Initial Parameters for 2D Pseudo-Loglikelihood Estimation without GEV models for covariates
init_params_noGEVinit_params_noGEV
A numeric vector of length where:
Baseline autoregressive coefficient.
Parameter controlling variance.
Coefficients related to external factors.
Computes the Joe copula dependence parameter based on Kendall's tau.
joe.theta(tau)joe.theta(tau)
tau |
Numeric, Kendall's tau correlation coefficient. |
Numeric, estimated Joe copula parameter.
Computes the log-likelihood for a time-varying copula model combined with Generalized Extreme Value (GEV) margins.
log_likelihood_Generalized(params, U, Z, X, copula)log_likelihood_Generalized(params, U, Z, X, copula)
params |
Numeric vector of model parameters, including copula parameters (omega, alpha, gamma) and GEV distribution parameters. |
U |
Numeric matrix (n_train x D), pseudo-observations for the copula. |
Z |
Numeric array (n_train x D x M), observed data for each margin and sub-feature. |
X |
Numeric matrix (n_train x M), risk factors for the dynamic copula parameter. |
copula |
Character, specifying the copula type: "Clayton", "Frank", "Gumbel", "Joe", or "Gaussian". |
Numeric, negative log-likelihood value.
test_ll <- log_likelihood_Generalized(init_params_full_G,uu, zz_train,xx_train,"Gaussian")test_ll <- log_likelihood_Generalized(init_params_full_G,uu, zz_train,xx_train,"Gaussian")
Computes the negative log-likelihood of a 2-dimensional copula-GEV model, incorporating dynamic Generalized Extreme Value (GEV) parameters and a time-varying copula structure.
log_likelihood_generalized_2d(params, u1, u2, X_t, z1, z2, copula)log_likelihood_generalized_2d(params, u1, u2, X_t, z1, z2, copula)
params |
Numeric vector, model parameters including copula and GEV parameters. |
u1 |
Numeric vector (length |
u2 |
Numeric vector (length |
X_t |
Numeric matrix ( |
z1 |
Numeric matrix ( |
z2 |
Numeric matrix ( |
copula |
Character, specifying the copula type: "Clayton", "Frank", "Gumbel", "Joe", or "Gaussian". |
The negative log-likelihood value for optimization.
test_ll_2d <-log_likelihood_generalized_2d(init_params_full, uu[,1], uu[,2], xx_train, zz_train[,1,], zz_train[,2,], "Gaussian")test_ll_2d <-log_likelihood_generalized_2d(init_params_full, uu[,1], uu[,2], xx_train, zz_train[,1,], zz_train[,2,], "Gaussian")
Computes the log-likelihood for a time-varying copula model.
log_likelihood_noGEV(params, U, Z, X, copula)log_likelihood_noGEV(params, U, Z, X, copula)
params |
Numeric vector of model parameters, including copula parameters (omega, alpha, gamma). |
U |
Numeric matrix (n_train x D), pseudo-observations for the copula. |
Z |
Numeric array (n_train x D x M), observed data for each margin and sub-feature. |
X |
Numeric matrix (n_train x M), risk factors for the dynamic copula parameter. |
copula |
Character, specifying the copula type: "Clayton", "Frank", "Gumbel", "Joe", or "Gaussian". |
Numeric, negative log-likelihood value.
test_ll_noGEV <- log_likelihood_noGEV(init_params_noGEV,uu, zz_train,x_train,"Gaussian")test_ll_noGEV <- log_likelihood_noGEV(init_params_noGEV,uu, zz_train,x_train,"Gaussian")
list containing Dufferin and Wellington
medoid_namesmedoid_names
An object of class list of length 2.
Creates a plot of observed data, forecasted values, and confidence intervals.
plot_forecast( forecast, data_train, data_test, time, quant_high, quant_low, observed_col, forecast_col, title )plot_forecast( forecast, data_train, data_test, time, quant_high, quant_low, observed_col, forecast_col, title )
forecast |
A matrix of BSTS forecast samples. |
data_train |
Numeric vector, training data. |
data_test |
Numeric vector, test data. |
time |
Numeric vector, representing time indices. |
quant_high |
Numeric, upper quantile for confidence interval. |
quant_low |
Numeric, lower quantile for confidence interval. |
observed_col |
Character, color for observed data. |
forecast_col |
Character, color for forecasted data. |
title |
Character, title of the plot. |
A ggplot2 object.
Generates a time series plot comparing the forecasts from two models along with observed data.
plot_forecast_compare( forecast1, forecast2, data_train, data_test, time, quant_high, quant_low, col1, title )plot_forecast_compare( forecast1, forecast2, data_train, data_test, time, quant_high, quant_low, col1, title )
forecast1 |
Numeric matrix, forecasted values from the first model (columns: time points). |
forecast2 |
Numeric matrix, forecasted values from the second model (columns: time points). |
data_train |
Numeric vector, training data used for modeling. |
data_test |
Numeric vector, actual test data for evaluation. |
time |
Numeric vector, representing the time points corresponding to the data. |
quant_high |
Numeric, upper quantile (e.g., 0.9) for confidence interval. |
quant_low |
Numeric, lower quantile (e.g., 0.1) for confidence interval. |
col1 |
Character, color for observed data lines. |
title |
Character, title for the plot. |
A ggplot2 object showing the forecast comparison.
A Special Case of simulation_generalized in 2 Dimensions
simul_fun_generalized_2d( nsim, n_train, n_test, copula, init_params, fn, u1, u2, z1_train, z2_train, X_t, y1_test, y2_test, BSTS_1, BSTS_2 )simul_fun_generalized_2d( nsim, n_train, n_test, copula, init_params, fn, u1, u2, z1_train, z2_train, X_t, y1_test, y2_test, BSTS_1, BSTS_2 )
nsim |
Integer, number of simulation replications. |
n_train |
Integer, number of training observations. |
n_test |
Integer, number of test observations. |
copula |
Character, specifying the copula type: "Clayton", "Frank", "Gumbel", "Joe", or "Gaussian". |
init_params |
Numeric vector, initial parameter values for optimization. |
fn |
Function, log-likelihood function for parameter estimation. |
u1 |
Numeric vector (n_train), first pseudo-observation for the copula. |
u2 |
Numeric vector (n_train), second pseudo-observation for the copula. |
z1_train |
Numeric matrix (n_train x M), observed data for the first margin. |
z2_train |
Numeric matrix (n_train x M), observed data for the second margin. |
X_t |
Numeric matrix (n_train x M), risk factors for the dynamic copula parameter. |
y1_test |
Numeric vector (n_test), true future values for the first response variable. |
y2_test |
Numeric vector (n_test), true future values for the second response variable. |
BSTS_1 |
Fitted BSTS model for the first response variable. |
BSTS_2 |
Fitted BSTS model for the second response variable. |
A list containing:
theta_simulated |
Simulated copula parameters across replications. |
y1_simulated |
Simulated values for the first response variable. |
y2_simulated |
Simulated values for the second response variable. |
MSE |
Mean squared error for each simulation run. |
optim_results |
Results from the optimization process. |
This function simulates multivariate crop yield data using a time-varying copula combined with Bayesian Structural Time Series (BSTS) models without GEV covariates for comparision.
simul.fun.noGEV( nsim = 100, n_train, n_test, copula, init_params, fn, U_train, Z_train, Z_test, X_train, X_test, Y_test, BSTS_list )simul.fun.noGEV( nsim = 100, n_train, n_test, copula, init_params, fn, U_train, Z_train, Z_test, X_train, X_test, Y_test, BSTS_list )
nsim |
Integer, number of simulation replications. |
n_train |
Integer, number of training observations. |
n_test |
Integer, number of test observations. |
copula |
Character, specifying the copula type: "Clayton", "Frank", "Gumbel", "Joe", or "Gaussian". |
init_params |
Numeric vector, initial parameter values for optimization. |
fn |
Function, log-likelihood function for parameter estimation. |
U_train |
Numeric matrix (n_train x D), pseudo-observations for the copula. |
Z_train |
Numeric array (n_train x D x M), observed data for each margin and sub-feature. |
Z_test |
Numeric array (n_test x D x M), observed data for each margin and sub-feature. |
X_train |
Numeric matrix (n_train x M), risk factors for the dynamic copula parameter. |
X_test |
Numeric matrix (n_test x M), risk factors for the dynamic copula parameter. |
Y_test |
Numeric matrix (n_test x D), true future values for MSE calculation. |
BSTS_list |
List of length D, each element is a BSTS model for a different margin. |
A list containing:
optim_results |
Results from the optimization process. |
theta_sim |
Simulated copula parameters across replications. |
Y_sim |
Simulated final BSTS-based forecasts. |
MSE |
Mean squared error for each simulation run. |
This function simulates multivariate crop yield data using a time-varying copula combined with Generalized Extreme Value (GEV) margins and Bayesian Structural Time Series (BSTS) models.
simulation_generalized( nsim = 100, n_train, n_test, copula, init_params, fn, U_train, Z_train, X, Y_test, BSTS_list )simulation_generalized( nsim = 100, n_train, n_test, copula, init_params, fn, U_train, Z_train, X, Y_test, BSTS_list )
nsim |
Integer, number of simulation replications. |
n_train |
Integer, number of training observations. |
n_test |
Integer, number of test observations. |
copula |
Character, specifying the copula type: "Clayton", "Frank", "Gumbel", "Joe", or "Gaussian". |
init_params |
Numeric vector, initial parameter values for optimization. |
fn |
Function, log-likelihood function for parameter estimation. |
U_train |
Numeric matrix (n_train x D), pseudo-observations for the copula. |
Z_train |
Numeric array (n_train x D x M), observed data for each margin and sub-feature. |
X |
Numeric matrix (n_train x M), risk factors for the dynamic copula parameter. |
Y_test |
Numeric matrix (n_test x D), true future values for MSE calculation. |
BSTS_list |
List of length D, each element is a BSTS model for a different margin. |
A list containing:
optim_results |
Results from the optimization process. |
theta_sim |
Simulated copula parameters across replications. |
Y_sim |
Simulated final BSTS-based forecasts. |
MSE |
Mean squared error for each simulation run. |
1950-2022
time_alltime_all
An object of class character of length 73.
2004-2022
time_testtime_test
An object of class character of length 19.
1950-2003
time_traintime_train
An object of class character of length 54.
Pseudo-Observations of BSTS Residuals for Crop Yield Forecasting
uuuu
A matrix with dimensions :
Number of time points used in the training set.
Number of regions analyzed (Dufferin, Wellington).
Derived from residuals of BSTS models fitted to crop yield data.
Maximized Covariates Matrix for Crop Yield Forecasting
xx_allxx_all
A three-dimensional array with dimensions :
Number of time points used in the training set.
Number of selected climate covariates used for modeling (cdd,frost_days,rx1day, tg_mean, txgt_25).
Derived from historical climate data from ClimateData.ca.
Maximized Covariates Matrix for Crop Yield Forecasting
xx_testxx_test
A three-dimensional array with dimensions :
Number of time points used in the testing set.
Number of selected climate covariates used for modeling (cdd,frost_days,rx1day, tg_mean, txgt_25).
Derived from historical climate data from ClimateData.ca.
Maximized Covariates Matrix for Crop Yield Forecasting
xx_trainxx_train
A three-dimensional array with dimensions :
Number of time points used in the training set.
Number of selected climate covariates used for modeling (cdd,frost_days,rx1day, tg_mean, txgt_25).
Derived from historical climate data from ClimateData.ca.
Crop Yield Data
yy_allyy_all
A matrix with dimensions :
Number of time points used in the test set.
Number of regions analyzed (Dufferin, Wellington).
Historical crop yield records from ClimateData.ca.
Crop Yield Data for Testing in BSTS Models
yy_testyy_test
A matrix with dimensions :
Number of time points used in the test set.
Number of regions analyzed (Dufferin, Wellington).
Historical crop yield records from ClimateData.ca.
Crop Yield Data for Training in BSTS Models
yy_trainyy_train
A matrix with dimensions :
Number of time points used in the train set.
Number of regions analyzed (Dufferin, Wellington).
Historical crop yield records from ClimateData.ca.
Standardized Covariates Array for Crop Yield Forecasting
zz_allzz_all
A three-dimensional array with dimensions :
Number of time points used in the training set.
Number of regions analyzed (Dufferin, Wellington).
Number of selected climate covariates used for modeling (cdd,frost_days,rx1day, tg_mean, txgt_25).
Derived from historical climate data.
Standardized Covariates Array for Crop Yield Forecasting
zz_testzz_test
A three-dimensional array with dimensions :
Number of time points used in the testing set.
Number of regions analyzed (Dufferin, Wellington).
Number of selected climate covariates used for modeling (cdd,frost_days,rx1day, tg_mean, txgt_25).
Derived from historical climate data.
Standardized Covariates Array for Crop Yield Forecasting
zz_trainzz_train
A three-dimensional array with dimensions :
Number of time points used in the training set.
Number of regions analyzed (Dufferin, Wellington).
Number of selected climate covariates used for modeling (cdd,frost_days,rx1day, tg_mean, txgt_25).
Derived from historical climate data from ClimateData.ca.