| Title: | Spatio-Temporal Crop Yield Prediction |
|---|---|
| Description: | Provides crop yield and meteorological data for Ontario, Canada. Includes functions for fitting and predicting data using spatio-temporal models, as well as tools for visualizing the results. The package builds upon existing R packages, including 'copula' (Hofert et al., 2025) <doi:10.32614/CRAN.package.copula>, and 'bsts' (Scott, 2024) <doi:10.32614/CRAN.package.bsts>. |
| Authors: | Marie Michaelides [aut], Mélina Mailhot [aut], Yongkun Li [aut, cre] |
| Maintainer: | Yongkun Li <[email protected]> |
| License: | MIT + file LICENSE |
| Version: | 1.0.0 |
| Built: | 2026-05-11 09:29:43 UTC |
| Source: | https://github.com/cran/STCYP |
Computes the Clayton copula dependence parameter based on Kendall's tau.
clayton.theta(tau)clayton.theta(tau)
tau |
Numeric, Kendall's tau correlation coefficient. |
Numeric, estimated Clayton copula parameter.
clayton.theta(mean(cor(cbind(u[[1]], u[[2]], u[[3]]), method = "kendall")))clayton.theta(mean(cor(cbind(u[[1]], u[[2]], u[[3]]), method = "kendall")))
A list containing supported copula types.
copula_listcopula_list
A list of copula types.
"Gaussian" "Clayton" "Frank" "Gumbel" "Joe"
Real crop yield and meteorological data of 24 regions for Ontario, Canada from 1950 to 2022 and anticipated data from 2023 to 2100.
datadata
A data frame with 1752 rows and 27 variables:
chr: year from 1950-2022
chr: 24 subregions
num: latitude
num: longitude
num: wheat crop yield per census division, in bushel/acre
num: Annual maximum number of consecutive days with daily precipitation below 1mm (unit = days)
num: Annual cooling degree days above 18C (unit = degree_days)
num: Annual number of days with a diurnal freeze-thaw cycle : tmax > 0 degc and tmin <= -1 degc
num: First day of year with temperature below 0 degc for at least 1 days
num: Annual number of days with minimum daily temperature below 0C
num: Annual number of days with maximum daily temperature below 0 degC
num: The annual number of dry periods of 6 days and more, during which the maximal precipitation on a window of 6 days is under 1.0 mm
num: Annual total precipitation (unit = mm)
num: Annual number of days with daily precipitation over 1.0 mm/day
num: Annual number of days with daily precipitation over 10.0 mm/day
num: Annual number of days with daily precipitation over 20.0 mm/day
num: Annual maximum 1-day total precipitation (unit = mm)
num: Annual maximum 5-day total precipitation (unit = mm)
num: Annual mean of daily mean temperatures (unit = C degrees)
num: Annual mean of daily minimum temperatures (unit = C degrees)
num: Annual minimum of daily minimum temperatures (unit = C degrees)
num: Annual number of tropical nights : defined as days with minimum daily temperature above 18 degc
num: Annual minimum of daily maximum temperature (unit = C degrees)
num: Annual mean of daily maximum temperature (unit = C degrees)
num: Annual number of days where daily maximum temperature exceeds 25 degC
num: Annual number of days where daily maximum temperature exceeds 27 degC
num: Annual number of days where daily maximum temperature exceeds 29 degC
ClimateData.ca
Selected data from year 1950 to 2022 and covariates including txgt27, tr18, cddcold, txgt29, and tnmean for case study.
dtdt
A data frame with 1752 rows and 10 variables:
chr: year from 1950-2022
chr: 24 subregions
num: latitude
num: longitude
num: wheat crop yield per census division, in bushel/acre
num: Annual cooling degree days above 18C (unit = degree_days)
num: Annual mean of daily minimum temperatures (unit = C degrees)
num: Annual number of tropical nights : defined as days with minimum daily temperature above 18 degc
num: Annual number of days where daily maximum temperature exceeds 27 degC
num: Annual number of days where daily maximum temperature exceeds 29 degC
ClimateData.ca
Computes the time-varying correlation parameter (rho) for a Gaussian copula.
dynamic.rho(params, lagged_rho, X_t)dynamic.rho(params, lagged_rho, X_t)
params |
Numeric vector of parameters: omega, alpha, and gamma coefficients. |
lagged_rho |
Numeric, the previous rho value. |
X_t |
Numeric vector or matrix of covariates at time t. |
Numeric, estimated dynamic Gaussian copula correlation.
Computes the Clayton copula parameter dynamically based on lagged values and covariates.
dynamic.theta.clayton(params, lagged_theta, X_t)dynamic.theta.clayton(params, lagged_theta, X_t)
params |
Numeric vector of parameters: omega, alpha, and gamma coefficients. |
lagged_theta |
Numeric, the previous theta value. |
X_t |
Numeric vector or matrix of covariates at time t. |
Numeric, estimated dynamic Clayton copula parameter.
Computes the Frank copula parameter dynamically based on lagged values and covariates.
dynamic.theta.frank(params, lagged_theta, X_t)dynamic.theta.frank(params, lagged_theta, X_t)
params |
Numeric vector of parameters: omega, alpha, and gamma coefficients. |
lagged_theta |
Numeric, the previous theta value. |
X_t |
Numeric vector or matrix of covariates at time t. |
Numeric, estimated dynamic Frank copula parameter.
Computes the Gumbel copula parameter dynamically based on lagged values and covariates.
dynamic.theta.gumbel(params, lagged_theta, X_t)dynamic.theta.gumbel(params, lagged_theta, X_t)
params |
Numeric vector of parameters: omega, alpha, and gamma coefficients. |
lagged_theta |
Numeric, the previous theta value. |
X_t |
Numeric vector or matrix of covariates at time t. |
Numeric, estimated dynamic Gumbel copula parameter.
Computes the Joe copula parameter dynamically based on lagged values and covariates.
dynamic.theta.joe(params, lagged_theta, X_t)dynamic.theta.joe(params, lagged_theta, X_t)
params |
Numeric vector of parameters: omega, alpha, and gamma coefficients. |
lagged_theta |
Numeric, the previous theta value. |
X_t |
Numeric vector or matrix of covariates at time t. |
Numeric, estimated dynamic Joe copula parameter.
Fits a BSTS model for a time series y, given a vector or matrix of covariates z.
fit_bsts(y, z, lags = 0, MCMC.iter = 5000)fit_bsts(y, z, lags = 0, MCMC.iter = 5000)
y |
A numeric vector (time series response variable). |
z |
A numeric vector or matrix (covariates). |
lags |
Integer, number of lags for the autoregressive component. |
MCMC.iter |
Integer, number of MCMC iterations. |
A fitted BSTS model.
Computes the Frank copula dependence parameter based on Kendall's tau.
frank.theta(tau)frank.theta(tau)
tau |
Numeric, Kendall's tau correlation coefficient. |
Numeric, estimated Frank copula parameter.
Computes the Gumbel-Hougaard copula dependence parameter based on Kendall's tau.
GH.theta(tau)GH.theta(tau)
tau |
Numeric, Kendall's tau correlation coefficient. |
Numeric, estimated Gumbel copula parameter.
GH.theta(mean(cor(cbind(u[[1]], u[[2]], u[[3]]), method = "kendall")))GH.theta(mean(cor(cbind(u[[1]], u[[2]], u[[3]]), method = "kendall")))
Initial Parameters for 3D Pseudo-Loglikelihood Estimation
init_params_fullinit_params_full
A numeric vector of length where:
Baseline autoregressive coefficient.
Parameter controlling variance.
Coefficients related to external factors.
Computes the Joe copula dependence parameter based on Kendall's tau.
joe.theta(tau)joe.theta(tau)
tau |
Numeric, Kendall's tau correlation coefficient. |
Numeric, estimated Joe copula parameter.
joe.theta(mean(cor(cbind(u[[1]], u[[2]], u[[3]]), method = "kendall")))joe.theta(mean(cor(cbind(u[[1]], u[[2]], u[[3]]), method = "kendall")))
Computes the negative log-likelihood of a 3-dimensional copula model with a time-varying copula structure.
log_likelihood_noGEV_3d(params, u1, u2, u3, X_t, z1, z2, z3, copula)log_likelihood_noGEV_3d(params, u1, u2, u3, X_t, z1, z2, z3, copula)
params |
Numeric vector, model parameters. |
u1 |
Numeric vector (length |
u2 |
Numeric vector (length |
u3 |
Numeric vector (length |
X_t |
Numeric matrix ( |
z1 |
Numeric matrix ( |
z2 |
Numeric matrix ( |
z3 |
Numeric matrix ( |
copula |
Character, specifying the copula type: "Clayton", "Frank", "Gumbel", "Joe", or "Gaussian". |
The negative log-likelihood value for optimization.
test_ll_3d <- log_likelihood_noGEV_3d(init_params_full, u[[1]], u[[2]], u[[3]], (z_train[[1]] + z_train[[2]] + z_train[[3]])/3, z_train[[1]], z_train[[2]], z_train[[3]], "Gaussian")test_ll_3d <- log_likelihood_noGEV_3d(init_params_full, u[[1]], u[[2]], u[[3]], (z_train[[1]] + z_train[[2]] + z_train[[3]])/3, z_train[[1]], z_train[[2]], z_train[[3]], "Gaussian")
list containing Chatham-Kent, Lambton, and Wellington
medoid_namesmedoid_names
An object of class character of length 3.
Creates a plot of observed data, forecasted values, and confidence intervals.
plot_forecast( forecast, data_train, data_test, time, quant_high, quant_low, observed_col, forecast_col, title )plot_forecast( forecast, data_train, data_test, time, quant_high, quant_low, observed_col, forecast_col, title )
forecast |
A matrix of BSTS forecast samples. |
data_train |
Numeric vector, training data. |
data_test |
Numeric vector, test data. |
time |
Numeric vector, representing time indices. |
quant_high |
Numeric, upper quantile for confidence interval. |
quant_low |
Numeric, lower quantile for confidence interval. |
observed_col |
Character, color for observed data. |
forecast_col |
Character, color for forecasted data. |
title |
Character, title of the plot. |
A ggplot2 object.
Generates a time series plot comparing the forecasts from two models along with observed data.
plot_forecast_compare( forecast1, forecast2, data_train, data_test, time, quant_high, quant_low, col1, title )plot_forecast_compare( forecast1, forecast2, data_train, data_test, time, quant_high, quant_low, col1, title )
forecast1 |
Numeric matrix, forecasted values from the first model (columns: time points). |
forecast2 |
Numeric matrix, forecasted values from the second model (columns: time points). |
data_train |
Numeric vector, training data used for modeling. |
data_test |
Numeric vector, actual test data for evaluation. |
time |
Numeric vector, representing the time points corresponding to the data. |
quant_high |
Numeric, upper quantile (e.g., 0.9) for confidence interval. |
quant_low |
Numeric, lower quantile (e.g., 0.1) for confidence interval. |
col1 |
Character, color for observed data lines. |
title |
Character, title for the plot. |
A ggplot2 object showing the forecast comparison.
Function to optimize the full pseudo-loglikelihood and perform new forecasts
simul_fun_noGEV_3d( nsim = 100, n_train, n_test, copula, init_params, fn, u1, u2, u3, z1_train, z2_train, z3_train, z1_test, z2_test, z3_test, X_t, y1_test, y2_test, y3_test, BSTS_1, BSTS_2, BSTS_3 )simul_fun_noGEV_3d( nsim = 100, n_train, n_test, copula, init_params, fn, u1, u2, u3, z1_train, z2_train, z3_train, z1_test, z2_test, z3_test, X_t, y1_test, y2_test, y3_test, BSTS_1, BSTS_2, BSTS_3 )
nsim |
Integer, number of simulation replications. |
n_train |
Integer, number of training observations. |
n_test |
Integer, number of test observations. |
copula |
Character, specifying the copula type: "Clayton", "Frank", "Gumbel", "Joe", or "Gaussian". |
init_params |
Numeric vector, initial parameter values for optimization. |
fn |
Function, log-likelihood function for parameter estimation. |
u1 |
Numeric vector (n_train), first pseudo-observation for the copula. |
u2 |
Numeric vector (n_train), second pseudo-observation for the copula. |
u3 |
Numeric vector (n_train), third pseudo-observation for the copula. |
z1_train |
Numeric matrix (n_train x M), observed data for the first margin and sub-feature. |
z2_train |
Numeric matrix (n_train x M), observed data for the second margin and sub-feature. |
z3_train |
Numeric matrix (n_train x M), observed data for the third margin and sub-feature. |
z1_test |
Numeric matrix (n_test x M), true future data for the first margin and sub-feature. |
z2_test |
Numeric matrix (n_test x M), true future data for the second margin and sub-feature. |
z3_test |
Numeric matrix (n_test x M), true future data for the third margin and sub-feature. |
X_t |
Numeric matrix (n_train x M), risk factors for the dynamic copula parameter. |
y1_test |
Numeric vector (n_test), true future values for the first response variable. |
y2_test |
Numeric vector (n_test), true future values for the second response variable. |
y3_test |
Numeric vector (n_test), true future values for the third response variable. |
BSTS_1 |
Fitted BSTS model for the first response variable. |
BSTS_2 |
Fitted BSTS model for the second response variable. |
BSTS_3 |
Fitted BSTS model for the third response variable. |
A list containing:
theta_simulated |
Simulated copula parameters across replications. |
y1_simulated |
Simulated values for the first response variable. |
y2_simulated |
Simulated values for the second response variable. |
y3_simulated |
Simulated values for the third response variable. |
MSE |
Mean squared error for each simulation run. |
optim_results |
Results from the optimization process. |
2004-2022
time_testtime_test
An object of class character of length 19.
1950-2003
time_traintime_train
An object of class character of length 54.
Pseudo-Observations of BSTS Residuals for Crop Yield Forecasting
uu
A matrix with dimensions :
Number of time points used in the training set.
Number of regions analyzed (Chatham-Kent, Lambton,Wellington).
Derived from residuals of BSTS models fitted to crop yield data.
Crop Yield Data for Testing in BSTS Models
y_testy_test
A matrix with dimensions :
Number of time points used in the test set.
Number of regions analyzed (Chatham-Kent, Lambton,Wellington).
Historical crop yield records from ClimateData.ca.
Training crop-yield data used for BSTS models.
y_trainy_train
A numeric matrix with n_train rows and D columns:
n_train)Number of time points in the training set.
D)Regions analyzed (Chatham-Kent, Lambton, Wellington).
ClimateData.ca (processed)
Standardized climate covariates used to forecast with the BSTS models (test).
z_testz_test
A numeric array with dimensions n_test × D × M:
n_testNumber of test time points.
DRegions (Chatham-Kent, Lambton, Wellington).
MNumber of covariates (cddcold, tr18, txgt27, tnmean, txgt29).
ClimateData.ca (processed)
Standardized climate covariates used to fit the BSTS models (training).
z_trainz_train
A numeric array with dimensions n_train × D × M:
n_trainNumber of training time points.
DRegions (Chatham-Kent, Lambton, Wellington).
MNumber of covariates (cddcold, tr18, txgt27, tnmean, txgt29).
ClimateData.ca (processed)