Temperature and salinity data from Argo floats in the North Atlantic Ocean at three layers of depth: surface (less than 50 meters), midlayer (between 475525 meters) and deep (975 to 1025 meters) during 2003.
argo_floats_atlantic_2003
An object of class data.frame
with 6978 rows and 11 columns.
Sahu and Challenor (2008) @format A data frame with 6978 rows and 11 columns:
Longitude of the argo float
Latitude of the argo float
Cumulative day of the year in 2003
Day within each month in 2003
Month in 2003
Temperature recorded by the Argo float in degree Celsius
Salinity in practical salinity units
Centered and scaled values of longitude at each depth
Centered and scaled values of latitude at each depth
Centered and scaled values of longitude times latitude at each depth
Sahu SK, Challenor P (2008). “A spacetime model for joint modeling of ocean temperature and salinity levels as measured by Argo floats.” Environmetrics, 19, 509528.
head(argo_floats_atlantic_2003)
# Data for the surface layer
surface < argo_floats_atlantic_2003[argo_floats_atlantic_2003$depth==1, ]
# Data for the midlayer
midlayer < argo_floats_atlantic_2003[argo_floats_atlantic_2003$depth==2, ]
# Data at the deep ocean
deep < argo_floats_atlantic_2003[argo_floats_atlantic_2003$depth==3, ]
Bayesian regression model fitting for areal and areal spatiotemporal data. Calculates parameter estimates, validation statistics, and estimated values of several Bayesian model choice criteria.
Bcartime(
formula,
data,
family,
link = NULL,
trials = NULL,
offsetcol = NULL,
formula.omega = NULL,
scol = NULL,
tcol = NULL,
package = "CARBayes",
model = "glm",
AR = 1,
W = NULL,
adj.graph = NULL,
residtype = "response",
interaction = TRUE,
Z = NULL,
W.binary = NULL,
changepoint = NULL,
knots = NULL,
validrows = NULL,
prior.mean.delta = NULL,
prior.mean.beta = NULL,
prior.var.beta = NULL,
prior.mean.gamma = NULL,
prior.var.gamma = NULL,
prior.sigma2 = NULL,
prior.tau2 = c(2, 1),
prior.delta = NULL,
prior.var.delta = NULL,
prior.lambda = NULL,
prior.nu2 = c(2, 1),
epsilon = 0,
G = NULL,
ind.area = NULL,
trends = NULL,
rho.T = NULL,
rho.S = NULL,
rho = NULL,
rho.slo = NULL,
rho.int = NULL,
MALA = FALSE,
N = 2000,
burn.in = 1000,
thin = 10,
rseed = 44,
Nchains = 4,
verbose = TRUE,
plotit = TRUE
)
formula 
An object of class "formula" (or one that can be coerced to that class): a symbolic description of the regression model to be fitted. 
data 
The data frame for which the model formula is to be fitted. The data frame should be in long format having one row for each location and time combination. The data frame must be ordered by time within each site, and should optionally have a column, named s.index, providing the site indices. Thus the data, with n sites and T times within each site, should be organized in the order: (s1, t1), (s1, t2), ... (s1, T), ... (sn, t1), ... (sn, T). The data frame should also contain two columns giving the coordinates of the locations for spatio temporal model fitting. 
family 
One of either "gaussian", "binomial","poisson" or "zip", which respectively specify a Gaussian, binomial likelihood model with the logistic link function, a Poisson likelihood model with a log link function, or a zeroinflated Poisson model with a log link function. 
link 
The link function to use for INLA based model fitting. This is ignored for the CARBayes and CARBayesST models. 
trials 
Only used if family="binomial". Either the name (character) or number of the column in the supplied data frame containing the total number of trials The program will try to access data[, trials] to get the binomial denominators. 
offsetcol 
Only used in INLA based modeling. The column name or number in the data frame that should be used as the offset. 
formula.omega 
A onesided formula object with no response variable (left side of the "~") needed, specifying the covariates in the logistic regression model for modelling the probability of an observation being a structural zero. Each covariate (or an offset) needs to be a vector of length K*1. Only required for zeroinflated Poisson models. 
scol 
Either the name (character) or number of the column in the supplied data frame identifying the spatial units. The program will try to access data[, scol] to identify the spatial units. If this is omitted, no spatial modeling will be performed. 
tcol 
Like the 
package 
Which package is to be used in model fitting? Currently available packages are:

model 
The specific spatio temporal model to be fitted.
If the package is "INLA" then the model argument should be a vector with two elements
giving the spatial model as the first component. The alternatives for the
spatial model are: "bym", "bym2", "besag", "besag2", "besagproper", "besagproper2", "iid"
and "none". The temporal model as the second second component can be one of
"iid", "rw1", "rw2", ar1" or "none".
In case the model component is "none" then no spatial/temporal random effects
will be fitted. No temporal random effects will be fitted in case 
AR 
The order of the autoregressive time series process that must be either 1 or 2. 
W 
A nonnegative K by K neighborhood matrix (where K is the number of spatial units).
Typically a binary specification is used, where the jkth element equals one if areas (j, k)
are spatially close (e.g. share a common border) and is zero otherwise.
The matrix can be nonbinary, but each row must contain at least one nonzero entry.
This argument may not need to be specified if 
adj.graph 
Adjacency graph which may be specified instead of the adjacency matrix
matrix. This argument is used if 
residtype 
Residual type, either "response" or "pearson", in GLM fitting with the packages CARBayes and CARBayesST. Default is "response" type observed minus fitted. The other option "pearson" is for Pearson residuals in GLM. For INLA based model fitting only the default response residuals are calculated. 
interaction 
TRUE or FALSE indicating whether the spatiotemporal interaction random effects should be included. Defaults to TRUE unless family="gaussian" in which case interactions are not allowed. 
Z 
A list, where each element is a K by K matrix of nonnegative dissimilarity metrics. 
W.binary 
Logical, should the estimated neighbourhood matrix have only binary (0,1) values. 
changepoint 
A scalar indicating the position of the change point should one of the change point trend functions be included in the trends vector, i.e. if "CP" or "CT" is specified. 
knots 
A scalar indicating the number of knots to use should one of the monotonic cubic splines trend functions be included in the trends vector, i.e. if "MD" or "MI" is specified. 
validrows 
A vector providing the rows of the data frame which should be used for validation. The default NULL value instructs that validation will not be performed. 
prior.mean.delta 
A vector of prior means for the regression parameters delta (Gaussian priors are assumed) for the zero probability logistic regression component of the model. Defaults to a vector of zeros. 
prior.mean.beta 
A vector of prior means for the regression parameters beta (Gaussian priors are assumed). Defaults to a vector of zeros. 
prior.var.beta 
A vector of prior variances for the regression parameters beta (Gaussian priors are assumed). Defaults to a vector with values 100,000. 
prior.mean.gamma 
A vector of prior means for the temporal trend parameters (Gaussian priors are assumed). Defaults to a vector of zeros. 
prior.var.gamma 
A vector of prior variances for the temporal trend parameters (Gaussian priors are assumed). Defaults to a vector with values 100,000. 
prior.sigma2 
The prior shape and scale in the form of c(shape, scale) for an InverseGamma(shape, scale) prior for sigma2. Defaults to c(1, 0.01). 
prior.tau2 
The prior shape and scale in the form of c(shape, scale) for an InverseGamma(shape, scale) prior for tau2. Defaults to c(1, 0.01). 
prior.delta 
The prior maximum for the cluster smoothing parameter delta. Defaults to 10. 
prior.var.delta 
A vector of prior variances for the regression parameters delta (Gaussian priors are assumed) for the zero probability logistic regression component of the model. Defaults to a vector with values 100000. 
prior.lambda 
A vector of prior samples sizes for the Dirichlet prior controlling the probabilities that each trend function is chosen. The vector should be the same length as the trends vector and defaults to a vector of ones. 
prior.nu2 
The prior shape and scale in the form of c(shape, scale) for an InverseGamma(shape, scale) prior for nu2. Defaults to c(1, 0.01) and only used if family="Gaussian". 
epsilon 
Diagonal ridge parameter to add to the random effects prior precision matrix, only required when rho = 1, and the prior precision is improper. Defaults to 0. Only used for adaptive model fitting in CARBayesST. 
G 
The maximum number of distinct intercept terms (groups) to allow in the localised model. 
ind.area 
A vector of integers the same length as the number of data points (individuals) giving which spatial unit (nunmbered from 1 to K to align with the rows of the W matrix) each individual belongs to. 
trends 
A vector containing the temporal trend functions to include in the model, which include: constant ("Constant""); linear decreasing ("LD"); linear increasing ("LI"); Known change point, where the trend can increase towards the change point before subsequently decreasing ("CP"); or decrease towards the change point before subsequently increasing ("CT"); and monotonic cubic splines which are decreasing ("MD") or increasing ("MI"). At least two trends have to be selected, with the constant trend always included. To avoid identifiability problems only one of "LI" or "MI" can be included at a given time (similarily for "LD" and "MD"). 
rho.T 
The value in the interval [0, 1] that the temporal dependence parameter rho.T is fixed at if it should not be estimated. If this argument is NULL then rho.T is estimated in the model. 
rho.S 
The value in the interval [0, 1] that the spatial dependence parameter rho.S is fixed at if it should not be estimated. If this argument is NULL then rho.S is estimated in the model. 
rho 
The value in the interval [0, 1] that the spatial dependence parameter rho is fixed at if it should not be estimated. If this argument is NULL then rho is estimated in the model. Setting rho=1, reduces the random effects prior to the intrinsic CAR model but does require epsilon>0. 
rho.slo 
The value in the interval [0, 1] that the spatial dependence parameter for the slope of the linear time trend, rho.slo, is fixed at if it should not be estimated. If this argument is NULL then rho.slo is estimated in the model. 
rho.int 
The value in the interval [0, 1] that the spatial dependence parameter for the intercept of the linear time trend, rho.int, is fixed at if it should not be estimated. If this argument is NULL then rho.int is estimated in the model. 
MALA 
Logical, should the function use Metropolis adjusted Langevin algorithm (MALA) updates (TRUE) or simple random walk (FALSE, default) updates for the regression parameters and random effects. 
N 
MCMC sample size. 
burn.in 
How many initial iterations to discard. Only relevant for MCMC based model fitting, i.e., when package is spBayes or Stan. 
thin 
The level of thinning to apply to the MCMC samples to reduce their temporal autocorrelation. Defaults to 1 (no thinning). 
rseed 
Random number seed that controls the starting point for the random number stream. A set value is required to help reproduce the results. 
Nchains 
The number of parallel Markov chains to be used in the Metropolis coupled Markov chain Monte Carlo (MCMCMC) simulations. Defaults to 4. 
verbose 
Logical scalar value: whether to print various estimates and statistics. 
plotit 
Logical scalar value: whether to plot the predictions against the observed values. 
A list containing:
params  A table of parameter estimates
fit  The fitted model object.
fitteds  A vector of fitted values.
mchoice  Calculated model choice statistics if those have been
requested by the input argument mchoice=T
. Not all model fits will contain
all the model choice statistics.
residuals  A vector of residual values.
stats  The four validation statistics: rmse, mae, crps and coverage. This is present only if model validation has been performed.
yobs_preds  A data frame containing the validation rows of the model fitting data frame. The last five columns of this data frame contains the validation prediction summaries: mean, sd, median, and 95% prediction interval. This is present only if model validation has been performed.
valpreds  A matrix containing the MCMC samples of the validation predictions. The dimension of this matrix is the number of validations times the number of retained MCMC samples. This is present only if model validation has been performed.
validationplots  Present only if validation has been performed.
Contains three validation plots with or without segment and
an ordinary plot. See obs_v_pred_plot
for more.
sn  The number of areal units used in fitting.
tn  The number of time points used in fitting.
formula  The input formula for the regression part of the model.
scale.transform  It is there for compatibility with Bsptime
output.
package  The name of the package used for model fitting.
model  The name of the fitted model.
call  The command used to call the model fitting function.
computation.time  Computation time required to run the model fitting.
GómezRubio V (2020).
Bayesian inference with INLA.
Boca Raton:Chapman and Hall/CRC,.
Lee D (2021).
“CARBayes version 5.2.3: An R Package for Spatial Areal Unit Modelling with Conditional Autoregressive Priors.”
University of Glasgow.
https://cran.rproject.org/package=CARBayes.
Lee D, Rushworth A, Napier G (2018).
“SpatioTemporal Areal Unit Modeling in R with Conditional Autoregressive Priors Using the CARBayesST Package.”
Journal of Statistical Software, 84(9), 10.18637/jss.v084.i09.
Sahu SK (2022).
Bayesian Modeling of Spatio Temporal Data with R, 1st edition.
Chapman and Hall, Boca Raton.
https://doi.org/10.1201/9780429318443.
# Set the validation row numbers
vs < sample(nrow(engtotals), 5)
# Total number of iterations
N < 60
# Number of burn in iterations
burn.in < 10
# Number of thinning iterations
thin < 1
# Set the model formula for binomial distribution based modeling
f1 < noofhighweeks ~ jsa + log10(houseprice) + log(popdensity) + sqrt(no2)
## Independent error logistic regression
M1 < Bcartime(formula = f1, data = engtotals, family = "binomial",
trials = "nweek", N = N, burn.in = burn.in, thin = thin,
verbose = TRUE)
summary(M1)
# Leroux model
M1.leroux < Bcartime(formula = f1, data = engtotals, scol = "spaceid",
model = "leroux", W = Weng, family = "binomial", trials = "nweek",
N = N, burn.in = burn.in, thin = thin)
summary(M1.leroux)
# BYM model
M1.bym < Bcartime(formula = f1, data = engtotals, scol = "spaceid",
model = "bym", W = Weng, family = "binomial", trials = "nweek",
N = N, burn.in = burn.in, thin = thin, verbose = FALSE)
summary(M1.bym)
# Validation for the Leroux model
M1.leroux.v < Bcartime(formula = f1, data = engtotals, scol = "spaceid",
model = "leroux", W = Weng, family = "binomial", trials = "nweek",
validrows = vs, N = N, burn.in = burn.in, thin = thin, verbose = FALSE)
summary(M1.leroux.v)
## Poisson Distribution based models ####################################
# Model formula
f2 < covid ~ offset(logEdeaths) + jsa + log10(houseprice) + log(popdensity) +
sqrt(no2)
# Independent error Poisson regression
M2 < Bcartime(formula = f2, data = engtotals, family = "poisson", N = N,
burn.in = burn.in, thin = thin, verbose = FALSE)
summary(M2)
## Poisson regression with Leroux Model
M2.leroux < Bcartime(formula = f2, data = engtotals, scol = "spaceid",
model = "leroux", family = "poisson", W = Weng, N = N, burn.in = burn.in,
thin = thin, verbose = FALSE)
summary(M2.leroux)
# Poisson regression with BYM Model
M2.bym < Bcartime(formula = f2, data = engtotals, scol = "spaceid",
model = "bym", family = "poisson", W = Weng, N = N, burn.in = burn.in,
thin = thin)
summary(M2.bym)
## Gaussian distribution based models ###############
f3 < sqrt(no2) ~ jsa + log10(houseprice) + log(popdensity)
# Independent error model
M3 < Bcartime(formula = f3, data = engtotals, family = "gaussian", N = N,
burn.in = burn.in, thin = thin, verbose = FALSE)
summary(M3)
# Leroux model
M3.leroux < Bcartime(formula = f3, data = engtotals, scol = "spaceid",
model = "leroux", family = "gaussian", W = Weng, N = N, burn.in = burn.in,
thin = thin, verbose = FALSE)
summary(M3.leroux)
## Validation
M3.leroux.v < Bcartime(formula = f3, data = engtotals, scol = "spaceid",
model = "leroux", family = "gaussian", W = Weng, N = N, burn.in = burn.in,
thin = thin, validrows = vs, verbose = FALSE)
summary(M3.leroux.v)
## Spatiotemporal modeling ##################################################
head(engdeaths)
dim(engdeaths)
colnames(engdeaths)
vs < sample(nrow(engdeaths), 5)
## Binomial distribution
engdeaths$nweek < rep(1, nrow(engdeaths))
f1 < highdeathsmr ~ jsa + log10(houseprice) + log(popdensity)
M1st_linear < Bcartime(formula = f1, data = engdeaths, scol = "spaceid",
tcol = "Weeknumber", trials = "nweek", W = Weng, model = "linear",
family = "binomial", package = "CARBayesST", N = N, burn.in = burn.in,
thin = thin, verbose = TRUE)
summary(M1st_linear)
M1st_sepspat < Bcartime(formula = f1, data = engdeaths, scol = "spaceid",
tcol = "Weeknumber", trials = "nweek", W = Weng, model = "sepspatial",
family = "binomial", package = "CARBayesST", N = N, burn.in = burn.in,
thin = thin, verbose = FALSE)
summary(M1st_sepspat)
M1st_ar < Bcartime(formula = f1, data = engdeaths, scol = "spaceid",
tcol = "Weeknumber", trials = "nweek", W = Weng, model = "ar", AR = 1,
family = "binomial", package = "CARBayesST", N = N, burn.in = burn.in,
thin = thin, verbose = FALSE)
summary(M1st_ar)
# Model validation
M1st_ar.v < Bcartime(formula = f1, data = engdeaths, scol = "spaceid",
tcol = "Weeknumber", trials = "nweek", W = Weng, model = "ar", AR = 1,
family = "binomial", package = "CARBayesST", N = N, burn.in = burn.in,
thin = thin, validrows = vs, verbose = FALSE)
summary(M1st_ar.v)
## Spatio temporal Poisson models###################################
colnames(engdeaths)
f2 < covid ~ offset(logEdeaths) + jsa + log10(houseprice) + log(popdensity) +
n0
M2st_linear < Bcartime(formula = f2, data = engdeaths, scol = "spaceid",
tcol = "Weeknumber", W = Weng, model = "linear", family = "poisson",
package = "CARBayesST", N = N, burn.in = burn.in, thin = thin,
verbose = FALSE)
summary(M2st_linear)
M2st_anova < Bcartime(formula = f2, data = engdeaths, scol = "spaceid",
tcol = "Weeknumber", W = Weng, model = "anova", family = "poisson",
package = "CARBayesST", N = N, burn.in = burn.in, thin = thin,
verbose = FALSE)
summary(M2st_anova)
M2st_anova_nointer < Bcartime(formula = f2, data = engdeaths, scol = "spaceid",
tcol = "Weeknumber", W = Weng, model = "anova", interaction = FALSE,
family = "poisson", package = "CARBayesST", N = N, burn.in = burn.in,
thin = thin, verbose = FALSE)
summary(M2st_anova_nointer)
M2st_sepspat < Bcartime(formula = f2, data = engdeaths, scol = "spaceid",
tcol = "Weeknumber", W = Weng, model = "sepspatial", family = "poisson",
package = "CARBayesST", N = N, burn.in = burn.in, thin = thin,
verbose = FALSE)
summary(M2st_sepspat)
M2st_ar < Bcartime(formula = f2, data = engdeaths, scol = "spaceid",
tcol = "Weeknumber", W = Weng, model = "ar", AR = 1, family = "poisson",
package = "CARBayesST", N = N, burn.in = burn.in, thin = thin,
verbose = FALSE)
summary(M2st_ar)
M2st_ar.v < Bcartime(formula = f2, data = engdeaths, scol = "spaceid",
tcol = "Weeknumber", W = Weng, model = "ar", family = "poisson",
package = "CARBayesST", N = N, burn.in = burn.in, thin = thin,
validrows = vs, verbose = FALSE)
M2st_anova.v < Bcartime(formula = f2, data = engdeaths, scol = "spaceid",
tcol = "Weeknumber", W = Weng, model = "anova", family = "poisson",
package = "CARBayesST", N = N, burn.in = burn.in, thin = thin,
validrows = vs, verbose = FALSE)
summary(M2st_ar.v)
summary(M2st_anova.v)
## Spatiotemporal Normal models ###############################
colnames(engdeaths)
f3 < sqrt(no2) ~ jsa + log10(houseprice) + log(popdensity)
M3st_linear < Bcartime(formula = f3, data = engdeaths, scol = "spaceid",
tcol = "Weeknumber", W = Weng, model = "linear", family = "gaussian",
package = "CARBayesST", N = N, burn.in = burn.in, thin = thin,
verbose = FALSE)
summary(M3st_linear)
M3st_anova < Bcartime(formula = f3, data = engdeaths, scol = "spaceid",
tcol = "Weeknumber", W = Weng, model = "anova", family = "gaussian",
package = "CARBayesST", N = N, burn.in = burn.in, thin = thin,
verbose = FALSE)
summary(M3st_anova)
M3st_anova_nointer < Bcartime(formula = f3, data = engdeaths, scol = "spaceid",
tcol = "Weeknumber", W = Weng, model = "anova", interaction = FALSE,
family = "gaussian", package = "CARBayesST", N = N, burn.in = burn.in,
thin = thin, verbose = FALSE)
summary(M3st_anova_nointer)
M3st_ar < Bcartime(formula = f3, data = engdeaths, scol = "spaceid",
tcol = "Weeknumber", W = Weng, model = "ar", AR = 2, family = "gaussian",
package = "CARBayesST", N = N, burn.in = burn.in, thin = thin,
verbose = FALSE)
summary(M3st_ar)
# Execute the following examples if INLA is available
if (require(INLA)) {
N < 55
burn.in < 5
thin < 1
vs < sample(nrow(engtotals), 5)
# Spatial Binomial GLM
f1 < noofhighweeks ~ jsa + log10(houseprice) + log(popdensity) + sqrt(no2)
M1.inla.bym < Bcartime(data = engtotals, formula = f1,
W = Weng, scol = "spaceid", model = c("bym"), package = "inla",
family = "binomial", link="logit", trials = "nweek", N = N, burn.in = burn.in,
thin = thin)
summary(M1.inla.bym)
## Spatial only Poisson
f2inla < covid ~ jsa + log10(houseprice) + log(popdensity) +
sqrt(no2)
M2.inla.bym < Bcartime(data = engtotals, formula = f2inla,
W = Weng, scol = "spaceid", offsetcol = "logEdeaths",
model = c("bym"), package = "inla", link = "log",
family = "poisson", N = N, burn.in = burn.in, thin = thin)
summary(M2.inla.bym)
## Normal models
f3 < sqrt(no2) ~ jsa + log10(houseprice) + log(popdensity)
## Fit BYM with iid random effect
M3.inla.bym < Bcartime(data = engtotals, formula = f3,
W = Weng, scol = "spaceid", model = c("bym"), package = "inla",
family = "gaussian", N = N, burn.in = burn.in, thin = thin)
summary(M3.inla.bym)
# validation
M3.inla.bym.v < Bcartime(data = engtotals, formula = f3,
W = Weng, scol = "spaceid", model = c("bym"), package = "inla",
family = "gaussian", validrows = vs, N = N, burn.in = burn.in,
thin = thin)
summary(M3.inla.bym.v)
###### Spatiotemporal INLA models
f1 < highdeathsmr ~ jsa + log10(houseprice) + log(popdensity)
nweek < rep(1, nrow(engdeaths))
engdeaths$nweek < rep(1, nrow(engdeaths))
## INLA Binomial
model < c("bym", "ar1")
M1st_inla.bym < Bcartime(data = engdeaths, formula = f1,
W = Weng, scol = "spaceid", tcol = "Weeknumber",
model = model, trials = "nweek", family = "binomial", link="logit",
package = "inla", N = N, burn.in = burn.in, thin = thin)
summary(M1st_inla.bym)
M1st_inla_v < Bcartime(data = engdeaths, formula = f1,
W = Weng, scol = "spaceid", tcol = "Weeknumber",
offsetcol = "logEdeaths", model = model, trials = "nweek",
family = "binomial", link="logit", package = "inla", validrow = vs,
N = N, burn.in = burn.in, thin = thin)
summary(M1st_inla_v)
model < c("bym", "none")
M1st_inla.bym.none < Bcartime(data = engdeaths, formula = f1,
W = Weng, scol = "spaceid", tcol = "Weeknumber",
model = model, trials = "nweek", family = "binomial", link="logit",
package = "inla", N = N, burn.in = burn.in, thin = thin)
summary(M1st_inla.bym.none)
model < c("bym")
M1st_inla.bym.none < Bcartime(data = engdeaths, formula = f1,
W = Weng, scol = "spaceid", tcol = "Weeknumber",
model = model, trials = "nweek", family = "binomial", link="logit",
package = "inla", N = N, burn.in = burn.in, thin = thin)
summary(M1st_inla.bym.none)
## Poisson models
f2inla < covid ~ jsa + log10(houseprice) + log(popdensity) +
n0 + n1 + n2 + n3
model < c("bym", "ar1")
M2stinla < Bcartime(data = engdeaths, formula = f2inla,
W = Weng, scol = "spaceid", tcol = "Weeknumber",
offsetcol = "logEdeaths", model = model, link = "log",
family = "poisson", package = "inla", N = N, burn.in = burn.in,
thin = thin)
summary(M2stinla)
M2stinla.v < Bcartime(data = engdeaths, formula = f2inla,
W = Weng, scol = "spaceid", tcol = "Weeknumber",
offsetcol = "logEdeaths", model = model, link = "log",
family = "poisson", package = "inla", validrows = vs,
N = N, burn.in = burn.in, thin = thin)
summary(M2stinla.v)
## Normal models
f3 < sqrt(no2) ~ jsa + log10(houseprice) + log(popdensity)
model < c("bym", "iid")
M3inla.bym.iid < Bcartime(data = engdeaths, formula = f3,
W = Weng, scol = "spaceid", tcol = "Weeknumber",
model = model, family = "gaussian", package = "inla",
validrows = vs, N = N, burn.in = burn.in, thin = thin)
summary(M3inla.bym.iid)
model < c("bym", "ar1")
M3inla.bym.ar1 < Bcartime(data = engdeaths, formula = f3,
W = Weng, scol = "spaceid", tcol = "Weeknumber",
model = model, family = "gaussian", package = "inla",
validrows = vs, N = N, burn.in = burn.in, thin = thin)
summary(M3inla.bym.ar1)
}
Cauchy prior simulation example.
BCauchy(
method = "exact",
true.theta = 1,
n = 25,
N = 10000,
rseed = 44,
tuning.sd = 1
)
method 
Which method or package to use. Possibilities are:

true.theta 
True value of theta with a default value of 5. 
n 
Data sample size; defaults to 100. 
N 
is the number of Monte Carlo samples. 
rseed 
is the random number seed for drawing data samples. 
tuning.sd 
is the standard deviation of the proposal increment distribution for the random walk sampler. 
A list containing the estimated posterior mean, ybar (the data mean) and the values of the numerator and the denominator integrals The routine simulates n observations from N(theta, 1). Mean of the simulated data values are returned as ybar.
BCauchy(true.theta = 1, n=25)
BCauchy(true.theta = 5, n=100)
BCauchy(method="importance", true.theta = 1, n=25)
BCauchy(method="importance", true.theta = 1, n=25, N=20000)
BCauchy(method="rejection", true.theta = 1, n=25)
BCauchy(method="independence", true.theta = 1, n=25)
BCauchy(method="randomwalk", true.theta = 1, n=25, tuning.sd =1)
Model choice criteria calculation for univariate normal model for both known and unknown sigma^2
Bmchoice(
case = "Exact.sigma2.known",
y = ydata,
mu0 = mean(y),
sigma2 = 22,
kprior = 1,
prior.M = 1,
prior.sigma2 = c(2, 1),
N = 10000,
rseed = 44
)
case 
One of the three cases:

y 
A vector of data values. Default is 28 ydata values from the package bmstdr 
mu0 
The value of the prior mean if kprior=0. Default is the data mean. 
sigma2 
Value of the known data variance; defaults to sample variance of the data. This is ignored in the third case when sigma2 is assumed to be unknown. 
kprior 
A scalar providing how many data standard deviation the prior mean is from the data mean. Default value is 0. 
prior.M 
Prior sample size, defaults to 10^(4). 
prior.sigma2 
Shape and scale parameter value for the gamma prior on 1/sigma^2, the precision. 
N 
The number of samples to generate. 
rseed 
The random number seed. Defaults to 44 to reproduce the results in the book Sahu (2022). 
A list containing the exact values of pdic, dic, pdicalt, dicalt, pwaic1, waic1, pwaic2, waic2, gof, penalty and pmcc. Also prints out the posterior mean and variance. @references Sahu SK (2022). Bayesian Modeling of Spatio Temporal Data with R, 1st edition. Chapman and Hall, Boca Raton. https://doi.org/10.1201/9780429318443.
Bmchoice()
b1 < Bmchoice(case="Exact.sigma2.known")
b2 < Bmchoice(case="MC.sigma2.known")
d1 < Bmchoice(case="MC.sigma2.unknown")
d2 < Bmchoice(y=rt(100, df=8), kprior=1, prior.M=1)
Model fitting and validation for spatiotemporal data from moving sensors in time.
Bmoving_sptime(
formula,
data,
coordtype,
coords,
prior.sigma2 = c(2, 1),
prior.tau2 = c(2, 1),
prior.phi = NULL,
prior.phi.param = NULL,
scale.transform = "NONE",
ad.delta = 0.8,
t.depth = 12,
s.size = 0.01,
N = 2500,
burn.in = 1000,
no.chains = 1,
validrows = 10,
predspace = FALSE,
newdata = NULL,
mchoice = TRUE,
plotit = FALSE,
rseed = 44,
verbose = TRUE,
knots.coords = NULL,
g_size = 5
)
formula 
An object of class "formula" (or one that can be coerced to that class): a symbolic description of the model to be fitted. 
data 
The data frame for which the model formula is to be fitted. The data frame should be in long format having one row for each location and time combination. The data frame must be ordered by time within each site, and should optionally have a column, named s.index, providing the site indices. Thus the data, with n sites and T times within each site, should be organized in the order: (s1, t1), (s1, t2), ... (s1, T), ... (sn, t1), ... (sn, T). The data frame should also contain two columns giving the coordinates of the locations for spatio temporal model fitting. 
coordtype 
Type of coordinates: utm, lonlat or plain with utm (supplied in meters) as the default. Distance will be calculated in units of kilometer if this argument is either utm or lonlat. Euclidean distance will be calculated if this is given as the third type plain. If distance in meter is to be calculated then coordtype should be passed on as plain although the coords are supplied in UTM. 
coords 
A vector of size 2 giving the column numbers of the data frame which contain the coordinates of the data locations. Here the supplied data frame must contain a column named 'time' which should indicate the time index of the data row. The values in the column 'time' should be positive integers starting from 1. 
prior.sigma2 
Shape and scale parameter value for the gamma prior on 1/sigma^2, the precision. 
prior.tau2 
Shape and scale parameter value for the gamma prior on tau^2, the nugget effect. 
prior.phi 
Specifies the prior distribution for 
prior.phi.param 
Lower and upper limits of the uniform prior distribution for
phi the spatial decay parameter. For the default uniform distribution the values correspond
to an effective range that is between 1% and 100% of the maximum distance
between the data locations. For the Gamma distribution the default values are 2 and 1
and for the Cauchy distribution the default values are 0, 1 which specifies
a halfCauchy distribution in 
scale.transform 
Transformation of the response variable. It can take three values: SQRT, LOG or NONE. Default value is "NONE". 
ad.delta 
Adaptive delta controlling the behavior of Stan during fitting. 
t.depth 
Maximum allowed tree depth in the fitting process of Stan. 
s.size 
step size in the fitting process of Stan. 
N 
MCMC sample size. 
burn.in 
How many initial iterations to discard. Only relevant for MCMC based model fitting, i.e., when package is spBayes or Stan. 
no.chains 
Number of parallel chains to run in Stan. 
validrows 
Either a number of randomly selected data rows to validate or a vector giving the row numbers of the data set for validation. 
predspace 
A 01 flag indicating whether spatial predictions are to be made. 
newdata 
A new data frame with the same column structure as the model fitting data set. 
mchoice 
Logical scalar value: whether model choice statistics should be calculated. 
plotit 
Logical scalar value: whether to plot the predictions against the observed values. 
rseed 
Random number seed that controls the starting point for the random number stream. A set value is required to help reproduce the results. 
verbose 
Logical scalar value: whether to print various estimates and statistics. 
knots.coords 
Only relevant for GPP models fitted by either spTimer or spTDyn. Optional two column matrix of UTMX and UTMY coordinates of the knots in kilometers. It is preferable to specify the g_size parameter instead. 
g_size 
Only relevant for GPP models fitted by either spTimer or spTDyn. The grid size c(m, n) for the knots for the GPP model. A square grid is assumed if this is passed on as a scalar. This does not need to be given if knots.coords is given instead. 
A list containing:
params  A table of parameter estimates
fit  The fitted model object.
datatostan  A list containing all the information sent to the rstan package.
prior.phi.param  This contains the values of
the hyperparameters of the prior distribution for the spatial
decay parameter $phi$
.
prior.phi  This contains the name of of the prior
distribution for the spatial decay parameter $phi$
.
validationplots  Present only if validation has been performed.
Contains three validation plots with or without segment and
an ordinary plot. See obs_v_pred_plot
for more.
fitteds  A vector of fitted values.
residuals  A vector of residual values.
package  The name of the package used for model fitting. This is always stan for this function.
model  The name of the fitted model.
call  The command used to call the model fitting function.
formula  The input formula for the regression part of the model.
scale.transform  The transformation adopted by the input argument with the same name.
sn  The number of data locations used in fitting.
tn  The number of time points used in fitting for each location.
computation.time  Computation time required to run the model fitting.
Bsptime
for spatiotemporal model fitting.
deep < argo_floats_atlantic_2003[argo_floats_atlantic_2003$depth==3, ]
deep$x2inter < deep$xinter*deep$xinter
deep$month < factor(deep$month)
deep$lat2 < (deep$lat)^2
deep$sin1 < round(sin(deep$time*2*pi/365), 4)
deep$cos1 < round(cos(deep$time*2*pi/365), 4)
deep$sin2 < round(sin(deep$time*4*pi/365), 4)
deep$cos2 < round(cos(deep$time*4*pi/365), 4)
deep[, c( "xlat2", "xsin1", "xcos1", "xsin2", "xcos2")] <
scale(deep[,c("lat2", "sin1", "cos1", "sin2", "cos2")])
f2 < temp ~ xlon + xlat + xlat2+ xinter + x2inter
M2 < Bmoving_sptime(formula=f2, data = deep, coordtype="lonlat",
coords = 1:2, N=11, burn.in=6, validrows =NULL, mchoice = FALSE)
summary(M2)
plot(M2)
names(M2)
# Testing for smaller data sets with different data pattern
d2 < deep[1:25, ]
d2$time < 1:25
# Now there is no missing times
M1 < Bmoving_sptime(formula=f2, data = d2, coordtype="lonlat", coords = 1:2,
N=11, burn.in=6, mchoice = FALSE)
summary(M1)
d2[26, ] < d2[25, ]
# With multiple observation at the same location and time
M1 < Bmoving_sptime(formula=f2, data = d2, coordtype="lonlat", coords = 1:2,
N=11, burn.in=6, mchoice = FALSE)
summary(M1)
d2[27, ] < d2[24, ]
d2[27, 3] < 25
# With previous location resampled
M1 < Bmoving_sptime(formula=f2, data = d2, coordtype="lonlat", coords = 1:2,
N=11, burn.in=6, mchoice = FALSE)
summary(M1)
Calculates and plots the variogram cloud and an estimated variogram.
bmstdr_variogram(
formula = yo3 ~ utmx + utmy,
coordtype = "utm",
data = nyspatial,
nbins = 30
)
formula 
Its a formula argument for the response and the coordinates. 
coordtype 
Type of coordinates: utm, lonlat or plain with utm (supplied in meters) as the default. Distance will be calculated in units of kilometer if this argument is either utm or lonlat. Euclidean distance will be calculated if this is given as the third type plain. If distance in meter is to be calculated then coordtype should be passed on as plain although the coords are supplied in UTM. 
data 
A data frame containing the response and the coordinates 
nbins 
Number of bins for the variogram. Default is 30. 
A list containing:
cloud  A data frame containing the variogram cloud. This contains pairs of all the data locations, distance between the locations and the variogram value for the pair.
variogram A data frame containing the variogram values in each bin.
cloudplot A ggplot2 object of the plot of the variogram cloud.
variogramplot A ggplot2 object of the plot of the binned variogram values.
a < bmstdr_variogram(data=nyspatial, formula = yo3~utmx + utmy,
coordtype="utm", nb=50)
names(a)
if (require(ggpubr)) ggarrange(a$cloudplot, a$variogramplot, nrow=1, ncol=2)
N(theta, sigma2): Using different methods.
Bnormal(
package = "exact",
y = ydata,
mu0 = mean(y),
kprior = 0,
prior.M = 1e04,
prior.sigma2 = c(0, 0),
N = 2000,
burn.in = 1000,
rseed = 44
)
package 
Which package (or method) to use. Possibilities are:

y 
A vector of data values. Default is 28 ydata values from the package bmstdr 
mu0 
The value of the prior mean if kprior=0. Default is the data mean. 
kprior 
A scalar providing how many data standard deviation the prior mean is from the data mean. Default value is 0. 
prior.M 
Prior sample size, defaults to 10^(4).#' 
prior.sigma2 
Shape and scale parameter value for the gamma prior on 1/sigma^2, the precision. 
N 
is the number of Gibbs sampling iterations 
burn.in 
is the number of initial iterations to discard before making inference. 
rseed 
is the random number seed defaults to 44. 
A list containing the exact posterior means and variances of theta and sigma2
# Find the posterior mean and variance using `exact' methods  no sampling
# or approximation
Bnormal(kprior = 1, prior.M = 1, prior.sigma2 = c(2, 1))
# Use default noninformative prior
Bnormal(mu0 = 0)
# Start creating table
y < ydata
mu0 < mean(y)
kprior < 1
prior.M < 1
prior.sigma2 < c(2, 1)
N < 10000
eresults < Bnormal(package = "exact", y = y, mu0 = mu0, kprior = kprior,
prior.M = prior.M, prior.sigma2 = prior.sigma2)
eresults
# Run Gibbs sampling
samps < Bnormal(package = "RGibbs", y = y, mu0 = mu0, kprior = kprior,
prior.M = prior.M, prior.sigma2 = prior.sigma2, N = N)
gres < list(mean_theta = mean(samps[, 1]), var_theta = var(samps[, 1]),
mean_sigma2 = mean(samps[, 2]), var_sigma2 = var(samps[, 2]))
glow < list(theta_low = quantile(samps[, 1], probs = 0.025), var_theta = NA,
sigma2_low = quantile(samps[, 2], probs = 0.025), var_sigma2 = NA)
gupp < list(theta_low = quantile(samps[, 1], probs = 0.975), var_theta = NA,
sigma2_low = quantile(samps[, 2], probs = 0.975), var_sigma2 = NA)
a < rbind(unlist(eresults), unlist(gres), unlist(glow), unlist(gupp))
cvsamp < sqrt(samps[, 2])/samps[, 1]
cv < c(NA, mean(cvsamp), quantile(cvsamp, probs = c(0.025, 0.975)))
u < data.frame(a, cv)
rownames(u) < c("Exact", "Estimate", "2.5%", "97.5%")
print(u)
# End create table
##
# Compute using the model fitted by Stan
u < Bnormal(package = "stan", kprior = 1, prior.M = 1, prior.sigma = c(2, 1),
N = 2000, burn.in = 1000)
print(u)
###
# Compute using INLA
if (require(INLA)) {
u < Bnormal(package = "inla", kprior = 1, prior.M = 1, prior.sigma = c(2,
1), N = 1000)
print(u)
}
Bayesian regression model fitting for point referenced spatial data. Calculates parameter estimates, validation statistics, and estimated values of several Bayesian model choice criteria.
Bspatial(
formula,
data,
package = "none",
model = "lm",
coordtype = NULL,
coords = NULL,
validrows = NULL,
scale.transform = "NONE",
prior.beta0 = 0,
prior.M = 1e04,
prior.sigma2 = c(2, 1),
prior.tau2 = c(2, 0.1),
phi = NULL,
prior.phi.param = NULL,
prior.range = c(1, 0.5),
prior.sigma = c(1, 0.005),
offset = c(10, 140),
max.edge = c(50, 1000),
cov.model = "exponential",
N = 5000,
burn.in = 1000,
rseed = 44,
n.report = 500,
no.chains = 1,
ad.delta = 0.99,
s.size = 0.01,
t.depth = 15,
verbose = TRUE,
plotit = TRUE,
mchoice = FALSE,
...
)
formula 
An object of class "formula" (or one that can be coerced to that class): a symbolic description of the model to be fitted. 
data 
The data frame for which the model formula is to be fitted.
If a spatial model is to be fitted then the data frame should contain
two columns containing the locations of the coordinates. See the 
package 
Which package is to be used in model fitting? Currently available packages are:
Further details and more examples are provided in Chapter 6 of the book Sahu (2022). 
model 
Only used when the package has been chosen to be "none". It can take one of two values: either "lm" or "spat". The "lm" option is for an independent error regression model while the "spat" option fits a spatial model without any nugget effect. 
coordtype 
Type of coordinates: utm, lonlat or plain with utm (supplied in meters) as the default. Distance will be calculated in units of kilometer if this argument is either utm or lonlat. Euclidean distance will be calculated if this is given as the third type plain. If distance in meter is to be calculated then coordtype should be passed on as plain although the coords are supplied in UTM. 
coords 
A vector of size two identifying the two column numbers of the data frame to take as coordinates. Or this can be given as a matrix of number of sites by 2 providing the coordinates of all the data locations. 
validrows 
A vector of site indices which should be used for validation. This function does not allow some sites to be used for both fitting and validation. The remaining observations will be used for model fitting. The default NULL value instructs that validation will not be performed. 
scale.transform 
Transformation of the response variable. It can take three values: SQRT, LOG or NONE. 
prior.beta0 
A scalar value or a vector providing the prior mean for beta parameters. 
prior.M 
Prior precision value (or matrix) for beta. Defaults to a diagonal matrix with diagonal values 10^(4). 
prior.sigma2 
Shape and scale parameter value for the gamma prior on 1/sigma^2, the precision. 
prior.tau2 
Shape and scale parameter value for the gamma prior on tau^2, the nugget effect. 
phi 
The spatial decay parameter for the exponential covariance function. Only
used if the package is Stan or the model is a spatial model "spat" without nugget effect when the

prior.phi.param 
Lower and upper limits of the uniform prior distribution for

prior.range 
A length 2 vector, with (range0, Prange) specifying
that 
prior.sigma 
A length 2 vector, with (sigma0, Psigma) specifying
that 
offset 
Only used in INLA based modeling. Offset parameter. See documentation for 
max.edge 
Only used in INLA based modeling. See documentation for 
cov.model 
Only relevant for the spBayes package. Default is the exponential model.
See the documentation for 
N 
MCMC sample size. Default value 5000. 
burn.in 
How many initial iterations to discard. Default value 1000. Only relevant for MCMC based model fitting, i.e., when package is spBayes or Stan. 
rseed 
Random number seed that controls the starting point for the random number stream. A set value is required to help reproduce the results. 
n.report 
How many times to report in MCMC progress. This is used only when the package is spBayes. 
no.chains 
Number of parallel chains to run in Stan. 
ad.delta 
Adaptive delta controlling the behavior of Stan during fitting. 
s.size 
step size in the fitting process of Stan. 
t.depth 
Maximum allowed tree depth in the fitting process of Stan. 
verbose 
Logical scalar value: whether to print various estimates and statistics. 
plotit 
Logical scalar value: whether to plot the predictions against the observed values. 
mchoice 
Logical scalar value: whether model choice statistics should be calculated. 
... 
Any additional arguments that may be passed on to the fitting package. 
A list containing:
params  A table of parameter estimates
fit  The fitted model object. This is present only if a named
package, e.g. spBayes
has been used.
max.d  Maximum distance between data locations.
This is in unit of kilometers unless the coordtype
argument
is set as plain
.
fitteds  A vector of fitted values.
mchoice  Calculated model choice statistics if those have been
requested by the input argument mchoice=TRUE
. Not all model fits will contain
all the model choice statistics.
stats  The four validation statistics: rmse, mae, crps and coverage. This is present only if model validation has been performed.
yobs_preds  A data frame containing the validation rows of the model fitting data frame. The last five columns of this data frame contains the validation prediction summaries: mean, sd, median, and 95% prediction interval. This is present only if model validation has been performed.
valpreds  A matrix containing the MCMC samples of the validation predictions. The dimension of this matrix is the number of validations times the number of retained MCMC samples. This is present only if model validation has been performed.
validationplots  Present only if validation has been performed.
Contains three validation plots with or without segment and
an ordinary plot. See obs_v_pred_plot
for more.
residuals  A vector of residual values.
sn  The number of data locations used in fitting.
tn Defaults to 1. Used for plotting purposes.
phi  If present this contains the fixed value of
the spatial decay parameter $phi$
used to fit the model.
prior.phi.param  If present this contains the values of the hyperparameters
of the prior distribution for the spatial decay parameter $phi$
.
prior.range  Present only if the INLA
package has been used
in model fitting. This contains the values of the hyperparameters
of the prior distribution for the range.
logliks  A list containing the loglikelihood values used in calculation of the model choice statistics if those have been requested in the first place.
formula  The input formula for the regression part of the model.
scale.transform  The transformation adopted by the input argument with the same name.
package  The name of the package used for model fitting.
model  The name of the fitted model.
call  The command used to call the model fitting function.
computation.time  Computation time required to run the model fitting.
Bsptime
for Bayesian spatiotemporal model fitting.
N < 10
burn.in < 5
n.report < 2
a < Bspatial(formula = mpg ~ wt, data = mtcars, package = "none", model = "lm",
N = N)
summary(a)
plot(a)
print(a)
b < Bspatial(formula = mpg ~ disp + wt + qsec + drat, data = mtcars,
validrows = c(8, 11, 12, 14, 18, 21, 24, 28), N = N)
#' print(b)
summary(b)
## Illustration with the nyspatial data set
head(nyspatial)
## Linear regression model fitting
M1 < Bspatial(formula = yo3 ~ xmaxtemp + xwdsp + xrh, data = nyspatial,
mchoice = TRUE, N = N)
print(M1)
plot(M1)
a < residuals(M1)
summary(M1)
## Spatial model fitting
M2 < Bspatial(model = "spat", formula = yo3 ~ xmaxtemp + xwdsp +
xrh, data = nyspatial, coordtype = "utm", coords = 4:5, phi = 0.4,
mchoice = TRUE, N = N)
names(M2)
print(M2)
plot(M2)
a < residuals(M2)
summary(M2)
## Fit model 2 on the square root scale
M2root < Bspatial(model = "spat", formula = yo3 ~ xmaxtemp + xwdsp + xrh,
data = nyspatial, coordtype = "utm", coords = 4:5, scale.transform = "SQRT")
summary(M2root)
## Spatial model fitting using spBayes
M3 < Bspatial(package = "spBayes", formula = yo3 ~ xmaxtemp + xwdsp + xrh,
data = nyspatial, coordtype = "utm", coords = 4:5, prior.phi = c(0.005,
2), mchoice = TRUE, N = N, burn.in = burn.in, n.report = n.report)
summary(M3)
# Spatial model fitting using stan (with a small number of iterations)
M4 < Bspatial(package = "stan", formula = yo3 ~ xmaxtemp + xwdsp + xrh,
data = nyspatial, coordtype = "utm", coords = 4:5, phi = 0.4, N = N,
burn.in = burn.in, mchoice = TRUE)
summary(M4)
## K fold crossvalidation for M2 only
set.seed(44)
x < runif(n = 28)
u < order(x)
# Here are the four folds
s1 < u[1:7]
s2 < u[8:14]
s3 < u[15:21]
s4 < u[22:28]
summary((1:28)  sort(c(s1, s2, s3, s4))) ## check
v1 < Bspatial(model = "spat", formula = yo3 ~ xmaxtemp + xwdsp + xrh,
data = nyspatial, coordtype = "utm", coords = 4:5, validrows = s1,
phi = 0.4, N = N)
v2 < Bspatial(model = "spat", formula = yo3 ~ xmaxtemp + xwdsp + xrh,
data = nyspatial, coordtype = "utm", coords = 4:5, validrows = s2,
phi = 0.4, N = N)
v3 < Bspatial(model = "spat", formula = yo3 ~ xmaxtemp + xwdsp + xrh,
data = nyspatial, coordtype = "utm", coords = 4:5, validrows = s3,
phi = 0.4, N = N)
v4 < Bspatial(model = "spat", formula = yo3 ~ xmaxtemp + xwdsp + xrh,
data = nyspatial, coordtype = "utm", coords = 4:5, validrows = s4,
phi = 0.4, N = N)
M2.val.table < cbind(unlist(v1$stats), unlist(v2$stats), unlist(v3$stats),
unlist(v4$stats))
dimnames(M2.val.table)[[2]] < paste("Fold", 1:4, sep = "")
round(M2.val.table, 3)
## Model validation
s < c(1, 5, 10)
M1.v < Bspatial(model = "lm", formula = yo3 ~ xmaxtemp + xwdsp + xrh,
data = nyspatial, coordtype = "utm", coords = 4:5, validrows = s, N = N,
burn.in = burn.in)
M2.v < Bspatial(model = "spat", formula = yo3 ~ xmaxtemp + xwdsp + xrh,
data = nyspatial, coordtype = "utm", coords = 4:5, validrows = s, phi = 0.4,
N = N, burn.in = burn.in)
M3.v < Bspatial(package = "spBayes", formula = yo3 ~ xmaxtemp + xwdsp +
xrh, data = nyspatial, coordtype = "utm", coords = 4:5, validrows = s,
prior.phi = c(0.005, 2), n.report = 2, N = N, burn.in = burn.in)
# Collect all the results
Mall.table < cbind(unlist(M1.v$stats), unlist(M2.v$stats), unlist(M3.v$stats))
colnames(Mall.table) < paste("M", c(1:3), sep = "")
round(Mall.table, 3)
if (require(INLA) & require(inlabru)) {
N < 10
burn.in < 5
# Spatial model fitting using INLA
M5 < Bspatial(package = "inla", formula = yo3 ~ xmaxtemp + xwdsp + xrh,
data = nyspatial, coordtype = "utm", coords = 4:5, mchoice = TRUE,
N = N, burn.in = burn.in)
summary(M5)
}
Bayesian regression model fitting for point referenced spatiotemporal data. Calculates parameter estimates, validation statistics, and estimated values of several Bayesian model choice criteria.
Bsptime(
formula,
data,
package = "none",
model = "GP",
coordtype = NULL,
coords = NULL,
validrows = NULL,
scale.transform = "NONE",
prior.beta0 = 0,
prior.M = 1e04,
prior.sigma2 = c(2, 1),
prior.tau2 = c(2, 0.1),
prior.sigma.eta = c(2, 0.001),
phi.s = NULL,
phi.t = NULL,
prior.phi = "Gamm",
prior.phi.param = NULL,
phi.tuning = NULL,
phi.npoints = NULL,
prior.range = c(1, 0.5),
prior.sigma = c(1, 0.005),
offset = c(10, 140),
max.edge = c(50, 1000),
rhotp = 0,
time.data = NULL,
truncation.para = list(at = 0, lambda = 2),
newcoords = NULL,
newdata = NULL,
annual.aggrn = "NONE",
cov.model = "exponential",
g_size = NULL,
knots.coords = NULL,
tol.dist = 0.005,
N = 2000,
burn.in = 1000,
rseed = 44,
n.report = 2,
no.chains = 1,
ad.delta = 0.8,
t.depth = 15,
s.size = 0.01,
verbose = FALSE,
plotit = TRUE,
mchoice = FALSE,
...
)
formula 
An object of class "formula" (or one that can be coerced to that class): a symbolic description of the model to be fitted. 
data 
The data frame for which the model formula is to be fitted. The data frame should be in long format having one row for each location and time combination. The data frame must be ordered by time within each site, and should optionally have a column, named s.index, providing the site indices. Thus the data, with n sites and T times within each site, should be organized in the order: (s1, t1), (s1, t2), ... (s1, T), ... (sn, t1), ... (sn, T). The data frame should also contain two columns giving the coordinates of the locations for spatio temporal model fitting. 
package 
Which package is to be used in model fitting? Currently available packages are:
Further details and more examples are provided in Chapters 79 of the book Sahu (2022). 
model 
The model to be fitted. This argument is passed to the fitting package. In case the package is none, then it can be either "lm" or "separable". The "lm" option is for an independent error regression model while the other option fits a separable model without any nugget effect. The separable model fitting method cannot handle missing data. All missing data points in the response variable will be replaced by the grand mean of the available observations. 
coordtype 
Type of coordinates: utm, lonlat or plain with utm (supplied in meters) as the default. Distance will be calculated in units of kilometer if this argument is either utm or lonlat. Euclidean distance will be calculated if this is given as the third type plain. If distance in meter is to be calculated then coordtype should be passed on as plain although the coords are supplied in UTM. 
coords 
A vector of size two identifying the two column numbers of the data frame to take as coordinates. Or this can be given as a matrix of number of sites by 2 providing the coordinates of all the data locations. 
validrows 
A vector of row numbers of the supplied data frame which should be used for validation. When the model is "separable" this argument must include all the time points for the sites to be validated. Otherwise, the user is allowed to select the row numbers of the data frame validation as they wish. The default NULL value instructs that validation will not be performed.

scale.transform 
Transformation of the response variable. It can take three values: SQRT, LOG or NONE. Default value is "NONE". 
prior.beta0 
A scalar value or a vector providing the prior mean for beta parameters. 
prior.M 
Prior precision value (or matrix) for beta. Defaults to a diagonal matrix with diagonal values 10^(4). 
prior.sigma2 
Shape and scale parameter value for the gamma prior on 1/sigma^2, the precision. 
prior.tau2 
Shape and scale parameter value for the gamma prior on tau^2, the nugget effect. 
prior.sigma.eta 
Shape and scale parameter value for the inverse gamma prior distribution for sigma^2 eta; only used in the spBayes package. 
phi.s 
Only used if the model is "separable". The value of the fixed spatial decay parameter for the exponential covariance function. If this is not provided then a value is chosen which corresponds to an effective range which is the maximum distance between the data locations. 
phi.t 
Only used if the model is "separable". The fixed decay parameter for the exponential covariance function in the temporal domain. If this is not provided then a value is chosen which corresponds to an effective temporal range which is the maximum time of the data set. 
prior.phi 
Specifies the prior distribution for 
prior.phi.param 
Lower and upper limits of the uniform prior distribution for
phi the spatial decay parameter. For the default uniform distribution the values correspond
to an effective range that is between 1% and 100% of the maximum distance
between the data locations. For the Gamma distribution the default values are 2 and 1
and for the Cauchy distribution the default values are 0, 1 which specifies
a halfCauchy distribution in 
phi.tuning 
Only relevant for spTimer and spTDyn models. Tuning parameter fo sampling phi. See the help file for spT.Gibbs 
phi.npoints 
Only relevant for spTimer and spTDyn models. Number of points for the discrete uniform prior distribution on phi. See the help file for spT.Gibbs 
prior.range 
A length 2 vector, with (range0, Prange) specifying
that 
prior.sigma 
A length 2 vector, with (sigma0, Psigma) specifying
that 
offset 
Only used in INLA based modeling. Offset parameter. See documentation for 
max.edge 
Only used in INLA based modeling. See documentation for 
rhotp 
Only relevant for models fitted by spTDyn. Initial value for the rho parameters in the temporal dynamic model. The default is rhotp=0 for which parameters are sampled from the full conditional distribution via MCMC with initial value 0. If rhotp=1,parameters are not sampled and fixed at value 1. 
time.data 
Defining the segments of the timeseries set up using the function 
truncation.para 
Provides truncation parameter lambda and truncation point "at" using list. Only used with the spTimer package for a truncated model. 
newcoords 
The locations of the prediction sites in similar format to the 
newdata 
The covariate values at the prediction sites specified by 
annual.aggrn 
This provides the options for calculating annual summary statistics by aggregating different time segments (e.g., annual mean). Currently implemented options are: "NONE", "ave" and "an4th", where "ave" = annual average, "an4th"= annual 4th highest. Only applicable if spT.time inputs more than one segment and when fit and predict are done simultaneously. Only used in the spTimer package. 
cov.model 
Model for the covariance function. Only relevant for the spBayes, spTimer and the spTDyn packages. Default is the exponential model.
See the documentation for 
g_size 
Only relevant for GPP models fitted by either spTimer or spTDyn. The grid size c(m, n) for the knots for the GPP model. A square grid is assumed if this is passed on as a scalar. This does not need to be given if knots.coords is given instead. 
knots.coords 
Only relevant for GPP models fitted by either spTimer or spTDyn. Optional two column matrix of UTMX and UTMY coordinates of the knots in kilometers. It is preferable to specify the g_size parameter instead. 
tol.dist 
Minimum separation distance between any two locations out of those specified by coords, knots.coords and pred.coords. The default is 0.005. The program will exit if the minimum distance is less than the nonzero specified value. This will ensure nonsingularity of the covariance matrices. 
N 
MCMC sample size. 
burn.in 
How many initial iterations to discard. Only relevant for MCMC based model fitting, i.e., when package is spBayes or Stan. 
rseed 
Random number seed that controls the starting point for the random number stream. A set value is required to help reproduce the results. 
n.report 
How many times to report in MCMC progress. This is only used when the package is spBayes or spTimer. 
no.chains 
Number of parallel chains to run in Stan. 
ad.delta 
Adaptive delta controlling the behavior of Stan during fitting. 
t.depth 
Maximum allowed tree depth in the fitting process of Stan. 
s.size 
step size in the fitting process of Stan. 
verbose 
Logical scalar value: whether to print various estimates and statistics. 
plotit 
Logical scalar value: whether to plot the predictions against the observed values. 
mchoice 
Logical scalar value: whether model choice statistics should be calculated. 
... 
Any additional arguments that may be passed on to the fitting package. 
A list containing:
params  A table of parameter estimates
fit  The fitted model object. This is present only if a
named package, e.g. spTimer
has been used.
max.d  Maximum distance between data locations.
This is in unit of kilometers unless the coordtype
argument
is set as plain
.
fitteds  A vector of fitted values.
mchoice  Calculated model choice statistics if
those have been requested by the input argument mchoice=TRUE
.
Not all model fits will contain all the model choice statistics.
stats  The four validation statistics: rmse, mae, crps and coverage. This is present only if model validation has been performed.
yobs_preds  A data frame containing the validation rows of the model fitting data frame. The last five columns of this data frame contains the validation prediction summaries: mean, sd, median, and 95% prediction interval. This is present only if model validation has been performed.
valpreds  A matrix containing the MCMC samples of the validation predictions. The dimension of this matrix is the number of validations times the number of retained MCMC samples. This is present only if model validation has been performed.
validationplots  Present only if validation has been performed.
Contains three validation plots with or without segment and
an ordinary plot. See obs_v_pred_plot
for more.
residuals  A vector of residual values.
sn  The number of data locations used in fitting.
tn  The number of time points used in fitting.
phi.s, phi.t  Adopted value of the spatial and temporal decay parameters if those were fixed during model fitting.
prior.phi  If present this contains the name of
the prior distribution for
the spatial decay parameter $phi$
used to fit the
model.
prior.phi.param  If present this contains the values of
the hyperparameters of the prior distribution for the spatial
decay parameter $phi$
.
prior.range  Present only if the INLA
package
has been used in model fitting. This contains the values of
the hyperparameters of the prior distribution for the range.
logliks  A list containing the loglikelihood values used in calculation of the model choice statistics if those have been requested in the first place.
knots.coords  The locations of the knots if the model has been fitted using the GPP method.
formula  The input formula for the regression part of the model.
scale.transform  The transformation adopted by the input argument with the same name.
package  The name of the package used for model fitting.
model  The name of the fitted model.
call  The command used to call the model fitting function.
computation.time  Computation time required to run the model fitting.
Sahu SK (2022). Bayesian Modeling of Spatio Temporal Data with R, 1st edition. Chapman and Hall, Boca Raton. https://doi.org/10.1201/9780429318443.
# Set the total number of iterations
N < 45
# Set the total number of burnin iterations
burn.in < 5
# How many times to report progress
n.report < 2
# Model formula used in most model fitting
f2 < y8hrmax ~ xmaxtemp + xwdsp + xrh
# Check out the data set
head(nysptime)
## Fit linear regression model
M1 < Bsptime(model = "lm", data = nysptime, formula = f2,
scale.transform = "SQRT", N = N, burn.in = burn.in, mchoice = TRUE)
names(M1)
plot(M1)
print(M1)
summary(M1)
a < residuals(M1, numbers = list(sn = 28, tn = 62))
M2 < Bsptime(model = "separable", data = nysptime, formula = f2,
coordtype = "utm", coords = 4:5, mchoice = TRUE, scale.transform = "SQRT",
N = N, burn.in = burn.in)
names(M2)
plot(M2)
print(M2)
summary(M2)
b < residuals(M2)
# Spatiotemporal model fitting and validation
valids < c(8, 11)
vrows < which(nysptime$s.index %in% valids)
## Fit separable spatiotemporal model
M2.1 < Bsptime(model = "separable", formula = f2, data = nysptime,
validrows = vrows, coordtype = "utm", coords = 4:5, phi.s = 0.005,
phi.t = 0.05, scale.transform = "SQRT", N = N)
summary(M2.1)
plot(M2.1)
# Use spTimer to fit independent GP model
M3 < Bsptime(package = "spTimer", formula = f2, data = nysptime,
coordtype = "utm", coords = 4:5, scale.transform = "SQRT", mchoice = TRUE,
N = N, burn.in = burn.in, n.report = 2)
summary(M3)
valids < c(1, 5, 10)
validt < sort(sample(1:62, size = 31))
vrows < getvalidrows(sn = 28, tn = 62, valids = valids, validt = validt)
ymat < matrix(nysptime$y8hrmax, byrow = TRUE, ncol = 62)
yholdout < ymat[valids, validt]
# Perform validation
M31 < Bsptime(package = "spTimer", formula = f2, data = nysptime,
coordtype = "utm", coords = 4:5, validrows = vrows, model = "GP",
scale.transform = "NONE", N = N, burn.in = burn.in, n.report = 2)
summary(M31)
modfit < M31$fit
## Extract the fits for the validation sites
fitall < data.frame(modfit$fitted)
head(fitall)
tn < 62
fitall$s.index < rep(1:28, each = tn)
library(spTimer)
vdat < spT.subset(data = nysptime, var.name = c("s.index"), s = valids)
fitvalid < spT.subset(data = fitall, var.name = c("s.index"), s = valids)
head(fitvalid)
fitvalid$low < fitvalid$Mean  1.96 * fitvalid$SD
fitvalid$up < fitvalid$Mean + 1.96 * fitvalid$SD
fitvalid$yobs < sqrt(vdat$y8hrmax)
fitvalid$yobs < vdat$y8hrmax
yobs < matrix(fitvalid$yobs, byrow = TRUE, ncol = tn)
y.valids.low < matrix(fitvalid$low, byrow = TRUE, ncol = tn)
y.valids.med < matrix(fitvalid$Mean, byrow = TRUE, ncol = tn)
y.valids.up < matrix(fitvalid$up, byrow = TRUE, ncol = tn)
library(ggplot2)
p1 < fig11.13.plot(yobs[1, ], y.valids.low[1, ], y.valids.med[1, ],
y.valids.up[1, ], misst = validt)
p1 < p1 + ggtitle("Validation for Site 1")
p1
p2 < fig11.13.plot(yobs[2, ], y.valids.low[2, ], y.valids.med[2, ],
y.valids.up[2, ], misst = validt)
p2 < p2 + ggtitle("Validation for Site 5")
p2
p3 < fig11.13.plot(yobs[3, ], y.valids.low[3, ], y.valids.med[3, ],
y.valids.up[3, ], misst = validt)
p3 < p3 + ggtitle("Validation for Site 10")
p3
## Independent marginal GP model fitting using rstan
M4 < Bsptime(package = "stan", formula = f2, data = nysptime,
coordtype = "utm", coords = 4:5, N = N, burn.in = burn.in,
verbose = FALSE)
summary(M4)
# Spatiotemporal hierarchical autoregressive modeling useing spTimer
M5 < Bsptime(package = "spTimer", model = "AR", formula = f2, data = nysptime,
coordtype = "utm", coords = 4:5, scale.transform = "SQRT", mchoice = TRUE,
n.report = n.report, N = N, burn.in = burn.in)
summary(M5)
a < residuals(M5)
## Spatiotemporal dynamic model fitting using spTDyn
library(spTDyn)
f3 < y8hrmax ~ xmaxtemp + sp(xmaxtemp) + tp(xwdsp) + xrh
M7 < Bsptime(package = "sptDyn", model = "GP", formula = f3, data = nysptime,
coordtype = "utm", coords = 4:5, scale.transform = "SQRT", mchoice = TRUE,
N = N, burn.in = burn.in, n.report = n.report)
summary(M7)
# Dynamic Model fitting using spBayes
M8 < Bsptime(package = "spBayes", formula = f2, data = nysptime,
prior.sigma2 = c(2, 25), prior.tau2 = c(2, 25), prior.sigma.eta = c(2,
0.001), coordtype = "utm", coords = 4:5, scale.transform = "SQRT",
N = N, burn.in = burn.in, n.report = n.report)
summary(M8)
## Gussian Predictive Process based model fitting using spTimer
M9 < Bsptime(package = "spTimer", model = "GPP", g_size = 5, formula = f2,
data = nysptime, coordtype = "utm", coords = 4:5, scale.transform = "SQRT",
N = N, burn.in = burn.in, n.report = n.report)
summary(M9)
# This INLA run may take a long time
if (require(INLA) & require(inlabru)) {
f2 < y8hrmax ~ xmaxtemp + xwdsp + xrh
M6 < Bsptime(package = "inla", model = "AR", formula = f2, data = nysptime,
coordtype = "utm", coords = 4:5, scale.transform = "SQRT",
offset = c(100, 200), max.edge = c(500, 10000),
mchoice = TRUE, plotit=TRUE)
# Takes a minute
summary(M6)
}
Calculate DIC function. Has two arguments: (1) log full likelihood at thetahat and (2) vector of loglikelihood at the theta samples Calculate the DIC criteria values
calculate_dic(loglikatthetahat, logliks)
loglikatthetahat 
Log of the likelihood function at theta hat (Bayes). It is a scalar value. 
logliks 
A vector of log likelihood values at the theta samples 
a list containing four values pdic, pdicalt, dic and dicalt
Calculates the four validation statistics: RMSE, MAE, CRPS and coverage given the observed values and MCMC iterates.
calculate_validation_statistics(yval, yits, level = 95, summarystat = "mean")
yval 
A vector containing n observed values of the response variable. 
yits 
A n by N matrix of predictive samples from the n observations contained in yval. 
level 
The nominal coverage level, defaults to 95%. 
summarystat 
Summary statistics to use to calculate the validation predictions from the samples. It should be a function like mean or median which can be calculated by R. The default is mean. 
A list giving the rmse, mae, crps and coverage.
set.seed(4)
vrows < sample(nrow(nysptime), 100)
M1 < Bsptime(model="lm", formula=y8hrmax~xmaxtemp+xwdsp+xrh, data=nysptime,
validrows=vrows, scale.transform = "SQRT")
valstats < calculate_validation_statistics(M1$yobs_preds$y8hrmax,
yits=t(M1$valpreds))
unlist(valstats)
Calculate WAIC function. Has the sole argument component wise likelihood at thetahat samples. v must be a matrix Calculate the WAIC criteria values
calculate_waic(v)
v 
must be a N (MCMC) by n (data) sample size matrix of the loglikelihood evaluations 
a list containing four values p_waic, p_waic alt, waic and waic_alt
Prints and returns the estimates of the coefficients
## S3 method for class 'bmstdr'
coef(object, digits = 3, ...)
object 
A bmstdr model fit object. 
digits 
How many significant digits after the decimal to print, defaults to 3. 
... 
Any other additional arguments 
Coefficients are returned as a data frame preserving the names of the covariates
The color palette used to draw maps to illustrate the package bmstdr, see Sahu (2022) It has the values in order: dodgerblue4, dodgerblue2, firebrick2, firebrick4 and purple.
colpalette
An object of class character
of length 5.
Sahu SK (2022). Bayesian Modeling of Spatio Temporal Data with R, 1st edition. Chapman and Hall, Boca Raton. https://doi.org/10.1201/9780429318443.
colpalette
Number of weekly Covid19 deaths and cases in the 313 local Local Authority Districts, Counties and Unitary Authorities (LADCUA) in England during the 20 peaks in the first peak from March 13 to July 31, 2020.
engdeaths
An object of class data.frame
with 6260 rows and 24 columns.
Sahu and Böhning (2021). @format A data frame with 6260 rows and 24 columns:
Areacode identifier of the 313 Local Authority Districts, Counties and Unitary Authorities (LADCUA)
A numeric column identifying the map area needed for plotting
A numeric variable taking value 1 to 313 identifying the LADCUA's
Identifies one of the 9 English regions
Population number in mid2019
Percentage of the working age population receiving jobseekers allowance during January 2020
Median house price in March 2020
Population density in mid2019
Estimated average value of NO2 at the centroid of the LADCUA
Number of Covid19 deaths within 28 days of a positive test
Number deaths
Number of cases
Log of the standardized case morbidity during the current week
Log of the standardized case morbidity during the week before
Log of the standardized case morbidity during the second week before
Log of the standardized case morbidity during the third week before
Log of the standardized case morbidity during the fourth week before
Expected number of Covid19 deaths. See Sahu and Bohning (2021) for methodology.
Expected number of cases.
Log of the column Edeaths
Log of the column Ecases
A binary (01) random variable taking the value 1 if the SMR of Covid19 death is higher than 1
Sahu SK, Böhning D (2021). “Bayesian spatiotemporal joint disease mapping of Covid19 cases and deaths in local authorities of England.” Spatial Statistics. doi:10.1016/j.spasta.2021.100519.
colnames(engdeaths)
dim(engdeaths)
summary(engdeaths[, 11:24])
Total number of weekly Covid19 deaths and cases in the 313 local Local Authority Districts, Counties and Unitary Authorities (LADCUA) in England during the first peak from March 13 to July 31, 2020.
engtotals
An object of class data.frame
with 313 rows and 19 columns.
Sahu and Böhning (2021). @format A data frame with 313 rows and 19 columns:
Areacode identifier of the 313 Local Authority Districts, Counties and Unitary Authorities (LADCUA)
A numeric column identifying the map area needed for plotting
A numeric variable taking value 1 to 313 identifying the LADCUA's
Identifies one of the 9 English regions
Population number in mid2019
Percentage of the working age population receiving jobseekers allowance during January 2020
Median house price in March 2020
Population density in mid2019
Start date of the week
Week numbers 11 to 30
Estimated average value of NO2 at the centroid of the LADCUA in that week
Number of Covid19 deaths within 28 days of a positive test
Total number deaths
Total number of cases
Expected number of Covid19 deaths. See Sahu and Bohning (2021) for methodology.
Expected number of cases.
Log of the column Edeaths
Log of the column Ecases
Standaridised morbidity rate for the number of cases,
noofcases
/Ecases
Number of weeks during March 13 to July 31, 2020. All values are 20.
Number of weeks out of 20 when the casesmr
was greater than 1
Sahu SK, Böhning D (2021). “Bayesian spatiotemporal joint disease mapping of Covid19 cases and deaths in local authorities of England.” Spatial Statistics. doi:10.1016/j.spasta.2021.100519.
colnames(engtotals)
dim(engtotals)
summary(engtotals[, 5:14])
Draws a time series (ribbon) plot by combining fitted and predicted values
fig11.13.plot(yobs, ylow, ymed, yup, misst)
yobs 
A vector of the observed values 
ylow 
A vector of the lower limits of the fitted or predicted values 
ymed 
A vector of fitted or predicted values 
yup 
A vector of the upper limits of the fitted or predicted values 
misst 
An integer vector which contains the indices of the predicted values 
A ribbon plot, ggplot2 object, which shows observed values in red color and open circle, predicted values in blue color and filled circle.
Extract fitted values from bmstdr objects.
## S3 method for class 'bmstdr'
fitted(object, ...)
object 
A bmstdr model fit object. 
... 
Any other additional arguments. 
Returns a vector of fitted values.
This function is used to delete values outside the state of New York
fnc.delete.map.XYZ(xyz)
xyz 
A list containing the x values, y values and interpolated z values at each x and y pair. 
Returns the input but with NA placed in z values corresponding to the locations whose xy coordinates are outside the land boundary of the USA.
Obtains parameter estimates from MCMC samples
get_parameter_estimates(samps, level = 95)
samps 
A matrix of N by p samples for the p parameters 
level 
Desired confidence level  defaults to 95%. 
A data frame containing four columns: mean, sd, low (er limit), and up (per limit) for the p parameters.
samps < matrix(rnorm(10000), ncol= 10 )
dim(samps)
a < get_parameter_estimates(samps)
a
b < get_parameter_estimates(samps, level=98)
b
Obtains suitable validation summary statistics from MCMC samples obtained for validation.
get_validation_summaries(samps, level = 95)
samps 
A matrix of N by p samples for the p parameters 
level 
Desired confidence level  defaults to 95%. 
A data frame containing five columns: meanpred, medianpred, sdpred, low (er limit), and up (per limit) for the p parameters.
set.seed(4)
vrows < sample(nrow(nysptime), 100)
M1 < Bsptime(model="lm", formula=y8hrmax~xmaxtemp+xwdsp+xrh, data=nysptime,
validrows=vrows, scale.transform = "SQRT")
samps< M1$valpreds
valsums < get_validation_summaries(samps)
head(valsums)
Returns a vector of row numbers for validation.
getvalidrows(sn, tn, valids, validt = NULL, allt = FALSE)
sn 
The total number of spatial locations. 
tn 
The total number of time points in each location. 
valids 
A vector of site numbers in (1:sn) to be used for validation. 
validt 
A vector of time points in (1:tn) to be used for validation. 
allt 
Whether all the time points should be used for validation. 
Integer vector providing the row numbers of the data frame for validation.
Output of this function is suitable as the argument validrows
for the
bmstdr
model fitting functions Bsptime, Bcartime
.
{
# To validate at site numbers 1, 5, and 10 at 31 randomly selected
# time points for the nysptime data set we issue the following commands
set.seed(44)
vt < sample(62, 31)
vrows < getvalidrows(sn=28, tn=62, valids=c(1, 5, 10), validt=vt)
# To validate at sites 1 and 2 at all time points
vrows < getvalidrows(sn=28, tn=62, valids=c(1, 2), allt=TRUE)
}
Values of three covariates for 100 grid locations in New York averaged over the 62 days during the months of July and August, 2006.
gridnyspatial
An object of class data.frame
with 100 rows and 8 columns.
See the NYgrid data set in spTimer package, Bakar and Sahu (2015). Each data row is the mean of the available covariate values at the 100 grid locations during the months of July and August in 2006.
@format A data frame with 100 rows and 8 columns:
site index (1 to 28)
Longitude of the site
Latitude of the site
UTM Xcoordinate of the site
UTM Ycoordinate of the site
Average maximum temperature (degree Celsius) at the site over 62 days in July and August, 2006
Average wind speed (nautical mile per hour) over 62 days in July and August, 2006
Average relative humidity over 62 days in July and August, 2006
Bakar KS, Sahu SK (2015). “spTimer: SpatioTemporal Bayesian Modeling Using R.” Journal of Statistical Software, Articles, 63(15), 1–32. ISSN 15487660, doi:10.18637/jss.v063.i15, https://www.jstatsoft.org/v063/i15.
summary(gridnyspatial)
Values of three covariates for 100 grid locations in New York for the 62 days during the months of July and August, 2006.
Values of three covariates for 100 grid locations in New York for the 62 days during the months of July and August, 2006.
gridnysptime
gridnysptime
An object of class data.frame
with 6200 rows and 11 columns.
An object of class data.frame
with 6200 rows and 11 columns.
It is the same data set as NYgrid data set in the spTimer package, Bakar and Sahu (2015). with two added columns providing the UTM X and Y coordinates. Each data row is for a particular grid site and day as detailed below.
@format A data frame with 6200 rows and 11 columns:
site index (1 to 100)
Longitude of the site
Latitude of the site
UTM Xcoordinate of the site
UTM Ycoordinate of the site
This is 2006 for all the rows
Month taking values 7 for July and 8 for August
Day taking values 1 to 31
Maximum temperature (degree Celsius)
wind speed (nautical mile per hour)
Relative humidity
It is the same data set as NYgrid data set in the spTimer package,
with two added columns providing the UTM X and Y coordinates. Each data row is for a particular grid site and day as detailed below.
@format A data frame with 6200 rows and 11 columns:
site index (1 to 100)
Longitude of the site
Latitude of the site
UTM Xcoordinate of the site
UTM Ycoordinate of the site
This is 2006 for all the rows
Month taking values 7 for July and 8 for August
Day taking values 1 to 31
Maximum temperature (degree Celsius)
wind speed (nautical mile per hour)
Relative humidity
Bakar KS, Sahu SK (2015). “spTimer: SpatioTemporal Bayesian Modeling Using R.” Journal of Statistical Software, Articles, 63(15), 1–32. ISSN 15487660, doi:10.18637/jss.v063.i15, https://www.jstatsoft.org/v063/i15.
Bakar KS, Sahu SK (2015). “spTimer: SpatioTemporal Bayesian Modeling Using R.” Journal of Statistical Software, Articles, 63(15), 1–32. ISSN 15487660, doi:10.18637/jss.v063.i15, https://www.jstatsoft.org/v063/i15.
head(gridnysptime)
summary(gridnysptime[, 9:11])
summary(gridnysptime[, 9:11])
Calculate the hit and false alarm rates
hitandfalsealarm(thresh, yobs, ypred)
thresh 
Threshold value 
yobs 
A vector of observations, may include missing values 
ypred 
Predictions 
A list containing the calculated hit and false alarm rates
Is it a bmstdr model fitted object?
is.bmstdr(x)
x 
Any R object. 
A TRUE/FALSE logical output.
Banerjee, Carlin and Gelfand (2015) Mat'ern covariance function
materncov(d, phi, nu)
d 
is the distance 
phi 
is the rate of decay 
nu 
is the smoothness parameter 
Returns the Mat'ern covariance for distance object d
Banerjee et al Mat'ern covariance function
maternfun(d, phi, nu)
d 
is the distance 
phi 
is the rate of decay 
nu 
is the smoothness parameter 
Returns the Mat'ern correlation function for distance object d
Average ozone concentration values and three covariates from 28 sites in New York.
nyspatial
An object of class data.frame
with 28 rows and 9 columns.
See the NYdata set in spTimer package, Bakar and Sahu (2015). Each data row is the mean of the available daily 8hour maximum average ozone concentrations in parts per billion (ppb) at each of the 28 sites. The daily values are for the months of July and August in 2006. @format A data frame with 28 rows and 9 columns:
site index (1 to 28)
Longitude of the site
Latitude of the site
UTM Xcoordinate of the site
UTM Ycoordinate of the site
Average ozone concentration value (ppb) at the site over 62 days in July and August, 2006
Average maximum temperature (degree Celsius) at the site over 62 days in July and August, 2006
Average wind speed (nautical mile per hour) over 62 days in July and August, 2006
Average relative humidity over 62 days in July and August, 2006
Bakar KS, Sahu SK (2015). “spTimer: SpatioTemporal Bayesian Modeling Using R.” Journal of Statistical Software, Articles, 63(15), 1–32. ISSN 15487660, doi:10.18637/jss.v063.i15, https://www.jstatsoft.org/v063/i15.
head(nyspatial)
summary(nyspatial)
pairs(nyspatial[, 6:9])
Daily 8hour maximum ozone concentration values and three covariates from 28 sites in New York for the 62 days during the months of July and August, 2006.
nysptime
An object of class data.frame
with 1736 rows and 12 columns.
It is the same as the NYdata set in the spTimer package, Bakar and Sahu (2015), with two added columns providing the UTM X and Y coordinates. Each data row is for a particular site and a day as detailed below.
@format A data frame with 1736 rows and 12 columns:
site index (1 to 28)
Longitude of the site
Latitude of the site
UTM Xcoordinate of the site
UTM Ycoordinate of the site
This is 2006 for all the rows
Month taking values 7 for July and 8 for August
Day taking values 1 to 31
Daily 8hour maximum ozone concentration value
Maximum temperature (degree Celsius)
wind speed (nautical mile per hour)
Relative humidity
Bakar KS, Sahu SK (2015). “spTimer: SpatioTemporal Bayesian Modeling Using R.” Journal of Statistical Software, Articles, 63(15), 1–32. ISSN 15487660, doi:10.18637/jss.v063.i15, https://www.jstatsoft.org/v063/i15.
head(nysptime)
summary(nysptime[, 9:12])
Observed against predicted plot
obs_v_pred_plot(
yobs,
predsums,
segments = TRUE,
summarystat = "median",
plotit = TRUE
)
yobs 
A vector containing the actual observations 
predsums 
A data frame containing predictive summary
statistics with the same number of rows as the length of the vector yobs.
The data frame must have columns named as meanpred, medianpred, sd, low and up.
Ideally this argument should be the output of the command

segments 
Logical: whether to draw line segments for the prediction intervals. 
summarystat 
Can take one of two values "median" (default) or "mean" indicating which one to use for the plot. 
plotit 
Logical scalar value: whether to plot the predictions against the observed values. 
Draws a plot only after removing the missing observations. It also returns a list of two ggplot2
objects: (i) a plot with intervals drawn pwithseg
and (ii) a plot without the segments drawn:
pwithoutseg
and (iii) a simple plot not showing the range of the prediction intervals.
set.seed(4)
vrows < sample(nrow(nysptime), 100)
M1 < Bsptime(model="lm", formula=y8hrmax~xmaxtemp+xwdsp+xrh, data=nysptime,
validrows=vrows, scale.transform = "SQRT")
psums < get_validation_summaries(M1$valpreds)
oplots < obs_v_pred_plot(yobs=M1$yobs_preds$y8hrmax, predsum=psums)
names(oplots)
plot(oplots$pwithoutseg)
plot(oplots$pwithseg)
Grid search method for choosing phi Calculates the validation statistics using the spatial model with a given range of values of the decay parameter phi.
phichoice_sp(
formula,
data,
coordtype,
coords,
phis,
scale.transform,
s,
N,
burn.in,
verbose = TRUE,
...
)
formula 
An object of class "formula" (or one that can be coerced to that class): a symbolic description of the model to be fitted. 
data 
The data frame for which the model formula is to be fitted. If a spatial model is to be fitted then the data frame should contain two columns containing the locations of the coordinates. See the coords argument below. 
coordtype 
Type of coordinates: utm, lonlat or plain with utm (supplied in meters) as the default. Distance will be calculated in units of kilometer if this argument is either utm or lonlat. Euclidean distance will be calculated if this is given as the third type plain. If distance in meter is to be calculated then coordtype should be passed on as plain although the coords are supplied in UTM. 
coords 
A vector of size two identifying the two column numbers of the data frame to take as coordinates. Or this can be given as a matrix of number of sites by 2 providing the coordinates of all the data locations. 
phis 
A vector values of phi 
scale.transform 
Transformation of the response variable. It can take three values: SQRT, LOG or NONE. 
s 
A vector giving the validation sites 
N 
MCMC sample size. Default value 5000. 
burn.in 
How many initial iterations to discard. Default value 1000. Only relevant for MCMC based model fitting, i.e., when package is spBayes or Stan. 
verbose 
Logical. Should it print progress? 
... 
Any additional parameter that may be passed to 
A data frame giving the phi values and the corresponding validation statistics
a < phichoice_sp(formula=yo3~xmaxtemp+xwdsp+xrh, data=nyspatial,
coordtype="utm", coords=4:5,
phis=seq(from=0.1, to=1, by=0.4), scale.transform="NONE",
s=c(8,11,12,14,18,21,24,28), N=20, burn.in=10, verbose=TRUE)
Calculates the validation statistics using the spatial model with a given range of values of the decay parameter phi.
phichoicep(
formula,
data,
coordtype,
coords,
scale.transform,
phis,
phit,
valids,
N,
burn.in,
verbose = TRUE
)
formula 
An object of class "formula" (or one that can be coerced to that class): a symbolic description of the model to be fitted. 
data 
The data frame for which the model formula is to be fitted. If a spatial model is to be fitted then the data frame should contain two columns containing the locations of the coordinates. See the coords argument below. 
coordtype 
Type of coordinates: utm, lonlat or plain with utm (supplied in meters) as the default. Distance will be calculated in units of kilometer if this argument is either utm or lonlat. Euclidean distance will be calculated if this is given as the third type plain. If distance in meter is to be calculated then coordtype should be passed on as plain although the coords are supplied in UTM. 
coords 
A vector of size two identifying the two column numbers of the data frame to take as coordinates. Or this can be given as a matrix of number of sites by 2 providing the coordinates of all the data locations. 
scale.transform 
Transformation of the response variable. It can take three values: SQRT, LOG or NONE. 
phis 
A vector values of phi for spatial decay 
phit 
A vector values of phi for temporal decay 
valids 
A vector giving the validation sites 
N 
MCMC sample size. 
burn.in 
How many initial iterations to discard. Only relevant for MCMC based model fitting, i.e., when package is spBayes or Stan. 
verbose 
Logical. Should progress be printed? 
A data frame giving the phi values and the corresponding validation statistics
a < phichoicep(formula=y8hrmax ~ xmaxtemp+xwdsp+xrh, data=nysptime,
coordtype="utm", coords=4:5, scale.transform = "SQRT",
phis=c(0.001, 0.005, 0.025, 0.125, 0.625), phit=c(0.05, 0.25, 1.25, 6.25),
valids=c(8,11,12,14,18,21,24,28), N=20, burn.in=10, verbose=TRUE)
Plot method for bmstdr objects.
## S3 method for class 'bmstdr'
plot(x, segments = TRUE, ...)
x 
A bmstdr model fit object. 
segments 
TRUE or FALSE. It decides whether to draw the prediction intervals as line segments. 
... 
Any other additional arguments. 
It plots the observed values on the original scale against the predictions and the 95% prediction intervals if validation has been performed. It then plots the residuals against fitted values. It then applies plotting method to the model fitted object as returned by the chosen named package. There is no return value.
Provides basic information regarding the fitted model.
## S3 method for class 'bmstdr'
print(x, digits = 3, ...)
x 
A bmstdr model fit object. 
digits 
How many significant digits after the decimal to print, defaults to 3. 
... 
Any other additional arguments 
No return value, called for side effects.
Extract residuals from a bmstdr fitted object.
## S3 method for class 'bmstdr'
residuals(object, numbers = NULL, ...)
object 
A bmstdr model fit object. 
numbers 
a list with two components: sn=number of spatial locations tn=number of time points. Residuals will be assumed to follow the arrangement of the data frame  sorted by space and then time within space. 
... 
Any other additional arguments. 
Returns a vector of residuals. If appropriate, it draws a time series plot of residuals. Otherwise, it draws a plot of residuals against observation numbers.
Provides basic summaries of model fitting.
## S3 method for class 'bmstdr'
summary(object, digits = 3, ...)
object 
A bmstdr model fit object. 
digits 
How many significant digits after the decimal to print, defaults to 3. 
... 
Any other additional arguments 
No return value, called for side effects.
Prints the terms
## S3 method for class 'bmstdr'
terms(x, ...)
x 
A bmstdr model fit object. 
... 
Any other additional arguments 
Terms in the model formula
A 313 by 313 proximity matrix for the 313 LADCUAS in England. Each entry is either 0 or 1 and is 1 if the corresponding row and column LADCUAs share a common boundary.
Weng
An object of class matrix
(inherits from array
) with 313 rows and 313 columns.
Sahu and Böhning (2021).
Sahu SK, Böhning D (2021). “Bayesian spatiotemporal joint disease mapping of Covid19 cases and deaths in local authorities of England.” Spatial Statistics. doi:10.1016/j.spasta.2021.100519.
dim(Weng)
summary(apply(Weng, 1, sum))
Average air pollution values from 28 sites in New York.
ydata
ydata
A vector with 28 real values.
An object of class numeric
of length 28.
This is obtained by calculating sitewise averages of the NYdata set in the 'spTimer' package Bakar and Sahu (2015). Each data point is the mean of the available daily 8hour maximum average ozone concentrations in parts per billion (ppb) at each of the 28 sites. The daily values are for the month of July and August in 2006.
Bakar KS, Sahu SK (2015). “spTimer: SpatioTemporal Bayesian Modeling Using R.” Journal of Statistical Software, Articles, 63(15), 1–32. ISSN 15487660, doi:10.18637/jss.v063.i15, https://www.jstatsoft.org/v063/i15.
summary(ydata)