| Title: | Small Area Estimation Hierarchical Bayes for Spatial Beta Model |
|---|---|
| Description: | Provides several functions and datasets for area-level Small Area Estimation using the Hierarchical Bayesian (HB) method. Model-based estimators are designed for variables of interest that follow a Beta distribution. The package supports spatial structures under the Simultaneous Autoregressive (SAR) and Leroux Conditional Autoregressive (CAR) models, accommodating survey design effect (DEFF) adjustments. The 'rjags' package is employed to obtain parameter estimates via Gibbs Sampling. For references, see Rao and Molina (2015) <doi:10.1002/9781118735855>, Kubacki and Jedrzejczak (2016) <doi:10.59170/stattrans-2016-022>, Leroux et al. (2000) <doi:10.1007/978-1-4612-1284-3_4>, and Chung and Datta (2020) <https://www.census.gov/content/dam/Census/library/working-papers/2020/adrm/RRS2020-07.pdf>. |
| Authors: | Boby Iwan [aut, cre], Cucu Sumarni [aut] |
| Maintainer: | Boby Iwan <[email protected]> |
| License: | GPL-3 |
| Version: | 0.1.0 |
| Built: | 2026-07-02 21:49:57 UTC |
| Source: | https://github.com/cran/saeHB.Spatial.Beta |
A binary adjacency matrix (B) generated from a 6x6 regular grid using Queen contiguity.
This matrix is mathematically suitable for the Leroux Conditional Autoregressive (CAR) model.
data(adjacency_mat)data(adjacency_mat)
A 36 x 36 numeric matrix. The elements take a value of 1 if two areas share a common border (are neighbors), and 0 otherwise. All diagonal elements are 0.
This function gives small area estimator under Spatial Leroux CAR Model. It is implemented to a variable of interest (y) that is assumed to follow a Beta Distribution. The range of data is .
beta_lerouxcar( formula, proxmat, data, iter.update = 3, iter.mcmc = 2000, thin = 1, burn.in = 1000, chains = 2, n.adapt = 1000, coef = NULL, var.coef = NULL, tau.v = 1, seed = 123, quiet = FALSE, plot = TRUE, keep.fit = FALSE )beta_lerouxcar( formula, proxmat, data, iter.update = 3, iter.mcmc = 2000, thin = 1, burn.in = 1000, chains = 2, n.adapt = 1000, coef = NULL, var.coef = NULL, tau.v = 1, seed = 123, quiet = FALSE, plot = TRUE, keep.fit = FALSE )
formula |
Formula that describes the fitted model. |
proxmat |
|
data |
The data frame. |
iter.update |
Number of updates performed during Gibbs sampling. Default is |
iter.mcmc |
Total number of MCMC iterations per chain. Default is |
thin |
Thinning rate for MCMC sampling. Must be a positive integer. Default is |
burn.in |
Number of burn-in iterations discarded from each MCMC chain. Default is |
chains |
Number of parallel MCMC chains. Default is |
n.adapt |
Number of iterations used for the adaptation phase in JAGS. Default is |
coef |
Optional vector containing the mean of the prior distribution of the regression model coefficients. |
var.coef |
Optional vector containing the variances of the prior distribution of the regression model coefficients. |
tau.v |
Initial value or shape for the random effect precision. Default is |
seed |
An integer seed for the random number generator to ensure reproducibility. Default is |
quiet |
Logical; if |
plot |
Logical; if |
keep.fit |
Logical; if |
This function returns a list with the following objects:
A dataframe containing the posterior mean estimates, posterior standard deviations, and 95% credible intervals of the small area means estimated using the Hierarchical Bayesian method.
A dataframe containing the posterior mean estimates, posterior standard deviations, and 95% credible intervals of the area-specific random effects .
A dataframe containing the posterior mean estimates, posterior standard deviations, and 95% credible intervals of the area-specific random effect variances .
A dataframe containing the posterior mean estimates, posterior standard deviations, 95% credible intervals, Rhat convergence diagnostics, and Effective Sample Sizes (ESS) for the regression coefficients , the spatial autoregressive parameter , and the global precision parameter .
# Load dataset and proximity matrix data(databeta) data(adjacency_mat) # Fit the Spatial Beta-Leroux CAR model result <- beta_lerouxcar( formula = y ~ x1 + x2, proxmat = adjacency_mat, data = databeta ) # View the estimation results # 1. Small Area Estimates result$est # 2. Estimated area-specific random effects result$randeff # 3. Estimated variance of the random effects result$refvar # 4. Estimated regression coefficients, spatial, and precision parameters result$coefficient# Load dataset and proximity matrix data(databeta) data(adjacency_mat) # Fit the Spatial Beta-Leroux CAR model result <- beta_lerouxcar( formula = y ~ x1 + x2, proxmat = adjacency_mat, data = databeta ) # View the estimation results # 1. Small Area Estimates result$est # 2. Estimated area-specific random effects result$randeff # 3. Estimated variance of the random effects result$refvar # 4. Estimated regression coefficients, spatial, and precision parameters result$coefficient
This function gives small area estimator under Non-Spatial Model. It is implemented to a variable of interest (y) that is assumed to follow a Beta Distribution. The range of data is .
beta_nonspatial( formula, data, iter.update = 3, iter.mcmc = 2000, thin = 1, burn.in = 1000, chains = 2, n.adapt = 1000, coef = NULL, var.coef = NULL, tau.v = 1, seed = 123, quiet = FALSE, plot = TRUE, keep.fit = FALSE )beta_nonspatial( formula, data, iter.update = 3, iter.mcmc = 2000, thin = 1, burn.in = 1000, chains = 2, n.adapt = 1000, coef = NULL, var.coef = NULL, tau.v = 1, seed = 123, quiet = FALSE, plot = TRUE, keep.fit = FALSE )
formula |
Formula that describes the fitted model. |
data |
The data frame. |
iter.update |
Number of updates performed during Gibbs sampling. Default is |
iter.mcmc |
Total number of MCMC iterations per chain. Default is |
thin |
Thinning rate for MCMC sampling. Must be a positive integer. Default is |
burn.in |
Number of burn-in iterations discarded from each MCMC chain. Default is |
chains |
Number of parallel MCMC chains. Default is |
n.adapt |
Number of iterations used for the adaptation phase in JAGS. Default is |
coef |
Optional vector containing the mean of the prior distribution of the regression model coefficients. |
var.coef |
Optional vector containing the variances of the prior distribution of the regression model coefficients. |
tau.v |
Initial value or shape for the random effect precision. Default is |
seed |
An integer seed for the random number generator to ensure reproducibility. Default is |
quiet |
Logical; if |
plot |
Logical; if |
keep.fit |
Logical; if |
This function returns a list with the following objects:
A dataframe containing the posterior mean estimates, posterior standard deviations, and 95% credible intervals of the small area means estimated using the Hierarchical Bayesian method.
A dataframe containing the posterior mean estimates, posterior standard deviations, and 95% credible intervals of the area-specific random effects .
A dataframe containing the posterior mean estimates, posterior standard deviations, and 95% credible intervals of the global random effect variance .
A dataframe containing the posterior mean estimates, posterior standard deviations, 95% credible intervals, Rhat convergence diagnostics, and effective sample sizes (ESS) for the regression coefficients and the global precision parameter .
# Load dataset data(databeta) # Fit the Non-Spatial Beta model result <- beta_nonspatial( formula = y ~ x1 + x2, data = databeta ) # View the estimation results # 1. Small Area Estimates result$est # 2. Estimated area-specific random effects result$randeff # 3. Estimated global variance of the random effects result$refvar # 4. Estimated regression coefficients and precision parameter result$coefficient# Load dataset data(databeta) # Fit the Non-Spatial Beta model result <- beta_nonspatial( formula = y ~ x1 + x2, data = databeta ) # View the estimation results # 1. Small Area Estimates result$est # 2. Estimated area-specific random effects result$randeff # 3. Estimated global variance of the random effects result$refvar # 4. Estimated regression coefficients and precision parameter result$coefficient
This function gives small area estimator under Spatial SAR Model. It is implemented to a variable of interest (y) that is assumed to follow a Beta Distribution. The range of data is .
beta_sar( formula, proxmat, data, iter.update = 3, iter.mcmc = 2000, thin = 1, burn.in = 1000, chains = 2, n.adapt = 1000, coef = NULL, var.coef = NULL, tau.u = 1, seed = 123, quiet = FALSE, plot = TRUE, keep.fit = FALSE )beta_sar( formula, proxmat, data, iter.update = 3, iter.mcmc = 2000, thin = 1, burn.in = 1000, chains = 2, n.adapt = 1000, coef = NULL, var.coef = NULL, tau.u = 1, seed = 123, quiet = FALSE, plot = TRUE, keep.fit = FALSE )
formula |
Formula that describes the fitted model. |
proxmat |
|
data |
The data frame. |
iter.update |
Number of updates performed during Gibbs sampling. Default is |
iter.mcmc |
Total number of MCMC iterations per chain. Default is |
thin |
Thinning rate for MCMC sampling. Must be a positive integer. Default is |
burn.in |
Number of burn-in iterations discarded from each MCMC chain. Default is |
chains |
Number of parallel MCMC chains. Default is |
n.adapt |
Number of iterations used for the adaptation phase in JAGS. Default is |
coef |
Optional vector containing the mean of the prior distribution of the regression model coefficients. |
var.coef |
Optional vector containing the variances of the prior distribution of the regression model coefficients. |
tau.u |
Initial value or shape for the random effect precision. Default is |
seed |
An integer seed for the random number generator to ensure reproducibility. Default is |
quiet |
Logical; if |
plot |
Logical; if |
keep.fit |
Logical; if |
This function returns a list with the following objects:
A dataframe containing the posterior mean estimates, posterior standard deviations, and 95% credible intervals of the small area means estimated using the Hierarchical Bayesian method.
A dataframe containing the posterior mean estimates, posterior standard deviations, and 95% credible intervals of the area-specific random effects .
A dataframe containing the posterior mean estimates, posterior standard deviations, and 95% credible intervals of the area-specific random effect variances .
A dataframe containing the posterior mean estimates, posterior standard deviations, 95% credible intervals, Rhat convergence diagnostics, and Effective Sample Sizes (ESS) for the regression coefficients , the spatial autoregressive parameter , and the global precision parameter .
# Load dataset and proximity matrix data(databeta) data(weight_mat) # Fit the Spatial Beta-SAR model result <- beta_sar( formula = y ~ x1 + x2, proxmat = weight_mat, data = databeta ) # View the estimation results # 1. Small Area Estimates result$est # 2. Estimated area-specific random effects result$randeff # 3. Estimated variance of the random effects result$refvar # 4. Estimated regression coefficients, spatial, and precision parameters result$coefficient# Load dataset and proximity matrix data(databeta) data(weight_mat) # Fit the Spatial Beta-SAR model result <- beta_sar( formula = y ~ x1 + x2, proxmat = weight_mat, data = databeta ) # View the estimation results # 1. Small Area Estimates result$est # 2. Estimated area-specific random effects result$randeff # 3. Estimated variance of the random effects result$refvar # 4. Estimated regression coefficients, spatial, and precision parameters result$coefficient
This function gives small area estimator under Spatial Leroux CAR Model with Design Effect (DEFF) adjustment. It is implemented to a variable of interest (y) that is assumed to follow a Beta Distribution. The range of data is .
betadeff_lerouxcar( formula, deff, n_i, proxmat, data, iter.update = 3, iter.mcmc = 2000, thin = 1, burn.in = 1000, chains = 2, n.adapt = 1000, coef = NULL, var.coef = NULL, tau.v = 1, seed = 123, quiet = FALSE, plot = TRUE, keep.fit = FALSE )betadeff_lerouxcar( formula, deff, n_i, proxmat, data, iter.update = 3, iter.mcmc = 2000, thin = 1, burn.in = 1000, chains = 2, n.adapt = 1000, coef = NULL, var.coef = NULL, tau.v = 1, seed = 123, quiet = FALSE, plot = TRUE, keep.fit = FALSE )
formula |
Formula that describes the fitted model. |
deff |
String specifying the name of the design effect variable in the data frame. |
n_i |
String specifying the name of the sample size variable in the data frame. |
proxmat |
|
data |
The data frame. |
iter.update |
Number of updates performed during Gibbs sampling. Default is |
iter.mcmc |
Total number of MCMC iterations per chain. Default is |
thin |
Thinning rate for MCMC sampling. Must be a positive integer. Default is |
burn.in |
Number of burn-in iterations discarded from each MCMC chain. Default is |
chains |
Number of parallel MCMC chains. Default is |
n.adapt |
Number of iterations used for the adaptation phase in JAGS. Default is |
coef |
Optional vector containing the mean of the prior distribution of the regression model coefficients. |
var.coef |
Optional vector containing the variances of the prior distribution of the regression model coefficients. |
tau.v |
Initial value or shape for the random effect precision. Default is |
seed |
An integer seed for the random number generator to ensure reproducibility. Default is |
quiet |
Logical; if |
plot |
Logical; if |
keep.fit |
Logical; if |
This function returns a list with the following objects:
A dataframe containing the posterior mean estimates, posterior standard deviations, and 95% credible intervals of the small area means estimated using the Hierarchical Bayesian method.
A dataframe containing the posterior mean estimates, posterior standard deviations, and 95% credible intervals of the area-specific random effects .
A dataframe containing the posterior mean estimates, posterior standard deviations, and 95% credible intervals of the area-specific random effect variances .
A dataframe containing the posterior mean estimates, posterior standard deviations, 95% credible intervals, Rhat convergence diagnostics, and Effective Sample Sizes (ESS) for the regression coefficients and the spatial autoregressive parameter .
# Load dataset and proximity matrix data(databeta) data(adjacency_mat) # Fit the Spatial Beta-Leroux CAR model with Design Effect result <- betadeff_lerouxcar( formula = y ~ x1 + x2, deff = "deff", n_i = "n_i", proxmat = adjacency_mat, data = databeta ) # View the estimation results # 1. Small Area Estimates result$est # 2. Estimated area-specific random effects result$randeff # 3. Estimated variance of the random effects result$refvar # 4. Estimated regression coefficients and spatial parameter result$coefficient# Load dataset and proximity matrix data(databeta) data(adjacency_mat) # Fit the Spatial Beta-Leroux CAR model with Design Effect result <- betadeff_lerouxcar( formula = y ~ x1 + x2, deff = "deff", n_i = "n_i", proxmat = adjacency_mat, data = databeta ) # View the estimation results # 1. Small Area Estimates result$est # 2. Estimated area-specific random effects result$randeff # 3. Estimated variance of the random effects result$refvar # 4. Estimated regression coefficients and spatial parameter result$coefficient
This function gives small area estimator under Non-Spatial Model with Design Effect (DEFF) adjustment. It is implemented to a variable of interest (y) that is assumed to follow a Beta Distribution. The range of data is .
betadeff_nonspatial( formula, deff, n_i, data, iter.update = 3, iter.mcmc = 2000, thin = 1, burn.in = 1000, chains = 2, n.adapt = 1000, coef = NULL, var.coef = NULL, tau.v = 1, seed = 123, quiet = FALSE, plot = TRUE, keep.fit = FALSE )betadeff_nonspatial( formula, deff, n_i, data, iter.update = 3, iter.mcmc = 2000, thin = 1, burn.in = 1000, chains = 2, n.adapt = 1000, coef = NULL, var.coef = NULL, tau.v = 1, seed = 123, quiet = FALSE, plot = TRUE, keep.fit = FALSE )
formula |
Formula that describes the fitted model. |
deff |
String specifying the name of the design effect variable in the data frame. |
n_i |
String specifying the name of the sample size variable in the data frame. |
data |
The data frame. |
iter.update |
Number of updates performed during Gibbs sampling. Default is |
iter.mcmc |
Total number of MCMC iterations per chain. Default is |
thin |
Thinning rate for MCMC sampling. Must be a positive integer. Default is |
burn.in |
Number of burn-in iterations discarded from each MCMC chain. Default is |
chains |
Number of parallel MCMC chains. Default is |
n.adapt |
Number of iterations used for the adaptation phase in JAGS. Default is |
coef |
Optional vector containing the mean of the prior distribution of the regression model coefficients. |
var.coef |
Optional vector containing the variances of the prior distribution of the regression model coefficients. |
tau.v |
Initial value or shape for the random effect precision. Default is |
seed |
An integer seed for the random number generator to ensure reproducibility. Default is |
quiet |
Logical; if |
plot |
Logical; if |
keep.fit |
Logical; if |
This function returns a list with the following objects:
A dataframe containing the posterior mean estimates, posterior standard deviations, and 95% credible intervals of the small area means estimated using the Hierarchical Bayesian method.
A dataframe containing the posterior mean estimates, posterior standard deviations, and 95% credible intervals of the area-specific random effects .
A dataframe containing the posterior mean estimates, posterior standard deviations, and 95% credible intervals of the global random effect variance .
A dataframe containing the posterior mean estimates, posterior standard deviations, 95% credible intervals, Rhat convergence diagnostics, and effective sample sizes (ESS) for the regression coefficients .
# Load dataset data(databeta) # Fit the Non-Spatial Beta model with Design Effect result <- betadeff_nonspatial( formula = y ~ x1 + x2, deff = "deff", n_i = "n_i", data = databeta ) # View the estimation results # 1. Small Area Estimates result$est # 2. Estimated area-specific random effects result$randeff # 3. Estimated global variance of the random effects result$refvar # 4. Estimated regression coefficients result$coefficient# Load dataset data(databeta) # Fit the Non-Spatial Beta model with Design Effect result <- betadeff_nonspatial( formula = y ~ x1 + x2, deff = "deff", n_i = "n_i", data = databeta ) # View the estimation results # 1. Small Area Estimates result$est # 2. Estimated area-specific random effects result$randeff # 3. Estimated global variance of the random effects result$refvar # 4. Estimated regression coefficients result$coefficient
This function gives small area estimator under Spatial SAR Model with Design Effect (DEFF) adjustment. It is implemented to a variable of interest (y) that is assumed to follow a Beta Distribution. The range of data is .
betadeff_sar( formula, deff, n_i, proxmat, data, iter.update = 3, iter.mcmc = 2000, thin = 1, burn.in = 1000, chains = 2, n.adapt = 1000, coef = NULL, var.coef = NULL, tau.u = 1, seed = 123, quiet = FALSE, plot = TRUE, keep.fit = FALSE )betadeff_sar( formula, deff, n_i, proxmat, data, iter.update = 3, iter.mcmc = 2000, thin = 1, burn.in = 1000, chains = 2, n.adapt = 1000, coef = NULL, var.coef = NULL, tau.u = 1, seed = 123, quiet = FALSE, plot = TRUE, keep.fit = FALSE )
formula |
Formula that describes the fitted model. |
deff |
String specifying the name of the design effect variable in the data frame. |
n_i |
String specifying the name of the sample size variable in the data frame. |
proxmat |
|
data |
The data frame. |
iter.update |
Number of updates performed during Gibbs sampling. Default is |
iter.mcmc |
Total number of MCMC iterations per chain. Default is |
thin |
Thinning rate for MCMC sampling. Must be a positive integer. Default is |
burn.in |
Number of burn-in iterations discarded from each MCMC chain. Default is |
chains |
Number of parallel MCMC chains. Default is |
n.adapt |
Number of iterations used for the adaptation phase in JAGS. Default is |
coef |
Optional vector containing the mean of the prior distribution of the regression model coefficients. |
var.coef |
Optional vector containing the variances of the prior distribution of the regression model coefficients. |
tau.u |
Initial value or shape for the random effect precision. Default is |
seed |
An integer seed for the random number generator to ensure reproducibility. Default is |
quiet |
Logical; if |
plot |
Logical; if |
keep.fit |
Logical; if |
This function returns a list with the following objects:
A dataframe containing the posterior mean estimates, posterior standard deviations, and 95% credible intervals of the small area means estimated using the Hierarchical Bayesian method.
A dataframe containing the posterior mean estimates, posterior standard deviations, and 95% credible intervals of the area-specific random effects .
A dataframe containing the posterior mean estimates, posterior standard deviations, and 95% credible intervals of the area-specific random effect variances .
A dataframe containing the posterior mean estimates, posterior standard deviations, 95% credible intervals, Rhat convergence diagnostics, and Effective Sample Sizes (ESS) for the regression coefficients and the spatial autoregressive parameter .
# Load dataset and proximity matrix data(databeta) data(weight_mat) # Fit the Spatial Beta-SAR model with Design Effect result <- betadeff_sar( formula = y ~ x1 + x2, deff = "deff", n_i = "n_i", proxmat = weight_mat, data = databeta ) # View the estimation results # 1. Small Area Estimates result$est # 2. Estimated area-specific random effects result$randeff # 3. Estimated variance of the random effects result$refvar # 4. Estimated regression coefficients and spatial parameter result$coefficient# Load dataset and proximity matrix data(databeta) data(weight_mat) # Fit the Spatial Beta-SAR model with Design Effect result <- betadeff_sar( formula = y ~ x1 + x2, deff = "deff", n_i = "n_i", proxmat = weight_mat, data = databeta ) # View the estimation results # 1. Small Area Estimates result$est # 2. Estimated area-specific random effects result$randeff # 3. Estimated variance of the random effects result$refvar # 4. Estimated regression coefficients and spatial parameter result$coefficient
This function constructs spatial weights matrices () for spatial modeling. It supports various methods including Contiguity, Distance-based, and Kernel-based weights, and provides a robust fallback mechanism to automatically connect isolated areas (islands).
build_w( data, coords = NULL, method = c("contiguity", "distance", "kernel"), contiguity = c("queen", "rook", "bishop"), distance = c("knn", "inverse_distance", "exponential"), k = 2, dmax = NULL, power = 1, alpha = 1, epsilon = 1e-12, kernel = c("uniform", "gaussian", "triangular", "epanechnikov", "quartic"), bandwidth = NULL, lonlat = TRUE, style = "W", zero.policy = TRUE, fallback = c("knn", "distance", "none"), fallback_k = 2, fallback_dmax = NULL, output = c("all", "matrix", "listw", "nb") )build_w( data, coords = NULL, method = c("contiguity", "distance", "kernel"), contiguity = c("queen", "rook", "bishop"), distance = c("knn", "inverse_distance", "exponential"), k = 2, dmax = NULL, power = 1, alpha = 1, epsilon = 1e-12, kernel = c("uniform", "gaussian", "triangular", "epanechnikov", "quartic"), bandwidth = NULL, lonlat = TRUE, style = "W", zero.policy = TRUE, fallback = c("knn", "distance", "none"), fallback_k = 2, fallback_dmax = NULL, output = c("all", "matrix", "listw", "nb") )
data |
An |
coords |
An |
method |
A string indicating the spatial weight construction method. Options are |
contiguity |
A string indicating the contiguity type. Options are |
distance |
A string indicating the distance-based type. Options are |
k |
An integer specifying the number of nearest neighbors for KNN methods. Default is |
dmax |
A numeric specifying the maximum distance threshold for distance-based neighbors. The unit depends on |
power |
A numeric specifying the decay power for inverse distance weights. Default is |
alpha |
A numeric specifying the decay parameter for exponential distance weights. Default is |
epsilon |
A small numeric value to prevent division by zero in inverse distance calculation. Default is |
kernel |
A string indicating the type of spatial kernel. Options are |
bandwidth |
A numeric specifying the bandwidth ( |
lonlat |
Logical; if |
style |
A character string specifying the spatial weights coding scheme ( |
zero.policy |
Logical; if |
fallback |
A string indicating the fallback method for isolated areas (without neighbors) when using contiguity. Options are |
fallback_k |
An integer specifying the number of neighbors for the fallback method. Default is |
fallback_dmax |
A numeric specifying the maximum distance for the fallback method. |
output |
A string specifying the format of the output. Options are |
The function supports the following spatial weight construction methods:
Contiguity: Queen, Rook, and Bishop.
Distance-based: K-Nearest Neighbors (KNN), Inverse Distance, and Exponential.
Kernel-based: Uniform, Gaussian, Triangular, Epanechnikov, and Quartic.
For distance and kernel methods, if lonlat = TRUE, spherical (great-circle) distances are calculated. For the kernel method specifically, distances are internally converted to kilometers.
Depending on the output argument, this function returns:
"matrix": An spatial weights matrix.
"listw": A listw object compatible with spdep functions.
"nb": An nb (neighborhood) object.
"all": A list containing W (matrix), listw, nb, info (method details), and diag (diagnostic metrics for isolates and fallback).
# Generate random Longitude and Latitude coordinates for 10 areas set.seed(123) lon <- runif(10, min = 100, max = 140) lat <- runif(10, min = -10, max = 10) coords <- cbind(lon, lat) # 1. Build KNN distance-based weights (k = 2) using spherical distance W_knn <- build_w( data = NULL, coords = coords, method = "distance", distance = "knn", k = 2, lonlat = TRUE, output = "matrix" ) # View the first few rows of the matrix head(W_knn) # 2. Build Gaussian Kernel weights using 500 km bandwidth W_kernel <- build_w( data = NULL, coords = coords, method = "kernel", kernel = "gaussian", bandwidth = 500, lonlat = TRUE, output = "matrix" ) # View the first few rows of the matrix head(W_kernel)# Generate random Longitude and Latitude coordinates for 10 areas set.seed(123) lon <- runif(10, min = 100, max = 140) lat <- runif(10, min = -10, max = 10) coords <- cbind(lon, lat) # 1. Build KNN distance-based weights (k = 2) using spherical distance W_knn <- build_w( data = NULL, coords = coords, method = "distance", distance = "knn", k = 2, lonlat = TRUE, output = "matrix" ) # View the first few rows of the matrix head(W_knn) # 2. Build Gaussian Kernel weights using 500 km bandwidth W_kernel <- build_w( data = NULL, coords = coords, method = "kernel", kernel = "gaussian", bandwidth = 500, lonlat = TRUE, output = "matrix" ) # View the first few rows of the matrix head(W_kernel)
A synthetic dataset generated for testing and tutorial purposes of the saeHB.Spatial.Beta package.
The data is generated under a Spatial Simultaneous Autoregressive (SAR) process with a Beta distribution,
accommodating survey design effects (DEFF).
This data is generated by these following steps:
Generate auxiliary variables and .
Generate sample sizes and survey design effects . Calculate the precision parameter for each area: .
Generate spatial random effects under the SAR model. First, generate independent normal errors . Then, calculate the spatial random effect , where is an identity matrix, is the row-standardized proximity matrix (weight_mat), and the spatial autoregressive parameter is set to 0.70.
Calculate the true mean proportions , where the regression coefficients are set as .
Generate the response variable . Values are strictly bounded between 0 and 1.
Area ID domain, response variable y, auxiliary variables x1, x2, sample size n_i, and design effect deff are combined into a data frame called databeta.
data(databeta)data(databeta)
A data frame with 36 rows and 6 columns:
Area ID/name
Direct estimates of the proportion/variable of interest (0 < y < 1)
Auxiliary variable 1 (Normal distribution)
Auxiliary variable 2 (Normal distribution)
Sample size for each area
Survey design effect for each area
A synthetic dataset identical to databeta, but contains 5 missing values (NA) in the
variable of interest (y) to demonstrate the prediction capability of the models for non-sampled areas.
data(databeta_na)data(databeta_na)
A data frame with 36 rows and 6 columns:
Area ID/name
Direct estimates of the proportion/variable of interest (0 < y < 1). Contains NA values.
Auxiliary variable 1
Auxiliary variable 2
Sample size for each area
Survey design effect for each area
This function provides a convenient wrapper to perform Moran's I test for spatial autocorrelation on a numeric vector. It seamlessly handles missing values (NA) by subsetting both the numeric vector and the spatial weights list simultaneously.
moran_test( x, listw, alternative = c("greater", "less", "two.sided"), mc = FALSE, nsim = 999, zero.policy = TRUE, na.rm = TRUE )moran_test( x, listw, alternative = c("greater", "less", "two.sided"), mc = FALSE, nsim = 999, zero.policy = TRUE, na.rm = TRUE )
x |
A numeric vector of the variable of interest (e.g., residuals, random effects, or raw data). |
listw |
A |
alternative |
A character string specifying the alternative hypothesis. Must be one of |
mc |
Logical; if |
nsim |
An integer specifying the number of permutations if |
zero.policy |
Logical; if |
na.rm |
Logical; if |
This function supports two approaches to testing the significance of Moran's I:
1. Analytical Approach (Randomization - Default)
When mc = FALSE, the function uses the analytical approach (specifically, the assumption of randomization). It computes the theoretical expectation and variance of Moran's I under the null hypothesis of no spatial autocorrelation. This method assumes that the observed values could have occurred in any spatial location with equal probability.
When to use: Use this approach when your dataset is relatively large and follows standard statistical assumptions. It is computationally fast and provides reliable asymptotic p-values for large .
2. Monte Carlo Permutation Approach (mc = TRUE)
When mc = TRUE, the function calculates the p-value empirically. It randomly permutes (shuffles) the observed values x across the spatial units nsim times. For each permutation, it calculates a pseudo-Moran's I. The final p-value is the proportion of simulated Moran's I values that are as extreme as or more extreme than the observed Moran's I.
When to use: Use this approach when your dataset has a relatively small number of areas or when you want to avoid relying on asymptotic theory. Because it computes the p-value empirically without assuming a specific theoretical distribution for the Moran's I statistic, the Monte Carlo approach is highly robust and is widely recommended for evaluating MCMC outputs.
A list with class htest containing the following components:
statistic: The value of the standard deviate of Moran's I.
p.value: The p-value of the test.
estimate: The value of the observed Moran's I, its expectation, and variance.
method: A character string indicating the type of test performed.
data.name: A character string giving the name(s) of the data.
# Load datasets data(databeta) data(weight_mat) # Convert the spatial weights matrix to a 'listw' object W_listw <- spdep::mat2listw(weight_mat, style = "W", zero.policy = TRUE) # Perform Moran's I test (Analytical approach) moran_test(x = databeta$y, listw = W_listw) # Perform Moran's I test (Monte Carlo permutation approach) moran_test(x = databeta$y, listw = W_listw, mc = TRUE, nsim = 99) # Handling Missing Values automatically (na.rm = TRUE is default) y_with_na <- databeta$y y_with_na[c(2, 5)] <- NA moran_test(x = y_with_na, listw = W_listw, na.rm = TRUE)# Load datasets data(databeta) data(weight_mat) # Convert the spatial weights matrix to a 'listw' object W_listw <- spdep::mat2listw(weight_mat, style = "W", zero.policy = TRUE) # Perform Moran's I test (Analytical approach) moran_test(x = databeta$y, listw = W_listw) # Perform Moran's I test (Monte Carlo permutation approach) moran_test(x = databeta$y, listw = W_listw, mc = TRUE, nsim = 99) # Handling Missing Values automatically (na.rm = TRUE is default) y_with_na <- databeta$y y_with_na[c(2, 5)] <- NA moran_test(x = y_with_na, listw = W_listw, na.rm = TRUE)
A row-standardized proximity matrix (W) generated from a 6x6 regular grid using Queen contiguity.
This matrix is mathematically suitable for the Spatial Simultaneous Autoregressive (SAR) model and Moran's I test.
data(weight_mat)data(weight_mat)
A 36 x 36 numeric matrix. The values are numbers in the interval [0,1] representing the proximity of the row and column areas. The sum of the values in each row is exactly 1.