Title: | Geographically Optimal Similarity |
---|---|
Description: | Understanding spatial association is essential for spatial statistical inference, including factor exploration and spatial prediction. Geographically optimal similarity (GOS) model is an effective method for spatial prediction, as described in Yongze Song (2022) <doi:10.1007/s11004-022-10036-8>. GOS was developed based on the geographical similarity principle, as described in Axing Zhu (2018) <doi:10.1080/19475683.2018.1534890>. GOS has advantages in more accurate spatial prediction using fewer samples and critically reduced prediction uncertainty. |
Authors: | Yongze Song [aut, cph] , Wenbo Lv [aut, cre] |
Maintainer: | Wenbo Lv <[email protected]> |
License: | GPL-3 |
Version: | 3.7 |
Built: | 2024-12-17 07:03:19 UTC |
Source: | CRAN |
Computationally optimized function for geographically optimal similarity (GOS) model
gos(formula, data = NULL, newdata = NULL, kappa = 0.25, cores = 1)
gos(formula, data = NULL, newdata = NULL, kappa = 0.25, cores = 1)
formula |
A formula of GOS model. |
data |
A |
newdata |
A |
kappa |
(optional) A numeric value of the percentage of observation locations
with high similarity to a prediction location. |
cores |
(optional) Positive integer. If cores > 1, a |
A tibble
made up of predictions and uncertainties.
pred
GOS model prediction results
uncertainty90
uncertainty under 0.9 quantile
uncertainty95
uncertainty under 0.95 quantile
uncertainty99
uncertainty under 0.99 quantile
uncertainty99.5
uncertainty under 0.995 quantile
uncertainty99.9
uncertainty under 0.999 quantile
uncertainty100
uncertainty under 1 quantile
Song, Y. (2022). Geographically Optimal Similarity. Mathematical Geosciences. doi: 10.1007/s11004-022-10036-8.
data("zn") # log-transformation hist(zn$Zn) zn$Zn <- log(zn$Zn) hist(zn$Zn) # remove outliers k <- removeoutlier(zn$Zn, coef = 2.5) dt <- zn[-k,] # split data for validation: 70% training; 30% testing split <- sample(1:nrow(dt), round(nrow(dt)*0.7)) train <- dt[split,] test <- dt[-split,] system.time({ g1 <- gos(Zn ~ Slope + Water + NDVI + SOC + pH + Road + Mine, data = train, newdata = test, kappa = 0.25, cores = 1) }) test$pred <- g1$pred plot(test$Zn, test$pred) cor(test$Zn, test$pred)
data("zn") # log-transformation hist(zn$Zn) zn$Zn <- log(zn$Zn) hist(zn$Zn) # remove outliers k <- removeoutlier(zn$Zn, coef = 2.5) dt <- zn[-k,] # split data for validation: 70% training; 30% testing split <- sample(1:nrow(dt), round(nrow(dt)*0.7)) train <- dt[split,] test <- dt[-split,] system.time({ g1 <- gos(Zn ~ Slope + Water + NDVI + SOC + pH + Road + Mine, data = train, newdata = test, kappa = 0.25, cores = 1) }) test$pred <- g1$pred plot(test$Zn, test$pred) cor(test$Zn, test$pred)
Computationally optimized function for determining the best kappa parameter for the optimal similarity
gos_bestkappa( formula, data = NULL, kappa = seq(0.05, 1, 0.05), nrepeat = 10, nsplit = 0.5, cores = 1 )
gos_bestkappa( formula, data = NULL, kappa = seq(0.05, 1, 0.05), nrepeat = 10, nsplit = 0.5, cores = 1 )
formula |
A formula of GOS model. |
data |
A |
kappa |
(optional) A numeric value of the percentage of observation locations
with high similarity to a prediction location. |
nrepeat |
(optional) A numeric value of the number of cross-validation training times.
The default value is |
nsplit |
(optional) The sample training set segmentation ratio,which in |
cores |
(optional) Positive integer. If cores > 1, a |
A list.
bestkappa
the result of best kappa
cvrmse
all RMSE calculations during cross-validation
cvmean
the average RMSE corresponding to different kappa in the cross-validation process
plot
the plot of rmse changes corresponding to different kappa
Song, Y. (2022). Geographically Optimal Similarity. Mathematical Geosciences. doi: 10.1007/s11004-022-10036-8.
data("zn") # log-transformation hist(zn$Zn) zn$Zn <- log(zn$Zn) hist(zn$Zn) # remove outliers k <- removeoutlier(zn$Zn, coef = 2.5) dt <- zn[-k,] # determine the best kappa system.time({ b1 <- gos_bestkappa(Zn ~ Slope + Water + NDVI + SOC + pH + Road + Mine, data = dt, kappa = c(0.01, 0.1, 1), nrepeat = 1, cores = 1) }) b1$bestkappa b1$plot
data("zn") # log-transformation hist(zn$Zn) zn$Zn <- log(zn$Zn) hist(zn$Zn) # remove outliers k <- removeoutlier(zn$Zn, coef = 2.5) dt <- zn[-k,] # determine the best kappa system.time({ b1 <- gos_bestkappa(Zn ~ Slope + Water + NDVI + SOC + pH + Road + Mine, data = dt, kappa = c(0.01, 0.1, 1), nrepeat = 1, cores = 1) }) b1$bestkappa b1$plot
Spatial grid data of explanatory variables.
grid
grid
grid
: A tibble of grided trace element explanatory variables
with 13132 rows and 12 variables, where the first column is ID.
Yongze Song [email protected]
Function for removing outliers.
removeoutlier(x, coef = 2.5)
removeoutlier(x, coef = 2.5)
x |
A vector of a variable |
coef |
A number of the times of standard deviation. Default is |
Location of outliers in the vector
data("zn") # log-transformation hist(zn$Zn) zn$Zn <- log(zn$Zn) hist(zn$Zn) # remove outliers k <- removeoutlier(zn$Zn, coef = 2.5) k
data("zn") # log-transformation hist(zn$Zn) zn$Zn <- log(zn$Zn) hist(zn$Zn) # remove outliers k <- removeoutlier(zn$Zn, coef = 2.5) k
Spatial datasets of trace element Zn.
zn
zn
zn
: A tibble of trace element Zn with 894 rows and 12 variables
Yongze Song [email protected]