Title: | Blind Source Separation for Multivariate Spatial Data |
---|---|
Description: | Blind source separation for multivariate spatial data based on simultaneous/joint diagonalization of (robust) local covariance matrices. This package is an implementation of the methods described in Bachoc, Genton, Nordhausen, Ruiz-Gazen and Virta (2020) <doi:10.1093/biomet/asz079>. |
Authors: | Christoph Muehlmann [aut] , Mika Sipil<e4> [aut] , Klaus Nordhausen [aut, cre] , Sara Taskinen [aut] , Joni Virta [aut] |
Maintainer: | Klaus Nordhausen <[email protected]> |
License: | GPL (>= 2) |
Version: | 0.14-0 |
Built: | 2024-12-12 07:07:57 UTC |
Source: | CRAN |
Blind source separation for multivariate spatial data based on simultaneous/joint diagonalization of local covariance matrices. This package is an implementation of the methods described in Nordhausen, Oja, Filzmoser and Reimann (2015) <doi:10.1007/s11004-014-9559-5>, Bachoc, Genton, Nordhausen, Ruiz-Gazen and Virta (2020) <doi:10.1093/biomet/asz079> and Muehlmann, Bachoc and Nordhausen (2022) <doi:10.1016/j.spasta.2021.100574> as well as some related methods.
Package: | SpatialBSS |
Type: | Package |
Version: | 0.14-0 |
Date: | 2023-07-20 |
License: | GPL (>= 2) |
This package provides functions to solve the Blind Source Separation problem for multivariate spatial data. These methods are designed to work with random fields that are observed on irregular locations. Moreover, the random field is assumed to show weak second order stationarity. The main functions of this package are:
sbss
This function derives a set of local scatter matrices that are based on spatial kernel functions, where the spatial kernel functions can be chosen. Then this set of local covariance matrices as well as the sample covariance matrix are simultaneously/jointly diagonalized. Local covariance matrices as well as local difference matrices are implemented.
sbss_asymp
, sbss_boot
These functions test for white noise components in the estimated latent field estimated by the sbss
function based on asymptotic results or bootstrap inference principles.
snss_sd
, snss_jd
and snss_sjd
These functions estimate the latent random field assuming a spatial non-stationary source separation model. This is done by splitting the domain into a number of sub-domains and diagonalizing the corresponding covariance and/or local covariance matrices for each sub-domain.
robsbss
Uses robust estimates of local covariance matrices to solve the SBSS problem.
Joint diagonalization is computed with the frjd
(fast real joint diagonalization) algorithm from the package JADE
.
The random field can be either a pair of numeric matrices giving the coordinates and field values or an object of class SpatialPointsDataFrame
or sf
.
Christoph Muehlmann, Mika Sipila, Klaus Nordhausen, Sara Taskinen, Joni Virta
Maintainer: Klaus Nordhausen [email protected]
Muehlmann, C., Filzmoser, P. and Nordhausen, K. (2021), Spatial Blind Source Separation in the Presence of a Drift, Submitted for publication. Preprint available at https://arxiv.org/abs/2108.13813.
Bachoc, F., Genton, M. G, Nordhausen, K., Ruiz-Gazen, A. and Virta, J. (2020), Spatial Blind Source Separation, Biometrika, 107, 627-646, doi:10.1093/biomet/asz079.
Nordhausen, K., Oja, H., Filzmoser, P., Reimann, C. (2015), Blind Source Separation for Spatial Compositional Data, Mathematical Geosciences 47, 753-770, doi:10.1007/s11004-014-9559-5.
Muehlmann, C., Bachoc, F. and Nordhausen, K. (2022), Blind Source Separation for Non-Stationary Random Fields, Spatial Statistics, 47, 100574, doi:10.1016/j.spasta.2021.100574.
Muehlmann, C., Bachoc, F., Nordhausen, K. and Yi, M. (2022), Test of the Latent Dimension of a Spatial Blind Source Separation Model, to appear in Statistica Sinica, doi:10.5705/ss.202021.0326.
Extracts the estimated unmixing matrix of an object of class 'sbss'
.
## S3 method for class 'sbss' coef(object, ...)
## S3 method for class 'sbss' coef(object, ...)
object |
object of class |
... |
further arguments to be passed to or from methods. |
Returns the estimated unmixing matrix of an object of class 'sbss'
as a numeric matrix.
Generates synthetic global outliers and contaminates a given p-variate random field
gen_glob_outl(x, alpha = 0.05, h = 10, random_sign = FALSE)
gen_glob_outl(x, alpha = 0.05, h = 10, random_sign = FALSE)
x |
a numeric matrix of dimension |
alpha |
a numerical value between 0 and 1 giving the proportion of observations to contaminate. |
h |
a numerical constant to determine how large the contaminated outliers are, see details. |
random_sign |
logical. If |
gen_glob_outl
generates outliers for a given field by selecting randomly round(alpha * n)
observations to be the outliers and contaminating them by setting
, where the elements
of vector
are determined by the parameter
random_sign
. If random_sign = TRUE
, is either
or
with
. If
random_sign = FALSE
, for all
,
. The parameter
alpha
determines the contamination rate and the parameter
h
determines the size of the outliers.
gen_glob_outl
returns a data.frame
containing the contaminated fields as first columns. The column
contains a logical indicator whether the observation is outlier or not.
# simulate coordinates coords <- runif(1000 * 2) * 20 dim(coords) <- c(1000, 2) coords_df <- as.data.frame(coords) names(coords_df) <- c("x", "y") # simulate random field if (!requireNamespace('gstat', quietly = TRUE)) { message('Please install the package gstat to run the example code.') } else { library(gstat) model_1 <- gstat(formula = z ~ 1, locations = ~ x + y, dummy = TRUE, beta = 0, model = vgm(psill = 0.025, range = 1, model = 'Exp'), nmax = 20) model_2 <- gstat(formula = z ~ 1, locations = ~ x + y, dummy = TRUE, beta = 0, model = vgm(psill = 0.025, range = 1, kappa = 2, model = 'Mat'), nmax = 20) model_3 <- gstat(formula = z ~ 1, locations = ~ x + y, dummy = TRUE, beta = 0, model = vgm(psill = 0.025, range = 1, model = 'Gau'), nmax = 20) field_1 <- predict(model_1, newdata = coords_df, nsim = 1)$sim1 field_2 <- predict(model_2, newdata = coords_df, nsim = 1)$sim1 field_3 <- predict(model_3, newdata = coords_df, nsim = 1)$sim1 field <- cbind(field_1, field_2, field_3) # Generate 10 % global outliers to data, with size h=15. field_cont <- gen_glob_outl(field, alpha = 0.1, h = 15) # Generate 5 % global outliers to data, with size h = 10 and random sign. field_cont2 <- gen_glob_outl(field, alpha = 0.05, h = 10, random_sign = TRUE) }
# simulate coordinates coords <- runif(1000 * 2) * 20 dim(coords) <- c(1000, 2) coords_df <- as.data.frame(coords) names(coords_df) <- c("x", "y") # simulate random field if (!requireNamespace('gstat', quietly = TRUE)) { message('Please install the package gstat to run the example code.') } else { library(gstat) model_1 <- gstat(formula = z ~ 1, locations = ~ x + y, dummy = TRUE, beta = 0, model = vgm(psill = 0.025, range = 1, model = 'Exp'), nmax = 20) model_2 <- gstat(formula = z ~ 1, locations = ~ x + y, dummy = TRUE, beta = 0, model = vgm(psill = 0.025, range = 1, kappa = 2, model = 'Mat'), nmax = 20) model_3 <- gstat(formula = z ~ 1, locations = ~ x + y, dummy = TRUE, beta = 0, model = vgm(psill = 0.025, range = 1, model = 'Gau'), nmax = 20) field_1 <- predict(model_1, newdata = coords_df, nsim = 1)$sim1 field_2 <- predict(model_2, newdata = coords_df, nsim = 1)$sim1 field_3 <- predict(model_3, newdata = coords_df, nsim = 1)$sim1 field <- cbind(field_1, field_2, field_3) # Generate 10 % global outliers to data, with size h=15. field_cont <- gen_glob_outl(field, alpha = 0.1, h = 15) # Generate 5 % global outliers to data, with size h = 10 and random sign. field_cont2 <- gen_glob_outl(field, alpha = 0.05, h = 10, random_sign = TRUE) }
Generates synthetic local outliers and contaminates a given p-variate random field by swapping observations based on the first principal component score.
gen_loc_outl(x, coords, alpha = 0.05, neighborhood_type = c("radius", "fixed_n"), radius = NULL, neighborhood_size = NULL, swap_order = c("regular", "reverse", "random"))
gen_loc_outl(x, coords, alpha = 0.05, neighborhood_type = c("radius", "fixed_n"), radius = NULL, neighborhood_size = NULL, swap_order = c("regular", "reverse", "random"))
x |
a numeric matrix of dimension |
coords |
a numeric matrix or data frame with dimension |
alpha |
a numeric value between 0 and 1 determining the proportion of the contaminated observations. |
neighborhood_type |
a string determining the type of neighborhood. If |
radius |
a positive numeric value defining the size of the radius when the |
neighborhood_size |
a positive integer defining the number of points in each neighborhood when the |
swap_order |
a string to determine which swap order is used. Either |
gen_loc_outl
generates local outliers by swapping the most extreme and the least extreme observations based on the first principal component score under the condition that at most one outliers lies in each neighborhood. For each location , the neighborhood
is defined based on the parameter
neighborhood_type
. When neighborhood_type
is 'radius'
, the neighborhood contains all locations
for which the Euclidean norm
, where
is determined by the parameter
radius
. When neighborhood_type
is 'fixed_n'
, the neighborhood contains
nearest locations of
, where
is determined by the parameter
neighborhood_size
. For more details see Ernst & Haesbroeck, (2017).
After calculating the neighborhoods, the local outliers are generated following Ernst & Haesbroeck, (2017) and Harris et al. (2014) using the steps:
Sort the observations from highest to lowest by their principle component analysis (PCA) scores of the first component (PC-1).
Set to be
rounded to nearest integer and select the set of local outlier points
by finding
observations with the highest PC-1 values and
observations with the lowest PC-1 values under the condition that for all
it holds that
.
Form sets , which contains
observations with the largest PC-1 values of outlier points
and
, which contains
observations with the smallest PC-1 values of outlier points
. Generate the local outliers by swapping
with
,
. The parameter
swap_order
defines how the sets and
are ordered.
If the parameter swap_order
is 'regular'
, and
are sorted by PC-1 score from smallest to largest.
If the parameter
swap_order
is 'reverse'
, is sorted from largest to smallest and
from smallest to largest.
If the parameter
swap_order
is 'random'
, and
are in random order.
gen_loc_outl
returns a data.frame
containing the contaminated fields as first columns. The column
contains a logical indicator whether the observation is an outlier or not.
This function is a modified version of code originally provided by M. Ernst and G. Haesbroeck.
Ernst, M., & Haesbroeck, G. (2017). Comparison of local outlier detection techniques in spatial multivariate data. Data Mining and Knowledge Discovery, 31 , 371-399. doi:10.1007/s10618-016-0471-0
Harris, P., Brunsdon, C., Charlton, M., Juggins, S., & Clarke, A. (2014). Multivariate spatial outlier detection using robust geographically weighted methods. Mathematical Geosciences, 46 , 1-31. doi:10.1007/s11004-013-9491-0
# simulate coordinates coords <- runif(1000 * 2) * 20 dim(coords) <- c(1000, 2) coords_df <- as.data.frame(coords) names(coords_df) <- c("x", "y") # simulate random field if (!requireNamespace('gstat', quietly = TRUE)) { message('Please install the package gstat to run the example code.') } else { library(gstat) model_1 <- gstat(formula = z ~ 1, locations = ~ x + y, dummy = TRUE, beta = 0, model = vgm(psill = 0.025, range = 1, model = 'Exp'), nmax = 20) model_2 <- gstat(formula = z ~ 1, locations = ~ x + y, dummy = TRUE, beta = 0, model = vgm(psill = 0.025, range = 1, kappa = 2, model = 'Mat'), nmax = 20) model_3 <- gstat(formula = z ~ 1, locations = ~ x + y, dummy = TRUE, beta = 0, model = vgm(psill = 0.025, range = 1, model = 'Gau'), nmax = 20) field_1 <- predict(model_1, newdata = coords_df, nsim = 1)$sim1 field_2 <- predict(model_2, newdata = coords_df, nsim = 1)$sim1 field_3 <- predict(model_3, newdata = coords_df, nsim = 1)$sim1 field <- cbind(field_1, field_2, field_3) # Generate 5 % local outliers to data using radius neighborhoods # and regular swap_order. field_cont <- gen_loc_outl(field, coords, alpha = 0.05, neighborhood_type = "radius", radius = 0.5, swap_order = "regular") # Generate 10 % local outliers to data using fixed_n neighborhoods # and reverse swap_order. field_cont2 <- gen_loc_outl(field, coords, alpha = 0.1, neighborhood_type = "fixed_n", neighborhood_size = 10, swap_order = "reverse") }
# simulate coordinates coords <- runif(1000 * 2) * 20 dim(coords) <- c(1000, 2) coords_df <- as.data.frame(coords) names(coords_df) <- c("x", "y") # simulate random field if (!requireNamespace('gstat', quietly = TRUE)) { message('Please install the package gstat to run the example code.') } else { library(gstat) model_1 <- gstat(formula = z ~ 1, locations = ~ x + y, dummy = TRUE, beta = 0, model = vgm(psill = 0.025, range = 1, model = 'Exp'), nmax = 20) model_2 <- gstat(formula = z ~ 1, locations = ~ x + y, dummy = TRUE, beta = 0, model = vgm(psill = 0.025, range = 1, kappa = 2, model = 'Mat'), nmax = 20) model_3 <- gstat(formula = z ~ 1, locations = ~ x + y, dummy = TRUE, beta = 0, model = vgm(psill = 0.025, range = 1, model = 'Gau'), nmax = 20) field_1 <- predict(model_1, newdata = coords_df, nsim = 1)$sim1 field_2 <- predict(model_2, newdata = coords_df, nsim = 1)$sim1 field_3 <- predict(model_3, newdata = coords_df, nsim = 1)$sim1 field <- cbind(field_1, field_2, field_3) # Generate 5 % local outliers to data using radius neighborhoods # and regular swap_order. field_cont <- gen_loc_outl(field, coords, alpha = 0.05, neighborhood_type = "radius", radius = 0.5, swap_order = "regular") # Generate 10 % local outliers to data using fixed_n neighborhoods # and reverse swap_order. field_cont2 <- gen_loc_outl(field, coords, alpha = 0.1, neighborhood_type = "fixed_n", neighborhood_size = 10, swap_order = "reverse") }
local_covariance_matrix
computes local covariance matrices for a random field based on a given set of spatial kernel matrices.
local_covariance_matrix(x, kernel_list, lcov = c('lcov', 'ldiff', 'lcov_norm'), center = TRUE)
local_covariance_matrix(x, kernel_list, lcov = c('lcov', 'ldiff', 'lcov_norm'), center = TRUE)
x |
a numeric matrix of dimension |
kernel_list |
a list with spatial kernel matrices of dimension |
lcov |
a string indicating which type of local covariance matrix to use. Either |
center |
logical. If |
Two versions of local covariance matrices are implemented, the argument lcov
determines which version is used:
'lcov'
:
'ldiff'
:
'lcov_norm'
:
with
Where correspond to the pairwise distances between coordinates,
are the
p
random field values at location ,
is the sample mean vector, and the kernel function
determines the locality. The choice
'lcov_norm'
is useful when testing for the actual signal dimension of the latent field, see sbss_asymp
and sbss_boot
. The function local_covariance_matrix
computes local covariance matrices for a given random field and given spatial kernel matrices, the type of computed local covariance matrices is determined by the argument 'lcov'
. If the argument center
equals FALSE
then the centering in the above formula for is not carried out. See also
spatial_kernel_matrix
for details.
local_covariance_matrix
returns a list of equal length as the argument kernel_list
. Each list entry is a numeric matrix of dimension c(p, p)
corresponding to a local covariance matrix. The list has the attribute 'lcov'
which equals the function argument lcov
.
Muehlmann, C., Filzmoser, P. and Nordhausen, K. (2021), Spatial Blind Source Separation in the Presence of a Drift, Submitted for publication. Preprint available at https://arxiv.org/abs/2108.13813.
Bachoc, F., Genton, M. G, Nordhausen, K., Ruiz-Gazen, A. and Virta, J. (2020), Spatial Blind Source Separation, Biometrika, 107, 627-646, doi:10.1093/biomet/asz079.
# simulate coordinates coords <- runif(1000 * 2) * 20 dim(coords) <- c(1000, 2) coords_df <- as.data.frame(coords) names(coords_df) <- c("x", "y") # simulate random field if (!requireNamespace('gstat', quietly = TRUE)) { message('Please install the package gstat to run the example code.') } else { library(gstat) model_1 <- gstat(formula = z ~ 1, locations = ~ x + y, dummy = TRUE, beta = 0, model = vgm(psill = 0.025, range = 1, model = 'Exp'), nmax = 20) model_2 <- gstat(formula = z ~ 1, locations = ~ x + y, dummy = TRUE, beta = 0, model = vgm(psill = 0.025, range = 1, kappa = 2, model = 'Mat'), nmax = 20) model_3 <- gstat(formula = z ~ 1, locations = ~ x + y, dummy = TRUE, beta = 0, model = vgm(psill = 0.025, range = 1, model = 'Gau'), nmax = 20) field_1 <- predict(model_1, newdata = coords_df, nsim = 1)$sim1 field_2 <- predict(model_2, newdata = coords_df, nsim = 1)$sim1 field_3 <- predict(model_3, newdata = coords_df, nsim = 1)$sim1 field <- as.matrix(cbind(field_1, field_2, field_3)) # computing two ring kernel matrices and corresponding local covariance matrices kernel_params_ring <- c(0, 0.5, 0.5, 2) ring_kernel_list <- spatial_kernel_matrix(coords, 'ring', kernel_params_ring) loc_cov_ring <- local_covariance_matrix(x = field, kernel_list = ring_kernel_list) # computing two ring kernel matrices and corresponding local difference matrices kernel_params_ring <- c(0, 0.5, 0.5, 2) ring_kernel_list <- spatial_kernel_matrix(coords, 'ring', kernel_params_ring) loc_cov_ring <- local_covariance_matrix(x = field, kernel_list = ring_kernel_list, lcov = 'ldiff') # computing three ball kernel matrices and corresponding local covariance matrices kernel_params_ball <- c(0.5, 1, 2) ball_kernel_list <- spatial_kernel_matrix(coords, 'ball', kernel_params_ball) loc_cov_ball <- local_covariance_matrix(x = field, kernel_list = ball_kernel_list) # computing three gauss kernel matrices and corresponding local covariance matrices kernel_params_gauss <- c(0.5, 1, 2) gauss_kernel_list <- spatial_kernel_matrix(coords, 'gauss', kernel_params_gauss) loc_cov_gauss <- local_covariance_matrix(x = field, kernel_list = gauss_kernel_list) }
# simulate coordinates coords <- runif(1000 * 2) * 20 dim(coords) <- c(1000, 2) coords_df <- as.data.frame(coords) names(coords_df) <- c("x", "y") # simulate random field if (!requireNamespace('gstat', quietly = TRUE)) { message('Please install the package gstat to run the example code.') } else { library(gstat) model_1 <- gstat(formula = z ~ 1, locations = ~ x + y, dummy = TRUE, beta = 0, model = vgm(psill = 0.025, range = 1, model = 'Exp'), nmax = 20) model_2 <- gstat(formula = z ~ 1, locations = ~ x + y, dummy = TRUE, beta = 0, model = vgm(psill = 0.025, range = 1, kappa = 2, model = 'Mat'), nmax = 20) model_3 <- gstat(formula = z ~ 1, locations = ~ x + y, dummy = TRUE, beta = 0, model = vgm(psill = 0.025, range = 1, model = 'Gau'), nmax = 20) field_1 <- predict(model_1, newdata = coords_df, nsim = 1)$sim1 field_2 <- predict(model_2, newdata = coords_df, nsim = 1)$sim1 field_3 <- predict(model_3, newdata = coords_df, nsim = 1)$sim1 field <- as.matrix(cbind(field_1, field_2, field_3)) # computing two ring kernel matrices and corresponding local covariance matrices kernel_params_ring <- c(0, 0.5, 0.5, 2) ring_kernel_list <- spatial_kernel_matrix(coords, 'ring', kernel_params_ring) loc_cov_ring <- local_covariance_matrix(x = field, kernel_list = ring_kernel_list) # computing two ring kernel matrices and corresponding local difference matrices kernel_params_ring <- c(0, 0.5, 0.5, 2) ring_kernel_list <- spatial_kernel_matrix(coords, 'ring', kernel_params_ring) loc_cov_ring <- local_covariance_matrix(x = field, kernel_list = ring_kernel_list, lcov = 'ldiff') # computing three ball kernel matrices and corresponding local covariance matrices kernel_params_ball <- c(0.5, 1, 2) ball_kernel_list <- spatial_kernel_matrix(coords, 'ball', kernel_params_ball) loc_cov_ball <- local_covariance_matrix(x = field, kernel_list = ball_kernel_list) # computing three gauss kernel matrices and corresponding local covariance matrices kernel_params_gauss <- c(0.5, 1, 2) gauss_kernel_list <- spatial_kernel_matrix(coords, 'gauss', kernel_params_gauss) loc_cov_gauss <- local_covariance_matrix(x = field, kernel_list = gauss_kernel_list) }
local_gss_covariance_matrix
computes generalized local sign covariance matrices for a random field based on a given set of spatial kernel matrices.
local_gss_covariance_matrix(x, kernel_list, lcov = c('norm', 'winsor', 'qwinsor'), center = TRUE)
local_gss_covariance_matrix(x, kernel_list, lcov = c('norm', 'winsor', 'qwinsor'), center = TRUE)
x |
a numeric matrix of dimension |
kernel_list |
a list with spatial kernel matrices of dimension |
lcov |
a string indicating which type of robust local covariance matrix to use. Either |
center |
logical. If |
Generalized local sign matrices are determined by radial functions , where
and
is Hettmansperger Randles location estimator (Hettmansperger & Randles, 2002), and kernel functions
, where
. Generalized local sign covariance (gLSCM) matrix is then calculated as
with
Three radial functions (Raymaekers & Rousseeuw, 2019) are implemented, the parameter
lcov
defines which is used:
'norm'
:
'winsor'
:
'qwinsor'
:
The cutoff is defined as
, where
is
th order statistic of
and
. If the argument
center
equals FALSE
then the centering in the above formula for is not carried out. See also
spatial_kernel_matrix
for details.
local_gss_covariance_matrix
returns a list with two entries:
cov_sp_list |
List of equal length as the argument |
weights |
numeric vector of |
Hettmansperger, T. P., & Randles, R. H. (2002). A practical affine equivariant multivariate median. Biometrika, 89 , 851-860. doi:10.1093/biomet/89.4.851.
Raymaekers, J., & Rousseeuw, P. (2019). A generalized spatial sign covariance matrix. Journal of Multivariate Analysis, 171 , 94-111. doi:10.1016/j.jmva.2018.11.010.
Sipila, M., Muehlmann, C. Nordhausen, K. & Taskinen, S. (2022). Robust second order stationary spatial blind source separation using generalized sign matrices. Manuscript.
spatial_kernel_matrix
, robsbss
# simulate coordinates coords <- runif(1000 * 2) * 20 dim(coords) <- c(1000, 2) coords_df <- as.data.frame(coords) names(coords_df) <- c("x", "y") # simulate random field if (!requireNamespace('gstat', quietly = TRUE)) { message('Please install the package gstat to run the example code.') } else { library(gstat) model_1 <- gstat(formula = z ~ 1, locations = ~ x + y, dummy = TRUE, beta = 0, model = vgm(psill = 0.025, range = 1, model = 'Exp'), nmax = 20) model_2 <- gstat(formula = z ~ 1, locations = ~ x + y, dummy = TRUE, beta = 0, model = vgm(psill = 0.025, range = 1, kappa = 2, model = 'Mat'), nmax = 20) model_3 <- gstat(formula = z ~ 1, locations = ~ x + y, dummy = TRUE, beta = 0, model = vgm(psill = 0.025, range = 1, model = 'Gau'), nmax = 20) field_1 <- predict(model_1, newdata = coords_df, nsim = 1)$sim1 field_2 <- predict(model_2, newdata = coords_df, nsim = 1)$sim1 field_3 <- predict(model_3, newdata = coords_df, nsim = 1)$sim1 field <- cbind(field_1, field_2, field_3) # computing two ring kernel matrices and corresponding # robust local covariance matrices using 'norm' radial function: kernel_params_ring <- c(0, 0.5, 0.5, 2) ring_kernel_list <- spatial_kernel_matrix(coords, 'ring', kernel_params_ring) loc_cov_ring <- local_gss_covariance_matrix(x = field, kernel_list = ring_kernel_list, lcov = 'norm') # computing three ball kernel matrices and corresponding # robust local covariance matrices using 'winsor' radial function: kernel_params_ball <- c(0.5, 1, 2) ball_kernel_list <- spatial_kernel_matrix(coords, 'ball', kernel_params_ball) loc_cov_ball <- local_gss_covariance_matrix(x = field, kernel_list = ball_kernel_list, lcov = 'winsor') # computing three gauss kernel matrices and corresponding # robust local covariance matrices using 'qwinsor' radial function: kernel_params_gauss <- c(0.5, 1, 2) gauss_kernel_list <- spatial_kernel_matrix(coords, 'gauss', kernel_params_gauss) loc_cov_gauss <- local_gss_covariance_matrix(x = field, kernel_list = gauss_kernel_list, lcov = 'qwinsor') }
# simulate coordinates coords <- runif(1000 * 2) * 20 dim(coords) <- c(1000, 2) coords_df <- as.data.frame(coords) names(coords_df) <- c("x", "y") # simulate random field if (!requireNamespace('gstat', quietly = TRUE)) { message('Please install the package gstat to run the example code.') } else { library(gstat) model_1 <- gstat(formula = z ~ 1, locations = ~ x + y, dummy = TRUE, beta = 0, model = vgm(psill = 0.025, range = 1, model = 'Exp'), nmax = 20) model_2 <- gstat(formula = z ~ 1, locations = ~ x + y, dummy = TRUE, beta = 0, model = vgm(psill = 0.025, range = 1, kappa = 2, model = 'Mat'), nmax = 20) model_3 <- gstat(formula = z ~ 1, locations = ~ x + y, dummy = TRUE, beta = 0, model = vgm(psill = 0.025, range = 1, model = 'Gau'), nmax = 20) field_1 <- predict(model_1, newdata = coords_df, nsim = 1)$sim1 field_2 <- predict(model_2, newdata = coords_df, nsim = 1)$sim1 field_3 <- predict(model_3, newdata = coords_df, nsim = 1)$sim1 field <- cbind(field_1, field_2, field_3) # computing two ring kernel matrices and corresponding # robust local covariance matrices using 'norm' radial function: kernel_params_ring <- c(0, 0.5, 0.5, 2) ring_kernel_list <- spatial_kernel_matrix(coords, 'ring', kernel_params_ring) loc_cov_ring <- local_gss_covariance_matrix(x = field, kernel_list = ring_kernel_list, lcov = 'norm') # computing three ball kernel matrices and corresponding # robust local covariance matrices using 'winsor' radial function: kernel_params_ball <- c(0.5, 1, 2) ball_kernel_list <- spatial_kernel_matrix(coords, 'ball', kernel_params_ball) loc_cov_ball <- local_gss_covariance_matrix(x = field, kernel_list = ball_kernel_list, lcov = 'winsor') # computing three gauss kernel matrices and corresponding # robust local covariance matrices using 'qwinsor' radial function: kernel_params_gauss <- c(0.5, 1, 2) gauss_kernel_list <- spatial_kernel_matrix(coords, 'gauss', kernel_params_gauss) loc_cov_gauss <- local_gss_covariance_matrix(x = field, kernel_list = gauss_kernel_list, lcov = 'qwinsor') }
plot.sbss
is an interface to the standard plot method for the class of the estimated source random field.
## S3 method for class 'sbss' plot(x, which = 1:ncol(x$s), ...)
## S3 method for class 'sbss' plot(x, which = 1:ncol(x$s), ...)
x |
object of class |
which |
a numeric vector indicating which components of the latent field should be plotted. |
... |
further arguments to the plot method of |
This method calls the corresponding plot method of class(x$s)
. Either spplot
for class(x$s)
is SpatialPointsDataFrame
or plot.sf
for class(x$s)
is sf
. If x$s
is a matrix then it is internally cast to SpatialPointsDataFrame
and spplot
is used for plotting. Arguments to the corresponding plot functions can be given through ...
.
# simulate coordinates coords <- runif(1000 * 2) * 20 dim(coords) <- c(1000, 2) coords_df <- as.data.frame(coords) names(coords_df) <- c("x", "y") # simulate random field if (!requireNamespace('gstat', quietly = TRUE)) { message('Please install the package gstat to run the example code.') } else { library(gstat) model_1 <- gstat(formula = z ~ 1, locations = ~ x + y, dummy = TRUE, beta = 0, model = vgm(psill = 0.025, range = 1, model = 'Exp'), nmax = 20) model_2 <- gstat(formula = z ~ 1, locations = ~ x + y, dummy = TRUE, beta = 0, model = vgm(psill = 0.025, range = 1, kappa = 2, model = 'Mat'), nmax = 20) model_3 <- gstat(formula = z ~ 1, locations = ~ x + y, dummy = TRUE, beta = 0, model = vgm(psill = 0.025, range = 1, model = 'Gau'), nmax = 20) field_1 <- predict(model_1, newdata = coords_df, nsim = 1)$sim1 field_2 <- predict(model_2, newdata = coords_df, nsim = 1)$sim1 field_3 <- predict(model_3, newdata = coords_df, nsim = 1)$sim1 field <- as.matrix(cbind(field_1, field_2, field_3)) # compute ring kernel matrices kernel_parameters <- c(0, 1, 1, 2, 2, 3) ring_kernel_list <- spatial_kernel_matrix(coords, 'ring', kernel_parameters) # apply sbss SpatialPointsDataFrame object field_sp <- sp::SpatialPointsDataFrame(coords = coords, data = data.frame(field)) res_sp <- sbss(field_sp, kernel_list = ring_kernel_list) # plot with SpatialPointsDataFrame object plot(res_sp) # plot with SpatialPointsDataFrame object # and additional arguments for spplot function plot(res_sp, colorkey = TRUE, as.table = TRUE, cex = 1) # apply sbss with sf object if (!requireNamespace('sf', quietly = TRUE)) { message('Please install the package sf to run the example code.') } else { field_sf <- sf::st_as_sf(data.frame(coords = coords, field), coords = c(1,2)) res_sf <- sbss(x = field_sf, kernel_list = ring_kernel_list) # plot with sf object plot(res_sf) # plot with sf object # and additional arguments for plot.sf function plot(res_sf, axes = TRUE, key.pos = 4) } }
# simulate coordinates coords <- runif(1000 * 2) * 20 dim(coords) <- c(1000, 2) coords_df <- as.data.frame(coords) names(coords_df) <- c("x", "y") # simulate random field if (!requireNamespace('gstat', quietly = TRUE)) { message('Please install the package gstat to run the example code.') } else { library(gstat) model_1 <- gstat(formula = z ~ 1, locations = ~ x + y, dummy = TRUE, beta = 0, model = vgm(psill = 0.025, range = 1, model = 'Exp'), nmax = 20) model_2 <- gstat(formula = z ~ 1, locations = ~ x + y, dummy = TRUE, beta = 0, model = vgm(psill = 0.025, range = 1, kappa = 2, model = 'Mat'), nmax = 20) model_3 <- gstat(formula = z ~ 1, locations = ~ x + y, dummy = TRUE, beta = 0, model = vgm(psill = 0.025, range = 1, model = 'Gau'), nmax = 20) field_1 <- predict(model_1, newdata = coords_df, nsim = 1)$sim1 field_2 <- predict(model_2, newdata = coords_df, nsim = 1)$sim1 field_3 <- predict(model_3, newdata = coords_df, nsim = 1)$sim1 field <- as.matrix(cbind(field_1, field_2, field_3)) # compute ring kernel matrices kernel_parameters <- c(0, 1, 1, 2, 2, 3) ring_kernel_list <- spatial_kernel_matrix(coords, 'ring', kernel_parameters) # apply sbss SpatialPointsDataFrame object field_sp <- sp::SpatialPointsDataFrame(coords = coords, data = data.frame(field)) res_sp <- sbss(field_sp, kernel_list = ring_kernel_list) # plot with SpatialPointsDataFrame object plot(res_sp) # plot with SpatialPointsDataFrame object # and additional arguments for spplot function plot(res_sp, colorkey = TRUE, as.table = TRUE, cex = 1) # apply sbss with sf object if (!requireNamespace('sf', quietly = TRUE)) { message('Please install the package sf to run the example code.') } else { field_sf <- sf::st_as_sf(data.frame(coords = coords, field), coords = c(1,2)) res_sf <- sbss(x = field_sf, kernel_list = ring_kernel_list) # plot with sf object plot(res_sf) # plot with sf object # and additional arguments for plot.sf function plot(res_sf, axes = TRUE, key.pos = 4) } }
predict.sbss
predicts the estimated source random field on a grid with Inverse Distance Weighting (IDW) and plots these predictions.
## S3 method for class 'sbss' predict(object, p = 2, n_grid = 50, which = 1:ncol(object$s), ...)
## S3 method for class 'sbss' predict(object, p = 2, n_grid = 50, which = 1:ncol(object$s), ...)
object |
object of class |
p |
numeric. The positive power parameter for IDW. Default is 2. |
n_grid |
numeric. Each dimension of the spatial domain is divided by this integer to derive a grid for IDW predictions. Default is 50. |
which |
a numeric vector indicating which components of the latent field should be predicted. |
... |
further arguments to the plot method of |
IDW predictions are made on a grid. The side lengths of the rectangular shaped grid cells are derived by the differences of the rounded maximum and minimum values divided by the n_grid
argument for each column of object$coords
. Hence, the grid contains a total of n_grid ^ 2
points. The power parameter of the IDW predictions is given by p
(default: 2).
The predictions are plotted with the corresponding plot method of class(x$s)
. Either spplot
for class(x$s)
is SpatialPointsDataFrame
or plot.sf
for class(x$s)
is sf
. If x$s
is a matrix then it is internally cast to SpatialPointsDataFrame
and spplot
is used for plotting. Arguments to the corresponding plot functions can be given through ...
as it is done by the method plot.sbss
.
The return is dependent on the class of the latent field in the 'sbss'
object.
If class(object$s)
is a matrix then a list with the following entries is returned:
vals_pred_idw |
a matrix of dimension |
coords_pred_idw |
a matrix of dimension |
If class(object$s)
is SpatialPointsDataFrame
or sf
then the predicted values and their coordinates are returned as an object of the corresponding class.
The return is invisible.
sbss
, plot.sbss
, spplot
, plot.sf
# simulate coordinates coords <- runif(1000 * 2) * 20 dim(coords) <- c(1000, 2) coords_df <- as.data.frame(coords) names(coords_df) <- c("x", "y") # simulate random field if (!requireNamespace('gstat', quietly = TRUE)) { message('Please install the package gstat to run the example code.') } else { library(gstat) model_1 <- gstat(formula = z ~ 1, locations = ~ x + y, dummy = TRUE, beta = 0, model = vgm(psill = 0.025, range = 1, model = 'Exp'), nmax = 20) model_2 <- gstat(formula = z ~ 1, locations = ~ x + y, dummy = TRUE, beta = 0, model = vgm(psill = 0.025, range = 1, kappa = 2, model = 'Mat'), nmax = 20) model_3 <- gstat(formula = z ~ 1, locations = ~ x + y, dummy = TRUE, beta = 0, model = vgm(psill = 0.025, range = 1, model = 'Gau'), nmax = 20) field_1 <- predict(model_1, newdata = coords_df, nsim = 1)$sim1 field_2 <- predict(model_2, newdata = coords_df, nsim = 1)$sim1 field_3 <- predict(model_3, newdata = coords_df, nsim = 1)$sim1 field <- as.matrix(cbind(field_1, field_2, field_3)) # apply sbss with three ring kernels kernel_borders <- c(0, 1, 1, 2, 2, 4) res_sbss <- sbss(field, coords, 'ring', kernel_borders) # predict latent fields on grid with default settings predict(res_sbss) # predict latent fields on grid with custom plotting settings predict(res_sbss, colorkey = TRUE, as.table = TRUE, cex = 1) # predict latent fields on a 60x60 grid predict(res_sbss, n_grid = 60, colorkey = TRUE, as.table = TRUE, cex = 1) # predict latent fields with a higher IDW power parameter predict(res_sbss, p = 10, colorkey = TRUE, as.table = TRUE, cex = 1) # predict latent fields and save the predictions predict_list <- predict(res_sbss, p = 5, colorkey = TRUE, as.table = TRUE, cex = 1) }
# simulate coordinates coords <- runif(1000 * 2) * 20 dim(coords) <- c(1000, 2) coords_df <- as.data.frame(coords) names(coords_df) <- c("x", "y") # simulate random field if (!requireNamespace('gstat', quietly = TRUE)) { message('Please install the package gstat to run the example code.') } else { library(gstat) model_1 <- gstat(formula = z ~ 1, locations = ~ x + y, dummy = TRUE, beta = 0, model = vgm(psill = 0.025, range = 1, model = 'Exp'), nmax = 20) model_2 <- gstat(formula = z ~ 1, locations = ~ x + y, dummy = TRUE, beta = 0, model = vgm(psill = 0.025, range = 1, kappa = 2, model = 'Mat'), nmax = 20) model_3 <- gstat(formula = z ~ 1, locations = ~ x + y, dummy = TRUE, beta = 0, model = vgm(psill = 0.025, range = 1, model = 'Gau'), nmax = 20) field_1 <- predict(model_1, newdata = coords_df, nsim = 1)$sim1 field_2 <- predict(model_2, newdata = coords_df, nsim = 1)$sim1 field_3 <- predict(model_3, newdata = coords_df, nsim = 1)$sim1 field <- as.matrix(cbind(field_1, field_2, field_3)) # apply sbss with three ring kernels kernel_borders <- c(0, 1, 1, 2, 2, 4) res_sbss <- sbss(field, coords, 'ring', kernel_borders) # predict latent fields on grid with default settings predict(res_sbss) # predict latent fields on grid with custom plotting settings predict(res_sbss, colorkey = TRUE, as.table = TRUE, cex = 1) # predict latent fields on a 60x60 grid predict(res_sbss, n_grid = 60, colorkey = TRUE, as.table = TRUE, cex = 1) # predict latent fields with a higher IDW power parameter predict(res_sbss, p = 10, colorkey = TRUE, as.table = TRUE, cex = 1) # predict latent fields and save the predictions predict_list <- predict(res_sbss, p = 5, colorkey = TRUE, as.table = TRUE, cex = 1) }
Prints the estimated unmixing matrix and the diagonalized local covariance matrices for an object of class 'sbss'
.
## S3 method for class 'sbss' print(x, ...)
## S3 method for class 'sbss' print(x, ...)
x |
object of class |
... |
additional arguments for the method |
robsbss
is a robust variant of sbss
. It estimates the unmixing matrix assuming a spatial blind source separation model by jointly diagonalizing the Hettmansperger-Randles scatter matrix and one/many generalized local sign covariance matrices. These local generalized sign covariance matrices are determined by spatial kernel functions and radial functions. Three types of such kernel functions and three types of radial functions are supported.
robsbss(x, ...) ## Default S3 method: robsbss(x, coords, kernel_type = c('ring', 'ball', 'gauss'), kernel_parameters, lcov = c('norm', 'winsor', 'qwinsor'), ordered = TRUE, kernel_list = NULL, ...) ## S3 method for class 'SpatialPointsDataFrame' robsbss(x, ...) ## S3 method for class 'sf' robsbss(x, ...)
robsbss(x, ...) ## Default S3 method: robsbss(x, coords, kernel_type = c('ring', 'ball', 'gauss'), kernel_parameters, lcov = c('norm', 'winsor', 'qwinsor'), ordered = TRUE, kernel_list = NULL, ...) ## S3 method for class 'SpatialPointsDataFrame' robsbss(x, ...) ## S3 method for class 'sf' robsbss(x, ...)
x |
either a numeric matrix of dimension |
coords |
a numeric matrix of dimension |
kernel_type |
a string indicating which kernel function to use. Either |
kernel_parameters |
a numeric vector that gives the parameters for the kernel function. At least length of one for |
lcov |
a string indicating which radial function or type of robust local covariance matrix to use. Either |
ordered |
logical. If |
kernel_list |
a list of spatial kernel matrices with dimension |
... |
further arguments for the fast real joint diagonalization algorithm that jointly diagonalizes the local covariance matrices. See details and |
robsbss
is a robust variant of sbss
which uses Hettmansperger-Randles (HR) location and scatter estimates (Hettmansperger & Randles, 2002) for whitening (see white_data
for details) and jointly diagonalizes HR scatter matrix and generalized local sign matrices to estimate the unmixing matrix. The generalized local sign matrices are determined by radial functions , where
and
is HR location estimator, and kernel functions
, where
. Generalized local sign covariance (gLSCM) matrix is then calculated as
with
Three radial functions (Raymaekers & Rousseeuw, 2019) are implemented, the parameter
lcov
defines which is used:
'norm'
:
'winsor'
:
'qwinsor'
:
The cutoff is defined as
, where
is
th order statistic of
and
.
In addition, three kernel functions
are implemented, the parameter
kernel_type
defines which is used:
'ring'
: parameters are inner radius and outer radius
, with
, and
:
'ball'
: parameter is the radius , with
:
'gauss'
: Gaussian function where 95% of the mass is inside the parameter , with
:
The argument kernel_type
determines the used kernel function as presented above, the argument kernel_parameters
gives the corresponding parameters for the kernel function. Specifically, if kernel_type
equals 'ball'
or 'gauss'
then kernel_parameters
is a numeric vector where each entry corresponds to one parameter. Hence, length(kernel_parameters)
local covariance matrices are used. Whereas, if kernel_type
equals 'ring'
, then kernel_parameters
must be a numeric vector of even length where subsequently the inner and outer radii must be given (informally: c(r_in1, r_out1, r_in2, r_out2, ...)
). In that case length(kernel_parameters) / 2
local covariance matrices are used.
robsbss
calls spatial_kernel_matrix
internally to compute a list of c(n,n)
kernel matrices based on the parameters given, where each entry of those matrices corresponds to . Alternatively, such a list of kernel matrices can be given directly to the function
robsbss
via the kernel_list
argument. This is useful when robsbss
is called numerous times with the same coordinates/kernel functions as the computation of the kernel matrices is then done only once prior the actual robsbss
calls. For details see also spatial_kernel_matrix
.
If more than one generalized local sign covariance matrix is used robsbss
jointly diagonalizes these matrices with the function frjd
. ...
provides arguments for frjd
, useful arguments might be:
eps
: tolerance for convergence.
maxiter
: maximum number of iterations.
robsbss
returns a list of class 'sbss'
with the following entries:
s |
object of |
coords |
coordinates of the observations. Is |
w |
estimated unmixing matrix. |
weights |
numeric vector of |
w_inv |
inverse of the estimated unmixing matrix. |
pevals |
(pseudo-)eigenvalues for each latent field entry. |
d |
matrix of stacked (jointly) diagonalized local covariance matrices with dimension |
diags |
matrix of dimension |
x_mu |
robustly estimated columnmeans of |
cov_inv_sqrt |
square root of the inverse sample covariance matrix of |
Hettmansperger, T. P., & Randles, R. H. (2002). A practical affine equivariant multivariate median. Biometrika, 89 , 851-860. doi:10.1093/biomet/89.4.851.
Raymaekers, J., & Rousseeuw, P. (2019). A generalized spatial sign covariance matrix. Journal of Multivariate Analysis, 171 , 94-111. doi:10.1016/j.jmva.2018.11.010.
Sipila, M., Muehlmann, C. Nordhausen, K. & Taskinen, S. (2022). Robust second order stationary spatial blind source separation using generalized sign matrices. Manuscript.
spatial_kernel_matrix
, local_gss_covariance_matrix
, sp
, sf
, frjd
# simulate coordinates coords <- runif(1000 * 2) * 20 dim(coords) <- c(1000, 2) coords_df <- as.data.frame(coords) names(coords_df) <- c("x", "y") # simulate random field if (!requireNamespace('gstat', quietly = TRUE)) { message('Please install the package gstat to run the example code.') } else { library(gstat) model_1 <- gstat(formula = z ~ 1, locations = ~ x + y, dummy = TRUE, beta = 0, model = vgm(psill = 0.025, range = 1, model = 'Exp'), nmax = 20) model_2 <- gstat(formula = z ~ 1, locations = ~ x + y, dummy = TRUE, beta = 0, model = vgm(psill = 0.025, range = 1, kappa = 2, model = 'Mat'), nmax = 20) model_3 <- gstat(formula = z ~ 1, locations = ~ x + y, dummy = TRUE, beta = 0, model = vgm(psill = 0.025, range = 1, model = 'Gau'), nmax = 20) field_1 <- predict(model_1, newdata = coords_df, nsim = 1)$sim1 field_2 <- predict(model_2, newdata = coords_df, nsim = 1)$sim1 field_3 <- predict(model_3, newdata = coords_df, nsim = 1)$sim1 field <- cbind(field_1, field_2, field_3) # Generate 5 % local outliers to data field_cont <- gen_loc_outl(field, coords, radius = 2, swap_order = "regular")[,1:3] X <- as.matrix(field_cont) # apply sbss with three ring kernels kernel_parameters <- c(0, 1, 1, 2, 2, 3) robsbss_result <- robsbss(X, coords, kernel_type = 'ring', kernel_parameters = kernel_parameters) # print object print(robsbss_result) # plot latent field plot(robsbss_result, colorkey = TRUE, as.table = TRUE, cex = 1) # predict latent fields on grid predict(robsbss_result, colorkey = TRUE, as.table = TRUE, cex = 1) # unmixing matrix w_unmix <- coef(robsbss_result) }
# simulate coordinates coords <- runif(1000 * 2) * 20 dim(coords) <- c(1000, 2) coords_df <- as.data.frame(coords) names(coords_df) <- c("x", "y") # simulate random field if (!requireNamespace('gstat', quietly = TRUE)) { message('Please install the package gstat to run the example code.') } else { library(gstat) model_1 <- gstat(formula = z ~ 1, locations = ~ x + y, dummy = TRUE, beta = 0, model = vgm(psill = 0.025, range = 1, model = 'Exp'), nmax = 20) model_2 <- gstat(formula = z ~ 1, locations = ~ x + y, dummy = TRUE, beta = 0, model = vgm(psill = 0.025, range = 1, kappa = 2, model = 'Mat'), nmax = 20) model_3 <- gstat(formula = z ~ 1, locations = ~ x + y, dummy = TRUE, beta = 0, model = vgm(psill = 0.025, range = 1, model = 'Gau'), nmax = 20) field_1 <- predict(model_1, newdata = coords_df, nsim = 1)$sim1 field_2 <- predict(model_2, newdata = coords_df, nsim = 1)$sim1 field_3 <- predict(model_3, newdata = coords_df, nsim = 1)$sim1 field <- cbind(field_1, field_2, field_3) # Generate 5 % local outliers to data field_cont <- gen_loc_outl(field, coords, radius = 2, swap_order = "regular")[,1:3] X <- as.matrix(field_cont) # apply sbss with three ring kernels kernel_parameters <- c(0, 1, 1, 2, 2, 3) robsbss_result <- robsbss(X, coords, kernel_type = 'ring', kernel_parameters = kernel_parameters) # print object print(robsbss_result) # plot latent field plot(robsbss_result, colorkey = TRUE, as.table = TRUE, cex = 1) # predict latent fields on grid predict(robsbss_result, colorkey = TRUE, as.table = TRUE, cex = 1) # unmixing matrix w_unmix <- coef(robsbss_result) }
sbss
estimates the unmixing matrix assuming a spatial blind source separation model by simultaneous/jointly diagonalizing the covariance matrix and one/many local covariance matrices. These local covariance matrices are determined by spatial kernel functions. Three types of such kernel functions are supported.
sbss(x, ...) ## Default S3 method: sbss(x, coords, kernel_type = c('ring', 'ball', 'gauss'), kernel_parameters, lcov = c('lcov', 'ldiff', 'lcov_norm'), ordered = TRUE, kernel_list = NULL, rob_whitening = FALSE, ...) ## S3 method for class 'SpatialPointsDataFrame' sbss(x, ...) ## S3 method for class 'sf' sbss(x, ...)
sbss(x, ...) ## Default S3 method: sbss(x, coords, kernel_type = c('ring', 'ball', 'gauss'), kernel_parameters, lcov = c('lcov', 'ldiff', 'lcov_norm'), ordered = TRUE, kernel_list = NULL, rob_whitening = FALSE, ...) ## S3 method for class 'SpatialPointsDataFrame' sbss(x, ...) ## S3 method for class 'sf' sbss(x, ...)
x |
either a numeric matrix of dimension |
coords |
a numeric matrix of dimension |
kernel_type |
a string indicating which kernel function to use. Either |
kernel_parameters |
a numeric vector that gives the parameters for the kernel function. At least length of one for |
lcov |
a string indicating which type of local covariance matrix to use. Either |
ordered |
logical. If |
kernel_list |
a list of spatial kernel matrices with dimension |
rob_whitening |
logical. If |
... |
further arguments for the fast real joint diagonalization algorithm that jointly diagonalizes the local covariance matrices. See details and |
Three versions of local covariance matrices are implemented, the argument lcov
determines which version is used:
'lcov'
:
'ldiff'
:
'lcov_norm'
:
with
Where correspond to the pairwise distances between coordinates,
are the
p
random field values at location ,
is the sample mean vector, and the kernel function
determines the locality. The choice
'lcov_norm'
is useful when testing for the actual signal dimension of the latent field, see sbss_asymp
and sbss_boot
. LDiff matrices are supposed to be more robust when the random field shows a smooth trend. The following kernel functions are implemented and chosen with the argument kernel_type
:
'ring'
: parameters are inner radius and outer radius
, with
, and
:
'ball'
: parameter is the radius , with
:
'gauss'
: Gaussian function where 95% of the mass is inside the parameter , with
:
The argument kernel_type
determines the used kernel function as presented above, the argument kernel_parameters
gives the corresponding parameters for the kernel function. Specifically, if kernel_type
equals 'ball'
or 'gauss'
then kernel_parameters
is a numeric vector where each entry corresponds to one parameter. Hence, length(kernel_parameters)
local covariance matrices are used. Whereas, if kernel_type
equals 'ring'
, then kernel_parameters
must be a numeric vector of even length where subsequently the inner and outer radii must be given (informally: c(r_in1, r_out1, r_in2, r_out2, ...)
). In that case length(kernel_parameters) / 2
local covariance matrices are used.
Internally, sbss
calls spatial_kernel_matrix
to compute a list of c(n,n)
kernel matrices based on the parameters given, where each entry of those matrices corresponds to . Alternatively, such a list of kernel matrices can be given directly to the function
sbss
via the kernel_list
argument. This is useful when sbss
is called numerous times with the same coordinates/kernel functions as the computation of the kernel matrices is then done only once prior the actual sbss
calls. For details see also spatial_kernel_matrix
.
rob_whitening
determines which scatter is used for the whitening step. If TRUE
, whitening is carried out with respect to the scatter matrix defined by the lcov
argument, where the kernel function is given by the argument kernel_type
and the parameters correspond to the first occuring in the argument kernel_parameters
. Therefore, at least two different kernel parameters need to be given. Note that only matrices are positive definite, hence whitening with
'lcov'
is likely to produce an error. If the argument is FALSE
, whitening is carried out with respect to the usual sample covariance matrix. sbss
internally calls white_data
.
If more than one local covariance matrix is used sbss
jointly diagonalizes these matrices with the function frjd
. ...
provides arguments for frjd
, useful arguments might be:
eps
: tolerance for convergence.
maxiter
: maximum number of iterations.
sbss
returns a list of class 'sbss'
with the following entries:
s |
object of |
coords |
coordinates of the observations. Is |
w |
estimated unmixing matrix. |
w_inv |
inverse of the estimated unmixing matrix. |
pevals |
(pseudo-)eigenvalues for each latent field entry. |
d |
matrix of stacked (jointly) diagonalized local covariance matrices with dimension |
diags |
matrix of dimension |
x_mu |
columnmeans of |
cov_inv_sqrt |
square root of the inverse sample covariance matrix of |
Muehlmann, C., Filzmoser, P. and Nordhausen, K. (2021), Spatial Blind Source Separation in the Presence of a Drift, Submitted for publication. Preprint available at https://arxiv.org/abs/2108.13813.
Bachoc, F., Genton, M. G, Nordhausen, K., Ruiz-Gazen, A. and Virta, J. (2020), Spatial Blind Source Separation, Biometrika, 107, 627-646, doi:10.1093/biomet/asz079.
Nordhausen, K., Oja, H., Filzmoser, P., Reimann, C. (2015), Blind Source Separation for Spatial Compositional Data, Mathematical Geosciences 47, 753-770, doi:10.1007/s11004-014-9559-5.
Muehlmann, C., Bachoc, F., Nordhausen, K. and Yi, M. (2022), Test of the Latent Dimension of a Spatial Blind Source Separation Model, to appear in Statistica Sinica, doi:10.5705/ss.202021.0326.v
spatial_kernel_matrix
, local_covariance_matrix
, sp
, sf
, frjd
# simulate coordinates coords <- runif(1000 * 2) * 20 dim(coords) <- c(1000, 2) coords_df <- as.data.frame(coords) names(coords_df) <- c("x", "y") # simulate random field if (!requireNamespace('gstat', quietly = TRUE)) { message('Please install the package gstat to run the example code.') } else { library(gstat) model_1 <- gstat(formula = z ~ 1, locations = ~ x + y, dummy = TRUE, beta = 0, model = vgm(psill = 0.025, range = 1, model = 'Exp'), nmax = 20) model_2 <- gstat(formula = z ~ 1, locations = ~ x + y, dummy = TRUE, beta = 0, model = vgm(psill = 0.025, range = 1, kappa = 2, model = 'Mat'), nmax = 20) model_3 <- gstat(formula = z ~ 1, locations = ~ x + y, dummy = TRUE, beta = 0, model = vgm(psill = 0.025, range = 1, model = 'Gau'), nmax = 20) field_1 <- predict(model_1, newdata = coords_df, nsim = 1)$sim1 field_2 <- predict(model_2, newdata = coords_df, nsim = 1)$sim1 field_3 <- predict(model_3, newdata = coords_df, nsim = 1)$sim1 field <- as.matrix(cbind(field_1, field_2, field_3)) # apply sbss with three ring kernels kernel_parameters <- c(0, 1, 1, 2, 2, 3) sbss_result <- sbss(field, coords, kernel_type = 'ring', kernel_parameters = kernel_parameters) # print object print(sbss_result) # plot latent field plot(sbss_result, colorkey = TRUE, as.table = TRUE, cex = 1) # predict latent fields on grid predict(sbss_result, colorkey = TRUE, as.table = TRUE, cex = 1) # unmixing matrix w_unmix <- coef(sbss_result) # apply the same sbss with a kernel list kernel_list <- spatial_kernel_matrix(coords, kernel_type = 'ring', kernel_parameters) sbss_result_k <- sbss(field, kernel_list = kernel_list) # apply sbss with three ring kernels and local difference matrices sbss_result_ldiff <- sbss(field, coords, kernel_type = 'ring', kernel_parameters = kernel_parameters, lcov = 'ldiff') }
# simulate coordinates coords <- runif(1000 * 2) * 20 dim(coords) <- c(1000, 2) coords_df <- as.data.frame(coords) names(coords_df) <- c("x", "y") # simulate random field if (!requireNamespace('gstat', quietly = TRUE)) { message('Please install the package gstat to run the example code.') } else { library(gstat) model_1 <- gstat(formula = z ~ 1, locations = ~ x + y, dummy = TRUE, beta = 0, model = vgm(psill = 0.025, range = 1, model = 'Exp'), nmax = 20) model_2 <- gstat(formula = z ~ 1, locations = ~ x + y, dummy = TRUE, beta = 0, model = vgm(psill = 0.025, range = 1, kappa = 2, model = 'Mat'), nmax = 20) model_3 <- gstat(formula = z ~ 1, locations = ~ x + y, dummy = TRUE, beta = 0, model = vgm(psill = 0.025, range = 1, model = 'Gau'), nmax = 20) field_1 <- predict(model_1, newdata = coords_df, nsim = 1)$sim1 field_2 <- predict(model_2, newdata = coords_df, nsim = 1)$sim1 field_3 <- predict(model_3, newdata = coords_df, nsim = 1)$sim1 field <- as.matrix(cbind(field_1, field_2, field_3)) # apply sbss with three ring kernels kernel_parameters <- c(0, 1, 1, 2, 2, 3) sbss_result <- sbss(field, coords, kernel_type = 'ring', kernel_parameters = kernel_parameters) # print object print(sbss_result) # plot latent field plot(sbss_result, colorkey = TRUE, as.table = TRUE, cex = 1) # predict latent fields on grid predict(sbss_result, colorkey = TRUE, as.table = TRUE, cex = 1) # unmixing matrix w_unmix <- coef(sbss_result) # apply the same sbss with a kernel list kernel_list <- spatial_kernel_matrix(coords, kernel_type = 'ring', kernel_parameters) sbss_result_k <- sbss(field, kernel_list = kernel_list) # apply sbss with three ring kernels and local difference matrices sbss_result_ldiff <- sbss(field, coords, kernel_type = 'ring', kernel_parameters = kernel_parameters, lcov = 'ldiff') }
sbss_asymp
uses asymptotic theory for the spatial blind source separation (SBSS) methodology to test if the last entries of the latent random field are white noise assuming that the
-variate observed random field follows a SBSS model.
sbss_asymp(x, ...) ## Default S3 method: sbss_asymp(x, coords, q, kernel_parameters, kernel_list = NULL, ...) ## S3 method for class 'SpatialPointsDataFrame' sbss_asymp(x, ...) ## S3 method for class 'sf' sbss_asymp(x, ...)
sbss_asymp(x, ...) ## Default S3 method: sbss_asymp(x, coords, q, kernel_parameters, kernel_list = NULL, ...) ## S3 method for class 'SpatialPointsDataFrame' sbss_asymp(x, ...) ## S3 method for class 'sf' sbss_asymp(x, ...)
x |
either a numeric matrix of dimension |
coords |
a numeric matrix of dimension |
q |
an integer between |
kernel_parameters |
a numeric vector that gives the parameters for the ring kernel function. At least length of two, see details. |
kernel_list |
a list of spatial kernel matrices with dimension |
... |
further arguments for the fast real joint diagonalization algorithm that jointly diagonalizes the local covariance matrices. See details and |
This function uses the SBSS methodology in conjunction with local covariance matrices based on ring kernel functions to estimate the -variate latent random field
, where
is the whitened version of the data and
is the estimated unmixing matrix. The considered (adapted) local covariance matrices write as
with
Where correspond to the pairwise distances between coordinates,
are the
p
random field values at location (which is the i-th row of the argument
x
and the location corresponds to the i-th row of the argument coords
) and is the sample mean vector. The function argument
kernel_parameters
determines the parameters of the used ring kernel functions or alternatively a list of kernel matrices can be given with the argument kernel_list
, see sbss
for details.
The null hypothesis specified with the argument q
states that the last components of the estimated latent field are white noise. The method orders the components of the latent field by the order of the decreasing sums of squares of the corresponding (pseudo-)eigenvalues of the local covariance matrices produced by the joint diagonalization algorithm (or the eigendecomposition if only one local covariance matrix is used). Under the null the lower right
block matrices of the jointly diagonalized local covariance matrices equal zero matrices. Therefore, the sum of their squared norms
is used as test statistic.
This function conducts the hypothesis test using the asymptotic null distribution of , a chi-squared distribution with
degrees of freedom (
is the number jointly diagonalized local covariance matrices).
If more than one local covariance matrix is used sbss_asymp
jointly diagonalizes these matrices with the function frjd
. ...
provides arguments for frjd
, useful arguments might be:
eps
: tolerance for convergence.
maxiter
: maximum number of iterations.
sbss_asymp
returns a list of class 'sbss_test'
inheriting from the classes 'htest'
and 'sbss'
with the following entries:
alternative |
a string containing the alternative hypothesis. |
method |
a string which indicates which test methods was used. |
data.name |
a string specifying the name of the used data. |
statistic |
the value of the test statistic. |
parameters |
degrees of freedom for the asymptotic chi-squared distribution of the test statistic under the null hypothesis. |
p.value |
the p-value of the test. |
s |
object of |
coords |
coordinates of the observations. Is |
w |
estimated unmixing matrix. |
w_inv |
inverse of the estimated unmixing matrix. |
d |
matrix of stacked (jointly) diagonalized local covariance matrices with dimension |
x_mu |
columnmeans of |
cov_inv_sqrt |
square root of the inverse sample covariance matrix of |
Muehlmann, C., Bachoc, F., Nordhausen, K. and Yi, M. (2022), Test of the Latent Dimension of a Spatial Blind Source Separation Model, to appear in Statistica Sinica, doi:10.5705/ss.202021.0326.
sbss
, spatial_kernel_matrix
, local_covariance_matrix
, sp
,
sf
, frjd
# simulate coordinates n <- 1000 coords <- runif(n * 2) * 20 dim(coords) <- c(n, 2) coords_df <- as.data.frame(coords) names(coords_df) <- c("x", "y") # simulate random field if (!requireNamespace('gstat', quietly = TRUE)) { message('Please install the package gstat to run the example code.') } else { library(gstat) model_1 <- gstat(formula = z ~ 1, locations = ~ x + y, dummy = TRUE, beta = 0, model = vgm(psill = 0.025, range = 1, model = 'Exp'), nmax = 20) model_2 <- gstat(formula = z ~ 1, locations = ~ x + y, dummy = TRUE, beta = 0, model = vgm(psill = 0.025, range = 1, kappa = 2, model = 'Mat'), nmax = 20) field_1 <- predict(model_1, newdata = coords_df, nsim = 1)$sim1 field_2 <- predict(model_2, newdata = coords_df, nsim = 1)$sim1 field_3 <- rnorm(n) field_4 <- rnorm(n) latent_field <- cbind(as.matrix(cbind(field_1, field_2)), field_3, field_4) mixing_matrix <- matrix(rnorm(16), 4, 4) observed_field <- latent_field %*% t(mixing_matrix) # apply the asymptotic test for a hypothetical latent white noise dimension of q # q can lie between 0 and 3 in this case # using one ring kernel function and the null hypothesis q = 1 asymp_res_1 <- sbss_asymp(observed_field, coords, q = 1, kernel_parameters = c(0, 1)) # using two ring kernel functions and the null hypothesis q = 3 asymp_res_2 <- sbss_asymp(observed_field, coords, q = 3, kernel_parameters = c(0, 1, 1, 2)) # the result is of class sbss_test which is inherited from htest and sbss # print object (print method for an object of class htest) print(asymp_res_1) print(asymp_res_2) # plot latent field (plot method for an object of class sbss) plot(asymp_res_1, colorkey = TRUE, as.table = TRUE, cex = 1) # predict latent fields on grid (predict method for an object of class sbss) predict(asymp_res_1, colorkey = TRUE, as.table = TRUE, cex = 1) # unmixing matrix (coef method for an object of class sbss) w_unmix <- coef(asymp_res_1) }
# simulate coordinates n <- 1000 coords <- runif(n * 2) * 20 dim(coords) <- c(n, 2) coords_df <- as.data.frame(coords) names(coords_df) <- c("x", "y") # simulate random field if (!requireNamespace('gstat', quietly = TRUE)) { message('Please install the package gstat to run the example code.') } else { library(gstat) model_1 <- gstat(formula = z ~ 1, locations = ~ x + y, dummy = TRUE, beta = 0, model = vgm(psill = 0.025, range = 1, model = 'Exp'), nmax = 20) model_2 <- gstat(formula = z ~ 1, locations = ~ x + y, dummy = TRUE, beta = 0, model = vgm(psill = 0.025, range = 1, kappa = 2, model = 'Mat'), nmax = 20) field_1 <- predict(model_1, newdata = coords_df, nsim = 1)$sim1 field_2 <- predict(model_2, newdata = coords_df, nsim = 1)$sim1 field_3 <- rnorm(n) field_4 <- rnorm(n) latent_field <- cbind(as.matrix(cbind(field_1, field_2)), field_3, field_4) mixing_matrix <- matrix(rnorm(16), 4, 4) observed_field <- latent_field %*% t(mixing_matrix) # apply the asymptotic test for a hypothetical latent white noise dimension of q # q can lie between 0 and 3 in this case # using one ring kernel function and the null hypothesis q = 1 asymp_res_1 <- sbss_asymp(observed_field, coords, q = 1, kernel_parameters = c(0, 1)) # using two ring kernel functions and the null hypothesis q = 3 asymp_res_2 <- sbss_asymp(observed_field, coords, q = 3, kernel_parameters = c(0, 1, 1, 2)) # the result is of class sbss_test which is inherited from htest and sbss # print object (print method for an object of class htest) print(asymp_res_1) print(asymp_res_2) # plot latent field (plot method for an object of class sbss) plot(asymp_res_1, colorkey = TRUE, as.table = TRUE, cex = 1) # predict latent fields on grid (predict method for an object of class sbss) predict(asymp_res_1, colorkey = TRUE, as.table = TRUE, cex = 1) # unmixing matrix (coef method for an object of class sbss) w_unmix <- coef(asymp_res_1) }
sbss_boot
uses bootstrap tests for the spatial blind source separation (SBSS) methodology to test if the last entries of the latent random field are white noise assuming that the
-variate observed random field follows a SBSS model.
sbss_boot(x, ...) ## Default S3 method: sbss_boot(x, coords, q, kernel_parameters, boot_method = c('permute', 'parametric'), n_boot = 200, kernel_list = NULL, ...) ## S3 method for class 'SpatialPointsDataFrame' sbss_boot(x, ...) ## S3 method for class 'sf' sbss_boot(x, ...)
sbss_boot(x, ...) ## Default S3 method: sbss_boot(x, coords, q, kernel_parameters, boot_method = c('permute', 'parametric'), n_boot = 200, kernel_list = NULL, ...) ## S3 method for class 'SpatialPointsDataFrame' sbss_boot(x, ...) ## S3 method for class 'sf' sbss_boot(x, ...)
x |
either a numeric matrix of dimension |
coords |
a numeric matrix of dimension |
q |
an integer between |
kernel_parameters |
a numeric vector that gives the parameters for the ring kernel function. At least length of two, see details. |
boot_method |
a string indicating which bootstrap strategy is used, see details. Either |
n_boot |
positive integer specifying the number of bootstrap samples. Default is |
kernel_list |
a list of spatial kernel matrices with dimension |
... |
further arguments for the fast real joint diagonalization algorithm that jointly diagonalizes the local covariance matrices. See details and |
This function uses the SBSS methodology in conjunction with local covariance matrices based on ring kernel functions to estimate the -variate latent random field
, where
is the whitened version of the data and
is the estimated unmixing matrix. The considered (adapted) local covariance matrices write as
with
Where correspond to the pairwise distances between coordinates,
are the
p
random field values at location (which is the i-th row of the argument
x
and the location corresponds to the i-th row of the argument coords
) and is the sample mean vector. The function argument
kernel_parameters
determines the parameters of the used ring kernel functions or alternatively a list of kernel matrices can be given with the argument kernel_list
, see sbss
for details.
The null hypothesis specified with the argument q
states that the last components of the estimated latent field are white noise. The method orders the components of the latent field by the order of the decreasing sums of squares of the corresponding (pseudo-)eigenvalues of the local covariance matrices produced by the joint diagonalization algorithm (or the eigendecomposition if only one local covariance matrix is used). Under the null the lower right
block matrices of the jointly diagonalized local covariance matrices equal zero matrices. Therefore, the sum of their squared norms
is used as test statistic for the bootstrap based inference methods described below.
Compute the test statistic based on the original data
.
The estimated latent field (its dimension is
c(n,p)
) is split into the signal part (first q
columns) and the white noise part (last p - q
columns).
Replace the noise part by a bootstrap sample drawn based on one of the two strategies described below.
Recombine the signal part and resampled noise part by concatenating the columns leading to and back-transform it by
.
Compute the test statistic based on
.
Repeat Step 2 - 5 for a total amount of n_boot
times (default is 200
) and the p-value of the bootstrap test is computed by
The argument boot_method
(default is "permute"
) specifies the used resample strategy. The two following strategies are implemented:
boot_method = "permute"
:
This strategy is non-parametric. It draws each bootstrap sample from the vector of all observed hypothetical white noise observations.
boot_method = "parametric"
:
This is parametric. Each bootstrap sample is drawn independently and identically from the standard normal distribution.
If more than one local covariance matrix is used sbss_boot
jointly diagonalizes these matrices with the function frjd
. ...
provides arguments for frjd
, useful arguments might be:
eps
: tolerance for convergence.
maxiter
: maximum number of iterations.
sbss_boot
returns a list of class 'sbss_test'
inheriting from the classes 'htest'
and 'sbss'
with the following entries:
alternative |
a string containing the alternative hypothesis. |
method |
a string which indicates which test methods was used. |
data.name |
a string specifying the name of the used data. |
statistic |
the value of the test statistic. |
parameters |
a integer specifying the number of generated bootstrap samples (the value of the argument |
p.value |
the p-value of the test. |
s |
object of |
coords |
coordinates of the observations. Is |
w |
estimated unmixing matrix. |
w_inv |
inverse of the estimated unmixing matrix. |
d |
matrix of stacked (jointly) diagonalized local covariance matrices with dimension |
x_mu |
columnmeans of |
cov_inv_sqrt |
square root of the inverse sample covariance matrix of |
Muehlmann, C., Bachoc, F., Nordhausen, K. and Yi, M. (2022), Test of the Latent Dimension of a Spatial Blind Source Separation Model, to appear in Statistica Sinica, doi:10.5705/ss.202021.0326.
sbss
, spatial_kernel_matrix
, local_covariance_matrix
, sp
,
sf
, frjd
# simulate coordinates n <- 1000 coords <- runif(n * 2) * 20 dim(coords) <- c(n, 2) coords_df <- as.data.frame(coords) names(coords_df) <- c("x", "y") # simulate random field if (!requireNamespace('gstat', quietly = TRUE)) { message('Please install the package gstat to run the example code.') } else { library(gstat) model_1 <- gstat(formula = z ~ 1, locations = ~ x + y, dummy = TRUE, beta = 0, model = vgm(psill = 0.025, range = 1, model = 'Exp'), nmax = 20) model_2 <- gstat(formula = z ~ 1, locations = ~ x + y, dummy = TRUE, beta = 0, model = vgm(psill = 0.025, range = 1, kappa = 2, model = 'Mat'), nmax = 20) field_1 <- predict(model_1, newdata = coords_df, nsim = 1)$sim1 field_2 <- predict(model_2, newdata = coords_df, nsim = 1)$sim1 field_3 <- rnorm(n) field_4 <- rnorm(n) latent_field <- cbind(as.matrix(cbind(field_1, field_2)), field_3, field_4) mixing_matrix <- matrix(rnorm(16), 4, 4) observed_field <- latent_field %*% t(mixing_matrix) # apply the bootstrap tests for a hypothetical latent white noise dimension of q # q can lie between 0 and 3 in this case # using one ring kernel function with the permute strategy # and the null hypothesis q = 1 boot_res_1 <- sbss_boot(observed_field, coords, q = 1, kernel_parameters = c(0, 1), boot_method = 'permute', n_boot = 100) # using two one ring kernel function with the parametric strategy # and the null hypothesis q = 3 boot_res_2 <- sbss_boot(observed_field, coords, q = 3, kernel_parameters = c(0, 1, 1, 2), boot_method = 'parametric', n_boot = 100) # the result is of class sbss_test which is inherited from htest and sbss # print object (print method for an object of class htest) print(boot_res_1) print(boot_res_2) # plot latent field (plot method for an object of class sbss) plot(boot_res_1, colorkey = TRUE, as.table = TRUE, cex = 1) # predict latent fields on grid (predict method for an object of class sbss) predict(boot_res_1, colorkey = TRUE, as.table = TRUE, cex = 1) # unmixing matrix (coef method for an object of class sbss) w_unmix <- coef(boot_res_1) }
# simulate coordinates n <- 1000 coords <- runif(n * 2) * 20 dim(coords) <- c(n, 2) coords_df <- as.data.frame(coords) names(coords_df) <- c("x", "y") # simulate random field if (!requireNamespace('gstat', quietly = TRUE)) { message('Please install the package gstat to run the example code.') } else { library(gstat) model_1 <- gstat(formula = z ~ 1, locations = ~ x + y, dummy = TRUE, beta = 0, model = vgm(psill = 0.025, range = 1, model = 'Exp'), nmax = 20) model_2 <- gstat(formula = z ~ 1, locations = ~ x + y, dummy = TRUE, beta = 0, model = vgm(psill = 0.025, range = 1, kappa = 2, model = 'Mat'), nmax = 20) field_1 <- predict(model_1, newdata = coords_df, nsim = 1)$sim1 field_2 <- predict(model_2, newdata = coords_df, nsim = 1)$sim1 field_3 <- rnorm(n) field_4 <- rnorm(n) latent_field <- cbind(as.matrix(cbind(field_1, field_2)), field_3, field_4) mixing_matrix <- matrix(rnorm(16), 4, 4) observed_field <- latent_field %*% t(mixing_matrix) # apply the bootstrap tests for a hypothetical latent white noise dimension of q # q can lie between 0 and 3 in this case # using one ring kernel function with the permute strategy # and the null hypothesis q = 1 boot_res_1 <- sbss_boot(observed_field, coords, q = 1, kernel_parameters = c(0, 1), boot_method = 'permute', n_boot = 100) # using two one ring kernel function with the parametric strategy # and the null hypothesis q = 3 boot_res_2 <- sbss_boot(observed_field, coords, q = 3, kernel_parameters = c(0, 1, 1, 2), boot_method = 'parametric', n_boot = 100) # the result is of class sbss_test which is inherited from htest and sbss # print object (print method for an object of class htest) print(boot_res_1) print(boot_res_2) # plot latent field (plot method for an object of class sbss) plot(boot_res_1, colorkey = TRUE, as.table = TRUE, cex = 1) # predict latent fields on grid (predict method for an object of class sbss) predict(boot_res_1, colorkey = TRUE, as.table = TRUE, cex = 1) # unmixing matrix (coef method for an object of class sbss) w_unmix <- coef(boot_res_1) }
snss_jd
estimates the unmixing matrix assuming a spatial non-stationary source separation model implying non-constant covariance by jointly diagonalizing at least two covariance matrices computed for corresponding different sub-domains.
snss_jd(x, ...) ## Default S3 method: snss_jd(x, coords, n_block, ordered = TRUE, ...) ## S3 method for class 'list' snss_jd(x, coords, ordered = TRUE, ...) ## S3 method for class 'SpatialPointsDataFrame' snss_jd(x, ...) ## S3 method for class 'sf' snss_jd(x, ...)
snss_jd(x, ...) ## Default S3 method: snss_jd(x, coords, n_block, ordered = TRUE, ...) ## S3 method for class 'list' snss_jd(x, coords, ordered = TRUE, ...) ## S3 method for class 'SpatialPointsDataFrame' snss_jd(x, ...) ## S3 method for class 'sf' snss_jd(x, ...)
x |
either a numeric matrix of dimension |
coords |
a numeric matrix of dimension |
n_block |
an integer defining the subdivision of the domain. See details. |
ordered |
logical. If |
... |
further arguments for the fast real joint diagonalization algorithm that jointly diagonalizes the sub-domain covariance matrices. See details and |
This function assumes that the random field is formed by
where is the deterministic
mixing matrix,
is the
-dimensional location vector,
is the observable
-variate random field given by the argument
x
, are the spatial locations given by the argument
coords
and is the latent
-variate random field assumed to consist of uncorrelated entries that have zero mean but non-constant variances. This function aims to recover
by
where is the
unmixing matrix and
is the sample mean. The function does this by splitting the given spatial domain into
n_block^2
equally sized rectangular sub-domains and jointly diagonalizing the corresponding covariance matrices for all sub-domains.
Alternatively the domain subdivision can be defined by providing lists of length K
for the arguments x
and coords
where the first list entries correspond to the values and coordinates of the first sub-domain and the second entries to the values and coordinates of the second sub-domain, etc..
snss_jd
jointly diagonalizes the covariance matrices for each sub-domain with the function frjd
. ...
provides arguments for frjd
, useful arguments might be:
eps
: tolerance for convergence.
maxiter
: maximum number of iterations.
Similarly as sbss
the function snss_jd
returns a list of class 'snss'
and 'sbss'
with the following entries:
s |
object of |
coords |
coordinates of the observations. Only given if |
w |
estimated unmixing matrix. |
w_inv |
inverse of the estimated unmixing matrix. |
d |
matrix of stacked (jointly) diagonalized sub-domain covariance matrices with dimension |
x_mu |
columnmeans of |
cov_inv_sqrt |
square root of the inverse sample covariance matrix of |
Muehlmann, C., Bachoc, F. and Nordhausen, K. (2022), Blind Source Separation for Non-Stationary Random Fields, Spatial Statistics, 47, 100574, doi:10.1016/j.spasta.2021.100574.
# simulate coordinates n <- 1000 coords <- runif(n * 2) * 20 dim(coords) <- c(n, 2) # simulate random field field_1 <- rnorm(n) field_2 <- 2 * sin(pi / 20 * coords[, 1]) * rnorm(n) field_3 <- rnorm(n) * (coords[, 1] < 10) + rnorm(n, 0, 3) * (coords[, 1] >= 10) latent_field <- cbind(field_1, field_2, field_3) mixing_matrix <- matrix(rnorm(9), 3, 3) observed_field <- latent_field observed_field_sp <- sp::SpatialPointsDataFrame(coords = coords, data = data.frame(observed_field)) sp::spplot(observed_field_sp, colorkey = TRUE, as.table = TRUE, cex = 1) # apply snss_jd with 4 sub-domains res_4 <- snss_jd(observed_field, coords, n_block = 2) JADE::MD(W.hat = coef(res_4), A = mixing_matrix) # apply snss_jd with 9 sub-domains res_9 <- snss_jd(observed_field, coords, n_block = 3) JADE::MD(W.hat = coef(res_9), A = mixing_matrix) cor(res_9$s, latent_field) # print object print(res_4) # plot latent field plot(res_4, colorkey = TRUE, as.table = TRUE, cex = 1) # predict latent fields on grid predict(res_4, colorkey = TRUE, as.table = TRUE, cex = 1) # unmixing matrix w_unmix <- coef(res_4) # apply snss_jd with SpatialPointsDataFrame object res_4_sp <- snss_jd(observed_field_sp, n_block = 2) # apply with list arguments # first axis split by 5 # second axis split by 10 # results in 4 sub-domains flag_x <- coords[, 1] < 5 flag_y <- coords[, 2] < 10 coords_list <- list(coords[flag_x & flag_y, ], coords[!flag_x & flag_y, ], coords[flag_x & !flag_y, ], coords[!flag_x & !flag_y, ]) field_list <- list(observed_field[flag_x & flag_y, ], observed_field[!flag_x & flag_y, ], observed_field[flag_x & !flag_y, ], observed_field[!flag_x & !flag_y, ]) plot(coords, col = 1) points(coords_list[[2]], col = 2) points(coords_list[[3]], col = 3) points(coords_list[[4]], col = 4) res_list <- snss_jd(x = field_list, coords = coords_list) plot(res_list, colorkey = TRUE, as.table = TRUE, cex = 1) JADE::MD(W.hat = coef(res_list), A = mixing_matrix)
# simulate coordinates n <- 1000 coords <- runif(n * 2) * 20 dim(coords) <- c(n, 2) # simulate random field field_1 <- rnorm(n) field_2 <- 2 * sin(pi / 20 * coords[, 1]) * rnorm(n) field_3 <- rnorm(n) * (coords[, 1] < 10) + rnorm(n, 0, 3) * (coords[, 1] >= 10) latent_field <- cbind(field_1, field_2, field_3) mixing_matrix <- matrix(rnorm(9), 3, 3) observed_field <- latent_field observed_field_sp <- sp::SpatialPointsDataFrame(coords = coords, data = data.frame(observed_field)) sp::spplot(observed_field_sp, colorkey = TRUE, as.table = TRUE, cex = 1) # apply snss_jd with 4 sub-domains res_4 <- snss_jd(observed_field, coords, n_block = 2) JADE::MD(W.hat = coef(res_4), A = mixing_matrix) # apply snss_jd with 9 sub-domains res_9 <- snss_jd(observed_field, coords, n_block = 3) JADE::MD(W.hat = coef(res_9), A = mixing_matrix) cor(res_9$s, latent_field) # print object print(res_4) # plot latent field plot(res_4, colorkey = TRUE, as.table = TRUE, cex = 1) # predict latent fields on grid predict(res_4, colorkey = TRUE, as.table = TRUE, cex = 1) # unmixing matrix w_unmix <- coef(res_4) # apply snss_jd with SpatialPointsDataFrame object res_4_sp <- snss_jd(observed_field_sp, n_block = 2) # apply with list arguments # first axis split by 5 # second axis split by 10 # results in 4 sub-domains flag_x <- coords[, 1] < 5 flag_y <- coords[, 2] < 10 coords_list <- list(coords[flag_x & flag_y, ], coords[!flag_x & flag_y, ], coords[flag_x & !flag_y, ], coords[!flag_x & !flag_y, ]) field_list <- list(observed_field[flag_x & flag_y, ], observed_field[!flag_x & flag_y, ], observed_field[flag_x & !flag_y, ], observed_field[!flag_x & !flag_y, ]) plot(coords, col = 1) points(coords_list[[2]], col = 2) points(coords_list[[3]], col = 3) points(coords_list[[4]], col = 4) res_list <- snss_jd(x = field_list, coords = coords_list) plot(res_list, colorkey = TRUE, as.table = TRUE, cex = 1) JADE::MD(W.hat = coef(res_list), A = mixing_matrix)
snss_sd
estimates the unmixing matrix assuming a spatial non-stationary source separation model implying non-constant covariance by simultaneously diagonalizing two covariance matrices computed for two corresponding different sub-domains.
snss_sd(x, ...) ## Default S3 method: snss_sd(x, coords, direction = c('x', 'y'), ordered = TRUE, ...) ## S3 method for class 'list' snss_sd(x, coords, ordered = TRUE, ...) ## S3 method for class 'SpatialPointsDataFrame' snss_sd(x, ...) ## S3 method for class 'sf' snss_sd(x, ...)
snss_sd(x, ...) ## Default S3 method: snss_sd(x, coords, direction = c('x', 'y'), ordered = TRUE, ...) ## S3 method for class 'list' snss_sd(x, coords, ordered = TRUE, ...) ## S3 method for class 'SpatialPointsDataFrame' snss_sd(x, ...) ## S3 method for class 'sf' snss_sd(x, ...)
x |
either a numeric matrix of dimension |
coords |
a numeric matrix of dimension |
direction |
a string indicating on which coordinate axis the domain is halved. Either |
ordered |
logical. If |
... |
further arguments to be passed to or from methods. |
This function assumes that the random field is formed by
where is the deterministic
mixing matrix,
is the
-dimensional location vector,
is the observable
-variate random field given by the argument
x
, are the spatial locations given by the argument
coords
and is the latent
-variate random field assumed to consist of uncorrelated entries that have zero mean but non-constant variances. This function aims to recover
by
where is the
unmixing matrix and
is the sample mean. The function does this by splitting the given spatial domain in half according to the first coordinate (argument
direction
equals 'x'
) or the second coodinate (argument direction
equals 'y'
) and simultaneously diagonalizing the sample covariance matrices for each of the two sub-domains.
Alternatively the domain subdivison can be defined by providing lists of length two for the arguments x
and coords
where the first list entries correspond to the values and coordinates of the first sub-domain and the second entries to the values and coordinates of the second sub-domain.
Similarly as sbss
the function snss_sd
returns a list of class 'snss'
and 'sbss'
with the following entries:
s |
object of |
coords |
coordinates of the observations. Only given if |
w |
estimated unmixing matrix. |
w_inv |
inverse of the estimated unmixing matrix. |
d |
diagonal matrix containing the eigenvalues of the eigendecomposition. |
x_mu |
columnmeans of |
cov_inv_sqrt |
square root of the inverse sample covariance matrix for the first sub-domain. |
Muehlmann, C., Bachoc, F. and Nordhausen, K. (2022), Blind Source Separation for Non-Stationary Random Fields, Spatial Statistics, 47, 100574, doi:10.1016/j.spasta.2021.100574.
# simulate coordinates n <- 1000 coords <- runif(n * 2) * 20 dim(coords) <- c(n, 2) # simulate random field field_1 <- rnorm(n) field_2 <- 2 * sin(pi / 20 * coords[, 1]) * rnorm(n) field_3 <- rnorm(n) * (coords[, 1] < 10) + rnorm(n, 0, 3) * (coords[, 1] >= 10) latent_field <- cbind(field_1, field_2, field_3) mixing_matrix <- matrix(rnorm(9), 3, 3) observed_field <- latent_field %*% t(mixing_matrix) observed_field_sp <- sp::SpatialPointsDataFrame(coords = coords, data = data.frame(observed_field)) sp::spplot(observed_field_sp, colorkey = TRUE, as.table = TRUE, cex = 1) # apply snss_sd with split in x res_x <- snss_sd(observed_field, coords, direction = 'x') JADE::MD(W.hat = coef(res_x), A = mixing_matrix) # apply snss_sd with split in y # should be much worse as field shows only variation in x res_y <- snss_sd(observed_field, coords, direction = 'y') JADE::MD(W.hat = coef(res_y), A = mixing_matrix) # print object print(res_x) # plot latent field plot(res_x, colorkey = TRUE, as.table = TRUE, cex = 1) # predict latent fields on grid predict(res_x, colorkey = TRUE, as.table = TRUE, cex = 1) # unmixing matrix w_unmix <- coef(res_x) # apply snss_sd with SpatialPointsDataFrame object res_x_sp <- snss_sd(observed_field_sp, direction = 'x') # apply with list arguments # first axis split by 5 flag_coords <- coords[, 1] < 5 coords_list <- list(coords[flag_coords, ], coords[!flag_coords, ]) field_list <- list(observed_field[flag_coords, ], observed_field[!flag_coords, ]) plot(coords, col = flag_coords + 1) res_list <- snss_sd(x = field_list, coords = coords_list) plot(res_list, colorkey = TRUE, as.table = TRUE, cex = 1) JADE::MD(W.hat = coef(res_list), A = mixing_matrix)
# simulate coordinates n <- 1000 coords <- runif(n * 2) * 20 dim(coords) <- c(n, 2) # simulate random field field_1 <- rnorm(n) field_2 <- 2 * sin(pi / 20 * coords[, 1]) * rnorm(n) field_3 <- rnorm(n) * (coords[, 1] < 10) + rnorm(n, 0, 3) * (coords[, 1] >= 10) latent_field <- cbind(field_1, field_2, field_3) mixing_matrix <- matrix(rnorm(9), 3, 3) observed_field <- latent_field %*% t(mixing_matrix) observed_field_sp <- sp::SpatialPointsDataFrame(coords = coords, data = data.frame(observed_field)) sp::spplot(observed_field_sp, colorkey = TRUE, as.table = TRUE, cex = 1) # apply snss_sd with split in x res_x <- snss_sd(observed_field, coords, direction = 'x') JADE::MD(W.hat = coef(res_x), A = mixing_matrix) # apply snss_sd with split in y # should be much worse as field shows only variation in x res_y <- snss_sd(observed_field, coords, direction = 'y') JADE::MD(W.hat = coef(res_y), A = mixing_matrix) # print object print(res_x) # plot latent field plot(res_x, colorkey = TRUE, as.table = TRUE, cex = 1) # predict latent fields on grid predict(res_x, colorkey = TRUE, as.table = TRUE, cex = 1) # unmixing matrix w_unmix <- coef(res_x) # apply snss_sd with SpatialPointsDataFrame object res_x_sp <- snss_sd(observed_field_sp, direction = 'x') # apply with list arguments # first axis split by 5 flag_coords <- coords[, 1] < 5 coords_list <- list(coords[flag_coords, ], coords[!flag_coords, ]) field_list <- list(observed_field[flag_coords, ], observed_field[!flag_coords, ]) plot(coords, col = flag_coords + 1) res_list <- snss_sd(x = field_list, coords = coords_list) plot(res_list, colorkey = TRUE, as.table = TRUE, cex = 1) JADE::MD(W.hat = coef(res_list), A = mixing_matrix)
snss_sjd
estimates the unmixing matrix assuming a spatial non-stationary source separation model implying non-constant (spatial) covariance by jointly diagonalizing several covariance and/or spatial covariance matrices computed for a subdivision of the spatial domain into at least two sub-domains.
snss_sjd(x, ...) ## Default S3 method: snss_sjd(x, coords, n_block, kernel_type = c('ring', 'ball', 'gauss'), kernel_parameters, with_cov = TRUE, lcov = c('lcov', 'ldiff', 'lcov_norm'), ordered = TRUE, ...) ## S3 method for class 'list' snss_sjd(x, coords, kernel_type = c('ring', 'ball', 'gauss'), kernel_parameters, with_cov = TRUE, lcov = c('lcov', 'ldiff', 'lcov_norm'), ordered = TRUE, ...) ## S3 method for class 'SpatialPointsDataFrame' snss_sjd(x, ...) ## S3 method for class 'sf' snss_sjd(x, ...)
snss_sjd(x, ...) ## Default S3 method: snss_sjd(x, coords, n_block, kernel_type = c('ring', 'ball', 'gauss'), kernel_parameters, with_cov = TRUE, lcov = c('lcov', 'ldiff', 'lcov_norm'), ordered = TRUE, ...) ## S3 method for class 'list' snss_sjd(x, coords, kernel_type = c('ring', 'ball', 'gauss'), kernel_parameters, with_cov = TRUE, lcov = c('lcov', 'ldiff', 'lcov_norm'), ordered = TRUE, ...) ## S3 method for class 'SpatialPointsDataFrame' snss_sjd(x, ...) ## S3 method for class 'sf' snss_sjd(x, ...)
x |
either a numeric matrix of dimension |
coords |
a numeric matrix of dimension |
n_block |
either be an integer defining the subdivision of the domain, |
kernel_type |
a string indicating which kernel function to use. Either |
kernel_parameters |
a numeric vector that gives the parameters for the kernel function. At least length of one for |
with_cov |
logical. If |
lcov |
a string indicating which type of local covariance matrix to use. Either |
ordered |
logical. If |
... |
further arguments for the fast real joint diagonalization algorithm that jointly diagonalizes the sub-domain covariance matrices. See details and |
This function assumes that the random field is formed by
where is the deterministic
mixing matrix,
is the
-dimensional location vector,
is the observable
-variate random field given by the argument
x
, are the spatial locations given by the argument
coords
and is the latent
-variate random field assumed to consist of uncorrelated entries that have zero mean but non-constant (spatial) second order dependence. This function aims to recover
by
where is the
unmixing matrix and
is the sample mean. The function does this by splitting the given spatial domain into
n_block^2
equally sized rectangular sub-domains and jointly diagonalizing the corresponding spatial covariance matrices for all sub-domains. If the argument with_cov
equals TRUE
(default) then additionally also the sample covariance matrices for each sub-domain are included in the joint diagonalization procedure.
The arguments kernel_type
, kernel_parameters
and lcov
determine which spatial kernel functions and which type of local covariance matrices are used for each sub-domain. The usage is equal to the function sbss
.
Alternatively the domain subdivision can be defined by providing lists of length K
for the arguments x
and coords
where the first list entries correspond to the values and coordinates of the first sub-domain and the second entries to the values and coordinates of the second sub-domain, etc.. The argument n_block
might be 'x'
or 'y'
indicating a split across the x or y coordinates similar as done by the function snss_sd
.
snss_sjd
jointly diagonalizes the covariance matrices for each sub-domain with the function frjd
. ...
provides arguments for frjd
, useful arguments might be:
eps
: tolerance for convergence.
maxiter
: maximum number of iterations.
Similarly as sbss
the function snss_jd
returns a list of class 'snss'
and 'sbss'
with the following entries:
s |
object of |
coords |
coordinates of the observations. Only given if |
w |
estimated unmixing matrix. |
w_inv |
inverse of the estimated unmixing matrix. |
d |
matrix of stacked (jointly) diagonalized sub-domain covariance and/or local covariance matrices. |
x_mu |
columnmeans of |
cov_inv_sqrt |
square root of the inverse sample covariance matrix of |
Muehlmann, C., Bachoc, F. and Nordhausen, K. (2022), Blind Source Separation for Non-Stationary Random Fields, Spatial Statistics, 47, 100574, doi:10.1016/j.spasta.2021.100574.
# simulate coordinates n <- 1000 coords <- runif(n * 2) * 20 dim(coords) <- c(n, 2) # simulate random field field_1 <- rnorm(n) field_2 <- 2 * sin(pi / 20 * coords[, 1]) * rnorm(n) field_3 <- rnorm(n) * (coords[, 1] < 10) + rnorm(n, 0, 3) * (coords[, 1] >= 10) latent_field <- cbind(field_1, field_2, field_3) mixing_matrix <- matrix(rnorm(9), 3, 3) observed_field <- latent_field observed_field_sp <- sp::SpatialPointsDataFrame(coords = coords, data = data.frame(observed_field)) sp::spplot(observed_field_sp, colorkey = TRUE, as.table = TRUE, cex = 1) # apply snss_sjd with 4 sub-domains # one ring kernel per sub-domain # without covariances res_4_ball <- snss_sjd(observed_field, coords, n_block = 2, kernel_type = 'ball', kernel_parameters = c(0, 2), with_cov = TRUE) JADE::MD(W.hat = coef(res_4_ball), A = mixing_matrix) # apply snss_sjd with split across y # one ring kernel per sub-domain # without covariances # should not work as field does not show spatial dependence res_4_ring <- snss_sjd(observed_field, coords, n_block = 'y', kernel_type = 'ring', kernel_parameters = c(0, 2), with_cov = FALSE) JADE::MD(W.hat = coef(res_4_ring), A = mixing_matrix) # print object print(res_4_ball) # plot latent field plot(res_4_ball, colorkey = TRUE, as.table = TRUE, cex = 1) # predict latent fields on grid predict(res_4_ball, colorkey = TRUE, as.table = TRUE, cex = 1) # unmixing matrix w_unmix <- coef(res_4_ball) # apply snss_jd with SpatialPointsDataFrame object res_4_ball_sp <- snss_sjd(observed_field_sp, n_block = 2, kernel_type = 'ball', kernel_parameters = c(0, 2), with_cov = TRUE) # apply with list arguments # first axis split by 5 # second axis split by 10 # results in 4 sub-domains flag_x <- coords[, 1] < 5 flag_y <- coords[, 2] < 10 coords_list <- list(coords[flag_x & flag_y, ], coords[!flag_x & flag_y, ], coords[flag_x & !flag_y, ], coords[!flag_x & !flag_y, ]) field_list <- list(observed_field[flag_x & flag_y, ], observed_field[!flag_x & flag_y, ], observed_field[flag_x & !flag_y, ], observed_field[!flag_x & !flag_y, ]) plot(coords, col = 1) points(coords_list[[2]], col = 2) points(coords_list[[3]], col = 3) points(coords_list[[4]], col = 4) res_list <- snss_sjd(x = field_list, coords = coords_list, kernel_type = 'ring', kernel_parameters = c(0, 2)) plot(res_list, colorkey = TRUE, as.table = TRUE, cex = 1) JADE::MD(W.hat = coef(res_list), A = mixing_matrix)
# simulate coordinates n <- 1000 coords <- runif(n * 2) * 20 dim(coords) <- c(n, 2) # simulate random field field_1 <- rnorm(n) field_2 <- 2 * sin(pi / 20 * coords[, 1]) * rnorm(n) field_3 <- rnorm(n) * (coords[, 1] < 10) + rnorm(n, 0, 3) * (coords[, 1] >= 10) latent_field <- cbind(field_1, field_2, field_3) mixing_matrix <- matrix(rnorm(9), 3, 3) observed_field <- latent_field observed_field_sp <- sp::SpatialPointsDataFrame(coords = coords, data = data.frame(observed_field)) sp::spplot(observed_field_sp, colorkey = TRUE, as.table = TRUE, cex = 1) # apply snss_sjd with 4 sub-domains # one ring kernel per sub-domain # without covariances res_4_ball <- snss_sjd(observed_field, coords, n_block = 2, kernel_type = 'ball', kernel_parameters = c(0, 2), with_cov = TRUE) JADE::MD(W.hat = coef(res_4_ball), A = mixing_matrix) # apply snss_sjd with split across y # one ring kernel per sub-domain # without covariances # should not work as field does not show spatial dependence res_4_ring <- snss_sjd(observed_field, coords, n_block = 'y', kernel_type = 'ring', kernel_parameters = c(0, 2), with_cov = FALSE) JADE::MD(W.hat = coef(res_4_ring), A = mixing_matrix) # print object print(res_4_ball) # plot latent field plot(res_4_ball, colorkey = TRUE, as.table = TRUE, cex = 1) # predict latent fields on grid predict(res_4_ball, colorkey = TRUE, as.table = TRUE, cex = 1) # unmixing matrix w_unmix <- coef(res_4_ball) # apply snss_jd with SpatialPointsDataFrame object res_4_ball_sp <- snss_sjd(observed_field_sp, n_block = 2, kernel_type = 'ball', kernel_parameters = c(0, 2), with_cov = TRUE) # apply with list arguments # first axis split by 5 # second axis split by 10 # results in 4 sub-domains flag_x <- coords[, 1] < 5 flag_y <- coords[, 2] < 10 coords_list <- list(coords[flag_x & flag_y, ], coords[!flag_x & flag_y, ], coords[flag_x & !flag_y, ], coords[!flag_x & !flag_y, ]) field_list <- list(observed_field[flag_x & flag_y, ], observed_field[!flag_x & flag_y, ], observed_field[flag_x & !flag_y, ], observed_field[!flag_x & !flag_y, ]) plot(coords, col = 1) points(coords_list[[2]], col = 2) points(coords_list[[3]], col = 3) points(coords_list[[4]], col = 4) res_list <- snss_sjd(x = field_list, coords = coords_list, kernel_type = 'ring', kernel_parameters = c(0, 2)) plot(res_list, colorkey = TRUE, as.table = TRUE, cex = 1) JADE::MD(W.hat = coef(res_list), A = mixing_matrix)
spatial_kernel_matrix
computes spatial kernel matrices for a given kernel function with its parameters and a set of coordinates.
spatial_kernel_matrix(coords, kernel_type = c('ring', 'ball', 'gauss'), kernel_parameters)
spatial_kernel_matrix(coords, kernel_type = c('ring', 'ball', 'gauss'), kernel_parameters)
coords |
a numeric matrix of dimension |
kernel_type |
a character string indicating which kernel function to use. Either |
kernel_parameters |
a numeric vector that gives the parameters for the kernel function. At least length of one for |
Two versions of local covariance matrices can be defined:
'lcov'
:
'ldiff'
:
'lcov_norm'
:
with
Where correspond to the pairwise distances between coordinates,
are the
p
random field values at location ,
is the sample mean vector, and the kernel function
determines the locality. The function
spatial_kernel_matrix
computes a list of c(n,n)
matrices where each entry of these matrices correspond to the spatial kernel function evaluated at the distance between two points, mathematically the entry ij of each kernel matrix is . The following kernel functions are implemented and chosen with the argument
kernel_type
:
'ring'
: parameters are inner radius and outer radius
, with
, and
:
'ball'
: parameter is the radius , with
:
'gauss'
: Gaussian function where 95% of the mass is inside the parameter , with
:
The argument kernel_type
determines the used kernel function as presented above, the argument kernel_parameters
gives the corresponding parameters for the kernel function. Specifically, if kernel_type
equals 'ball'
or 'gauss'
then kernel_parameters
is a numeric vector where each entry corresponds to one parameter. Hence, length(kernel_parameters)
spatial kernel matrices of type kernel_type
are computed. Whereas, if kernel_type
equals 'ring'
, then kernel_parameters
must be a numeric vector of even length where subsequently the inner and outer radii must be given (informally: c(r_i1, r_o1, r_i2, r_o2, ...)
). In that case length(kernel_parameters) / 2
spatial kernel matrices of type 'ring'
are computed.
The output of this function can be used with the function sbss
to avoid unnecessary computation of kernel matrices when sbss
is called multiple times with the same coordinate/kernel function setting. Additionally, the output can be used with the function local_covariance_matrix
to actually compute local covariance matrices as defined above based on a given set of spatial kernel matrices.
spatial_kernel_matrix
returns a list with length of length(kernel_parameters)
(for 'ball'
and 'gauss'
kernel functions) or length(kernel_parameters) / 2
(for 'ring'
kernel function) containing numeric matrices of dimension c(n,n)
corresponding to the spatial kernel matrices.
Muehlmann, C., Filzmoser, P. and Nordhausen, K. (2021), Spatial Blind Source Separation in the Presence of a Drift, Submitted for publication. Preprint available at https://arxiv.org/abs/2108.13813.
Bachoc, F., Genton, M. G, Nordhausen, K., Ruiz-Gazen, A. and Virta, J. (2020), Spatial Blind Source Separation, Biometrika, 107, 627-646, doi:10.1093/biomet/asz079.
# simulate a set of coordinates coords <- rnorm(100 * 2) dim(coords) <- c(100, 2) # computing two ring kernel matrices kernel_params_ring <- c(0, 0.5, 0.5, 2) ring_kernel_list <- spatial_kernel_matrix(coords, 'ring', kernel_params_ring) # computing three ball kernel matrices kernel_params_ball <- c(0.5, 1, 2) ball_kernel_list <- spatial_kernel_matrix(coords, 'ball', kernel_params_ball) # computing three gauss kernel matrices kernel_params_gauss <- c(0.5, 1, 2) gauss_kernel_list <- spatial_kernel_matrix(coords, 'gauss', kernel_params_gauss)
# simulate a set of coordinates coords <- rnorm(100 * 2) dim(coords) <- c(100, 2) # computing two ring kernel matrices kernel_params_ring <- c(0, 0.5, 0.5, 2) ring_kernel_list <- spatial_kernel_matrix(coords, 'ring', kernel_params_ring) # computing three ball kernel matrices kernel_params_ball <- c(0.5, 1, 2) ball_kernel_list <- spatial_kernel_matrix(coords, 'ball', kernel_params_ball) # computing three gauss kernel matrices kernel_params_gauss <- c(0.5, 1, 2) gauss_kernel_list <- spatial_kernel_matrix(coords, 'gauss', kernel_params_gauss)
white_data
whites the data with respect to the sample covariance matrix, or different spatial scatter matrices.
white_data(x, whitening = c("standard", "rob", "hr"), lcov = c('lcov', 'ldiff', 'lcov_norm'), kernel_mat = numeric(0))
white_data(x, whitening = c("standard", "rob", "hr"), lcov = c('lcov', 'ldiff', 'lcov_norm'), kernel_mat = numeric(0))
x |
a numeric matrix of dimension |
whitening |
a string indicating the whitening method. If |
lcov |
a string indicating which type of local covariance matrix is used for whitening, when the whitening method |
kernel_mat |
a spatial kernel matrix with dimension |
The inverse square root of a positive definite matrix with eigenvalue decomposition
is defined as
.
white_data
whitens the data by where
is a location functional of
and the matrix
is a scatter functional. If the argument
whitening
is 'standard'
, is the sample covariance matrix and
is a vector of column means of
. If the argument
whitening
is 'hr'
, the Hettmansperger-Randles location and scatter estimates (Hettmansperger & Randles, 2002) are used as location functional and scatter functional
. The Hettmansperger-Randles location and scatter estimates are robust variants of sample mean and covariance matrices, that are used for whitening in
robsbss
. If the argument whitening
is 'rob'
, the argument lcov
determines the scatter functional to be one of the following local scatter matrices:
'lcov'
:
'ldiff'
:
'lcov_norm'
:
with
where correspond to the pairwise distances between coordinates,
are the
p
random field values at location ,
is the sample mean vector, and the kernel function
determines the locality. The choice
'lcov_norm'
is useful when testing for the actual signal dimension of the latent field, see sbss_asymp
and sbss_boot
. See also sbss
for details.
Note that are usually not positive definite, therefore in that case the matrix cannot be inverted and an error is produced. Whitening with
matrices might be favorable in the presence of spatially uncorrelated noise, and whitening with
might be favorable when a non-constant smooth drift is present in the data.
The argument kernel_mat
is a matrix of dimension c(n,n)
where each entry corresponds to the spatial kernel function evaluated at the distance between two sample locations, mathematically the entry ij of each kernel matrix is . This matrix is usually computed with the function
spatial_kernel_matrix
.
white_data
returns a list with the following entries:
mu |
a numeric vector of length |
x_0 |
a numeric matrix of dimension |
x_w |
a numeric matrix of dimension |
s |
a numeric matrix of dimension |
s_inv_sqrt |
a numeric matrix of dimension |
s_sqrt |
a numeric matrix of dimension |
Muehlmann, C., Filzmoser, P. and Nordhausen, K. (2021), Spatial Blind Source Separation in the Presence of a Drift, Submitted for publication. Preprint available at https://arxiv.org/abs/2108.13813.
Bachoc, F., Genton, M. G, Nordhausen, K., Ruiz-Gazen, A. and Virta, J. (2020), Spatial Blind Source Separation, Biometrika, 107, 627-646, doi:10.1093/biomet/asz079.
Hettmansperger, T. P., & Randles, R. H. (2002). A practical affine equivariant multivariate median. Biometrika, 89 , 851-860. doi:10.1093/biomet/89.4.851.
# simulate coordinates coords <- runif(1000 * 2) * 20 dim(coords) <- c(1000, 2) coords_df <- as.data.frame(coords) names(coords_df) <- c("x", "y") # simulate random field if (!requireNamespace('gstat', quietly = TRUE)) { message('Please install the package gstat to run the example code.') } else { library(gstat) model_1 <- gstat(formula = z ~ 1, locations = ~ x + y, dummy = TRUE, beta = 0, model = vgm(psill = 0.025, range = 1, model = 'Exp'), nmax = 20) model_2 <- gstat(formula = z ~ 1, locations = ~ x + y, dummy = TRUE, beta = 0, model = vgm(psill = 0.025, range = 1, kappa = 2, model = 'Mat'), nmax = 20) model_3 <- gstat(formula = z ~ 1, locations = ~ x + y, dummy = TRUE, beta = 0, model = vgm(psill = 0.025, range = 1, model = 'Gau'), nmax = 20) field_1 <- predict(model_1, newdata = coords_df, nsim = 1)$sim1 field_2 <- predict(model_2, newdata = coords_df, nsim = 1)$sim1 field_3 <- predict(model_3, newdata = coords_df, nsim = 1)$sim1 field <- cbind(field_1, field_2, field_3) X <- as.matrix(field) # white the data with the usual sample covariance x_w_1 <- white_data(X) # white the data with a ldiff matrix and ring kernel kernel_params_ring <- c(0, 1) ring_kernel_list <- spatial_kernel_matrix(coords, 'ring', kernel_params_ring) x_w_2 <- white_data(field, whitening = 'rob', lcov = 'ldiff', kernel_mat = ring_kernel_list[[1]]) # Generate 5 % of global outliers to data field_cont <- gen_glob_outl(field)[,1:3] X <- as.matrix(field_cont) # white the data using Hettmansperger-Randles location and scatter estimates x_w_3 <- white_data(X, whitening = 'hr') }
# simulate coordinates coords <- runif(1000 * 2) * 20 dim(coords) <- c(1000, 2) coords_df <- as.data.frame(coords) names(coords_df) <- c("x", "y") # simulate random field if (!requireNamespace('gstat', quietly = TRUE)) { message('Please install the package gstat to run the example code.') } else { library(gstat) model_1 <- gstat(formula = z ~ 1, locations = ~ x + y, dummy = TRUE, beta = 0, model = vgm(psill = 0.025, range = 1, model = 'Exp'), nmax = 20) model_2 <- gstat(formula = z ~ 1, locations = ~ x + y, dummy = TRUE, beta = 0, model = vgm(psill = 0.025, range = 1, kappa = 2, model = 'Mat'), nmax = 20) model_3 <- gstat(formula = z ~ 1, locations = ~ x + y, dummy = TRUE, beta = 0, model = vgm(psill = 0.025, range = 1, model = 'Gau'), nmax = 20) field_1 <- predict(model_1, newdata = coords_df, nsim = 1)$sim1 field_2 <- predict(model_2, newdata = coords_df, nsim = 1)$sim1 field_3 <- predict(model_3, newdata = coords_df, nsim = 1)$sim1 field <- cbind(field_1, field_2, field_3) X <- as.matrix(field) # white the data with the usual sample covariance x_w_1 <- white_data(X) # white the data with a ldiff matrix and ring kernel kernel_params_ring <- c(0, 1) ring_kernel_list <- spatial_kernel_matrix(coords, 'ring', kernel_params_ring) x_w_2 <- white_data(field, whitening = 'rob', lcov = 'ldiff', kernel_mat = ring_kernel_list[[1]]) # Generate 5 % of global outliers to data field_cont <- gen_glob_outl(field)[,1:3] X <- as.matrix(field_cont) # white the data using Hettmansperger-Randles location and scatter estimates x_w_3 <- white_data(X, whitening = 'hr') }