Title: | Uniformity Tests on the Circle, Sphere, and Hypersphere |
---|---|
Description: | Implementation of uniformity tests on the circle and (hyper)sphere. The main function of the package is unif_test(), which conveniently collects more than 35 tests for assessing uniformity on S^{p-1} = {x in R^p : ||x|| = 1}, p >= 2. The test statistics are implemented in the unif_stat() function, which allows computing several statistics for different samples within a single call, thus facilitating Monte Carlo experiments. Furthermore, the unif_stat_MC() function allows parallelizing them in a simple way. The asymptotic null distributions of the statistics are available through the function unif_stat_distr(). The core of 'sphunif' is coded in C++ by relying on the 'Rcpp' package. The package also provides several novel datasets and gives the replicability for the data applications/simulations in García-Portugués et al. (2021) <doi:10.1007/978-3-030-69944-4_12>, García-Portugués et al. (2023) <doi:10.3150/21-BEJ1454>, García-Portugués et al. (2024) <doi:10.48550/arXiv.2108.09874>, and Fernández-de-Marcos and García-Portugués (2024) <doi:10.48550/arXiv.2405.13531>. |
Authors: | Eduardo García-Portugués [aut, cre] , Thomas Verdebout [aut] , Alberto Fernández-de-Marcos [ctb], Paula Navarro [ctb] |
Maintainer: | Eduardo García-Portugués <[email protected]> |
License: | GPL-3 |
Version: | 1.4.0 |
Built: | 2024-11-21 06:47:38 UTC |
Source: | CRAN |
sphunif
: Uniformity Tests on the Circle, Sphere, and
HypersphereImplementation of uniformity tests on the circle and
(hyper)sphere. The main function of the package is unif_test
,
which conveniently collects more than 35 tests for assessing uniformity on
,
. The test statistics are
implemented in the
unif_stat
function, which allows computing
several statistics for different samples within a single call, thus
facilitating Monte Carlo experiments. Furthermore, the
unif_stat_MC
function allows parallelizing them in
a simple way. The asymptotic null distributions of the statistics are
available through the function unif_stat_distr
. The core of
sphunif-package
is coded in C++ by relying on the
Rcpp-package
. The package also provides several
novel datasets and gives the replicability for the data applications/
simulations in García-Portugués et al. (2021)
<doi:10.1007/978-3-030-69944-4_12>, García-Portugués et al. (2023)
<doi:10.3150/21-BEJ1454>, García-Portugués et al. (2024)
<doi:10.48550/arXiv.2108.09874>, and Fernández-de-Marcos and
García-Portugués (2024) <doi:10.48550/arXiv.405.13531>.
Eduardo García-Portugués and Thomas Verdebout.
Fernández-de-Marcos, A. and García-Portugués, E. (2024) A stereographic test of spherical uniformity. arXiv:2405.13531. doi:10.48550/arXiv.2405.13531.
García-Portugués, E. and Verdebout, T. (2018) An overview of uniformity tests on the hypersphere. arXiv:1804.00286. doi:10.48550/arXiv.1804.00286.
García-Portugués, E., Navarro-Esteban, P., Cuesta-Albertos, J. A. (2023) On a projection-based class of uniformity tests on the hypersphere. Bernoulli, 29(1):181–204. doi:10.3150/21-BEJ1454.
García-Portugués, E., Navarro-Esteban, P., and Cuesta-Albertos, J. A. (2021). A Cramér–von Mises test of uniformity on the hypersphere. In Balzano, S., Porzio, G. C., Salvatore, R., Vistocco, D., and Vichi, M. (Eds.), Statistical Learning and Modeling in Data Analysis, Studies in Classification, Data Analysis and Knowledge Organization, pp. 107–116. Springer, Cham. doi:10.1007/978-3-030-69944-4_12.
García-Portugués, E., Paindaveine, D., and Verdebout, T. (2024). On a class of Sobolev tests for symmetry of directions, their detection thresholds, and asymptotic powers. arXiv:2108.09874v2. doi:10.48550/arXiv.2108.09874.
Computation of
where ,
, and
is the surface area of
.
is the proportion of surface area
of
covered by the intersection of two hyperspherical caps
centered at
and
and with
common solid angle
.
A_theta_x(theta, x, p, N = 160L, as_matrix = TRUE)
A_theta_x(theta, x, p, N = 160L, as_matrix = TRUE)
theta |
vector with values in |
x |
vector with values in |
p |
integer giving the dimension of the ambient space |
N |
number of points used in the
Gauss-Legendre quadrature. Defaults to
|
as_matrix |
return a matrix with the values of |
See García-Portugués et al. (2023) for more details about the
function.
A matrix of size c(length(theta), length(x))
containing the
evaluation of if
as_matrix = TRUE
. Otherwise,
a vector of size c(length(theta)
if theta
and x
equal
in size.
García-Portugués, E., Navarro-Esteban, P., Cuesta-Albertos, J. A. (2023) On a projection-based class of uniformity tests on the hypersphere. Bernoulli, 29(1):181–204. doi:10.3150/21-BEJ1454.
# Plot A_x(theta) for several dimensions and x's A_lines <- function(x, th = seq(0, pi, l = 200)) { plot(th, A_theta_x(theta = th, x = x, p = 2), type = "l", col = 1, ylim = c(0, 1.25), main = paste("x =", x), ylab = expression(A[x](theta)), xlab = expression(theta), axes = FALSE) axis(1, at = c(0, pi / 4, pi / 2, 3 * pi / 4, pi), labels = expression(0, pi / 4, pi / 2, 3 * pi / 4, pi)) axis(2); box() abline(h = c(0, 1), lty = 2) lines(th, A_theta_x(theta = th, x = x, p = 3), col = 2) lines(th, A_theta_x(theta = th, x = x, p = 4), col = 3) lines(th, A_theta_x(theta = th, x = x, p = 5), col = 4) legend("top", lwd = 2, legend = paste("p =", 2:5), col = 1:4, cex = 0.75, horiz = TRUE) } old_par <- par(mfrow = c(2, 3)) A_lines(x = -0.75) A_lines(x = -0.25) A_lines(x = 0) A_lines(x = 0.25) A_lines(x = 0.5) A_lines(x = 0.75) par(old_par) # As surface of (theta, x) for several dimensions A_surf <- function(p, x = seq(-1, 1, l = 201), th = seq(0, pi, l = 201)) { col <- c("white", viridisLite::viridis(20)) breaks <- c(-1, seq(1e-15, 1, l = 21)) A <- A_theta_x(theta = th, x = x, p = p) image(th, x, A, main = paste("p =", p), col = col, breaks = breaks, xlab = expression(theta), axes = FALSE) axis(1, at = c(0, pi / 4, pi / 2, 3 * pi / 4, pi), labels = expression(0, pi / 4, pi / 2, 3 * pi / 4, pi)) axis(2); box() contour(th, x, A, levels = breaks, add = TRUE) } old_par <- par(mfrow = c(2, 2)) A_surf(p = 2) A_surf(p = 3) A_surf(p = 4) A_surf(p = 5) par(old_par) # No matrix return th <- seq(0, pi, l = 5) x <- seq(-1, 1, l = 5) diag(A_theta_x(theta = th, x = x, p = 2)) A_theta_x(theta = th, x = x, p = 2, as_matrix = FALSE)
# Plot A_x(theta) for several dimensions and x's A_lines <- function(x, th = seq(0, pi, l = 200)) { plot(th, A_theta_x(theta = th, x = x, p = 2), type = "l", col = 1, ylim = c(0, 1.25), main = paste("x =", x), ylab = expression(A[x](theta)), xlab = expression(theta), axes = FALSE) axis(1, at = c(0, pi / 4, pi / 2, 3 * pi / 4, pi), labels = expression(0, pi / 4, pi / 2, 3 * pi / 4, pi)) axis(2); box() abline(h = c(0, 1), lty = 2) lines(th, A_theta_x(theta = th, x = x, p = 3), col = 2) lines(th, A_theta_x(theta = th, x = x, p = 4), col = 3) lines(th, A_theta_x(theta = th, x = x, p = 5), col = 4) legend("top", lwd = 2, legend = paste("p =", 2:5), col = 1:4, cex = 0.75, horiz = TRUE) } old_par <- par(mfrow = c(2, 3)) A_lines(x = -0.75) A_lines(x = -0.25) A_lines(x = 0) A_lines(x = 0.25) A_lines(x = 0.5) A_lines(x = 0.75) par(old_par) # As surface of (theta, x) for several dimensions A_surf <- function(p, x = seq(-1, 1, l = 201), th = seq(0, pi, l = 201)) { col <- c("white", viridisLite::viridis(20)) breaks <- c(-1, seq(1e-15, 1, l = 21)) A <- A_theta_x(theta = th, x = x, p = p) image(th, x, A, main = paste("p =", p), col = col, breaks = breaks, xlab = expression(theta), axes = FALSE) axis(1, at = c(0, pi / 4, pi / 2, 3 * pi / 4, pi), labels = expression(0, pi / 4, pi / 2, 3 * pi / 4, pi)) axis(2); box() contour(th, x, A, levels = breaks, add = TRUE) } old_par <- par(mfrow = c(2, 2)) A_surf(p = 2) A_surf(p = 3) A_surf(p = 4) A_surf(p = 5) par(old_par) # No matrix return th <- seq(0, pi, l = 5) x <- seq(-1, 1, l = 5) diag(A_theta_x(theta = th, x = x, p = 2)) A_theta_x(theta = th, x = x, p = 2, as_matrix = FALSE)
Transforms the angles in
into the Cartesian coordinates
of , and vice versa.
angles_to_sphere(theta) sphere_to_angles(x)
angles_to_sphere(theta) sphere_to_angles(x)
theta |
matrix of size |
x |
matrix of size |
For angles_to_sphere
, the matrix x
. For
sphere_to_angles
, the matrix theta
.
# Check changes of coordinates sphere_to_angles(angles_to_sphere(c(pi / 2, 0, pi))) sphere_to_angles(angles_to_sphere(rbind(c(pi / 2, 0, pi), c(pi, pi / 2, 0)))) angles_to_sphere(sphere_to_angles(c(0, sqrt(0.5), sqrt(0.1), sqrt(0.4)))) angles_to_sphere(sphere_to_angles( rbind(c(0, sqrt(0.5), sqrt(0.1), sqrt(0.4)), c(0, sqrt(0.5), sqrt(0.5), 0), c(0, 1, 0, 0), c(0, 0, 0, -1), c(0, 0, 1, 0)))) # Circle sphere_to_angles(angles_to_sphere(0)) sphere_to_angles(angles_to_sphere(cbind(0:3))) angles_to_sphere(cbind(sphere_to_angles(rbind(c(0, 1), c(1, 0))))) angles_to_sphere(cbind(sphere_to_angles(rbind(c(0, 1)))))
# Check changes of coordinates sphere_to_angles(angles_to_sphere(c(pi / 2, 0, pi))) sphere_to_angles(angles_to_sphere(rbind(c(pi / 2, 0, pi), c(pi, pi / 2, 0)))) angles_to_sphere(sphere_to_angles(c(0, sqrt(0.5), sqrt(0.1), sqrt(0.4)))) angles_to_sphere(sphere_to_angles( rbind(c(0, sqrt(0.5), sqrt(0.1), sqrt(0.4)), c(0, sqrt(0.5), sqrt(0.5), 0), c(0, 1, 0, 0), c(0, 0, 0, -1), c(0, 0, 1, 0)))) # Circle sphere_to_angles(angles_to_sphere(0)) sphere_to_angles(angles_to_sphere(cbind(0:3))) angles_to_sphere(cbind(sphere_to_angles(rbind(c(0, 1), c(1, 0))))) angles_to_sphere(cbind(sphere_to_angles(rbind(c(0, 1)))))
Listing of the tests implemented in the sphunif
package.
avail_cir_tests avail_sph_tests
avail_cir_tests avail_sph_tests
An object of class character
of length 33.
An object of class character
of length 18.
A character vector whose elements are valid inputs for the
type
argument in unif_test
, unif_stat
,
unif_stat_distr
, and unif_stat_MC
.
avail_cir_tests
provides the available circular tests and
avail_sph_tests
the (hyper)spherical tests.
# Circular tests avail_cir_tests # Spherical tests avail_sph_tests
# Circular tests avail_cir_tests # Spherical tests avail_sph_tests
Transformation between a matrix Theta
containing
M
circular samples of size n
on and an array
X
containing the associated Cartesian coordinates on
.
Theta_to_X(Theta) X_to_Theta(X)
Theta_to_X(Theta) X_to_Theta(X)
Theta |
a matrix of size |
X |
an array of size |
Theta_to_X
: the corresponding X
.
X_to_Theta
: the corresponding Theta
.
# Sample Theta <- r_unif_cir(n = 10, M = 2) X <- r_unif_sph(n = 10, p = 2, M = 2) # Check equality sum(abs(X - Theta_to_X(X_to_Theta(X)))) sum(abs(Theta - X_to_Theta(Theta_to_X(Theta))))
# Sample Theta <- r_unif_cir(n = 10, M = 2) X <- r_unif_sph(n = 10, p = 2, M = 2) # Check equality sum(abs(X - Theta_to_X(X_to_Theta(X)))) sum(abs(Theta - X_to_Theta(Theta_to_X(Theta))))
Computation of the circular gaps of an angular sample
on
, defined as
where
cir_gaps(Theta, sorted = FALSE)
cir_gaps(Theta, sorted = FALSE)
Theta |
a matrix of size |
sorted |
are the columns of |
A matrix of size c(n, M)
containing the n
circular
gaps for each of the M
circular samples.
Be careful on avoiding the next bad usages of cir_gaps
, which will
produce spurious results:
The entries of Theta
are not in .
Theta
is not sorted increasingly when
data_sorted = TRUE
.
Theta <- cbind(c(pi, 0, 3 * pi / 2), c(0, 3 * pi / 2, pi), c(5, 3, 1)) cir_gaps(Theta)
Theta <- cbind(c(pi, 0, 3 * pi / 2), c(0, 3 * pi / 2, pi), c(5, 3, 1)) cir_gaps(Theta)
Low-level implementation of several statistics for assessing
circular uniformity on or, equivalently,
.
cir_stat_Kuiper(Theta, sorted = FALSE, KS = FALSE, Stephens = FALSE) cir_stat_Watson(Theta, sorted = FALSE, CvM = FALSE, Stephens = FALSE) cir_stat_Watson_1976(Theta, sorted = FALSE, minus = FALSE) cir_stat_Range(Theta, sorted = FALSE, gaps_in_Theta = FALSE, max_gap = TRUE) cir_stat_Rao(Theta, sorted = FALSE, gaps_in_Theta = FALSE) cir_stat_Greenwood(Theta, sorted = FALSE, gaps_in_Theta = FALSE) cir_stat_Log_gaps(Theta, sorted = FALSE, gaps_in_Theta = FALSE, abs_val = TRUE) cir_stat_Vacancy(Theta, a = 2 * pi, sorted = FALSE, gaps_in_Theta = FALSE) cir_stat_Max_uncover(Theta, a = 2 * pi, sorted = FALSE, gaps_in_Theta = FALSE) cir_stat_Num_uncover(Theta, a = 2 * pi, sorted = FALSE, gaps_in_Theta = FALSE, minus_val = TRUE) cir_stat_Gini(Theta, sorted = FALSE, gaps_in_Theta = FALSE) cir_stat_Gini_squared(Theta, sorted = FALSE, gaps_in_Theta = FALSE) cir_stat_Ajne(Theta, Psi_in_Theta = FALSE) cir_stat_Rothman(Theta, Psi_in_Theta = FALSE, t = 1/3) cir_stat_Hodges_Ajne(Theta, asymp_std = FALSE, sorted = FALSE, use_Cressie = TRUE) cir_stat_Cressie(Theta, t = 1/3, sorted = FALSE) cir_stat_FG01(Theta, sorted = FALSE) cir_stat_Rayleigh(Theta, m = 1L) cir_stat_Bingham(Theta) cir_stat_Hermans_Rasson(Theta, Psi_in_Theta = FALSE) cir_stat_Gine_Gn(Theta, Psi_in_Theta = FALSE) cir_stat_Gine_Fn(Theta, Psi_in_Theta = FALSE) cir_stat_Pycke(Theta, Psi_in_Theta = FALSE) cir_stat_Pycke_q(Theta, Psi_in_Theta = FALSE, q = 0.5) cir_stat_Bakshaev(Theta, Psi_in_Theta = FALSE) cir_stat_Riesz(Theta, Psi_in_Theta = FALSE, s = 1) cir_stat_PCvM(Theta, Psi_in_Theta = FALSE) cir_stat_PRt(Theta, Psi_in_Theta = FALSE, t = 1/3) cir_stat_PAD(Theta, Psi_in_Theta = FALSE, AD = FALSE, sorted = FALSE) cir_stat_Poisson(Theta, Psi_in_Theta = FALSE, rho = 0.5) cir_stat_Softmax(Theta, Psi_in_Theta = FALSE, kappa = 1) cir_stat_CCF09(Theta, dirs, K_CCF09 = 25L, original = FALSE)
cir_stat_Kuiper(Theta, sorted = FALSE, KS = FALSE, Stephens = FALSE) cir_stat_Watson(Theta, sorted = FALSE, CvM = FALSE, Stephens = FALSE) cir_stat_Watson_1976(Theta, sorted = FALSE, minus = FALSE) cir_stat_Range(Theta, sorted = FALSE, gaps_in_Theta = FALSE, max_gap = TRUE) cir_stat_Rao(Theta, sorted = FALSE, gaps_in_Theta = FALSE) cir_stat_Greenwood(Theta, sorted = FALSE, gaps_in_Theta = FALSE) cir_stat_Log_gaps(Theta, sorted = FALSE, gaps_in_Theta = FALSE, abs_val = TRUE) cir_stat_Vacancy(Theta, a = 2 * pi, sorted = FALSE, gaps_in_Theta = FALSE) cir_stat_Max_uncover(Theta, a = 2 * pi, sorted = FALSE, gaps_in_Theta = FALSE) cir_stat_Num_uncover(Theta, a = 2 * pi, sorted = FALSE, gaps_in_Theta = FALSE, minus_val = TRUE) cir_stat_Gini(Theta, sorted = FALSE, gaps_in_Theta = FALSE) cir_stat_Gini_squared(Theta, sorted = FALSE, gaps_in_Theta = FALSE) cir_stat_Ajne(Theta, Psi_in_Theta = FALSE) cir_stat_Rothman(Theta, Psi_in_Theta = FALSE, t = 1/3) cir_stat_Hodges_Ajne(Theta, asymp_std = FALSE, sorted = FALSE, use_Cressie = TRUE) cir_stat_Cressie(Theta, t = 1/3, sorted = FALSE) cir_stat_FG01(Theta, sorted = FALSE) cir_stat_Rayleigh(Theta, m = 1L) cir_stat_Bingham(Theta) cir_stat_Hermans_Rasson(Theta, Psi_in_Theta = FALSE) cir_stat_Gine_Gn(Theta, Psi_in_Theta = FALSE) cir_stat_Gine_Fn(Theta, Psi_in_Theta = FALSE) cir_stat_Pycke(Theta, Psi_in_Theta = FALSE) cir_stat_Pycke_q(Theta, Psi_in_Theta = FALSE, q = 0.5) cir_stat_Bakshaev(Theta, Psi_in_Theta = FALSE) cir_stat_Riesz(Theta, Psi_in_Theta = FALSE, s = 1) cir_stat_PCvM(Theta, Psi_in_Theta = FALSE) cir_stat_PRt(Theta, Psi_in_Theta = FALSE, t = 1/3) cir_stat_PAD(Theta, Psi_in_Theta = FALSE, AD = FALSE, sorted = FALSE) cir_stat_Poisson(Theta, Psi_in_Theta = FALSE, rho = 0.5) cir_stat_Softmax(Theta, Psi_in_Theta = FALSE, kappa = 1) cir_stat_CCF09(Theta, dirs, K_CCF09 = 25L, original = FALSE)
Theta |
a matrix of size |
sorted |
are the columns of |
KS |
compute the Kolmogorov-Smirnov statistic (which is
not invariant under origin shifts) instead of the Kuiper statistic?
Defaults to |
Stephens |
compute Stephens (1970) modification so that the null distribution of the is less dependent on the sample size? The modification does not alter the test decision. |
CvM |
compute the Cramér-von Mises statistic (which is not
invariant under origin shifts) instead of the Watson statistic? Defaults to
|
minus |
compute the invariant |
gaps_in_Theta |
does |
max_gap |
compute the maximum gap for the range statistic? If
|
abs_val |
return the absolute value of the Darling's log gaps
statistic? If |
a |
either:
|
minus_val |
return the negative value of the (standardized) number of
uncovered spacings? If |
Psi_in_Theta |
does |
t |
|
asymp_std |
normalize the Hodges-Ajne statistic in terms of its
asymptotic distribution? Defaults to |
use_Cressie |
compute the Hodges-Ajne statistic as a particular case
of the Cressie statistic? Defaults to |
m |
integer |
q |
|
s |
|
AD |
compute the Anderson-Darling statistic (which is not
invariant under origin shifts) instead of the Projected Anderson-Darling
statistic? Defaults to |
rho |
|
kappa |
|
dirs |
a matrix of size |
K_CCF09 |
integer giving the truncation of the series present in the
asymptotic distribution of the Kolmogorov-Smirnov statistic. Defaults to
|
original |
return the CCF09 statistic as originally defined?
If |
Descriptions and references for most of the statistics are available in García-Portugués and Verdebout (2018).
The statistics cir_stat_PCvM
and cir_stat_PRt
are provided
for the sake of completion, but they equal the more efficiently-implemented
statistics 2 * cir_stat_Watson
and cir_stat_Rothman
,
respectively.
A matrix of size c(M, 1)
containing the statistics for each
of the M
samples.
Be careful on avoiding the next bad usages of the functions, which will produce spurious results:
The entries of Theta
are not in .
Theta
does not contain the circular gaps when
gaps_in_Theta = TRUE
.
Theta
is not sorted increasingly when
data_sorted = TRUE
.
Theta
does not contain
Psi_mat(array(Theta, dim = c(n, 1, M)))
when
Psi_in_Theta = TRUE
.
The directions in dirs
do not have unit norm.
García-Portugués, E. and Verdebout, T. (2018) An overview of uniformity tests on the hypersphere. arXiv:1804.00286. doi:10.48550/arXiv.1804.00286.
## Sample uniform circular data M <- 2 n <- 100 set.seed(987202226) Theta <- r_unif_cir(n = n, M = M) ## Tests based on the empirical cumulative distribution function # Kuiper cir_stat_Kuiper(Theta) cir_stat_Kuiper(Theta, Stephens = TRUE) # Watson cir_stat_Watson(Theta) cir_stat_Watson(Theta, Stephens = TRUE) # Watson (1976) cir_stat_Watson_1976(Theta) ## Partition-based tests # Ajne Theta_array <- Theta dim(Theta_array) <- c(nrow(Theta), 1, ncol(Theta)) Psi <- Psi_mat(Theta_array) cir_stat_Ajne(Theta) cir_stat_Ajne(Psi, Psi_in_Theta = TRUE) # Rothman cir_stat_Rothman(Theta, t = 0.5) cir_stat_Rothman(Theta) cir_stat_Rothman(Psi, Psi_in_Theta = TRUE) # Hodges-Ajne cir_stat_Hodges_Ajne(Theta) cir_stat_Hodges_Ajne(Theta, use_Cressie = FALSE) # Cressie cir_stat_Cressie(Theta, t = 0.5) cir_stat_Cressie(Theta) # FG01 cir_stat_FG01(Theta) ## Spacings-based tests # Range cir_stat_Range(Theta) # Rao cir_stat_Rao(Theta) # Greenwood cir_stat_Greenwood(Theta) # Log gaps cir_stat_Log_gaps(Theta) # Vacancy cir_stat_Vacancy(Theta) # Maximum uncovered spacing cir_stat_Max_uncover(Theta) # Number of uncovered spacings cir_stat_Num_uncover(Theta) # Gini mean difference cir_stat_Gini(Theta) # Gini mean squared difference cir_stat_Gini_squared(Theta) ## Sobolev tests # Rayleigh cir_stat_Rayleigh(Theta) cir_stat_Rayleigh(Theta, m = 2) # Bingham cir_stat_Bingham(Theta) # Hermans-Rasson cir_stat_Hermans_Rasson(Theta) cir_stat_Hermans_Rasson(Psi, Psi_in_Theta = TRUE) # Gine Fn cir_stat_Gine_Fn(Theta) cir_stat_Gine_Fn(Psi, Psi_in_Theta = TRUE) # Gine Gn cir_stat_Gine_Gn(Theta) cir_stat_Gine_Gn(Psi, Psi_in_Theta = TRUE) # Pycke cir_stat_Pycke(Theta) cir_stat_Pycke(Psi, Psi_in_Theta = TRUE) # Pycke q cir_stat_Pycke_q(Theta) cir_stat_Pycke_q(Psi, Psi_in_Theta = TRUE) # Bakshaev cir_stat_Bakshaev(Theta) cir_stat_Bakshaev(Psi, Psi_in_Theta = TRUE) # Riesz cir_stat_Riesz(Theta, s = 1) cir_stat_Riesz(Psi, Psi_in_Theta = TRUE, s = 1) # Projected Cramér-von Mises cir_stat_PCvM(Theta) cir_stat_PCvM(Psi, Psi_in_Theta = TRUE) # Projected Rothman cir_stat_PRt(Theta, t = 0.5) cir_stat_PRt(Theta) cir_stat_PRt(Psi, Psi_in_Theta = TRUE) # Projected Anderson-Darling cir_stat_PAD(Theta) cir_stat_PAD(Psi, Psi_in_Theta = TRUE) ## Other tests # CCF09 dirs <- r_unif_sph(n = 3, p = 2, M = 1)[, , 1] cir_stat_CCF09(Theta, dirs = dirs) ## Connection of Kuiper and Watson statistics with KS and CvM, respectively # Rotate sample for KS and CvM alpha <- seq(0, 2 * pi, l = 1e4) KS_alpha <- sapply(alpha, function(a) { cir_stat_Kuiper((Theta[, 2, drop = FALSE] + a) %% (2 * pi), KS = TRUE) }) CvM_alpha <- sapply(alpha, function(a) { cir_stat_Watson((Theta[, 2, drop = FALSE] + a) %% (2 * pi), CvM = TRUE) }) AD_alpha <- sapply(alpha, function(a) { cir_stat_PAD((Theta[, 2, drop = FALSE] + a) %% (2 * pi), AD = TRUE) }) # Kuiper is the maximum rotated KS plot(alpha, KS_alpha, type = "l") abline(h = cir_stat_Kuiper(Theta[, 2, drop = FALSE]), col = 2) points(alpha[which.max(KS_alpha)], max(KS_alpha), col = 2, pch = 16) # Watson is the minimum rotated CvM plot(alpha, CvM_alpha, type = "l") abline(h = cir_stat_Watson(Theta[, 2, drop = FALSE]), col = 2) points(alpha[which.min(CvM_alpha)], min(CvM_alpha), col = 2, pch = 16) # Anderson-Darling is the average rotated AD? plot(alpha, AD_alpha, type = "l") abline(h = cir_stat_PAD(Theta[, 2, drop = FALSE]), col = 2) abline(h = mean(AD_alpha), col = 3)
## Sample uniform circular data M <- 2 n <- 100 set.seed(987202226) Theta <- r_unif_cir(n = n, M = M) ## Tests based on the empirical cumulative distribution function # Kuiper cir_stat_Kuiper(Theta) cir_stat_Kuiper(Theta, Stephens = TRUE) # Watson cir_stat_Watson(Theta) cir_stat_Watson(Theta, Stephens = TRUE) # Watson (1976) cir_stat_Watson_1976(Theta) ## Partition-based tests # Ajne Theta_array <- Theta dim(Theta_array) <- c(nrow(Theta), 1, ncol(Theta)) Psi <- Psi_mat(Theta_array) cir_stat_Ajne(Theta) cir_stat_Ajne(Psi, Psi_in_Theta = TRUE) # Rothman cir_stat_Rothman(Theta, t = 0.5) cir_stat_Rothman(Theta) cir_stat_Rothman(Psi, Psi_in_Theta = TRUE) # Hodges-Ajne cir_stat_Hodges_Ajne(Theta) cir_stat_Hodges_Ajne(Theta, use_Cressie = FALSE) # Cressie cir_stat_Cressie(Theta, t = 0.5) cir_stat_Cressie(Theta) # FG01 cir_stat_FG01(Theta) ## Spacings-based tests # Range cir_stat_Range(Theta) # Rao cir_stat_Rao(Theta) # Greenwood cir_stat_Greenwood(Theta) # Log gaps cir_stat_Log_gaps(Theta) # Vacancy cir_stat_Vacancy(Theta) # Maximum uncovered spacing cir_stat_Max_uncover(Theta) # Number of uncovered spacings cir_stat_Num_uncover(Theta) # Gini mean difference cir_stat_Gini(Theta) # Gini mean squared difference cir_stat_Gini_squared(Theta) ## Sobolev tests # Rayleigh cir_stat_Rayleigh(Theta) cir_stat_Rayleigh(Theta, m = 2) # Bingham cir_stat_Bingham(Theta) # Hermans-Rasson cir_stat_Hermans_Rasson(Theta) cir_stat_Hermans_Rasson(Psi, Psi_in_Theta = TRUE) # Gine Fn cir_stat_Gine_Fn(Theta) cir_stat_Gine_Fn(Psi, Psi_in_Theta = TRUE) # Gine Gn cir_stat_Gine_Gn(Theta) cir_stat_Gine_Gn(Psi, Psi_in_Theta = TRUE) # Pycke cir_stat_Pycke(Theta) cir_stat_Pycke(Psi, Psi_in_Theta = TRUE) # Pycke q cir_stat_Pycke_q(Theta) cir_stat_Pycke_q(Psi, Psi_in_Theta = TRUE) # Bakshaev cir_stat_Bakshaev(Theta) cir_stat_Bakshaev(Psi, Psi_in_Theta = TRUE) # Riesz cir_stat_Riesz(Theta, s = 1) cir_stat_Riesz(Psi, Psi_in_Theta = TRUE, s = 1) # Projected Cramér-von Mises cir_stat_PCvM(Theta) cir_stat_PCvM(Psi, Psi_in_Theta = TRUE) # Projected Rothman cir_stat_PRt(Theta, t = 0.5) cir_stat_PRt(Theta) cir_stat_PRt(Psi, Psi_in_Theta = TRUE) # Projected Anderson-Darling cir_stat_PAD(Theta) cir_stat_PAD(Psi, Psi_in_Theta = TRUE) ## Other tests # CCF09 dirs <- r_unif_sph(n = 3, p = 2, M = 1)[, , 1] cir_stat_CCF09(Theta, dirs = dirs) ## Connection of Kuiper and Watson statistics with KS and CvM, respectively # Rotate sample for KS and CvM alpha <- seq(0, 2 * pi, l = 1e4) KS_alpha <- sapply(alpha, function(a) { cir_stat_Kuiper((Theta[, 2, drop = FALSE] + a) %% (2 * pi), KS = TRUE) }) CvM_alpha <- sapply(alpha, function(a) { cir_stat_Watson((Theta[, 2, drop = FALSE] + a) %% (2 * pi), CvM = TRUE) }) AD_alpha <- sapply(alpha, function(a) { cir_stat_PAD((Theta[, 2, drop = FALSE] + a) %% (2 * pi), AD = TRUE) }) # Kuiper is the maximum rotated KS plot(alpha, KS_alpha, type = "l") abline(h = cir_stat_Kuiper(Theta[, 2, drop = FALSE]), col = 2) points(alpha[which.max(KS_alpha)], max(KS_alpha), col = 2, pch = 16) # Watson is the minimum rotated CvM plot(alpha, CvM_alpha, type = "l") abline(h = cir_stat_Watson(Theta[, 2, drop = FALSE]), col = 2) points(alpha[which.min(CvM_alpha)], min(CvM_alpha), col = 2, pch = 16) # Anderson-Darling is the average rotated AD? plot(alpha, AD_alpha, type = "l") abline(h = cir_stat_PAD(Theta[, 2, drop = FALSE]), col = 2) abline(h = mean(AD_alpha), col = 3)
Comet orbits data from the
JPL Small-Body Database Search Engine. The normal vector of a comet orbit
represents is a vector on .
comets
comets
A data frame with 3798 rows and 13 variables:
database ID.
object primary SPK-ID.
full name/designation following the IUA naming convention.
object primary designation.
flag indicating if the record is a comet fragment.
diameter from equivalent sphere (in km).
inclination; the orbit's plane angle with respect to the
ecliptic plane, in radians in .
longitude of the ascending node; the counterclockwise angle from
the vector pointing to the First Point of Aries and that pointing to
the ascending node (the intersection between orbit and ecliptic plane), in
radians in . (Both vectors are heliocentric and within
the ecliptic plane.)
sidereal orbital period (in years).
orbit classification. A factor with levels given below.
eccentricity of the orbit.
semi-major axis of the orbit (in AU).
argument of perihelion; the (shortest) angle between the vector
pointing to the ascending node and that pointing to the perihelion
(nearest orbit point to the Sun), in radians in . (Both
vectors are heliocentric and within the orbit's plane.)
Date
of the first and
last recorded observations used in the orbit fit.
flag indicating if the comet was considered in the data application in Cuesta-Albertos et al. (2009); see details below.
The normal vector to the ecliptic plane of the comet with inclination
and longitude of the ascending node
is
A prograde comet has positive , negative
represents a retrograde comet.
class
has the following levels:
COM
: comet orbit not matching any defined orbit class.
CTc
: Chiron-type comet, as defined by Levison and Duncan
(T_Jupiter > 3; a > a_Jupiter).
ETc
: Encke-type comet, as defined by Levison and Duncan
(T_Jupiter > 3; a < a_Jupiter).
HTC
: Halley-type comet, classical definition (20y < P < 200y).
HYP
: comets on hyperbolic orbits.
JFc
: Jupiter-family comet, as defined by Levison and Duncan
(2 < T_Jupiter < 3).
JFC
: Jupiter-family comet, classical definition (P < 20y).
PAR
: comets on parabolic orbits.
Hyperbolic and parabolic comets are not periodic; only elliptical comets are periodic.
The ccf09
variable gives the observations considered in
Cuesta-Albertos et al. (2009) after fetching in the database in 2007-12-14
for the comets such that !(class %in% c("HYP", "PAR")) &
per_y >= 200
. Due to the dynamic nature of the data, more comets were added
to the database since 2007 and also some past records were updated.
The script performing the data preprocessing is available at
comets.R
. The data was retrieved on 2022-05-28. A previous version
of this dataset based on the old NASA's JPL Database (accessed on
2020-05-07) is available at
comets-old.rda
and was obtained with
comets-old.R
.
https://ssd.jpl.nasa.gov/tools/sbdb_query.html
Cuesta-Albertos, J. A., Cuevas, A., Fraiman, R. (2009) On projection-based tests for directional and compositional data. Statistics and Computing, 19:367–380. doi:10.1007/s11222-008-9098-3
# Load data data("comets") # Add normal vectors comets$normal <- cbind(sin(comets$i) * sin(comets$om), -sin(comets$i) * cos(comets$om), cos(comets$i)) # Tests to be performed type_tests <- c("PCvM", "PAD", "PRt") # Excluding the C/1882 R1-X (Great September comet) records with X = B, C, D comets_ccf09 <- comets[comets$ccf09, ][-c(13:15), ] # Sample size nrow(comets_ccf09) # Tests for the data in Cuesta-Albertos et al. (2009) tests_ccf09 <- unif_test(data = comets_ccf09$normal, type = type_tests, p_value = "asymp") tests_ccf09
# Load data data("comets") # Add normal vectors comets$normal <- cbind(sin(comets$i) * sin(comets$om), -sin(comets$i) * cos(comets$om), cos(comets$i)) # Tests to be performed type_tests <- c("PCvM", "PAD", "PRt") # Excluding the C/1882 R1-X (Great September comet) records with X = B, C, D comets_ccf09 <- comets[comets$ccf09, ][-c(13:15), ] # Sample size nrow(comets_ccf09) # Tests for the data in Cuesta-Albertos et al. (2009) tests_ccf09 <- unif_test(data = comets_ccf09$normal, type = type_tests, p_value = "asymp") tests_ccf09
Named craters of the Solar System by the Gazetteer of Planetary Nomenclature of the International Astronomical Union (IUA).
craters
craters
A data frame with 5235 rows and 7 variables:
database ID.
name of the crater.
name of the celestial body. A factor with 43 levels,
such as "Moon"
, "Venus"
, or "Europa"
.
type of celestial body. A factor with 3 levels:
"Planet"
, "Moon"
, "Dwarf planet"
, or
"Asteroid"
.
diameter of the crater (in km).
longitude angle of the
crater center.
latitude angle of the
crater center.
"Craters" are understood in the Gazetteer of Planetary Nomenclature as roughly circular depressions resulting from impact or volcanic activity (the geological origin is unspecified).
Be aware that the dataset only contains named craters by the IUA.
Therefore, there is likely a high uniform bias on the distribution
of craters. Presumably the naming process attempts to cover the planet in
a somehow uniform fashion (distant craters are more likely to be named than
neighboring craters). Also, there are substantially more craters in the
listed bodies than those named by the IUA. See venus
and
rhea
for more detailed and specific crater datasets.
The angles are such their associated planetocentric
coordinates are:
with denoting the north pole.
The script performing the data preprocessing is available at
craters.R
. The data was retrieved on 2020-05-31.
https://planetarynames.wr.usgs.gov/AdvancedSearch
# Load data data("craters") # Add Cartesian coordinates craters$X <- cbind(cos(craters$theta) * cos(craters$phi), sin(craters$theta) * cos(craters$phi), sin(craters$phi)) # Tests to be performed type_tests <- c("PCvM", "PAD", "PRt") # Tests for Venus and Rhea unif_test(data = craters$X[craters$target == "Venus", ], type = type_tests, p_value = "asymp") unif_test(data = craters$X[craters$target == "Rhea", ], type = type_tests, p_value = "asymp")
# Load data data("craters") # Add Cartesian coordinates craters$X <- cbind(cos(craters$theta) * cos(craters$phi), sin(craters$theta) * cos(craters$phi), sin(craters$phi)) # Tests to be performed type_tests <- c("PCvM", "PAD", "PRt") # Tests for Venus and Rhea unif_test(data = craters$X[craters$target == "Venus", ], type = type_tests, p_value = "asymp") unif_test(data = craters$X[craters$target == "Rhea", ], type = type_tests, p_value = "asymp")
Numerical computation of the distribution function and
the quantile function
for an angular function
in a tangent-normal decomposition.
results from the inversion of
for , where
is a normalizing constant and
is the surface area of
.
F_from_f(f, p, Gauss = TRUE, N = 320, K = 1000, tol = 1e-06, ...) F_inv_from_f(f, p, Gauss = TRUE, N = 320, K = 1000, tol = 1e-06, ...)
F_from_f(f, p, Gauss = TRUE, N = 320, K = 1000, tol = 1e-06, ...) F_inv_from_f(f, p, Gauss = TRUE, N = 320, K = 1000, tol = 1e-06, ...)
f |
angular function defined on |
p |
integer giving the dimension of the ambient space |
Gauss |
use a Gauss–Legendre quadrature
rule to integrate |
N |
number of points used in the Gauss–Legendre quadrature. Defaults
to |
K |
number of equispaced points on |
tol |
tolerance passed to |
... |
further parameters passed to |
The normalizing constant is such that
. It does not
need to be part of
f
as it is computed internally.
Interpolation is performed by a monotone cubic spline. Gauss = TRUE
yields more accurate results, at expenses of a heavier computation.
If f
yields negative values, these are silently truncated to zero.
A splinefun
object ready to evaluate or
, as specified.
f <- function(x) rep(1, length(x)) plot(F_from_f(f = f, p = 4, Gauss = TRUE), ylab = "F(x)", xlim = c(-1, 1)) plot(F_from_f(f = f, p = 4, Gauss = FALSE), col = 2, add = TRUE, xlim = c(-1, 1)) curve(p_proj_unif(x = x, p = 4), col = 3, add = TRUE, n = 300) plot(F_inv_from_f(f = f, p = 4, Gauss = TRUE), ylab = "F^{-1}(x)") plot(F_inv_from_f(f = f, p = 4, Gauss = FALSE), col = 2, add = TRUE) curve(q_proj_unif(u = x, p = 4), col = 3, add = TRUE, n = 300)
f <- function(x) rep(1, length(x)) plot(F_from_f(f = f, p = 4, Gauss = TRUE), ylab = "F(x)", xlim = c(-1, 1)) plot(F_from_f(f = f, p = 4, Gauss = FALSE), col = 2, add = TRUE, xlim = c(-1, 1)) curve(p_proj_unif(x = x, p = 4), col = 3, add = TRUE, n = 300) plot(F_inv_from_f(f = f, p = 4, Gauss = TRUE), ylab = "F^{-1}(x)") plot(F_inv_from_f(f = f, p = 4, Gauss = FALSE), col = 2, add = TRUE) curve(q_proj_unif(u = x, p = 4), col = 3, add = TRUE, n = 300)
Convenience for computing the nodes and weights
of the Gauss–Legendre quadrature formula
in
:
.
Gauss_Legen_nodes(a = -1, b = 1, N = 40L) Gauss_Legen_weights(a = -1, b = 1, N = 40L)
Gauss_Legen_nodes(a = -1, b = 1, N = 40L) Gauss_Legen_weights(a = -1, b = 1, N = 40L)
a , b
|
scalars giving the interval |
N |
number of points used in the Gauss–Legendre quadrature. The
following choices are supported: |
For functions, Gauss–Legendre quadrature
can be very efficient. It is exact for polynomials up to degree
.
The nodes and weights up to were retrieved from
NIST and have
precision.
For
onwards, the nodes and weights were computed with the
gauss.quad
function from the
statmod
package (Smyth, 1998), and have precision.
A matrix of size c(N, 1)
with the nodes
(
Gauss_Legen_nodes
) or the corresponding weights
(
Gauss_Legen_weights
).
NIST Digital Library of Mathematical Functions. Release 1.0.20 of 2018-09-15. F. W. J. Olver, A. B. Olde Daalhuis, D. W. Lozier, B. I. Schneider, R. F. Boisvert, C. W. Clark, B. R. Miller, and B. V. Saunders, eds. https://dlmf.nist.gov/
Smyth, G. K. (1998). Numerical integration. In: Encyclopedia of Biostatistics, P. Armitage and T. Colton (eds.), Wiley, London, pp. 3088-3095.
## Integration of a smooth function in (1, 10) # Weights and nodes for integrating x_k <- Gauss_Legen_nodes(a = 1, b = 10, N = 40) w_k <- Gauss_Legen_weights(a = 1, b = 10, N = 40) # Check quadrature f <- function(x) sin(x) * x^2 - log(x + 1) integrate(f, lower = 1, upper = 10, rel.tol = 1e-12) sum(w_k * f(x_k)) # Exact for polynomials up to degree 2 * N - 1 f <- function(x) (((x + 0.5) / 1e3)^5 - ((x - 0.5)/ 5)^4 + ((x - 0.25) / 10)^2 + 1)^20 sum(w_k * f(x_k)) integrate(f, lower = -1, upper = 1, rel.tol = 1e-12) ## Integration on (0, pi) # Weights and nodes for integrating th_k <- Gauss_Legen_nodes(a = 0, b = pi, N = 40) w_k <- Gauss_Legen_weights(a = 0, b = pi, N = 40) # Check quadrature p <- 4 psi <- function(th) -sin(th / 2) w <- function(th) sin(th)^(p - 2) integrate(function(th) psi(th) * w(th), lower = 0, upper = pi, rel.tol = 1e-12) sum(w_k * psi(th_k) * w(th_k)) # Integral with Gegenbauer polynomial k <- 3 C_k <- function(th) drop(Gegen_polyn(theta = th, k = k, p = p)) integrate(function(th) psi(th) * C_k(th) * w(th), lower = 0, upper = pi, rel.tol = 1e-12) th_k <- drop(Gauss_Legen_nodes(a = 0, b = pi, N = 80)) w_k <- drop(Gauss_Legen_weights(a = 0, b = pi, N = 80)) sum(w_k * psi(th_k) * C_k(th_k) * w(th_k))
## Integration of a smooth function in (1, 10) # Weights and nodes for integrating x_k <- Gauss_Legen_nodes(a = 1, b = 10, N = 40) w_k <- Gauss_Legen_weights(a = 1, b = 10, N = 40) # Check quadrature f <- function(x) sin(x) * x^2 - log(x + 1) integrate(f, lower = 1, upper = 10, rel.tol = 1e-12) sum(w_k * f(x_k)) # Exact for polynomials up to degree 2 * N - 1 f <- function(x) (((x + 0.5) / 1e3)^5 - ((x - 0.5)/ 5)^4 + ((x - 0.25) / 10)^2 + 1)^20 sum(w_k * f(x_k)) integrate(f, lower = -1, upper = 1, rel.tol = 1e-12) ## Integration on (0, pi) # Weights and nodes for integrating th_k <- Gauss_Legen_nodes(a = 0, b = pi, N = 40) w_k <- Gauss_Legen_weights(a = 0, b = pi, N = 40) # Check quadrature p <- 4 psi <- function(th) -sin(th / 2) w <- function(th) sin(th)^(p - 2) integrate(function(th) psi(th) * w(th), lower = 0, upper = pi, rel.tol = 1e-12) sum(w_k * psi(th_k) * w(th_k)) # Integral with Gegenbauer polynomial k <- 3 C_k <- function(th) drop(Gegen_polyn(theta = th, k = k, p = p)) integrate(function(th) psi(th) * C_k(th) * w(th), lower = 0, upper = pi, rel.tol = 1e-12) th_k <- drop(Gauss_Legen_nodes(a = 0, b = pi, N = 80)) w_k <- drop(Gauss_Legen_weights(a = 0, b = pi, N = 80)) sum(w_k * psi(th_k) * C_k(th_k) * w(th_k))
The Gegenbauer polynomials
form a family of orthogonal polynomials on the interval
with respect to the weight function
,
for
,
. They usually appear
when dealing with functions defined on
with index
.
The Gegenbauer polynomials are somehow simpler to evaluate for
, with
. This simplifies
also the connection with the Chebyshev polynomials
, which admit
the explicit expression
. The Chebyshev polynomials
appear as the limit of the Gegenbauer polynomials
(divided by
) when
goes to
, so they
can be regarded as the extension by continuity of
to the case
.
For a reasonably smooth function
defined on
,
provided that the coefficients
are finite, where the normalizing constants are
The (squared) "Gegenbauer norm" of is
The previous expansion can be generalized for a 2-dimensional function
defined on
:
with coefficients
The (squared) "Gegenbauer norm" of is
Gegen_polyn(theta, k, p) Gegen_coefs(k, p, psi, Gauss = TRUE, N = 320, normalize = TRUE, only_const = FALSE, tol = 1e-06, ...) Gegen_series(theta, coefs, k, p, normalize = TRUE) Gegen_norm(coefs, k, p, normalize = TRUE, cumulative = FALSE) Gegen_polyn_2d(theta_1, theta_2, k, m, p) Gegen_coefs_2d(k, m, p, psi, Gauss = TRUE, N = 320, normalize = TRUE, only_const = FALSE, tol = 1e-06, ...) Gegen_series_2d(theta_1, theta_2, coefs, k, m, p, normalize = TRUE) Gegen_norm_2d(coefs, k, m, p, normalize = TRUE)
Gegen_polyn(theta, k, p) Gegen_coefs(k, p, psi, Gauss = TRUE, N = 320, normalize = TRUE, only_const = FALSE, tol = 1e-06, ...) Gegen_series(theta, coefs, k, p, normalize = TRUE) Gegen_norm(coefs, k, p, normalize = TRUE, cumulative = FALSE) Gegen_polyn_2d(theta_1, theta_2, k, m, p) Gegen_coefs_2d(k, m, p, psi, Gauss = TRUE, N = 320, normalize = TRUE, only_const = FALSE, tol = 1e-06, ...) Gegen_series_2d(theta_1, theta_2, coefs, k, m, p, normalize = TRUE) Gegen_norm_2d(coefs, k, m, p, normalize = TRUE)
theta , theta_1 , theta_2
|
vectors with values in |
k , m
|
vectors with the orders of the Gegenbauer polynomials. Must
be integers larger or equal than |
p |
integer giving the dimension of the ambient space |
psi |
function defined in |
Gauss |
use a Gauss–Legendre quadrature rule of |
N |
number of points used in the
Gauss–Legendre quadrature for computing the Gegenbauer coefficients.
Defaults to |
normalize |
consider normalized coefficients (divided by
|
only_const |
return only the normalizing constants |
tol |
tolerance passed to |
... |
further arguments to be passed to |
coefs |
for |
cumulative |
return the cumulative norm for increasing truncation of
the series? Defaults to |
The Gegen_polyn
function is a wrapper to the functions
gegenpoly_n
and
gegenpoly_array
in the
gsl-package
, which they interface the functions
defined in the header file gsl_sf_gegenbauer.h
(documented
here) of the
GNU Scientific Library.
Note that the function Gegen_polyn
computes the regular
unnormalized Gegenbauer polynomials.
For the case , the Chebyshev polynomials are considered.
Gegen_polyn
: a matrix of size
c(length(theta), length(k))
containing the evaluation of the
length(k)
Gegenbauer polynomials at theta
.
Gegen_coefs
: a vector of size length(k)
containing
the coefficients .
Gegen_series
: the evaluation of the truncated series
expansion, a vector of size length(theta)
.
Gegen_norm
: the Gegenbauer norm of the truncated series,
a scalar if cumulative = FALSE
, otherwise a vector of size
length(k)
.
Gegen_polyn_2d
: a 4-dimensional array of size
c(length(theta_1), length(theta_2), length(k), length(m))
containing the evaluation of the length(k) * length(m)
2-dimensional Gegenbauer polynomials at the bivariate grid
spanned by theta_1
and theta_2
.
Gegen_coefs_2d
: a matrix of size
c(length(k), length(m))
containing the coefficients
.
Gegen_series_2d
: the evaluation of the truncated series
expansion, a matrix of size c(length(theta_1), length(theta_2))
.
Gegen_norm_2d
: the 2-dimensional Gegenbauer norm of the
truncated series, a scalar.
Galassi, M., Davies, J., Theiler, J., Gough, B., Jungman, G., Alken, P., Booth, M., and Rossi, F. (2009) GNU Scientific Library Reference Manual. Network Theory Ltd. http://www.gnu.org/software/gsl/
NIST Digital Library of Mathematical Functions. Release 1.0.20 of 2018-09-15. F. W. J. Olver, A. B. Olde Daalhuis, D. W. Lozier, B. I. Schneider, R. F. Boisvert, C. W. Clark, B. R. Miller, and B. V. Saunders, eds. https://dlmf.nist.gov/
## Representation of Gegenbauer polynomials (Chebyshev polynomials for p = 2) th <- seq(0, pi, l = 500) k <- 0:3 old_par <- par(mfrow = c(2, 2)) for (p in 2:5) { matplot(th, t(Gegen_polyn(theta = th, k = k, p = p)), lty = 1, type = "l", main = substitute(p == d, list(d = p)), axes = FALSE, xlab = expression(theta), ylab = "") axis(1, at = c(0, pi / 4, pi / 2, 3 * pi / 4, pi), labels = expression(0, pi / 4, pi / 2, 3 * pi / 4, pi)) axis(2); box() mtext(text = expression({C[k]^{p/2 - 1}}(cos(theta))), side = 2, line = 2, cex = 0.75) legend("bottomleft", legend = paste("k =", k), lwd = 2, col = seq_along(k)) } par(old_par) ## Coefficients and series in p = 2 # Function in [0, pi] to be projected in Chebyshev polynomials psi <- function(th) -sin(th / 2) # Coefficients p <- 2 k <- 0:4 (coefs <- Gegen_coefs(k = k, p = p, psi = psi)) # Series plot(th, psi(th), type = "l", axes = FALSE, xlab = expression(theta), ylab = "", ylim = c(-1.25, 0)) axis(1, at = c(0, pi / 4, pi / 2, 3 * pi / 4, pi), labels = expression(0, pi / 4, pi / 2, 3 * pi / 4, pi)) axis(2); box() col <- viridisLite::viridis(length(coefs)) for (i in seq_along(coefs)) { lines(th, Gegen_series(theta = th, coefs = coefs[1:(i + 1)], k = 0:i, p = p), col = col[i]) } lines(th, psi(th), lwd = 2) ## Coefficients and series in p = 3 # Function in [0, pi] to be projected in Gegenbauer polynomials psi <- function(th) tan(th / 3) # Coefficients p <- 3 k <- 0:10 (coefs <- Gegen_coefs(k = k, p = p, psi = psi)) # Series plot(th, psi(th), type = "l", axes = FALSE, xlab = expression(theta), ylab = "", ylim = c(0, 2)) axis(1, at = c(0, pi / 4, pi / 2, 3 * pi / 4, pi), labels = expression(0, pi / 4, pi / 2, 3 * pi / 4, pi)) axis(2); box() col <- viridisLite::viridis(length(coefs)) for (i in seq_along(coefs)) { lines(th, Gegen_series(theta = th, coefs = coefs[1:(i + 1)], k = 0:i, p = p), col = col[i]) } lines(th, psi(th), lwd = 2) ## Surface representation # Surface in [0, pi]^2 to be projected in Gegenbauer polynomials p <- 3 psi <- function(th_1, th_2) A_theta_x(theta = th_1, x = cos(th_2), p = p, as_matrix = TRUE) # Coefficients k <- 0:20 m <- 0:10 coefs <- Gegen_coefs_2d(k = k, m = m, p = p, psi = psi) # Series th <- seq(0, pi, l = 100) col <- viridisLite::viridis(20) old_par <- par(mfrow = c(2, 2)) image(th, th, A_theta_x(theta = th, x = cos(th), p = p), axes = FALSE, col = col, zlim = c(0, 1), xlab = expression(theta[1]), ylab = expression(theta[2]), main = "Original") axis(1, at = c(0, pi / 4, pi / 2, 3 * pi / 4, pi), labels = expression(0, pi / 4, pi / 2, 3 * pi / 4, pi)) axis(2, at = c(0, pi / 4, pi / 2, 3 * pi / 4, pi), labels = expression(0, pi / 4, pi / 2, 3 * pi / 4, pi)) box() for(K in c(5, 10, 20)) { A <- Gegen_series_2d(theta_1 = th, theta_2 = th, coefs = coefs[1:(K + 1), ], k = 0:K, m = m, p = p) image(th, th, A, axes = FALSE, col = col, zlim = c(0, 1), xlab = expression(theta[1]), ylab = expression(theta[2]), main = paste(K, "x", m[length(m)], "coefficients")) axis(1, at = c(0, pi / 4, pi / 2, 3 * pi / 4, pi), labels = expression(0, pi / 4, pi / 2, 3 * pi / 4, pi)) axis(2, at = c(0, pi / 4, pi / 2, 3 * pi / 4, pi), labels = expression(0, pi / 4, pi / 2, 3 * pi / 4, pi)) box() } par(old_par)
## Representation of Gegenbauer polynomials (Chebyshev polynomials for p = 2) th <- seq(0, pi, l = 500) k <- 0:3 old_par <- par(mfrow = c(2, 2)) for (p in 2:5) { matplot(th, t(Gegen_polyn(theta = th, k = k, p = p)), lty = 1, type = "l", main = substitute(p == d, list(d = p)), axes = FALSE, xlab = expression(theta), ylab = "") axis(1, at = c(0, pi / 4, pi / 2, 3 * pi / 4, pi), labels = expression(0, pi / 4, pi / 2, 3 * pi / 4, pi)) axis(2); box() mtext(text = expression({C[k]^{p/2 - 1}}(cos(theta))), side = 2, line = 2, cex = 0.75) legend("bottomleft", legend = paste("k =", k), lwd = 2, col = seq_along(k)) } par(old_par) ## Coefficients and series in p = 2 # Function in [0, pi] to be projected in Chebyshev polynomials psi <- function(th) -sin(th / 2) # Coefficients p <- 2 k <- 0:4 (coefs <- Gegen_coefs(k = k, p = p, psi = psi)) # Series plot(th, psi(th), type = "l", axes = FALSE, xlab = expression(theta), ylab = "", ylim = c(-1.25, 0)) axis(1, at = c(0, pi / 4, pi / 2, 3 * pi / 4, pi), labels = expression(0, pi / 4, pi / 2, 3 * pi / 4, pi)) axis(2); box() col <- viridisLite::viridis(length(coefs)) for (i in seq_along(coefs)) { lines(th, Gegen_series(theta = th, coefs = coefs[1:(i + 1)], k = 0:i, p = p), col = col[i]) } lines(th, psi(th), lwd = 2) ## Coefficients and series in p = 3 # Function in [0, pi] to be projected in Gegenbauer polynomials psi <- function(th) tan(th / 3) # Coefficients p <- 3 k <- 0:10 (coefs <- Gegen_coefs(k = k, p = p, psi = psi)) # Series plot(th, psi(th), type = "l", axes = FALSE, xlab = expression(theta), ylab = "", ylim = c(0, 2)) axis(1, at = c(0, pi / 4, pi / 2, 3 * pi / 4, pi), labels = expression(0, pi / 4, pi / 2, 3 * pi / 4, pi)) axis(2); box() col <- viridisLite::viridis(length(coefs)) for (i in seq_along(coefs)) { lines(th, Gegen_series(theta = th, coefs = coefs[1:(i + 1)], k = 0:i, p = p), col = col[i]) } lines(th, psi(th), lwd = 2) ## Surface representation # Surface in [0, pi]^2 to be projected in Gegenbauer polynomials p <- 3 psi <- function(th_1, th_2) A_theta_x(theta = th_1, x = cos(th_2), p = p, as_matrix = TRUE) # Coefficients k <- 0:20 m <- 0:10 coefs <- Gegen_coefs_2d(k = k, m = m, p = p, psi = psi) # Series th <- seq(0, pi, l = 100) col <- viridisLite::viridis(20) old_par <- par(mfrow = c(2, 2)) image(th, th, A_theta_x(theta = th, x = cos(th), p = p), axes = FALSE, col = col, zlim = c(0, 1), xlab = expression(theta[1]), ylab = expression(theta[2]), main = "Original") axis(1, at = c(0, pi / 4, pi / 2, 3 * pi / 4, pi), labels = expression(0, pi / 4, pi / 2, 3 * pi / 4, pi)) axis(2, at = c(0, pi / 4, pi / 2, 3 * pi / 4, pi), labels = expression(0, pi / 4, pi / 2, 3 * pi / 4, pi)) box() for(K in c(5, 10, 20)) { A <- Gegen_series_2d(theta_1 = th, theta_2 = th, coefs = coefs[1:(K + 1), ], k = 0:K, m = m, p = p) image(th, th, A, axes = FALSE, col = col, zlim = c(0, 1), xlab = expression(theta[1]), ylab = expression(theta[2]), main = paste(K, "x", m[length(m)], "coefficients")) axis(1, at = c(0, pi / 4, pi / 2, 3 * pi / 4, pi), labels = expression(0, pi / 4, pi / 2, 3 * pi / 4, pi)) axis(2, at = c(0, pi / 4, pi / 2, 3 * pi / 4, pi), labels = expression(0, pi / 4, pi / 2, 3 * pi / 4, pi)) box() } par(old_par)
Computation of a certain explicit representation of
(hyper)spherical harmonics on
,
. Details are available in
García-Portugués et al. (2024).
g_i_k(x, i = 1, k = 1, m = NULL, show_m = FALSE)
g_i_k(x, i = 1, k = 1, m = NULL, show_m = FALSE)
x |
locations in |
i , k
|
alternative indexing to refer to the |
m |
(hyper)spherical harmonic index, as used in Proposition 3.1. The
index is computed internally from |
show_m |
flag to print |
The implementation uses Proposition 3.1 in García-Portugués et al. (2024),
which adapts Theorem 1.5.1 in Dai and Xu (2013) with the correction of
typos in the normalizing constant and in the definition of
the function
of the latter theorem.
A vector of size nrow(x)
.
Dai, F. and Xu, Y. (2013). Approximation Theory and Harmonic Analysis on Spheres and Balls. Springer, New York. doi:10.1007/978-1-4614-6660-4
García-Portugués, E., Paindaveine, D., and Verdebout, T. (2024). On a class of Sobolev tests for symmetry of directions, their detection thresholds, and asymptotic powers. arXiv:2108.09874v2. doi:10.48550/arXiv.2108.09874
n <- 3e3 old_par <- par(mfrow = c(2, 3)) k <- 2 for (i in 1:d_p_k(p = 3, k = k)) { X <- r_unif_sph(n = n, p = 3, M = 1)[, , 1] col <- rainbow(n)[rank(g_i_k(x = X, k = k, i = i, show_m = TRUE))] scatterplot3d::scatterplot3d(X[, 1], X[, 2], X[, 3], color = col, axis = FALSE, pch = 19) } for (k in 0:5) { X <- r_unif_sph(n = n, p = 3, M = 1)[, , 1] col <- rainbow(n)[rank(g_i_k(x = X, k = k, i = 1, show_m = TRUE))] scatterplot3d::scatterplot3d(X[, 1], X[, 2], X[, 3], color = col, axis = FALSE, pch = 19) } par(old_par)
n <- 3e3 old_par <- par(mfrow = c(2, 3)) k <- 2 for (i in 1:d_p_k(p = 3, k = k)) { X <- r_unif_sph(n = n, p = 3, M = 1)[, , 1] col <- rainbow(n)[rank(g_i_k(x = X, k = k, i = i, show_m = TRUE))] scatterplot3d::scatterplot3d(X[, 1], X[, 2], X[, 3], color = col, axis = FALSE, pch = 19) } for (k in 0:5) { X <- r_unif_sph(n = n, p = 3, M = 1)[, , 1] col <- rainbow(n)[rank(g_i_k(x = X, k = k, i = 1, show_m = TRUE))] scatterplot3d::scatterplot3d(X[, 1], X[, 2], X[, 3], color = col, axis = FALSE, pch = 19) } par(old_par)
Monte Carlo approximation of the integral
of a function defined on the (hyper)sphere
,
.
int_sph_MC(f, p, M = 10000, cores = 1, chunks = ceiling(M/1000), seeds = NULL, ...)
int_sph_MC(f, p, M = 10000, cores = 1, chunks = ceiling(M/1000), seeds = NULL, ...)
f |
function to be integrated. Its first argument must be the
(hyper)sphere position. Must be vectorized and return a vector of size
|
p |
integer giving the dimension of the ambient space |
M |
number of Monte Carlo samples. Defaults to |
cores |
number of cores to perform the integration. Defaults to
|
chunks |
number of chunks to split the |
seeds |
if provided, a vector of size |
... |
optional arguments to be passed to |
It is possible to have a progress bar if int_sph_MC
is wrapped with
progressr::with_progress
or if
progressr::handlers(global = TRUE)
is invoked (once) by the user.
See the examples below. The progress bar is updated with the number of
finished chunks.
A scalar with the approximate integral.
## Sequential simulation # Vectorized functions to be integrated x1 <- function(x) x[, 1] quad <- function(x, a = 0) a + rowSums(x^4) # Approximate \int_{S^{p-1}} x_1 dx = 0 int_sph_MC(f = x1, p = 3, M = 1e4, chunks = 2) # Approximate \int_{S^{p-1}} (a + \sum_i x_i^4) dx int_sph_MC(f = quad, p = 2, M = 1e4, a = 0, chunks = 2) # Compare with Gauss--Legendre integration on S^2 th_k <- Gauss_Legen_nodes(a = 0, b = 2 * pi, N = 40) w_k <- Gauss_Legen_weights(a = 0, b = 2 * pi, N = 40) sum(w_k * quad(cbind(cos(th_k), sin(th_k)), a = 1)) ## Parallel simulation with a progress bar # Define a progress bar require(progress) require(progressr) handlers(handler_progress( format = paste("(:spin) [:bar] :percent Iter: :current/:total Rate:", ":tick_rate iter/sec ETA: :eta Elapsed: :elapsedfull"), clear = FALSE)) # Call int_sph_MC() within with_progress() with_progress(int_sph_MC(f = x1, p = 3, cores = 2, M = 1e5, chunks = 100)) # Instead of using with_progress() each time, it is more practical to run # handlers(global = TRUE) # once to activate progress bars in your R session
## Sequential simulation # Vectorized functions to be integrated x1 <- function(x) x[, 1] quad <- function(x, a = 0) a + rowSums(x^4) # Approximate \int_{S^{p-1}} x_1 dx = 0 int_sph_MC(f = x1, p = 3, M = 1e4, chunks = 2) # Approximate \int_{S^{p-1}} (a + \sum_i x_i^4) dx int_sph_MC(f = quad, p = 2, M = 1e4, a = 0, chunks = 2) # Compare with Gauss--Legendre integration on S^2 th_k <- Gauss_Legen_nodes(a = 0, b = 2 * pi, N = 40) w_k <- Gauss_Legen_weights(a = 0, b = 2 * pi, N = 40) sum(w_k * quad(cbind(cos(th_k), sin(th_k)), a = 1)) ## Parallel simulation with a progress bar # Define a progress bar require(progress) require(progressr) handlers(handler_progress( format = paste("(:spin) [:bar] :percent Iter: :current/:total Rate:", ":tick_rate iter/sec ETA: :eta Elapsed: :elapsedfull"), clear = FALSE)) # Call int_sph_MC() within with_progress() with_progress(int_sph_MC(f = x1, p = 3, cores = 2, M = 1e5, chunks = 100)) # Instead of using with_progress() each time, it is more practical to run # handlers(global = TRUE) # once to activate progress bars in your R session
Density and random generation for local projected alternatives to uniformity with densities
where
is the angular function controlling the local alternative in a
Gegenbauer series, ,
is a direction on
, and
is the surface area of
. The sequence
is typically such that
for the Gegenbauer coefficients
of the kernel function of a Sobolev statistic (see the
transformation between the coefficients
and
).
Also, automatic truncation of the series
according to the proportion of "Gegenbauer norm"
explained.
f_locdev(z, p, uk) con_f(f, p, N = 320) d_locdev(x, mu, f, kappa) r_locdev(n, mu, f, kappa, F_inv = NULL, ...) cutoff_locdev(p, K_max = 10000, thre = 0.001, type, Rothman_t = 1/3, Pycke_q = 0.5, verbose = FALSE, Gauss = TRUE, N = 320, tol = 1e-06)
f_locdev(z, p, uk) con_f(f, p, N = 320) d_locdev(x, mu, f, kappa) r_locdev(n, mu, f, kappa, F_inv = NULL, ...) cutoff_locdev(p, K_max = 10000, thre = 0.001, type, Rothman_t = 1/3, Pycke_q = 0.5, verbose = FALSE, Gauss = TRUE, N = 320, tol = 1e-06)
z |
projected evaluation points for |
p |
integer giving the dimension of the ambient space |
uk |
coefficients |
f |
angular function defined on |
N |
number of points used in the
Gauss–Legendre quadrature for computing the Gegenbauer coefficients.
Defaults to |
x |
locations in |
mu |
a unit norm vector of size |
kappa |
the strength of the local alternative, between |
n |
sample size, a positive integer. |
F_inv |
quantile function associated to |
... |
further parameters passed to |
K_max |
integer giving the truncation of the series. Defaults to
|
thre |
proportion of norm not explained by the first terms of the
truncated series. Defaults to |
type |
name of the Sobolev statistic, using the naming from
|
Rothman_t |
|
Pycke_q |
|
verbose |
output information about the truncation ( |
Gauss |
use a Gauss–Legendre quadrature rule of |
tol |
tolerance passed to |
See the definitions of local alternatives in Prentice (1978) and in García-Portugués et al. (2023).
The truncation of is done to the first
K_max
terms and then up to the index such that the first terms
leave unexplained the proportion thre
of the norm of the whole series.
Setting thre = 0
truncates to K_max
terms exactly. If the
series only contains odd or even non-zero terms, then only K_max / 2
addends are effectively taken into account in the first truncation.
f_locdev
: angular function evaluated at x
, a vector.
con_f
: normalizing constant of
, a scalar.
d_locdev
: density function evaluated at x
, a vector.
r_locdev
: a matrix of size c(n, p)
containing a random
sample from the density .
cutoff_locdev
: vector of coefficients
automatically truncated according to
K_max
and thre
(see details).
García-Portugués, E., Navarro-Esteban, P., Cuesta-Albertos, J. A. (2023) On a projection-based class of uniformity tests on the hypersphere. Bernoulli, 29(1):181–204. doi:10.3150/21-BEJ1454.
Prentice, M. J. (1978). On invariant tests of uniformity for directions and orientations. The Annals of Statistics, 6(1):169–176. doi:10.1214/aos/1176344075
## Local alternatives diagnostics loc_alt_diagnostic <- function(p, type, thre = 1e-3, K_max = 1e3) { # Coefficients of the alternative uk <- cutoff_locdev(K_max = K_max, p = p, type = type, thre = thre, N = 640) old_par <- par(mfrow = c(2, 2)) # Construction of f z <- seq(-1, 1, l = 1e3) f <- function(z) f_locdev(z = z, p = p, uk = uk) plot(z, f(z), type = "l", xlab = expression(z), ylab = expression(f(z)), main = paste0("Local alternative f, ", type, ", p = ", p), log = "y") # Projected density on [-1, 1] f_proj <- function(z) rotasym::w_p(p = p - 1) * f(z) * (1 - z^2)^((p - 3) / 2) plot(z, f_proj(z), type = "l", xlab = expression(z), ylab = expression(omega[p - 1] * f(z) * (1 - z^2)^{(p - 3) / 2}), main = paste0("Projected density, ", type, ", p = ", p), log = "y", sub = paste("Integral:", round(con_f(f = f, p = p), 4))) # Quantile function for projected density mu <- c(rep(0, p - 1), 1) F_inv <- F_inv_from_f(f = f, p = p, K = 5e2) plot(F_inv, xlab = expression(x), ylab = expression(F^{-1}*(x)), main = paste0("Quantile function, ", type, ", p = ", p)) # Sample from the alternative and plot the projected sample n <- 5e4 samp <- r_locdev(n = n, mu = mu, f = f, kappa = 1, F_inv = F_inv) plot(z, f_proj(z), col = 2, type = "l", main = paste0("Simulated projected data, ", type, ", p = ", p), ylim = c(0, 1.75)) hist(samp %*% mu, freq = FALSE, breaks = seq(-1, 1, l = 50), add = TRUE) par(old_par) } ## Local alternatives for the PCvM test loc_alt_diagnostic(p = 2, type = "PCvM") loc_alt_diagnostic(p = 3, type = "PCvM") loc_alt_diagnostic(p = 4, type = "PCvM") loc_alt_diagnostic(p = 5, type = "PCvM") loc_alt_diagnostic(p = 11, type = "PCvM") ## Local alternatives for the PAD test loc_alt_diagnostic(p = 2, type = "PAD") loc_alt_diagnostic(p = 3, type = "PAD") loc_alt_diagnostic(p = 4, type = "PAD") loc_alt_diagnostic(p = 5, type = "PAD") loc_alt_diagnostic(p = 11, type = "PAD") ## Local alternatives for the PRt test loc_alt_diagnostic(p = 2, type = "PRt") loc_alt_diagnostic(p = 3, type = "PRt") loc_alt_diagnostic(p = 4, type = "PRt") loc_alt_diagnostic(p = 5, type = "PRt") loc_alt_diagnostic(p = 11, type = "PRt")
## Local alternatives diagnostics loc_alt_diagnostic <- function(p, type, thre = 1e-3, K_max = 1e3) { # Coefficients of the alternative uk <- cutoff_locdev(K_max = K_max, p = p, type = type, thre = thre, N = 640) old_par <- par(mfrow = c(2, 2)) # Construction of f z <- seq(-1, 1, l = 1e3) f <- function(z) f_locdev(z = z, p = p, uk = uk) plot(z, f(z), type = "l", xlab = expression(z), ylab = expression(f(z)), main = paste0("Local alternative f, ", type, ", p = ", p), log = "y") # Projected density on [-1, 1] f_proj <- function(z) rotasym::w_p(p = p - 1) * f(z) * (1 - z^2)^((p - 3) / 2) plot(z, f_proj(z), type = "l", xlab = expression(z), ylab = expression(omega[p - 1] * f(z) * (1 - z^2)^{(p - 3) / 2}), main = paste0("Projected density, ", type, ", p = ", p), log = "y", sub = paste("Integral:", round(con_f(f = f, p = p), 4))) # Quantile function for projected density mu <- c(rep(0, p - 1), 1) F_inv <- F_inv_from_f(f = f, p = p, K = 5e2) plot(F_inv, xlab = expression(x), ylab = expression(F^{-1}*(x)), main = paste0("Quantile function, ", type, ", p = ", p)) # Sample from the alternative and plot the projected sample n <- 5e4 samp <- r_locdev(n = n, mu = mu, f = f, kappa = 1, F_inv = F_inv) plot(z, f_proj(z), col = 2, type = "l", main = paste0("Simulated projected data, ", type, ", p = ", p), ylim = c(0, 1.75)) hist(samp %*% mu, freq = FALSE, breaks = seq(-1, 1, l = 50), add = TRUE) par(old_par) } ## Local alternatives for the PCvM test loc_alt_diagnostic(p = 2, type = "PCvM") loc_alt_diagnostic(p = 3, type = "PCvM") loc_alt_diagnostic(p = 4, type = "PCvM") loc_alt_diagnostic(p = 5, type = "PCvM") loc_alt_diagnostic(p = 11, type = "PCvM") ## Local alternatives for the PAD test loc_alt_diagnostic(p = 2, type = "PAD") loc_alt_diagnostic(p = 3, type = "PAD") loc_alt_diagnostic(p = 4, type = "PAD") loc_alt_diagnostic(p = 5, type = "PAD") loc_alt_diagnostic(p = 11, type = "PAD") ## Local alternatives for the PRt test loc_alt_diagnostic(p = 2, type = "PRt") loc_alt_diagnostic(p = 3, type = "PRt") loc_alt_diagnostic(p = 4, type = "PRt") loc_alt_diagnostic(p = 5, type = "PRt") loc_alt_diagnostic(p = 11, type = "PRt")
Computation of the asymptotic null distributions of circular uniformity statistics.
p_Kolmogorov(x, K_Kolmogorov = 25L, alternating = TRUE) d_Kolmogorov(x, K_Kolmogorov = 25L, alternating = TRUE) p_cir_stat_Ajne(x, K_Ajne = 15L) d_cir_stat_Ajne(x, K_Ajne = 15L) p_cir_stat_Bingham(x) d_cir_stat_Bingham(x) p_cir_stat_Greenwood(x) d_cir_stat_Greenwood(x) p_cir_stat_Gini(x) d_cir_stat_Gini(x) p_cir_stat_Gini_squared(x) d_cir_stat_Gini_squared(x) p_cir_stat_Hodges_Ajne2(x, n, asymp_std = FALSE) p_cir_stat_Hodges_Ajne(x, n, exact = TRUE, asymp_std = FALSE) d_cir_stat_Hodges_Ajne(x, n, exact = TRUE, asymp_std = FALSE) p_cir_stat_Kuiper(x, n, K_Kuiper = 12L, second_term = TRUE, Stephens = FALSE) d_cir_stat_Kuiper(x, n, K_Kuiper = 12L, second_term = TRUE, Stephens = FALSE) p_cir_stat_Log_gaps(x, abs_val = TRUE) d_cir_stat_Log_gaps(x, abs_val = TRUE) p_cir_stat_Max_uncover(x) d_cir_stat_Max_uncover(x) p_cir_stat_Num_uncover(x) d_cir_stat_Num_uncover(x) p_cir_stat_Pycke(x) d_cir_stat_Pycke(x) p_cir_stat_Vacancy(x) d_cir_stat_Vacancy(x) p_cir_stat_Watson(x, n = 0L, K_Watson = 25L, Stephens = FALSE) d_cir_stat_Watson(x, n = 0L, K_Watson = 25L, Stephens = FALSE) p_cir_stat_Watson_1976(x, K_Watson_1976 = 8L, N = 40L) d_cir_stat_Watson_1976(x, K_Watson_1976 = 8L) p_cir_stat_Range(x, n, max_gap = TRUE) d_cir_stat_Range(x, n, max_gap = TRUE) p_cir_stat_Rao(x) d_cir_stat_Rao(x) p_cir_stat_Rayleigh(x) d_cir_stat_Rayleigh(x) p_cir_stat_Bakshaev(x, K_max = 1000, thre = 0, method = "I", ...) d_cir_stat_Bakshaev(x, K_max = 1000, thre = 0, method = "I", ...) p_cir_stat_Gine_Fn(x, K_max = 1000, thre = 0, method = "I", ...) d_cir_stat_Gine_Fn(x, K_max = 1000, thre = 0, method = "I", ...) p_cir_stat_Gine_Gn(x, K_max = 1000, thre = 0, method = "I", ...) d_cir_stat_Gine_Gn(x, K_max = 1000, thre = 0, method = "I", ...) p_cir_stat_Hermans_Rasson(x, K_max = 1000, thre = 0, method = "I", ...) d_cir_stat_Hermans_Rasson(x, K_max = 1000, thre = 0, method = "I", ...) p_cir_stat_PAD(x, K_max = 1000, thre = 0, method = "I", ...) d_cir_stat_PAD(x, K_max = 1000, thre = 0, method = "I", ...) p_cir_stat_PCvM(x, K_max = 1000, thre = 0, method = "I", ...) d_cir_stat_PCvM(x, K_max = 1000, thre = 0, method = "I", ...) p_cir_stat_PRt(x, t = 1/3, K_max = 1000, thre = 0, method = "I", ...) d_cir_stat_PRt(x, t = 1/3, K_max = 1000, thre = 0, method = "I", ...) p_cir_stat_Poisson(x, rho = 0.5, K_max = 1000, thre = 0, method = "I", ...) d_cir_stat_Poisson(x, rho = 0.5, K_max = 1000, thre = 0, method = "I", ...) p_cir_stat_Pycke_q(x, q = 0.5, K_max = 1000, thre = 0, method = "I", ...) d_cir_stat_Pycke_q(x, q = 0.5, K_max = 1000, thre = 0, method = "I", ...) p_cir_stat_Rothman(x, t = 1/3, K_max = 1000, thre = 0, method = "I", ...) d_cir_stat_Rothman(x, t = 1/3, K_max = 1000, thre = 0, method = "I", ...) p_cir_stat_Riesz(x, s = 1, K_max = 1000, thre = 0, method = "I", ...) d_cir_stat_Riesz(x, s = 1, K_max = 1000, thre = 0, method = "I", ...) p_cir_stat_Sobolev(x, vk2 = c(0, 0, 1), method = "I", ...) d_cir_stat_Sobolev(x, vk2 = c(0, 0, 1), method = "I", ...) p_cir_stat_Softmax(x, kappa = 1, K_max = 1000, thre = 0, method = "I", ...) d_cir_stat_Softmax(x, kappa = 1, K_max = 1000, thre = 0, method = "I", ...)
p_Kolmogorov(x, K_Kolmogorov = 25L, alternating = TRUE) d_Kolmogorov(x, K_Kolmogorov = 25L, alternating = TRUE) p_cir_stat_Ajne(x, K_Ajne = 15L) d_cir_stat_Ajne(x, K_Ajne = 15L) p_cir_stat_Bingham(x) d_cir_stat_Bingham(x) p_cir_stat_Greenwood(x) d_cir_stat_Greenwood(x) p_cir_stat_Gini(x) d_cir_stat_Gini(x) p_cir_stat_Gini_squared(x) d_cir_stat_Gini_squared(x) p_cir_stat_Hodges_Ajne2(x, n, asymp_std = FALSE) p_cir_stat_Hodges_Ajne(x, n, exact = TRUE, asymp_std = FALSE) d_cir_stat_Hodges_Ajne(x, n, exact = TRUE, asymp_std = FALSE) p_cir_stat_Kuiper(x, n, K_Kuiper = 12L, second_term = TRUE, Stephens = FALSE) d_cir_stat_Kuiper(x, n, K_Kuiper = 12L, second_term = TRUE, Stephens = FALSE) p_cir_stat_Log_gaps(x, abs_val = TRUE) d_cir_stat_Log_gaps(x, abs_val = TRUE) p_cir_stat_Max_uncover(x) d_cir_stat_Max_uncover(x) p_cir_stat_Num_uncover(x) d_cir_stat_Num_uncover(x) p_cir_stat_Pycke(x) d_cir_stat_Pycke(x) p_cir_stat_Vacancy(x) d_cir_stat_Vacancy(x) p_cir_stat_Watson(x, n = 0L, K_Watson = 25L, Stephens = FALSE) d_cir_stat_Watson(x, n = 0L, K_Watson = 25L, Stephens = FALSE) p_cir_stat_Watson_1976(x, K_Watson_1976 = 8L, N = 40L) d_cir_stat_Watson_1976(x, K_Watson_1976 = 8L) p_cir_stat_Range(x, n, max_gap = TRUE) d_cir_stat_Range(x, n, max_gap = TRUE) p_cir_stat_Rao(x) d_cir_stat_Rao(x) p_cir_stat_Rayleigh(x) d_cir_stat_Rayleigh(x) p_cir_stat_Bakshaev(x, K_max = 1000, thre = 0, method = "I", ...) d_cir_stat_Bakshaev(x, K_max = 1000, thre = 0, method = "I", ...) p_cir_stat_Gine_Fn(x, K_max = 1000, thre = 0, method = "I", ...) d_cir_stat_Gine_Fn(x, K_max = 1000, thre = 0, method = "I", ...) p_cir_stat_Gine_Gn(x, K_max = 1000, thre = 0, method = "I", ...) d_cir_stat_Gine_Gn(x, K_max = 1000, thre = 0, method = "I", ...) p_cir_stat_Hermans_Rasson(x, K_max = 1000, thre = 0, method = "I", ...) d_cir_stat_Hermans_Rasson(x, K_max = 1000, thre = 0, method = "I", ...) p_cir_stat_PAD(x, K_max = 1000, thre = 0, method = "I", ...) d_cir_stat_PAD(x, K_max = 1000, thre = 0, method = "I", ...) p_cir_stat_PCvM(x, K_max = 1000, thre = 0, method = "I", ...) d_cir_stat_PCvM(x, K_max = 1000, thre = 0, method = "I", ...) p_cir_stat_PRt(x, t = 1/3, K_max = 1000, thre = 0, method = "I", ...) d_cir_stat_PRt(x, t = 1/3, K_max = 1000, thre = 0, method = "I", ...) p_cir_stat_Poisson(x, rho = 0.5, K_max = 1000, thre = 0, method = "I", ...) d_cir_stat_Poisson(x, rho = 0.5, K_max = 1000, thre = 0, method = "I", ...) p_cir_stat_Pycke_q(x, q = 0.5, K_max = 1000, thre = 0, method = "I", ...) d_cir_stat_Pycke_q(x, q = 0.5, K_max = 1000, thre = 0, method = "I", ...) p_cir_stat_Rothman(x, t = 1/3, K_max = 1000, thre = 0, method = "I", ...) d_cir_stat_Rothman(x, t = 1/3, K_max = 1000, thre = 0, method = "I", ...) p_cir_stat_Riesz(x, s = 1, K_max = 1000, thre = 0, method = "I", ...) d_cir_stat_Riesz(x, s = 1, K_max = 1000, thre = 0, method = "I", ...) p_cir_stat_Sobolev(x, vk2 = c(0, 0, 1), method = "I", ...) d_cir_stat_Sobolev(x, vk2 = c(0, 0, 1), method = "I", ...) p_cir_stat_Softmax(x, kappa = 1, K_max = 1000, thre = 0, method = "I", ...) d_cir_stat_Softmax(x, kappa = 1, K_max = 1000, thre = 0, method = "I", ...)
x |
a vector of size |
K_Kolmogorov , K_Kuiper , K_Watson , K_Watson_1976 , K_Ajne
|
integer giving
the truncation of the series present in the null asymptotic distributions.
For the Kolmogorov-Smirnov-related series defaults to |
alternating |
use the alternating series expansion for the distribution
of the Kolmogorov-Smirnov statistic? Defaults to |
n |
sample size employed for computing the statistic. |
asymp_std |
compute the distribution associated to the normalized
Hodges-Ajne statistic? Defaults to |
exact |
use the exact distribution for the Hodges-Ajne statistic?
Defaults to |
second_term |
use the second-order series expansion for the
distribution of the Kuiper statistic? Defaults to |
Stephens |
compute Stephens (1970) modification so that the null distribution of the is less dependent on the sample size? The modification does not alter the test decision. |
abs_val |
compute the distribution associated to the absolute value of
the Darling's log gaps statistic? Defaults to |
N |
number of points used in the
Gauss-Legendre quadrature. Defaults to |
max_gap |
compute the distribution associated to the maximum gap for
the range statistic? Defaults to |
K_max |
integer giving the truncation of the series that compute the
asymptotic p-value of a Sobolev test. Defaults to |
thre |
error threshold for the tail probability given by the
the first terms of the truncated series of a Sobolev test. Defaults to
|
method |
method for approximating the density, distribution, or
quantile function of the weighted sum of chi squared random variables. Must
be |
... |
further parameters passed to |
t |
|
rho |
|
q |
|
s |
|
vk2 |
weights for the finite Sobolev test. A non-negative vector or
matrix. Defaults to |
kappa |
|
Descriptions and references for most of the tests are available in García-Portugués and Verdebout (2018).
A matrix of size c(nx, 1)
with the evaluation of the
distribution or density function at x
.
García-Portugués, E. and Verdebout, T. (2018) An overview of uniformity tests on the hypersphere. arXiv:1804.00286. doi:10.48550/arXiv.1804.00286.
# Ajne curve(d_cir_stat_Ajne(x), to = 1.5, n = 2e2, ylim = c(0, 4)) curve(p_cir_stat_Ajne(x), n = 2e2, col = 2, add = TRUE) # Bakshaev curve(d_cir_stat_Bakshaev(x, method = "HBE"), to = 6, n = 2e2, ylim = c(0, 1)) curve(p_cir_stat_Bakshaev(x, method = "HBE"), n = 2e2, add = TRUE, col = 2) # Bingham curve(d_cir_stat_Bingham(x), to = 12, n = 2e2, ylim = c(0, 1)) curve(p_cir_stat_Bingham(x), n = 2e2, col = 2, add = TRUE) # Greenwood curve(d_cir_stat_Greenwood(x), from = -6, to = 6, n = 2e2, ylim = c(0, 1)) curve(p_cir_stat_Greenwood(x), n = 2e2, col = 2, add = TRUE) # Hermans-Rasson curve(p_cir_stat_Hermans_Rasson(x, method = "HBE"), to = 10, n = 2e2, ylim = c(0, 1)) curve(d_cir_stat_Hermans_Rasson(x, method = "HBE"), n = 2e2, add = TRUE, col = 2) # Hodges-Ajne plot(25:45, d_cir_stat_Hodges_Ajne(cbind(25:45), n = 50), type = "h", lwd = 2, ylim = c(0, 1)) lines(25:45, p_cir_stat_Hodges_Ajne(cbind(25:45), n = 50), type = "s", col = 2) # Kolmogorov-Smirnov curve(d_Kolmogorov(x), to = 3, n = 2e2, ylim = c(0, 2)) curve(p_Kolmogorov(x), n = 2e2, col = 2, add = TRUE) # Kuiper curve(d_cir_stat_Kuiper(x, n = 50), to = 3, n = 2e2, ylim = c(0, 2)) curve(p_cir_stat_Kuiper(x, n = 50), n = 2e2, col = 2, add = TRUE) # Kuiper and Watson with Stephens modification curve(d_cir_stat_Kuiper(x, n = 8, Stephens = TRUE), to = 2.5, n = 2e2, ylim = c(0, 10)) curve(d_cir_stat_Watson(x, n = 8, Stephens = TRUE), n = 2e2, lty = 2, add = TRUE) n <- c(10, 20, 30, 40, 50, 100, 500) col <- rainbow(length(n)) for (i in seq_along(n)) { curve(d_cir_stat_Kuiper(x, n = n[i], Stephens = TRUE), n = 2e2, col = col[i], add = TRUE) curve(d_cir_stat_Watson(x, n = n[i], Stephens = TRUE), n = 2e2, col = col[i], lty = 2, add = TRUE) } # Maximum uncovered spacing curve(d_cir_stat_Max_uncover(x), from = -3, to = 6, n = 2e2, ylim = c(0, 1)) curve(p_cir_stat_Max_uncover(x), n = 2e2, col = 2, add = TRUE) # Number of uncovered spacing curve(d_cir_stat_Num_uncover(x), from = -4, to = 4, n = 2e2, ylim = c(0, 1)) curve(p_cir_stat_Num_uncover(x), n = 2e2, col = 2, add = TRUE) # Log gaps curve(d_cir_stat_Log_gaps(x), from = -1, to = 4, n = 2e2, ylim = c(0, 1)) curve(p_cir_stat_Log_gaps(x), n = 2e2, col = 2, add = TRUE) # Gine Fn curve(d_cir_stat_Gine_Fn(x, method = "HBE"), to = 2.5, n = 2e2, ylim = c(0, 2)) curve(p_cir_stat_Gine_Fn(x, method = "HBE"), n = 2e2, add = TRUE, col = 2) # Gine Gn curve(d_cir_stat_Gine_Gn(x, method = "HBE"), to = 2.5, n = 2e2, ylim = c(0, 2)) curve(p_cir_stat_Gine_Gn(x, method = "HBE"), n = 2e2, add = TRUE, col = 2) # Gini mean difference curve(d_cir_stat_Gini(x), from = -4, to = 4, n = 2e2, ylim = c(0, 1)) curve(p_cir_stat_Gini(x), n = 2e2, col = 2, add = TRUE) # Gini mean squared difference curve(d_cir_stat_Gini_squared(x), from = -10, to = 10, n = 2e2, ylim = c(0, 1)) curve(p_cir_stat_Gini_squared(x), n = 2e2, col = 2, add = TRUE) # PAD curve(d_cir_stat_PAD(x, method = "HBE"), to = 3, n = 2e2, ylim = c(0, 1.5)) curve(p_cir_stat_PAD(x, method = "HBE"), n = 2e2, add = TRUE, col = 2) # PCvM curve(d_cir_stat_PCvM(x, method = "HBE"), to = 4, n = 2e2, ylim = c(0, 2)) curve(p_cir_stat_PCvM(x, method = "HBE"), n = 2e2, add = TRUE, col = 2) # PRt curve(d_cir_stat_PRt(x, method = "HBE"), n = 2e2, ylim = c(0, 5)) curve(p_cir_stat_PRt(x, method = "HBE"), n = 2e2, add = TRUE, col = 2) # Poisson curve(d_cir_stat_Poisson(x, method = "HBE"), from = -1, to = 5, n = 2e2, ylim = c(0, 2)) curve(p_cir_stat_Poisson(x, method = "HBE"), n = 2e2, col = 2, add = TRUE) # Pycke curve(d_cir_stat_Pycke(x), from = -5, to = 10, n = 2e2, ylim = c(0, 1)) curve(p_cir_stat_Pycke(x), n = 2e2, col = 2, add = TRUE) # Pycke q curve(d_cir_stat_Pycke_q(x, method = "HBE"), to = 15, n = 2e2, ylim = c(0, 1)) curve(p_cir_stat_Pycke_q(x, method = "HBE"), n = 2e2, add = TRUE, col = 2) # Range curve(d_cir_stat_Range(x, n = 50), to = 2, n = 2e2, ylim = c(0, 4)) curve(p_cir_stat_Range(x, n = 50), n = 2e2, col = 2, add = TRUE) # Rao curve(d_cir_stat_Rao(x), from = -6, to = 6, n = 2e2, ylim = c(0, 1)) curve(p_cir_stat_Rao(x), n = 2e2, col = 2, add = TRUE) # Rayleigh curve(d_cir_stat_Rayleigh(x), to = 12, n = 2e2, ylim = c(0, 1)) curve(p_cir_stat_Rayleigh(x), n = 2e2, col = 2, add = TRUE) # Riesz curve(d_cir_stat_Riesz(x, method = "HBE"), to = 6, n = 2e2, ylim = c(0, 1)) curve(p_cir_stat_Riesz(x, method = "HBE"), n = 2e2, add = TRUE, col = 2) # Rothman curve(d_cir_stat_Rothman(x, method = "HBE"), n = 2e2, ylim = c(0, 5)) curve(p_cir_stat_Rothman(x, method = "HBE"), n = 2e2, add = TRUE, col = 2) # Vacancy curve(d_cir_stat_Vacancy(x), from = -4, to = 4, n = 2e2, ylim = c(0, 1)) curve(p_cir_stat_Vacancy(x), n = 2e2, col = 2, add = TRUE) # Watson curve(d_cir_stat_Watson(x), to = 0.5, n = 2e2, ylim = c(0, 15)) curve(p_cir_stat_Watson(x), n = 2e2, col = 2, add = TRUE) # Watson (1976) curve(d_cir_stat_Watson_1976(x), to = 1.5, n = 2e2, ylim = c(0, 3)) curve(p_cir_stat_Watson_1976(x), n = 2e2, col = 2, add = TRUE) # Softmax curve(d_cir_stat_Softmax(x, method = "HBE"), to = 3, n = 2e2, ylim = c(0, 2)) curve(p_cir_stat_Softmax(x, method = "HBE"), n = 2e2, col = 2, add = TRUE) # Sobolev vk2 <- c(0.5, 0) curve(d_cir_stat_Sobolev(x = x, vk2 = vk2), to = 3, n = 2e2, ylim = c(0, 2)) curve(p_cir_stat_Sobolev(x = x, vk2 = vk2), n = 2e2, col = 2, add = TRUE)
# Ajne curve(d_cir_stat_Ajne(x), to = 1.5, n = 2e2, ylim = c(0, 4)) curve(p_cir_stat_Ajne(x), n = 2e2, col = 2, add = TRUE) # Bakshaev curve(d_cir_stat_Bakshaev(x, method = "HBE"), to = 6, n = 2e2, ylim = c(0, 1)) curve(p_cir_stat_Bakshaev(x, method = "HBE"), n = 2e2, add = TRUE, col = 2) # Bingham curve(d_cir_stat_Bingham(x), to = 12, n = 2e2, ylim = c(0, 1)) curve(p_cir_stat_Bingham(x), n = 2e2, col = 2, add = TRUE) # Greenwood curve(d_cir_stat_Greenwood(x), from = -6, to = 6, n = 2e2, ylim = c(0, 1)) curve(p_cir_stat_Greenwood(x), n = 2e2, col = 2, add = TRUE) # Hermans-Rasson curve(p_cir_stat_Hermans_Rasson(x, method = "HBE"), to = 10, n = 2e2, ylim = c(0, 1)) curve(d_cir_stat_Hermans_Rasson(x, method = "HBE"), n = 2e2, add = TRUE, col = 2) # Hodges-Ajne plot(25:45, d_cir_stat_Hodges_Ajne(cbind(25:45), n = 50), type = "h", lwd = 2, ylim = c(0, 1)) lines(25:45, p_cir_stat_Hodges_Ajne(cbind(25:45), n = 50), type = "s", col = 2) # Kolmogorov-Smirnov curve(d_Kolmogorov(x), to = 3, n = 2e2, ylim = c(0, 2)) curve(p_Kolmogorov(x), n = 2e2, col = 2, add = TRUE) # Kuiper curve(d_cir_stat_Kuiper(x, n = 50), to = 3, n = 2e2, ylim = c(0, 2)) curve(p_cir_stat_Kuiper(x, n = 50), n = 2e2, col = 2, add = TRUE) # Kuiper and Watson with Stephens modification curve(d_cir_stat_Kuiper(x, n = 8, Stephens = TRUE), to = 2.5, n = 2e2, ylim = c(0, 10)) curve(d_cir_stat_Watson(x, n = 8, Stephens = TRUE), n = 2e2, lty = 2, add = TRUE) n <- c(10, 20, 30, 40, 50, 100, 500) col <- rainbow(length(n)) for (i in seq_along(n)) { curve(d_cir_stat_Kuiper(x, n = n[i], Stephens = TRUE), n = 2e2, col = col[i], add = TRUE) curve(d_cir_stat_Watson(x, n = n[i], Stephens = TRUE), n = 2e2, col = col[i], lty = 2, add = TRUE) } # Maximum uncovered spacing curve(d_cir_stat_Max_uncover(x), from = -3, to = 6, n = 2e2, ylim = c(0, 1)) curve(p_cir_stat_Max_uncover(x), n = 2e2, col = 2, add = TRUE) # Number of uncovered spacing curve(d_cir_stat_Num_uncover(x), from = -4, to = 4, n = 2e2, ylim = c(0, 1)) curve(p_cir_stat_Num_uncover(x), n = 2e2, col = 2, add = TRUE) # Log gaps curve(d_cir_stat_Log_gaps(x), from = -1, to = 4, n = 2e2, ylim = c(0, 1)) curve(p_cir_stat_Log_gaps(x), n = 2e2, col = 2, add = TRUE) # Gine Fn curve(d_cir_stat_Gine_Fn(x, method = "HBE"), to = 2.5, n = 2e2, ylim = c(0, 2)) curve(p_cir_stat_Gine_Fn(x, method = "HBE"), n = 2e2, add = TRUE, col = 2) # Gine Gn curve(d_cir_stat_Gine_Gn(x, method = "HBE"), to = 2.5, n = 2e2, ylim = c(0, 2)) curve(p_cir_stat_Gine_Gn(x, method = "HBE"), n = 2e2, add = TRUE, col = 2) # Gini mean difference curve(d_cir_stat_Gini(x), from = -4, to = 4, n = 2e2, ylim = c(0, 1)) curve(p_cir_stat_Gini(x), n = 2e2, col = 2, add = TRUE) # Gini mean squared difference curve(d_cir_stat_Gini_squared(x), from = -10, to = 10, n = 2e2, ylim = c(0, 1)) curve(p_cir_stat_Gini_squared(x), n = 2e2, col = 2, add = TRUE) # PAD curve(d_cir_stat_PAD(x, method = "HBE"), to = 3, n = 2e2, ylim = c(0, 1.5)) curve(p_cir_stat_PAD(x, method = "HBE"), n = 2e2, add = TRUE, col = 2) # PCvM curve(d_cir_stat_PCvM(x, method = "HBE"), to = 4, n = 2e2, ylim = c(0, 2)) curve(p_cir_stat_PCvM(x, method = "HBE"), n = 2e2, add = TRUE, col = 2) # PRt curve(d_cir_stat_PRt(x, method = "HBE"), n = 2e2, ylim = c(0, 5)) curve(p_cir_stat_PRt(x, method = "HBE"), n = 2e2, add = TRUE, col = 2) # Poisson curve(d_cir_stat_Poisson(x, method = "HBE"), from = -1, to = 5, n = 2e2, ylim = c(0, 2)) curve(p_cir_stat_Poisson(x, method = "HBE"), n = 2e2, col = 2, add = TRUE) # Pycke curve(d_cir_stat_Pycke(x), from = -5, to = 10, n = 2e2, ylim = c(0, 1)) curve(p_cir_stat_Pycke(x), n = 2e2, col = 2, add = TRUE) # Pycke q curve(d_cir_stat_Pycke_q(x, method = "HBE"), to = 15, n = 2e2, ylim = c(0, 1)) curve(p_cir_stat_Pycke_q(x, method = "HBE"), n = 2e2, add = TRUE, col = 2) # Range curve(d_cir_stat_Range(x, n = 50), to = 2, n = 2e2, ylim = c(0, 4)) curve(p_cir_stat_Range(x, n = 50), n = 2e2, col = 2, add = TRUE) # Rao curve(d_cir_stat_Rao(x), from = -6, to = 6, n = 2e2, ylim = c(0, 1)) curve(p_cir_stat_Rao(x), n = 2e2, col = 2, add = TRUE) # Rayleigh curve(d_cir_stat_Rayleigh(x), to = 12, n = 2e2, ylim = c(0, 1)) curve(p_cir_stat_Rayleigh(x), n = 2e2, col = 2, add = TRUE) # Riesz curve(d_cir_stat_Riesz(x, method = "HBE"), to = 6, n = 2e2, ylim = c(0, 1)) curve(p_cir_stat_Riesz(x, method = "HBE"), n = 2e2, add = TRUE, col = 2) # Rothman curve(d_cir_stat_Rothman(x, method = "HBE"), n = 2e2, ylim = c(0, 5)) curve(p_cir_stat_Rothman(x, method = "HBE"), n = 2e2, add = TRUE, col = 2) # Vacancy curve(d_cir_stat_Vacancy(x), from = -4, to = 4, n = 2e2, ylim = c(0, 1)) curve(p_cir_stat_Vacancy(x), n = 2e2, col = 2, add = TRUE) # Watson curve(d_cir_stat_Watson(x), to = 0.5, n = 2e2, ylim = c(0, 15)) curve(p_cir_stat_Watson(x), n = 2e2, col = 2, add = TRUE) # Watson (1976) curve(d_cir_stat_Watson_1976(x), to = 1.5, n = 2e2, ylim = c(0, 3)) curve(p_cir_stat_Watson_1976(x), n = 2e2, col = 2, add = TRUE) # Softmax curve(d_cir_stat_Softmax(x, method = "HBE"), to = 3, n = 2e2, ylim = c(0, 2)) curve(p_cir_stat_Softmax(x, method = "HBE"), n = 2e2, col = 2, add = TRUE) # Sobolev vk2 <- c(0.5, 0) curve(d_cir_stat_Sobolev(x = x, vk2 = vk2), to = 3, n = 2e2, ylim = c(0, 2)) curve(p_cir_stat_Sobolev(x = x, vk2 = vk2), n = 2e2, col = 2, add = TRUE)
Computation of the asymptotic null distributions of spherical uniformity statistics.
p_sph_stat_Bingham(x, p) d_sph_stat_Bingham(x, p) p_sph_stat_CJ12(x, regime = 1L, beta = 0) d_sph_stat_CJ12(x, regime = 3L, beta = 0) p_sph_stat_Rayleigh(x, p) d_sph_stat_Rayleigh(x, p) p_sph_stat_Rayleigh_HD(x, p) d_sph_stat_Rayleigh_HD(x, p) p_sph_stat_Ajne(x, p, K_max = 1000, thre = 0, method = "I", ...) d_sph_stat_Ajne(x, p, K_max = 1000, thre = 0, method = "I", ...) p_sph_stat_Bakshaev(x, p, K_max = 1000, thre = 0, method = "I", ...) d_sph_stat_Bakshaev(x, p, K_max = 1000, thre = 0, method = "I", ...) p_sph_stat_Gine_Fn(x, p, K_max = 1000, thre = 0, method = "I", ...) d_sph_stat_Gine_Fn(x, p, K_max = 1000, thre = 0, method = "I", ...) p_sph_stat_Gine_Gn(x, p, K_max = 1000, thre = 0, method = "I", ...) d_sph_stat_Gine_Gn(x, p, K_max = 1000, thre = 0, method = "I", ...) p_sph_stat_PAD(x, p, K_max = 1000, thre = 0, method = "I", ...) d_sph_stat_PAD(x, p, K_max = 1000, thre = 0, method = "I", ...) p_sph_stat_PCvM(x, p, K_max = 1000, thre = 0, method = "I", ...) d_sph_stat_PCvM(x, p, K_max = 1000, thre = 0, method = "I", ...) p_sph_stat_Poisson(x, p, rho = 0.5, K_max = 1000, thre = 0, method = "I", ...) d_sph_stat_Poisson(x, p, rho = 0.5, K_max = 1000, thre = 0, method = "I", ...) p_sph_stat_PRt(x, p, t = 1/3, K_max = 1000, thre = 0, method = "I", ...) d_sph_stat_PRt(x, p, t = 1/3, K_max = 1000, thre = 0, method = "I", ...) p_sph_stat_Riesz(x, p, s = 1, K_max = 1000, thre = 0, method = "I", ...) d_sph_stat_Riesz(x, p, s = 1, K_max = 1000, thre = 0, method = "I", ...) p_sph_stat_Sobolev(x, p, vk2 = c(0, 0, 1), method = "I", ...) d_sph_stat_Sobolev(x, p, vk2 = c(0, 0, 1), method = "I", ...) p_sph_stat_Softmax(x, p, kappa = 1, K_max = 1000, thre = 0, method = "I", ...) d_sph_stat_Softmax(x, p, kappa = 1, K_max = 1000, thre = 0, method = "I", ...) p_sph_stat_Stereo(x, p, a = 0, K_max = 1000, method = "I", ...) d_sph_stat_Stereo(x, p, a = 0, K_max = 1000, method = "I", ...)
p_sph_stat_Bingham(x, p) d_sph_stat_Bingham(x, p) p_sph_stat_CJ12(x, regime = 1L, beta = 0) d_sph_stat_CJ12(x, regime = 3L, beta = 0) p_sph_stat_Rayleigh(x, p) d_sph_stat_Rayleigh(x, p) p_sph_stat_Rayleigh_HD(x, p) d_sph_stat_Rayleigh_HD(x, p) p_sph_stat_Ajne(x, p, K_max = 1000, thre = 0, method = "I", ...) d_sph_stat_Ajne(x, p, K_max = 1000, thre = 0, method = "I", ...) p_sph_stat_Bakshaev(x, p, K_max = 1000, thre = 0, method = "I", ...) d_sph_stat_Bakshaev(x, p, K_max = 1000, thre = 0, method = "I", ...) p_sph_stat_Gine_Fn(x, p, K_max = 1000, thre = 0, method = "I", ...) d_sph_stat_Gine_Fn(x, p, K_max = 1000, thre = 0, method = "I", ...) p_sph_stat_Gine_Gn(x, p, K_max = 1000, thre = 0, method = "I", ...) d_sph_stat_Gine_Gn(x, p, K_max = 1000, thre = 0, method = "I", ...) p_sph_stat_PAD(x, p, K_max = 1000, thre = 0, method = "I", ...) d_sph_stat_PAD(x, p, K_max = 1000, thre = 0, method = "I", ...) p_sph_stat_PCvM(x, p, K_max = 1000, thre = 0, method = "I", ...) d_sph_stat_PCvM(x, p, K_max = 1000, thre = 0, method = "I", ...) p_sph_stat_Poisson(x, p, rho = 0.5, K_max = 1000, thre = 0, method = "I", ...) d_sph_stat_Poisson(x, p, rho = 0.5, K_max = 1000, thre = 0, method = "I", ...) p_sph_stat_PRt(x, p, t = 1/3, K_max = 1000, thre = 0, method = "I", ...) d_sph_stat_PRt(x, p, t = 1/3, K_max = 1000, thre = 0, method = "I", ...) p_sph_stat_Riesz(x, p, s = 1, K_max = 1000, thre = 0, method = "I", ...) d_sph_stat_Riesz(x, p, s = 1, K_max = 1000, thre = 0, method = "I", ...) p_sph_stat_Sobolev(x, p, vk2 = c(0, 0, 1), method = "I", ...) d_sph_stat_Sobolev(x, p, vk2 = c(0, 0, 1), method = "I", ...) p_sph_stat_Softmax(x, p, kappa = 1, K_max = 1000, thre = 0, method = "I", ...) d_sph_stat_Softmax(x, p, kappa = 1, K_max = 1000, thre = 0, method = "I", ...) p_sph_stat_Stereo(x, p, a = 0, K_max = 1000, method = "I", ...) d_sph_stat_Stereo(x, p, a = 0, K_max = 1000, method = "I", ...)
x |
a vector of size |
p |
integer giving the dimension of the ambient space |
regime |
type of asymptotic regime for the CJ12 test, either |
beta |
|
K_max |
integer giving the truncation of the series that compute the
asymptotic p-value of a Sobolev test. Defaults to |
thre |
error threshold for the tail probability given by the
the first terms of the truncated series of a Sobolev test. Defaults to
|
method |
method for approximating the density, distribution, or
quantile function of the weighted sum of chi squared random variables. Must
be |
... |
further parameters passed to |
rho |
|
t |
|
s |
|
vk2 |
weights for the finite Sobolev test. A non-negative vector or
matrix. Defaults to |
kappa |
|
a |
either:
|
Descriptions and references on most of the asymptotic distributions are available in García-Portugués and Verdebout (2018).
r_sph_stat_*
: a matrix of size c(n, 1)
containing
the sample.
p_sph_stat_*
, d_sph_stat_*
: a matrix of size
c(nx, 1)
with the evaluation of the distribution or density
functions at x
.
# Ajne curve(d_sph_stat_Ajne(x, p = 3, method = "HBE"), n = 2e2, ylim = c(0, 4)) curve(p_sph_stat_Ajne(x, p = 3, method = "HBE"), n = 2e2, col = 2, add = TRUE) # Bakshaev curve(d_sph_stat_Bakshaev(x, p = 3, method = "HBE"), to = 5, n = 2e2, ylim = c(0, 2)) curve(p_sph_stat_Bakshaev(x, p = 3, method = "HBE"), n = 2e2, col = 2, add = TRUE) # Bingham curve(d_sph_stat_Bingham(x, p = 3), to = 20, n = 2e2, ylim = c(0, 1)) curve(p_sph_stat_Bingham(x, p = 3), n = 2e2, col = 2, add = TRUE) # CJ12 curve(d_sph_stat_CJ12(x, regime = 1), from = -10, to = 10, n = 2e2, ylim = c(0, 1)) curve(d_sph_stat_CJ12(x, regime = 2, beta = 0.1), n = 2e2, col = 2, add = TRUE) curve(d_sph_stat_CJ12(x, regime = 3), n = 2e2, col = 3, add = TRUE) curve(p_sph_stat_CJ12(x, regime = 1), n = 2e2, col = 1, add = TRUE) curve(p_sph_stat_CJ12(x, regime = 2, beta = 0.1), n = 2e2, col = 2, add = TRUE) curve(p_sph_stat_CJ12(x, regime = 3), col = 3, add = TRUE) # Gine Fn curve(d_sph_stat_Gine_Fn(x, p = 3, method = "HBE"), to = 2, n = 2e2, ylim = c(0, 2)) curve(p_sph_stat_Gine_Fn(x, p = 3, method = "HBE"), n = 2e2, col = 2, add = TRUE) # Gine Gn curve(d_sph_stat_Gine_Gn(x, p = 3, method = "HBE"), to = 1.5, n = 2e2, ylim = c(0, 2.5)) curve(p_sph_stat_Gine_Gn(x, p = 3, method = "HBE"), n = 2e2, col = 2, add = TRUE) # PAD curve(d_sph_stat_PAD(x, p = 3, method = "HBE"), to = 3, n = 2e2, ylim = c(0, 1.5)) curve(p_sph_stat_PAD(x, p = 3, method = "HBE"), n = 2e2, col = 2, add = TRUE) # PCvM curve(d_sph_stat_PCvM(x, p = 3, method = "HBE"), to = 0.6, n = 2e2, ylim = c(0, 7)) curve(p_sph_stat_PCvM(x, p = 3, method = "HBE"), n = 2e2, col = 2, add = TRUE) # Poisson curve(d_sph_stat_Poisson(x, p = 3, method = "HBE"), to = 2, n = 2e2, ylim = c(0, 2)) curve(p_sph_stat_Poisson(x, p = 3, method = "HBE"), n = 2e2, col = 2, add = TRUE) # PRt curve(d_sph_stat_PRt(x, p = 3, method = "HBE"), n = 2e2, ylim = c(0, 5)) curve(p_sph_stat_PRt(x, p = 3, method = "HBE"), n = 2e2, col = 2, add = TRUE) # Rayleigh curve(d_sph_stat_Rayleigh(x, p = 3), to = 15, n = 2e2, ylim = c(0, 1)) curve(p_sph_stat_Rayleigh(x, p = 3), n = 2e2, col = 2, add = TRUE) # HD-standardized Rayleigh curve(d_sph_stat_Rayleigh_HD(x, p = 3), from = -4, to = 4, n = 2e2, ylim = c(0, 1)) curve(p_sph_stat_Rayleigh_HD(x, p = 3), n = 2e2, col = 2, add = TRUE) # Riesz curve(d_sph_stat_Riesz(x, p = 3, method = "HBE"), n = 2e2, from = 0, to = 5, ylim = c(0, 2)) curve(p_sph_stat_Riesz(x, p = 3, method = "HBE"), n = 2e2, col = 2, add = TRUE) # Sobolev x <- seq(-1, 5, by = 0.05) vk2 <- diag(rep(0.3, 2)) matplot(x, d_sph_stat_Sobolev(x = x, vk2 = vk2, p = 3), type = "l", ylim = c(0, 1), lty = 1) matlines(x, p_sph_stat_Sobolev(x = x, vk2 = vk2, p = 3), lty = 1) matlines(x, d_sph_stat_Sobolev(x = x, vk2 = vk2 + 0.01, p = 3), lty = 2) matlines(x, p_sph_stat_Sobolev(x = x, vk2 = vk2 + 0.01, p = 3), lty = 2) # Softmax curve(d_sph_stat_Softmax(x, p = 3, method = "HBE"), to = 2, n = 2e2, ylim = c(0, 2)) curve(p_sph_stat_Softmax(x, p = 3, method = "HBE"), n = 2e2, col = 2, add = TRUE) # Stereo curve(d_sph_stat_Stereo(x, p = 4, method = "HBE"), from=-5,to = 10, n = 2e2, ylim = c(0, 2)) curve(p_sph_stat_Stereo(x, p = 4, method = "HBE"), n = 2e2, col = 2, add = TRUE)
# Ajne curve(d_sph_stat_Ajne(x, p = 3, method = "HBE"), n = 2e2, ylim = c(0, 4)) curve(p_sph_stat_Ajne(x, p = 3, method = "HBE"), n = 2e2, col = 2, add = TRUE) # Bakshaev curve(d_sph_stat_Bakshaev(x, p = 3, method = "HBE"), to = 5, n = 2e2, ylim = c(0, 2)) curve(p_sph_stat_Bakshaev(x, p = 3, method = "HBE"), n = 2e2, col = 2, add = TRUE) # Bingham curve(d_sph_stat_Bingham(x, p = 3), to = 20, n = 2e2, ylim = c(0, 1)) curve(p_sph_stat_Bingham(x, p = 3), n = 2e2, col = 2, add = TRUE) # CJ12 curve(d_sph_stat_CJ12(x, regime = 1), from = -10, to = 10, n = 2e2, ylim = c(0, 1)) curve(d_sph_stat_CJ12(x, regime = 2, beta = 0.1), n = 2e2, col = 2, add = TRUE) curve(d_sph_stat_CJ12(x, regime = 3), n = 2e2, col = 3, add = TRUE) curve(p_sph_stat_CJ12(x, regime = 1), n = 2e2, col = 1, add = TRUE) curve(p_sph_stat_CJ12(x, regime = 2, beta = 0.1), n = 2e2, col = 2, add = TRUE) curve(p_sph_stat_CJ12(x, regime = 3), col = 3, add = TRUE) # Gine Fn curve(d_sph_stat_Gine_Fn(x, p = 3, method = "HBE"), to = 2, n = 2e2, ylim = c(0, 2)) curve(p_sph_stat_Gine_Fn(x, p = 3, method = "HBE"), n = 2e2, col = 2, add = TRUE) # Gine Gn curve(d_sph_stat_Gine_Gn(x, p = 3, method = "HBE"), to = 1.5, n = 2e2, ylim = c(0, 2.5)) curve(p_sph_stat_Gine_Gn(x, p = 3, method = "HBE"), n = 2e2, col = 2, add = TRUE) # PAD curve(d_sph_stat_PAD(x, p = 3, method = "HBE"), to = 3, n = 2e2, ylim = c(0, 1.5)) curve(p_sph_stat_PAD(x, p = 3, method = "HBE"), n = 2e2, col = 2, add = TRUE) # PCvM curve(d_sph_stat_PCvM(x, p = 3, method = "HBE"), to = 0.6, n = 2e2, ylim = c(0, 7)) curve(p_sph_stat_PCvM(x, p = 3, method = "HBE"), n = 2e2, col = 2, add = TRUE) # Poisson curve(d_sph_stat_Poisson(x, p = 3, method = "HBE"), to = 2, n = 2e2, ylim = c(0, 2)) curve(p_sph_stat_Poisson(x, p = 3, method = "HBE"), n = 2e2, col = 2, add = TRUE) # PRt curve(d_sph_stat_PRt(x, p = 3, method = "HBE"), n = 2e2, ylim = c(0, 5)) curve(p_sph_stat_PRt(x, p = 3, method = "HBE"), n = 2e2, col = 2, add = TRUE) # Rayleigh curve(d_sph_stat_Rayleigh(x, p = 3), to = 15, n = 2e2, ylim = c(0, 1)) curve(p_sph_stat_Rayleigh(x, p = 3), n = 2e2, col = 2, add = TRUE) # HD-standardized Rayleigh curve(d_sph_stat_Rayleigh_HD(x, p = 3), from = -4, to = 4, n = 2e2, ylim = c(0, 1)) curve(p_sph_stat_Rayleigh_HD(x, p = 3), n = 2e2, col = 2, add = TRUE) # Riesz curve(d_sph_stat_Riesz(x, p = 3, method = "HBE"), n = 2e2, from = 0, to = 5, ylim = c(0, 2)) curve(p_sph_stat_Riesz(x, p = 3, method = "HBE"), n = 2e2, col = 2, add = TRUE) # Sobolev x <- seq(-1, 5, by = 0.05) vk2 <- diag(rep(0.3, 2)) matplot(x, d_sph_stat_Sobolev(x = x, vk2 = vk2, p = 3), type = "l", ylim = c(0, 1), lty = 1) matlines(x, p_sph_stat_Sobolev(x = x, vk2 = vk2, p = 3), lty = 1) matlines(x, d_sph_stat_Sobolev(x = x, vk2 = vk2 + 0.01, p = 3), lty = 2) matlines(x, p_sph_stat_Sobolev(x = x, vk2 = vk2 + 0.01, p = 3), lty = 2) # Softmax curve(d_sph_stat_Softmax(x, p = 3, method = "HBE"), to = 2, n = 2e2, ylim = c(0, 2)) curve(p_sph_stat_Softmax(x, p = 3, method = "HBE"), n = 2e2, col = 2, add = TRUE) # Stereo curve(d_sph_stat_Stereo(x, p = 4, method = "HBE"), from=-5,to = 10, n = 2e2, ylim = c(0, 2)) curve(p_sph_stat_Stereo(x, p = 4, method = "HBE"), n = 2e2, col = 2, add = TRUE)
Planet orbits data from the
JPL Keplerian Elements for Approximate Positions of the Major Planets.
The normal vector of a planet orbit represents is a vector on .
planets
planets
A data frame with 9 rows and 3 variables:
names of the planets and Pluto.
inclination; the orbit's plane angle with respect to the
ecliptic plane, in radians in .
longitude of the ascending node; the counterclockwise angle from
the vector pointing to the First Point of Aries and that pointing to
the ascending node (the intersection between orbit and ecliptic plane), in
radians in . (Both vectors are heliocentric and within
the ecliptic plane.)
The normal vector to the ecliptic plane of the planet with inclination
and longitude of the ascending node
is
The script performing the data preprocessing is available at
planets.R
. The data was retrieved on 2020-05-16.
Table 2a in https://ssd.jpl.nasa.gov/planets/approx_pos.html
# Load data data("planets") # Add normal vectors planets$normal <- cbind(sin(planets$i) * sin(planets$om), -sin(planets$i) * cos(planets$om), cos(planets$i)) # Tests to be performed type_tests <- c("PCvM", "PAD", "PRt") # Tests with Pluto unif_test(data = planets$normal, type = type_tests, p_value = "MC") # Tests without Pluto unif_test(data = planets$normal[-9, ], type = type_tests, p_value = "MC")
# Load data data("planets") # Add normal vectors planets$normal <- cbind(sin(planets$i) * sin(planets$om), -sin(planets$i) * cos(planets$om), cos(planets$i)) # Tests to be performed type_tests <- c("PCvM", "PAD", "PRt") # Tests with Pluto unif_test(data = planets$normal, type = type_tests, p_value = "MC") # Tests without Pluto unif_test(data = planets$normal[-9, ], type = type_tests, p_value = "MC")
Computation of the kernels
where is the proportion of area surface of
covered by the
intersection of two hyperspherical caps with common solid
angle
and centers separated by
an angle
,
is the distribution function
of the projected spherical uniform distribution,
and
is a measure on
.
Also, computation of the Gegenbauer coefficients of
:
These coefficients can also be computed via
for a certain function . They serve to define
projected alternatives to uniformity.
psi_Pn(theta, q, type, Rothman_t = 1/3, tilde = FALSE, psi_Gauss = TRUE, psi_N = 320, tol = 1e-06) Gegen_coefs_Pn(k, p, type, Rothman_t = 1/3, Gauss = TRUE, N = 320, tol = 1e-06, verbose = FALSE) akx(x, p, k, sqr = FALSE) f_locdev_Pn(p, type, K = 1000, N = 320, K_max = 10000, thre = 0.001, Rothman_t = 1/3, verbose = FALSE)
psi_Pn(theta, q, type, Rothman_t = 1/3, tilde = FALSE, psi_Gauss = TRUE, psi_N = 320, tol = 1e-06) Gegen_coefs_Pn(k, p, type, Rothman_t = 1/3, Gauss = TRUE, N = 320, tol = 1e-06, verbose = FALSE) akx(x, p, k, sqr = FALSE) f_locdev_Pn(p, type, K = 1000, N = 320, K_max = 10000, thre = 0.001, Rothman_t = 1/3, verbose = FALSE)
theta |
vector with values in |
q |
integer giving the dimension of the sphere |
type |
type of projected-ecdf test statistic. Must be either
|
Rothman_t |
|
tilde |
include the constant and bias term? Defaults to |
psi_Gauss |
use a Gauss–Legendre quadrature
rule with |
psi_N |
number of points used in the Gauss–Legendre quadrature for
computing the kernel function. Defaults to |
tol |
tolerance passed to |
k |
vector with the index of coefficients. |
p |
integer giving the dimension of the ambient space |
Gauss |
use a Gauss–Legendre quadrature rule of |
N |
number of points used in the
Gauss–Legendre quadrature for computing the Gegenbauer coefficients.
Defaults to |
verbose |
flag to print informative messages. Defaults to |
x |
evaluation points for |
sqr |
return the signed square root of |
K |
number of equispaced points on |
K_max |
integer giving the truncation of the series. Defaults to
|
thre |
proportion of norm not explained by the first terms of the
truncated series. Defaults to |
The evaluation of and
depends on the type of
projected-ecdf statistic:
PCvM: closed-form expressions for and
with
, numerical integration required for
.
PAD: closed-form expressions for and
,
numerical integration required for
with
and
with
and
.
PRt: closed-form expressions for and
for any
.
See García-Portugués et al. (2023) for more details.
psi_Pn
: a vector of size length(theta)
with the
evaluation of .
Gegen_coefs_Pn
: a vector of size length(k)
containing
the coefficients .
akx
: a matrix of size c(length(x), length(k))
containing the coefficients .
f_locdev_Pn
: the projected alternative as a function
ready to be evaluated.
Eduardo García-Portugués and Paula Navarro-Esteban.
García-Portugués, E., Navarro-Esteban, P., Cuesta-Albertos, J. A. (2023) On a projection-based class of uniformity tests on the hypersphere. Bernoulli, 29(1):181–204. doi:10.3150/21-BEJ1454.
# Kernels in the projected-ecdf test statistics k <- 0:10 coefs <- list() (coefs$PCvM <- t(sapply(2:5, function(p) Gegen_coefs_Pn(k = k, p = p, type = "PCvM")))) (coefs$PAD <- t(sapply(2:5, function(p) Gegen_coefs_Pn(k = k, p = p, type = "PAD")))) (coefs$PRt <- t(sapply(2:5, function(p) Gegen_coefs_Pn(k = k, p = p, type = "PRt")))) # Gegenbauer expansion th <- seq(0, pi, length.out = 501)[-501] old_par <- par(mfrow = c(3, 4)) for (type in c("PCvM", "PAD", "PRt")) { for (p in 2:5) { plot(th, psi_Pn(theta = th, q = p - 1, type = type), type = "l", main = paste0(type, ", p = ", p), xlab = expression(theta), ylab = expression(psi(theta)), axes = FALSE, ylim = c(-1.5, 1)) axis(1, at = c(0, pi / 4, pi / 2, 3 * pi / 4, pi), labels = expression(0, pi / 4, pi / 2, 3 * pi / 4, pi)) axis(2); box() lines(th, Gegen_series(theta = th, coefs = coefs[[type]][p - 1, ], k = k, p = p), col = 2) } } par(old_par) # Analytical coefficients vs. numerical integration test_coef <- function(type, p, k = 0:20) { plot(k, log1p(abs(Gegen_coefs_Pn(k = k, p = p, type = type))), ylab = "Coefficients", main = paste0(type, ", p = ", p)) points(k, log1p(abs(Gegen_coefs(k = k, p = p, psi = psi_Pn, type = type, q = p - 1))), col = 2) legend("topright", legend = c("log(1 + Gegen_coefs_Pn))", "log(1 + Gegen_coefs(psi_Pn))"), lwd = 2, col = 1:2) } # PCvM statistic old_par <- par(mfrow = c(2, 2)) for (p in 2:5) test_coef(type = "PCvM", p = p) par(old_par) # PAD statistic old_par <- par(mfrow = c(2, 2)) for (p in 2:5) test_coef(type = "PAD", p = p) par(old_par) # PRt statistic old_par <- par(mfrow = c(2, 2)) for (p in 2:5) test_coef(type = "PRt", p = p) par(old_par) # akx akx(x = seq(-1, 1, l = 5), k = 1:4, p = 2) akx(x = 0, k = 1:4, p = 3) # PRt alternative to uniformity z <- seq(-1, 1, l = 1e3) p <- c(2:5, 10, 15, 17) col <- viridisLite::viridis(length(p)) plot(z, f_locdev_Pn(p = p[1], type = "PRt")(z), type = "s", col = col[1], ylim = c(0, 0.6), ylab = expression(f[Rt](z))) for (k in 2:length(p)) { lines(z, f_locdev_Pn(p = p[k], type = "PRt")(z), type = "s", col = col[k]) } legend("topleft", legend = paste("p =", p), col = col, lwd = 2)
# Kernels in the projected-ecdf test statistics k <- 0:10 coefs <- list() (coefs$PCvM <- t(sapply(2:5, function(p) Gegen_coefs_Pn(k = k, p = p, type = "PCvM")))) (coefs$PAD <- t(sapply(2:5, function(p) Gegen_coefs_Pn(k = k, p = p, type = "PAD")))) (coefs$PRt <- t(sapply(2:5, function(p) Gegen_coefs_Pn(k = k, p = p, type = "PRt")))) # Gegenbauer expansion th <- seq(0, pi, length.out = 501)[-501] old_par <- par(mfrow = c(3, 4)) for (type in c("PCvM", "PAD", "PRt")) { for (p in 2:5) { plot(th, psi_Pn(theta = th, q = p - 1, type = type), type = "l", main = paste0(type, ", p = ", p), xlab = expression(theta), ylab = expression(psi(theta)), axes = FALSE, ylim = c(-1.5, 1)) axis(1, at = c(0, pi / 4, pi / 2, 3 * pi / 4, pi), labels = expression(0, pi / 4, pi / 2, 3 * pi / 4, pi)) axis(2); box() lines(th, Gegen_series(theta = th, coefs = coefs[[type]][p - 1, ], k = k, p = p), col = 2) } } par(old_par) # Analytical coefficients vs. numerical integration test_coef <- function(type, p, k = 0:20) { plot(k, log1p(abs(Gegen_coefs_Pn(k = k, p = p, type = type))), ylab = "Coefficients", main = paste0(type, ", p = ", p)) points(k, log1p(abs(Gegen_coefs(k = k, p = p, psi = psi_Pn, type = type, q = p - 1))), col = 2) legend("topright", legend = c("log(1 + Gegen_coefs_Pn))", "log(1 + Gegen_coefs(psi_Pn))"), lwd = 2, col = 1:2) } # PCvM statistic old_par <- par(mfrow = c(2, 2)) for (p in 2:5) test_coef(type = "PCvM", p = p) par(old_par) # PAD statistic old_par <- par(mfrow = c(2, 2)) for (p in 2:5) test_coef(type = "PAD", p = p) par(old_par) # PRt statistic old_par <- par(mfrow = c(2, 2)) for (p in 2:5) test_coef(type = "PRt", p = p) par(old_par) # akx akx(x = seq(-1, 1, l = 5), k = 1:4, p = 2) akx(x = 0, k = 1:4, p = 3) # PRt alternative to uniformity z <- seq(-1, 1, l = 1e3) p <- c(2:5, 10, 15, 17) col <- viridisLite::viridis(length(p)) plot(z, f_locdev_Pn(p = p[1], type = "PRt")(z), type = "s", col = col[1], ylim = c(0, 0.6), ylab = expression(f[Rt](z))) for (k in 2:length(p)) { lines(z, f_locdev_Pn(p = p[k], type = "PRt")(z), type = "s", col = col[k]) } legend("topleft", legend = paste("p =", p), col = col, lwd = 2)
Density, distribution, and quantile functions of the
projection of the spherical uniform random variable on an arbitrary
direction, that is, the random variable
, where
is uniformly distributed on the (hyper)sphere
,
, and
is an
arbitrary projection direction. Note that the distribution is
invariant to the choice of
. Also,
efficient simulation of
.
d_proj_unif(x, p, log = FALSE) p_proj_unif(x, p, log = FALSE) q_proj_unif(u, p) r_proj_unif(n, p)
d_proj_unif(x, p, log = FALSE) p_proj_unif(x, p, log = FALSE) q_proj_unif(u, p) r_proj_unif(n, p)
x |
a vector of size |
p |
integer giving the dimension of the ambient space |
log |
compute the logarithm of the density or distribution? |
u |
vector of probabilities. |
n |
sample size. |
A matrix of size c(nx, 1)
with the evaluation of the
density, distribution, or quantile function at x
or u
.
For r_proj_unif
, a random vector of size n
.
Eduardo García-Portugués and Paula Navarro-Esteban.
# Density function curve(d_proj_unif(x, p = 2), from = -2, to = 2, n = 2e2, ylim = c(0, 2)) curve(d_proj_unif(x, p = 3), n = 2e2, col = 2, add = TRUE) curve(d_proj_unif(x, p = 4), n = 2e2, col = 3, add = TRUE) curve(d_proj_unif(x, p = 5), n = 2e2, col = 4, add = TRUE) curve(d_proj_unif(x, p = 6), n = 2e2, col = 5, add = TRUE) # Distribution function curve(p_proj_unif(x, p = 2), from = -2, to = 2, n = 2e2, ylim = c(0, 1)) curve(p_proj_unif(x, p = 3), n = 2e2, col = 2, add = TRUE) curve(p_proj_unif(x, p = 4), n = 2e2, col = 3, add = TRUE) curve(p_proj_unif(x, p = 5), n = 2e2, col = 4, add = TRUE) curve(p_proj_unif(x, p = 6), n = 2e2, col = 5, add = TRUE) # Quantile function curve(q_proj_unif(u = x, p = 2), from = 0, to = 1, n = 2e2, ylim = c(-1, 1)) curve(q_proj_unif(u = x, p = 3), n = 2e2, col = 2, add = TRUE) curve(q_proj_unif(u = x, p = 4), n = 2e2, col = 3, add = TRUE) curve(q_proj_unif(u = x, p = 5), n = 2e2, col = 4, add = TRUE) curve(q_proj_unif(u = x, p = 6), n = 2e2, col = 5, add = TRUE) # Sampling hist(r_proj_unif(n = 1e4, p = 4), freq = FALSE, breaks = 50) curve(d_proj_unif(x, p = 4), n = 2e2, col = 3, add = TRUE)
# Density function curve(d_proj_unif(x, p = 2), from = -2, to = 2, n = 2e2, ylim = c(0, 2)) curve(d_proj_unif(x, p = 3), n = 2e2, col = 2, add = TRUE) curve(d_proj_unif(x, p = 4), n = 2e2, col = 3, add = TRUE) curve(d_proj_unif(x, p = 5), n = 2e2, col = 4, add = TRUE) curve(d_proj_unif(x, p = 6), n = 2e2, col = 5, add = TRUE) # Distribution function curve(p_proj_unif(x, p = 2), from = -2, to = 2, n = 2e2, ylim = c(0, 1)) curve(p_proj_unif(x, p = 3), n = 2e2, col = 2, add = TRUE) curve(p_proj_unif(x, p = 4), n = 2e2, col = 3, add = TRUE) curve(p_proj_unif(x, p = 5), n = 2e2, col = 4, add = TRUE) curve(p_proj_unif(x, p = 6), n = 2e2, col = 5, add = TRUE) # Quantile function curve(q_proj_unif(u = x, p = 2), from = 0, to = 1, n = 2e2, ylim = c(-1, 1)) curve(q_proj_unif(u = x, p = 3), n = 2e2, col = 2, add = TRUE) curve(q_proj_unif(u = x, p = 4), n = 2e2, col = 3, add = TRUE) curve(q_proj_unif(u = x, p = 5), n = 2e2, col = 4, add = TRUE) curve(q_proj_unif(u = x, p = 6), n = 2e2, col = 5, add = TRUE) # Sampling hist(r_proj_unif(n = 1e4, p = 4), freq = FALSE, breaks = 50) curve(d_proj_unif(x, p = 4), n = 2e2, col = 3, add = TRUE)
Efficient computation of the shortest angles matrix
, defined as
for a sample ,
.
For a circular sample ,
can be expressed as
Psi_mat(data, ind_tri = 0L, use_ind_tri = FALSE, scalar_prod = FALSE, angles_diff = FALSE) upper_tri_ind(n)
Psi_mat(data, ind_tri = 0L, use_ind_tri = FALSE, scalar_prod = FALSE, angles_diff = FALSE) upper_tri_ind(n)
data |
an array of size |
ind_tri |
if |
use_ind_tri |
use the already computed vector index |
scalar_prod |
return the scalar products
|
angles_diff |
return the (unwrapped) angles difference
|
n |
sample size, used to determine the index vector that gives the
upper triangular part of |
Psi_mat
: a matrix of size
c(n * (n - 1) / 2, M)
containing, for each column, the vector
half of for each of the
M
samples.
upper_tri_ind
: a matrix of size n * (n - 1) / 2
containing the 0-based linear indexes for extracting the upper triangular
matrix of a matrix of size c(n, n)
, diagonal excluded, assuming
column-major order.
Be careful on avoiding the next bad usages of Psi_mat
, which will
produce spurious results:
The directions in data
do not have unit norm when
Cartesian coordinates are employed.
The entries of data
are not in when
polar coordinates are employed.
ind_tri
is a vector of size n * (n - 1) / 2
that
does not contain the indexes produced by upper_tri_ind(n)
.
# Shortest angles n <- 5 X <- r_unif_sph(n = n, p = 2, M = 2) Theta <- X_to_Theta(X) dim(Theta) <- c(n, 1, 2) Psi_mat(X) Psi_mat(Theta) # Precompute ind_tri ind_tri <- upper_tri_ind(n) Psi_mat(X, ind_tri = ind_tri, use_ind_tri = TRUE) # Compare with R A <- acos(tcrossprod(X[, , 1])) ind <- upper.tri(A) A[ind] # Reconstruct matrix Psi_vec <- Psi_mat(Theta[, , 1, drop = FALSE]) Psi <- matrix(0, nrow = n, ncol = n) Psi[upper.tri(Psi)] <- Psi_vec Psi <- Psi + t(Psi)
# Shortest angles n <- 5 X <- r_unif_sph(n = n, p = 2, M = 2) Theta <- X_to_Theta(X) dim(Theta) <- c(n, 1, 2) Psi_mat(X) Psi_mat(Theta) # Precompute ind_tri ind_tri <- upper_tri_ind(n) Psi_mat(X, ind_tri = ind_tri, use_ind_tri = TRUE) # Compare with R A <- acos(tcrossprod(X[, , 1])) ind <- upper.tri(A) A[ind] # Reconstruct matrix Psi_vec <- Psi_mat(Theta[, , 1, drop = FALSE]) Psi <- matrix(0, nrow = n, ncol = n) Psi[upper.tri(Psi)] <- Psi_vec Psi <- Psi + t(Psi)
Simple simulation of prespecified non-uniform spherical distributions: von Mises–Fisher (vMF), Mixture of vMF (MvMF), Angular Central Gaussian (ACG), Small Circle (SC), Watson (W), Cauchy-like (C), Mixture of Cauchy-like (MC), or Uniform distribution with Antipodal-Dependent observations (UAD).
r_alt(n, p, M = 1, alt = "vMF", mu = c(rep(0, p - 1), 1), kappa = 1, nu = 0.5, F_inv = NULL, K = 1000, axial_mix = TRUE)
r_alt(n, p, M = 1, alt = "vMF", mu = c(rep(0, p - 1), 1), kappa = 1, nu = 0.5, F_inv = NULL, K = 1000, axial_mix = TRUE)
n |
sample size. |
p |
integer giving the dimension of the ambient space |
M |
number of samples of size |
alt |
alternative, must be |
mu |
location parameter for |
kappa |
non-negative parameter measuring the strength of the deviation
with respect to uniformity (obtained with |
nu |
projection along |
F_inv |
quantile function returned by |
K |
number of equispaced points on |
axial_mix |
use a mixture of von Mises–Fisher or Cauchy-like that is
axial (i.e., symmetrically distributed about the origin)? Defaults to
|
The parameter kappa
is used as in the following
distributions:
"vMF"
: von Mises–Fisher distribution with concentration
and directional mean
.
"MvMF"
: equally-weighted mixture of von Mises–Fisher
distributions with common concentration
and directional means
if
axial_mix = TRUE
. If axial_mix = FALSE
, then only means
with positive signs are considered.
"ACG"
: Angular Central Gaussian distribution with diagonal
shape matrix with diagonal given by
"SC"
: Small Circle distribution with axis mean
and concentration
about the projection
along the mean,
.
"W"
: Watson distribution with axis mean
and concentration
. The Watson distribution is a particular
case of the Bingham distribution.
"C"
: Cauchy-like distribution with directional mode
and concentration
. The circular Wrapped Cauchy
distribution is a particular case of this Cauchy-like distribution.
"MC"
: equally-weighted mixture of Cauchy-like
distributions with common concentration
and directional means
if
axial_mix = TRUE
. If axial_mix = FALSE
, then only means
with positive signs are considered.
The alternative "UAD"
generates a sample formed by
observations drawn uniformly on
and the remaining observations drawn from a uniform spherical cap
distribution of angle
about each of the
observations (see
unif_cap
). Hence,
kappa = 0
corresponds to a spherical cap covering the whole sphere and
kappa = pi
is a one-point degenerate spherical cap.
Much faster sampling for "SC"
, "W"
, "C"
, and "MC"
is achieved providing F_inv
; see examples.
An array of size c(n, p, M)
with M
random
samples of size n
of non-uniformly-generated directions on
.
## Simulation with p = 2 p <- 2 n <- 50 kappa <- 20 nu <- 0.5 angle <- pi / 10 rho <- ((2 * kappa + 1) - sqrt(4 * kappa + 1)) / (2 * kappa) F_inv_SC_2 <- F_inv_from_f(f = function(z) exp(-kappa * (z - nu)^2), p = 2) F_inv_W_2 <- F_inv_from_f(f = function(z) exp(kappa * z^2), p = 2) F_inv_C_2 <- F_inv_from_f(f = function(z) (1 - rho^2) / (1 + rho^2 - 2 * rho * z)^(p / 2), p = 2) x1 <- r_alt(n = n, p = p, alt = "vMF", kappa = kappa)[, , 1] x2 <- r_alt(n = n, p = p, alt = "C", F_inv = F_inv_C_2)[, , 1] x3 <- r_alt(n = n, p = p, alt = "SC", F_inv = F_inv_SC_2)[, , 1] x4 <- r_alt(n = n, p = p, alt = "ACG", kappa = kappa)[, , 1] x5 <- r_alt(n = n, p = p, alt = "W", F_inv = F_inv_W_2)[, , 1] x6 <- r_alt(n = n, p = p, alt = "MvMF", kappa = kappa)[, , 1] x7 <- r_alt(n = n, p = p, alt = "MC", kappa = kappa)[, , 1] x8 <- r_alt(n = n, p = p, alt = "UAD", kappa = 1 - angle)[, , 1] r <- runif(n, 0.95, 1.05) # Radius perturbation to improve visualization plot(r * x1, pch = 16, xlim = c(-1.1, 1.1), ylim = c(-1.1, 1.1), col = 1) points(r * x2, pch = 16, col = 2) points(r * x3, pch = 16, col = 3) plot(r * x4, pch = 16, xlim = c(-1.1, 1.1), ylim = c(-1.1, 1.1), col = 1) points(r * x5, pch = 16, col = 2) points(r * x6, pch = 16, col = 3) points(r * x7, pch = 16, col = 4) col <- rep(rainbow(n / 2), 2) plot(r * x8, pch = 16, xlim = c(-1.1, 1.1), ylim = c(-1.1, 1.1), col = col) for (i in seq(1, n, by = 2)) lines((r * x8)[i + 0:1, ], col = col[i]) ## Simulation with p = 3 n <- 50 p <- 3 kappa <- 20 angle <- pi / 10 nu <- 0.5 rho <- ((2 * kappa + 1) - sqrt(4 * kappa + 1)) / (2 * kappa) F_inv_SC_3 <- F_inv_from_f(f = function(z) exp(-kappa * (z - nu)^2), p = 3) F_inv_W_3 <- F_inv_from_f(f = function(z) exp(kappa * z^2), p = 3) F_inv_C_3 <- F_inv_from_f(f = function(z) (1 - rho^2) / (1 + rho^2 - 2 * rho * z)^(p / 2), p = 3) x1 <- r_alt(n = n, p = p, alt = "vMF", kappa = kappa)[, , 1] x2 <- r_alt(n = n, p = p, alt = "C", F_inv = F_inv_C_3)[, , 1] x3 <- r_alt(n = n, p = p, alt = "SC", F_inv = F_inv_SC_3)[, , 1] x4 <- r_alt(n = n, p = p, alt = "ACG", kappa = kappa)[, , 1] x5 <- r_alt(n = n, p = p, alt = "W", F_inv = F_inv_W_3)[, , 1] x6 <- r_alt(n = n, p = p, alt = "MvMF", kappa = kappa)[, , 1] x7 <- r_alt(n = n, p = p, alt = "MC", kappa = kappa)[, , 1] x8 <- r_alt(n = n, p = p, alt = "UAD", kappa = 1 - angle)[, , 1] s3d <- scatterplot3d::scatterplot3d(x1, pch = 16, xlim = c(-1.1, 1.1), ylim = c(-1.1, 1.1), zlim = c(-1.1, 1.1)) s3d$points3d(x2, pch = 16, col = 2) s3d$points3d(x3, pch = 16, col = 3) s3d <- scatterplot3d::scatterplot3d(x4, pch = 16, xlim = c(-1.1, 1.1), ylim = c(-1.1, 1.1), zlim = c(-1.1, 1.1)) s3d$points3d(x5, pch = 16, col = 2) s3d$points3d(x6, pch = 16, col = 3) s3d$points3d(x7, pch = 16, col = 4) col <- rep(rainbow(n / 2), 2) s3d <- scatterplot3d::scatterplot3d(x8, pch = 16, xlim = c(-1.1, 1.1), ylim = c(-1.1, 1.1), zlim = c(-1.1, 1.1), color = col) for (i in seq(1, n, by = 2)) s3d$points3d(x8[i + 0:1, ], col = col[i], type = "l")
## Simulation with p = 2 p <- 2 n <- 50 kappa <- 20 nu <- 0.5 angle <- pi / 10 rho <- ((2 * kappa + 1) - sqrt(4 * kappa + 1)) / (2 * kappa) F_inv_SC_2 <- F_inv_from_f(f = function(z) exp(-kappa * (z - nu)^2), p = 2) F_inv_W_2 <- F_inv_from_f(f = function(z) exp(kappa * z^2), p = 2) F_inv_C_2 <- F_inv_from_f(f = function(z) (1 - rho^2) / (1 + rho^2 - 2 * rho * z)^(p / 2), p = 2) x1 <- r_alt(n = n, p = p, alt = "vMF", kappa = kappa)[, , 1] x2 <- r_alt(n = n, p = p, alt = "C", F_inv = F_inv_C_2)[, , 1] x3 <- r_alt(n = n, p = p, alt = "SC", F_inv = F_inv_SC_2)[, , 1] x4 <- r_alt(n = n, p = p, alt = "ACG", kappa = kappa)[, , 1] x5 <- r_alt(n = n, p = p, alt = "W", F_inv = F_inv_W_2)[, , 1] x6 <- r_alt(n = n, p = p, alt = "MvMF", kappa = kappa)[, , 1] x7 <- r_alt(n = n, p = p, alt = "MC", kappa = kappa)[, , 1] x8 <- r_alt(n = n, p = p, alt = "UAD", kappa = 1 - angle)[, , 1] r <- runif(n, 0.95, 1.05) # Radius perturbation to improve visualization plot(r * x1, pch = 16, xlim = c(-1.1, 1.1), ylim = c(-1.1, 1.1), col = 1) points(r * x2, pch = 16, col = 2) points(r * x3, pch = 16, col = 3) plot(r * x4, pch = 16, xlim = c(-1.1, 1.1), ylim = c(-1.1, 1.1), col = 1) points(r * x5, pch = 16, col = 2) points(r * x6, pch = 16, col = 3) points(r * x7, pch = 16, col = 4) col <- rep(rainbow(n / 2), 2) plot(r * x8, pch = 16, xlim = c(-1.1, 1.1), ylim = c(-1.1, 1.1), col = col) for (i in seq(1, n, by = 2)) lines((r * x8)[i + 0:1, ], col = col[i]) ## Simulation with p = 3 n <- 50 p <- 3 kappa <- 20 angle <- pi / 10 nu <- 0.5 rho <- ((2 * kappa + 1) - sqrt(4 * kappa + 1)) / (2 * kappa) F_inv_SC_3 <- F_inv_from_f(f = function(z) exp(-kappa * (z - nu)^2), p = 3) F_inv_W_3 <- F_inv_from_f(f = function(z) exp(kappa * z^2), p = 3) F_inv_C_3 <- F_inv_from_f(f = function(z) (1 - rho^2) / (1 + rho^2 - 2 * rho * z)^(p / 2), p = 3) x1 <- r_alt(n = n, p = p, alt = "vMF", kappa = kappa)[, , 1] x2 <- r_alt(n = n, p = p, alt = "C", F_inv = F_inv_C_3)[, , 1] x3 <- r_alt(n = n, p = p, alt = "SC", F_inv = F_inv_SC_3)[, , 1] x4 <- r_alt(n = n, p = p, alt = "ACG", kappa = kappa)[, , 1] x5 <- r_alt(n = n, p = p, alt = "W", F_inv = F_inv_W_3)[, , 1] x6 <- r_alt(n = n, p = p, alt = "MvMF", kappa = kappa)[, , 1] x7 <- r_alt(n = n, p = p, alt = "MC", kappa = kappa)[, , 1] x8 <- r_alt(n = n, p = p, alt = "UAD", kappa = 1 - angle)[, , 1] s3d <- scatterplot3d::scatterplot3d(x1, pch = 16, xlim = c(-1.1, 1.1), ylim = c(-1.1, 1.1), zlim = c(-1.1, 1.1)) s3d$points3d(x2, pch = 16, col = 2) s3d$points3d(x3, pch = 16, col = 3) s3d <- scatterplot3d::scatterplot3d(x4, pch = 16, xlim = c(-1.1, 1.1), ylim = c(-1.1, 1.1), zlim = c(-1.1, 1.1)) s3d$points3d(x5, pch = 16, col = 2) s3d$points3d(x6, pch = 16, col = 3) s3d$points3d(x7, pch = 16, col = 4) col <- rep(rainbow(n / 2), 2) s3d <- scatterplot3d::scatterplot3d(x8, pch = 16, xlim = c(-1.1, 1.1), ylim = c(-1.1, 1.1), zlim = c(-1.1, 1.1), color = col) for (i in seq(1, n, by = 2)) s3d$points3d(x8[i + 0:1, ], col = col[i], type = "l")
Simulation of the uniform distribution on and
,
.
r_unif_cir(n, M = 1L, sorted = FALSE) r_unif_sph(n, p, M = 1L)
r_unif_cir(n, M = 1L, sorted = FALSE) r_unif_sph(n, p, M = 1L)
n |
sample size. |
M |
number of samples of size |
sorted |
return each circular sample sorted? Defaults to |
p |
integer giving the dimension of the ambient space |
r_unif_cir
: a matrix of size c(n, M)
with
M
random samples of size n
of uniformly-generated circular
data on .
r_unif_sph
: an array of size c(n, p, M)
with
M
random samples of size n
of uniformly-generated
directions on .
# A sample on [0, 2*pi) n <- 5 r_unif_cir(n = n) # A sample on S^1 p <- 2 samp <- r_unif_sph(n = n, p = p) samp rowSums(samp^2) # A sample on S^2 p <- 3 samp <- r_unif_sph(n = n, p = p) samp rowSums(samp^2)
# A sample on [0, 2*pi) n <- 5 r_unif_cir(n = n) # A sample on S^1 p <- 2 samp <- r_unif_sph(n = n, p = p) samp rowSums(samp^2) # A sample on S^2 p <- 3 samp <- r_unif_sph(n = n, p = p) samp rowSums(samp^2)
Craters on Rhea from Hirata (2016).
rhea
rhea
A data frame with 3596 rows and 4 variables:
name of the crater (if named).
diameter of the crater (in km).
longitude angle of the
crater center.
latitude angle of the
crater center.
The angles are such their associated planetocentric
coordinates are:
with denoting the north pole.
The script performing the data preprocessing is available at
rhea.R
.
Hirata, N. (2016) Differential impact cratering of Saturn's satellites by heliocentric impactors. Journal of Geophysical Research: Planets, 121:111–117. doi:10.1002/2015JE004940
# Load data data("rhea") # Add Cartesian coordinates rhea$X <- cbind(cos(rhea$theta) * cos(rhea$phi), sin(rhea$theta) * cos(rhea$phi), sin(rhea$phi)) # Tests unif_test(data = rhea$X[rhea$diam > 15 & rhea$diam < 20, ], type = c("PCvM", "PAD", "PRt"), p_value = "asymp")
# Load data data("rhea") # Add Cartesian coordinates rhea$X <- cbind(cos(rhea$theta) * cos(rhea$phi), sin(rhea$theta) * cos(rhea$phi), sin(rhea$phi)) # Tests unif_test(data = rhea$X[rhea$diam > 15 & rhea$diam < 20, ], type = c("PCvM", "PAD", "PRt"), p_value = "asymp")
Approximated density, distribution, and quantile functions for
the asymptotic null distributions of Sobolev statistics of uniformity
on . These asymptotic distributions are infinite
weighted sums of (central) chi squared random variables:
where
is the dimension of the space of eigenfunctions of the Laplacian on
,
, associated to the
-th
eigenvalue,
.
d_p_k(p, k, log = FALSE) weights_dfs_Sobolev(p, K_max = 1000, thre = 0.001, type, Rothman_t = 1/3, Pycke_q = 0.5, Riesz_s = 1, Poisson_rho = 0.5, Softmax_kappa = 1, Stereo_a = 0, Sobolev_vk2 = c(0, 0, 1), log = FALSE, verbose = TRUE, Gauss = TRUE, N = 320, tol = 1e-06, force_positive = TRUE, x_tail = NULL) d_Sobolev(x, p, type, method = c("I", "SW", "HBE")[1], K_max = 1000, thre = 0.001, Rothman_t = 1/3, Pycke_q = 0.5, Riesz_s = 1, Poisson_rho = 0.5, Softmax_kappa = 1, Stereo_a = 0, Sobolev_vk2 = c(0, 0, 1), ncps = 0, verbose = TRUE, N = 320, x_tail = NULL, ...) p_Sobolev(x, p, type, method = c("I", "SW", "HBE", "MC")[1], K_max = 1000, thre = 0.001, Rothman_t = 1/3, Pycke_q = 0.5, Riesz_s = 1, Poisson_rho = 0.5, Softmax_kappa = 1, Stereo_a = 0, Sobolev_vk2 = c(0, 0, 1), ncps = 0, verbose = TRUE, N = 320, x_tail = NULL, ...) q_Sobolev(u, p, type, method = c("I", "SW", "HBE", "MC")[1], K_max = 1000, thre = 0.001, Rothman_t = 1/3, Pycke_q = 0.5, Riesz_s = 1, Poisson_rho = 0.5, Softmax_kappa = 1, Stereo_a = 0, Sobolev_vk2 = c(0, 0, 1), ncps = 0, verbose = TRUE, N = 320, x_tail = NULL, ...)
d_p_k(p, k, log = FALSE) weights_dfs_Sobolev(p, K_max = 1000, thre = 0.001, type, Rothman_t = 1/3, Pycke_q = 0.5, Riesz_s = 1, Poisson_rho = 0.5, Softmax_kappa = 1, Stereo_a = 0, Sobolev_vk2 = c(0, 0, 1), log = FALSE, verbose = TRUE, Gauss = TRUE, N = 320, tol = 1e-06, force_positive = TRUE, x_tail = NULL) d_Sobolev(x, p, type, method = c("I", "SW", "HBE")[1], K_max = 1000, thre = 0.001, Rothman_t = 1/3, Pycke_q = 0.5, Riesz_s = 1, Poisson_rho = 0.5, Softmax_kappa = 1, Stereo_a = 0, Sobolev_vk2 = c(0, 0, 1), ncps = 0, verbose = TRUE, N = 320, x_tail = NULL, ...) p_Sobolev(x, p, type, method = c("I", "SW", "HBE", "MC")[1], K_max = 1000, thre = 0.001, Rothman_t = 1/3, Pycke_q = 0.5, Riesz_s = 1, Poisson_rho = 0.5, Softmax_kappa = 1, Stereo_a = 0, Sobolev_vk2 = c(0, 0, 1), ncps = 0, verbose = TRUE, N = 320, x_tail = NULL, ...) q_Sobolev(u, p, type, method = c("I", "SW", "HBE", "MC")[1], K_max = 1000, thre = 0.001, Rothman_t = 1/3, Pycke_q = 0.5, Riesz_s = 1, Poisson_rho = 0.5, Softmax_kappa = 1, Stereo_a = 0, Sobolev_vk2 = c(0, 0, 1), ncps = 0, verbose = TRUE, N = 320, x_tail = NULL, ...)
p |
integer giving the dimension of the ambient space |
k |
sequence of integer indexes. |
log |
compute the logarithm of |
K_max |
integer giving the truncation of the series that compute the
asymptotic p-value of a Sobolev test. Defaults to |
thre |
error threshold for the tail probability given by the
the first terms of the truncated series of a Sobolev test. Defaults to
|
type |
name of the Sobolev statistic, using the naming from
|
Rothman_t |
|
Pycke_q |
|
Riesz_s |
|
Poisson_rho |
|
Softmax_kappa |
|
Stereo_a |
|
Sobolev_vk2 |
weights for the finite Sobolev test. A non-negative
vector or matrix. Defaults to |
verbose |
output information about the truncation? Defaults to
|
Gauss |
use a Gauss–Legendre quadrature rule of |
N |
number of points used in the
Gauss–Legendre quadrature for computing the Gegenbauer coefficients.
Defaults to |
tol |
tolerance passed to |
force_positive |
set negative weights to zero? Defaults to |
x_tail |
scalar evaluation point for determining the upper tail
probability. If |
x |
vector of quantiles. |
method |
method for approximating the density, distribution, or
quantile function of the weighted sum of chi squared random variables. Must
be |
ncps |
non-centrality parameters. Either |
... |
further parameters passed to |
u |
vector of probabilities. |
The truncation of is
done to the first
K_max
terms and then up to the index such that
the first terms explain the tail probability at the x_tail
with
an absolute error smaller than thre
(see details in
cutoff_wschisq
). This automatic truncation takes place when
calling *_Sobolev
. Setting thre = 0
truncates to K_max
terms exactly. If the series only contains odd or even non-zero terms, then
only K_max / 2
addends are effectively taken into account
in the first truncation.
d_p_k
: a vector of size length(k)
with the
evaluation of .
weights_dfs_Sobolev
: a list with entries weights
and
dfs
, automatically truncated according to K_max
and
thre
(see details).
d_Sobolev
: density function evaluated at x
, a vector.
p_Sobolev
: distribution function evaluated at x
,
a vector.
q_Sobolev
: quantile function evaluated at u
, a vector.
Eduardo García-Portugués and Paula Navarro-Esteban.
# Circular-specific statistics curve(p_Sobolev(x = x, p = 2, type = "Watson", method = "HBE"), n = 2e2, ylab = "Distribution", main = "Watson") curve(p_Sobolev(x = x, p = 2, type = "Rothman", method = "HBE"), n = 2e2, ylab = "Distribution", main = "Rothman") curve(p_Sobolev(x = x, p = 2, type = "Pycke_q", method = "HBE"), to = 10, n = 2e2, ylab = "Distribution", main = "Pycke_q") curve(p_Sobolev(x = x, p = 2, type = "Hermans_Rasson", method = "HBE"), to = 10, n = 2e2, ylab = "Distribution", main = "Hermans_Rasson") # Statistics for arbitrary dimensions test_statistic <- function(type, to = 1, pmax = 5, M = 1e3, ...) { col <- viridisLite::viridis(pmax - 1) curve(p_Sobolev(x = x, p = 2, type = type, method = "MC", M = M, ...), to = to, n = 2e2, col = col[pmax - 1], ylab = "Distribution", main = type, ylim = c(0, 1)) for (p in 3:pmax) { curve(p_Sobolev(x = x, p = p, type = type, method = "MC", M = M, ...), add = TRUE, n = 2e2, col = col[pmax - p + 1]) } legend("bottomright", legend = paste("p =", 2:pmax), col = rev(col), lwd = 2) } # Ajne test_statistic(type = "Ajne") # Gine_Gn test_statistic(type = "Gine_Gn", to = 1.5) # Gine_Fn test_statistic(type = "Gine_Fn", to = 2) # Bakshaev test_statistic(type = "Bakshaev", to = 3) # Riesz test_statistic(type = "Riesz", Riesz_s = 0.5, to = 3) # PCvM test_statistic(type = "PCvM", to = 0.6) # PAD test_statistic(type = "PAD", to = 3) # PRt test_statistic(type = "PRt", Rothman_t = 0.5) # Quantiles p <- c(2, 3, 4, 11) t(sapply(p, function(p) q_Sobolev(u = c(0.10, 0.05, 0.01), p = p, type = "PCvM"))) t(sapply(p, function(p) q_Sobolev(u = c(0.10, 0.05, 0.01), p = p, type = "PAD"))) t(sapply(p, function(p) q_Sobolev(u = c(0.10, 0.05, 0.01), p = p, type = "PRt"))) # Series truncation for thre = 1e-5 sapply(p, function(p) length(weights_dfs_Sobolev(p = p, type = "PCvM")$dfs)) sapply(p, function(p) length(weights_dfs_Sobolev(p = p, type = "PRt")$dfs)) sapply(p, function(p) length(weights_dfs_Sobolev(p = p, type = "PAD")$dfs))
# Circular-specific statistics curve(p_Sobolev(x = x, p = 2, type = "Watson", method = "HBE"), n = 2e2, ylab = "Distribution", main = "Watson") curve(p_Sobolev(x = x, p = 2, type = "Rothman", method = "HBE"), n = 2e2, ylab = "Distribution", main = "Rothman") curve(p_Sobolev(x = x, p = 2, type = "Pycke_q", method = "HBE"), to = 10, n = 2e2, ylab = "Distribution", main = "Pycke_q") curve(p_Sobolev(x = x, p = 2, type = "Hermans_Rasson", method = "HBE"), to = 10, n = 2e2, ylab = "Distribution", main = "Hermans_Rasson") # Statistics for arbitrary dimensions test_statistic <- function(type, to = 1, pmax = 5, M = 1e3, ...) { col <- viridisLite::viridis(pmax - 1) curve(p_Sobolev(x = x, p = 2, type = type, method = "MC", M = M, ...), to = to, n = 2e2, col = col[pmax - 1], ylab = "Distribution", main = type, ylim = c(0, 1)) for (p in 3:pmax) { curve(p_Sobolev(x = x, p = p, type = type, method = "MC", M = M, ...), add = TRUE, n = 2e2, col = col[pmax - p + 1]) } legend("bottomright", legend = paste("p =", 2:pmax), col = rev(col), lwd = 2) } # Ajne test_statistic(type = "Ajne") # Gine_Gn test_statistic(type = "Gine_Gn", to = 1.5) # Gine_Fn test_statistic(type = "Gine_Fn", to = 2) # Bakshaev test_statistic(type = "Bakshaev", to = 3) # Riesz test_statistic(type = "Riesz", Riesz_s = 0.5, to = 3) # PCvM test_statistic(type = "PCvM", to = 0.6) # PAD test_statistic(type = "PAD", to = 3) # PRt test_statistic(type = "PRt", Rothman_t = 0.5) # Quantiles p <- c(2, 3, 4, 11) t(sapply(p, function(p) q_Sobolev(u = c(0.10, 0.05, 0.01), p = p, type = "PCvM"))) t(sapply(p, function(p) q_Sobolev(u = c(0.10, 0.05, 0.01), p = p, type = "PAD"))) t(sapply(p, function(p) q_Sobolev(u = c(0.10, 0.05, 0.01), p = p, type = "PRt"))) # Series truncation for thre = 1e-5 sapply(p, function(p) length(weights_dfs_Sobolev(p = p, type = "PCvM")$dfs)) sapply(p, function(p) length(weights_dfs_Sobolev(p = p, type = "PRt")$dfs)) sapply(p, function(p) length(weights_dfs_Sobolev(p = p, type = "PAD")$dfs))
Given a Sobolev statistic
for a sample ,
, three important sequences
are related to
.
Gegenbauer coefficients of
(see, e.g., the projected-ecdf statistics), given
by
Weights of the
asymptotic distribution of the Sobolev statistic,
, given by
Gegenbauer coefficients of the
local projected alternative associated to
,
given by
For , the factor
is replaced by
.
bk_to_vk2(bk, p, log = FALSE) bk_to_uk(bk, p, signs = 1) vk2_to_bk(vk2, p, log = FALSE) vk2_to_uk(vk2, p, signs = 1) uk_to_vk2(uk, p) uk_to_bk(uk, p)
bk_to_vk2(bk, p, log = FALSE) bk_to_uk(bk, p, signs = 1) vk2_to_bk(vk2, p, log = FALSE) vk2_to_uk(vk2, p, signs = 1) uk_to_vk2(uk, p) uk_to_bk(uk, p)
bk |
coefficients |
p |
integer giving the dimension of the ambient space |
log |
do operations in log scale (log-in, log-out)? Defaults to
|
signs |
signs of the coefficients |
vk2 |
squared coefficients |
uk |
coefficients |
See more details in Prentice (1978) and García-Portugués et al. (2023). The
adequate signs of uk
for the "PRt"
Rothman test
can be retrieved with akx
and sqr = TRUE
, see the
examples.
The corresponding vectors of coefficients vk2
, bk
, or
uk
, depending on the call.
García-Portugués, E., Navarro-Esteban, P., Cuesta-Albertos, J. A. (2023) On a projection-based class of uniformity tests on the hypersphere. Bernoulli, 29(1):181–204. doi:10.3150/21-BEJ1454.
Prentice, M. J. (1978). On invariant tests of uniformity for directions and orientations. The Annals of Statistics, 6(1):169–176. doi:10.1214/aos/1176344075
# bk, vk2, and uk for the PCvM test in p = 3 (bk <- Gegen_coefs_Pn(k = 1:5, type = "PCvM", p = 3)) (vk2 <- bk_to_vk2(bk = bk, p = 3)) (uk <- bk_to_uk(bk = bk, p = 3)) # vk2 is the same as weights_dfs_Sobolev(K_max = 10, thre = 0, p = 3, type = "PCvM")$weights # bk and uk for the Rothman test in p = 3, with adequate signs t <- 1 / 3 (bk <- Gegen_coefs_Pn(k = 1:5, type = "PRt", p = 3, Rothman_t = t)) (ak <- akx(x = drop(q_proj_unif(t, p = 3)), p = 3, k = 1:5, sqr = TRUE)) (uk <- bk_to_uk(bk = bk, p = 3, signs = ak))
# bk, vk2, and uk for the PCvM test in p = 3 (bk <- Gegen_coefs_Pn(k = 1:5, type = "PCvM", p = 3)) (vk2 <- bk_to_vk2(bk = bk, p = 3)) (uk <- bk_to_uk(bk = bk, p = 3)) # vk2 is the same as weights_dfs_Sobolev(K_max = 10, thre = 0, p = 3, type = "PCvM")$weights # bk and uk for the Rothman test in p = 3, with adequate signs t <- 1 / 3 (bk <- Gegen_coefs_Pn(k = 1:5, type = "PRt", p = 3, Rothman_t = t)) (ak <- akx(x = drop(q_proj_unif(t, p = 3)), p = 3, k = 1:5, sqr = TRUE)) (uk <- bk_to_uk(bk = bk, p = 3, signs = ak))
Low-level implementation of several statistics for assessing
uniformity on the (hyper)sphere
,
.
sph_stat_Rayleigh(X) sph_stat_Bingham(X) sph_stat_Ajne(X, Psi_in_X = FALSE) sph_stat_Gine_Gn(X, Psi_in_X = FALSE, p = 0L) sph_stat_Gine_Fn(X, Psi_in_X = FALSE, p = 0L) sph_stat_Pycke(X, Psi_in_X = FALSE, p = 0L) sph_stat_Bakshaev(X, Psi_in_X = FALSE, p = 0L) sph_stat_Riesz(X, Psi_in_X = FALSE, p = 0L, s = 1) sph_stat_PCvM(X, Psi_in_X = FALSE, p = 0L, N = 160L, L = 1000L) sph_stat_PRt(X, Psi_in_X = FALSE, p = 0L, t = 1/3, N = 160L, L = 1000L) sph_stat_PAD(X, Psi_in_X = FALSE, p = 0L, N = 160L, L = 1000L) sph_stat_Poisson(X, Psi_in_X = FALSE, p = 0L, rho = 0.5) sph_stat_Softmax(X, Psi_in_X = FALSE, p = 0L, kappa = 1) sph_stat_Stereo(X, Psi_in_X = FALSE, p = 0L, a = 0) sph_stat_CCF09(X, dirs, K_CCF09 = 25L, original = FALSE) sph_stat_Rayleigh_HD(X) sph_stat_CJ12(X, Psi_in_X = FALSE, p = 0L, regime = 3L)
sph_stat_Rayleigh(X) sph_stat_Bingham(X) sph_stat_Ajne(X, Psi_in_X = FALSE) sph_stat_Gine_Gn(X, Psi_in_X = FALSE, p = 0L) sph_stat_Gine_Fn(X, Psi_in_X = FALSE, p = 0L) sph_stat_Pycke(X, Psi_in_X = FALSE, p = 0L) sph_stat_Bakshaev(X, Psi_in_X = FALSE, p = 0L) sph_stat_Riesz(X, Psi_in_X = FALSE, p = 0L, s = 1) sph_stat_PCvM(X, Psi_in_X = FALSE, p = 0L, N = 160L, L = 1000L) sph_stat_PRt(X, Psi_in_X = FALSE, p = 0L, t = 1/3, N = 160L, L = 1000L) sph_stat_PAD(X, Psi_in_X = FALSE, p = 0L, N = 160L, L = 1000L) sph_stat_Poisson(X, Psi_in_X = FALSE, p = 0L, rho = 0.5) sph_stat_Softmax(X, Psi_in_X = FALSE, p = 0L, kappa = 1) sph_stat_Stereo(X, Psi_in_X = FALSE, p = 0L, a = 0) sph_stat_CCF09(X, dirs, K_CCF09 = 25L, original = FALSE) sph_stat_Rayleigh_HD(X) sph_stat_CJ12(X, Psi_in_X = FALSE, p = 0L, regime = 3L)
X |
an array of size |
Psi_in_X |
does |
p |
integer giving the dimension of the ambient space |
s |
|
N |
number of points used in the
Gauss-Legendre quadrature. Defaults to
|
L |
number of discretization points to interpolate angular functions
that require evaluating an integral. Defaults to |
t |
|
rho |
|
kappa |
|
a |
either:
|
dirs |
a matrix of size |
K_CCF09 |
integer giving the truncation of the series present in the
asymptotic distribution of the Kolmogorov-Smirnov statistic. Defaults to
|
original |
return the CCF09 statistic as originally defined?
If |
regime |
type of asymptotic regime for the CJ12 test, either |
Detailed descriptions and references of the statistics are available in García-Portugués and Verdebout (2018).
The Pycke and CJ12 statistics employ the scalar products matrix,
rather than the shortest angles matrix, when Psi_in_X = TRUE
. This
matrix is obtained by setting scalar_prod = TRUE
in
Psi_mat
.
A matrix of size c(M, 1)
containing the statistics for each
of the M
samples.
Be careful on avoiding the next bad usages of the functions, which will produce spurious results:
The directions in X
do not have unit norm.
X
does not contain Psi_mat(X)
when
X_in_Theta = TRUE
.
The parameter p
does not match with the dimension of
.
Not passing the scalar products matrix to sph_stat_CJ12
when Psi_in_X = TRUE
.
The directions in dirs
do not have unit norm.
García-Portugués, E. and Verdebout, T. (2018) An overview of uniformity tests on the hypersphere. arXiv:1804.00286. doi:10.48550/arXiv.1804.00286.
## Sample uniform spherical data M <- 2 n <- 100 p <- 3 set.seed(123456789) X <- r_unif_sph(n = n, p = p, M = M) ## Sobolev tests # Rayleigh sph_stat_Rayleigh(X) # Bingham sph_stat_Bingham(X) # Ajne Psi <- Psi_mat(X) dim(Psi) <- c(dim(Psi), 1) sph_stat_Ajne(X) sph_stat_Ajne(Psi, Psi_in_X = TRUE) # Gine Gn sph_stat_Gine_Gn(X) sph_stat_Gine_Gn(Psi, Psi_in_X = TRUE, p = p) # Gine Fn sph_stat_Gine_Fn(X) sph_stat_Gine_Fn(Psi, Psi_in_X = TRUE, p = p) # Pycke sph_stat_Pycke(X) sph_stat_Pycke(Psi, Psi_in_X = TRUE, p = p) # Bakshaev sph_stat_Bakshaev(X) sph_stat_Bakshaev(Psi, Psi_in_X = TRUE, p = p) # Riesz sph_stat_Riesz(X, s = 1) sph_stat_Riesz(Psi, Psi_in_X = TRUE, p = p, s = 1) # Projected Cramér-von Mises sph_stat_PCvM(X) sph_stat_PCvM(Psi, Psi_in_X = TRUE, p = p) # Projected Rothman sph_stat_PRt(X) sph_stat_PRt(Psi, Psi_in_X = TRUE, p = p) # Projected Anderson-Darling sph_stat_PAD(X) sph_stat_PAD(Psi, Psi_in_X = TRUE, p = p) ## Other tests # CCF09 dirs <- r_unif_sph(n = 3, p = p, M = 1)[, , 1] sph_stat_CCF09(X, dirs = dirs) ## High-dimensional tests # Rayleigh HD-Standardized sph_stat_Rayleigh_HD(X) # CJ12 sph_stat_CJ12(X, regime = 1) sph_stat_CJ12(Psi, regime = 1, Psi_in_X = TRUE, p = p) sph_stat_CJ12(X, regime = 2) sph_stat_CJ12(Psi, regime = 2, Psi_in_X = TRUE, p = p) sph_stat_CJ12(X, regime = 3) sph_stat_CJ12(Psi, regime = 3, Psi_in_X = TRUE, p = p)
## Sample uniform spherical data M <- 2 n <- 100 p <- 3 set.seed(123456789) X <- r_unif_sph(n = n, p = p, M = M) ## Sobolev tests # Rayleigh sph_stat_Rayleigh(X) # Bingham sph_stat_Bingham(X) # Ajne Psi <- Psi_mat(X) dim(Psi) <- c(dim(Psi), 1) sph_stat_Ajne(X) sph_stat_Ajne(Psi, Psi_in_X = TRUE) # Gine Gn sph_stat_Gine_Gn(X) sph_stat_Gine_Gn(Psi, Psi_in_X = TRUE, p = p) # Gine Fn sph_stat_Gine_Fn(X) sph_stat_Gine_Fn(Psi, Psi_in_X = TRUE, p = p) # Pycke sph_stat_Pycke(X) sph_stat_Pycke(Psi, Psi_in_X = TRUE, p = p) # Bakshaev sph_stat_Bakshaev(X) sph_stat_Bakshaev(Psi, Psi_in_X = TRUE, p = p) # Riesz sph_stat_Riesz(X, s = 1) sph_stat_Riesz(Psi, Psi_in_X = TRUE, p = p, s = 1) # Projected Cramér-von Mises sph_stat_PCvM(X) sph_stat_PCvM(Psi, Psi_in_X = TRUE, p = p) # Projected Rothman sph_stat_PRt(X) sph_stat_PRt(Psi, Psi_in_X = TRUE, p = p) # Projected Anderson-Darling sph_stat_PAD(X) sph_stat_PAD(Psi, Psi_in_X = TRUE, p = p) ## Other tests # CCF09 dirs <- r_unif_sph(n = 3, p = p, M = 1)[, , 1] sph_stat_CCF09(X, dirs = dirs) ## High-dimensional tests # Rayleigh HD-Standardized sph_stat_Rayleigh_HD(X) # CJ12 sph_stat_CJ12(X, regime = 1) sph_stat_CJ12(Psi, regime = 1, Psi_in_X = TRUE, p = p) sph_stat_CJ12(X, regime = 2) sph_stat_CJ12(Psi, regime = 2, Psi_in_X = TRUE, p = p) sph_stat_CJ12(X, regime = 3) sph_stat_CJ12(Psi, regime = 3, Psi_in_X = TRUE, p = p)
Computes the finite Sobolev statistic
for a sequence of non-negative weights. For
, the Gegenbauer polynomials are replaced by Chebyshev ones.
sph_stat_Sobolev(X, Psi_in_X = FALSE, p = 0, vk2 = c(0, 0, 1)) cir_stat_Sobolev(Theta, Psi_in_Theta = FALSE, vk2 = c(0, 0, 1))
sph_stat_Sobolev(X, Psi_in_X = FALSE, p = 0, vk2 = c(0, 0, 1)) cir_stat_Sobolev(Theta, Psi_in_Theta = FALSE, vk2 = c(0, 0, 1))
X |
an array of size |
Psi_in_X |
does |
p |
integer giving the dimension of the ambient space |
vk2 |
weights for the finite Sobolev test. A non-negative vector or
matrix. Defaults to |
Theta |
a matrix of size |
Psi_in_Theta |
does |
A matrix of size c(M, ncol(vk2))
containing the statistics for
each of the M
samples.
Density, simulation, and associated functions for a uniform
distribution within a spherical cap of angle about
a direction
on
,
. The
density at
is given by
where
is the projected radius of the spherical cap about
,
is the surface area of
, and
is the projected uniform distribution (see
p_proj_unif
).
The angular function of the uniform cap distribution is
. The associated
projected density is
.
d_unif_cap(x, mu, angle = pi/10) c_unif_cap(p, angle = pi/10) r_unif_cap(n, mu, angle = pi/10) p_proj_unif_cap(x, p, angle = pi/10) q_proj_unif_cap(u, p, angle = pi/10) d_proj_unif_cap(x, p, angle = pi/10, scaled = TRUE) r_proj_unif_cap(n, p, angle = pi/10)
d_unif_cap(x, mu, angle = pi/10) c_unif_cap(p, angle = pi/10) r_unif_cap(n, mu, angle = pi/10) p_proj_unif_cap(x, p, angle = pi/10) q_proj_unif_cap(u, p, angle = pi/10) d_proj_unif_cap(x, p, angle = pi/10, scaled = TRUE) r_proj_unif_cap(n, p, angle = pi/10)
x |
locations to evaluate the density or distribution. For
|
mu |
the directional mean |
angle |
angle |
p |
integer giving the dimension of the ambient space |
n |
sample size, a positive integer. |
u |
vector of probabilities. |
scaled |
whether to scale the angular function by the normalizing
constant. Defaults to |
Depending on the function:
d_unif_cap
: a vector of length nx
or 1
with the
evaluated density at x
.
r_unif_cap
: a matrix of size c(n, p)
with the random
sample.
c_unif_cap
: the normalizing constant.
p_proj_unif_cap
: a vector of length x
with the
evaluated distribution function at x
.
q_proj_unif_cap
: a vector of length u
with the
evaluated quantile function at u
.
d_proj_unif_cap
: a vector of size nx
with the
evaluated angular function.
r_proj_unif_cap
: a vector of length n
containing
simulated values from the cosines density associated to the angular
function.
Alberto Fernández-de-Marcos and Eduardo García-Portugués.
# Simulation and density evaluation for p = 2 mu <- c(0, 1) angle <- pi / 5 n <- 1e2 x <- r_unif_cap(n = n, mu = mu, angle = angle) col <- viridisLite::viridis(n) r_noise <- runif(n, 0.95, 1.05) # Perturbation to improve visualization color <- col[rank(d_unif_cap(x = x, mu = mu, angle = angle))] plot(r_noise * x, pch = 16, col = color, xlim = c(-1, 1), ylim = c(-1, 1)) # Simulation and density evaluation for p = 3 mu <- c(0, 0, 1) angle <- pi / 5 x <- r_unif_cap(n = n, mu = mu, angle = angle) color <- col[rank(d_unif_cap(x = x, mu = mu, angle = angle))] scatterplot3d::scatterplot3d(x, size = 5, xlim = c(-1, 1), ylim = c(-1, 1), zlim = c(-1, 1), color = color) # Simulated data from the cosines density n <- 1e3 p <- 3 angle <- pi / 3 hist(r_proj_unif_cap(n = n, p = p, angle = angle), breaks = seq(cos(angle), 1, l = 10), probability = TRUE, main = "Simulated data from proj_unif_cap", xlab = "t", xlim = c(-1, 1)) t <- seq(-1, 1, by = 0.01) lines(t, d_proj_unif_cap(x = t, p = p, angle = angle), col = "red")
# Simulation and density evaluation for p = 2 mu <- c(0, 1) angle <- pi / 5 n <- 1e2 x <- r_unif_cap(n = n, mu = mu, angle = angle) col <- viridisLite::viridis(n) r_noise <- runif(n, 0.95, 1.05) # Perturbation to improve visualization color <- col[rank(d_unif_cap(x = x, mu = mu, angle = angle))] plot(r_noise * x, pch = 16, col = color, xlim = c(-1, 1), ylim = c(-1, 1)) # Simulation and density evaluation for p = 3 mu <- c(0, 0, 1) angle <- pi / 5 x <- r_unif_cap(n = n, mu = mu, angle = angle) color <- col[rank(d_unif_cap(x = x, mu = mu, angle = angle))] scatterplot3d::scatterplot3d(x, size = 5, xlim = c(-1, 1), ylim = c(-1, 1), zlim = c(-1, 1), color = color) # Simulated data from the cosines density n <- 1e3 p <- 3 angle <- pi / 3 hist(r_proj_unif_cap(n = n, p = p, angle = angle), breaks = seq(cos(angle), 1, l = 10), probability = TRUE, main = "Simulated data from proj_unif_cap", xlab = "t", xlim = c(-1, 1)) t <- seq(-1, 1, by = 0.01) lines(t, d_proj_unif_cap(x = t, p = p, angle = angle), col = "red")
Implementation of several statistics for assessing uniformity
on the (hyper)sphere
,
, for a sample
.
unif_stat
receives a (several) sample(s) of directions in
Cartesian coordinates, except for the circular case () in
which the sample(s) can be angles
.
unif_stat
allows to compute several statistics to several samples
within a single call, facilitating thus Monte Carlo experiments.
unif_stat(data, type = "all", data_sorted = FALSE, CCF09_dirs = NULL, CJ12_reg = 3, cov_a = 2 * pi, Cressie_t = 1/3, K_CCF09 = 25, Poisson_rho = 0.5, Pycke_q = 0.5, Rayleigh_m = 1, Riesz_s = 1, Rothman_t = 1/3, Sobolev_vk2 = c(0, 0, 1), Softmax_kappa = 1, Stereo_a = 0)
unif_stat(data, type = "all", data_sorted = FALSE, CCF09_dirs = NULL, CJ12_reg = 3, cov_a = 2 * pi, Cressie_t = 1/3, K_CCF09 = 25, Poisson_rho = 0.5, Pycke_q = 0.5, Rayleigh_m = 1, Riesz_s = 1, Rothman_t = 1/3, Sobolev_vk2 = c(0, 0, 1), Softmax_kappa = 1, Stereo_a = 0)
data |
sample to compute the test statistic. An array of size
|
type |
type of test to be applied. A character vector containing any of
the following types of tests, depending on the dimension
If |
data_sorted |
is the circular data sorted? If |
CCF09_dirs |
a matrix of size |
CJ12_reg |
type of asymptotic regime for CJ12 test, either |
cov_a |
|
Cressie_t |
|
K_CCF09 |
integer giving the truncation of the series present in the
asymptotic distribution of the Kolmogorov-Smirnov statistic. Defaults to
|
Poisson_rho |
|
Pycke_q |
|
Rayleigh_m |
integer |
Riesz_s |
|
Rothman_t |
|
Sobolev_vk2 |
weights for the finite Sobolev test. A non-negative
vector or matrix. Defaults to |
Softmax_kappa |
|
Stereo_a |
|
Except for CCF09_dirs
, K_CCF09
, and CJ12_reg
, all the
test-specific parameters are vectorized.
Descriptions and references for most of the statistics are available in García-Portugués and Verdebout (2018).
A data frame of size c(M, length(type))
, with column names
given by type
, that contains the values of the test statistics.
García-Portugués, E. and Verdebout, T. (2018) An overview of uniformity tests on the hypersphere. arXiv:1804.00286. doi:10.48550/arXiv.1804.00286.
## Circular data # Sample n <- 10 M <- 2 Theta <- r_unif_cir(n = n, M = M) # Matrix unif_stat(data = Theta, type = "all") # Array unif_stat(data = array(Theta, dim = c(n, 1, M)), type = "all") # Vector unif_stat(data = Theta[, 1], type = "all") ## Spherical data # Circular sample in Cartesian coordinates n <- 10 M <- 2 X <- array(dim = c(n, 2, M)) for (i in 1:M) X[, , i] <- cbind(cos(Theta[, i]), sin(Theta[, i])) # Array unif_stat(data = X, type = "all") # High-dimensional data X <- r_unif_sph(n = n, p = 3, M = M) unif_stat(data = X, type = "all") ## Specific arguments # Rothman unif_stat(data = Theta, type = "Rothman", Rothman_t = 0.5) # CCF09 unif_stat(data = X, type = "CCF09", CCF09_dirs = X[, , 1]) unif_stat(data = X, type = "CCF09", CCF09_dirs = X[, , 1], K_CCF09 = 1) # CJ12 unif_stat(data = X, type = "CJ12", CJ12_reg = 3) unif_stat(data = X, type = "CJ12", CJ12_reg = 1)
## Circular data # Sample n <- 10 M <- 2 Theta <- r_unif_cir(n = n, M = M) # Matrix unif_stat(data = Theta, type = "all") # Array unif_stat(data = array(Theta, dim = c(n, 1, M)), type = "all") # Vector unif_stat(data = Theta[, 1], type = "all") ## Spherical data # Circular sample in Cartesian coordinates n <- 10 M <- 2 X <- array(dim = c(n, 2, M)) for (i in 1:M) X[, , i] <- cbind(cos(Theta[, i]), sin(Theta[, i])) # Array unif_stat(data = X, type = "all") # High-dimensional data X <- r_unif_sph(n = n, p = 3, M = M) unif_stat(data = X, type = "all") ## Specific arguments # Rothman unif_stat(data = Theta, type = "Rothman", Rothman_t = 0.5) # CCF09 unif_stat(data = X, type = "CCF09", CCF09_dirs = X[, , 1]) unif_stat(data = X, type = "CCF09", CCF09_dirs = X[, , 1], K_CCF09 = 1) # CJ12 unif_stat(data = X, type = "CJ12", CJ12_reg = 3) unif_stat(data = X, type = "CJ12", CJ12_reg = 1)
Approximate computation of the null distributions of several
statistics for assessing uniformity on the (hyper)sphere
,
. The approximation is done either by
means of the asymptotic distribution or by Monte Carlo.
unif_stat_distr(x, type, p, n, approx = "asymp", M = 10000, stats_MC = NULL, K_max = 10000, method = "I", Stephens = FALSE, CCF09_dirs = NULL, CJ12_beta = 0, CJ12_reg = 3, cov_a = 2 * pi, Cressie_t = 1/3, K_Ajne = 500, K_CCF09 = 25, K_Kuiper = 25, K_Watson = 25, K_Watson_1976 = 5, Poisson_rho = 0.5, Pycke_q = 0.5, Rayleigh_m = 1, Riesz_s = 1, Rothman_t = 1/3, Sobolev_vk2 = c(0, 0, 1), Softmax_kappa = 1, Stereo_a = 0, ...)
unif_stat_distr(x, type, p, n, approx = "asymp", M = 10000, stats_MC = NULL, K_max = 10000, method = "I", Stephens = FALSE, CCF09_dirs = NULL, CJ12_beta = 0, CJ12_reg = 3, cov_a = 2 * pi, Cressie_t = 1/3, K_Ajne = 500, K_CCF09 = 25, K_Kuiper = 25, K_Watson = 25, K_Watson_1976 = 5, Poisson_rho = 0.5, Pycke_q = 0.5, Rayleigh_m = 1, Riesz_s = 1, Rothman_t = 1/3, Sobolev_vk2 = c(0, 0, 1), Softmax_kappa = 1, Stereo_a = 0, ...)
x |
evaluation points for the null distribution(s). Either a vector of
size |
type |
type of test to be applied. A character vector containing any of
the following types of tests, depending on the dimension
If |
p |
integer giving the dimension of the ambient space |
n |
sample size employed for computing the statistic. |
approx |
type of approximation to the null distribution, either
|
M |
number of Monte Carlo replications for approximating the null
distribution when |
stats_MC |
a data frame of size |
K_max |
integer giving the truncation of the series that compute the
asymptotic p-value of a Sobolev test. Defaults to |
method |
method for approximating the density, distribution, or
quantile function of the weighted sum of chi squared random variables. Must
be |
Stephens |
compute Stephens (1970) modification so that the null distribution of the is less dependent on the sample size? The modification does not alter the test decision. |
CCF09_dirs |
a matrix of size |
CJ12_beta |
|
CJ12_reg |
type of asymptotic regime for CJ12 test, either |
cov_a |
|
Cressie_t |
|
K_CCF09 |
integer giving the truncation of the series present in the
asymptotic distribution of the Kolmogorov-Smirnov statistic. Defaults to
|
K_Kuiper , K_Watson , K_Watson_1976 , K_Ajne
|
integer giving the truncation
of the series present in the null asymptotic distributions. For the
Kolmogorov-Smirnov-related series defaults to |
Poisson_rho |
|
Pycke_q |
|
Rayleigh_m |
integer |
Riesz_s |
|
Rothman_t |
|
Sobolev_vk2 |
weights for the finite Sobolev test. A non-negative
vector or matrix. Defaults to |
Softmax_kappa |
|
Stereo_a |
|
... |
if |
When approx = "asymp"
, statistics that do not have an implemented or
known asymptotic are omitted, and a warning is generated.
For Sobolev tests, K_max = 1e4
produces probabilities uniformly
accurate with three digits for the "PCvM"
, "PAD"
, and
"PRt"
tests, for dimensions . With
K_max = 5e4
,
these probabilities are uniformly accurate in the fourth digit. With
K_max = 1e3
, only two-digit uniform accuracy is obtained. Uniform
accuracy deteriorates when increases, e.g., a digit accuracy is lost
when
.
Descriptions and references on most of the asymptotic distributions are available in García-Portugués and Verdebout (2018).
A data frame of size c(nx, length(type))
, with column names
given by type
, that contains the values of the null distributions of
the statistics evaluated at x
.
García-Portugués, E. and Verdebout, T. (2018) An overview of uniformity tests on the hypersphere. arXiv:1804.00286. doi:10.48550/arXiv.1804.00286.
## Asymptotic distribution # Circular statistics x <- seq(0, 1, l = 5) unif_stat_distr(x = x, type = "Kuiper", p = 2, n = 10) unif_stat_distr(x = x, type = c("Ajne", "Kuiper"), p = 2, n = 10) unif_stat_distr(x = x, type = c("Ajne", "Kuiper"), p = 2, n = 10, K_Ajne = 5) # All circular statistics unif_stat_distr(x = x, type = avail_cir_tests, p = 2, n = 10, K_max = 1e3) # Spherical statistics unif_stat_distr(x = cbind(x, x + 1), type = c("Rayleigh", "Bingham"), p = 3, n = 10) unif_stat_distr(x = cbind(x, x + 1), type = c("Rayleigh", "Bingham"), p = 3, n = 10, M = 100) # All spherical statistics unif_stat_distr(x = x, type = avail_sph_tests, p = 3, n = 10, K_max = 1e3) ## Monte Carlo distribution # Circular statistics x <- seq(0, 5, l = 10) unif_stat_distr(x = x, type = avail_cir_tests, p = 2, n = 10, approx = "MC") unif_stat_distr(x = x, type = "Kuiper", p = 2, n = 10, approx = "MC") unif_stat_distr(x = x, type = c("Ajne", "Kuiper"), p = 2, n = 10, approx = "MC") # Spherical statistics unif_stat_distr(x = x, type = avail_sph_tests, p = 3, n = 10, approx = "MC") unif_stat_distr(x = cbind(x, x + 1), type = c("Rayleigh", "Bingham"), p = 3, n = 10, approx = "MC") unif_stat_distr(x = cbind(x, x + 1), type = c("Rayleigh", "Bingham"), p = 3, n = 10, approx = "MC") ## Specific arguments # Rothman unif_stat_distr(x = x, type = "Rothman", p = 2, n = 10, Rothman_t = 0.5, approx = "MC") # CCF09 dirs <- r_unif_sph(n = 5, p = 3, M = 1)[, , 1] x <- seq(0, 1, l = 10) unif_stat_distr(x = x, type = "CCF09", p = 3, n = 10, approx = "MC", CCF09_dirs = dirs) unif_stat_distr(x = x, type = "CCF09", p = 3, n = 10, approx = "MC") # CJ12 unif_stat_distr(x = x, type = "CJ12", p = 3, n = 100, CJ12_reg = 3) unif_stat_distr(x = x, type = "CJ12", p = 3, n = 100, CJ12_reg = 2, CJ12_beta = 0.01) unif_stat_distr(x = x, type = "CJ12", p = 3, n = 100, CJ12_reg = 1) ## Sobolev x <- seq(0, 1, l = 10) vk2 <- diag(1, nrow = 3) unif_stat_distr(x = x, type = "Sobolev", approx = "asymp", p = 3, n = 100, Sobolev_vk2 = vk2) sapply(1:3, function(i) unif_stat_distr(x = x, type = "Sobolev", approx = "asymp", p = 3, n = 100, Sobolev_vk2 = vk2[i, ])$Sobolev) sapply(1:3, function(i) unif_stat_distr(x = x, type = "Sobolev", approx = "MC", p = 3, n = 100, Sobolev_vk2 = vk2[i, ], M = 1e3)$Sobolev) unif_stat_distr(x = x, type = "Sobolev", approx = "MC", p = 3, n = 100, Sobolev_vk2 = vk2, M = 1e3)
## Asymptotic distribution # Circular statistics x <- seq(0, 1, l = 5) unif_stat_distr(x = x, type = "Kuiper", p = 2, n = 10) unif_stat_distr(x = x, type = c("Ajne", "Kuiper"), p = 2, n = 10) unif_stat_distr(x = x, type = c("Ajne", "Kuiper"), p = 2, n = 10, K_Ajne = 5) # All circular statistics unif_stat_distr(x = x, type = avail_cir_tests, p = 2, n = 10, K_max = 1e3) # Spherical statistics unif_stat_distr(x = cbind(x, x + 1), type = c("Rayleigh", "Bingham"), p = 3, n = 10) unif_stat_distr(x = cbind(x, x + 1), type = c("Rayleigh", "Bingham"), p = 3, n = 10, M = 100) # All spherical statistics unif_stat_distr(x = x, type = avail_sph_tests, p = 3, n = 10, K_max = 1e3) ## Monte Carlo distribution # Circular statistics x <- seq(0, 5, l = 10) unif_stat_distr(x = x, type = avail_cir_tests, p = 2, n = 10, approx = "MC") unif_stat_distr(x = x, type = "Kuiper", p = 2, n = 10, approx = "MC") unif_stat_distr(x = x, type = c("Ajne", "Kuiper"), p = 2, n = 10, approx = "MC") # Spherical statistics unif_stat_distr(x = x, type = avail_sph_tests, p = 3, n = 10, approx = "MC") unif_stat_distr(x = cbind(x, x + 1), type = c("Rayleigh", "Bingham"), p = 3, n = 10, approx = "MC") unif_stat_distr(x = cbind(x, x + 1), type = c("Rayleigh", "Bingham"), p = 3, n = 10, approx = "MC") ## Specific arguments # Rothman unif_stat_distr(x = x, type = "Rothman", p = 2, n = 10, Rothman_t = 0.5, approx = "MC") # CCF09 dirs <- r_unif_sph(n = 5, p = 3, M = 1)[, , 1] x <- seq(0, 1, l = 10) unif_stat_distr(x = x, type = "CCF09", p = 3, n = 10, approx = "MC", CCF09_dirs = dirs) unif_stat_distr(x = x, type = "CCF09", p = 3, n = 10, approx = "MC") # CJ12 unif_stat_distr(x = x, type = "CJ12", p = 3, n = 100, CJ12_reg = 3) unif_stat_distr(x = x, type = "CJ12", p = 3, n = 100, CJ12_reg = 2, CJ12_beta = 0.01) unif_stat_distr(x = x, type = "CJ12", p = 3, n = 100, CJ12_reg = 1) ## Sobolev x <- seq(0, 1, l = 10) vk2 <- diag(1, nrow = 3) unif_stat_distr(x = x, type = "Sobolev", approx = "asymp", p = 3, n = 100, Sobolev_vk2 = vk2) sapply(1:3, function(i) unif_stat_distr(x = x, type = "Sobolev", approx = "asymp", p = 3, n = 100, Sobolev_vk2 = vk2[i, ])$Sobolev) sapply(1:3, function(i) unif_stat_distr(x = x, type = "Sobolev", approx = "MC", p = 3, n = 100, Sobolev_vk2 = vk2[i, ], M = 1e3)$Sobolev) unif_stat_distr(x = x, type = "Sobolev", approx = "MC", p = 3, n = 100, Sobolev_vk2 = vk2, M = 1e3)
Utility for performing Monte Carlo simulation of several
statistics for assessing uniformity on the (hyper)sphere
,
.
unif_stat_MC
provides a convenient wrapper for parallel
evaluation of unif_stat
, the estimation of critical values under the
null distribution, and the computation of empirical powers under the
alternative.
unif_stat_MC(n, type = "all", p, M = 10000, r_H1 = NULL, crit_val = NULL, alpha = c(0.1, 0.05, 0.01), return_stats = TRUE, stats_sorted = FALSE, chunks = ceiling((n * M)/1e+05), cores = 1, seeds = NULL, CCF09_dirs = NULL, CJ12_reg = 3, cov_a = 2 * pi, Cressie_t = 1/3, K_CCF09 = 25, Poisson_rho = 0.5, Pycke_q = 0.5, Rayleigh_m = 1, Riesz_s = 1, Rothman_t = 1/3, Sobolev_vk2 = c(0, 0, 1), Softmax_kappa = 1, Stereo_a = 0, ...)
unif_stat_MC(n, type = "all", p, M = 10000, r_H1 = NULL, crit_val = NULL, alpha = c(0.1, 0.05, 0.01), return_stats = TRUE, stats_sorted = FALSE, chunks = ceiling((n * M)/1e+05), cores = 1, seeds = NULL, CCF09_dirs = NULL, CJ12_reg = 3, cov_a = 2 * pi, Cressie_t = 1/3, K_CCF09 = 25, Poisson_rho = 0.5, Pycke_q = 0.5, Rayleigh_m = 1, Riesz_s = 1, Rothman_t = 1/3, Sobolev_vk2 = c(0, 0, 1), Softmax_kappa = 1, Stereo_a = 0, ...)
n |
sample size. |
type |
type of test to be applied. A character vector containing any of
the following types of tests, depending on the dimension
If |
p |
integer giving the dimension of the ambient space |
M |
number of Monte Carlo replications. Defaults to |
r_H1 |
if provided, the computation of empirical powers is
carried out for the alternative hypothesis sampled with |
crit_val |
if provided, must be the critical values as returned by
|
alpha |
vector with significance levels. Defaults to
|
return_stats |
return the Monte Carlo statistics? If only the critical
values or powers are desired, |
stats_sorted |
sort the returned Monte Carlo statistics? If
|
chunks |
number of chunks to split the |
cores |
number of cores to perform the simulation. Defaults to |
seeds |
if provided, a vector of size |
CCF09_dirs |
a matrix of size |
CJ12_reg |
type of asymptotic regime for CJ12 test, either |
cov_a |
|
Cressie_t |
|
K_CCF09 |
integer giving the truncation of the series present in the
asymptotic distribution of the Kolmogorov-Smirnov statistic. Defaults to
|
Poisson_rho |
|
Pycke_q |
|
Rayleigh_m |
integer |
Riesz_s |
|
Rothman_t |
|
Sobolev_vk2 |
weights for the finite Sobolev test. A non-negative
vector or matrix. Defaults to |
Softmax_kappa |
|
Stereo_a |
|
... |
optional arguments to be passed to the |
It is possible to have a progress bar if unif_stat_MC
is wrapped with
progressr::with_progress
or if
progressr::handlers(global = TRUE)
is invoked (once) by the user.
See the examples below. The progress bar is updated with the number of
finished chunks.
All the tests reject for large values of the test statistic
(max_gap = TRUE
is assumed for the Range test), so the critical
values for the significance levels alpha
correspond to the
alpha
-upper quantiles of the null distribution of the test statistic.
The Monte Carlo simulation for the CCF09 test is made conditionally
on the choice of CCF09_dirs
. That is, all the Monte Carlo statistics
share the same random directions.
Except for CCF09_dirs
, K_CCF09
, and CJ12_reg
, all the
test-specific parameters are vectorized.
A list with the following entries:
crit_val_MC
: a data frame of size
c(length(alpha), length(type))
, with column names given by
type
and rows corresponding to the significance levels alpha
,
that contains the estimated critical values of the tests.
power_MC
: a data frame of size
c(nrow(crit_val), length(type))
, with column names given by
type
and rows corresponding to the significance levels of
crit_val
, that contains the empirical powers of the tests. NA
if crit_val = NULL
.
stats_MC
: a data frame of size c(M, length(type))
, with
column names given by type
, that contains the Monte Carlo
statistics.
## Critical values # Single statistic, specific alpha cir <- unif_stat_MC(n = 10, M = 1e2, type = "Ajne", p = 2, alpha = 0.15) summary(cir$stats_MC) cir$crit_val_MC # All circular statistics cir <- unif_stat_MC(n = 10, M = 1e2, p = 2) head(cir$stats_MC) cir$crit_val_MC # All spherical statistics sph <- unif_stat_MC(n = 10, M = 1e2, p = 3) head(sph$stats_MC) sph$crit_val_MC ## Using a progress bar # Define a progress bar require(progress) require(progressr) handlers(handler_progress( format = paste("(:spin) [:bar] :percent Iter: :current/:total Rate:", ":tick_rate iter/sec ETA: :eta Elapsed: :elapsedfull"), clear = FALSE)) # Call unif_stat_MC() within with_progress() with_progress(unif_stat_MC(n = 10, M = 1e2, p = 3, chunks = 10)) # With several cores with_progress(unif_stat_MC(n = 10, M = 1e2, p = 3, chunks = 10, cores = 2)) # Instead of using with_progress() each time, it is more practical to run # handlers(global = TRUE) # once to activate progress bars in your R session ## Power computation # Single statistic cir_pow <- unif_stat_MC(n = 10, M = 1e2, type = "Ajne", p = 2, crit_val = cir$crit_val_MC) cir_pow$crit_val_MC cir_pow$power_MC # All circular statistics cir_pow <- unif_stat_MC(n = 10, M = 1e2, p = 2, crit_val = cir$crit_val_MC) cir_pow$crit_val_MC cir_pow$power_MC # All spherical statistics sph_pow <- unif_stat_MC(n = 10, M = 1e2, p = 3, crit_val = sph$crit_val_MC) sph_pow$crit_val_MC sph_pow$power_MC ## Custom r_H1 # Circular r_H1 <- function(n, p, M, l = 0.05) { stopifnot(p == 2) Theta_to_X(matrix(runif(n * M, 0, (2 - l) * pi), n, M)) } dirs <- r_unif_sph(n = 5, p = 2, M = 1)[, , 1] cir <- unif_stat_MC(n = 50, M = 1e2, p = 2, CCF09_dirs = dirs) cir_pow <- unif_stat_MC(n = 50, M = 1e2, p = 2, r_H1 = r_H1, l = 0.10, crit_val = cir$crit_val_MC, CCF09_dirs = dirs) cir_pow$crit_val_MC cir_pow$power_MC # Spherical r_H1 <- function(n, p, M, l = 0.5) { samp <- array(dim = c(n, p, M)) for (j in 1:M) { samp[, , j] <- mvtnorm::rmvnorm(n = n, mean = c(l, rep(0, p - 1)), sigma = diag(rep(1, p))) samp[, , j] <- samp[, , j] / sqrt(rowSums(samp[, , j]^2)) } return(samp) } dirs <- r_unif_sph(n = 5, p = 3, M = 1)[, , 1] sph <- unif_stat_MC(n = 50, M = 1e2, p = 3, CCF09_dirs = dirs) sph_pow <- unif_stat_MC(n = 50, M = 1e2, p = 3, r_H1 = r_H1, l = 0.5, crit_val = sph$crit_val_MC, CCF09_dirs = dirs) sph_pow$power_MC ## Pre-built r_H1 # Circular dirs <- r_unif_sph(n = 5, p = 2, M = 1)[, , 1] cir_pow <- unif_stat_MC(n = 50, M = 1e2, p = 2, r_H1 = r_alt, alt = "vMF", kappa = 1, crit_val = cir$crit_val_MC, CCF09_dirs = dirs) cir_pow$power_MC # Spherical dirs <- r_unif_sph(n = 5, p = 3, M = 1)[, , 1] sph_pow <- unif_stat_MC(n = 50, M = 1e2, p = 3, r_H1 = r_alt, alt = "vMF", kappa = 1, crit_val = sph$crit_val_MC, CCF09_dirs = dirs) sph_pow$power_MC
## Critical values # Single statistic, specific alpha cir <- unif_stat_MC(n = 10, M = 1e2, type = "Ajne", p = 2, alpha = 0.15) summary(cir$stats_MC) cir$crit_val_MC # All circular statistics cir <- unif_stat_MC(n = 10, M = 1e2, p = 2) head(cir$stats_MC) cir$crit_val_MC # All spherical statistics sph <- unif_stat_MC(n = 10, M = 1e2, p = 3) head(sph$stats_MC) sph$crit_val_MC ## Using a progress bar # Define a progress bar require(progress) require(progressr) handlers(handler_progress( format = paste("(:spin) [:bar] :percent Iter: :current/:total Rate:", ":tick_rate iter/sec ETA: :eta Elapsed: :elapsedfull"), clear = FALSE)) # Call unif_stat_MC() within with_progress() with_progress(unif_stat_MC(n = 10, M = 1e2, p = 3, chunks = 10)) # With several cores with_progress(unif_stat_MC(n = 10, M = 1e2, p = 3, chunks = 10, cores = 2)) # Instead of using with_progress() each time, it is more practical to run # handlers(global = TRUE) # once to activate progress bars in your R session ## Power computation # Single statistic cir_pow <- unif_stat_MC(n = 10, M = 1e2, type = "Ajne", p = 2, crit_val = cir$crit_val_MC) cir_pow$crit_val_MC cir_pow$power_MC # All circular statistics cir_pow <- unif_stat_MC(n = 10, M = 1e2, p = 2, crit_val = cir$crit_val_MC) cir_pow$crit_val_MC cir_pow$power_MC # All spherical statistics sph_pow <- unif_stat_MC(n = 10, M = 1e2, p = 3, crit_val = sph$crit_val_MC) sph_pow$crit_val_MC sph_pow$power_MC ## Custom r_H1 # Circular r_H1 <- function(n, p, M, l = 0.05) { stopifnot(p == 2) Theta_to_X(matrix(runif(n * M, 0, (2 - l) * pi), n, M)) } dirs <- r_unif_sph(n = 5, p = 2, M = 1)[, , 1] cir <- unif_stat_MC(n = 50, M = 1e2, p = 2, CCF09_dirs = dirs) cir_pow <- unif_stat_MC(n = 50, M = 1e2, p = 2, r_H1 = r_H1, l = 0.10, crit_val = cir$crit_val_MC, CCF09_dirs = dirs) cir_pow$crit_val_MC cir_pow$power_MC # Spherical r_H1 <- function(n, p, M, l = 0.5) { samp <- array(dim = c(n, p, M)) for (j in 1:M) { samp[, , j] <- mvtnorm::rmvnorm(n = n, mean = c(l, rep(0, p - 1)), sigma = diag(rep(1, p))) samp[, , j] <- samp[, , j] / sqrt(rowSums(samp[, , j]^2)) } return(samp) } dirs <- r_unif_sph(n = 5, p = 3, M = 1)[, , 1] sph <- unif_stat_MC(n = 50, M = 1e2, p = 3, CCF09_dirs = dirs) sph_pow <- unif_stat_MC(n = 50, M = 1e2, p = 3, r_H1 = r_H1, l = 0.5, crit_val = sph$crit_val_MC, CCF09_dirs = dirs) sph_pow$power_MC ## Pre-built r_H1 # Circular dirs <- r_unif_sph(n = 5, p = 2, M = 1)[, , 1] cir_pow <- unif_stat_MC(n = 50, M = 1e2, p = 2, r_H1 = r_alt, alt = "vMF", kappa = 1, crit_val = cir$crit_val_MC, CCF09_dirs = dirs) cir_pow$power_MC # Spherical dirs <- r_unif_sph(n = 5, p = 3, M = 1)[, , 1] sph_pow <- unif_stat_MC(n = 50, M = 1e2, p = 3, r_H1 = r_alt, alt = "vMF", kappa = 1, crit_val = sph$crit_val_MC, CCF09_dirs = dirs) sph_pow$power_MC
Implementation of several uniformity tests on the (hyper)sphere
,
, with calibration either in
terms of their asymptotic/exact distributions, if available, or Monte Carlo.
unif_test
receives a sample of directions
in
Cartesian coordinates, except for the circular case (
) in
which the sample can be represented in terms of angles
.
unif_test
allows to perform several tests within a single call,
facilitating thus the exploration of a dataset by applying several tests.
unif_test(data, type = "all", p_value = "asymp", alpha = c(0.1, 0.05, 0.01), M = 10000, stats_MC = NULL, crit_val = NULL, data_sorted = FALSE, K_max = 10000, method = "I", CCF09_dirs = NULL, CJ12_beta = 0, CJ12_reg = 3, cov_a = 2 * pi, Cressie_t = 1/3, K_CCF09 = 25, Poisson_rho = 0.5, Pycke_q = 0.5, Rayleigh_m = 1, Riesz_s = 1, Rothman_t = 1/3, Sobolev_vk2 = c(0, 0, 1), Softmax_kappa = 1, Stereo_a = 0, ...)
unif_test(data, type = "all", p_value = "asymp", alpha = c(0.1, 0.05, 0.01), M = 10000, stats_MC = NULL, crit_val = NULL, data_sorted = FALSE, K_max = 10000, method = "I", CCF09_dirs = NULL, CJ12_beta = 0, CJ12_reg = 3, cov_a = 2 * pi, Cressie_t = 1/3, K_CCF09 = 25, Poisson_rho = 0.5, Pycke_q = 0.5, Rayleigh_m = 1, Riesz_s = 1, Rothman_t = 1/3, Sobolev_vk2 = c(0, 0, 1), Softmax_kappa = 1, Stereo_a = 0, ...)
data |
sample to perform the test. A matrix of size |
type |
type of test to be applied. A character vector containing any of
the following types of tests, depending on the dimension
If |
p_value |
type of |
alpha |
vector with significance levels. Defaults to
|
M |
number of Monte Carlo replications for approximating the null
distribution when |
stats_MC |
a data frame of size |
crit_val |
table with critical values for the tests, to be used if
|
data_sorted |
is the circular data sorted? If |
K_max |
integer giving the truncation of the series that compute the
asymptotic p-value of a Sobolev test. Defaults to |
method |
method for approximating the density, distribution, or
quantile function of the weighted sum of chi squared random variables. Must
be |
CCF09_dirs |
a matrix of size |
CJ12_beta |
|
CJ12_reg |
type of asymptotic regime for CJ12 test, either |
cov_a |
|
Cressie_t |
|
K_CCF09 |
integer giving the truncation of the series present in the
asymptotic distribution of the Kolmogorov-Smirnov statistic. Defaults to
|
Poisson_rho |
|
Pycke_q |
|
Rayleigh_m |
integer |
Riesz_s |
|
Rothman_t |
|
Sobolev_vk2 |
weights for the finite Sobolev test. A non-negative
vector or matrix. Defaults to |
Softmax_kappa |
|
Stereo_a |
|
... |
If |
All the tests reject for large values of the test statistic, so the critical
values for the significance levels alpha
correspond to the
alpha
-upper quantiles of the null distribution of the test statistic.
When p_value = "asymp"
, tests that do not have an implemented or
known asymptotic are omitted, and a warning is generated.
When p_value = "MC"
, it is possible to have a progress bar indicating
the Monte Carlo simulation progress if unif_test
is wrapped with
progressr::with_progress
or if
progressr::handlers(global = TRUE)
is invoked (once) by the user.
See the examples below. The progress bar is updated with the number of
finished chunks.
All the statistics are continuous random variables except the
Hodges–Ajne statistic ("Hodges_Ajne"
), the Cressie statistic
("Cressie"
), and the number of (different) uncovered spacings
("Num_uncover"
). These three statistics are discrete random variables.
The Monte Carlo calibration for the CCF09 test is made conditionally
on the choice of CCF09_dirs
. That is, all the Monte
Carlo statistics share the same random directions.
Except for CCF09_dirs
, K_CCF09
, and CJ12_reg
, all the
test-specific parameters are vectorized.
Descriptions and references for most of the tests are available in García-Portugués and Verdebout (2018).
If only a single test is performed, a list with class
htest
containing the following components:
statistic
: the value of the test statistic.
p.value
: the p-value of the test. If
p_value = "crit_val"
, an NA
.
alternative
: a character string describing the alternative
hypothesis.
method
: a character string indicating what type of test was
performed.
data.name
: a character string giving the name of the data.
reject
: the rejection decision for the levels of significance
alpha
.
crit_val
: a vector with the critical values for the
significance levels alpha
used with p_value = "MC"
or
p_value = "asymp"
.
param
: parameter(s) used in the test (if any).
If several tests are performed, a type
-named list with
entries for each test given by the above list.
García-Portugués, E. and Verdebout, T. (2018) An overview of uniformity tests on the hypersphere. arXiv:1804.00286. doi:10.48550/arXiv.1804.00286.
## Asymptotic distribution # Circular data n <- 10 samp_cir <- r_unif_cir(n = n) # Matrix unif_test(data = samp_cir, type = "Ajne", p_value = "asymp") # Vector unif_test(data = samp_cir[, 1], type = "Ajne", p_value = "asymp") # Array unif_test(data = array(samp_cir, dim = c(n, 1, 1)), type = "Ajne", p_value = "asymp") # Several tests unif_test(data = samp_cir, type = avail_cir_tests, p_value = "asymp") # Spherical data n <- 10 samp_sph <- r_unif_sph(n = n, p = 3) # Array unif_test(data = samp_sph, type = "Bingham", p_value = "asymp") # Matrix unif_test(data = samp_sph[, , 1], type = "Bingham", p_value = "asymp") # Several tests unif_test(data = samp_sph, type = avail_sph_tests, p_value = "asymp") ## Monte Carlo # Circular data unif_test(data = samp_cir, type = "Ajne", p_value = "MC") unif_test(data = samp_cir, type = avail_cir_tests, p_value = "MC") # Spherical data unif_test(data = samp_sph, type = "Bingham", p_value = "MC") unif_test(data = samp_sph, type = avail_sph_tests, p_value = "MC") # Caching stats_MC stats_MC_cir <- unif_stat_MC(n = nrow(samp_cir), p = 2)$stats_MC stats_MC_sph <- unif_stat_MC(n = nrow(samp_sph), p = 3)$stats_MC unif_test(data = samp_cir, type = avail_cir_tests, p_value = "MC", stats_MC = stats_MC_cir) unif_test(data = samp_sph, type = avail_sph_tests, p_value = "MC", stats_MC = stats_MC_sph) ## Critical values # Circular data unif_test(data = samp_cir, type = avail_cir_tests, p_value = "crit_val") # Spherical data unif_test(data = samp_sph, type = avail_sph_tests, p_value = "crit_val") # Caching crit_val crit_val_cir <- unif_stat_MC(n = n, p = 2)$crit_val_MC crit_val_sph <- unif_stat_MC(n = n, p = 3)$crit_val_MC unif_test(data = samp_cir, type = avail_cir_tests, p_value = "crit_val", crit_val = crit_val_cir) unif_test(data = samp_sph, type = avail_sph_tests, p_value = "crit_val", crit_val = crit_val_sph) ## Specific arguments # Rothman unif_test(data = samp_cir, type = "Rothman", Rothman_t = 0.5) # CCF09 unif_test(data = samp_sph, type = "CCF09", p_value = "MC", CCF09_dirs = samp_sph[1:2, , 1]) unif_test(data = samp_sph, type = "CCF09", p_value = "MC", CCF09_dirs = samp_sph[3:4, , 1]) ## Using a progress bar when p_value = "MC" # Define a progress bar require(progress) require(progressr) handlers(handler_progress( format = paste("(:spin) [:bar] :percent Iter: :current/:total Rate:", ":tick_rate iter/sec ETA: :eta Elapsed: :elapsedfull"), clear = FALSE)) # Call unif_test() within with_progress() with_progress( unif_test(data = samp_sph, type = avail_sph_tests, p_value = "MC", chunks = 10, M = 1e3) ) # With several cores with_progress( unif_test(data = samp_sph, type = avail_sph_tests, p_value = "MC", cores = 2, chunks = 10, M = 1e3) ) # Instead of using with_progress() each time, it is more practical to run # handlers(global = TRUE) # once to activate progress bars in your R session
## Asymptotic distribution # Circular data n <- 10 samp_cir <- r_unif_cir(n = n) # Matrix unif_test(data = samp_cir, type = "Ajne", p_value = "asymp") # Vector unif_test(data = samp_cir[, 1], type = "Ajne", p_value = "asymp") # Array unif_test(data = array(samp_cir, dim = c(n, 1, 1)), type = "Ajne", p_value = "asymp") # Several tests unif_test(data = samp_cir, type = avail_cir_tests, p_value = "asymp") # Spherical data n <- 10 samp_sph <- r_unif_sph(n = n, p = 3) # Array unif_test(data = samp_sph, type = "Bingham", p_value = "asymp") # Matrix unif_test(data = samp_sph[, , 1], type = "Bingham", p_value = "asymp") # Several tests unif_test(data = samp_sph, type = avail_sph_tests, p_value = "asymp") ## Monte Carlo # Circular data unif_test(data = samp_cir, type = "Ajne", p_value = "MC") unif_test(data = samp_cir, type = avail_cir_tests, p_value = "MC") # Spherical data unif_test(data = samp_sph, type = "Bingham", p_value = "MC") unif_test(data = samp_sph, type = avail_sph_tests, p_value = "MC") # Caching stats_MC stats_MC_cir <- unif_stat_MC(n = nrow(samp_cir), p = 2)$stats_MC stats_MC_sph <- unif_stat_MC(n = nrow(samp_sph), p = 3)$stats_MC unif_test(data = samp_cir, type = avail_cir_tests, p_value = "MC", stats_MC = stats_MC_cir) unif_test(data = samp_sph, type = avail_sph_tests, p_value = "MC", stats_MC = stats_MC_sph) ## Critical values # Circular data unif_test(data = samp_cir, type = avail_cir_tests, p_value = "crit_val") # Spherical data unif_test(data = samp_sph, type = avail_sph_tests, p_value = "crit_val") # Caching crit_val crit_val_cir <- unif_stat_MC(n = n, p = 2)$crit_val_MC crit_val_sph <- unif_stat_MC(n = n, p = 3)$crit_val_MC unif_test(data = samp_cir, type = avail_cir_tests, p_value = "crit_val", crit_val = crit_val_cir) unif_test(data = samp_sph, type = avail_sph_tests, p_value = "crit_val", crit_val = crit_val_sph) ## Specific arguments # Rothman unif_test(data = samp_cir, type = "Rothman", Rothman_t = 0.5) # CCF09 unif_test(data = samp_sph, type = "CCF09", p_value = "MC", CCF09_dirs = samp_sph[1:2, , 1]) unif_test(data = samp_sph, type = "CCF09", p_value = "MC", CCF09_dirs = samp_sph[3:4, , 1]) ## Using a progress bar when p_value = "MC" # Define a progress bar require(progress) require(progressr) handlers(handler_progress( format = paste("(:spin) [:bar] :percent Iter: :current/:total Rate:", ":tick_rate iter/sec ETA: :eta Elapsed: :elapsedfull"), clear = FALSE)) # Call unif_test() within with_progress() with_progress( unif_test(data = samp_sph, type = avail_sph_tests, p_value = "MC", chunks = 10, M = 1e3) ) # With several cores with_progress( unif_test(data = samp_sph, type = avail_sph_tests, p_value = "MC", cores = 2, chunks = 10, M = 1e3) ) # Instead of using with_progress() each time, it is more practical to run # handlers(global = TRUE) # once to activate progress bars in your R session
Craters on Venus from the USGS Astrogeology Science Center.
venus
venus
A data frame with 967 rows and 4 variables:
name of the crater (if named).
diameter of the crater (in km).
longitude angle of the
crater center.
latitude angle of the
crater center.
The angles are such their associated planetocentric
coordinates are:
with denoting the north pole.
The script performing the data preprocessing is available at
venus.R
.
https://astrogeology.usgs.gov/search/map/Venus/venuscraters
# Load data data("venus") # Add Cartesian coordinates venus$X <- cbind(cos(venus$theta) * cos(venus$phi), sin(venus$theta) * cos(venus$phi), sin(venus$phi)) # Tests unif_test(data = venus$X, type = c("PCvM", "PAD", "PRt"), p_value = "asymp")
# Load data data("venus") # Add Cartesian coordinates venus$X <- cbind(cos(venus$theta) * cos(venus$phi), sin(venus$theta) * cos(venus$phi), sin(venus$phi)) # Tests unif_test(data = venus$X, type = c("PCvM", "PAD", "PRt"), p_value = "asymp")
Approximated density, distribution, and quantile functions for weighted sums of non-central chi squared random variables:
where are positive weights,
are positive degrees of freedom, and
are non-negative non-centrality parameters. Also, simulation of
.
d_wschisq(x, weights, dfs, ncps = 0, method = c("I", "SW", "HBE")[1], exact_chisq = TRUE, imhof_epsabs = 1e-06, imhof_epsrel = 1e-06, imhof_limit = 10000, grad_method = "simple", grad_method.args = list(eps = 1e-07)) p_wschisq(x, weights, dfs, ncps = 0, method = c("I", "SW", "HBE", "MC")[1], exact_chisq = TRUE, imhof_epsabs = 1e-06, imhof_epsrel = 1e-06, imhof_limit = 10000, M = 10000, MC_sample = NULL) q_wschisq(u, weights, dfs, ncps = 0, method = c("I", "SW", "HBE", "MC")[1], exact_chisq = TRUE, imhof_epsabs = 1e-06, imhof_epsrel = 1e-06, imhof_limit = 10000, nlm_gradtol = 1e-06, nlm_iterlim = 1000, M = 10000, MC_sample = NULL) r_wschisq(n, weights, dfs, ncps = 0) cutoff_wschisq(thre = 1e-04, weights, dfs, ncps = 0, log = FALSE, x_tail = NULL)
d_wschisq(x, weights, dfs, ncps = 0, method = c("I", "SW", "HBE")[1], exact_chisq = TRUE, imhof_epsabs = 1e-06, imhof_epsrel = 1e-06, imhof_limit = 10000, grad_method = "simple", grad_method.args = list(eps = 1e-07)) p_wschisq(x, weights, dfs, ncps = 0, method = c("I", "SW", "HBE", "MC")[1], exact_chisq = TRUE, imhof_epsabs = 1e-06, imhof_epsrel = 1e-06, imhof_limit = 10000, M = 10000, MC_sample = NULL) q_wschisq(u, weights, dfs, ncps = 0, method = c("I", "SW", "HBE", "MC")[1], exact_chisq = TRUE, imhof_epsabs = 1e-06, imhof_epsrel = 1e-06, imhof_limit = 10000, nlm_gradtol = 1e-06, nlm_iterlim = 1000, M = 10000, MC_sample = NULL) r_wschisq(n, weights, dfs, ncps = 0) cutoff_wschisq(thre = 1e-04, weights, dfs, ncps = 0, log = FALSE, x_tail = NULL)
x |
vector of quantiles. |
weights |
vector with the positive weights of the sum. Must have the
same length as |
dfs |
vector with the positive degrees of freedom of the chi squared
random variables. Must have the same length as |
ncps |
non-centrality parameters. Either |
method |
method for approximating the density, distribution, or
quantile function. Must be |
exact_chisq |
if |
imhof_epsabs , imhof_epsrel , imhof_limit
|
precision parameters passed to
|
grad_method , grad_method.args
|
numerical differentiation parameters
passed to |
M |
number of Monte Carlo samples for approximating the distribution if
|
MC_sample |
if provided, it is employed when |
u |
vector of probabilities. |
nlm_gradtol , nlm_iterlim
|
convergence control parameters passed to
|
n |
sample size. |
thre |
vector with the error thresholds of the tail probability and
mean/variance explained by the first terms of the series. Defaults to
|
log |
are |
x_tail |
scalar evaluation point for determining the upper tail
probability. If |
Four methods are implemented for approximating the distribution of a weighted sum of chi squared random variables:
"I"
: Imhof's approximation (Imhof, 1961) for the evaluation
of the distribution function. If this method is selected, the function
is simply a wrapper to imhof
from the
CompQuadForm
package (Duchesne and Lafaye De Micheaux, 2010).
"SW"
: Satterthwaite–Welch (Satterthwaite, 1946; Welch, 1938)
approximation, consisting in matching the first two moments of
with a gamma distribution.
"HBE"
: Hall–Buckley–Eagleson (Hall, 1983; Buckley and
Eagleson, 1988) approximation, consisting in matching the first
three moments of with a gamma distribution.
"MC"
: Monte Carlo approximation using the empirical
cumulative distribution function with M
simulated samples.
The Imhof method is exact up to the prescribed numerical accuracy. It is also the most time-consuming method. The density and quantile functions for this approximation are obtained by numerical differentiation and inversion, respectively, of the approximated distribution.
For the methods based on gamma matching, the GammaDist
density, distribution, and quantile functions are invoked. The
Hall–Buckley–Eagleson approximation tends to overperform the
Satterthwaite–Welch approximation.
The Monte Carlo method is relatively inaccurate and slow, but serves as an
unbiased reference of the true distribution function. The inversion of the
empirical cumulative distribution is done by quantile
.
An empirical comparison of these and other approximation methods is given in Bodenham and Adams (2016).
cutoff_wschisq
removes NA
s/NaN
s in weights
or
dfs
with a message. The threshold thre
ensures that the
tail probability of the truncated and whole series differ less than
thre
at x_tail
, or that thre
is the proportion of the
mean/variance of the whole series that is not retained. The (upper)
tail probabilities for evaluating truncation are computed using the
Hall–Buckley–Eagleson approximation at x_tail
.
d_wschisq
: density function evaluated at x
, a vector.
p_wschisq
: distribution function evaluated at x
,
a vector.
q_wschisq
: quantile function evaluated at u
, a vector.
r_wschisq
: a vector of size n
containing a random
sample.
cutoff_wschisq
: a data frame with the indexes up to which the
truncated series explains the tail probability with absolute error
thre
, or the proportion of the mean/variance of the whole series
that is not explained by the truncated series.
Eduardo García-Portugués and Paula Navarro-Esteban.
Bodenham, D. A. and Adams, N. M. (2016). A comparison of efficient approximations for a weighted sum of chi-squared random variables. Statistics and Computing, 26(4):917–928. doi:10.1007/s11222-015-9583-4
Buckley, M. J. and Eagleson, G. K. (1988). An approximation to the distribution of quadratic forms in normal random variables. Australian Journal of Statistics, 30(1):150–159. doi:10.1111/j.1467-842X.1988.tb00471.x
Duchesne, P. and Lafaye De Micheaux, P. (2010) Computing the distribution of quadratic forms: Further comparisons between the Liu–Tang–Zhang approximation and exact methods. Computational Statistics and Data Analysis, 54(4):858–862. doi:10.1016/j.csda.2009.11.025
Hall, P. (1983). Chi squared approximations to the distribution of a sum of independent random variables. Annals of Probability, 11(4):1028–1036. doi:10.1214/aop/1176993451
Imhof, J. P. (1961). Computing the distribution of quadratic forms in normal variables. Biometrika, 48(3/4):419–426. doi:10.2307/2332763
Satterthwaite, F. E. (1946). An approximate distribution of estimates of variance components. Biometrics Bulletin, 2(6):110–114. doi:10.2307/3002019
Welch, B. L. (1938). The significance of the difference between two means when the population variances are unequal. Biometrika, 29(3/4):350–362. doi:10.2307/2332010
# Plotting functions for the examples add_approx_dens <- function(x, dfs, weights, ncps) { lines(x, d_wschisq(x, weights = weights, dfs = dfs, ncps = ncps, method = "SW", exact_chisq = FALSE), col = 3) lines(x, d_wschisq(x, weights = weights, dfs = dfs, ncps = ncps, method = "HBE", exact_chisq = FALSE), col = 4) lines(x, d_wschisq(x, weights = weights, dfs = dfs, ncps = ncps, method = "I", exact_chisq = TRUE), col = 2) legend("topright", legend = c("True", "SW", "HBE", "I"), lwd = 2, col = c(1, 3:4, 2)) } add_approx_distr <- function(x, dfs, weights, ncps, ...) { lines(x, p_wschisq(x, weights = weights, dfs = dfs, ncps = ncps, method = "SW", exact_chisq = FALSE), col = 3) lines(x, p_wschisq(x, weights = weights, dfs = dfs, ncps = ncps, method = "HBE", exact_chisq = FALSE), col = 4) lines(x, p_wschisq(x, weights = weights, dfs = dfs, ncps = ncps, method = "MC", exact_chisq = FALSE), col = 5, type = "s") lines(x, p_wschisq(x, weights = weights, dfs = dfs, ncps = ncps, method = "I", exact_chisq = TRUE), col = 2) legend("bottomright", legend = c("True", "SW", "HBE", "MC", "I"), lwd = 2, col = c(1, 3:5, 2)) } add_approx_quant <- function(u, dfs, weights, ncps, ...) { lines(u, q_wschisq(u, weights = weights, dfs = dfs, ncps = ncps, method = "SW", exact_chisq = FALSE), col = 3) lines(u, q_wschisq(u, weights = weights, dfs = dfs, ncps = ncps, method = "HBE", exact_chisq = FALSE), col = 4) lines(u, q_wschisq(u, weights = weights, dfs = dfs, ncps = ncps, method = "MC", exact_chisq = FALSE), col = 5, type = "s") lines(u, q_wschisq(u, weights = weights, dfs = dfs, ncps = ncps, method = "I", exact_chisq = TRUE), col = 2) legend("topleft", legend = c("True", "SW", "HBE", "MC", "I"), lwd = 2, col = c(1, 3:5, 2)) } # Validation plots for density, distribution, and quantile functions u <- seq(0.01, 0.99, l = 100) old_par <- par(mfrow = c(1, 3)) # Case 1: 1 * ChiSq_3(0) + 1 * ChiSq_3(0) = ChiSq_6(0) weights <- c(1, 1) dfs <- c(3, 3) ncps <- 0 x <- seq(-1, 30, l = 100) main <- expression(1 * chi[3]^2 * (0) + 1 * chi[3]^2 * (0)) plot(x, dchisq(x, df = 6), type = "l", main = main, ylab = "Density") add_approx_dens(x = x, weights = weights, dfs = dfs, ncps = ncps) plot(x, pchisq(x, df = 6), type = "l", main = main, ylab = "Distribution") add_approx_distr(x = x, weights = weights, dfs = dfs, ncps = ncps) plot(u, qchisq(u, df = 6), type = "l", main = main, ylab = "Quantile") add_approx_quant(u = u, weights = weights, dfs = dfs, ncps = ncps) # Case 2: 2 * ChiSq_3(1) + 1 * ChiSq_6(0.5) + 0.5 * ChiSq_12(0.25) weights <- c(2, 1, 0.5) dfs <- c(3, 6, 12) ncps <- c(1, 0.5, 0.25) x <- seq(0, 70, l = 100) main <- expression(2 * chi[3]^2 * (1)+ 1 * chi[6]^2 * (0.5) + 0.5 * chi[12]^2 * (0.25)) samp <- r_wschisq(n = 1e4, weights = weights, dfs = dfs, ncps = ncps) hist(samp, breaks = 50, freq = FALSE, main = main, ylab = "Density", xlim = range(x), xlab = "x"); box() add_approx_dens(x = x, weights = weights, dfs = dfs, ncps = ncps) plot(x, ecdf(samp)(x), main = main, ylab = "Distribution", type = "s") add_approx_distr(x = x, weights = weights, dfs = dfs, ncps = ncps) plot(u, quantile(samp, probs = u), type = "s", main = main, ylab = "Quantile") add_approx_quant(u = u, weights = weights, dfs = dfs, ncps = ncps) # Case 3: \sum_{k = 1}^K k^(-3) * ChiSq_{5k}(1 / k^2) K <- 1e2 weights<- 1 / (1:K)^3 dfs <- 5 * 1:K ncps <- 1 / (1:K)^2 x <- seq(0, 25, l = 100) main <- substitute(sum(k^(-3) * chi[5 * k]^2 * (1 / k^2), k == 1, K), list(K = K)) samp <- r_wschisq(n = 1e4, weights = weights, dfs = dfs, ncps = ncps) hist(samp, breaks = 50, freq = FALSE, main = main, ylab = "Density", xlim = range(x), xlab = "x"); box() add_approx_dens(x = x, weights = weights, dfs = dfs, ncps = ncps) plot(x, ecdf(samp)(x), main = main, ylab = "Distribution", type = "s") add_approx_distr(x = x, weights = weights, dfs = dfs, ncps = ncps) plot(u, quantile(samp, probs = u), type = "s", main = main, ylab = "Quantile") add_approx_quant(u = u, weights = weights, dfs = dfs, ncps = ncps) par(old_par) # Cutoffs for infinite series of the last example K <- 1e7 log_weights<- -3 * log(1:K) log_dfs <- log(5) + log(1:K) (cutoff <- cutoff_wschisq(thre = 10^(-(1:4)), weights = log_weights, dfs = log_dfs, log = TRUE)) # Approximation x <- seq(0, 25, l = 100) l <- length(cutoff$mean) main <- expression(sum(k^(-3) * chi[5 * k]^2, k == 1, K)) col <- viridisLite::viridis(l) plot(x, d_wschisq(x, weights = exp(log_weights[1:cutoff$mean[l]]), dfs = exp(log_dfs[1:cutoff$mean[l]])), type = "l", ylab = "Density", col = col[l], lwd = 3) for(i in rev(seq_along(cutoff$mean)[-l])) { lines(x, d_wschisq(x, weights = exp(log_weights[1:cutoff$mean[i]]), dfs = exp(log_dfs[1:cutoff$mean[i]])), col = col[i]) } legend("topright", legend = paste0(rownames(cutoff), " (", cutoff$mean, ")"), lwd = 2, col = col)
# Plotting functions for the examples add_approx_dens <- function(x, dfs, weights, ncps) { lines(x, d_wschisq(x, weights = weights, dfs = dfs, ncps = ncps, method = "SW", exact_chisq = FALSE), col = 3) lines(x, d_wschisq(x, weights = weights, dfs = dfs, ncps = ncps, method = "HBE", exact_chisq = FALSE), col = 4) lines(x, d_wschisq(x, weights = weights, dfs = dfs, ncps = ncps, method = "I", exact_chisq = TRUE), col = 2) legend("topright", legend = c("True", "SW", "HBE", "I"), lwd = 2, col = c(1, 3:4, 2)) } add_approx_distr <- function(x, dfs, weights, ncps, ...) { lines(x, p_wschisq(x, weights = weights, dfs = dfs, ncps = ncps, method = "SW", exact_chisq = FALSE), col = 3) lines(x, p_wschisq(x, weights = weights, dfs = dfs, ncps = ncps, method = "HBE", exact_chisq = FALSE), col = 4) lines(x, p_wschisq(x, weights = weights, dfs = dfs, ncps = ncps, method = "MC", exact_chisq = FALSE), col = 5, type = "s") lines(x, p_wschisq(x, weights = weights, dfs = dfs, ncps = ncps, method = "I", exact_chisq = TRUE), col = 2) legend("bottomright", legend = c("True", "SW", "HBE", "MC", "I"), lwd = 2, col = c(1, 3:5, 2)) } add_approx_quant <- function(u, dfs, weights, ncps, ...) { lines(u, q_wschisq(u, weights = weights, dfs = dfs, ncps = ncps, method = "SW", exact_chisq = FALSE), col = 3) lines(u, q_wschisq(u, weights = weights, dfs = dfs, ncps = ncps, method = "HBE", exact_chisq = FALSE), col = 4) lines(u, q_wschisq(u, weights = weights, dfs = dfs, ncps = ncps, method = "MC", exact_chisq = FALSE), col = 5, type = "s") lines(u, q_wschisq(u, weights = weights, dfs = dfs, ncps = ncps, method = "I", exact_chisq = TRUE), col = 2) legend("topleft", legend = c("True", "SW", "HBE", "MC", "I"), lwd = 2, col = c(1, 3:5, 2)) } # Validation plots for density, distribution, and quantile functions u <- seq(0.01, 0.99, l = 100) old_par <- par(mfrow = c(1, 3)) # Case 1: 1 * ChiSq_3(0) + 1 * ChiSq_3(0) = ChiSq_6(0) weights <- c(1, 1) dfs <- c(3, 3) ncps <- 0 x <- seq(-1, 30, l = 100) main <- expression(1 * chi[3]^2 * (0) + 1 * chi[3]^2 * (0)) plot(x, dchisq(x, df = 6), type = "l", main = main, ylab = "Density") add_approx_dens(x = x, weights = weights, dfs = dfs, ncps = ncps) plot(x, pchisq(x, df = 6), type = "l", main = main, ylab = "Distribution") add_approx_distr(x = x, weights = weights, dfs = dfs, ncps = ncps) plot(u, qchisq(u, df = 6), type = "l", main = main, ylab = "Quantile") add_approx_quant(u = u, weights = weights, dfs = dfs, ncps = ncps) # Case 2: 2 * ChiSq_3(1) + 1 * ChiSq_6(0.5) + 0.5 * ChiSq_12(0.25) weights <- c(2, 1, 0.5) dfs <- c(3, 6, 12) ncps <- c(1, 0.5, 0.25) x <- seq(0, 70, l = 100) main <- expression(2 * chi[3]^2 * (1)+ 1 * chi[6]^2 * (0.5) + 0.5 * chi[12]^2 * (0.25)) samp <- r_wschisq(n = 1e4, weights = weights, dfs = dfs, ncps = ncps) hist(samp, breaks = 50, freq = FALSE, main = main, ylab = "Density", xlim = range(x), xlab = "x"); box() add_approx_dens(x = x, weights = weights, dfs = dfs, ncps = ncps) plot(x, ecdf(samp)(x), main = main, ylab = "Distribution", type = "s") add_approx_distr(x = x, weights = weights, dfs = dfs, ncps = ncps) plot(u, quantile(samp, probs = u), type = "s", main = main, ylab = "Quantile") add_approx_quant(u = u, weights = weights, dfs = dfs, ncps = ncps) # Case 3: \sum_{k = 1}^K k^(-3) * ChiSq_{5k}(1 / k^2) K <- 1e2 weights<- 1 / (1:K)^3 dfs <- 5 * 1:K ncps <- 1 / (1:K)^2 x <- seq(0, 25, l = 100) main <- substitute(sum(k^(-3) * chi[5 * k]^2 * (1 / k^2), k == 1, K), list(K = K)) samp <- r_wschisq(n = 1e4, weights = weights, dfs = dfs, ncps = ncps) hist(samp, breaks = 50, freq = FALSE, main = main, ylab = "Density", xlim = range(x), xlab = "x"); box() add_approx_dens(x = x, weights = weights, dfs = dfs, ncps = ncps) plot(x, ecdf(samp)(x), main = main, ylab = "Distribution", type = "s") add_approx_distr(x = x, weights = weights, dfs = dfs, ncps = ncps) plot(u, quantile(samp, probs = u), type = "s", main = main, ylab = "Quantile") add_approx_quant(u = u, weights = weights, dfs = dfs, ncps = ncps) par(old_par) # Cutoffs for infinite series of the last example K <- 1e7 log_weights<- -3 * log(1:K) log_dfs <- log(5) + log(1:K) (cutoff <- cutoff_wschisq(thre = 10^(-(1:4)), weights = log_weights, dfs = log_dfs, log = TRUE)) # Approximation x <- seq(0, 25, l = 100) l <- length(cutoff$mean) main <- expression(sum(k^(-3) * chi[5 * k]^2, k == 1, K)) col <- viridisLite::viridis(l) plot(x, d_wschisq(x, weights = exp(log_weights[1:cutoff$mean[l]]), dfs = exp(log_dfs[1:cutoff$mean[l]])), type = "l", ylab = "Density", col = col[l], lwd = 3) for(i in rev(seq_along(cutoff$mean)[-l])) { lines(x, d_wschisq(x, weights = exp(log_weights[1:cutoff$mean[i]]), dfs = exp(log_dfs[1:cutoff$mean[i]])), col = col[i]) } legend("topright", legend = paste0(rownames(cutoff), " (", cutoff$mean, ")"), lwd = 2, col = col)