Title: | A Collection of Functions for Directional Data Analysis |
---|---|
Description: | A collection of functions for directional data (including massive data, with millions of observations) analysis. Hypothesis testing, discriminant and regression analysis, MLE of distributions and more are included. The standard textbook for such data is the "Directional Statistics" by Mardia, K. V. and Jupp, P. E. (2000). Other references include: a) Paine J.P., Preston S.P., Tsagris M. and Wood A.T.A. (2018). "An elliptically symmetric angular Gaussian distribution". Statistics and Computing 28(3): 689-697. <doi:10.1007/s11222-017-9756-4>. b) Tsagris M. and Alenazi A. (2019). "Comparison of discriminant analysis methods on the sphere". Communications in Statistics: Case Studies, Data Analysis and Applications 5(4):467--491. <doi:10.1080/23737484.2019.1684854>. c) Paine J.P., Preston S.P., Tsagris M. and Wood A.T.A. (2020). "Spherical regression models with general covariates and anisotropic errors". Statistics and Computing 30(1): 153--165. <doi:10.1007/s11222-019-09872-2>. d) Tsagris M. and Alenazi A. (2024). "An investigation of hypothesis testing procedures for circular and spherical mean vectors". Communications in Statistics-Simulation and Computation, 53(3): 1387--1408. <doi:10.1080/03610918.2022.2045499>. e) Yu Z. and Huang X. (2024). A new parameterization for elliptically symmetric angular Gaussian distributions of arbitrary dimension. Electronic Journal of Statistics, 18(1): 301--334. <doi:10.1214/23-EJS2210>. f) Tsagris M. and Alzeley O. (2024). "Circular and spherical projected Cauchy distributions: A Novel Framework for Circular and Directional Data Modeling". Australian & New Zealand Journal of Statistics (Accepted for publication). <doi:10.48550/arXiv.2302.02468>. g) Tsagris M., Papastamoulis P. and Kato S. (2024). "Directional data analysis using the spherical Cauchy and the Poisson kernel-based distribution". <doi:10.48550/arXiv.2409.03292>. |
Authors: | Michail Tsagris [aut, cre], Giorgos Athineou [aut], Christos Adam [aut], Zehao Yu [aut], Anamul Sajib [ctb], Eli Amson [ctb], Micah J. Waldstein [ctb], Panagiotis Papastamoulis [ctb] |
Maintainer: | Michail Tsagris <[email protected]> |
License: | GPL (>= 2) |
Version: | 7.0 |
Built: | 2024-12-05 12:03:40 UTC |
Source: | CRAN |
Circular-linear regression, spherical-spherical regression, spherical regression, discriminant analysis, ANOVA for circular and (hyper-)spherical data, tests for eaquality of conentration parameters, maximum likelihood estimation of the parameters of many distributions, random values generation from various distributions, contour plots and many more functions are included.
Package: | Directional |
Type: | Package |
Version: | 7.0 |
Date: | 2024-12-04 |
License: | GPL-2 |
Michail Tsagris [email protected].
Acknowledgments:
Professor Andy Wood and Dr Simon Preston from the university of Nottingham are highly appreciated for being my supervisors during my post-doc in directional data analysis.
Dr Georgios Pappas (former postDoc at the university of Nottingham) helped me construct the contour plots of the von Mises-Fisher and the Kent distribution.
Dr Christopher Fallaize and Dr Theo Kypraios from the university of Nottingham have provided a function for simulating from the Bingham distribution using rejection sampling. So any questions regarding this function should be addressed to them.
Dr Kwang-Rae Kim (post-doc at the university of Nottingham) answered some of my questions.
Giorgos Borboudakis (PhD student at the university of Crete) pointed out to me a not so clear message in the algorithm of generating random values from the von Mises-Fisher distribution.
Panagiotis (pronounced Panayiotis) Tzirakis (master student at the department of computer science in Heraklion during the 2013-2015 seasons) showed me how to perform parallel computing in R and he is greatly acknowledged and appreciated not only from me but from all the readers of this document. He also helped me with the vectorization of some contour plot functions.
Professor John Kent from the university of Leeds is acknowledged for clarifying one thing with the ovalness parameter in his distribution.
Phillip Paine (postdoc at the university of Nottingham) spotted that the function rfb
is rather slow and he suggested me to change it. The function has changed now and this is also due to Joshua Davis (from Carleton College, Northfield, MN) who spotted that mistakes could occur, due a vector not being a matrix.
Professor Kurt Hornik from the Vienna university of economics and business is greatly acknowledged for his patience and contast help with this (and not only) R package.
Manos Papadakis is also acknowledged for his programming tips and for his assistance with the "htest" class object.
Dr Mojgan Golzy spotted a mistake in the function desag
and Michail is very happy for that.
Lisette de Jonge-Hoekstra from the University of Groningen found a wrong sentence in the help file of function spml.reg
which is now deleted.
Peter Harremoes from the Copenhagen Business College spotted a mistake in the confidence interval of the function circ.summary
which has now been corrected.
Dr Gregory Emvalomatis from the University of Crete helped me understand better the EM algorithm for mixture models and I fixed a bug in the function mixvmf.mle
.
Kinley Russell, PhD student at the Johns Hopkins University School of Medicine, suggested that I include bootstrap ANOVA functions.
Sia Ahmadi found a mistake in the function conc.test
which has now been corrected.
If you want more information on many of these algorithms see Chapters 9 and 10 in the following document. https://www.researchgate.net/publication/324363311_Multivariate_data_analysis_in_R
Michail Tsagris [email protected], Giorgos Athineou <[email protected]>, Christos Adam [email protected], Zehao Yu <[email protected]>, Anamul Sajib <[email protected]>, Eli Amson <[email protected]>, Micah J. Waldstein <[email protected]> and Panagiotis Papastamoulis <[email protected]>.
Mardia, K. V. and Jupp, P. E. (2000). Directional statistics. Chicester: John Wiley and Sons.
(Hyper-)spherical regression using the rotational symmetric distributions.
vmfreg(y, x, con = TRUE, xnew = NULL, tol = 1e-06) spcauchy.reg(y, x, con = TRUE, xnew = NULL, tol = 1e-06) pkbd.reg(y, x, con = TRUE, xnew = NULL, tol = 1e-6) pkbd.reg2(y, x, con = TRUE, xnew = NULL, tol = 1e-6)
vmfreg(y, x, con = TRUE, xnew = NULL, tol = 1e-06) spcauchy.reg(y, x, con = TRUE, xnew = NULL, tol = 1e-06) pkbd.reg(y, x, con = TRUE, xnew = NULL, tol = 1e-6) pkbd.reg2(y, x, con = TRUE, xnew = NULL, tol = 1e-6)
y |
A matrix with any number of columns containing the (unit vector) (hyper-)spherical data. |
x |
The predictor variable(s), they can be continnuous, (hyper-)spherical, categorical or a mix of them. |
con |
Do you want the constant term in the regression? |
xnew |
If you have new data use it, otherwise leave it NULL. |
tol |
A tolerance value to decide when to stop the successive optimaizations. |
The second parametrization of the projected normal and of the von Mises-Fisher regression (Paine et al., 2020) is applied. The same is true for the SIPC distribution. For more information see the paper by Paine et al. (2020). The difference from vmf.reg
is that the latter is designed for the sphere only, whereas this function works in the hyper-sphere also.
As for the spcauchy.reg() and pkbd.reg() they are based upon the spherical Cauchy (Kato and McCullagh, 2020) and the Poisson kernel-based (Golzy and Markatou, 2020) distributions. These two use Newton-Raphson, but the pkbd.reg2() uses the optim
. We have noticed some numerical issues with the pkbd.reg() when the dimensionalities of the variables are large and this is why we also provide the (much slower) pkbd.reg2() function.
A list including:
runtime |
The runtime of the regression. |
iters |
The number of iterations required until convergence of the Newton-Raphson algorithm. |
loglik |
The log-likelihood of the regression model. |
fit |
This is a measure of fit of the estimated values, defined as |
beta |
The beta coefficients. |
seb |
The standard error of the beta coefficients. |
ki |
The norm of the fitted values. In the von Mises-Fisher regression this is the concentration parameter of each observation. This is returned if the argument "xnew" is NULL. |
g2 |
The norm of the fitted values. In the spherical Cauchy and the PKBD regression this is the concentration parameter of each observation. This is returned if the argument "xnew" is NULL. |
est |
The fitted values of xnew if "xnew" is NULL. If it is not NULL, the fitted values for the "xnew" you supplied will be returned. |
Michail Tsagris.
R implementation and documentation: Michail Tsagris [email protected].
P. J. Paine, S. P. Preston, M. Tsagris and Andrew T. A. Wood (2020). Spherical regression models with general covariates and anisotropic errors. Statistics and Computing, 30(1): 153–165. https://link.springer.com/content/pdf/10.1007
Kato S. and McCullagh P. (2020). Some properties of a Cauchy family on the sphere derived from the Mobius transformations. Bernoulli, 26(4): 3224–3248.
Golzy M. and Markatou M. (2020). Poisson kernel-based clustering on the sphere: convergence properties, identifiability, and a method of sampling. Journal of Computational and Graphical Statistics, 29(4): 758–770.
Tsagris M., Papastamoulis P. and Kato S. (2024). Directional data analysis using the spherical Cauchy and the Poisson kernel-based distribution. https://arxiv.org/pdf/2409.03292.
y <- rvmf(150, rnorm(5), 5) a <- vmfreg(y, iris[, 1]) b <- spcauchy.reg(y, iris)
y <- rvmf(150, rnorm(5), 5) a <- vmfreg(y, iris[, 1]) b <- spcauchy.reg(y, iris)
A test for testing the equality of the concentration parameter among g samples, where g >= 2 for ciruclar data. It is a tangential approach.
tang.conc(u, ina, rads = FALSE)
tang.conc(u, ina, rads = FALSE)
u |
A numeric vector containing the values of all samples. |
ina |
A numerical variable or factor indicating the groups of each value. |
rads |
If the data are in radians this should be TRUE and FALSE otherwise. |
This test works for circular data.
This is an "htest"class object. Thus it returns a list including:
statistic |
The test statistic value. |
parameter |
The degrees of freedom of the test. |
p.value |
The p-value of the test. |
alternative |
A character with the alternative hypothesis. |
method |
A character with the test used. |
data.name |
A character vector with two elements. |
Michail Tsagris.
R implementation and documentation: Michail Tsagris [email protected] and Giorgos Athineou <[email protected]>.
Mardia, K. V. and Jupp, P. E. (2000). Directional statistics. Chicester: John Wiley & Sons. Fisher, N. I. (1995). Statistical analysis of circular data. Cambridge University Press.
embed.circaov, hcf.circaov, lr.circaov, het.circaov, conc.test
x <- rvonmises(100, 2.4, 15) ina <- rep(1:4,each = 25) tang.conc(x, ina, rads = TRUE)
x <- rvonmises(100, 2.4, 15) ina <- rep(1:4,each = 25) tang.conc(x, ina, rads = TRUE)
Angular central Gaussian random values simulation.
racg(n, sigma)
racg(n, sigma)
n |
The sample size, a numerical value. |
sigma |
The covariance matrix in |
The algorithm uses univariate normal random values and transforms them to multivariate via a spectral decomposition. The vectors are then scaled to have unit length.
A matrix with the simulated data.
Michail Tsagris.
R implementation and documentation: Michail Tsagris [email protected].
Tyler D. E. (1987). Statistical analysis for the angular central Gaussian distribution on the sphere. Biometrika 74(3): 579–589.
s <- cov( iris[, 1:4] ) x <- racg(100, s) Directional::acg.mle(x) Directional::vmf.mle(x) ## the concentration parameter, kappa, is very low, close to zero, as expected.
s <- cov( iris[, 1:4] ) x <- racg(100, s) Directional::acg.mle(x) Directional::vmf.mle(x) ## the concentration parameter, kappa, is very low, close to zero, as expected.
Analysis of variance for (hyper-)spherical data.
hcf.aov(x, ina, fc = TRUE) hclr.aov(x, ina) lr.aov(x, ina) embed.aov(x, ina) het.aov(x, ina)
hcf.aov(x, ina, fc = TRUE) hclr.aov(x, ina) lr.aov(x, ina) embed.aov(x, ina) het.aov(x, ina)
x |
A matrix with the data in Euclidean coordinates, i.e. unit vectors. |
ina |
A numerical variable or a factor indicating the group of each vector. |
fc |
A boolean that indicates whether a corrected F test should be used or not. |
The high concentration (hcf.aov), high concentration log-likelihood ratio (hclr.aov), log-likelihood ratio (lr.aov), embedding approach (embed.aov) or the non equal concentration parameters approach (het.aov) is used.
This is an "htest"class object. Thus it returns a list including:
statistic |
The test statistic value. |
parameter |
The degree(s) of freedom of the test. |
p.value |
The p-value of the test. |
alternative |
A character with the alternative hypothesis. |
method |
A character with the test used. |
data.name |
A character vector with two elements. |
Michail Tsagris.
R implementation and documentation: Michail Tsagris [email protected] and Giorgos Athineou <[email protected]>.
Mardia K. V. and Jupp P. E. (2000). Directional statistics. Chicester: John Wiley & Sons.
Rumcheva P. and Presnell B. (2017). An improved test of equality of mean directions for the Langevin-von Mises-Fisher distribution. Australian & New Zealand Journal of Statistics, 59(1): 119–135.
Tsagris M. and Alenazi A. (2024). An investigation of hypothesis testing procedures for circular and spherical mean vectors. Communications in Statistics-Simulation and Computation, 53(3): 1387–1408.
hcf.boot, hcfboot, hclr.circaov,
x <- rvmf(60, rnorm(3), 15) ina <- rep(1:3, each = 20) hcf.aov(x, ina) hcf.aov(x, ina, fc = FALSE) lr.aov(x, ina) embed.aov(x, ina) het.aov(x, ina)
x <- rvmf(60, rnorm(3), 15) ina <- rep(1:3, each = 20) hcf.aov(x, ina) hcf.aov(x, ina, fc = FALSE) lr.aov(x, ina) embed.aov(x, ina) het.aov(x, ina)
Analysis of variance for circular data.
hcf.circaov(u, ina, rads = FALSE) hclr.circaov(u, ina, rads = FALSE) lr.circaov(u, ina, rads = FALSE) het.circaov(u, ina, rads = FALSE) embed.circaov(u, ina, rads = FALSE)
hcf.circaov(u, ina, rads = FALSE) hclr.circaov(u, ina, rads = FALSE) lr.circaov(u, ina, rads = FALSE) het.circaov(u, ina, rads = FALSE) embed.circaov(u, ina, rads = FALSE)
u |
A numeric vector containing the data. |
ina |
A numerical or factor variable indicating the group of each value. |
rads |
If the data are in radians, this should be TRUE and FALSE otherwise. |
The high concentration (hcf.circaov), high concentration likelihood ratio (hclr.aov), log-likelihood ratio (lr.circaov), embedding approach (embed.circaov) or the non equal concentration parameters approach (het.circaov) is used.
This is an "htest"class object. Thus it returns a list including:
statistic |
The test statistic value. |
parameter |
The degree(s) of freedom of the test. |
p.value |
The p-value of the test. |
alternative |
A character with the alternative hypothesis. |
method |
A character with the test used. |
data.name |
A character vector with two elements. |
Michail Tsagris.
R implementation and documentation: Michail Tsagris [email protected] and Giorgos Athineou <[email protected]>.
Mardia, K. V. and Jupp, P. E. (2000). Directional statistics. Chicester: John Wiley & Sons.
Rumcheva P. and Presnell B. (2017). An improved test of equality of mean directions for the Langevin-von Mises-Fisher distribution. Australian & New Zealand Journal of Statistics, 59(1): 119–135.
Tsagris M. and Alenazi A. (2024). An investigation of hypothesis testing procedures for circular and spherical mean vectors. Communications in Statistics-Simulation and Computation, 53(3): 1387–1408.
hclr.aov, hcfcirc.boot, hcfcircboot
x <- rvonmises(100, 2.4, 15) ina <- rep(1:4,each = 25) hcf.circaov(x, ina, rads = TRUE) lr.circaov(x, ina, rads = TRUE) het.circaov(x, ina, rads = TRUE) embed.circaov(x, ina, rads = TRUE) hclr.circaov(x, ina, rads = TRUE)
x <- rvonmises(100, 2.4, 15) ina <- rep(1:4,each = 25) hcf.circaov(x, ina, rads = TRUE) lr.circaov(x, ina, rads = TRUE) het.circaov(x, ina, rads = TRUE) embed.circaov(x, ina, rads = TRUE) hclr.circaov(x, ina, rads = TRUE)
BIC to choose the number of components in a model based clustering using mixtures of rotationally symmetric distributions
bic.mixvmf(x, G = 5, n.start = , tol = 1e-6, maxiters = 500) bic.mixspcauchy(x, G = 5, n.start = 5, tol = 1e-6, maxiters = 500) bic.mixpkbd(x, G = 5, n.start = 5, tol = 1e-6, maxiters = 500)
bic.mixvmf(x, G = 5, n.start = , tol = 1e-6, maxiters = 500) bic.mixspcauchy(x, G = 5, n.start = 5, tol = 1e-6, maxiters = 500) bic.mixpkbd(x, G = 5, n.start = 5, tol = 1e-6, maxiters = 500)
x |
A matrix containing directional data. |
G |
The maximum number of clusters to be tested. Default value is 5. |
n.start |
The number of random starts to try. See also R's built-in function |
tol |
The tolerance value to terminate the EM algorithm. |
maxiters |
The maximum number of iterations to perform. |
The function computes the BIC (and ICL) to decide on the optimal number of clusters when using mixtures of von Mises-Fisher, mixtures of spherical Cauchy or mixtures of Poisson kernel-based distributions.
A plot of the BIC values and a list including:
bic |
The BIC values for all the models tested. |
icl |
The ICL values for all the models tested. |
runtime |
The run time of the algorithm. A numeric vector. The first element is the user time, the second element is the system time and the third element is the elapsed time. |
Michail Tsagris.
R implementation and documentation: Michail Tsagris [email protected].
Hornik, K. and Grun, B. (2014). movMF: An R package for fitting mixtures of von Mises-Fisher distributions. Journal of Statistical Software, 58(10): 1–31.
Biernacki C., Celeux G. and Govaert, G. (2000). Assessing a mixture model for clustering with the integrated completed likelihood. IEEE Transactions on Pattern Analysis and Machine Intelligence, 22(7): 719–725.
Tsagris M., Papastamoulis P. and Kato S. (2024). Directional data analysis using the spherical Cauchy and the Poisson kernel-based distribution. https://arxiv.org/pdf/2409.03292
mixvmf.mle, rmixvmf, mixvmf.contour
x <- as.matrix( iris[, 1:4] ) x <- x / sqrt( rowSums(x^2) ) bic.mixvmf(x)
x <- as.matrix( iris[, 1:4] ) x <- x / sqrt( rowSums(x^2) ) bic.mixvmf(x)
Bootstrap 2-sample mean test for (hyper-)spherical data.
hcf.boot(x1, x2, fc = TRUE, B = 999) lr.boot(x1, x2, B = 999) hclr.boot(x1, x2, B = 999) embed.boot(x1, x2, B = 999) het.boot(x1, x2, B = 999)
hcf.boot(x1, x2, fc = TRUE, B = 999) lr.boot(x1, x2, B = 999) hclr.boot(x1, x2, B = 999) embed.boot(x1, x2, B = 999) het.boot(x1, x2, B = 999)
x1 |
A matrix with the data in Euclidean coordinates, i.e. unit vectors. |
x2 |
A matrix with the data in Euclidean coordinates, i.e. unit vectors. |
fc |
A boolean that indicates whether a corrected F test should be used or not. |
B |
The number of bootstraps to perform. |
The high concentration (hcf.boot), log-likelihood ratio (lr.boot), high concentration log-likelihood ratio (hclr.boot), embedding approach (embed.boot) or the non equal concentration parameters approach (het.boot) is used.
This is an "htest"class object. Thus it returns a list including:
statistic |
The test statistic value. |
parameter |
The degrees of freedom of the test. Since these are bootstrap based tests this is "NA". |
p.value |
The p-value of the test. |
alternative |
A character with the alternative hypothesis. |
method |
A character with the test used. |
data.name |
A character vector with two elements. |
Michail Tsagris.
R implementation and documentation: Michail Tsagris [email protected].
Mardia K. V. and Jupp P. E. (2000). Directional statistics. Chicester: John Wiley & Sons.
Rumcheva P. and Presnell B. (2017). An improved test of equality of mean directions for the Langevin-von Mises-Fisher distribution. Australian & New Zealand Journal of Statistics, 59(1): 119–135.
Tsagris M. and Alenazi A. (2024). An investigation of hypothesis testing procedures for circular and spherical mean vectors. Communications in Statistics-Simulation and Computation, 53(3): 1387–1408.
x <- rvmf(60, rnorm(3), 15) ina <- rep(1:2, each = 30) x1 <- x[ina == 1, ] x2 <- x[ina == 2, ] hcf.boot(x1, x2) lr.boot(x1, x2) het.boot(x1, x2)
x <- rvmf(60, rnorm(3), 15) ina <- rep(1:2, each = 30) x1 <- x[ina == 1, ] x2 <- x[ina == 2, ] hcf.boot(x1, x2) lr.boot(x1, x2) het.boot(x1, x2)
Bootstrap 2-sample mean test for circular data.
hcfcirc.boot(u1, u2, rads = TRUE, B = 999) lrcirc.boot(u1, u2, rads = TRUE, B = 999) hclrcirc.boot(u1, u2, rads = TRUE, B = 999) embedcirc.boot(u1, u2, rads = TRUE, B = 999) hetcirc.boot(u1, u2, rads = TRUE, B = 999)
hcfcirc.boot(u1, u2, rads = TRUE, B = 999) lrcirc.boot(u1, u2, rads = TRUE, B = 999) hclrcirc.boot(u1, u2, rads = TRUE, B = 999) embedcirc.boot(u1, u2, rads = TRUE, B = 999) hetcirc.boot(u1, u2, rads = TRUE, B = 999)
u1 |
A numeric vector containing the data of the first sample. |
u2 |
A numeric vector containing the data of the first sample. |
rads |
If the data are in radians, this should be TRUE and FALSE otherwise. |
B |
The number of bootstraps to perform. |
The high concentration (hcfcirc.boot), the log-likelihood ratio test (lrcirc.boot), high concentration log-likelihood ratio (hclrcirc.boot), embedding approach (embedcirc.boot), or the non equal concentration parameters approach (hetcirc.boot) is used.
This is an "htest"class object. Thus it returns a list including:
statistic |
The test statistic value. |
parameter |
The degrees of freedom of the test. Since these are bootstrap based tests this is "NA". |
p.value |
The p-value of the test. |
alternative |
A character with the alternative hypothesis. |
method |
A character with the test used. |
data.name |
A character vector with two elements. |
Michail Tsagris.
R implementation and documentation: Michail Tsagris [email protected].
Mardia K. V. and Jupp P. E. (2000). Directional statistics. Chicester: John Wiley & Sons.
Rumcheva P. and Presnell B. (2017). An improved test of equality of mean directions for the Langevin-von Mises-Fisher distribution. Australian & New Zealand Journal of Statistics, 59(1): 119–135.
Tsagris M. and Alenazi A. (2024). An investigation of hypothesis testing procedures for circular and spherical mean vectors. Communications in Statistics-Simulation and Computation, 53(3): 1387–1408.
hcf.circaov, hcfcircboot, het.aov
u1 <- rvonmises(20, 2.4, 5) u2 <- rvonmises(20, 2.4, 10) hcfcirc.boot(u1, u2)
u1 <- rvonmises(20, 2.4, 5) u2 <- rvonmises(20, 2.4, 10) hcfcirc.boot(u1, u2)
Bootstrap ANOVA for (hyper-)spherical data.
hcfboot(x, ina, B = 999) hetboot(x, ina, B = 999)
hcfboot(x, ina, B = 999) hetboot(x, ina, B = 999)
x |
A matrix with the combined data (from all groups) in Euclidean coordinates, i.e. unit vectors. |
ina |
The grouping variables. A factor or a numerical vector specifying the groups to which each observation belongs to. |
B |
The number of bootstraps to perform. |
The high concentration (hcfboot), or the non equal concentration parameters approach (hetboot) is used.
This is an "htest"class object. Thus it returns a list including:
statistic |
The test statistic value. |
parameter |
The degrees of freedom of the test. Since these are bootstrap based tests this is "NA". |
p.value |
The p-value of the test. |
alternative |
A character with the alternative hypothesis. |
method |
A character with the test used. |
data.name |
A character vector with two elements. |
Michail Tsagris.
R implementation and documentation: Michail Tsagris [email protected].
Mardia K. V. and Jupp P. E. (2000). Directional statistics. Chicester: John Wiley & Sons.
Rumcheva P. and Presnell B. (2017). An improved test of equality of mean directions for the Langevin-von Mises-Fisher distribution. Australian & New Zealand Journal of Statistics, 59(1): 119–135.
Tsagris M. and Alenazi A. (2024). An investigation of hypothesis testing procedures for circular and spherical mean vectors. Communications in Statistics-Simulation and Computation, 53(3): 1387–1408.
x <- rvmf(60, rnorm(3), 10) ina <- rep(1:3, each = 20) hcfboot(x, ina)
x <- rvmf(60, rnorm(3), 10) ina <- rep(1:3, each = 20) hcfboot(x, ina)
Bootstrap ANOVA for circular data.
hcfcircboot(u, ina, rads = TRUE, B = 999) hetcircboot(u, ina, rads = TRUE, B = 999)
hcfcircboot(u, ina, rads = TRUE, B = 999) hetcircboot(u, ina, rads = TRUE, B = 999)
u |
A numeric vector containing the data of all groups. |
ina |
The grouping variables. A factor or a numerical vector specifying the groups to which each observation belongs to. |
rads |
If the data are in radians, this should be TRUE and FALSE otherwise. |
B |
The number of bootstraps to perform. |
The high concentration (hcfcircboot), or the non equal concentration parameters approach (hetcircboot) is used.
This is an "htest"class object. Thus it returns a list including:
statistic |
The test statistic value. |
parameter |
The degrees of freedom of the test. Since these are bootstrap based tests this is "NA". |
p.value |
The p-value of the test. |
alternative |
A character with the alternative hypothesis. |
method |
A character with the test used. |
data.name |
A character vector with two elements. |
Michail Tsagris.
R implementation and documentation: Michail Tsagris [email protected].
Mardia K. V. and Jupp P. E. (2000). Directional statistics. Chicester: John Wiley & Sons.
Rumcheva P. and Presnell B. (2017). An improved test of equality of mean directions for the Langevin-von Mises-Fisher distribution. Australian & New Zealand Journal of Statistics, 59(1): 119–135.
Tsagris M. and Alenazi A. (2024). An investigation of hypothesis testing procedures for circular and spherical mean vectors. Communications in Statistics-Simulation and Computation, 53(3): 1387–1408.
u1 <- rvonmises(20, 2.4, 5) u2 <- rvonmises(20, 2.4, 10) hcfcirc.boot(u1, u2)
u1 <- rvonmises(20, 2.4, 5) u2 <- rvonmises(20, 2.4, 10) hcfcirc.boot(u1, u2)
It plots the log probability trace of matrix Fisher distribution which should close to the maximum value of the logarithm of matrix Fisher distribution, if samples are correctly generated.
visual.check(x, Fa)
visual.check(x, Fa)
x |
The simulated data. An array with at least 2 3x3 matrices. |
Fa |
An arbitrary 3x3 matrix represents the parameter matrix of this distribution. |
For a given parameter matrix Fa, maximum value of the logarithm of matrix Fisher distribution is calculated via
the form of singular value decomposition of which is
. Multiply the last
column of
by
and replace small eigenvalue, say,
by
if
.
A plot which shows log probability trace of matrix Fisher distribution. The values are also returned.
Anamul Sajib.
R implementation and documentation: Anamul Sajib <[email protected]>.
Habeck M. (2009). Generation of three-dimensional random rotations in fitting and matching problems. Computational Statistics, 24(4):719–731.
Fa <- matrix( c(85, 11, 41, 78, 39, 60, 43, 64, 48), ncol = 3) / 10 x <- rmatrixfisher(1000, Fa) a <- visual.check(x, Fa)
Fa <- matrix( c(85, 11, 41, 78, 39, 60, 43, 64, 48), ncol = 3) / 10 x <- rmatrixfisher(1000, Fa) a <- visual.check(x, Fa)
Circular correlations between two circular variables.
circ.cors1(theta, phi, rads = FALSE) circ.cors2(theta, phi, rads = FALSE)
circ.cors1(theta, phi, rads = FALSE) circ.cors2(theta, phi, rads = FALSE)
theta |
The first cirular variable expressed in radians, not degrees. |
phi |
The other cirular variable. In the case of "circ.cors1" this is a matrix with many circular variables. In either case, the values must be in radians, not degrees. |
rads |
If the data are expressed in rads, then this should be TRUE. If the data are in degrees, then this is FALSE. |
Correlation for circular variables using the cosinus and sinus formula of Jammaladaka and SenGupta (1988).
A matrix with two columns, the correlations and the p-values.
Michail Tsagris.
R implementation and documentation: Michail Tsagris [email protected].
Jammalamadaka, R. S. and Sengupta, A. (2001). Topics in circular statistics. World Scientific.
Jammalamadaka, S. R. and Sarma, Y. R. (1988). A correlation coefficient for angular variables. Statistical Theory and Data Analysis, 2:349–364.
Mardia, K. V. and Jupp, P. E. (2000). Directional statistics. Chicester: John Wiley & Sons.
y <- runif(50, 0, 2 * pi) x <- matrix(runif(50 * 10, 0, 2 * pi), ncol = 10) circ.cors1(y, x, rads = TRUE)
y <- runif(50, 0, 2 * pi) x <- matrix(runif(50 * 10, 0, 2 * pi), ncol = 10) circ.cors1(y, x, rads = TRUE)
Circular correlations between two circular variables.
circ.cor1(theta, phi, rads = FALSE) circ.cor2(theta, phi, rads = FALSE)
circ.cor1(theta, phi, rads = FALSE) circ.cor2(theta, phi, rads = FALSE)
theta |
The first cirular variable. |
phi |
The other cirular variable. |
rads |
If the data are expressed in rads, then this should be TRUE. If the data are in degrees, then this is FALSE. |
circ.cor1: Correlation for circular variables using the cosinus and sinus formula of Jammaladaka and SenGupta (1988).
circ.cor2: Correlation for circular variables using the cosinus and sinus formula of Mardia and Jupp (2000).
A vector including:
rho |
The value of the correlation coefficient. |
p-value |
The p-value of the zero correlation hypothesis testing. |
Michail Tsagris.
R implementation and documentation: Michail Tsagris [email protected] and Giorgos Athineou <[email protected]>.
Jammalamadaka, R. S. and Sengupta, A. (2001). Topics in circular statistics. World Scientific.
Jammalamadaka, S. R. and Sarma, Y. R. (1988) . A correlation coefficient for angular variables. Statistical Theory and Data Analysis, 2:349–364.
Mardia, K. V. and Jupp, P. E. (2000). Directional statistics. Chicester: John Wiley & Sons.
circlin.cor, circ.cor2, spml.reg
y <- runif(50, 0, 2 * pi) x <- runif(50, 0, 2 * pi) circ.cor1(x, y, rads = TRUE) circ.cor2(x, y, rads = TRUE)
y <- runif(50, 0, 2 * pi) x <- runif(50, 0, 2 * pi) circ.cor1(x, y, rads = TRUE) circ.cor2(x, y, rads = TRUE)
Circular distance correlation between two circular variables.
circ.dcor(theta, phi, rads = FALSE)
circ.dcor(theta, phi, rads = FALSE)
theta |
The first cirular variable. |
phi |
The other cirular variable. |
rads |
If the data are expressed in rads, then this should be TRUE. If the data are in degrees, then this is FALSE. |
The angular data are transformed to their Euclidean coordinates and then the distance correlation is computed.
A list including:
dcov |
The distance covariance. |
dvarX |
The distance variance of x. |
dvarY |
The distance variance of Y. |
dcor |
The distance correlation. |
Michail Tsagris.
R implementation and documentation: Michail Tsagris [email protected].
G.J. Szekely, M.L. Rizzo and N. K. Bakirov (2007). Measuring and Testing Independence by Correlation of Distances. Annals of Statistics, 35(6):2769-2794.
circlin.cor, circ.cor2, spher.dcor
y <- runif(50, 0, 2 * pi) x <- runif(50, 0, 2 * pi) circ.dcor(x, y, rads = TRUE)
y <- runif(50, 0, 2 * pi) x <- runif(50, 0, 2 * pi) circ.dcor(x, y, rads = TRUE)
Regression with circular dependent variable and Euclidean or categorical independent variables.
spml.reg(y, x, rads = TRUE, xnew = NULL, seb = FALSE, tol = 1e-07) circpurka.reg(y, x, rads = TRUE, xnew = NULL) cipc.reg(y, x, rads = TRUE, xnew = NULL, tol = 1e-06) gcpc.reg(y, x, rads = TRUE, reps = 20, xnew = NULL)
spml.reg(y, x, rads = TRUE, xnew = NULL, seb = FALSE, tol = 1e-07) circpurka.reg(y, x, rads = TRUE, xnew = NULL) cipc.reg(y, x, rads = TRUE, xnew = NULL, tol = 1e-06) gcpc.reg(y, x, rads = TRUE, reps = 20, xnew = NULL)
y |
The dependent variable, a numerical vector, it can be in radians or degrees. |
x |
The independent variable(s). Can be Euclidean or categorical (factor variables). |
rads |
If the dependent variable is expressed in rads, this should be TRUE and FALSE otherwise. |
reps |
How many starting values shall the algortihm use? By default it uses 20 different starting values. |
xnew |
The new values of some independent variable(s) whose circular values you want to predict. Can be Euclidean or categorical. If they are categorical, the user must provide them as dummy variables. It does not accept factor variables. If you have no new x values, leave it NULL (default). |
seb |
a boolean variable. If TRUE, the standard error of the coefficients will be be returned. Set to FALSE in case of simulation studies or in other cases such as a forward regression setting for example. In these cases, it can save some time. |
tol |
The tolerance value to terminate the Newton-Raphson algorithm. |
For the spml.reg(), the Newton-Raphson algorithm is fitted in this regression as described in Presnell et al. (1998). For the cipc.reg(), the Newton-Raphson algorithm is fitted in this regression as described in Tsagris and Alenazy (2023). Note that the cipc.reg() is the same as the wrapped Cauchy regression. For the circpurka.reg() the optim() function is employed. For the gcpc.reg() the optim() and the optimise() functions are being used.
A list including:
runtime |
The runtime of the procedure. |
iters |
The number of iterations required until convergence of the Newton-Raphson algorithm. |
beta |
The regression coefficients. |
seb |
The standard errors of the coefficients. |
loglik |
The value of the maximised log-likelihood. |
est |
The fitted values expressed in radians if the obsereved data are in radians and in degrees otherwise. If xnew is not NULL, i.e. if you have new x values, then the predicted values of y will be returned. |
Michail Tsagris.
R implementation and documentation: Michail Tsagris [email protected].
Presnell B., Morrison S. P. and Littell Ramon C. (1998). Projected multivariate linear models for directional data. Journal of the American Statistical Association, 93(443): 1068-1077.
Purkayastha S. (1991). A Rotationally Symmetric Directional Distribution: Obtained through Maximum Likelihood Characterization. The Indian Journal of Statistics, Series A, 53(1): 70-83
Tsagris M. and Alzeley O. (2024). Circular and spherical projected Cauchy distributions: A Novel Framework for Circular and Directional Data Modeling. Australian & New Zealand Journal of Statistics (accepted for publication). https://arxiv.org/pdf/2302.02468.pdf
circlin.cor, circ.cor1, circ.cor2, spher.cor, spher.reg
x <- rnorm(100) z <- cbind(3 + 2 * x, 1 -3 * x) y <- cbind( rnorm(100,z[ ,1], 1), rnorm(100, z[ ,2], 1) ) y <- y / sqrt( rowSums(y^2) ) y <- ( atan( y[, 2] / y[, 1] ) + pi * I(y[, 1] < 0) ) %% (2 * pi) a <- spml.reg(y, x, rads = TRUE, xnew = x) b <- cipc.reg(y, x, rads = TRUE, xnew = x)
x <- rnorm(100) z <- cbind(3 + 2 * x, 1 -3 * x) y <- cbind( rnorm(100,z[ ,1], 1), rnorm(100, z[ ,2], 1) ) y <- y / sqrt( rowSums(y^2) ) y <- ( atan( y[, 2] / y[, 1] ) + pi * I(y[, 1] < 0) ) %% (2 * pi) a <- spml.reg(y, x, rads = TRUE, xnew = x) b <- cipc.reg(y, x, rads = TRUE, xnew = x)
It calculates the squared correlation between a circular and one or more linear variables.
circlin.cor(theta, x, rads = FALSE)
circlin.cor(theta, x, rads = FALSE)
theta |
The circular variable. |
x |
The linear variable or a matrix containing many linear variables. |
rads |
If the circualr variable is in rads, this should be TRUE and FALSE otherwise. |
The squared correlation between a circular and one or more linear variables is calculated.
A matrix with as many rows as linear variables including:
R-squared |
The value of the squared correlation. |
p-value |
The p-value of the zero correlation hypothesis testing. |
Michail Tsagris.
R implementation and documentation: Michail Tsagris [email protected] and Giorgos Athineou <[email protected]>.
Mardia, K. V. and Jupp, P. E. (2000). Directional statistics. Chicester: John Wiley & Sons.
circ.cor1, circ.cor2, spml.reg
phi <- rvonmises(50, 2, 20, rads = TRUE) x <- 2 * phi + rnorm(50) y <- matrix(rnorm(50 * 5), ncol = 5) circlin.cor(phi, x, rads = TRUE) circlin.cor(phi, y, rads = TRUE)
phi <- rvonmises(50, 2, 20, rads = TRUE) x <- 2 * phi + rnorm(50) y <- matrix(rnorm(50 * 5), ncol = 5) circlin.cor(phi, x, rads = TRUE) circlin.cor(phi, y, rads = TRUE)
Column-wise MLE of the angular Gaussian and the von Mises Fisher distributions.
colspml.mle(x ,tol = 1e-07, maxiters = 100, parallel = FALSE) colvm.mle(x, tol = 1e-07)
colspml.mle(x ,tol = 1e-07, maxiters = 100, parallel = FALSE) colvm.mle(x, tol = 1e-07)
x |
A numerical matrix with data. Each column refers to a different vector of observations of the same distribution. The values of for Lognormal must be greater than zero, for the logitnormal they must by percentages, exluding 0 and 1, whereas for the Borel distribution the x must contain integer values greater than 1. |
tol |
The tolerance value to terminate the Newton-Raphson algorithm. |
maxiters |
The maximum number of iterations that can take place in each regression. |
parallel |
Do you want this to be executed in parallel or not. The parallel takes place in C++, and the number of threads is defined by each system's availiable cores. |
For each column, spml.mle function is applied that fits the angular Gaussian distribution estimates its parameters and computes the maximum log-likelihood.
A matrix with four columns. The first two are the mean vector, then the parameter, and the fourth
column contains maximum log-likelihood.
Michail Tsagris.
R implementation and documentation: Michail Tsagris [email protected].
Presnell Brett, Morrison Scott P. and Littell Ramon C. (1998). Projected multivariate linear models for directional data. Journal of the American Statistical Association, 93(443): 1068–1077.
x <- matrix( runif(100 * 10), ncol = 10) a <- colspml.mle(x) b <- colvm.mle(x) x <- NULL
x <- matrix( runif(100 * 10), ncol = 10) a <- colspml.mle(x) b <- colvm.mle(x) x <- NULL
Column-wise uniformity tests for circular data.
colwatsons(u, rads = FALSE)
colwatsons(u, rads = FALSE)
u |
A numeric matrix containing the circular data which are expressed in radians. Each column is a different sample. |
rads |
A boolean variable. If the data are in radians, put this TRUE. If the data are expressed in degrees make this FALSE. |
These tests are used to test the hypothesis that the data come from a circular uniform distribution.
A matrix with two columns, the value of the test statistic and its associated p-value.
Michail Tsagris.
R implementation and documentation: Michail Tsagris [email protected].
Jammalamadaka S. Rao and SenGupta A. (2001). Topics in Circular Statistics, pg. 156–157.
x <- matrix( rvonmises(n = 50 * 10, m = 2, k = 0), ncol = 10 ) res<-colwatsons(x) x <- NULL
x <- matrix( rvonmises(n = 50 * 10, m = 2, k = 0), ncol = 10 ) res<-colwatsons(x) x <- NULL
The contour plot (on the plane) of the spherical ESAG and Kent distributions is produced.
esag.contour(mu, gam, lat, long) kent.contour(k, b)
esag.contour(mu, gam, lat, long) kent.contour(k, b)
k |
The concentration parameter. |
b |
The ovalness parameter. It has to be less than k/2 in order for the distribution to be unimodal. Otherwise it is bimodal. |
mu |
The mean vector the ESAG distribution, a vector in |
gam |
The two gamma parameters of the ESAG distribution. |
lat |
A positive number determing the range of degrees to move left and right from the latitude center. See the example to better understand this argument. |
long |
A positive number determing the range of degrees to move up and down from the longitude center. See the example to better understand this argument. |
The goal of this function is for the user to see how the Kent or the SAG distribution looks like.
A plot containing the contours of the distribution.
Michail Tsagris and Christos Adam.
R implementation and documentation: Michail Tsagris [email protected] and Christos Adam [email protected].
Kent John (1982). The Fisher-Bingham distribution on the sphere. Journal of the Royal Statistical Society, Series B, 44(1): 71–80.
Paine P.J., Preston S.P., Tsagris M. and Wood A.T.A. (2018). An Elliptically Symmetric Angular Gaussian Distribution. Statistics and Computing, 28(3):689–697.
vmf.contour, vmf.kerncontour, spher.esag.contour
kent.contour(10, 4) mu <- colMeans( as.matrix( iris[,1:3] ) ) gam <- c(1,0.5) esag.contour(mu, gam, 50, 50) esag.contour(mu, gam, 30, 40)
kent.contour(10, 4) mu <- colMeans( as.matrix( iris[,1:3] ) ) gam <- c(1,0.5) esag.contour(mu, gam, 50, 50) esag.contour(mu, gam, 30, 40)
The contour plot (on the sphere) of a mixture of von Mises-Fisher distributions is produced.
spher.mixvmf.contour(probs, mu, k, bgcol = "snow", dat = NULL, col = NULL, lat = 50, long = 50)
spher.mixvmf.contour(probs, mu, k, bgcol = "snow", dat = NULL, col = NULL, lat = 50, long = 50)
probs |
This is avector with the mixing probability of each group. |
mu |
A matrix with the mean direction of each group. |
k |
A vector with the concentration parameter of each group. |
bgcol |
The color of the surface of the sphere. |
dat |
If you have you want to plot supply them here. This has to be a numerical matrix with three columns, i.e. unit vectors. |
col |
If you supplied data then choose the color of the points. If you did not choose a color, the points will appear in red. |
lat |
A positive number determing the range of degrees to move left and right from the latitude center. See the example to better understand this argument. |
long |
A positive number determing the range of degrees to move up and down from the longitude center. See the example to better understand this argument. |
The goal of this function is for the user to see how the mixtures of von Mises-Fisher look like.
A plot containing the contours of the mixture distribution.
Michail Tsagris.
R implementation and documentation: Michail Tsagris [email protected].
Kurt Hornik and Bettina Grun (2014). movMF: An R Package for Fitting Mixtures of von Mises-Fisher Distributions http://cran.r-project.org/web/packages/movMF/vignettes/movMF.pdf
Mardia K. V. and Jupp, P. E. (2000). Directional statistics. Chicester: John Wiley & Sons.
Sra S. (2012). A short note on parameter approximation for von Mises-Fisher distributions:
and a fast implementation of . Computational Statistics, 27(1): 177–190.
spher.esag.contour, spher.vmf.contour, mixvmf.mle
k <- runif(3, 4, 20) probs <- c(0.2, 0.5, 0.3) mu <- matrix(rnorm(9, 0, 0.5), ncol = 3) mu <- mu / sqrt( rowSums(mu^2) ) ## the lat and long are decreased to 10. Increase them back to 50 to ## see the difference spher.mixvmf.contour(probs, mu, k, lat = 10, long = 10)
k <- runif(3, 4, 20) probs <- c(0.2, 0.5, 0.3) mu <- matrix(rnorm(9, 0, 0.5), ncol = 3) mu <- mu / sqrt( rowSums(mu^2) ) ## the lat and long are decreased to 10. Increase them back to 50 to ## see the difference spher.mixvmf.contour(probs, mu, k, lat = 10, long = 10)
The contour plot (on the sphere) of some spherical rotationally symmetric distributions is produced.
spher.vmf.contour(mu, k, bgcol = "snow", dat = NULL, col = NULL, lat = 50, long = 50) spher.purka.contour(theta, a, bgcol = "snow", dat = NULL, col = NULL, lat = 50, long = 50) spher.spcauchy.contour(mu, rho, bgcol = "snow", dat = NULL, col = NULL, lat = 50, long = 50) spher.pkbd.contour(mu, rho, bgcol = "snow", dat = NULL, col = NULL, lat = 50, long = 50)
spher.vmf.contour(mu, k, bgcol = "snow", dat = NULL, col = NULL, lat = 50, long = 50) spher.purka.contour(theta, a, bgcol = "snow", dat = NULL, col = NULL, lat = 50, long = 50) spher.spcauchy.contour(mu, rho, bgcol = "snow", dat = NULL, col = NULL, lat = 50, long = 50) spher.pkbd.contour(mu, rho, bgcol = "snow", dat = NULL, col = NULL, lat = 50, long = 50)
mu |
The mean or the median direction, depending on the distribution, a unit vector. |
theta |
The mean direction (unit vector) of the Purkayastha distribution. |
k |
The concentration parameter ( |
a |
The concentration parameter ( |
rho |
The concentration parameter ( |
bgcol |
The color of the surface of the sphere. |
dat |
If you have you want to plot supply them here. This has to be a numerical matrix with three columns, i.e. unit vectors. |
col |
If you supplied data then choose the color of the points. If you did not choose a color, the points will appear in red. |
lat |
A positive number determing the range of degrees to move left and right from the latitude center. See the example to better understand this argument. |
long |
A positive number determing the range of degrees to move up and down from the longitude center. See the example to better understand this argument. |
The goal of this function is for the user to see how the von Mises-Fisher, the Purkayastha, the spherical Cauchy or the Poisson kernel-based distribution looks like.
A plot containing the contours of the distribution.
Michail Tsagris.
R implementation and documentation: Michail Tsagris [email protected].
Tsagris M., Papastamoulis P. and Kato S. (2024). Directional data analysis using the spherical Cauchy and the Poisson kernel-based distribution. https://arxiv.org/pdf/2409.03292.
Mardia K. V. and Jupp, P. E. (2000). Directional statistics. Chicester: John Wiley & Sons.
Sra S. (2012). A short note on parameter approximation for von Mises-Fisher distributions:
and a fast implementation of . Computational Statistics, 27(1): 177–190.
Purkayastha S. (1991). A Rotationally Symmetric Directional Distribution: Obtained through Maximum Likelihood Characterization. The Indian Journal of Statistics, Series A, 53(1): 70–83.
Cabrera J. and Watson G. S. (1990). On a spherical median related distribution. Communications in Statistics-Theory and Methods, 19(6): 1973–1986.
Kato S. and McCullagh P. (2020). Some properties of a Cauchy family on the sphere derived from the Mobius transformations. Bernoulli, 26(4): 3224–3248. https://arxiv.org/pdf/1510.07679.pdf
Golzy M. and Markatou M. (2020). Poisson kernel-based clustering on the sphere: convergence properties, identifiability, and a method of sampling. Journal of Computational and Graphical Statistics, 29(4): 758–770.
Sablica L., Hornik K. and Leydold J. (2023). Efficient sampling from the PKBD distribution. Electronic Journal of Statistics, 17(2): 2180–2209.
spher.esag.contour, spher.mixvmf.contour, kent.contour
mu <- colMeans( as.matrix( iris[, 1:3] ) ) mu <- mu / sqrt( sum(mu^2) ) ## the lat and long are decreased to 30. Increase them back to 50 to ## see the difference spher.spcauchy.contour(mu, 0.7, lat = 30, long = 30)
mu <- colMeans( as.matrix( iris[, 1:3] ) ) mu <- mu / sqrt( sum(mu^2) ) ## the lat and long are decreased to 30. Increase them back to 50 to ## see the difference spher.spcauchy.contour(mu, 0.7, lat = 30, long = 30)
The contour plot (on the sphere) of the ESAG and Kent distributions is produced.
spher.esag.contour(mu, gam, bgcol = "snow", dat = NULL, col = NULL, lat = 50, long = 50) spher.kent.contour(G, param, bgcol = "snow", dat = NULL, col = NULL, lat = 50, long = 50)
spher.esag.contour(mu, gam, bgcol = "snow", dat = NULL, col = NULL, lat = 50, long = 50) spher.kent.contour(G, param, bgcol = "snow", dat = NULL, col = NULL, lat = 50, long = 50)
mu |
The mean vector the ESAG distribution, a vector in |
gam |
The two gamma parameters of the ESAG distribution. |
G |
For the Kent distribution, a 3 x 3 matrix whose first column is the mean direction. The second and third columns are the major and minor axes respectively. |
param |
For the Kent distribution a vector with the concentration |
bgcol |
The color of the surface of the sphere. |
dat |
If you have you want to plot supply them here. This has to be a numerical matrix with three columns, i.e. unit vectors. |
col |
If you supplied data then choose the color of the points. If you did not choose a color, the points will appear in red. |
lat |
A positive number determing the range of degrees to move left and right from the latitude center. See the example to better understand this argument. |
long |
A positive number determing the range of degrees to move up and down from the longitude center. See the example to better understand this argument. |
The goal of this function is for the user to see how the ESAG or the Kent distribution looks like.
A plot containing the contours of the distribution.
Michail Tsagris.
R implementation and documentation: Michail Tsagris [email protected].
Kent John (1982). The Fisher-Bingham distribution on the sphere. Journal of the Royal Statistical Society, Series B, 44(1): 71–80.
Paine P.J., Preston S.P., Tsagris M. and Wood A.T.A. (2018). An Elliptically Symmetric Angular Gaussian Distribution. Statistics and Computing, 28(3):689–697.
esag.contour, spher.purka.contour, kent.contour
mu <- colMeans( as.matrix( iris[, 1:3] ) ) gam <- c(1 ,0.5) ## the lat and long are decreased to 30. Increase them back to 50 to ## see the difference spher.esag.contour(mu, gam, lat = 30, long = 30)
mu <- colMeans( as.matrix( iris[, 1:3] ) ) gam <- c(1 ,0.5) ## the lat and long are decreased to 30. Increase them back to 50 to ## see the difference spher.esag.contour(mu, gam, lat = 30, long = 30)
The contour plot (on the sphere) of the SESPC distribution is produced.
spher.sespc.contour(mu, theta, bgcol = "snow", dat = NULL, col = NULL, lat = 50, long = 50)
spher.sespc.contour(mu, theta, bgcol = "snow", dat = NULL, col = NULL, lat = 50, long = 50)
mu |
The mean vector the SESPC distribution, a vector in |
theta |
The two |
bgcol |
The color of the surface of the sphere. |
dat |
If you have you want to plot supply them here. This has to be a numerical matrix with three columns, i.e. unit vectors. |
col |
If you supplied data then choose the color of the points. If you did not choose a color, the points will appear in red. |
lat |
A positive number determing the range of degrees to move left and right from the latitude center. See the example to better understand this argument. |
long |
A positive number determing the range of degrees to move up and down from the longitude center. See the example to better understand this argument. |
The goal of this function is for the user to see how the SESPC distribution looks like.
A plot containing the contours of the distribution.
Michail Tsagris.
R implementation and documentation: Michail Tsagris [email protected].
Tsagris M. and Alzeley O. (2024). Circular and spherical projected Cauchy distributions: A Novel Framework for Circular and Directional Data Modeling. Australian & New Zealand Journal of Statistics (accepted for publication). https://arxiv.org/pdf/2302.02468.pdf
spher.esag.contour, spher.spcauchy.contour
mu <- colMeans( as.matrix( iris[, 1:3] ) ) theta <- c(1 ,0.5) ## the lat and long are decreased to 30. Increase them back to 50 to ## see the difference spher.sespc.contour(mu, theta, lat = 30, long = 30)
mu <- colMeans( as.matrix( iris[, 1:3] ) ) theta <- c(1 ,0.5) ## the lat and long are decreased to 30. Increase them back to 50 to ## see the difference spher.sespc.contour(mu, theta, lat = 30, long = 30)
Contour lines are produced of mixture model for spherical data only.
mixvmf.contour(u, mod)
mixvmf.contour(u, mod)
u |
A two column matrix. The first column is the longitude and the second is the latitude. |
mod |
This is mix.vmf object, actually it is a list. Run a mixture model and save it as mod for example, mod = mix.vmf(x, 3). |
The contour plot is displayed with latitude and longitude in the axes. No Lambert projection is used here. This works for spherical data only which are given as longitude and latitude.
A plot including: The points and the contour lines.
Michail Tsagris and Christos Adam.
R implementation and documentation: Michail Tsagris [email protected] and Christos Adam [email protected].
Kurt Hornik and Bettina Grun (2014). movMF: An R Package for Fitting Mixtures of von Mises-Fisher Distributions http://cran.r-project.org/web/packages/movMF/vignettes/movMF.pdf
vmf.kerncontour, vmf.contour, mixvmf.mle
k <- runif(2, 4, 20) prob <- c(0.4, 0.6) mu <- matrix( rnorm(6), ncol = 3 ) mu <- mu / sqrt( rowSums(mu^2) ) x <- rmixvmf(200, prob, mu, k)$x mod <- mixvmf.mle(x, 2) y <- euclid.inv(x) mixvmf.contour(y, mod)
k <- runif(2, 4, 20) prob <- c(0.4, 0.6) mu <- matrix( rnorm(6), ncol = 3 ) mu <- mu / sqrt( rowSums(mu^2) ) x <- rmixvmf(200, prob, mu, k)$x mod <- mixvmf.mle(x, 2) y <- euclid.inv(x) mixvmf.contour(y, mod)
Contour plot of spherical data using a von Mises-Fisher kernel density estimate.
vmf.kerncontour(u, thumb = "none", den.ret = FALSE, full = FALSE, ngrid = 100)
vmf.kerncontour(u, thumb = "none", den.ret = FALSE, full = FALSE, ngrid = 100)
u |
A two column matrix. The first coolumn is the latitude and the second is the longitude. |
thumb |
This is either 'none' (defualt), or 'rot' for the rule of thumb suggested by Garcia-Portugues (2013).
If it is "none" it is estimated via cross validation, with the fast function |
den.ret |
If FALSE (default), plots the contours of the density along with the individual points. If TRUE, will instead return a list with the Longitudes, Latitudes and Densities. Look at the 'value' section for details. |
full |
If FALSE (default), uses the range of positions from 'u' to calculate and optionally plot densities. If TRUE, calculates densities covering the entire sphere. |
ngrid |
Sets the resolution of the density calculation. |
It calculates the contour plot using a von Mises-Fisher kernel for spherical data only.
The contour lines of the data. If "den.ret" was set to TRUE a list including:
lat |
The latitude values. |
long |
The longitude values. |
h |
The optimal bandwidth. |
den |
The kernel density estimate contour points. |
Michail Tsagris, Micah J. Waldstein and Christos Adam.
R implementation and documentation: Michail Tsagris [email protected], Micah J. Waldstein [email protected] and Christos Adam [email protected].
Garcia Portugues, E. (2013). Exact risk improvement of bandwidth selectors for kernel density estimation with directional data. Electronic Journal of Statistics, 7, 1655–1685.
vmf.kde, vmfkde.tune, vmf.contour
x <- rvmf(100, rnorm(3), 15) x <- euclid.inv(x) vmf.kerncontour(x, "rot")
x <- rvmf(100, rnorm(3), 15) x <- euclid.inv(x) vmf.kerncontour(x, "rot")
Contour plots of some rotationally symmetric distributions.
vmf.contour(k) spcauchy.contour(mu, rho, lat = 50, long = 50) purka.contour(theta, a, lat = 50, long = 50) pkbd.contour(mu, rho, lat = 50, long = 50)
vmf.contour(k) spcauchy.contour(mu, rho, lat = 50, long = 50) purka.contour(theta, a, lat = 50, long = 50) pkbd.contour(mu, rho, lat = 50, long = 50)
k |
The concentration parameter. |
mu |
The mean direction (unit vector) of the von Mises-Fisher, the IAG, the spherical Cauchy distribution, or the Poisson kernel-based distribution. |
rho |
The |
theta |
The median direction for the Purkayastha distribution, a unit vector. |
a |
The concentration parameter of the Purkayastha distribution. |
lat |
A positive number determing the range of degrees to move left and right from the latitude center. See the example to better understand this argument. |
long |
A positive number determing the range of degrees to move up and down from the longitude center. See the example to better understand this argument. |
The user specifies the concentration parameter only and not the mean direction or data. This is for illustration purposes only. The graph of the von Mises-Fisher distribution will always contain circles, as this distribution is the analogue of a bivariate normal in two dimensions with a zero covariance.
A contour plot of the distribution.
Michail Tsagris and Christos Adam.
R implementation and documentation: Michail Tsagris [email protected] and Christos Adam [email protected].
Tsagris M., Papastamoulis P. and Kato S. (2024). Directional data analysis using the spherical Cauchy and the Poisson kernel-based distribution. https://arxiv.org/pdf/2409.03292.
Mardia K. V. and Jupp P. E. (2000). Directional statistics. Chicester: John Wiley & Sons.
Kato S. and McCullagh P. (2020). Some properties of a Cauchy family on the sphere derived from the Mobius transformations. Bernoulli, 26(4): 3224–3248. https://arxiv.org/pdf/1510.07679.pdf
Purkayastha S. (1991). A Rotationally Symmetric Directional Distribution: Obtained through Maximum Likelihood Characterization. The Indian Journal of Statistics, Series A, 53(1): 70–83
Cabrera J. and Watson G. S. (1990). On a spherical median related distribution. Communications in Statistics-Theory and Methods, 19(6): 1973–1986.
Golzy M. and Markatou M. (2020). Poisson kernel-based clustering on the sphere: convergence properties, identifiability, and a method of sampling. Journal of Computational and Graphical Statistics, 29(4): 758–770.
Sablica L., Hornik K. and Leydold J. (2023). Efficient sampling from the PKBD distribution. Electronic Journal of Statistics, 17(2): 2180–2209.
rvmf, vmf.mle, vmf.kerncontour, kent.contour, sphereplot
vmf.contour(5) mu <- colMeans( as.matrix( iris[,1:3] ) ) mu <- mu / sqrt( sum(mu^2) ) spcauchy.contour(mu, 0.7, 30, 30) spcauchy.contour(mu, 0.7, 60, 60)
vmf.contour(5) mu <- colMeans( as.matrix( iris[,1:3] ) ) mu <- mu / sqrt( sum(mu^2) ) spcauchy.contour(mu, 0.7, 30, 30) spcauchy.contour(mu, 0.7, 60, 60)
Conversion of cosines to azimuth and plunge.
cosap(x,y,z)
cosap(x,y,z)
x |
x component of cosine. |
y |
y component of cosine. |
z |
z component of cosine. |
Orientation: x>0 is 'eastward', y>0 is 'southward', and z>0 is 'downward'.
A list including:
A |
The azimuth |
P |
The plunge |
Eli Amson.
R implementation and documentation: Eli Amson <[email protected]>.
Amson E, Arnold P, Van Heteren AH, Cannoville A, Nyakatura JA. Trabecular architecture in the forelimb epiphyses of extant xenarthrans (Mammalia). Frontiers in Zoology.
cosap(-0.505, 0.510, -0.696)
cosap(-0.505, 0.510, -0.696)
It returns an unsigned unite quaternion in (the four-dimensional sphere) from a
rotation matrix on SO(3).
rot2quat(X)
rot2quat(X)
X |
A rotation matrix in SO(3). |
Firstly construct a system of linear equations by equating the corresponding components of the theoretical rotation matrix proposed by Prentice (1986), and given a rotation matrix. Finally, the system of linear equations are solved by following the tricks mentioned in second reference here in order to achieve numerical accuracy to get quaternion values.
A unsigned unite quaternion.
Anamul Sajib.
R implementation and documentation: Anamul Sajib <[email protected]>.
Prentice,M. J. (1986). Orientation statistics without parametric assumptions.Journal of the Royal Statistical Society. Series B: Methodological 48(2). //http://www.euclideanspace.com/maths/geometry/rotations/conversions/matrixToQuaternion/index.htm
quat2rot, rotation, Arotation \ link{rot.matrix}
x <- rnorm(4) x <- x/sqrt( sum(x^2) ) ## an unit quaternion in R4 ## R <- quat2rot(x) R x rot2quat(R) ## sign is not exact as you can see
x <- rnorm(4) x <- x/sqrt( sum(x^2) ) ## an unit quaternion in R4 ## R <- quat2rot(x) R x rot2quat(R) ## sign is not exact as you can see
It forms a (3 x 3) rotation matrix on SO(3) from an unsigned unite quaternion in (the four-dimensional sphere).
quat2rot(x)
quat2rot(x)
x |
An unsigned unit quaternion in |
Given an unsigned unit quaternion in it forms a rotation matrix on SO(3), according to the transformation proposed by Prentice (1986).
A rotation matrix.
Anamul Sajib.
R implementation and documentation: Anamul Sajib <[email protected]>.
Prentice,M. J. (1986). Orientation statistics without parametric assumptions.Journal of the Royal Statistical Society. Series B: Methodological 48(2).
rot2quat, rotation, Arotation rot.matrix
x <- rnorm(4) x <- x/sqrt( sum(x^2) ) x ## an unit quaternion in R4 ## quat2rot(x)
x <- rnorm(4) x <- x/sqrt( sum(x^2) ) x ## an unit quaternion in R4 ## quat2rot(x)
Cross validation for estimating the classification rate.
dirda.cv(x, ina, folds = NULL, nfolds = 10, stratified = FALSE, type = c("vmf", "iag", "esag", "kent", "sc", "pkbd", "purka"), seed = NULL, B = 1000)
dirda.cv(x, ina, folds = NULL, nfolds = 10, stratified = FALSE, type = c("vmf", "iag", "esag", "kent", "sc", "pkbd", "purka"), seed = NULL, B = 1000)
x |
A matrix with the data in Eulcidean coordinates, i.e. unit vectors. The matrix must have three columns, only spherical data are currently supported. |
ina |
A variable indicating the groupings. |
folds |
Do you already have a list with the folds? If not, leave this NULL. |
nfolds |
How many folds to create? |
stratified |
Should the folds be created in a stratified way? i.e. keeping the distribution of the groups similar through all folds? |
seed |
If seed is TRUE, the results will always be the same. |
type |
The type of classifier to use. The avaliable options are "vmf" (von Mises-Fisher distribution), "iag" (IAG distribution), "esag" (ESAG distribution), "kent" (Kent distribution), "sc" and "sc2" (spherical Cauchy distribution), "pkbd" and "pkbd2" (Poisson kernel-based distribution), and "purka" (Purkayastha distribution). The difference between "sc" and "sc2" and between "pkbd" and "pkbd2" is that the first uses the Newton-Raphson algorithm and it is faster, whereas the second uses a hybrid algorithm that does not require the Hessian matrix, but in large dimensions the second will be faster. You can chose any of them or all of them. Note that "kent" works only with spherical data. |
B |
If you used k-NN, should a bootstrap correction of the bias be applied? If yes, 1000 is a good value. |
Cross-validation for the estimation of the performance of a classifier.
The estimated performance of the best classifier is overestimated. After the cross-valdiation
procedure, the predicted values produced by all classifiers are colelcted, from all folds,
in an matrix, where
is the number of samples and
is the number of all
classifiers used. We sample rows (predictions) with replacement from P and denote them as
the in-sample values. The non re-sampled rows are denoted as out-of-sample values. The
performance of each classifier in the insample rows is calculated and the classifier with
the optimal performance is selected, followed by the calculation of performance in the
out-of-sample values. This process is repeated B times and the average performance is returned.
The only computational overhead is with the repetitive resampling and calculation of the performance,
i.e. no model or classifier is fitted nor trained. For more information see Tsamardinos et al. (2018).
The good thing with the function is that you can run any method you want by supplying the
folds yourselves using the command makefolds
. Then suppose you want to run another method. By suppying the same folds you will be able to have comparative results for all methods.
A list including:
perf |
A vector with the estimated performance of each classifier. |
bbc.perf |
The bootstrap bias corrected performance. |
Michail Tsagris.
R implementation and documentation: Michail Tsagris [email protected].
Tsagris M., Papastamoulis P. and Kato S. (2024). Directional data analysis using the spherical Cauchy and the Poisson kernel-based distribution. https://arxiv.org/pdf/2409.03292
Tsagris M. and Alenazi A. (2019). Comparison of discriminant analysis methods on the sphere. Communications in Statistics: Case Studies, Data Analysis and Applications, 5(4), 467–491.
Mardia K. V. and Jupp, P. E. (2000). Directional statistics. Chicester: John Wiley & Sons.
Morris J. E. and Laycock P. J. (1974). Discriminant analysis of directional data. Biometrika, 61(2): 335–341.
Tsamardinos I., Greasidou E. and Borboudakis G. (2018). Machince Learning, 107(12): 1895–1922.
x <- rvmf(300, rnorm(3), 10) ina <- sample.int(2, 300, replace = TRUE) dirda.cv(x, ina, B = 1)
x <- rvmf(300, rnorm(3), 10) ina <- sample.int(2, 300, replace = TRUE) dirda.cv(x, ina, B = 1)
Cumulative probability distribution of circular distributions.
pvm(u, m, k, rads = FALSE) pspml(u, mu, rads = FALSE) pwrapcauchy(u, m, rho, rads = FALSE) pcircpurka(u, m, a, rads = FALSE) pcircbeta(u, m, a, b, rads = FALSE) pcardio(u, m, rho, rads = FALSE) pcircexp(u, lambda, rads = FALSE) pcipc(u, omega, g, rads = FALSE) pgcpc(u, omega, g, rho, rads = FALSE) pmmvm(u, m, k, N, rads = FALSE)
pvm(u, m, k, rads = FALSE) pspml(u, mu, rads = FALSE) pwrapcauchy(u, m, rho, rads = FALSE) pcircpurka(u, m, a, rads = FALSE) pcircbeta(u, m, a, b, rads = FALSE) pcardio(u, m, rho, rads = FALSE) pcircexp(u, lambda, rads = FALSE) pcipc(u, omega, g, rads = FALSE) pgcpc(u, omega, g, rho, rads = FALSE) pmmvm(u, m, k, N, rads = FALSE)
u |
A numerical value, either in radians or in degrees. |
m |
The mean direction of the von Mises and the multi-modal von Mises distribution in radians or in degrees. |
mu |
The mean vector, a vector with two values for the "pspml". |
omega |
The location parameter of the CIPC and GCPC distributions. |
g |
The norm of the mean vector for the CIPC and GCPC distributions. |
k |
The concentration parameter, |
lambda |
The |
a |
The |
b |
The |
rho |
The |
N |
The number of modes to consider in the multi-modal von Mises distribution. |
rads |
If the data are in radians, this should be TRUE and FALSE otherwise. |
This value calculates the probability of u being less than some value .
The probability that of u being less than , where u follows a circular distribution.
Michail Tsagris.
R implementation and documentation: Michail Tsagris [email protected].
Arthur Pewsey, Markus Neuhauser, and Graeme D. Ruxton (2013). Circular Statistics in R.
Barnett M. J. and Kingston R. L. (2024). A note on the Hendrickson-Lattman phase probability distribution and its equivalence to the generalized von Mises distribution. Journal of Applied Crystallography, 57(2).
Jammalamadaka S. R. and Kozubowski T. J. (2003). A new family of circular models: The wrapped Laplace distributions. Advances and Applications in Statistics, 3(1): 77–103.
Purkayastha S. (1991). A Rotationally Symmetric Directional Distribution: Obtained through Maximum Likelihood Characterization. The Indian Journal of Statistics, Series A, 53(1): 70–83
Cabrera J. and Watson G. S. (1990). On a spherical median related distribution. Communications in Statistics–Theory and Methods, 19(6): 1973–1986.
Paula F. V., Nascimento A. D., Amaral G. J. and Cordeiro G. M. (2021). Generalized Cardioid distributions for circular data analysis. Stats, 4(3): 634–649.
Zheng Sun (2009). Comparing measures of fit for circular distributions. MSc Thesis, University of Victoria. file:///C:/Users/mtsag/Downloads/zhengsun_master_thesis.pdf
group.gof, dvm, dcircexp,
purka.mle, dcircpurka, dmmvm
pvm(1, 2, 10, rads = TRUE) pmmvm(1, 2, 10, 3, rads = TRUE) pcircexp(c(1, 2), 2, rads = TRUE) pcircpurka(2, 3, 0.3)
pvm(1, 2, 10, rads = TRUE) pmmvm(1, 2, 10, 3, rads = TRUE) pcircexp(c(1, 2), 2, rads = TRUE) pcircpurka(2, 3, 0.3)
Density of a mixture of rotationally symmetric distributions.
dmixvmf(y, probs, mu, k, logden = FALSE) dmixspcauchy(y, probs, mu, rho, logden = FALSE) dmixpkbd(y, probs, mu, rho, logden = FALSE)
dmixvmf(y, probs, mu, k, logden = FALSE) dmixspcauchy(y, probs, mu, rho, logden = FALSE) dmixpkbd(y, probs, mu, rho, logden = FALSE)
y |
A matrix with unit vectors. |
probs |
This is avector with the mixing probability of each group. |
mu |
A matrix with the mean direction of each group. |
k |
A vector with the concentration parameter of each group. |
rho |
A vector with the concentration parameter of each group. |
logden |
If you the logarithm of the density values set this to TRUE. |
The function computes the density for a given mixture of von Mises-Fisher, spherical Cauchy or Poisson kernel-based distributions.
A vector with the (log) density values of y.
Michail Tsagris.
R implementation and documentation: Michail Tsagris [email protected].
Kurt Hornik and Bettina Grun (2014). movMF: An R Package for Fitting Mixtures of von Mises-Fisher Distributions http://cran.r-project.org/web/packages/movMF/vignettes/movMF.pdf
Tsagris M., Papastamoulis P. and Kato S. (2024). Directional data analysis using the spherical Cauchy and the Poisson kernel-based distribution. https://arxiv.org/pdf/2409.03292.
k <- runif(3, 4, 6) probs <- c(0.2, 0.5, 0.3) mu <- matrix(rnorm(9), ncol = 3) mu <- mu / sqrt( rowSums(mu^2) ) x <- rmixvmf(200, probs, mu, k)$x b <- dmixvmf(x, probs, mu, k)
k <- runif(3, 4, 6) probs <- c(0.2, 0.5, 0.3) mu <- matrix(rnorm(9), ncol = 3) mu <- mu / sqrt( rowSums(mu^2) ) x <- rmixvmf(200, probs, mu, k)$x b <- dmixvmf(x, probs, mu, k)
Density of some (hyper-)spherical distributions.
dvmf(y, mu, k, logden = FALSE ) iagd(y, mu, logden = FALSE) dpurka(y, theta, a, logden = FALSE) dspcauchy(y, mu, rho, logden = FALSE) dpkbd(y, mu, rho, logden = FALSE)
dvmf(y, mu, k, logden = FALSE ) iagd(y, mu, logden = FALSE) dpurka(y, theta, a, logden = FALSE) dspcauchy(y, mu, rho, logden = FALSE) dpkbd(y, mu, rho, logden = FALSE)
y |
A matrix or a vector with the data expressed in Euclidean coordinates, i.e. unit vectors. |
mu |
The mean direction (unit vector) of the von Mises-Fisher, the IAG, the spherical Cauchy distribution, or of the Poisson kernel-based distribution. |
theta |
The mean direction (unit vector) of the Purkayastha distribution. |
k |
The concentration parameter of the von Mises-Fisher distribution. |
a |
The concentration parameter of the Purkayastha distribution. |
rho |
The |
logden |
If you the logarithm of the density values set this to TRUE. |
The density of the von Mises-Fisher, of the IAG, of the Purkayastha, of the spherical Cauchy distribution, or of the Poisson kernel-based distribution is computed.
A vector with the (log) density values of y.
Michail Tsagris and Zehao Yu.
R implementation and documentation: Michail Tsagris [email protected] and Zehao Yu [email protected].
Mardia K. V. and Jupp P. E. (2000). Directional statistics. Chicester: John Wiley & Sons.
Purkayastha S. (1991). A Rotationally Symmetric Directional Distribution: Obtained through Maximum Likelihood Characterization. The Indian Journal of Statistics, Series A, 53(1): 70–83
Cabrera J. and Watson G. S. (1990). On a spherical median related distribution. Communications in Statistics-Theory and Methods, 19(6): 1973–1986.
Kato S. and McCullagh P. (2020). Some properties of a Cauchy family on the sphere derived from the Mobius transformations. Bernoulli, 26(4): 3224–3248. https://arxiv.org/pdf/1510.07679.pdf
Golzy M. and Markatou M. (2020). Poisson kernel-based clustering on the sphere: convergence properties, identifiability, and a method of sampling. Journal of Computational and Graphical Statistics, 29(4): 758–770.
Sablica L., Hornik K. and Leydold J. (2023). Efficient sampling from the PKBD distribution. Electronic Journal of Statistics, 17(2): 2180–2209.
Zehao Yu and Xianzheng Huang (2024). A new parameterization for elliptically symmetric angular Gaussian distributions of arbitrary dimension. Electronic Journal of Statististics, 18(1): 301–334.
Tsagris M., Papastamoulis P. and Kato S. (2024). Directional data analysis using the spherical Cauchy and the Poisson kernel-based distribution. https://arxiv.org/pdf/2409.03292.
m <- colMeans( as.matrix( iris[,1:3] ) ) y <- rvmf(1000, m = m, k = 10) dvmf(y, k=10, m)
m <- colMeans( as.matrix( iris[,1:3] ) ) y <- rvmf(1000, m = m, k = 10) dvmf(y, k=10, m)
Density of some circular distributions.
dvm(x, m, k, rads = FALSE, logden = FALSE) dspml(x, mu, rads = FALSE, logden = FALSE) dwrapcauchy(x, m, rho, rads = FALSE, logden = FALSE) dcircpurka(x, m, a, rads = FALSE, logden = FALSE) dggvm(x, param, rads = FALSE, logden = FALSE) dcircbeta(x, m, a, b, rads = FALSE, logden = FALSE) dcardio(x, m, rho, rads = FALSE, logden = FALSE) dcircexp(x, lambda, rads = FALSE, logden = FALSE) dcipc(x, omega, g, rads = FALSE, logden = FALSE) dgcpc(x, omega, g, rho, rads = FALSE, logden = FALSE) dmmvm(x, m, k, N, rads = FALSE, logden = FALSE)
dvm(x, m, k, rads = FALSE, logden = FALSE) dspml(x, mu, rads = FALSE, logden = FALSE) dwrapcauchy(x, m, rho, rads = FALSE, logden = FALSE) dcircpurka(x, m, a, rads = FALSE, logden = FALSE) dggvm(x, param, rads = FALSE, logden = FALSE) dcircbeta(x, m, a, b, rads = FALSE, logden = FALSE) dcardio(x, m, rho, rads = FALSE, logden = FALSE) dcircexp(x, lambda, rads = FALSE, logden = FALSE) dcipc(x, omega, g, rads = FALSE, logden = FALSE) dgcpc(x, omega, g, rho, rads = FALSE, logden = FALSE) dmmvm(x, m, k, N, rads = FALSE, logden = FALSE)
x |
A vector with circular data. |
m |
The mean value of the von Mises distribution and of the cardioid, a scalar. This is the median for the circular Purkayastha distribution. |
mu |
The mean vector, a vector with two values for the "spml" and the GCPC. |
omega |
The location parameter of the CIPC and GCPC distributions. |
g |
The norm of the mean vector for the CIPC and GCPC distributions. |
k |
The concentration parameter. |
rho |
For the wrapped Cauchy and Cardioid distributions, this is the |
a |
The |
b |
The |
lambda |
The |
param |
The vector of parameters of the GGVM distribution as returned by the function |
N |
The number of modes to consider in the multi-modal von Mises distribution. |
rads |
If the data are in rads, then this should be TRUE, otherwise FALSE. |
logden |
If you the logarithm of the density values set this to TRUE. |
The density of the von Mises, bivariate projected normal, cardio, circular exponential, wrapped Cauchy, circular Purkayastha, CIPC or GCPC distributions is computed.
A vector with the (log) density values of x.
Michail Tsagris.
R implementation and documentation: Michail Tsagris [email protected].
Mardia K. V. and Jupp P. E. (2000). Directional statistics. Chicester: John Wiley & Sons.
Tsagris M. and Alzeley O. (2024). Circular and spherical projected Cauchy distributions: A Novel Framework for Circular and Directional Data Modeling. https://arxiv.org/pdf/2302.02468.pdf
Presnell B., Morrison S. P. and Littell R. C. (1998). Projected multivariate linear models for directional data. Journal of the American Statistical Association, 93(443): 1068–1077.
Jammalamadaka S. R. and Kozubowski T. J. (2003). A new family of circular models: The wrapped Laplace distributions. Advances and Applications in Statistics, 3(1): 77–103.
Barnett M. J. and Kingston R. L. (2024). A note on the Hendrickson-Lattman phase probability distribution and its equivalence to the generalized von Mises distribution. Journal of Applied Crystallography, 57(2).
Paula F. V., Nascimento A. D., Amaral G. J. and Cordeiro G. M. (2021). Generalized Cardioid distributions for circular data analysis. Stats, 4(3): 634–649.
Zheng Sun (2009). Comparing measures of fit for circular distributions. MSc Thesis, University of Victoria. file:///C:/Users/mtsag/Downloads/zhengsun_master_thesis.pdf
x <- rvonmises(500, m = 2.5, k = 10, rads = TRUE) mod <- circ.summary(x, rads = TRUE, plot = FALSE) den <- dvm(x, mod$mesos, mod$kappa, rads = TRUE, logden = TRUE ) mod$loglik sum(den)
x <- rvonmises(500, m = 2.5, k = 10, rads = TRUE) mod <- circ.summary(x, rads = TRUE, plot = FALSE) den <- dvm(x, mod$mesos, mod$kappa, rads = TRUE, logden = TRUE ) mod$loglik sum(den)
Density of the SESPC distribution.
dsespc(y, mu, theta, logden = FALSE)
dsespc(y, mu, theta, logden = FALSE)
y |
A matrix or a vector with the data expressed in Euclidean coordinates, i.e. unit vectors. |
mu |
The mean vector the SESPC distribution, a vector in |
theta |
The two |
logden |
If you the logarithm of the density values set this to TRUE. |
The density of the SESPC distribution is computed.
A vector with the (log) density values of y.
Michail Tsagris.
R implementation and documentation: Michail Tsagris [email protected].
Tsagris M. and Alzeley O. (2024). Circular and spherical projected Cauchy distributions: A Novel Framework for Circular and Directional Data Modeling. Australian & New Zealand Journal of Statistics (accepted for publication). https://arxiv.org/pdf/2302.02468.pdf
Mardia K. V. and Jupp P. E. (2000). Directional statistics. Chicester: John Wiley & Sons.
m <- colMeans( as.matrix( iris[,1:3] ) ) y <- rsespc(1000, m, c(1, 1)) mod <- sespc.mle(y) dsespc( y, mod$mu, mod$theta)
m <- colMeans( as.matrix( iris[,1:3] ) ) y <- rsespc(1000, m, c(1, 1)) mod <- sespc.mle(y) dsespc( y, mod$mu, mod$theta)
Density of the spherical ESAG and Kent distributions.
desag(y, mu, gam, logden = FALSE) dkent(y, G, param, logden = FALSE) dESAGd(y, mu, gam, logden = FALSE)
desag(y, mu, gam, logden = FALSE) dkent(y, G, param, logden = FALSE) dESAGd(y, mu, gam, logden = FALSE)
y |
A matrix or a vector with the data expressed in Euclidean coordinates, i.e. unit vectors. For the dESAGd it can have any dimension. |
mu |
The mean vector the ESAG distribution. |
gam |
The two |
G |
For the Kent distribution only, a 3 x 3 matrix whose first column is the mean direction. The second and third columns are the major and minor axes respectively. |
param |
For the Kent distribution a vector with the concentration |
logden |
If you the logarithm of the density values set this to TRUE. |
The density of the spherical ESAG or Kent distribution, or of the ESAG distribution in arbitrary dimensions is computed.
A vector with the (log) density values of y.
Michail Tsagris and Zehao Yu.
R implementation and documentation: Michail Tsagris [email protected] and Zehao Yu [email protected].
Zehao Yu and Xianzheng Huang (2024). A new parameterization for elliptically symmetric angular Gaussian distributions of arbitrary dimension. Electronic Journal of Statististics, 18(1): 301–334.
Paine P.J., Preston S.P., Tsagris M. and Wood A.T.A. (2018). An Elliptically Symmetric Angular Gaussian Distribution. Statistics and Computing, 28(3):689–697.
Kent John (1982). The Fisher-Bingham distribution on the sphere. Journal of the Royal Statistical Society, Series B, 44(1): 71–80.
Mardia K. V. and Jupp P. E. (2000). Directional statistics. Chicester: John Wiley & Sons.
m <- colMeans( as.matrix( iris[, 1:3] ) ) y <- rkent(1000, k = 10, m = m, b = 4) mod <- kent.mle(y) dkent( y, G = mod$G, param = mod$param )
m <- colMeans( as.matrix( iris[, 1:3] ) ) y <- rkent(1000, k = 10, m = m, b = 4) mod <- kent.mle(y) dkent( y, G = mod$G, param = mod$param )
Density of the Wood bimodal distribution on the sphere.
dwood(y, param, logden = FALSE)
dwood(y, param, logden = FALSE)
y |
A matrix containing two columns. The first one is the latitude and the second is the longitude, both expressed in degrees. |
param |
A vector with the 5 parameters, in the order they are returned by the |
logden |
If you the logarithm of the density values set this to TRUE. |
The density of the spherical Wood distribution is computed.
A vector with the (log) density values of y.
Michail Tsagris.
R implementation and documentation: Michail Tsagris [email protected].
Wood A.T.A. (1982). A bimodal distribution on the sphere. Journal of the Royal Statistical Society, Series C, 31(1): 52–58.
x <- rvmf(100, rnorm(3), 15) x <- euclid.inv(x) mod <- wood.mle(x) d <- dwood(x, mod$info[, 1])
x <- rvmf(100, rnorm(3), 15) x <- euclid.inv(x) mod <- wood.mle(x) d <- dwood(x, mod$info[, 1])
It transforms the data from the spherical coordinates to Euclidean coordinates.
euclid(u)
euclid(u)
u |
A two column matrix or even one single vector, where the first column (or element) is the latitude and the second is the longitude. The order is important. |
It takes the matrix of unit vectors of latitude and longitude and transforms it to unit vectors.
A three column matrix:
U |
The Euclidean coordinates of the latitude and longitude. |
Michail Tsagris.
R implementation and documentation: Michail Tsagris [email protected] and Giorgos Athineou <[email protected]>.
euclid.inv, Arotation, lambert
x <- rvmf(10, rnorm(3), 10) u <- euclid.inv(x) euclid(u) x
x <- rvmf(10, rnorm(3), 10) u <- euclid.inv(x) euclid(u) x
It calculates three euler angles from a
rotation matrix X, where X is defined as
. Here
means a rotation of
radians about the x axis.
rot2eul(X)
rot2eul(X)
X |
A rotation matrix which is defined as a product of three elementary rotations mentioned above.
Here |
Given a rotation matrix X, euler angles are computed by equating each element in X with the corresponding element in the matrix product defined above. This results in nine equations that can be used to find the euler angles.
For a given rotation matrix, there are two eqivalent sets of euler angles.
Anamul Sajib <[email protected]>.
R implementation and documentation: Anamul Sajib <[email protected]>.
Green, P. J. and Mardia, K. V. (2006). Bayesian alignment using hierarchical models, with applications in proteins bioinformatics. Biometrika, 93(2):235–254.
http://www.staff.city.ac.uk/~sbbh653/publications/euler.pdf
# three euler angles theta.12 <- sample( seq(-3, 3, 0.3), 1 ) theta.23 <- sample( seq(-3, 3, 0.3), 1 ) theta.13 <- sample( seq(-1.4, 1.4, 0.3), 1 ) theta.12 ; theta.23 ; theta.13 X <- eul2rot(theta.12, theta.23, theta.13) X ## A rotation matrix e <- rot2eul(X)$v1 theta.12 <- e[3] theta.23 <- e[2] theta.13 <- e[1] theta.12 ; theta.23 ; theta.13
# three euler angles theta.12 <- sample( seq(-3, 3, 0.3), 1 ) theta.23 <- sample( seq(-3, 3, 0.3), 1 ) theta.13 <- sample( seq(-1.4, 1.4, 0.3), 1 ) theta.12 ; theta.23 ; theta.13 X <- eul2rot(theta.12, theta.23, theta.13) X ## A rotation matrix e <- rot2eul(X)$v1 theta.12 <- e[3] theta.23 <- e[2] theta.13 <- e[1] theta.12 ; theta.23 ; theta.13
Forward Backward Early Dropping selection for circular data using the SPML regression.
spml.fbed(y, x, alpha = 0.05, K = 0, backward = FALSE, parallel = FALSE, tol = 1e-07, maxiters = 100)
spml.fbed(y, x, alpha = 0.05, K = 0, backward = FALSE, parallel = FALSE, tol = 1e-07, maxiters = 100)
y |
The response variable, a numeric vector expressed in rads. |
x |
A matrix with continuous independent variables. |
alpha |
The significance threshold value for assessing p-values. Default value is 0.05. |
K |
How many times should the process be repeated? The default value is 0. |
backward |
After the Forward Early Dropping phase, the algorithm proceeds witha the usual Backward Selection phase. The default value is set to TRUE. It is advised to perform this step as maybe some variables are false positives, they were wrongly selected. This is rather experimental now and there could be some mistakes in the indices of the selected variables. Do not use it for now. |
parallel |
If you want the algorithm to run in parallel set this TRUE. |
tol |
The tolerance value to terminate the Newton-Raphson algorithm. |
maxiters |
The maximum number of iterations Newton-Raphson will perform. |
The algorithm is a variation of the usual forward selection. At every step, the most significant variable enters the selected variables set. In addition, only the significant variables stay and are further examined. The non signifcant ones are dropped. This goes until no variable can enter the set. The user has the option to re-do this step 1 or more times (the argument K). In the end, a backward selection is performed to remove falsely selected variables. Note that you may have specified, for example, K=10, but the maximum value FBED used can be 4 for example.
If K is a single number a list including: Note, that the "gam" argument must be the same though.
res |
A matrix with the selected variables and their test statistic. |
info |
A matrix with the number of variables and the number of tests performed (or models fitted) at each round (value of K). This refers to the forward phase only. |
runtime |
The runtime required. |
Michail Tsagris.
R implementation and documentation: Michail Tsagris [email protected].
Borboudakis G. and Tsamardinos I. (2019). Forward-backward selection with early dropping. Journal of Machine Learning Research, 20(8): 1–39.
Tsagis M. (2018). Guide on performing feature selection with the R package MXM. https://f1000research.com/articles/7-1505
Presnell Brett, Morrison Scott P. and Littell Ramon C. (1998). Projected multivariate linear models for directional data. Journal of the American Statistical Association, 93(443): 1068–1077.
x <- matrix( runif(100 * 50, 1, 100), ncol = 50 ) y <- runif(100) a <- spml.fbed(y, x)
x <- matrix( runif(100 * 50, 1, 100), ncol = 50 ) y <- runif(100) a <- spml.fbed(y, x)
Random folds for use in a cross validation are generated. There is the option for stratified splitting as well.
makefolds(ina, nfolds = 10, stratified = TRUE, seed = NULL)
makefolds(ina, nfolds = 10, stratified = TRUE, seed = NULL)
ina |
A variable indicating the groupings. |
nfolds |
The number of folds to produce. |
stratified |
A boolean variable specifying whether stratified random (TRUE) or simple random (FALSE) sampling is to be used when producing the folds. |
seed |
You can specify your own seed number here or leave it NULL. |
I was inspired by the command in the package TunePareto in order to do the stratified version.
A list with nfolds elements where each elements is a fold containing the indices of the data.
Michail Tsagris.
R implementation and documentation: Michail Tsagris [email protected].
a <- makefolds(iris[, 5], nfolds = 5, stratified = TRUE) table(iris[a[[1]], 5]) ## 10 values from each group
a <- makefolds(iris[, 5], nfolds = 5, stratified = TRUE) table(iris[a[[1]], 5]) ## 10 values from each group
Generation of unit vector(s) with a given angle from a given unit vector.
vec(x, n = 1, deg = 90)
vec(x, n = 1, deg = 90)
x |
A unit vector. If it is not a unit vector it becomes one. |
n |
The number of unit vectors to return. |
deg |
The angle between the given vector and the n vectors to be returned. This must be in degrees and it has to be between 0 and 180 degrees. If the angle is 0, the same unit vector will be returned. If the angle is 180, the same unit vector with the signs changed will be returned. |
The user provides a unit vector and the degrees. The function will return n unit vectors whose angle with the given unit vector equals the degrees given. For example, if you want 10 unit vectors purpendicualr to the x put vec(x, 10, 90).
A list including:
runtime |
The runtime of the procedure. |
crit |
The calculated angle between the given unit vector and each of the generated unit vectors. |
mat |
A matrix with the n unit vectors. |
Michail Tsagris.
R implementation and documentation: Michail Tsagris [email protected] and Giorgos Athineou <[email protected]>.
x <- rnorm(10) x <- x / sqrt( sum(x^2) ) a <- vec(x, 20, 90)
x <- rnorm(10) x <- x / sqrt( sum(x^2) ) a <- vec(x, 20, 90)
Goodness of fit test for grouped data.
group.gof(g, ni, m, k, dist = "vm", rads = FALSE, R = 999, ncores = 1)
group.gof(g, ni, m, k, dist = "vm", rads = FALSE, R = 999, ncores = 1)
g |
A vector with the group points, either in radians or in degrees. |
ni |
The frequency of each or group class. |
m |
The mean direction in radians or in degrees. |
k |
The concentration parameter, |
dist |
The distribution to be tested, it can be either "vm" or "uniform". |
rads |
If the data are in radians, this should be TRUE and FALSE otherwise. |
R |
The number of bootstrap simulations to perform, set to 999 by default. |
ncores |
The number of cores to use. |
When you have grouped data, you can test whether the data come from the von Mises-Fisher distribution or from a uniform distribution.
This is an "htest"class object. Thus it returns a list including:
statistic |
The test statistic value. |
parameter |
Since this is a bootstrap based test, there are no degrees of freedom, hence this is "NA". |
p.value |
The p-value of the test. |
alternative |
A character with the alternative hypothesis. |
method |
A character with the test used. |
data.name |
A character vector with two elements. |
Michail Tsagris.
R implementation and documentation: Michail Tsagris [email protected].
Arthur Pewsey, Markus Neuhauser, and Graeme D. Ruxton (2013). Circular Statistics in R.
x <- rvonmises(100, 2, 10) g <- seq(min(x) - 0.1, max(x) + 0.1, length = 6) ni <- as.vector( table( cut(x, g) ) ) group.gof(g, ni, 2, 10, dist = "vm", rads = TRUE, R = 299, ncores = 1) group.gof(g, ni, 2, 5, dist = "vm", rads = TRUE, R = 299, ncores = 1)
x <- rvonmises(100, 2, 10) g <- seq(min(x) - 0.1, max(x) + 0.1, length = 6) ni <- as.vector( table( cut(x, g) ) ) group.gof(g, ni, 2, 10, dist = "vm", rads = TRUE, R = 299, ncores = 1) group.gof(g, ni, 2, 5, dist = "vm", rads = TRUE, R = 299, ncores = 1)
It generates random rotations in three-dimensional space that follow a probability distribution, matrix Fisher distribution, arising in fitting and matching problem.
habeck.rot(F)
habeck.rot(F)
F |
An arbitrary 3 x 3 matrix represents the parameter matrix of this distribution. |
Firstly rotation matrices X are chosen which are the closest to F, and then parameterized using euler angles. Then a Gibbs sampling algorithm is implemented to generate rotation matrices from the resulting disribution of the euler angles.
A simulated rotation matrix.
Anamul Sajib.
R implementation and documentation: Anamul Sajib <[email protected]>.
Habeck M (2009). Generation of three-dimensional random rotations in fitting and matching problems. Computational Statistics, 24, 719–731.
F <- 10^(-1) * matrix( c(85, 11, 41, 78, 39, 60, 43, 64, 48), ncol = 3 ) ## Arbitrary F matrix X <- habeck.rot(F) det(X)
F <- 10^(-1) * matrix( c(85, 11, 41, 78, 39, 60, 43, 64, 48), ncol = 3 ) ## Arbitrary F matrix X <- habeck.rot(F) det(X)
Haversine distance matrix.
haversine.dist(x)
haversine.dist(x)
x |
A a matrix of two columns. The first column is the latitude and the second the longitude. |
The function computes the haversine distance between all observations.
A matrix with the haversine distances between all observations.
Michail Tsagris.
R implementation and documentation: Michail Tsagris [email protected].
https://en.wikipedia.org/wiki/Haversine_formula
x <- rvmf(10, rnorm(3), 10) x <- euclid.inv(x) haversine.dist(x)
x <- rvmf(10, rnorm(3), 10) x <- euclid.inv(x) haversine.dist(x)
Regression when both the dependent and independent variables are directional data-.
hspher.reg(y, x, xnew = NULL)
hspher.reg(y, x, xnew = NULL)
y |
The dependent variable; a matrix with either two columns, latitude and longitude, either in radians or in degrees. Alternatively it is a matrix with three columns, unit vectors. |
x |
The dependent variable; a matrix with either two columns, latitude and longitude, either in radians or in degrees. Alternatively it is a matrix with three columns, unit vectors. The two matrices must agree in the scale and dimensions. |
xnew |
The new values of some directional independent variable(s) whose directional response values you want to predict. If you have no new x values, leave it NULL (default). |
Spherical regression as proposed by Chang (1986) is implemented. If the estimated rotation matrix has a determinant equal to -1, singular value decomposition is performed and the last unit vector is multiplied by -1.
A list including:
A |
The estimated rotation matrix. |
est |
The fitted values in unit vectors, if the argument xnew is not NULL. |
Michail Tsagris.
R implementation and documentation: Michail Tsagris [email protected].
Ted Chang (1986). Spherical Regression. Annals of Statistics, 14(3): 907–924.
spher.cor, spml.reg, spher.reg, sphereplot
mx <- rnorm(5) mx <- mx/sqrt( sum(mx^2) ) my <- rnorm(5) my <- my/sqrt( sum(my^2) ) x <- rvmf(100, mx, 15) A <- rotation(mx, my) y <- x %*% t(A) mod <- hspher.reg(y, x) A mod$A ## exact match, no noise y <- x %*% t(A) y <- y + rvmf(100, colMeans(y), 40) mod <- hspher.reg(y, x) A mod$A ## noise added, more relistic example
mx <- rnorm(5) mx <- mx/sqrt( sum(mx^2) ) my <- rnorm(5) my <- my/sqrt( sum(my^2) ) x <- rvmf(100, mx, 15) A <- rotation(mx, my) y <- x %*% t(A) mod <- hspher.reg(y, x) A mod$A ## exact match, no noise y <- x %*% t(A) y <- y + rvmf(100, colMeans(y), 40) mod <- hspher.reg(y, x) A mod$A ## noise added, more relistic example
The null hypothesis is whether an IAG distribution fits the data well, where the altenrative is that ESAG distribution is more suitable.
iagesag(x, B = 1, tol = 1e-07)
iagesag(x, B = 1, tol = 1e-07)
x |
A numeric matrix with three columns containing the data as unit vectors in Euclidean coordinates. |
B |
The number of bootstrap re-samples. By default is set to 999. If it is equal to 1, no bootstrap is performed and the p-value is obtained throught the asymptotic distribution. |
tol |
The tolerance to accept that the Newton-Raphson algorithm used in the IAG distribution has converged. |
Essentially it is a test of rotational symmetry, whether the two parameters are equal to zero.
This works for spherical data only.
This is an "htest"class object. Thus it returns a list including:
statistic |
The test statistic value. |
parameter |
The degrees of freedom of the test. If bootstrap was employed this is "NA". |
p.value |
The p-value of the test. |
alternative |
A character with the alternative hypothesis. |
method |
A character with the test used. |
data.name |
A character vector with two elements. |
Michail Tsagris.
R implementation and documentation: Michail Tsagris [email protected].
Paine P.J., Preston S.P., Tsagris M. and Wood A.T.A. (2018). An Elliptically Symmetric Angular Gaussian Distribution. Statistics and Computing, 28(3):689–697.
fishkent, iagesag, pc.test, esag.mle, kent.mle,
x <- rvmf(100, rnorm(3), 15) iagesag(x) fishkent(x, B = 1)
x <- rvmf(100, rnorm(3), 15) iagesag(x) fishkent(x, B = 1)
The null hypothesis is whether an SIPC distribution fits the data well, where the altenrative is that SESPC distribution is more suitable.
pc.test(x, B = 1, tol = 1e-06)
pc.test(x, B = 1, tol = 1e-06)
x |
A numeric matrix with three columns containing the data as unit vectors in Euclidean coordinates. |
B |
The number of bootstrap re-samples. By default is set to 999. If it is equal to 1, no bootstrap is performed and the p-value is obtained throught the asymptotic distribution. |
tol |
The tolerance to accept that the Newton-Raphson algorithm used in the IAG distribution has converged. |
Essentially it is a test of rotational symmetry, whether the two parameters are equal to zero. This works for spherical data only.
This is an "htest"class object. Thus it returns a list including:
statistic |
The test statistic value. |
parameter |
The degrees of freedom of the test. If bootstrap was employed this is "NA". |
p.value |
The p-value of the test. |
alternative |
A character with the alternative hypothesis. |
method |
A character with the test used. |
data.name |
A character vector with two elements. |
Michail Tsagris.
R implementation and documentation: Michail Tsagris [email protected].
Tsagris M. and Alzeley O. (2024). Circular and spherical projected Cauchy distributions: A Novel Framework for Circular and Directional Data Modeling. https://arxiv.org/pdf/2302.02468.pdf
x <- rvmf(100, rnorm(3), 15) iagesag(x) pc.test(x)
x <- rvmf(100, rnorm(3), 15) iagesag(x) pc.test(x)
The null hypothesis is whether a von Mises-Fisher distribution fits the data well, where the altenrative is that Kent distribution is more suitable.
fishkent(x, B = 999)
fishkent(x, B = 999)
x |
A numeric matrix containing the data as unit vectors in Euclidean coordinates. |
B |
The number of bootstrap re-samples. By default is set to 999. If it is equal to 1, no bootstrap is performed and the p-value is obtained throught the asymptotic distribution. |
Essentially it is a test of rotational symmetry, whether Kent's ovalness parameter (beta) is equal to zero. This works for spherical data only.
This is an "htest"class object. Thus it returns a list including:
statistic |
The test statistic value. |
parameter |
The degrees of freedom of the test. If bootstrap was employed this is "NA". |
p.value |
The p-value of the test. |
alternative |
A character with the alternative hypothesis. |
method |
A character with the test used. |
data.name |
A character vector with two elements. |
Michail Tsagris.
R implementation and documentation: Michail Tsagris [email protected].
Rivest L. P. (1986). Modified Kent's statistics for testing goodness of fit for the Fisher distribution in small concentrated samples. Statistics & Probability Letters, 4(1): 1–4.
iagesag, pc.test, vmf.mle, kent.mle
x <- rvmf(100, rnorm(3), 15) fishkent(x) fishkent(x, B = 1) iagesag(x)
x <- rvmf(100, rnorm(3), 15) fishkent(x) fishkent(x, B = 1) iagesag(x)
Interactive 3D plot of spherical data.
sphereplot(dat, col = NULL, bgcol = "snow")
sphereplot(dat, col = NULL, bgcol = "snow")
dat |
A matrix with three columns, unit-vectors, spherical data. |
col |
If you want the points to appear with different colours put numbers here, otherwise leave it NULL. |
bgcol |
The color of the surface of the sphere. |
An interactive 3D plot of the spherical data will appear.
Michail Tsagris.
R implementation and documentation: Michail Tsagris [email protected].
x <- rvmf(100, rnorm(3), 5) sphereplot(x)
x <- rvmf(100, rnorm(3), 5) sphereplot(x)
It takes some points from the cartesian coordinates and maps them onto the sphere. The inverse os the Lambert's equal area projection.
lambert.inv(z, mu)
lambert.inv(z, mu)
z |
A two- column matrix containing the Lambert's equal rea projected data. |
mu |
The mean direction of the data on the sphere. |
The data are first mapped on the sphere with mean direction equal to the north pole. Then, they are rotated to have the given mean direction. It is the inverse of the Lambert's equal are projection.
A matrix containing spherical data (unit vectors).
Michail Tsagris.
R implementation and documentation: Michail Tsagris [email protected] and Giorgos Athineou <[email protected]>.
Kent, John T. (1982). The Fisher-Bingham distribution on the sphere. Journal of the Royal Statistical Society. Series B (Methodological) 44(1):71–80.
m <- rnorm(3) m <- m / sqrt( sum(m^2) ) x <- rvmf(20, m, 19) mu <- vmf.mle(x)$mu y <- lambert( euclid.inv(x) ) lambert.inv(y, mu) euclid.inv(x)
m <- rnorm(3) m <- m / sqrt( sum(m^2) ) x <- rvmf(20, m, 19) mu <- vmf.mle(x)$mu y <- lambert( euclid.inv(x) ) lambert.inv(y, mu) euclid.inv(x)
It transforms the data from the Euclidan coordinates to latitude dn longitude.
euclid.inv(U)
euclid.inv(U)
U |
A matrix of unit vectors, or even one single unit vector in three dimensions. |
It takes the matrix of unit vectors and back transforms it to latitude and longitude.
A two column matrix:
u |
The first column is the latitude and the second is the longitude, both expressed in degrees. |
Michail Tsagris.
R implementation and documentation: Michail Tsagris [email protected] and Giorgos Athineou <[email protected]>.
x <- rvmf(10, rnorm(3), 10) euclid.inv(x) euclid( euclid.inv(x) ) x
x <- rvmf(10, rnorm(3), 10) euclid.inv(x) euclid( euclid.inv(x) ) x
It classifies new observations to some known groups via the k-NN algorithm.
dirknn(xnew, ina, x, k = 5, mesos = TRUE, parallel = FALSE, rann = FALSE)
dirknn(xnew, ina, x, k = 5, mesos = TRUE, parallel = FALSE, rann = FALSE)
xnew |
The new data whose membership is to be predicted, a numeric matrix with unit vectors. |
ina |
A variable indicating the groups of the data x. |
x |
The data, a numeric matrix with unit vectors. |
k |
The number of nearest neighbours, set to 5 by default. It can also be a vector with many values. |
mesos |
A boolean variable used only in the case of the non standard algorithm (type="NS"). Should the average of the distances be calculated (TRUE) or not (FALSE)? If it is FALSE, the harmonic mean is calculated. |
parallel |
If you want the standard -NN algorithm to take place in parallel set this equal to TRUE. |
rann |
If you have large scale datasets and want a faster k-NN search, you can use kd-trees implemented in the R package "RANN". In this case you must set this argument equal to TRUE. |
The standard algorithm is to keep the k nearest observations and see the groups of these observations. The new observation is allocated to the most frequent seen group. The non standard algorithm is to calculate the classical mean or the harmonic mean of the k nearest observations for each group. The new observation is allocated to the group with the smallest mean distance.
A vector including:
g |
A matrix with the predicted group(s). It has as many columns as the values of k. |
Michail Tsagris.
R implementation and documentation: Michail Tsagris [email protected].
Tsagris M. and Alenazi A. (2019). Comparison of discriminant analysis methods on the sphere. Communications in Statistics: Case Studies, Data Analysis and Applications, 5(4), 467–491.
k <- runif(4, 4, 20) prob <- c(0.2, 0.4, 0.3, 0.1) mu <- matrix(rnorm(16), ncol = 4) mu <- mu / sqrt( rowSums(mu^2) ) da <- rmixvmf(200, prob, mu, k) nu <- sample(1:200, 180) x <- da$x[nu, ] ina <- da$id[nu] xx <- da$x[-nu, ] id <- da$id[-nu] a1 <- dirknn(xx, ina, x, k = 5, mesos = TRUE) a2 <- dirknn(xx, ina, x, k = 5, mesos = FALSE) table(id, a1) table(id, a2)
k <- runif(4, 4, 20) prob <- c(0.2, 0.4, 0.3, 0.1) mu <- matrix(rnorm(16), ncol = 4) mu <- mu / sqrt( rowSums(mu^2) ) da <- rmixvmf(200, prob, mu, k) nu <- sample(1:200, 180) x <- da$x[nu, ] ina <- da$id[nu] xx <- da$x[-nu, ] id <- da$id[-nu] a1 <- dirknn(xx, ina, x, k = 5, mesos = TRUE) a2 <- dirknn(xx, ina, x, k = 5, mesos = FALSE) table(id, a1) table(id, a2)
k-NN regression with Euclidean or (hyper-)spherical response and or predictor variables.
knn.reg(xnew, y, x, k = 5, res = "eucl", estim = "arithmetic")
knn.reg(xnew, y, x, k = 5, res = "eucl", estim = "arithmetic")
xnew |
The new data, new predictor variables values. A matrix with either euclidean (univariate or multivariate) or (hyper-)spherical data. If you have a circular response, say u, transform it to a unit vector via (cos(u), sin(u)). If xnew = x, you will get the fitted values. |
y |
The currently available data, the response variables values. A matrix with either euclidean (univariate or multivariate) or (hyper-)spherical data. If you have a circular response, say u, transform it to a unit vector via (cos(u), sin(u)). |
x |
The currently available data, the predictor variables values. A matrix with either euclidean (univariate or multivariate) or (hyper-)spherical data. If you have a circular response, say u, transform it to a unit vector via (cos(u), sin(u)). |
k |
The number of nearest neighbours, set to 5 by default. This can also be a vector with many values. |
res |
The type of the response variable. If it is Euclidean, set this argument equal to "res". If it is a unit vector set it to res="spher". |
estim |
Once the k observations whith the smallest distance are discovered, what should the prediction be? The arithmetic average of the corresponding y values be used estim="arithmetic" or their harmonic average estim="harmonic". |
This function covers a broad range of data, Euclidean and spherical, along with their combinations.
A list with as many elements as the number of values of k. Each element in the list contains a matrix (or a vector in the case of Euclidean data) with the predicted response values.
Michail Tsagris.
R implementation and documentation: Michail Tsagris [email protected].
knnreg.tune, spher.reg, spml.reg
y <- iris[, 1] x <- as.matrix(iris[, 2:4]) x <- x/ sqrt( rowSums(x^2) ) ## Euclidean response a <- knn.reg(x, y, x, k = 5, res = "eucl", estim = "arithmetic") y <- iris[, 2:4] y <- y / sqrt( rowSums(y^2) ) ## Spherical response x <- iris[, 1] a <- knn.reg(x, y, x, k = 5, res = "spher", estim = "arithmetic")
y <- iris[, 1] x <- as.matrix(iris[, 2:4]) x <- x/ sqrt( rowSums(x^2) ) ## Euclidean response a <- knn.reg(x, y, x, k = 5, res = "eucl", estim = "arithmetic") y <- iris[, 2:4] y <- y / sqrt( rowSums(y^2) ) ## Spherical response x <- iris[, 1] a <- knn.reg(x, y, x, k = 5, res = "spher", estim = "arithmetic")
It calculates the Lambert's equal area projection.
lambert(y)
lambert(y)
y |
A two column matrix with the data. The first column is the altitude and the second is the longitude. |
The spherical data are first rotated so that their mean direction is the north pole and then are projectedt on the plane tagent to the sphere at the north pole.
A two-column matrix with the projected points.
Michail Tsagris.
R implementation and documentation: Michail Tsagris [email protected] and Giorgos Athineou <[email protected]>.
Kent, John T. (1982). The Fisher-Bingham distribution on the sphere. Journal of the Royal Statistical Society. Series B (Methodological) 44(1):71-80.
x <- rvmf(100, rnorm(3), 20) x <- euclid.inv(x) a <- lambert(x) plot(a)
x <- rvmf(100, rnorm(3), 20) x <- euclid.inv(x) a <- lambert(x) plot(a)
Logarithm of the Kent distribution normalizing constant.
kent.logcon(k, b, j = 100)
kent.logcon(k, b, j = 100)
k |
The conencration parameter, |
b |
The ovalness parameter, |
j |
The number of the terms in the sum to use. By default this is 100. |
It calculates logarithm of the normalising constant of the Kent distribution.
The value of the logarithm of the normalising constant of the Kent distribution.
Michail Tsagris.
R implementation and documentation: Michail Tsagris [email protected] and Giorgos Athineou <[email protected]>.
Kent John (1982). The Fisher-Bingham distribution on the sphere. Journal of the Royal Statistical Society, Series B, 44(1): 71–80.
kent.logcon(10, 2) fb.saddle( c(0, 10, 0), c(0, -2, 2) )
kent.logcon(10, 2) fb.saddle( c(0, 10, 0), c(0, -2, 2) )
Many regressions with one circular dependent variable and one Euclidean independent variable.
spml.regs(y, x, tol = 1e-07, logged = FALSE, maxiters = 100, parallel = FALSE)
spml.regs(y, x, tol = 1e-07, logged = FALSE, maxiters = 100, parallel = FALSE)
y |
The dependent variable, it can be a numerical vector with data expressed in radians or it can be a matrix with two columns, the cosinus and the sinus of the circular data. The benefit of the matrix is that if the function is to be called multiple times with the same response, there is no need to transform the vector every time into a matrix. |
x |
A matrix with independent variable. |
tol |
The tolerance value to terminatate the Newton-Raphson algorithm. |
logged |
Do you want the logarithm of the p-value be returned? TRUE or FALSE. |
maxiters |
The maximum number of iterations to implement. |
parallel |
Do you want the calculations to take plac ein parallel? The default value if FALSE. |
The Newton-Raphson algorithm is fitted in these regression as described in Presnell et al. (1998). For each colum of x a circual regression model is fitted and the hypothesis testing of no association between y and this variable is performed.
A matrix with two columns, the test statistics and their associated (log) p-values.
Michail Tsagris.
R implementation and documentation: Michail Tsagris [email protected].
Presnell B., Morrison S. P. and Littell R. C. (1998). Projected multivariate linear models for directional data. Journal of the American Statistical Association, 93(443): 1068–1077.
spml.reg, spml.mle, iag.mle, score.cipc
x <- rnorm(200) z <- cbind(3 + 2 * x, 1 -3 * x) y <- cbind( rnorm(100,z[, 1], 1), rnorm(100, z[, 2], 1) ) y <- y / sqrt( rowSums(y^2) ) x <- matrix( rnorm(100 * 50), ncol = 50 ) a <- Directional::spml.regs(y, x) x <- NULL
x <- rnorm(200) z <- cbind(3 + 2 * x, 1 -3 * x) y <- cbind( rnorm(100,z[, 1], 1), rnorm(100, z[, 2], 1) ) y <- y / sqrt( rowSums(y^2) ) x <- matrix( rnorm(100 * 50), ncol = 50 ) a <- Directional::spml.regs(y, x) x <- NULL
It produces maps of the world and the continents.
asia(title = "Asia", coords = NULL) africa(title = "Africa", coords = NULL) europe(title = "Europe", coords = NULL) north.america(title = "North America", coords = NULL) oceania(title = "Oceania", coords = NULL) south.america(title = "South America", coords = NULL) worldmap(title = "World map", coords = NULL)
asia(title = "Asia", coords = NULL) africa(title = "Africa", coords = NULL) europe(title = "Europe", coords = NULL) north.america(title = "North America", coords = NULL) oceania(title = "Oceania", coords = NULL) south.america(title = "South America", coords = NULL) worldmap(title = "World map", coords = NULL)
title |
A character vector with the title of the map. |
coords |
If you want specific points to appear on the plot give the coordinates as a matrix, where the first column contains the longitude and the second column contains the latitude, in degrees. |
Maps of the world or the continents are produced. This are experimental functions and plot the countries with specific colouring at the moment. More functionalities will be added in the future.
A map of the selected continent or the whole world.
Christos Adam.
R implementation and documentation: Christos Adam [email protected] and Michail Tsagris.
x <- euclid.inv( rvmf(10, rnorm(3), 5) )
x <- euclid.inv( rvmf(10, rnorm(3), 5) )
It performs model based clustering for circualr, spherical and hyper-spherical data assuming rotationally symetric distributions.
mixvmf.mle(x, g, n.start = 5, tol = 1e-6, maxiters = 100) mixspcauchy.mle(x, g, n.start = 5, tol = 1e-6, maxiters = 100) mixpkbd.mle(x, g, n.start = 5, tol = 1e-6, maxiters = 100)
mixvmf.mle(x, g, n.start = 5, tol = 1e-6, maxiters = 100) mixspcauchy.mle(x, g, n.start = 5, tol = 1e-6, maxiters = 100) mixpkbd.mle(x, g, n.start = 5, tol = 1e-6, maxiters = 100)
x |
A matrix with the data expressed as unit vectors. |
g |
The number of groups to fit. It must be greater than or equal to 2. |
n.start |
The number of random starts to try. See also R's built-in function |
tol |
The tolerance value to terminate the EM algorithm. |
maxiters |
The maximum number of iterations to perform. |
The initial step of the algorithm is not based on a spherical k-means, but on simple k-means. The results are comparable to the package movMF for the mixtures of von Mises-Fisher distributions. The other cases are mixtures of spherical Cauchy distributions or mixtures of Poisson kernel-based distributions.
A list including:
param |
A matrix with the mean direction, the concentration parameters and mixing probability of each group. |
loglik |
The value of the maximised log-likelihood. |
pred |
The predicted group of each observation. |
w |
The estimated probabilities of each observation to belong to each cluster. |
iter |
The number of iteration required by the EM algorithm. |
runtime |
The run time of the algorithm. A numeric vector. The first element is the user time, the second element is the system time and the third element is the elapsed time. |
Michail Tsagris and Panagiotis Papastamoulis.
R implementation and documentation: Michail Tsagris [email protected] and Panagiotis Papastamoulis [email protected].
Kurt Hornik and Bettina Grun (2014). movMF: An R Package for Fitting Mixtures of von Mises-Fisher Distributions http://cran.r-project.org/web/packages/movMF/vignettes/movMF.pdf
Tsagris M., Papastamoulis P. and Kato S. (2024). Directional data analysis using the spherical Cauchy and the Poisson kernel-based distribution. https://arxiv.org/pdf/2409.03292.
rmixvmf, bic.mixvmf, mixvmf.contour
k <- runif(4, 4, 6) prob <- c(0.2, 0.4, 0.3, 0.1) mu <- matrix(rnorm(16), ncol = 4) mu <- mu / sqrt( rowSums(mu^2) ) x <- rmixvmf(200, prob, mu, k)$x mixvmf.mle(x, 3) mixvmf.mle(x, 4) mixvmf.mle(x, 5)
k <- runif(4, 4, 6) prob <- c(0.2, 0.4, 0.3, 0.1) mu <- matrix(rnorm(16), ncol = 4) mu <- mu / sqrt( rowSums(mu^2) ) x <- rmixvmf(200, prob, mu, k)$x mixvmf.mle(x, 3) mixvmf.mle(x, 4) mixvmf.mle(x, 5)
MLE of (hyper-)spherical rotationally symmetric distributions.
vmf.mle(x, fast = FALSE, tol = 1e-07) multivmf.mle(x, ina, tol = 1e-07, ell = FALSE) iag.mle(x, tol = 1e-06) sipc.mle(x, tol = 1e-6) acg.mle(x, tol = 1e-07) spcauchy.mle(x, tol = 1e-06) spcauchy.mle2(x, tol = 1e-06) pkbd.mle(x, tol = 1e-6) pkbd.mle2(x, tol = 1e-6)
vmf.mle(x, fast = FALSE, tol = 1e-07) multivmf.mle(x, ina, tol = 1e-07, ell = FALSE) iag.mle(x, tol = 1e-06) sipc.mle(x, tol = 1e-6) acg.mle(x, tol = 1e-07) spcauchy.mle(x, tol = 1e-06) spcauchy.mle2(x, tol = 1e-06) pkbd.mle(x, tol = 1e-6) pkbd.mle2(x, tol = 1e-6)
x |
A matrix with directional data, i.e. unit vectors. |
fast |
IF you want a faster version, but with fewer information returned, set this equal to TRUE. |
ina |
A numerical vector with discrete numbers starting from 1, i.e. 1, 2, 3, 4,... or a factor variable. Each number denotes a sample or group. If you supply a continuous valued vector the function will obviously provide wrong results. |
ell |
This is for the multivmf.mle only. Do you want the log-likelihood returned? The default value is TRUE. |
tol |
The tolerance value at which to terminate the iterations. |
The vmf.mle() estimates the mean direction and concentration of a fitted von Mises-Fisher distribution.
The von Mises-Fisher distribution for groups of data is also implemented.
The acg.mle() fits the angular central Gaussian distribution. There is a constraint on the estimated covariance matrix; its trace is equal to the number of variables. An iterative algorithm takes place and convergence is guaranteed.
The iag.mle() implements MLE of the spherical projected normal distribution, for spherical and hyper-spherical data.
The spcauchy.mle() is faster than the spcacuhy.mle2() because it employs the Newton-Raphson algortihm. Both functions estimate the parameters of the spherical Cauchy distribution, for any dimension. Despite the name sounds confusing, it is implemented for arbitrary dimensions, not only the sphere. The function employs a combination of the fixed points iteration algorithm and the Brent algorithm.
The pkbd.mle() estimates the parameters of the Poisson kernel-based distribution (PKBD), for any dimension and it is faster than pkbd.mle2() for the same reason with the spcauchy.mle().
The sipc.mle() implements MLE of the spherical independent projected Cauchy distribution, for spherical data only.
For the von Mises-Fisher a list including:
loglik |
The maximum log-likelihood value. |
mu |
The mean direction. |
kappa |
The concentration parameter. |
For the multi von Mises-Fisher a list including:
loglik |
A vector with the maximum log-likelihood values if ell is set to TRUE. Otherwise NULL is returned. |
mi |
A matrix with the group mean directions. |
ki |
A vector with the group concentration parameters. |
For the angular central Gaussian a list including:
iter |
The number if iterations required by the algorithm to converge to the solution. |
cova |
The estimated covariance matrix. |
For the spherical projected normal a list including:
iters |
The number of iteration required by the Newton-Raphson. |
mesi |
A matrix with two rows. The first row is the mean direction and the second is the mean vector. The first comes from the second by normalising to have unit length. |
param |
A vector with the elements, the norm of mean vector, the log-likelihood and the log-likelihood of the spherical uniform distribution. The third value helps in case you want to do a log-likelihood ratio test for uniformity. |
For the spherical Cauchy and the PKBD a list including:
mesos |
The mean in |
mu |
The mean direction. |
gamma |
The norm of the mean in |
rho |
The concetration parameter, this takes values in [0, 1). |
loglik |
The log-likelihood value. |
For the SIPC a list including:
mu |
The mean direction. |
loglik |
The log-likelihood value. |
For the angular central Gaussian a list including:
iter |
The number of iterations performed. |
cova |
The covariance matrix. |
Michail Tsagris and Zehao Yu.
R implementation and documentation: Michail Tsagris [email protected] and Zehao Yu [email protected].
Mardia K. V. and Jupp P. E. (2000). Directional statistics. Chicester: John Wiley & Sons.
Sra S. (2012). A short note on parameter approximation for von Mises-Fisher distributions:
and a fast implementation of . Computational Statistics, 27(1): 177–190.
Tyler D. E. (1987). Statistical analysis for the angular central Gaussian distribution on the sphere. Biometrika 74(3): 579–589.
Paine P.J., Preston S.P., Tsagris M. and Wood A.T.A. (2018). An Elliptically Symmetric Angular Gaussian Distribution. Statistics and Computing, 28: 689–697.
Tsagris M. and Alzeley O. (2024). Circular and spherical projected Cauchy distributions: A Novel Framework for Circular and Directional Data Modeling. https://arxiv.org/pdf/2302.02468.pdf
Kato S. and McCullagh P. (2020). Some properties of a Cauchy family on the sphere derived from the Mobius transformations. Bernoulli, 26(4): 3224–3248. https://arxiv.org/pdf/1510.07679.pdf
Golzy M. and Markatou M. (2020). Poisson kernel-based clustering on the sphere: convergence properties, identifiability, and a method of sampling. Journal of Computational and Graphical Statistics, 29(4): 758–770.
Sablica L., Hornik K. and Leydold J. (2023). Efficient sampling from the PKBD distribution. Electronic Journal of Statistics, 17(2): 2180–2209.
Zehao Yu and Xianzheng Huang (2024). A new parameterization for elliptically symmetric angular Gaussian distributions of arbitrary dimension. Electronic Journal of Statististics, 18(1): 301–334.
Tsagris M., Papastamoulis P. and Kato S. (2024). Directional data analysis using the spherical Cauchy and the Poisson kernel-based distribution. https://arxiv.org/pdf/2409.03292.
m <- c(0, 0, 0, 0) s <- cov(iris[, 1:4]) x <- racg(100, s) mod <- acg.mle(x) mod cov2cor(mod$cova) ## estimated covariance matrix turned into a correlation matrix cov2cor(s) ## true covariance matrix turned into a correlation matrix vmf.mle(x) x <- rbind( rvmf(100,rnorm(4), 10), rvmf(100,rnorm(4), 20) ) a <- multivmf.mle(x, rep(1:2, each = 100) )
m <- c(0, 0, 0, 0) s <- cov(iris[, 1:4]) x <- racg(100, s) mod <- acg.mle(x) mod cov2cor(mod$cova) ## estimated covariance matrix turned into a correlation matrix cov2cor(s) ## true covariance matrix turned into a correlation matrix vmf.mle(x) x <- rbind( rvmf(100,rnorm(4), 10), rvmf(100,rnorm(4), 20) ) a <- multivmf.mle(x, rep(1:2, each = 100) )
MLE of some circular distributions.
spml.mle(x, rads = FALSE, tol = 1e-07) wrapcauchy.mle(x, rads = FALSE, tol = 1e-07) circexp.mle(x, rads = FALSE, tol = 1e-06) circbeta.mle(x, rads = FALSE) cardio.mle(x, rads = FALSE) ggvm.mle(phi, rads = FALSE) cipc.mle(x, rads = FALSE, tol = 1e-6) gcpc.mle(x, rads = FALSE) mmvm.mle(x, N, rads = FALSE)
spml.mle(x, rads = FALSE, tol = 1e-07) wrapcauchy.mle(x, rads = FALSE, tol = 1e-07) circexp.mle(x, rads = FALSE, tol = 1e-06) circbeta.mle(x, rads = FALSE) cardio.mle(x, rads = FALSE) ggvm.mle(phi, rads = FALSE) cipc.mle(x, rads = FALSE, tol = 1e-6) gcpc.mle(x, rads = FALSE) mmvm.mle(x, N, rads = FALSE)
x |
A numerical vector with the circular data. They can either be expressed in radians or in degrees. |
phi |
A numerical vector with the circular data. They can either be expressed in radians or in degrees. |
N |
The number of modes to consider in the multi-modal von Mises distribution. |
rads |
If the data are in radians set this to TRUE. |
tol |
The tolerance level to stop the iterative process of finding the MLEs. |
The parameters of the bivariate angular Gaussian (spml.mle), wrapped Cauchy, circular exponential, cardioid, circular beta, geometrically generalised von Mises, CIPC (reparametrised version of the wrapped Cauchy), GCPC (generalisation of the CIPC) and multi-modal von Mises distributions are estimated. For the Wrapped Cauchy, the iterative procedure described by Kent and Tyler (1988) is used. The Newton-Raphson algortihm for the angular Gaussian is described in the regression setting in Presnell et al. (1998). The circular exponential is also known as wrapped exponential distribution.
A list including:
iters |
The iterations required until convergence. |
loglik |
The value of the maximised log-likelihood. |
param |
A vector consisting of the estimates of the two parameters, the mean direction for both distributions
and the concentration parameter |
gamma |
The norm of the mean vector of the angular Gaussian, the CIPC and the GCPC distributions. |
mu |
The mean vector of the angular Gaussian, the CIPC and the GCPC distributions. |
mumu |
In the case of "angular Gaussian distribution this is the mean angle in radians. |
circmu |
In the case of the CIPC and the GCPC this is the mean angle in radians. |
rho |
For the GCPC distribution this is the eigenvalue of the covariance matrix, or the covariance determinant. |
lambda |
The lambda parameter of the circular exponential distribution. |
Michail Tsagris.
R implementation and documentation: Michail Tsagris [email protected].
Mardia K. V. and Jupp P. E. (2000). Directional statistics. Chicester: John Wiley & Sons.
Sra S. (2012). A short note on parameter approximation for von Mises-Fisher distributions:
and a fast implementation of . Computational Statistics, 27(1): 177–190.
Presnell Brett, Morrison Scott P. and Littell Ramon C. (1998). Projected multivariate linear models for directional data. Journal of the American Statistical Association, 93(443): 1068–1077.
Kent J. and Tyler D. (1988). Maximum likelihood estimation for the wrapped Cauchy distribution. Journal of Applied Statistics, 15(2): 247–254.
Dietrich T. and Richter W. D. (2017). Classes of geometrically generalized von Mises distributions. Sankhya B, 79(1): 21–59.
https://en.wikipedia.org/wiki/Wrapped_exponential_distribution
Jammalamadaka S. R. and Kozubowski T. J. (2003). A new family of circular models: The wrapped Laplace distributions. Advances and Applications in Statistics, 3(1), 77–103.
Tsagris M. and Alzeley O. (2024). Circular and spherical projected Cauchy distributions: A Novel Framework for Circular and Directional Data Modeling. Australian & New Zealand Journal of Statistics (accepted for publication). https://arxiv.org/pdf/2302.02468.pdf
Barnett M. J. and Kingston R. L. (2024). A note on the Hendrickson-Lattman phase probability distribution and its equivalence to the generalized von Mises distribution. Journal of Applied Crystallography, 57(2).
circ.summary, purka.mle, rvonmises, vmf.mle, rvmf
x <- rvonmises(1000, 3, 9) spml.mle(x, rads = TRUE) wrapcauchy.mle(x, rads = TRUE) circexp.mle(x, rads = TRUE) ggvm.mle(x, rads = TRUE)
x <- rvonmises(1000, 3, 9) spml.mle(x, rads = TRUE) wrapcauchy.mle(x, rads = TRUE) circexp.mle(x, rads = TRUE) ggvm.mle(x, rads = TRUE)
MLE of some circular distributions with multiple samples.
multivm.mle(x, ina, tol = 1e-07, ell = FALSE) multispml.mle(x, ina, tol = 1e-07, ell = FALSE)
multivm.mle(x, ina, tol = 1e-07, ell = FALSE) multispml.mle(x, ina, tol = 1e-07, ell = FALSE)
x |
A numerical vector with the circular data. They must be expressed in radians. For the "spml.mle" this can also be a matrix with two columns, the cosinus and the sinus of the circular data. |
ina |
A numerical vector with discrete numbers starting from 1, i.e. 1, 2, 3, 4,... or a factor variable. Each number denotes a sample or group. If you supply a continuous valued vector the function will obviously provide wrong results. |
tol |
The tolerance level to stop the iterative process of finding the MLEs. |
ell |
Do you want the log-likelihood returned? The default value is FALSE. |
The parameters of the von Mises and of the bivariate angular Gaussian distributions are estimated for multiple samples.
A list including:
iters |
The iterations required until convergence. This is returned in the wrapped Cauchy distribution only. |
loglik |
A vector with the value of the maximised log-likelihood for each sample. |
mi |
For the von Mises, this is a vector with the means of each sample. For the angular Gaussian (spml), a matrix with the mean vector of each sample |
ki |
A vector with the concentration parameter of the von Mises distribution at each sample. |
gi |
A vector with the norm of the mean vector of the angular Gaussian distribution at each sample. |
Michail Tsagris.
R implementation and documentation: Michail Tsagris [email protected].
Mardia K. V. and Jupp P. E. (2000). Directional statistics. Chicester: John Wiley & Sons.
Sra S. (2012). A short note on parameter approximation for von Mises-Fisher distributions:
and a fast implementation of . Computational Statistics, 27(1): 177–190.
Presnell Brett, Morrison Scott P. and Littell Ramon C. (1998). Projected multivariate linear models for directional data. Journal of the American Statistical Association, 93(443): 1068–1077.
Kent J. and Tyler D. (1988). Maximum likelihood estimation for the wrapped Cauchy distribution. Journal of Applied Statistics, 15(2): 247–254.
y <- rcauchy(100, 3, 1) x <- y ina <- rep(1:2, 50) multivm.mle(x, ina) multispml.mle(x, ina)
y <- rcauchy(100, 3, 1) x <- y ina <- rep(1:2, 50) multivm.mle(x, ina) multispml.mle(x, ina)
MLE of the ESAG distribution.
esag.mle(y, full = FALSE, tol = 1e-06) ESAGd.mle(y, full = FALSE)
esag.mle(y, full = FALSE, tol = 1e-06) ESAGd.mle(y, full = FALSE)
y |
A matrix with the data expressed in Euclidean coordinates, i.e. unit vectors. The function esag.mle() works for spherical data, whereas ESAGd.mle() is for spherical and hyper-spherical data. |
full |
If you want some extra information, the inverse of the covariance matrix, the |
tol |
A tolerance value to stop performing successive optimizations. |
MLE of the MLE of the ESAG distributiontribution, on the sphere, is implemented. ESAG stands for Elliptically Symmetric Angular Gaussian and it was suugested by Paine et al. (2018). Unlike the projected normal distribution this is rotationally symmetric and is a competitor of the spherical Kent distribution (which is also elliptically symmetric). ESAG was then generalized to arbitrary dimensions by Yu and Huang (2024).
A list including:
mu |
The mean vector. |
gam |
The |
loglik |
The log-likelihood value. |
vinv |
The inverse of the covariance matrix. It is returned if the argument "full" is TRUE. |
rho |
The |
psi |
The angle of rotation |
lambda |
The |
iag.loglik |
The log-likelihood value of the isotropic angular Gaussian distribution in the esag.mle(). That is, the projected normal distribution which is rotationally symmetric. |
Michail Tsagris and Zehao Yu.
R implementation and documentation: Michail Tsagris [email protected] and Zehao Yu [email protected].
Zehao Yu and Xianzheng Huang (2024). A new parameterization for elliptically symmetric angular Gaussian distributions of arbitrary dimension. Electronic Journal of Statististics, 18(1): 301–334.
Paine P.J., Preston S.P., Tsagris M. and Wood A.T.A. (2018). An Elliptically Symmetric Angular Gaussian Distribution. Statistics and Computing, 28(3):689–697.
Mardia, K. V. and Jupp, P. E. (2000). Directional statistics. Chicester: John Wiley & Sons.
desag, resag, iag.mle, kent.mle, acg.mle, circ.summary, sphereplot
m <- colMeans( as.matrix( iris[, 1:3] ) ) y <- resag(1000, m, c(1, 0.5) ) esag.mle(y) m <- colMeans( as.matrix( iris[, 1:4] ) ) y <- rESAGd(1000, m, c(1, 0.5, -1, 1, -0.5) ) ESAGd.mle(y)
m <- colMeans( as.matrix( iris[, 1:3] ) ) y <- resag(1000, m, c(1, 0.5) ) esag.mle(y) m <- colMeans( as.matrix( iris[, 1:4] ) ) y <- rESAGd(1000, m, c(1, 0.5, -1, 1, -0.5) ) ESAGd.mle(y)
It estimates the concentration and the ovalness parameter of some directional data assuming the Kent distribution. The mean direction and major and minor axes are also estimated.
kent.mle(x)
kent.mle(x)
x |
A matrix containing spherical data in Euclidean coordinates. |
The Kent distribution is fitted to some data and its parameters are estimated.
A list including:
runtime |
The run time of the procedure. |
G |
A 3 x 3 matrix whose first column is the mean direction. The second and third columns are the major and minor axes respectively. |
param |
A vector with the concentration |
logcon |
The logarithm of the normalising constant, using the third type approximation (Kume and Wood, 2005). |
loglik |
The value of the log-likelihood. |
Michail Tsagris.
R implementation and documentation: Michail Tsagris [email protected] and Giorgos Athineou <[email protected]>.
Kent John (1982). The Fisher-Bingham distribution on the sphere. Journal of the Royal Statistical Society, Series B, 44(1): 71–80.
Kume Alfred and Wood Andrew T.A. (2005). Saddlepoint approximations for the Bingham and Fisher-Bingham normalizing constants. Biometrika, 92(2):465–476
kent.mle, fb.saddle, vmf.mle, wood.mle, sphereplot
x <- rvmf(200, rnorm(3), 15) kent.mle(x) vmf.mle(x) A <- diag( c(-5, 0, 5) ) x <- rfb(200, 15, rnorm(3), A) kent.mle(x) vmf.mle(x)
x <- rvmf(200, rnorm(3), 15) kent.mle(x) vmf.mle(x) A <- diag( c(-5, 0, 5) ) x <- rfb(200, 15, rnorm(3), A) kent.mle(x) vmf.mle(x)
It returns the maximum likelihood estimate of the Matrix Fisher parameter F(3x3).
matrixfisher.mle(X)
matrixfisher.mle(X)
X |
An array containing rotation matrices in SO(3). |
The components of .
Anamul Sajib and Chris Fallaize.
R implementation and documentation: Anamul Sajib <[email protected]> and Chris Fallaize.
Prentice M. J. (1986). Orientation statistics without parametric assumptions. Journal of the Royal Statistical Society. Series B: Methodological 48(2): 214–222.
F <- 10^(-1) * matrix( c(85, 11, 41, 78, 39, 60, 43, 64, 48), ncol = 3 ) ### An arbitrary F matrix X <- rmatrixfisher(5000, F) matrixfisher.mle(X) svd(F)
F <- 10^(-1) * matrix( c(85, 11, 41, 78, 39, 60, 43, 64, 48), ncol = 3 ) ### An arbitrary F matrix X <- rmatrixfisher(5000, F) matrixfisher.mle(X) svd(F)
MLE of the Purkayashta distribution.
purka.mle(x, tol = 1e-07)
purka.mle(x, tol = 1e-07)
x |
A numerical vector with data expressed in radians or a matrix with spherical data. |
tol |
The tolerance value to terminate the Brent algorithm. |
MLE of the Purkayastha distribution is performed.
A list including:
theta |
The median direction. |
circtheta |
In case of circular data the circular mean is also returned. |
alpha |
The concentration parameter. |
loglik |
The log-likelihood. |
alpha.sd |
The standard error of the concentration parameter. |
Michail Tsagris.
R implementation and documentation: Michail Tsagris [email protected].
Purkayastha S. (1991). A Rotationally Symmetric Directional Distribution: Obtained through Maximum Likelihood Characterization. The Indian Journal of Statistics, Series A, 53(1): 70–83.
Cabrera J. and Watson G. S. (1990). On a spherical median related distribution. Communications in Statistics-Theory and Methods, 19(6): 1973–1986.
x <- cbind( rnorm(100,1,1), rnorm(100, 2, 1) ) x <- x / sqrt(rowSums(x^2)) purka.mle(x)
x <- cbind( rnorm(100,1,1), rnorm(100, 2, 1) ) x <- x / sqrt(rowSums(x^2)) purka.mle(x)
MLE of the SESPC distribution.
sespc.mle(y, full = FALSE, tol = 1e-06)
sespc.mle(y, full = FALSE, tol = 1e-06)
y |
A matrix with the data expressed in Euclidean coordinates, i.e. unit vectors. |
full |
If you want some extra information, the inverse of the covariance matrix, set this equal to TRUE. Otherwise leave it FALSE. |
tol |
A tolerance value to stop performing successive optimizations. |
MLE of the SESPC distribution is implemented. SESPC stands for Spherical Elliptically Symmetric Projected Cauchy and it was suugested by Tsagris and Alzeley (2024). Unlike the spherical independent projected Cauchy distribution this is rotationally symmetric and is a competitor of the spherical ESAG and Kent distributions (which are also ellitpically symmetric).
A list including:
mu |
The mean vector in |
theta |
The two |
loglik |
The log-likelihood value. |
vinv |
The inverse of the covariance matrix. It is returned if the argument "full" is TRUE. |
lambda |
The |
psi |
The angle of rotation |
sipc.loglik |
The log-likelihood value of the isotropic prohected Cuchy distribution, which is rotationally symmetric. |
Michail Tsagris.
R implementation and documentation: Michail Tsagris [email protected].
Tsagris M. and Alzeley O. (2024). Circular and spherical projected Cauchy distributions: A Novel Framework for Circular and Directional Data Modeling. Australian & New Zealand Journal of Statistics (accepted for publication). https://arxiv.org/pdf/2302.02468.pdf
Mardia K. V. and Jupp P. E. (2000). Directional statistics. Chicester: John Wiley & Sons.
dsespc, rsespc, sipc.mle, esag.mle, spher.sespc.contour
m <- colMeans( as.matrix( iris[,1:3] ) ) y <- rsespc(1000, m, c(1,0.5) ) sespc.mle(y)
m <- colMeans( as.matrix( iris[,1:3] ) ) y <- rsespc(1000, m, c(1,0.5) ) sespc.mle(y)
It estimates the parameters of the Wood bimodal distribution.
wood.mle(y)
wood.mle(y)
y |
A matrix containing two columns. The first one is the latitude and the second is the longitude, both expressed in degrees. |
The Wood distribution is fitted to some data and its parameters are estimated. It is a bimodal distribution which contains 5 parameters, just like the Kent distribution.
A list including:
info |
A 5 x 3 matrix containing the 5 parameters, |
modes |
The two axis of the modes of the distribution expressed in degrees. |
unitvectors |
A 3 x 3 matrix with the 3 unit vectors associated with the |
loglik |
The value of the log-likelihood. |
Michail Tsagris.
R implementation and documentation: Michail Tsagris [email protected] and Giorgos Athineou <[email protected]>.
Wood A.T.A. (1982). A bimodal distribution on the sphere. Journal of the Royal Statistical Society, Series C, 31(1): 52–58.
kent.mle, esag.mle, vmf.mle, sphereplot
x <- rvmf(100, rnorm(3), 15) x <- euclid.inv(x) wood.mle(x)
x <- rvmf(100, rnorm(3), 15) x <- euclid.inv(x) wood.mle(x)
Naive Bayes classifiers for directional data.
vm.nb(xnew = NULL, x, ina, tol = 1e-07) spml.nb(xnew = NULL, x, ina, tol = 1e-07)
vm.nb(xnew = NULL, x, ina, tol = 1e-07) spml.nb(xnew = NULL, x, ina, tol = 1e-07)
xnew |
A numerical matrix with new predictor variables whose group is to be predicted. Each column refers to an angular variable. |
x |
A numerical matrix with observed predictor variables. Each column refers to an angular variable. |
ina |
A numerical vector with strictly positive numbers, i.e. 1,2,3 indicating the groups of the dataset. Alternatively this can be a factor variable. |
tol |
The tolerance value to terminate the Newton-Raphson algorithm. |
Each column is supposed to contain angular measurements. Thus, for each column a von Mises distribution or an circular angular Gaussian distribution is fitted. The product of the densities is the joint multivariate distribution.
A list including:
mu |
A matrix with the mean vectors expressed in radians. |
mu1 |
A matrix with the first set of mean vectors. |
mu2 |
A matrix with the second set of mean vectors. |
kappa |
A matrix with the kappa parameters for the vonMises distribution or with the norm of the mean vectors for the circular angular Gaussian distribution. |
ni |
The sample size of each group in the dataset. |
est |
The estimated group of the xnew observations. It returns a numerical value back regardless of the target variable being numerical as well or factor. Hence, it is suggested that you do \"as.numeric(ina)\" in order to see what is the predicted class of the new data. |
Michail Tsagris.
R implementation and documentation: Michail Tsagris [email protected].
x <- matrix( runif( 100, 0, 1 ), ncol = 2 ) ina <- rbinom(50, 1, 0.5) + 1 a <- vm.nb(x, x, ina)
x <- matrix( runif( 100, 0, 1 ), ncol = 2 ) ina <- rbinom(50, 1, 0.5) + 1 a <- vm.nb(x, x, ina)
Normalised spatial median for directional data.
nsmedian(x, tol = 1e-07)
nsmedian(x, tol = 1e-07)
x |
A matrix with Euclidean data, continuous variables. |
tol |
A tolerance level to terminate the process. |
The spatial median, using a fixed point iterative algorithm, for Euclidean data is calculated. It is a robust location estimate. Then it is normalised to become a unit vector. Generally speaking this might be a better alternative than then mediandir
.
A vector with the spatial median.
Michail Tsagris.
R implementation and documentation: Michail Tsagris [email protected].
Ducharme G. R. and Milasevic P. (1987). Spatial median and directional data. Biometrika, 74(1), 212-215.
Jyrki Mottonen, Klaus Nordhausen and Hannu Oja (2010). Asymptotic theory of the spatial median. In Nonparametrics and Robustness in Modern Statistical Inference and Time Series Analysis: A Festschrift in honor of Professor Jana Jureckova.
T. Karkkaminen and S. Ayramo (2005). On computation of spatial median for robust data mining. Evolutionary and Deterministic Methods for Design, Optimization and Control with Applications to Industrial and Societal Problems, EUROGEN 2005, R. Schilling, W.Haase, J. Periaux, H. Baier, G. Bugeda (Eds) FLM, Munich. http://users.jyu.fi/~samiayr/pdf/ayramo_eurogen05.pdf
m <- rnorm(3) m <- m / sqrt( sum(m^2) ) x <- rvmf(100, m, 10) nsmedian(x) mediandir(x)
m <- rnorm(3) m <- m / sqrt( sum(m^2) ) x <- rvmf(100, m, 10) nsmedian(x) mediandir(x)
Permutation based 2-sample mean test for (hyper-)spherical data.
hcf.perm(x1, x2, B = 999) lr.perm(x1, x2, B = 999) hclr.perm(x1, x2, B = 999) embed.perm(x1, x2, B = 999) het.perm(x1, x2, B = 999)
hcf.perm(x1, x2, B = 999) lr.perm(x1, x2, B = 999) hclr.perm(x1, x2, B = 999) embed.perm(x1, x2, B = 999) het.perm(x1, x2, B = 999)
x1 |
A matrix with the data in Euclidean coordinates, i.e. unit vectors. |
x2 |
A matrix with the data in Euclidean coordinates, i.e. unit vectors. |
B |
The number of permutations to perform. |
The high concentration (hcf.perm), log-likelihood ratio (lr.perm), high concentration log-likelihood ratio (hclr.perm), embedding approach (embed.perm) or the non equal concentration parameters approach (het.perm) is used.
This is an "htest"class object. Thus it returns a list including:
statistic |
The test statistic value. |
parameter |
The degrees of freedom of the test. Since these are permutation based tests this is "NA". |
p.value |
The p-value of the test. |
alternative |
A character with the alternative hypothesis. |
method |
A character with the test used. |
data.name |
A character vector with two elements. |
Michail Tsagris.
R implementation and documentation: Michail Tsagris [email protected].
Mardia K. V. and Jupp P. E. (2000). Directional statistics. Chicester: John Wiley & Sons.
Rumcheva P. and Presnell B. (2017). An improved test of equality of mean directions for the Langevin-von Mises-Fisher distribution. Australian & New Zealand Journal of Statistics, 59(1), 119–135.
Tsagris M. and Alenazi A. (2024). An investigation of hypothesis testing procedures for circular and spherical mean vectors. Communications in Statistics-Simulation and Computation, 53(3): 1387–1408.
hcf.boot, hcf.aov, spherconc.test, conc.test
x <- rvmf(60, rnorm(3), 15) ina <- rep(1:2, each = 30) x1 <- x[ina == 1, ] x2 <- x[ina == 2, ] hcf.perm(x1, x2) lr.perm(x1, x2) het.boot(x1, x2)
x <- rvmf(60, rnorm(3), 15) ina <- rep(1:2, each = 30) x1 <- x[ina == 1, ] x2 <- x[ina == 2, ] hcf.perm(x1, x2) lr.perm(x1, x2) het.boot(x1, x2)
Permutation based 2-sample mean test for circular data.
hcfcirc.perm(u1, u2, rads = TRUE, B = 999) hetcirc.perm(u1, u2, rads = TRUE, B = 999) lrcirc.perm(u1, u2, rads = TRUE, B = 999) hclrcirc.perm(u1, u2, rads = TRUE, B = 999) embedcirc.perm(u1, u2, rads = TRUE, B = 999)
hcfcirc.perm(u1, u2, rads = TRUE, B = 999) hetcirc.perm(u1, u2, rads = TRUE, B = 999) lrcirc.perm(u1, u2, rads = TRUE, B = 999) hclrcirc.perm(u1, u2, rads = TRUE, B = 999) embedcirc.perm(u1, u2, rads = TRUE, B = 999)
u1 |
A numeric vector containing the data of the first sample. |
u2 |
A numeric vector containing the data of the first sample. |
rads |
If the data are in radians, this should be TRUE and FALSE otherwise. |
B |
The number of permutations to perform. |
The high concentration (hcfcirc.perm), log-likelihood ratio (lrcirc.perm), high concentration log-likelihood ratio (hclrcirc.perm), embedding approach (embedcirc.perm) or the non equal concentration parameters approach (hetcirc.perm) is used.
This is an "htest"class object. Thus it returns a list including:
statistic |
The test statistic value. |
parameter |
The degrees of freedom of the test. Since these are permutation based tests this is "NA". |
p.value |
The p-value of the test. |
alternative |
A character with the alternative hypothesis. |
method |
A character with the test used. |
data.name |
A character vector with two elements. |
Michail Tsagris.
R implementation and documentation: Michail Tsagris [email protected].
Mardia K. V. and Jupp P. E. (2000). Directional statistics. Chicester: John Wiley & Sons.
Rumcheva P. and Presnell B. (2017). An improved test of equality of mean directions for the Langevin-von Mises-Fisher distribution. Australian & New Zealand Journal of Statistics, 59(1): 119–135.
Tsagris M. and Alenazi A. (2024). An investigation of hypothesis testing procedures for circular and spherical mean vectors. Communications in Statistics-Simulation and Computation, 53(3): 1387–1408.
u1 <- rvonmises(20, 2.4, 5) u2 <- rvonmises(20, 2.4, 10) hcfcirc.perm(u1, u2) lrcirc.perm(u1, u2)
u1 <- rvonmises(20, 2.4, 5) u2 <- rvonmises(20, 2.4, 10) hcfcirc.perm(u1, u2) lrcirc.perm(u1, u2)
Prediction of a new observation using discriminant analysis based on some distributions.
dirda(xnew, x, ina, type = c("vmf", "iag", "esag", "kent", "sc", "pkbd", "purka") )
dirda(xnew, x, ina, type = c("vmf", "iag", "esag", "kent", "sc", "pkbd", "purka") )
xnew |
The new observation(s) (unit vector(s)) whose group is to be predicted. |
x |
A data matrix with unit vectors, i.e. spherical directional data. |
ina |
A vector indicating the groups of the data y. |
type |
The type of classifier to use. The avaliable options are "vmf" (von Mises-Fisher distribution), "iag" (IAG distribution), "esag" (ESAG distribution), "kent" (Kent distribution), "sc" and "sc2" (spherical Cauchy distribution), "pkbd" and "pkbd2" (Poisson kernel-based distribution), and "purka" (Purkayastha distribution). The difference between "sc" and "sc2" and between "pkbd" and "pkbd2" is that the first uses the Newton-Raphson algorithm and it is faster, whereas the second uses a hybrid algorithm that does not require the Hessian matrix, but in large dimensions the second will be faster. You can chose any of them or all of them. Note that "kent" works only with spherical data. |
Prediction of the class of a new (hyper-)spherical vector assuming some distributions.
A vector with the predicted group of each new observation.
Michail Tsagris.
R implementation and documentation: Michail Tsagris [email protected].
Tsagris M. and Alenazi A. (2019). Comparison of discriminant analysis methods on the sphere. Communications in Statistics: Case Studies, Data Analysis and Applications, 5(4): 467–491.
Tsagris M., Papastamoulis P. and Kato S. (2024). Directional data analysis using the spherical Cauchy and the Poisson kernel-based distribution. https://arxiv.org/pdf/2409.03292.
Morris J. E. and Laycock P. J. (1974). Discriminant analysis of directional data. Biometrika, 61(2): 335–341.
Mardia K. V. and Jupp P. E. (2000). Directional statistics. Chicester: John Wiley & Sons.
dirda.cv, vm.nb, dirknn, knn.reg
m1 <- rnorm(3) m2 <- rnorm(3) + 0.5 x <- rbind( rvmf(100, m1, 3), rvmf(80, m2, 5) ) ina <- c( rep(1,100), rep(2, 80) ) xnew <- rbind(rvmf(10, m1, 10), rvmf(10, m2, 5)) id <- rep(1:2, each = 10) g <- dirda(xnew, x, ina, type = "vmf") table(id, g[, 1])
m1 <- rnorm(3) m2 <- rnorm(3) + 0.5 x <- rbind( rvmf(100, m1, 3), rvmf(80, m2, 5) ) ina <- c( rep(1,100), rep(2, 80) ) xnew <- rbind(rvmf(10, m1, 10), rvmf(10, m2, 5)) id <- rep(1:2, each = 10) g <- dirda(xnew, x, ina, type = "vmf") table(id, g[, 1])
Prediction with some naive Bayes classifiers for circular data.
vmnb.pred(xnew, mu, kappa, ni) spmlnb.pred(xnew, mu1, mu2, ni)
vmnb.pred(xnew, mu, kappa, ni) spmlnb.pred(xnew, mu1, mu2, ni)
xnew |
A numerical matrix with new predictor variables whose group is to be predicted. Each column refers to an angular variable. |
mu |
A matrix with the mean vectors expressed in radians. |
mu1 |
A matrix with the first set of mean vectors. |
mu2 |
A matrix with the second set of mean vectors. |
kappa |
A matrix with the kappa parameters for the vonMises distribution or with the norm of the mean vectors for the circular angular Gaussian distribution. |
ni |
The sample size of each group in the dataset. |
Each column is supposed to contain angular measurements. Thus, for each column a von Mises distribution or an circular angular Gaussian distribution is fitted. The product of the densities is the joint multivariate distribution.
A numerical vector with 1, 2, ... denoting the predicted group.
Michail Tsagris.
R implementation and documentation: Michail Tsagris [email protected].
x <- matrix( runif( 100, 0, 1 ), ncol = 2 ) ina <- rbinom(50, 1, 0.5) + 1 a <- vm.nb(x, x, ina) a2 <- vmnb.pred(x, a$mu, a$kappa, a$ni)
x <- matrix( runif( 100, 0, 1 ), ncol = 2 ) ina <- rbinom(50, 1, 0.5) + 1 a <- vm.nb(x, x, ina) a2 <- vmnb.pred(x, a$mu, a$kappa, a$ni)
It checkes whether the data are uniformly distributed on the circle or the (hyper-)sphere.
ptest(x, B = 100)
ptest(x, B = 100)
x |
A matrix containing the data, unit vectors. |
B |
The number of random uniform projections to use. |
For more details see Cuesta-Albertos, Cuevas and Fraiman (2009).
A list including:
pvalues |
The p-values of the Kolmogorov-Smirnov tests. |
pvalue |
The p-value of the test based on the Benjamini and Heller (2008) procedure. |
Michail Tsagris.
R implementation and documentation: Michail Tsagris [email protected].
Cuesta-Albertos J. A., Cuevas A. and Fraiman, R. (2009). On projection-based tests for directional and compositional data. Statistics and Computing, 19: 367–380.
Benjamini Y. and Heller R. (2008). Screening for partial conjunction hypotheses. Biometrics, 64(4): 1215–1222.
x <- rvmf(100, rnorm(5), 1) ## Fisher distribution with low concentration ptest(x)
x <- rvmf(100, rnorm(5), 1) ## Fisher distribution with low concentration ptest(x)
Random sample of matrices in SO(p).
rsop(n, p)
rsop(n, p)
n |
The sample size, the number of matrices you want to generate. |
p |
The dimensionality of the matrices. |
The idea is very simple. Start with a unit vector pointing at the north pole (1,0,...,0). Then generate random numbers from a standard normal and scale them so that they have a unit length. To put it differently, a sample of n values from the uniform distribution on the sphere is generated. Then calculate the rotation matrix required to go from the north pole to each of a generated vector.
If n = 1 one matrix is returned. If n is greater than 1, an array with n matrices inside.
Michail Tsagris.
R implementation and documentation: Michail Tsagris [email protected] and Giorgos Athineou <[email protected]>.
Amaral G.J.A., Dryden I.L. and Wood A.T.A. (2007). Pivotal Bootstrap Methods for k-Sample Problems in Directional Statistics and Shape Analysis. Journal of the American Statistical Association, 102(478): 695–707.
rotation, Arotation, rot.matrix
x1 <- rsop(1, 3) x2 <- rsop(10, 3) x3 <- rsop(100, 10)
x1 <- rsop(1, 3) x2 <- rsop(10, 3) x3 <- rsop(100, 10)
It checkes whether the data are uniformly distributed on the circle or the (hyper-)sphere.
rayleigh(x, modif = TRUE, B = 999)
rayleigh(x, modif = TRUE, B = 999)
x |
A matrix containing the data, unit vectors. |
modif |
If modif is TRUE, the modification as suggested by Jupp (2001) is used. |
B |
If B is greater than 1, bootstap calibation os performed. If it is equal to 1, classical theory is used. |
The Rayleigh test of uniformity is not the best, when there are two antipodal mean directions. In this case it will fail. It is good to test whether there is one mean direction or not. To put it differently, it tests whether the concentration parameter of the Fisher distribution is zero or not.
This is an "htest"class object. Thus it returns a list including:
statistic |
The test statistic value. |
parameter |
The degrees of freedom of the test. If bootstrap was employed this is "NA". |
p.value |
The p-value of the test. |
alternative |
A character with the alternative hypothesis. |
method |
A character with the test used. |
data.name |
A character vector with two elements. |
Michail Tsagris.
R implementation and documentation: Michail Tsagris [email protected] and Giorgos Athineou <[email protected]>.
Mardia, K. V. and Jupp, P. E. (2000). Directional statistics. Chicester: John Wiley & Sons.
Jupp, P. E. (2001). Modifications of the Rayleigh and Bingham tests for uniformity of directions. Journal of Multivariate Analysis, 77(2): 1-20.
Rayleigh, L. (1919). On the problem of random vibrations, and of random flights in one, two, or three dimensions. The London, Edinburgh, and Dublin Philosophical Magazine and Journal of Science, 37(220): 321–347.
x <- rvmf(100, rnorm(5), 1) ## Fisher distribution with low concentration rayleigh(x)
x <- rvmf(100, rnorm(5), 1) ## Fisher distribution with low concentration rayleigh(x)
Read a file as a Filebacked Big Matrix.
read.fbm(file, select)
read.fbm(file, select)
file |
The File to read. |
select |
Indices of columns to read (sorted). The length of select will be the number of columns of the resulting FBM. |
The functions read a file as a Filebacked Big Matrix object. For more information see the "bigstatsr" package.
A Filebacked Big Matrix object.
Michail Tsagris.
R implementation and documentation: Michail Tsagris [email protected].
x <- matrix( runif(50 * 20, 0, 2*pi), ncol = 20 )
x <- matrix( runif(50 * 20, 0, 2*pi), ncol = 20 )
Given a 3 x 3 rotation matrix, the angle and the axis of rotation are calculated.
Arotation(A)
Arotation(A)
A |
A 3 x 3 rotation matrix. |
If the user does not supply a rotation matrix a message will appear.
A list including:
angle |
The angle of rotation expressed in degrees. |
axis |
The axis of rotation. A vector of two components, latitude and longitude, expressed in degrees. |
Michail Tsagris.
R implementation and documentation: Michail Tsagris [email protected] and Giorgos Athineou <[email protected]>.
Course webpage of Howard E. Haber. http://scipp.ucsc.edu/~haber/ph216/rotation_12.pdf
Ted Chang (1986). Spherical Regression. Annals of Statistics, 14(3): 907–924.
ksi <- c(25.31, 24.29) theta <- 2.38 A <- rot.matrix(ksi, theta, rads = FALSE) A Arotation(A)
ksi <- c(25.31, 24.29) theta <- 2.38 A <- rot.matrix(ksi, theta, rads = FALSE) A Arotation(A)
It calculates a rotation matrix from a rotation axis and angle of rotation.
rot.matrix(ksi, theta, rads = FALSE)
rot.matrix(ksi, theta, rads = FALSE)
ksi |
The rotation axis, a vector with two elements, the first is the latitude and the second is the longitude. |
theta |
The angle of rotation. |
rads |
If both the ksi and theta are in rads, this should be TRUE. If both the ksi and theta are in degrees, this should be FALSE. |
The function accepts as arguments the rotation axis and the angle of rotation and it calcualtes the requested rotation matrix.
A 3 x 3 rotation matrix.
Michail Tsagris.
R implementation and documentation: Michail Tsagris [email protected] and Giorgos Athineou <[email protected]>.
Course webpage of Howard E. Haber. http://scipp.ucsc.edu/~haber/ph216/rotation_12.pdf
Ted Chang (1986). Spherical Regression. Annals of Statistics, 14(3): 907–924.
ksi <- c(25.31, 24.29) theta <- 2.38 A <- rot.matrix(ksi, theta, rads = FALSE) A Arotation(A)
ksi <- c(25.31, 24.29) theta <- 2.38 A <- rot.matrix(ksi, theta, rads = FALSE) A Arotation(A)
It forms a rotation matrix X on SO(3) by using three Euler angles ,
where X is defined as
.
Here
means a rotation of
radians about the x axis.
eul2rot(theta.12, theta.23, theta.13)
eul2rot(theta.12, theta.23, theta.13)
theta.12 |
An Euler angle, a number which must lie in |
theta.23 |
An Euler angle, a number which must lie in |
theta.13 |
An Euler angle, a number which must lie in |
Given three euler angles a rotation matrix X on SO(3) is formed using the transformation according to Green and Mardia (2006) which is defined above.
A roation matrix.
Anamul Sajib <[email protected]>.
R implementation and documentation: Anamul Sajib <[email protected]>.
Green, P. J. and Mardia, K. V. (2006). Bayesian alignment using hierarchical models, with applications in proteins bioinformatics. Biometrika, 93(2):235–254.
# three euler angles theta.12 <- sample( seq(-3, 3, 0.3), 1 ) theta.23 <- sample( seq(-3, 3, 0.3), 1 ) theta.13 <- sample( seq(-1.4, 1.4, 0.3), 1 ) theta.12 ; theta.23 ; theta.13 X <- eul2rot(theta.12, theta.23, theta.13) X # A rotation matrix det(X) e <- rot2eul(X)$v1 theta.12 <- e[3] theta.23 <- e[2] theta.13 <- e[1] theta.12 ; theta.23 ; theta.13
# three euler angles theta.12 <- sample( seq(-3, 3, 0.3), 1 ) theta.23 <- sample( seq(-3, 3, 0.3), 1 ) theta.13 <- sample( seq(-1.4, 1.4, 0.3), 1 ) theta.12 ; theta.23 ; theta.13 X <- eul2rot(theta.12, theta.23, theta.13) X # A rotation matrix det(X) e <- rot2eul(X)$v1 theta.12 <- e[3] theta.23 <- e[2] theta.13 <- e[1] theta.12 ; theta.23 ; theta.13
A rotation matrix is calculated to rotate a unit vector along the direction of another.
rotation(a, b)
rotation(a, b)
a |
The initial unit vector. |
b |
The target unit vector. |
The function calcualtes a rotation matrix given two vectors. This rotation matrix is the connection between the two spherical only, vectors.
A rotation matrix whose dimension is equal to the length of the unit vectors.
Michail Tsagris.
R implementation and documentation: Michail Tsagris [email protected] and Giorgos Athineou <[email protected]>.
Amaral G.J.A., Dryden I.L. and Wood A.T.A. (2007). Pivotal Bootstrap Methods for k-Sample Problems in Directional Statistics and Shape Analysis. Journal of the American Statistical Association, 102(478): 695–707.
Arotation, rot.matrix, lambert, lambert.inv, rsop
a <- rnorm(3) a <- a/sqrt(sum(a^2)) b <- rnorm(3) b <- b/sqrt(sum(b^2)) A <- rotation(a, b) A a ; b a %*% t(A) a <- rnorm(7) a <- a/sqrt(sum(a^2)) b <- rnorm(7) b <- b/sqrt(sum(b^2)) A <- rotation(a, b) A a ; b a %*% t(A)
a <- rnorm(3) a <- a/sqrt(sum(a^2)) b <- rnorm(3) b <- b/sqrt(sum(b^2)) A <- rotation(a, b) A a ; b a %*% t(A) a <- rnorm(7) a <- a/sqrt(sum(a^2)) b <- rnorm(7) b <- b/sqrt(sum(b^2)) A <- rotation(a, b) A a ; b a %*% t(A)
It calculates the logarithm of the normalising constant of the Fisher-Bingham distribution.
fb.saddle(gam, lam)
fb.saddle(gam, lam)
gam |
A numeric vector containing the parameters of the Fisher part. |
lam |
All the eigenvalues of the Bingham part. Not just the non zero ones. |
It calculate the three approximations given by Kume and Wood (2005) and it uses the Fisher-Bingham parametrization of that paper.
A list including:
first oder |
The first order approximation |
second oder |
The second order approximation |
third oder |
The third order approximation |
Michail Tsagris.
R implementation and documentation: Michail Tsagris [email protected] and Giorgos Athineou <[email protected]>.
Kume Alfred and Wood Andrew T.A. (2005). Saddlepoint approximations for the Bingham and Fisher-Bingham normalizing constants. Biometrika, 92(2):465-476
kent.logcon, rfb, kent.mle, rbingham
p <- 3 ; k <- 1 0.5 * p * log(2 * pi) - (p/2 - 1) * log(k) + log( besselI(k, p/2 - 1, expon.scaled = TRUE) ) + k ## normalising constant of the ## von Mises-Fisher distribution fb.saddle( c(0, k, 0), c(0, 0, 0) ) ## saddlepoint approximation ## Normalising constant of the Kent distribution fb.saddle( c(0, 10, 0), c(0, -2, 2) ) kent.logcon(10, 2)
p <- 3 ; k <- 1 0.5 * p * log(2 * pi) - (p/2 - 1) * log(k) + log( besselI(k, p/2 - 1, expon.scaled = TRUE) ) + k ## normalising constant of the ## von Mises-Fisher distribution fb.saddle( c(0, k, 0), c(0, 0, 0) ) ## saddlepoint approximation ## Normalising constant of the Kent distribution fb.saddle( c(0, 10, 0), c(0, -2, 2) ) kent.logcon(10, 2)
Score test for many simple CIPC and SPML regressions.
score.cipc(y, X, rads = TRUE, tol = 1e-06) score.spml(y, X, rads = TRUE, tol = 1e-06)
score.cipc(y, X, rads = TRUE, tol = 1e-06) score.spml(y, X, rads = TRUE, tol = 1e-06)
y |
The dependent variable, a numerical vector, it can be in radians or degrees. |
X |
A matrix with many numerical independent variables. |
rads |
If the dependent variable is expressed in rads, this should be TRUE and FALSE otherwise. |
tol |
The tolerance value to terminate the Newton-Raphson algorithm in the null model (CIPC without covariates). |
The score test uses the first derivative (score function) of the regression log-likelihood and it is asymptotically correct. So, this function requires sample sizes or at least 1,000 observations. The CIPC is basically the Wrapped Cauchy distribution (Tsagris and Alzeley, 2024) and SPML is the bivariate projected normal (Presnell et al., 1998).
A matrix with two columns, the test statistic and its associated p-value.
Michail Tsagris.
R implementation and documentation: Michail Tsagris [email protected].
Tsagris M. and Alzeley O. (2024). Circular and spherical projected Cauchy distributions: A Novel Framework for Circular and Directional Data Modeling. Australian & New Zealand Journal of Statistics (accepted for publication). https://arxiv.org/pdf/2302.02468.pdf
Presnell B., Morrison S. P. and Littell R. C. (1998). Projected multivariate linear models for directional data. Journal of the American Statistical Association, 93(443): 1068–1077.
cipc.reg, spml.reg, cipc.mle, spml.mle,
y <- rcipc(500, omega = 2, g = 5) x <- matrix( rnorm(500 * 10), ncol = 10 ) a <- score.cipc(y, x)
y <- rcipc(500, omega = 2, g = 5) x <- matrix( rnorm(500 * 10), ncol = 10 ) a <- score.cipc(y, x)
It simulates random values from a Bingham distribution with any given symmetric matrix.
rbingham(n, A)
rbingham(n, A)
n |
The sample size. |
A |
A symmetric matrix. |
The eigenvalues are fist calcualted and then Chris Fallaize and Theo Kypraio's code (f.rbing) is used. The resulting simulated data anre then right multiplied by the eigenvectors of the matrix A.
A matrix with the siumlated data.
Michail Tsagris.
R implementation and documentation: Michail Tsagris [email protected].
Kent J. T., Ganeiber A. M. and Mardia K. V. (2018). A new unified approach for the simulation of a wide class of directional distributions. Journal of Computational and Graphical Statistics, 27(2): 291–301.
Kent J.T., Ganeiber A.M. and Mardia K.V. (2013). A new method to simulate the Bingham and related distributions in directional data analysis with applications. http://arxiv.org/pdf/1310.8110v1.pdf
Fallaize C. J. and Kypraios T. (2016). Exact bayesian inference for the Bingham distribution. Statistics and Computing, 26(1): 349–360. http://arxiv.org/pdf/1401.2894v1.pdf
A <- cov(iris[, 1:3]) x <- rbingham(100, A)
A <- cov(iris[, 1:3]) x <- rbingham(100, A)
It simulates random samples (rotation matrices) from a Matrix Fisher distribution with any given parameter matrix, F (3x3).
rmatrixfisher(n, F)
rmatrixfisher(n, F)
n |
the sample size. |
F |
An arbitrary 3x3 matrix. |
Firstly corresponding Bingham parameter A is determined for a given Matrix Fisher parameter F using John Kent et al.'s (2013) algorithm and then Bingham samples for parameter A are generated using rbingham code. Finally convert Bingham samples to Matrix Fisher samples according to the Kent (2013) transformation.
An array with simulated rotation matrices.
Anamul Sajib and Chris Fallaize.
R implementation and documentation: Anamul Sajib <[email protected]> and Chris Fallaize.
Kent J. T., Ganeiber A. M. and Mardia K. V. (2018). A new unified approach for the simulation of a wide class of directional distributions. Journal of Computational and Graphical Statistics, 27(2): 291–301.
Kent J.T., Ganeiber A.M. and Mardia K.V. (2013). A new method to simulate the Bingham and related distributions in directional data analysis with applications. http://arxiv.org/pdf/1310.8110v1.pdf
F <- matrix( c(85, 11, 41, 78, 39, 60, 43, 64, 48), ncol = 3) / 10 ### An arbitrary F matrix a <- rmatrixfisher(10, F)
F <- matrix( c(85, 11, 41, 78, 39, 60, 43, 64, 48), ncol = 3) / 10 ### An arbitrary F matrix a <- rmatrixfisher(10, F)
It simulates from a Bingham distribution using the code suggested by Kent et al. (2013).
f.rbing(n, lam, fast = FALSE)
f.rbing(n, lam, fast = FALSE)
n |
Sample size. |
lam |
Eigenvalues of the diagonal symmetric matrix of the Bingham distribution. |
fast |
If you want a fast, efficient simulation set this to TRUE. |
The user must have calculated the eigenvalues of the diagonal symmetric matrix of the Bingham distribution. The function accepts the q-1 eigenvalues only. This means, that the user must have subtracted the lowest eigenvalue from the rest and give the non zero ones. The function uses rejection sampling and it was written by Chris Fallaize and Theo Kypraios (University of Nottingham) and kindly offered. Any questions on the code can be addressed to one of the two aforementioned people. It is slightly different than the one Kent et al. (2013) suggests.
A list including:
X |
The simulated data. |
avtry |
The estimate of M in the rejection sampling. The average number of simulated values before a value is accepted. If the argument fast is set to TRUE this information will not appear. |
Michail Tsagris.
R implementation and documentation: Michail Tsagris [email protected] and Giorgos Athineou <[email protected]>.
Kent J. T., Ganeiber A. M. and Mardia K. V. (2018). A new unified approach for the simulation of a wide class of directional distributions. Journal of Computational and Graphical Statistics, 27(2): 291–301.
Kent J.T., Ganeiber A.M. and Mardia K.V. (2013). A new method to simulate the Bingham and related distributions in directional data analysis with applications. http://arxiv.org/pdf/1310.8110v1.pdf
Fallaize C. J. and Kypraios T. (2016). Exact bayesian inference for the Bingham distribution. Statistics and Computing, 26(1): 349–360. http://arxiv.org/pdf/1401.2894v1.pdf
rfb, rvmf, rbingham, rkent, link{rsop}
x <- f.rbing( 100, c(1, 0.6, 0.1) ) x
x <- f.rbing( 100, c(1, 0.6, 0.1) ) x
The function simulates random values simulation from a given mixture of rotationally symmetric distributions.
rmixvmf(n, probs, mu, k) rmixspcauchy(n, probs, mu, k) rmixpkbd(n, probs, mu, k)
rmixvmf(n, probs, mu, k) rmixspcauchy(n, probs, mu, k) rmixpkbd(n, probs, mu, k)
n |
The sample size. |
probs |
This is avector with the mixing probability of each group. |
mu |
A matrix with the mean direction of each group. |
k |
A vector with the concentration parameter of each group. |
The function simulates random values simulation from a given mixture of von Mises-Fisher, spherical Cauchy or Poisson kernel-based distributions.
A list including:
id |
An indicator of the group of each simulated vector. |
x |
A matrix with the simulated data. |
Michail Tsagris.
R implementation and documentation: Michail Tsagris [email protected].
Kurt Hornik and Bettina Grun (2014). movMF: An R Package for Fitting Mixtures of von Mises-Fisher Distributions http://cran.r-project.org/web/packages/movMF/vignettes/movMF.pdf
Tsagris M., Papastamoulis P. and Kato S. (2024). Directional data analysis using the spherical Cauchy and the Poisson kernel-based distribution. https://arxiv.org/pdf/2409.03292.
k <- runif(3, 4, 20) probs <- c(0.2, 0.5, 0.3) mu <- matrix(rnorm(9), ncol = 3) mu <- mu / sqrt( rowSums(mu^2) ) x <- rmixvmf(200, probs, mu, k)$x bic.mixvmf(x, 5)
k <- runif(3, 4, 20) probs <- c(0.2, 0.5, 0.3) mu <- matrix(rnorm(9), ncol = 3) mu <- mu / sqrt( rowSums(mu^2) ) x <- rmixvmf(200, probs, mu, k)$x bic.mixvmf(x, 5)
Simulation of random values from a spherical Fisher-Bingham distribution.
rfb(n, k, m, A)
rfb(n, k, m, A)
n |
The sample size. |
k |
The concentraion parameter (Fisher part). It has to be greater than 0. |
m |
The mean direction (Fisher part). |
A |
A symmetric matrix (Bingham part). |
Random values from a spherical Fisher-Bingham distribution are generated. This functions included the option of simulating from a Kent distribution also.
A matrix with the simulated data.
Michail Tsagris.
R implementation and documentation: Michail Tsagris [email protected] and Giorgos Athineou <[email protected]>.
Kent J. T., Ganeiber A. M. and Mardia K. V. (2018). A new unified approach for the simulation of a wide class of directional distributions. Journal of Computational and Graphical Statistics, 27(2): 291–301.
Kent J.T., Ganeiber A.M. and Mardia K.V. (2013). A new method to simulate the Bingham and related distributions in directional data analysis with applications. http://arxiv.org/pdf/1310.8110v1.pdf
Fallaize C. J. and Kypraios T. (2016). Exact bayesian inference for the Bingham distribution. Statistics and Computing, 26(1): 349–360. http://arxiv.org/pdf/1401.2894v1.pdf
rbingham, rvmf, rkent, f.rbing
k <- 15 mu <- rnorm(3) mu <- mu / sqrt( sum(mu^2) ) A <- cov(iris[, 1:3]) x <- rfb(50, k, mu, A) vmf.mle(x) ## fits a von Mises-Fisher distribution to the simulated data ## Next we simulate from a Kent distribution A <- diag( c(-5, 0, 5) ) n <- 100 x <- rfb(n, k, mu, A) ## data follow a Kent distribution kent.mle(x) ## fits a Kent distribution vmf.mle(x) ## fits a von Mises-Fisher distribution A <- diag( c(5, 0, -5) ) n <- 100 x <- rfb(n, k, mu, A) ## data follow a Kent distribution kent.mle(x) ## fits a Kent distribution vmf.mle(x) ## fits a von Mises-Fisher distribution
k <- 15 mu <- rnorm(3) mu <- mu / sqrt( sum(mu^2) ) A <- cov(iris[, 1:3]) x <- rfb(50, k, mu, A) vmf.mle(x) ## fits a von Mises-Fisher distribution to the simulated data ## Next we simulate from a Kent distribution A <- diag( c(-5, 0, 5) ) n <- 100 x <- rfb(n, k, mu, A) ## data follow a Kent distribution kent.mle(x) ## fits a Kent distribution vmf.mle(x) ## fits a von Mises-Fisher distribution A <- diag( c(5, 0, -5) ) n <- 100 x <- rfb(n, k, mu, A) ## data follow a Kent distribution kent.mle(x) ## fits a Kent distribution vmf.mle(x) ## fits a von Mises-Fisher distribution
Simulation of random values from a spherical Kent distribution.
rkent(n, k, m, b)
rkent(n, k, m, b)
n |
The sample size. |
k |
The concentraion parameter |
m |
The mean direction (Fisher part). |
b |
The ovalness parameter, |
Random values from a Kent distribution on the sphere are generated. The function generates from a spherical Kent distribution using rfb
with an arbitrary mean direction and then rotates the data to have the desired mean direction.
A matrix with the simulated data.
Michail Tsagris.
R implementation and documentation: Michail Tsagris [email protected].
Kent J. T., Ganeiber A. M. and Mardia K. V. (2018). A new unified approach for the simulation of a wide class of directional distributions. Journal of Computational and Graphical Statistics, 27(2): 291–301.
Kent J.T., Ganeiber A.M. and Mardia K.V. (2013). A new method to simulate the Bingham and related distributions in directional data analysis with applications. http://arxiv.org/pdf/1310.8110v1.pdf
k <- 15 mu <- rnorm(3) mu <- mu / sqrt( sum(mu^2) ) A <- diag( c(-5, 0, 5) ) x <- rfb(500, k, mu, A) kent.mle(x) y <- rkent(500, k, mu, A[3, 3]) kent.mle(y)
k <- 15 mu <- rnorm(3) mu <- mu / sqrt( sum(mu^2) ) A <- diag( c(-5, 0, 5) ) x <- rfb(500, k, mu, A) kent.mle(x) y <- rkent(500, k, mu, A[3, 3]) kent.mle(y)
Simulation of random values from rotationally symmetric distributions. The data can be spherical or hyper-spherical.
rvmf(n, mu, k) riag(n, mu) rspcauchy(n, mu, rho) rpkbd(n, mu, rho)
rvmf(n, mu, k) riag(n, mu) rspcauchy(n, mu, rho) rpkbd(n, mu, rho)
n |
The sample size. |
mu |
A unit vector showing the mean direction for the von Mises-Fisher or the spherical Cauchy distribution. The mean vector of the Independent Angular Gaussian distribution does not have to be a unit vector. |
k |
The concentration parameter ( |
rho |
The |
The von Mises-Fisher uses the rejection smapling suggested by Wood (1994). For the Independent Angular Gaussian, values are generated from a multivariate normal distribution with the given mean vector and the identity matrix as the covariance matrix. Then each vector becomes a unit vector. For the spherical Cauchy distribution the algortihm is described in Kato and McCullagh (2020) and for the Poisson kernel-based distribution, it is described in Sablica, Hornik and Leydold (2023).
A matrix with the simulated data.
Michail Tsagris.
R implementation and documentation: Michail Tsagris [email protected] and Giorgos Athineou <[email protected]>.
Wood A.T.A. (1994). Simulation of the von Mises Fisher distribution. Communications in Statistics-Simulation and Computation, 23(1): 157–164.
Dhillon I. S. and Sra S. (2003). Modeling data using directional distributions. Technical Report TR-03-06, Department of Computer Sciences, The University of Texas at Austin. http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.75.4122&rep=rep1&type=pdf
Kato S. and McCullagh P. (2020). Some properties of a Cauchy family on the sphere derived from the Mobius transformations. Bernoulli, 26(4): 3224–3248. https://arxiv.org/pdf/1510.07679.pdf
Sablica L., Hornik K. and Leydold J. (2023). Efficient sampling from the PKBD distribution. Electronic Journal of Statistics, 17(2): 2180–2209.
vmf.mle, iag.mle rfb, racg, rvonmises, rmixvmf
m <- rnorm(4) m <- m/sqrt(sum(m^2)) x <- rvmf(100, m, 25) m vmf.mle(x)
m <- rnorm(4) m <- m/sqrt(sum(m^2)) x <- rvmf(100, m, 25) m vmf.mle(x)
Simulation of random values from some circular distributions.
rvonmises(n, m, k, rads = TRUE) rwrapcauchy(n, m, rho, rads = TRUE) rspml(n, mu, rads = TRUE) rcircbeta(n, m, a, b, rads = TRUE) rcircpurka(n, m, a, rads = TRUE) rcircexp(n, lambda, rads = TRUE) rcipc(n, mu = NULL, omega, g, rads = TRUE) rgcpc(n, mu = NULL, omega, g, rho, rads = TRUE)
rvonmises(n, m, k, rads = TRUE) rwrapcauchy(n, m, rho, rads = TRUE) rspml(n, mu, rads = TRUE) rcircbeta(n, m, a, b, rads = TRUE) rcircpurka(n, m, a, rads = TRUE) rcircexp(n, lambda, rads = TRUE) rcipc(n, mu = NULL, omega, g, rads = TRUE) rgcpc(n, mu = NULL, omega, g, rho, rads = TRUE)
n |
The sample size. |
m |
The mean angle expressed in radians or degrees. |
mu |
The mean vector of the SPML, CIPC and GCPC in |
omega |
The location parameter for the CIPC and the GCPC expressed in radians or degrees. |
k |
The concentration parameter of the von Mises distribution.
If k is zero the sample will be generated from the uniform distribution over |
g |
The norm of the mean vector for the CIPC and GCPC, if |
rho |
For the wrapped Cauchy distribution, this is the |
a |
The |
b |
The |
lambda |
The |
rads |
If the mean angle is expressed in radians, this should be TRUE and FALSE otherwise. The simulated data will be expressed in radians or degrees depending on what the mean angle is expressed. |
For the von Mises distribution, the mean direction is transformed to the Euclidean coordinates (i.e. unit vector)
and then the rvmf
function is employed. It uses a rejection smapling as suggested by Andrew Wood in 1994.
We have mentioned the description of the algorithm as we found it in Dhillon and Sra in 2003.
Finally, the data are transformed to radians or degrees.
For the wrapped Cauchy and wrapped exponential distributions the function generates Cauchy or exponential values
and then wrapps them around the circle
. For the circular beta the function has some
extra steps (see Zheng Sun's master thesis).
For the CIPC and GCPC distributions, data are generated from the bivariate Cauchy distribution, normalized to have unit norm and then transformed to angles.
A vector with the simulated data.
Michail Tsagris.
R implementation and documentation: Michail Tsagris [email protected] and Giorgos Athineou <[email protected]>.
Wood A.T.A. (1994). Simulation of the von Mises Fisher distribution. Communications in Statistics-Simulation and Computation, 23(1): 157-164.
Dhillon I.S. and Sra S. (2003). Modeling data using directional distributions. Technical Report TR-03-06, Department of Computer Sciences, The University of Texas at Austin. http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.75.4122&rep=rep1&type=pdf
Zheng Sun (2006). Comparing measures of fit for circular distributions. Master thesis, University of Victoria. https://dspace.library.uvic.ca/bitstream/handle/1828/2698/zhengsun_master_thesis.pdf;sequence=1
Lai M. (1994). Some results in the statistical analysis of directional data. Master thesis, University of Hong Kong.
Presnell B., Morrison S.P. and Littell R.C. (1998). Projected multivariate linear models for directional data. Journal of the American Statistical Association, 93(443): 1068–1077.
Purkayastha S. (1991). A Rotationally Symmetric Directional Distribution: Obtained through Maximum Likelihood Characterization. The Indian Journal of Statistics, Series A, 53(1): 70–83
Jammalamadaka S.R. and Kozubowski T.J. (2003). A new family of circular models: The wrapped Laplace distributions. Advances and Applications in Statistics, 3(1): 77–103.
x <- rvonmises(100, 2, 25, rads = TRUE) circ.summary(x, rads = TRUE)
x <- rvonmises(100, 2, 25, rads = TRUE) circ.summary(x, rads = TRUE)
Simulation of random values from the ESAG distribution.
resag(n, mu, gam) rESAGd(n, mu, gam)
resag(n, mu, gam) rESAGd(n, mu, gam)
n |
A number; how many vectors you want to generate. |
mu |
The mean vector the ESAG distribution. |
gam |
The |
A random sample from the ESAG distribution is generated. In case the are zero (or null for the rESAGd), the sample is drawn from the Independent Angular Gaussian (or projected normal). The resag() is designed for the sphere, whereas the rESAGd is designed for the sphere and hyper-sphere.
An matrix with the simulated unit vectors.
Michail Tsagris and Zehao Yu.
R implementation and documentation: Michail Tsagris [email protected] and Zehao Yu [email protected].
Zehao Yu and Xianzheng Huang (2024). A new parameterization for elliptically symmetric angular Gaussian distributions of arbitrary dimension. Electronic Journal of Statististics, 18(1): 301–334.
Paine P.J., Preston S.P., Tsagris M. and Wood A.T.A. (2018). An Elliptically Symmetric Angular Gaussian Distribution. Statistics and Computing, 28(3):689–697.
Mardia, K. V. and Jupp, P. E. (2000). Directional statistics. Chicester: John Wiley & Sons.
esag.mle, desag, spml.mle, acg.mle, circ.summary
m <- colMeans( as.matrix( iris[, 1:3] ) ) y <- resag(1000, m, c(1, 0.5) ) esag.mle(y)
m <- colMeans( as.matrix( iris[, 1:3] ) ) y <- resag(1000, m, c(1, 0.5) ) esag.mle(y)
Simulation of random values from the SESPC distribution
rsespc(n, mu, theta)
rsespc(n, mu, theta)
n |
A number; how many vectors you want to generate. |
mu |
The mean vector the SESPC distribution, a vector in |
theta |
The two |
A random sample from the SESPC distribution is generated. In case the are zero, the sample is drawn from the SIPC (spherical independent projected Cauchy) distribution.
An matrix with the simulated unit vectors.
Michail Tsagris.
R implementation and documentation: Michail Tsagris [email protected].
Tsagris M. and Alzeley O. (2024). Circular and spherical projected Cauchy distributions: A Novel Framework for Circular and Directional Data Modeling. Australian & New Zealand Journal of Statistics (accepted for publication). https://arxiv.org/pdf/2302.02468.pdf
Mardia, K. V. and Jupp, P. E. (2000). Directional statistics. Chicester: John Wiley & Sons.
m <- colMeans( as.matrix( iris[,1:3] ) ) y <- rsespc(1000, m, c(1, 0.5) ) sespc.mle(y)
m <- colMeans( as.matrix( iris[,1:3] ) ) y <- rsespc(1000, m, c(1, 0.5) ) sespc.mle(y)
Spherical and hyper-spherical distance correlation.
spher.dcor(x, y)
spher.dcor(x, y)
x |
A matrix with directional data, i.e. unit vectors. |
y |
A matrix with directional data, i.e. unit vectors. |
The distance correlation between two spherical or hyper-spherical variables is computed.
A list including:
dcov |
The distance covariance. |
dvarX |
The distance variance of x. |
dvarY |
The distance variance of Y. |
dcor |
The distance correlation. |
Michail Tsagris.
R implementation and documentation: Michail Tsagris [email protected].
G.J. Szekely, M.L. Rizzo and N. K. Bakirov (2007). Measuring and Testing Independence by Correlation of Distances. Annals of Statistics, 35(6):2769-2794.
y <- rvmf(50, rnorm(3), 4) x <- rvmf(50, rnorm(3), 4) spher.dcor(x, y)
y <- rvmf(50, rnorm(3), 4) x <- rvmf(50, rnorm(3), 4) spher.dcor(x, y)
It calculates, very fast, the (hyper-)spherical median of a sample.
mediandir(x) mediandir_2(x)
mediandir(x) mediandir_2(x)
x |
The data, a numeric matrix with unit vectors. |
The "mediandir" employes a fixed poit iterative algorithm stemming from the first derivative (Cabrera and Watson, 1990) to find the median direction as described by Fisher (1985) and Fisher, Lewis and Embleton (1987). In the big samples this is much much faster than "mediandir_2", since the search is based on iterations.
The median direction.
Michail Tsagris.
R implementation and documentation: Michail Tsagris [email protected] and Giorgos Athineou <[email protected]>.
Fisher N. I. (1985). Spherical medians. Journal of the Royal Statistical Society. Series B, 47(2): 342–348.
Fisher N. I., Lewis T. and Embleton B. J. (1987). Statistical analysis of spherical data. Cambridge university press.
Cabrera J. and Watson G. S. (1990). On a spherical median related distribution. Communications in Statistics-Theory and Methods, 19(6): 1973–1986.
m <- rnorm(3) m <- m / sqrt( sum(m^2) ) x <- rvmf(100, m, 10) mediandir(x) mediandir_2(x) nsmedian(x)
m <- rnorm(3) m <- m / sqrt( sum(m^2) ) x <- rvmf(100, m, 10) mediandir(x) mediandir_2(x) nsmedian(x)
Spherical regression using rotationally symmetric distributions.
iag.reg(y, x, con = TRUE, xnew = NULL, tol = 1e-06) vmf.reg(y, x, con = TRUE, xnew = NULL, tol = 1e-06) sipc.reg(y, x, con = TRUE, xnew = NULL, tol = 1e-06)
iag.reg(y, x, con = TRUE, xnew = NULL, tol = 1e-06) vmf.reg(y, x, con = TRUE, xnew = NULL, tol = 1e-06) sipc.reg(y, x, con = TRUE, xnew = NULL, tol = 1e-06)
y |
A matrix with 3 columns containing the (unit vector) spherical data. |
x |
The predictor variable(s), they can be continnuous, spherical, categorical or a mix of them. |
con |
Do you want the constant term in the regression? |
xnew |
If you have new data use it, otherwise leave it NULL. |
tol |
A tolerance value to decide when to stop the successive optimaizations. |
The second parametrization of the projected normal and of the von Mises-Fisher regression (Paine et al., 2020) is applied. The same is true for the SIPC distribution. For more information see the paper by Paine et al. (2020).
A list including:
loglik |
The log-likelihood of the regression model. |
fit |
This is a measure of fit of the estimated values, defined as |
beta |
The beta coefficients. |
seb |
The standard error of the beta coefficients. |
ki |
The norm of the fitted values. In the von Mises-Fisher regression this is the concentration parameter of each observation. In the projected normal this are the norms of the fitted values before being projected onto the sphere. This is returned if the argument "xnew" is NULL. |
est |
The fitted values of xnew if "xnew" is NULL. If it is not NULL, the fitted values for the "xnew" you supplied will be returned. |
Michail Tsagris.
R implementation and documentation: Michail Tsagris [email protected].
P. J. Paine, S. P. Preston, M. Tsagris and Andrew T. A. Wood (2020). Spherical regression models with general covariates and anisotropic errors. Statistics and Computing, 30(1): 153–165. https://link.springer.com/content/pdf/10.1007
Tsagris M. and Alzeley O. (2024). Circular and spherical projected Cauchy distributions: A Novel Framework for Circular and Directional Data Modeling. https://arxiv.org/pdf/2302.02468.pdf
y <- rvmf(150, rnorm(3), 5) a1 <- iag.reg(y, iris[, 4]) a2 <- iag.reg(y, iris[, 4:5]) b1 <- vmf.reg(y, iris[, 4]) b2 <- vmf.reg(y, iris[, 4:5])
y <- rvmf(150, rnorm(3), 5) a1 <- iag.reg(y, iris[, 4]) a2 <- iag.reg(y, iris[, 4:5]) b1 <- vmf.reg(y, iris[, 4]) b2 <- vmf.reg(y, iris[, 4:5])
Spherical regression using the ESAG distribution.
esag.reg(y, x, con = TRUE, xnew = NULL, lati = 10, longi = 10, tol = 1e-06)
esag.reg(y, x, con = TRUE, xnew = NULL, lati = 10, longi = 10, tol = 1e-06)
y |
A matrix with 3 columns containing the (unit vector) spherical data. |
x |
The predictor variable(s), they can be continnuous, spherical, categorical or a mix of them. |
con |
Do you want the constant term in the regression? |
xnew |
If you have new data use it, otherwise leave it NULL. |
lati |
A positive number determing the range of degrees to move left and right from the latitude center. This number and the next determine the grid of points to search for the Q matrix described in Paine et al. (2020). |
longi |
A positive number determing the range of degrees to move up and down from the longitude center. This number and the previous determine the grid of points to search for the Q matrix described in Paine et al. (2020). |
tol |
A tolerance value to decide when to stop the successive optimizations. |
The second parametrization of the ESAG regression (Paine et al., 2020) is applied.
A list including:
loglik |
The log-likelihood of the regression model. |
param |
A vector with three numbers. A measure of fit of the estimated values, defined as |
gam |
The two |
beta |
The beta coefficients. |
seb |
The standard error of the beta coefficients. |
est |
The fitted values of xnew if "xnew" is NULL. If it is not NULL, the fitted values for the "xnew" you supplied will be returned. |
Michail Tsagris.
R implementation and documentation: Michail Tsagris [email protected].
P. J. Paine, S. P. Preston, M. Tsagris and Andrew T. A. Wood (2020). Spherical regression models with general covariates and anisotropic errors. Statistics and Computing, 30(1): 153–165. https://link.springer.com/content/pdf/10.1007
y <- resag( 25, rnorm(3), c(1, 1) ) ## this is a small example to pass CRAN's check because the default argument values ## of lati and longi require many seconds a <- esag.reg(y, iris[1:25, 4], lati = 2, longi = 2)
y <- resag( 25, rnorm(3), c(1, 1) ) ## this is a small example to pass CRAN's check because the default argument values ## of lati and longi require many seconds a <- esag.reg(y, iris[1:25, 4], lati = 2, longi = 2)
Spherical regression using the SESPC distribution.
sespc.reg(y, x, con = TRUE, xnew = NULL, lati = 10, longi = 10, tol = 1e-06)
sespc.reg(y, x, con = TRUE, xnew = NULL, lati = 10, longi = 10, tol = 1e-06)
y |
A matrix with 3 columns containing the (unit vector) spherical data. |
x |
The predictor variable(s), they can be continnuous, spherical, categorical or a mix of them. |
con |
Do you want the constant term in the regression? |
xnew |
If you have new data use it, otherwise leave it NULL. |
lati |
A positive number determing the range of degrees to move left and right from the latitude center. This number and the next determine the grid of points to search for the Q matrix described in Tsagris and Alzeley (2024). |
longi |
A positive number determing the range of degrees to move up and down from the longitude center. This number and the previous determine the grid of points to search for the Q matrix described in Tsagris and Alzeley (2024). |
tol |
A tolerance value to decide when to stop the successive optimizations. |
Regression based on the SESPC distribution (Tsagris and Alzeley, 2024) is applied.
A list including:
loglik |
The log-likelihood of the regression model. |
param |
A vector with three numbers. A measure of fit of the estimated values, defined as |
theta |
The two |
beta |
The beta coefficients. |
seb |
The standard error of the beta coefficients. |
est |
The fitted values of xnew if "xnew" is NULL. If it is not NULL, the fitted values for the "xnew" you supplied will be returned. |
Michail Tsagris.
R implementation and documentation: Michail Tsagris [email protected].
Tsagris M. and Alzeley O. (2024). Circular and spherical projected Cauchy distributions: A Novel Framework for Circular and Directional Data Modeling. Australian & New Zealand Journal of Statistics (accepted for publication). https://arxiv.org/pdf/2302.02468.pdf
y <- rsespc( 150, rnorm(3), c(1, 1) ) ## this is a small example to pass CRAN's check because the default argument values ## of lati and longi require many seconds a <- sespc.reg(y, iris[, 4], lati = 2, longi = 2)
y <- rsespc( 150, rnorm(3), c(1, 1) ) ## this is a small example to pass CRAN's check because the default argument values ## of lati and longi require many seconds a <- sespc.reg(y, iris[, 4], lati = 2, longi = 2)
Correlation between two spherical variables.
spher.cor(x, y)
spher.cor(x, y)
x |
A spherical variable. A matrix with thre columns, each row is a unit vector. |
y |
A spherical variable. A matrix with thre columns, each row is a unit vector. |
A very similar to the classical correlation is calcualted. In addition, a hypothesis test for no correlation is performed. Note, that this is a squared correlation actually, so negative values will never be returned.
A vector including:
R-squared |
The value of the squared correlation. |
p-value |
The p-value of the no correlation hypothesis testing. |
Michail Tsagris.
R implementation and documentation: Michail Tsagris [email protected] and Giorgos Athineou <[email protected]>.
Kanti V. Mardia and Peter E. Jupp. Directional statistics, pg. 254–255.
spher.reg, vmf.mle, circ.cor1, circ.cor2
x <- rvmf(100, rnorm(3), 10) y <- rvmf(100, rnorm(3), 10) spher.cor(x, y)
x <- rvmf(100, rnorm(3), 10) y <- rvmf(100, rnorm(3), 10) spher.cor(x, y)
Regression when both the dependent and independent variables are spherical.
spher.reg(y, x, rads = FALSE, xnew = NULL)
spher.reg(y, x, rads = FALSE, xnew = NULL)
y |
The dependent variable; a matrix with either two columns, latitude and longitude, either in radians or in degrees. Alternatively it is a matrix with three columns, unit vectors. |
x |
The dependent variable; a matrix with either two columns, latitude and longitude, either in radians or in degrees. Alternatively it is a matrix with three columns, unit vectors. The two matrices must agree in the scale and dimensions. |
rads |
If the data are expressed in latitude and longitude then it matter to know if they are in radians or degrees. If they are in radians, then this should be TRUE and FALSE otherwise. If the previous argument, euclidean, is TRUE, this one does not matter what its value is. |
xnew |
The new values of some spherical independent variable(s) whose spherical response values you want to predict. If you have no new x values, leave it NULL (default). |
Spherical regression as proposed by Chang (1986) is implemented. If the estimated rotation matrix has a determinant equal to -1, singular value decomposition is performed and the third unit vector is multiplied by -1.
A list including:
A |
The estimated rotation matrix. |
est |
The fitted values in unit vectors, if the argument xnew is not NULL. |
Michail Tsagris.
R implementation and documentation: Michail Tsagris [email protected] and Giorgos Athineou <[email protected]>.
Ted Chang (1986). Spherical Regression. Annals of Statistics, 14(3): 907–924.
hspher.reg, spher.cor, spml.reg
mx <- rnorm(3) mx <- mx/sqrt( sum(mx^2) ) my <- rnorm(3) my <- my/sqrt( sum(my^2) ) x <- rvmf(100, mx, 15) A <- rotation(mx, my) y <- x %*% t(A) mod <- spher.reg(y, x) A mod$A ## exact match, no noise y <- x %*% t(A) y <- y + rvmf(100, colMeans(y), 40) mod <- spher.reg(y, x) A mod$A ## noise added, more relistic example
mx <- rnorm(3) mx <- mx/sqrt( sum(mx^2) ) my <- rnorm(3) my <- my/sqrt( sum(my^2) ) x <- rvmf(100, mx, 15) A <- rotation(mx, my) y <- x %*% t(A) mod <- spher.reg(y, x) A mod$A ## exact match, no noise y <- x %*% t(A) y <- y + rvmf(100, colMeans(y), 40) mod <- spher.reg(y, x) A mod$A ## noise added, more relistic example
It produces a few summary measures for circular data.
circ.summary(u, rads = FALSE, fast = FALSE, tol = 1e-07, plot = FALSE)
circ.summary(u, rads = FALSE, fast = FALSE, tol = 1e-07, plot = FALSE)
u |
A vector with circular data. |
rads |
If the data are in rads, then this should be TRUE, otherwise FALSE. |
fast |
A boolean variable to do a faster implementation. |
tol |
The tolerance level to stop the Newton-Raphson algorithm for finding kappa. |
plot |
If you want to see the data plotted on a cicrle make this TRUE. |
It returns the circular mean, mean resultant length, variance, standard deviation and concentration parameter. So, basically it returns the estimated values of the parameters of the von Mises distribution.
If fast = FALSE a list including all the following. If fast = TRUE less items are returned.
mesos |
The circular mean direction. |
confint |
The 95% confidence interval for the circular mean direction. |
kappa |
The concentration parameter. |
MRL |
The mean resultant length. |
circvariance |
The circular variance. |
circstd |
The circular standard deviation. |
loglik |
The log-likelihood of the fitted von Mises distribution. |
Michail Tsagris.
R implementation and documentation: Michail Tsagris [email protected] and Giorgos Athineou <[email protected]>.
Mardia, K. V. and Jupp, P. E. (2000). Directional statistics. Chicester: John Wiley & Sons.
spml.mle, rvonmises, vm.kde, vmf.mle, group.vm, hcf.circaov
x <- rvonmises(50, 2.5, 15, rads = TRUE) circ.summary(x, rads = TRUE, plot = TRUE)
x <- rvonmises(50, 2.5, 15, rads = TRUE) circ.summary(x, rads = TRUE, plot = TRUE)
It produces a few summary measures for grouped circular data.
group.vm(group, fi, rads = FALSE)
group.vm(group, fi, rads = FALSE)
group |
A matrix denoting the classes. Each row consists of two numbers, the lower and upper points of each class. |
fi |
The frequency of each class of data. |
rads |
If the data are in rads, then this should be TRUE, otherwise FALSE. |
It returns the circular mean, mean resultant length, variance, standard deviation and concentration parameter. So, basically it returns the estimated values of the parameters of the von Mises distribution. The mena resultant length though is group corrected.
A list including:
mesos |
The circular mean direction. |
confint |
The 95% confidence interval for the circular mean direction. |
kappa |
The concentration parameter. |
MRL |
The mean resultant length. |
circvariance |
The circular variance. |
circstd |
The circular standard deviation. |
Michail Tsagris.
R implementation and documentation: Michail Tsagris [email protected].
Pewsey Arthur, Markus Neuhauser and Graeme D. Ruxton (2013). Circular statistics in R. Oxford University Press.
Mardia K. V. and Jupp P. E. (2000). Directional statistics. Chicester: John Wiley & Sons.
circ.summary, rvonmises, vm.kde
x <- rvonmises(200, 3, 10) a <- circ.summary(x, rads = TRUE, plot = FALSE) group <- seq(min(x) - 0.1, max(x) + 0.1, length = 6) y <- cut(x, breaks = group, length = 6) group <- matrix( c( group[1], rep(group[2:5], each = 2), group[6]), ncol = 2, byrow = TRUE) fi <- as.vector( table(y) ) b <- group.vm(group, fi, rads = TRUE) a b
x <- rvonmises(200, 3, 10) a <- circ.summary(x, rads = TRUE, plot = FALSE) group <- seq(min(x) - 0.1, max(x) + 0.1, length = 6) y <- cut(x, breaks = group, length = 6) group <- matrix( c( group[1], rep(group[2:5], each = 2), group[6]), ncol = 2, byrow = TRUE) fi <- as.vector( table(y) ) b <- group.vm(group, fi, rads = TRUE) a b
A log-likelihood ratio test for testing whether the sample mena direction is equal to some predefined one.
meandir.test(x, mu, B = 999)
meandir.test(x, mu, B = 999)
x |
A matrix with the data, unit vectors. |
mu |
A unit vector with the hypothesized mean direction. |
B |
A number either 1, so no bootstrap calibration is performed or more than 1, so bootstrap calibration is performed. |
The log-likelihood ratio test is employed.
This is an "htest"class object. Thus it returns a list including:
statistic |
The test statistic value. |
parameter |
The degrees of freedom of the test. If bootstrap was employed this is "NA". |
p.value |
The p-value of the test. |
alternative |
A character with the alternative hypothesis. |
method |
A character with the test used. |
data.name |
A character vector with two elements. |
Michail Tsagris.
R implementation and documentation: Michail Tsagris [email protected] and Giorgos Athineou <[email protected]>.
Mardia, K. V. and Jupp, P. E. (2000). Directional statistics. Chicester: John Wiley & Sons.
mu <- rnorm(5) mu <- mu / sqrt( sum(mu^2) ) x <- rvmf(100, mu, 10) meandir.test(x, mu, 1) meandir.test(x, mu, 499)
mu <- rnorm(5) mu <- mu / sqrt( sum(mu^2) ) x <- rvmf(100, mu, 10) meandir.test(x, mu, 1) meandir.test(x, mu, 499)
This tests the equality of concentration parameters for spherical data only.
spherconc.test(x, ina)
spherconc.test(x, ina)
x |
A matrix with the data in Euclidean coordinates, i.e. unit vectors |
ina |
A variable indicating the groupings of the observations. |
The test is designed for spherical data only.
A list including:
mess |
A message stating the value of the mean resultant and which test statistic was used, U1, U2 or U3. |
res |
A vector containing the test statistic and its p-value. |
Michail Tsagris.
R implementation and documentation: Michail Tsagris [email protected] and Giorgos Athineou <[email protected]>.
Kanti V. Mardia and Peter E. Jupp. Directional statistics, pg. 226–227.
het.aov, lr.aov, embed.aov, hcf.aov, conc.test, sphereplot
x <- rvmf(100, rnorm(3), 15) ina <- rep(1:4, each = 25) spherconc.test(x, ina)
x <- rvmf(100, rnorm(3), 15) ina <- rep(1:4, each = 25) spherconc.test(x, ina)
A test for testing the equality of the concentration parameter among g samples, where g >= 2 for ciruclar data.
conc.test(u, ina, rads = FALSE)
conc.test(u, ina, rads = FALSE)
u |
A numeric vector containing the values of all samples. |
ina |
A numerical variable or factor indicating the groups of each value. |
rads |
If the data are in radians this should be TRUE and FALSE otherwise. |
This test works for circular data.
A list including:
mess |
A message informing the use of the test statistic used. |
res |
A numeric vector containing the value of the test statistic and its associated p-value. |
Michail Tsagris.
R implementation and documentation: Michail Tsagris [email protected] and Giorgos Athineou <[email protected]>.
Mardia, K. V. and Jupp, P. E. (2000). Directional statistics. Chicester: John Wiley & Sons.
embed.circaov, hcf.circaov, lr.circaov, het.circaov
x <- rvonmises(100, 2.4, 15) ina <- rep(1:4,each = 25) conc.test(x, ina, rads = TRUE)
x <- rvonmises(100, 2.4, 15) ina <- rep(1:4,each = 25) conc.test(x, ina, rads = TRUE)
The k-nearest neighbours using the cosinus distance.
cosnn(xnew, x, k = 5, index = FALSE, rann = FALSE)
cosnn(xnew, x, k = 5, index = FALSE, rann = FALSE)
xnew |
The new data whose k-nearest neighbours are to be found. |
x |
The data, a numeric matrix with unit vectors. |
k |
The number of nearest neighbours, set to 5 by default. It can also be a vector with many values. |
index |
If you want the indices of the closest observations set this equal to TRUE. |
rann |
If you have large scale datasets and want a faster k-NN search, you can use kd-trees implemented in the R package "RANN". In this case you must set this argument equal to TRUE. |
The shortest distances or the indices of the k-nearest neighbours using the cosinus distance are returned.
A matrix with the shortest distance of each xnew from x, if index is FALSE, or the indices of the nearest neighbours of each xnew from x, if index is TRUE.
Michail Tsagris.
R implementation and documentation: Michail Tsagris [email protected].
Tsagris M. and Alenazi A. (2019). Comparison of discriminant analysis methods on the sphere. Communications in Statistics: Case Studies, Data Analysis and Applications, 5(4): 467–491.
xnew <- rvmf(10, rnorm(3), 5) x <- rvmf(50, rnorm(3), 5) a <- cosnn(xnew, x, k = 5) b <- cosnn(xnew, x, k = 5, index = TRUE)
xnew <- rvmf(10, rnorm(3), 5) x <- rvmf(50, rnorm(3), 5) a <- cosnn(xnew, x, k = 5) b <- cosnn(xnew, x, k = 5, index = TRUE)
Transform unit vectors to angular data.
etoa(x)
etoa(x)
x |
A numerical matrix with directional data, i.e. unit verctors. |
from the Euclidean coordinates the data are mapped to angles, expressed in rads.
A list including:
mu |
A matrix with angles. The number of columns is one less than that of the original matrix. |
Michail Tsagris.
R implementation and documentation: Michail Tsagris [email protected].
https://en.wikipedia.org/wiki/N-sphere#Spherical_coordinates
x <- rvmf(10, rnorm(3), 5) y <- etoa(x)
x <- rvmf(10, rnorm(3), 5) y <- etoa(x)
Tuning of the bandwidth parameter in the von Mises kernel for circular data. Cross validation is used.
vmkde.tune(u, low = 0.1, up = 1, rads = TRUE)
vmkde.tune(u, low = 0.1, up = 1, rads = TRUE)
u |
The data, a numerical vector. |
low |
The lower value of h to search. |
up |
The lower value of h to search. |
rads |
If the data are in radians this should be TRUE and FALSE otherwise. |
Tuning of the bandwidth parameter in the von Mises kernel for circula data via cross validation. The procedure is fast because an optimiser is used.
A vector including two elements:
Optimal h |
The best H found. |
cv |
The value of the maximised pseudo-likelihood. |
Michail Tsagris.
R implementation and documentation: Michail Tsagris [email protected] and Giorgos Athineou <[email protected]>.
Taylor C.C. (2008). Automatic bandwidth selection for circular density estimation. Computational Statistics & Data Analysis, 52(7), 3493–3500.
Wand M.P. and Jones M.C. (1994). Kernel smoothing. CrC Press.
u <- rvonmises(100, 2.4, 10, rads = TRUE) vmkde.tune(u)
u <- rvonmises(100, 2.4, 10, rads = TRUE) vmkde.tune(u)
Tuning of the bandwidth parameter in the von Mises-Fisher kernel for (hyper-)spherical data whit cross validation.
vmfkde.tune(x, low = 0.1, up = 1)
vmfkde.tune(x, low = 0.1, up = 1)
x |
A matrix with the data in Euclidean cordinates, i.e. unit vectors. |
low |
The lower value of the bandwdith to search. |
up |
The upper value of the bandwdith to search. |
Fast tuning of the bandwidth parameter in the von Mises-Fisher kernel for (hyper-)spherical data via cross validation.
A vector including two elements:
Optimal h |
The best H found. |
cv |
The value of the maximised pseudo-likelihood. |
Michail Tsagris.
R implementation and documentation: Michail Tsagris [email protected] and Giorgos Athineou <[email protected]>.
Garcia P.E. (2013). Exact risk improvement of bandwidth selectors for kernel density estimation with directional data. Electronic Journal of Statistics, 7, 1655–1685.
Wand M.P. and Jones M.C. (1994). Kernel smoothing. Crc Press.
vmf.kde,vmf.kerncontour, vm.kde, vmkde.tune
x <- rvmf(100, rnorm(3), 15) vmfkde.tune(x)
x <- rvmf(100, rnorm(3), 15) vmfkde.tune(x)
It estimates the percentage of correct classification via an m-fold cross validation.
dirknn.tune(ina, x, k = 2:10, mesos = TRUE, nfolds = 10, folds = NULL, parallel = FALSE, stratified = TRUE, seed = NULL, rann = FALSE, graph = FALSE)
dirknn.tune(ina, x, k = 2:10, mesos = TRUE, nfolds = 10, folds = NULL, parallel = FALSE, stratified = TRUE, seed = NULL, rann = FALSE, graph = FALSE)
x |
The data, a numeric matrix with unit vectors. |
ina |
A variable indicating the groups of the data x. |
nfolds |
How many folds to create? |
k |
A vector with the number of nearest neighbours to consider. |
mesos |
A boolean variable used only in the case of the non standard algorithm (type="NS"). Should the average of the distances be calculated (TRUE) or not (FALSE)? If it is FALSE, the harmonic mean is calculated. |
folds |
Do you already have a list with the folds? If not, leave this NULL. |
parallel |
If you want the standard -NN algorithm to take place in parallel set this equal to TRUE. |
stratified |
Should the folds be created in a stratified way? i.e. keeping the distribution of the groups similar through all folds? |
seed |
If seed is TRUE, the results will always be the same. |
rann |
If you have large scale datasets and want a faster k-NN search, you can use kd-trees implemented in the R package "RANN". In this case you must set this argument equal to TRUE. |
graph |
If this is TRUE a graph with the results will appear. |
The standard algorithm is to keep the k nearest observations and see the groups of these observations. The new observation is allocated to the most frequent seen group. The non standard algorithm is to calculate the classical mean or the harmonic mean of the k nearest observations for each group. The new observation is allocated to the group with the smallest mean distance.
We have made an eficient (not very much efficient though) memory allocation. Even if you have hundreds of thousands of observations, the computer will not clush, it will only take longer. Instead of calculate the distance matrix once in the beginning we calcualte the distances of the out-of-sample observations from the rest. If we calculated the distance matrix in the beginning, once, the resulting matrix could have dimensions thousands by thousands. This would not fit into the memory. If you have a few hundres of observations, the runtime is about the same (maybe less, maybe more) as calculating the distance matrix in the first place.
A list including:
per |
The average percent of correct classification across the neighbours. |
percent |
The estimated (optimal) percent of correct classification. |
runtime |
The run time of the algorithm. A numeric vector. The first element is the user time, the second element is the system time and the third element is the elapsed time. |
Michail Tsagris.
R implementation and documentation: Michail Tsagris [email protected].
Tsagris M. and Alenazi A. (2019). Comparison of discriminant analysis methods on the sphere. Communications in Statistics: Case Studies, Data Analysis and Applications, 5(4), 467–491.
k <- runif(4, 4, 20) prob <- c(0.2, 0.4, 0.3, 0.1) mu <- matrix(rnorm(16), ncol = 4) mu <- mu / sqrt( rowSums(mu^2) ) da <- rmixvmf(200, prob, mu, k) x <- da$x ina <- da$id dirknn.tune(ina, x, k = 2:6, nfolds = 5, mesos = TRUE) dirknn.tune(ina, x, k = 2:6, nfolds = 10, mesos = TRUE)
k <- runif(4, 4, 20) prob <- c(0.2, 0.4, 0.3, 0.1) mu <- matrix(rnorm(16), ncol = 4) mu <- mu / sqrt( rowSums(mu^2) ) da <- rmixvmf(200, prob, mu, k) x <- da$x ina <- da$id dirknn.tune(ina, x, k = 2:6, nfolds = 5, mesos = TRUE) dirknn.tune(ina, x, k = 2:6, nfolds = 10, mesos = TRUE)
Tuning of the k-NN regression with Euclidean or (hyper-)spherical response and or predictor variables. It estimates the percentage of correct classification via an m-fold cross valdiation. The bias is estimated as well using the algorithm suggested by Tibshirani and Tibshirani (2009) and is subtracted.
knnreg.tune(y, x, nfolds = 10, A = 10, ncores = 1, res = "eucl", estim = "arithmetic", folds = NULL, seed = NULL, graph = FALSE)
knnreg.tune(y, x, nfolds = 10, A = 10, ncores = 1, res = "eucl", estim = "arithmetic", folds = NULL, seed = NULL, graph = FALSE)
y |
The currently available data, the response variables values. A matrix with either euclidean (univariate or multivariate) or (hyper-)spherical data. If you have a circular response, say u, transform it to a unit vector via (cos(u), sin(u)). |
x |
The currently available data, the predictor variables values. A matrix with either euclidean (univariate or multivariate) or (hyper-)spherical data. If you have a circular response, say u, transform it to a unit vector via (cos(u), sin(u)). |
nfolds |
How many folds to create? |
A |
The maximum number of nearest neighbours, set to 10 by default, starting from the 2nd nearest neighbor. |
ncores |
How many cores to use. This is taken into account only when the predictor variables are spherical. |
res |
The type of the response variable. If it is Euclidean, set this argument equal to "res". If it is a unit vector set it to res="spher". |
estim |
Once the k observations whith the smallest distance are discovered, what should the prediction be? The arithmetic average of the corresponding y values be used estim="arithmetic" or their harmonic average estim="harmonic". |
folds |
Do you already have a list with the folds? If not, leave this NULL. |
seed |
You can specify your own seed number here or leave it NULL. |
graph |
If this is TRUE a graph with the results will appear. |
Tuning of the k-NN regression with Euclidean or (hyper-)spherical response and or predictor variables. It estimates the percentage of correct classification via an m-fold cross valdiation. The bias is estimated as well using the algorithm suggested by Tibshirani and Tibshirani (2009) and is subtracted. The sum of squares of prediction is used in the case of Euclidean responses. In the case of spherical responses the is calculated.
A list including:
crit |
The value of the criterion to minimise/maximise for all values of the nearest neighbours. |
best_k |
The best value of the nearest neighbours. |
performance |
The bias corrected optimal value of the criterion, along wit the estimated bias. For the case of Euclidean reponse this will be higher than the crit and for the case or spherical responses it will be lower than crit. |
runtime |
The run time of the algorithm. A numeric vector. The first element is the user time, the second element is the system time and the third element is the elapsed time. |
Michail Tsagris.
R implementation and documentation: Michail Tsagris [email protected].
knn.reg, spher.reg, dirknn.tune
y <- iris[, 1] x <- iris[, 2:4] x <- x/ sqrt( rowSums(x^2) ) ## Euclidean response and spherical predictors knnreg.tune(y, x, A = 5, res = "eucl", estim = "arithmetic") y <- iris[, 1:3] y <- y/ sqrt( rowSums(y^2) ) ## Spherical response and Euclidean predictor x <- iris[, 2] knnreg.tune(y, x, A = 5, res = "spher", estim = "arithmetic")
y <- iris[, 1] x <- iris[, 2:4] x <- x/ sqrt( rowSums(x^2) ) ## Euclidean response and spherical predictors knnreg.tune(y, x, A = 5, res = "eucl", estim = "arithmetic") y <- iris[, 1:3] y <- y/ sqrt( rowSums(y^2) ) ## Spherical response and Euclidean predictor x <- iris[, 2] knnreg.tune(y, x, A = 5, res = "spher", estim = "arithmetic")
Two sample location test for (hyper-)spherical data.
spcauchy2test(y1, y2, B = 1) pkbd2test(y1, y2, B = 1) vmf2test(y1, y2, B = 1) sp2(y1, y2, tol = 1e-6) pk2(y1, y2, tol = 1e-6) vmf2(y1, y2, tol = 1e-6)
spcauchy2test(y1, y2, B = 1) pkbd2test(y1, y2, B = 1) vmf2test(y1, y2, B = 1) sp2(y1, y2, tol = 1e-6) pk2(y1, y2, tol = 1e-6) vmf2(y1, y2, tol = 1e-6)
y1 |
A matrix with the data in Euclidean coordinates, i.e. unit vectors. |
y2 |
A matrix with the data in Euclidean coordinates, i.e. unit vectors. |
B |
The number of bootstraps to perform. |
tol |
The tolerance value at which to terminate the iterations. |
A log-likelihood ratio based test for the equality of two location parameters, assuming that the data in each group follow the spherical Cauchy of the Poisson kernel-based distribution. Bootstrap is also offered.
For the von Mises-Fisher distribution we do the same, but for the mean direction.
The functions sp2() and pk2() estimate the common location of the two groups assuming unequal concentration parameters. These functions are used the compute the log-likelihood under the null hypothesis. So does the function vmf2(), but the mean direction.
The result of the spcauchy2test(), pkbd2test() and vmf2test() functions is an "htest"class object. Thus it returns a list including:
statistic |
The test statistic value. |
parameter |
The degree(s) of freedom of the test. |
p.value |
The p-value of the test. |
alternative |
A character with the alternative hypothesis. |
method |
A character with the test used. |
data.name |
A character vector with two elements. |
The result of the sp2(), pk2() and vmf2 functions is a list including:
mu |
The common location parameter, for both samples, under the null hypothesis. |
rho1 |
The concentration parameter of the first group, asusming a common location parameter. |
rho2 |
The concentration parameter of the second group, asusming a common location parameter. |
kappa11 |
The concentration parameter (assuming the von Mises-Fisher distribution) of the first group, asusming a common location parameter. |
kappa2 |
The concentration parameter (assuming the von Mises-Fisher distribution) of the second group, asusming a common location parameter. |
loglik |
The log-likelihood of the whole sample, asusming a common location (or mean direction) parameter. |
Michail Tsagris.
R implementation and documentation: Michail Tsagris [email protected].
Kato S. and McCullagh P. (2020). Some properties of a Cauchy family on the sphere derived from the Mobius transformations. Bernoulli, 26(4): 3224–3248.
Golzy M. and Markatou M. (2020). Poisson kernel-based clustering on the sphere: convergence properties, identifiability, and a method of sampling. Journal of Computational and Graphical Statistics, 29(4): 758–770.
Tsagris M., Papastamoulis P. and Kato S. (2024). Directional data analysis using the spherical Cauchy and the Poisson kernel-based distribution. https://arxiv.org/pdf/2409.03292.
mu <- rvmf(2, rnorm(5), 3) y1 <- rspcauchy(60, mu[1, ], 0.4) y2 <- rspcauchy(30, mu[2, ], 0.8) spcauchy2test(y1, y2)
mu <- rvmf(2, rnorm(5), 3) y1 <- rspcauchy(60, mu[1, ], 0.4) y2 <- rspcauchy(30, mu[2, ], 0.8) spcauchy2test(y1, y2)
Hypothesis tests of uniformity for circular data.
kuiper(u, rads = FALSE, R = 1) watson(u, rads = FALSE, R = 1)
kuiper(u, rads = FALSE, R = 1) watson(u, rads = FALSE, R = 1)
u |
A numeric vector containing the circular data, which cna be expressed in degrees or radians. |
rads |
A boolean variable. If the data are in radians, put this TRUE. If the data are expressed in degrees make this FALSE. |
R |
If R = 1 the asymptotic p-value will be calcualted. If R is greater than 1 the bootstrap p-value is returned. |
The high concentration (hcf.circaov), log-likelihood ratio (lr.circaov), embedding approach (embed.circaov) or the non equal concentration parameters approach (het.circaov) is used.
This is an "htest"class object. Thus it returns a list including:
statistic |
The test statistic value. |
parameter |
This is usually the degrees of freedom of the test, but here this is "NA" because the asymptotic based p-value is computed in a different way or because bootstrap was employed. |
p.value |
The p-value of the test. |
alternative |
A character with the alternative hypothesis. |
method |
A character with the test used. |
data.name |
A character vector with two elements. |
Michail Tsagris.
R implementation and documentation: Michail Tsagris [email protected] and Giorgos Athineou <[email protected]>.
Jammalamadaka, S. Rao and SenGupta, A. (2001). Topics in Circular Statistics, pg. 153–55 (Kuiper's test) and pg. 156–157 (Watson's test).
rayleigh, ptest, vmf.mle, rvonmises
x <- rvonmises(n = 40, m = 2, k = 10) kuiper(x, rads = TRUE) watson(x, rads = TRUE) x <- rvonmises(40, m = 2, k = 0) kuiper(x, rads = TRUE) watson(x, rads = TRUE)
x <- rvonmises(n = 40, m = 2, k = 10) kuiper(x, rads = TRUE) watson(x, rads = TRUE) x <- rvonmises(40, m = 2, k = 0) kuiper(x, rads = TRUE) watson(x, rads = TRUE)
Kernel density estimation of circular data with a von Mises kernel.
vm.kde(u, h, thumb = "none", rads = TRUE)
vm.kde(u, h, thumb = "none", rads = TRUE)
u |
A numeric vector containing the data. |
h |
The bandwidth. |
thumb |
It can be either "none", so the bandwidth the user has set will be used, "tay" for the method of Taylor (2008) or "rot" for the method of Garcia-Portugues (2013). |
rads |
If the data are in radians, this should be TRUE and FALSE otherwise. |
The user has the option to use a bandwidth he/she has found in some way (cross-validation) or estimate it as Taylor (2008) or Garcia-Portugues (2013).
A list including:
h |
The bandwidth. If the user chose one of "tay" or "rot" the estimated bandwidth will be returned. |
f |
The kernel density estimate at the observed points. |
Michail Tsagris.
R implementation and documentation: Michail Tsagris [email protected] and Giorgos Athineou<[email protected]>.
Taylor, C. C. (2008). Automatic bandwidth selection for circular density estimation. Computational Statistics & Data Analysis, 52(7): 3493-3500.
Garcia Portugues, E. (2013). Exact risk improvement of bandwidth selectors for kernel density estimation with directional data. Electronic Journal of Statistics, 7, 1655-1685.
vmkde.tune, vmfkde.tune, vmf.kde
x <- rvonmises(100, 2.4, 10, rads = TRUE) hist(x, freq = FALSE) f1 <- vm.kde(x, h = 0.1, thumb = "rot", rads = TRUE)$f f2 <- vm.kde(x, h = 0.1, thumb = "tay", rads = TRUE)$f h <- vmkde.tune(x)[1] f3 <- vm.kde(x, h = h, thumb = "none", rads = TRUE)$f points(x, f1, col = 1) points(x, f2, col = 2) points(x, f3, col = 3)
x <- rvonmises(100, 2.4, 10, rads = TRUE) hist(x, freq = FALSE) f1 <- vm.kde(x, h = 0.1, thumb = "rot", rads = TRUE)$f f2 <- vm.kde(x, h = 0.1, thumb = "tay", rads = TRUE)$f h <- vmkde.tune(x)[1] f3 <- vm.kde(x, h = h, thumb = "none", rads = TRUE)$f points(x, f1, col = 1) points(x, f2, col = 2) points(x, f3, col = 3)
A von Mises-Fisher kernel is used for the non parametric density estimation.
vmf.kde(x, h, thumb = "none")
vmf.kde(x, h, thumb = "none")
x |
A matrix with unit vectors, i.e. the data being expressed in Euclidean cordinates. |
h |
The bandwidth to be used. |
thumb |
If this is "none", the given bandwidth is used. If it is "rot" the rule of thumb suggested by Garcia-Portugues (2013) is used. |
A von Mises-Fisher kernel is used for the non parametric density estimation.
A list including:
h |
The bandwidth used. |
f |
A vector with the kernel density estimate calculated for each of the data points. |
Michail Tsagris.
R implementation and documentation: Michail Tsagris [email protected] and Giorgos Athineou <[email protected]>.
Garcia Portugues, E. (2013). Exact risk improvement of bandwidth selectors for kernel density estimation with directional data. Electronic Journal of Statistics, 7, 1655–1685.
vmfkde.tune, vm.kde, vmf.mle, vmkde.tune
x <- rvmf(100, rnorm(5), 15) h <- vmfkde.tune(x)[1] f1 <- vmf.kde(x, h = h, thumb = "none") f2 <- vmf.kde(x, h = h, thumb = "rot") f1$h ; f2$h
x <- rvmf(100, rnorm(5), 15) h <- vmfkde.tune(x)[1] f1 <- vmf.kde(x, h = h, thumb = "none") f2 <- vmf.kde(x, h = h, thumb = "rot") f1$h ; f2$h