Title: | The Folding Test of Unimodality |
---|---|
Description: | The basic algorithm to perform the folding test of unimodality. Given a dataset X (d dimensional, n samples), the test checks whether the distribution of the data are rather unimodal or rather multimodal. This package stems from the following research publication: Siffer Alban, Pierre-Alain Fouque, Alexandre Termier, and Christine Largouët. "Are your data gathered?" In Proceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery Data Mining, pp. 2210-2218. ACM, 2018. <doi:10.1145/3219819.3219994>. |
Authors: | Alban Siffer [aut, cre], Amossys [cph, fnd] |
Maintainer: | Alban Siffer <[email protected]> |
License: | GPL-3 |
Version: | 1.0 |
Built: | 2024-10-25 06:44:59 UTC |
Source: | CRAN |
Computes the folding ratio of the input data
folding.ratio(X)
folding.ratio(X)
X |
nxd matrix (n observations, d dimensions) |
the folding ratio
X = matrix(runif(n = 1000, min = 0., max = 1.), ncol = 1) phi = folding.statistics(X)
X = matrix(runif(n = 1000, min = 0., max = 1.), ncol = 1) phi = folding.statistics(X)
Computes the folding statistics of the input data
folding.statistics(X)
folding.statistics(X)
X |
nxd matrix (n observations, d dimensions) |
the folding statsistics
library(MASS) mu = c(0,0) Sigma = matrix(c(1,0.5,1,0.5), ncol = 2) X = mvrnorm(n = 5000, mu = mu, Sigma = Sigma) Phi = folding.statistics(X)
library(MASS) mu = c(0,0) Sigma = matrix(c(1,0.5,1,0.5), ncol = 2) X = mvrnorm(n = 5000, mu = mu, Sigma = Sigma) Phi = folding.statistics(X)
Perform the folding test of unimodality
folding.test(X)
folding.test(X)
X |
$nxd$ matrix (n observations, d dimensions) |
1 if unimodal, 0 if multimodal
library(MASS) n = 10000 d = 3 mu = c(0,0,0) Sigma = matrix(c(1,0.5,0.5,0.5,1,0.5,0.5,0.5,1), ncol = d) X = mvrnorm(n = n, mu = mu, Sigma = Sigma) m = folding.test(X)
library(MASS) n = 10000 d = 3 mu = c(0,0,0) Sigma = matrix(c(1,0.5,0.5,0.5,1,0.5,0.5,0.5,1), ncol = d) X = mvrnorm(n = n, mu = mu, Sigma = Sigma) m = folding.test(X)
Computes the confidence bound for the significance level p
folding.test.bound(n, d, p)
folding.test.bound(n, d, p)
n |
sample size |
d |
dimension |
p |
significance level (between 0 and 1, the lower, the more significant) |
the confidence bound q (the bounds are 1-q and 1+q)
n = 2000 # number of observations d = 2 # 2 dimensional data p = 0.05 # we want the bound at the level 0.05 (classical p-value) q = folding.test.bound(n,d,p)
n = 2000 # number of observations d = 2 # 2 dimensional data p = 0.05 # we want the bound at the level 0.05 (classical p-value) q = folding.test.bound(n,d,p)
Computes the p-value of the folding test
folding.test.pvalue(Phi, n, d)
folding.test.pvalue(Phi, n, d)
Phi |
the folding statistics |
n |
sample size |
d |
dimension |
the p-value (the lower, the more significant)
library(MASS) n = 5000 d = 2 mu = c(0,0) Sigma = matrix(c(1,0.5,1,0.5), ncol = d) X = mvrnorm(n = n, mu = mu, Sigma = Sigma) Phi = folding.statistics(X) p = folding.test.pvalue(Phi,n,d)
library(MASS) n = 5000 d = 2 mu = c(0,0) Sigma = matrix(c(1,0.5,1,0.5), ncol = d) X = mvrnorm(n = n, mu = mu, Sigma = Sigma) Phi = folding.statistics(X) p = folding.test.pvalue(Phi,n,d)
(approximate pivot)Computes the pivot (approximate pivot)
pivot.approx(X)
pivot.approx(X)
X |
nxd matrix (n observations, d dimensions) |
the approximate pivot
library(MASS) mu = c(0,0) Sigma = matrix(c(1,0.5,1,0.5), ncol = 2) X = mvrnorm(n = 5000, mu = mu, Sigma = Sigma) Phi = pivot.approx(X)
library(MASS) mu = c(0,0) Sigma = matrix(c(1,0.5,1,0.5), ncol = 2) X = mvrnorm(n = 5000, mu = mu, Sigma = Sigma) Phi = pivot.approx(X)