Title: | Binary Expansion Testing |
---|---|
Description: | Nonparametric detection of nonuniformity and dependence with Binary Expansion Testing (BET). See Kai Zhang (2019) BET on Independence, Journal of the American Statistical Association, 114:528, 1620-1637, <DOI:10.1080/01621459.2018.1537921>, Kai Zhang, Wan Zhang, Zhigen Zhao, Wen Zhou. (2023). BEAUTY Powered BEAST, <doi:10.48550/arXiv.2103.00674> and Wan Zhang, Zhigen Zhao, Michael Baiocchi, Yao Li, Kai Zhang. (2023) SorBET: A Fast and Powerful Algorithm to Test Dependence of Variables, Techinical report. |
Authors: | Wan Zhang [aut, cre], Zhigen Zhao [aut], Michael Baiocchi [aut], Kai Zhang [aut] |
Maintainer: | Wan Zhang <[email protected]> |
License: | GPL |
Version: | 0.5.4 |
Built: | 2024-12-09 06:50:39 UTC |
Source: | CRAN |
BEAST
(Binary Expansion Adaptive Symmetry Test) is used for nonparametric detection of nonuniformity or dependence.
BEAST( X, dep, subsample.percent = 1/2, B = 100, unif.margin = FALSE, lambda = NULL, index = list(c(1:ncol(X))), method = "p", num = NULL )
BEAST( X, dep, subsample.percent = 1/2, B = 100, unif.margin = FALSE, lambda = NULL, index = list(c(1:ncol(X))), method = "p", num = NULL )
X |
a matrix to be tested. |
dep |
depth of the binary expansion for the |
subsample.percent |
sample size for subsampling. |
B |
times of subsampling. |
unif.margin |
logicals. If |
lambda |
tuning parameter for soft-thresholding, default to be |
index |
a list of indices. If provided, test the independence among two or more groups of variables. For example, |
method |
If |
num |
number of permutations if method == "p" (default to be 100), or simulations if method == "s" (default to be 1000). |
Interaction |
the most frequent interaction among all subsamples. |
BEAST.Statistic |
BEAST statistic. |
Null.Distribution |
simulated null distribution. |
p.value |
simulated p-value. |
## Elapsed times 7.32 secs ## Measured in R 4.0.2, 32 bit, on a processor 3.3 GHz 6-Core Intel Core i5 under MacOS, 2024/9/6 ## Not run: x1 = runif(128) x2 = runif(128) y = sin(4*pi*(x1 + x2)) + 0.8*rnorm(128) ##test independence between (x1, x2) and y BEAST(cbind(x1, x2, y), 3, index = list(c(1,2), c(3))) ##test mutual independence among x1, x2 and y BEAST(cbind(x1, x2, y), 3, index = list(1, 2, 3)) ##test bivariate uniformity x1 = rbeta(128, 2, 4) x2 = rbeta(128, 2, 4) BEAST(cbind(x1, x2), 3) ##test multivariate uniformity x1 = rbeta(128, 2, 4) x2 = rbeta(128, 2, 4) x3 = rbeta(128, 2, 4) BEAST(cbind(x1, x2, x3), 3) ## End(Not run)
## Elapsed times 7.32 secs ## Measured in R 4.0.2, 32 bit, on a processor 3.3 GHz 6-Core Intel Core i5 under MacOS, 2024/9/6 ## Not run: x1 = runif(128) x2 = runif(128) y = sin(4*pi*(x1 + x2)) + 0.8*rnorm(128) ##test independence between (x1, x2) and y BEAST(cbind(x1, x2, y), 3, index = list(c(1,2), c(3))) ##test mutual independence among x1, x2 and y BEAST(cbind(x1, x2, y), 3, index = list(1, 2, 3)) ##test bivariate uniformity x1 = rbeta(128, 2, 4) x2 = rbeta(128, 2, 4) BEAST(cbind(x1, x2), 3) ##test multivariate uniformity x1 = rbeta(128, 2, 4) x2 = rbeta(128, 2, 4) x3 = rbeta(128, 2, 4) BEAST(cbind(x1, x2, x3), 3) ## End(Not run)
The BET
package provides functions for nonparametric detection of nonuniformity and dependence with Binary Expansion Testing (BET).
MaxBET
symm
get.signs
cell.counts
bet.plot
MaxBETs
BEAST
Kai Zhang (2019) BET on Independence, Journal of the American Statistical Association, 114:528, 1620-1637, doi:10.1080/01621459.2018.1537921, Kai Zhang, Zhigen Zhao, and Wen Zhou (2021). BEAUTY Powered BEAST, <arXiv:2103.00674> and Wan Zhang, Zhigen Zhao, Michael Baiocchi, Yao Li, Kai Zhang. SorBET: A Fast and Powerful Algorithm to Test Dependence of Variables. Techinical report, 2023.
bet.plot
shows the cross interaction of the strongest asymmetry, which the BET returns with the rejection of independence null.
This function only works for the test on two variables, that is, X
can only have two columns.
There are nontrivial binary variables in the
-field and
of them are cross interactions, whose positive regions are in plotted in white and whose negative regions are plotted in blue.
plot.bet
shows the cross interaction where the difference of number of observations in the positive and negative region is largest.
## S3 method for class 'plot' bet(X, dep, unif.margin = FALSE, cex=0.5, index = list(c(1:ncol(X))), ...)
## S3 method for class 'plot' bet(X, dep, unif.margin = FALSE, cex=0.5, index = list(c(1:ncol(X))), ...)
X |
a matrix with two columns. |
dep |
depth of BET. |
unif.margin |
logicals. If |
cex |
number indicating the amount by which plotting text and symbols should be scaled relative to the default. |
index |
a list of indices. If provided, test the independence among two or more groups of variables. For example, |
... |
graphical parameters to plot |
v <- runif(128, -pi, pi) X1 <- cos(v) + 2.5 * rnorm(128, 0, 1/20) X2 <- sin(v) + 2.5 * rnorm(128, 0, 1/20) bet.plot(cbind(X1, X2), 3, index = list(1,2))
v <- runif(128, -pi, pi) X1 <- cos(v) + 2.5 * rnorm(128, 0, 1/20) X2 <- sin(v) + 2.5 * rnorm(128, 0, 1/20) bet.plot(cbind(X1, X2), 3, index = list(1,2))
cell.counts
returns the amount of data points in each cell getting from binary expansion.
cell.counts(X, dep, unif.margin = FALSE)
cell.counts(X, dep, unif.margin = FALSE)
X |
a matrix to be tested. |
dep |
depth of the marginal binary expansions. |
unif.margin |
logicals. If |
The result is a dataframe with 2 rows and columns, where
is the number of columns of
X
. The first column is the binary index, the second column is the amount of data points.
v <- runif(128, -pi, pi) X1 <- cos(v) + 2.5 * rnorm(128, 0, 1/20) X2 <- sin(v) + 2.5 * rnorm(128, 0, 1/20) cell.counts(cbind(X1, X2), 3)
v <- runif(128, -pi, pi) X1 <- cos(v) + 2.5 * rnorm(128, 0, 1/20) X2 <- sin(v) + 2.5 * rnorm(128, 0, 1/20) cell.counts(cbind(X1, X2), 3)
get.signs
returns all the signs of colors for each point under all interactions up to depth d
in marginal binary expansions for the tests BET
and BETs
.
get.signs(X, dep, unif.margin = FALSE)
get.signs(X, dep, unif.margin = FALSE)
X |
a matrix to be tested. |
dep |
depth of the marginal binary expansions. |
unif.margin |
logicals. If |
The result is a dataframe with rows and
columns, where
is the number of columns of
X
and is the number of rows of
X
. The values of or
stand for the sign of color, while the marginal interactions return
.
v <- runif(128, -pi, pi) X1 <- cos(v) + 2.5 * rnorm(128, 0, 1/20) X2 <- sin(v) + 2.5 * rnorm(128, 0, 1/20) get.signs(cbind(X1, X2), 3)
v <- runif(128, -pi, pi) X1 <- cos(v) + 2.5 * rnorm(128, 0, 1/20) X2 <- sin(v) + 2.5 * rnorm(128, 0, 1/20) get.signs(cbind(X1, X2), 3)
MaxBET
stands for Binary Expansion Testing. It is used for nonparametric detection of nonuniformity or dependence. It can be used to test whether a column vector is [0, 1]-uniformly distributed. It can also be used to detect dependence between columns of a matrix X
, if X
has more than one column.
MaxBET( X, dep, unif.margin = FALSE, asymptotic = TRUE, plot = FALSE, index = list(c(1:ncol(X))) )
MaxBET( X, dep, unif.margin = FALSE, asymptotic = TRUE, plot = FALSE, index = list(c(1:ncol(X))) )
X |
a matrix to be tested. When |
dep |
depth of the binary expansion for the |
unif.margin |
logicals. If |
asymptotic |
logicals. If |
plot |
logicals. If |
index |
a list of indices. If provided, test the independence among two or more groups of variables. For example, |
MaxBET
tests the independence or uniformity by considering the maximal magnitude of the symmetry statistics in the -field generated from marginal binary expansions at the depth
d
.
Interaction |
a dataframe with |
Extreme.Asymmetry |
the extreme asymmetry statistics. |
p.value.bonf |
p-value of the test with Bonferroni adjustment. |
z.statistic |
normal approximation of the test statistic. |
##test mutual independence v <- runif(128, -pi, pi) X1 <- cos(v) + 2.5 * rnorm(128, 0, 1/20) X2 <- sin(v) + 2.5 * rnorm(128, 0, 1/20) MaxBET(cbind(X1, X2), 3, asymptotic = FALSE, index = list(1,2)) ##test independence between (x1, x2) and y x1 = runif(128) x2 = runif(128) y = sin(4*pi*(x1 + x2)) + 0.4*rnorm(128) MaxBET(cbind(x1, x2, y), 3, index = list(c(1,2), c(3))) ##test uniformity x1 = rbeta(128, 2, 4) x2 = rbeta(128, 2, 4) x3 = rbeta(128, 2, 4) MaxBET(cbind(x1, x2, x3), 3)
##test mutual independence v <- runif(128, -pi, pi) X1 <- cos(v) + 2.5 * rnorm(128, 0, 1/20) X2 <- sin(v) + 2.5 * rnorm(128, 0, 1/20) MaxBET(cbind(X1, X2), 3, asymptotic = FALSE, index = list(1,2)) ##test independence between (x1, x2) and y x1 = runif(128) x2 = runif(128) y = sin(4*pi*(x1 + x2)) + 0.4*rnorm(128) MaxBET(cbind(x1, x2, y), 3, index = list(c(1,2), c(3))) ##test uniformity x1 = rbeta(128, 2, 4) x2 = rbeta(128, 2, 4) x3 = rbeta(128, 2, 4) MaxBET(cbind(x1, x2, x3), 3)
MaxBETs
is used for nonparametric dependence detection.
Extended from BET
, for a chosen maximal depth d.max
, MaxBETs
does a sequential test up to d.max
and avoids overlapping symmetry statistics in different depths,
for all . The adjustment is done by multiplying the number of interactions which are in the
-field generated by marginal binary expansions at depth
but not in that at depth
.
MaxBETs( X, d.max = 4, unif.margin = FALSE, asymptotic = TRUE, plot = FALSE, index = list(c(1:ncol(X))) )
MaxBETs( X, d.max = 4, unif.margin = FALSE, asymptotic = TRUE, plot = FALSE, index = list(c(1:ncol(X))) )
X |
a matrix to be tested. When |
d.max |
the maximal depth of the binary expansion for |
unif.margin |
logicals. If |
asymptotic |
logicals. If |
plot |
logicals. If |
index |
a list of indices. If provided, test the independence among two or more groups of variables, for example, |
bet.s.pvalue.bonf |
the overall p-value on the test. |
bet.s.index |
the interaction that the p-value is minimal. |
bet.s.zstatistic |
normal approximation of the test statistic. |
##test mutual independence v <- runif(128, -pi, pi) X1 <- cos(v) + 2.5 * rnorm(128, 0, 1/20) X2 <- sin(v) + 2.5 * rnorm(128, 0, 1/20) MaxBETs(cbind(X1, X2), 3, asymptotic = FALSE, index = list(1,2)) ##test independence between (x1, x2) and y x1 = runif(128) x2 = runif(128) y = sin(4*pi*(x1 + x2)) + 0.4*rnorm(128) MaxBETs(cbind(x1, x2, y), 3, index = list(c(1,2), c(3))) ##test uniformity x1 = rbeta(128, 2, 4) x2 = rbeta(128, 2, 4) x3 = rbeta(128, 2, 4) MaxBETs(cbind(x1, x2, x3), 3)
##test mutual independence v <- runif(128, -pi, pi) X1 <- cos(v) + 2.5 * rnorm(128, 0, 1/20) X2 <- sin(v) + 2.5 * rnorm(128, 0, 1/20) MaxBETs(cbind(X1, X2), 3, asymptotic = FALSE, index = list(1,2)) ##test independence between (x1, x2) and y x1 = runif(128) x2 = runif(128) y = sin(4*pi*(x1 + x2)) + 0.4*rnorm(128) MaxBETs(cbind(x1, x2, y), 3, index = list(c(1,2), c(3))) ##test uniformity x1 = rbeta(128, 2, 4) x2 = rbeta(128, 2, 4) x3 = rbeta(128, 2, 4) MaxBETs(cbind(x1, x2, x3), 3)
This data set collects the galactic coordinates of the 256 brightest stars in the night sky (Perryman et al. 1997). We consider the longitude (x
) and sine latitude (y
) here.
data(star)
data(star)
An object of class data.frame
with 256 rows and 2 columns.
data(star) MaxBETs(cbind(star$x.raw, star$y.raw), asymptotic = FALSE, plot = TRUE, index = list(1,2))
data(star) MaxBETs(cbind(star$x.raw, star$y.raw), asymptotic = FALSE, plot = TRUE, index = list(1,2))
symm
returns all the symmetry statistics up to depth d
in marginal binary expansions for the tests BET
and BETs
.
symm( X, dep, unif.margin = FALSE, print.sample.size = TRUE )
symm( X, dep, unif.margin = FALSE, print.sample.size = TRUE )
X |
a matrix to be tested. |
dep |
depth of the marginal binary expansions. |
unif.margin |
logicals. If |
print.sample.size |
logicals. If |
The result is a dataframe with columns, where
is the number of columns of
X
. The first column gives the binary index for all variables, the next columns displays all the interactions of respective variables, the last column of
Statistics
gives the respective symmetry statistic.
v <- runif(128, -pi, pi) X1 <- cos(v) + 2.5 * rnorm(128, 0, 1/20) X2 <- sin(v) + 2.5 * rnorm(128, 0, 1/20) symm(cbind(X1, X2), 3)
v <- runif(128, -pi, pi) X1 <- cos(v) + 2.5 * rnorm(128, 0, 1/20) X2 <- sin(v) + 2.5 * rnorm(128, 0, 1/20) symm(cbind(X1, X2), 3)