Title: | Power-Enhanced (PE) Tests for High-Dimensional Data |
---|---|
Description: | Two-sample power-enhanced mean tests, covariance tests, and simultaneous tests on mean vectors and covariance matrices for high-dimensional data. Methods of these PE tests are presented in Yu, Li, and Xue (2022) <doi:10.1080/01621459.2022.2126781>; Yu, Li, Xue, and Li (2022) <doi:10.1080/01621459.2022.2061354>. |
Authors: | Xiufan Yu [aut, cre], Danning Li [aut], Lingzhou Xue [aut], Runze Li [aut] |
Maintainer: | Xiufan Yu <[email protected]> |
License: | GPL (>= 3) |
Version: | 0.1.0 |
Built: | 2024-12-03 06:43:16 UTC |
Source: | CRAN |
The package implements several two-sample power-enhanced mean tests, covariance tests, and simultaneous tests on mean vectors and covariance matrices for high-dimensional data.
There are three main functions:covtest
meantest
simultest
Chen, S. X. and Qin, Y. L. (2010). A two-sample test for high-dimensional data with applications to gene-set testing. Annals of Statistics, 38(2):808–835. doi:10.1214/09-AOS716
Cai, T. T., Liu, W., and Xia, Y. (2013). Two-sample covariance matrix testing and support recovery in high-dimensional and sparse settings. Journal of the American Statistical Association, 108(501):265–277. doi:10.1080/01621459.2012.758041
Cai, T. T., Liu, W., and Xia, Y. (2014). Two-sample test of high dimensional means under dependence. Journal of the Royal Statistical Society: Series B: Statistical Methodology, 76(2):349–372. doi:10.1111/rssb.12034
Li, J. and Chen, S. X. (2012). Two sample tests for high-dimensional covariance matrices. The Annals of Statistics, 40(2):908–940. doi:10.1214/12-AOS993
Yu, X., Li, D., and Xue, L. (2022). Fisher’s combined probability test for high-dimensional covariance matrices. Journal of the American Statistical Association, (in press):1–14. doi:10.1080/01621459.2022.2126781
Yu, X., Li, D., Xue, L., and Li, R. (2022). Power-enhanced simultaneous test of high-dimensional mean vectors and covariance matrices with application to gene-set testing. Journal of the American Statistical Association, (in press):1–14. doi:10.1080/01621459.2022.2061354
n1 = 100; n2 = 100; pp = 500 set.seed(1) X = matrix(rnorm(n1*pp), nrow=n1, ncol=pp) Y = matrix(rnorm(n2*pp), nrow=n2, ncol=pp) covtest(X, Y) meantest(X, Y) simultest(X, Y)
n1 = 100; n2 = 100; pp = 500 set.seed(1) X = matrix(rnorm(n1*pp), nrow=n1, ncol=pp) Y = matrix(rnorm(n2*pp), nrow=n2, ncol=pp) covtest(X, Y) meantest(X, Y) simultest(X, Y)
This function implements five two-sample covariance tests on high-dimensional
covariance matrices.
Let and
be two
-dimensional populations with mean vectors
and covariance matrices
, respectively.
The problem of interest is to test the equality of the two
covariance matrices:
Suppose are i.i.d.
copies of
, and
are i.i.d. copies of
. We denote
dataX=
and
dataY=
.
covtest(dataX,dataY,method='pe.comp',delta=NULL)
covtest(dataX,dataY,method='pe.comp',delta=NULL)
dataX |
an |
dataY |
an |
method |
the method type (default =
|
delta |
This is needed only in |
method
the method type
stat
the value of test statistic
pval
the p-value for the test.
Cai, T. T., Liu, W., and Xia, Y. (2013). Two-sample covariance matrix testing and support recovery in high-dimensional and sparse settings. Journal of the American Statistical Association, 108(501):265–277.
Li, J. and Chen, S. X. (2012). Two sample tests for high-dimensional covariance matrices. The Annals of Statistics, 40(2):908–940.
Yu, X., Li, D., and Xue, L. (2022). Fisher’s combined probability test for high-dimensional covariance matrices. Journal of the American Statistical Association, (in press):1–14.
Yu, X., Li, D., Xue, L., and Li, R. (2022). Power-enhanced simultaneous test of high-dimensional mean vectors and covariance matrices with application to gene-set testing. Journal of the American Statistical Association, (in press):1–14.
n1 = 100; n2 = 100; pp = 500 set.seed(1) X = matrix(rnorm(n1*pp), nrow=n1, ncol=pp) Y = matrix(rnorm(n2*pp), nrow=n2, ncol=pp) covtest(X,Y)
n1 = 100; n2 = 100; pp = 500 set.seed(1) X = matrix(rnorm(n1*pp), nrow=n1, ncol=pp) Y = matrix(rnorm(n2*pp), nrow=n2, ncol=pp) covtest(X,Y)
This function implements the two-sample -norm-based
high-dimensional covariance test proposed in Cai, Liu and Xia (2013).
Suppose
are i.i.d.
copies of
, and
are i.i.d. copies of
. The test statistic is defined as
where and
are the sample covariances,
and
estimates the variance of
.
The explicit formulas of
,
,
and
can be found
in Section 2 of Cai, Liu and Xia (2013).
With some regularity conditions, under the null hypothesis
,
the test statistic
converges in distribution to
a Gumbel distribution
as
.
The asymptotic
-value is obtained by
covtest.clx(dataX,dataY)
covtest.clx(dataX,dataY)
dataX |
an |
dataY |
an |
stat
the value of test statistic
pval
the p-value for the test.
Cai, T. T., Liu, W., and Xia, Y. (2013). Two-sample covariance matrix testing and support recovery in high-dimensional and sparse settings. Journal of the American Statistical Association, 108(501):265–277.
n1 = 100; n2 = 100; pp = 500 set.seed(1) X = matrix(rnorm(n1*pp), nrow=n1, ncol=pp) Y = matrix(rnorm(n2*pp), nrow=n2, ncol=pp) covtest.clx(X,Y)
n1 = 100; n2 = 100; pp = 500 set.seed(1) X = matrix(rnorm(n1*pp), nrow=n1, ncol=pp) Y = matrix(rnorm(n2*pp), nrow=n2, ncol=pp) covtest.clx(X,Y)
This function implements the two-sample -norm-based high-dimensional covariance test
proposed by Li and Chen (2012).
Suppose
are i.i.d.
copies of
, and
are i.i.d. copies of
. The test statistic
is
defined as
where ,
, and
are unbiased estimators for
,
,
and
, respectively.
Under the null hypothesis
,
the leading variance of
is
,
which can be consistently estimated by
.
The explicit formulas of
,
,
and
can be found in
Equations (2.1), (2.2) and Theorem 1 of Li and Chen (2012).
With some regularity conditions, under the null hypothesis
,
the test statistic
converges in distribution to a standard normal distribution
as
.
The asymptotic
-value is obtained by
where is the cdf of the standard normal distribution.
covtest.lc(dataX,dataY)
covtest.lc(dataX,dataY)
dataX |
an |
dataY |
an |
stat
the value of test statistic
pval
the p-value for the test.
Li, J. and Chen, S. X. (2012). Two sample tests for high-dimensional covariance matrices. The Annals of Statistics, 40(2):908–940.
n1 = 100; n2 = 100; pp = 500 set.seed(1) X = matrix(rnorm(n1*pp), nrow=n1, ncol=pp) Y = matrix(rnorm(n2*pp), nrow=n2, ncol=pp) covtest.lc(X,Y)
n1 = 100; n2 = 100; pp = 500 set.seed(1) X = matrix(rnorm(n1*pp), nrow=n1, ncol=pp) Y = matrix(rnorm(n2*pp), nrow=n2, ncol=pp) covtest.lc(X,Y)
This function implements the two-sample PE covariance test via
Cauchy combination.
Suppose are i.i.d.
copies of
, and
are i.i.d. copies of
.
Let
and
denote the
-values associated with
the
-norm-based covariance test (see
covtest.lc
for details)
and the -norm-based covariance test
(see
covtest.clx
for details), respectively.
The PE covariance test via Cauchy combination is defined as
It has been proved that with some regularity conditions, under the null hypothesis
the two tests are asymptotically independent as
,
and therefore
asymptotically converges in distribution to a standard Cauchy distribution.
The asymptotic
-value is obtained by
where is the cdf of the standard Cauchy distribution.
covtest.pe.cauchy(dataX,dataY)
covtest.pe.cauchy(dataX,dataY)
dataX |
an |
dataY |
an |
stat
the value of test statistic
pval
the p-value for the test.
Yu, X., Li, D., and Xue, L. (2022). Fisher’s combined probability test for high-dimensional covariance matrices. Journal of the American Statistical Association, (in press):1–14.
n1 = 100; n2 = 100; pp = 500 set.seed(1) X = matrix(rnorm(n1*pp), nrow=n1, ncol=pp) Y = matrix(rnorm(n2*pp), nrow=n2, ncol=pp) covtest.pe.cauchy(X,Y)
n1 = 100; n2 = 100; pp = 500 set.seed(1) X = matrix(rnorm(n1*pp), nrow=n1, ncol=pp) Y = matrix(rnorm(n2*pp), nrow=n2, ncol=pp) covtest.pe.cauchy(X,Y)
This function implements the two-sample PE covariance test via the
construction of the PE component. Let
denote the
-norm-based covariance test statistic
(see
covtest.lc
for details).
The PE component is constructed by
where is a threshold for the screening procedure,
recommended to take the value of
.
The explicit forms of
and
can be found in Section 3.2 of Yu et al. (2022).
The PE covariance test statistic is defined as
With some regularity conditions, under the null hypothesis
,
the test statistic
converges in distribution to
a standard normal distribution as
.
The asymptotic
-value is obtained by
where is the cdf of the standard normal distribution.
covtest.pe.comp(dataX,dataY,delta=NULL)
covtest.pe.comp(dataX,dataY,delta=NULL)
dataX |
an |
dataY |
an |
delta |
a scalar; the thresholding value used in the construction of
the PE component. If not specified, the function uses a default value
|
stat
the value of test statistic
pval
the p-value for the test.
Yu, X., Li, D., Xue, L., and Li, R. (2022). Power-enhanced simultaneous test of high-dimensional mean vectors and covariance matrices with application to gene-set testing. Journal of the American Statistical Association, (in press):1–14.
n1 = 100; n2 = 100; pp = 500 set.seed(1) X = matrix(rnorm(n1*pp), nrow=n1, ncol=pp) Y = matrix(rnorm(n2*pp), nrow=n2, ncol=pp) covtest.pe.comp(X,Y)
n1 = 100; n2 = 100; pp = 500 set.seed(1) X = matrix(rnorm(n1*pp), nrow=n1, ncol=pp) Y = matrix(rnorm(n2*pp), nrow=n2, ncol=pp) covtest.pe.comp(X,Y)
This function implements the two-sample PE covariance test via
Fisher's combination.
Suppose are i.i.d.
copies of
, and
are i.i.d. copies of
.
Let
and
denote the
-values associated with
the
-norm-based covariance test (see
covtest.lc
for details)
and the -norm-based covariance test
(see
covtest.clx
for details), respectively.
The PE covariance test via Fisher's combination is defined as
It has been proved that with some regularity conditions, under the null hypothesis
the two tests are asymptotically independent as
,
and therefore
asymptotically converges in distribution to a
distribution.
The asymptotic
-value is obtained by
where is the cdf of the
distribution.
covtest.pe.fisher(dataX,dataY)
covtest.pe.fisher(dataX,dataY)
dataX |
an |
dataY |
an |
stat
the value of test statistic
pval
the p-value for the test.
Yu, X., Li, D., and Xue, L. (2022). Fisher’s combined probability test for high-dimensional covariance matrices. Journal of the American Statistical Association, (in press):1–14.
n1 = 100; n2 = 100; pp = 500 set.seed(1) X = matrix(rnorm(n1*pp), nrow=n1, ncol=pp) Y = matrix(rnorm(n2*pp), nrow=n2, ncol=pp) covtest.pe.fisher(X,Y)
n1 = 100; n2 = 100; pp = 500 set.seed(1) X = matrix(rnorm(n1*pp), nrow=n1, ncol=pp) Y = matrix(rnorm(n2*pp), nrow=n2, ncol=pp) covtest.pe.fisher(X,Y)
This function implements five two-sample mean tests on high-dimensional
mean vectors.
Let and
be two
-dimensional populations with mean vectors
and covariance matrices
, respectively.
The problem of interest is to test the equality of the two
mean vectors of the two populations:
Suppose are i.i.d.
copies of
, and
are i.i.d. copies of
. We denote
dataX=
and
dataY=
.
meantest(dataX,dataY,method='pe.comp',delta=NULL)
meantest(dataX,dataY,method='pe.comp',delta=NULL)
dataX |
an |
dataY |
an |
method |
the method type (default =
|
delta |
This is needed only in |
method
the method type
stat
the value of test statistic
pval
the p-value for the test.
Chen, S. X. and Qin, Y. L. (2010). A two-sample test for high-dimensional data with applications to gene-set testing. Annals of Statistics, 38(2):808–835.
Cai, T. T., Liu, W., and Xia, Y. (2014). Two-sample test of high dimensional means under dependence. Journal of the Royal Statistical Society: Series B: Statistical Methodology, 76(2):349–372.
Yu, X., Li, D., Xue, L., and Li, R. (2022). Power-enhanced simultaneous test of high-dimensional mean vectors and covariance matrices with application to gene-set testing. Journal of the American Statistical Association, (in press):1–14.
n1 = 100; n2 = 100; pp = 500 set.seed(1) X = matrix(rnorm(n1*pp), nrow=n1, ncol=pp) Y = matrix(rnorm(n2*pp), nrow=n2, ncol=pp) meantest(X,Y)
n1 = 100; n2 = 100; pp = 500 set.seed(1) X = matrix(rnorm(n1*pp), nrow=n1, ncol=pp) Y = matrix(rnorm(n2*pp), nrow=n2, ncol=pp) meantest(X,Y)
This function implements the two-sample -norm-based
high-dimensional mean test proposed in Cai, Liu and Xia (2014).
Suppose
are i.i.d.
copies of
, and
are i.i.d. copies of
.
The test statistic is defined as
With some regularity conditions, under the null hypothesis ,
the test statistic
converges in distribution to
a Gumbel distribution
as
.
The asymptotic
-value is obtained by
meantest.clx(dataX,dataY)
meantest.clx(dataX,dataY)
dataX |
an |
dataY |
an |
stat
the value of test statistic
pval
the p-value for the test.
Cai, T. T., Liu, W., and Xia, Y. (2014). Two-sample test of high dimensional means under dependence. Journal of the Royal Statistical Society: Series B: Statistical Methodology, 76(2):349–372.
n1 = 100; n2 = 100; pp = 500 set.seed(1) X = matrix(rnorm(n1*pp), nrow=n1, ncol=pp) Y = matrix(rnorm(n2*pp), nrow=n2, ncol=pp) meantest.clx(X,Y)
n1 = 100; n2 = 100; pp = 500 set.seed(1) X = matrix(rnorm(n1*pp), nrow=n1, ncol=pp) Y = matrix(rnorm(n2*pp), nrow=n2, ncol=pp) meantest.clx(X,Y)
This function implements the two-sample -norm-based high-dimensional
mean test proposed by Chen and Qin (2010).
Suppose
are i.i.d.
copies of
, and
are i.i.d. copies of
.
The test statistic
is defined as
Under the null hypothesis ,
the leading variance of
is
,
which can be consistently estimated by
The explicit formulas of
,
, and
can be found in Section 3 of Chen and Qin (2010).
With some regularity conditions, under the null hypothesis
,
the test statistic
converges in distribution to a standard normal distribution
as
.
The asymptotic
-value is obtained by
where is the cdf of the standard normal distribution.
meantest.cq(dataX,dataY)
meantest.cq(dataX,dataY)
dataX |
an |
dataY |
an |
stat
the value of test statistic
pval
the p-value for the test.
Chen, S. X. and Qin, Y. L. (2010). A two-sample test for high-dimensional data with applications to gene-set testing. Annals of Statistics, 38(2):808–835.
n1 = 100; n2 = 100; pp = 500 set.seed(1) X = matrix(rnorm(n1*pp), nrow=n1, ncol=pp) Y = matrix(rnorm(n2*pp), nrow=n2, ncol=pp) meantest.cq(X,Y)
n1 = 100; n2 = 100; pp = 500 set.seed(1) X = matrix(rnorm(n1*pp), nrow=n1, ncol=pp) Y = matrix(rnorm(n2*pp), nrow=n2, ncol=pp) meantest.cq(X,Y)
This function implements the two-sample PE covariance test via
Cauchy combination.
Suppose are i.i.d.
copies of
, and
are i.i.d. copies of
.
Let
and
denote the
-values associated with
the
-norm-based covariance test (see
meantest.cq
for details)
and the -norm-based covariance test
(see
meantest.clx
for details), respectively.
The PE covariance test via Cauchy combination is defined as
It has been proved that with some regularity conditions, under the null hypothesis
the two tests are asymptotically independent as
,
and therefore
asymptotically converges in distribution to a standard Cauchy distribution.
The asymptotic
-value is obtained by
where is the cdf of the standard Cauchy distribution.
meantest.pe.cauchy(dataX,dataY)
meantest.pe.cauchy(dataX,dataY)
dataX |
an |
dataY |
an |
stat
the value of test statistic
pval
the p-value for the test.
Chen, S. X. and Qin, Y. L. (2010). A two-sample test for high-dimensional data with applications to gene-set testing. Annals of Statistics, 38(2):808–835.
Cai, T. T., Liu, W., and Xia, Y. (2014). Two-sample test of high dimensional means under dependence. Journal of the Royal Statistical Society: Series B: Statistical Methodology, 76(2):349–372.
n1 = 100; n2 = 100; pp = 500 set.seed(1) X = matrix(rnorm(n1*pp), nrow=n1, ncol=pp) Y = matrix(rnorm(n2*pp), nrow=n2, ncol=pp) meantest.pe.cauchy(X,Y)
n1 = 100; n2 = 100; pp = 500 set.seed(1) X = matrix(rnorm(n1*pp), nrow=n1, ncol=pp) Y = matrix(rnorm(n2*pp), nrow=n2, ncol=pp) meantest.pe.cauchy(X,Y)
This function implements the two-sample PE mean via the
construction of the PE component. Let
denote the
-norm-based mean test statistic
(see
meantest.cq
for details).
The PE component is constructed by
where is a threshold for the screening procedure,
recommended to take the value of
.
The explicit forms of
and
can be found in Section 3.1 of Yu et al. (2022).
The PE covariance test statistic is defined as
With some regularity conditions, under the null hypothesis
,
the test statistic
converges in distribution to
a standard normal distribution as
.
The asymptotic
-value is obtained by
where is the cdf of the standard normal distribution.
meantest.pe.comp(dataX,dataY,delta=NULL)
meantest.pe.comp(dataX,dataY,delta=NULL)
dataX |
an |
dataY |
an |
delta |
a scalar; the thresholding value used in the construction of
the PE component. If not specified, the function uses a default value
|
stat
the value of test statistic
pval
the p-value for the test.
Yu, X., Li, D., Xue, L., and Li, R. (2022). Power-enhanced simultaneous test of high-dimensional mean vectors and covariance matrices with application to gene-set testing. Journal of the American Statistical Association, (in press):1–14.
n1 = 100; n2 = 100; pp = 500 set.seed(1) X = matrix(rnorm(n1*pp), nrow=n1, ncol=pp) Y = matrix(rnorm(n2*pp), nrow=n2, ncol=pp) meantest.pe.comp(X,Y)
n1 = 100; n2 = 100; pp = 500 set.seed(1) X = matrix(rnorm(n1*pp), nrow=n1, ncol=pp) Y = matrix(rnorm(n2*pp), nrow=n2, ncol=pp) meantest.pe.comp(X,Y)
This function implements the two-sample PE covariance test via
Fisher's combination.
Suppose are i.i.d.
copies of
, and
are i.i.d. copies of
.
Let
and
denote the
-values associated with
the
-norm-based covariance test (see
meantest.cq
for details)
and the -norm-based covariance test
(see
meantest.clx
for details), respectively.
The PE covariance test via Fisher's combination is defined as
It has been proved that with some regularity conditions, under the null hypothesis
the two tests are asymptotically independent as
,
and therefore
asymptotically converges in distribution to a
distribution.
The asymptotic
-value is obtained by
where is the cdf of the
distribution.
meantest.pe.fisher(dataX,dataY)
meantest.pe.fisher(dataX,dataY)
dataX |
an |
dataY |
an |
stat
the value of test statistic
pval
the p-value for the test.
Chen, S. X. and Qin, Y. L. (2010). A two-sample test for high-dimensional data with applications to gene-set testing. Annals of Statistics, 38(2):808–835.
Cai, T. T., Liu, W., and Xia, Y. (2014). Two-sample test of high dimensional means under dependence. Journal of the Royal Statistical Society: Series B: Statistical Methodology, 76(2):349–372.
n1 = 100; n2 = 100; pp = 500 set.seed(1) X = matrix(rnorm(n1*pp), nrow=n1, ncol=pp) Y = matrix(rnorm(n2*pp), nrow=n2, ncol=pp) meantest.pe.fisher(X,Y)
n1 = 100; n2 = 100; pp = 500 set.seed(1) X = matrix(rnorm(n1*pp), nrow=n1, ncol=pp) Y = matrix(rnorm(n2*pp), nrow=n2, ncol=pp) meantest.pe.fisher(X,Y)
This function implements six two-sample simultaneous tests
on high-dimensional mean vectors and covariance matrices.
Let and
be two
-dimensional populations with mean vectors
and covariance matrices
, respectively.
The problem of interest is the simultaneous inference on the equality of
mean vectors and covariance matrices of the two populations:
Suppose are i.i.d.
copies of
, and
are i.i.d. copies of
. We denote
dataX=
and
dataY=
.
simultest(dataX, dataY, method='pe.fisher', delta_mean=NULL, delta_cov=NULL)
simultest(dataX, dataY, method='pe.fisher', delta_mean=NULL, delta_cov=NULL)
dataX |
an |
dataY |
an |
method |
the method type (default =
|
delta_mean |
the thresholding value used in the construction of
the PE component for the mean test statistic. It is needed only in PE methods such as
|
delta_cov |
the thresholding value used in the construction of
the PE component for the covariance test statistic. It is needed only in PE methods such as
|
method
the method type
stat
the value of test statistic
pval
the p-value for the test.
Chen, S. X. and Qin, Y. L. (2010). A two-sample test for high-dimensional data with applications to gene-set testing. Annals of Statistics, 38(2):808–835.
Li, J. and Chen, S. X. (2012). Two sample tests for high-dimensional covariance matrices. The Annals of Statistics, 40(2):908–940.
Yu, X., Li, D., and Xue, L. (2022). Fisher’s combined probability test for high-dimensional covariance matrices. Journal of the American Statistical Association, (in press):1–14.
Yu, X., Li, D., Xue, L., and Li, R. (2022). Power-enhanced simultaneous test of high-dimensional mean vectors and covariance matrices with application to gene-set testing. Journal of the American Statistical Association, (in press):1–14.
n1 = 100; n2 = 100; pp = 500 set.seed(1) X = matrix(rnorm(n1*pp), nrow=n1, ncol=pp) Y = matrix(rnorm(n2*pp), nrow=n2, ncol=pp) simultest(X,Y)
n1 = 100; n2 = 100; pp = 500 set.seed(1) X = matrix(rnorm(n1*pp), nrow=n1, ncol=pp) Y = matrix(rnorm(n2*pp), nrow=n2, ncol=pp) simultest(X,Y)
This function implements the two-sample simultaneous test on high-dimensional
mean vectors and covariance matrices using Cauchy combination.
Suppose are i.i.d.
copies of
, and
are i.i.d. copies of
.
Let
and
denote the
-values associated with
the
-norm-based mean test proposed in Chen and Qin (2010)
(see
meantest.cq
for details)
and the -norm-based covariance test proposed in Li and Chen (2012)
(see
covtest.lc
for details),
respectively. The simultaneous test statistic via Cauchy combination is defined as
It has been proved that with some regularity conditions, under the null hypothesis
,
the two tests are asymptotically independent as
,
and therefore
asymptotically converges in distribution to
a standard Cauchy distribution.
The asymptotic
-value is obtained by
where is the cdf of the standard Cauchy distribution.
simultest.cauchy(dataX,dataY)
simultest.cauchy(dataX,dataY)
dataX |
an |
dataY |
an |
stat
the value of test statistic
pval
the p-value for the test.
Chen, S. X. and Qin, Y. L. (2010). A two-sample test for high-dimensional data with applications to gene-set testing. Annals of Statistics, 38(2):808–835.
Li, J. and Chen, S. X. (2012). Two sample tests for high-dimensional covariance matrices. The Annals of Statistics, 40(2):908–940.
Yu, X., Li, D., Xue, L., and Li, R. (2022). Power-enhanced simultaneous test of high-dimensional mean vectors and covariance matrices with application to gene-set testing. Journal of the American Statistical Association, (in press):1–14.
n1 = 100; n2 = 100; pp = 500 set.seed(1) X = matrix(rnorm(n1*pp), nrow=n1, ncol=pp) Y = matrix(rnorm(n2*pp), nrow=n2, ncol=pp) simultest.cauchy(X,Y)
n1 = 100; n2 = 100; pp = 500 set.seed(1) X = matrix(rnorm(n1*pp), nrow=n1, ncol=pp) Y = matrix(rnorm(n2*pp), nrow=n2, ncol=pp) simultest.cauchy(X,Y)
This function implements the two-sample simultaneous test on high-dimensional
mean vectors and covariance matrices using chi-squared approximation.
Suppose are i.i.d.
copies of
, and
are i.i.d. copies of
.
Let
denote
the
-norm-based mean test statistic proposed in Chen and Qin (2010)
(see
meantest.cq
for details),
and let
denote the
-norm-based covariance test statistic
proposed in Li and Chen (2012) (see
covtest.lc
for details).
The simultaneous test statistic via chi-squared approximation is defined as
It has been proved that with some regularity conditions, under the null hypothesis
,
the two tests are asymptotically independent as
,
and therefore
asymptotically converges in distribution to
a
distribution.
The asymptotic
-value is obtained by
where is the cdf of the
distribution.
simultest.chisq(dataX,dataY)
simultest.chisq(dataX,dataY)
dataX |
n1 by p data matrix |
dataY |
n2 by p data matrix |
stat
the value of test statistic
pval
the p-value for the test.
Yu, X., Li, D., Xue, L., and Li, R. (2022). Power-enhanced simultaneous test of high-dimensional mean vectors and covariance matrices with application to gene-set testing. Journal of the American Statistical Association, (in press):1–14.
n1 = 100; n2 = 100; pp = 500 set.seed(1) X = matrix(rnorm(n1*pp), nrow=n1, ncol=pp) Y = matrix(rnorm(n2*pp), nrow=n2, ncol=pp) simultest.chisq(X,Y)
n1 = 100; n2 = 100; pp = 500 set.seed(1) X = matrix(rnorm(n1*pp), nrow=n1, ncol=pp) Y = matrix(rnorm(n2*pp), nrow=n2, ncol=pp) simultest.chisq(X,Y)
This function implements the two-sample simultaneous test on high-dimensional
mean vectors and covariance matrices using Fisher's combination.
Suppose are i.i.d.
copies of
, and
are i.i.d. copies of
.
Let
and
denote the
-values associated with
the
-norm-based mean test proposed in Chen and Qin (2010)
(see
meantest.cq
for details)
and the -norm-based covariance test proposed in Li and Chen (2012)
(see
covtest.lc
for details),
respectively.
The simultaneous test statistic via Fisher's combination is defined as
It has been proved that with some regularity conditions, under the null hypothesis
,
the two tests are asymptotically independent as
,
and therefore
asymptotically converges in distribution to
a
distribution.
The asymptotic
-value is obtained by
where is the cdf of the
distribution.
simultest.fisher(dataX,dataY)
simultest.fisher(dataX,dataY)
dataX |
an |
dataY |
an |
stat
the value of test statistic
pval
the p-value for the test.
Chen, S. X. and Qin, Y. L. (2010). A two-sample test for high-dimensional data with applications to gene-set testing. Annals of Statistics, 38(2):808–835.
Li, J. and Chen, S. X. (2012). Two sample tests for high-dimensional covariance matrices. The Annals of Statistics, 40(2):908–940.
Yu, X., Li, D., Xue, L., and Li, R. (2022). Power-enhanced simultaneous test of high-dimensional mean vectors and covariance matrices with application to gene-set testing. Journal of the American Statistical Association, (in press):1–14.
n1 = 100; n2 = 100; pp = 500 set.seed(1) X = matrix(rnorm(n1*pp), nrow=n1, ncol=pp) Y = matrix(rnorm(n2*pp), nrow=n2, ncol=pp) simultest.fisher(X,Y)
n1 = 100; n2 = 100; pp = 500 set.seed(1) X = matrix(rnorm(n1*pp), nrow=n1, ncol=pp) Y = matrix(rnorm(n2*pp), nrow=n2, ncol=pp) simultest.fisher(X,Y)
This function implements the two-sample PE simultaneous test on high-dimensional
mean vectors and covariance matrices using Cauchy combination.
Suppose are i.i.d.
copies of
, and
are i.i.d. copies of
.
Let
and
denote
the PE mean test statistic and PE covariance test statistic, respectively.
(see
meantest.pe.comp
and covtest.pe.comp
for details).
Let and
denote their respective
-values.
The PE simultaneous test statistic via Cauchy combination is defined as
It has been proved that with some regularity conditions, under the null hypothesis
,
the two tests are asymptotically independent as
,
and therefore
asymptotically converges in distribution to
a standard Cauchy distribution.
The asymptotic
-value is obtained by
where is the cdf of the standard Cauchy distribution.
simultest.pe.cauchy(dataX,dataY,delta_mean=NULL,delta_cov=NULL)
simultest.pe.cauchy(dataX,dataY,delta_mean=NULL,delta_cov=NULL)
dataX |
an |
dataY |
an |
delta_mean |
a scalar; the thresholding value used in the construction of
the PE component for mean test; see |
delta_cov |
a scalar; the thresholding value used in the construction of
the PE component for covariance test; see |
stat
the value of test statistic
pval
the p-value for the test.
Yu, X., Li, D., Xue, L., and Li, R. (2022). Power-enhanced simultaneous test of high-dimensional mean vectors and covariance matrices with application to gene-set testing. Journal of the American Statistical Association, (in press):1–14.
n1 = 100; n2 = 100; pp = 500 set.seed(1) X = matrix(rnorm(n1*pp), nrow=n1, ncol=pp) Y = matrix(rnorm(n2*pp), nrow=n2, ncol=pp) simultest.pe.cauchy(X,Y)
n1 = 100; n2 = 100; pp = 500 set.seed(1) X = matrix(rnorm(n1*pp), nrow=n1, ncol=pp) Y = matrix(rnorm(n2*pp), nrow=n2, ncol=pp) simultest.pe.cauchy(X,Y)
This function implements the two-sample PE simultaneous test on
high-dimensional mean vectors and covariance matrices using chi-squared approximation.
Suppose are i.i.d.
copies of
, and
are i.i.d. copies of
.
Let
and
denote
the PE mean test statistic and PE covariance test statistic, respectively.
(see
meantest.pe.comp
and covtest.pe.comp
for details).
The PE simultaneous test statistic via chi-squared approximation is defined as
It has been proved that with some regularity conditions, under the null hypothesis
,
the two tests are asymptotically independent as
,
and therefore
asymptotically converges in distribution to
a
distribution.
The asymptotic
-value is obtained by
where is the cdf of the
distribution.
simultest.pe.chisq(dataX,dataY,delta_mean=NULL,delta_cov=NULL)
simultest.pe.chisq(dataX,dataY,delta_mean=NULL,delta_cov=NULL)
dataX |
an |
dataY |
an |
delta_mean |
a scalar; the thresholding value used in the construction of
the PE component for mean test; see |
delta_cov |
a scalar; the thresholding value used in the construction of
the PE component for covariance test; see |
stat
the value of test statistic
pval
the p-value for the test.
Yu, X., Li, D., Xue, L., and Li, R. (2022). Power-enhanced simultaneous test of high-dimensional mean vectors and covariance matrices with application to gene-set testing. Journal of the American Statistical Association, (in press):1–14.
n1 = 100; n2 = 100; pp = 500 set.seed(1) X = matrix(rnorm(n1*pp), nrow=n1, ncol=pp) Y = matrix(rnorm(n2*pp), nrow=n2, ncol=pp) simultest.pe.chisq(X,Y)
n1 = 100; n2 = 100; pp = 500 set.seed(1) X = matrix(rnorm(n1*pp), nrow=n1, ncol=pp) Y = matrix(rnorm(n2*pp), nrow=n2, ncol=pp) simultest.pe.chisq(X,Y)
This function implements the two-sample PE simultaneous test on
high-dimensional mean vectors and covariance matrices using Fisher's combination.
Suppose are i.i.d.
copies of
, and
are i.i.d. copies of
.
Let
and
denote
the PE mean test statistic and PE covariance test statistic, respectively.
(see
meantest.pe.comp
and covtest.pe.comp
for details).
Let and
denote their respective
-values.
The PE simultaneous test statistic via Fisher's combination is defined as
It has been proved that with some regularity conditions, under the null hypothesis
,
the two tests are asymptotically independent as
,
and therefore
asymptotically converges in distribution to
a
distribution.
The asymptotic
-value is obtained by
where is the cdf of the
distribution.
simultest.pe.fisher(dataX,dataY,delta_mean=NULL,delta_cov=NULL)
simultest.pe.fisher(dataX,dataY,delta_mean=NULL,delta_cov=NULL)
dataX |
an |
dataY |
an |
delta_mean |
a scalar; the thresholding value used in the construction of
the PE component for mean test; see |
delta_cov |
a scalar; the thresholding value used in the construction of
the PE component for covariance test; see |
stat
the value of test statistic
pval
the p-value for the test.
Yu, X., Li, D., Xue, L., and Li, R. (2022). Power-enhanced simultaneous test of high-dimensional mean vectors and covariance matrices with application to gene-set testing. Journal of the American Statistical Association, (in press):1–14.
n1 = 100; n2 = 100; pp = 500 set.seed(1) X = matrix(rnorm(n1*pp), nrow=n1, ncol=pp) Y = matrix(rnorm(n2*pp), nrow=n2, ncol=pp) simultest.pe.fisher(X,Y)
n1 = 100; n2 = 100; pp = 500 set.seed(1) X = matrix(rnorm(n1*pp), nrow=n1, ncol=pp) Y = matrix(rnorm(n2*pp), nrow=n2, ncol=pp) simultest.pe.fisher(X,Y)