Package 'FarmTest' reference manual

Title:	Factor-Adjusted Robust Multiple Testing
Description:	Performs robust multiple testing for means in the presence of known and unknown latent factors presented in Fan et al.(2019) "FarmTest: Factor-Adjusted Robust Multiple Testing With Approximate False Discovery Control" <doi:10.1080/01621459.2018.1527700>. Implements a series of adaptive Huber methods combined with fast data-drive tuning schemes proposed in Ke et al.(2019) "User-Friendly Covariance Estimation for Heavy-Tailed Distributions" <doi:10.1214/19-STS711> to estimate model parameters and construct test statistics that are robust against heavy-tailed and/or asymmetric error distributions. Extensions to two-sample simultaneous mean comparison problems are also included. As by-products, this package contains functions that compute adaptive Huber mean, covariance and regression estimators that are of independent interest.
Authors:	Xiaoou Pan [aut, cre], Yuan Ke [aut], Wen-Xin Zhou [aut]
Maintainer:	Xiaoou Pan <[email protected]>
License:	GPL-3
Version:	2.2.0
Built:	2025-03-11 06:34:58 UTC
Source:	CRAN

FarmTest: Factor-Adjusted Robust Multiple Testing

Description

FarmTest package performs robust multiple testing for means in the presence of known and unknown latent factors (Fan et al, 2019). It implements a series of adaptive Huber methods combined with fast data-drive tuning schemes (Wang et al, 2020; Ke et al, 2019) to estimate model parameters and construct test statistics that are robust against heavy-tailed and/or assymetric error distributions. Extensions to two-sample simultaneous mean comparison problems are also included. As by-products, this package also contains functions that compute adaptive Huber mean, covariance and regression estimators that are of independent interest.

Details

See its GitHub page https://github.com/XiaoouPan/FarmTest for details.

References

Ahn, S. C. and Horenstein, A. R. (2013). Eigenvalue ratio rest for the number of factors. Econometrica, 81(3) 1203–1227.

Benjamini, Y. and Hochberg, Y. (1995). Controlling the false discovery rate: A practical and powerful approach to multiple testing. J. R. Stat. Soc. Ser. B. Stat. Methodol., 57 289–300.

Bose, K., Fan, J., Ke, Y., Pan, X. and Zhou, W.-X. (2019). FarmTest: An R package for factor-adjusted robust multiple testing, Preprint.

Fan, J., Ke, Y., Sun, Q. and Zhou, W-X. (2019). FarmTest: Factor-adjusted robust multiple testing with approximate false discovery control. J. Amer. Statist. Assoc., 114, 1880-1893.

Huber, P. J. (1964). Robust estimation of a location parameter. Ann. Math. Statist., 35, 73–101.

Ke, Y., Minsker, S., Ren, Z., Sun, Q. and Zhou, W.-X. (2019). User-friendly covariance estimation for heavy-tailed distributions. Statis. Sci., 34, 454-471.

Storey, J. D. (2002). A direct approach to false discovery rates. J. R. Stat. Soc. Ser. B. Stat. Methodol., 64 479–498.

Sun, Q., Zhou, W.-X. and Fan, J. (2020). Adaptive Huber regression. J. Amer. Statist. Assoc., 115, 254-265.

Wang, L., Zheng, C., Zhou, W. and Zhou, W.-X. (2020). A new principle for tuning-free Huber regression. Stat. Sin., to appear.

Zhou, W-X., Bose, K., Fan, J. and Liu, H. (2018). A new perspective on robust M-estimation: Finite sample theory and applications to dependence-adjusted multiple testing. Ann. Statist., 46 1904-1931.

Factor-adjusted robust multiple testing

Description

This function conducts factor-adjusted robust multiple testing (FarmTest) for means of multivariate data proposed in Fan et al. (2019) via a tuning-free procedure.

Usage

farm.test(
  X,
  fX = NULL,
  KX = -1,
  Y = NULL,
  fY = NULL,
  KY = -1,
  h0 = NULL,
  alternative = c("two.sided", "less", "greater"),
  alpha = 0.05,
  p.method = c("bootstrap", "normal"),
  nBoot = 500
)
farm.test(
  X,
  fX = NULL,
  KX = -1,
  Y = NULL,
  fY = NULL,
  KY = -1,
  h0 = NULL,
  alternative = c("two.sided", "less", "greater"),
  alpha = 0.05,
  p.method = c("bootstrap", "normal"),
  nBoot = 500
)

Arguments

`X`	An $n$ by $p$ data matrix with each row being a sample.
`fX`	An optional factor matrix with each column being a factor for `X`. The number of rows of `fX` and `X` must be the same.
`KX`	An optional positive number of factors to be estimated for `X` when `fX` is not specified. `KX` cannot exceed the number of columns of `X`. If `KX` is not specified or specified to be negative, it will be estimated internally. If `KX` is specified to be 0, no factor will be adjusted.
`Y`	An optional data matrix used for two-sample FarmTest. The number of columns of `X` and `Y` must be the same.
`fY`	An optional factor matrix for two-sample FarmTest with each column being a factor for `Y`. The number of rows of `fY` and `Y` must be the same.
`KY`	An optional positive number of factors to be estimated for `Y` for two-sample FarmTest when `fY` is not specified. `KY` cannot exceed the number of columns of `Y`. If `KY` is not specified or specified to be negative, it will be estimated internally. If `KY` is specified to be 0, no factor will be adjusted.
`h0`	An optional $p$ -vector of true means, or difference in means for two-sample FarmTest. The default is a zero vector.
`alternative`	An optional character string specifying the alternate hypothesis, must be one of "two.sided" (default), "less" or "greater".
`alpha`	An optional level for controlling the false discovery rate. The value of `alpha` must be between 0 and 1. The default value is 0.05.
`p.method`	An optional character string specifying the method to calculate p-values when `fX` is known or when `KX = 0`, possible options are multiplier bootstrap or normal approximation. It must be one of "bootstrap"(default) or "normal".
`nBoot`	An optional positive integer specifying the size of bootstrap sample, only available when `p.method = "bootstrap"`. The dafault value is 500.

Details

For two-sample FarmTest, means, stdDev, loadings, eigenVal, eigenRatio, nfactors and n will be lists of items for sample X and Y separately.

alternative = "greater" is the alternative that $\mu > \mu_0$ for one-sample test or $\mu_X > \mu_Y$ for two-sample test.

Setting p.method = "bootstrap" for factor-known model will slow down the program, but it will achieve lower empirical FDP than setting p.method = "normal".

Value

An object with S3 class farm.test containing the following items will be returned:

means: Estimated means, a vector with length $p$ .
stdDev: Estimated standard deviations, a vector with length $p$ . It's not available for bootstrap method.
loadings: Estimated factor loadings, a matrix with dimension $p$ by $K$ , where $K$ is the number of factors.
eigenVal: Eigenvalues of estimated covariance matrix, a vector with length $p$ . It's only available when factors fX and fY are not given.
eigenRatio: Ratios of eigenVal to estimate nFactors, a vector with length $min(n, p) / 2$ . It's only available when number of factors KX and KY are not given.
nFactors: Estimated or input number of factors, a positive integer.
tStat: Values of test statistics, a vector with length $p$ . It's not available for bootstrap method.
pValues: P-values of tests, a vector with length $p$ .
pAdjust: Adjusted p-values of tests, a vector with length $p$ .
significant: Boolean values indicating whether each test is significant, with 1 for significant and 0 for non-significant, a vector with length $p$ .
reject: Indices of tests that are rejected. It will show "no hypotheses rejected" if none of the tests are rejected.
type: Indicator of whether factor is known or unknown.
n: Sample size.
p: Data dimension.
h0: Null hypothesis, a vector with length $p$ .
alpha: $\alpha$ value.
alternative: Althernative hypothesis.

References

Ahn, S. C. and Horenstein, A. R. (2013). Eigenvalue ratio test for the number of factors. Econometrica, 81(3) 1203–1227.

Benjamini, Y. and Hochberg, Y. (1995). Controlling the false discovery rate: A practical and powerful approach to multiple testing. J. R. Stat. Soc. Ser. B. Stat. Methodol., 57 289–300.

Fan, J., Ke, Y., Sun, Q. and Zhou, W-X. (2019). FarmTest: Factor-adjusted robust multiple testing with approximate false discovery control. J. Amer. Statist. Assoc., 114, 1880-1893.

Huber, P. J. (1964). Robust estimation of a location parameter. Ann. Math. Statist., 35, 73–101.

Storey, J. D. (2002). A direct approach to false discovery rates. J. R. Stat. Soc. Ser. B. Stat. Methodol., 64, 479–498.

Sun, Q., Zhou, W.-X. and Fan, J. (2020). Adaptive Huber regression. J. Amer. Statist. Assoc., 115, 254-265.

Zhou, W-X., Bose, K., Fan, J. and Liu, H. (2018). A new perspective on robust M-estimation: Finite sample theory and applications to dependence-adjusted multiple testing. Ann. Statist., 46 1904-1931.

Examples

n = 20
p = 50
K = 3
muX = rep(0, p)
muX[1:5] = 2
epsilonX = matrix(rnorm(p * n, 0, 1), nrow = n)
BX = matrix(runif(p * K, -2, 2), nrow = p)
fX = matrix(rnorm(K * n, 0, 1), nrow = n)
X = rep(1, n) %*% t(muX) + fX %*% t(BX) + epsilonX
# One-sample FarmTest with two sided alternative
output = farm.test(X)
# One-sample FarmTest with one sided alternative
output = farm.test(X, alternative = "less")
# One-sample FarmTest with known factors
output = farm.test(X, fX = fX)

# Two-sample FarmTest
muY = rep(0, p)
muY[1:5] = 4
epsilonY = matrix(rnorm(p * n, 0, 1), nrow = n)
BY = matrix(runif(p * K, -2, 2), nrow = p)
fY = matrix(rnorm(K * n, 0, 1), nrow = n)
Y = rep(1, n) %*% t(muY) + fY %*% t(BY) + epsilonY
output = farm.test(X, Y = Y)
n = 20
p = 50
K = 3
muX = rep(0, p)
muX[1:5] = 2
epsilonX = matrix(rnorm(p * n, 0, 1), nrow = n)
BX = matrix(runif(p * K, -2, 2), nrow = p)
fX = matrix(rnorm(K * n, 0, 1), nrow = n)
X = rep(1, n) %*% t(muX) + fX %*% t(BX) + epsilonX
# One-sample FarmTest with two sided alternative
output = farm.test(X)
# One-sample FarmTest with one sided alternative
output = farm.test(X, alternative = "less")
# One-sample FarmTest with known factors
output = farm.test(X, fX = fX)

# Two-sample FarmTest
muY = rep(0, p)
muY[1:5] = 4
epsilonY = matrix(rnorm(p * n, 0, 1), nrow = n)
BY = matrix(runif(p * K, -2, 2), nrow = p)
fY = matrix(rnorm(K * n, 0, 1), nrow = n)
Y = rep(1, n) %*% t(muY) + fY %*% t(BY) + epsilonY
output = farm.test(X, Y = Y)

Tuning-free Huber-type covariance estimation

Description

The function calculates adaptive Huber-type covariance estimator from a data sample, with robustification parameter $\tau$ determined by a tuning-free principle. For the input matrix X, both low-dimension ( $p < n$ ) and high-dimension ( $p > n$ ) are allowed.

Usage

huber.cov(X)
huber.cov(X)

Arguments

`X`	An $n$ by $p$ data matrix.

Value

A $p$ by $p$ Huber-type covariance matrix estimator will be returned.

References

Huber, P. J. (1964). Robust estimation of a location parameter. Ann. Math. Statist., 35, 73–101.

Ke, Y., Minsker, S., Ren, Z., Sun, Q. and Zhou, W.-X. (2019). User-friendly covariance estimation for heavy-tailed distributions. Statis. Sci., 34, 454-471.

Examples

n = 100
d = 50
X = matrix(rt(n * d, df = 3), n, d) / sqrt(3)
Sigma = huber.cov(X)
n = 100
d = 50
X = matrix(rt(n * d, df = 3), n, d) / sqrt(3)
Sigma = huber.cov(X)

Tuning-free Huber mean estimation

Description

The function calculates adaptive Huber mean estimator from a data sample, with robustification parameter $\tau$ determined by a tuning-free principle.

Usage

huber.mean(X)
huber.mean(X)

Arguments

`X`	An $n$ -dimensional data vector.

Value

A Huber mean estimator will be returned.

References

Huber, P. J. (1964). Robust estimation of a location parameter. Ann. Math. Statist., 35, 73–101.

Wang, L., Zheng, C., Zhou, W. and Zhou, W.-X. (2020). A New Principle for Tuning-Free Huber Regression. Stat. Sin., to appear.

Examples

n = 10000
X = rt(n, 2) + 2
mu = huber.mean(X)
n = 10000
X = rt(n, 2) + 2
mu = huber.mean(X)

Tuning-free Huber regression

Description

The function conducts Huber regression from a data sample, with robustification parameter $\tau$ determined by a tuning-free principle.

Usage

huber.reg(X, Y, method = c("standard", "adaptive"))
huber.reg(X, Y, method = c("standard", "adaptive"))

Arguments

`X`	An $n$ by $p$ design matrix, where $p < n$ .
`Y`	A continuous response with length $n$ .
`method`	An optional character string specifying the method to calibrate the robustification parameter $\tau$ . Two choices are "standard"(default) and "adaptive". See Wang et al.(2020) for details.

Value

A coefficients estimator with length $p + 1$ will be returned.

References

Huber, P. J. (1964). Robust estimation of a location parameter. Ann. Math. Statist., 35, 73–101.

Sun, Q., Zhou, W.-X. and Fan, J. (2020). Adaptive Huber regression. J. Amer. Statist. Assoc., 115, 254-265.

Wang, L., Zheng, C., Zhou, W. and Zhou, W.-X. (2020). A new principle for tuning-free Huber regression. Stat. Sin., to appear.

Examples

n = 200
d = 10
beta = rep(1, d)
X = matrix(rnorm(n * d), n, d)
err = rnorm(n)
Y = 1 + X %*% beta + err
beta.hat = huber.reg(X, Y)
n = 200
d = 10
beta = rep(1, d)
X = matrix(rnorm(n * d), n, d)
err = rnorm(n)
Y = 1 + X %*% beta + err
beta.hat = huber.reg(X, Y)

Plot function of FarmTest

Description

This is the plot function of S3 objects with class "farm.test". It produces the histogram of estimated means.

Usage

## S3 method for class 'farm.test'
plot(x, ...)
## S3 method for class 'farm.test'
plot(x, ...)

Arguments

`x`	A `farm.test` object.
`...`	Further arguments passed to or from other methods.

Details

For two-sample FarmTest, the histogram is based on the difference: estimated means of sample X - estimated means of sample Y.

Value

No variable will be returned, but a histogram of estimated means will be presented.

Examples

n = 50
p = 100
K = 3
muX = rep(0, p)
muX[1:5] = 2
epsilonX = matrix(rnorm(p * n, 0, 1), nrow = n)
BX = matrix(runif(p * K, -2, 2), nrow = p)
fX = matrix(rnorm(K * n, 0, 1), nrow = n)
X = rep(1, n) %*% t(muX) + fX %*% t(BX) + epsilonX
output = farm.test(X)
plot(output)
n = 50
p = 100
K = 3
muX = rep(0, p)
muX[1:5] = 2
epsilonX = matrix(rnorm(p * n, 0, 1), nrow = n)
BX = matrix(runif(p * K, -2, 2), nrow = p)
fX = matrix(rnorm(K * n, 0, 1), nrow = n)
X = rep(1, n) %*% t(muX) + fX %*% t(BX) + epsilonX
output = farm.test(X)
plot(output)

Print function of FarmTest

Description

This is the print function of S3 objects with class "farm.test".

Usage

## S3 method for class 'farm.test'
print(x, ...)
## S3 method for class 'farm.test'
print(x, ...)

Arguments

`x`	A `farm.test` object.
`...`	Further arguments passed to or from other methods.

Value

No variable will be returned, but a brief summary of FarmTest will be displayed.

Examples

n = 50
p = 100
K = 3
muX = rep(0, p)
muX[1:5] = 2
epsilonX = matrix(rnorm(p * n, 0, 1), nrow = n)
BX = matrix(runif(p * K, -2, 2), nrow = p)
fX = matrix(rnorm(K * n, 0, 1), nrow = n)
X = rep(1, n) %*% t(muX) + fX %*% t(BX) + epsilonX
output = farm.test(X)
print(output)
n = 50
p = 100
K = 3
muX = rep(0, p)
muX[1:5] = 2
epsilonX = matrix(rnorm(p * n, 0, 1), nrow = n)
BX = matrix(runif(p * K, -2, 2), nrow = p)
fX = matrix(rnorm(K * n, 0, 1), nrow = n)
X = rep(1, n) %*% t(muX) + fX %*% t(BX) + epsilonX
output = farm.test(X)
print(output)

Summary function of FarmTest

Description

This is the summary function of S3 objects with class "farm.test".

Usage

## S3 method for class 'farm.test'
summary(object, ...)
## S3 method for class 'farm.test'
summary(object, ...)

Arguments

`object`	A `farm.test` object.
`...`	Further arguments passed to or from other methods.

Details

For two-sample FarmTest, the first column is the difference: estimated means of sample X - estimated means of sample Y.

Value

A data frame including the estimated means, p-values, adjusted p-values and significance for all the features will be presented.

Examples

n = 50
p = 100
K = 3
muX = rep(0, p)
muX[1:5] = 2
epsilonX = matrix(rnorm(p * n, 0, 1), nrow = n)
BX = matrix(runif(p * K, -2, 2), nrow = p)
fX = matrix(rnorm(K * n, 0, 1), nrow = n)
X = rep(1, n) %*% t(muX) + fX %*% t(BX) + epsilonX
output = farm.test(X)
summary(output)
n = 50
p = 100
K = 3
muX = rep(0, p)
muX[1:5] = 2
epsilonX = matrix(rnorm(p * n, 0, 1), nrow = n)
BX = matrix(runif(p * K, -2, 2), nrow = p)
fX = matrix(rnorm(K * n, 0, 1), nrow = n)
X = rep(1, n) %*% t(muX) + fX %*% t(BX) + epsilonX
output = farm.test(X)
summary(output)

Package 'FarmTest'

Help Index

FarmTest: Factor-Adjusted Robust Multiple Testing

Description

Details

References

Factor-adjusted robust multiple testing

Description

Usage

Arguments

Details

Value

References

See Also

Examples

Tuning-free Huber-type covariance estimation

Description

Usage

Arguments

Value

References

See Also

Examples

Tuning-free Huber mean estimation

Description

Usage

Arguments

Value

References

See Also

Examples

Tuning-free Huber regression

Description

Usage

Arguments

Value

References

See Also

Examples

Plot function of FarmTest

Description

Usage

Arguments

Details

Value

See Also

Examples

Print function of FarmTest

Description

Usage

Arguments

Value

See Also

Examples

Summary function of FarmTest

Description

Usage

Arguments

Details

Value

See Also

Examples