This package contains functions for association testing in 2x2 tables (ie. two binary variables). In particular, the scientific setting that motivated this package’s development was testing for associations between diseases and rare genetic variants in case-control studies. When the expected number of subjects possessing a variant is small, standard methods perform poorly (usually tend to be overly conservative in controlling the Type I error).
The two alternative methods implemented in the package are permutation testing and approximate unconditional (AU) testing.
Permutation testing works by computing a test statistic T for the observed data, generating all plausible datasets with the same total number of exposed subjects, then adding up the probabilities of those datasets which give more extreme test statistics than T.
The perm.tests
function returns p-values from
permutation tests based on score, likelihood ratio, Wald (with and
without regularization), and Firth statistics.
The following code runs the tests for a dataset containing 5,000 cases (55 with a minor allele of interest) and 15,000 controls (45 with a minor allele of interest):
## score.p lr.p wald.p wald0.p firth.p
## 1.362901e-10 3.109880e-10 1.362901e-10 1.365853e-10 1.000000e+00
For comparison purposes, the basic.tests
function
returns p-values for the standard score, likelihood ratio, Wald, Firth,
and Fisher’s exact tests:
## score.p lr.p wald.p wald0.p firth.p fisher.p
## 3.768763e-12 1.524214e-10 7.777712e-11 9.028622e-11 1.325086e-10 1.464917e-10
AU testing works by computing a test statistic T for the observed data, generating all plausible datasets with any number of variants, then adding up the probabilities of those datasets which give more extreme test statistics than T.
The au.tests
function returns p-values from AU tests
based on score, likelihood ratio, and Wald (with and without
regularization) statistics. The au.firth
function returns a
p-value from the AU Firth test. It was implemented as a separate
function due to its increased computational time.
The following code runs the tests for a dataset containing 10,000 cases (60 with a minor allele of interest) and 10,000 controls (45 with a minor allele of interest):
## score.p lr.p wald.p wald0.p
## 0.1420303 0.1430718 0.1431031 0.1431030
## au.firth.p
## 0.85898
In order to gain precision or adjust for a confounding variable, it
can be of interest to perform a stratified analysis. The
perm.test.strat
function implements a permutation
likelihood ratio test that allows for categorical covariates, and the
au.test.strat
implements a similar AU test. The functions
read in vectors of controls, cases, controls with the exposure, and
cases wih the exposure, where the i-th element of each vector
corresponds to the coount for the i-th strata.
Consider the following example data, with two strata (ie. a binary covariate):
m0list = c(500, 1250) # controls
m1list = c(150, 100) # cases
r0list = c(60, 20) # exposed controls
r1list = c(25, 5) # exposed cases
A non-stratified analysis would yield a highly significant result:
## score.p lr.p wald.p wald0.p firth.p
## 1.283296e-05 1.758305e-05 1.283296e-05 1.310045e-05 9.999910e-01
## score.p lr.p wald.p wald0.p
## 7.592631e-06 2.077567e-05 7.893789e-06 8.701288e-06
When adjusting for the covariate, however, the result is much less significant:
## lrt.p
## 0.0460971
## lrt.p
## 0.04333194