Package 'permutest'

Title: Run Permutation Tests and Construct Associated Confidence Intervals
Description: Implements permutation tests for any test statistic and randomization scheme and constructs associated confidence intervals as described in Glazer and Stark (2024) <doi:10.48550/arXiv.2405.05238>.
Authors: Amanda Glazer [aut, cre]
Maintainer: Amanda Glazer <[email protected]>
License: GPL (>= 3)
Version: 1.0.0
Built: 2024-11-26 06:48:12 UTC
Source: CRAN

Help Index


Adjust p-values for multiple testing

Description

This function takes an array of p-values and returns adjusted p-values using user-inputted FWER or FDR correction method

Usage

adjust_p_value(pvalues, method = "holm-bonferroni")

Arguments

pvalues

Array of p-values

method

The FWER or FDR correction to use, either 'holm-bonferroni', 'bonferroni', or 'benjamini-hochberg'

Value

Adjusted p-values

Examples

adjust_p_value(pvalues = c(.05, .1, .5), method='holm-bonferroni')

Calculate difference in means

Description

This function takes a data frame, and group and outcome column names as input and returns the difference in mean outcome between the two groups

Usage

diff_in_means(df, group_col, outcome_col, treatment_value = NULL)

Arguments

df

A data frame

group_col

The name of the column in df that corresponds to the group label

outcome_col

The name of the column in df that corresponds to the outcome variable

treatment_value

The value of group_col to be considered 'treatment'

Value

The difference in mean outcome between the two groups

Examples

data <- data.frame(group = c(rep(1, 4), rep(2, 4)),
                   outcome = c(rep(3, 4), rep(5, 4)))

diff_in_means(df = data,
              group_col = "group",
              outcome_col = "outcome",
              treatment_value = 1)

Calculate difference in medians

Description

This function takes a data frame, and group and outcome column names as input and returns the difference in median outcome between the two groups

Usage

diff_in_medians(df, group_col, outcome_col, treatment_value = NULL)

Arguments

df

A data frame

group_col

The name of the column in df that corresponds to the group label

outcome_col

The name of the column in df that corresponds to the outcome variable

treatment_value

The value of group_col to be considered 'treatment'

Value

The difference in median outcome between the two groups

Examples

data <- data.frame(group = c(rep(1, 4), rep(2, 4)),
                   outcome = c(rep(3, 4), rep(5, 4)))

diff_in_medians(df = data,
              group_col = "group",
              outcome_col = "outcome",
              treatment_value = 1)

Fisher combining function

Description

This function takes an array of p-values and returns a combined p-value using fisher's combining function: 2ilog(pi)-2 \sum_i \log(p_i)

Usage

fisher(pvalues)

Arguments

pvalues

Array of p-values

Value

Combined p-value using fisher's method

Examples

fisher(pvalues = c(.05, .1, .5))

Liptak combining function

Description

This function takes an array of p-values and returns a combined p-value using Liptak's combining function: iΦ1(1pi)\sum_i \Phi^{-1}(1-p_i) where Φ\Phi is the CDF of the Normal distribution

Usage

liptak(pvalues)

Arguments

pvalues

Array of p-values

Value

Combined p-value using Liptak's method

Examples

liptak(pvalues = c(.05, .1, .5))

Run NPC

Description

This function takes a data frame and group and outcome column names as input and returns the nonparametric combination of tests (NPC) omnibus p-value

Usage

npc(
  df,
  group_col,
  outcome_cols,
  strata_col = NULL,
  test_stat = "diff_in_means",
  perm_func = permute_group,
  combn = "fisher",
  shift = 0,
  reps = 10000,
  perm_set = NULL,
  complete_enum = FALSE,
  seed = NULL
)

Arguments

df

A data frame

group_col

The name of the column in df that corresponds to the group label

outcome_cols

The names of the columns in df that corresponds to the outcome variable

strata_col

The name of the column in df that corresponds to the strata

test_stat

Test statistic function

perm_func

Function to permute group, default is permute_group which randomly permutes group assignment

combn

Combining function method to use, takes values 'fisher', 'tippett', or 'liptak', or a user defined function

shift

Value of shift to apply in one- or two-sample problem

reps

Number of iterations to use when calculating permutation p-value

perm_set

Matrix of permutations to use instead of reps iterations of perm_func

complete_enum

Boolean, whether to calculate P-value under complete enumeration of permutations

seed

An integer seed value

Value

The omnibus p-value

Examples

data <- data.frame(group = c(rep(1, 4), rep(2, 4)),
out1 = c(0, 1, 0, 0, 1, 1, 1, 0),
out2 = rep(1, 8))
output <- npc(df = data, group_col = "group",
              outcome_cols = c("out1", "out2"), perm_func = permute_group,
              combn = "fisher", reps = 10^4, seed=42)

One-sample permutation test

Description

This function runs a permutation test for the one-sample problem by calling the permutation_test function using the one-sample mean test statistic.

Usage

one_sample(x, shift = 0, alternative = "greater", reps = 10^4, seed = NULL)

Arguments

x

array of data

shift

Value of shift to apply in one-sample problem

alternative

String, two-sided or one-sided (greater or less) p-value

reps

Number of iterations to use when calculating permutation p-value

seed

An integer seed value

Value

The permutation test p-value

Examples

one_sample(x = c(-1, 1, 2), seed = 42)

Calculate the one-sample problem test statistic

Description

This function takes a data frame, and group and outcome column names as input and returns the mean of the product of the outcome and group. This test statistic is used for the one-sample problem.

Usage

one_sample_mean(df, group_col, outcome_col)

Arguments

df

A data frame

group_col

The name of the column in df that corresponds to the group label

outcome_col

The name of the column in df that corresponds to the outcome variable

Value

The one-sample problem test statistic: the mean of the product of the outcome and group

Examples

data <- data.frame(group = c(rep(1, 4), rep(2, 4)),
                   outcome = c(rep(3, 4), rep(5, 4)))

one_sample_mean(df = data,
              group_col = "group",
              outcome_col = "outcome")

Calculate one-way anova test statistic

Description

This function takes a data frame, and group and outcome column names as input and returns the one-way anova test statistic

Usage

one_way_anova_stat(df, group_col, outcome_col)

Arguments

df

A data frame

group_col

The name of the column in df that corresponds to the group label

outcome_col

The name of the column in df that corresponds to the outcome variable

Value

The one-way anova test statistic: g=1Gng(XgX)2\sum_{g=1}^G n_g(\overline{X_g} - \overline{X})^2 where gg indexes the groups


Run permutation test

Description

Run permutation test with user inputted data, test statistic, and permutation function

Usage

permutation_test(
  df,
  group_col,
  outcome_col,
  strata_col = NULL,
  test_stat = "diff_in_means",
  perm_func = permute_group,
  alternative = "two-sided",
  shift = 0,
  reps = 10000,
  perm_set = NULL,
  complete_enum = FALSE,
  return_test_dist = FALSE,
  return_perm_dist = FALSE,
  seed = NULL
)

Arguments

df

A data frame

group_col

The name of the column in df that corresponds to the group label

outcome_col

The name of the column in df that corresponds to the outcome variable

strata_col

The name of the column in df that corresponds to the strata

test_stat

Test statistic function

perm_func

Function to permute group

alternative

String, two-sided or one-sided (greater or less) p-value; options are 'greater', 'less', or 'two-sided'

shift

Value of shift to apply in one- or two-sample problem

reps

Number of iterations to use when calculating permutation p-value

perm_set

Matrix of group assignments to use instead of reps iterations of perm_func

complete_enum

Boolean, whether to calculate P-value under complete enumeration of permutations

return_test_dist

Boolean, whether to return test statistic distribution under permutations

return_perm_dist

Boolean, whether to return a matrix where each row is the group assignment under that permutation

seed

An integer seed value

Value

p_value: the permutation test p-value

test_stat_dist: array, the distribution of the test statistic under the set of permutations, if return_test_dist is set to TRUE

perm_indices_mat: matrix, each row corresponds to a permutation used in the permutation test calculation

Examples

data <- data.frame(group = c(rep(1, 10), rep(2, 10)), outcome = c(rep(1, 10), rep(1, 10)))

permutation_test(df = data, group_col = "group", outcome_col = "outcome",
test_stat = "diff_in_means", perm_func = permute_group, alternative = "greater",
shift = 0, reps = 10, return_perm_dist = TRUE, return_test_dist = TRUE, seed = 42)

Construct confidence interval by inverting permutation tests

Description

This function constructs a confidence interval by inverting permutation tests and applying the method in Glazer and Stark, 2024.

Usage

permutation_test_ci(
  df,
  group_col,
  outcome_col,
  strata_col = NULL,
  test_stat = "diff_in_means",
  perm_func = permute_group,
  upper_bracket = NULL,
  lower_bracket = NULL,
  cl = 0.95,
  e = 0.1,
  reps = 10000,
  perm_set = NULL,
  seed = 42
)

Arguments

df

A data frame

group_col

The name of the column in df that corresponds to the group label

outcome_col

The name of the column in df that corresponds to the outcome variable

strata_col

The name of the column in df that corresponds to the strata

test_stat

Test statistic function

perm_func

Function to permute group

upper_bracket

Array with 2 values that bracket upper confidence bound

lower_bracket

Array with 2 values that bracket lower confidence bound

cl

Confidence level, default 0.95

e

Maximum distance from true confidence bound value

reps

Number of iterations to use when calculating permutation p-value

perm_set

Matrix of group assignments to use instead of reps iterations of perm_func

seed

An integer seed value

Value

A list containing the permutation test p-value, and the test statistic distribution if applicable

Examples

x <- c(35.3, 35.9, 37.2, 33.0, 31.9, 33.7, 36.0, 35.0, 33.3, 33.6, 37.9, 35.6, 29.0, 33.7, 35.7)
y <- c(32.5, 34.0, 34.4, 31.8, 35.0, 34.6, 33.5, 33.6, 31.5, 33.8, 34.6)
df <- data.frame(outcome = c(x, y), group = c(rep(1, length(x)), rep(0, length(y))))
permutation_test_ci(df = df, group_col = "group", outcome_col = "outcome", strata_col = NULL,
                    test_stat = "diff_in_means", perm_func = permute_group,
                    upper_bracket = NULL, lower_bracket = NULL,
                    cl = 0.95, e = 0.01, reps = 10^3, seed = 42)

Unstratified group permutation

Description

This function takes a data frame and group column name as input and returns the dataframe with the group column randomly permuted

Usage

permute_group(df, group_col, strata_col = NULL, seed = NULL)

Arguments

df

A data frame

group_col

String, the name of the column in df that corresponds to the group label

strata_col

The name of the column in df that corresponds to the strata, should be NULL for unstratified permutation

seed

An integer seed value

Value

The inputted data frame with the group column randomly shuffled

Examples

data <- data.frame(group_label = c(1, 2, 2, 1, 2, 1), outcome = 1:6)
permute_group(df = data, group_col = "group_label", strata_col = NULL, seed = 42)

Sign permutation

Description

This function takes a data frame and group and outcome column name as input and returns the dataframe with the group column replaced with randomly assigned signs

Usage

permute_sign(df, group_col, strata_col = NULL, seed = NULL)

Arguments

df

A data frame

group_col

The name of the column in df that corresponds to the group label

strata_col

The name of the column in df that corresponds to the strata, should be NULL for this function

seed

An integer seed value

Value

The inputted data frame with the group column replaced with randomly assigned signs

Examples

data <- data.frame(group_label = rep(1, 6), outcome = 1:6)
permute_group(df = data, group_col = "group_label", strata_col = NULL, seed = 42)

Stratified group permutation

Description

This function takes a data frame and group and strata column name as input and returns the dataframe with the group column randomly permuted by strata

Usage

strat_permute_group(df, group_col, strata_col, seed = NULL)

Arguments

df

A data frame

group_col

The name of the column in df that corresponds to the group label

strata_col

The name of the column in df that corresponds to the strata

seed

An integer seed value

Value

The inputted data frame with the group column randomly shuffled by strata

Examples

data <- data.frame(group_label = c(1, 2, 2, 1, 2, 1), stratum = c(1, 1, 1, 2, 2, 2), outcome = 1:6)
permute_group(df = data, group_col = "group_label", strata_col = "stratum", seed = 42)

Tippett combining function

Description

This function takes an array of p-values and returns a combined p-value using Tippett's combining function: maxi{1pi}\max_i \{1-p_i\}

Usage

tippett(pvalues)

Arguments

pvalues

Array of p-values

Value

Combined p-value using Tippett's method

Examples

tippett(pvalues = c(.05, .1, .5))

Calculate t-test statistic

Description

This function takes a data frame, and group and outcome column names as input and returns the t test statistic

Usage

ttest_stat(df, group_col, outcome_col)

Arguments

df

A data frame

group_col

The name of the column in df that corresponds to the group label

outcome_col

The name of the column in df that corresponds to the outcome variable

Value

The t test statistic


Two-sample permutation test

Description

This function runs a permutation test with difference in means test statistic for the two-sample problem by calling the permutation_test function.

Usage

two_sample(x, y, shift = 0, alternative = "greater", reps = 10^4, seed = NULL)

Arguments

x

array of data for treatment group

y

array of data for control group

shift

Value of shift to apply in two-sample problem

alternative

String, two-sided or one-sided (greater or less) p-value; options are 'greater', 'less', or 'two-sided'

reps

Number of iterations to use when calculating permutation p-value

seed

An integer seed value

Value

The permutation test p-value

Examples

two_sample(x = c(10, 9, 11), y = c(12, 11, 13), alternative = "less", seed = 42)