Package 'permutest' reference manual

Title:	Run Permutation Tests and Construct Associated Confidence Intervals
Description:	Implements permutation tests for any test statistic and randomization scheme and constructs associated confidence intervals as described in Glazer and Stark (2024) <doi:10.48550/arXiv.2405.05238>.
Authors:	Amanda Glazer [aut, cre]
Maintainer:	Amanda Glazer <[email protected]>
License:	GPL (>= 3)
Version:	1.0.0
Built:	2025-02-24 06:57:51 UTC
Source:	CRAN

Adjust p-values for multiple testing

Description

This function takes an array of p-values and returns adjusted p-values using user-inputted FWER or FDR correction method

Usage

adjust_p_value(pvalues, method = "holm-bonferroni")
adjust_p_value(pvalues, method = "holm-bonferroni")

Arguments

`pvalues`	Array of p-values
`method`	The FWER or FDR correction to use, either 'holm-bonferroni', 'bonferroni', or 'benjamini-hochberg'

Value

Adjusted p-values

Examples

adjust_p_value(pvalues = c(.05, .1, .5), method='holm-bonferroni')

adjust_p_value(pvalues = c(.05, .1, .5), method='holm-bonferroni')

Calculate difference in means

Description

This function takes a data frame, and group and outcome column names as input and returns the difference in mean outcome between the two groups

Usage

diff_in_means(df, group_col, outcome_col, treatment_value = NULL)
diff_in_means(df, group_col, outcome_col, treatment_value = NULL)

Arguments

`df`	A data frame
`group_col`	The name of the column in df that corresponds to the group label
`outcome_col`	The name of the column in df that corresponds to the outcome variable
`treatment_value`	The value of group_col to be considered 'treatment'

Value

The difference in mean outcome between the two groups

Examples

data <- data.frame(group = c(rep(1, 4), rep(2, 4)),
                   outcome = c(rep(3, 4), rep(5, 4)))

diff_in_means(df = data,
              group_col = "group",
              outcome_col = "outcome",
              treatment_value = 1)

data <- data.frame(group = c(rep(1, 4), rep(2, 4)),
                   outcome = c(rep(3, 4), rep(5, 4)))

diff_in_means(df = data,
              group_col = "group",
              outcome_col = "outcome",
              treatment_value = 1)

Calculate difference in medians

Description

This function takes a data frame, and group and outcome column names as input and returns the difference in median outcome between the two groups

Usage

diff_in_medians(df, group_col, outcome_col, treatment_value = NULL)
diff_in_medians(df, group_col, outcome_col, treatment_value = NULL)

Arguments

`df`	A data frame
`group_col`	The name of the column in df that corresponds to the group label
`outcome_col`	The name of the column in df that corresponds to the outcome variable
`treatment_value`	The value of group_col to be considered 'treatment'

Value

The difference in median outcome between the two groups

Examples

data <- data.frame(group = c(rep(1, 4), rep(2, 4)),
                   outcome = c(rep(3, 4), rep(5, 4)))

diff_in_medians(df = data,
              group_col = "group",
              outcome_col = "outcome",
              treatment_value = 1)
data <- data.frame(group = c(rep(1, 4), rep(2, 4)),
                   outcome = c(rep(3, 4), rep(5, 4)))

diff_in_medians(df = data,
              group_col = "group",
              outcome_col = "outcome",
              treatment_value = 1)

Fisher combining function

Description

This function takes an array of p-values and returns a combined p-value using fisher's combining function: $-2 \sum_i \log(p_i)$

Usage

fisher(pvalues)
fisher(pvalues)

Arguments

pvalues

Array of p-values

Value

Combined p-value using fisher's method

Examples

fisher(pvalues = c(.05, .1, .5))

fisher(pvalues = c(.05, .1, .5))

Liptak combining function

Description

This function takes an array of p-values and returns a combined p-value using Liptak's combining function: $\sum_i \Phi^{-1}(1-p_i)$ where $\Phi$ is the CDF of the Normal distribution

Usage

liptak(pvalues)
liptak(pvalues)

Arguments

pvalues

Array of p-values

Value

Combined p-value using Liptak's method

Examples

liptak(pvalues = c(.05, .1, .5))

liptak(pvalues = c(.05, .1, .5))

Run NPC

Description

This function takes a data frame and group and outcome column names as input and returns the nonparametric combination of tests (NPC) omnibus p-value

Usage

npc(
  df,
  group_col,
  outcome_cols,
  strata_col = NULL,
  test_stat = "diff_in_means",
  perm_func = permute_group,
  combn = "fisher",
  shift = 0,
  reps = 10000,
  perm_set = NULL,
  complete_enum = FALSE,
  seed = NULL
)
npc(
  df,
  group_col,
  outcome_cols,
  strata_col = NULL,
  test_stat = "diff_in_means",
  perm_func = permute_group,
  combn = "fisher",
  shift = 0,
  reps = 10000,
  perm_set = NULL,
  complete_enum = FALSE,
  seed = NULL
)

Arguments

`df`	A data frame
`group_col`	The name of the column in df that corresponds to the group label
`outcome_cols`	The names of the columns in df that corresponds to the outcome variable
`strata_col`	The name of the column in df that corresponds to the strata
`test_stat`	Test statistic function
`perm_func`	Function to permute group, default is permute_group which randomly permutes group assignment
`combn`	Combining function method to use, takes values 'fisher', 'tippett', or 'liptak', or a user defined function
`shift`	Value of shift to apply in one- or two-sample problem
`reps`	Number of iterations to use when calculating permutation p-value
`perm_set`	Matrix of permutations to use instead of reps iterations of perm_func
`complete_enum`	Boolean, whether to calculate P-value under complete enumeration of permutations
`seed`	An integer seed value

Value

The omnibus p-value

Examples

data <- data.frame(group = c(rep(1, 4), rep(2, 4)),
out1 = c(0, 1, 0, 0, 1, 1, 1, 0),
out2 = rep(1, 8))
output <- npc(df = data, group_col = "group",
              outcome_cols = c("out1", "out2"), perm_func = permute_group,
              combn = "fisher", reps = 10^4, seed=42)

data <- data.frame(group = c(rep(1, 4), rep(2, 4)),
out1 = c(0, 1, 0, 0, 1, 1, 1, 0),
out2 = rep(1, 8))
output <- npc(df = data, group_col = "group",
              outcome_cols = c("out1", "out2"), perm_func = permute_group,
              combn = "fisher", reps = 10^4, seed=42)

One-sample permutation test

Description

This function runs a permutation test for the one-sample problem by calling the permutation_test function using the one-sample mean test statistic.

Usage

one_sample(x, shift = 0, alternative = "greater", reps = 10^4, seed = NULL)
one_sample(x, shift = 0, alternative = "greater", reps = 10^4, seed = NULL)

Arguments

`x`	array of data
`shift`	Value of shift to apply in one-sample problem
`alternative`	String, two-sided or one-sided (greater or less) p-value
`reps`	Number of iterations to use when calculating permutation p-value
`seed`	An integer seed value

Value

The permutation test p-value

Examples

one_sample(x = c(-1, 1, 2), seed = 42)

one_sample(x = c(-1, 1, 2), seed = 42)

Calculate the one-sample problem test statistic

Description

This function takes a data frame, and group and outcome column names as input and returns the mean of the product of the outcome and group. This test statistic is used for the one-sample problem.

Usage

one_sample_mean(df, group_col, outcome_col)
one_sample_mean(df, group_col, outcome_col)

Arguments

`df`	A data frame
`group_col`	The name of the column in df that corresponds to the group label
`outcome_col`	The name of the column in df that corresponds to the outcome variable

Value

The one-sample problem test statistic: the mean of the product of the outcome and group

Examples

data <- data.frame(group = c(rep(1, 4), rep(2, 4)),
                   outcome = c(rep(3, 4), rep(5, 4)))

one_sample_mean(df = data,
              group_col = "group",
              outcome_col = "outcome")

data <- data.frame(group = c(rep(1, 4), rep(2, 4)),
                   outcome = c(rep(3, 4), rep(5, 4)))

one_sample_mean(df = data,
              group_col = "group",
              outcome_col = "outcome")

Calculate one-way anova test statistic

Description

This function takes a data frame, and group and outcome column names as input and returns the one-way anova test statistic

Usage

one_way_anova_stat(df, group_col, outcome_col)
one_way_anova_stat(df, group_col, outcome_col)

Arguments

`df`	A data frame
`group_col`	The name of the column in df that corresponds to the group label
`outcome_col`	The name of the column in df that corresponds to the outcome variable

Value

The one-way anova test statistic: $\sum_{g=1}^G n_g(\overline{X_g} - \overline{X})^2$ where $g$ indexes the groups

Run permutation test

Description

Run permutation test with user inputted data, test statistic, and permutation function

Usage

permutation_test(
  df,
  group_col,
  outcome_col,
  strata_col = NULL,
  test_stat = "diff_in_means",
  perm_func = permute_group,
  alternative = "two-sided",
  shift = 0,
  reps = 10000,
  perm_set = NULL,
  complete_enum = FALSE,
  return_test_dist = FALSE,
  return_perm_dist = FALSE,
  seed = NULL
)
permutation_test(
  df,
  group_col,
  outcome_col,
  strata_col = NULL,
  test_stat = "diff_in_means",
  perm_func = permute_group,
  alternative = "two-sided",
  shift = 0,
  reps = 10000,
  perm_set = NULL,
  complete_enum = FALSE,
  return_test_dist = FALSE,
  return_perm_dist = FALSE,
  seed = NULL
)

Arguments

`df`	A data frame
`group_col`	The name of the column in df that corresponds to the group label
`outcome_col`	The name of the column in df that corresponds to the outcome variable
`strata_col`	The name of the column in df that corresponds to the strata
`test_stat`	Test statistic function
`perm_func`	Function to permute group
`alternative`	String, two-sided or one-sided (greater or less) p-value; options are 'greater', 'less', or 'two-sided'
`shift`	Value of shift to apply in one- or two-sample problem
`reps`	Number of iterations to use when calculating permutation p-value
`perm_set`	Matrix of group assignments to use instead of reps iterations of perm_func
`complete_enum`	Boolean, whether to calculate P-value under complete enumeration of permutations
`return_test_dist`	Boolean, whether to return test statistic distribution under permutations
`return_perm_dist`	Boolean, whether to return a matrix where each row is the group assignment under that permutation
`seed`	An integer seed value

Value

p_value: the permutation test p-value

test_stat_dist: array, the distribution of the test statistic under the set of permutations, if return_test_dist is set to TRUE

perm_indices_mat: matrix, each row corresponds to a permutation used in the permutation test calculation

Examples

data <- data.frame(group = c(rep(1, 10), rep(2, 10)), outcome = c(rep(1, 10), rep(1, 10)))

permutation_test(df = data, group_col = "group", outcome_col = "outcome",
test_stat = "diff_in_means", perm_func = permute_group, alternative = "greater",
shift = 0, reps = 10, return_perm_dist = TRUE, return_test_dist = TRUE, seed = 42)

data <- data.frame(group = c(rep(1, 10), rep(2, 10)), outcome = c(rep(1, 10), rep(1, 10)))

permutation_test(df = data, group_col = "group", outcome_col = "outcome",
test_stat = "diff_in_means", perm_func = permute_group, alternative = "greater",
shift = 0, reps = 10, return_perm_dist = TRUE, return_test_dist = TRUE, seed = 42)

Construct confidence interval by inverting permutation tests

Description

This function constructs a confidence interval by inverting permutation tests and applying the method in Glazer and Stark, 2024.

Usage

permutation_test_ci(
  df,
  group_col,
  outcome_col,
  strata_col = NULL,
  test_stat = "diff_in_means",
  perm_func = permute_group,
  upper_bracket = NULL,
  lower_bracket = NULL,
  cl = 0.95,
  e = 0.1,
  reps = 10000,
  perm_set = NULL,
  seed = 42
)
permutation_test_ci(
  df,
  group_col,
  outcome_col,
  strata_col = NULL,
  test_stat = "diff_in_means",
  perm_func = permute_group,
  upper_bracket = NULL,
  lower_bracket = NULL,
  cl = 0.95,
  e = 0.1,
  reps = 10000,
  perm_set = NULL,
  seed = 42
)

Arguments

`df`	A data frame
`group_col`	The name of the column in df that corresponds to the group label
`outcome_col`	The name of the column in df that corresponds to the outcome variable
`strata_col`	The name of the column in df that corresponds to the strata
`test_stat`	Test statistic function
`perm_func`	Function to permute group
`upper_bracket`	Array with 2 values that bracket upper confidence bound
`lower_bracket`	Array with 2 values that bracket lower confidence bound
`cl`	Confidence level, default 0.95
`e`	Maximum distance from true confidence bound value
`reps`	Number of iterations to use when calculating permutation p-value
`perm_set`	Matrix of group assignments to use instead of reps iterations of perm_func
`seed`	An integer seed value

Value

A list containing the permutation test p-value, and the test statistic distribution if applicable

Examples

x <- c(35.3, 35.9, 37.2, 33.0, 31.9, 33.7, 36.0, 35.0, 33.3, 33.6, 37.9, 35.6, 29.0, 33.7, 35.7)
y <- c(32.5, 34.0, 34.4, 31.8, 35.0, 34.6, 33.5, 33.6, 31.5, 33.8, 34.6)
df <- data.frame(outcome = c(x, y), group = c(rep(1, length(x)), rep(0, length(y))))
permutation_test_ci(df = df, group_col = "group", outcome_col = "outcome", strata_col = NULL,
                    test_stat = "diff_in_means", perm_func = permute_group,
                    upper_bracket = NULL, lower_bracket = NULL,
                    cl = 0.95, e = 0.01, reps = 10^3, seed = 42)
x <- c(35.3, 35.9, 37.2, 33.0, 31.9, 33.7, 36.0, 35.0, 33.3, 33.6, 37.9, 35.6, 29.0, 33.7, 35.7)
y <- c(32.5, 34.0, 34.4, 31.8, 35.0, 34.6, 33.5, 33.6, 31.5, 33.8, 34.6)
df <- data.frame(outcome = c(x, y), group = c(rep(1, length(x)), rep(0, length(y))))
permutation_test_ci(df = df, group_col = "group", outcome_col = "outcome", strata_col = NULL,
                    test_stat = "diff_in_means", perm_func = permute_group,
                    upper_bracket = NULL, lower_bracket = NULL,
                    cl = 0.95, e = 0.01, reps = 10^3, seed = 42)

Unstratified group permutation

Description

This function takes a data frame and group column name as input and returns the dataframe with the group column randomly permuted

Usage

permute_group(df, group_col, strata_col = NULL, seed = NULL)
permute_group(df, group_col, strata_col = NULL, seed = NULL)

Arguments

`df`	A data frame
`group_col`	String, the name of the column in df that corresponds to the group label
`strata_col`	The name of the column in df that corresponds to the strata, should be NULL for unstratified permutation
`seed`	An integer seed value

Value

The inputted data frame with the group column randomly shuffled

Examples

data <- data.frame(group_label = c(1, 2, 2, 1, 2, 1), outcome = 1:6)
permute_group(df = data, group_col = "group_label", strata_col = NULL, seed = 42)

data <- data.frame(group_label = c(1, 2, 2, 1, 2, 1), outcome = 1:6)
permute_group(df = data, group_col = "group_label", strata_col = NULL, seed = 42)

Sign permutation

Description

This function takes a data frame and group and outcome column name as input and returns the dataframe with the group column replaced with randomly assigned signs

Usage

permute_sign(df, group_col, strata_col = NULL, seed = NULL)
permute_sign(df, group_col, strata_col = NULL, seed = NULL)

Arguments

`df`	A data frame
`group_col`	The name of the column in df that corresponds to the group label
`strata_col`	The name of the column in df that corresponds to the strata, should be NULL for this function
`seed`	An integer seed value

Value

The inputted data frame with the group column replaced with randomly assigned signs

Examples

data <- data.frame(group_label = rep(1, 6), outcome = 1:6)
permute_group(df = data, group_col = "group_label", strata_col = NULL, seed = 42)

data <- data.frame(group_label = rep(1, 6), outcome = 1:6)
permute_group(df = data, group_col = "group_label", strata_col = NULL, seed = 42)

Stratified group permutation

Description

This function takes a data frame and group and strata column name as input and returns the dataframe with the group column randomly permuted by strata

Usage

strat_permute_group(df, group_col, strata_col, seed = NULL)
strat_permute_group(df, group_col, strata_col, seed = NULL)

Arguments

`df`	A data frame
`group_col`	The name of the column in df that corresponds to the group label
`strata_col`	The name of the column in df that corresponds to the strata
`seed`	An integer seed value

Value

The inputted data frame with the group column randomly shuffled by strata

Examples

data <- data.frame(group_label = c(1, 2, 2, 1, 2, 1), stratum = c(1, 1, 1, 2, 2, 2), outcome = 1:6)
permute_group(df = data, group_col = "group_label", strata_col = "stratum", seed = 42)

data <- data.frame(group_label = c(1, 2, 2, 1, 2, 1), stratum = c(1, 1, 1, 2, 2, 2), outcome = 1:6)
permute_group(df = data, group_col = "group_label", strata_col = "stratum", seed = 42)

Tippett combining function

Description

This function takes an array of p-values and returns a combined p-value using Tippett's combining function: $\max_i \{1-p_i\}$

Usage

tippett(pvalues)
tippett(pvalues)

Arguments

pvalues

Array of p-values

Value

Combined p-value using Tippett's method

Examples

tippett(pvalues = c(.05, .1, .5))

tippett(pvalues = c(.05, .1, .5))

Calculate t-test statistic

Description

This function takes a data frame, and group and outcome column names as input and returns the t test statistic

Usage

ttest_stat(df, group_col, outcome_col)
ttest_stat(df, group_col, outcome_col)

Arguments

`df`	A data frame
`group_col`	The name of the column in df that corresponds to the group label
`outcome_col`	The name of the column in df that corresponds to the outcome variable

Value

The t test statistic

Two-sample permutation test

Description

This function runs a permutation test with difference in means test statistic for the two-sample problem by calling the permutation_test function.

Usage

two_sample(x, y, shift = 0, alternative = "greater", reps = 10^4, seed = NULL)
two_sample(x, y, shift = 0, alternative = "greater", reps = 10^4, seed = NULL)

Arguments

`x`	array of data for treatment group
`y`	array of data for control group
`shift`	Value of shift to apply in two-sample problem
`alternative`	String, two-sided or one-sided (greater or less) p-value; options are 'greater', 'less', or 'two-sided'
`reps`	Number of iterations to use when calculating permutation p-value
`seed`	An integer seed value

Value

The permutation test p-value

Examples

two_sample(x = c(10, 9, 11), y = c(12, 11, 13), alternative = "less", seed = 42)

two_sample(x = c(10, 9, 11), y = c(12, 11, 13), alternative = "less", seed = 42)

Package 'permutest'

Help Index

Adjust p-values for multiple testing

Description

Usage

Arguments

Value

Examples

Calculate difference in means

Description

Usage

Arguments

Value

Examples

Calculate difference in medians

Description

Usage

Arguments

Value

Examples

Fisher combining function

Description

Usage

Arguments

Value

Examples

Liptak combining function

Description

Usage

Arguments

Value

Examples

Run NPC

Description

Usage

Arguments

Value

Examples

One-sample permutation test

Description

Usage

Arguments

Value

Examples

Calculate the one-sample problem test statistic

Description

Usage

Arguments

Value

Examples

Calculate one-way anova test statistic

Description

Usage

Arguments

Value

Run permutation test

Description

Usage

Arguments

Value

Examples

Construct confidence interval by inverting permutation tests

Description

Usage

Arguments

Value

Examples

Unstratified group permutation

Description

Usage

Arguments

Value

Examples

Sign permutation

Description

Usage

Arguments

Value

Examples

Stratified group permutation