Package 'R2sample' reference manual

Title:	Various Methods for the Two Sample Problem
Description:	The routine twosample_test() in this package runs the two sample test using various test statistic. The p values are found via permutation or large sample theory. The routine twosample_power() allows the calculation of the power in various cases, and plot_power() draws the corresponding power graphs.
Authors:	Wolfgang Rolke [aut, cre]
Maintainer:	Wolfgang Rolke <[email protected]>
License:	GPL (>= 2)
Version:	2.2.0
Built:	2024-10-22 07:22:29 UTC
Source:	CRAN

This function finds the p values of several tests based on large sample theory

Description

This function finds the p values of several tests based on large sample theory

Usage

asymptotic_pvalues(x, n, m)
asymptotic_pvalues(x, n, m)

Arguments

`x`	a vector of test statistics
`n`	size of sample 1
`m`	size of sample 2

Value

A vector of p values.

This function runs the chi-square test for continuous or discrete data

Description

This function runs the chi-square test for continuous or discrete data

Usage

chi_power(
  rxy,
  alpha = 0.05,
  B = 1000,
  xparam,
  yparam,
  nbins = c(50, 10),
  minexpcount = 5,
  typeTS
)
chi_power(
  rxy,
  alpha = 0.05,
  B = 1000,
  xparam,
  yparam,
  nbins = c(50, 10),
  minexpcount = 5,
  typeTS
)

Arguments

`rxy`	a function to generate data
`alpha`	=0.05 type I error probability of test
`B`	=1000 number of simulation runs
`xparam`	vector of parameter values
`yparam`	vector of parameter values
`nbins`	=c(50, 10) number of desired bins
`minexpcount`	=5 smallest number of counts required in each bin
`typeTS`	type of problem, continuous/discrete, with/without weights

Value

A matrix of power values

This function draws the power graph, with curves sorted by the mean power and smoothed for easier reading.

Description

This function draws the power graph, with curves sorted by the mean power and smoothed for easier reading.

Usage

plot_power(pwr, xname = " ", title, Smooth = TRUE, span = 0.25)
plot_power(pwr, xname = " ", title, Smooth = TRUE, span = 0.25)

Arguments

`pwr`	a matrix of power values, usually from the twosample_power command
`xname`	Name of variable on x axis
`title`	(Optional) title of graph
`Smooth`	=TRUE lines are smoothed for easier reading
`span`	=0.25bandwidth of smoothing method

Value

plt, an object of class ggplot.

Runs the shiny app associated with R2sample package

Description

Runs the shiny app associated with R2sample package

Usage

run_shiny()
run_shiny()

Value

No return value, called for side effect of opening a shiny app

This function does some rounding to nice numbers

Description

This function does some rounding to nice numbers

Usage

## S3 method for class 'digits'
signif(x, d = 4)
## S3 method for class 'digits'
signif(x, d = 4)

Arguments

`x`	a list of two vectors
`d`	=4 number of digits to round to

Value

A list with rounded vectors

Find the power of various two sample tests using Rcpp and parallel computing.

Description

Find the power of various two sample tests using Rcpp and parallel computing.

Usage

twosample_power(
  f,
  ...,
  TS,
  TSextra,
  alpha = 0.05,
  B = c(1000, 1000),
  nbins = c(50, 10),
  minexpcount = 5,
  UseLargeSample,
  samplingmethod = "independence",
  maxProcessor = 10
)
twosample_power(
  f,
  ...,
  TS,
  TSextra,
  alpha = 0.05,
  B = c(1000, 1000),
  nbins = c(50, 10),
  minexpcount = 5,
  UseLargeSample,
  samplingmethod = "independence",
  maxProcessor = 10
)

Arguments

`f`	function to generate a list with data sets x, y and (optional) vals, weights
`...`	additional arguments passed to f, up to 2
`TS`	routine to calculate test statistics for non-chi-square tests
`TSextra`	additional info passed to TS, if necessary
`alpha`	=0.05, the level of the hypothesis test
`B`	=c(1000, 2000), number of simulation runs for power and permutation test.
`nbins`	=c(50,10), number of bins for chi large and chi small.
`minexpcount`	=5 minimum required count for chi square tests
`UseLargeSample`	should p values be found via large sample theory if n,m>10000?
`samplingmethod`	=independence or MCMC in discrete data case
`maxProcessor`	=10, maximum number of cores to use. If maxProcessor=1 no parallel computing is used.

Value

A numeric vector of power values.

Examples

 f=function(mu) list(x=rnorm(25), y=rnorm(25, mu))
 twosample_power(f, mu=c(0,2), B=c(100, 100), maxProcessor = 1)
 f=function(n, p) list(x=table(sample(1:5, size=1000, replace=TRUE)), 
       y=table(sample(1:5, size=n, replace=TRUE, 
       prob=c(1, 1, 1, 1, p))), vals=1:5)
 twosample_power(f, n=c(1000, 2000), p=c(1, 1.5), B=c(100, 100), maxProcessor = 1)
f=function(mu) list(x=rnorm(25), y=rnorm(25, mu))
 twosample_power(f, mu=c(0,2), B=c(100, 100), maxProcessor = 1)
 f=function(n, p) list(x=table(sample(1:5, size=1000, replace=TRUE)), 
       y=table(sample(1:5, size=n, replace=TRUE, 
       prob=c(1, 1, 1, 1, p))), vals=1:5)
 twosample_power(f, n=c(1000, 2000), p=c(1, 1.5), B=c(100, 100), maxProcessor = 1)

This function runs a number of two sample tests using Rcpp and parallel computing.

Description

This function runs a number of two sample tests using Rcpp and parallel computing.

Usage

twosample_test(
  x,
  y,
  vals = NA,
  TS,
  TSextra,
  wx = rep(1, length(x)),
  wy = rep(1, length(y)),
  B = 5000,
  nbins = c(50, 10),
  maxProcessor,
  UseLargeSample,
  samplingmethod = "independence",
  doMethods = "all"
)
twosample_test(
  x,
  y,
  vals = NA,
  TS,
  TSextra,
  wx = rep(1, length(x)),
  wy = rep(1, length(y)),
  B = 5000,
  nbins = c(50, 10),
  maxProcessor,
  UseLargeSample,
  samplingmethod = "independence",
  doMethods = "all"
)

Arguments

`x`	a vector of numbers if data is continuous or of counts if data is discrete.
`y`	a vector of numbers if data is continuous or of counts if data is discrete.
`vals`	=NA, a vector of numbers, the values of a discrete random variable. NA if data is continuous data.
`TS`	routine to calculate test statistics for non-chi-square tests
`TSextra`	additional info passed to TS, if necessary
`wx`	A numeric vector of weights of x.
`wy`	A numeric vector of weights of y.
`B`	=5000, number of simulation runs for permutation test
`nbins`	=c(50,10), number of bins for chi square tests.
`maxProcessor`	maximum number of cores to use. If missing (the default) no parallel processing is used.
`UseLargeSample`	should p values be found via large sample theory if n,m>10000?
`samplingmethod`	="independence" or "MCMC" for discrete data
`doMethods`	="all" Which methods should be included? If missing all methods are used.

Value

A list of two numeric vectors, the test statistics and the p values.

Examples

 R2sample::twosample_test(rnorm(1000), rt(1000, 4), B=1000)
 myTS=function(x,y) {z=c(mean(x)-mean(y),sd(x)-sd(y));names(z)=c("M","S");z}
 R2sample::twosample_test(rnorm(1000), rt(1000, 4), TS=myTS, B=1000)
 vals=1:5
 x=table(sample(vals, size=100, replace=TRUE))
 y=table(sample(vals, size=100, replace=TRUE, prob=c(1,1,3,1,1)))
 R2sample::twosample_test(x, y, vals)
R2sample::twosample_test(rnorm(1000), rt(1000, 4), B=1000)
 myTS=function(x,y) {z=c(mean(x)-mean(y),sd(x)-sd(y));names(z)=c("M","S");z}
 R2sample::twosample_test(rnorm(1000), rt(1000, 4), TS=myTS, B=1000)
 vals=1:5
 x=table(sample(vals, size=100, replace=TRUE))
 y=table(sample(vals, size=100, replace=TRUE, prob=c(1,1,3,1,1)))
 R2sample::twosample_test(x, y, vals)

This function runs a number of two sample tests using Rcpp and parallel computing and then finds the correct p value for the combined tests.

Description

This function runs a number of two sample tests using Rcpp and parallel computing and then finds the correct p value for the combined tests.

Usage

twosample_test_adjusted_pvalue(
  x,
  y,
  vals = NA,
  TS,
  TSextra,
  wx = rep(1, length(x)),
  wy = rep(1, length(y)),
  B = c(5000, 1000),
  nbins = c(50, 10),
  samplingmethod = "independence",
  doMethods
)
twosample_test_adjusted_pvalue(
  x,
  y,
  vals = NA,
  TS,
  TSextra,
  wx = rep(1, length(x)),
  wy = rep(1, length(y)),
  B = c(5000, 1000),
  nbins = c(50, 10),
  samplingmethod = "independence",
  doMethods
)

Arguments

`x`	a vector of numbers if data is continuous or of counts if data is discrete.
`y`	a vector of numbers if data is continuous or of counts if data is discrete.
`vals`	=NA, a vector of numbers, the values of a discrete random variable. NA if data is continuous data.
`TS`	routine to calculate test statistics for non-chi-square tests
`TSextra`	additional info passed to TS, if necessary
`wx`	A numeric vector of weights of x.
`wy`	A numeric vector of weights of y.
`B`	=c(5000, 1000), number of simulation runs for permutation test
`nbins`	=c(50,10), number of bins for chi square tests.
`samplingmethod`	="independence" or "MCMC" for discrete data
`doMethods`	Which methods should be included?

Value

A list of two numeric vectors, the test statistics and the p values.

Examples

 x=rnorm(100)
 y=rt(200, 4)
 R2sample::twosample_test_adjusted_pvalue(x, y, B=c(500, 500))
 vals=1:5
 x=table(c(1:5, sample(1:5, size=100, replace=TRUE)))-1
 y=table(c(1:5, sample(1:5, size=100, replace=TRUE, prob=c(1,1,3,1,1))))-1
 R2sample::twosample_test_adjusted_pvalue(x, y, vals, B=c(500, 500))
x=rnorm(100)
 y=rt(200, 4)
 R2sample::twosample_test_adjusted_pvalue(x, y, B=c(500, 500))
 vals=1:5
 x=table(c(1:5, sample(1:5, size=100, replace=TRUE)))-1
 y=table(c(1:5, sample(1:5, size=100, replace=TRUE, prob=c(1,1,3,1,1))))-1
 R2sample::twosample_test_adjusted_pvalue(x, y, vals, B=c(500, 500))

Package 'R2sample'

Help Index

This function finds the p values of several tests based on large sample theory

Description

Usage

Arguments

Value

This function runs the chi-square test for continuous or discrete data

Description

Usage

Arguments

Value

This function draws the power graph, with curves sorted by the mean power and smoothed for easier reading.

Description

Usage

Arguments

Value

Runs the shiny app associated with R2sample package

Description

Usage

Value

This function does some rounding to nice numbers

Description

Usage

Arguments

Value

Find the power of various two sample tests using Rcpp and parallel computing.

Description

Usage

Arguments

Value

Examples

This function runs a number of two sample tests using Rcpp and parallel computing.

Description

Usage

Arguments

Value

Examples

This function runs a number of two sample tests using Rcpp and parallel computing and then finds the correct p value for the combined tests.

Description

Usage

Arguments

Value

Examples