PSIndependenceTest

PSIndependenceTest: Independence Tests for Two-Way, Three-Way and Four-Way Contingency Tables

author: Piotr Sulewski, Pomeranian University

The goal of the package is to put into practice the modular and logarithmic minimum tests for independence in two-way, three-way and four-way contingency tables. Statistic value, cv value and p-value are calculated. This package also includes three table generation functions and six data sets. To read more about the package please see (and cite :)) papers:

Sulewski, P. (2016). Moc testów niezależności w tablicy dwudzielczej większej niż 2×2. Przegląd statystyczny 63(2), 190-210.

Sulewski, P. (2019). The LMS for Testing Independence in Two-way Contingency Tables. Biometrical Letters 56(1), 17-43.

Sulewski, P. (2018). Power Analysis Of Independence Testing for the Three-Way Con-tingency Tables of Small Sizes. Journal of Applied Statistics 45(13), 2481-2498.

Sulewski, P. (2021). Logarithmic Minimum Test for Independence in Three Way Con-tingency Table of Small Sizes. Journal of Statistical Computation and Simulation 91(13), 2780-2799.

Installation

You can install the released version of PSIndependenceTest from CRAN with:

install.packages("PSIndependenceTest")

You can install the development version of PSIndependenceTest from GitHub with:

library("remotes")
remotes::install_github("PiotrSule/PSIndependenceTest")

This package includes four data sets

The first one, table1, consist of 40 observations presented as two-way contingency table 2 x 2. See details: Sulewski, P. (2017). A new test for independence in 2x2 contingency tables. Acta Universitatis Lodziensis. Folia Oeconomica 4(330), 55–75.

The second one, table2, consist of 25 observations described the effect of a treatment for rheumatoid arthritis vs. a placebo presented as two-way contingency table 2 x 3. See details: Sulewski, P. (2019). The LMS for Testing Independence in Two-way Contingency Tables. Biometrical Letters 56 (1), 17-43.

The third one, table3, consist of 695 observations described the frequency of watching videos at home or at friends’ homes for young people between 7 and 15 years of age, cross-classified according to age and sex. Data are presented as three-way contingency table 3 x 3 x 2. See details: Sulewski, P. (2021). Logarithmic Minimum Test for Independence in Three Way Contingency Table of Small Sizes. Journal of Statistical Computation and Simulation 91(13), 2780-2799

The fourth one, table4, consist of 100 observations obtained using the Monte Carlo method when Ho is true, i.e. all probabilities pijt equal 1/24. Data is presented as three-way contingency table 2 x 3 x 4.

The fifth one, table5,provides information on the fate of passengers on the fatal maiden voyage of the ocean liner ‘Titanic’, summarized according to economic status (class), sex, age and survival. Data is presented as four-way contingency table 4 x 2 x 2 x 2. The sample size equals 2201.

The sixth one, table6, consist of 100 observations obtained using the Monte Carlo method when Ho is true, i.e. all probabilities pijtu equal 1/16. Data is presented as four-way contingency table 2 x 2 x 2 x 2.

library(PSIndependenceTest)
dim(table1)
#> [1] 2 2
length(table2)
#> [1] 6

Functions

GenTab2

This function generating the two-way contingency table with the Monte Carlo method

GenTab2(matrix(1/8, nrow = 2, ncol = 4), 50)
#>      [,1] [,2] [,3] [,4]
#> [1,]    8    6   10    7
#> [2,]    3    6    6    4
GenTab2(matrix(1/12, nrow = 4, ncol = 3), 60)
#>      [,1] [,2] [,3]
#> [1,]    6    7    4
#> [2,]    2    1    6
#> [3,]    7    3    5
#> [4,]   10    4    5

Mod2.stat

This function returns the statistic value of the modular independence test in the two-way contingency table.

Mod2.stat(table1)
#> [1] 1.273572
Mod2.stat(GenTab2(matrix(1/9, nrow = 3, ncol = 3), 90))
#> [1] 1.556496

Mod2.cv

This function returns the critical value of the modular independence test in the two-way contingency table.

Mod2.cv(2, 2, 40, 0.05, B = 1e3) 
#> [1] 1.25
Mod2.cv(2, 3, 60, 0.1) 
#> [1] 1.533521

Mod2.pvalue

This function returns the p-value of the modular independence test in the two-way contingency table.

Mod2.pvalue(Mod2.stat(table1), 2, 2, 40, B = 1e3)
#> [1] 0.06093906
Mod2.pvalue(Mod2.stat(table2), 2, 3, 60)
#> [1] 0.6953305

Mod2.test

This function returns the test statistic and p-value of the logarithmic minimum independence test in the two-way contingency table.

Mod2.test(table1, B = 1e3)
#> 
#>  Modular test for independence in two-way contingency table
#> 
#> data:  table1
#> D = 1.2736, p-value = 0.0493
Mod2.test(table2)
#> 
#>  Modular test for independence in two-way contingency table
#> 
#> data:  table2
#> D = 0.61103, p-value = 0.7002

Lms2.stat

This function returns the statistic value of the logarithmic minimum independence test in the two-way contingency table.

Lms2.stat(table1)
#> [1] 1.660665
Lms2.stat(GenTab2(matrix(1/9, nrow = 3, ncol = 3), 90))
#> [1] 1.302853

Lms2.cv

This function returns the critical value of the logarithmic minimum independence test in the two-way contingency table.

Lms2.cv(2, 2, 40, 0.05, B = 1e3) 
#> [1] 1.317246
Lms2.cv(2, 3, 60, 0.1)  
#> [1] 1.609358

Lms2.pvalue

This function returns the p-value of the logarithmic minimum independence test in the two-way contingency table.

Lms2.pvalue(Lms2.stat(table1), 2, 2, 40, B = 1e3)
#> [1] 0.01498501
Lms2.pvalue(Lms2.stat(table2), 2, 3, 60)
#> [1] 0.7039296

Lms2.test

This function returns the test statistic and p-value of the logarithmic minimum independence test in the two-way contingency table.

Lms2.test(table1, B = 1e3)
#> 
#>  Modular test for independence in two-way contingency table
#> 
#> data:  table1
#> D = 1.6607, p-value = 0.0214
Lms2.test(table2)
#> 
#>  Modular test for independence in two-way contingency table
#> 
#> data:  table2
#> D = 0.61437, p-value = 0.7063

GenTab3

This function generating the three-way contingency table with the Monte Carlo method

GenTab3(array(1/12, dim=c(2,2,3)), 60)
#> , , 1
#> 
#>      [,1] [,2]
#> [1,]    9    6
#> [2,]    6    6
#> 
#> , , 2
#> 
#>      [,1] [,2]
#> [1,]    4    2
#> [2,]    6    5
#> 
#> , , 3
#> 
#>      [,1] [,2]
#> [1,]    2    5
#> [2,]    3    6
GenTab3(array(1/18, dim=c(2,3,3)), 80)
#> , , 1
#> 
#>      [,1] [,2] [,3]
#> [1,]    6    4    5
#> [2,]    4    5    6
#> 
#> , , 2
#> 
#>      [,1] [,2] [,3]
#> [1,]    3    6    4
#> [2,]    6    6    1
#> 
#> , , 3
#> 
#>      [,1] [,2] [,3]
#> [1,]    5    2    4
#> [2,]    4    5    4

Mod3.stat

This function returns the statistic value of the modular independence test in the three-way contingency table.

Mod3.stat(table3)
#> [1] 2.641208
Mod3.stat(GenTab3(array(1/12, dim=c(2,2,3)), 120))
#> [1] 2.522944

Mod3.cv

This function returns the critical value of the modular independence test in the three-way contingency table.

Mod3.cv(2, 2, 2, 80, 0.05, B = 1e3) 
#> [1] 2.483619
Mod3.cv(2, 2, 2, 80, 0.1) 
#> [1] 2.15164

Mod3.pvalue

This function returns the p-value of the modular independence test in the three-way contingency table.

Mod3.pvalue(Mod3.stat(table4), 2, 2, 2, 80, B = 1e3)
#> [1] 0.02597403
Mod3.pvalue(Mod3.stat(table4), 2, 2, 2, 80)
#> [1] 0.02649735

Mod3.test

This function returns the test statistic and p-value of the modular independence test in the three-way contingency table.

Mod3.test(table4, B = 1e2)
#> 
#>  Modular test for independence in three-way contingency table
#> 
#> data:  table4
#> D = 2.5886, p-value = 0.0277
Mod3.test(table4, B = 1e3)
#> 
#>  Modular test for independence in three-way contingency table
#> 
#> data:  table4
#> D = 2.5886, p-value = 0.024

Lms3.stat

This function returns the statistic value of the logarithmic minimum independence test in the three-way contingency table.

Lms3.stat(table3)
#> [1] 2.712789
Lms3.stat(GenTab3(array(1/12, dim=c(2,2,3)), 120))
#> [1] 2.707617

Lms3.cv

This function returns the critical value of the logarithmic minimum independence test in the three-way contingency table.

Lms3.cv(2, 2, 2, 80, 0.05, B = 1e2) 
#> [1] 2.595609
Lms3.cv(2, 2, 2, 80, 0.1, B = 1e3) 
#> [1] 2.257343

Lms3.pvalue

This function returns the p-value of the logarithmic minimum independence test in the three-way contingency table.

Lms3.pvalue(Lms3.stat(table4), 2, 2, 2, 80, B = 1e3)
#> [1] 0.04595405
Lms3.pvalue(Lms3.stat(table4), 2, 2, 2, 80)
#> [1] 0.03709629

Lms3.test

This function returns the test statistic and p-value of the logarithmic minimum independence test in the three-way contingency table.

Lms3.test(table4, B = 1e2)
#> 
#>  Logarithmic minimum test for independence in three-way contingency
#>  table
#> 
#> data:  table4
#> D = 2.6637, p-value = 0.0378
Lms3.test(table4, B = 1e3)
#> 
#>  Logarithmic minimum test for independence in three-way contingency
#>  table
#> 
#> data:  table4
#> D = 2.6637, p-value = 0.0409

GenTab4

This function generating the four-way contingency table with the Monte Carlo method.

GenTab4(array(1/16, dim=c(2,2,2,2)), 100)
#> , , 1, 1
#> 
#>      [,1] [,2]
#> [1,]    8    5
#> [2,]    6    7
#> 
#> , , 2, 1
#> 
#>      [,1] [,2]
#> [1,]    6   12
#> [2,]    6    5
#> 
#> , , 1, 2
#> 
#>      [,1] [,2]
#> [1,]    5    8
#> [2,]    5    6
#> 
#> , , 2, 2
#> 
#>      [,1] [,2]
#> [1,]    3    8
#> [2,]    4    6
GenTab4(array(1/36, dim=c(2,3,2,3)), 150)
#> , , 1, 1
#> 
#>      [,1] [,2] [,3]
#> [1,]    8    9    2
#> [2,]    3    2    9
#> 
#> , , 2, 1
#> 
#>      [,1] [,2] [,3]
#> [1,]    3    5    7
#> [2,]    4    3    4
#> 
#> , , 1, 2
#> 
#>      [,1] [,2] [,3]
#> [1,]    2    3    6
#> [2,]    2    4    1
#> 
#> , , 2, 2
#> 
#>      [,1] [,2] [,3]
#> [1,]    5    7    4
#> [2,]    4    4    1
#> 
#> , , 1, 3
#> 
#>      [,1] [,2] [,3]
#> [1,]    2    4    4
#> [2,]    3    6    7
#> 
#> , , 2, 3
#> 
#>      [,1] [,2] [,3]
#> [1,]    3    3    3
#> [2,]    4    6    3

Mod4.stat

This function returns the statistic value of the modular independence test in the four-way contingency table.

Mod4.stat(table5)
#> [1] 46.70092
Mod4.stat(table6)
#> [1] 3.7301

Mod4.cv

This function returns the critical value of the modular independence test in the four-way contingency table.

Mod4.cv(2, 2, 2, 2, 160, 0.05, B = 1e2)
#> [1] 4.376112
Mod4.cv(2, 2, 2, 2, 160, 0.1, B = 1e3)
#> [1] 4.398633

Mod4.pvalue

This function returns the p-value of the modular independence test in the four-way contingency table.

Mod4.pvalue(Mod4.stat(table6), 2, 2, 2, 2, 160, B = 1e2)
#> [1] 0.3267327
Mod4.pvalue(Mod4.stat(table6), 2, 2, 2, 2, 160, B = 1e3)
#> [1] 0.3046953

Mod4.test

This function returns the test statistic and p-value of the modular independence test in the -way contingency table.

Mod4.test(table6, B = 1e2)
#> 
#>  Modular test for independence in four-way contingency table
#> 
#> data:  table6
#> D = 3.7301, p-value = 0.3169
Mod4.test(table6, B = 1e3)
#> 
#>  Modular test for independence in four-way contingency table
#> 
#> data:  table6
#> D = 3.7301, p-value = 0.3096

Lms4.stat

This function returns the statistic value of the logarithmic minimum independence test in the four-way contingency table.

Lms4.stat(table5)
#> [1] 116.0267
Lms4.stat(table6)
#> [1] 4.097161

Lms4.cv

This function returns the critical value of the logarithmic minimum independence test in the four-way contingency table.

Lms4.cv(2, 2, 2, 2, 160, 0.05, B = 1e2)
#> [1] 4.812597
Lms4.cv(2, 2, 2, 2, 160, 0.1, B = 1e3)
#> [1] 4.654836

Lms4.pvalue

This function returns the p-value of the logarithmic minimum independence test in the four-way contingency table.

Lms4.pvalue(Lms4.stat(table6), 2, 2, 2, 2, 160, B = 1e2)
#> [1] 0.2871287
Lms4.pvalue(Lms4.stat(table6), 2, 2, 2, 2, 160, B = 1e3)
#> [1] 0.2297702

Lms4.test

This function returns the test statistic and p-value of the logarithmic minimum independence test in the four-way contingency table.

Lms4.test(table6, B = 1e2)
#> 
#>  Logarithmic minimum test for independence in four-way contingency table
#> 
#> data:  table6
#> D = 4.0972, p-value = 0.2513
Lms4.test(table6, B = 1e3)
#> 
#>  Logarithmic minimum test for independence in four-way contingency table
#> 
#> data:  table6
#> D = 4.0972, p-value = 0.2541