Title: | An Alternative to the Kruskal-Wallis Based on the Kendall Tau Distance |
---|---|
Description: | The Concordance Test is a non-parametric method for testing whether two o more samples originate from the same distribution. It extends the Kendall Tau correlation coefficient when there are only two groups. For details, see Alcaraz J., Anton-Sanchez L., Monge J.F. (2022) The Concordance Test, an Alternative to Kruskal-Wallis Based on the Kendall-tau Distance: An R Package. The R Journal 14, 26–53 <doi:10.32614/RJ-2022-039>. |
Authors: | Javier Alcaraz [aut], Laura Anton-Sanchez [aut, cre], Juan Francisco Monge [aut] |
Maintainer: | Laura Anton-Sanchez <[email protected]> |
License: | GPL-3 |
Version: | 1.0.3 |
Built: | 2024-12-11 06:43:52 UTC |
Source: | CRAN |
This function computes the Concordance coefficient and the Kruskal-Wallis statistic.
CT_Coefficient(Sample_List, H = 0)
CT_Coefficient(Sample_List, H = 0)
Sample_List |
List of numeric data vectors with the elements of each sample. |
H |
0 by default. If set to 1, the Kruskal-Wallis statistic is also calculated and returned. |
The function returns a list with the following elements:
Sample_Sizes
: Numeric vector of sample sizes.
order_elements
: Numeric vector containing the elements order.
disorder
: Disorder of the permutation given by order_elements
.
Concordance_Coefficient
: 1-relative disorder of permutation given by order_elements
.
H_Statistic
: Kruskal-Wallis statistic (only if H = 1).
## Example A <- c(12,13,15,20,23,28,30,32,40,48) B <- c(29,31,49,52,54) C <- c(24,26,44) Sample_List <- list(A, B, C) CT_Coefficient(Sample_List) CT_Coefficient(Sample_List, H = 1) ## Example with ties A <- c(12,13,15,20,24,29,30,32,40,49) B <- c(29,31,49,52,54) C <- c(24,26,44) Sample_List <- list(A, B, C) CT_Coefficient(Sample_List, H = 1)
## Example A <- c(12,13,15,20,23,28,30,32,40,48) B <- c(29,31,49,52,54) C <- c(24,26,44) Sample_List <- list(A, B, C) CT_Coefficient(Sample_List) CT_Coefficient(Sample_List, H = 1) ## Example with ties A <- c(12,13,15,20,24,29,30,32,40,49) B <- c(29,31,49,52,54) C <- c(24,26,44) Sample_List <- list(A, B, C) CT_Coefficient(Sample_List, H = 1)
This function computes the critical values and the p-values for a desired significance levels of .10, .05 and .01. of the Concordance and Kruskal-Wallis tests. Critical values and p-values can be obtained exactly or by simulation (default option).
CT_Critical_Values(Sample_Sizes, Num_Sim = 10000, H = 0, verbose = TRUE)
CT_Critical_Values(Sample_Sizes, Num_Sim = 10000, H = 0, verbose = TRUE)
Sample_Sizes |
Numeric vector ( |
Num_Sim |
Number of simulations in order to obtain the probability distribution of the statistics. The default is 10000. If set to 0, the critical values and the p-values are obtained exactly. Otherwise they are obtained by simulation. |
H |
0 by default. If set to 1, the critical values and the p-values of the Kruskal-Wallis test are also calculated and returned. |
verbose |
A logical indicating if some "progress report" of the simulations should be given. The default is TRUE. |
The function returns a list with the following elements:
C_results
: Concordance coefficient results. Critical values and p-values for a desired significance levels of 0.1, .05 and .01.
H_results
: Kruskal-Wallis results. Critical values and p-values for a desired significance levels of 0.1, .05 and .01 (only if H = 1).
The computational time in exact calculations increases exponentially with the number of elements and with the number of sets.
Sample_Sizes <- c(3,3,3) CT_Critical_Values(Sample_Sizes, Num_Sim = 0, H = 1) CT_Critical_Values(Sample_Sizes, Num_Sim = 1000, H = 1)
Sample_Sizes <- c(3,3,3) CT_Critical_Values(Sample_Sizes, Num_Sim = 0, H = 1) CT_Critical_Values(Sample_Sizes, Num_Sim = 1000, H = 1)
This function performs the graphical visualization of the density distribution of the Concordance coefficient and the Kruskal-Wallis statistic.
CT_Density_Plot(C_freq = NULL, H_freq = NULL)
CT_Density_Plot(C_freq = NULL, H_freq = NULL)
C_freq |
Probability distribution of the Concordance coefficient obtained with the function |
H_freq |
Probability distribution of the Kruskal-Wallis statistic obtained with the function |
Sample_Sizes <- c(5,5,5) Distributions <- CT_Distribution(Sample_Sizes, Num_Sim = 1000, H = 1) C_freq <- Distributions$C_freq H_freq <- Distributions$H_freq CT_Density_Plot(C_freq, H_freq)
Sample_Sizes <- c(5,5,5) Distributions <- CT_Distribution(Sample_Sizes, Num_Sim = 1000, H = 1) C_freq <- Distributions$C_freq H_freq <- Distributions$H_freq CT_Density_Plot(C_freq, H_freq)
This function computes the probability distribution tables of the Concordance coefficient and Kruskal-Wallis statistic. Probability distribution tables can be obtained exactly or by simulation (default option).
CT_Distribution(Sample_Sizes, Num_Sim = 10000, H = 0, verbose = TRUE)
CT_Distribution(Sample_Sizes, Num_Sim = 10000, H = 0, verbose = TRUE)
Sample_Sizes |
Numeric vector ( |
Num_Sim |
Number of simulations in order to obtain the probability distribution of the statistics. The default is 10000. If set to 0, the probability distribution tables are obtained exactly. Otherwise they are obtained by simulation. |
H |
0 by default. If set to 1, the probability distribution table of the Kruskal-Wallis statistic is also calculated and returned. |
verbose |
A logical indicating if some "progress report" of the simulations should be given. The default is TRUE. |
The function returns a list with the following elements:
C_freq
: Matrix with the probability distribution of the Concordance coefficient. Each row in the matrix contains the disorder, the value of the coefficient, the frequency and its probability.
H_freq
: Matrix with the probability distribution of the Kruskal-Wallis statistic. Each row in the matrix contains the value of the statistic, the frequency and its probability (only if H = 1).
The computational time in exact calculations increases exponentially with the number of elements and with the number of sets.
Sample_Sizes <- c(5,4) CT_Distribution(Sample_Sizes, Num_Sim = 0) CT_Distribution(Sample_Sizes, Num_Sim = 0, H = 1) CT_Distribution(Sample_Sizes, Num_Sim = 1000) CT_Distribution(Sample_Sizes, Num_Sim = 1000, H = 1)
Sample_Sizes <- c(5,4) CT_Distribution(Sample_Sizes, Num_Sim = 0) CT_Distribution(Sample_Sizes, Num_Sim = 0, H = 1) CT_Distribution(Sample_Sizes, Num_Sim = 1000) CT_Distribution(Sample_Sizes, Num_Sim = 1000, H = 1)
This function performs the hypothesis test for testing whether samples originate from the same distribution.
CT_Hypothesis_Test(Sample_List, Num_Sim = 10000, H = 0, verbose = TRUE)
CT_Hypothesis_Test(Sample_List, Num_Sim = 10000, H = 0, verbose = TRUE)
Sample_List |
List of numeric data vectors with the elements of each sample. |
Num_Sim |
The number of used simulations. The default is 10000. |
H |
0 by default. If set to 1, the Kruskal-Wallis test is also performed and returned. |
verbose |
A logical indicating if some "progress report" of the simulations should be given. The default is TRUE. |
The function returns a list with the following elements:
results
: Table with the statistics and the signification levels.
C_p-value
: Concordance test signification level.
H_p-value
: Kruskal-Wallis test signification level (only if H = 1).
Myles Hollander and Douglas A. Wolfe (1973), Nonparametric Statistical Methods. New York: John Wiley & Sons. Pages 115-120.
## Hollander & Wolfe (1973), 116. ## Mucociliary efficiency from the rate of removal of dust in normal ## subjects, subjects with obstructive airway disease, and subjects ## with asbestosis. x <- c(2.9, 3.0, 2.5, 2.6, 3.2) # normal subjects y <- c(3.8, 2.7, 4.0, 2.4) # with obstructive airway disease z <- c(2.8, 3.4, 3.7, 2.2, 2.0) # with asbestosis Sample_List <- list(x, y, z) CT_Hypothesis_Test(Sample_List, Num_Sim = 1000, H = 1) ## Example A <- c(12,13,15,20,23,28,30,32,40,48) B <- c(29,31,49,52,54) C <- c(24,26,44) Sample_List <- list(A, B, C) CT_Hypothesis_Test(Sample_List, Num_Sim = 1000, H = 1) ## Example with ties A <- c(12,13,15,20,24,29,30,32,40,49) B <- c(29,31,49,52,54) C <- c(24,26,44) Sample_List <- list(A, B, C) CT_Hypothesis_Test(Sample_List, Num_Sim = 1000, H = 1)
## Hollander & Wolfe (1973), 116. ## Mucociliary efficiency from the rate of removal of dust in normal ## subjects, subjects with obstructive airway disease, and subjects ## with asbestosis. x <- c(2.9, 3.0, 2.5, 2.6, 3.2) # normal subjects y <- c(3.8, 2.7, 4.0, 2.4) # with obstructive airway disease z <- c(2.8, 3.4, 3.7, 2.2, 2.0) # with asbestosis Sample_List <- list(x, y, z) CT_Hypothesis_Test(Sample_List, Num_Sim = 1000, H = 1) ## Example A <- c(12,13,15,20,23,28,30,32,40,48) B <- c(29,31,49,52,54) C <- c(24,26,44) Sample_List <- list(A, B, C) CT_Hypothesis_Test(Sample_List, Num_Sim = 1000, H = 1) ## Example with ties A <- c(12,13,15,20,24,29,30,32,40,49) B <- c(29,31,49,52,54) C <- c(24,26,44) Sample_List <- list(A, B, C) CT_Hypothesis_Test(Sample_List, Num_Sim = 1000, H = 1)
This function performs the graphical visualization of the probability distribution of the Concordance coefficient and the Kruskal-Wallis statistic.
CT_Probability_Plot(C_freq = NULL, H_freq = NULL)
CT_Probability_Plot(C_freq = NULL, H_freq = NULL)
C_freq |
Probability distribution of the Concordance coefficient obtained with the function |
H_freq |
Probability distribution of the Kruskal-Wallis statistic obtained with the function |
Sample_Sizes <- c(5,5,5) Distributions <- CT_Distribution(Sample_Sizes, Num_Sim = 1000, H = 1) C_freq <- Distributions$C_freq H_freq <- Distributions$H_freq CT_Probability_Plot(C_freq) CT_Probability_Plot(C_freq, H_freq)
Sample_Sizes <- c(5,5,5) Distributions <- CT_Distribution(Sample_Sizes, Num_Sim = 1000, H = 1) C_freq <- Distributions$C_freq H_freq <- Distributions$H_freq CT_Probability_Plot(C_freq) CT_Probability_Plot(C_freq, H_freq)
This function computes the solution of the Linear Ordering Problem.
LOP(mat_LOP)
LOP(mat_LOP)
mat_LOP |
Preference matrix defining the Linear Ordering Problem. A numeric square matrix for which we want to obtain the permutation of rows/columns that maximizes the sum of the elements above the main diagonal. |
The function returns a list with the following elements:
obj_val
: Optimal value of the solution of the Linear Ordering Problem, i.e., the sum of the elements above the main diagonal under the permutation rows/cols solution.
permutation
: Solution of the Linear Ordering Problem, i.e., the rows/cols permutation.
permutation_matrix
: Optimal permutation matrix of the Linear Ordering Problem.
Martí, R. and Reinelt, G. The Linear Ordering Problem: Exact and Heuristic Methods in Combinatorial Optimization. Springer, first edition 2011.
## Square matrix ## ## | 1 2 2 | ## | 2 3 3 | ## | 3 2 2 | ## ## The optimal permutation of rows/cols is (2,3,1), ## and the solution of the Linear Ordering Problem is 8. ## Te permutation matrix of the solution is ## | 0 0 0 | ## | 1 0 1 | ## | 1 0 0 | mat_LOP <- matrix(c(1,2,3,2,3,2,2,3,2), nrow=3) LOP(mat_LOP)
## Square matrix ## ## | 1 2 2 | ## | 2 3 3 | ## | 3 2 2 | ## ## The optimal permutation of rows/cols is (2,3,1), ## and the solution of the Linear Ordering Problem is 8. ## Te permutation matrix of the solution is ## | 0 0 0 | ## | 1 0 1 | ## | 1 0 0 | mat_LOP <- matrix(c(1,2,3,2,3,2,2,3,2), nrow=3) LOP(mat_LOP)
This function enumerates the possible combinations of n
elements where the first element is repeated n1
times, the second element is repeated n2
times, the third n3
times, ...
Permutations_With_Repetition(Sample_Sizes)
Permutations_With_Repetition(Sample_Sizes)
Sample_Sizes |
Numeric vector ( |
Returns a matrix where each row contains a permutation.
The number of permutations and the computational time increase exponentially with the number of elements and with the number of sets.
Sample_Sizes <- c(2,2,2) Permutations_With_Repetition(Sample_Sizes)
Sample_Sizes <- c(2,2,2) Permutations_With_Repetition(Sample_Sizes)