| Title: | Automated Functions for Basic Statistical Tests |
|---|---|
| Description: | Provides simple and intuitive functions for basic statistical analyses. Methods include the t-test (Student 1908 <doi:10.1093/biomet/6.1.1>), the Mann-Whitney U test (Mann and Whitney 1947 <doi:10.1214/aoms/1177730491>), Pearson's correlation (Pearson 1895 <doi:10.1098/rspl.1895.0041>), and analysis of variance (Fisher 1925, <doi:10.1007/978-1-4612-4380-9_5>). Functions are compatible with 'ggplot2' and 'dplyr'. |
| Authors: | Luiz Garcia [aut, cre] (ORCID: <https://orcid.org/0000-0002-9616-0927>) |
| Maintainer: | Luiz Garcia <[email protected]> |
| License: | MIT + file LICENSE |
| Version: | 1.2.15 |
| Built: | 2026-06-03 07:16:38 UTC |
| Source: | https://github.com/cran/autotestR |
designed to simplify the execution of the main statistical tests commonly used in the life sciences. It provides user-friendly functions that automatically generate plots and clear explanations, making statistical analysis more accessible for researchers and students.
t test (independent and paired)
Mann–Whitney test (Wilcoxon rank-sum)
Multiple group comparison (t test or Mann–Whitney)
Chi-squared test and Fisher’s exact test
One-way ANOVA with Tukey HSD post hoc test
Kruskal–Wallis test with Dunn post hoc test
Pearson, Spearman, and Kendall correlation tests with automatic plots
Diagnostic function that suggests the most appropriate statistical test
Intuitive plots fully integrated into the functions
library(autotestR)
group1 <- rnorm(30, 10, 2) group2 <- rnorm(30, 12, 2) test.t(group1, group2)
var1 <- sample(c("A", "B"), 100, replace = TRUE) var2 <- sample(c("Yes", "No"), 100, replace = TRUE) test.chi(var1, var2)
df <- data.frame( control = rnorm(30, 10), treatment = rnorm(30, 12), test1 = rnorm(30, 11), test2 = rnorm(30, 15) ) test.tmulti(df)
g1 <- rnorm(20, 5) g2 <- rnorm(20, 7) g3 <- rnorm(20, 6) test.anova(g1, g2, g3)
x <- rnorm(30) y <- x + rnorm(30, 0, 1) test.correlation(x, y)
Maintainer: Luiz Garcia [email protected] (ORCID)
Automatically identifies whether the input data are numeric or categorical and suggests the most appropriate statistical test.
pre.test(..., alpha = 0.05, help = FALSE, verbose = TRUE)pre.test(..., alpha = 0.05, help = FALSE, verbose = TRUE)
... |
Two or more vectors (numeric or categorical), or a data.frame with >= 2 columns |
alpha |
Significance level. Default = 0.05 |
help |
Logical. If TRUE, shows detailed help |
verbose |
Logical. If TRUE, prints informative messages |
Invisible list with normality results, homogeneity or contingency table, and test recommendation
Performs ANOVA (and Tukey HSD) if data meet normality and homogeneity assumptions. Otherwise, automatically recommends Kruskal-Wallis/Dunn.
test.anova( ..., title = "ANOVA/Tukey HSD", xlab = "Group", ylab = "Value", style = c("boxplot", "violin", "mono", "halfeye"), help = FALSE, verbose = TRUE )test.anova( ..., title = "ANOVA/Tukey HSD", xlab = "Group", ylab = "Value", style = c("boxplot", "violin", "mono", "halfeye"), help = FALSE, verbose = TRUE )
... |
Vectors or a data.frame with >= 2 columns. |
title |
Plot title. |
xlab |
X-axis label. |
ylab |
Y-axis label. |
style |
Aesthetic style of the generated plot. |
help |
If TRUE, shows help. |
verbose |
If TRUE, shows detailed messages. |
An aov object or a recommendation message.
df <- data.frame( control = rnorm(30, 10), treatment = rnorm(30, 12), test = rnorm(30, 11) ) test.anova(df)df <- data.frame( control = rnorm(30, 10), treatment = rnorm(30, 12), test = rnorm(30, 11) ) test.anova(df)
Applies the Pearson chi-square test or Fisher's exact test to assess association between two categorical variables.
test.chi( x, y = NULL, title = "Chi-square Test", xlab = NULL, ylab = "Proportion", style = c("stacked", "barplot", "mosaic", "pie"), show_table = TRUE, help = FALSE, verbose = TRUE )test.chi( x, y = NULL, title = "Chi-square Test", xlab = NULL, ylab = "Proportion", style = c("stacked", "barplot", "mosaic", "pie"), show_table = TRUE, help = FALSE, verbose = TRUE )
x |
Categorical vector or data frame with two columns (group 1 and group 2). |
y |
Categorical vector (group 2). Required if x is a vector. |
title |
Plot title (string). Default: "Chi-square Test". |
xlab |
X-axis label in the plot (string). Default: NULL (uses variable name). |
ylab |
Y-axis label in the plot (string). Default: "Proportion". |
style |
Plot style generated by the function. |
show_table |
Logical. If TRUE, prints the contingency table to the console. Default: TRUE. |
help |
Logical. If TRUE, displays a detailed explanation of the function. Default: FALSE. |
verbose |
Logical. If TRUE, prints messages about the test and expected frequencies. Default: TRUE. |
Test result and contingency table.
data <- data.frame( control = c(rep("healthy", 50), rep("sick", 150)), treatment = c(rep("healthy", 100), rep("sick", 100)) ) test.chi(data)data <- data.frame( control = c(rep("healthy", 50), rep("sick", 150)), treatment = c(rep("healthy", 100), rep("sick", 100)) ) test.chi(data)
Performs correlation analysis between two numeric variables using Pearson,
Spearman, or Kendall methods. When method = "auto", the function
automatically selects the most appropriate method based on normality,
presence of ties, and proportion of outliers.
test.correlation( x, y = NULL, method = "auto", main = NULL, xlab = NULL, ylab = NULL, style = c("simple", "inference", "structure", "density", "distribution"), plot_normality = FALSE, help = FALSE, verbose = TRUE ) ## S3 method for class 'test.correlation' print(x, ...)test.correlation( x, y = NULL, method = "auto", main = NULL, xlab = NULL, ylab = NULL, style = c("simple", "inference", "structure", "density", "distribution"), plot_normality = FALSE, help = FALSE, verbose = TRUE ) ## S3 method for class 'test.correlation' print(x, ...)
x |
Numeric vector or data frame with exactly two numeric columns. |
y |
Numeric vector (optional if |
method |
Correlation method: |
main |
Plot title. If |
xlab |
Label for the x-axis. If |
ylab |
Label for the y-axis. If |
style |
Plot style: |
plot_normality |
Logical. If |
help |
Logical. If |
verbose |
Logical. If |
... |
Additional arguments passed to other print methods (currently ignored) |
In addition to hypothesis testing, the function provides diagnostic information, effect size interpretation, and publication-ready visualizations.
Method selection in automatic mode follows these rules:
If more than 5\ Kendall's tau is used.
If both variables are normally distributed and no ties are present, Pearson's correlation is used.
Otherwise, Spearman's rank correlation is applied.
Normality is assessed using Shapiro-Wilk (n <= 50), Anderson-Darling (50 < n <= 300), or Kolmogorov-Smirnov (n > 300) tests.
An object of class "test.correlation" containing:
call: Matched function call.
data: Input data (if n <= 5000).
n: Sample size.
method: Selected correlation method.
estimate: Correlation coefficient.
effect: Effect size information (r, rho, or tau, and r² when applicable).
conf.int: Confidence interval.
p.value: P-value of the test.
interpretation: Qualitative interpretation of effect size.
diagnostics: Normality, ties, and outlier diagnostics.
decision: Automatic selection rationale.
htest: Underlying stats::cor.test object.
plot: ggplot2 object.
# Pearson (approximately normal data) set.seed(123) x <- rnorm(50, sd = 0.1) y <- x + rnorm(50, sd = 0.1) test.correlation(x, y) # Spearman (non-normal data) set.seed(123) x <- runif(300) y <- log(x + 0.1) + rnorm(300, sd = 0.5) test.correlation(x, y) # Kendall (ties and outliers) set.seed(123) x <- runif(1000, 1, 100) y <- sin(x) * 30 + rnorm(1000, 0, 10) x[sample(1:500, 50)] <- 50 y[sample(1:500, 50)] <- 0 x_out <- runif(100, 10, 20) y_out <- runif(100, 80, 120) x <- c(x, x_out) y <- c(y, y_out) test.correlation(x, y)# Pearson (approximately normal data) set.seed(123) x <- rnorm(50, sd = 0.1) y <- x + rnorm(50, sd = 0.1) test.correlation(x, y) # Spearman (non-normal data) set.seed(123) x <- runif(300) y <- log(x + 0.1) + rnorm(300, sd = 0.5) test.correlation(x, y) # Kendall (ties and outliers) set.seed(123) x <- runif(1000, 1, 100) y <- sin(x) * 30 + rnorm(1000, 0, 10) x[sample(1:500, 50)] <- 50 y[sample(1:500, 50)] <- 0 x_out <- runif(100, 10, 20) y_out <- runif(100, 80, 120) x <- c(x, x_out) y <- c(y, y_out) test.correlation(x, y)
Performs Fisher's Exact Test using two categorical vectors or a data frame with two columns, constructing a contingency table and optionally generating graphical visualizations.
test.fisher( x, y = NULL, title = "Fisher's Exact Test", xlab = NULL, ylab = "Proportion", style = c("stacked", "barplot", "mosaic", "pie"), show_table = TRUE, help = FALSE, verbose = TRUE )test.fisher( x, y = NULL, title = "Fisher's Exact Test", xlab = NULL, ylab = "Proportion", style = c("stacked", "barplot", "mosaic", "pie"), show_table = TRUE, help = FALSE, verbose = TRUE )
x |
Categorical vector or data frame with two columns. |
y |
Categorical vector (required if x_var is a vector). |
title |
Plot title (string). Default: "Fisher's Exact Test" |
xlab |
Name of the x-axis in the plot (string). Default: NULL (uses variable name) |
ylab |
Name of the y-axis in the plot (string). Default: "Proportion" |
style |
Plot style. Controls the visualization type |
show_table |
Logical. If TRUE, prints the contingency table to the console. Default: TRUE |
help |
Logical. If TRUE, shows detailed function explanation. Default: FALSE |
verbose |
Logical. If TRUE, prints detailed test messages. Default: TRUE |
Invisible object containing the Fisher test result.
data <- data.frame(control = c("healthy","healthy","sick","sick","sick"), treatment = c("healthy","healthy","healthy","healthy","sick")) test.fisher(data)data <- data.frame(control = c("healthy","healthy","sick","sick","sick"), treatment = c("healthy","healthy","healthy","healthy","sick")) test.fisher(data)
Fits a linear model of the form y ~ x * by to evaluate whether the
association between a continuous predictor and an outcome differs across
groups. Optionally produces a publication-ready visualization of
group-specific regression lines.
test.interaction( x, y, by, title = NULL, xlab = NULL, ylab = NULL, plot = TRUE, style = c("clean", "CI", "facet"), conf.level = 0.95, verbose = TRUE, help = FALSE ) ## S3 method for class 'test.interaction' print(x, ...)test.interaction( x, y, by, title = NULL, xlab = NULL, ylab = NULL, plot = TRUE, style = c("clean", "CI", "facet"), conf.level = 0.95, verbose = TRUE, help = FALSE ) ## S3 method for class 'test.interaction' print(x, ...)
x |
Numeric vector representing the continuous predictor. |
y |
Numeric vector representing the continuous outcome. |
by |
Grouping variable defining the interaction. Must be coercible to a factor with at least two levels. |
title |
Optional title title for the plot. |
xlab |
Optional x-axis label. |
ylab |
Optional y-axis label. |
plot |
Logical. Should a plot be generated? |
style |
Plot style. One of |
conf.level |
Confidence level for the interaction interval (default: 0.95). |
verbose |
Logical. If TRUE, prints detailed messages. Default: TRUE. |
help |
Logical. If TRUE, shows a detailed explanation of the function. Default: FALSE. |
... |
Additional arguments passed to other print methods (currently ignored) |
The interaction coefficient () represents the difference in
regression slopes between groups, conditional on the reference level
of by. The sign and magnitude of this coefficient depend on the
chosen reference group.
Confidence intervals are emphasized as the primary inferential quantity.
An object of class "test.interaction" containing:
model: the fitted linear model,
interaction: estimated interaction effects with confidence intervals,
plot: a ggplot object (if plot = TRUE).
# Simple example: different trends between groups set.seed(123) n <- 60 marker <- rnorm(n, 10, 2) group <- rep(c("Control", "Treatment"), each = n/2) # Same intercept, different slopes response <- 2 + ifelse(group == "Control", 0.5, 1.2) * marker + rnorm(n, 0, 1) test.interaction(marker, response, group)# Simple example: different trends between groups set.seed(123) n <- 60 marker <- rnorm(n, 10, 2) group <- rep(c("Control", "Treatment"), each = n/2) # Same intercept, different slopes response <- 2 + ifelse(group == "Control", 0.5, 1.2) * marker + rnorm(n, 0, 1) test.interaction(marker, response, group)
Performs the Kruskal-Wallis rank-sum test for comparing three or more independent groups, followed by Dunn's post-hoc test with multiple comparison adjustment.
test.kruskal( ..., title = "Kruskal-Wallis + Dunn", xlab = "Group", ylab = "Value", style = c("boxplot", "violin", "mono", "halfeye"), adjust = c("bonferroni", "holm", "BH"), help = FALSE, verbose = TRUE )test.kruskal( ..., title = "Kruskal-Wallis + Dunn", xlab = "Group", ylab = "Value", style = c("boxplot", "violin", "mono", "halfeye"), adjust = c("bonferroni", "holm", "BH"), help = FALSE, verbose = TRUE )
... |
Numeric vectors representing groups, or a data frame with two or more columns (each column is treated as a group). |
title |
Character. Plot title. |
xlab |
Character. X-axis label. |
ylab |
Character. Y-axis label. |
style |
Character. Plot style. One of:
|
adjust |
Character. Method for p-value adjustment in Dunn's test.
One of |
help |
Logical. If |
verbose |
Logical. If |
This function is a non-parametric alternative to one-way ANOVA and is recommended when normality or homoscedasticity assumptions are violated.
Invisibly returns a list with the following components:
Test type.
Kruskal-Wallis H statistic.
Degrees of freedom.
Global test p-value.
Epsilon-squared effect size.
Bootstrap confidence interval for effect size.
Group means and standard deviations.
Dunn post-hoc results.
Significant pairwise comparisons.
Long-format data used in the analysis.
set.seed(123) n <- 25 df <- data.frame( control = rexp(n, rate = 1), treatment1 = rexp(n, rate = 0.6), treatment2 = rgamma(n, shape = 2, scale = 1) ) test.kruskal(df)set.seed(123) n <- 25 df <- data.frame( control = rexp(n, rate = 1), treatment1 = rexp(n, rate = 0.6), treatment2 = rgamma(n, shape = 2, scale = 1) ) test.kruskal(df)
Performs Student's t-test to compare means between two independent groups, with automatic checks for normality and homogeneity of variances. If assumptions are violated, the Mann-Whitney test is automatically applied (without generating a plot).
test.t( ..., title = TRUE, title_text = "t-test", xlab = "Group", ylab = "Value", style = c("boxplot", "violin", "mono", "halfeye"), help = FALSE, verbose = TRUE )test.t( ..., title = TRUE, title_text = "t-test", xlab = "Group", ylab = "Value", style = c("boxplot", "violin", "mono", "halfeye"), help = FALSE, verbose = TRUE )
... |
Two numeric vectors or a data frame with exactly two columns. |
title |
Logical. If true, return a plot entitled. |
title_text |
Plot title (string). Default: "t-test". |
xlab |
X-axis label in the plot (string). Default: "Group". |
ylab |
Y-axis label in the plot (string). Default: "Value". |
style |
Plot aesthetic generated by the function. |
help |
Logical. If TRUE, shows a detailed explanation of the function. Default: FALSE. |
verbose |
Logical. If TRUE, prints detailed messages. Default: TRUE. |
Invisible list with summary, test test_result, method and (optionally) plot.
set.seed(123) df <- data.frame( control = rnorm(30, 10), treatment = rnorm(30, 15) ) test.t(df)set.seed(123) df <- data.frame( control = rnorm(30, 10), treatment = rnorm(30, 15) ) test.t(df)
Performs multiple pairwise comparisons using Student's t-test or Mann-Whitney test, with automatic diagnostics, effect sizes, confidence intervals, multiple testing correction and visualization.
test.tmulti( ..., comparisons = NULL, title = "Multiple comparisons (t / MW)", xlab = "", ylab = "Value", style = c("boxplot", "violin", "mono", "halfeye"), p_adjust = c("none", "holm", "BH", "bonferroni"), help = FALSE, verbose = TRUE )test.tmulti( ..., comparisons = NULL, title = "Multiple comparisons (t / MW)", xlab = "", ylab = "Value", style = c("boxplot", "violin", "mono", "halfeye"), p_adjust = c("none", "holm", "BH", "bonferroni"), help = FALSE, verbose = TRUE )
... |
Numeric vectors or a data.frame with groups in columns. |
comparisons |
List of character vectors specifying pairwise comparisons (e.g. list(c("A","B"), c("B","C"))). If NULL, all pairwise combinations are used. |
title |
Plot title. |
xlab |
X-axis label. |
ylab |
Y-axis label. |
style |
Plot style. One of "boxplot", "violin", "mono", or "halfeye". |
p_adjust |
Method for multiple testing correction. One of "none", "holm", "BH", "bonferroni". |
help |
Logical. If TRUE, prints usage examples. |
verbose |
Logical. If TRUE, prints results and plots. |
Normality is assessed using Shapiro-Wilk tests and homogeneity of variances using Levene's test. If assumptions are met, a pooled-variance t-test is used. Otherwise, the Mann-Whitney test is applied with bootstrap confidence intervals.
Effect sizes:
Cohen's d for t-tests
Rank-biserial correlation for Mann-Whitney
A list with:
A tibble with test results.
A ggplot object.
Long-format data used for plotting.
df <- data.frame( control = rnorm(30, 10), treatment = rnorm(30, 12), test1 = rnorm(30, 11), test2 = rnorm(30, 15) ) test.tmulti( df, comparisons = list( c("control", "treatment"), c("treatment", "test1") ) )df <- data.frame( control = rnorm(30, 10), treatment = rnorm(30, 12), test1 = rnorm(30, 11), test2 = rnorm(30, 15) ) test.tmulti( df, comparisons = list( c("control", "treatment"), c("treatment", "test1") ) )
Performs a paired t-test between two numeric vectors (e.g., before vs after) or between two numeric columns of a data frame. Includes four visualization styles (boxplot, violin, mono, and half-eye).
test.tpaired( ..., title = "Paired t-test", xlab = "", ylab = "Value", style = c("boxplot", "violin", "mono", "halfeye"), connect = TRUE, help = FALSE, verbose = TRUE )test.tpaired( ..., title = "Paired t-test", xlab = "", ylab = "Value", style = c("boxplot", "violin", "mono", "halfeye"), connect = TRUE, help = FALSE, verbose = TRUE )
... |
Two numeric vectors of equal length, or a data frame with exactly two numeric columns. |
title |
Plot title. |
xlab |
X-axis label. |
ylab |
Y-axis label. |
style |
Plot style:
|
connect |
Logical. If TRUE, connects paired observations. |
help |
If TRUE, displays detailed help. |
verbose |
If TRUE, prints progress messages. |
An invisible list containing:
Group means and standard deviations
t-test result object (stats::t.test)
Data frame used for plotting
ggplot2 object
before <- c(13, 12, 15, 14) after <- c(9, 11, 10, 10) test.tpaired(before, after)before <- c(13, 12, 15, 14) after <- c(9, 11, 10, 10) test.tpaired(before, after)
Performs the Mann-Whitney (Wilcoxon rank-sum) test for comparing two independent groups, with statistical summary and graphical visualization.
test.u( ..., title = "Mann-Whitney Test", xlab = "Group", ylab = "Value", style = c("boxplot", "violin", "mono", "halfeye"), help = FALSE, verbose = TRUE )test.u( ..., title = "Mann-Whitney Test", xlab = "Group", ylab = "Value", style = c("boxplot", "violin", "mono", "halfeye"), help = FALSE, verbose = TRUE )
... |
Two numeric vectors or a data.frame with two numeric columns. |
title |
Plot title. Default: "Mann-Whitney Test". |
xlab |
Label for x-axis. Default: "Group". |
ylab |
Label for y-axis. Default: "Value". |
style |
Plot aesthetic style. |
help |
Logical. If TRUE, prints a detailed explanation. Default: FALSE. |
verbose |
Logical. If TRUE, prints detailed messages. Default: TRUE. |
Invisible list with:
Group-wise statistical summary
Test result (htest object)
ggplot2 visualization object
x <- c(1, 3, 5, 6) y <- c(7, 8, 9, 12) data <- data.frame(groupA = x, groupB = y) test.u(data)x <- c(1, 3, 5, 6) y <- c(7, 8, 9, 12) data <- data.frame(groupA = x, groupB = y) test.u(data)