Introduction to statease

Overview

statease is an R package designed to simplify statistical analysis by combining a wide range of tests with automatic plain-English interpretation of results. Whether you are a student, researcher, or educator, statease gives you numbers and meaning in one command.

Installation

# Install from CRAN
install.packages("statease")

# Load the package
library(statease)

Dataset

Throughout this vignette we use a simulated dataset of 90 students examining the effect of teaching methods on exam performance.

set.seed(42)

tutorial_data <- data.frame(
  student_id = 1:90,
  method     = rep(c("Traditional", "Online", "Hybrid"), each = 30),
  gender     = rep(c("Male", "Female"), times = 45),
  exam_score = c(
    round(rnorm(30, mean = 65, sd = 10)),
    round(rnorm(30, mean = 72, sd = 10)),
    round(rnorm(30, mean = 78, sd = 10))
  ),
  pre_test = c(
    round(rnorm(30, mean = 55, sd = 10)),
    round(rnorm(30, mean = 58, sd = 10)),
    round(rnorm(30, mean = 57, sd = 10))
  ),
  age        = round(rnorm(90, mean = 22, sd = 3)),
  passed     = rbinom(90, 1, prob = 0.7)
)

head(tutorial_data)
#>   student_id      method gender exam_score pre_test age passed
#> 1          1 Traditional   Male         79       69  22      0
#> 2          2 Traditional Female         59       50  20      0
#> 3          3 Traditional   Male         69       62  23      1
#> 4          4 Traditional Female         71       69  23      1
#> 5          5 Traditional   Male         69       44  21      1
#> 6          6 Traditional Female         64       46  18      0

1. Descriptive Statistics

The describe() function provides a full summary of a numeric variable including measures of central tendency, spread, and a normality check.

result <- describe(tutorial_data$exam_score,
                   var_name = "Exam Score")
print(result)
#> 
#> -- statease Descriptive Report ----------------------------------
#>   Variable     : Exam Score
#>   N            : 90  |  Missing: 0
#> -----------------------------------------------------------------
#>   Mean         : 72.10
#>   Median       : 74.00
#>   Std Dev      : 11.94
#>   Min          : 38.00  |  Max: 93.00
#>   Q1           : 64.00  |  Q3: 80.00
#>   IQR          : 16.00
#> -----------------------------------------------------------------
#>   Interpretation:
#>   The distribution is approximately symmetric.
#>   Spread shows moderate variability (CV = 16.6%).
#>   Shapiro-Wilk test suggests non-normality (W = 0.958, p = 0.0054).
#> -----------------------------------------------------------------

You can also extract individual values:

result$mean
#> [1] 72.1
result$sd
#> [1] 11.93903
result$skew_label
#> [1] "approximately symmetric"

2. T-Tests

Independent Samples T-Test

Compare exam scores between male and female students:

males   <- tutorial_data$exam_score[tutorial_data$gender == "Male"]
females <- tutorial_data$exam_score[tutorial_data$gender == "Female"]

result <- ttest_interpret(
  males, females,
  var_name = "Exam Score by Gender"
)
print(result)
#> 
#> -- statease T-Test Report ----------------------------------------
#>   Test         : Independent Samples T-Test
#>   Variable     : Exam Score by Gender
#>   Groups       : Group 1: n = 45  |  Group 2: n = 45
#> -----------------------------------------------------------------
#>   t-statistic  : 0.026
#>   df           : 88.0
#>   p-value      : 0.9790
#>   95% CI      : [-4.964, 5.097]
#>   Cohen's d    : 0.006 (negligible effect)
#> -----------------------------------------------------------------
#>   Interpretation:
#>   The result is not statistically significant (p = 0.979 > alpha 0.05).
#>   Group 1 had a higher mean (72.13 vs 72.07).
#>   Effect size is negligible (d = 0.006).
#>   95% CI: true difference lies between -4.964 and 5.097.
#> -----------------------------------------------------------------

One-Sample T-Test

Test whether the average exam score is significantly different from 70:

result <- ttest_interpret(
  tutorial_data$exam_score,
  mu = 70,
  var_name = "Exam Score"
)
print(result)
#> 
#> -- statease T-Test Report ----------------------------------------
#>   Test         : One-Sample T-Test
#>   Variable     : Exam Score
#>   Groups       : n = 90  |  Hypothesised mean (mu) = 70.00
#> -----------------------------------------------------------------
#>   t-statistic  : 1.669
#>   df           : 89.0
#>   p-value      : 0.0987
#>   95% CI      : [69.599, 74.601]
#>   Cohen's d    : 0.176 (negligible effect)
#> -----------------------------------------------------------------
#>   Interpretation:
#>   The result is not statistically significant (p = 0.099 > alpha 0.05).
#>   Sample mean was 72.10 vs hypothesised 70.00.
#>   Effect size is negligible (d = 0.176).
#>   95% CI: true difference lies between 69.599 and 74.601.
#> 
#>   WARNING: Shapiro-Wilk suggests x may not be normally distributed.
#> -----------------------------------------------------------------

Paired T-Test

Test whether students improved from pre-test to final exam:

result <- ttest_interpret(
  tutorial_data$exam_score,
  tutorial_data$pre_test,
  paired   = TRUE,
  var_name = "Score Improvement"
)
print(result)
#> 
#> -- statease T-Test Report ----------------------------------------
#>   Test         : Paired Samples T-Test
#>   Variable     : Score Improvement
#>   Groups       : n (pairs) = 90
#> -----------------------------------------------------------------
#>   t-statistic  : 10.296
#>   df           : 89.0
#>   p-value      : 0.0000
#>   95% CI      : [13.262, 19.605]
#>   Cohen's d    : 1.085 (large effect)
#> -----------------------------------------------------------------
#>   Interpretation:
#>   The result is statistically significant (p = 0.000 < alpha 0.05).
#>   Group 1 had a higher mean (72.10 vs 55.67).
#>   Effect size is large (d = 1.085).
#>   95% CI: true difference lies between 13.262 and 19.605.
#> 
#>   WARNING: Shapiro-Wilk suggests x may not be normally distributed.
#> -----------------------------------------------------------------

3. One-Way ANOVA

Test whether teaching method affects exam scores:

result <- anova_interpret(
  exam_score ~ method,
  data = tutorial_data
)
print(result)
#> 
#> -- statease ANOVA Report -----------------------------------------
#>   Outcome      : exam_score
#>   Group        : method  (3 levels)
#> -----------------------------------------------------------------
#>   Group Means:
#>     Hybrid       : Mean = 79.90  (n = 30)
#>     Online       : Mean = 70.77  (n = 30)
#>     Traditional  : Mean = 65.63  (n = 30)
#> -----------------------------------------------------------------
#>   F-statistic  : 14.267
#>   df           : 2, 87
#>   p-value      : 0.0000
#>   Eta squared  : 0.2470 (large effect)
#> -----------------------------------------------------------------
#>   Interpretation:
#>   The overall ANOVA result is statistically significant (p = 0.0000 < alpha 0.05).
#>   Group differences explain 24.7% of variance
#>   (eta^2 = 0.2470, large effect).
#>   WARNING: Bartlett's test suggests unequal variances (p = 0.0421).
#> 
#> -- Post-Hoc Tukey HSD --------------------------------------------
#>   Online-Hybrid
#>     Mean diff = -9.133  |  p adj = 0.0031  |  [significant]
#>   Traditional-Hybrid
#>     Mean diff = -14.267  |  p adj = 0.0000  |  [significant]
#>   Traditional-Online
#>     Mean diff = -5.133  |  p adj = 0.1456  |  [not significant]
#> -----------------------------------------------------------------
#>   Note: Tukey HSD controls for family-wise error rate.
#> -----------------------------------------------------------------

4. Two-Way ANOVA

Test the effect of both teaching method and gender on exam scores:

result <- anova2_interpret(
  exam_score ~ method * gender,
  data = tutorial_data
)
print(result)
#> 
#>  statease Two-Way ANOVA Report 
#>   Outcome      : exam_score
#>   Factor 1     : method
#>   Factor 2     : gender
#>   N            : 90
#> 
#>   Means by method:
#>     Hybrid          : 79.90
#>     Online          : 70.77
#>     Traditional     : 65.63
#> 
#>   Means by gender:
#>     Female          : 72.07
#>     Male            : 72.13
#> 
#>   Interaction Means:
#>             Female  Male
#> Hybrid       81.27 78.53
#> Online       71.60 69.93
#> Traditional  63.33 67.93
#> 
#>   ANOVA Results:
#>   method               : F = 14.123  df = 2,84  significant (p = 0.0000)  eta^2 = 0.2470 (large)
#>   gender               : F = 0.001  df = 1,84  not significant (p = 0.9761)  eta^2 = 0.0000 (negligible)
#>   Interaction          : F = 1.061  df = 2,84  not significant (p = 0.3506)  eta^2 = 0.0186 (small)
#> 
#>   Interpretation:
#>   Main effect of method is significant (p = 0.0000).
#>   Main effect of gender is not significant (p = 0.9761).
#>   Interaction (method x gender) is not significant (p = 0.3506).
#> 
#> -- Post-Hoc Tukey HSD (method) ------------------------------
#>   Online-Hybrid : diff = -9.133  p adj = 0.0033  [significant]
#>   Traditional-Hybrid : diff = -14.267  p adj = 0.0000  [significant]
#>   Traditional-Online : diff = -5.133  p adj = 0.1485  [not significant]
#> -----------------------------------------------------------------

5. MANOVA

Test the combined effect of teaching method on both exam score and pre-test score simultaneously:

result <- manova_interpret(
  cbind(exam_score, pre_test) ~ method,
  data = tutorial_data
)
print(result)
#> 
#> -- statease MANOVA Report ----------------------------------------
#>   Outcomes     : exam_score, pre_test
#>   Group        : method  (3 levels)
#>   N            : 90
#> -----------------------------------------------------------------
#>   Group Means:
#> 
#>   exam_score:
#>     Hybrid       : 79.90
#>     Online       : 70.77
#>     Traditional  : 65.63
#> 
#>   pre_test:
#>     Hybrid       : 56.87
#>     Online       : 55.37
#>     Traditional  : 54.77
#> -----------------------------------------------------------------
#>   Multivariate Test Results:
#>   Pillai's Trace : 0.2565
#>   Wilks' Lambda  : 0.7435
#>   F-statistic    : 6.400  (df = 4, 174)
#>   p-value        : 0.0001
#>   Effect size    : small (Pillai = 0.2565)
#> -----------------------------------------------------------------
#>   Interpretation:
#>   The overall MANOVA result is statistically significant (p = 0.0001 < alpha 0.05).
#>   Pillai's Trace = 0.2565 indicates a small effect.
#> -----------------------------------------------------------------
#>   Follow-Up Univariate ANOVAs:
#> 
#>   exam_score
#>     F = 14.267  (df = 2, 87)  p = 0.0000  [significant]
#> 
#>   pre_test
#>     F = 0.403  (df = 2, 87)  p = 0.6695  [not significant]
#> 
#>   Note: Follow-up ANOVAs identify which outcomes
#>   differ significantly across groups.
#> 
#>   WARNING: 'exam_score' may not be normally distributed (Shapiro-Wilk p = 0.0054).
#> -----------------------------------------------------------------

6. Chi-Square Test

Test whether there is an association between teaching method and pass/fail outcome:

tutorial_data$passed_label <- ifelse(tutorial_data$passed == 1,
                                      "Pass", "Fail")

result <- chisq_interpret(
  tutorial_data$method,
  tutorial_data$passed_label
)
print(result)
#> 
#> -- statease Chi-Square Test Report ------------------------------
#>   N            : 90
#> -----------------------------------------------------------------
#>   Contingency Table (Observed):
#>              y
#> x             Fail Pass
#>   Hybrid         6   24
#>   Online         8   22
#>   Traditional   10   20
#> 
#>   Expected Frequencies:
#>              y
#> x             Fail Pass
#>   Hybrid         8   22
#>   Online         8   22
#>   Traditional    8   22
#> 
#> -----------------------------------------------------------------
#>   Chi-square   : 1.364
#>   df           : 2
#>   p-value      : 0.5057
#>   Cramer's V   : 0.123 (small effect)
#> -----------------------------------------------------------------
#>   Interpretation:
#>   The result is not statistically significant (p = 0.5057 > alpha 0.05).
#>   There is no significant association between the two variables.
#>   Effect size is small (V = 0.123).
#> -----------------------------------------------------------------

7. Correlation Analysis

Test the relationship between pre-test scores and exam scores:

result <- cor_interpret(
  tutorial_data$pre_test,
  tutorial_data$exam_score,
  var1_name = "Pre-Test Score",
  var2_name = "Exam Score"
)
print(result)
#> 
#> -- statease Correlation Report -----------------------------------
#>   Method       : Pearson Product-Moment Correlation
#>   Variables    : Pre-Test Score & Exam Score
#>   N            : 90  |  Missing: 0
#> -----------------------------------------------------------------
#>   r            : -0.0038
#>   p-value      : 0.9720
#>   95% CI      : [-0.2107, 0.2035]
#>   Strength     : negligible
#>   Direction    : negative (as one variable increases, the other tends to decrease)
#> -----------------------------------------------------------------
#>   Interpretation:
#>   The correlation is not statistically significant (p = 0.9720 > alpha 0.05).
#>   The relationship between Pre-Test Score and Exam Score is
#>   negligible and negative (as one variable increases, the other tends to decrease) in direction.
#> 
#>   WARNING: One or both variables may not be normally distributed. Consider using method = 'spearman' instead.
#> -----------------------------------------------------------------

Spearman Correlation

result <- cor_interpret(
  tutorial_data$pre_test,
  tutorial_data$exam_score,
  method    = "spearman",
  var1_name = "Pre-Test Score",
  var2_name = "Exam Score"
)
#> Warning in cor.test.default(x, y, method = method, conf.level = conf.level):
#> cannot compute exact p-value with ties
print(result)
#> 
#> -- statease Correlation Report -----------------------------------
#>   Method       : Spearman Rank Correlation
#>   Variables    : Pre-Test Score & Exam Score
#>   N            : 90  |  Missing: 0
#> -----------------------------------------------------------------
#>   r            : 0.0750
#>   p-value      : 0.4826
#>   Strength     : negligible
#>   Direction    : positive (as one variable increases, the other tends to increase)
#> -----------------------------------------------------------------
#>   Interpretation:
#>   The correlation is not statistically significant (p = 0.4826 > alpha 0.05).
#>   The relationship between Pre-Test Score and Exam Score is
#>   negligible and positive (as one variable increases, the other tends to increase) in direction.
#> -----------------------------------------------------------------

8. Simple Linear Regression

Predict exam score from pre-test score:

result <- reg_interpret(
  exam_score ~ pre_test,
  data = tutorial_data
)
print(result)
#> 
#> -- statease Simple Linear Regression Report ---------------------
#>   Outcome      : exam_score
#>   Predictor    : pre_test
#>   N            : 90
#> -----------------------------------------------------------------
#>   Model Equation:
#>   exam_score = 72.369 + -0.005 * pre_test
#> -----------------------------------------------------------------
#>   Coefficients:
#>   Intercept    : 72.369
#>   Slope        : -0.005  (SE = 0.137)
#>   t-statistic  : -0.035
#>   p-value      : 0.9720
#>   95% CI      : [-0.278, 0.268]
#> -----------------------------------------------------------------
#>   Model Fit:
#>   R-squared    : 0.0000
#>   Adj R-squared: -0.0113
#>   F-statistic  : 0.001 (df = 1, 88)  p = 0.9720
#> -----------------------------------------------------------------
#>   Interpretation:
#>   The predictor pre_test is not statistically significant (p = 0.9720 > alpha 0.05).
#>   The slope is negative - as pre_test increases by 1 unit, exam_score decreases by 0.005 units.
#>   R-squared = 0.0000: pre_test explains 0.0% of the
#>   variance in exam_score (negligible effect).
#> 
#>   WARNING: Residuals may not be normally distributed (Shapiro-Wilk p = 0.0056).
#> -----------------------------------------------------------------

9. Multiple Linear Regression

Predict exam score from both pre-test score and age:

result <- mlr_interpret(
  exam_score ~ pre_test + age,
  data = tutorial_data
)
print(result)
#> 
#> -- statease Multiple Linear Regression Report -------------------
#>   Outcome      : exam_score
#>   Predictors   : pre_test, age
#>   N            : 90
#> -----------------------------------------------------------------
#>   Model Equation:
#>   exam_score = 70.019 + 0.001*pre_test + 0.093*age
#> -----------------------------------------------------------------
#>   Overall Model Fit:
#>   R-squared    : 0.0005 (negligible effect)
#>   Adj R-squared: -0.0225
#>   F-statistic  : 0.022 (df = 2, 87)  p = 0.9782
#>   The overall model is not statistically significant (p = 0.9782 > alpha 0.05).
#> -----------------------------------------------------------------
#>   Individual Predictors:
#> 
#>   pre_test
#>     Coefficient  : 0.001  (SE = 0.141)
#>     t-statistic  : 0.008
#>     p-value      : 0.9936  [not significant]
#>     95% CI      : [-0.279, 0.282]
#>     Direction    : positive (b = 0.001)
#> 
#>   age
#>     Coefficient  : 0.093  (SE = 0.447)
#>     t-statistic  : 0.207
#>     p-value      : 0.8366  [not significant]
#>     95% CI      : [-0.796, 0.982]
#>     Direction    : positive (b = 0.093)
#> -----------------------------------------------------------------
#>   Interpretation:
#>   The model explains 0.1% of the variance in exam_score
#>   (R-squared = 0.0005, negligible effect).
#>   Adjusted R-squared = -0.0225 accounting for
#>   the number of predictors in the model.
#>   Non-significant predictors: pre_test, age
#> 
#>   WARNING: Residuals may not be normally distributed (Shapiro-Wilk p = 0.0048).
#> -----------------------------------------------------------------

10. Logistic Regression

Predict pass/fail outcome from pre-test score and age:

result <- logistic_interpret(
  passed ~ pre_test + age,
  data = tutorial_data
)
#> Waiting for profiling to be done...
print(result)
#> 
#> -- statease Logistic Regression Report --------------------------
#>   Outcome      : passed
#>   Predictors   : pre_test, age
#>   N            : 90
#> -----------------------------------------------------------------
#>   Overall Model Fit:
#>   Chi-square   : 3.668  (df = 2)  p = 0.1598
#>   Nagelkerke R2: 0.0582 (small effect)
#>   The overall model is not statistically significant (p = 0.1598 > alpha 0.05).
#> -----------------------------------------------------------------
#>   Individual Predictors:
#> 
#>   pre_test
#>     Coefficient  : -0.050  (SE = 0.027)
#>     z-statistic  : -1.842
#>     p-value      : 0.0655  [not significant]
#>     Odds Ratio   : 0.951
#>     95% CI (OR) : [0.899, 1.002]
#>     Interpretation: each unit increase in pre_test decreases the odds by 4.9%.
#> 
#>   age
#>     Coefficient  : -0.006  (SE = 0.086)
#>     z-statistic  : -0.065
#>     p-value      : 0.9482  [not significant]
#>     Odds Ratio   : 0.994
#>     95% CI (OR) : [0.838, 1.180]
#>     Interpretation: each unit increase in age decreases the odds by 0.6%.
#> -----------------------------------------------------------------
#>   Interpretation:
#>   The model is not statistically significant (p = 0.1598 > alpha 0.05).
#>   Nagelkerke R2 = 0.0582 suggests a small amount of
#>   variance in passed is explained by the predictors.
#>   Non-significant predictors: pre_test, age
#> -----------------------------------------------------------------

11. Non-Parametric Tests

Mann-Whitney U Test

Non-parametric alternative to the independent samples t-test:

result <- mannwhitney_interpret(
  males, females,
  var_name = "Exam Score by Gender"
)
print(result)
#> 
#> -- statease Mann-Whitney U Test Report --------------------------
#>   Variable     : Exam Score by Gender
#>   Group 1      : n = 45  |  Median = 74.00
#>   Group 2      : n = 45  |  Median = 75.00
#> -----------------------------------------------------------------
#>   W statistic  : 1026.500
#>   p-value      : 0.9120
#>   95% CI      : [-5.000, 5.000]
#>   Effect size  : 0.012 (negligible)
#> -----------------------------------------------------------------
#>   Interpretation:
#>   The result is not statistically significant (p = 0.9120 > alpha 0.05).
#>   Values in Group 2 appear stochastically greater than values in Group 1. (Reported medians: Group 1 = 74.00, Group 2 = 75.00)
#>   Effect size is negligible (r = 0.012).
#>   Note: Mann-Whitney tests stochastic superiority,
#>   not differences in medians.
#> -----------------------------------------------------------------

Wilcoxon Signed Rank Test

Non-parametric alternative to the paired t-test:

result <- wilcoxon_interpret(
  tutorial_data$exam_score,
  tutorial_data$pre_test,
  var_name = "Score Improvement"
)
print(result)
#> 
#> -- statease Wilcoxon Signed Rank Test Report --------------------
#>   Variable     : Score Improvement
#>   N (pairs)    : 90
#>   Pre Median   : 55.00
#>   Post Median  : 74.00
#> -----------------------------------------------------------------
#>   V statistic  : 3711.500
#>   p-value      : 0.0000
#>   95% CI      : [14.000, 20.000]
#>   Effect size  : 0.737 (large)
#> -----------------------------------------------------------------
#>   Interpretation:
#>   The result is statistically significant (p = 0.0000 < alpha 0.05).
#>   Post-measurement values appear stochastically greater than pre-measurement values. (Reported medians: Post = 74.00, Pre = 55.00)
#>   Effect size is large (r = 0.737).
#>   Note: Wilcoxon test compares groups using ranked values.
#>   A significant result sugegsts one group tends to have larger or smaller observation than the other.
#>   This can be interpreted as evidence of stochastic superiority, but only under typical distribution assumptions.  It does not specifically test differences in medians.
#> -----------------------------------------------------------------

Kruskal-Wallis Test

Non-parametric alternative to one-way ANOVA:

result <- kruskal_interpret(
  exam_score ~ method,
  data = tutorial_data
)
print(result)
#> 
#> -- statease Kruskal-Wallis Test Report --------------------------
#>   Outcome      : exam_score
#>   Group        : method  (3 levels)
#>   N            : 90
#> -----------------------------------------------------------------
#>   Group Medians:
#>     Hybrid       : Median = 81.00  (n = 30)
#>     Online       : Median = 73.50  (n = 30)
#>     Traditional  : Median = 64.00  (n = 30)
#> -----------------------------------------------------------------
#>   H statistic  : 23.052
#>   df           : 2
#>   p-value      : 0.0000
#>   Eta squared  : 0.2420 (large effect)
#> -----------------------------------------------------------------
#>   Interpretation:
#>   The result is statistically significant (p = 0.0000 < alpha 0.05).
#>   The result is statistically significant (p = 0.0000 < alpha 0.05).
#>   Effect size is large (eta^2 = 0.2420).
#>   Note: Kruskal-Wallis test compares multiple groups using ranked values
#>   not differences in medians.
#>   Medians are reported for descriptive purposes only.
#> 
#> -- Post-Hoc Pairwise Comparisons (Wilcoxon) ---------------------
#>   Hybrid vs Online
#>     p = 0.0006  [significant]
#>   Hybrid vs Traditional
#>     p = 0.0000  [significant]
#>   Online vs Traditional
#>     p = 0.0653  [not significant]
#> 
#>   Note: Pairwise Wilcoxon tests used for post-hoc comparisons.
#> -----------------------------------------------------------------

12. P-Value Interpretation

Interpret any p-value in plain English:

result <- interpret_p(
  0.03,
  context = "teaching method effect on exam scores"
)
print(result)
#> 
#> -- statease P-Value Interpretation ------------------------------
#>   Context      : teaching method effect on exam scores
#>   P-value      : 0.0300
#>   Alpha        : 0.05
#> -----------------------------------------------------------------
#>   Decision     : REJECT the null hypothesis at alpha = 0.05
#>   Evidence     : There is moderate evidence against the null hypothesis.
#> -----------------------------------------------------------------
#>   Interpretation:
#>   The result is statistically significant. The observed data is unlikely to have occurred by chance if the null hypothesis were true.
#> 
#>   Note: Statistical significance does not imply practical importance. Always consider effect size alongside the p-value.
#> -----------------------------------------------------------------

13. The Master Function

The analyze() function automatically detects the right test based on your input — no need to remember which function to use!

# Descriptive statistics
analyze(x = tutorial_data$exam_score, var_name = "Exam Score")
#> [statease] Single numeric vector -> Running Descriptive Statistics
#> 
#> -- statease Descriptive Report ----------------------------------
#>   Variable     : Exam Score
#>   N            : 90  |  Missing: 0
#> -----------------------------------------------------------------
#>   Mean         : 72.10
#>   Median       : 74.00
#>   Std Dev      : 11.94
#>   Min          : 38.00  |  Max: 93.00
#>   Q1           : 64.00  |  Q3: 80.00
#>   IQR          : 16.00
#> -----------------------------------------------------------------
#>   Interpretation:
#>   The distribution is approximately symmetric.
#>   Spread shows moderate variability (CV = 16.6%).
#>   Shapiro-Wilk test suggests non-normality (W = 0.958, p = 0.0054).
#> -----------------------------------------------------------------

# Auto t-test
analyze(
  x        = males,
  y        = females,
  var_name = "Exam Score by Gender"
)
#> [statease] Two numeric vectors detected -> Running T-Test
#> 
#> -- statease T-Test Report ----------------------------------------
#>   Test         : Independent Samples T-Test
#>   Variable     : Exam Score by Gender
#>   Groups       : Group 1: n = 45  |  Group 2: n = 45
#> -----------------------------------------------------------------
#>   t-statistic  : 0.026
#>   df           : 88.0
#>   p-value      : 0.9790
#>   95% CI      : [-4.964, 5.097]
#>   Cohen's d    : 0.006 (negligible effect)
#> -----------------------------------------------------------------
#>   Interpretation:
#>   The result is not statistically significant (p = 0.979 > alpha 0.05).
#>   Group 1 had a higher mean (72.13 vs 72.07).
#>   Effect size is negligible (d = 0.006).
#>   95% CI: true difference lies between -4.964 and 5.097.
#> -----------------------------------------------------------------

# Auto ANOVA
analyze(formula = exam_score ~ method, data = tutorial_data)
#> [statease] 3+ groups detected -> Running One-Way ANOVA
#> 
#> -- statease ANOVA Report -----------------------------------------
#>   Outcome      : exam_score
#>   Group        : method  (3 levels)
#> -----------------------------------------------------------------
#>   Group Means:
#>     Hybrid       : Mean = 79.90  (n = 30)
#>     Online       : Mean = 70.77  (n = 30)
#>     Traditional  : Mean = 65.63  (n = 30)
#> -----------------------------------------------------------------
#>   F-statistic  : 14.267
#>   df           : 2, 87
#>   p-value      : 0.0000
#>   Eta squared  : 0.2470 (large effect)
#> -----------------------------------------------------------------
#>   Interpretation:
#>   The overall ANOVA result is statistically significant (p = 0.0000 < alpha 0.05).
#>   Group differences explain 24.7% of variance
#>   (eta^2 = 0.2470, large effect).
#>   WARNING: Bartlett's test suggests unequal variances (p = 0.0421).
#> 
#> -- Post-Hoc Tukey HSD --------------------------------------------
#>   Online-Hybrid
#>     Mean diff = -9.133  |  p adj = 0.0031  |  [significant]
#>   Traditional-Hybrid
#>     Mean diff = -14.267  |  p adj = 0.0000  |  [significant]
#>   Traditional-Online
#>     Mean diff = -5.133  |  p adj = 0.1456  |  [not significant]
#> -----------------------------------------------------------------
#>   Note: Tukey HSD controls for family-wise error rate.
#> -----------------------------------------------------------------

# Auto non-parametric
analyze(
  formula  = exam_score ~ method,
  data     = tutorial_data,
  nonparam = TRUE
)
#> [statease] 3+ groups + nonparam = TRUE -> Running Kruskal-Wallis Test
#> 
#> -- statease Kruskal-Wallis Test Report --------------------------
#>   Outcome      : exam_score
#>   Group        : method  (3 levels)
#>   N            : 90
#> -----------------------------------------------------------------
#>   Group Medians:
#>     Hybrid       : Median = 81.00  (n = 30)
#>     Online       : Median = 73.50  (n = 30)
#>     Traditional  : Median = 64.00  (n = 30)
#> -----------------------------------------------------------------
#>   H statistic  : 23.052
#>   df           : 2
#>   p-value      : 0.0000
#>   Eta squared  : 0.2420 (large effect)
#> -----------------------------------------------------------------
#>   Interpretation:
#>   The result is statistically significant (p = 0.0000 < alpha 0.05).
#>   The result is statistically significant (p = 0.0000 < alpha 0.05).
#>   Effect size is large (eta^2 = 0.2420).
#>   Note: Kruskal-Wallis test compares multiple groups using ranked values
#>   not differences in medians.
#>   Medians are reported for descriptive purposes only.
#> 
#> -- Post-Hoc Pairwise Comparisons (Wilcoxon) ---------------------
#>   Hybrid vs Online
#>     p = 0.0006  [significant]
#>   Hybrid vs Traditional
#>     p = 0.0000  [significant]
#>   Online vs Traditional
#>     p = 0.0653  [not significant]
#> 
#>   Note: Pairwise Wilcoxon tests used for post-hoc comparisons.
#> -----------------------------------------------------------------

# Auto regression
analyze(formula = exam_score ~ pre_test, data = tutorial_data)
#> [statease] Numeric predictor detected -> Running Simple Linear Regression
#> 
#> -- statease Simple Linear Regression Report ---------------------
#>   Outcome      : exam_score
#>   Predictor    : pre_test
#>   N            : 90
#> -----------------------------------------------------------------
#>   Model Equation:
#>   exam_score = 72.369 + -0.005 * pre_test
#> -----------------------------------------------------------------
#>   Coefficients:
#>   Intercept    : 72.369
#>   Slope        : -0.005  (SE = 0.137)
#>   t-statistic  : -0.035
#>   p-value      : 0.9720
#>   95% CI      : [-0.278, 0.268]
#> -----------------------------------------------------------------
#>   Model Fit:
#>   R-squared    : 0.0000
#>   Adj R-squared: -0.0113
#>   F-statistic  : 0.001 (df = 1, 88)  p = 0.9720
#> -----------------------------------------------------------------
#>   Interpretation:
#>   The predictor pre_test is not statistically significant (p = 0.9720 > alpha 0.05).
#>   The slope is negative - as pre_test increases by 1 unit, exam_score decreases by 0.005 units.
#>   R-squared = 0.0000: pre_test explains 0.0% of the
#>   variance in exam_score (negligible effect).
#> 
#>   WARNING: Residuals may not be normally distributed (Shapiro-Wilk p = 0.0056).
#> -----------------------------------------------------------------

# Auto MANOVA
analyze(
  formula = cbind(exam_score, pre_test) ~ method,
  data    = tutorial_data
)
#> [statease] Multiple outcomes detected -> Running MANOVA
#> 
#> -- statease MANOVA Report ----------------------------------------
#>   Outcomes     : exam_score, pre_test
#>   Group        : method  (3 levels)
#>   N            : 90
#> -----------------------------------------------------------------
#>   Group Means:
#> 
#>   exam_score:
#>     Hybrid       : 79.90
#>     Online       : 70.77
#>     Traditional  : 65.63
#> 
#>   pre_test:
#>     Hybrid       : 56.87
#>     Online       : 55.37
#>     Traditional  : 54.77
#> -----------------------------------------------------------------
#>   Multivariate Test Results:
#>   Pillai's Trace : 0.2565
#>   Wilks' Lambda  : 0.7435
#>   F-statistic    : 6.400  (df = 4, 174)
#>   p-value        : 0.0001
#>   Effect size    : small (Pillai = 0.2565)
#> -----------------------------------------------------------------
#>   Interpretation:
#>   The overall MANOVA result is statistically significant (p = 0.0001 < alpha 0.05).
#>   Pillai's Trace = 0.2565 indicates a small effect.
#> -----------------------------------------------------------------
#>   Follow-Up Univariate ANOVAs:
#> 
#>   exam_score
#>     F = 14.267  (df = 2, 87)  p = 0.0000  [significant]
#> 
#>   pre_test
#>     F = 0.403  (df = 2, 87)  p = 0.6695  [not significant]
#> 
#>   Note: Follow-up ANOVAs identify which outcomes
#>   differ significantly across groups.
#> 
#>   WARNING: 'exam_score' may not be normally distributed (Shapiro-Wilk p = 0.0054).
#> -----------------------------------------------------------------

14. Fisher’s Exact Test

Test the association between two categorical variables when sample sizes are small:

result <- fisher_interpret(
  tutorial_data$method,
  tutorial_data$passed_label
)
print(result)
#> 
#> --- statease Fisher's Exact Test Report ---------------
#>   N            : 90
#>   Table size   : 3 x 2
#> 
#>   Contingency Table (Observed):
#>              y
#> x             Fail Pass
#>   Hybrid         6   24
#>   Online         8   22
#>   Traditional   10   20
#> 
#>   Expected Frequencies:
#>              y
#> x             Fail Pass
#>   Hybrid         8   22
#>   Online         8   22
#>   Traditional    8   22
#> 
#> -----------------------------------------------------------------
#>   p-value      : 0.5541
#> -----------------------------------------------------------------
#>   Interpretation:
#>   The result is not statistically significant (p = 0.5541 > alpha 0.05).
#>   There is insufficient evidence of an association between the two variables.
#> 
#>   WARNING: The contingency table is larger than 2x2. Fisher's Exact Test may require simulation or approximation methods, which can increase computation time and reduce exactness. Interpret results with caution. If simulate.p.value = TRUE was used, explicitly state this in your report.
#> 
#>   NOTE: All expected frequencies exceed 5. A Chi-square test would also be appropriate and may be preferred for larger samples. Consider using chisq_interpret().
#> -----------------------------------------------------------------

15. McNemar’s Test

Test whether there is a significant difference in paired proportions between two related measurements:

pre_pass  <- sample(c("Yes","No"), 90, replace = TRUE, 
                    prob = c(0.4, 0.6))
post_pass <- sample(c("Yes","No"), 90, replace = TRUE, 
                    prob = c(0.7, 0.3))

result <- mcnemar_interpret(pre_pass, post_pass)
print(result)
#> 
#> statease McNemar's Test Report --------------------------------
#>   N            : 90
#>   Table size   : 2 x 2
#>   Discordant   : 49 pairs
#> -----------------------------------------------------------------
#>   Contingency Table:
#>      y
#> x     No Yes
#>   No  14  39
#>   Yes 10  27
#> 
#> -----------------------------------------------------------------
#>   p-value      : 0.0001
#>   Matched OR   : 3.900
#>   95% CI      : [1.947, 7.812]
#> -----------------------------------------------------------------
#>   Interpretation:
#>   The result is statistically significant (p = 0.0001 < alpha 0.05).
#>   There is evidence of a significant difference in paired proportions between the two measurements.
#>   Matched OR = 3.900: More subjects changed from the first category to the second category than vice versa.
#>   95% CI [1.947, 7.812] excludes 1: Evidence of a significant difference in paired proportions between measurements.
#> 
#>   WARNING: McNemar's Test assumes that observations are paired and independent across pairs. Violation of this assumption may affect the validity of the results.
#> 
#>   NOTE: McNemar's Test requires paired or matched data. Ensure that each row in your data represents the same subject measured twice or a matched pair.
#> -----------------------------------------------------------------

16. Friedman Test

Non-parametric alternative to repeated measures ANOVA:

friedman_data <- data.frame(
  score   = c(23,45,12,67,34,89,56,43,78,90,11,34),
  time    = rep(c("T1","T2","T3"), each = 4),
  subject = rep(1:4, times = 3)
)

result <- friedman_interpret(score ~ time | subject, 
                             data = friedman_data)
#> Warning in friedman_interpret(score ~ time | subject, data = friedman_data):
#> The Friedman Test may have low statistical power with very small sample sizes.
#> Interpret non-significant results with caution.
print(result)
#> 
#> -- statease Friedman Test Report 
#>   Outcome      : score
#>   Time/Group   : time (3 levels)
#>   Subjects     : subject (n = 4)
#> -----------------------------------------------------------------
#>   Group Medians (descriptive only):
#>     T1           : 34.00
#>     T2           : 49.50
#>     T3           : 56.00
#> -----------------------------------------------------------------
#>   Chi-square   : 0.500
#>   df           : 2
#>   p-value      : 0.7788
#>   Kendall's W  : 0.0625 (negligible effect)
#> -----------------------------------------------------------------
#>   Interpretation:
#>   The result is not statistically significant (p = 0.7788 > alpha 0.05).
#>   There is insufficient evidence of a significant difference in ranks across the related groups or repeated measurements.
#> 
#>   NOTE: Medians are reported for descriptive purposes only.
#>   The Friedman Test assesses whether rank distributions
#>   differ across groups and does not directly test for
#>   differences in medians.
#> 
#>   Post-hoc tests not run (overall result not significant).
#> 
#>   WARNING: The Friedman Test assumes that the blocks (subjects) are independent of each other. Violation of this assumption may affect the validity of the results.
#> 
#>   NOTE: Normality assumption appears reasonable. If the assumptions of repeated measures ANOVA are met, consider using repeated measures ANOVA for greater statistical power.
#> -----------------------------------------------------------------

17. Checking Assumptions

Before running a test, check its assumptions automatically:

result <- check_assumptions(
  "ttest",
  x = males,
  y = females
)
print(result)
#> 
#> -- statease Assumption Check Report -------------------------------
#>   Test         : ttest
#> ---------------------------------------------------------------------
#> 
#>   [PASSED] Normality (x)
#>     Shapiro-Wilk test: statistic = 0.950, p = 0.0502. Normality assumption appears satisfied.
#> 
#>   [PASSED] Sample size guidance (x)
#>     n = 45. Sample size appears reasonable.
#> 
#>   [PASSED] Normality (y)
#>     Shapiro-Wilk test: statistic = 0.963, p = 0.1561. Normality assumption appears satisfied.
#> 
#>   [PASSED] Sample size guidance (y)
#>     n = 45. Sample size appears reasonable.
#> 
#>   [PASSED] Homogeneity of variance
#>     Levene's Test: p = 0.8482. Variances appear approximately equal.
#> 
#> ---------------------------------------------------------------------
#>   NOTE: Assumption checks are based on statistical tests and
#>   heuristics. They provide guidance but should not be
#>   interpreted as definitive proof that assumptions are met
#>   or violated.
#> 
#>   NOTE: Failure to reject an assumption test does not prove
#>   that the assumption has been satisfied.
#> 
#>   NOTE: Visual inspection of residual plots is always
#>   recommended alongside formal assumption tests.
#> ---------------------------------------------------------------------

18. Power Analysis

Determine the sample size needed to detect an effect:

result <- power_interpret(
  "ttest.two",
  effect_size = 0.5
)
print(result)
#> 
#> -- statease Power Analysis Report  
#>   Test         : Independent Samples T-Test
#>   Mode         : Calculate required sample size
#> -----------------------------------------------------------------
#>   Effect size  : 0.500 (large)
#>   Alpha        : 0.05
#>   Desired power: 0.80 (80%)
#>   Required n   : 64
#>   Total N      : 128 (2 groups x 64)
#> -----------------------------------------------------------------
#>   Interpretation:
#>   To detect a large effect (effect size = 0.50) with 80% power at alpha = 0.05, you need at least 64 participants per group (128 total for 2 groups).
#> 
#>   NOTE: Power analysis results are estimates based on assumptions about effect size, alpha, and power. Actual results may differ depending on the true effect size in the population.
#>   NOTE: Effect sizes should ideally be based on previous research, pilot studies, or theoretically justified values, not chosen arbitrarily to reduce required sample size.
#>   NOTE: A power of 0.80 is a conventional minimum. In high stakes research such as clinical trials, a higher power of 0.90 or 0.95 is often recommended.
#>   NOTE: Power analysis assumes that the chosen statistical test and its assumptions are appropriate for the data.
#> -----------------------------------------------------------------

19. The Shiny App

For users who prefer a point-and-click interface, statease includes a built-in Shiny app:

statease::run_app()

This launches an interactive web application where you can upload your data, select variables, and run any statease function without writing code.

Summary

Test	Function	analyze() support
Descriptive stats	`describe()`	✔
Independent t-test	`ttest_interpret()`	✔
One-sample t-test	`ttest_interpret()`	✔
Paired t-test	`ttest_interpret()`	✔
One-way ANOVA	`anova_interpret()`	✔
Two-way ANOVA	`anova2_interpret()`	✔
MANOVA	`manova_interpret()`	✔
Chi-square	`chisq_interpret()`	✔
Correlation	`cor_interpret()`	✔
Simple regression	`reg_interpret()`	✔
Multiple regression	`mlr_interpret()`	✔
Logistic regression	`logistic_interpret()`	✔
Mann-Whitney U	`mannwhitney_interpret()`	✔
Wilcoxon Signed Rank	`wilcoxon_interpret()`	✔
Kruskal-Wallis	`kruskal_interpret()`	✔
Fisher’s Exact	`fisher_interpret()`	✔
McNemar’s Test	`mcnemar_interpret()`	✔
Friedman Test	`friedman_interpret()`	✔
Check Assumptions	`check_assumptions()`	✔
Power Analysis	`power_interpret()`	✔
P-value interpretation	`interpret_p()`	-