--- title: "Getting started with baselinr" output: rmarkdown::html_vignette vignette: > %\VignetteIndexEntry{Getting started with baselinr} %\VignetteEngine{knitr::rmarkdown} %\VignetteEncoding{UTF-8} --- ```{r, include = FALSE} knitr::opts_chunk$set( collapse = TRUE, comment = "#>" ) ``` ```{r setup} library(baselinr) ``` ## The problem Every quasi-experimental impact study in education has to answer the same question before anyone looks at outcomes: *were the treatment and comparison groups similar enough at baseline?* The What Works Clearinghouse (WWC) sets the de facto standard for this in education research: - a covariate with a standardized mean difference (Hedges' g) of **0.05 or less** satisfies baseline equivalence on its own; - between **0.05 and 0.25**, equivalence holds only if the covariate is statistically adjusted for in the impact model; - **above 0.25**, the covariate cannot establish equivalence. `baselinr` computes those effect sizes and categories so the baseline table is not something you assemble by hand for every report. ## A worked example ```{r} study <- data.frame( treat = c(1, 1, 1, 0, 0, 0), pretest = c(5, 6, 7, 4, 5, 6), # continuous -> Hedges' g female = c(1, 0, 1, 0, 0, 1) # binary -> Cox index ) baseline_equivalence(study, treatment = "treat") ``` By default, every numeric, logical, and factor column other than the treatment indicator is treated as a covariate. A covariate with exactly two unique values is treated as binary and summarized with the Cox index; other numeric covariates use Hedges' g. Pass `covariates =` to control the set explicitly. ## The building blocks `baseline_equivalence()` is built from exported helpers you can also call directly. ```{r} # Standardized mean difference (Hedges' g) for a continuous covariate hedges_g(study$pretest, study$treat) # Cox index for a binary covariate cox_index(study$female, study$treat) # Classify any effect size(s) into the WWC categories wwc_classify(c(0.03, 0.12, 0.80)) ``` ## Visualise and format A Love plot shows the standardized effect size of each covariate against the WWC thresholds (0.05 and 0.25), coloured by category: ```{r loveplot, eval = requireNamespace("ggplot2", quietly = TRUE), fig.width = 7, fig.height = 2.6} love_plot(baseline_equivalence(study, treatment = "treat")) ``` For a report-ready table, `gt_baseline()` returns a formatted `gt` table: ```{r eval = FALSE} gt_baseline(baseline_equivalence(study, treatment = "treat")) ``` ## Scope Continuous covariates use Hedges' g (with the WWC small-sample correction); binary covariates use the WWC Cox index. Collapse the table into an overall verdict with `wwc_summary()`, assess sample loss with `attrition()`, visualise with `love_plot()`, and format with `gt_baseline()`. See `NEWS.md` for the roadmap.