Package: mixedsubjects 1.0.0

Klint Kanopka

mixedsubjects: Causal Inference in Experiments with Mixed-Subjects Designs

Implements seven estimators for average treatment effect (ATE) estimation in mixed-subjects designs (MSDs), where human subjects data is augmented with predictions from large language models (LLMs). Includes Difference-in-Means, GREG, PPI++, Doubly-Tuned, Difference-in-Predictions (DiP), DiP++, and D-T DiP estimators. Provides point estimates, variance estimation via delta-method or bootstrap, and optimal design selection for budget allocation between human observations and LLM predictions.

Authors:Austin van Loon [aut], Klint Kanopka [aut, cre], Yuan Huang [ctb]

mixedsubjects_1.0.0.tar.gz
mixedsubjects_1.0.0.tar.gz(r-4.7-any)mixedsubjects_1.0.0.tar.gz(r-4.6-any)
mixedsubjects_1.0.0.tgz(r-4.6-emscripten)
manual.pdf |manual.html
DESCRIPTION |NEWS
card.svg |card.png
mixedsubjects/json (API)

# Install 'mixedsubjects' in R:
install.packages('mixedsubjects', repos = c('https://cran.r-universe.dev', 'https://cloud.r-project.org'))

Bug tracker:https://github.com/klintkanopka/mixedsubjects/issues

Pkgdown/docs site:https://klintkanopka.com

On CRAN:

Conda:

3.00 score 2 scripts 12 exports 0 dependencies

Last updated from:144fa387a3. Checks:4 OK. Indexed: yes.

TargetResultTimeFilesSyslog
linux-devel-x86_64OK117
source / vignettesOK219
linux-release-x86_64OK129
wasm-releaseOK94

Exports:bootstrap_variancecompare_variance_methodsestimate_allmsd_datamsd_dimmsd_dipmsd_dip_ppmsd_dtmsd_dt_dipmsd_gregmsd_ppioptimal_design

Dependencies:

Comparing Estimators Under Different Data-Generating Processes
Overview | Simulation Engine | Scenario 1: Poor Predictions — DiM Wins | Scenario 2: Negatively Correlated Predictions | Scenario 3: Heterogeneous Prediction Quality — Double-Tuned Wins | Scenario 4: High Common-Mode Error | Scenario 5: Common-Mode Error + Heterogeneous Quality | Scenario 6: Near-Perfect Predictions | Summary Table

Last update: 2026-07-02
Started: 2026-07-02

Introduction to Mixed-Subjects Designs with the mixedsubjects Package
Overview | Why Mixed-Subjects Designs? | What This Package Does | Installation | A Quick Example | Simulating Experimental Data | Creating an MSD Data Object | Estimating the Treatment Effect | Understanding the Data Structure | What Data Do You Need? | Two Types of Predictions | Creating Your Data Object | Flexible Column Names | Formula Interface | The Seven Estimators | 1. Difference-in-Means (DiM) | 2. GREG (Calibration Estimator) | 3. PPI++ (Power-Tuned) | 4. D-T (Doubly-Tuned) | 5. DiP (Difference-in-Predictions) | 6. DiP++ (Power-Tuned DiP) | 7. D-T DiP (Doubly-Tuned DiP) | Comparing All Estimators | Choosing the Right Estimator | Decision Tree | When to Use DiP-Type Estimators | Variance Estimation | Delta-Method (Default) | Bootstrap (Optional) | Optimal Experimental Design | The Design Problem | Using optimal_design() | Interpreting the Results | Comparing Designs | Practical Workflow | Step 1: Pilot Study | Step 2: Estimate Prediction Quality | Step 3: Plan Your Main Study | Step 4: Run the Main Study | Step 5: Analyze and Report | Common Questions | Q: What if my LLM predictions are biased? | Q: How much do predictions need to correlate with outcomes? | Q: Should I use cross-fitting? | Q: What if predictions are worse in one treatment arm? | Q: How many folds should I use for cross-fitting? | Q: Can I use predictions from multiple LLMs? | Technical Details | Assumptions | Variance Formulas | Citation | Session Info

Last update: 2026-07-02
Started: 2026-07-02