| Title: | Model Comparison via the 'InterModel Vigorish' ('IMV') |
|---|---|
| Description: | Computes the 'InterModel Vigorish' ('IMV'), a metric for comparing the predictive accuracy of two models for binary outcomes. The 'IMV' is derived from the expected value of a bettor using one model's predicted probabilities against those of a competing model, and is estimated via k-fold cross-validation. Methods are provided for generalized linear models, mixed-effects models ('lme4'), and item response theory models ('mirt'). See <doi:10.1371/journal.pone.0316491>. |
| Authors: | Ben Domingue [aut, cre], Christian Jackson [ctb] |
| Maintainer: | Ben Domingue <[email protected]> |
| License: | MIT + file LICENSE |
| Version: | 0.3 |
| Built: | 2026-05-11 20:49:06 UTC |
| Source: | https://github.com/cran/imv |
S3 generic that computes the InterModel Vigorish (IMV) between a baseline
model m0 and an enhanced model m1 via k-fold cross-validation.
For each fold, both models are refit on the training partition and evaluated on
the held-out partition; the IMV is computed from those out-of-fold predictions.
imv.default provides an escape hatch for unsupported model types via
predict_fn: in that case the original fitted models are used (without
refitting) to obtain predictions on each test fold.
imv(m0, m1, ...) ## S3 method for class 'glm' imv(m0, m1, data = NULL, nfold = 4, predict_fn = NULL, y = NULL, ...) ## Default S3 method: imv(m0, m1, data = NULL, nfold = 4, predict_fn = NULL, y = NULL, ...)imv(m0, m1, ...) ## S3 method for class 'glm' imv(m0, m1, data = NULL, nfold = 4, predict_fn = NULL, y = NULL, ...) ## Default S3 method: imv(m0, m1, data = NULL, nfold = 4, predict_fn = NULL, y = NULL, ...)
m0 |
Baseline model. |
m1 |
Enhanced model. |
data |
Data frame used for cross-validation. May be omitted for model classes that
store training data internally (e.g., objects from |
nfold |
Number of cross-validation folds (default 4). |
predict_fn |
Optional function with signature |
y |
Character string naming the binary outcome column in |
... |
Additional arguments passed to methods. |
A named list with four elements:
Numeric vector of per-fold IMV values (length nfold).
Mean IMV across folds.
Standard deviation of per-fold IMVs.
Named numeric vector of length 2: a 95% interval computed as mean +/- 1.96 * (sd / sqrt(nfold)).
Domingue, B. W., Rahal, C., Faul, J., Freese, J., Kanopka, K., Rigos, A., Stenhaug, B., & Tripathi, A. S. (2025). The InterModel Vigorish (IMV) as a flexible and portable approach for quantifying predictive accuracy with binary outcomes. PloS one, 20(3), e0316491. https://journals.plos.org/plosone/article?id=10.1371/journal.pone.0316491
imv.binary, imv.glmerMod, imv.SingleGroupClass
## --- glm ------------------------------------------------------------ set.seed(1) x <- rnorm(100) y <- rbinom(100, 1, plogis(x)) df <- data.frame(x = x, y = y) m0 <- glm(y ~ 1, df, family = "binomial") m1 <- glm(y ~ x, df, family = "binomial") result <- imv(m0, m1, nfold = 2) result$mean result$ci ## --- custom predict_fn (escape hatch for unsupported model types) --- pfn <- function(model, newdata) predict(model, newdata, type = "response") result <- imv(m0, m1, data = df, y = "y", predict_fn = pfn, nfold = 2) result$mean ## --- glmer (requires lme4) ------------------------------------------ if (requireNamespace("lme4", quietly = TRUE)) { data(sleepstudy, package = "lme4") sleepstudy$slow <- as.integer(sleepstudy$Reaction > 300) m0 <- lme4::glmer(slow ~ (1 | Subject), sleepstudy, family = binomial) m1 <- lme4::glmer(slow ~ Days + (1 | Subject), sleepstudy, family = binomial) imv(m0, m1) } ## --- mirt (requires mirt) ------------------------------------------- if (requireNamespace("mirt", quietly = TRUE)) { resp <- mirt::expand.table(mirt::LSAT7) mod1 <- mirt::mirt(resp, 1, "Rasch", verbose = FALSE) # 1PL mod2 <- mirt::mirt(resp, 1, verbose = FALSE) # 2PL imv(mod1, mod2) }## --- glm ------------------------------------------------------------ set.seed(1) x <- rnorm(100) y <- rbinom(100, 1, plogis(x)) df <- data.frame(x = x, y = y) m0 <- glm(y ~ 1, df, family = "binomial") m1 <- glm(y ~ x, df, family = "binomial") result <- imv(m0, m1, nfold = 2) result$mean result$ci ## --- custom predict_fn (escape hatch for unsupported model types) --- pfn <- function(model, newdata) predict(model, newdata, type = "response") result <- imv(m0, m1, data = df, y = "y", predict_fn = pfn, nfold = 2) result$mean ## --- glmer (requires lme4) ------------------------------------------ if (requireNamespace("lme4", quietly = TRUE)) { data(sleepstudy, package = "lme4") sleepstudy$slow <- as.integer(sleepstudy$Reaction > 300) m0 <- lme4::glmer(slow ~ (1 | Subject), sleepstudy, family = binomial) m1 <- lme4::glmer(slow ~ Days + (1 | Subject), sleepstudy, family = binomial) imv(m0, m1) } ## --- mirt (requires mirt) ------------------------------------------- if (requireNamespace("mirt", quietly = TRUE)) { resp <- mirt::expand.table(mirt::LSAT7) mod1 <- mirt::mirt(resp, 1, "Rasch", verbose = FALSE) # 1PL mod2 <- mirt::mirt(resp, 1, verbose = FALSE) # 2PL imv(mod1, mod2) }
Computes the InterModel Vigorish (IMV) comparing baseline predictions
p1 to enhanced predictions p2 for binary outcomes y.
A positive value indicates that p2 predicts better than p1
out of sample; a negative value indicates the reverse.
## S3 method for class 'binary' imv(m0, m1, p2, sigma = 1e-04, ...)## S3 method for class 'binary' imv(m0, m1, p2, sigma = 1e-04, ...)
m0 |
Integer or numeric vector of binary outcomes (0/1), preferably from a held-out test set. |
m1 |
Numeric vector of baseline predicted probabilities (same length as |
p2 |
Numeric vector of enhanced predicted probabilities (same length as |
sigma |
Small positive constant used to clip probabilities away from 0 and 1 to
avoid numerical issues. Default |
... |
Currently unused. Accepted for consistency with the |
A scalar IMV value. Positive values favour p2; negative values
favour p1.
Domingue, B. W., Rahal, C., Faul, J., Freese, J., Kanopka, K., Rigos, A., Stenhaug, B., & Tripathi, A. S. (2025). The InterModel Vigorish (IMV) as a flexible and portable approach for quantifying predictive accuracy with binary outcomes. PloS one, 20(3), e0316491. https://journals.plos.org/plosone/article?id=10.1371/journal.pone.0316491
set.seed(1) x <- rnorm(1000) y <- rbinom(length(x), 1, plogis(x)) df <- data.frame(x = x, y = y) m <- glm(y ~ x, df, family = "binomial") pr <- predict(m, data.frame(x = x), type = "response") imv.binary(y, mean(y), pr)set.seed(1) x <- rnorm(1000) y <- rbinom(length(x), 1, plogis(x)) df <- data.frame(x = x, y = y) m <- glm(y ~ x, df, family = "binomial") pr <- predict(m, data.frame(x = x), type = "response") imv.binary(y, mean(y), pr)
Computes the InterModel Vigorish (IMV) for binomial mixed-effects models fit
with lme4::glmer() via k-fold cross-validation. Both models are refit
on each training fold; predictions on the held-out fold use
allow.new.levels = TRUE to handle random-effect levels not seen during
training.
Only binomial family models are supported.
## S3 method for class 'glmerMod' imv(m0, m1, data = NULL, nfold = 4, predict_fn = NULL, y = NULL, ...)## S3 method for class 'glmerMod' imv(m0, m1, data = NULL, nfold = 4, predict_fn = NULL, y = NULL, ...)
m0 |
A |
m1 |
A |
data |
Optional data frame. If |
nfold |
Number of cross-validation folds. Default 4. |
predict_fn |
Ignored for this method. |
y |
Ignored for this method; the outcome is inferred from the model formula. |
... |
Currently unused. Accepted for consistency with the generic. |
A named list with four elements:
Numeric vector of per-fold IMV values (length nfold).
Mean IMV across folds.
Standard deviation of per-fold IMVs.
Named numeric vector of length 2: a 95% interval computed as mean +/- 1.96 * (sd / sqrt(nfold)).
Domingue, B. W., Rahal, C., Faul, J., Freese, J., Kanopka, K., Rigos, A., Stenhaug, B., & Tripathi, A. S. (2025). The InterModel Vigorish (IMV) as a flexible and portable approach for quantifying predictive accuracy with binary outcomes. PloS one, 20(3), e0316491. https://journals.plos.org/plosone/article?id=10.1371/journal.pone.0316491
if (requireNamespace("lme4", quietly = TRUE)) { data(sleepstudy, package = "lme4") sleepstudy$slow <- as.integer(sleepstudy$Reaction > 300) m0 <- lme4::glmer(slow ~ (1 | Subject), sleepstudy, family = binomial) m1 <- lme4::glmer(slow ~ Days + (1 | Subject), sleepstudy, family = binomial) imv(m0, m1) }if (requireNamespace("lme4", quietly = TRUE)) { data(sleepstudy, package = "lme4") sleepstudy$slow <- as.integer(sleepstudy$Reaction > 300) m0 <- lme4::glmer(slow ~ (1 | Subject), sleepstudy, family = binomial) m1 <- lme4::glmer(slow ~ Days + (1 | Subject), sleepstudy, family = binomial) imv(m0, m1) }
Computes the InterModel Vigorish (IMV) for item response theory models fit
with the mirt package via response-level k-fold cross-validation.
Fold splits are at the individual response level: each held-out observation is
a single person-by-item pair, and ability is estimated from the remaining
responses for that person.
When a single model is supplied (m1 = NULL), predictions from m0
are compared to item-level prevalence rates (the null model). When two models
are supplied, m0 serves as the baseline and m1 as the enhanced
model.
Only dichotomous response models are supported.
## S3 method for class 'SingleGroupClass' imv(m0, m1 = NULL, data = NULL, nfold = 5, predict_fn = NULL, y = NULL, fscores.options = list(method = "EAP"), whole.matrix = TRUE, remove.nonvarying.items = TRUE, remove.allNA.rows = TRUE, ...)## S3 method for class 'SingleGroupClass' imv(m0, m1 = NULL, data = NULL, nfold = 5, predict_fn = NULL, y = NULL, fscores.options = list(method = "EAP"), whole.matrix = TRUE, remove.nonvarying.items = TRUE, remove.allNA.rows = TRUE, ...)
m0 |
A |
m1 |
An optional second |
data |
Not used for mirt models. Accepted for consistency with the generic. |
nfold |
Number of cross-validation folds (default 5). |
predict_fn |
Not used for mirt models. Accepted for consistency with the generic. |
y |
Not used for mirt models. Accepted for consistency with the generic. |
fscores.options |
Named list of additional arguments passed to |
whole.matrix |
Logical (default |
remove.nonvarying.items |
Logical (default |
remove.allNA.rows |
Logical (default |
... |
Currently unused. |
A named list with four elements:
Numeric vector of per-fold IMV values (length nfold).
Mean IMV across folds.
Standard deviation of per-fold IMVs.
Named numeric vector of length 2: a 95% interval computed as mean +/- 1.96 * (sd / sqrt(nfold)).
Domingue, B. W., Rahal, C., Faul, J., Freese, J., Kanopka, K., Rigos, A., Stenhaug, B., & Tripathi, A. S. (2025). The InterModel Vigorish (IMV) as a flexible and portable approach for quantifying predictive accuracy with binary outcomes. PloS one, 20(3), e0316491. https://journals.plos.org/plosone/article?id=10.1371/journal.pone.0316491
if (requireNamespace("mirt", quietly = TRUE)) { set.seed(1) resp <- mirt::expand.table(mirt::LSAT7) # Single model vs prevalence baseline mod1 <- mirt::mirt(resp, 1, "Rasch", verbose = FALSE) imv(mod1) # Two models mod2 <- mirt::mirt(resp, 1, verbose = FALSE) imv(mod1, mod2) # Priors specified as a variable are handled correctly my_prior <- list(a1 = c(0, 1, 0.25, 3)) mod3 <- mirt::mirt(resp, 1, prior.list = my_prior, verbose = FALSE) imv(mod3) }if (requireNamespace("mirt", quietly = TRUE)) { set.seed(1) resp <- mirt::expand.table(mirt::LSAT7) # Single model vs prevalence baseline mod1 <- mirt::mirt(resp, 1, "Rasch", verbose = FALSE) imv(mod1) # Two models mod2 <- mirt::mirt(resp, 1, verbose = FALSE) imv(mod1, mod2) # Priors specified as a variable are handled correctly my_prior <- list(a1 = c(0, 1, 0.25, 3)) mod3 <- mirt::mirt(resp, 1, prior.list = my_prior, verbose = FALSE) imv(mod3) }
Legacy function. Computes the IMV for a fitted glm model against a
null model (intercept only) via k-fold cross-validation. Both the full and
null models are refit on each training fold and evaluated on the held-out
fold.
For new code, prefer imv(m0, m1) with an explicit null model as
m0.
imv0glm(m, nfold = 5)imv0glm(m, nfold = 5)
m |
A |
nfold |
Number of cross-validation folds. Default 5. |
A numeric vector of length nfold containing the per-fold IMV values.
Domingue, B. W., Rahal, C., Faul, J., Freese, J., Kanopka, K., Rigos, A., Stenhaug, B., & Tripathi, A. S. (2025). The InterModel Vigorish (IMV) as a flexible and portable approach for quantifying predictive accuracy with binary outcomes. PloS one, 20(3), e0316491. https://journals.plos.org/plosone/article?id=10.1371/journal.pone.0316491
set.seed(1) x <- rnorm(1000) y <- rbinom(length(x), 1, plogis(x)) df <- data.frame(x = x, y = y) m <- glm(y ~ x, df, family = "binomial") imv0glm(m)set.seed(1) x <- rnorm(1000) y <- rbinom(length(x), 1, plogis(x)) df <- data.frame(x = x, y = y) m <- glm(y ~ x, df, family = "binomial") imv0glm(m)
Legacy function. Computes the IMV comparing a full glm model to a
reduced model with var.nm dropped from the formula, via k-fold
cross-validation. Both models are refit on each training fold and evaluated
on the held-out fold.
For new code, prefer constructing both models explicitly and calling
imv(m0, m1).
imvglm.rmvar(m, nfold = 5, var.nm)imvglm.rmvar(m, nfold = 5, var.nm)
m |
A |
nfold |
Number of cross-validation folds. Default 5. |
var.nm |
Character string naming the variable to remove from the formula. Must match
exactly the term as it appears in the original |
A numeric vector of length nfold containing the per-fold IMV values.
The reduced model (without var.nm) serves as the baseline; the full
model serves as the enhanced model. A positive mean indicates that
var.nm improves out-of-sample prediction.
Domingue, B. W., Rahal, C., Faul, J., Freese, J., Kanopka, K., Rigos, A., Stenhaug, B., & Tripathi, A. S. (2025). The InterModel Vigorish (IMV) as a flexible and portable approach for quantifying predictive accuracy with binary outcomes. PloS one, 20(3), e0316491. https://journals.plos.org/plosone/article?id=10.1371/journal.pone.0316491
set.seed(1) x <- rnorm(1000) z <- rnorm(1000) y <- rbinom(length(x), 1, plogis(x)) df <- data.frame(x = x, z = z, y = y) m <- glm(y ~ x + z, df, family = "binomial") imvglm.rmvar(m, var.nm = "z")set.seed(1) x <- rnorm(1000) z <- rnorm(1000) y <- rbinom(length(x), 1, plogis(x)) df <- data.frame(x = x, z = z, y = y) m <- glm(y ~ x + z, df, family = "binomial") imvglm.rmvar(m, var.nm = "z")