| Title: | Bootstrap Resampling for Multilevel Models |
|---|---|
| Description: | Functions for bootstrapping with multilevel data and models (and mixed-effect models). It implements multiple bootstrap methods under the parametric, residual, and case bootstrap categories, as discussed in Van der Leeden, Meijer, and Busing (2008) <doi:10.1007/978-0-387-73186-5_11> and Carpenter, Goldstein, and Rasbash (2003) <doi:10.1111/1467-9876.00415>. Currently it supports fitted objects from the 'lme4' package. |
| Authors: | Mark Lai [aut, cre, cph] |
| Maintainer: | Mark Lai <[email protected]> |
| License: | MIT + file LICENSE |
| Version: | 0.1.1 |
| Built: | 2026-05-13 14:41:32 UTC |
| Source: | https://github.com/cran/bootmlm |
Run multilevel parametric, residual, and case bootstrap with different options
bootstrap_mer( x, FUN, nsim = 1, type = c("parametric", "residual", "residual_cgr", "residual_trans", "reb", "case"), corrected_trans = FALSE, lv1_resample = FALSE, reb_scale = FALSE, .progress = FALSE, verbose = FALSE, ... )bootstrap_mer( x, FUN, nsim = 1, type = c("parametric", "residual", "residual_cgr", "residual_trans", "reb", "case"), corrected_trans = FALSE, lv1_resample = FALSE, reb_scale = FALSE, .progress = FALSE, verbose = FALSE, ... )
x |
A fitted |
FUN |
A function taking a fitted |
nsim |
A positive integer telling the number of simulations, positive
integer; the bootstrap |
type |
A character string indicating the type of multilevel bootstrap.
Currently, possible values are |
corrected_trans |
Logical indicating whether to use the correct
variance-covariance matrix of the residuals. If |
lv1_resample |
Logical indicating whether to sample with replacement
the level-1 units for each level-2 cluster. Only used for
|
reb_scale |
Logical indicating whether to scale the residuals for the random effect block bootstrap |
.progress |
Logical indicating whether to display progress bar (using
|
verbose |
Logical indicating if progress should print output. |
... |
argument passed to .resid_resample. |
bootstrap_mer performs different bootstrapping methods to fitted
model objects using the lme4 package. Currently, only models fitted
using lmer is supported.
An object of S3 class "boot", compatible with boot
package's boot(). It contains the following components:
t0 |
The original statistic from |
t |
A matrix with |
R |
The value of |
data |
The data used in the original analysis. |
statistic |
The function |
See the documentation in for link[boot]{boot}() for the other
components.
Carpenter, J. R., Goldstein, H., & Rasbash, J. (2003). A novel bootstrap procedure for assessing the relationship between class size and achievement. Journal of the Royal Statistical Society. Series C (Applied Statistics), 52, 431–443. https://doi.org/10.1111/1467-9876.00415
Chambers, R., & Chandra, H. (2013). A random effect block bootstrap for clustered data. Journal of Computational and Graphical Statistics, 22(2), 452–470. https://doi.org/10.1080/10618600.2012.681216
Davison, A. C. and Hinkley, D. V. (1997). Bootstrap methods and their application. Cambridge, UK: Cambridge University Press.
Morris, J. S. (2002). The BLUPs are not "best" when it comes to bootstrapping. Statistics & Probability Letters, 56(4), 425–430. https://doi.org/10.1016/S0167-7152(02)00041-X
Van der Leeden, R., Meijer, E., & Busing, F. M. T. A. (2008). Resampling multilevel models. In J. de Leeuw & E. Meijer (Eds.), Handbook of multilevel Analysis (pp. 401–433). New York, NY: Springer.
boot for single-level bootstrapping,
bootMer for parametric and semi-parametric
bootstrap implemented in lme4, and
boot.ci for getting bootstrap confidence
intervals and plot.boot for plotting the bootstrap
distribution.
library(lme4) fm01ML <- lmer(Yield ~ (1 | Batch), Dyestuff, REML = FALSE) mySumm <- function(x) { c(getME(x, "beta"), sigma(x)) } # Covariance preserving residual bootstrap boo01 <- bootstrap_mer(fm01ML, mySumm, type = "residual", nsim = 100) # Plot bootstrap distribution of fixed effect library(boot) plot(boo01, index = 1) # Get confidence interval boot.ci(boo01, index = 2, type = c("norm", "basic", "perc")) # BCa using influence values computed from `empinf_mer` boot.ci(boo01, index = 2, type = "bca", L = empinf_mer(fm01ML, mySumm, 2))library(lme4) fm01ML <- lmer(Yield ~ (1 | Batch), Dyestuff, REML = FALSE) mySumm <- function(x) { c(getME(x, "beta"), sigma(x)) } # Covariance preserving residual bootstrap boo01 <- bootstrap_mer(fm01ML, mySumm, type = "residual", nsim = 100) # Plot bootstrap distribution of fixed effect library(boot) plot(boo01, index = 1) # Get confidence interval boot.ci(boo01, index = 2, type = c("norm", "basic", "perc")) # BCa using influence values computed from `empinf_mer` boot.ci(boo01, index = 2, type = "bca", L = empinf_mer(fm01ML, mySumm, 2))
This is a wrapper for getting CIs for multiple parameters after running
bootstrap_mer, to be consistent with a similar method for the
bootMer class.
## S3 method for class 'boot' confint( object, parm, level = 0.95, type = c("norm", "basic", "perc", "bca"), L = NULL, ... )## S3 method for class 'boot' confint( object, parm, level = 0.95, type = c("norm", "basic", "perc", "bca"), L = NULL, ... )
object |
an object returned by |
parm |
a specification of which parameters are to be given confidence intervals as a vector of numbers. If missing, all parameters are considered. |
level |
the confidence level required. |
type |
character indicating the type of intervals required, as described
in |
L |
empirical influence values required for |
... |
additional argument(s) passed to |
A matrix (or vector) with columns giving lower and upper confidence limits for each parameter.
library(lme4) fm01ML <- lmer(Yield ~ (1 | Batch), Dyestuff, REML = FALSE) mySumm <- function(x) { c(getME(x, "beta"), sigma(x)) } # residual bootstrap boo_resid <- bootstrap_mer(fm01ML, mySumm, type = "residual", nsim = 100) confint(boo_resid, type = "bca", L = empinf_merm(fm01ML, mySumm))library(lme4) fm01ML <- lmer(Yield ~ (1 | Batch), Dyestuff, REML = FALSE) mySumm <- function(x) { c(getME(x, "beta"), sigma(x)) } # residual bootstrap boo_resid <- bootstrap_mer(fm01ML, mySumm, type = "residual", nsim = 100) confint(boo_resid, type = "bca", L = empinf_merm(fm01ML, mySumm))
Creates a deviance function for a fitted model object, using
and as the parameters.
devfun_mer(x) devfun_mer2(x)devfun_mer(x) devfun_mer2(x)
x |
A fitted merMod object from |
The built-in function(s) in lme4 for generating deviance function are not exported and rely on C++ code. This function is mainly used to obtain Hessian and asymptotic covariance matrix of the random effects.
A deviance function with one argument th_sig that takes input
of a numeric vector corresponding to the some estimated values of
and . For devfun_mer2, a function with
one argument theta that profiles out and only takes
input of elements for .
Bates, D., Maechler, M., Bolker, B. M., & Walker, S. C. Fitting linear mixed-effects models using lme4. Retrieved from https://www.jstatsoft.org/article/view/v067i01
Implementation in lme4pureR: https://github.com/lme4/lme4pureR/blob/master/R/pls.R
library(lme4) fm01ML <- lmer(Yield ~ (1 | Batch), Dyestuff, REML = FALSE) dd <- devfun_mer(fm01ML) # Asymptotic variance-covariance matrix of (theta, sigma): 2 * solve(numDeriv::hessian(dd, c(fm01ML@theta, sigma(fm01ML))))library(lme4) fm01ML <- lmer(Yield ~ (1 | Batch), Dyestuff, REML = FALSE) dd <- devfun_mer(fm01ML) # Asymptotic variance-covariance matrix of (theta, sigma): 2 * solve(numDeriv::hessian(dd, c(fm01ML@theta, sigma(fm01ML))))
This function calculates the empirical influence values for a statistic in a
given fitted model object using the delete- jackknife.
empinf_mer(x, FUN, index = 1) empinf_merm(x, FUN)empinf_mer(x, FUN, index = 1) empinf_merm(x, FUN)
x |
A fitted merMod object from |
FUN |
A function taking a fitted merMod object as input and returning the statistic of interest. |
index |
An integer stating the position of the statistic in the output of
|
empinf_mer computes non-parametric influence function of models
fitted using lmer by deleting one cluster at a time. See
van der Leeden, Meijer, and Busing (2008, pp. 420–422) for more information.
Whereas empinf_mer computes influence values for a specified position
(as specified with the index argument) of the output of FUN,
empinf_merm computes influence values for every element in
FUN(x).
A numeric vector with length equals to number of clusters of
x containing the weighted influence value of each cluster.
Van der Leeden, R., Meijer, E., & Busing, F. M. T. A. (2008). Resampling multilevel models. In J. de Leeuw & E. Meijer (Eds.), Handbook of multilevel Analysis (pp. 401–433). New York, NY: Springer.
library(lme4) fm01ML <- lmer(Yield ~ (1 | Batch), Dyestuff, REML = FALSE) # Define function for intraclass correlation icc <- function(x) 1 / (1 + 1 / getME(x, "theta")^2) empinf_mer(fm01ML, icc) empinf_mer(fm01ML, fixef)library(lme4) fm01ML <- lmer(Yield ~ (1 | Batch), Dyestuff, REML = FALSE) # Define function for intraclass correlation icc <- function(x) 1 / (1 + 1 / getME(x, "theta")^2) empinf_mer(fm01ML, icc) empinf_mer(fm01ML, fixef)
A synthetic two-level dataset with pupils nested within schools, generated to mimic the structure and parameters of the popular dataset from Hox (2010). It can be used to demonstrate multilevel bootstrap methods without depending on external data sources.
pop_synpop_syn
A data frame with 2000 rows and 5 variables:
Pupil identification number within school (integer).
School identification number, 1–100 (integer).
Pupil popularity score on a 0–10 scale (numeric).
Pupil sex: 0 = boy, 1 = girl (integer).
Teacher experience in years (numeric).
The dataset was generated by fitting a two-level random intercept model to the original popular data and simulating new data from the estimated parameters:
Fixed effects: intercept = 3.56, sex = 0.84, texp = 0.093
School-level random intercept SD: 0.69
Residual SD: 0.68
Continuous predictions were rounded to the nearest integer and clamped to the [0, 10] range to match the original Likert-type scale.
Simulated from parameters estimated from the popular dataset in: Hox, J. J. (2010). Multilevel analysis: Techniques and applications (2nd ed.). Routledge. Data available at https://stats.oarc.ucla.edu/stat/stata/examples/mlm_ma_hox/popular.dta.
Compute confidence intervals for the intraclass correlation of a model
fit of class merMod-class.
prof_ci_icc(x, level = 0.95, eps_max = 1e+06)prof_ci_icc(x, level = 0.95, eps_max = 1e+06)
x |
A fitted merMod object from |
level |
Confidence level between 0 and 1. Default is .95. |
eps_max |
The maximum value that the upper limit can be. Default is 1e6. |
This function uses the uniroot function and determine
the lower and upper limit by evaluating the profile deviance (for ML) or the
-2 profile REML criterion. It works by obtaining the interval for
and transforming the two limits.
The resulting interval is bounded by zero for its lower limit.
A named numeric vector with two values showing the lower and upper limit of the intraclass correlation.
library(lme4) fm01ML <- lmer(Yield ~ (1 | Batch), Dyestuff, REML = FALSE) # Profile likelihood 95% CI for the ICC prof_ci_icc(fm01ML) # 90% CI prof_ci_icc(fm01ML, level = 0.90)library(lme4) fm01ML <- lmer(Yield ~ (1 | Batch), Dyestuff, REML = FALSE) # Profile likelihood 95% CI for the ICC prof_ci_icc(fm01ML) # 90% CI prof_ci_icc(fm01ML, level = 0.90)
Score Functions and Case-wise Derivatives
scores_mer(x, level = 2)scores_mer(x, level = 2)
x |
A fitted |
level |
If |
A numeric matrix of score contributions. If level = 2,
rows correspond to level-2 clusters (aggregated); if level = 1,
rows correspond to level-1 observations. Columns correspond to model
parameters in order: fixed effects, variance components, residual variance.
library(lme4) fm01ML <- lmer(Yield ~ (1 | Batch), Dyestuff, REML = FALSE) # Cluster-level (level-2) scores scores_mer(fm01ML, level = 2) # Observation-level (level-1) scores scores_mer(fm01ML, level = 1)library(lme4) fm01ML <- lmer(Yield ~ (1 | Batch), Dyestuff, REML = FALSE) # Cluster-level (level-2) scores scores_mer(fm01ML, level = 2) # Observation-level (level-1) scores scores_mer(fm01ML, level = 1)
Solve for in the linear system
, where is a
symmetric matrix square root of with eigenvalue
decomposition such that .
solve_eigen_sqrt(M, b)solve_eigen_sqrt(M, b)
M |
a symmetric positive definite/semi-definite matrix |
b |
a numeric vector or matrix |
A numeric vector or matrix x satisfying ,
where is the symmetric square root of M.
Asymptotic Covariance Matrix for Cholesky Factor of Random Effects
vcov_theta(x)vcov_theta(x)
x |
A fitted merMod object from |
A symmetric matrix giving the asymptotic covariance matrix of
the Cholesky factor of the random-effects covariance.
library(lme4) fm01ML <- lmer(Yield ~ (1 | Batch), Dyestuff, REML = FALSE) # Asymptotic covariance of the Cholesky factor theta vcov_theta(fm01ML)library(lme4) fm01ML <- lmer(Yield ~ (1 | Batch), Dyestuff, REML = FALSE) # Asymptotic covariance of the Cholesky factor theta vcov_theta(fm01ML)
Return the asymptotic covariance matrix of random effect standard deviations (or variances) for a fitted model object, using the Hessian evaluated at the (restricted) maximum likelihood estimates.
vcov_vc(x, sd_cor = TRUE, print_names = TRUE)vcov_vc(x, sd_cor = TRUE, print_names = TRUE)
x |
A fitted merMod object from |
sd_cor |
Logical indicating whether to return asymptotic covariance
matrix on SD scale (if |
print_names |
Logical, whether to print the names for the covariance matrix. |
Although it's easy to obtain the Hessian for , the relative
Cholesky factor, in lme4, there is no easy way to obtain the Hessian
for the variance components. This function uses devfun_mer() to
obtain the Hessian () of variance components (or standard deviations,
SD), and then obtain the asymptotic covariance matrix as .
A (q + 1) * (q + 1) symmetric matrix of the covariance
matrix of () (if sd_cor = TRUE) or
() (if sd_cor = FALSE), where q is the
the number of estimated random-effects components (excluding ).
For example, for a model with random slope, =
(intercept SD, intercept-slope correlation, slope SD).
vcov.merMod for covariance matrix of fixed
effects, confint.merMod for confidence intervals of all
parameter estimates, and devfun_mer for the underlying
function to produce the deviance function.
library(lme4) data(Orthodont, package = "nlme") fm1 <- lmer(distance ~ age + (age | Subject), data = Orthodont) vc <- VarCorr(fm1) # Standard deviation only print(vc, comp = c("Std.Dev")) # Asymptotic variance-covariance matrix of (tau, sigma): vcov_vc(fm1, sd_cor = TRUE) # Compare with (parametric) bootstrap results : get_sdcor <- function(x) { as.data.frame(lme4::VarCorr(x), order = "lower.tri")[ , "sdcor"] } boo <- bootstrap_mer(fm1, get_sdcor, type = "parametric", nsim = 200L) # There might be failures in some resamples cov(boo$t, use = "complete.obs")library(lme4) data(Orthodont, package = "nlme") fm1 <- lmer(distance ~ age + (age | Subject), data = Orthodont) vc <- VarCorr(fm1) # Standard deviation only print(vc, comp = c("Std.Dev")) # Asymptotic variance-covariance matrix of (tau, sigma): vcov_vc(fm1, sd_cor = TRUE) # Compare with (parametric) bootstrap results : get_sdcor <- function(x) { as.data.frame(lme4::VarCorr(x), order = "lower.tri")[ , "sdcor"] } boo <- bootstrap_mer(fm1, get_sdcor, type = "parametric", nsim = 200L) # There might be failures in some resamples cov(boo$t, use = "complete.obs")