Specialist backends: when to move beyond the default stack

This article explains how tidyILD fits next to specialist R packages for multivariate dynamics, high-dimensional predictors, and full latent-variable / DSEM-style estimands. It is not a tutorial for those packages; it gives a contract, a routing table, and export patterns so you can preprocess in tidyILD and estimate elsewhere without re-deriving time rules or centering.

Contract: what tidyILD owns vs what it does not

tidyILD is responsible for (when you use its pipeline):

  • Encoding person and time, ordering within person, and gap metadata (ild_prepare(), ild_meta()).
  • Within-between decomposition for interpretable mixed models (ild_center(), ild_decomposition()).
  • Spacing-aware lags (ild_lag(), ild_panel_lag_prepare(), ild_check_lags(), ild_crosslag() for single-equation shortcuts).
  • Provenance on analytic objects and diagnostics for models that tidyILD fits (ild_diagnose(), guardrails).

Specialist packages are responsible for (examples below):

  • Joint multivariate time-series models, feedback systems, and time-varying parameters at the system level (e.g. dynamite, lavaan dynamic SEM, multivariate brms).
  • Penalized longitudinal models and variable selection when p >> n (e.g. PGEE and related penalized GEE / regularized approaches).
  • Measurement models and latent structural models beyond the conservative paths wrapped in tidyILD (ctsem, lavaan, blavaan, multivariate brms).

tidyILD does not aim to reimplement those estimators. The recommended pattern is: prepare → center → lag → export a plain data frame → fit with the specialist tool, recording package versions (e.g. sessionInfo() or ild_manifest() alongside your script).

Decision table

Scientific situation Often sufficient in tidyILD Bridge (preprocess here, then …) Primary external tools (examples)
Lag predictor → one outcome; AR1/CAR1 residuals ild_lag, ild_lme, ild_brms, ild_tvem
Reciprocal or multivariate lags (e.g. stress ↔︎ mood), joint dynamics Two separate ild_lme / ild_crosslag fits are not a joint likelihood Same lags/centering; export data dynamite; lavaan DSEM; multivariate brms; multivariate ctsem
High-dimensional time-varying predictors (p >> n), selection lme4 / unpenalized mixed models: unstable or non-identified Screen or penalize outside default ild_lme path PGEE; regularized GLM/GEE; strong priors in brms
Latent constructs + dynamics Observed-variable pipeline; ild_ctsem() v1 is intentionally narrow Export; specify full SEM in target package ctsem, lavaan / blavaan DSEM
Nonlinear or non-Gaussian dynamics ild_tvem (GAM); ild_brms with appropriate families/splines Same dynamite; brms; state-space with custom observation model
Time-varying coefficients (effect of X on Y evolves) ild_tvem; interactions with time; random slopes Full state-space / Bayesian TVP dynamite; brms hierarchical structure
Causal estimands with many time-varying confounders MSM / IPW tools in tidyILD for supported paths High-dimensional confounding may need dedicated methods Doubly robust / DML literature; penalized exposure/confounder models
Correlated outcomes, shared residual structure Separate outcomes ignore cross-outcome correlation Multivariate formula or dynamic multivariate model dynamite; mvbind in brms (see vignette("brms-dynamics-recipes", package = "tidyILD"))

For temporal structure inside tidyILD (lags vs AR vs TVEM vs KFAS vs ctsem), start with vignette("temporal-dynamics-model-choice", package = "tidyILD").

Handoff pattern: export after prepare, center, and lag

Use [ild_meta()] to recover the person id column name stored in metadata (often still "id" in your data, but the prepared object also uses internal columns such as .ild_id).

library(tidyILD)
set.seed(1)
d <- ild_simulate(n_id = 8, n_obs_per = 10, seed = 1)
d$x <- rnorm(nrow(d))
x <- ild_prepare(d, id = "id", time = "time")
x <- ild_center(x, y, x)
x <- ild_lag(x, dplyr::all_of(c("y", "x")), n = 1L, mode = "gap_aware")

meta <- ild_meta(x)
meta$ild_id
#> [1] "id"
# Plain data frame for any external package:
dat <- as.data.frame(x)
head(dat[, c(meta$ild_id, ".ild_time_num", "y", "y_wp", "y_bp", "x_lag1")])
#>   id .ild_time_num          y        y_wp       y_bp     x_lag1
#> 1  1             0 -0.3385631 -0.22929533 -0.1092678         NA
#> 2  1          3600 -0.6791480 -0.56988020 -0.1092678  0.3700188
#> 3  1          7200  0.3294368  0.43870458 -0.1092678  0.2670988
#> 4  1         10800 -0.1315322 -0.02226439 -0.1092678 -0.5425200
#> 5  1         14400 -0.5370741 -0.42780630 -0.1092678  1.2078678
#> 6  1         18000 -1.2338038 -1.12453595 -0.1092678  1.1604026

Typical columns to retain for dynamic panel estimators:

  • Person identifier: meta$ild_id (and/or .ild_id if present).
  • Time: .ild_time_num and/or your original time column, depending on whether the target software expects continuous time, integer occasion, or clock time.
  • Outcomes and predictors, including _wp / _bp columns if you want to preserve within-between interpretation in the exported file.
  • Lag columns created by [ild_lag()] or [ild_panel_lag_prepare()].

[ild_export_provenance()] applies to objects that carry provenance (e.g. after modeling in tidyILD). For external fits, record inputs and script versions in your project manifest.

Code stubs (not evaluated)

The chunks below are illustrative. They are not run during R CMD check because dynamite and PGEE are not required dependencies of tidyILD. Install those packages from their own instructions, read their vignettes, and adapt column names to match dat above.

dynamite (multivariate dynamic models)

# library(dynamite)
# After building `dat` from an ILD object (see above):
# - Map id/time/outcome columns to dynamite's expected data layout.
# - Specify channels, lags, and priors per package documentation.
# Example placeholder only (not valid without a real dynamite specification):
# fit_dyn <- dynamite::dynamite(
#   dformula = <your dformula>,
#   data = dat,
#   ...
# )

PGEE (penalized GEE / high-dimensional longitudinal)

# library(PGEE)
# Penalized GEE expects a long data frame with id, repeated outcome, and a matrix
# or formula interface for high-dimensional covariates — see ?PGEE::PGEE.
# Use `dat` from tidyILD after centering/lagging so covariates align with your estimand.
# fit_pgee <- PGEE::PGEE(<formula or design>, data = dat, ...)

lavaan / blavaan (DSEM)

# library(lavaan)
# Dynamic SEM is model-syntax-specific; export `dat` and define your model
# in lavaan's longitudinal / DSEM extensions. tidyILD does not generate lavaan syntax.

See also

  • vignette("temporal-dynamics-model-choice", package = "tidyILD")
  • vignette("brms-dynamics-recipes", package = "tidyILD") (including multivariate mvbind sketch)
  • vignette("ctsem-continuous-time-dynamics", package = "tidyILD")
  • vignette("msm-identification-and-recovery", package = "tidyILD") (causal MSM path in tidyILD)
#> R version 4.6.0 (2026-04-24)
#> Platform: x86_64-pc-linux-gnu
#> Running under: Ubuntu 24.04.4 LTS
#> 
#> Matrix products: default
#> BLAS:   /usr/lib/x86_64-linux-gnu/openblas-pthread/libblas.so.3 
#> LAPACK: /usr/lib/x86_64-linux-gnu/openblas-pthread/libopenblasp-r0.3.26.so;  LAPACK version 3.12.0
#> 
#> locale:
#>  [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C              
#>  [3] LC_TIME=en_US.UTF-8        LC_COLLATE=en_US.UTF-8    
#>  [5] LC_MONETARY=en_US.UTF-8    LC_MESSAGES=en_US.UTF-8   
#>  [7] LC_PAPER=en_US.UTF-8       LC_NAME=C                 
#>  [9] LC_ADDRESS=C               LC_TELEPHONE=C            
#> [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C       
#> 
#> time zone: Etc/UTC
#> tzcode source: system (glibc)
#> 
#> attached base packages:
#> [1] stats     graphics  grDevices utils     datasets  methods   base     
#> 
#> other attached packages:
#> [1] ggplot2_4.0.3  dplyr_1.2.1    tidyILD_0.4.1  rmarkdown_2.31
#> 
#> loaded via a namespace (and not attached):
#>  [1] sandwich_3.1-1     utf8_1.2.6         sass_0.4.10        generics_0.1.4    
#>  [5] anytime_0.3.13     lattice_0.22-9     lme4_2.0-1         digest_0.6.39     
#>  [9] magrittr_2.0.5     timechange_0.4.0   evaluate_1.0.5     grid_4.6.0        
#> [13] RColorBrewer_1.1-3 fastmap_1.2.0      jsonlite_2.0.0     Matrix_1.7-5      
#> [17] mgcv_1.9-4         scales_1.4.0       jquerylib_0.1.4    reformulas_0.4.4  
#> [21] Rdpack_2.6.6       cli_3.6.6          rlang_1.2.0        rbibutils_2.4.1   
#> [25] splines_4.6.0      withr_3.0.2        cachem_1.1.0       yaml_2.3.12       
#> [29] otel_0.2.0         tools_4.6.0        nloptr_2.2.1       minqa_1.2.8       
#> [33] tsibble_1.2.0      boot_1.3-32        clubSandwich_0.7.0 buildtools_1.0.0  
#> [37] vctrs_0.7.3        R6_2.6.1           zoo_1.8-15         lubridate_1.9.5   
#> [41] lifecycle_1.0.5    MASS_7.3-65        pkgconfig_2.0.3    pillar_1.11.1     
#> [45] bslib_0.11.0       gtable_0.3.6       glue_1.8.1         Rcpp_1.1.1-1.1    
#> [49] xfun_0.58          tibble_3.3.1       tidyselect_1.2.1   sys_3.4.3         
#> [53] knitr_1.51         farver_2.1.2       htmltools_0.5.9    nlme_3.1-169      
#> [57] labeling_0.4.3     maketools_1.3.2    compiler_4.6.0     S7_0.2.2