The Longitudinal GRMTree (longitudinal_grmtree) extends
the cross-sectional tree-based graded response model to the longitudinal
setting, providing a data-driven framework for detecting
response shift (RS) in patient-reported outcome
measures (PROMs) measured at two time points. Response shift refers to a
change in the meaning of a respondent’s self-evaluation of a construct
over time, arising from a change in internal standards (recalibration),
values (reprioritization), or the definition of the construct
(reconceptualization).
The method operates in two phases:
longitudinal_grmtree): A
constrained two-factor longitudinal GRM (item parameters held equal
across time) is embedded within model-based recursive partitioning. The
tree identifies patient subgroups whose longitudinal measurement model
differs.rs_characterize): Within each
terminal node, the equality constraints are relaxed and a likelihood
ratio test compares the constrained (no-RS) model to an unconstrained
(RS-allowed) model. Items exhibiting RS are identified and classified as
recalibration, reprioritization, or both.This vignette demonstrates:
The package ships with a synthetic longitudinal dataset,
grmtree_long_data, containing responses to the eight MOS-SS
emotional domain items at baseline and one-year follow-up, together with
baseline covariates. The data were simulated from a two-factor
longitudinal GRM in which older patients (age > 61) experience
response shift on two items, so the example produces a real,
interpretable result.
## Load the data
data("grmtree_long_data", package = "grmtree")
## Take a glimpse at the data
glimpse(grmtree_long_data)
#> Rows: 1,500
#> Columns: 24
#> $ MOS_Listen <dbl> 3, 5, 4, 5, 5, 5, 5, 4, 5, 4, 3, 5, 5, 4, 4, 5…
#> $ MOS_Info <dbl> 2, 5, 4, 3, 5, 5, 4, 1, 5, 4, 4, 4, 5, 3, 5, 5…
#> $ MOS_Advice_Crisis <dbl> 4, 5, 2, 3, 5, 5, 4, 3, 5, 4, 3, 4, 5, 4, 4, 5…
#> $ MOS_Confide <dbl> 4, 5, 4, 5, 5, 5, 5, 2, 5, 3, 4, 4, 5, 2, 4, 5…
#> $ MOS_Advice_Want <dbl> 4, 5, 4, 5, 4, 4, 4, 2, 5, 4, 2, 3, 5, 5, 4, 5…
#> $ MOS_Fears <dbl> 5, 5, 5, 5, 5, 5, 4, 3, 5, 3, 2, 4, 5, 3, 4, 5…
#> $ MOS_Personal <dbl> 5, 5, 4, 3, 5, 5, 3, 2, 5, 3, 3, 4, 5, 4, 4, 5…
#> $ MOS_Understand <dbl> 3, 5, 3, 5, 5, 5, 4, 2, 5, 3, 4, 5, 5, 5, 5, 5…
#> $ MOS_Listen_year1 <dbl> 5, 3, 5, 4, 5, 5, 4, 2, 5, 3, 5, 4, 5, 3, 4, 5…
#> $ MOS_Info_year1 <dbl> 4, 3, 5, 4, 5, 4, 5, 5, 5, 3, 5, 4, 4, 3, 5, 5…
#> $ MOS_Advice_Crisis_year1 <dbl> 5, 2, 5, 5, 5, 5, 5, 3, 5, 3, 4, 4, 4, 4, 3, 5…
#> $ MOS_Confide_year1 <dbl> 5, 2, 4, 4, 5, 5, 4, 4, 5, 4, 4, 4, 5, 4, 5, 5…
#> $ MOS_Advice_Want_year1 <dbl> 5, 3, 4, 4, 5, 5, 4, 2, 5, 1, 4, 3, 2, 4, 4, 5…
#> $ MOS_Fears_year1 <dbl> 1, 1, 3, 5, 5, 5, 1, 5, 5, 5, 4, 4, 4, 5, 2, 5…
#> $ MOS_Personal_year1 <dbl> 5, 2, 2, 3, 5, 4, 4, 3, 5, 4, 4, 4, 4, 3, 5, 5…
#> $ MOS_Understand_year1 <dbl> 5, 1, 3, 3, 5, 5, 5, 2, 5, 3, 5, 4, 4, 3, 5, 5…
#> $ sex <fct> Male, Male, Male, Male, Male, Female, Male, Ma…
#> $ age <dbl> 73, 59, 58, 61, 59, 54, 74, 58, 57, 60, 57, 67…
#> $ residency <fct> Urban, Rural, Urban, Urban, Urban, Urban, Urba…
#> $ job <fct> Unemployed, Employed, Employed, Unemployed, Un…
#> $ education <fct> Primary/High school, College/University, Colle…
#> $ ever_smoker <fct> Yes, No, Yes, Yes, Yes, Yes, Yes, Yes, No, Yes…
#> $ comorbidity_count <int> 1, 1, 3, 3, 2, 2, 5, 0, 1, 1, 1, 0, 1, 0, 1, 3…
#> $ bmi <dbl> 29.9, 27.1, 33.3, 34.3, 34.2, 23.6, 14.6, 19.4…The first eight columns are the baseline items, the next eight are
the one-year follow-up items (suffixed _year1), followed by
the covariates.
The prepare_longitudinal_data() helper builds the
wide-format response matrix required by
longitudinal_grmtree(). You supply the baseline item names,
the follow-up item names (in the same order), and any covariates to
carry along.
## Baseline item names
items_t1 <- c("MOS_Listen", "MOS_Info", "MOS_Advice_Crisis", "MOS_Confide",
"MOS_Advice_Want", "MOS_Fears", "MOS_Personal", "MOS_Understand")
## Follow-up item names (same order, with the _year1 suffix)
items_t2 <- paste0(items_t1, "_year1")
## Build the wide-format data
ld <- prepare_longitudinal_data(
data = grmtree_long_data,
items_t1 = items_t1,
items_t2 = items_t2,
covariates = c("sex", "age", "residency", "job",
"education", "comorbidity_count", "ever_smoker")
)
## The response matrix lives in the resp_wide column (16 = 2 x 8 columns)
dim(ld$resp_wide)
#> [1] 1500 16We fit the tree with the wide-format response matrix on the left-hand
side and the candidate partitioning variables on the right. The
minbucket control governs the minimum terminal node size;
it should be large enough for stable estimation of the node-specific
model (roughly 10–25 times the number of free parameters per node).
## Fit the Longitudinal GRMTree
tree <- longitudinal_grmtree(
resp_wide ~ sex + age + residency + job +
education + comorbidity_count + ever_smoker,
data = ld,
n_items = 8,
control = grmtree.control(minbucket = 200, p_adjust = "bonferroni", alpha = 0.05)
)
## Print the tree structure
print(tree)Fitting the full model is computationally intensive, so the chunk above is not evaluated when the vignette is built. The tree splits the sample on age at 61, producing two subgroups. Example output:
Graded Response Model Tree
Model formula:
resp_wide ~ sex + age + residency + job + education + comorbidity_count +
ever_smoker
Fitted party:
[1] root
| [2] age <= 61: n = 597
| ... (constrained item parameters, means, covariance) ...
| [3] age > 61: n = 903
| ... (constrained item parameters, means, covariance) ...
Number of inner nodes: 1
Number of terminal nodes: 2
Number of parameters per node: 3
Objective function (negative log-likelihood): 26051.77
The tree identifies a younger subgroup (age <= 61, n = 597) and an older subgroup (age > 61, n = 903) whose constrained longitudinal measurement models differ. Note that the split itself does not establish response shift; it identifies where the measurement model is heterogeneous. Response shift is tested in Phase 2.
rs_characterize() performs the Phase 2 test within each
terminal node. It reports an omnibus RS test (constrained vs
unconstrained) per node, and, where the omnibus test is significant, an
item-level test that classifies each flagged item. Two levels of
multiple-comparison control are available: global_p_adjust
across nodes (the “in which subgroups?” question) and
p_adjust within nodes (the “which items?” question).
## Phase 2: characterize response shift
rs <- rs_characterize(
tree,
p_adjust = "fdr", # item-level correction within nodes
global_p_adjust = "bonferroni", # omnibus correction across nodes
rs_threshold = 0.3 # RS-type classification threshold
)
## View the results
print(rs)Example output:
=== Response Shift Characterization ===
--- Omnibus RS Test (Constrained vs Unconstrained) ---
Node 2 (n = 597):
LRT: chi2 = 36.791, df = 40, p = 6.155e-01, p_adj = 1.000e+00
RS detected: FALSE
Node 3 (n = 903):
LRT: chi2 = 449.498, df = 40, p = 1.067e-70, p_adj = 2.133e-70
RS detected: TRUE
--- Item-Level RS Testing ---
Node 3:
Item 2: chi2 = 50.670, df = 5, p_adj = 0.000000 Reprioritization ***
Item 5: chi2 = 19.626, df = 5, p_adj = 0.003915 Significant (small effect) ***
Item 6: chi2 = 383.014, df = 5, p_adj = 0.000000 Recalibration ***
The younger subgroup (Node 2) shows no response shift. The older subgroup (Node 3) shows significant response shift, localized to the Info item (reprioritization; a discrimination change) and the Fears item (recalibration; a threshold change). This is exactly the pattern built into the synthetic data, and it illustrates how the Longitudinal GRMTree pinpoints which subgroup and which items exhibit response shift without pre-specifying the subgroups.
The returned object stores the omnibus and item-level results as data frames, and the raw constrained/unconstrained parameter matrices for each node.
Two visualization helpers summarize the Phase 2 results.
plot_rs_tree() annotates each terminal node of the tree
with its omnibus result and an item-level RS colour bar;
plot_rs_heatmap() gives a standalone item-by-node
heatmap.
## Short item labels for display
mos_labels <- c("Listen", "Info", "Crisis", "Confide",
"Advice", "Fears", "Personal", "Understand")
## Annotated tree with RS results in the terminal nodes
plot_rs_tree(tree, rs, item_labels = mos_labels)
## Standalone item-level RS heatmap
plot_rs_heatmap(rs, item_labels = mos_labels)
## Heatmap with chi-squared values shown in the cells
plot_rs_heatmap(rs, item_labels = mos_labels, show_chi2 = TRUE)The threshold region plot of the tree itself is also available. Because the constrained model enforces equal parameters across time, the plot shows only the eight unique items rather than all sixteen columns.
A family of extraction functions returns node-specific quantities.
Each returns n_items rows per node (the unique T1 items),
not 2 * n_items.
## Threshold parameters (8 items per node)
threshpar_longitudinal_grmtree(tree)
## Discrimination parameters (from the a1 column)
discrpar_longitudinal_grmtree(tree)
## Combined item parameters (discrimination + thresholds)
itempar_longitudinal_grmtree(tree)
## Latent trait summary per node (mu_T2, variance, T1-T2 correlation)
latentpar_longitudinal_grmtree(tree)Factor scores are returned per node, with two columns
(Theta_T1 and Theta_T2) because the model
estimates a latent trait at each time point.
## Factor scores per terminal node
scores <- fscores_longitudinal_grmtree(tree)
head(scores[["3"]]) # Theta_T1 and Theta_T2 for the older subgroupFinally, generate_node_scores_dataset() merges node
membership and factor scores back onto the original data frame, which is
convenient for downstream analyses (for example, relating latent change
to clinical outcomes).
If longitudinal_grmtree() returns a single terminal node
(the root), no covariate moderates the measurement model. This does
not imply the absence of response shift:
rs_characterize() still runs on the single root node and
can detect uniform response shift affecting the whole sample. In other
words, the tree localizes heterogeneity in RS, while Phase 2
tests for RS itself.
This vignette demonstrated the two-phase Longitudinal GRMTree
workflow: preparing wide-format data, fitting the tree to identify
subgroups (longitudinal_grmtree), characterizing response
shift within those subgroups (rs_characterize), and
visualizing and extracting the results. The method identifies both
which subgroups and which items exhibit response shift
in a single data-driven analysis, without requiring the subgroups to be
specified in advance.
For the cross-sectional tree-based graded response model, see the “Getting Started with the grmtree Package” vignette.