--- title: "Introduction to GMLTM" output: rmarkdown::html_vignette vignette: > %\VignetteIndexEntry{Introduction to GMLTM} %\VignetteEngine{knitr::rmarkdown} %\VignetteEncoding{UTF-8} --- ## Overview The **GMLTM** package implements three Bayesian Item Response Theory models for cognitive diagnosis: - **LLTM** — Linear Logistic Test Model (Fischer, 1973) - **MLTM-D** — Multicomponent Latent Trait Model for Diagnosis (Embretson & Yang, 2013) - **GMLTM-D** — Generalized Multicomponent Latent Trait Model for Diagnosis (Ramírez et al., 2024) All models decompose item difficulty into cognitive operations (rules), making them suitable for studying the cognitive components underlying test performance. Estimation uses Hamiltonian Monte Carlo via `rstan`. ## Installation ```{r, eval=FALSE} install.packages("GMLTM") ``` ## The Q matrix and components All models require a **Q matrix** (items × rules) indicating which cognitive rules each item involves. The MLTM-D and GMLTM-D additionally require a `components` list grouping rules into higher-level dimensions. ```{r, eval=FALSE} # Example Q matrix: 5 items, 3 rules Q <- matrix(c(1,0,1, 0,1,1, 1,1,0, 1,0,0, 0,1,0), nrow = 5, byrow = TRUE) # Group rules into 2 components components <- list(comp1 = c(1, 2), comp2 = 3) ``` ## Basic usage — LLTM ```{r, eval=FALSE} library(GMLTM) data(analogy) Q <- matrix(...) # define your Q matrix (items x rules) fit <- LLTM(analogy, Q, iters = 2000, iter_warmup = 500, chains = 2) fit$EAP$eta # rule difficulty estimates fit$EAP$beta # item difficulty estimates reliability(fit) # marginal reliability ppchecks(fit) # posterior predictive check plot ``` ## Basic usage — GMLTM-D ```{r, eval=FALSE} components <- list(global = c(1, 2, 3), local = c(4, 5)) fit <- GMLTM(analogy, Q, components, iters = 2000, iter_warmup = 500, chains = 2) fit$EAP$eta # rule difficulties per component fit$EAP$alpha # item discriminations per component fit$EAP$guessing # item guessing parameters reliability(fit) # marginal reliability per component ``` ## Customising prior distributions All model functions accept a `priors` argument. Only the elements to change need to be specified; unspecified elements retain their defaults. ```{r, eval=FALSE} # More diffuse prior on rule difficulties fit_diffuse <- LLTM(analogy, Q, priors = list(eta = list(sigma = 3))) # Moderately informative prior for guessing in GMLTM fit_gmltm <- GMLTM(analogy, Q, components, priors = list(c = list(shape1 = 2, shape2 = 10))) # Uniform prior for guessing (least informative) fit_uniform_c <- GMLTM(analogy, Q, components, priors = list(c = list(shape1 = 1, shape2 = 1))) ``` ### Available priors per model | Parameter | Distribution | Models | |-----------|-------------|--------| | `theta` (ability) | Normal(mu, sigma) | All | | `eta` (rule difficulty) | Normal(mu, sigma) | All | | `alpha` (discrimination) | half-Normal(sigma) | MLTM, GMLTM | | `c` (guessing) | Beta(shape1, shape2) | GMLTM | ## Model comparison with LOO-CV ```{r, eval=FALSE} # Fit two models with different priors fit1 <- GMLTM(analogy, Q, components, iters = 2000, iter_warmup = 500) fit2 <- GMLTM(analogy, Q, components, priors = list(c = list(shape1 = 1, shape2 = 1)), iters = 2000, iter_warmup = 500) # Compare with LOO result <- compute_model_validation(list(fit1, fit2)) print(result$Summary) ``` ## Customizing prior distributions All models in GMLTM support user-defined prior distributions via the `priors` argument. This is useful for prior sensitivity analysis — refitting models with different priors to assess how much the posteriors change. ### How priors work in GMLTM Priors are passed as a **named list** where each element corresponds to a model parameter. You only need to specify the parameters you want to change; unspecified ones keep their defaults. The general structure is: ```{r, eval=FALSE} priors = list( parameter_name = list(hyperparameter1 = value, hyperparameter2 = value) ) ``` ### LLTM priors The LLTM has two parameters with customizable priors: | Parameter | Distribution | Hyperparameters | Default | |-----------|-------------|-----------------|---------| | `theta` (ability) | Normal | `mu`, `sigma` | `mu=0, sigma=1` | | `eta` (rule difficulty) | Normal | `mu`, `sigma` | `mu=0, sigma=1` | ```{r, eval=FALSE} library(GMLTM) data(analogy) # Default priors (weakly informative) fit_default <- LLTM(analogy, Q, iters = 2000, iter_warmup = 500, chains = 2) # More diffuse prior on rule difficulties (eta) fit_diffuse <- LLTM(analogy, Q, iters = 2000, iter_warmup = 500, chains = 2, priors = list(eta = list(mu = 0, sigma = 3))) # Informative prior centered on positive difficulty fit_informed <- LLTM(analogy, Q, iters = 2000, iter_warmup = 500, chains = 2, priors = list(eta = list(mu = 1, sigma = 0.5))) ``` ### MLTM priors The MLTM adds discrimination parameters (`alpha`): | Parameter | Distribution | Hyperparameters | Default | |-----------|-------------|-----------------|---------| | `theta` (ability) | Normal | `mu`, `sigma` | `mu=0, sigma=1` | | `eta` (rule difficulty) | Normal | `mu`, `sigma` | `mu=0, sigma=1` | | `alpha` (discrimination) | Half-Normal | `sigma` | `sigma=1` | Note: `alpha` uses a **half-Normal** prior (truncated at 0) to enforce positive discrimination. Only `sigma` is meaningful; `mu` is ignored. ```{r, eval=FALSE} components <- list(global = c(1, 2, 3), local = c(4, 5)) # Wider prior for discrimination fit_mltm <- MLTM(analogy, Q, components, iters = 2000, iter_warmup = 500, chains = 2, priors = list( theta = list(mu = 0, sigma = 1), alpha = list(sigma = 2) )) ``` ### GMLTM-D priors The GMLTM-D adds a guessing parameter (`c`): | Parameter | Distribution | Hyperparameters | Default | |-----------|-------------|-----------------|---------| | `theta` (ability) | Normal | `mu`, `sigma` | `mu=0, sigma=1` | | `eta` (rule difficulty) | Normal | `mu`, `sigma` | `mu=0, sigma=1` | | `alpha` (discrimination) | Half-Normal | `sigma` | `sigma=1` | | `c` (guessing) | Beta | `shape1`, `shape2` | `shape1=3, shape2=20` | The default Beta(3, 20) prior for guessing concentrates probability below 0.20, consistent with typical multiple-choice guessing rates. ```{r, eval=FALSE} # Default: Beta(3,20) -- conservative guessing prior fit_gmltm <- GMLTM(analogy, Q, components, iters = 2000, iter_warmup = 500, chains = 2) # Uniform prior for guessing -- no prior assumption fit_uniform_c <- GMLTM(analogy, Q, components, iters = 2000, iter_warmup = 500, chains = 2, priors = list(c = list(shape1 = 1, shape2 = 1))) # Moderately informative prior fit_moderate_c <- GMLTM(analogy, Q, components, iters = 2000, iter_warmup = 500, chains = 2, priors = list(c = list(shape1 = 2, shape2 = 10))) ``` ### Prior sensitivity analysis A good practice is to refit the model with at least two different prior specifications and compare the posterior means: ```{r, eval=FALSE} # Conservative priors fit_conservative <- LLTM(analogy, Q, chains = 2, iters = 2000, priors = list(eta = list(sigma = 1))) # Diffuse priors fit_diffuse <- LLTM(analogy, Q, chains = 2, iters = 2000, priors = list(eta = list(sigma = 5))) # Compare eta estimates cbind( conservative = fit_conservative$EAP$eta, diffuse = fit_diffuse$EAP$eta ) ``` If the estimates are similar, your results are robust to prior choice. Large differences indicate the data are not very informative and results should be interpreted with caution. ### GMLTM default priors The default priors in `GMLTM()` are weakly informative and suitable for most applications: | Parameter | Distribution | Default | |-----------|-------------|---------| | theta (ability) | Normal | mu=0, sigma=1 | | eta (rule difficulty) | Normal | mu=0, sigma=1 | | alpha (discrimination) | Half-Normal | sigma=1 | | c (guessing) | Beta | shape1=3, shape2=20 | To replicate moderately diffuse priors (formerly GMLTM1): ```{r, eval=FALSE} fit_gmltm1_style <- GMLTM(analogy, Q, components, priors = list( theta = list(mu = 0, sigma = 2), eta = list(mu = 0, sigma = 2), c = list(shape1 = 2, shape2 = 5) )) ``` To replicate diffuse priors (formerly GMLTM2): ```{r, eval=FALSE} fit_gmltm2_style <- GMLTM(analogy, Q, components, priors = list( theta = list(mu = 0, sigma = 5), eta = list(mu = 0, sigma = 5), c = list(shape1 = 1, shape2 = 1) )) ``` ## References Fischer, G. H. (1973). The linear logistic test model as an instrument in educational research. *Acta Psychologica*, 37(6), 359--374. Embretson, S. E., & Yang, X. (2013). A multicomponent latent trait model for diagnosis. *Psychometrika*, 78, 14--36. Ramírez, E. S., Jiménez, M., Franco, V. R., & Alvarado, J. M. (2024). Delving into the complexity of analogical reasoning: A detailed exploration with the Generalized Multicomponent Latent Trait Model for Diagnosis. *Journal of Intelligence*, 12, 67. https://doi.org/10.3390/jintelligence12070067