Title: | Multidimensional Latent Class Item Response Theory Models |
---|---|
Description: | Framework for the Item Response Theory analysis of dichotomous and ordinal polytomous outcomes under the assumption of multidimensionality and discreteness of the latent traits. The fitting algorithms allow for missing responses and for different item parameterizations and are based on the Expectation-Maximization paradigm. Individual covariates affecting the class weights may be included in the new version (since 2.1). |
Authors: | Francesco Bartolucci, Silvia Bacci, Michela Gnaldi - University of Perugia (IT) |
Maintainer: | Francesco Bartolucci <[email protected]> |
License: | GPL (>= 2) |
Version: | 2.11 |
Built: | 2024-12-05 06:50:28 UTC |
Source: | CRAN |
This package provides a flexible framework for the Item Response Theory (IRT) analysis of dichotomous and ordinal polytomous outcomes under the assumption of multidimensionality and discreteness of latent traits (abilities). Every level of the abilities identify a latent class of subjects. The fitting algorithms are based on the Expectation-Maximization (EM) paradigm and allow for missing responses and for different item parameterizations. The package also allows for the inclusion individual covariates affecting the class weights.
Package: | MultiLCIRT |
Type: | Package |
Version: | 2.11 |
Date: | 2017-05-19 |
License: | GPL (>= 2) |
Function est_multi_poly
performs the parameter estimation of the following IRT models,
allowing for one or more latent traits:
- Binary responses: Rasch model, 2-Parameter Logistic (2PL) model;
- Ordinal polythomous responses: Samejima's Graded Response Model (GRM) and constrained versions with fixed discrimination parameters and/or additive decomposition of difficulty parameters (rating scale parameterization); Muraki's Generalized Partial Credit Model and constrained versions with fixed discrimination parameters and/or additive decomposition of difficulty parameters, such as Partial Credit Model and Rating Scale Model.
The basic input arguments for est_multi_poly are the person-item matrix of available response configurations and the corresponding frequencies, the number of latent classes, the type of link function, the specification of constraints on the discriminating and difficulty item parameters, and the allocation of items to the latent traits. Missing responses are coded with NA, and units and items without responses are automatically removed.
Function test_dim
performs a likelihood ratio test to choose the optimal number of latent traits (or
dimensions) by comparing nested models that differ in the number of latent traits, being all the other
elements let equal (i.e., number of latent classes, type of link function, constraints on item parameters).
The basic input arguments for test_dim
are similar as those for est_multi_poly
.
Function class_item
performs a hierarchical clustering of items based on a specified LC IRT model.
The basic input arguments are given by the number of latent classes, the type of model, and the constraints
on the item parameters (only for polythomous responses).
An allocation of items to the different latent traits is obtained
depending on the cut-point of the resulting dendrogram.
Francesco Bartolucci, Silvia Bacci, Michela Gnaldi - University of Perugia (IT)
Maintainer: Francesco Bartolucci <[email protected]>
Bartolucci, F. (2007), A class of multidimensional IRT models for testing unidimensionality and clustering items, Psychometrika, 72, 141-157.
Bacci, S., Bartolucci, F. and Gnaldi, M. (2014), A class of Multidimensional Latent Class IRT models for ordinal polytomous item responses, Communication in Statistics - Theory and Methods, 43, 787-800.
Bartolucci, F., Bacci, S. and Gnaldi, M. (2014), MultiLCIRT: An R package for multidimensional latent class item response models, Computational Statistics and Data Analysis, 71, 971-985.
## Estimation of different Multidimensional LC IRT models with binary ## responses # Aggregate data data(naep) X = as.matrix(naep) out = aggr_data(X) S = out$data_dis yv = out$freq # Define matrix to allocate each item on one dimension multi1 = rbind(c(1,2,9,10),c(3,5,8,11),c(4,6,7,12)) # Three-dimensional LC Rasch model with 4 latent classes # less severe tolerance level to check convergence (to be modified) out1 = est_multi_poly(S,yv,k=4,start=0,link=1,multi=multi1,tol=10^-6)
## Estimation of different Multidimensional LC IRT models with binary ## responses # Aggregate data data(naep) X = as.matrix(naep) out = aggr_data(X) S = out$data_dis yv = out$freq # Define matrix to allocate each item on one dimension multi1 = rbind(c(1,2,9,10),c(3,5,8,11),c(4,6,7,12)) # Three-dimensional LC Rasch model with 4 latent classes # less severe tolerance level to check convergence (to be modified) out1 = est_multi_poly(S,yv,k=4,start=0,link=1,multi=multi1,tol=10^-6)
Given a matrix of configurations (covariates and responses) unit-by-unit, this function finds the corresponding matrix of distinct configurations and the corresponding vector of frequencies (it does not work properly with missing data).
aggr_data(data, disp=FALSE, fort=FALSE)
aggr_data(data, disp=FALSE, fort=FALSE)
data |
matrix of covariate and unit-by-unit response configurations |
disp |
to display partial results |
fort |
to use fortran routines when possible |
data_dis |
matrix of distinct configurations |
freq |
vector of corresponding frequencies |
label |
the index of each provided response configuration among the distinct ones |
Francesco Bartolucci - University of Perugia (IT)
# draw a matrix of random responses and find distinct responses X = matrix(sample(5,100,replace=TRUE),50,2) out = aggr_data(X) # find the distinct responses and the corresponding vector of frequencies # for naep data data(naep) X = as.matrix(naep) out = aggr_data(X) length(out$freq)
# draw a matrix of random responses and find distinct responses X = matrix(sample(5,100,replace=TRUE),50,2) out = aggr_data(X) # find the distinct responses and the corresponding vector of frequencies # for naep data data(naep) X = as.matrix(naep) out = aggr_data(X) length(out$freq)
It performs a hierarchical classification of a set of test items on the basis of the responses provided by a sample of subjects. The classification is based on a sequence of likelihood ratio tests between pairs of multidimensional models suitably formulated.
class_item(S, yv, k, link = 1, disc = 0, difl = 0, fort = FALSE, disp = FALSE, tol = 10^-10)
class_item(S, yv, k, link = 1, disc = 0, difl = 0, fort = FALSE, disp = FALSE, tol = 10^-10)
S |
matrix of all response sequences observed at least once in the sample and listed row-by-row (use 999 for missing response) |
yv |
vector of the frequencies of every response configuration in |
k |
number of ability levels (or latent classes) |
link |
type of link function (1 = global logits, 2 = local logits); with global logits the Graded Response model results; with local logits the Partial Credit results (with dichotomous responses, global logits is the same as using local logits resulting in the Rasch or the 2PL model depending on the value assigned to disc) |
disc |
indicator of constraints on the discriminating indices (0 = all equal to one, 1 = free) |
difl |
indicator of constraints on the difficulty levels (0 = free, 1 = rating scale parametrization) |
fort |
to use fortran routines when possible |
disp |
to display the likelihood evolution step by step |
tol |
tolerance level for convergence |
merge |
input for the dendrogram represented by the |
height |
input for the dendrogram represented by the |
lk |
maximum log-likelihood of the model resulting from each aggregation |
np |
number of free parameters of the model resulting from each aggregation |
lk0 |
maximum log-likelihood of the latent class model |
np0 |
number of free parameters of the latent class model |
groups |
list of groups resulting (step-by-step) from the hierarchical clustering |
dend |
hclust object to represent the histogram |
call |
command used to call the function |
Francesco Bartolucci, Silvia Bacci, Michela Gnaldi - University of Perugia (IT)
Bartolucci, F. (2007), A class of multidimensional IRT models for testing unidimensionality and clustering items, Psychometrika, 72, 141-157.
Bacci, S., Bartolucci, F. and Gnaldi, M. (2012), A class of Multidimensional Latent Class IRT models for ordinal polytomous item responses, Technical report, http://arxiv.org/abs/1201.4667.
## Not run: ## Model-based hierarchical classification of items from simulated data # Setup r = 6 # number of items n = 1000 # sample size bev = rep(0,r) k = r/2 multi = rbind(1:(r/2),(r/2+1):r) L = chol(matrix(c(1,0.6,0.6,1),2,2)) data = matrix(0,n,r) model = 1 # Create data Th = matrix(rnorm(2*n),n,2) for(i in 1:n) for(j in 1:r){ if(j<=r/2){ pc = exp(Th[i,1]-bev[j]); pc = pc/(1+pc) }else{ pc = exp(Th[i,2]-bev[j]); pc = pc/(1+pc) } data[i,j] = runif(1)<pc } # Aggregate data out = aggr_data(data) S = out$data_dis yv = out$freq # Create dendrogram for items classification, by assuming k=3 latent # classes and a Rasch parameterization out = class_item(S,yv,k=3,link=1) summary(out) plot(out$dend) ## End(Not run) ## Not run: ## Model-based hierarchical classification of NAEP items # Aggregate data data(naep) X = as.matrix(naep) out = aggr_data(X) S = out$data_dis yv = out$freq # Create dendrogram for items classification, by assuming k=4 latent # classes and a Rasch parameterization out = class_item(S,yv,k=4,link=1) summary(out) plot(out$dend) ## End(Not run)
## Not run: ## Model-based hierarchical classification of items from simulated data # Setup r = 6 # number of items n = 1000 # sample size bev = rep(0,r) k = r/2 multi = rbind(1:(r/2),(r/2+1):r) L = chol(matrix(c(1,0.6,0.6,1),2,2)) data = matrix(0,n,r) model = 1 # Create data Th = matrix(rnorm(2*n),n,2) for(i in 1:n) for(j in 1:r){ if(j<=r/2){ pc = exp(Th[i,1]-bev[j]); pc = pc/(1+pc) }else{ pc = exp(Th[i,2]-bev[j]); pc = pc/(1+pc) } data[i,j] = runif(1)<pc } # Aggregate data out = aggr_data(data) S = out$data_dis yv = out$freq # Create dendrogram for items classification, by assuming k=3 latent # classes and a Rasch parameterization out = class_item(S,yv,k=3,link=1) summary(out) plot(out$dend) ## End(Not run) ## Not run: ## Model-based hierarchical classification of NAEP items # Aggregate data data(naep) X = as.matrix(naep) out = aggr_data(X) S = out$data_dis yv = out$freq # Create dendrogram for items classification, by assuming k=4 latent # classes and a Rasch parameterization out = class_item(S,yv,k=4,link=1) summary(out) plot(out$dend) ## End(Not run)
Given different outputs provided by est_multi_poly, the function compare the different models providing a unified table.
compare_models(out1, out2, out3=NULL, out4=NULL, out5=NULL, nested=FALSE)
compare_models(out1, out2, out3=NULL, out4=NULL, out5=NULL, nested=FALSE)
out1 |
output from the 1st fitting |
out2 |
output from the 2nd fitting |
out3 |
output from the 3rd fitting |
out4 |
output from the 4th fitting |
out5 |
output from the 5th fitting |
nested |
to compare each model with the first in terms of LR test |
table |
table summarizing the comparison between the models |
Francesco Bartolucci - University of Perugia (IT)
It estimates marginal regression models to datasets consisting of a categorical response and one or more covariates by a Fisher-scoring algorithm; this is an internal function.
est_multi_glob(Y, X, model, ind = 1:nrow(Y), be = NULL, Dis = NULL, dis = NULL, disp=FALSE, only_sc = FALSE, Int = NULL, der_single = FALSE)
est_multi_glob(Y, X, model, ind = 1:nrow(Y), be = NULL, Dis = NULL, dis = NULL, disp=FALSE, only_sc = FALSE, Int = NULL, der_single = FALSE)
Y |
matrix of response configurations |
X |
array of all distinct covariate configurations |
model |
type of logit (g = global, l = local, m = multinomial) |
ind |
vector to link responses to covariates |
be |
initial vector of regression coefficients |
Dis |
matrix for inequality constraints on be |
dis |
vector for inequality constraints on be |
disp |
to display partial output |
only_sc |
to exit giving only the score |
Int |
matrix of the fixed intercepts |
der_single |
to require single derivatives |
be |
estimated vector of regression coefficients |
lk |
log-likelihood at convergence |
Pdis |
matrix of the probabilities for each distinct covariate configuration |
P |
matrix of the probabilities for each covariate configuration |
sc |
score |
Sc |
single derivative (if der_single=TRUE) |
Francesco Bartolucci - University of Perugia (IT)
Colombi, R. and Forcina, A. (2001), Marginal regression models for the analysis of positive association of ordinal response variables, Biometrika, 88, 1007-1019.
Glonek, G. F. V. and McCullagh, P. (1995), Multivariate logistic models, Journal of the Royal Statistical Society, Series B, 57, 533-546.
The function performs maximum likelihood estimation of the parameters of the IRT models assuming a discrete distribution for the ability. Every ability level corresponds to a latent class of subjects in the reference population. Maximum likelihood estimation is based on Expectation- Maximization algorithm.
est_multi_poly(S, yv = rep(1,ns), k, X = NULL, start = 0, link = 0, disc = 0, difl = 0, multi = NULL, piv = NULL, Phi = NULL, gac = NULL, De = NULL, fort = FALSE, tol = 10^-10, disp = FALSE, output = FALSE, out_se = FALSE, glob = FALSE)
est_multi_poly(S, yv = rep(1,ns), k, X = NULL, start = 0, link = 0, disc = 0, difl = 0, multi = NULL, piv = NULL, Phi = NULL, gac = NULL, De = NULL, fort = FALSE, tol = 10^-10, disp = FALSE, output = FALSE, out_se = FALSE, glob = FALSE)
S |
matrix of all response sequences observed at least once in the sample and listed row-by-row (use NA for missing response) |
yv |
vector of the frequencies of every response configuration in |
k |
number of ability levels (or latent classes) |
X |
matrix of covariates that affects the weights |
start |
method of initialization of the algorithm (0 = deterministic, 1 = random, 2 = arguments given as input) |
link |
type of link function (0 = no link function, 1 = global logits, 2 = local logits); with no link function the Latent Class model results; with global logits the Graded Response model results; with local logits the Partial Credit results (with dichotomous responses, global logits is the same as using local logits resulting in the Rasch or the 2PL model depending on the value assigned to disc) |
disc |
indicator of constraints on the discriminating indices (0 = all equal to one, 1 = free) |
difl |
indicator of constraints on the difficulty levels (0 = free, 1 = rating scale parameterization) |
multi |
matrix with a number of rows equal to the number of dimensions and elements in each row equal to the indices of the items measuring the dimension corresponding to that row |
piv |
initial value of the vector of weights of the latent classes (if start=2) |
Phi |
initial value of the matrix of the conditional response probabilities (if start=2) |
gac |
initial value of the complete vector of discriminating indices (if start=2) |
De |
initial value of regression coefficients for the covariates (if start=2) |
fort |
to use fortran routines when possible |
tol |
tolerance level for checking convergence of the algorithm as relative difference between consecutive log-likelihoods |
disp |
to display the likelihood evolution step by step |
output |
to return additional outputs (Phi,Pp,Piv) |
out_se |
to return standard errors |
glob |
to use global logits in the covariates |
piv |
estimated vector of weights of the latent classes (average of the weights in case of model with covariates) |
Th |
estimated matrix of ability levels for each dimension and latent class |
Bec |
estimated vector of difficulty levels for every item (split in two vectors if difl=1) |
gac |
estimated vector of discriminating indices for every item (with all elements equal to 1 with Rasch parametrization) |
fv |
vector indicating the reference item chosen for each latent dimension |
Phi |
array of the conditional response probabilities for every item and latent class |
De |
matrix of regression coefficients for the multinomial logit model on the class weights |
Piv |
matrix of the weights for every response configuration (if output=TRUE) |
Pp |
matrix of the posterior probabilities for each response configuration and latent class (if output=TRUE) |
lk |
log-likelhood at convergence of the EM algorithm |
np |
number of free parameters |
aic |
Akaike Information Criterion index |
bic |
Bayesian Information Criterion index |
ent |
Etropy index to measure the separation of classes |
lkv |
Vector to trace the log-likelihood evolution across iterations (if output=TRUE) |
seDe |
Standard errors for De (if output=TRUE) |
separ |
Standard errors for vector of parameters containing Th and Be (if out_se=TRUE) |
sega |
Standard errors for vector of discrimination indices (if out_se=TRUE) |
Vn |
Estimated variance-covariance matrix for all parameter estimates (if output=TRUE) |
Francesco Bartolucci, Silvia Bacci, Michela Gnaldi - University of Perugia (IT)
Bartolucci, F. (2007), A class of multidimensional IRT models for testing unidimensionality and clustering items, Psychometrika, 72, 141-157.
Bacci, S., Bartolucci, F. and Gnaldi, M. (2014), A class of Multidimensional Latent Class IRT models for ordinal polytomous item responses, Communication in Statistics - Theory and Methods, 43, 787-800.
## Estimation of different Multidimensional LC IRT models with binary # responses # Aggregate data data(naep) X = as.matrix(naep) out = aggr_data(X) S = out$data_dis yv = out$freq # Define matrix to allocate each item to one dimension multi1 = rbind(c(1,2,9,10),c(3,5,8,11),c(4,6,7,12)) # Three-dimensional Rasch model with 3 latent classes # the tolerance level has been rise to increase the speed (to be reported # to a smaller value) out1 = est_multi_poly(S,yv,k=3,start=0,link=1,multi=multi1,tol=10^-6) ## Not run: # Three-dimensional 2PL model with 3 latent classes out2 = est_multi_poly(S,yv,k=3,start=0,link=1,disc=1,multi=multi1) ## End(Not run) ## Not run: ## Estimation of different Multidimensional LC IRT models with ordinal # responses # Aggregate data data(hads) X = as.matrix(hads) out = aggr_data(X) S = out$data_dis yv = out$freq # Define matrix to allocate each item to one dimension multi1 = rbind(c(2,6,7,8,10,11,12),c(1,3,4,5,9,13,14)) # Bidimensional LC Graded Response Model with 3 latent classes # (free discriminating and free difficulty parameters) out1 = est_multi_poly(S,yv,k=3,start=0,link=1,disc=1,multi=multi1) # Bidimensional LC Partial Credit Model with 3 latent classes # (constrained discrimination and free difficulty parameters) out2 = est_multi_poly(S,yv,k=3,start=0,link=2,multi=multi1) # Bidimensional LC Rating Scale Model with 3 latent classes # (constrained discrimination and constrained difficulty parameters) out3 = est_multi_poly(S,yv,k=3,start=0,link=2,difl=1,multi=multi1) ## End(Not run) ## Not run: ## Estimation of LC model with covariates # gerate covariates be = c(0,1,-1) X = matrix(rnorm(2000),1000,2) u = cbind(1,X) p = exp(u)/(1+exp(u)) c = 1+(runif(1000)<p) Y = matrix(0,1000,5) la = c(0.3,0.7) for(i in 1:1000) Y[i,] = runif(5)<la[c[i]] # fit the model with k=2 and k=3 classes out1 = est_multi_poly(Y,k=2,X=X) out2 = est_multi_poly(Y,k=3,X=X) # fit model with k=2 and k=3 classes in fortran out3 = est_multi_poly(Y,k=2,X=X,fort=TRUE) out4 = est_multi_poly(Y,k=3,X=X,fort=TRUE) ## End(Not run)
## Estimation of different Multidimensional LC IRT models with binary # responses # Aggregate data data(naep) X = as.matrix(naep) out = aggr_data(X) S = out$data_dis yv = out$freq # Define matrix to allocate each item to one dimension multi1 = rbind(c(1,2,9,10),c(3,5,8,11),c(4,6,7,12)) # Three-dimensional Rasch model with 3 latent classes # the tolerance level has been rise to increase the speed (to be reported # to a smaller value) out1 = est_multi_poly(S,yv,k=3,start=0,link=1,multi=multi1,tol=10^-6) ## Not run: # Three-dimensional 2PL model with 3 latent classes out2 = est_multi_poly(S,yv,k=3,start=0,link=1,disc=1,multi=multi1) ## End(Not run) ## Not run: ## Estimation of different Multidimensional LC IRT models with ordinal # responses # Aggregate data data(hads) X = as.matrix(hads) out = aggr_data(X) S = out$data_dis yv = out$freq # Define matrix to allocate each item to one dimension multi1 = rbind(c(2,6,7,8,10,11,12),c(1,3,4,5,9,13,14)) # Bidimensional LC Graded Response Model with 3 latent classes # (free discriminating and free difficulty parameters) out1 = est_multi_poly(S,yv,k=3,start=0,link=1,disc=1,multi=multi1) # Bidimensional LC Partial Credit Model with 3 latent classes # (constrained discrimination and free difficulty parameters) out2 = est_multi_poly(S,yv,k=3,start=0,link=2,multi=multi1) # Bidimensional LC Rating Scale Model with 3 latent classes # (constrained discrimination and constrained difficulty parameters) out3 = est_multi_poly(S,yv,k=3,start=0,link=2,difl=1,multi=multi1) ## End(Not run) ## Not run: ## Estimation of LC model with covariates # gerate covariates be = c(0,1,-1) X = matrix(rnorm(2000),1000,2) u = cbind(1,X) p = exp(u)/(1+exp(u)) c = 1+(runif(1000)<p) Y = matrix(0,1000,5) la = c(0.3,0.7) for(i in 1:1000) Y[i,] = runif(5)<la[c[i]] # fit the model with k=2 and k=3 classes out1 = est_multi_poly(Y,k=2,X=X) out2 = est_multi_poly(Y,k=3,X=X) # fit model with k=2 and k=3 classes in fortran out3 = est_multi_poly(Y,k=2,X=X,fort=TRUE) out4 = est_multi_poly(Y,k=3,X=X,fort=TRUE) ## End(Not run)
The function performs maximum likelihood estimation of the parameters of the IRT models assuming a discrete distribution for the ability and a discrete distribution for the latent variable at cluster level. Every ability level corresponds to a latent class of subjects in the reference population. Maximum likelihood estimation is based on Expectation- Maximization algorithm.
est_multi_poly_clust(S, kU, kV, W = NULL, X = NULL, clust, start = 0, link = 0, disc = 0, difl = 0, multi = 1:J, piv = NULL, Phi = NULL, gac = NULL, DeU = NULL, DeV = NULL, fort = FALSE, tol = 10^-10, disp = FALSE, output = FALSE)
est_multi_poly_clust(S, kU, kV, W = NULL, X = NULL, clust, start = 0, link = 0, disc = 0, difl = 0, multi = 1:J, piv = NULL, Phi = NULL, gac = NULL, DeU = NULL, DeV = NULL, fort = FALSE, tol = 10^-10, disp = FALSE, output = FALSE)
S |
matrix of all response sequences observed at least once in the sample and listed row-by-row (use NA for missing response) |
kU |
number of support points (or latent classes at cluster level) |
kV |
number of ability levels (or latent classes at individual level) |
W |
matrix of covariates that affects the weights at cluster level |
X |
matrix of covariates that affects the weights at individual level |
clust |
vector of cluster indicator for each unit |
start |
method of initialization of the algorithm (0 = deterministic, 1 = random, 2 = arguments given as input) |
link |
type of link function (0 = no link function, 1 = global logits, 2 = local logits); with no link function the Latent Class model results; with global logits the Graded Response model results; with local logits the Partial Credit results (with dichotomous responses, global logits is the same as using local logits resulting in the Rasch or the 2PL model depending on the value assigned to disc) |
disc |
indicator of constraints on the discriminating indices (0 = all equal to one, 1 = free) |
difl |
indicator of constraints on the difficulty levels (0 = free, 1 = rating scale parameterization) |
multi |
matrix with a number of rows equal to the number of dimensions and elements in each row equal to the indices of the items measuring the dimension corresponding to that row |
piv |
initial value of the vector of weights of the latent classes (if start=2) |
Phi |
initial value of the matrix of the conditional response probabilities (if start=2) |
gac |
initial value of the complete vector of discriminating indices (if start=2) |
DeU |
initial value of regression coefficients for the covariates in W (if start=2) |
DeV |
initial value of regression coefficients for the covariates in X (if start=2) |
fort |
to use fortran routines when possible |
tol |
tolerance level for checking convergence of the algorithm as relative difference between consecutive log-likelihoods |
disp |
to display the likelihood evolution step by step |
output |
to return additional outputs (Phi,Pp,Piv) |
piv |
estimated vector of weights of the latent classes (average of the weights in case of model with covariates) |
Th |
estimated matrix of ability levels for each dimension and latent class |
Bec |
estimated vector of difficulty levels for every item (split in two vectors if difl=1) |
gac |
estimated vector of discriminating indices for every item (with all elements equal to 1 with Rasch parametrization) |
fv |
vector indicating the reference item chosen for each latent dimension |
Phi |
array of the conditional response probabilities for every item and latent class |
De |
matrix of regression coefficients for the multinomial logit model on the class weights |
Piv |
matrix of the weights for every response configuration (if output=TRUE) |
Pp |
matrix of the posterior probabilities for each response configuration and latent class (if output=TRUE) |
lk |
log-likelhood at convergence of the EM algorithm |
np |
number of free parameters |
aic |
Akaike Information Criterion index |
bic |
Bayesian Information Criterion index |
ent |
Etropy index to measure the separation of classes |
lkv |
Vector to trace the log-likelihood evolution across iterations (if output=TRUE) |
seDe |
Standard errors for De (if output=TRUE) |
separ |
Standard errors for vector of parameters containing Th and Be (if output=TRUE) |
sega |
Standard errors for vector of discrimination indices (if output=TRUE) |
Vn |
Estimated variance-covariance matrix for all parameter estimates (if output=TRUE) |
Francesco Bartolucci, Silvia Bacci, Michela Gnaldi - University of Perugia (IT)
Bartolucci, F. (2007), A class of multidimensional IRT models for testing unidimensionality and clustering items, Psychometrika, 72, 141-157.
Bacci, S., Bartolucci, F. and Gnaldi, M. (2014), A class of Multidimensional Latent Class IRT models for ordinal polytomous item responses, Communication in Statistics - Theory and Methods, 43, 787-800.
## Not run: # generate covariate at cluster level nclust = 200 W = matrix(round(rnorm(nclust)*2,0)/2,nclust,1) la = exp(W)/(1+exp(W)) U = 1+1*(runif(nclust)<la) clust = NULL for(h in 1:nclust){ nh = round(runif(1,5,20)) clust = c(clust,h*rep(1,nh)) } n = length(clust) # generate covariates DeV = rbind(c(1.75,1.5),c(-0.25,-1.5),c(-0.5,-1),c(0.5,1)) X = matrix(round(rnorm(2*n)*2,0)/2,n,2) Piv = cbind(0,cbind(U[clust]==1,U[clust]==2,X)%*%DeV) Piv = exp(Piv)*(1/rowSums(exp(Piv))) V = rep(0,n) for(i in 1:n) V[i] = which(rmultinom(1,1,Piv[i,])==1) # generate responses la = c(0.2,0.5,0.8) Y = matrix(0,n,10) for(i in 1:n) Y[i,] = runif(10)<la[V[i]] # fit the model with k1=3 and k2=2 classes out1 = est_multi_poly_clust(Y,kU=2,kV=3,W=W,X=X,clust=clust) out2 = est_multi_poly_clust(Y,kU=2,kV=3,W=W,X=X,clust=clust,disp=TRUE, output=TRUE) out3 = est_multi_poly_clust(Y,kU=2,kV=3,W=W,X=X,clust=clust,disp=TRUE, output=TRUE,start=2,Phi=out2$Phi,gac=out2$gac, DeU=out2$DeU,DeV=out2$DeV) # Rasch out4 = est_multi_poly_clust(Y,kU=2,kV=3,W=W,X=X,clust=clust,link=1, disp=TRUE,output=TRUE) out5 = est_multi_poly_clust(Y,kU=2,kV=3,W=W,X=X,clust=clust,link=1, disc=1,disp=TRUE,output=TRUE) ## End(Not run)
## Not run: # generate covariate at cluster level nclust = 200 W = matrix(round(rnorm(nclust)*2,0)/2,nclust,1) la = exp(W)/(1+exp(W)) U = 1+1*(runif(nclust)<la) clust = NULL for(h in 1:nclust){ nh = round(runif(1,5,20)) clust = c(clust,h*rep(1,nh)) } n = length(clust) # generate covariates DeV = rbind(c(1.75,1.5),c(-0.25,-1.5),c(-0.5,-1),c(0.5,1)) X = matrix(round(rnorm(2*n)*2,0)/2,n,2) Piv = cbind(0,cbind(U[clust]==1,U[clust]==2,X)%*%DeV) Piv = exp(Piv)*(1/rowSums(exp(Piv))) V = rep(0,n) for(i in 1:n) V[i] = which(rmultinom(1,1,Piv[i,])==1) # generate responses la = c(0.2,0.5,0.8) Y = matrix(0,n,10) for(i in 1:n) Y[i,] = runif(10)<la[V[i]] # fit the model with k1=3 and k2=2 classes out1 = est_multi_poly_clust(Y,kU=2,kV=3,W=W,X=X,clust=clust) out2 = est_multi_poly_clust(Y,kU=2,kV=3,W=W,X=X,clust=clust,disp=TRUE, output=TRUE) out3 = est_multi_poly_clust(Y,kU=2,kV=3,W=W,X=X,clust=clust,disp=TRUE, output=TRUE,start=2,Phi=out2$Phi,gac=out2$gac, DeU=out2$DeU,DeV=out2$DeV) # Rasch out4 = est_multi_poly_clust(Y,kU=2,kV=3,W=W,X=X,clust=clust,link=1, disp=TRUE,output=TRUE) out5 = est_multi_poly_clust(Y,kU=2,kV=3,W=W,X=X,clust=clust,link=1, disc=1,disp=TRUE,output=TRUE) ## End(Not run)
This data set contains the responses of 201 oncological patients to 14 ordinal polytomous items that measure anxiety (7 items) and depression (7 items), according to the Hospital Anxiety and Depression Scale questionnaire.
data(hads)
data(hads)
A data frame with 201 observations on 14 items:
item1
measure of depression
item2
measure of anxiety
item3
measure of depression
item4
measure of depression
item5
measure of depression
item6
measure of anxiety
item7
measure of anxiety
item8
measure of anxiety
item9
measure of depression
item10
measure of anxiety
item11
measure of anxiety
item12
measure of anxiety
item13
measure of depression
item14
measure of depression
All items have 4 response categories: the minimum value 0 corresponds to a low level of anxiety or depression, whereas the maximum value 3 corresponds to a high level of anxiety or depression.
Zigmond, A. and Snaith, R. (1983), The hospital anxiety and depression scale, Acta Psychiatrika Scandinavica, 67, 361-370.
data(hads) ## maybe str(hads) str(hads)
data(hads) ## maybe str(hads) str(hads)
Function used within est_multi_glob
to invert marginal logits and fit the marginal
regression model; this is an internal function.
inv_glob(eta, type = "g", der = F)
inv_glob(eta, type = "g", der = F)
eta |
vector of logits |
type |
type of logit (l = local-logits, g = global-logits) |
der |
indicator that the derivative of the canonical parameters with respect to the vector of marginal logits is required (F = not required, T = required) |
p |
vector of probabilities |
D |
derivative of the canonical parameters with respect to the vector of marginal logits (if der = T) |
Francesco Bartolucci - University of Perugia (IT)
Colombi, R. and Forcina, A. (2001), Marginal regression models for the analysis of positive association of ordinal response variables, Biometrika, 88, 1007-1019.
Glonek, G. F. V. and McCullagh, P. (1995), Multivariate logistic models, Journal of the Royal Statistical Society, Series B, 57, 533-546.
Function used within est_multi_poly
to compute observed log-likelihood and score.
lk_obs_score(par_comp, lde, lpar, lga, S, R, yv, k, rm, l, J, fv, link, disc, indga, glob, refitem, miss, ltype, XXdis, Xlabel, ZZ0, fort)
lk_obs_score(par_comp, lde, lpar, lga, S, R, yv, k, rm, l, J, fv, link, disc, indga, glob, refitem, miss, ltype, XXdis, Xlabel, ZZ0, fort)
par_comp |
complete vector of parameters |
lde |
length of de |
lpar |
length of par |
lga |
length of ga |
S |
matrix of responses |
R |
matrix of observed responses indicator |
yv |
vector of frequencies |
k |
number of latent classes |
rm |
number of dimensions |
l |
number of respnse categories |
J |
number of items |
fv |
indicator of constrained parameters |
link |
link function |
disc |
presence of discrimination parameter |
indga |
indicator of gamma parameters |
glob |
indicator of gloabl parametrization for the covariates |
refitem |
vector of reference items |
miss |
indicator of presence of missing responses |
ltype |
type of logit |
XXdis |
array of covariates |
Xlabel |
indicator for covariate configuration |
ZZ0 |
design matrix |
fort |
to use fortran |
lk |
log-likelihood function |
sc |
score vector |
Francesco Bartolucci - University of Perugia (IT)
Function used within est_multi_poly
to compute observed log-likelihood and score.
lk_obs_score_clust(par_comp, lde1, lde2, lpar, lga, S, R, kU, kV, rm, l, J, fv, link, disc, indga, refitem, miss, ltype, WWdis, Wlabel, XXdis, Xlabel, ZZ0, clust, fort)
lk_obs_score_clust(par_comp, lde1, lde2, lpar, lga, S, R, kU, kV, rm, l, J, fv, link, disc, indga, refitem, miss, ltype, WWdis, Wlabel, XXdis, Xlabel, ZZ0, clust, fort)
par_comp |
complete vector of parameters |
lde1 |
length of de |
lde2 |
length of de |
lpar |
length of par |
lga |
length of ga |
S |
matrix of responses |
R |
matrix of observed responses indicator |
kU |
number of latent classes at cluster level |
kV |
number of latent classes at individual level |
rm |
number of dimensions |
l |
number of respnse categories |
J |
number of items |
fv |
indicator of constrained parameters |
link |
link function |
disc |
presence of discrimination parameter |
indga |
indicator of gamma parameters |
refitem |
vector of reference items |
miss |
indicator of presence of missing responses |
ltype |
type of logit |
WWdis |
array of covariates at cluster level |
Wlabel |
indicator for covariate configuration at cluster level |
XXdis |
array of covariates at individual level |
Xlabel |
indicator for covariate configuration at individual level |
ZZ0 |
design matrix |
clust |
vector of cluster indicator for each unit |
fort |
to use fortran |
lk |
log-likelihood function |
sc |
score vector |
Francesco Bartolucci - University of Perugia (IT)
It provides the matrices used to compute a vector of generalized logits on the basis of a vector of probabilities according to the formula Co*log(Ma*p); this is an internal function.
matr_glob(l, type = "g")
matr_glob(l, type = "g")
l |
number of response categories |
type |
type of logit (l = local-logits, g = global-logits) |
Co |
matrix of contrasts |
Ma |
marginalization matrix |
Francesco Bartolucci - University of Perugia (IT)
Colombi, R. and Forcina, A. (2001), Marginal regression models for the analysis of positive association of ordinal response variables, Biometrika, 88, 1007-1019.
Glonek, G. F. V. and McCullagh, P. (1995), Multivariate logistic models, Journal of the Royal Statistical Society, Series B, 57, 533-546.
This dataset contains the responses of a sample of 1510 examinees to 12 binary items on Mathematics. It has been extrapolated from a larger dataset collected in 1996 by the Educational Testing Service within the National Assessment of Educational Progress (NAEP) project.
data(naep)
data(naep)
A data frame with 1510 observations on the following 12 items:
Item1
round to thousand place
Item2
write fraction that represents shaded region
Item3
multiply two negative integers
Item4
reason about sample space (number correct)
Item5
find amount of restaurant tip
Item6
identify representative sample
Item7
read dials on a meter
Item8
find (x, y) solution of linear equation
Item9
translate words to symbols
Item10
find number of diagonals in polygon from a vertex
Item11
find perimeter (quadrilateral)
Item12
reason about betweenness
Bartolucci, F. and Forcina, A. (2005), Likelihood inference on the underlying structure of IRT models. Psychometrika, 70, 31-43.
Bartolucci, F. (2007), A class of multidimensional IRT models for testing unidimensionality and clustering items. Psychometrika, 72, 141-157.
data(naep) ## maybe str(naep) str(naep)
data(naep) ## maybe str(naep) str(naep)
Given the output from class_item, it is written in a readable form
## S3 method for class 'class_item' print(x, ...)
## S3 method for class 'class_item' print(x, ...)
x |
output from class_item |
... |
further arguments passed to or from other methods |
Francesco Bartolucci - University of Perugia (IT)
Given the output from est_multi_poly, it is written in a readable form
## S3 method for class 'est_multi_poly' print(x, ...)
## S3 method for class 'est_multi_poly' print(x, ...)
x |
output from est_multi_poly |
... |
further arguments passed to or from other methods |
Francesco Bartolucci - University of Perugia (IT)
Given the output from est_multi_poly_clust, it is written in a readable form
## S3 method for class 'est_multi_poly_clust' print(x, ...)
## S3 method for class 'est_multi_poly_clust' print(x, ...)
x |
output from est_multi_poly_clust |
... |
further arguments passed to or from other methods |
Francesco Bartolucci - University of Perugia (IT)
Given the output from test_dim, it is written in a readable form
## S3 method for class 'test_dim' print(x, ...)
## S3 method for class 'test_dim' print(x, ...)
x |
output from test_dim |
... |
further arguments passed to or from other methods |
Francesco Bartolucci - University of Perugia (IT)
It provides matrix of probabilities under different parametrizations.
prob_multi_glob(X, model, be, ind=(1:dim(X)[3]))
prob_multi_glob(X, model, be, ind=(1:dim(X)[3]))
X |
array of all distinct covariate configurations |
model |
type of logit (g = global, l = local, m = multinomial) |
be |
initial vector of regression coefficients |
ind |
vector to link responses to covariates |
Pdis |
matrix of distinct probability vectors |
P |
matrix of the probabilities for each covariate configuration |
Francesco Bartolucci - University of Perugia (IT)
Colombi, R. and Forcina, A. (2001), Marginal regression models for the analysis of positive association of ordinal response variables, Biometrika, 88, 1007-1019.
Glonek, G. F. V. and McCullagh, P. (1995), Multivariate logistic models, Journal of the Royal Statistical Society, Series B, 57, 533-546.
It search for the global maximum of the log-likelihood given a vector of possible number of classes to try for.
search.model(S, yv = rep(1,ns), kv, X = NULL, link = 0, disc = 0, difl = 0, multi = 1:J, fort = FALSE, tol = 10^-10, nrep = 2, glob = FALSE, disp=FALSE)
search.model(S, yv = rep(1,ns), kv, X = NULL, link = 0, disc = 0, difl = 0, multi = 1:J, fort = FALSE, tol = 10^-10, nrep = 2, glob = FALSE, disp=FALSE)
S |
matrix of all response sequences observed at least once in the sample and listed row-by-row (use 999 for missing response) |
yv |
vector of the frequencies of every response configuration in |
kv |
vector of the possible numbers of latent classes |
X |
matrix of covariates that affects the weights |
link |
type of link function (1 = global logits, 2 = local logits); with global logits the Graded Response model results; with local logits the Partial Credit results (with dichotomous responses, global logits is the same as using local logits resulting in the Rasch or the 2PL model depending on the value assigned to disc) |
disc |
indicator of constraints on the discriminating indices (0 = all equal to one, 1 = free) |
difl |
indicator of constraints on the difficulty levels (0 = free, 1 = rating scale parametrization) |
multi |
matrix with a number of rows equal to the number of dimensions and elements in each row equal to the indices of the items measuring the dimension corresponding to that row |
fort |
to use fortran routines when possible |
tol |
tolerance level for checking convergence of the algorithm as relative difference between consecutive log-likelihoods |
nrep |
number of repetitions of each random initialization |
glob |
to use global logits in the covariates |
disp |
to dispaly partial output |
out.single |
output of each single model (as from est_multi_poly) for each k in kv |
bicv |
value of BIC index for each k in kv |
lkv |
value of log-likelihood for each k in kv |
Francesco Bartolucci, Silvia Bacci, Michela Gnaldi - University of Perugia (IT)
Bartolucci, F. (2007), A class of multidimensional IRT models for testing unidimensionality and clustering items, Psychometrika, 72, 141-157.
Bacci, S., Bartolucci, F. and Gnaldi, M. (2012), A class of Multidimensional Latent Class IRT models for ordinal polytomous item responses, Technical report, http://arxiv.org/abs/1201.4667.
## Not run: ## Search Multidimensional LC IRT models for binary responses # Aggregate data data(naep) X = as.matrix(naep) out = aggr_data(X) S = out$data_dis yv = out$freq # Define matrix to allocate each item on one dimension multi1 = rbind(c(1,2,9,10),c(3,5,8,11),c(4,6,7,12)) out2 = search.model(S, yv = yv, kv=c(1:4),multi=multi1) ## End(Not run)
## Not run: ## Search Multidimensional LC IRT models for binary responses # Aggregate data data(naep) X = as.matrix(naep) out = aggr_data(X) S = out$data_dis yv = out$freq # Define matrix to allocate each item on one dimension multi1 = rbind(c(1,2,9,10),c(3,5,8,11),c(4,6,7,12)) out2 = search.model(S, yv = yv, kv=c(1:4),multi=multi1) ## End(Not run)
Given a matrix of support points X and a corresponding vector of probabilities piv it computes the mean for each dimension, the variance covariance matrix, the correlation matrix, Spearman correlation matrix, and the standarized matrix Y
standard.matrix(X,piv)
standard.matrix(X,piv)
X |
matrix of support points for the distribution included row by row |
piv |
vector of probabilities with the same number of elements as the rows of |
mu |
vector of the means |
V |
variance-covariance matrix |
si2 |
vector of the variances |
si |
vector of standard deviations |
Cor |
Braives-Pearson correlation matrix |
Sper |
Spearman correlation matrix |
Y |
matrix of standardized support points |
Francesco Bartolucci, Silvia Bacci, Michela Gnaldi - University of Perugia (IT)
## Example of standardization of a randomly generated distribution X = matrix(rnorm(100),20,5) piv = runif(20); piv = piv/sum(piv) out = standard.matrix(X,piv)
## Example of standardization of a randomly generated distribution X = matrix(rnorm(100),20,5) piv = runif(20); piv = piv/sum(piv) out = standard.matrix(X,piv)
Given the output from class_item, it is written in a readable form
## S3 method for class 'class_item' summary(object, ...)
## S3 method for class 'class_item' summary(object, ...)
object |
output from class_item |
... |
further arguments passed to or from other methods |
table |
summary of all the results |
Francesco Bartolucci - University of Perugia (IT)
Given the output from est_multi_poly, it is written in a readable form
## S3 method for class 'est_multi_poly' summary(object, ...)
## S3 method for class 'est_multi_poly' summary(object, ...)
object |
output from est_multi_poly |
... |
further arguments passed to or from other methods |
Francesco Bartolucci - University of Perugia (IT)
Given the output from est_multi_poly_clust, it is written in a readable form
## S3 method for class 'est_multi_poly_clust' summary(object, ...)
## S3 method for class 'est_multi_poly_clust' summary(object, ...)
object |
output from est_multi_poly_clust |
... |
further arguments passed to or from other methods |
Francesco Bartolucci - University of Perugia (IT)
Given the output from test_dim, it is written in a readable form
## S3 method for class 'test_dim' summary(object, ...)
## S3 method for class 'test_dim' summary(object, ...)
object |
output from test_dim |
... |
further arguments passed to or from other methods |
table |
summary of all the results |
Francesco Bartolucci - University of Perugia (IT)
The function tests a certain multidimensional model (restricted model)
against a larger multidimensional model based on a higher number of dimensions. A
typical example is testing a unidimensional model (and then the hypothesis of unidimensionality)
against a bidimensional model. Both models are estimated by est_multi_poly
.
test_dim(S, yv, k, link = 1, disc = 0, difl = 0, multi0 = 1:J, multi1, tol = 10^-10, disp = FALSE)
test_dim(S, yv, k, link = 1, disc = 0, difl = 0, multi0 = 1:J, multi1, tol = 10^-10, disp = FALSE)
S |
matrix of all response sequences observed at least once in the sample and listed row-by-row (use 999 for missing response) |
yv |
vector of the frequencies of every response configuration in |
k |
number of ability levels (or latent classes) |
link |
type of link function (1 = global logits, 2 = local logits); with global logits the Graded Response model results; with local logits the Partial Credit results (with dichotomous responses, global logits is the same as using local logits resulting in the Rasch or the 2PL model depending on the value assigned to disc) |
disc |
indicator of constraints on the discriminating indices (0 = all equal to one, 1 = free) |
difl |
indicator of constraints on the difficulty levels (0 = free, 1 = rating scale parametrization) |
multi0 |
matrix specifying the multidimensional structure of the restricted model |
multi1 |
matrix specifying the multidimensional structure of the larger model |
tol |
tolerance level for checking convergence of the algorithm as relative difference between consecutive log-likelihoods |
disp |
to display intermediate output |
out0 |
output for the restricted model obtained from |
out1 |
output for the larger model obtained from |
dev |
likelihood ratio statistic |
df |
number of degrees of freedom of the test |
pv |
p-value for the test |
call |
command used to call the function |
Francesco Bartolucci, Silvia Bacci, Michela Gnaldi - University of Perugia (IT)
Bartolucci, F. (2007), A class of multidimensional IRT models for testing unidimensionality and clustering items, Psychometrika, 72, 141-157.
Bacci, S., Bartolucci, F. and Gnaldi, M. (2012), A class of Multidimensional Latent Class IRT models for ordinal polytomous item responses, Technical report, http://arxiv.org/abs/1201.4667.
## Computation of the LR statistic testing unidimensionality on HADS data # Aggregate data data(hads) X = as.matrix(hads) out = aggr_data(X) S = out$data_dis yv = out$freq # Define matrix to allocate each item on one dimension multi1 = rbind(c(2,6,7,8,10,11,12),c(1,3,4,5,9,13,14)) # Compare unidimensional vs bidimensional Graded Response models with free # discrimination and free difficulty parameters # with less severe tollerance level (to be increased) out = test_dim(S,yv,k=3,link=1,disc=1,multi1=multi1,tol=5*10^-4)
## Computation of the LR statistic testing unidimensionality on HADS data # Aggregate data data(hads) X = as.matrix(hads) out = aggr_data(X) S = out$data_dis yv = out$freq # Define matrix to allocate each item on one dimension multi1 = rbind(c(2,6,7,8,10,11,12),c(1,3,4,5,9,13,14)) # Compare unidimensional vs bidimensional Graded Response models with free # discrimination and free difficulty parameters # with less severe tollerance level (to be increased) out = test_dim(S,yv,k=3,link=1,disc=1,multi1=multi1,tol=5*10^-4)