Title: | Response Probability Functions |
---|---|
Description: | Factor out logic and math common to Item Factor Analysis fitting, diagnostics, and analysis. It is envisioned as core support code suitable for more specialized IRT packages to build upon. Complete access to optimized C functions are made available with R_RegisterCCallable(). This software is described in Pritikin & Falk (2020) <doi:10.1177/0146621620929431>. |
Authors: | Joshua Pritikin [cre, aut], Jonathan Weeks [ctb], Li Cai [ctb], Carrie Houts [ctb], Phil Chalmers [ctb], Michael D. Hunter [ctb], Carl F. Falk [ctb] |
Maintainer: | Joshua Pritikin <[email protected]> |
License: | GPL (>= 3) |
Version: | 1.0.14 |
Built: | 2024-12-16 07:06:49 UTC |
Source: | CRAN |
Factor out logic and math common to Item Factor Analysis fitting, diagnostics, and analysis. It is envisioned as core support code suitable for more specialized IFA packages to build upon.
This package provides optimized, low-level functions to map
parameters to response probabilities for dichotomous (1PL, 2PL and
3PL) rpf.drm
and polytomous (graded response
rpf.grm
, partial credit/generalized partial credit
(via the nominal model), and nominal rpf.nrm
items.
Item model parameters are passed around as a numeric vector. A 1D matrix is also acceptable. Regardless of model, parameters are always ordered as follows: discrimination/slope ("a"), difficulty/intercept ("b"), and pseudo guessing/upper-bound ("g"/"u"). If person ability ranges from negative to positive then probabilities are output from incorrect to correct. That is, a low ability person (e.g., ability = -2) will be more likely to get an item incorrect than correct. For example, a dichotomous model that returns [.25, .75] indicates a probability of .25 for incorrect and .75 for correct. A polytomous model will have the most incorrect probability at index 1 and the most correct probability at the maximum index.
All models are always in the logistic metric. To obtain normal
ogive discrimination parameters, divide slope parameters by
rpf.ogive
. Item models are estimated in
slope-intercept form. Input/output matrices arranged in the way
most convenient for low-level processing in C. The maximum
absolute logit is 35 because f(x) := 1-exp(x) loses accuracy around f(-35)
and equals 1 at f(-38) due to the limited accuracy of double
precision floating point.
This package could also accrete functions to support plotting (but not the actual plot functions).
Pritikin, J. N., Hunter, M. D., & Boker, S. M. (2015). Modular open-source software for Item Factor Analysis. Educational and Psychological Measurement, 75(3), 458-474
Thissen, D. and Steinberg, L. (1986). A taxonomy of item response models. Psychometrika 51(4), 567-577.
See rpf.rparam
to create item parameters.
When “minItemsPerScore” is passed, EAP scores will be computed from the data and stored. Scores are required for some diagnostic tests. See discussion of “minItemsPerScore” in EAPscores.
as.IFAgroup( mxModel, data = NULL, container = NULL, ..., minItemsPerScore = NULL )
as.IFAgroup( mxModel, data = NULL, container = NULL, ..., minItemsPerScore = NULL )
mxModel |
MxModel object |
data |
observed data (otherwise the data will be taken from the mxModel) |
container |
an MxModel in which to search for the latent distribution matrices |
... |
Not used. Forces remaining arguments to be specified by name. |
minItemsPerScore |
minimum number of items required to compute a score (also see description) |
a groups with item parameters and latent distribution
A model, or group within a model, is represented as a named list.
list of response model objects
numeric matrix of item parameters
logical matrix of indicating which parameters are free (TRUE) or fixed (FALSE)
numeric vector giving the mean of the latent distribution
numeric matrix giving the covariance of the latent distribution
data.frame containing observed item responses, and optionally, weights and frequencies
factors scores with response patterns in rows
name of the data column containing the numeric row weights (optional)
name of the data column containing the integral row frequencies (optional)
width of the quadrature expressed in Z units
number of quadrature points
minimum number of non-missing items when estimating factor scores
The param
matrix stores items parameters by column. If a
column has more rows than are required to fully specify a model
then the extra rows are ignored. The order of the items in
spec
and order of columns in param
are assumed to
match. All items should have the same number of latent dimensions.
Loadings on latent dimensions are given in the first few rows and
can be named by setting rownames. Item names are assigned by
param
colnames.
Currently only a multivariate normal distribution is available,
parameterized by the mean
and cov
. If mean
and
cov
are not specified then a standard normal distribution is
assumed. The quadrature consists of equally spaced points. For
example, qwidth=2
and qpoints=5
would produce points
-2, -1, 0, 1, and 2. The quadrature specification is part of the
group and not passed as extra arguments for the sake of
consistency. As currently implemented, OpenMx uses EAP scores to
estimate latent distribution parameters. By default, the exact same
EAP scores should be produced by EAPscores.
If a reference column is given then only rows that are not missing on the reference column are considered. Otherwise all rows are considered.
bestToOmit(grp, omit, ref = NULL)
bestToOmit(grp, omit, ref = NULL)
grp |
a list containing the model and data. See the details section. |
omit |
the maximum number of items to omit |
ref |
the reference column (optional) |
A model, or group within a model, is represented as a named list.
list of response model objects
numeric matrix of item parameters
logical matrix of indicating which parameters are free (TRUE) or fixed (FALSE)
numeric vector giving the mean of the latent distribution
numeric matrix giving the covariance of the latent distribution
data.frame containing observed item responses, and optionally, weights and frequencies
factors scores with response patterns in rows
name of the data column containing the numeric row weights (optional)
name of the data column containing the integral row frequencies (optional)
width of the quadrature expressed in Z units
number of quadrature points
minimum number of non-missing items when estimating factor scores
The param
matrix stores items parameters by column. If a
column has more rows than are required to fully specify a model
then the extra rows are ignored. The order of the items in
spec
and order of columns in param
are assumed to
match. All items should have the same number of latent dimensions.
Loadings on latent dimensions are given in the first few rows and
can be named by setting rownames. Item names are assigned by
param
colnames.
Currently only a multivariate normal distribution is available,
parameterized by the mean
and cov
. If mean
and
cov
are not specified then a standard normal distribution is
assumed. The quadrature consists of equally spaced points. For
example, qwidth=2
and qpoints=5
would produce points
-2, -1, 0, 1, and 2. The quadrature specification is part of the
group and not passed as extra arguments for the sake of
consistency. As currently implemented, OpenMx uses EAP scores to
estimate latent distribution parameters. By default, the exact same
EAP scores should be produced by EAPscores.
Other scoring:
EAPscores()
,
itemOutcomeBySumScore()
,
observedSumScore()
,
omitItems()
,
omitMostMissing()
,
sumScoreEAP()
Item Factor Analysis makes two assumptions: (1) that the latent distribution is reasonably approximated by the multivariate Normal and (2) that items are conditionally independent. This test examines the second assumption. The presence of locally dependent items can inflate the precision of estimates causing a test to seem more accurate than it really is.
ChenThissen1997( grp, ..., data = NULL, inames = NULL, qwidth = 6, qpoints = 49, method = "pearson", .twotier = TRUE, .parallel = TRUE )
ChenThissen1997( grp, ..., data = NULL, inames = NULL, qwidth = 6, qpoints = 49, method = "pearson", .twotier = TRUE, .parallel = TRUE )
grp |
a list containing the model and data. See the details section. |
... |
Not used. Forces remaining arguments to be specified by name. |
data |
|
inames |
a subset of items to examine |
qwidth |
|
qpoints |
|
method |
method to use to calculate P values. The default is the Pearson X^2 statistic. Use "lr" for the similar likelihood ratio statistic. |
.twotier |
whether to enable the two-tier optimization |
.parallel |
whether to take advantage of multiple CPUs (default TRUE) |
Statically significant entries suggest that the item pair has local dependence. Since log(.01)=-4.6, an absolute magitude of 5 is a reasonable cut-off. Positive entries indicate that the two item residuals are more correlated than expected. These items may share an unaccounted for latent dimension. Consider a redesign of the items or the use of testlets for scoring. Negative entries indicate that the two item residuals are less correlated than expected.
a list with raw, pval and detail. The pval matrix is a
lower triangular matrix of log P values with the sign
determined by relative association between the observed and
expected tables (see ordinal.gamma
)
A model, or group within a model, is represented as a named list.
list of response model objects
numeric matrix of item parameters
logical matrix of indicating which parameters are free (TRUE) or fixed (FALSE)
numeric vector giving the mean of the latent distribution
numeric matrix giving the covariance of the latent distribution
data.frame containing observed item responses, and optionally, weights and frequencies
factors scores with response patterns in rows
name of the data column containing the numeric row weights (optional)
name of the data column containing the integral row frequencies (optional)
width of the quadrature expressed in Z units
number of quadrature points
minimum number of non-missing items when estimating factor scores
The param
matrix stores items parameters by column. If a
column has more rows than are required to fully specify a model
then the extra rows are ignored. The order of the items in
spec
and order of columns in param
are assumed to
match. All items should have the same number of latent dimensions.
Loadings on latent dimensions are given in the first few rows and
can be named by setting rownames. Item names are assigned by
param
colnames.
Currently only a multivariate normal distribution is available,
parameterized by the mean
and cov
. If mean
and
cov
are not specified then a standard normal distribution is
assumed. The quadrature consists of equally spaced points. For
example, qwidth=2
and qpoints=5
would produce points
-2, -1, 0, 1, and 2. The quadrature specification is part of the
group and not passed as extra arguments for the sake of
consistency. As currently implemented, OpenMx uses EAP scores to
estimate latent distribution parameters. By default, the exact same
EAP scores should be produced by EAPscores.
Chen, W.-H. & Thissen, D. (1997). Local dependence indexes for item pairs using Item Response Theory. Journal of Educational and Behavioral Statistics, 22(3), 265-289.
Thissen, D., Steinberg, L., & Mooney, J. A. (1989). Trace lines for testlets: A use of multiple-categorical-response models. Journal of Educational Measurement, 26 (3), 247–260.
Wainer, H. & Kiely, G. L. (1987). Item clusters and computerized adaptive testing: A case for testlets. Journal of Educational measurement, 24(3), 185–201.
Other diagnostic:
SitemFit1()
,
SitemFit()
,
multinomialFit()
,
rpf.1dim.fit()
,
sumScoreEAPTest()
The base class for 1 dimensional response probability functions.
Unidimensional dichotomous item models (1PL, 2PL, and 3PL).
Unidimensional generalized partial credit monotonic polynomial.
This class contains methods common to both the generalized partial credit model and the graded response model.
The unidimensional graded response item model.
Unidimensional graded response monotonic polynomial.
Unidimensional logistic function of a monotonic polynomial.
Item specifications should not be modified after creation.
The base class for multi-dimensional response probability functions.
Multidimensional dichotomous item models (M1PL, M2PL, and M3PL).
This class contains methods common to both the generalized partial credit model and the graded response model.
The multidimensional graded response item model.
The multiple-choice response item model (both unidimensional and multidimensional models have the same parameterization).
The nominal response item model (both unidimensional and multidimensional models have the same parameterization).
Collapse small sample size categorical frequency counts
collapseCategoricalCells(observed, expected, minExpected = 1)
collapseCategoricalCells(observed, expected, minExpected = 1)
observed |
the observed frequency table |
expected |
the expected frequency table |
minExpected |
the minimum expected cell frequency Pearson's X^2 test requires some minimum frequency per cell to avoid an inflated false positive rate. This function will merge cells with the lowest frequency counts until all the counts are above the minimum threshold. Cells that have been merged are filled with NAs. The resulting tables and number of merged cells is returned. |
O = matrix(c(7,31,42,20,0), 1,5) E = matrix(c(3,39,50,8,0), 1,5) collapseCategoricalCells(O,E,9)
O = matrix(c(7,31,42,20,0), 1,5) E = matrix(c(3,39,50,8,0), 1,5) collapseCategoricalCells(O,E,9)
Compress a data frame into unique rows and frequency counts.
compressDataFrame(tabdata, freqColName = "freq", .asNumeric = FALSE)
compressDataFrame(tabdata, freqColName = "freq", .asNumeric = FALSE)
tabdata |
An object of class |
freqColName |
Column name to contain the frequencies |
.asNumeric |
logical. Whether to cast the frequencies to the numeric type |
Returns a compressed data frame
df <- as.data.frame(matrix(c(sample.int(2, 30, replace=TRUE)), 10, 3)) compressDataFrame(df)
df <- as.data.frame(matrix(c(sample.int(2, 30, replace=TRUE)), 10, 3)) compressDataFrame(df)
crosstabTest(ob, ex, trials)
crosstabTest(ob, ex, trials)
ob |
observed table |
ex |
expected table |
trials |
number of Monte-Carlo trials |
If you have missing data then you must specify
minItemsPerScore
. This option will set scores to NA when
there are too few items to make an accurate score estimate. If
you are using the scores as point estimates without considering
the standard error then you should set minItemsPerScore
as
high as you can tolerate. This will increase the amount of missing
data but scores will be more accurate. If you are carefully
considering the standard errors of the scores then you can set
minItemsPerScore
to 1. This will mimic the behavior of most
other IFA software wherein scores are estimated if there is at
least 1 non-NA item for the score. However, it may make more sense
to set minItemsPerScore
to 0. When set to 0, all NA rows
are scored to the prior distribution.
EAPscores(grp, ..., compressed = FALSE)
EAPscores(grp, ..., compressed = FALSE)
grp |
a list containing the model and data. See the details section. |
... |
Not used. Forces remaining arguments to be specified by name. |
compressed |
output one score per observed data row even when freqColumn is set (default FALSE) |
Output is not affected by the presence of a weightColumn
.
A model, or group within a model, is represented as a named list.
list of response model objects
numeric matrix of item parameters
logical matrix of indicating which parameters are free (TRUE) or fixed (FALSE)
numeric vector giving the mean of the latent distribution
numeric matrix giving the covariance of the latent distribution
data.frame containing observed item responses, and optionally, weights and frequencies
factors scores with response patterns in rows
name of the data column containing the numeric row weights (optional)
name of the data column containing the integral row frequencies (optional)
width of the quadrature expressed in Z units
number of quadrature points
minimum number of non-missing items when estimating factor scores
The param
matrix stores items parameters by column. If a
column has more rows than are required to fully specify a model
then the extra rows are ignored. The order of the items in
spec
and order of columns in param
are assumed to
match. All items should have the same number of latent dimensions.
Loadings on latent dimensions are given in the first few rows and
can be named by setting rownames. Item names are assigned by
param
colnames.
Currently only a multivariate normal distribution is available,
parameterized by the mean
and cov
. If mean
and
cov
are not specified then a standard normal distribution is
assumed. The quadrature consists of equally spaced points. For
example, qwidth=2
and qpoints=5
would produce points
-2, -1, 0, 1, and 2. The quadrature specification is part of the
group and not passed as extra arguments for the sake of
consistency. As currently implemented, OpenMx uses EAP scores to
estimate latent distribution parameters. By default, the exact same
EAP scores should be produced by EAPscores.
Other scoring:
bestToOmit()
,
itemOutcomeBySumScore()
,
observedSumScore()
,
omitItems()
,
omitMostMissing()
,
sumScoreEAP()
spec <- list() spec[1:3] <- list(rpf.grm(outcomes=3)) param <- sapply(spec, rpf.rparam) data <- rpf.sample(5, spec, param) colnames(param) <- colnames(data) grp <- list(spec=spec, param=param, data=data, minItemsPerScore=1L) EAPscores(grp)
spec <- list() spec[1:3] <- list(rpf.grm(outcomes=3)) param <- sapply(spec, rpf.rparam) data <- rpf.sample(5, spec, param) colnames(param) <- colnames(data) grp <- list(spec=spec, param=param, data=data, minItemsPerScore=1L) EAPscores(grp)
Expand a summary table of unique response patterns to a full sized data-set.
expandDataFrame(tabdata, freqName = NULL)
expandDataFrame(tabdata, freqName = NULL)
tabdata |
An object of class |
freqName |
Column name containing the frequencies |
Returns a data frame with all the response patterns
Based on code by Phil Chalmers [email protected]
data(LSAT7) expandDataFrame(LSAT7, freqName="freq")
data(LSAT7) expandDataFrame(LSAT7, freqName="freq")
Convert factor loadings to response function slopes
fromFactorLoading(loading, ogive = rpf.ogive)
fromFactorLoading(loading, ogive = rpf.ogive)
loading |
a matrix with items in the rows and factors in the columns |
ogive |
the ogive constant (default rpf.ogive) |
a slope matrix with items in the columns and factors in the rows
Other factor model equivalence:
fromFactorThreshold()
,
toFactorLoading()
,
toFactorThreshold()
Convert factor thresholds to response function intercepts
fromFactorThreshold(threshold, loading, ogive = rpf.ogive)
fromFactorThreshold(threshold, loading, ogive = rpf.ogive)
threshold |
a matrix with items in the columns and thresholds in the rows |
loading |
a matrix with items in the rows and factors in the columns |
ogive |
the ogive constant (default rpf.ogive) |
an item intercept matrix with items in the columns and intercepts in the rows
Other factor model equivalence:
fromFactorLoading()
,
toFactorLoading()
,
toFactorThreshold()
Produce an item outcome by observed sum-score table
itemOutcomeBySumScore(grp, mask, interest)
itemOutcomeBySumScore(grp, mask, interest)
grp |
a list containing the model and data. See the details section. |
mask |
a vector of logicals indicating which items to include |
interest |
index or name of the item of interest |
A model, or group within a model, is represented as a named list.
list of response model objects
numeric matrix of item parameters
logical matrix of indicating which parameters are free (TRUE) or fixed (FALSE)
numeric vector giving the mean of the latent distribution
numeric matrix giving the covariance of the latent distribution
data.frame containing observed item responses, and optionally, weights and frequencies
factors scores with response patterns in rows
name of the data column containing the numeric row weights (optional)
name of the data column containing the integral row frequencies (optional)
width of the quadrature expressed in Z units
number of quadrature points
minimum number of non-missing items when estimating factor scores
The param
matrix stores items parameters by column. If a
column has more rows than are required to fully specify a model
then the extra rows are ignored. The order of the items in
spec
and order of columns in param
are assumed to
match. All items should have the same number of latent dimensions.
Loadings on latent dimensions are given in the first few rows and
can be named by setting rownames. Item names are assigned by
param
colnames.
Currently only a multivariate normal distribution is available,
parameterized by the mean
and cov
. If mean
and
cov
are not specified then a standard normal distribution is
assumed. The quadrature consists of equally spaced points. For
example, qwidth=2
and qpoints=5
would produce points
-2, -1, 0, 1, and 2. The quadrature specification is part of the
group and not passed as extra arguments for the sake of
consistency. As currently implemented, OpenMx uses EAP scores to
estimate latent distribution parameters. By default, the exact same
EAP scores should be produced by EAPscores.
Other scoring:
EAPscores()
,
bestToOmit()
,
observedSumScore()
,
omitItems()
,
omitMostMissing()
,
sumScoreEAP()
set.seed(1) spec <- list() spec[1:3] <- rpf.grm(outcomes=3) param <- sapply(spec, rpf.rparam) data <- rpf.sample(5, spec, param) colnames(param) <- colnames(data) grp <- list(spec=spec, param=param, data=data) itemOutcomeBySumScore(grp, c(FALSE,TRUE,TRUE), 1L)
set.seed(1) spec <- list() spec[1:3] <- rpf.grm(outcomes=3) param <- sapply(spec, rpf.rparam) data <- rpf.sample(5, spec, param) colnames(param) <- colnames(data) grp <- list(spec=spec, param=param, data=data) itemOutcomeBySumScore(grp, c(FALSE,TRUE,TRUE), 1L)
These data from Wright & Stone (1979, p. 31) were fit with Winsteps 3.73 using a 1PL model (slope fixed to 1).
Wright, B. D. & Stone, M. H. (1979). Best Test Design: Rasch Measurement. Univ of Chicago Social Research.
data(kct)
data(kct)
The logit function is a standard transformation from [0,1] (such as a probability) to the real number line. This function is exactly the same as qlogis.
logit(p, location = 0, scale = 1, lower.tail = TRUE, log.p = FALSE)
logit(p, location = 0, scale = 1, lower.tail = TRUE, log.p = FALSE)
p |
a number between 0 and 1 |
location |
see qlogis |
scale |
see qlogis |
lower.tail |
see qlogis |
log.p |
see qlogis |
qlogis, plogis
logit(.5) # 0 logit(.25) # -1.098 logit(0) # -Inf
logit(.5) # 0 logit(.25) # -1.098 logit(0) # -Inf
Data from Thissen (1982); contains 5 dichotomously scored items obtained from the Law School Admissions Test, section 6.
Phil Chalmers [email protected]
Thissen, D. (1982). Marginal maximum likelihood estimation for the one-parameter logistic model. Psychometrika, 47, 175-186.
data(LSAT6)
data(LSAT6)
Data from Bock & Lieberman (1970); contains 5 dichotomously scored items obtained from the Law School Admissions Test, section 7.
Phil Chalmers [email protected]
Bock, R. D., & Lieberman, M. (1970). Fitting a response model for n dichotomously scored items. Psychometrika, 35(2), 179-197.
data(LSAT7)
data(LSAT7)
For degrees of freedom, we use the number of observed statistics (incorrect) instead of the number of possible response patterns (correct) (see Bock, Giibons, & Muraki, 1998, p. 265). This is not a huge problem because this test is becomes poorly calibrated when the multinomial table is sparse. For more accurate p-values, you can conduct a Monte-Carlo simulation study (see examples).
multinomialFit( grp, independenceGrp, ..., method = "lr", log = TRUE, .twotier = TRUE )
multinomialFit( grp, independenceGrp, ..., method = "lr", log = TRUE, .twotier = TRUE )
grp |
a list containing the model and data. See the details section. |
independenceGrp |
the independence group |
... |
Not used. Forces remaining arguments to be specified by name. |
method |
lr (default) or pearson |
log |
whether to report p-value in log units |
.twotier |
whether to use the two-tier optimization (default TRUE) |
Rows with missing data are ignored.
The full information test is described in Bartholomew & Tzamourani (1999, Section 3).
For CFI and TLI, you must provide an independenceGrp
.
A model, or group within a model, is represented as a named list.
list of response model objects
numeric matrix of item parameters
logical matrix of indicating which parameters are free (TRUE) or fixed (FALSE)
numeric vector giving the mean of the latent distribution
numeric matrix giving the covariance of the latent distribution
data.frame containing observed item responses, and optionally, weights and frequencies
factors scores with response patterns in rows
name of the data column containing the numeric row weights (optional)
name of the data column containing the integral row frequencies (optional)
width of the quadrature expressed in Z units
number of quadrature points
minimum number of non-missing items when estimating factor scores
The param
matrix stores items parameters by column. If a
column has more rows than are required to fully specify a model
then the extra rows are ignored. The order of the items in
spec
and order of columns in param
are assumed to
match. All items should have the same number of latent dimensions.
Loadings on latent dimensions are given in the first few rows and
can be named by setting rownames. Item names are assigned by
param
colnames.
Currently only a multivariate normal distribution is available,
parameterized by the mean
and cov
. If mean
and
cov
are not specified then a standard normal distribution is
assumed. The quadrature consists of equally spaced points. For
example, qwidth=2
and qpoints=5
would produce points
-2, -1, 0, 1, and 2. The quadrature specification is part of the
group and not passed as extra arguments for the sake of
consistency. As currently implemented, OpenMx uses EAP scores to
estimate latent distribution parameters. By default, the exact same
EAP scores should be produced by EAPscores.
Bartholomew, D. J., & Tzamourani, P. (1999). The goodness-of-fit of latent trait models in attitude measurement. Sociological Methods and Research, 27(4), 525-546.
Bock, R. D., Gibbons, R., & Muraki, E. (1988). Full-information item factor analysis. Applied Psychological Measurement, 12(3), 261-280.
Other diagnostic:
ChenThissen1997()
,
SitemFit1()
,
SitemFit()
,
rpf.1dim.fit()
,
sumScoreEAPTest()
# Create an example IFA group grp <- list(spec=list()) grp$spec[1:10] <- rpf.grm() grp$param <- sapply(grp$spec, rpf.rparam) colnames(grp$param) <- paste("i", 1:10, sep="") grp$mean <- 0 grp$cov <- diag(1) grp$uniqueFree <- sum(grp$param != 0) grp$data <- rpf.sample(1000, grp=grp) # Monte-Carlo simulation study mcReps <- 3 # increase this to 10,000 or so stat <- rep(NA, mcReps) for (rx in 1:mcReps) { t1 <- grp t1$data <- rpf.sample(grp=grp) stat[rx] <- multinomialFit(t1)$statistic } sum(multinomialFit(grp)$statistic > stat)/mcReps # better p-value
# Create an example IFA group grp <- list(spec=list()) grp$spec[1:10] <- rpf.grm() grp$param <- sapply(grp$spec, rpf.rparam) colnames(grp$param) <- paste("i", 1:10, sep="") grp$mean <- 0 grp$cov <- diag(1) grp$uniqueFree <- sum(grp$param != 0) grp$data <- rpf.sample(1000, grp=grp) # Monte-Carlo simulation study mcReps <- 3 # increase this to 10,000 or so stat <- rep(NA, mcReps) for (rx in 1:mcReps) { t1 <- grp t1$data <- rpf.sample(grp=grp) stat[rx] <- multinomialFit(t1)$statistic } sum(multinomialFit(grp)$statistic > stat)/mcReps # better p-value
When summary=TRUE
, tabulation uses row frequency
multiplied by row weight.
observedSumScore(grp, ..., mask, summary = TRUE)
observedSumScore(grp, ..., mask, summary = TRUE)
grp |
a list containing the model and data. See the details section. |
... |
Not used. Forces remaining arguments to be specified by name. |
mask |
a vector of logicals indicating which items to include |
summary |
whether to return a summary (default) or per-row scores |
A model, or group within a model, is represented as a named list.
list of response model objects
numeric matrix of item parameters
logical matrix of indicating which parameters are free (TRUE) or fixed (FALSE)
numeric vector giving the mean of the latent distribution
numeric matrix giving the covariance of the latent distribution
data.frame containing observed item responses, and optionally, weights and frequencies
factors scores with response patterns in rows
name of the data column containing the numeric row weights (optional)
name of the data column containing the integral row frequencies (optional)
width of the quadrature expressed in Z units
number of quadrature points
minimum number of non-missing items when estimating factor scores
The param
matrix stores items parameters by column. If a
column has more rows than are required to fully specify a model
then the extra rows are ignored. The order of the items in
spec
and order of columns in param
are assumed to
match. All items should have the same number of latent dimensions.
Loadings on latent dimensions are given in the first few rows and
can be named by setting rownames. Item names are assigned by
param
colnames.
Currently only a multivariate normal distribution is available,
parameterized by the mean
and cov
. If mean
and
cov
are not specified then a standard normal distribution is
assumed. The quadrature consists of equally spaced points. For
example, qwidth=2
and qpoints=5
would produce points
-2, -1, 0, 1, and 2. The quadrature specification is part of the
group and not passed as extra arguments for the sake of
consistency. As currently implemented, OpenMx uses EAP scores to
estimate latent distribution parameters. By default, the exact same
EAP scores should be produced by EAPscores.
Other scoring:
EAPscores()
,
bestToOmit()
,
itemOutcomeBySumScore()
,
omitItems()
,
omitMostMissing()
,
sumScoreEAP()
spec <- list() spec[1:3] <- rpf.grm(outcomes=3) param <- sapply(spec, rpf.rparam) data <- rpf.sample(5, spec, param) colnames(param) <- colnames(data) grp <- list(spec=spec, param=param, data=data) observedSumScore(grp)
spec <- list() spec[1:3] <- rpf.grm(outcomes=3) param <- sapply(spec, rpf.rparam) data <- rpf.sample(5, spec, param) colnames(param) <- colnames(data) grp <- list(spec=spec, param=param, data=data) observedSumScore(grp)
Omit the given items
omitItems(grp, excol)
omitItems(grp, excol)
grp |
a list containing the model and data. See the details section. |
excol |
vector of column names to omit |
A model, or group within a model, is represented as a named list.
list of response model objects
numeric matrix of item parameters
logical matrix of indicating which parameters are free (TRUE) or fixed (FALSE)
numeric vector giving the mean of the latent distribution
numeric matrix giving the covariance of the latent distribution
data.frame containing observed item responses, and optionally, weights and frequencies
factors scores with response patterns in rows
name of the data column containing the numeric row weights (optional)
name of the data column containing the integral row frequencies (optional)
width of the quadrature expressed in Z units
number of quadrature points
minimum number of non-missing items when estimating factor scores
The param
matrix stores items parameters by column. If a
column has more rows than are required to fully specify a model
then the extra rows are ignored. The order of the items in
spec
and order of columns in param
are assumed to
match. All items should have the same number of latent dimensions.
Loadings on latent dimensions are given in the first few rows and
can be named by setting rownames. Item names are assigned by
param
colnames.
Currently only a multivariate normal distribution is available,
parameterized by the mean
and cov
. If mean
and
cov
are not specified then a standard normal distribution is
assumed. The quadrature consists of equally spaced points. For
example, qwidth=2
and qpoints=5
would produce points
-2, -1, 0, 1, and 2. The quadrature specification is part of the
group and not passed as extra arguments for the sake of
consistency. As currently implemented, OpenMx uses EAP scores to
estimate latent distribution parameters. By default, the exact same
EAP scores should be produced by EAPscores.
Other scoring:
EAPscores()
,
bestToOmit()
,
itemOutcomeBySumScore()
,
observedSumScore()
,
omitMostMissing()
,
sumScoreEAP()
Items with no missing data are never omitted, regardless of the number of items requested.
omitMostMissing(grp, omit)
omitMostMissing(grp, omit)
grp |
a list containing the model and data. See the details section. |
omit |
the maximum number of items to omit |
A model, or group within a model, is represented as a named list.
list of response model objects
numeric matrix of item parameters
logical matrix of indicating which parameters are free (TRUE) or fixed (FALSE)
numeric vector giving the mean of the latent distribution
numeric matrix giving the covariance of the latent distribution
data.frame containing observed item responses, and optionally, weights and frequencies
factors scores with response patterns in rows
name of the data column containing the numeric row weights (optional)
name of the data column containing the integral row frequencies (optional)
width of the quadrature expressed in Z units
number of quadrature points
minimum number of non-missing items when estimating factor scores
The param
matrix stores items parameters by column. If a
column has more rows than are required to fully specify a model
then the extra rows are ignored. The order of the items in
spec
and order of columns in param
are assumed to
match. All items should have the same number of latent dimensions.
Loadings on latent dimensions are given in the first few rows and
can be named by setting rownames. Item names are assigned by
param
colnames.
Currently only a multivariate normal distribution is available,
parameterized by the mean
and cov
. If mean
and
cov
are not specified then a standard normal distribution is
assumed. The quadrature consists of equally spaced points. For
example, qwidth=2
and qpoints=5
would produce points
-2, -1, 0, 1, and 2. The quadrature specification is part of the
group and not passed as extra arguments for the sake of
consistency. As currently implemented, OpenMx uses EAP scores to
estimate latent distribution parameters. By default, the exact same
EAP scores should be produced by EAPscores.
Other scoring:
EAPscores()
,
bestToOmit()
,
itemOutcomeBySumScore()
,
observedSumScore()
,
omitItems()
,
sumScoreEAP()
Completely order all rows in a data.frame.
orderCompletely(observed)
orderCompletely(observed)
observed |
a data.frame holding ordered factors in every column |
the sorted order of the rows
df <- as.data.frame(matrix(c(sample.int(2, 30, replace=TRUE)), 10, 3)) mask <- matrix(c(sample.int(3, 30, replace=TRUE)), 10, 3) == 1 df[mask] <- NA df[orderCompletely(df),]
df <- as.data.frame(matrix(c(sample.int(2, 30, replace=TRUE)), 10, 3)) mask <- matrix(c(sample.int(3, 30, replace=TRUE)), 10, 3) == 1 df[mask] <- NA df[orderCompletely(df),]
Compute the ordinal gamma association statistic
ordinal.gamma(mat)
ordinal.gamma(mat)
mat |
a cross tabulation matrix |
Agresti, A. (1990). Categorical data analysis. New York: Wiley.
# Example data from Agresti (1990, p. 21) jobsat <- matrix(c(20,22,13,7,24,38,28,18,80,104,81,54,82,125,113,92), nrow=4, ncol=4) ordinal.gamma(jobsat)
# Example data from Agresti (1990, p. 21) jobsat <- matrix(c(20,22,13,7,24,38,28,18,80,104,81,54,82,125,113,92), nrow=4, ncol=4) ordinal.gamma(jobsat)
This test is an alternative to Pearson's X^2 goodness-of-fit test. In contrast to Pearson's X^2, no ad hoc cell collapsing is needed to avoid an inflated false positive rate in situations of sparse cell frequences. The statistic rapidly converges to the Monte-Carlo estimate as the number of draws increases.
ptw2011.gof.test(observed, expected)
ptw2011.gof.test(observed, expected)
observed |
observed matrix |
expected |
expected matrix |
The P value indicating whether the two tables come from the same distribution. For example, a significant result (P < alpha level) rejects the hypothesis that the two matrices are from the same distribution.
Perkins, W., Tygert, M., & Ward, R. (2011). Computing the confidence levels for a root-mean-square test of goodness-of-fit. Applied Mathematics and Computations, 217(22), 9072-9084.
draws <- 17 observed <- matrix(c(.294, .176, .118, .411), nrow=2) * draws expected <- matrix(c(.235, .235, .176, .353), nrow=2) * draws ptw2011.gof.test(observed, expected) # not signficiant
draws <- 17 observed <- matrix(c(.294, .176, .118, .411), nrow=2) * draws expected <- matrix(c(.235, .235, .176, .353), nrow=2) * draws ptw2011.gof.test(observed, expected) # not signficiant
This was last updated in 2017 and may no longer work.
read.flexmirt(fname)
read.flexmirt(fname)
fname |
file name |
Load the item parameters from a flexMIRT PRM file.
a list of groups as described in the details
A model, or group within a model, is represented as a named list.
list of response model objects
numeric matrix of item parameters
logical matrix of indicating which parameters are free (TRUE) or fixed (FALSE)
numeric vector giving the mean of the latent distribution
numeric matrix giving the covariance of the latent distribution
data.frame containing observed item responses, and optionally, weights and frequencies
factors scores with response patterns in rows
name of the data column containing the numeric row weights (optional)
name of the data column containing the integral row frequencies (optional)
width of the quadrature expressed in Z units
number of quadrature points
minimum number of non-missing items when estimating factor scores
The param
matrix stores items parameters by column. If a
column has more rows than are required to fully specify a model
then the extra rows are ignored. The order of the items in
spec
and order of columns in param
are assumed to
match. All items should have the same number of latent dimensions.
Loadings on latent dimensions are given in the first few rows and
can be named by setting rownames. Item names are assigned by
param
colnames.
Currently only a multivariate normal distribution is available,
parameterized by the mean
and cov
. If mean
and
cov
are not specified then a standard normal distribution is
assumed. The quadrature consists of equally spaced points. For
example, qwidth=2
and qpoints=5
would produce points
-2, -1, 0, 1, and 2. The quadrature specification is part of the
group and not passed as extra arguments for the sake of
consistency. As currently implemented, OpenMx uses EAP scores to
estimate latent distribution parameters. By default, the exact same
EAP scores should be produced by EAPscores.
Note: These statistics are only appropriate if all discrimination
parameters are fixed equal and items are conditionally independent
(see ChenThissen1997
). A best effort is made to
cope with missing data.
rpf.1dim.fit( spec, params, responses, scores, margin, group = NULL, wh.exact = TRUE )
rpf.1dim.fit( spec, params, responses, scores, margin, group = NULL, wh.exact = TRUE )
spec |
|
params |
|
responses |
|
scores |
|
margin |
for people 1, for items 2 |
group |
spec, params, data, and scores can be provided in a list instead of as arguments |
wh.exact |
whether to use the exact Wilson-Hilferty transformation |
Exact distributional properties of these statistics are unknown (Masters & Wright, 1997, p. 112). For details on the calculation, refer to Wright & Masters (1982, p. 100).
The Wilson-Hilferty transformation is biased for less than 25 items. Consider wh.exact=FALSE for less than 25 items.
A model, or group within a model, is represented as a named list.
list of response model objects
numeric matrix of item parameters
logical matrix of indicating which parameters are free (TRUE) or fixed (FALSE)
numeric vector giving the mean of the latent distribution
numeric matrix giving the covariance of the latent distribution
data.frame containing observed item responses, and optionally, weights and frequencies
factors scores with response patterns in rows
name of the data column containing the numeric row weights (optional)
name of the data column containing the integral row frequencies (optional)
width of the quadrature expressed in Z units
number of quadrature points
minimum number of non-missing items when estimating factor scores
The param
matrix stores items parameters by column. If a
column has more rows than are required to fully specify a model
then the extra rows are ignored. The order of the items in
spec
and order of columns in param
are assumed to
match. All items should have the same number of latent dimensions.
Loadings on latent dimensions are given in the first few rows and
can be named by setting rownames. Item names are assigned by
param
colnames.
Currently only a multivariate normal distribution is available,
parameterized by the mean
and cov
. If mean
and
cov
are not specified then a standard normal distribution is
assumed. The quadrature consists of equally spaced points. For
example, qwidth=2
and qpoints=5
would produce points
-2, -1, 0, 1, and 2. The quadrature specification is part of the
group and not passed as extra arguments for the sake of
consistency. As currently implemented, OpenMx uses EAP scores to
estimate latent distribution parameters. By default, the exact same
EAP scores should be produced by EAPscores.
A model, or group within a model, is represented as a named list.
list of response model objects
numeric matrix of item parameters
logical matrix of indicating which parameters are free (TRUE) or fixed (FALSE)
numeric vector giving the mean of the latent distribution
numeric matrix giving the covariance of the latent distribution
data.frame containing observed item responses, and optionally, weights and frequencies
factors scores with response patterns in rows
name of the data column containing the numeric row weights (optional)
name of the data column containing the integral row frequencies (optional)
width of the quadrature expressed in Z units
number of quadrature points
minimum number of non-missing items when estimating factor scores
The param
matrix stores items parameters by column. If a
column has more rows than are required to fully specify a model
then the extra rows are ignored. The order of the items in
spec
and order of columns in param
are assumed to
match. All items should have the same number of latent dimensions.
Loadings on latent dimensions are given in the first few rows and
can be named by setting rownames. Item names are assigned by
param
colnames.
Currently only a multivariate normal distribution is available,
parameterized by the mean
and cov
. If mean
and
cov
are not specified then a standard normal distribution is
assumed. The quadrature consists of equally spaced points. For
example, qwidth=2
and qpoints=5
would produce points
-2, -1, 0, 1, and 2. The quadrature specification is part of the
group and not passed as extra arguments for the sake of
consistency. As currently implemented, OpenMx uses EAP scores to
estimate latent distribution parameters. By default, the exact same
EAP scores should be produced by EAPscores.
Masters, G. N. & Wright, B. D. (1997). The Partial Credit Model. In W. van der Linden & R. K. Kambleton (Eds.), Handbook of modern item response theory (pp. 101-121). Springer.
Wilson, E. B., & Hilferty, M. M. (1931). The distribution of chi-square. Proceedings of the National Academy of Sciences of the United States of America, 17, 684-688.
Wright, B. D. & Masters, G. N. (1982). Rating Scale Analysis. Chicago: Mesa Press.
Other diagnostic:
ChenThissen1997()
,
SitemFit1()
,
SitemFit()
,
multinomialFit()
,
sumScoreEAPTest()
data(kct) responses <- kct.people[,paste("V",2:19, sep="")] rownames(responses) <- kct.people$NAME colnames(responses) <- kct.items$NAME scores <- kct.people$MEASURE params <- cbind(1, kct.items$MEASURE, logit(0), logit(1)) rownames(params) <- kct.items$NAME items<-list() items[1:18] <- rpf.drm() params[,2] <- -params[,2] rpf.1dim.fit(items, t(params), responses, scores, 2, wh.exact=TRUE)
data(kct) responses <- kct.people[,paste("V",2:19, sep="")] rownames(responses) <- kct.people$NAME colnames(responses) <- kct.items$NAME scores <- kct.people$MEASURE params <- cbind(1, kct.items$MEASURE, logit(0), logit(1)) rownames(params) <- kct.items$NAME items<-list() items[1:18] <- rpf.drm() params[,2] <- -params[,2] rpf.1dim.fit(items, t(params), responses, scores, 2, wh.exact=TRUE)
Popular central moments include 2 (variance) and 4 (kurtosis).
rpf.1dim.moment(spec, params, scores, m)
rpf.1dim.moment(spec, params, scores, m)
spec |
list of item models |
params |
data frame of item parameters, 1 per row |
scores |
model derived person scores |
m |
which moment |
moment matrix
Calculate residuals
rpf.1dim.residual(spec, params, responses, scores)
rpf.1dim.residual(spec, params, responses, scores)
spec |
list of item models |
params |
data frame of item parameters, 1 per row |
responses |
persons in rows and items in columns |
scores |
model derived person scores |
residuals
Calculate standardized residuals
rpf.1dim.stdresidual(spec, params, responses, scores)
rpf.1dim.stdresidual(spec, params, responses, scores)
spec |
list of item models |
params |
data frame of item parameters, 1 per row |
responses |
persons in rows and items in columns |
scores |
model derived person scores |
standardized residuals
Evaluate the partial derivatives of the log likelihood with
respect to each parameter at where
with weight
.
rpf.dLL(m, param, where, weight)
rpf.dLL(m, param, where, weight)
m |
item model |
param |
item parameters |
where |
location in the latent space |
weight |
per outcome weights (typically derived by observation) |
It is not easy to write an example for this function. To evaluate the derivative, you need to sum the derivatives across a quadrature. You also need response outcome weights at each quadrature point. It is not anticipated that this function will be often used in R code. It's mainly to expose a C-level function for occasional debugging.
first and second order partial derivatives of the log
likelihood evaluated at where
. For p parameters, the first
p values are the first derivative and the next p(p+1)/2 columns
are the lower triangle of the second derivative.
The numDeriv package.
For slope vector a, intercept c, pseudo-guessing parameter g, upper bound u, and latent ability vector theta, the response probability function is
rpf.drm(factors = 1, multidimensional = TRUE, poor = FALSE)
rpf.drm(factors = 1, multidimensional = TRUE, poor = FALSE)
factors |
the number of factors |
multidimensional |
whether to use a multidimensional model.
Defaults to |
poor |
if TRUE, use the traditional parameterization of the 1d model instead of the slope-intercept parameterization |
The pseudo-guessing and upper bound parameter are specified in
logit units (see logit
).
For discussion on the choice of priors see Cai, Yang, and Hansen (2011, p. 246).
an item model
Cai, L., Yang, J. S., & Hansen, M. (2011). Generalized Full-Information Item Bifactor Analysis. Psychological Methods, 16(3), 221-248.
Other response model:
rpf.gpcmp()
,
rpf.grmp()
,
rpf.grm()
,
rpf.lmp()
,
rpf.mcm()
,
rpf.nrm()
spec <- rpf.drm() rpf.prob(spec, rpf.rparam(spec), 0)
spec <- rpf.drm() rpf.prob(spec, rpf.rparam(spec), 0)
Evaluate the partial derivatives of the response probability with respect to ability. See rpf.info for an application.
rpf.dTheta(m, param, where, dir)
rpf.dTheta(m, param, where, dir)
m |
item model |
param |
item parameters |
where |
location in the latent distribution |
dir |
if more than 1 factor, a basis vector |
This model is a polytomous model proposed by Falk & Cai (2016) and is based on the generalized partial credit model (Muraki, 1992).
rpf.gpcmp(outcomes = 2, q = 0, multidimensional = FALSE)
rpf.gpcmp(outcomes = 2, q = 0, multidimensional = FALSE)
outcomes |
The number of possible response categories. |
q |
a non-negative integer that controls the order of the polynomial (2q+1) with a default of q=0 (1st order polynomial = generalized partial credit model). |
multidimensional |
whether to use a multidimensional model.
Defaults to |
The GPC-MP replaces the linear predictor part of the
generalized partial credit model with a monotonic polynomial,
.
The response function for category k is:
where and
are vectors
of length q. The GPC-MP uses the same parameterization for the polynomial
as described for the logistic function of a monotonic polynomial (LMP).
See also (
rpf.lmp
).
The order of the polynomial is always odd and is controlled by
the user specified non-negative integer, q. The model contains
1+(outcomtes-1)+2*q parameters and are used as input to the rpf.prob
function in the following order:
- natural log of the slope of the item model when q=0,
- a (outcomes-1)-length vector of intercept parameters,
and
- two parameters that control bends in
the polynomial. These latter parameters are repeated in the same order for
models with q>0. For example, a q=2 polynomial with 3 categories will have an item
parameter vector of:
.
Note that the GPC-MP reduces to the LMP when the number of categories is 2, and the GPC-MP reduces to the generalized partial credit model when the order of the polynomial is 1 (i.e., q=0).
an item model
Falk, C. F., & Cai, L. (2016). Maximum marginal likelihood estimation of a monotonic polynomial generalized partial credit model with applications to multiple group analysis. Psychometrika, 81, 434-460. doi:10.1007/s11336-014-9428-7
Muraki, E. (1992). A generalized partial credit model: Application of an EM algorithm. Applied Psychological Measurement, 16, 159–176.
Other response model:
rpf.drm()
,
rpf.grmp()
,
rpf.grm()
,
rpf.lmp()
,
rpf.mcm()
,
rpf.nrm()
spec <- rpf.gpcmp(5,2) # 5-category, 3rd order polynomial theta<-seq(-3,3,.1) p<-rpf.prob(spec, c(1.02,3.48,2.5,-.25,-1.64,.89,-8.7,-.74,-8.99),theta)
spec <- rpf.gpcmp(5,2) # 5-category, 3rd order polynomial theta<-seq(-3,3,.1) p<-rpf.prob(spec, c(1.02,3.48,2.5,-.25,-1.64,.89,-8.7,-.74,-8.99),theta)
For outcomes k in 0 to K, slope vector a, intercept vector c, and latent ability vector theta, the response probability function is
rpf.grm(outcomes = 2, factors = 1, multidimensional = TRUE)
rpf.grm(outcomes = 2, factors = 1, multidimensional = TRUE)
outcomes |
The number of choices available |
factors |
the number of factors |
multidimensional |
whether to use a multidimensional model.
Defaults to |
The graded response model was designed for a item with a series of
dependent parts where a higher score implies that easier parts of
the item were surmounted. If there is any chance your polytomous
item has independent parts then consider rpf.nrm
.
If your categories cannot cross then the graded response model
provides a little more information than the nominal model.
Stronger a priori assumptions offer provide more power at the cost
of flexibility.
an item model
Other response model:
rpf.drm()
,
rpf.gpcmp()
,
rpf.grmp()
,
rpf.lmp()
,
rpf.mcm()
,
rpf.nrm()
spec <- rpf.grm() rpf.prob(spec, rpf.rparam(spec), 0)
spec <- rpf.grm() rpf.prob(spec, rpf.rparam(spec), 0)
The GR-MP model replaces the linear predictor of the graded response model (Samejima, 1969, 1972) with a monotonic polynomial (Falk, conditionally accepted).
rpf.grmp(outcomes = 2, q = 0, multidimensional = FALSE)
rpf.grmp(outcomes = 2, q = 0, multidimensional = FALSE)
outcomes |
The number of possible response categories. When equal to 2, the model reduces to the logistic function of a monotonic polynomial (LMP). |
q |
a non-negative integer that controls the order of the polynomial (2q+1) with a default of q=0 (1st order polynomial = graded response model). |
multidimensional |
whether to use a multidimensional model.
Defaults to |
Given its relationship to the graded response model, the GR-MP is constructed in an analogous way:
The order of the polynomial is always odd and is controlled by
the user specified non-negative integer, q. The model contains
1+(outcomtes-1)+2*q parameters and are used as input to the rpf.prob
or rpf.dTheta
functions in the following order:
- slope of the item model when q=0,
- a (outcomes-1)-length vector of intercept parameters,
and
- two parameters that control bends in
the polynomial. These latter parameters are repeated in the same order for
models with q>0. For example, a q=2 polynomial with 3 categories will have an item
parameter vector of:
.
As with other monotonic polynomial-based item models
(e.g., rpf.lmp
), the polynomial looks like the
following:
However, the coefficients, b, are not directly estimated, but are a function of the
item parameters, and the parameterization of the GR-MP is different than
that currently appearing for the logistic function of a monotonic
polynomial (LMP; rpf.lmp
) and monotonic polynomial generalized partial credit
(GPC-MP; rpf.gpcmp
) models. In particular, the polynomial is
parameterized such that boundary descrimination functions for the GR-MP will
be all monotonically increasing or decreasing for any given item. This allows
the possibility of items that load either negatively or positively on the latent
trait, as is common with reverse-worded items in non-cognitive tests (e.g., personality).
The derivative is
parameterized in the following way:
Note that the only difference between the GR-MP and these other models
is that is not re-parameterized and may take on
negative values. When
is negative, it is analogous
to having a negative loading or a monotonically decreasing function.
an item model
Falk, C. F. (conditionally accepted). The monotonic polynomial graded response model: Implementation and a comparative study. Applied Psychological Measurement.
Samejima, F. (1969). Estimation of latent ability using a response pattern of graded scores. Psychometric Monographs, 17.
Samejima, F. (1972). A general model of free-response data. Psychometric Monographs, 18.
Other response model:
rpf.drm()
,
rpf.gpcmp()
,
rpf.grm()
,
rpf.lmp()
,
rpf.mcm()
,
rpf.nrm()
spec <- rpf.grmp(5,2) # 5-category, 3rd order polynomial theta<-seq(-3,3,.1) p<-rpf.prob(spec, c(2.77,2,1,0,-1,.89,-8.7,-.74,-8.99),theta)
spec <- rpf.grmp(5,2) # 5-category, 3rd order polynomial theta<-seq(-3,3,.1) p<-rpf.prob(spec, c(2.77,2,1,0,-1,.89,-8.7,-.74,-8.99),theta)
This is an internal function and should not be used.
rpf.id_of(name)
rpf.id_of(name)
name |
name of the item model (string) |
the integer ID assigned to the given model
Map an item model, item parameters, and person trait score into a information vector
rpf.info(ii, ii.p, where, basis = 1)
rpf.info(ii, ii.p, where, basis = 1)
ii |
an item model |
ii.p |
item parameters |
where |
the location in the latent distribution |
basis |
if more than 1 factor, a positive basis vector |
Fisher information
Dodd, B. G., De Ayala, R. J. & Koch, W. R. (1995). Computerized adaptive testing with polytomous items. Applied psychological measurement 19(1), 5-22.
i1 <- rpf.drm() i1.p <- c(.6,1,.1,.95) theta <- seq(0,3,.05) plot(theta, rpf.info(i1, i1.p, t(theta)), type="l")
i1 <- rpf.drm() i1.p <- c(.6,1,.1,.95) theta <- seq(0,3,.05) plot(theta, rpf.info(i1, i1.p, t(theta)), type="l")
This model is a dichotomous response model originally proposed by Liang (2007) and is implemented using the parameterization by Falk & Cai (2016).
rpf.lmp(q = 0, multidimensional = FALSE)
rpf.lmp(q = 0, multidimensional = FALSE)
q |
a non-negative integer that controls the order of the polynomial (2q+1) with a default of q=0 (1st order polynomial = 2PL). |
multidimensional |
whether to use a multidimensional model.
Defaults to |
The LMP model replaces the linear predictor part of the
two-parameter logistic function with a monotonic polynomial,
,
where and
are vectors
of length q.
The order of the polynomial is always odd and is controlled by
the user specified non-negative integer, q. The model contains
2+2*q parameters and are used in conjunction with the rpf.prob
or rpf.dTheta
function in the following order:
- the natural log of the slope of the item model when q=0,
- the intercept,
and
- two parameters that control bends in
the polynomial. These latter parameters are repeated in the same order for
models with q>0. For example, a q=2 polynomial with have an item
parameter vector of:
.
In general, the polynomial looks like the following:
However, the coefficients, b, are not directly estimated, but are a function of the
item parameters. In particular, the derivative is
parameterized in the following way:
See Falk & Cai (2016) for more details as to how the polynomial is constructed.
At the lowest order polynomial (q=0) the model reduces to the
two-parameter logistic (2PL) model. However, parameterization of the
slope parameter, , is currently different than
the 2PL (i.e., slope = exp(
)). This parameterization
ensures that the response function is always monotonically increasing
without requiring constrained optimization.
For an alternative parameterization that releases constraints
on , allowing for monotonically decreasing functions,
see
rpf.grmp
. And for polytomous items, see both
rpf.grmp
and rpf.gpcmp
.
an item model
Falk, C. F., & Cai, L. (2016). Maximum marginal likelihood estimation of a monotonic polynomial generalized partial credit model with applications to multiple group analysis. Psychometrika, 81, 434-460. doi:10.1007/s11336-014-9428-7
Liang (2007). A semi-parametric approach to estimating item response functions. Unpublished doctoral dissertation, Department of Psychology, The Ohio State University.
Other response model:
rpf.drm()
,
rpf.gpcmp()
,
rpf.grmp()
,
rpf.grm()
,
rpf.mcm()
,
rpf.nrm()
spec <- rpf.lmp(1) # 3rd order polynomial theta<-seq(-3,3,.1) p<-rpf.prob(spec, c(-.11,.37,.24,-.21),theta) spec <- rpf.lmp(2) # 5th order polynomial p<-rpf.prob(spec, c(.69,.71,-.5,-8.48,.52,-3.32),theta)
spec <- rpf.lmp(1) # 3rd order polynomial theta<-seq(-3,3,.1) p<-rpf.prob(spec, c(-.11,.37,.24,-.21),theta) spec <- rpf.lmp(2) # 5th order polynomial p<-rpf.prob(spec, c(.69,.71,-.5,-8.48,.52,-3.32),theta)
Note that in general, exp(rpf.logprob(..)) != rpf.prob(..) because the range of logits is much wider than the range of probabilities due to limitations of floating point numerical precision.
rpf.logprob(m, param, theta)
rpf.logprob(m, param, theta)
m |
an item model |
param |
item parameters |
theta |
the trait score(s) |
a vector of probabilities. For dichotomous items, probabilities are returned in the order incorrect, correct. Although redundent, both incorrect and correct probabilities are returned in the dichotomous case for API consistency with polytomous item models.
i1 <- rpf.drm() i1.p <- rpf.rparam(i1) rpf.logprob(i1, c(i1.p), -1) # low trait score rpf.logprob(i1, c(i1.p), c(0,1)) # average and high trait score
i1 <- rpf.drm() i1.p <- rpf.rparam(i1) rpf.logprob(i1, c(i1.p), -1) # low trait score rpf.logprob(i1, c(i1.p), c(0,1)) # average and high trait score
rpf.mcm(outcomes = 2, numChoices = 5, factors = 1)
rpf.mcm(outcomes = 2, numChoices = 5, factors = 1)
outcomes |
the number of possible outcomes |
numChoices |
the number of choices available |
factors |
the number of factors |
This function instantiates a multiple-choice response model.
an item model
Jonathan Weeks <[email protected]>
Other response model:
rpf.drm()
,
rpf.gpcmp()
,
rpf.grmp()
,
rpf.grm()
,
rpf.lmp()
,
rpf.nrm()
This is a point estimate of the mean difficulty of items that do not offer easily interpretable parameters such as the Generalized PCM. Since the information curve may not be unimodal, this function integrates across the latent space.
rpf.mean.info(spec, param, grain = 0.1)
rpf.mean.info(spec, param, grain = 0.1)
spec |
list of item specs |
param |
list or matrix of item parameters |
grain |
the step size for numerical integration (optional) |
rpf.mean.info1(spec, iparam, grain = 0.1)
rpf.mean.info1(spec, iparam, grain = 0.1)
spec |
an item spec |
iparam |
an item parameter vector |
grain |
the step size for numerical integration (optional) |
Create a similar item specification with the given number of factors
rpf.modify(m, factors)
rpf.modify(m, factors)
m |
item model |
factors |
the number of factors/dimensions |
s1 <- rpf.grm(factors=3) rpf.rparam(s1) s2 <- rpf.modify(s1, 1) rpf.rparam(s2)
s1 <- rpf.grm(factors=3) rpf.rparam(s1) s2 <- rpf.modify(s1, 1) rpf.rparam(s2)
This function instantiates a nominal response model.
rpf.nrm(outcomes = 3, factors = 1, T.a = "trend", T.c = "trend")
rpf.nrm(outcomes = 3, factors = 1, T.a = "trend", T.c = "trend")
outcomes |
The number of choices available |
factors |
the number of factors |
T.a |
the T matrix for slope parameters |
T.c |
the T matrix for intercept parameters |
The transformation matrices T.a and T.c are chosen by the analyst and not estimated. The T matrices must be invertible square matrices of size outcomes-1. As a shortcut, either T matrix can be specified as "trend" for a Fourier basis or as "id" for an identity basis. The response probability function is
where and
are the result of multiplying two vectors
of free parameters
and
by fixed matrices
and
, respectively;
and
are fixed to 0 for identification;
and
is a normalizing factor to ensure that
.
an item model
Thissen, D., Cai, L., & Bock, R. D. (2010). The Nominal Categories Item Response Model. In M. L. Nering & R. Ostini (Eds.), Handbook of Polytomous Item Response Theory Models (pp. 43–75). Routledge.
Other response model:
rpf.drm()
,
rpf.gpcmp()
,
rpf.grmp()
,
rpf.grm()
,
rpf.lmp()
,
rpf.mcm()
spec <- rpf.nrm() rpf.prob(spec, rpf.rparam(spec), 0) # typical parameterization for the Generalized Partial Credit Model gpcm <- function(outcomes) rpf.nrm(outcomes, T.c=lower.tri(diag(outcomes-1),TRUE) * -1) spec <- gpcm(4) rpf.prob(spec, rpf.rparam(spec), 0)
spec <- rpf.nrm() rpf.prob(spec, rpf.rparam(spec), 0) # typical parameterization for the Generalized Partial Credit Model gpcm <- function(outcomes) rpf.nrm(outcomes, T.c=lower.tri(diag(outcomes-1),TRUE) * -1) spec <- gpcm(4) rpf.prob(spec, rpf.rparam(spec), 0)
Length of the item parameter vector
rpf.numParam(m)
rpf.numParam(m)
m |
item model |
rpf.numParam(rpf.grm(outcomes=3)) rpf.numParam(rpf.nrm(outcomes=3))
rpf.numParam(rpf.grm(outcomes=3)) rpf.numParam(rpf.nrm(outcomes=3))
Length of the item model vector
rpf.numSpec(m)
rpf.numSpec(m)
m |
item model |
rpf.numSpec(rpf.grm(outcomes=3)) rpf.numSpec(rpf.nrm(outcomes=3))
rpf.numSpec(rpf.grm(outcomes=3)) rpf.numSpec(rpf.nrm(outcomes=3))
The ogive constant can be multiplied by the discrimination parameter to obtain a response curve very similar to the Normal cumulative distribution function (Haley, 1952; Molenaar, 1974). Recently, Savalei (2006) proposed a new constant of 1.749 based on Kullback-Leibler information.
rpf.ogive
rpf.ogive
An object of class numeric
of length 1.
In recent years, the logistic has grown in favor, and therefore, this package does not offer any special support for this transformation (Baker & Kim, 2004, pp. 14-18).
Camilli, G. (1994). Teacher's corner: Origin of the scaling constant d=1.7 in Item Response Theory. Journal of Educational and Behavioral Statistics, 19(3), 293-295.
Baker & Kim (2004). Item Response Theory: Parameter Estimation Techniques. Marcel Dekker, Inc.
Haley, D. C. (1952). Estimation of the dosage mortality relationship when the dose is subject to error (Technical Report No. 15). Stanford University Applied Mathematics and Statistics Laboratory, Stanford, CA.
Molenaar, W. (1974). De logistische en de normale kromme [The logistic and the normal curve]. Nederlands Tijdschrift voor de Psychologie 29, 415-420.
Savalei, V. (2006). Logistic approximation to the normal: The KL rationale. Psychometrika, 71(4), 763–767.
Retrieve a description of the given parameter
rpf.paramInfo(m, num = NULL)
rpf.paramInfo(m, num = NULL)
m |
item model |
num |
vector of parameters (defaults to all) |
a list containing the type, upper bound, and lower bound
rpf.paramInfo(rpf.drm())
rpf.paramInfo(rpf.drm())
This function is known by many names in the literature. When plotted against latent trait, it is often called a traceline, item characteristic curve, or item response function. Sometimes the word 'category' or 'outcome' is used in place of 'item'. For example, 'item response function' might become 'category response function'. All these terms refer to the same thing.
rpf.prob(m, param, theta)
rpf.prob(m, param, theta)
m |
an item model |
param |
item parameters |
theta |
the trait score(s) |
a vector of probabilities. For dichotomous items, probabilities are returned in the order incorrect, correct. Although redundent, both incorrect and correct probabilities are returned in the dichotomous case for API consistency with polytomous item models.
i1 <- rpf.drm() i1.p <- rpf.rparam(i1) rpf.prob(i1, c(i1.p), -1) # low trait score rpf.prob(i1, c(i1.p), c(0,1)) # average and high trait score
i1 <- rpf.drm() i1.p <- rpf.rparam(i1) rpf.prob(i1, c(i1.p), -1) # low trait score rpf.prob(i1, c(i1.p), c(0,1)) # average and high trait score
Adjust item parameters for changes in mean and covariance of the latent distribution.
rpf.rescale(m, param, mean, cov)
rpf.rescale(m, param, mean, cov)
m |
item model |
param |
item parameters |
mean |
vector of means |
cov |
covariance matrix |
spec <- rpf.grm() p1 <- rpf.rparam(spec) testPoint <- rnorm(1) move <- rnorm(1) cov <- as.matrix(rlnorm(1)) Icov <- solve(cov) padj <- rpf.rescale(spec, p1, move, cov) pr1 <- rpf.prob(spec, padj, (testPoint-move) %*% Icov) pr2 <- rpf.prob(spec, p1, testPoint) abs(pr1 - pr2) < 1e9
spec <- rpf.grm() p1 <- rpf.rparam(spec) testPoint <- rnorm(1) move <- rnorm(1) cov <- as.matrix(rlnorm(1)) Icov <- solve(cov) padj <- rpf.rescale(spec, p1, move, cov) pr1 <- rpf.prob(spec, padj, (testPoint-move) %*% Icov) pr2 <- rpf.prob(spec, p1, testPoint) abs(pr1 - pr2) < 1e9
This function generates random item parameters. The version
argument is available if you are writing a test that depends on
reproducable random parameters (using set.seed
).
rpf.rparam(m, version = 2L)
rpf.rparam(m, version = 2L)
m |
an item model |
version |
the version of random parameters |
item parameters
i1 <- rpf.drm() rpf.rparam(i1)
i1 <- rpf.drm() rpf.rparam(i1)
Returns a random sample of response patterns given a list of item
models and parameters. If grp
is given then theta, items, params,
mean, and cov can be omitted.
rpf.sample( theta, items, params, ..., prefix = "i", mean = NULL, cov = NULL, mcar = 0, grp = NULL )
rpf.sample( theta, items, params, ..., prefix = "i", mean = NULL, cov = NULL, mcar = 0, grp = NULL )
theta |
either a vector (for 1 dimension) or a matrix (for >1 dimension) of person abilities or the number of response patterns to generate randomly |
items |
a list of item models |
params |
a list or matrix of item parameters. If omitted, random item parameters are generated for each item model. |
... |
Not used. Forces remaining arguments to be specified by name. |
prefix |
Column names are taken from param or items. If no column names are available, some will be generated using the given prefix. |
mean |
mean vector of latent distribution (optional) |
cov |
covariance matrix of latent distribution (optional) |
mcar |
proportion of generated data to set to NA (missing completely at random) |
grp |
a list containing the model and data. See the details section. |
Returns a data frame of response patterns
A model, or group within a model, is represented as a named list.
list of response model objects
numeric matrix of item parameters
logical matrix of indicating which parameters are free (TRUE) or fixed (FALSE)
numeric vector giving the mean of the latent distribution
numeric matrix giving the covariance of the latent distribution
data.frame containing observed item responses, and optionally, weights and frequencies
factors scores with response patterns in rows
name of the data column containing the numeric row weights (optional)
name of the data column containing the integral row frequencies (optional)
width of the quadrature expressed in Z units
number of quadrature points
minimum number of non-missing items when estimating factor scores
The param
matrix stores items parameters by column. If a
column has more rows than are required to fully specify a model
then the extra rows are ignored. The order of the items in
spec
and order of columns in param
are assumed to
match. All items should have the same number of latent dimensions.
Loadings on latent dimensions are given in the first few rows and
can be named by setting rownames. Item names are assigned by
param
colnames.
Currently only a multivariate normal distribution is available,
parameterized by the mean
and cov
. If mean
and
cov
are not specified then a standard normal distribution is
assumed. The quadrature consists of equally spaced points. For
example, qwidth=2
and qpoints=5
would produce points
-2, -1, 0, 1, and 2. The quadrature specification is part of the
group and not passed as extra arguments for the sake of
consistency. As currently implemented, OpenMx uses EAP scores to
estimate latent distribution parameters. By default, the exact same
EAP scores should be produced by EAPscores.
# 1 dimensional items i1 <- rpf.drm() i1.p <- rpf.rparam(i1) i2 <- rpf.nrm(outcomes=3) i2.p <- rpf.rparam(i2) rpf.sample(5, list(i1,i2), list(i1.p, i2.p))
# 1 dimensional items i1 <- rpf.drm() i1.p <- rpf.rparam(i1) i2 <- rpf.nrm(outcomes=3) i2.p <- rpf.rparam(i2) rpf.sample(5, list(i1,i2), list(i1.p, i2.p))
These data are from Wright & Masters (1982, p. 18).
All items were fit to a 3 category Partial Credit Model (PCM) using Ministep 3.75.0.
Wright, B. D. & Masters, G. N. (1982). Rating Scale Analysis. Chicago: Mesa Press.
data(science)
data(science)
Runs SitemFit1
for every item and accumulates
the results.
SitemFit( grp, ..., method = "pearson", log = TRUE, qwidth = 6, qpoints = 49L, alt = FALSE, omit = 0L, .twotier = TRUE, .parallel = TRUE )
SitemFit( grp, ..., method = "pearson", log = TRUE, qwidth = 6, qpoints = 49L, alt = FALSE, omit = 0L, .twotier = TRUE, .parallel = TRUE )
grp |
a list containing the model and data. See the details section. |
... |
Not used. Forces remaining arguments to be specified by name. |
method |
whether to use a pearson or rms test |
log |
whether to return p-values in log units |
qwidth |
|
qpoints |
|
alt |
whether to include the item of interest in the denominator |
omit |
number of items to omit (a single number) or a list of the length the number of items |
.twotier |
whether to enable the two-tier optimization |
.parallel |
whether to take advantage of multiple CPUs (default TRUE) |
a list of output from SitemFit1
A model, or group within a model, is represented as a named list.
list of response model objects
numeric matrix of item parameters
logical matrix of indicating which parameters are free (TRUE) or fixed (FALSE)
numeric vector giving the mean of the latent distribution
numeric matrix giving the covariance of the latent distribution
data.frame containing observed item responses, and optionally, weights and frequencies
factors scores with response patterns in rows
name of the data column containing the numeric row weights (optional)
name of the data column containing the integral row frequencies (optional)
width of the quadrature expressed in Z units
number of quadrature points
minimum number of non-missing items when estimating factor scores
The param
matrix stores items parameters by column. If a
column has more rows than are required to fully specify a model
then the extra rows are ignored. The order of the items in
spec
and order of columns in param
are assumed to
match. All items should have the same number of latent dimensions.
Loadings on latent dimensions are given in the first few rows and
can be named by setting rownames. Item names are assigned by
param
colnames.
Currently only a multivariate normal distribution is available,
parameterized by the mean
and cov
. If mean
and
cov
are not specified then a standard normal distribution is
assumed. The quadrature consists of equally spaced points. For
example, qwidth=2
and qpoints=5
would produce points
-2, -1, 0, 1, and 2. The quadrature specification is part of the
group and not passed as extra arguments for the sake of
consistency. As currently implemented, OpenMx uses EAP scores to
estimate latent distribution parameters. By default, the exact same
EAP scores should be produced by EAPscores.
Other diagnostic:
ChenThissen1997()
,
SitemFit1()
,
multinomialFit()
,
rpf.1dim.fit()
,
sumScoreEAPTest()
grp <- list(spec=list()) grp$spec[1:20] <- list(rpf.grm()) grp$param <- sapply(grp$spec, rpf.rparam) colnames(grp$param) <- paste("i", 1:20, sep="") grp$mean <- 0 grp$cov <- diag(1) grp$free <- grp$param != 0 grp$data <- rpf.sample(500, grp=grp) SitemFit(grp)
grp <- list(spec=list()) grp$spec[1:20] <- list(rpf.grm()) grp$param <- sapply(grp$spec, rpf.rparam) colnames(grp$param) <- paste("i", 1:20, sep="") grp$mean <- 0 grp$cov <- diag(1) grp$free <- grp$param != 0 grp$data <- rpf.sample(500, grp=grp) SitemFit(grp)
Implements the Kang & Chen (2007) polytomous extension to
S statistic of Orlando & Thissen (2000). Rows with
missing data are ignored, but see the omit
option.
SitemFit1( grp, item, free = 0, ..., method = "pearson", log = TRUE, qwidth = 6, qpoints = 49L, alt = FALSE, omit = 0L, .twotier = TRUE )
SitemFit1( grp, item, free = 0, ..., method = "pearson", log = TRUE, qwidth = 6, qpoints = 49L, alt = FALSE, omit = 0L, .twotier = TRUE )
grp |
a list containing the model and data. See the details section. |
item |
the item of interest |
free |
the number of free parameters involved in estimating the item (to adjust the df) |
... |
Not used. Forces remaining arguments to be specified by name. |
method |
whether to use a pearson or rms test |
log |
whether to return p-values in log units |
qwidth |
|
qpoints |
|
alt |
whether to include the item of interest in the denominator |
omit |
number of items to omit or a character vector with the names of the items to omit when calculating the observed and expected sum-score tables |
.twotier |
whether to enable the two-tier optimization |
This statistic is good at finding a small number of misfitting items among a large number of well fitting items. However, be aware that misfitting items can cause other items to misfit.
Observed tables cannot be computed when data is missing. Therefore, you can optionally omit items with the greatest number of responses missing relative to the item of interest.
Pearson is slightly more powerful than RMS in most cases I examined.
Setting alt
to TRUE
causes the tables to match
published articles. However, the default setting of FALSE
probably provides slightly more power when there are less than 10
items.
The name of the test, "S", probably stands for sum-score.
A model, or group within a model, is represented as a named list.
list of response model objects
numeric matrix of item parameters
logical matrix of indicating which parameters are free (TRUE) or fixed (FALSE)
numeric vector giving the mean of the latent distribution
numeric matrix giving the covariance of the latent distribution
data.frame containing observed item responses, and optionally, weights and frequencies
factors scores with response patterns in rows
name of the data column containing the numeric row weights (optional)
name of the data column containing the integral row frequencies (optional)
width of the quadrature expressed in Z units
number of quadrature points
minimum number of non-missing items when estimating factor scores
The param
matrix stores items parameters by column. If a
column has more rows than are required to fully specify a model
then the extra rows are ignored. The order of the items in
spec
and order of columns in param
are assumed to
match. All items should have the same number of latent dimensions.
Loadings on latent dimensions are given in the first few rows and
can be named by setting rownames. Item names are assigned by
param
colnames.
Currently only a multivariate normal distribution is available,
parameterized by the mean
and cov
. If mean
and
cov
are not specified then a standard normal distribution is
assumed. The quadrature consists of equally spaced points. For
example, qwidth=2
and qpoints=5
would produce points
-2, -1, 0, 1, and 2. The quadrature specification is part of the
group and not passed as extra arguments for the sake of
consistency. As currently implemented, OpenMx uses EAP scores to
estimate latent distribution parameters. By default, the exact same
EAP scores should be produced by EAPscores.
Kang, T. and Chen, T. T. (2007). An investigation of the performance of the generalized S-Chisq item-fit index for polytomous IRT models. ACT Research Report Series.
Orlando, M. and Thissen, D. (2000). Likelihood-Based Item-Fit Indices for Dichotomous Item Response Theory Models. Applied Psychological Measurement, 24(1), 50-64.
Other diagnostic:
ChenThissen1997()
,
SitemFit()
,
multinomialFit()
,
rpf.1dim.fit()
,
sumScoreEAPTest()
In addition, the freqColumn and weightColumn are reset to NULL.
stripData(grp)
stripData(grp)
grp |
a list containing the model and data. See the details section. |
The same group without associated data.
A model, or group within a model, is represented as a named list.
list of response model objects
numeric matrix of item parameters
logical matrix of indicating which parameters are free (TRUE) or fixed (FALSE)
numeric vector giving the mean of the latent distribution
numeric matrix giving the covariance of the latent distribution
data.frame containing observed item responses, and optionally, weights and frequencies
factors scores with response patterns in rows
name of the data column containing the numeric row weights (optional)
name of the data column containing the integral row frequencies (optional)
width of the quadrature expressed in Z units
number of quadrature points
minimum number of non-missing items when estimating factor scores
The param
matrix stores items parameters by column. If a
column has more rows than are required to fully specify a model
then the extra rows are ignored. The order of the items in
spec
and order of columns in param
are assumed to
match. All items should have the same number of latent dimensions.
Loadings on latent dimensions are given in the first few rows and
can be named by setting rownames. Item names are assigned by
param
colnames.
Currently only a multivariate normal distribution is available,
parameterized by the mean
and cov
. If mean
and
cov
are not specified then a standard normal distribution is
assumed. The quadrature consists of equally spaced points. For
example, qwidth=2
and qpoints=5
would produce points
-2, -1, 0, 1, and 2. The quadrature specification is part of the
group and not passed as extra arguments for the sake of
consistency. As currently implemented, OpenMx uses EAP scores to
estimate latent distribution parameters. By default, the exact same
EAP scores should be produced by EAPscores.
spec <- list() spec[1:3] <- list(rpf.grm(outcomes=3)) param <- sapply(spec, rpf.rparam) data <- rpf.sample(5, spec, param) colnames(param) <- colnames(data) grp <- list(spec=spec, param=param, data=data, minItemsPerScore=1L) grp$score <- EAPscores(grp) str(grp) grp <- stripData(grp) str(grp)
spec <- list() spec[1:3] <- list(rpf.grm(outcomes=3)) param <- sapply(spec, rpf.rparam) data <- rpf.sample(5, spec, param) colnames(param) <- colnames(data) grp <- list(spec=spec, param=param, data=data, minItemsPerScore=1L) grp$score <- EAPscores(grp) str(grp) grp <- stripData(grp) str(grp)
Observed tables cannot be computed when data is missing. Therefore, you can optionally omit items with the greatest number of responses missing when conducting the distribution test.
sumScoreEAP(grp, ..., qwidth = 6, qpoints = 49L, .twotier = TRUE)
sumScoreEAP(grp, ..., qwidth = 6, qpoints = 49L, .twotier = TRUE)
grp |
a list containing the model and data. See the details section. |
... |
Not used. Forces remaining arguments to be specified by name. |
qwidth |
DEPRECATED |
qpoints |
DEPRECATED |
.twotier |
whether to enable the two-tier optimization |
When two-tier covariance structure is detected, EAP scores are only reported for primary factors. It is possible to compute EAP scores for specific factors, but it is not clear why this would be useful because they are conditional on the specific factor sum scores. Moveover, the algorithm to compute them efficiently has not been published yet (as of Jun 2014).
A model, or group within a model, is represented as a named list.
list of response model objects
numeric matrix of item parameters
logical matrix of indicating which parameters are free (TRUE) or fixed (FALSE)
numeric vector giving the mean of the latent distribution
numeric matrix giving the covariance of the latent distribution
data.frame containing observed item responses, and optionally, weights and frequencies
factors scores with response patterns in rows
name of the data column containing the numeric row weights (optional)
name of the data column containing the integral row frequencies (optional)
width of the quadrature expressed in Z units
number of quadrature points
minimum number of non-missing items when estimating factor scores
The param
matrix stores items parameters by column. If a
column has more rows than are required to fully specify a model
then the extra rows are ignored. The order of the items in
spec
and order of columns in param
are assumed to
match. All items should have the same number of latent dimensions.
Loadings on latent dimensions are given in the first few rows and
can be named by setting rownames. Item names are assigned by
param
colnames.
Currently only a multivariate normal distribution is available,
parameterized by the mean
and cov
. If mean
and
cov
are not specified then a standard normal distribution is
assumed. The quadrature consists of equally spaced points. For
example, qwidth=2
and qpoints=5
would produce points
-2, -1, 0, 1, and 2. The quadrature specification is part of the
group and not passed as extra arguments for the sake of
consistency. As currently implemented, OpenMx uses EAP scores to
estimate latent distribution parameters. By default, the exact same
EAP scores should be produced by EAPscores.
Other scoring:
EAPscores()
,
bestToOmit()
,
itemOutcomeBySumScore()
,
observedSumScore()
,
omitItems()
,
omitMostMissing()
# see Thissen, Pommerich, Billeaud, & Williams (1995, Table 2) spec <- list() spec[1:3] <- list(rpf.grm(outcomes=4)) param <- matrix(c(1.87, .65, 1.97, 3.14, 2.66, .12, 1.57, 2.69, 1.24, .08, 2.03, 4.3), nrow=4) # fix parameterization param <- apply(param, 2, function(p) c(p[1], p[2:4] * -p[1])) grp <- list(spec=spec, mean=0, cov=matrix(1,1,1), param=param) sumScoreEAP(grp)
# see Thissen, Pommerich, Billeaud, & Williams (1995, Table 2) spec <- list() spec[1:3] <- list(rpf.grm(outcomes=4)) param <- matrix(c(1.87, .65, 1.97, 3.14, 2.66, .12, 1.57, 2.69, 1.24, .08, 2.03, 4.3), nrow=4) # fix parameterization param <- apply(param, 2, function(p) c(p[1], p[2:4] * -p[1])) grp <- list(spec=spec, mean=0, cov=matrix(1,1,1), param=param) sumScoreEAP(grp)
Conduct the sum-score EAP distribution test
sumScoreEAPTest(grp, ..., qwidth = 6, qpoints = 49L, .twotier = TRUE)
sumScoreEAPTest(grp, ..., qwidth = 6, qpoints = 49L, .twotier = TRUE)
grp |
a list containing the model and data. See the details section. |
... |
Not used. Forces remaining arguments to be specified by name. |
qwidth |
DEPRECATED |
qpoints |
DEPRECATED |
.twotier |
whether to enable the two-tier optimization |
A model, or group within a model, is represented as a named list.
list of response model objects
numeric matrix of item parameters
logical matrix of indicating which parameters are free (TRUE) or fixed (FALSE)
numeric vector giving the mean of the latent distribution
numeric matrix giving the covariance of the latent distribution
data.frame containing observed item responses, and optionally, weights and frequencies
factors scores with response patterns in rows
name of the data column containing the numeric row weights (optional)
name of the data column containing the integral row frequencies (optional)
width of the quadrature expressed in Z units
number of quadrature points
minimum number of non-missing items when estimating factor scores
The param
matrix stores items parameters by column. If a
column has more rows than are required to fully specify a model
then the extra rows are ignored. The order of the items in
spec
and order of columns in param
are assumed to
match. All items should have the same number of latent dimensions.
Loadings on latent dimensions are given in the first few rows and
can be named by setting rownames. Item names are assigned by
param
colnames.
Currently only a multivariate normal distribution is available,
parameterized by the mean
and cov
. If mean
and
cov
are not specified then a standard normal distribution is
assumed. The quadrature consists of equally spaced points. For
example, qwidth=2
and qpoints=5
would produce points
-2, -1, 0, 1, and 2. The quadrature specification is part of the
group and not passed as extra arguments for the sake of
consistency. As currently implemented, OpenMx uses EAP scores to
estimate latent distribution parameters. By default, the exact same
EAP scores should be produced by EAPscores.
Li, Z., & Cai, L. (2018). Summed Score Likelihood-Based Indices for Testing Latent Variable Distribution Fit in Item Response Theory. Educational and Psychological Measurement, 78(5), 857-886.
Other diagnostic:
ChenThissen1997()
,
SitemFit1()
,
SitemFit()
,
multinomialFit()
,
rpf.1dim.fit()
Like tabulate
but entire rows are the unit of tabulation.
The data.frame is not sorted, but must be sorted already.
tabulateRows(observed)
tabulateRows(observed)
observed |
a sorted data.frame holding ordered factors in every column |
df <- as.data.frame(matrix(c(sample.int(2, 30, replace=TRUE)), 10, 3)) df <- df[orderCompletely(df),] tabulateRows(df)
df <- as.data.frame(matrix(c(sample.int(2, 30, replace=TRUE)), 10, 3)) df <- df[orderCompletely(df),] tabulateRows(df)
All slopes are divided by the ogive constant. Then the following transformation is applied to the slope matrix,
toFactorLoading(slope, ogive = rpf.ogive)
toFactorLoading(slope, ogive = rpf.ogive)
slope |
a matrix with items in the columns and slopes in the rows |
ogive |
the ogive constant (default rpf.ogive) |
a factor loading matrix with items in the rows and factors in the columns
Other factor model equivalence:
fromFactorLoading()
,
fromFactorThreshold()
,
toFactorThreshold()
Convert response function intercepts to factor thresholds
toFactorThreshold(intercept, slope, ogive = rpf.ogive)
toFactorThreshold(intercept, slope, ogive = rpf.ogive)
intercept |
a matrix with items in the columns and intercepts in the rows |
slope |
a matrix with items in the columns and slopes in the rows |
ogive |
the ogive constant (default rpf.ogive) |
a factor threshold matrix with items in the columns and factor thresholds in the rows
Other factor model equivalence:
fromFactorLoading()
,
fromFactorThreshold()
,
toFactorLoading()
This was last updated in 2017 and may no longer work.
write.flexmirt(groups, file = NULL, fileEncoding = "")
write.flexmirt(groups, file = NULL, fileEncoding = "")
groups |
a list of groups each with items and latent parameters |
file |
the destination file name |
fileEncoding |
how to encode the text file (optional) |
Formats item parameters in the way that flexMIRT expects to read them.
NOTE: Support for the graded response model may not be complete.
A model, or group within a model, is represented as a named list.
list of response model objects
numeric matrix of item parameters
logical matrix of indicating which parameters are free (TRUE) or fixed (FALSE)
numeric vector giving the mean of the latent distribution
numeric matrix giving the covariance of the latent distribution
data.frame containing observed item responses, and optionally, weights and frequencies
factors scores with response patterns in rows
name of the data column containing the numeric row weights (optional)
name of the data column containing the integral row frequencies (optional)
width of the quadrature expressed in Z units
number of quadrature points
minimum number of non-missing items when estimating factor scores
The param
matrix stores items parameters by column. If a
column has more rows than are required to fully specify a model
then the extra rows are ignored. The order of the items in
spec
and order of columns in param
are assumed to
match. All items should have the same number of latent dimensions.
Loadings on latent dimensions are given in the first few rows and
can be named by setting rownames. Item names are assigned by
param
colnames.
Currently only a multivariate normal distribution is available,
parameterized by the mean
and cov
. If mean
and
cov
are not specified then a standard normal distribution is
assumed. The quadrature consists of equally spaced points. For
example, qwidth=2
and qpoints=5
would produce points
-2, -1, 0, 1, and 2. The quadrature specification is part of the
group and not passed as extra arguments for the sake of
consistency. As currently implemented, OpenMx uses EAP scores to
estimate latent distribution parameters. By default, the exact same
EAP scores should be produced by EAPscores.