Title: | Testing in Conditional Likelihood Context |
---|---|
Description: | An implementation of hypothesis testing in an extended Rasch modeling framework, including sample size planning procedures and power computations. Provides 4 statistical tests, i.e., gradient test (GR), likelihood ratio test (LR), Rao score or Lagrange multiplier test (RS), and Wald test, for testing a number of hypotheses referring to the Rasch model (RM), linear logistic test model (LLTM), rating scale model (RSM), and partial credit model (PCM). Three types of functions for power and sample size computations are provided. Firstly, functions to compute the sample size given a user-specified (predetermined) deviation from the hypothesis to be tested, the level alpha, and the power of the test. Secondly, functions to evaluate the power of the tests given a user-specified (predetermined) deviation from the hypothesis to be tested, the level alpha of the test, and the sample size. Thirdly, functions to evaluate the so-called post hoc power of the tests. This is the power of the tests given the observed deviation of the data from the hypothesis to be tested and a user-specified level alpha of the test. Power and sample size computations are based on a Monte Carlo simulation approach. It is computationally very efficient. The variance of the random error in computing power and sample size arising from the simulation approach is analytically derived by using the delta method. Draxler, C., & Alexandrowicz, R. W. (2015), <doi:10.1007/s11336-015-9472-y>. |
Authors: | Clemens Draxler [aut, cre], Andreas Kurz [aut] |
Maintainer: | Clemens Draxler <[email protected]> |
License: | GPL-2 |
Version: | 0.2.1 |
Built: | 2024-12-16 06:59:48 UTC |
Source: | CRAN |
Computes gradient (GR), likelihood ratio (LR), Rao score (RS) and Wald (W) test statistics for hypotheses on parameters expressing change between two time points.
change_test(X)
change_test(X)
X |
Data matrix containing the responses of n persons to 2k binary items. Columns 1 to k contain the responses to k items at time point 1, and columns (k+1) to 2k the responses to the same k items at time point 2. |
Assume all items be presented twice (2 time points) to the same persons. The data matrix X has n rows (number of persons) and 2k columns considered as virtual items. Assume a constant shift of item difficulties of each item between the 2 time points represented by one parameter. The shift parameter is the only parameter of interest. Of interest is the test of the hypothesis that the shift parameter equals 0 against the two-sided alternative that it is not equal to zero.
A list of test statistics, degrees of freedom, and p-values.
test |
A numeric vector of gradient (GR), likelihood ratio (LR), Rao score (RS), and Wald test statistics. |
df |
Degrees of freedom. |
pvalue |
A vector of corresponding p-values. |
call |
The matched call. |
Fischer, G. H. (1995). The Linear Logistic Test Model. In G. H. Fischer & I. W. Molenaar (Eds.), Rasch models: Foundations, Recent Developments, and Applications (pp. 131-155). New York: Springer.
Fischer, G. H. (1983). Logistic Latent Trait Models with Linear Constraints. Psychometrika, 48(1), 3-26.
invar_test
, and LLTM_test
.
## Not run: # Numerical example with 400 persons and 4 items # presented twice, thus 8 virtual items # Data y generated under the assumption that shift parameter equals 0 # (no change from time point 1 to 2) # design matrix W used only for example data generation # (not used for estimating in change_test function) W <- rbind(c(1,0,0,0,0), c(0,1,0,0,0), c(0,0,1,0,0), c(0,0,0,1,0), c(1,0,0,0,1), c(0,1,0,0,1), c(0,0,1,0,1), c(0,0,0,1,1)) # eta Parameter, first 4 are nuisance, i.e. , easiness parameters of the 4 items # at time point 1, last one is the shift parameter. eta <- c(-2,-1,1,2,0) y <- eRm::sim.rasch(persons = rnorm(400), items = colSums(eta * t(W))) res <- change_test(X = y) res$test # test statistics res$df # degrees of freedoms res$pvalue # p-values ## End(Not run)
## Not run: # Numerical example with 400 persons and 4 items # presented twice, thus 8 virtual items # Data y generated under the assumption that shift parameter equals 0 # (no change from time point 1 to 2) # design matrix W used only for example data generation # (not used for estimating in change_test function) W <- rbind(c(1,0,0,0,0), c(0,1,0,0,0), c(0,0,1,0,0), c(0,0,0,1,0), c(1,0,0,0,1), c(0,1,0,0,1), c(0,0,1,0,1), c(0,0,0,1,1)) # eta Parameter, first 4 are nuisance, i.e. , easiness parameters of the 4 items # at time point 1, last one is the shift parameter. eta <- c(-2,-1,1,2,0) y <- eRm::sim.rasch(persons = rnorm(400), items = colSums(eta * t(W))) res <- change_test(X = y) res$test # test statistics res$df # degrees of freedoms res$pvalue # p-values ## End(Not run)
Computes gradient (GR), likelihood ratio (LR), Rao score (RS) and Wald (W) test of hypothesis of equal item discriminations against the alternative that at least one item discriminates differently (only for binary data).
discr_test(X)
discr_test(X)
X |
Data matrix. |
The tests are based on the following model suggested in Draxler, Kurz, Gürer, and Nolte (2024)
where ist the expected value of a binary response (of a person to an item),
is the person score, i.e., number of correct responses of that person
when responding to
items,
is the respective person parameter and
and
are two parameters referring to the respective item. The parameter
represents a baseline, i.e., the easiness or attractiveness of the respective item in person score
group
. The parameter
denotes the constant change of the attractiveness of that
item between successive person score groups. Thus, the model assumes a linear effect of the person
score
on the logit of the probability of a correct response.
The four test statistics are derived from a conditional likelihood function in which the
parameters are eliminated by conditioning on the observed person scores.
The hypothesis to be tested is formally given by setting all
parameters equal to
.
The alternative assumes that at least one
parameter is not equal to
.
A list of test statistics, degrees of freedom, and p-values.
test |
A numeric vector of gradient (GR), likelihood ratio (LR), Rao score (RS), and Wald test statistics. |
df |
A numeric vector of corresponding degrees of freedom. |
pvalue |
A vector of corresponding p-values. |
call |
The matched call. |
Draxler, C., Kurz. A., Gürer, C., & Nolte, J. P. (2024). An improved inferential procedure to evaluate item discriminations in a conditional maximum likelihood framework. Manuscript submitted for publication.
invar_test
, change_test
, and LLTM_test
.
## Not run: ##### Dataset PISA Mathematics data.pisaMath {sirt} ##### library(sirt) data(data.pisaMath) y <- data.pisaMath$data[, grep(names(data.pisaMath$data), pattern = "M" )] res <- discr_test(X = y) # $test # GR LR RS W # 72.430 73.032 76.725 73.470 # # $df # GR LR RS W # 10 10 10 10 # # $pvalue # GR LR RS W # "< 0.001" "< 0.001" "< 0.001" "< 0.001" # # $call # discr_test(X = y) ## End(Not run)
## Not run: ##### Dataset PISA Mathematics data.pisaMath {sirt} ##### library(sirt) data(data.pisaMath) y <- data.pisaMath$data[, grep(names(data.pisaMath$data), pattern = "M" )] res <- discr_test(X = y) # $test # GR LR RS W # 72.430 73.032 76.725 73.470 # # $df # GR LR RS W # 10 10 10 10 # # $pvalue # GR LR RS W # "< 0.001" "< 0.001" "< 0.001" "< 0.001" # # $call # discr_test(X = y) ## End(Not run)
Computes gradient (GR), likelihood ratio (LR), Rao score (RS) and Wald (W) test statistics for hypothesis of equality of item parameters between two groups of persons against a two-sided alternative that at least one item parameter differs between the two groups.
invar_test(X, splitcr = "median", model = "RM")
invar_test(X, splitcr = "median", model = "RM")
X |
Data matrix. |
splitcr |
Split criterion which is either "mean", "median" or a numeric vector x.
|
model |
RM, PCM, RSM |
Note that items are excluded for the computation of GR, LR, and W due to inappropriate response patterns within subgroups and for computation of RS due to inappropriate response patterns in the total data. If the model is identified from the total data but not from one or both subgroups only RS will be computed. If the model is not identified from the total data, no test statistic is computable.
A list of test statistics, degrees of freedom, and p-values.
test |
A numeric vector of gradient (GR), likelihood ratio (LR), Rao score (RS), and Wald test statistics. |
df |
A numeric vector of corresponding degrees of freedom. |
pvalue |
A vector of corresponding p-values. |
deleted_items |
A list with numeric vectors of item numbers that were excluded before computing corresponding test statistics. |
call |
The matched call. |
Draxler, C. (2010). Sample Size Determination for Rasch Model Tests. Psychometrika, 75(4), 708–724.
Draxler, C., & Alexandrowicz, R. W. (2015). Sample Size Determination Within the Scope of Conditional Maximum Likelihood Estimation with Special Focus on Testing the Rasch Model. Psychometrika, 80(4), 897–919.
Draxler, C., Kurz, A., & Lemonte, A. J. (2020). The Gradient Test and its Finite Sample Size Properties in a Conditional Maximum Likelihood and Psychometric Modeling Context. Communications in Statistics-Simulation and Computation, 1-19.
Glas, C. A. W., & Verhelst, N. D. (1995a). Testing the Rasch Model. In G. H. Fischer & I. W. Molenaar (Eds.), Rasch Models: Foundations, Recent Developments, and Applications (pp. 69–95). New York: Springer.
Glas, C. A. W., & Verhelst, N. D. (1995b). Tests of Fit for Polytomous Rasch Models. In G. H. Fischer & I. W. Molenaar (Eds.), Rasch Models: Foundations, Recent Developments, and Applications (pp. 325-352). New York: Springer.
change_test
, and LLTM_test
.
## Not run: ##### Rasch Model ##### y <- eRm::sim.rasch(persons = rnorm(400), c(0,-3,-2,-1,0,1,2,3)) x <- c(rep(1,200),rep(0,200)) res <- invar_test(y, splitcr = x, model = "RM") res$test # test statistics res$df # degrees of freedoms res$pvalue # p-values res$deleted_items # excluded items $test GR LR RS W 14.492 14.083 13.678 12.972 $df GR LR RS W 7 7 7 7 $pvalue GR LR RS W "0.043" "0.050" "0.057" "0.073" $deleted_items $deleted_items$GR [1] "none" $deleted_items$LR [1] "none" $deleted_items$RS [1] "none" $deleted_items$W [1] "none" $call invar_test(X = y, splitcr = x, model = "RM") ## End(Not run)
## Not run: ##### Rasch Model ##### y <- eRm::sim.rasch(persons = rnorm(400), c(0,-3,-2,-1,0,1,2,3)) x <- c(rep(1,200),rep(0,200)) res <- invar_test(y, splitcr = x, model = "RM") res$test # test statistics res$df # degrees of freedoms res$pvalue # p-values res$deleted_items # excluded items $test GR LR RS W 14.492 14.083 13.678 12.972 $df GR LR RS W 7 7 7 7 $pvalue GR LR RS W "0.043" "0.050" "0.057" "0.073" $deleted_items $deleted_items$GR [1] "none" $deleted_items$LR [1] "none" $deleted_items$RS [1] "none" $deleted_items$W [1] "none" $call invar_test(X = y, splitcr = x, model = "RM") ## End(Not run)
Computes gradient (GR), likelihood ratio (LR), Rao score (RS) and Wald (W) test statistics for hypotheses defined by linear restrictions on parameter space of the item parameters of RM.
LLTM_test(X, W)
LLTM_test(X, W)
X |
Data matrix. |
W |
Design matrix of LLTM. |
The RM item parameters are assumed to be linear in the LLTM parameters. The coefficients of the linear functions are specified by a design matrix W. In this context, the LLTM is considered as a more parsimonious model than the RM. The LLTM parameters can be interpreted as the difficulties of certain cognitive operations needed to respond correctly to psychological test items. The item parameters of the RM are assumed to be linear combinations of these cognitive operations. These linear combinations are defined in the design matrix W.
A list of test statistics, degrees of freedom, and p-values.
test |
A numeric vector of gradient (GR), likelihood ratio (LR), Rao score (RS), and Wald test statistics. |
df |
Degrees of freedom. |
pvalue |
A vector of corresponding p-values. |
call |
The matched call. |
Fischer, G. H. (1995). The Linear Logistic Test Model. In G. H. Fischer & I. W. Molenaar (Eds.), Rasch models: Foundations, Recent Developments, and Applications (pp. 131-155). New York: Springer.
Fischer, G. H. (1983). Logistic Latent Trait Models with Linear Constraints. Psychometrika, 48(1), 3-26.
change_test
, and invar_test
.
## Not run: # Numerical example assuming no deviation from linear restriction # design matrix W defining linear restriction W <- rbind(c(1,0), c(0,1), c(1,1), c(2,1)) # assumed eta parameters of LLTM for data generation eta <- c(-0.5, 1) # assumed vector of item parameters of RM b <- colSums(eta * t(W)) y <- eRm::sim.rasch(persons = rnorm(400), items = b - b[1]) # sum0 = FALSE res <- LLTM_test(X = y, W = W ) res$test # test statistics res$df # degrees of freedoms res$pvalue # p-values ## End(Not run)
## Not run: # Numerical example assuming no deviation from linear restriction # design matrix W defining linear restriction W <- rbind(c(1,0), c(0,1), c(1,1), c(2,1)) # assumed eta parameters of LLTM for data generation eta <- c(-0.5, 1) # assumed vector of item parameters of RM b <- colSums(eta * t(W)) y <- eRm::sim.rasch(persons = rnorm(400), items = b - b[1]) # sum0 = FALSE res <- LLTM_test(X = y, W = W ) res$test # test statistics res$df # degrees of freedoms res$pvalue # p-values ## End(Not run)
Returns post hoc power of Wald (W), likelihood ratio (LR), Rao score (RS)
and gradient (GR) test given data and probability of error of first kind .
The hypothesis to be tested states that the shift parameter quantifying the constant change
for all items between time points 1 and 2 equals 0. The alternative states that the
shift parameter is not equal to 0. It is assumed that the same items are presented at both
time points. See function
change_test
.
post_hocChange(data, alpha = 0.05)
post_hocChange(data, alpha = 0.05)
data |
Data matrix as required for function |
alpha |
Probability of error of first kind. |
The power of the tests (Wald, LR, score, and gradient) is determined from the assumption that the
approximate distributions of the four test statistics are from the family of noncentral
distributions with
and noncentrality parameter
. In case of evaluating the post hoc power,
is assumed to be given by the observed value of the test statistic. Given the probability of the
error of the first kind
the post hoc power of the tests can be determined from
.
More details about the distributions of the test statistics and the relationship between
, power, and
sample size can be found in Draxler and Alexandrowicz (2015).
In particular, let be the
quantile of the central
distribution with df = 1. Then,
where is the cumulative distribution function of the noncentral
distribution with
and
equal to the observed value of the test statistic.
A list of results.
test |
A numeric vector of Wald (W), likelihood ratio (LR), Rao score (RS), and gradient (GR) test statistics. |
power |
Posthoc power value for each test. |
observed deviation |
CML estimate of shift parameter expressing observed deviation from hypothesis to be tested. |
person score distribution |
Relative frequencies of person scores. Uninformative scores, i.e., minimum and maximum score, are omitted. Note that the person score distribution does also have an influence on the power of the tests. |
degrees of freedom |
Degrees of freedom |
noncentrality parameter |
Noncentrality parameter |
call |
The matched call. |
Draxler, C., & Alexandrowicz, R. W. (2015). Sample size determination within the scope of conditional maximum likelihood estimation with special focus on testing the Rasch model. Psychometrika, 80(4), 897-919.
Fischer, G. H. (1995). The Linear Logistic Test Model. In G. H. Fischer & I. W. Molenaar (Eds.), Rasch models: Foundations, Recent Developments, and Applications (pp. 131-155). New York: Springer.
Fischer, G. H. (1983). Logistic Latent Trait Models with Linear Constraints. Psychometrika, 48(1), 3-26.
sa_sizeChange
, and powerChange
.
## Not run: # Numerical example with 200 persons and 4 items # presented twice, thus 8 virtual items # Data y generated under the assumption that shift parameter equals 0.5 # (change from time point 1 to 2) # design matrix W used only for exmaple data generation # (not used for estimating in change_test function) W <- rbind(c(1,0,0,0,0), c(0,1,0,0,0), c(0,0,1,0,0), c(0,0,0,1,0), c(1,0,0,0,1), c(0,1,0,0,1), c(0,0,1,0,1), c(0,0,0,1,1)) # eta parameter vector, first 4 are nuisance, i.e., item parameters at time point 1. # (easiness parameters of the 4 items at time point 1), # last one is the shift parameter eta <- c(-2,-1,1,2,0.5) y <- eRm::sim.rasch(persons=rnorm(150), items=colSums(-eta*t(W))) res <- post_hocChange(data = y, alpha = 0.05) # > res # $test # W LR RS GR # 9.822 10.021 9.955 10.088 # # $power # W LR RS GR # 0.880 0.886 0.884 0.888 # # $`observed deviation (estimate of shift parameter)` # [1] 0.504 # # $`person score distribution` # # 1 2 3 4 5 6 7 # 0.047 0.047 0.236 0.277 0.236 0.108 0.047 # # $`degrees of freedom` # [1] 1 # # $`noncentrality parameter` # W LR RS GR # 9.822 10.021 9.955 10.088 # # $call # post_hocChange(alpha = 0.05, data = y) ## End(Not run)
## Not run: # Numerical example with 200 persons and 4 items # presented twice, thus 8 virtual items # Data y generated under the assumption that shift parameter equals 0.5 # (change from time point 1 to 2) # design matrix W used only for exmaple data generation # (not used for estimating in change_test function) W <- rbind(c(1,0,0,0,0), c(0,1,0,0,0), c(0,0,1,0,0), c(0,0,0,1,0), c(1,0,0,0,1), c(0,1,0,0,1), c(0,0,1,0,1), c(0,0,0,1,1)) # eta parameter vector, first 4 are nuisance, i.e., item parameters at time point 1. # (easiness parameters of the 4 items at time point 1), # last one is the shift parameter eta <- c(-2,-1,1,2,0.5) y <- eRm::sim.rasch(persons=rnorm(150), items=colSums(-eta*t(W))) res <- post_hocChange(data = y, alpha = 0.05) # > res # $test # W LR RS GR # 9.822 10.021 9.955 10.088 # # $power # W LR RS GR # 0.880 0.886 0.884 0.888 # # $`observed deviation (estimate of shift parameter)` # [1] 0.504 # # $`person score distribution` # # 1 2 3 4 5 6 7 # 0.047 0.047 0.236 0.277 0.236 0.108 0.047 # # $`degrees of freedom` # [1] 1 # # $`noncentrality parameter` # W LR RS GR # 9.822 10.021 9.955 10.088 # # $call # post_hocChange(alpha = 0.05, data = y) ## End(Not run)
Returns post hoc power of Wald (W), likelihood ratio (LR), Rao score (RS)
and gradient (GR) test given data and probability of error of first kind .
The hypothesis to be tested assumes equal item-category parameters of the partial
credit model between two predetermined groups of persons. The alternative states that
at least one of the parameters differs between the two groups.
post_hocPCM(data, x, alpha = 0.05)
post_hocPCM(data, x, alpha = 0.05)
data |
Data matrix with item responses (in ordered categories starting from 0). |
x |
A numeric vector of length equal to number of persons that contains zeros and ones indicating group membership of the persons. |
alpha |
Probability of error of first kind. |
The power of the tests (Wald, LR, score, and gradient) is determined from the assumption
that the approximate distributions of the four test statistics are from the family of
noncentral distributions with
equal to the number of free item-category
parameters in the partial credit model and noncentrality parameter
. In case of evaluating
the post hoc power,
is assumed to be given by the observed value of the test statistic.
Given the probability of the error of the first kind
the post hoc power of the tests
can be determined from
. More details about the distributions of the test statistics and the
relationship between
, power, and sample size can be found in Draxler and Alexandrowicz (2015).
In particular, let be the
quantile of the central
distribution
with
equal to the number of free item-category parameters. Then,
where is the cumulative distribution function of the noncentral
distribution with
equal to the number of free item-category parameters and
equal to the
observed value of the test statistic.
A list of results.
test |
A numeric vector of Wald (W), likelihood ratio (LR), Rao score (RS), and gradient (GR) test statistics. |
power |
Post hoc power value for each test. |
observed global deviation |
Observed global deviation from hypothesis to be tested represented by a single number. It is obtained by dividing the test statistic by the informative sample size. The latter does not include persons with minimum or maximum person score. |
observed local deviation |
CML estimates of free item-category parameters in both groups of persons representing observed deviation from hypothesis to be tested locally per item and response category. |
person score distribution in group 1 |
Relative frequencies of person scores in group 1. Uninformative scores, i.e., minimum and maximum score, are omitted. Note that the person score distribution does also have an influence on the power of the tests. |
person score distribution in group 2 |
Relative frequencies of person scores in group 2. Uninformative scores, i.e., minimum and maximum score, are omitted. Note that the person score distribution does also have an influence on the power of the tests. |
degrees of freedom |
Degrees of freedom |
noncentrality parameter |
Noncentrality parameter |
call |
The matched call. |
Draxler, C. (2010). Sample Size Determination for Rasch Model Tests. Psychometrika, 75(4), 708–724.
Draxler, C., & Alexandrowicz, R. W. (2015). Sample Size Determination Within the Scope of Conditional Maximum Likelihood Estimation with Special Focus on Testing the Rasch Model. Psychometrika, 80(4), 897–919.
Draxler, C., Kurz, A., & Lemonte, A. J. (2020). The Gradient Test and its Finite Sample Size Properties in a Conditional Maximum Likelihood and Psychometric Modeling Context. Communications in Statistics-Simulation and Computation, 1-19.
Glas, C. A. W., & Verhelst, N. D. (1995a). Testing the Rasch Model. In G. H. Fischer & I. W. Molenaar (Eds.), Rasch Models: Foundations, Recent Developments, and Applications (pp. 69–95). New York: Springer.
Glas, C. A. W., & Verhelst, N. D. (1995b). Tests of Fit for Polytomous Rasch Models. In G. H. Fischer & I. W. Molenaar (Eds.), Rasch Models: Foundations, Recent Developments, and Applications (pp. 325-352). New York: Springer.
sa_sizePCM
, and powerPCM
.
## Not run: # Numerical example for post hoc power analysis for PCM y <- eRm::pcmdat2 n <- nrow(y) # sample size x <- c( rep(0,n/2), rep(1,n/2) ) # binary covariate res <- post_hocPCM(data = y, x = x, alpha = 0.05) # > res # $test # W LR RS GR # 11.395 11.818 11.628 11.978 # # $power # W LR RS GR # 0.683 0.702 0.694 0.709 # # $`observed global deviation` # W LR RS GR # 0.045 0.046 0.045 0.047 # # $`observed local deviation` # I1-C2 I2-C1 I2-C2 I3-C1 I3-C2 I4-C1 I4-C2 # group1 2.556 0.503 2.573 -2.573 -2.160 -1.272 -0.683 # group2 2.246 0.878 3.135 -1.852 -0.824 -0.494 0.941 # # $`person score distribution in group 1` # # 1 2 3 4 5 6 7 # 0.016 0.097 0.137 0.347 0.121 0.169 0.113 # # $`person score distribution in group 2` # # 1 2 3 4 5 6 7 # 0.015 0.083 0.136 0.280 0.152 0.227 0.106 # # $`degrees of freedom` # [1] 7 # # $`noncentrality parameter` # W LR RS GR # 11.395 11.818 11.628 11.978 # # $call # post_hocPCM(alpha = 0.05, data = y, x = x) ## End(Not run)
## Not run: # Numerical example for post hoc power analysis for PCM y <- eRm::pcmdat2 n <- nrow(y) # sample size x <- c( rep(0,n/2), rep(1,n/2) ) # binary covariate res <- post_hocPCM(data = y, x = x, alpha = 0.05) # > res # $test # W LR RS GR # 11.395 11.818 11.628 11.978 # # $power # W LR RS GR # 0.683 0.702 0.694 0.709 # # $`observed global deviation` # W LR RS GR # 0.045 0.046 0.045 0.047 # # $`observed local deviation` # I1-C2 I2-C1 I2-C2 I3-C1 I3-C2 I4-C1 I4-C2 # group1 2.556 0.503 2.573 -2.573 -2.160 -1.272 -0.683 # group2 2.246 0.878 3.135 -1.852 -0.824 -0.494 0.941 # # $`person score distribution in group 1` # # 1 2 3 4 5 6 7 # 0.016 0.097 0.137 0.347 0.121 0.169 0.113 # # $`person score distribution in group 2` # # 1 2 3 4 5 6 7 # 0.015 0.083 0.136 0.280 0.152 0.227 0.106 # # $`degrees of freedom` # [1] 7 # # $`noncentrality parameter` # W LR RS GR # 11.395 11.818 11.628 11.978 # # $call # post_hocPCM(alpha = 0.05, data = y, x = x) ## End(Not run)
Returns post hoc power of Wald (W), likelihood ratio (LR), Rao score (RS)
and gradient (GR) test given data and probability of error of first kind .
The hypothesis to be tested assumes equal item parameters between two predetermined groups
of persons. The alternative states that at least one of the parameters differs between the two
groups.
post_hocRM(data, x, alpha = 0.05)
post_hocRM(data, x, alpha = 0.05)
data |
Binary data matrix. |
x |
A numeric vector of length equal to number of persons containing zeros and ones indicating group membership of the persons. |
alpha |
Probability of error of first kind. |
The power of the tests (Wald, LR, score, and gradient) is determined from the assumption that the
approximate distributions of the four test statistics are from the family of noncentral
distributions with
equal to the number of items minus 1 and noncentrality parameter
. In case
of evaluating the post hoc power,
is assumed to be given by the observed value of the test statistic.
Given the probability of the error of the first kind
the post hoc power of the tests can be
determined from
. More details about the distributions of the test statistics and the relationship
between
, power, and sample size can be found in Draxler and Alexandrowicz (2015).
In particular, let be the
quantile of the central
distribution
with df equal to the number of items minus 1. Then,
where is the cumulative distribution function of the noncentral
distribution
with
equal to the number of items reduced by 1 and
equal to the observed value of the test statistic.
A list of results.
test |
A numeric vector of Wald (W), likelihood ratio (LR), Rao score (RS), and gradient (GR) test statistics. |
power |
Post hoc power value for each test. |
global deviation |
Observed global deviation from hypothesis to be tested represented by a single number. It is obtained by dividing the test statistic by the informative sample size. The latter does not include persons with minimum or maximum person score. |
local deviation |
CML estimates of free item parameters in both groups of persons (first item parameter set to 0 in both groups) representing observed deviation from hypothesis to be tested locally per item. |
person score distribution in group 1 |
Relative frequencies of person scores in group 1. Uninformative scores, i.e., minimum and maximum score, are omitted. Note that the person score distribution does also have an influence on the power of the tests. |
person score distribution in group 2 |
Relative frequencies of person scores in group 2. Uninformative scores, i.e., minimum and maximum score, are omitted. Note that the person score distribution does also have an influence on the power of the tests. |
degrees of freedom |
Degrees of freedom |
noncentrality parameter |
Noncentrality parameter |
call |
The matched call. |
Draxler, C. (2010). Sample Size Determination for Rasch Model Tests. Psychometrika, 75(4), 708–724.
Draxler, C., & Alexandrowicz, R. W. (2015). Sample Size Determination Within the Scope of Conditional Maximum Likelihood Estimation with Special Focus on Testing the Rasch Model. Psychometrika, 80(4), 897–919.
Draxler, C., Kurz, A., & Lemonte, A. J. (2020). The Gradient Test and its Finite Sample Size Properties in a Conditional Maximum Likelihood and Psychometric Modeling Context. Communications in Statistics-Simulation and Computation, 1-19.
Glas, C. A. W., & Verhelst, N. D. (1995a). Testing the Rasch Model. In G. H. Fischer & I. W. Molenaar (Eds.), Rasch Models: Foundations, Recent Developments, and Applications (pp. 69–95). New York: Springer.
Glas, C. A. W., & Verhelst, N. D. (1995b). Tests of Fit for Polytomous Rasch Models. In G. H. Fischer & I. W. Molenaar (Eds.), Rasch Models: Foundations, Recent Developments, and Applications (pp. 325-352). New York: Springer.
## Not run: # Numerical example for post hoc power analysis for Rasch Model y <- eRm::raschdat1 n <- nrow(y) # sample size x <- c( rep(0,n/2), rep(1,n/2) ) # binary covariate res <- post_hocRM(data = y, x = x, alpha = 0.05) # > res # $test # W LR RS GR # 29.241 29.981 29.937 30.238 # # $power # W LR RS GR # 0.890 0.900 0.899 0.903 # # $`observed global deviation` # W LR RS GR # 0.292 0.300 0.299 0.302 # # $`observed local deviation` # I2 I3 I4 I5 I6 I7 I8 I9 I10 I11 # group1 1.039 0.693 2.790 2.404 1.129 1.039 0.864 1.039 2.790 2.244 # group2 2.006 0.945 2.006 3.157 1.834 0.690 0.822 1.061 2.689 2.260 # I12 I13 I14 I15 I16 I17 I18 I19 I20 I21 # group1 1.412 3.777 3.038 1.315 2.244 1.039 1.221 2.404 0.608 0.608 # group2 0.945 2.962 4.009 1.171 2.175 1.472 2.091 2.344 1.275 0.690 # I22 I23 I24 I25 I26 I27 I28 I29 I30 # group1 0.438 0.608 1.617 3.038 0.438 1.617 2.100 2.583 0.864 # group2 0.822 1.275 1.565 2.175 0.207 1.746 1.746 2.260 0.822 # # $`person score distribution in group 1` # # 1 2 3 4 5 6 7 8 9 10 11 12 13 # 0.02 0.02 0.02 0.06 0.02 0.10 0.10 0.06 0.10 0.12 0.08 0.12 0.12 # 14 15 16 17 18 19 20 21 22 23 24 25 26 # 0.06 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 # 27 28 29 # 0.00 0.00 0.00 # # $`person score distribution in group 2` # # 1 2 3 4 5 6 7 8 9 10 11 12 13 # 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 # 14 15 16 17 18 19 20 21 22 23 24 25 26 # 0.08 0.12 0.10 0.16 0.06 0.04 0.10 0.12 0.08 0.02 0.02 0.02 0.08 # 27 28 29 # 0.00 0.00 0.00 # # $`degrees of freedom` # [1] 29 # # $`noncentrality parameter` # W LR RS GR # 29.241 29.981 29.937 30.238 # # $call # post_hocRM(alpha = 0.05, data = y, x = x) ## End(Not run)
## Not run: # Numerical example for post hoc power analysis for Rasch Model y <- eRm::raschdat1 n <- nrow(y) # sample size x <- c( rep(0,n/2), rep(1,n/2) ) # binary covariate res <- post_hocRM(data = y, x = x, alpha = 0.05) # > res # $test # W LR RS GR # 29.241 29.981 29.937 30.238 # # $power # W LR RS GR # 0.890 0.900 0.899 0.903 # # $`observed global deviation` # W LR RS GR # 0.292 0.300 0.299 0.302 # # $`observed local deviation` # I2 I3 I4 I5 I6 I7 I8 I9 I10 I11 # group1 1.039 0.693 2.790 2.404 1.129 1.039 0.864 1.039 2.790 2.244 # group2 2.006 0.945 2.006 3.157 1.834 0.690 0.822 1.061 2.689 2.260 # I12 I13 I14 I15 I16 I17 I18 I19 I20 I21 # group1 1.412 3.777 3.038 1.315 2.244 1.039 1.221 2.404 0.608 0.608 # group2 0.945 2.962 4.009 1.171 2.175 1.472 2.091 2.344 1.275 0.690 # I22 I23 I24 I25 I26 I27 I28 I29 I30 # group1 0.438 0.608 1.617 3.038 0.438 1.617 2.100 2.583 0.864 # group2 0.822 1.275 1.565 2.175 0.207 1.746 1.746 2.260 0.822 # # $`person score distribution in group 1` # # 1 2 3 4 5 6 7 8 9 10 11 12 13 # 0.02 0.02 0.02 0.06 0.02 0.10 0.10 0.06 0.10 0.12 0.08 0.12 0.12 # 14 15 16 17 18 19 20 21 22 23 24 25 26 # 0.06 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 # 27 28 29 # 0.00 0.00 0.00 # # $`person score distribution in group 2` # # 1 2 3 4 5 6 7 8 9 10 11 12 13 # 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 # 14 15 16 17 18 19 20 21 22 23 24 25 26 # 0.08 0.12 0.10 0.16 0.06 0.04 0.10 0.12 0.08 0.02 0.02 0.02 0.08 # 27 28 29 # 0.00 0.00 0.00 # # $`degrees of freedom` # [1] 29 # # $`noncentrality parameter` # W LR RS GR # 29.241 29.981 29.937 30.238 # # $call # post_hocRM(alpha = 0.05, data = y, x = x) ## End(Not run)
Returns power of Wald (W), likelihood ratio (LR), Rao score (RS)
and gradient (GR) test given probability of error of first kind , sample size, and
a deviation from the hypothesis to be tested. The latter states that the shift parameter
quantifying the constant change for all items between time points 1 and 2 equals 0.
The alternative states that the shift parameter is not equal to 0.
It is assumed that the same items are presented at both time points. See function
change_test
.
powerChange(n_total, eta, alpha = 0.05, persons = rnorm(10^6))
powerChange(n_total, eta, alpha = 0.05, persons = rnorm(10^6))
n_total |
Total sample size for which power shall be determined. |
eta |
A vector of eta parameters of the LLTM. The last element represents the constant change or shift for all items between time points 1 and 2. The other elements of the vector are the item parameters at time point 1. A choice of the eta parameters constitutes a scenario of deviation from the hypothesis of no change. |
alpha |
Probability of the error of first kind. |
persons |
A vector of person parameters (drawn from a specified distribution). By default |
In general, the power of the tests is determined from the assumption that the approximate distributions of
the four test statistics are from the family of noncentral distributions with
and noncentrality
parameter
. The latter depends on a scenario of deviation from the hypothesis to be tested and a specified sample size.
Given the probability of the error of the first kind
the power of the tests can be determined from
.
More details about the distributions of the test statistics and the relationship between
, power, and sample size can be found
in Draxler and Alexandrowicz (2015).
As regards the concept of sample size a distinction between informative and total sample size has to be made since the power of the tests depends only on the informative sample size. In the conditional maximum likelihood context, the responses of persons with minimum or maximum person score are completely uninformative. They do not contribute to the value of the test statistic. Thus, the informative sample size does not include these persons. The total sample size is composed of all persons.
In particular, the determination of and the power of the tests, respectively, is based on a simple Monte Carlo approach.
Data (responses of a large number of persons to a number of items presented at two time points) are generated given a
user-specified scenario of a deviation from the hypothesis to be tested. The hypothesis to be tested assumes no change
between time points 1 and 2. A scenario of a deviation is given by a choice of the item parameters at time point 1 and
the shift parameter, i.e., the LLTM eta parameters, as well as the person parameters (to be drawn randomly from a specified
distribution). The shift parameter represents a constant change of all item parameters from time point 1 to time point 2.
A test statistic
(Wald, LR, score, or gradient) is computed from the simulated data. The observed value
of the test
statistic is then divided by the informative sample size
observed in the simulated data. This yields the so-called
global deviation
, i.e., the chosen scenario of a deviation from the hypothesis to be tested being represented
by a single number. The power of the tests can be determined given a user-specified total sample size denoted by
.
The noncentrality parameter
can then be expressed by
,
where
denotes the total number of persons in the simulated data and
is the proportion of
informative persons in the sim. data. Let
be the
quantile of the central
distribution with
.
Then,
where is the cumulative distribution function of the noncentral
distribution with
and
. Thereby, it is assumed that
is composed of a frequency distribution
of person scores that is proportional to the observed distribution of person scores in the simulated data.
Note that in this approach the data have to be generated only once. There are no replications needed. Thus, the procedure is computationally not very time-consuming.
Since is determined from the value of the test statistic observed in the simulated data it has to be treated as a realized
value of a random variable
. The same holds true for
as well as the power of the tests. Thus, the power is a realized
value of a random variable that shall be denoted by
. Consequently, the (realized) value of the power of the tests need
not be equal to the exact power that follows from the user-specified
,
, and the chosen item parameters and shift
parameter used for the simulation of the data. If the CML estimates of these parameters computed from the simulated data are
close to the predetermined parameters the power of the tests will be close to the exact value. This will generally be the
case if the number of person parameters used for simulating the data is large, e.g.,
or even
persons. In such
cases, the possible random error of the computation procedure based on the sim. data may not be of practical relevance
any more. That is why a large number (of persons for the simulation process) is generally recommended.
For theoretical reasons, the random error involved in computing the power of the tests can be pretty well approximated.
A suitable approach is the well-known delta method. Basically, it is a Taylor polynomial of first order, i.e., a linear
approximation of a function. According to it the variance of a function of a random variable can be linearly approximated
by multiplying the variance of this random variable with the square of the first derivative of the respective function.
In the present problem, the variance of the test statistic is (approximately) given by the variance of a noncentral
distribution. Thus,
,
with
and
.
Since the global deviation
it follows for the variance of the corresponding random variable
that
. The power of the tests is a function of
which is given by
, where
and
.
Then, by the delta method one obtains (for the variance of
)
where is the derivative of
with respect to
. This derivative is determined
numerically and evaluated at
using the package numDeriv. The square root of
is then used to quantify the random
error of the suggested Monte Carlo computation procedure. It is called Monte Carlo error of power.
A list of results.
power |
Power value for each test. |
MC error of power |
Monte Carlo error of power computation for each test. |
deviation |
Shift parameter estimated from the simulated data representing the constant shift of item parameters between time points 1 and 2. |
person score distribution |
Relative frequencies of person scores observed in simulated data. Uninformative scores, i.e., minimum and maximum score, are omitted. Note that the person score distribution does also have an influence on the power of the tests. |
degrees of freedom |
Degrees of freedom |
noncentrality parameter |
Noncentrality parameter |
call |
The matched call. |
Draxler, C., & Alexandrowicz, R. W. (2015). Sample size determination within the scope of conditional maximum likelihood estimation with special focus on testing the Rasch model. Psychometrika, 80(4), 897-919.
Fischer, G. H. (1995). The Linear Logistic Test Model. In G. H. Fischer & I. W. Molenaar (Eds.), Rasch models: Foundations, Recent Developments, and Applications (pp. 131-155). New York: Springer.
Fischer, G. H. (1983). Logistic Latent Trait Models with Linear Constraints. Psychometrika, 48(1), 3-26.
sa_sizeChange
, and post_hocChange
.
## Not run: # Numerical example: 4 items presented twice, thus 8 virtual items # eta Parameter, first 4 are nuisance # (easiness parameters of the 4 items at time point 1), # last one is the shift parameter eta <- c(-2,-1,1,2,0.5) res <- powerChange(n_total = 150, eta = eta, persons=rnorm(10^6)) # > res # $power # W LR RS GR # 0.905 0.910 0.908 0.911 # # $`MC error of power` # W LR RS GR # 0.002 0.002 0.002 0.002 # # $`deviation (estimate of shift parameter)` # [1] 0.499 # # $`person score distribution` # # 1 2 3 4 5 6 7 # 0.034 0.093 0.181 0.249 0.228 0.147 0.068 # # $`degrees of freedom` # [1] 1 # # $`noncentrality parameter` # W LR RS GR # 10.692 10.877 10.815 10.939 # # $call # powerChange(alpha = 0.05, n_total = 150, eta = eta, persons = rnorm(10^6)) # ## End(Not run)
## Not run: # Numerical example: 4 items presented twice, thus 8 virtual items # eta Parameter, first 4 are nuisance # (easiness parameters of the 4 items at time point 1), # last one is the shift parameter eta <- c(-2,-1,1,2,0.5) res <- powerChange(n_total = 150, eta = eta, persons=rnorm(10^6)) # > res # $power # W LR RS GR # 0.905 0.910 0.908 0.911 # # $`MC error of power` # W LR RS GR # 0.002 0.002 0.002 0.002 # # $`deviation (estimate of shift parameter)` # [1] 0.499 # # $`person score distribution` # # 1 2 3 4 5 6 7 # 0.034 0.093 0.181 0.249 0.228 0.147 0.068 # # $`degrees of freedom` # [1] 1 # # $`noncentrality parameter` # W LR RS GR # 10.692 10.877 10.815 10.939 # # $call # powerChange(alpha = 0.05, n_total = 150, eta = eta, persons = rnorm(10^6)) # ## End(Not run)
Returns power of Wald (W), likelihood ratio (LR), Rao score (RS)
and gradient (GR) test given probability of error of first kind
, sample size, and a deviation from the hypothesis to be tested.
The hypothesis to be tested assumes equal item-category parameters of the
partial credit model between two predetermined groups of persons. The alternative
states that at least one of the parameters differs between the two groups.
powerPCM( n_total, local_dev, alpha = 0.05, persons1 = rnorm(10^6), persons2 = rnorm(10^6) )
powerPCM( n_total, local_dev, alpha = 0.05, persons1 = rnorm(10^6), persons2 = rnorm(10^6) )
n_total |
Total sample size for which power shall be determined. |
local_dev |
A list consisting of two lists. One list refers to group 1, the other to group 2. Each of the two lists contains a numeric vector per item, i.e., each list contains as many vectors as items. Each vector contains the free item-cat. parameters of the respective item. The number of free item-cat. parameters per item equals the number of categories of the item minus 1. |
alpha |
Probability of error of first kind. |
persons1 |
A vector of person parameters in group 1 (drawn from a specified distribution).
By default |
persons2 |
A vector of person parameters in group 2 (drawn from a specified distribution).
By default |
In general, the power of the tests is determined from the assumption that the
approximate distributions of the four test statistics are from the family of
noncentral distributions with
equal to the number of
free item-category parameters and noncentrality parameter
.
The latter depends on a scenario of deviation from the hypothesis to be tested
and a specified sample size. Given the probability of the error of the first
kind
the power of the tests can be determined from
.
More details about the distributions of the test statistics and the relationship
between
, power, and sample size can be found in Draxler and
Alexandrowicz (2015).
As regards the concept of sample size a distinction between informative and total sample size has to be made since the power of the tests depends only on the informative sample size. In the conditional maximum likelihood context, the responses of persons with minimum or maximum person score are completely uninformative. They do not contribute to the value of the test statistic. Thus, the informative sample size does not include these persons. The total sample size is composed of all persons.
In particular, the determination of and the power of the tests, respectively,
is based on a simple Monte Carlo approach. Data (responses of a large number of persons
to a number of items) are generated given a user-specified scenario of a deviation from
the hypothesis to be tested. A scenario of a deviation is given by a choice of the
item-cat. parameters and the person parameters (to be drawn randomly from a specified
distribution) for each of the two groups. Such a scenario may be called local deviation
since deviations can be specified locally for each item-category. The relative group
sizes are determined by the choice of the number of person parameters for each of the
two groups. For instance, by default
person parameters are selected randomly for
each group. In this case, it is implicitly assumed that the two groups of persons are
of equal size. The user can specify the relative group sizes by choosing the length of
the arguments persons1 and persons2 appropriately. Note that the relative group sizes
do have an impact on power and sample size of the tests. The next step is to compute a
test statistic
(Wald, LR, score, or gradient) from the simulated data. The observed
value
of the test statistic is then divided by the informative sample size
observed in the simulated data. This yields the so-called global deviation
, i.e., the chosen scenario of a deviation from the hypothesis to
be tested being represented by a single number. The power of the tests can be determined
given a user-specified total sample size denoted by
n_total
. The noncentrality
parameter can then be expressed by
, where
denotes
the total number of persons in the simulated data and
is
the proportion of informative persons in the sim. data. Let
be the
quantile of the central
distribution with df equal to the
number of free item-category parameters. Then,
where is the cumulative distribution function of the noncentral
distribution with
equal to the number of free item-category parameters
and
. Thereby, it is assumed that
is composed of a frequency distribution of person scores that is proportional
to the observed distribution of person scores in the simulated data. The same holds
true in respect of the relative group sizes, i.e., the relative frequencies of the two
person groups in a sample of size
are assumed to be equal to the relative frequencies of the two
groups in the simulated data.
Note that in this approach the data have to be generated only once. There are no replications needed. Thus, the procedure is computationally not very time-consuming.
Since is determined from the value of the test statistic observed in the simulated
data it has to be treated as a realized value of a random variable
. The same holds
true for
as well as the power of the tests. Thus, the power is a realized
value of a random variable that shall be denoted by
. Consequently, the (realized)
value of the power of the tests need not be equal to the exact power that follows from the
user-specified
,
, and the chosen item-category parameters used
for the simulation of the data. If the CML estimates of these parameters computed from the
simulated data are close to the predetermined parameters the power of the tests will be
close to the exact value. This will generally be the case if the number of person parameters
used for simulating the data is large, e.g.,
or even
persons. In such cases,
the possible random error of the computation procedure based on the sim. data may not be of
practical relevance any more. That is why a large number (of persons for the simulation process)
is generally recommended.
For theoretical reasons, the random error involved in computing the power of the tests can
be pretty well approximated. A suitable approach is the well-known delta method. Basically,
it is a Taylor polynomial of first order, i.e., a linear approximation of a function.
According to it the variance of a function of a random variable can be linearly approximated
by multiplying the variance of this random variable with the square of the first derivative
of the respective function. In the present problem, the variance of the test statistic
is (approximately) given by the variance of a noncentral
distribution with
equal to the number of free item-category parameters and noncentrality parameter
.
Thus,
, with
. Since the global
deviation
it follows for the variance of the corresponding random variable
that
.
The power of the tests is a function of
which is given by
,
where
and
equal to the
number of free item-category parameters. Then, by the delta method one obtains (for the variance of P).
where is the derivative of
with respect to
.
This derivative is determined numerically and evaluated at
using the package numDeriv. The square root of
is then used to quantify the random error of the suggested Monte Carlo computation
procedure. It is called Monte Carlo error of power.
A list of results.
power |
Power value for each test. |
MC error of power |
Monte Carlo error of power computation for each test. |
global deviation |
Global deviation computed from simulated data for each test. See Details. |
local deviation |
CML estimates of free item-category parameters in both groups of persons obtained from the simulated data expressing a deviation from the hypothesis to be tested locally per item and response category. |
person score distribution in group 1 |
Relative frequencies of person scores in group 1 observed in simulated data. Uninformative scores, i.e., minimum and maximum score, are omitted. Note that the person score distribution does also have an influence on the power of the tests. |
person score distribution in group 2 |
Relative frequencies of person scores in group 2 observed in simulated data. Uninformative scores, i.e., minimum and maximum score, are omitted. Note that the person score distribution does also have an influence on the power of the tests. |
degrees of freedom |
Degrees of freedom |
noncentrality parameter |
Noncentrality parameter |
call |
The matched call. |
Draxler, C. (2010). Sample Size Determination for Rasch Model Tests. Psychometrika, 75(4), 708–724.
Draxler, C., & Alexandrowicz, R. W. (2015). Sample Size Determination Within the Scope of Conditional Maximum Likelihood Estimation with Special Focus on Testing the Rasch Model. Psychometrika, 80(4), 897–919.
Draxler, C., Kurz, A., & Lemonte, A. J. (2020). The Gradient Test and its Finite Sample Size Properties in a Conditional Maximum Likelihood and Psychometric Modeling Context. Communications in Statistics-Simulation and Computation, 1-19.
Glas, C. A. W., & Verhelst, N. D. (1995a). Testing the Rasch Model. In G. H. Fischer & I. W. Molenaar (Eds.), Rasch Models: Foundations, Recent Developments, and Applications (pp. 69–95). New York: Springer.
Glas, C. A. W., & Verhelst, N. D. (1995b). Tests of Fit for Polytomous Rasch Models. In G. H. Fischer & I. W. Molenaar (Eds.), Rasch Models: Foundations, Recent Developments, and Applications (pp. 325-352). New York: Springer.
sa_sizePCM
, and post_hocPCM
.
## Not run: # Numerical example # free item-category parameters for group 1 and 2 with 5 items, with 3 categories each local_dev <- list ( list(c( 0, 0), c( -1, 0), c( 0, 0), c( 1, 0), c( 1, 0.5)) , list(c( 0, 0), c( -1, 0), c( 0, 0), c( 1, 0), c( 0, -0.5)) ) res <- powerPCM(n_total = 200, local_dev = local_dev) # > res # $power # W LR RS GR # 0.863 0.885 0.876 0.892 # # $`MC error of power` # W LR RS GR # 0.002 0.002 0.002 0.002 # # $`global deviation` # W LR RS GR # 0.102 0.107 0.105 0.109 # # $`local deviation` # I1-C2 I2-C1 I2-C2 I3-C1 I3-C2 I4-C1 I4-C2 I5-C1 I5-C2 # group1 0.002 -0.997 -0.993 0.006 0.012 1.002 1.007 1.006 1.508 # group2 -0.007 -1.005 -1.007 -0.006 -0.009 0.993 0.984 -0.006 -0.510 # # $`person score distribution in group 1` # # 1 2 3 4 5 6 7 8 9 # 0.112 0.130 0.131 0.129 0.122 0.114 0.101 0.091 0.070 # # $`person score distribution in group 2` # # 1 2 3 4 5 6 7 8 9 # 0.091 0.108 0.117 0.122 0.122 0.121 0.115 0.110 0.093 # # $`degrees of freedom` # [1] 9 # # $`noncentrality parameter` # W LR RS GR # 18.003 19.024 18.596 19.403 # # $call # powerPCM(alpha = 0.05, n_total = 200, persons1 = rnorm(10^6), # persons2 = rnorm(10^6), local_dev = local_dev) ## End(Not run)
## Not run: # Numerical example # free item-category parameters for group 1 and 2 with 5 items, with 3 categories each local_dev <- list ( list(c( 0, 0), c( -1, 0), c( 0, 0), c( 1, 0), c( 1, 0.5)) , list(c( 0, 0), c( -1, 0), c( 0, 0), c( 1, 0), c( 0, -0.5)) ) res <- powerPCM(n_total = 200, local_dev = local_dev) # > res # $power # W LR RS GR # 0.863 0.885 0.876 0.892 # # $`MC error of power` # W LR RS GR # 0.002 0.002 0.002 0.002 # # $`global deviation` # W LR RS GR # 0.102 0.107 0.105 0.109 # # $`local deviation` # I1-C2 I2-C1 I2-C2 I3-C1 I3-C2 I4-C1 I4-C2 I5-C1 I5-C2 # group1 0.002 -0.997 -0.993 0.006 0.012 1.002 1.007 1.006 1.508 # group2 -0.007 -1.005 -1.007 -0.006 -0.009 0.993 0.984 -0.006 -0.510 # # $`person score distribution in group 1` # # 1 2 3 4 5 6 7 8 9 # 0.112 0.130 0.131 0.129 0.122 0.114 0.101 0.091 0.070 # # $`person score distribution in group 2` # # 1 2 3 4 5 6 7 8 9 # 0.091 0.108 0.117 0.122 0.122 0.121 0.115 0.110 0.093 # # $`degrees of freedom` # [1] 9 # # $`noncentrality parameter` # W LR RS GR # 18.003 19.024 18.596 19.403 # # $call # powerPCM(alpha = 0.05, n_total = 200, persons1 = rnorm(10^6), # persons2 = rnorm(10^6), local_dev = local_dev) ## End(Not run)
Returns power of Wald (W), likelihood ratio (LR), Rao score (RS)
and gradient (GR) test given probability of error of first kind
, sample size, and a deviation from the hypothesis to be tested.
The latter assumes equality of the item parameters in the Rasch model
between two predetermined groups of persons. The alternative states that at least
one of the parameters differs between the two groups.
powerRM( n_total, local_dev, alpha = 0.05, persons1 = rnorm(10^6), persons2 = rnorm(10^6) )
powerRM( n_total, local_dev, alpha = 0.05, persons1 = rnorm(10^6), persons2 = rnorm(10^6) )
n_total |
Total sample size for which power shall be determined. |
local_dev |
A list of two vectors containing item parameters for the two person groups representing a deviation from the hypothesis to be tested locally per item. |
alpha |
Probability of error of first kind. |
persons1 |
A vector of person parameters in group 1 (drawn from a specified distribution).
By default |
persons2 |
A vector of person parameters in group 2 (drawn from a specified distribution).
By default |
In general, the power of the tests is determined from the assumption that the
approximate distributions of the four test statistics are from the family of
noncentral distributions with
equal to the number of items
minus 1 and noncentrality parameter
.
The latter depends on a scenario of deviation from the hypothesis to be tested
and a specified sample size. Given the probability of the error of the first
kind
the power of the tests can be determined from
.
More details about the distributions of the test statistics and the relationship
between
, power, and sample size can be found in Draxler and
Alexandrowicz (2015).
As regards the concept of sample size a distinction between informative and total sample size has to be made since the power of the tests depends only on the informative sample size. In the conditional maximum likelihood context, the responses of persons with minimum or maximum person score are completely uninformative. They do not contribute to the value of the test statistic. Thus, the informative sample size does not include these persons. The total sample size is composed of all persons.
In particular, the determination of and the power of the tests, respectively,
is based on a simple Monte Carlo approach. Data (responses of a large number of persons
to a number of items) are generated given a user-specified scenario of a deviation from
the hypothesis to be tested. A scenario of a deviation is given by a choice of the
item parameters and the person parameters (to be drawn randomly from a specified
distribution) for each of the two groups. Such a scenario may be called local deviation
since deviations can be specified locally for each item. The relative group
sizes are determined by the choice of the number of person parameters for each of the
two groups. For instance, by default
person parameters are selected randomly for
each group. In this case, it is implicitly assumed that the two groups of persons are
of equal size. The user can specify the relative group sizes by choosing the length of
the arguments persons1 and persons2 appropriately. Note that the relative group sizes
do have an impact on power and sample size of the tests. The next step is to compute a
test statistic
(Wald, LR, score, or gradient) from the simulated data. The observed
value
of the test statistic is then divided by the informative sample size
observed in the simulated data. This yields the so-called global deviation
, i.e., the chosen scenario of a deviation from the hypothesis to
be tested being represented by a single number. The power of the tests can be determined
given a user-specified total sample size denoted by
n_total
. The noncentrality
parameter can then be expressed by
, where
denotes
the total number of persons in the simulated data and
is
the proportion of informative persons in the sim. data. Let
be the
quantile of the central
distribution with df equal to the
number items minus 1. Then,
where is the cumulative distribution function of the noncentral
distribution with
equal to the number of items minus 1
and
. Thereby, it is assumed that
is composed of a frequency distribution of person scores that is proportional
to the observed distribution of person scores in the simulated data. The same holds
true in respect of the relative group sizes, i.e., the relative frequencies of the two
person groups in a sample of size
are assumed to be equal to the relative
frequencies of the two groups in the simulated data.
Note that in this approach the data have to be generated only once. There are no replications needed. Thus, the procedure is computationally not very time-consuming.
Since is determined from the value of the test statistic observed in the simulated
data it has to be treated as a realized value of a random variable
. The same holds
true for
as well as the power of the tests. Thus, the power is a realized
value of a random variable that shall be denoted by
. Consequently, the (realized)
value of the power of the tests need not be equal to the exact power that follows from the
user-specified
,
, and the chosen item parameters used
for the simulation of the data. If the CML estimates of these parameters computed from the
simulated data are close to the predetermined parameters the power of the tests will be
close to the exact value. This will generally be the case if the number of person parameters
used for simulating the data is large, e.g.,
or even
persons. In such cases,
the possible random error of the computation procedure based on the sim. data may not be of
practical relevance any more. That is why a large number (of persons for the simulation process)
is generally recommended.
For theoretical reasons, the random error involved in computing the power of the tests can
be pretty well approximated. A suitable approach is the well-known delta method. Basically,
it is a Taylor polynomial of first order, i.e., a linear approximation of a function.
According to it the variance of a function of a random variable can be linearly approximated
by multiplying the variance of this random variable with the square of the first derivative
of the respective function. In the present problem, the variance of the test statistic
is (approximately) given by the variance of a noncentral
distribution with
equal to the number of free item parameters and noncentrality parameter
.
Thus,
, with
. Since the global
deviation
it follows for the variance of the corresponding random
variable
that
.
The power of the tests is a function of
which is given by
,
where
and
equal to the
number of free item parameters. Then, by the delta method one obtains (for the variance of P).
where is the derivative of
with respect to
.
This derivative is determined numerically and evaluated at
using the package numDeriv.
The square root of
is then used to quantify the random error of the suggested
Monte Carlo computation procedure. It is called Monte Carlo error of power.
A list of results.
power |
Power value for each test. |
MC error of power |
Monte Carlo error of power computation for each test. |
global deviation |
Global deviation computed from simulated data for each test. See Details. |
local deviation |
CML estimates of item parameters in both groups of persons obtained from the simulated data expressing a deviation from the hypothesis to be tested locally per item. |
person score distribution in group 1 |
Relative frequencies of person scores in group 1 observed in simulated data. Uninformative scores, i.e., minimum and maximum score, are omitted. Note that the person score distribution does also have an influence on the power of the tests. |
person score distribution in group 2 |
Relative frequencies of person scores in group 2 observed in simulated data. Uninformative scores, i.e., minimum and maximum score, are omitted. Note that the person score distribution does also have an influence on the power of the tests. |
degrees of freedom |
Degrees of freedom |
noncentrality parameter |
Noncentrality parameter |
call |
The matched call. |
Draxler, C. (2010). Sample Size Determination for Rasch Model Tests. Psychometrika, 75(4), 708–724.
Draxler, C., & Alexandrowicz, R. W. (2015). Sample Size Determination Within the Scope of Conditional Maximum Likelihood Estimation with Special Focus on Testing the Rasch Model. Psychometrika, 80(4), 897–919.
Draxler, C., Kurz, A., & Lemonte, A. J. (2020). The Gradient Test and its Finite Sample Size Properties in a Conditional Maximum Likelihood and Psychometric Modeling Context. Communications in Statistics-Simulation and Computation, 1-19.
Glas, C. A. W., & Verhelst, N. D. (1995a). Testing the Rasch Model. In G. H. Fischer & I. W. Molenaar (Eds.), Rasch Models: Foundations, Recent Developments, and Applications (pp. 69–95). New York: Springer.
Glas, C. A. W., & Verhelst, N. D. (1995b). Tests of Fit for Polytomous Rasch Models. In G. H. Fischer & I. W. Molenaar (Eds.), Rasch Models: Foundations, Recent Developments, and Applications (pp. 325-352). New York: Springer.
sa_sizeRM
, and post_hocRM
.
## Not run: # Numerical example res <- powerRM(n_total = 130, local_dev = list( c(0, -0.5, 0, 0.5, 1) , c(0, 0.5, 0, -0.5, 1))) # > res # $power # W LR RS GR # 0.824 0.840 0.835 0.845 # # $`MC error of power` # W LR RS GR # 0.002 0.002 0.002 0.002 # # $`global deviation` # W LR RS GR # 0.118 0.122 0.121 0.124 # # $`local deviation` # Item2 Item3 Item4 Item5 # group1 -0.499 0.005 0.500 1.001 # group2 0.501 0.003 -0.499 1.003 # # $`person score distribution in group 1` # # 1 2 3 4 # 0.249 0.295 0.269 0.187 # # $`person score distribution in group 2` # # 1 2 3 4 # 0.249 0.295 0.270 0.186 # # $`degrees of freedom` # [1] 4 # # $`noncentrality parameter` # W LR RS GR # 12.619 13.098 12.937 13.264 # # $call # powerRM(n_total = 130, local_dev = list(c(0, -0.5, 0, 0.5, 1), # c(0, 0.5, 0, -0.5, 1))) ## End(Not run)
## Not run: # Numerical example res <- powerRM(n_total = 130, local_dev = list( c(0, -0.5, 0, 0.5, 1) , c(0, 0.5, 0, -0.5, 1))) # > res # $power # W LR RS GR # 0.824 0.840 0.835 0.845 # # $`MC error of power` # W LR RS GR # 0.002 0.002 0.002 0.002 # # $`global deviation` # W LR RS GR # 0.118 0.122 0.121 0.124 # # $`local deviation` # Item2 Item3 Item4 Item5 # group1 -0.499 0.005 0.500 1.001 # group2 0.501 0.003 -0.499 1.003 # # $`person score distribution in group 1` # # 1 2 3 4 # 0.249 0.295 0.269 0.187 # # $`person score distribution in group 2` # # 1 2 3 4 # 0.249 0.295 0.270 0.186 # # $`degrees of freedom` # [1] 4 # # $`noncentrality parameter` # W LR RS GR # 12.619 13.098 12.937 13.264 # # $call # powerRM(n_total = 130, local_dev = list(c(0, -0.5, 0, 0.5, 1), # c(0, 0.5, 0, -0.5, 1))) ## End(Not run)
Returns sample size for Wald (W), likelihood ratio (LR), Rao score (RS)
and gradient (GR) test given probabilities of errors of first and second kinds and
as well as a deviation from the hypothesis to be tested. The hypothesis to be tested states that
the shift parameter quantifying the constant change for all items between time points 1 and 2
equals 0. The alternative states that the shift parameter is not equal to 0. It is assumed that the same
items are presented at both time points. See function
change_test
.
sa_sizeChange(eta, alpha = 0.05, beta = 0.05, persons = rnorm(10^6))
sa_sizeChange(eta, alpha = 0.05, beta = 0.05, persons = rnorm(10^6))
eta |
A vector of eta parameters of the LLTM. The last element represents the constant change or shift for all items between time points 1 and 2. The other elements of the vector are the item parameters at time point 1. A choice of the eta parameters constitutes a scenario of deviation from the hypothesis of no change. |
alpha |
Probability of error of first kind. |
beta |
Probability of error of second kind. |
persons |
A vector of person parameters (drawn from a specified distribution). By default |
In general, the sample size is determined from the assumption that the approximate distributions
of the four test statistics are from the family of noncentral distributions with
and noncentrality parameter
. The latter is, inter alia, a function of the sample size. Hence,
the sample size can be determined from the condition
, where
is
a predetermined constant which depends on the probabilities of the errors of the first and second kinds
and
(or power). More details about the distributions of the test statistics
and the relationship between
, power, and sample size can be found in Draxler and Alexandrowicz (2015).
In particular, the determination of and the sample size, respectively, is based on a simple Monte
Carlo approach. As regards the concept of sample size a distinction between informative and total
sample size has to be made. In the conditional maximum likelihood context, the responses of
persons with minimum or maximum person score are completely uninformative. They do not contribute
to the value of the test statistic. Thus, the informative sample size does not include these persons.
The total sample size is composed of all persons. The Monte Carlo approach used in the present
problem to determine
and informative (and total) sample size can briefly be described as follows.
Data (responses of a large number of persons to a number of items presented at two time points) are
generated given a user-specified scenario of a deviation from the hypothesis to be tested. The
hypothesis to be tested assumes no change between time points 1 and 2. A scenario of a deviation
is given by a choice of the item parameters at time point 1 and the shift parameter, i.e., the
LLTM eta parameters, as well as the person parameters (to be drawn randomly from a specified distribution).
The shift parameter represents a constant change of all item parameters from time point 1 to time point 2.
A test statistic
(Wald, LR, score, or gradient) is computed from the simulated data. The observed
value
of the test statistic is then divided by the informative sample size
observed
in the simulated data. This yields the so-called global deviation
, i.e.,
the chosen scenario of a deviation from the hypothesis to be tested being represented by a
single number. Let the informative sample size sought be denoted by
(thus, this is not
the informative sample size observed in the sim. data). The noncentrality parameter
can
be expressed by the product
. Then, it follows from the condition
that
and
Note that the sample of size is assumed to be composed only of persons with informative person scores, where
the relative frequency distribution of these informative scores is considered to be equal to the
observed relative frequency distribution of the informative scores in the simulated data. The total sample size
is then obtained from the relation
, where
is the proportion
or relative frequency of persons observed in the simulated data with a minimum or maximum score. Basing
the tests given a level
on an informative sample of size
the probability of rejecting
the hypothesis to be tested will be at least
if the true global deviation
.
Note that in this approach the data have to be generated only once. There are no replications needed. Thus, the procedure is computationally not very time-consuming.
Since e is determined from the value of the test statistic observed in the simulated data it has
to be treated as a realized value of a random variable . Consequently,
is also a
realization of a random variable
. Thus, the (realized) value
need not be
equal to the exact value of the informative sample size that follows from the user-specified
(predetermined)
,
, and scenario of a deviation from the hypothesis to be
tested, i.e., the selected item parameters and shift parameter used for the simulation of the data.
If the CML estimates of these parameters computed from the simulated data are close to the
predetermined parameters
will be close to the exact value. This will generally be the
case if the number of person parameters used for simulating the data is large, e.g.,
or even
persons. In such cases, the possible random error of the computation procedure
of
based on the sim. data may not be of practical relevance any more. That is why a
large number (of persons for the simulation process) is generally recommended.
For theoretical reasons, the random error involved in computing can be pretty well approximated.
A suitable approach is the well-known delta method. Basically, it is a Taylor polynomial of first order,
i.e., a linear approximation of a function. According to it the variance of a function of a random
variable can be linearly approximated by multiplying the variance of this random variable with the square
of the first derivative of the respective function. In the present problem, the variance of the test
statistic
is (approximately) given by the variance of a noncentral
distribution. Thus,
,
with
and
. Since the global deviation
it follows for the variance of the corresponding random variable
that
. Since
one obtains by the
delta method (for the variance of the corresponding random variable
)
where is the derivative of
. The square root of
is then used to quantify the random error of the suggested Monte Carlo computation procedure. It is called
Monte Carlo error of informative sample size.
A list results.
informative sample size |
Informative sample size for each test, omitting persons with min. and max score. |
MC error of sample size |
Monte Carlo error of sample size computation for each test. |
deviation |
Shift parameter estimated from the simulated data representing the constant shift of item parameters between time points 1 and 2. |
person score distribution |
Relative frequencies of person scores observed in simulated data. Uninformative scores, i.e., minimum and maximum score, are omitted. Note that the person score distribution does also have an influence on the sample size. |
degrees of freedom |
Degrees of freedom |
noncentrality parameter |
Noncentrality parameter |
total sample size |
Total sample size for each test. See Details. |
call |
The matched call. |
Draxler, C., & Alexandrowicz, R. W. (2015). Sample size determination within the scope of conditional maximum likelihood estimation with special focus on testing the Rasch model. Psychometrika, 80(4), 897-919.
Fischer, G. H. (1995). The Linear Logistic Test Model. In G. H. Fischer & I. W. Molenaar (Eds.), Rasch models: Foundations, Recent Developments, and Applications (pp. 131-155). New York: Springer.
Fischer, G. H. (1983). Logistic Latent Trait Models with Linear Constraints. Psychometrika, 48(1), 3-26.
powerChange
, and post_hocChange
.
## Not run: # Numerical example 4 items presented twice, thus 8 virtual items # eta Parameter, first 4 are nuisance # (easiness parameters of the 4 items at time point 1), # last one is the shift parameter eta <- c(-2,-1,1,2,0.5) res <- sa_sizeChange(eta = eta) # > res # $`informative sample size` # W LR RS GR # 177 174 175 173 # # $`MC error of sample size` # W LR RS GR # 1.321 1.287 1.299 1.276 # # $`deviation (estimate of shift parameter)` # [1] 0.501 # # $`person score distribution` # # 1 2 3 4 5 6 7 # 0.034 0.094 0.181 0.249 0.227 0.147 0.068 # # $`degrees of freedom` # [1] 1 # # $`noncentrality parameter` # [1] 12.995 # # $`total sample size` # W LR RS GR # 182 179 180 178 # # $call # sa_sizeChange(alpha = 0.05, beta = 0.05, eta = eta, persons = rnorm(10^6)) ## End(Not run)
## Not run: # Numerical example 4 items presented twice, thus 8 virtual items # eta Parameter, first 4 are nuisance # (easiness parameters of the 4 items at time point 1), # last one is the shift parameter eta <- c(-2,-1,1,2,0.5) res <- sa_sizeChange(eta = eta) # > res # $`informative sample size` # W LR RS GR # 177 174 175 173 # # $`MC error of sample size` # W LR RS GR # 1.321 1.287 1.299 1.276 # # $`deviation (estimate of shift parameter)` # [1] 0.501 # # $`person score distribution` # # 1 2 3 4 5 6 7 # 0.034 0.094 0.181 0.249 0.227 0.147 0.068 # # $`degrees of freedom` # [1] 1 # # $`noncentrality parameter` # [1] 12.995 # # $`total sample size` # W LR RS GR # 182 179 180 178 # # $call # sa_sizeChange(alpha = 0.05, beta = 0.05, eta = eta, persons = rnorm(10^6)) ## End(Not run)
Returns sample size for Wald (W), likelihood ratio (LR), Rao score (RS)
and gradient (GR) test given probabilities of errors of first and second kinds and
as well as a deviation from the hypothesis to be tested. The hypothesis to be tested
assumes equal item-category parameters in the partial credit model between two predetermined groups of persons.
The alternative assumes that at least one parameter differs between the two groups.
sa_sizePCM( local_dev, alpha = 0.05, beta = 0.05, persons1 = rnorm(10^6), persons2 = rnorm(10^6) )
sa_sizePCM( local_dev, alpha = 0.05, beta = 0.05, persons1 = rnorm(10^6), persons2 = rnorm(10^6) )
local_dev |
A list consisting of two lists. One list refers to group 1, the other to group 2. Each of the two lists contains a numerical vector per item, i.e., each list contains as many vectors as items. Each vector contains the free item-cat. parameters of the respective item. The number of free item-cat. parameters per item equals the number of categories of the item minus 1. |
alpha |
Probability of the error of first kind. |
beta |
Probability of the error of second kind. |
persons1 |
A vector of person parameters for group 1 (drawn from a specified distribution). By default |
persons2 |
A vector of person parameters for group 2 (drawn from a specified distribution). By default |
In general, the sample size is determined from the assumption that the approximate distributions of
the four test statistics are from the family of noncentral distributions with
,
where
is the number of free item-category parameters in the partial credit model, and noncentrality
parameter
. The latter is, inter alia, a function of the sample size. Hence, the sample size can be
determined from the condition
, where
is a predetermined constant
which depends on the probabilities of the errors of the first and second kinds
and
(or power). More details about the distributions of the test statistics and the relationship between
,
power, and sample size can be found in Draxler and Alexandrowicz (2015).
In particular, the determination of and the sample size, respectively, is based on a simple Monte Carlo
approach. As regards the concept of sample size a distinction between informative and total sample size has
to be made. In the conditional maximum likelihood context, the responses of persons with minimum or maximum
person score are completely uninformative. They do not contribute to the value of the test statistic. Thus,
the informative sample size does not include these persons. The total sample size is composed of all persons.
The Monte Carlo approach used in the present problem to determine
and informative (and total) sample size
can briefly be described as follows. Data (responses of a large number of persons to a number of items) are
generated given a user-specified scenario of a deviation from the hypothesis to be tested. The hypothesis
to be tested assumes equal item-category parameters between the two groups of persons. A scenario of a
deviation is given by a choice of the item-cat. parameters and the person parameters (to be drawn randomly
from a specified distribution) for each of the two groups. Such a scenario may be called local deviation
since deviations can be specified locally for each item-category. The relative group sizes are determined
by the choice of the number of person parameters for each of the two groups. For instance, by default
person parameters are selected randomly for each group. In this case, it is implicitly assumed that
the two groups of persons are of equal size. The user can specify the relative groups sizes by choosing
the length of the arguments
persons1
and persons2
appropriately. Note that the relative group sizes do
have an impact on power and sample size of the tests. The next step is to compute a test statistic
(Wald, LR, score, or gradient) from the simulated data. The observed value
of the test statistic is
then divided by the informative sample size
observed in the simulated data. This yields the
so-called global deviation
, i.e., the chosen scenario of a deviation from the
hypothesis to be tested being represented by a single number. Let the informative sample size sought
be denoted by
(thus, this is not the informative sample size observed in the sim. data). The
noncentrality parameter
can be expressed by the product
. Then, it follows from the
condition
that
and
Note that the sample of size is assumed to be composed only of persons with informative person scores in both groups,
where the relative frequency distribution of these informative scores in each of both groups is considered to be
equal to the observed relative frequency distribution of informative scores in each of both groups in the simulated
data. Note also that the relative sizes of the two person groups are
assumed to be equal to the relative sizes of the two groups in the simulated data. By default, the two
groups are equal-sized in the simulated data, i.e., one yields
persons (with informative scores)
in each of the two groups. The total sample size
is obtained from the relation
,
where
is the proportion or relative frequency of persons observed in the simulated data with a minimum or
maximum score. Basing the tests given a level
on an informative sample of size
the
probability of rejecting the hypothesis to be tested will be at least
if the true global deviation
.
Note that in this approach the data have to be generated only once. There are no replications needed. Thus, the procedure is computationally not very time-consuming.
Since e is determined from the value of the test statistic observed in the simulated data it has to be
treated as a realization of a random variable . Consequently,
is also a realization of a random
variable
. Thus, the (realized) value
need not be equal to the exact value of the informative
sample size that follows from the user-specified (predetermined)
,
, and scenario of a deviation
from the hypothesis to be tested, i.e., the selected item-category parameters used for the simulation of
the data. If the CML estimates of these parameters computed from the simulated data are close to the
predetermined parameters
will be close to the exact value. This will generally be the case if
the number of person parameters used for simulating the data, i.e., the lengths of the vectors persons1
and persons2, is large, e.g.,
or even
persons. In such cases, the possible random error of the
computation procedure of
based on the sim. data may not be of practical relevance any more. That is
why a large number (of persons for the simulation process) is generally recommended.
For theoretical reasons, the random error involved in computing n_inf can be pretty well approximated.
A suitable approach is the well-known delta method. Basically, it is a Taylor polynomial of first order,
i.e., a linear approximation of a function. According to it the variance of a function of a random variable
can be linearly approximated by multiplying the variance of this random variable with the square of the first
derivative of the respective function. In the present problem, the variance of the test statistic is
(approximately) given by the variance of a noncentral
distribution. Thus,
, with
and
. Since the global deviation
it follows for the variance of the
corresponding random variable
that
. Since
one obtains by the delta method (for the variance of the corresponding
random variable
)
where is the derivative of
. The square root of
is then used to
quantify the random error of the suggested Monte Carlo computation procedure. It is called Monte Carlo
error of informative sample size.
A list of results.
informative sample size |
Informative sample size for each test, omitting persons with min. and max score. |
MC error of sample size |
Monte Carlo error of informative sample size for each test. |
global deviation |
Global deviation computed from simulated data. See Details. |
local deviation |
CML estimates of free item-category parameters in both group of persons obtained from the simulated data expressing a deviation from the hypothesis to be tested locally per item and response category. |
person score distribution in group 1 |
Relative frequencies of person scores in group 1 observed in simulated data. Uninformative scores, i.e., minimum and maximum score, are omitted. Note that the person score distribution does also have an influence on the sample size. |
person score distribution in group 2 |
Relative frequencies of person scores in group 2 observed in simulated data. Uninformative scores, i.e., minimum and maximum score, are omitted. Note that the person score distribution does also have an influence on the sample size. |
degrees of freedom |
Degrees of freedom |
noncentrality parameter |
Noncentrality parameter |
total sample size in group 1 |
Total sample size in group 1 for each test. See Details. |
total sample size in group 1 |
Total sample size in group 2 for each test. See Details. |
call |
The matched call. |
Draxler, C. (2010). Sample Size Determination for Rasch Model Tests. Psychometrika, 75(4), 708–724.
Draxler, C., & Alexandrowicz, R. W. (2015). Sample Size Determination Within the Scope of Conditional Maximum Likelihood Estimation with Special Focus on Testing the Rasch Model. Psychometrika, 80(4), 897–919.
Draxler, C., Kurz, A., & Lemonte, A. J. (2020). The Gradient Test and its Finite Sample Size Properties in a Conditional Maximum Likelihood and Psychometric Modeling Context. Communications in Statistics-Simulation and Computation, 1-19.
Glas, C. A. W., & Verhelst, N. D. (1995a). Testing the Rasch Model. In G. H. Fischer & I. W. Molenaar (Eds.), Rasch Models: Foundations, Recent Developments, and Applications (pp. 69–95). New York: Springer.
Glas, C. A. W., & Verhelst, N. D. (1995b). Tests of Fit for Polytomous Rasch Models. In G. H. Fischer & I. W. Molenaar (Eds.), Rasch Models: Foundations, Recent Developments, and Applications (pp. 325-352). New York: Springer.
powerPCM
, and post_hocPCM
.
## Not run: ##### Sample size of PCM Model ##### # free item-category parameters for group 1 and 2 with 5 items, with 3 categories each local_dev <- list ( list(c( 0, 0), c( -1, 0), c( 0, 0), c( 1, 0), c( 1, 0.5)) , list(c( 0, 0), c( -1, 0), c( 0, 0), c( 1, 0), c( 0, -0.5)) ) res <- sa_sizePCM(alpha = 0.05, beta = 0.05, persons1 = rnorm(10^6), persons2 = rnorm(10^6), local_dev = local_dev) # > res # $`informative sample size` # W LR RS GR # 234 222 227 217 # # $`MC error of sample size` # W LR RS GR # 1.105 1.018 1.053 0.988 # # $`global deviation` # W LR RS GR # 0.101 0.107 0.104 0.109 # # $`local deviation` # I1-C2 I2-C1 I2-C2 I3-C1 I3-C2 I4-C1 I4-C2 I5-C1 I5-C2 # group1 -0.001 -1.000 -1.001 -0.003 -0.011 0.997 0.998 0.996 1.492 # group2 0.001 -0.998 -0.996 -0.007 -0.007 0.991 1.001 0.004 -0.499 # # $`person score distribution in group 1` # # 1 2 3 4 5 6 7 8 9 # 0.111 0.130 0.133 0.129 0.122 0.114 0.101 0.091 0.070 # # $`person score distribution in group 2` # # 1 2 3 4 5 6 7 8 9 # 0.090 0.109 0.117 0.121 0.121 0.121 0.116 0.111 0.093 # # $`degrees of freedom` # [1] 9 # # $`noncentrality parameter` # [1] 23.589 # # $`total sample size in group 1` # W LR RS GR # 132 125 128 123 # # $`total sample size in group 2` # W LR RS GR # 133 126 129 123 # # $call # sa_sizePCM(alpha = 0.05, beta = 0.05, persons1 = rnorm(10^6), # persons2 = rnorm(10^6), local_dev = local_dev) ## End(Not run)
## Not run: ##### Sample size of PCM Model ##### # free item-category parameters for group 1 and 2 with 5 items, with 3 categories each local_dev <- list ( list(c( 0, 0), c( -1, 0), c( 0, 0), c( 1, 0), c( 1, 0.5)) , list(c( 0, 0), c( -1, 0), c( 0, 0), c( 1, 0), c( 0, -0.5)) ) res <- sa_sizePCM(alpha = 0.05, beta = 0.05, persons1 = rnorm(10^6), persons2 = rnorm(10^6), local_dev = local_dev) # > res # $`informative sample size` # W LR RS GR # 234 222 227 217 # # $`MC error of sample size` # W LR RS GR # 1.105 1.018 1.053 0.988 # # $`global deviation` # W LR RS GR # 0.101 0.107 0.104 0.109 # # $`local deviation` # I1-C2 I2-C1 I2-C2 I3-C1 I3-C2 I4-C1 I4-C2 I5-C1 I5-C2 # group1 -0.001 -1.000 -1.001 -0.003 -0.011 0.997 0.998 0.996 1.492 # group2 0.001 -0.998 -0.996 -0.007 -0.007 0.991 1.001 0.004 -0.499 # # $`person score distribution in group 1` # # 1 2 3 4 5 6 7 8 9 # 0.111 0.130 0.133 0.129 0.122 0.114 0.101 0.091 0.070 # # $`person score distribution in group 2` # # 1 2 3 4 5 6 7 8 9 # 0.090 0.109 0.117 0.121 0.121 0.121 0.116 0.111 0.093 # # $`degrees of freedom` # [1] 9 # # $`noncentrality parameter` # [1] 23.589 # # $`total sample size in group 1` # W LR RS GR # 132 125 128 123 # # $`total sample size in group 2` # W LR RS GR # 133 126 129 123 # # $call # sa_sizePCM(alpha = 0.05, beta = 0.05, persons1 = rnorm(10^6), # persons2 = rnorm(10^6), local_dev = local_dev) ## End(Not run)
Returns sample size for Wald (W), likelihood ratio (LR), Rao score (RS)
and gradient (GR) test given probabilities of errors of first and second kinds and
as well as a deviation from the hypothesis to be tested. The hypothesis to be
tested assumes equal item parameters between two predetermined groups of persons. The alternative assumes
that at least one parameter differs between the two groups.
sa_sizeRM( local_dev, alpha = 0.05, beta = 0.05, persons1 = rnorm(10^6), persons2 = rnorm(10^6) )
sa_sizeRM( local_dev, alpha = 0.05, beta = 0.05, persons1 = rnorm(10^6), persons2 = rnorm(10^6) )
local_dev |
A list consisting of two vectors containing item parameters for the two person groups representing a deviation from the hypothesis to be tested locally per item. |
alpha |
Probability of the error of first kind. |
beta |
Probability of the error of second kind. |
persons1 |
A vector of person parameters for group 1 (drawn from a specified distribution). By default
|
persons2 |
A vector of person parameters for group 2 (drawn from a specified distribution). By default
|
In general, the sample size is determined from the assumption that the approximate distributions of
the four test statistics are from the family of noncentral distributions with
equal to the number of items minus 1, and noncentrality parameter
. The latter is,
inter alia, a function of the sample size. Hence, the sample size can be determined from the condition
, where
is a predetermined constant which depends on the probabilities of
the errors of the first and second kinds
and
(or power). More details about the distributions of the test statistics and the relationship between
,
power, and sample size can be found in Draxler and Alexandrowicz (2015).
In particular, the determination of and the sample size, respectively, is based on a simple
Monte Carlo approach. As regards the concept of sample size a distinction between informative and total
sample size has to be made. In the conditional maximum likelihood context, the responses of persons
with minimum or maximum person score are completely uninformative. They do not contribute to the value o
f the test statistic. Thus, the informative sample size does not include these persons. The total sample
size is composed of all persons. The Monte Carlo approach used in the present problem to determine
and informative (and total) sample size can briefly be described as follows. Data (responses of a large number
of persons to a number of items) are generated given a user-specified scenario of a deviation from the hypothesis
to be tested. The hypothesis to be tested assumes equal item parameters between the two groups of persons.
A scenario of a deviation is given by a choice of the item parameters and the person parameters (to be drawn
randomly from a specified distribution) for each of the two groups. Such a scenario may be called local
deviation since deviations can be specified locally for each item. The relative group sizes are determined by
the choice of the number of person parameters for each of the two groups. For instance, by default
person
parameters are selected randomly for each group. In this case, it is implicitly assumed that the two groups of
persons are of equal size. The user can specify the relative groups sizes by choosing the lengths of the
arguments
persons1
and persons2
appropriately. Note that the relative group sizes do have an impact on power
and sample size of the tests. The next step is to compute a test statistic (Wald, LR, score, or gradient)
from the simulated data. The observed value
of the test statistic is then divided by the informative
sample size
observed in the simulated data. This yields the so-called global deviation
, i.e., the chosen scenario of a deviation from the hypothesis to be tested being
represented by a single number. Let the informative sample size sought be denoted by
(thus, this is
not the informative sample size observed in the sim. data). The noncentrality parameter
can
be expressed by the product
. Then, it follows from the condition
that
and
Note that the sample of size is assumed to be composed only of persons with informative person scores in both groups,
where the relative frequency distribution of these informative scores in each of both groups is considered to be equal
to the observed relative frequency distribution of informative scores in each of both groups in the simulated data. Note also that the
relative sizes of the two person groups are assumed to be equal to the
relative sizes of the two groups in the simulated data. By default, the two groups are equal-sized in the simulated
data, i.e., one yields
persons (with informative scores) in each of the two groups. The total
sample size
is obtained from the relation
, where
is the proportion or relative frequency of persons observed
in the simulated data with a minimum or maximum score. Basing the tests given a level
on an informative
sample of size
the probability of rejecting the hypothesis to be tested will be at least
if the true global deviation
.
Note that in this approach the data have to be generated only once. There are no replications needed. Thus, the procedure is computationally not very time-consuming.
Since is determined from the value of the test statistic observed in the simulated data it has to be
treated as a realization of a random variable
. Consequently,
is also a realization of a
random variable
. Thus, the (realized) value
need not be equal to the exact value of
the informative sample size that follows from the user-specified (predetermined)
,
, and
scenario of a deviation from the hypothesis to be tested, i.e., the selected item parameters used for the
simulation of the data. If the CML estimates of these parameters computed from the simulated data are close
to the predetermined parameters
will be close to the exact value. This will generally be the case
if the number of person parameters used for simulating the data, i.e., the lengths of the vectors
persons1
and persons2
, is large, e.g., or even
persons. In such cases, the possible random
error of the computation procedure of
based on the sim. data may not be of practical relevance any
more. That is why a large number (of persons for the simulation process) is generally recommended.
For theoretical reasons, the random error involved in computing can be pretty well approximated.
A suitable approach is the well-known delta method. Basically, it is a Taylor polynomial of first order, i.e.,
a linear approximation of a function. According to it the variance of a function of a random variable can be
linearly approximated by multiplying the variance of this random variable with the square of the first
derivative of the respective function. In the present problem, the variance of the test statistic
is
(approximately) given by the variance of a noncentral
distribution.
Thus,
, with
equal to the number of items minus 1 and
. Since the global deviation
it
follows for the variance of the corresponding random variable
that
.
Since
one obtains by the delta method (for the variance of the
corresponding random variable
)
where is the derivative of
. The square root of
is then used to quantify the random error of the suggested Monte Carlo
computation procedure. It is called Monte Carlo error of informative sample size.
A list of results.
informative sample size |
Informative sample size for each test omitting persons with min. and max score. |
MC error of sample size |
Monte Carlo error of informative sample size for each test. |
global deviation |
Global deviation computed from simulated data. See Details. |
local deviation |
CML estimates of free item parameters in both groups obtained from the simulated data. First item parameter set 0 in both groups. |
person score distribution in group 1 |
Relative frequencies of person scores in group 1 observed in simulated data. Uninformative scores, i.e., minimum and maximum score, are omitted. Note that the person score distribution does also have an influence on the sample size. |
person score distribution in group 2 |
Relative frequencies of person scores in group 2 observed in simulated data. Uninformative scores, i.e., minimum and maximum score, are omitted. Note that the person score distribution does also have an influence on the sample size. |
degrees of freedom |
Degrees of freedom |
noncentrality parameter |
Noncentrality parameter |
total sample size in group 1 |
Total sample size in group 1 for each test. See Details. |
total sample size in group 1 |
Total sample size in group 2 for each test. See Details. |
call |
The matched call. |
Draxler, C. (2010). Sample Size Determination for Rasch Model Tests. Psychometrika, 75(4), 708–724.
Draxler, C., & Alexandrowicz, R. W. (2015). Sample Size Determination Within the Scope of Conditional Maximum Likelihood Estimation with Special Focus on Testing the Rasch Model. Psychometrika, 80(4), 897–919.
Draxler, C., Kurz, A., & Lemonte, A. J. (2020). The Gradient Test and its Finite Sample Size Properties in a Conditional Maximum Likelihood and Psychometric Modeling Context. Communications in Statistics-Simulation and Computation, 1-19.
Glas, C. A. W., & Verhelst, N. D. (1995a). Testing the Rasch Model. In G. H. Fischer & I. W. Molenaar (Eds.), Rasch Models: Foundations, Recent Developments, and Applications (pp. 69–95). New York: Springer.
Glas, C. A. W., & Verhelst, N. D. (1995b). Tests of Fit for Polytomous Rasch Models. In G. H. Fischer & I. W. Molenaar (Eds.), Rasch Models: Foundations, Recent Developments, and Applications (pp. 325-352). New York: Springer.
powerRM
, and post_hocRM
.
## Not run: ##### Sample size of Rasch Model ##### res <- sa_sizeRM(local_dev = list( c(0, -0.5, 0, 0.5, 1) , c(0, 0.5, 0, -0.5, 1))) # > res # $`informative sample size` # W LR RS GR # 159 153 155 151 # # $`MC error of sample size` # W LR RS GR # 0.721 0.682 0.695 0.670 # # $`global deviation` # W LR RS GR # 0.117 0.122 0.120 0.123 # # $`local deviation` # Item2 Item3 Item4 Item5 # group1 -0.502 -0.005 0.497 1.001 # group2 0.495 -0.006 -0.501 0.994 # # $`person score distribution in group 1` # # 1 2 3 4 # 0.249 0.295 0.268 0.188 # # $`person score distribution in group 2` # # 1 2 3 4 # 0.249 0.295 0.270 0.187 # # $`degrees of freedom` # [1] 4 # # $`noncentrality parameter` # [1] 18.572 # # $`total sample size in group 1` # W LR RS GR # 97 93 94 92 # # $`total sample size in group 2` # W LR RS GR # 97 93 94 92 # # $call # sa_sizeRM(local_dev = list(c(0, -0.5, 0, 0.5, 1), # c(0, 0.5, 0, -0.5, 1))) ## End(Not run)
## Not run: ##### Sample size of Rasch Model ##### res <- sa_sizeRM(local_dev = list( c(0, -0.5, 0, 0.5, 1) , c(0, 0.5, 0, -0.5, 1))) # > res # $`informative sample size` # W LR RS GR # 159 153 155 151 # # $`MC error of sample size` # W LR RS GR # 0.721 0.682 0.695 0.670 # # $`global deviation` # W LR RS GR # 0.117 0.122 0.120 0.123 # # $`local deviation` # Item2 Item3 Item4 Item5 # group1 -0.502 -0.005 0.497 1.001 # group2 0.495 -0.006 -0.501 0.994 # # $`person score distribution in group 1` # # 1 2 3 4 # 0.249 0.295 0.268 0.188 # # $`person score distribution in group 2` # # 1 2 3 4 # 0.249 0.295 0.270 0.187 # # $`degrees of freedom` # [1] 4 # # $`noncentrality parameter` # [1] 18.572 # # $`total sample size in group 1` # W LR RS GR # 97 93 94 92 # # $`total sample size in group 2` # W LR RS GR # 97 93 94 92 # # $call # sa_sizeRM(local_dev = list(c(0, -0.5, 0, 0.5, 1), # c(0, 0.5, 0, -0.5, 1))) ## End(Not run)
Uses function hessian()
from numDeriv package to compute (approximate numerically) Hessian matrix
evaluated at arbitrary values of item easiness parameters.
tcl_hessian(X, eta, W, model = "RM")
tcl_hessian(X, eta, W, model = "RM")
X |
data matrix. |
eta |
numeric vector of item easiness parameters. |
W |
design matrix. |
model |
RM, PCM, RSM, LLTM. |
Hessian matrix evaluated at eta
Gilbert, P., Gilbert, M. P., & Varadhan, R. (2016). numDeriv: Accurate Numerical Derivatives. R package version 2016.8-1.1. url: https://CRAN.R-project.org/package=numDeriv
## Not run: # Rasch model with beta_1 restricted to 0 y <- eRm::raschdat1 res <- eRm::RM(X = y, sum0 = FALSE) mat <- tcl_hessian(X = y, eta = res$etapar, model = "RM") ## End(Not run)
## Not run: # Rasch model with beta_1 restricted to 0 y <- eRm::raschdat1 res <- eRm::RM(X = y, sum0 = FALSE) mat <- tcl_hessian(X = y, eta = res$etapar, model = "RM") ## End(Not run)
Uses function jacobian()
from numDeriv package to compute (approximate numerically) score function
(first order partial derivatives of conditional log likelihood function)
evaluated at arbitrary values of item easiness parameters.
tcl_scorefun(X, eta, W, model = "RM")
tcl_scorefun(X, eta, W, model = "RM")
X |
data matrix. |
eta |
numeric vector of item easiness parameters. |
W |
design matrix. |
model |
RM, PCM, RSM, LLTM. |
Score function evaluated at eta
Gilbert, P., Gilbert, M. P., & Varadhan, R. (2016). numDeriv: Accurate Numerical Derivatives. R package version 2016.8-1.1. url: https://CRAN.R-project.org/package=numDeriv
## Not run: # Rasch model with beta_1 restricted to 0 y <- eRm::raschdat1 res <- eRm::RM(X = y, sum0 = FALSE) scorefun <- tcl_scorefun(X = y, eta = res$etapar, model = "RM") ## End(Not run)
## Not run: # Rasch model with beta_1 restricted to 0 y <- eRm::raschdat1 res <- eRm::RM(X = y, sum0 = FALSE) scorefun <- tcl_scorefun(X = y, eta = res$etapar, model = "RM") ## End(Not run)