---
title: "autoRasch: The Semi-automated Rasch Analysis"
# output:
#   html_vignette:
#     keep_md: true
#   pdf_document: default
# output: 
#   pdf_document:
#     latex_engine: pdflatex
output: rmarkdown::html_vignette
vignette: >
  %\VignetteIndexEntry{autoRasch: The Semi-automated Rasch Analysis}
  %\VignetteEngine{knitr::rmarkdown}
  %\VignetteEncoding{UTF-8}
---

```{r, include = FALSE}
knitr::opts_chunk$set(
  collapse = TRUE,
  comment = "#>"
)
```

```{}
# install.packages("remotes")
# remotes::install_github("fwijayanto/autoRasch", build_manual = TRUE, build_vignettes = TRUE)
```

```{r}
library(autoRasch)
library(doParallel)
```


### Computing the criteron score (IPOQ-LL/IPOQ-LL-DIF)

Utilizing the generalized partial credit model (GPCM) and the generalized partial credit model with DIF (GPCM-DIF), we develop a score as a criterion to judge the quality of an itemset within an original survey, called the In-plus-out-of-questionnaire log-likelihood (IPOQ-LL) and In-plus-out-of-questionnaire log-likelihood with DIF (IPOQ-LL-DIF), respectively.

For example, we have a 9-item original survey and we want to examine how good to estimate persons' abilities using only item~7~, item~8~, and item~9~. To compute the IPOQ-LL score we simply run

```{r}
grMap <- matrix(c(rep(0,50),rep(1,50)),ncol = 1, dimnames = list(c(1:100),c("cov")))
ipoqlldif_score <- autoRasch::compute_score(shortDIF, incl_set = c(1:4), type = "ipoqlldif", groups_map = grMap)
summary(ipoqlldif_score)
```

Furthermore, to compute multiple IPOQ-LL scores of several itemsets simultanously, we simply use

```{r}
ipoqll_scores <- compute_scores(shortDIF, incl_sets = rbind(c(1:3),c(2:4)), type = "ipoqll", cores = 2)
ipoqll_scores[,1:7]
```


### Semi-automated Rasch analysis by searching the maximum of the (IPOQ-LL/IPOQ-LL-DIF) score

The IPOQ-LL obtains by totalling the IQ-LL and OQ-LL. Changing `type = ipoqlldif` means the IPOQ-LL-DIF score is computed, by considering the DIF effects, instead of the IPOQ-LL. This log-likelihood is a score for model comparison, which means that there are more items combinations to be compared in order to obtain the maximum. Hence, we conduct the semi-automated Rasch analysis using the IPOQ-LL score by running

```{r}
setting <- autoRaschOptions()
setting$isHessian <- FALSE
stepwise_res <- stepwise_search(shortDIF, incl_set = c(1:4), cores = 2,
                               groups_map = grMap, method = "fast",
                               criterion = "ipoqlldif", isTraced = TRUE)
```

This `stepwise_search()` aims to search the maximum IPOQ-LL score over all items combinations possible. This maximum score correspond to the "best" itemset according to the semi-automated Rasch analysis. Therefore, to speed up the search, we implements parallelization in every step of the stepwise selection search. If `isTracked = TRUE` the function prints the combination of items which returns the highest IPOQ-LL score at every step.

Obtaining the analysis result, we could plot

```{r}
plot_search(stepwise_res, type="l")
```

The plot show the highest IPOQ-LL scores in every possible number of items in the itemsets. The numbers in the plot represent the item(s) which are removed (and added) to obtained the plotted scores, compared to the previous step. For instance, starting with full items, the highest IPOQ-LL score for itemset consisting with 8 items is obtained by removing item~1~. Subsequently, the highest IPOQ-LL score for itemset consisting with 7 items is obtained by removing item~2~.