--- title: "Between-Within" package: mmrm bibliography: '`r system.file("REFERENCES.bib", package = "mmrm")`' csl: '`r system.file("jss.csl", package = "mmrm")`' output: rmarkdown::html_vignette: toc: true vignette: | %\VignetteIndexEntry{Between-Within} %\VignetteEncoding{UTF-8} %\VignetteEngine{knitr::rmarkdown} editor_options: chunk_output_type: console --- ```{r, include = FALSE} knitr::opts_chunk$set( collapse = TRUE, comment = "#>" ) library(mmrm) ``` For determining the degrees of freedom (DF) required for the testing of fixed effects, one option is to use the "between-within" method, originally proposed by @schluchter1990small as a small-sample adjustment. ## General definition Using this method, the DF are determined by the grouping level at which the term is estimated. Generally, assuming $G$ levels of grouping: $DF_g=N_g-(N_{g-1}+p_g), g=1, ..., G+1$ where $N_g$ is the number of groups at the $g$-th grouping level and $p_g$ is the number of parameters estimated at that level. $N_0=1$ if the model includes an intercept term and $N_0=0$ otherwise. Note however that the DF for the intercept term itself (when it is included) are calculated at the $G+1$ level, i.e. for the intercept we use $DF_{G+1}$ degrees of freedom. We note that general contrasts $C\beta$ have not been considered in the literature so far. Here we therefore use a pragmatic approach and define that for a general contrast matrix $C$ we take the minimum DF across the involved coefficients as the DF. ## MMRM special case In our case of an MMRM (with only fixed effect terms), there is only a single grouping level (subject), so $G=1$. This means there are 3 potential "levels" of parameters (@galecki2013linear): * Level 0: The intercept term, assuming the model has been fitted with one. - We use $DF_2$ degrees of freedom as defined below. * Level 1: Effects that change between subjects, but not across observations within subjects. - These are the "between parameters". - The corresponding degrees of freedom are $DF_1 = N_1 - (N_0 + p_1)$. - In words this can be read as:\ "Between" DF = "number of subjects" - ("1 if intercept otherwise 0" + "number of between parameters"). * Level 2: Effects that change within subjects. - These are the "within parameters". - The corresponding degrees of freedom are $DF_2 = N_2 - (N_1 + p_2)$. - In words this can be read as:\ "Within" DF = "number of observations" - ("number of subjects" + "number of within parameters"). ## Example Let's look at a concrete example and what the "between-within" degrees of freedom method gives as results: ```{r} fit <- mmrm( formula = FEV1 ~ RACE + SEX + ARMCD * AVISIT + us(AVISIT | USUBJID), data = fev_data, control = mmrm_control(method = "Between-Within") ) summary(fit) ``` Let's try to calculate the degrees of freedom manually now. In `fev_data` there are 197 subjects with at least one non-missing `FEV1` observation, and 537 non-missing observations in total. Therefore we obtain the following numbers of groups $N_g$ at the levels $g=1,2$: * $N_1 = 197$ * $N_2 = 537$ And we note that $N_0 = 1$ because we use an intercept term. Now let's look at the design matrix: ```{r} head(model.matrix(fit), 1) ``` Leaving the intercept term aside, we therefore have the following number of parameters for the corresponding effects: * `RACE`: 2 * `SEX`: 1 * `ARMCD`: 1 * `AVISIT`: 3 * `ARMCD:AVISIT`: 3 In the model above, `RACE`, `SEX` and `ARMCD` are between-subjects effects and belong to level 1; they do not vary within subject across the repeated observations. On the other hand, `AVISIT` is a within-subject effect; it represents study visit, so naturally its value changes over repeated observations for each subject. Similarly, the interaction of `ARMCD` and `AVISIT` also belongs to level 2. Therefore we obtain the following numbers of parameters $p_g$ at the levels $g=1,2$: * $p_1 = 2 + 1 + 1 = 4$ * $p_2 = 3 + 3 = 6$ And we obtain therefore the degrees of freedom $DF_g$ at the levels $g=1,2$: * $DF_1 = N_1 - (N_0 + p_1) = 197 - (1 + 4) = 192$ * $DF_2 = N_2 - (N_1 + p_2) = 537 - (197 + 6) = 334$ So we can finally see that those degrees of freedom are exactly as displayed in the summary table above. ## Differences compared to SAS The implementation described above is not identical to that of SAS. Differences include: * In SAS, when using an unstructured covariance matrix, all effects are assigned the between-subjects degrees of freedom. * In SAS, the within-subjects degrees of freedom are affected by the number of subjects in which the effect takes different values. * In SAS, if there are multiple within-subject effects containing classification variables, the within-subject degrees of freedom are partitioned into components corresponding to the subject-by-effect interactions. * In SAS, the final effect you list in the `CONTRAST`/`ESTIMATE` statement is used to define the DF for general contrasts. Code contributions for adding the SAS version of between-within degrees of freedom to the `mmrm` package are welcome! ## References