--- title: "Estimating Models with Interactions" output: rmarkdown::html_vignette vignette: > %\VignetteIndexEntry{Estimating Models with Interactions} %\VignetteEngine{knitr::rmarkdown} \usepackage[utf8]{inputenc} --- ```{r setup, include=FALSE, message=FALSE, warning=FALSE} knitr::opts_chunk$set( collapse = TRUE, warning = FALSE, message = FALSE, fig.retina = 3, comment = "#>" ) ``` # Interactions with continuous variables To add interactions between covariates in your model, you can add additional arguments in the `pars` vector in the `logitr()` function separated by the `*` symbol. For example, let's say we want to interact `price` with `feat` in the following model: ```{r} library("logitr") model <- logitr( data = yogurt, outcome = 'choice', obsID = 'obsID', pars = c('price', 'feat', 'brand') ) ``` To do so, I could add `"price*feat"` to the `pars` vector: ```{r} model_price_feat <- logitr( data = yogurt, outcome = 'choice', obsID = 'obsID', pars = c('price', 'feat', 'brand', 'price*feat') ) ``` The model now has an estimated coefficient for the `price*feat` effect: ```{r} summary(model_price_feat) ``` # Interactions with discrete variables In the above example, both `price` and `feat` were continuous variables, so only a single interaction coefficient was needed. In the case of interacting _discrete_ variables, multiple interactions coefficients will be estimated according to the number of levels in the discrete attribute. For example, the interaction of `price` with `brand` will require three new interactions - one for each level of the `brand` variable except the first reference level: ```{r} model_price_brand <- logitr( data = yogurt, outcome = 'choice', obsID = 'obsID', pars = c('price', 'feat', 'brand', 'price*brand') ) ``` The model now has three estimated coefficients for the `price*brand` effect: ```{r} summary(model_price_brand) ``` # Interactions with individual-specific variables If you want to include interactions with individual-specific variables (for example, to assess the difference in an effect between groups of respondents), you should **not** include the individual-specific variable interactions using `*` in `pars`. This is because interactions inside `pars` automatically generate the interaction coefficient as well as coefficients for each covariate. For example, if you had a `group` variable that determined whether individuals belongs to group `A` or group `B`, including `price*group` in `pars` would create coefficients for `price`, `groupA`, and `price:groupA`, but the `groupA` coefficient would be unidentified. In this case, you should only include `price` and `price:groupA` in the model. For now, the only way to handle this situation is to manually create dummy-coded interaction variables to include in the model. To illustrate how one might do this, consider if the `yogurt` data frame had two groups of individuals: `A` and `B`. For simple illustration, I'll define these groups arbitrarily based on whether or not the `obsID` is even or odd: ```{r} # Create group A dummies yogurt$groupA <- ifelse(yogurt$obsID %% 2 == 0, 1, 0) ``` An interaction between the `group` variable and `price` can be included in the model by first manually creating a `price_groupA` interaction variable and then including it in `pars`: ```{r} # Create dummy coefficients for group interaction with price yogurt$price_groupA <- yogurt$price*yogurt$groupA model_price_group <- logitr( data = yogurt, outcome = 'choice', obsID = 'obsID', pars = c('price', 'feat', 'brand', 'price_groupA') ) ``` The model now has attribute coefficients for `price`, `feat`, and `brand` as well as an interaction between the `group` and `price`: ```{r} summary(model_price_group) ``` # Interactions in mixed logit models Suppose I want to include an interaction between two variables and I also want one of those variables to be modeled as normally distributed across the population. The example below illustrates this cases, where a `price*feat` interaction is specified and the `feat` parameter is modeled as normally distributed by setting `randPars = c(feat = "n")`: ```{r} model_price_feat_mxl <- logitr( data = yogurt, outcome = 'choice', obsID = 'obsID', pars = c('price', 'feat', 'brand', 'price*feat'), randPars = c(feat = "n") ) ``` In this case, the `price*feat` interaction parameter is interpreted as a difference in the `feat_mu` parameter and price; that is, it an interaction in the _mean_ `feat` parameter and `price`: ```{r} summary(model_price_feat_mxl) ```