---
title: "Estimating Models with Interactions"
output: rmarkdown::html_vignette
vignette: >
  %\VignetteIndexEntry{Estimating Models with Interactions}
  %\VignetteEngine{knitr::rmarkdown}
  \usepackage[utf8]{inputenc}
---

```{r setup, include=FALSE, message=FALSE, warning=FALSE}
knitr::opts_chunk$set(
  collapse = TRUE,
  warning = FALSE,
  message = FALSE,
  fig.retina = 3,
  comment = "#>"
)
```

# Interactions with continuous variables

To add interactions between covariates in your model, you can add additional arguments in the `pars` vector in the `logitr()` function separated by the `*` symbol. For example, let's say we want to interact `price` with `feat` in the following model:

```{r}
library("logitr")

model <- logitr(
  data    = yogurt,
  outcome = 'choice',
  obsID   = 'obsID',
  pars    = c('price', 'feat', 'brand')
)
```

To do so, I could add `"price*feat"` to the `pars` vector:

```{r}
model_price_feat <- logitr(
  data    = yogurt,
  outcome = 'choice',
  obsID   = 'obsID',
  pars    = c('price', 'feat', 'brand', 'price*feat')
)
```

The model now has an estimated coefficient for the `price*feat` effect:

```{r}
summary(model_price_feat)
```

# Interactions with discrete variables

In the above example, both `price` and `feat` were continuous variables, so only a single interaction coefficient was needed.

In the case of interacting _discrete_ variables, multiple interactions coefficients will be estimated according to the number of levels in the discrete attribute. For example, the interaction of `price` with `brand` will require three new interactions - one for each level of the `brand` variable except the first reference level:

```{r}
model_price_brand <- logitr(
  data    = yogurt,
  outcome = 'choice',
  obsID   = 'obsID',
  pars    = c('price', 'feat', 'brand', 'price*brand')
)
```

The model now has three estimated coefficients for the `price*brand` effect:

```{r}
summary(model_price_brand)
```

# Interactions with individual-specific variables

If you want to include interactions with individual-specific variables (for example, to assess the difference in an effect between groups of respondents), you should **not** include the individual-specific variable interactions using `*` in `pars`. This is because interactions inside `pars` automatically generate the interaction coefficient as well as coefficients for each covariate. 

For example, if you had a `group` variable that determined whether individuals belongs to group `A` or group `B`, including `price*group` in `pars` would create coefficients for `price`, `groupA`, and `price:groupA`, but the `groupA` coefficient would be unidentified. In this case, you should only include `price` and `price:groupA` in the model. For now, the only way to handle this situation is to manually create dummy-coded interaction variables to include in the model. 

To illustrate how one might do this, consider if the `yogurt` data frame had two groups of individuals: `A` and `B`. For simple illustration, I'll define these groups arbitrarily based on whether or not the `obsID` is even or odd:

```{r}
# Create group A dummies
yogurt$groupA <- ifelse(yogurt$obsID %% 2 == 0, 1, 0)
```

An interaction between the `group` variable and `price` can be included in the model by first manually creating a `price_groupA` interaction variable and then including it in `pars`:

```{r}
# Create dummy coefficients for group interaction with price
yogurt$price_groupA <- yogurt$price*yogurt$groupA

model_price_group <- logitr(
  data    = yogurt,
  outcome = 'choice',
  obsID   = 'obsID',
  pars    = c('price', 'feat', 'brand', 'price_groupA')
)
```

The model now has attribute coefficients for `price`, `feat`, and `brand` as well as an interaction between the `group` and `price`:

```{r}
summary(model_price_group)
```

# Interactions in mixed logit models

Suppose I want to include an interaction between two variables and I also want one of those variables to be modeled as normally distributed across the population. The example below illustrates this cases, where a `price*feat` interaction is specified and the `feat` parameter is modeled as normally distributed by setting `randPars = c(feat = "n")`:

```{r}
model_price_feat_mxl <- logitr(
  data    = yogurt,
  outcome = 'choice',
  obsID   = 'obsID',
  pars    = c('price', 'feat', 'brand', 'price*feat'),
  randPars = c(feat = "n")
)
```

In this case, the `price*feat` interaction parameter is interpreted as a difference in the `feat_mu` parameter and price; that is, it an interaction in the _mean_ `feat` parameter and `price`:

```{r}
summary(model_price_feat_mxl)
```