---
title: "Callback: getting started"
author: Emmanuel Duguet
output: rmarkdown::html_vignette
vignette: >
  %\VignetteIndexEntry{getting-started}
  %\VignetteEngine{knitr::rmarkdown}
  %\VignetteEncoding{UTF-8}
---

```{r, include = FALSE}
knitr::opts_chunk$set(
  collapse = TRUE,
  comment = "#>"
)
```
We will use the `callback` package to analyze hiring discrimination based on gender and origin in France. The experiment was conducted in 2009 for the jobs of software developers (Petit et al., 2013). The data is available in the data frame `inter1`. Let's examine its contents.
```{r setup}
library(callback)
data(inter1)
str(inter1)
```
The first variable, `offer`, is very important. It indicates the job offer identification. It is important because, in order to test discrimination, the workers must candidate on the same job offer. This is the `cluster` parameter of the `callback()` function. With `cluster = "offer"` we are sure that all the computations will be paired, which means that we will always compare the candidates on the very same job offer. This is essential to produce meaningful results since otherwise the difference of answers could come from the differences of recruiters and not from the differences in gender or origin.

The second important variables are the ones that define the candidates. Here, there are two variables : `gender` and `origin`. These are factors and the reference levels of these factors implicitly define the reference candidate. Which one? By convention, it is the candidate that is the least susceptible of being discriminated against. Here, the reference candidate would be male because male candidates should not be discriminated against because of their gender, and from a French origin because French origin candidates should not be discriminated against in the French labor market because they have a French origin. In practice, we will check that this candidate really had the highest callback rate. We can find the reference levels of our two factors by looking at the first level given by the `levels()` function.
```{r}
levels(inter1$gender)
levels(inter1$origin)
```

There are two genders : "Man" (reference) and "Woman". There are four origins : French (F, reference), Moroccan (M), Senegalese (S) and Vietnamese (V). You do not need to aggregate the two candidates' variables gender and origin to use `callback()`, it will do it for you. 

The last element we need is, obviously, the outcome of the job hiring application. It is given by the `callback` variable. It is a Boolean variable, TRUE when the recruiter gives a non negative callback, and FALSE otherwise.

We can know launch the `callback()` function, which prepares the data for statistical analysis. Here we need to choose the `comp` parameter. Indeed, we realize that there are 8 candidates so that 8 x 7/2 = 28 comparisons are possible. This is a large number and this is why `callback()` performs the statistical analysis according to the reference candidate by default with `comp = "ref"`.  This reduces our analysis to 7 comparisons.  You could get the 28 comparisons by choosing `comp = "all"` instead. 
```{r}
m <- callback(data = inter1, cluster = "offer", candid = c("gender","origin"), callback = "callback")
```
The `m` object contains the formatted data needed for the analysis. Using `print()` gives the mains characteristics of the experiment :
```{r}
print(m)
```
We find that the experiment is standard, in the sense that all the candidates were sent to all the applications. Notice that this is not needed to use `callback`, it will work fine if there are less candidates. However, when more than one candidate of the same type are send to a test, the most favorable answer is kept (the "max" rule). The reader is informed that there are other ways to deal with this issue.

We can also take a look at the global callback rates of the candidates, by entering :

```{r}
print(stat_glob(m))
```

and get a graphical representation with :
```{r,fig.width=7,fig.height=4.4}
graph(stat_glob(m))
```

It is possible to change the definition of the confidence intervals, the confidence level and the color of the plot. If you prefer the Clopper-Pearson definition, a 90% confidence interval and a "steelblue3" color enter :
```{r,fig.width=7,fig.height=4.4}
s <- stat_glob(m,level=0.9)
print(s,method="cp")
graph(s,method="cp",col="steelblue3")
```

When all the candidates are sent to all the tests, the previous figures may be used to measure discrimination. However, when there is a rotation of the candidates so that only a part of them is sent on each test, it could not be the case. For this reason, we prefer to use matched statistics, which only compares candidates that have been sent to the same tests.

In order to get the result of the discrimination tests, we will use the `stat_count` function. It can be saved into an object for further exports, or printed. The following instruction:
```{r}
s <- stat_count(m)
```
does not produce any printed output, but saves an object `stat_count` into `s`. We can get the statistics with:
```{r}
print(s)
```

The callback counts describe the results of the paired experiments. The first column defines the comparison under the form "candidate 1 vs candidate 2". Here "Man.F vs Woman.F" means that we compare French origin men and women. Out of 310 tests, 113 got at least one callback. The men got 86 callbacks and the women 70. The difference, called net discrimination, equals 16 callbacks. We can go further in the details thanks to the next columns. Out of 310 tests, neither candidate was called back in 197 of the job offers, 43 called only men, 27 called only women and 43 called both. Discrimination only occurs when a single candidate is called back. The net discrimination is thus 43-27=16. The corresponding line percentages are available with \code{s$props}.

```{r}
s$props
```

Now, we can pass to the proportions analysis. We can save the output or print it, like in the previous example. Printing is the default. There are three ways to compute proportions in discrimination studies. First, you can divide the number of callbacks by the number of tests. We call it "matched callback rates" given by the function `stat_mcr()`. Second, you can restrict your analysis to the tests which got at least one callback. We call it "total callback shares", given by the function `stat_tcs()`. Last you can divide by the number of tests where only one candidate has been called back. We call it "exclusive callback shares", given by the function `stat_ecs()`. With the first convention, we get:
```{r}
stat_mcr(m)
```


The printing output includes three tests: the Fisher exact independence test between the candidate type and the callback variable the chi-squared test for the equality of the candidates' callback rates, and the asymptotic Student test for the equality of the candidates' callback rates. A code indicates the significance of the difference, with the same convention as in the `lm()` function. We find that all the differences are significant at the 5% level, except for the two French origin candidates, whatever the test used. The associated graphical representation is obtained by:

```{r,fig.width=7,fig.height=4.4}
graph(stat_mcr(m))
```
The colors can be changed with the option `col` and the definition of the confidence intervals with the option `method`.

There is a second graphical representation, that shows the confidence interval of both candidates. However, the reader must be warned that this representation can be misleading for the following reason. The crossing of the confidence intervals does not imply the equality of the proportions. The only correct representation is the previous one, given by default. To get a comparaison of the confidence interval, enter:
```{r,fig.width=7,fig.height=4.4}
graph(stat_mcr(m),dif=FALSE)
```

The statistics for the other conventions, total callback share and exclusive callback shares, can be obtained by changing the function name to `stat_tcs()` and `stat_ecs()` respectively. The graphical representations for the differences are also similar. The only difference is that, it is possible to have a representation of the total or exclusive callback shares. For the total callback shares, we have:

For the total callback shares, we get:
```{r,fig.width=7,fig.height=4.4}
graph(stat_tcs(m),dif=FALSE)

```

and for the exclusive callback shares, we get :

```{r,fig.width=7,fig.height=4.4}
graph(stat_ecs(m),dif=FALSE)

```