---
title: "BLE_Categorical"
output: rmarkdown::html_vignette
vignette: >
  %\VignetteIndexEntry{BLE_Categorical}
  %\VignetteEngine{knitr::rmarkdown}
  %\VignetteEncoding{UTF-8}
---

```{r, include = FALSE}
knitr::opts_chunk$set(
  collapse = TRUE,
  comment = "#>"
)
```

```{r setup}
library(BayesSampling)
```


# Application of the BLE to categorical data  

### (From Section 4 of the "[Gonçalves, Moura and Migon: Bayes linear estimation for finite population with emphasis on categorical data](https://www150.statcan.gc.ca/n1/en/catalogue/12-001-X201400111886)") 

In a situation where the population can be divided into different and exclusive categories, we can calculate the Bayes Linear Estimator for the proportion of individuals in each category with the _BLE_Categorical()_ function, which receives the following parameters:  

* $y_s$ - $k$-vector of sample proportion for each category;
* $n$ - sample size;
* $N$ - total size of the population;
* $m$ - $k$-vector with the prior proportion of each category. If _NULL_, sample proportion for each category will be used (non-informative prior);
* $rho$ - matrix with the prior correlation coefficients between two different units within categories. It must be a symmetric square matrix of dimension $k$ (or $k-1$). If _NULL_, non-informative prior will be used (see below).


### Vague Prior Distribution

Letting $\rho_{ii} \to 1$, that is, assuming prior ignorance, the resulting point estimate will be the same as the one seen in the design-based context for categorical data.\  

This can be achieved using the _BLE_Categorical()_ function by omitting either the prior proportions and/or the parameter _rho_, that is:

* $m =$ _NULL_ - sample proportions in each category will be used
* $rho =$ _NULL_ - $\rho_{ii} \to 1$ and $\rho_{ij} = 0, i \neq j$


### _R_ and _Vs_ Matrices

If the calculation of matrices _R_ and _Vs_ results in non-positive definite matrices, a warning will be displayed. In general this does not produce incorrect/ inconsistent results for the proportion estimate but for its associated variance. It is suggested to review the prior correlation coefficients (parameter _rho_).



### Examples

1. Example presented in the mentioned [article](https://www150.statcan.gc.ca/n1/en/catalogue/12-001-X201400111886) (2 categories)


```{r ex 1, message=TRUE, warning=TRUE}
ys <- c(0.2614, 0.7386)
n <- 153
N <- 15288
m <- c(0.7, 0.3)
rho <- matrix(0.1, 1)
Estimator <- BLE_Categorical(ys,n,N,m,rho)

Estimator$est.prop
Estimator$Vest.prop
```

Bellow we can see that the greater the correlation coefficient, the closer our estimation will get to the sample proportions.

```{r ex 1.2, message=TRUE, warning=TRUE}
ys <- c(0.2614, 0.7386)
n <- 153
N <- 15288
m <- c(0.7, 0.3)
rho <- matrix(0.5, 1)
Estimator <- BLE_Categorical(ys,n,N,m,rho)

Estimator$est.prop
Estimator$Vest.prop
```



2. Example from the help page (3 categories)

```{r ex 2, message=TRUE, warning=TRUE}
ys <- c(0.2, 0.5, 0.3)
n <- 100
N <- 10000
m <- c(0.4, 0.1, 0.5)
mat <- c(0.4, 0.1, 0.1, 0.1, 0.2, 0.1, 0.1, 0.1, 0.6)
rho <- matrix(mat, 3, 3)

Estimator <- BLE_Categorical(ys,n,N,m,rho)

Estimator$est.prop
Estimator$Vest.prop
```

Same example, but with no prior correlation coefficients informed (non-informative prior)

```{r ex 2.2, message=TRUE, warning=TRUE}
ys <- c(0.2, 0.5, 0.3)
n <- 100
N <- 10000
m <- c(0.4, 0.1, 0.5)

Estimator <- BLE_Categorical(ys,n,N,m,rho=NULL)

Estimator$est.prop
```