--- title: "BLE_Categorical" output: rmarkdown::html_vignette vignette: > %\VignetteIndexEntry{BLE_Categorical} %\VignetteEngine{knitr::rmarkdown} %\VignetteEncoding{UTF-8} --- ```{r, include = FALSE} knitr::opts_chunk$set( collapse = TRUE, comment = "#>" ) ``` ```{r setup} library(BayesSampling) ``` # Application of the BLE to categorical data ### (From Section 4 of the "[Gonçalves, Moura and Migon: Bayes linear estimation for finite population with emphasis on categorical data](https://www150.statcan.gc.ca/n1/en/catalogue/12-001-X201400111886)") In a situation where the population can be divided into different and exclusive categories, we can calculate the Bayes Linear Estimator for the proportion of individuals in each category with the _BLE_Categorical()_ function, which receives the following parameters: * $y_s$ - $k$-vector of sample proportion for each category; * $n$ - sample size; * $N$ - total size of the population; * $m$ - $k$-vector with the prior proportion of each category. If _NULL_, sample proportion for each category will be used (non-informative prior); * $rho$ - matrix with the prior correlation coefficients between two different units within categories. It must be a symmetric square matrix of dimension $k$ (or $k-1$). If _NULL_, non-informative prior will be used (see below). ### Vague Prior Distribution Letting $\rho_{ii} \to 1$, that is, assuming prior ignorance, the resulting point estimate will be the same as the one seen in the design-based context for categorical data.\ This can be achieved using the _BLE_Categorical()_ function by omitting either the prior proportions and/or the parameter _rho_, that is: * $m =$ _NULL_ - sample proportions in each category will be used * $rho =$ _NULL_ - $\rho_{ii} \to 1$ and $\rho_{ij} = 0, i \neq j$ ### _R_ and _Vs_ Matrices If the calculation of matrices _R_ and _Vs_ results in non-positive definite matrices, a warning will be displayed. In general this does not produce incorrect/ inconsistent results for the proportion estimate but for its associated variance. It is suggested to review the prior correlation coefficients (parameter _rho_). ### Examples 1. Example presented in the mentioned [article](https://www150.statcan.gc.ca/n1/en/catalogue/12-001-X201400111886) (2 categories) ```{r ex 1, message=TRUE, warning=TRUE} ys <- c(0.2614, 0.7386) n <- 153 N <- 15288 m <- c(0.7, 0.3) rho <- matrix(0.1, 1) Estimator <- BLE_Categorical(ys,n,N,m,rho) Estimator$est.prop Estimator$Vest.prop ``` Bellow we can see that the greater the correlation coefficient, the closer our estimation will get to the sample proportions. ```{r ex 1.2, message=TRUE, warning=TRUE} ys <- c(0.2614, 0.7386) n <- 153 N <- 15288 m <- c(0.7, 0.3) rho <- matrix(0.5, 1) Estimator <- BLE_Categorical(ys,n,N,m,rho) Estimator$est.prop Estimator$Vest.prop ``` 2. Example from the help page (3 categories) ```{r ex 2, message=TRUE, warning=TRUE} ys <- c(0.2, 0.5, 0.3) n <- 100 N <- 10000 m <- c(0.4, 0.1, 0.5) mat <- c(0.4, 0.1, 0.1, 0.1, 0.2, 0.1, 0.1, 0.1, 0.6) rho <- matrix(mat, 3, 3) Estimator <- BLE_Categorical(ys,n,N,m,rho) Estimator$est.prop Estimator$Vest.prop ``` Same example, but with no prior correlation coefficients informed (non-informative prior) ```{r ex 2.2, message=TRUE, warning=TRUE} ys <- c(0.2, 0.5, 0.3) n <- 100 N <- 10000 m <- c(0.4, 0.1, 0.5) Estimator <- BLE_Categorical(ys,n,N,m,rho=NULL) Estimator$est.prop ```