Title: | Small Area Estimation for Rao and Yu Model |
---|---|
Description: | Functions to calculate EBLUPs (Empirical Best Linear Unbiased Predictor) estimators and their MSEs (Mean Squared Errors). Estimators are based on an area-level linear mixed model introduced by Rao and Yu (1994) <doi:10.2307/3315407>. The REML (Residual Maximum Likelihood) method is used for fitting the model. |
Authors: | Cabello E. [aut], Esteban M.D. [aut], Morales Domingo [aut], Perez Agustin [aut, cre] |
Maintainer: | Perez Agustin <[email protected]> |
License: | GPL-2 |
Version: | 2.0 |
Built: | 2025-03-09 07:00:57 UTC |
Source: | CRAN |
A complete set of functions to calculate several eblups estimators and its mean square errors. All estimators are based in area-level linear mixed model introduced by Rao and Yu in 1994 (see documentation). Saery package are developed to fit the model with REML method.
Package: | saery |
Type: | Package |
Version: | 2.0 |
Date: | 2025-02-06 |
License: | GPL-2 |
The main functions of the saery package are fit.saery
and eblup.saery
.
The function fit.saery
is used to fit the correct model for three options. eblup.saery
calculates the eblup and mse for the model.
Esteban Cabello Garcia, Maria Dolores Esteban Lefler, Domingo Morales Gonzalez, Agustin Perez Martin
Maintainer: Perez Agustin <[email protected]>
Rao, J.N.K., Yu, M., 1994. Small area estimation by combining time series and cross sectional data. Canadian Journal of Statistics 22, 511-528.
Esteban, M.D., Morales, D., Perez, A., Santamaria, L., 2012. Small area estimation of poverty proportions under area-level time models. Computational Statistics and Data Analysis, 56 (10), pp. 2840-2855.
sigma2edi <- datos[,6] X <- as.matrix(datos[,5]) ydi <- datos[,3] D <- length(unique(datos[,1])) md <- rep(length(unique(datos[,2])), D) output.fit.ar1 <- fit.saery(X, ydi, D, md, sigma2edi, "AR", 0.9) output.fit.ar1 #For computational reasons B is too low. We recomend to increase up to 100 eblup.output.ar1 <- eblup.saery(X, ydi, D, md, sigma2edi, "a", plot = TRUE, B = 2) eblup.output.ar1
sigma2edi <- datos[,6] X <- as.matrix(datos[,5]) ydi <- datos[,3] D <- length(unique(datos[,1])) md <- rep(length(unique(datos[,2])), D) output.fit.ar1 <- fit.saery(X, ydi, D, md, sigma2edi, "AR", 0.9) output.fit.ar1 #For computational reasons B is too low. We recomend to increase up to 100 eblup.output.ar1 <- eblup.saery(X, ydi, D, md, sigma2edi, "a", plot = TRUE, B = 2) eblup.output.ar1
A simulated dataset created by the authors in order to check the correct operation of the saery package
data(datos)
data(datos)
A simulated data frame with 2000 observations on the following 6 variables.
Area
a numeric vector with the area (domain) of the data
Period
a numeric vector with the period (subdomain) of the data
ydi
a numeric vector with the direct estimator of the indicator of interest for area (domain)
ones
a numeric vector. This is only needed to include the intercept parameter in the model
xdi
a numeric vector containing the aggregated (population) values of an auxiliary variable
sigma2edi
a numeric vector with the known variance of the error term
sigma2edi <- datos[,6] X <- as.matrix(datos[,5]) ydi <- datos[,3] D <- length(unique(datos[,1])) md <- rep(length(unique(datos[,2])), D) output.fit.ar1 <- fit.saery(X, ydi, D, md, sigma2edi, "AR", 0.9) output.fit.ar1 #For computational reasons B is too low. We recomend to increase up to 100 eblup.output.ar1 <- eblup.saery(X, ydi, D, md, sigma2edi, "a", plot = TRUE, B = 2) eblup.output.ar1
sigma2edi <- datos[,6] X <- as.matrix(datos[,5]) ydi <- datos[,3] D <- length(unique(datos[,1])) md <- rep(length(unique(datos[,2])), D) output.fit.ar1 <- fit.saery(X, ydi, D, md, sigma2edi, "AR", 0.9) output.fit.ar1 #For computational reasons B is too low. We recomend to increase up to 100 eblup.output.ar1 <- eblup.saery(X, ydi, D, md, sigma2edi, "a", plot = TRUE, B = 2) eblup.output.ar1
eblup.saery
calculate the eblup and mse for a model.
The function eblup.saery
calculate the eblup and mse for a model. Is recomended that the model was previusly checked by fit.saery
eblup.saery(X, ydi, D, md, sigma2edi, model = c("INDEP", "AR1", "MA1"), plot = FALSE, type = "I", B = 100) eblup.saery.AR1(X, ydi, D, md, sigma2edi, plot, type, B) eblup.saery.MA1(X, ydi, D, md, sigma2edi, plot, type, B) eblup.saery.indep(X, ydi, D, md, sigma2edi, plot)
eblup.saery(X, ydi, D, md, sigma2edi, model = c("INDEP", "AR1", "MA1"), plot = FALSE, type = "I", B = 100) eblup.saery.AR1(X, ydi, D, md, sigma2edi, plot, type, B) eblup.saery.MA1(X, ydi, D, md, sigma2edi, plot, type, B) eblup.saery.indep(X, ydi, D, md, sigma2edi, plot)
X |
a numeric vector or data frame containing the aggregated (population) values of p auxiliary variables. A ones columns must be agregated to calculate the intercept parameter |
ydi |
a numeric vector with the direct estimator of the indicator of interest for area (domain) |
D |
a numeric vector with the number of areas (domain) of the data |
md |
a numeric vector with the number of periods (subdomains) for each area of the data |
sigma2edi |
a numeric vector with the known variance of the error term |
model |
Three diferents types of model must be fit. For an indepent model |
plot |
logical specifying if a set of plot be returned. |
type |
three types of mse can be calculated for |
B |
the number of bootstrap samples to be generated and fitted for types |
A data frame with the eblups and its mse be returned. A set of plots be displayed if plot = TRUE
Maria Dolores Esteban Lefler, Domingo Morales Gonzalez, Agustin Perez Martin
Rao, J.N.K., Yu, M., 1994. Small area estimation by combining time series and cross sectional data. Canadian Journal of Statistics 22, 511-528.
Esteban, M.D., Morales, D., Perez, A., Santamaria, L., 2012. Small area estimation of poverty proportions under area-level time models. Computational Statistics and Data Analysis, 56 (10), pp. 2840-2855.
sigma2edi <- datos[,6] X <- as.matrix(datos[,5]) ydi <- datos[,3] D <- length(unique(datos[,1])) md <- rep(length(unique(datos[,2])), D) #For computational reasons B is too low. We recomend to increase up to 100 eblup.output.ar1 <- eblup.saery(X, ydi, D, md, sigma2edi, model = "a", plot = TRUE, B = 2) eblup.output.ar1 #For computational reasons B is too low. We recomend to increase up to 100 eblup.output.ma1 <- eblup.saery(X, ydi, D, md, sigma2edi, model = "ma", plot = FALSE, type = "II", B = 2) eblup.output.ma1 eblup.output.indep <- eblup.saery(X, ydi, D, md, sigma2edi, model = "i", plot = TRUE) eblup.output.indep
sigma2edi <- datos[,6] X <- as.matrix(datos[,5]) ydi <- datos[,3] D <- length(unique(datos[,1])) md <- rep(length(unique(datos[,2])), D) #For computational reasons B is too low. We recomend to increase up to 100 eblup.output.ar1 <- eblup.saery(X, ydi, D, md, sigma2edi, model = "a", plot = TRUE, B = 2) eblup.output.ar1 #For computational reasons B is too low. We recomend to increase up to 100 eblup.output.ma1 <- eblup.saery(X, ydi, D, md, sigma2edi, model = "ma", plot = FALSE, type = "II", B = 2) eblup.output.ma1 eblup.output.indep <- eblup.saery(X, ydi, D, md, sigma2edi, model = "i", plot = TRUE) eblup.output.indep
fit.saery
is used to fit the correct model for three options.
The function fit.saery
fits the model for three options. This function and eblup.saery
use the REML method to fit the model.
fit.saery(X, ydi, D, md, sigma2edi, model = c("INDEP", "AR1", "MA1"), conf.level = 0.95) fit.saery.AR1(X, ydi, D, md, sigma2edi, conf.level) fit.saery.MA1(X, ydi, D, md, sigma2edi, conf.level) fit.saery.indep(X, ydi, D, md, sigma2edi, conf.level)
fit.saery(X, ydi, D, md, sigma2edi, model = c("INDEP", "AR1", "MA1"), conf.level = 0.95) fit.saery.AR1(X, ydi, D, md, sigma2edi, conf.level) fit.saery.MA1(X, ydi, D, md, sigma2edi, conf.level) fit.saery.indep(X, ydi, D, md, sigma2edi, conf.level)
X |
a numeric vector or data frame containing the aggregated (population) values of p auxiliary variables. A ones columns must be agregated to calculate the intercept parameter |
ydi |
a numeric vector with the direct estimator of the indicator of interest for area (domain) |
D |
a numeric vector with the number of areas (domain) of the data |
md |
a numeric vector with the number of periods (subdomains) for each area of the data |
sigma2edi |
a numeric vector with the known variance of the error term |
model |
Three diferents types of model must be fit. For an indepent model |
conf.level |
a value under 1 for the confidence level for the confidence intervals returned by the function |
A list with the fitted parameters of the model are returned. Caonfidence intervals, p-values, the Fisher Scoring matrix and the number of iterations of the model are also returned.
Maria Dolores Esteban Lefler, Domingo Morales Gonzalez, Agustin Perez Martin
Rao, J.N.K., Yu, M., 1994. Small area estimation by combining time series and cross sectional data. Canadian Journal of Statistics 22, 511-528.
Esteban, M.D., Morales, D., Perez, A., Santamaria, L., 2012. Small area estimation of poverty proportions under area-level time models. Computational Statistics and Data Analysis, 56 (10), pp. 2840-2855.
eblup.saery
, ~~~
sigma2edi <- datos[,6] X <- as.matrix(datos[,5]) ydi <- datos[,3] D <- length(unique(datos[,1])) md <- rep(length(unique(datos[,2])), D) output.fit.ar1 <- fit.saery(X, ydi, D, md, sigma2edi, "AR", 0.9) output.fit.ar1 output.fit.ma1 <- fit.saery(X, ydi, D, md, sigma2edi, "MA", 0.9) output.fit.ma1 output.fit.indep <- fit.saery(X, ydi, D, md, sigma2edi, "indep", 0.9) output.fit.indep
sigma2edi <- datos[,6] X <- as.matrix(datos[,5]) ydi <- datos[,3] D <- length(unique(datos[,1])) md <- rep(length(unique(datos[,2])), D) output.fit.ar1 <- fit.saery(X, ydi, D, md, sigma2edi, "AR", 0.9) output.fit.ar1 output.fit.ma1 <- fit.saery(X, ydi, D, md, sigma2edi, "MA", 0.9) output.fit.ma1 output.fit.indep <- fit.saery(X, ydi, D, md, sigma2edi, "indep", 0.9) output.fit.indep