Title: | Semiparametric Model-Assisted Estimation in Finite Populations |
---|---|
Description: | It is a framework to fit semiparametric regression estimators for the total parameter of a finite population when the interest variable is asymmetric distributed. The main references for this package are Sarndal C.E., Swensson B., and Wretman J. (2003,ISBN: 978-0-387-40620-6, "Model Assisted Survey Sampling." Springer-Verlag) Cardozo C.A, Paula G.A. and Vanegas L.H. (2022) "Generalized log-gamma additive partial linear mdoels with P-spline smoothing", Statistical Papers. Cardozo C.A and Alonso-Malaver C.E. (2022). "Semi-parametric model assisted estimation in finite populations." In preparation. |
Authors: | Carlos Alberto Cardozo Delgado [aut, cre, cph], Carlos E. Alonso-Malaver [aut] |
Maintainer: | Carlos Alberto Cardozo Delgado <[email protected]> |
License: | GPL-3 |
Version: | 0.1.3 |
Built: | 2024-11-26 06:43:18 UTC |
Source: | CRAN |
sreg_ber
is used to estimate the total parameter of a finite population generated from a semi-parametric generalized gamma population under a Bernoulli sampling design.
sreg_ber(location_formula, scale_formula, data, pi, ...)
sreg_ber(location_formula, scale_formula, data, pi, ...)
location_formula |
a symbolic description of the systematic component of the location model to be fitted. |
scale_formula |
a symbolic description of the systematic component of the scale model to be fitted. |
data |
a data frame, list containing the variables in the model. |
pi |
numeric, represents the first order probability. Default value is 0.5. |
... |
further parameters accepted by caret and survey functions. |
sampling_design
is the name of the sampling design used in the estimation process.
N
is the population size.
n
is the random sample size used in the estimation process.
first_order_probabilities
vector of the first order probabilities used in the estimation process.
sample
is the random sample used in the estimation process.
estimated_total_y_sreg
is the SREG estimate of the total parameter of the finite population.
Carlos Alberto Cardozo Delgado <[email protected]>
Cardozo C.A, Alonso C. (2021) Semi-parametric model assisted estimation in finite populations. In preparation.
Cardozo C.A., Paula G., and Vanegas L. (2022). Generalized log-gamma additive partial linear models with P-spline smoothing. Statistical Papers.
Sarndal C.E., Swensson B., and Wretman J. (2003). Model Assisted Survey Sampling. Springer-Verlag.
#This example use the data set 'apipop' of the survey package. library(sregsurvey) library(survey) library(magrittr) library(dplyr) library(gamlss) data(api) attach(apipop) Apipop <- filter(apipop,full!= 'NA') Apipop <- filter(Apipop, stype == 'H') Apipop <- Apipop %>% dplyr::select(api00,grad.sch,full) fit <- sreg_ber(api00 ~ pb(grad.sch), scale_formula = ~ full - 1, data= Apipop, pi=0.2) fit # The total population value is true_total <- sum(Apipop$api00) # The estimated relative bias in percentage is round(abs((fit$estimated_total_y_sreg - true_total)/true_total),3)*100
#This example use the data set 'apipop' of the survey package. library(sregsurvey) library(survey) library(magrittr) library(dplyr) library(gamlss) data(api) attach(apipop) Apipop <- filter(apipop,full!= 'NA') Apipop <- filter(Apipop, stype == 'H') Apipop <- Apipop %>% dplyr::select(api00,grad.sch,full) fit <- sreg_ber(api00 ~ pb(grad.sch), scale_formula = ~ full - 1, data= Apipop, pi=0.2) fit # The total population value is true_total <- sum(Apipop$api00) # The estimated relative bias in percentage is round(abs((fit$estimated_total_y_sreg - true_total)/true_total),3)*100
sreg_pips
is used to estimate the total parameter of a finite population generated from a semi-parametric generalized gamma population under a proportional to size without-replacement sampling design.
sreg_pips(location_formula, scale_formula, data, x, n, ...)
sreg_pips(location_formula, scale_formula, data, x, n, ...)
location_formula |
a symbolic description of the systematic component of the location model to be fitted. |
scale_formula |
a symbolic description of the systematic component of the scale model to be fitted. |
data |
a data frame, list containing the variables in the model. |
x |
vector, an auxiliary variable to calculate the inclusion probabilities of each unit. |
n |
numeric, sample size. |
... |
further parameters accepted by caret and survey functions. |
sampling_design
is the name of the sampling design used in the estimation process.
N
is the population size.
n
is the sample size used in the estimation process.
first_order_probabilities
vector of the first order probabilities used in the estimation process.
sample
is the random sample used in the estimation process.
estimated_total_y_sreg
is the SREG estimate of the total parameter of the finite population.
Carlos Alberto Cardozo Delgado <[email protected]>
Cardozo C.A, Alonso C. (2021) Semi-parametric model assisted estimation in finite populations. In preparation.
Cardozo C.A., Paula G., and Vanegas L. (2022). Generalized log-gamma additive partial linear models with P-spline smoothing. Statistical Papers.
Sarndal C.E., Swensson B., and Wretman J. (2003). Model Assisted Survey Sampling. Springer-Verlag.
library(sregsurvey) library(survey) library(dplyr) library(gamlss) data(api) attach(apipop) Apipop <- filter(apipop,full!= 'NA') Apipop <- filter(Apipop, stype == 'H') Apipop <- Apipop %>% dplyr::select(api00,grad.sch,full,api99) n=ceiling(0.2*dim(Apipop)[1]) aux_var <- Apipop %>% dplyr::select(api99) fit <- sreg_pips(api00 ~ pb(grad.sch), scale_formula = ~ full - 1, data= Apipop, x= aux_var, n=n) fit # The total population value is true_total <- sum(Apipop$api00) # The estimated relative bias in percentage is round(abs((fit$estimated_total_y_sreg - true_total)/true_total),3)*100
library(sregsurvey) library(survey) library(dplyr) library(gamlss) data(api) attach(apipop) Apipop <- filter(apipop,full!= 'NA') Apipop <- filter(Apipop, stype == 'H') Apipop <- Apipop %>% dplyr::select(api00,grad.sch,full,api99) n=ceiling(0.2*dim(Apipop)[1]) aux_var <- Apipop %>% dplyr::select(api99) fit <- sreg_pips(api00 ~ pb(grad.sch), scale_formula = ~ full - 1, data= Apipop, x= aux_var, n=n) fit # The total population value is true_total <- sum(Apipop$api00) # The estimated relative bias in percentage is round(abs((fit$estimated_total_y_sreg - true_total)/true_total),3)*100
sreg_poisson
is used to estimate the total parameter of a finite population generated from a semi-parametric generalized gamma population under a Poisson sampling design.
sreg_poisson(location_formula, scale_formula, data, pis, ...)
sreg_poisson(location_formula, scale_formula, data, pis, ...)
location_formula |
a symbolic description of the systematic component of the location model to be fitted. |
scale_formula |
a symbolic description of the systematic component of the scale model to be fitted. |
data |
a data frame, list containing the variables in the model. |
pis |
numeric vector, first order inclusion probabilities. Default value 0.1 for each element. |
... |
further parameters accepted by caret and survey functions. |
sampling_design
is the name of the sampling design used in the estimation process.
N
is the population size.
n
is the random sample size used in the estimation process.
first_order_probabilities
vector of the first order probabilities used in the estimation process.
sample
is the random sample used in the estimation process.
estimated_total_y_sreg
is the SREG estimate of the total parameter of the finite population.
Carlos Alberto Cardozo Delgado <[email protected]>
Cardozo C.A, Alonso C. (2021) Semi-parametric model assisted estimation in finite populations. In preparation.
Cardozo C.A., Paula G., and Vanegas L. (2022). Generalized log-gamma additive partial linear models with P-spline smoothing. Statistical Papers.
Sarndal C.E., Swensson B., and Wretman J. (2003). Model Assisted Survey Sampling. Springer-Verlag.
library(sregsurvey) library(survey) library(dplyr) library(gamlss) data(api) attach(apipop) Apipop <- filter(apipop,full!= 'NA') Apipop <- filter(Apipop, stype == 'H') Apipop <- Apipop %>% dplyr::select(api00,grad.sch,full) fit <- sreg_poisson(api00 ~ pb(grad.sch), scale_formula = ~ full - 1, data= Apipop) fit # The total population value is true_total <- sum(Apipop$api00) # The estimated relative bias in percentage is round(abs((fit$estimated_total_y_sreg - true_total)/true_total),3)*100
library(sregsurvey) library(survey) library(dplyr) library(gamlss) data(api) attach(apipop) Apipop <- filter(apipop,full!= 'NA') Apipop <- filter(Apipop, stype == 'H') Apipop <- Apipop %>% dplyr::select(api00,grad.sch,full) fit <- sreg_poisson(api00 ~ pb(grad.sch), scale_formula = ~ full - 1, data= Apipop) fit # The total population value is true_total <- sum(Apipop$api00) # The estimated relative bias in percentage is round(abs((fit$estimated_total_y_sreg - true_total)/true_total),3)*100
sreg_srswr
is used to estimate the total parameter of a finite population generated from a semi-parametric generalized gamma population under a simple random sampling without-replacement sampling design.
sreg_srswr( location_formula, scale_formula, data, fraction, format = "COMPLETE", ... )
sreg_srswr( location_formula, scale_formula, data, fraction, format = "COMPLETE", ... )
location_formula |
a symbolic description of the systematic component of the location model to be fitted. |
scale_formula |
a symbolic description of the systematic component of the scale model to be fitted. |
data |
a data frame, list containing the variables in the model. |
fraction |
numeric, represents a fraction of the size of the population. Default value is 0.2. |
format |
character, represents the type of summary of the methodology, 'SIMPLE' or 'COMPLETE'. Default value is 'COMPLETE'. |
... |
further parameters accepted by caret and survey functions. |
sampling_design
is the name of the sampling design used in the estimation process.
N
is the population size.
n
is the fixed sample size used in the estimation process.
first_order_probabilities
vector of the first order probabilities used in the estimation process.
sample
is the random sample used in the estimation process.
estimated_total_y_sreg
is the SREG estimate of the total parameter of the finite population.
Carlos Alberto Cardozo Delgado <[email protected]>
Cardozo C.A, Alonso C. (2021) Semi-parametric model assisted estimation in finite populations. In preparation.
Cardozo C.A., Paula G., and Vanegas L. (2022). Generalized log-gamma additive partial linear models with P-spline smoothing. Statistical Papers.
Sarndal C.E., Swensson B., and Wretman J. (2003). Model Assisted Survey Sampling. Springer-Verlag.
library(sregsurvey) library(survey) library(dplyr) library(gamlss) data(api) attach(apipop) Apipop <- filter(apipop,full!= 'NA') Apipop <- filter(Apipop, stype == 'H') Apipop <- Apipop %>% dplyr::select(api00,grad.sch,full) fit <- sreg_srswr(api00 ~ pb(grad.sch), scale_formula = ~ full - 1, data= Apipop, fraction=0.25) # The total population value is true_total <- sum(Apipop$api00) # The estimated relative bias in percentage is round(abs((fit$estimated_total_y_sreg - true_total)/true_total),3)*100
library(sregsurvey) library(survey) library(dplyr) library(gamlss) data(api) attach(apipop) Apipop <- filter(apipop,full!= 'NA') Apipop <- filter(Apipop, stype == 'H') Apipop <- Apipop %>% dplyr::select(api00,grad.sch,full) fit <- sreg_srswr(api00 ~ pb(grad.sch), scale_formula = ~ full - 1, data= Apipop, fraction=0.25) # The total population value is true_total <- sum(Apipop$api00) # The estimated relative bias in percentage is round(abs((fit$estimated_total_y_sreg - true_total)/true_total),3)*100
sreg_stsi
is used to estimate the total parameter of a finite population generated from a semi-parametric generalized gamma population
under a stratified sampling with simple random sampling without-replacement in each stratum.
sreg_stsi( location_formula, scale_formula, stratum, data, n, ss_sizes, allocation_type = "PA", aux_x, ... )
sreg_stsi( location_formula, scale_formula, stratum, data, n, ss_sizes, allocation_type = "PA", aux_x, ... )
location_formula |
a symbolic description of the systematic component of the location model to be fitted. |
scale_formula |
a symbolic description of the systematic component of the scale model to be fitted. |
stratum |
vector, represents the strata of each unit in the population |
data |
a data frame, list containing the variables in the model. |
n |
integer, represents a fixed sample size. |
ss_sizes |
vector, represents a vector with the sample size in each stratum. |
allocation_type |
character, there is two choices, proportional allocation, 'PA', and x-optimal allocation,'XOA'. By default is a 'PA', Sarndal et. al. (2003). |
aux_x |
vector, represents an auxiliary variable to help to calculate the sample sizes by the x-optimum allocation method, Sarndal et. al. (2003). This option is validated only when the argument allocation_type is equal to 'XOA'. |
... |
further parameters accepted by caret and survey functions. |
sampling_design
is the name of the sampling design used in the estimation process.
N
is the population size.
H
is the number of strata.
Ns
is the population strata sizes.
allocation_type
is the method used to calculate sample strata sizes.
global_n
is the global sample size used in the estimation process.
first_order_probabilities
vector of the first order probabilities used in the estimation process.
sample
is the random sample used in the estimation process.
estimated_total_y_sreg
is the SREG estimate of the total parameter of the finite population.
Carlos Alberto Cardozo Delgado <[email protected]>
Cardozo C.A, Alonso C. (2021) Semi-parametric model assisted estimation in finite populations. In preparation.
Cardozo C.A., Paula G., and Vanegas L. (2022). Generalized log-gamma additive partial linear models with P-spline smoothing. Statistical Papers.
Sarndal C.E., Swensson B., and Wretman J. (2003). Model Assisted Survey Sampling. Springer-Verlag.
library(sregsurvey) library(survey) library(dplyr) library(magrittr) library(gamlss) data(api) attach(apipop) Apipop <- filter(apipop,full!= 'NA') Apipop <- Apipop %>% dplyr::select(api00,grad.sch,full,stype) dim(Apipop) fit <- sreg_stsi(api00~ pb(grad.sch), scale_formula =~ full-1, n=400, stratum='stype', data=Apipop) fit # The total population value is true_total <- sum(Apipop$api00) # The estimated relative bias in percentage is round(abs((fit$estimated_total_y_sreg - true_total)/true_total),3)*100
library(sregsurvey) library(survey) library(dplyr) library(magrittr) library(gamlss) data(api) attach(apipop) Apipop <- filter(apipop,full!= 'NA') Apipop <- Apipop %>% dplyr::select(api00,grad.sch,full,stype) dim(Apipop) fit <- sreg_stsi(api00~ pb(grad.sch), scale_formula =~ full-1, n=400, stratum='stype', data=Apipop) fit # The total population value is true_total <- sum(Apipop$api00) # The estimated relative bias in percentage is round(abs((fit$estimated_total_y_sreg - true_total)/true_total),3)*100