Title: | Bayesian Beta Regression |
---|---|
Description: | Provides a class of Bayesian beta regression models for the analysis of continuous data with support restricted to an unknown finite support. The response variable is modeled using a four-parameter beta distribution with the mean or mode parameter depending linearly on covariates through a link function. When the response support is known to be (0,1), the above class of models reduce to traditional (0,1) supported beta regression models. Model choice is carried out via the logarithm of the pseudo marginal likelihood (LPML), the deviance information criterion (DIC), and the Watanabe-Akaike information criterion (WAIC). See Zhou and Huang (2022) <doi:10.1016/j.csda.2021.107345>. |
Authors: | Haiming Zhou [aut, cre, cph], Xianzheng Huang [aut] |
Maintainer: | Haiming Zhou <[email protected]> |
License: | GPL (>= 2) |
Version: | 1.0.1 |
Built: | 2024-12-15 07:35:57 UTC |
Source: | CRAN |
This function fits Bayesian beta regression models. The response distribution can be either the beta with the support on (0,1) or the four-parameter beta with an unknown final support. The logarithm of the pseudo marginal likelihood (LPML), the deviance information criterion (DIC), and the Watanabe-Akaike information criterion (WAIC) are provided for model comparison.
beta4reg(formula, data, na.action, link="logit", model = "mode", mcmc=list(nburn=3000, nsave=2000, nskip=0, ndisplay=500), prior=NULL, start=NULL, Xpred=NULL)
beta4reg(formula, data, na.action, link="logit", model = "mode", mcmc=list(nburn=3000, nsave=2000, nskip=0, ndisplay=500), prior=NULL, start=NULL, Xpred=NULL)
formula |
a formula expression of the form |
data |
a data frame in which to interpret the variables named in the |
na.action |
a missing-data filter function, applied to the |
link |
a character string for the link function. Choices include |
model |
a character string for the regression type. The options include |
mcmc |
a list giving the MCMC parameters. The list must include the following elements: |
prior |
a list giving the prior information. The function itself provides all default priors. The following components can be specified here: |
start |
a list giving the starting values of the parameters. The function itself provides all default choices. The following components can be specified here: |
Xpred |
A new design matrix at which estimates of the response model or mean are required. The default is the design matrix returned by the argument |
This class of objects is returned by the beta4reg
function to represent a fitted Bayesian beta regression model. Objects of this class have methods for the functions print
and summary
.
The beta4reg
object is a list containing the following components:
modelname |
the name of the fitted model |
terms |
the |
link |
the link function used |
model |
the model fitted: mean or mode |
coefficients |
a named vector of coefficients. The last two elements are the estimates of theta1 and theta2 involved in the support of the four-parameter beta distribution. |
call |
the matched call |
prior |
the list of hyperparameters used in all priors. |
start |
the list of starting values used for all parameters. |
mcmc |
the list of MCMC parameters used |
n |
the number of row observations used in fitting the model |
p |
the number of columns in the model matrix |
y |
the response observations |
X |
the n by (p+1) orginal design matrix |
beta |
the (p+1) by nsave matrix of posterior samples for the coefficients in the |
theta |
the 2 by nsave matrix of posterior samples for theta1 and theta2 involved in the support. |
phi |
the vector of posterior samples for the precision parameter. |
cpo |
the length n vector of the stabilized estiamte of CPO; used for calculating LPML |
pD |
the effective number of parameters involved in DIC |
DIC |
the deviance information criterion (DIC) |
pW |
the effective number of parameters involved in WAIC |
WAIC |
the Watanabe-Akaike information criterion (WAIC) |
ratetheta |
the acceptance rate in the posterior sampling of theta vector involved in the support |
ratebeta |
the acceptance rate in the posterior sampling of beta coefficient vector |
ratephi |
the acceptance rate in the posterior sampling of precision parameter |
The use of the summary
function to the object will return new object with the following additional components:
coeff |
A table that presents the posterior summaries for the regression coefficients |
bounds |
A table that presents the posterior summaries for the support boundaries theta1 and theta2 |
phivar |
A table that presents the posterior summaries for the precision phi. |
Haiming Zhou and Xianzheng Huang
Zhou, H. and Huang, X. (2022). Bayesian beta regression for bounded responses with unknown supports. Computational Statistics & Data Analysis, 167, 107345.
library(betaBayes) library(betareg) ## Data from Ferrari and Cribari-Neto (2004) data("GasolineYield", package = "betareg") data("FoodExpenditure", package = "betareg") ## four-parameter beta mean regression mcmc=list(nburn=2000, nsave=1000, nskip=4, ndisplay=1000); # Note larger nburn, nsave and nskip should be used in practice. prior = list(th1a0 = 0, th2b0 = 1) # here the natural bound (0,1) is used to specify the prior # GasolineYield set.seed(100) gy_res1 <- beta4reg(yield ~ batch + temp, data = GasolineYield, link = "logit", model = "mean", mcmc = mcmc, prior = prior) (gy_sfit1 <- summary(gy_res1)) cox.snell.beta4reg(gy_res1) # Cox-Snell plot # FoodExpenditure set.seed(100) fe_res1 <- beta4reg(I(food/income) ~ income + persons, data = FoodExpenditure, link = "logit", model = "mean", mcmc = mcmc, prior = prior) (fe_sfit1 <- summary(fe_res1)) cox.snell.beta4reg(fe_res1) # Cox-Snell plot ## two-parameter beta mean regression with support (0,1) mcmc=list(nburn=2000, nsave=1000, nskip=4, ndisplay=1000); # Note larger nburn, nsave and nskip should be used in practice. prior = list(th1a0 = 0, th1b0 = 0, th2a0 = 1, th2b0 = 1) # this setting forces the support to be (0,1) # GasolineYield set.seed(100) gy_res2 <- beta4reg(yield ~ batch + temp, data = GasolineYield, link = "logit", model = "mean", mcmc = mcmc, prior = prior) (gy_sfit2 <- summary(gy_res2)) cox.snell.beta4reg(gy_res2) # Cox-Snell plot # FoodExpenditure set.seed(100) fe_res2 <- beta4reg(I(food/income) ~ income + persons, data = FoodExpenditure, link = "logit", model = "mean", mcmc = mcmc, prior = prior) (fe_sfit2 <- summary(fe_res2)) cox.snell.beta4reg(fe_res2) # Cox-Snell plot
library(betaBayes) library(betareg) ## Data from Ferrari and Cribari-Neto (2004) data("GasolineYield", package = "betareg") data("FoodExpenditure", package = "betareg") ## four-parameter beta mean regression mcmc=list(nburn=2000, nsave=1000, nskip=4, ndisplay=1000); # Note larger nburn, nsave and nskip should be used in practice. prior = list(th1a0 = 0, th2b0 = 1) # here the natural bound (0,1) is used to specify the prior # GasolineYield set.seed(100) gy_res1 <- beta4reg(yield ~ batch + temp, data = GasolineYield, link = "logit", model = "mean", mcmc = mcmc, prior = prior) (gy_sfit1 <- summary(gy_res1)) cox.snell.beta4reg(gy_res1) # Cox-Snell plot # FoodExpenditure set.seed(100) fe_res1 <- beta4reg(I(food/income) ~ income + persons, data = FoodExpenditure, link = "logit", model = "mean", mcmc = mcmc, prior = prior) (fe_sfit1 <- summary(fe_res1)) cox.snell.beta4reg(fe_res1) # Cox-Snell plot ## two-parameter beta mean regression with support (0,1) mcmc=list(nburn=2000, nsave=1000, nskip=4, ndisplay=1000); # Note larger nburn, nsave and nskip should be used in practice. prior = list(th1a0 = 0, th1b0 = 0, th2a0 = 1, th2b0 = 1) # this setting forces the support to be (0,1) # GasolineYield set.seed(100) gy_res2 <- beta4reg(yield ~ batch + temp, data = GasolineYield, link = "logit", model = "mean", mcmc = mcmc, prior = prior) (gy_sfit2 <- summary(gy_res2)) cox.snell.beta4reg(gy_res2) # Cox-Snell plot # FoodExpenditure set.seed(100) fe_res2 <- beta4reg(I(food/income) ~ income + persons, data = FoodExpenditure, link = "logit", model = "mean", mcmc = mcmc, prior = prior) (fe_sfit2 <- summary(fe_res2)) cox.snell.beta4reg(fe_res2) # Cox-Snell plot
A county level COVID-19 dataset in US. It is of interest to examine the association between several county-level characteristics and the cumulative numbers of confirmed cases and deaths. County-level characteristics are based on the 2018 ACS 5-year estimates.
data(covid)
data(covid)
FIPS: | FIPS county code |
PopE: | total population |
MaleP: | percentage of people who are male |
WhiteP: | percentage of people who are white |
BlackP: | percentage of people who are black or African American |
Age65plusP: | percentage of people who are 65 years and over |
PovertyP: | percentage of people whose income in the past 12 months is below poverty |
RUCC_2013: | 2013 Rural Urban Continuum Code, with a higher value indicating a more rural county |
State: | two-letter state abbreviation code |
deaths: | cumulative number of deaths as of October 13, 2020 |
cases: | cumulative number of confirmed cases as of October 13, 2020 |
data(covid) head(covid)
data(covid) head(covid)
This function provides the Cox-Snell diagnostic plot for fitting for Bayesian beta regression models.
cox.snell.beta4reg(x, ncurves = 10, CI = 0.95, PLOT = TRUE)
cox.snell.beta4reg(x, ncurves = 10, CI = 0.95, PLOT = TRUE)
x |
an object obtained from the function |
ncurves |
the number of posterior draws. |
CI |
the level of confidence for point-wise credible intervals. |
PLOT |
a logical value indicating whether the Cox-Snell residuals will be plotted. |
The function returns the plot (if PLOT = TRUE
) and a list with the following components:
tgrid |
the x-axis values with length, say |
Hhat |
the |
Hhatlow |
the |
Hhatup |
the |
H |
the |
Haiming Zhou and Xianzheng Huang
Posterior predicted response values based on beta4 model object
## S3 method for class 'beta4reg' predict(object, newx, ...)
## S3 method for class 'beta4reg' predict(object, newx, ...)
object |
an object obtained from the function |
newx |
an m by p matrix at which predictions are required. If not specified, the original design matrix will be used. |
... |
further arguments passed to or from other methods. |
The function returns an m by nsave matrix of posterior samples for response predictions at newx.
Haiming Zhou and Xianzheng Huang