| Title: | Regression Models for Bounded Continuous and Discrete Responses |
|---|---|
| Description: | Functions to fit regression models for bounded continuous and discrete responses. In case of bounded continuous responses (e.g., proportions and rates), available models are the flexible beta (Migliorati, S., Di Brisco, A. M., Ongaro, A. (2018) <doi:10.1214/17-BA1079>), the variance-inflated beta (Di Brisco, A. M., Migliorati, S., Ongaro, A. (2020) <doi:10.1177/1471082X18821213>), the beta (Ferrari, S.L.P., Cribari-Neto, F. (2004) <doi:10.1080/0266476042000214501>), and their augmented versions to handle the presence of zero/one values (Di Brisco, A. M., Migliorati, S. (2020) <doi:10.1002/sim.8406>) are implemented. In case of bounded discrete responses (e.g., bounded counts, such as the number of successes in n trials), available models are the flexible beta-binomial (Ascari, R., Migliorati, S. (2021) <doi:10.1002/sim.9005>), the beta-binomial, and the binomial are implemented. Inference is dealt with a Bayesian approach based on the Hamiltonian Monte Carlo (HMC) algorithm (Gelman, A., Carlin, J. B., Stern, H. S., Rubin, D. B. (2014) <doi:10.1201/b16018>). Besides, functions to compute residuals, posterior predictives, goodness of fit measures, convergence diagnostics, and graphical representations are provided. |
| Authors: | Roberto Ascari [aut, cre], Agnese M. Di Brisco [aut], Sonia Migliorati [aut], Andrea Ongaro [aut] |
| Maintainer: | Roberto Ascari <[email protected]> |
| License: | GPL (>= 2) |
| Version: | 1.4.2 |
| Built: | 2026-05-11 10:20:01 UTC |
| Source: | https://github.com/cran/FlexReg |
Count/Percentage of chromosome aberrations in atomic bombs survivors.
A data.frame containing 1039 observations on the following 5 variables:
y.percthe percentage of cells with chromosomal abnormalities.
ythe number of cells with chromosomal abnormalities.
nthe number of analyzed cells. It is fixed to 100 for all the survivors.
dosea quantitative measure of the radiation exposure level, expressed in rads.
bomba factor, indicating which bomb the subject survived (H = Hiroshima, N = Nagasaki).
The data have been originally analyzed by Otake and Prentice (1984) and successively by Ascari and Migliorati (2021).
Ascari, R., Migliorati, S. (2021). A new regression model for overdispersed binomial data accounting for outliers and an excess of zeros. Statistics in Medicine, 40(17), 3895–3914. doi:10.1002/sim.9005
Otake, M., Prentice, R.L. (1984). The analysis of chromosomally aberrant cells based on beta-binomial distribution. Radiat Res. 98, 456–470.
Count/Percentage of eggs parasitized by female parasitoids.
A data.frame containing 70 observations on the following 5 variables:
y.percthe percentage of parasitized eggs.
ythe number of parasitized eggs.
nthe maximum number of eggs that female parasitoids could parasitized. It is fixed to 128 for all the observations.
femalesthe number of female parasitoids.
females_stdthe standardized version of females.
The data have been originally analyzed by Demétrio et al (2014) and successively by Ascari and Migliorati (2021). Data come from a completely randomized experiment with 10 replicates for each specification of number of females.
Demétrio et al., (2014). Models for overdispersed data in entomology.
Ascari, R., Migliorati, S. (2021). A new regression model for overdispersed binomial data accounting for outliers and an excess of zeros. Statistics in Medicine, 40(17), 3895–3914. doi:10.1002/sim.9005
Demétrio, C.G.B., Hinde, J., Moral, R.A. (2014). Models for overdispersed data in entomology. Ecological Modelling Applied to Entomology. Entomology in Focus Switzerland: Springer International Publishing, 219–259.
This dataset is a subset from the 2016 Survey on Household Income and Wealth data, a statistical survey conducted by the Bank of Italy. The statistical units are the households and the head of the household is conventionally selected as the major income earner.
A data.frame containing 568 observations on the following 8 variables:
NCompthe number of household members.
Sexthe sex of the head of the household.
Agethe age of the head of the household.
NEarnersthe number of household income earners.
Areaa factor indicating the geographical area where the household is located.
Citizenshipa factor indicating the citizenship of the head of household.
Incomethe net disposable income.
Consumptionthe propensity to consume, defined as the percentage of Income that is spent rather than saved.
Full data are available on the website of the Bank of Italy. Consumption is computed as the ratio between the amount of consumption' over the net disposable income'.
Bank of Italy, Survey on Household Income and Wealth, 2016.
Survey description.
The function returns some diagnostic measures to check for convergence to the equilibrium distribution of the Markov Chain(s). Moreover, it prints the number (and percentage) of iterations that ended with a divergence and that saturated the max treedepth, and the E-BFMI values for each chain for which E-BFMI is less than 0.2.
convergence.diag( model, diagnostics = "all", pars = NULL, additional.args = list() ) ## S3 method for class 'convergence.diag.flexreg' print(x, ...)convergence.diag( model, diagnostics = "all", pars = NULL, additional.args = list() ) ## S3 method for class 'convergence.diag.flexreg' print(x, ...)
model |
an object of class |
diagnostics |
an optional character vector of diagnostics names. The default is to compute |
pars |
an optional character vector of parameter names. If |
additional.args |
a list containing additional arguments (see details) |
x |
an object of class |
... |
additional arguments. Currently not used. |
"Rhat" returns the potential scale reduction factor on split chains. An R-hat greater than 1 is indicative of a bad mix of the chains. At convergence R-hat has to be less than 1.05. See rstan::Rhat for further details.
"geweke" returns the z-scores, one for each parameter, for a test of equality between the means of the first 10\
"raftery" returns the estimate of the "dependence factor" . Values of greater than 5 may indicate a strong autocorrelation.
Additional parameters such as the quantile to be estimated (q), the desired margin of error of the estimate (r), and the probability (s) of obtaining an estimate between and can be passed as a list in the additional.args argument. See coda::raftery.diag for further details.
"heidel" returns the p-values, one for each parameter, referred to a convergence test where the null hypothesis is that the sampled values come from a stationary distribution. It is possible to set the target value for ratio of halfwidth to sample mean (eps) and the significance level of the test (pvalue) into the additional.args argument. See coda::heidel.diag for further details.
"gelman" returns the estimate of the potential scale reduction factor and the upper confidence limit. At least two chains are needed to compute the Gelman and Rubin's convergence diagnostic. Additional parameters such as the confidence level (confidence), a logical flag indicating whether variables should be transformed (transform),
a logical flag indicating whether only the second half of the series should be used in the computation (autoburnin), and a logical flag indicating whether the multivariate potential scale reduction factor should be calculated for multivariate chains (multivariate) can be passed as a list in the additional.args argument. See coda::gelman.diag for further details.
A print from check_hmc_diagnostics function and a list of convergence diagnostics.
Brooks, SP., Gelman, A. (1998). General methods for monitoring convergence of iterative simulations. Journal of Computational and Graphical Statistics, 7, 434-455.
Geweke, J. (1992). Evaluating the accuracy of sampling-based approaches to calculating posterior moments. In Bayesian Statistics 4 (ed JM Bernado, JO Berger, AP Dawid and AFM Smith). Clarendon Press, Oxford, UK.
Heidelberger P., Welch P.D. (1981). A spectral method for confidence interval generation and run length control in simulations. Comm. ACM. 24, 233-245.
Raftery, A.E. and Lewis, S.M. (1992). One long run with diagnostics: Implementation strategies for Markov chain Monte Carlo. Statistical Science, 7, 493-497.
Stan Development Team (2020). RStan: the R interface to Stan. R package version 2.19.3. https://mc-stan.org
## Not run: data("Reading") FB <- flexreg(accuracy.adj ~ iq, data = Reading, type = "FB") convergence.diag(FB, diagnostics = c("Rhat", "geweke"), pars = "beta") ## End(Not run)## Not run: data("Reading") FB <- flexreg(accuracy.adj ~ iq, data = Reading, type = "FB") convergence.diag(FB, diagnostics = c("Rhat", "geweke"), pars = "beta") ## End(Not run)
The function produces some convergence plots from the Monte Carlo draws.
convergence.plot( model, file = "convergence-output.pdf", plotfun = "all", pars = NULL, point_est = "median", prob = 0.5, prob_outer = 0.9, lags = 10, warmup = F, width = 7, height = 7 )convergence.plot( model, file = "convergence-output.pdf", plotfun = "all", pars = NULL, point_est = "median", prob = 0.5, prob_outer = 0.9, lags = 10, warmup = F, width = 7, height = 7 )
model |
an object of class |
file |
a character string giving the name of the file (including the extension .pdf) containing the convergence plots. If |
plotfun |
an optional character vector of diagnostics plots. The default is to compute |
pars |
an optional character vector of parameter names. If |
point_est |
an optional character to specify the point estimate to be shown between |
prob |
the probability mass to be included in the inner interval (for |
prob_outer |
the probability mass to be included in the outer interval of the |
lags |
the number of lags to be shown in the |
warmup |
a logical scalar indicating whether to include the warmup draws or not (default). |
width, height
|
the width and height of the graphics region of each plot in inches. The default values are 7. |
"density" returns a density plot for each parameter in pars computed from the posterior draws. See bayesplot::mcmc_areas for further details.
"trace" returns a trace plot for each parameter in pars computed from the posterior draws. See bayesplot::mcmc_trace for further details.
"intervals" returns a plot of uncertainty interval for each parameter in pars computed from the posterior draws. See bayesplot::mcmc_intervals for further details.
"rate" returns a plot for each parameter in pars with the number of iterations on the x-axis and the Monte Carlo mean until iteration i-th on the y-axis.
"rhat" returns a plot with the Rhat values for each parameter in pars. See bayesplot::mcmc_rhat for further details.
"acf" returns the autocorrelation plots (one for each parameter in pars). See bayesplot::mcmc_acf for further details.
Moreover, the convergence plots can be further customized using the ggplot2 package.
A .pdf file with one plot per page.
Brooks, SP., Gelman, A. (1998). General methods for monitoring convergence of iterative simulations. Journal of Computational and Graphical Statistics, 7, 434-455.
Stan Development Team (2020). RStan: the R interface to Stan. R package version 2.19.3. https://mc-stan.org
## Not run: data("Reading") FB <- flexreg(accuracy.adj ~ iq, data = Reading, type = "FB") convergence.plot(FB, file = "Convergence_plot_Output.pdf", pars = "beta") ## End(Not run)## Not run: data("Reading") FB <- flexreg(accuracy.adj ~ iq, data = Reading, type = "FB") convergence.plot(FB, file = "Convergence_plot_Output.pdf", pars = "beta") ## End(Not run)
The function draws a curve corresponding to the probability density/mass function of the specified distribution (beta, flexible beta, variance-inflated beta, binomial, beta-binomial, or flexible beta-binomial). For beta, flexible beta, and variance-inflated beta, it also allows to include the representation of the probability of augmentation in zero and/or one values.
curve.density( type = NULL, size = NULL, mu = NULL, theta = NULL, phi = NULL, p = NULL, w = NULL, k = NULL, q0 = NULL, q1 = NULL, ... )curve.density( type = NULL, size = NULL, mu = NULL, theta = NULL, phi = NULL, p = NULL, w = NULL, k = NULL, q0 = NULL, q1 = NULL, ... )
type |
a character specifying the distribution type to be plotted ( |
size |
the total number of trials (to be specified only if |
mu |
the mean parameter of the distribution. It must lie in (0, 1). |
theta |
the overdispersion parameter (to be specified only if |
phi |
the precision parameter (if |
p |
the mixing weight (to be specified only if |
w |
the normalized distance among component means of the FB and FBB distributions (to be specified only if |
k |
the extent of the variance inflation (to be specified only if |
q0 |
the probability of augmentation in zero (to be specified only if |
q1 |
the probability of augmentation in one (to be specified only if |
... |
additional arguments of |
Ascari, R., Migliorati, S. (2021). A new regression model for overdispersed binomial data accounting for outliers and an excess of zeros. Statistics in Medicine, 40(17), 3895–3914. doi:10.1002/sim.9005
Di Brisco, A. M., Migliorati, S. (2020). A new mixed-effects mixture model for constrained longitudinal data. Statistics in Medicine, 39(2), 129–145. doi:10.1002/sim.8406
Di Brisco, A. M., Migliorati, S., Ongaro, A. (2020). Robustness against outliers: A new variance inflated regression model for proportions. Statistical Modelling, 20(3), 274–309. doi:10.1177/1471082X18821213
Ferrari, S.L.P., and Cribari-Neto, F. (2004). Beta Regression for Modeling Rates and Proportions. Journal of Applied Statistics, 31(7), 799–815. doi:10.1080/0266476042000214501
Migliorati, S., Di Brisco, A. M., Ongaro, A. (2018). A New Regression Model for Bounded Responses. Bayesian Analysis, 13(3), 845–872. doi:10.1214/17-BA1079
## Not run: curve.density("Beta", mu=.5, phi=20) curve.density("Beta", mu=.5, phi=20, q1 = .3) curve.density("FB", mu=.5, phi=20, p=.4, w=.8) curve.density("FB", mu=.5, phi=20, p=.4, w=.8, q0= .1) curve.density("VIB", mu=.5, phi=20, p=.9, k=.8, col=3) curve.density("VIB", mu=.5, phi=20, p=.9, k=.8, col=3, q0=.1, q1=.3) curve.density("Bin", size=10, mu=.7) curve.density("BetaBin", size=10, mu=.7, phi=10) curve.density("FBB", size=10, mu=.7, phi=10, p=.2,w=.7) ## End(Not run)## Not run: curve.density("Beta", mu=.5, phi=20) curve.density("Beta", mu=.5, phi=20, q1 = .3) curve.density("FB", mu=.5, phi=20, p=.4, w=.8) curve.density("FB", mu=.5, phi=20, p=.4, w=.8, q0= .1) curve.density("VIB", mu=.5, phi=20, p=.9, k=.8, col=3) curve.density("VIB", mu=.5, phi=20, p=.9, k=.8, col=3, q0=.1, q1=.3) curve.density("Bin", size=10, mu=.7) curve.density("BetaBin", size=10, mu=.7, phi=10) curve.density("FBB", size=10, mu=.7, phi=10, p=.2,w=.7) ## End(Not run)
Density function, distribution function, quantile function, and random generation for the (augmented) beta distribution with the mean-precision parameterization.
dBeta(x, mu, phi, q0 = NULL, q1 = NULL, log = FALSE) qBeta(prob, mu, phi, q0 = NULL, q1 = NULL, log.prob = FALSE) pBeta(q, mu, phi, q0 = NULL, q1 = NULL, log.prob = FALSE) rBeta(n, mu, phi, q0 = NULL, q1 = NULL)dBeta(x, mu, phi, q0 = NULL, q1 = NULL, log = FALSE) qBeta(prob, mu, phi, q0 = NULL, q1 = NULL, log.prob = FALSE) pBeta(q, mu, phi, q0 = NULL, q1 = NULL, log.prob = FALSE) rBeta(n, mu, phi, q0 = NULL, q1 = NULL)
x, q
|
a vector of quantiles. |
mu |
the mean parameter. It must lie in (0, 1). |
phi |
the precision parameter. It must be a real positive value. |
q0 |
the probability of augmentation in zero. It must lie in (0, 1). In case of no augmentation, it is |
q1 |
the probability of augmentation in one. It must lie in (0, 1). In case of no augmentation, it is |
log |
logical; if TRUE, densities are returned on log-scale. |
prob |
a vector of probabilities. |
log.prob |
logical; if TRUE, probabilities |
n |
the number of values to generate. If |
The beta distribution has density
for , where identifies the mean and is the precision parameter.
The augmented beta distribution has density
, if
, if
, if
where identifies the augmentation in zero, identifies the augmentation in one,
and .
The function dBeta returns a vector with the same length as x containing the density values.
The function pBeta returns a vector with the same length as q containing the values of the distribution function.
The function qBeta returns a vector with the same length as prob containing the quantiles.
The function rBeta returns a vector of length n containing the generated random values.
Ferrari, S.L.P., Cribari-Neto, F. (2004). Beta Regression for Modeling Rates and Proportions. Journal of Applied Statistics, 31(7), 799–815. doi:10.1080/0266476042000214501
dBeta(x = c(.5,.7,.8), mu = .3, phi = 20) dBeta(x = c(.5,.7,.8), mu = .3, phi = 20, q0 = .2) dBeta(x = c(.5,.7,.8), mu = .3, phi = 20, q0 = .2, q1= .1) qBeta(prob = c(.5,.7,.8), mu = .3, phi = 20) qBeta(prob = c(.5,.7,.8), mu = .3, phi = 20, q0 = .2) qBeta(prob = c(.5,.7,.8), mu = .3, phi = 20, q0 = .2, q1= .1) pBeta(q = c(.5,.7,.8), mu = .3, phi = 20) pBeta(q = c(.5,.7,.8), mu = .3, phi = 20, q0 = .2) pBeta(q = c(.5,.7,.8), mu = .3, phi = 20, q0 = .2, q1= .1) rBeta(n = 100, mu = .5, phi = 30) rBeta(n = 100, mu = .5, phi = 30, q0 = .2, q1 = .1)dBeta(x = c(.5,.7,.8), mu = .3, phi = 20) dBeta(x = c(.5,.7,.8), mu = .3, phi = 20, q0 = .2) dBeta(x = c(.5,.7,.8), mu = .3, phi = 20, q0 = .2, q1= .1) qBeta(prob = c(.5,.7,.8), mu = .3, phi = 20) qBeta(prob = c(.5,.7,.8), mu = .3, phi = 20, q0 = .2) qBeta(prob = c(.5,.7,.8), mu = .3, phi = 20, q0 = .2, q1= .1) pBeta(q = c(.5,.7,.8), mu = .3, phi = 20) pBeta(q = c(.5,.7,.8), mu = .3, phi = 20, q0 = .2) pBeta(q = c(.5,.7,.8), mu = .3, phi = 20, q0 = .2, q1= .1) rBeta(n = 100, mu = .5, phi = 30) rBeta(n = 100, mu = .5, phi = 30, q0 = .2, q1 = .1)
Mass function, distribution function, quantile function, and random generation for the beta-binomial distribution.
dBetaBin(x, size, mu, theta = NULL, phi = NULL, log = FALSE) qBetaBin(prob, size, mu, theta = NULL, phi = NULL, log.prob = FALSE) pBetaBin(q, size, mu, theta = NULL, phi = NULL, log.prob = FALSE) rBetaBin(n, size = NULL, mu = NULL, theta = NULL, phi = NULL)dBetaBin(x, size, mu, theta = NULL, phi = NULL, log = FALSE) qBetaBin(prob, size, mu, theta = NULL, phi = NULL, log.prob = FALSE) pBetaBin(q, size, mu, theta = NULL, phi = NULL, log.prob = FALSE) rBetaBin(n, size = NULL, mu = NULL, theta = NULL, phi = NULL)
x, q
|
a vector of quantiles. |
size |
the total number of trials. |
mu |
the mean parameter. It must lie in (0, 1). |
theta |
the overdispersion parameter. It must lie in (0, 1). |
phi |
the precision parameter, an alternative way to specify the overdispersion parameter |
log |
logical; if TRUE, probabilities are returned on log-scale. |
prob |
a vector of probabilities. |
log.prob |
logical; if TRUE, probabilities |
n |
the number of values to generate. If |
The beta-binomial distribution has probability mass function
for , where identifies the mean and is the precision parameter.
The function dBetaBin returns a vector with the same length as x containing the probability mass values.
The function pBetaBin returns a vector with the same length as q containing the values of the distribution function.
The function qBetaBin returns a vector with the same length as prob containing the quantiles.
The function rBetaBin returns a vector of length n containing the generated random values.
Ascari, R., Migliorati, S. (2021). A new regression model for overdispersed binomial data accounting for outliers and an excess of zeros. Statistics in Medicine, 40(17), 3895–3914. doi:10.1002/sim.9005
dBetaBin(x = 5, size = 10, mu = .3, phi = 10) dBetaBin(x = 5, size = 10, mu = .3, theta = 1/(10+1)) qBetaBin(prob = .5, size = 10, mu = .3, phi = 10) qBetaBin(prob = .5, size = 10, mu = .3, theta = 1/(10+1)) pBetaBin(q = 5, size = 10, mu = .3, phi = 10) pBetaBin(q = 5, size = 10, mu = .3, theta = 1/(10+1)) rBetaBin(n = 100, size = 40, mu = .5, theta = .4) rBetaBin(n = 100, size = 40, mu = .5, phi = 1.5)dBetaBin(x = 5, size = 10, mu = .3, phi = 10) dBetaBin(x = 5, size = 10, mu = .3, theta = 1/(10+1)) qBetaBin(prob = .5, size = 10, mu = .3, phi = 10) qBetaBin(prob = .5, size = 10, mu = .3, theta = 1/(10+1)) pBetaBin(q = 5, size = 10, mu = .3, phi = 10) pBetaBin(q = 5, size = 10, mu = .3, theta = 1/(10+1)) rBetaBin(n = 100, size = 40, mu = .5, theta = .4) rBetaBin(n = 100, size = 40, mu = .5, phi = 1.5)
Density function, distribution function, quantile function, and random generation for the (augmented) flexible beta distribution.
dFB(x, mu, phi, p, w, q0 = NULL, q1 = NULL, log = FALSE) qFB(prob, mu, phi, p, w, q0 = NULL, q1 = NULL, log.prob = FALSE) pFB(q, mu, phi, p, w, q0 = NULL, q1 = NULL, log.prob = FALSE) rFB(n, mu, phi, p, w, q0 = NULL, q1 = NULL)dFB(x, mu, phi, p, w, q0 = NULL, q1 = NULL, log = FALSE) qFB(prob, mu, phi, p, w, q0 = NULL, q1 = NULL, log.prob = FALSE) pFB(q, mu, phi, p, w, q0 = NULL, q1 = NULL, log.prob = FALSE) rFB(n, mu, phi, p, w, q0 = NULL, q1 = NULL)
x, q
|
a vector of quantiles. |
mu |
the mean parameter. It must lie in (0, 1). |
phi |
the precision parameter. It must be a real positive value. |
p |
the mixing weight. It must lie in (0, 1). |
w |
the normalized distance among component means. It must lie in (0, 1). |
q0 |
the probability of augmentation in zero. It must lie in (0, 1). In case of no augmentation, it is |
q1 |
the probability of augmentation in one. It must lie in (0, 1). In case of no augmentation, it is |
log |
logical; if TRUE, densities are returned on log-scale. |
prob |
a vector of probabilities. |
log.prob |
logical; if TRUE, probabilities |
n |
the number of values to generate. If |
The FB distribution is a special mixture of two beta distributions with probability density function
for , where is the beta density with a mean-precision parameterization.
Moreover, is the overall mean, is a precision parameter,
is the mixing weight, is the normalized distance between component means, and
and are the means of the first and second component of the mixture, respectively.
The augmented FB distribution has density
, if
, if
, if
where identifies the augmentation in zero, identifies the augmentation in one,
and .
The function dFB returns a vector with the same length as x containing the density values.
The function pFB returns a vector with the same length as q containing the values of the distribution function.
The function qFB returns a vector with the same length as prob containing the quantiles.
The function rFB returns a vector of length n containing the generated random values.
Di Brisco, A. M., Migliorati, S. (2020). A new mixed-effects mixture model for constrained longitudinal data. Statistics in Medicine, 39(2), 129–145. doi:10.1002/sim.8406
Migliorati, S., Di Brisco, A. M., Ongaro, A. (2018). A New Regression Model for Bounded Responses. Bayesian Analysis, 13(3), 845–872. doi:10.1214/17-BA1079
dFB(x = c(.5,.7,.8), mu = .3, phi = 20, p = .5, w = .5) dFB(x = c(.5,.7,.8), mu = .3, phi = 20, p = .5, w = .5, q0 = .2) dFB(x = c(.5,.7,.8), mu = .3, phi = 20, p = .5, w = .5, q0 = .2, q1 = .1) qFB(prob = c(.5,.7,.8), mu = .3, phi = 20, p = .5, w = .5) qFB(prob = c(.5,.7,.8), mu = .3, phi = 20, p = .5, w = .5, q0 = .2) qFB(prob = c(.5,.7,.8), mu = .3, phi = 20, p = .5, w = .5, q0 = .2, q1 = .1) pFB(q = c(.5,.7,.8), mu = .3, phi = 20, p = .5, w = .5) pFB(q = c(.5,.7,.8), mu = .3, phi = 20, p = .5, w = .5, q0 = .2) pFB(q = c(.5,.7,.8), mu = .3, phi = 20, p = .5, w = .5, q0 = .2, q1 = .1) rFB(n = 100, mu = .5, phi = 30,p = .3, w = .6) rFB(n = 100, mu = .5, phi = 30,p = .3, w = .6, q0 = .2, q1 = .1)dFB(x = c(.5,.7,.8), mu = .3, phi = 20, p = .5, w = .5) dFB(x = c(.5,.7,.8), mu = .3, phi = 20, p = .5, w = .5, q0 = .2) dFB(x = c(.5,.7,.8), mu = .3, phi = 20, p = .5, w = .5, q0 = .2, q1 = .1) qFB(prob = c(.5,.7,.8), mu = .3, phi = 20, p = .5, w = .5) qFB(prob = c(.5,.7,.8), mu = .3, phi = 20, p = .5, w = .5, q0 = .2) qFB(prob = c(.5,.7,.8), mu = .3, phi = 20, p = .5, w = .5, q0 = .2, q1 = .1) pFB(q = c(.5,.7,.8), mu = .3, phi = 20, p = .5, w = .5) pFB(q = c(.5,.7,.8), mu = .3, phi = 20, p = .5, w = .5, q0 = .2) pFB(q = c(.5,.7,.8), mu = .3, phi = 20, p = .5, w = .5, q0 = .2, q1 = .1) rFB(n = 100, mu = .5, phi = 30,p = .3, w = .6) rFB(n = 100, mu = .5, phi = 30,p = .3, w = .6, q0 = .2, q1 = .1)
Mass function, distribution function, quantile function, and random generation for the flexible beta-binomial distribution.
dFBB(x, size, mu, theta = NULL, phi = NULL, p, w, log = FALSE) qFBB(prob, size, mu, theta = NULL, phi = NULL, p, w, log.prob = FALSE) pFBB(q, size, mu, theta = NULL, phi = NULL, p, w, log.prob = FALSE) rFBB(n, size = NULL, mu, theta = NULL, phi = NULL, p, w)dFBB(x, size, mu, theta = NULL, phi = NULL, p, w, log = FALSE) qFBB(prob, size, mu, theta = NULL, phi = NULL, p, w, log.prob = FALSE) pFBB(q, size, mu, theta = NULL, phi = NULL, p, w, log.prob = FALSE) rFBB(n, size = NULL, mu, theta = NULL, phi = NULL, p, w)
x, q
|
a vector of quantiles. |
size |
the total number of trials. |
mu |
the mean parameter. It must lie in (0, 1). |
theta |
the overdispersion parameter. It must lie in (0, 1). |
phi |
the precision parameter, an alternative way to specify the overdispersion parameter |
p |
the mixing weight. It must lie in (0, 1). |
w |
the normalized distance among component means. It must lie in (0, 1). |
log |
logical; if TRUE, probabilities are returned on log-scale. |
prob |
a vector of probabilities. |
log.prob |
logical; if TRUE, probabilities |
n |
the number of values to generate. If |
The FBB distribution is a special mixture of two beta-binomial distributions with probability mass function
for , where is the beta-binomial distribution with a mean-precision parameterization.
Moreover, is a precision parameter, is the mixing weight, is the overall mean,
is the normalized distance between component means, and
and are the scaled means of the first and second component of the mixture, respectively.
The function dFBB returns a vector with the same length as x containing the probability mass values.
The function pFBB returns a vector with the same length as q containing the values of the distribution function.
The function qFBB returns a vector with the same length as prob containing the quantiles.
The function rFBB returns a vector of length n containing the generated random values.
Ascari, R., Migliorati, S. (2021). A new regression model for overdispersed binomial data accounting for outliers and an excess of zeros. Statistics in Medicine, 40(17), 3895–3914. doi:10.1002/sim.9005
dFBB(x = c(5,7,8), size=10, mu = .3, phi = 20, p = .5, w = .5) dFBB(x = c(5,7,8), size=10, mu = .3, theta = 1/(20+1), p = .5, w = .5) qFBB(prob = .5, size=10, mu = .3, phi = 20, p = .5, w = .5) qFBB(prob = .5, size=10, mu = .3, theta = 1/(20+1), p = .5, w = .5) pFBB(q = c(5,7,8), size=10, mu = .3, phi = 20, p = .5, w = .5) pFBB(q = c(5,7,8), size=10, mu = .3, theta = 1/(20+1), p = .5, w = .5) rFBB(n = 100, size = 40, mu = .5, phi = 5, p = .3, w = .6) rFBB(n = 100, size = 40, mu = .5, theta = 1/(5+1), p = .3, w = .6)dFBB(x = c(5,7,8), size=10, mu = .3, phi = 20, p = .5, w = .5) dFBB(x = c(5,7,8), size=10, mu = .3, theta = 1/(20+1), p = .5, w = .5) qFBB(prob = .5, size=10, mu = .3, phi = 20, p = .5, w = .5) qFBB(prob = .5, size=10, mu = .3, theta = 1/(20+1), p = .5, w = .5) pFBB(q = c(5,7,8), size=10, mu = .3, phi = 20, p = .5, w = .5) pFBB(q = c(5,7,8), size=10, mu = .3, theta = 1/(20+1), p = .5, w = .5) rFBB(n = 100, size = 40, mu = .5, phi = 5, p = .3, w = .6) rFBB(n = 100, size = 40, mu = .5, theta = 1/(5+1), p = .3, w = .6)
Density function, distribution function, quantile function, and random generation for the (augmented) variance-inflated beta distribution.
dVIB(x, mu, phi, p, k, q0 = NULL, q1 = NULL, log = FALSE) qVIB(prob, mu, phi, p, k, q0 = NULL, q1 = NULL, log.prob = FALSE) pVIB(q, mu, phi, p, k, q0 = NULL, q1 = NULL, log.prob = FALSE) rVIB(n, mu, phi, p, k, q0 = NULL, q1 = NULL)dVIB(x, mu, phi, p, k, q0 = NULL, q1 = NULL, log = FALSE) qVIB(prob, mu, phi, p, k, q0 = NULL, q1 = NULL, log.prob = FALSE) pVIB(q, mu, phi, p, k, q0 = NULL, q1 = NULL, log.prob = FALSE) rVIB(n, mu, phi, p, k, q0 = NULL, q1 = NULL)
x, q
|
a vector of quantiles. |
mu |
the mean parameter. It must lie in (0, 1). |
phi |
the precision parameter. It must be a real positive value. |
p |
the mixing weight. It must lie in (0, 1). |
k |
the extent of the variance inflation. It must lie in (0, 1). |
q0 |
the probability of augmentation in zero. It must lie in (0, 1). In case of no augmentation, it is |
q1 |
the probability of augmentation in one. It must lie in (0, 1). In case of no augmentation, it is |
log |
logical; if TRUE, densities are returned on log-scale. |
prob |
a vector of probabilities. |
log.prob |
logical; if TRUE, probabilities |
n |
the number of values to generate. If |
The VIB distribution is a special mixture of two beta distributions with probability density function
for , where is the beta density with a mean-precision parameterization.
Moreover, is the mixing weight, represents the overall (as well as mixture component)
mean, is a precision parameter, and determines the extent of the variance inflation.
The augmented VIB distribution has density
, if
, if
, if
where identifies the augmentation in zero, identifies the augmentation in one,
and .
The function dVIB returns a vector with the same length as x containing the density values.
The function pVIB returns a vector with the same length as q containing the values of the distribution function.
The function qVIB returns a vector with the same length as prob containing the quantiles.
The function rVIB returns a vector of length n containing the generated random values.
Di Brisco, A. M., Migliorati, S., Ongaro, A. (2020). Robustness against outliers: A new variance inflated regression model for proportions. Statistical Modelling, 20(3), 274–309. doi:10.1177/1471082X18821213
dVIB(x = c(.5,.7,.8), mu = .3, phi = 20, p = .5, k= .5) dVIB(x = c(.5,.7,.8), mu = .3, phi = 20, p = .5, k= .5, q1 = .1) dVIB(x = c(.5,.7,.8), mu = .3, phi = 20, p = .5, k= .5, q0 = .2, q1 = .1) qVIB(prob = c(.5,.7,.8), mu = .3, phi = 20, p = .5, k= .5) qVIB(prob = c(.5,.7,.8), mu = .3, phi = 20, p = .5, k= .5, q1 = .1) qVIB(prob = c(.5,.7,.8), mu = .3, phi = 20, p = .5, k= .5, q0 = .2, q1 = .1) pVIB(q = c(.5,.7,.8), mu = .3, phi = 20, p = .5, k= .5) pVIB(q = c(.5,.7,.8), mu = .3, phi = 20, p = .5, k= .5, q1 = .1) pVIB(q = c(.5,.7,.8), mu = .3, phi = 20, p = .5, k= .5, q0 = .2, q1 = .1) rVIB(n = 100, mu = .5, phi = 30, p = .3, k = .6) rVIB(n = 100, mu = .5, phi = 30, p = .3, k = .6, q0 = .2, q1 = .1)dVIB(x = c(.5,.7,.8), mu = .3, phi = 20, p = .5, k= .5) dVIB(x = c(.5,.7,.8), mu = .3, phi = 20, p = .5, k= .5, q1 = .1) dVIB(x = c(.5,.7,.8), mu = .3, phi = 20, p = .5, k= .5, q0 = .2, q1 = .1) qVIB(prob = c(.5,.7,.8), mu = .3, phi = 20, p = .5, k= .5) qVIB(prob = c(.5,.7,.8), mu = .3, phi = 20, p = .5, k= .5, q1 = .1) qVIB(prob = c(.5,.7,.8), mu = .3, phi = 20, p = .5, k= .5, q0 = .2, q1 = .1) pVIB(q = c(.5,.7,.8), mu = .3, phi = 20, p = .5, k= .5) pVIB(q = c(.5,.7,.8), mu = .3, phi = 20, p = .5, k= .5, q1 = .1) pVIB(q = c(.5,.7,.8), mu = .3, phi = 20, p = .5, k= .5, q0 = .2, q1 = .1) rVIB(n = 100, mu = .5, phi = 30, p = .3, k = .6) rVIB(n = 100, mu = .5, phi = 30, p = .3, k = .6, q0 = .2, q1 = .1)
Data from the Italian general election held on 4 March 2018.
A data.frame containing 232 observations on the following 13 variables:
NVotesthe number of valid votes.
FIthe percentage of votes got by Forza Italia' party.} \item{\code{FDI}}{the percentage of votes got by Fratelli d'Italia' party.
LEGAthe percentage of votes got by Lega' party.} \item{\code{LEU}}{the percentage of votes got by Liberi e Uguali' party.
M5Sthe percentage of votes got by Movimento 5 Stelle' party.} \item{\code{PD}}{the percentage of votes got by Partito Democratico' party.
Otherthe percentage of votes got by other parties, including blank ballots.
AgeIndthe age index, defined as the ratio of the number of elderly persons (aged 65 and over) to the number of young persons (from 0 to 14), divided by 10.
PopDensthe number of inhabitants per square km.
ERthe employment rate, defined as the ratio of the number of employed persons (aged 15-64) to the number of persons (aged 15-64).
Illiteracythe illiteracy rate, defined as the ratio of the number of persons without a qualification (aged 15 and over) to the total number of persons aged 15 and over.
Foreignthe number of foreigners per 1000 inhabitants.
Data are collected on the 232 electoral districts into which the Italian territory is organized. Distribution of votes for Aosta constituency is not available. Distributions of votes are available on the Italian Ministry of Interior’s webpage, whereas constituencies information have been obtained from 2011 Italian Census. The count of votes got by each party can be derived by multiplying the percentage of votes and the number of valid votes.
Italian Ministry of Interior’s webpage: https://www.interno.gov.it/it/speciali/2018-elections.
The function fits some flexible regression models for bounded continuous responses (e.g., proportions and rates) via a Bayesian approach to inference based on Hamiltonian Monte Carlo algorithm.
Available regression models are the flexible beta regression model (type = "FB", default), the variance inflated beta (type = "VIB"), the beta (type = "Beta"), as well as their augmented versions.
flexreg( formula, zero.formula = NULL, one.formula = NULL, data, type = "FB", link.mu = "logit", prior.beta = "normal", hyperparam.beta = NULL, prior.omega0 = "normal", hyperparam.omega0 = NULL, prior.omega1 = "normal", hyperparam.omega1 = NULL, hyperparam.p = NULL, hyperparam.w = NULL, hyperparam.k = NULL, link.phi = NULL, prior.phi = NULL, hyperparam.phi = NULL, prior.psi = NULL, hyperparam.psi = NULL, n.chain = 1, n.iter = 5000, warmup.perc = 0.5, thin = 1, verbose = TRUE, ... )flexreg( formula, zero.formula = NULL, one.formula = NULL, data, type = "FB", link.mu = "logit", prior.beta = "normal", hyperparam.beta = NULL, prior.omega0 = "normal", hyperparam.omega0 = NULL, prior.omega1 = "normal", hyperparam.omega1 = NULL, hyperparam.p = NULL, hyperparam.w = NULL, hyperparam.k = NULL, link.phi = NULL, prior.phi = NULL, hyperparam.phi = NULL, prior.psi = NULL, hyperparam.psi = NULL, n.chain = 1, n.iter = 5000, warmup.perc = 0.5, thin = 1, verbose = TRUE, ... )
formula |
an object of class " |
zero.formula |
an object of class " |
one.formula |
an object of class " |
data |
an optional |
type |
a character specifying the type of regression model. Current options are |
link.mu |
a character specifying the link function for the mean model (mu). Currently, |
prior.beta |
a character specifying the prior distribution for the regression coefficients of the mean model, |
hyperparam.beta |
a positive numeric (vector of length 1) specifying the hyperprior scale parameter for the prior distribution of |
prior.omega0 |
a character specifying the prior distribution for the regression coefficients of the augmented model in zero, |
hyperparam.omega0 |
a positive numeric (vector of length 1) specifying the hyperprior scale parameter for the prior distribution of |
prior.omega1 |
a character specifying the prior distribution for the regression coefficients of the augmented model in one, |
hyperparam.omega1 |
a positive numeric (vector of length 1) specifying the hyperprior scale parameter for the prior distribution of |
hyperparam.p |
a vector of length 2 with positive elements specifying the hyperparameters for the beta prior distribution of |
hyperparam.w |
a vector of length 2 with positive elements specifying the hyperparameters for the beta prior distribution of |
hyperparam.k |
a vector of length 2 with positive elements specifying the hyperparameters for the beta prior distribution of |
link.phi |
a character specifying the link function for the precision model (phi). Currently, |
prior.phi |
a character specifying the prior distribution for precision parameter |
hyperparam.phi |
a positive numeric (vector of length 1) specifying the hyperprior parameter for the prior distribution of |
prior.psi |
a character specifying the prior distribution for the regression coefficients of the precision model |
hyperparam.psi |
a positive numeric (vector of length 1) specifying the hyperprior scale parameter for the prior distribution of |
n.chain |
a positive integer specifying the number of Markov chains. The default is 1. |
n.iter |
a positive integer specifying the number of iterations for each chain (including warm-up). The default is 5000. |
warmup.perc |
the percentage of iterations per chain to discard. |
thin |
a positive integer specifying the period for saving samples. The default is 1. |
verbose |
a logical (with default |
... |
additional arguments from |
Let Y be a continuous bounded random variable whose distribution can be specified in the type argument and be the mean of Y.
The flexreg function links the parameter to a linear predictor through a function specified in link.mu:
where is the vector of regression coefficients for the mean model.
The prior distribution and the related hyperparameter of can be specified in prior.beta and hyperparam.beta, respectively.
By default, the precision parameter is assumed to be constant.
The prior distribution and the related hyperparameter of can be specified in prior.phi and hyperparam.phi.
It is possible to extend the model by linking to an additional (possibly overlapping) set of covariates through a proper link
function specified in the link.phi argument:
where is the vector of regression coefficients for the precision model.
The prior distribution and the related hyperparameter of can be specified in prior.psi and hyperparam.psi.
In the function flexreg, the regression model for the mean and, where appropriate, for the precision parameter can be specified in the
formula argument with a formula of type y ~ x1 + x2 | z1 + z2 where covariates on the left of "|" are included in the regression model
for the mean, whereas covariates on the right of "|" are included in the regression model for the precision.
If the second part is omitted, i.e., y ~ x1 + x2, the precision is assumed constant for each observation.
In presence of zero values in the response, one has to link the parameter , i.e., the probability of augmentation in zero, to an additional (possibly overlapping) set of covariates through a logit link function:
where is the vector of regression coefficients for the augmented model in zero.
The prior distribution and the related hyperparameter of can be specified in prior.omega0 and hyperparam.omega0.
In presence of one values in the response, one has to link the parameter , i.e., the probability of augmentation in one, to an additional (possibly overlapping) set of covariates through a logit link function:
where is the vector of regression coefficients for the augmented model in one.
The prior distribution and the related hyperparameter of can be specified in prior.omega1 and hyperparam.omega1.
If both the augmented models in zero and one are specified, the link function is a bivariate logit.
In flexreg function, the augmented models in zero and/or one can be specified in the
zero.formula and/or one.formula arguments with a formula of type ~ x.
Left hand side in zero.formula and one.formula can be omitted; if specified, they have to be the same as left hand side in formula.
The flexreg function returns an object of class `flexreg`, i.e. a list with the following elements:
call |
the function call. |
type |
the type of regression model. |
formula |
the overall formula. |
aug |
a character specifing the absence of the augmentation ( |
link.mu |
a character specifing the link function in the mean model. |
link.phi |
a character specifing the link function in the precision model. |
model |
a list of objects of class |
response |
the response variable, assuming values in (0, 1). |
design.X |
the design matrix for the mean model. |
design.Z |
the design matrix for the precision model (if defined). |
design.X0 |
the design matrix for the augmented model in zero (if defined). |
design.X1 |
the design matrix for the augmented model in one (if defined). |
Di Brisco, A. M., Migliorati, S. (2020). A new mixed-effects mixture model for constrained longitudinal data. Statistics in Medicine, 39(2), 129–145. doi:10.1002/sim.8406
Di Brisco, A. M., Migliorati, S., Ongaro, A. (2020). Robustness against outliers: A new variance inflated regression model for proportions. Statistical Modelling, 20(3), 274–309. doi:10.1177/1471082X18821213
Ferrari, S.L.P., Cribari-Neto, F. (2004). Beta Regression for Modeling Rates and Proportions. Journal of Applied Statistics, 31(7), 799–815. doi:10.1080/0266476042000214501
Migliorati, S., Di Brisco, A. M., Ongaro, A. (2018) A New Regression Model for Bounded Responses. Bayesian Analysis, 13(3), 845–872. doi:10.1214/17-BA1079
## Not run: data("Reading") FB <- flexreg(accuracy.adj ~ iq, data = Reading, type="FB") # Regression model with one augmentation: AFB1 <- flexreg(accuracy ~ dyslexia | iq + dyslexia + iq:dyslexia, one.formula = ~ iq + dyslexia, data = Reading, type="FB") ## End(Not run)## Not run: data("Reading") FB <- flexreg(accuracy.adj ~ iq, data = Reading, type="FB") # Regression model with one augmentation: AFB1 <- flexreg(accuracy ~ dyslexia | iq + dyslexia + iq:dyslexia, one.formula = ~ iq + dyslexia, data = Reading, type="FB") ## End(Not run)
The function fits some flexible regression models for bounded discrete responses via a Bayesian approach to inference based on Hamiltonian Monte Carlo algorithm.
Available regression models are the flexible beta-binomial (type = "FBB", default), the beta-binomial (type = "BetaBin"), and the binomial one (type = "Bin").
flexreg_binom( formula, data, type = "FBB", n, link.mu = "logit", prior.beta = "normal", hyperparam.beta = 100, hyperparam.theta = NULL, hyperparam.p = NULL, hyperparam.w = NULL, link.theta = NULL, prior.psi = NULL, hyperparam.psi = NULL, n.chain = 1, n.iter = 5000, warmup.perc = 0.5, thin = 1, verbose = TRUE, ... )flexreg_binom( formula, data, type = "FBB", n, link.mu = "logit", prior.beta = "normal", hyperparam.beta = 100, hyperparam.theta = NULL, hyperparam.p = NULL, hyperparam.w = NULL, link.theta = NULL, prior.psi = NULL, hyperparam.psi = NULL, n.chain = 1, n.iter = 5000, warmup.perc = 0.5, thin = 1, verbose = TRUE, ... )
formula |
an object of class " |
data |
an optional |
type |
a character specifying the type of regression model. Current options are |
n |
a character specifying the name of the variable containing the total number of trials. |
link.mu |
a character specifying the link function for the mean model. Currently, |
prior.beta |
a character specifying the prior distribution for the regression coefficients of the mean model, |
hyperparam.beta |
a positive numeric (vector of length 1) specifying the hyperprior scale parameter for the prior distribution of |
hyperparam.theta |
a vector of length 2 with positive elements specifying the hyperparameters for the beta prior distribution of |
hyperparam.p |
a vector of length 2 with positive elements specifying the hyperparameters for the beta prior distribution of |
hyperparam.w |
a vector of length 2 with positive elements specifying the hyperparameters for the beta prior distribution of |
link.theta |
a character specifying the link function for the overdispersion model. Currently, |
prior.psi |
a character specifying the prior distribution for the regression coefficients of the overdispersion model, |
hyperparam.psi |
a positive numeric (vector of length 1) specifying the hyperprior scale parameter for the prior distribution of |
n.chain |
a positive integer specifying the number of Markov chains. The default is 1. |
n.iter |
a positive integer specifying the number of iterations for each chain (including warm-up). The default is 5000. |
warmup.perc |
the percentage of iterations per chain to discard. |
thin |
a positive integer specifying the period for saving samples. The default is 1. |
verbose |
a logical (with default |
... |
additional arguments from |
Let Y be a random variable whose distribution can be specified in the type argument and be the mean of Y/n.
The flexreg_binom function links the parameter to a linear predictor through a function specified in link.mu:
where is the vector of regression coefficients for the mean model.
The prior distribution and the related hyperparameter of can be specified in prior.beta and hyperparam.beta.
By default, link.theta="identity", meaning that the overdispersion parameter is assumed to be constant.
In that case, the prior distribution for is a beta with shape hyperparameters and that can be specified in hyper.theta.a and hyper.theta.b.
If not specified, , otherwise if only one hyperparameter is specified, the other is set equal.
It is possible to extend the model by linking to an additional (possibly overlapping) set of covariates through a proper link
function specified in the link.theta argument:
where is the vector of regression coefficients for the overdispersion model.
The prior distribution and the related hyperparameter of can be specified in prior.psi and hyperparam.psi.
In flexreg_binom, the regression model for the mean and, where appropriate, for the overdispersion parameter can be specified in the
formula argument with a formula of type y ~ x1 + x2 | z1 + z2 where covariates on the left of "|" are included in the regression model
for the mean, whereas covariates on the right of "|" are included in the regression model for the overdispersion.
If the second part is omitted, i.e., y ~ x1 + x2, the overdispersion is assumed constant for each observation.
The flexreg_binom function returns an object of class `flexreg`, i.e. a list with the following elements:
call |
the function call. |
type |
the type of regression model. |
formula |
the original formula. |
link.mu |
a character specifing the link function in the mean model. |
link.theta |
a character specifing the link function in the overdispersion model. |
model |
a list containing an object of class |
response |
the response variable, assuming values in (0, 1). |
design.X |
the design matrix for the mean model. |
design.Z |
the design matrix for the overdispersion model (if defined). |
Ascari, R., Migliorati, S. (2021). A new regression model for overdispersed binomial data accounting for outliers and an excess of zeros. Statistics in Medicine, 40(17), 3895–3914. doi:10.1002/sim.9005
## Not run: data(Bacteria) fbb <- flexreg_binom(y ~ females, n = "n", data = Bacteria, type = "FBB") ## End(Not run)## Not run: data(Bacteria) fbb <- flexreg_binom(y ~ females, n = "n", data = Bacteria, type = "FBB") ## End(Not run)
Method for extracting design matrix from fitted
regression model objects of class `flexreg`.
model_frame(object, ...)model_frame(object, ...)
object |
an object of class |
... |
additional arguments. Currently not used. |
The method returns a list containing all the design matrices involved
in the regression models, namely `X` (regression on the mean), `Z`
(regression on the precision or overdispersion parameters), `X0` (regression
on the augmentation in zero probability), and/or `X1` (regression
on the augmentation in one probability).
## Not run: data("Reading") FB <- flexreg(accuracy.adj ~ iq, data = Reading, n.iter = 1000) model.frame(FB) ## End(Not run)## Not run: data("Reading") FB <- flexreg(accuracy.adj ~ iq, data = Reading, n.iter = 1000) model.frame(FB) ## End(Not run)
Method for extracting design matrices from fitted
regression model objects of class `flexreg`.
## S3 method for class 'flexreg' model.matrix(object, ...)## S3 method for class 'flexreg' model.matrix(object, ...)
object |
an object of class |
... |
additional arguments. Currently not used. |
The method returns a list containing all the design matrices involved
in the regression models, namely `X` (regression on the mean), `Z`
(regression on the precision or overdispersion parameters), `X0` (regression
on the augmentation in zero probability), and/or `X1` (regression
on the augmentation in one probability).
## Not run: data("Reading") FB <- flexreg(accuracy.adj ~ iq, data = Reading, n.iter = 1000) model.matrix(FB) ## End(Not run)## Not run: data("Reading") FB <- flexreg(accuracy.adj ~ iq, data = Reading, n.iter = 1000) model.matrix(FB) ## End(Not run)
flexreg ObjectsMethod for plotting regression curves for the mean from fitted regression model objects of class `flexreg`.
## S3 method for class 'flexreg' plot( x, name.x, additional.cov.default = NA, smooth = TRUE, cluster = FALSE, type = "response", ... )## S3 method for class 'flexreg' plot( x, name.x, additional.cov.default = NA, smooth = TRUE, cluster = FALSE, type = "response", ... )
x |
an object of class |
name.x |
a character containing the name of the covariate from the mean model to be plotted on the x-axis of the scatterplot. |
additional.cov.default |
a list of additional covariates from the mean model and their value to be set as default. |
smooth |
a logical value indicating wheater the curves should be smooth ( |
cluster |
logical. If the model is |
type |
a vector of characters indicating the regression curves to be plotted. Available options are |
... |
additional arguments. Currently not used. |
The function produces a scatterplot of the covariate from the mean model specified in name.x and y or y/n if the response is bounded continuous or discrete, respectively. Any other variable specified in the mean model must be set to a default through the additional.cov.default argument.
The argument type = "response" plots the conditional mean curve (i.e., ), whereas the argument type = "response.aug", available only for augmented models,
plots the augmented mean curve.
If the regression model is of "FB" or "FBB" type and cluster = TRUE, then the function returns two additional curves corresponding to the component means, i.e., and .
## Not run: data("Reading") FB <- flexreg(accuracy.adj ~ iq + dyslexia, data = Reading) plot(FB, name.x="iq", additional.cov.default = list("dyslexia"=1)) ## End(Not run)## Not run: data("Reading") FB <- flexreg(accuracy.adj ~ iq + dyslexia, data = Reading) plot(FB, name.x="iq", additional.cov.default = list("dyslexia"=1)) ## End(Not run)
`flexreg_postpred` objectsMethod for an object of class `flexreg_postpred` containing the simulated posterior predictive distribution, usually the result of posterior_predict function. The plot shows the posterior predictive interval for each statistical unit.
Additionally, the mean of the posterior predictives and the values of the observed response (either or for bounded continuous or discrete responses, respectively) can be added.
## S3 method for class 'flexreg_postpred' plot(x, prob = 0.9, p_mean = F, response = NULL, ...)## S3 method for class 'flexreg_postpred' plot(x, prob = 0.9, p_mean = F, response = NULL, ...)
x |
an object of class |
prob |
the interval probability for the posterior predictives (default is 0.9). |
p_mean |
a logical value indicating whether the posterior predictives' mean should be plotted. |
response |
a numerical vector containing the response (either |
... |
additional arguments. Currently not used. |
## Not run: data("Reading") FB <- flexreg(accuracy.adj ~ iq, data = Reading, n.iter=1000) pp <- posterior_predict(FB) plot(pp) ## End(Not run)## Not run: data("Reading") FB <- flexreg(accuracy.adj ~ iq, data = Reading, n.iter=1000) pp <- posterior_predict(FB) plot(pp) ## End(Not run)
`flexreg` objectsThe function takes an object of class `flexreg` and generates values from the posterior predictive distribution.
## S3 method for class 'flexreg' posterior_predict(model, newdata = NULL, n.new = NULL)## S3 method for class 'flexreg' posterior_predict(model, newdata = NULL, n.new = NULL)
model |
an object of class |
newdata |
an optional |
n.new |
an optional vector containing the total number of trials with which to predict. It must be specified if |
The function generates values from the posterior predictive distribution, which is the distribution of a future outcome given the observed data.
The posterior predictive distribution is computed for in case of bounded continuous responses and
for in case of bounded discrete responses.
An object of class `flexreg_postpred` containing a matrix with the simulated posterior predictions. Each column refers to a statistical unit to predict.
Ascari, R., Migliorati, S. (2021). A new regression model for overdispersed binomial data accounting for outliers and an excess of zeros. Statistics in Medicine, 40(17), 3895–3914. doi:10.1002/sim.9005
Di Brisco, A. M., Migliorati, S., Ongaro, A. (2020). Robustness against outliers: A new variance inflated regression model for proportions. Statistical Modelling, 20(3), 274–309. doi:10.1177/1471082X18821213
Gelman, A., Carlin, J. B., Stern, H. S., Rubin, D. B. (2014). Bayesian Data Analysis, 3th edition. Chapman and Hall/CRC. doi:10.1201/b16018
Migliorati, S., Di Brisco, A. M., Ongaro, A. (2018). A New Regression Model for Bounded Responses. Bayesian Analysis, 13(3), 845–872. doi:10.1214/17-BA1079
## Not run: data("Reading") FB <- flexreg(accuracy.adj ~ iq, data = Reading, n.iter=1000) pp <- posterior_predict(FB) plot(pp) ## End(Not run)## Not run: data("Reading") FB <- flexreg(accuracy.adj ~ iq, data = Reading, n.iter=1000) pp <- posterior_predict(FB) plot(pp) ## End(Not run)
`flexreg` ObjectsMethod that computes various types of predictions from objects of class `flexreg`.
## S3 method for class 'flexreg' predict( object, newdata = NULL, n.new = NULL, cluster = FALSE, type = "response", estimate = "mean", q = NULL, ... )## S3 method for class 'flexreg' predict( object, newdata = NULL, n.new = NULL, cluster = FALSE, type = "response", estimate = "mean", q = NULL, ... )
object |
an object of class |
newdata |
an optional |
n.new |
an optional vector containing the total number of trials with which to predict. It must be specified if |
cluster |
a logical (with default |
type |
a character indicating the type of prediction. Available options are: |
estimate |
a character indicating the type of estimate. Available options are |
q |
if |
... |
additional arguments. Currently not used. |
The predict method computes various types of predictions from objects of class `flexreg`.
If type = "response", the function returns the marginal mean, i.e., .
In case of models for continuous bounded responses with augmentation, the function returns also the overall mean
and the probabilities of augmentation and/or .
If type = "variance", the function returns in case of no augmentation and
in case of augmentation.
If cluster = TRUE, for FB and FBB models, the function returns the cluster means ( and ) when type = "response" and the cluster variances when type = "variance".
The option type = "overdispersion" is available only for beta-binomial and flexible beta-binomial models and returns the fitted overdispersion.
The function returns a data.frame of different dimensions depending on the type of prediction.
Ascari, R., Migliorati, S. (2021). A new regression model for overdispersed binomial data accounting for outliers and an excess of zeros. Statistics in Medicine, 40(17), 3895–3914. doi:10.1002/sim.9005
Di Brisco, A. M., Migliorati, S. (2020). A new mixed-effects mixture model for constrained longitudinal data. Statistics in Medicine, 39(2), 129–145. doi:10.1002/sim.8406
Migliorati, S., Di Brisco, A. M., Ongaro, A. (2018). A New Regression Model for Bounded Responses. Bayesian Analysis, 13(3), 845–872. doi:10.1214/17-BA1079
## Not run: data("Reading") FB <- flexreg(accuracy.adj ~ iq, data=Reading, type="FB") predict(FB, type="response", cluster=TRUE) ## End(Not run)## Not run: data("Reading") FB <- flexreg(accuracy.adj ~ iq, data=Reading, type="FB") predict(FB, type="response", cluster=TRUE) ## End(Not run)
Print Methods for flexreg Objects
## S3 method for class 'flexreg' print(x, ...)## S3 method for class 'flexreg' print(x, ...)
x |
an object of class |
... |
additional arguments. Currently not used. |
flexreg ObjectsBayesian version of R-squared for flexible regression models for bounded continuous and discrete responses.
R2_bayes(model)R2_bayes(model)
model |
an object (or a list of objects) of class |
The function provides a Bayesian version of the R-squared measure, defined as the variance of the predicted values divided by itself plus the expected variance of the errors.
A list with the same length as the number of objects of class `flexreg` passed in the model argument.
Each element of the list contains a vector of Bayesian R-squared values with the same length as the Markov Chain(s) after warmup.
Gelman, A., Goodrich, B., Gabry, J., Vehtari, A. (2019). R-squared for Bayesian Regression Models, The American Statistician, 73:3, 307–309. doi: 10.1080/00031305.2018.1549100
## Not run: data("Reading") FB <- flexreg(accuracy.adj ~ iq, data = Reading, type = "FB", n.iter=1000) hist(R2_bayes(FB)[[1]]) ## End(Not run)## Not run: data("Reading") FB <- flexreg(accuracy.adj ~ iq, data = Reading, type = "FB", n.iter=1000) hist(R2_bayes(FB)[[1]]) ## End(Not run)
Data for assessing the contribution of non-verbal IQ to children's reading skills in dyslexic and non-dyslexic children.
A data.frame containing 44 observations on 4 variables.
accuracya reading score.
accuracy.adjthe adjusted reading score: the observed 1's (perfect reading scores) are substituted with 0.99.
dyslexiaa factor indicating wheter the child is dyslexic.
iqa standardized quantitative measure of the children's non verbal abilities.
The data were originally analyzed by Pammer and Kevan (2004) and successively used by Smithson and Verkuilen (2006) and by Migliorati et al. (2018).
Cribari-Neto, F., Zeileis, A. (2010). Beta Regression in R. Journal of Statistical Software, 34(2), 1–24.
Di Brisco, A. M., Migliorati, S. (2020). A new mixed-effects mixture model for constrained longitudinal data. Statistics in Medicine, 39(2), 129–145. doi:10.1002/sim.8406
Migliorati, S., Di Brisco, A. M., Ongaro, A. (2018). A New Regression Model for Bounded Responses. Bayesian Analysis, 13(3), 845–872. doi:10.1214/17-BA1079
Smithson, M., Verkuilen, J. (2006). A Better Lemon Squeezer? Maximum-Likelihood Regression with Beta-Distributed Dependent Variables. Psychological Methods, 11(7), 54–71.
Method that computes various types of residuals from objects of class `flexreg`. If the model type is FB or FBB and cluster = TRUE, the method returns also residuals with respect to cluster means.
## S3 method for class 'flexreg' residuals( object, type = "raw", cluster = FALSE, estimate = "mean", q = NULL, ... )## S3 method for class 'flexreg' residuals( object, type = "raw", cluster = FALSE, estimate = "mean", q = NULL, ... )
object |
an object of class |
type |
a character indicating type of residuals ( |
cluster |
logical. If the model is |
estimate |
a character indicating the type of estimate: |
q |
if |
... |
additional arguments. Currently not used. |
The residuals method computes raw and standardized residuals from objects of class `flexreg`.
Raw residuals are defined as for bounded continuous responses or as for bounded discrete responses.
Values and are the observed
responses which are specified on the left-hand side of formula in the
flexreg and flexreg_binom functions, respectively.
Moreover, is the predicted value, the result of
the predict function with type = "response".
Standardized residuals are defined as where
is the variance of the response evaluated at the posterior means
–by default, otherwise evaluated at the posterior quantiles of order q– of the parameters.
If the model is "FB" or "FBB", type = "raw", and cluster = TRUE, the cluster raw residuals are computed as
the difference between the observed response/relative response and the cluster means, i.e.,
and .
If the model is "FB" or "FBB", type = "standardized" and cluster = TRUE, the cluster standardized residuals are computed as the
cluster raw residuals divided by the square root of the cluster variances.
Cluster residuals, either raw or standardized, can be used for classification purpose. Indeed, with cluster = TRUE the residuals method returns also a column named
"label" assigning values 1 or 2 to observations depending on whether they are classified in cluster 1 (if the corresponding cluster residual is smaller) or in cluster 2.
The method returns an array with as many rows as the number of observations in the sample. If cluster = FALSE, the array has only one column containing either the raw or standardized residuals.
If cluster = TRUE, the array has four columns: the first column contains the raw or standardized residuals, the second and third columns contain the cluster residuals,
and the fourth column contains the classification labels (see Details).
Ascari, R., Migliorati, S. (2021). A new regression model for overdispersed binomial data accounting for outliers and an excess of zeros. Statistics in Medicine, 40(17), 3895–3914. doi:10.1002/sim.9005
Di Brisco, A. M., Migliorati, S. (2020). A new mixed-effects mixture model for constrained longitudinal data. Statistics in Medicine, 39(2), 129–145. doi:10.1002/sim.8406
Migliorati, S., Di Brisco, A. M., Ongaro, A. (2018). A New Regression Model for Bounded Responses. Bayesian Analysis, 13(3), 845–872. doi:10.1214/17-BA1079
## Not run: data("Reading") FB <- flexreg(accuracy.adj ~ iq, data=Reading, type="FB") residuals(FB, type="raw", cluster=TRUE) ## End(Not run)## Not run: data("Reading") FB <- flexreg(accuracy.adj ~ iq, data=Reading, type="FB") residuals(FB, type="raw", cluster=TRUE) ## End(Not run)
Data for assessing the dependency between stress and anxiety in nonclinical women in Townsville, Queensland, Australia.
A data.frame containing 166 observations on the following 2 variables:
stressdefined as rate.
anxietydefined as rate.
Both variables are rates obtained as linear transformations from the Depression Anxiety Stress Scales which range from 0 to 42 (Lovibond & Lovibond, 1995). Additional details can be found in Example 2 from Smithson and Verkuilen (2006).
Example 2 from Smithson and Verkuilen (2006).
Lovibond, P. F., Lovibond, S. H. (1995). The structure of negative emotional states: Comparison of the Depression Anxiety Stress Scales (DASS) with the Beck Depression and Anxiety Inventories. Behaviour research and therapy, 33(3), 335–343.
Smithson, M., Verkuilen, J. (2006). A Better Lemon Squeezer? Maximum-Likelihood Regression with Beta-Distributed Dependent Variables. Psychological Methods, 11(7), 54–71.
`flexreg` ObjectsMethods for extracting information from fitted regression model objects of class `flexreg`.
## S3 method for class 'flexreg' summary(object, ..., digits = 4) ## S3 method for class 'summary.flexreg' print(x, ...) ## S3 method for class 'flexreg' coef(object, ...)## S3 method for class 'flexreg' summary(object, ..., digits = 4) ## S3 method for class 'summary.flexreg' print(x, ...) ## S3 method for class 'flexreg' coef(object, ...)
object |
an object of class |
... |
additional arguments. Currently not used. |
digits |
an integer indicating the number of decimal places. Default equal to 4. |
x |
an object of class |
The summary.flexreg method summarizes the results of flexreg and flexreg_binom functions, adding also information from the functions
residuals.flexreg and WAIC. The summary.flexreg method returns an object of class `summary.flexreg` containing the relevant summary statistics which can subsequently be
printed using the associated print.summary.flexreg method.
## Not run: data("Reading") FB <- flexreg(accuracy.adj ~ iq, data = Reading, n.iter = 1000) summary(FB) ## End(Not run)## Not run: data("Reading") FB <- flexreg(accuracy.adj ~ iq, data = Reading, n.iter = 1000) summary(FB) ## End(Not run)
`flexreg_postpred` objectsSummary method for an object of class `flexreg_postpred`, containing the simulated posterior predictive distribution.
## S3 method for class 'flexreg_postpred' summary(object, ...)## S3 method for class 'flexreg_postpred' summary(object, ...)
object |
an object of class |
... |
additional arguments. |
The function summary.flexreg_postpred returns an array with the statistical units by row. The number of rows of the array is equal to the number of columns of the object of class `flexreg_postpred` that is given to the function.
By column there are some synthesis values that are the minimum, the first quartile, the media, the mean, the third quartile, and the maximum.
## Not run: data("Reading") FB <- flexreg(accuracy.adj ~ iq, data = Reading, n.iter=1000) pp <- posterior_predict(FB) summary(pp) ## End(Not run)## Not run: data("Reading") FB <- flexreg(accuracy.adj ~ iq, data = Reading, n.iter=1000) pp <- posterior_predict(FB) summary(pp) ## End(Not run)
The function computes widely applicable information criterion (WAIC) and efficient approximate leave-one-out cross-validation (LOO) from fitted regression model objects of class `flexreg`.
WAIC(model, ...) ## S3 method for class 'WAIC.flexreg' print(x, ...)WAIC(model, ...) ## S3 method for class 'WAIC.flexreg' print(x, ...)
model |
an object (or a list of objects) of class |
... |
additional arguments. |
x |
an object of class |
This function takes advantage of the loo package to compute the widely applicable information criterion (WAIC) and leave-one-out cross-validation (LOO) for objects of class `flexreg`.
If a list of two or more objects of class `flexreg` is provided, the function returns the difference in their expected predictive accuracy (see loo_compare for further details).
A named list with components from loo and waic.
Vehtari, A., Gelman, A., Gabry, J. (2017). Practical Bayesian model evaluation using leave-one-out cross-validation and WAIC. Statistics and Computing. 27(5), 1413–1432. doi:10.1007/s11222-016-9696-4
## Not run: data("Reading") FB <- flexreg(accuracy.adj ~ iq, data = Reading, type="FB", n.iter=1000) WAIC(FB) ## End(Not run)## Not run: data("Reading") FB <- flexreg(accuracy.adj ~ iq, data = Reading, type="FB", n.iter=1000) WAIC(FB) ## End(Not run)