Title: | Gamma Shape Mixture |
---|---|
Description: | Implementation of a Bayesian approach for estimating a mixture of gamma distributions in which the mixing occurs over the shape parameter. This family provides a flexible and novel approach for modeling heavy-tailed distributions, it is computationally efficient, and it only requires to specify a prior distribution for a single parameter. |
Authors: | Sergio Venturini |
Maintainer: | Sergio Venturini <[email protected]> |
License: | GPL (>= 2) |
Version: | 1.3.2 |
Built: | 2024-12-17 06:30:12 UTC |
Source: | CRAN |
This package implements a Bayesian approach for estimation of a mixture of gamma distributions in which the mixing occurs over the shape parameter. This family provides a flexible and novel approach for modeling heavy-tailed distributions, it is computationally efficient, and it only requires to specify a prior distribution for a single parameter. See Venturini et al. (2008).
Sergio Venturini [email protected]
Venturini, S., Dominici, F. and Parmigiani, G. (2008), "Gamma shape mixtures for heavy-tailed distributions". Annals of Applied Statistics, Volume 2, Number 2, 756–776. http://projecteuclid.org/euclid.aoas/1215118537
Utility function for plotting a Gamma Shape Mixture Model density.
allcurves.q(post, perc)
allcurves.q(post, perc)
post |
matrix containing of a mixture's density posterior draws. |
perc |
percentile, a value that satisfies 0 < perc < 1. |
This is a utility function used to generate the credibility bands for a Gamma Shape Mixture density within plot
.
Sergio Venturini [email protected]
This function provides the inferential algorithm to estimate a mixture of gamma distributions in which the mixing occurs over the shape parameter. It implements the collapsing approach for the GSM model, as discussed in Venturini et al. (2008).
estim.gsm(y, J, G = 100, M = 600, a, b, alpha, init = list(rep(1 / J, J), NA, rep(1, N)))
estim.gsm(y, J, G = 100, M = 600, a, b, alpha, init = list(rep(1 / J, J), NA, rep(1, N)))
y |
vector of data. |
J |
number of mixture components. |
G |
number of points where to evaluate the GSM density. |
M |
number of MCMC runs. |
a |
hyperparameter of the rate parameter prior distribution. |
b |
hyperparameter of the rate parameter prior distribution. |
alpha |
hyperparameter of the mixture's weights prior distribution. |
init |
initialization values. |
Suggestions on how to choose J
, a
and b
are provided in Venturini et al. (2008). In that work the alpha
vector is always set at (1/J
,...,1/J
), but here one is free to choose the value of the generic element of alpha
.
estim.gsm
returns an object of class "gsm", which is a list with the following components:
fdens |
matrix containing the posterior draws for the mixture's density. |
theta |
vector containing the posterior draws for the mixture's rate parameter. |
weight |
matrix containing the posterior draws for the mixture's weights. |
label |
matrix containing the posterior draws for the mixture's labels. |
data |
vector of data. |
Sergio Venturini [email protected]
Venturini, S., Dominici, F. and Parmigiani, G. (2008), "Gamma shape mixtures for heavy-tailed distributions". Annals of Applied Statistics, Volume 2, Number 2, 756–776. http://projecteuclid.org/euclid.aoas/1215118537
estim.gsm_theta
,
summary-methods
,
plot-methods
.
## Not run: set.seed(2040) y <- rgsm(500, c(.1, .3, .4, .2), 1) burnin <- 100 mcmcsim <- 500 J <- 250 gsm.out <- estim.gsm(y, J, 300, burnin + mcmcsim, 6500, 340, 1/J) summary(gsm.out, plot = TRUE, start = (burnin + 1)) plot(gsm.out, ndens = 0, nbin = 20, histogram = TRUE, start = (burnin + 1)) ## End(Not run)
## Not run: set.seed(2040) y <- rgsm(500, c(.1, .3, .4, .2), 1) burnin <- 100 mcmcsim <- 500 J <- 250 gsm.out <- estim.gsm(y, J, 300, burnin + mcmcsim, 6500, 340, 1/J) summary(gsm.out, plot = TRUE, start = (burnin + 1)) plot(gsm.out, ndens = 0, nbin = 20, histogram = TRUE, start = (burnin + 1)) ## End(Not run)
This function provides the inferential algorithm to estimate a mixture of gamma distributions in which the mixing occurs over the shape parameter. It implements the standard approach for the GSM model, as discussed in Venturini et al. (2008).
estim.gsm_theta(y, J, G = 100, M = 600, a, b, alpha, init = list(rep(1 / J, J), J / max(y), rep(1, N)))
estim.gsm_theta(y, J, G = 100, M = 600, a, b, alpha, init = list(rep(1 / J, J), J / max(y), rep(1, N)))
y |
vector of data. |
J |
number of mixture components. |
G |
number of points where to evaluate the GSM density. |
M |
number of MCMC runs. |
a |
hyperparameter of the rate parameter prior distribution. |
b |
hyperparameter of the rate parameter prior distribution. |
alpha |
hyperparameter of the mixture's weights prior distribution. |
init |
initialization values. |
Suggestions on how to choose J
, a
and b
are provided in Venturini et al. (2008). In that work the alpha
vector is always set at (1/J
,...,1/J
), but here one is free to choose the value of the generic element of alpha
.
estim.gsm_theta
returns an object of class "gsm", which is a list with the following components:
fdens |
matrix containing the posterior draws for the mixture's density. |
theta |
vector containing the posterior draws for the mixture's rate parameter. |
weight |
matrix containing the posterior draws for the mixture's weights. |
label |
matrix containing the posterior draws for the mixture's labels. |
data |
vector of data. |
Sergio Venturini [email protected]
Venturini, S., Dominici, F. and Parmigiani, G. (2008), "Gamma shape mixtures for heavy-tailed distributions". Annals of Applied Statistics, Volume 2, Number 2, 756–776. http://projecteuclid.org/euclid.aoas/1215118537
estim.gsm
,
summary-methods
,
plot-methods
.
## Not run: set.seed(2040) y <- rgsm(500, c(.1, .3, .4, .2), 1) burnin <- 100 mcmcsim <- 500 J <- 250 gsm.out <- estim.gsm_theta(y, J, 300, burnin + mcmcsim, 6500, 340, 1/J) summary(gsm.out, plot = TRUE, start = (burnin + 1)) plot(gsm.out, ndens = 0, nbin = 20, histogram = TRUE, start = (burnin + 1)) ## End(Not run)
## Not run: set.seed(2040) y <- rgsm(500, c(.1, .3, .4, .2), 1) burnin <- 100 mcmcsim <- 500 J <- 250 gsm.out <- estim.gsm_theta(y, J, 300, burnin + mcmcsim, 6500, 340, 1/J) summary(gsm.out, plot = TRUE, start = (burnin + 1)) plot(gsm.out, ndens = 0, nbin = 20, histogram = TRUE, start = (burnin + 1)) ## End(Not run)
This class encapsulates results of a Gamma Shape Mixture estimation procedure.
Objects can be created by calls of the form new("gsm", fdens, theta, weight, data)
, but most often as the result of a call to estim.gsm
or estim.gsm_theta
.
fdens
:Object of class "matrix"
; posterior draws from the MCMC simulation algorithm of the Gamma Shape Mixture density.
theta
:Object of class "numeric"
; posterior draws from the MCMC simulation algorithm of the Gamma Shape Mixture scale parameter.
weight
:Object of class "matrix"
; posterior draws from the MCMC simulation algorithm of the Gamma Shape Mixture weights.
label
:Object of class "matrix"
; posterior draws from the MCMC simulation algorithm of the Gamma Shape Mixture lables.
data
:Object of class "numeric"
; original data.
signature(x = "gsm", y = "missing")
: Plot Gamma Shape Mixture estimate.
signature(object = "gsm")
: Estimate of the Gamma Shape Mixture upper tail.
signature(object = "gsm")
: Generate object summary.
Sergio Venturini [email protected]
Venturini, S., Dominici, F. and Parmigiani, G. (2008), "Gamma shape mixtures for heavy-tailed distributions". Annals of Applied Statistics, Volume 2, Number 2, 756–776. http://projecteuclid.org/euclid.aoas/1215118537
estim.gsm
,
summary-methods
,
plot-methods
,
predict-methods
,
summary-methods
.
Function evaluations for a Gamma Shape Mixture Model.
dgsm(x, weight, rateparam) pgsm(q, weight, rateparam, lower.t = TRUE) rgsm(n, weight, rateparam) qgsm(p, x = NULL, weight, rateparam, alpha = .05, br = c(0, 1000), lower.t = TRUE)
dgsm(x, weight, rateparam) pgsm(q, weight, rateparam, lower.t = TRUE) rgsm(n, weight, rateparam) qgsm(p, x = NULL, weight, rateparam, alpha = .05, br = c(0, 1000), lower.t = TRUE)
x , q
|
vector of quantiles. |
n |
number of observations. |
p |
vector of probabilities. |
weight |
vector of mixture weights. |
rateparam |
reciprocal of the shape parameter, as in |
alpha |
outside the interval (alpha, 1 - alpha) the quantiles are found by searching for the root of F(x) - p = 0. |
br |
a vector containing the end-points of the interval to be searched for the root. |
lower.t |
logical; if TRUE (default), probabilities are P[X <= x] otherwise, P[X > x]. |
The parametrisation implemented in this function is described in Venturini et al. (2008).
dgsm
gives the density, pgsm
gives the distribution function, qgsm
gives the quantile function, and rgsm
generates random deviates.
Sergio Venturini [email protected]
Venturini, S., Dominici, F. and Parmigiani, G. (2008), "Gamma shape mixtures for heavy-tailed distributions". Annals of Applied Statistics, Volume 2, Number 2, 756–776. http://projecteuclid.org/euclid.aoas/1215118537
dgamma
,
pgamma
,
rgamma
,
uniroot
.
plot
method for class "gsm". This function plots the output of a Gamma Shape Mixture estimation procedure.
## S4 method for signature 'gsm,missing' plot(x, ndens = 5, xlab = "x", ylab = "density", nbin = 10, histogram = FALSE, bands = FALSE, confid = .95, start = 1, ...)
## S4 method for signature 'gsm,missing' plot(x, ndens = 5, xlab = "x", ylab = "density", nbin = 10, histogram = FALSE, bands = FALSE, confid = .95, start = 1, ...)
x |
object of class "gsm"; a list returned by the |
ndens |
number of simulated density curves to plot. |
xlab |
a title for the x axis. |
ylab |
a title for the y axis. |
nbin |
number of bins for the histogram. |
histogram |
logical; if TRUE the histogram is plotted on the figure. |
bands |
logical; if TRUE the 95% credibility bands are overimposed on the density graph. |
confid |
confidence level for the pointwise credibility bands around the density estimate. |
start |
MCMC run to start from. |
... |
further arguments passed to or from other methods. |
To produce a standard histogram with the estimated density curve superimposed on it, simply set ndens
to 0 and histogram
to TRUE
.
List with the following components:
xval |
horizontal coordinates. |
yval |
vertical coordinates (pointwise density posterior means). |
Sergio Venturini [email protected]
Venturini, S., Dominici, F. and Parmigiani, G. (2008), "Gamma shape mixtures for heavy-tailed distributions". Annals of Applied Statistics, Volume 2, Number 2, 756–776. http://projecteuclid.org/euclid.aoas/1215118537
estim.gsm
,
estim.gsm_theta
,
summary-methods
,
predict-methods
.
set.seed(2040) y <- rgsm(500, c(.1, .3, .4, .2), 1) burnin <- 5 mcmcsim <- 10 J <- 250 gsm.out <- estim.gsm(y, J, 300, burnin + mcmcsim, 6500, 340, 1/J) par(mfrow = c(3, 2)) plot(gsm.out) plot(gsm.out, ndens = 0, nbin = 20, start = (burnin + 1)) plot(gsm.out, ndens = 0, nbin = 20, histogram = TRUE, start = (burnin + 1)) plot(gsm.out, ndens = 0, nbin = 20, histogram = TRUE, bands = TRUE, start = (burnin + 1)) plot(gsm.out, ndens = 5, nbin = 20, histogram = TRUE, bands = TRUE, start = (burnin + 1)) plot(gsm.out, ndens = 0, nbin = 20, bands = TRUE, start = (burnin + 1))
set.seed(2040) y <- rgsm(500, c(.1, .3, .4, .2), 1) burnin <- 5 mcmcsim <- 10 J <- 250 gsm.out <- estim.gsm(y, J, 300, burnin + mcmcsim, 6500, 340, 1/J) par(mfrow = c(3, 2)) plot(gsm.out) plot(gsm.out, ndens = 0, nbin = 20, start = (burnin + 1)) plot(gsm.out, ndens = 0, nbin = 20, histogram = TRUE, start = (burnin + 1)) plot(gsm.out, ndens = 0, nbin = 20, histogram = TRUE, bands = TRUE, start = (burnin + 1)) plot(gsm.out, ndens = 5, nbin = 20, histogram = TRUE, bands = TRUE, start = (burnin + 1)) plot(gsm.out, ndens = 0, nbin = 20, bands = TRUE, start = (burnin + 1))
predict
method for class "gsm". This function allows to estimate the tail probability of a Gamma Shape Mixture Model using the output of the estim.gsm
or estim.gsm_theta
procedures.
## S4 method for signature 'gsm' predict(object, thresh, start = 1, ...)
## S4 method for signature 'gsm' predict(object, thresh, start = 1, ...)
object |
object of class "gsm"; a list returned by the |
thresh |
threshold value. |
start |
MCMC run to start from. |
... |
further arguments passed to or from other methods. |
The tail probability is estimated by applying the standard Rao-Blackwellized estimator on the Gibbs sampling realizations obtained through the estim.gsm
or estim.gsm_theta
procedures.
A numerical vector containing the posterior draws for the tail probability exceeding the value of thresh
.
Sergio Venturini [email protected]
Venturini, S., Dominici, F. and Parmigiani, G. (2008), "Gamma shape mixtures for heavy-tailed distributions". Annals of Applied Statistics, Volume 2, Number 2, 756–776. http://projecteuclid.org/euclid.aoas/1215118537
estim.gsm
,
estim.gsm_theta
,
predict-methods
,
plot-methods
.
set.seed(2040) y <- rgsm(500, c(.1, .3, .4, .2), 1) burnin <- 5 mcmcsim <- 10 J <- 250 gsm.out <- estim.gsm(y, J, 300, burnin + mcmcsim, 6500, 340, 1/J) thresh <- c(0.1, 0.5, 0.75, 1, 2) tail.prob.est <- tail.prob.true <- rep(NA, length(thresh)) for (i in 1:length(thresh)){ tail.prob.est[i] <- mean(predict(gsm.out, thresh[i])) tail.prob.true[i] <- sum(y > thresh[i])/length(y) } qqplot(tail.prob.true, tail.prob.est, main = "Q-Q plot of true vs. estimated tail probability") abline(0, 1, lty = 2)
set.seed(2040) y <- rgsm(500, c(.1, .3, .4, .2), 1) burnin <- 5 mcmcsim <- 10 J <- 250 gsm.out <- estim.gsm(y, J, 300, burnin + mcmcsim, 6500, 340, 1/J) thresh <- c(0.1, 0.5, 0.75, 1, 2) tail.prob.est <- tail.prob.true <- rep(NA, length(thresh)) for (i in 1:length(thresh)){ tail.prob.est[i] <- mean(predict(gsm.out, thresh[i])) tail.prob.true[i] <- sum(y > thresh[i])/length(y) } qqplot(tail.prob.true, tail.prob.est, main = "Q-Q plot of true vs. estimated tail probability") abline(0, 1, lty = 2)
summary
method for class "gsm". This function allows to summarize the output of a Gamma Shape Mixture estimate procedure like estim.gsm
or estim.gsm_theta
.
## S4 method for signature 'gsm' summary(object, plot = FALSE, start = 1, ...)
## S4 method for signature 'gsm' summary(object, plot = FALSE, start = 1, ...)
object |
object of class "gsm"; a list returned by the |
plot |
logical; if TRUE produces a bar plot of the mixture weights posterior means. |
start |
MCMC run to start from. |
... |
further arguments passed to or from other methods. |
The function summary
computes and returns a list of summary statistics of the fitted gamma shape mixture given in object
, in particular
theta |
summary index of the theta parameter posterior draws. |
weight posterior means |
vector of the mixture weights posterior means. |
Sergio Venturini [email protected]
Venturini, S., Dominici, F. and Parmigiani, G. (2008), "Gamma shape mixtures for heavy-tailed distributions". Annals of Applied Statistics, Volume 2, Number 2, 756–776. http://projecteuclid.org/euclid.aoas/1215118537
estim.gsm
,
estim.gsm_theta
,
plot-methods
,
predict-methods
.
set.seed(2040) y <- rgsm(500, c(.1, .3, .4, .2), 1) burnin <- 5 mcmcsim <- 10 J <- 250 gsm.out <- estim.gsm(y, J, 300, burnin + mcmcsim, 6500, 340, 1/J) summary(gsm.out, TRUE, start = (burnin + 1))
set.seed(2040) y <- rgsm(500, c(.1, .3, .4, .2), 1) burnin <- 5 mcmcsim <- 10 J <- 250 gsm.out <- estim.gsm(y, J, 300, burnin + mcmcsim, 6500, 340, 1/J) summary(gsm.out, TRUE, start = (burnin + 1))