Title: | Induced Priors in Bayesian Mixture Models |
---|---|
Description: | Computes implicitly induced quantities from prior/hyperparameter specifications of three Mixtures of Finite Mixtures models: Dirichlet Process Mixtures (DPMs; Escobar and West (1995) <doi:10.1080/01621459.1995.10476550>), Static Mixtures of Finite Mixtures (Static MFMs; Miller and Harrison (2018) <doi:10.1080/01621459.2016.1255636>), and Dynamic Mixtures of Finite Mixtures (Dynamic MFMs; Frühwirth-Schnatter, Malsiner-Walli and Grün (2020) <arXiv:2005.09918>). For methodological details, please refer to Greve, Grün, Malsiner-Walli and Frühwirth-Schnatter (2020) <arXiv:2012.12337>) as well as the package vignette. |
Authors: | Jan Greve [aut, cre], Bettina Grün [ctb] , Gertraud Malsiner-Walli [ctb] , Sylvia Frühwirth-Schnatter [ctb] |
Maintainer: | Jan Greve <[email protected]> |
License: | GPL-2 |
Version: | 1.0.0 |
Built: | 2024-10-31 06:51:07 UTC |
Source: | CRAN |
Evaluates the probability density function of the beta-negative-binomial (BNB) distribution with a mean parameter and two shape parameters.
dbnb(x, mu, a, b, log = FALSE)
dbnb(x, mu, a, b, log = FALSE)
x |
vector of quantiles. |
mu |
mean parameter. |
a |
1st shape parameter. |
b |
2nd shape parameter. |
log |
logical; if TRUE, density values p are given as log(p). |
The BNB distribution has density
where is the mean parameter and
and
are the first and
second shape parameter.
Numeric vector of density values.
Frühwirth-Schnatter, S., Malsiner-Walli, G., and Grün, B. (2020) Generalized mixtures of finite mixtures and telescoping sampling https://arxiv.org/abs/2005.09918
## Similar to other d+DISTRIBUTION_NAME functions such as dnorm, it ## evaluates the density of a distribution (in this case the BNB distri) ## at point x ## ## Let's try with the density of x = 1 for BNB(1,4,3) x <- 1 dbnb(x, mu = 1, a = 4, b = 3) ## The primary use of this function is in the closures returned from ## fipp() or nCluststers() as a prior on K-1 pmf <- nClusters(Kplus = 1:15, N = 100, type = "static", gamma = 1, maxK = 150) ## Now evaluate above when K-1 ~ BNB(1,4,3) pmf(priorK = dbnb, priorKparams = list(mu = 1, a = 4, b = 3)) ## Compare the result with the case when K-1 ~ Pois(1) pmf(priorK = dpois, priorKparams = list(lambda = 1)) ## Although both BNB(1,4,3) and Pois(1) have 1 as their mean, the former ## has a fatter rhs tail. We see that it is reflected in the induced prior ## on K+ as well
## Similar to other d+DISTRIBUTION_NAME functions such as dnorm, it ## evaluates the density of a distribution (in this case the BNB distri) ## at point x ## ## Let's try with the density of x = 1 for BNB(1,4,3) x <- 1 dbnb(x, mu = 1, a = 4, b = 3) ## The primary use of this function is in the closures returned from ## fipp() or nCluststers() as a prior on K-1 pmf <- nClusters(Kplus = 1:15, N = 100, type = "static", gamma = 1, maxK = 150) ## Now evaluate above when K-1 ~ BNB(1,4,3) pmf(priorK = dbnb, priorKparams = list(mu = 1, a = 4, b = 3)) ## Compare the result with the case when K-1 ~ Pois(1) pmf(priorK = dpois, priorKparams = list(lambda = 1)) ## Although both BNB(1,4,3) and Pois(1) have 1 as their mean, the former ## has a fatter rhs tail. We see that it is reflected in the induced prior ## on K+ as well
fipp
is a closure which returns a function that computes moments of a
user-specified functional over the induced prior partitions. Required
arguments are: prior distribution of the number of mixture components and its
parameters (see examples for details). Optional arguments are: the number of
moments to be evaluated (currently only up to 2 are implemented) and whether
the mean/variance or 1st/2nd moments should be printed out as a result of
computing the first two moments (default is set to print out mean/variance).
fipp( lfunc, Kplus, N, type = c("DPM", "static", "dynamic"), alpha = NULL, gamma = NULL, maxK = NULL, log = FALSE )
fipp( lfunc, Kplus, N, type = c("DPM", "static", "dynamic"), alpha = NULL, gamma = NULL, maxK = NULL, log = FALSE )
lfunc |
a logged version of the additive symmetric functional intended to compute over the prior partition. The function should only accept one argument N_j (= number of observations in each partition). |
Kplus |
a numeric value that represents the number of filled clusters in data |
N |
the number of observation in data |
type |
the type of model considered. Three models (static/dynamic MFMs and DPM) are supported. |
alpha , gamma
|
hyperparameters for the Dirichlet prior. For static MFM, gamma should be specified, while alpha should be specified for all other models (that is, for dynamic MFM and DPM). |
maxK |
the maximum number of K (= the number of mixture components) considered. Only needed for static/dynamic MFMs. |
log |
logical, indicating whether the probability should be logged or not |
fipp
returns a function which takes two required arguments
(required only for static/dynamic MFMs) and 2 optional arguments:
a function with support on the positive integers. The function serves as a prior of K (default = NULL which is for DPM).
a named list of prior parameters for the function
supplied in argument priorK
(default = NULL which is for
DPM).
maximum number of moments to be evaluated by the function (default = 2)
replace 2nd moment with variance (default = TRUE)
Greve, J., Grün, B., Malsiner-Walli, G., and Frühwirth-Schnatter, S. (2020) Spying on the Prior of the Number of Data Clusters and the Partition Distribution in Bayesian Cluster Analysis. https://arxiv.org/abs/2012.12337
Escobar, M. D., and West, M. (1995) Bayesian Density Estimation and Inference Using Mixtures. Journal of the American Statistical Association 90 (430), Taylor & Francis: 577-–88. https://www.tandfonline.com/doi/abs/10.1080/01621459.1995.10476550
Miller, J. W., and Harrison, M. T. (2018) Mixture Models with a Prior on the Number of Components. Journal of the American Statistical Association 113 (521), Taylor & Francis: 340-–56. https://www.tandfonline.com/doi/full/10.1080/01621459.2016.1255636
Frühwirth-Schnatter, S., Malsiner-Walli, G., and Grün, B. (2020) Generalized mixtures of finite mixtures and telescoping sampling https://arxiv.org/abs/2005.09918
## Determine mean/variance of the number of singleton clusters for dynamic ## MFM model conditional on K+ = 5, alpha = 1 with a sample size N = 100. ## We assume that K will be smaller than 30 by setting maxK = 30, please ## increase this value for more realistic analysis. ## ## First create the function singletons(): singletons <- fipp(lfunc = function(n) log(n==1), Kplus = 5, N = 100, type = "dynamic", alpha = 1, maxK = 30) ## Then evaluate it using a Geom(0.1) prior: singletons(dgeom, list(prob = 0.1)) ## Try a different prior, the Poisson prior Pois(1): singletons(dpois, list(lambda = 1)) ## If mean is the only thing you are interested in, try the following: singletons(dpois, list(lambda = 1), order = 1) ## Also, if you want 1st/2nd moments instead of mean/variance, try: singletons(dpois, list(lambda = 1), replace2ndwvar = FALSE)
## Determine mean/variance of the number of singleton clusters for dynamic ## MFM model conditional on K+ = 5, alpha = 1 with a sample size N = 100. ## We assume that K will be smaller than 30 by setting maxK = 30, please ## increase this value for more realistic analysis. ## ## First create the function singletons(): singletons <- fipp(lfunc = function(n) log(n==1), Kplus = 5, N = 100, type = "dynamic", alpha = 1, maxK = 30) ## Then evaluate it using a Geom(0.1) prior: singletons(dgeom, list(prob = 0.1)) ## Try a different prior, the Poisson prior Pois(1): singletons(dpois, list(lambda = 1)) ## If mean is the only thing you are interested in, try the following: singletons(dpois, list(lambda = 1), order = 1) ## Also, if you want 1st/2nd moments instead of mean/variance, try: singletons(dpois, list(lambda = 1), replace2ndwvar = FALSE)
nClusters
is a closure that returns a function which computes a table
of probability masses for specified K+s. Arguments needed for the returned
function to evaluate are: prior distribution of the number of mixture
components and its parameters (see examples for details).
nClusters( Kplus, N, type = c("DPM", "static", "dynamic"), alpha = NULL, gamma = NULL, maxK = NULL, log = FALSE )
nClusters( Kplus, N, type = c("DPM", "static", "dynamic"), alpha = NULL, gamma = NULL, maxK = NULL, log = FALSE )
Kplus |
a numeric value or vector. All values must be positive integers (that is 1,2,...). It specifies the range of the number of data clusters the user wants to evaluate the prior probabilities on. |
N |
the number of observations in data |
type |
the type of model considered. Three models (static/dynamic MFMs and DPM) are supported. |
alpha , gamma
|
hyperparameters for the symmetric Dirichlet prior. For static MFM, gamma should be specified, while alpha should be specified for all other models (that is, dynamic MFM and DPM). |
maxK |
the maximum number of K (= the number of mixture components) considered. Only needed for static/dynamic MFMs. |
log |
logical, indicating whether the returned probability should be logged or not |
nClusters
returns a function which takes two arguments:
a function with support on the positive integers. The function serves as a prior on K (default = NULL which is for the DPM).
a named list of prior parameters for the function
supplied in argument priorK
(default = NULL which is for the
DPM).
Greve, J., Grün, B., Malsiner-Walli, G., and Frühwirth-Schnatter, S. (2020) Spying on the Prior of the Number of Data Clusters and the Partition Distribution in Bayesian Cluster Analysis. https://arxiv.org/abs/2012.12337
Escobar, M. D., and West, M. (1995) Bayesian Density Estimation and Inference Using Mixtures. Journal of the American Statistical Association 90 (430), Taylor & Francis: 577-–88. https://www.tandfonline.com/doi/abs/10.1080/01621459.1995.10476550
Miller, J. W., and Harrison, M. T. (2018) Mixture Models with a Prior on the Number of Components. Journal of the American Statistical Association 113 (521), Taylor & Francis: 340-–56. https://www.tandfonline.com/doi/full/10.1080/01621459.2016.1255636
Frühwirth-Schnatter, S., Malsiner-Walli, G., and Grün, B. (2020) Generalized mixtures of finite mixtures and telescoping sampling https://arxiv.org/abs/2005.09918
## first, create the function pmf() for the dynamic MFM ## with N = 100, K+ evaluated between 1 and 15 with alpha = 1, ## we assume that K will be smaller than 30 by setting maxK = 30, ## please increase this value for more realistic analysis. pmf <- nClusters(Kplus = 1:15, N = 100, type = "dynamic", alpha = 1, maxK = 30) ## then, specifiy the prior for K so that the pmf can be evaluated ## between K+ = 1 and K+ = 15 pmf(dgeom, list(prob = 0.1)) ## we can also compare this result with a different prior setting pmf(dpois, list(lambda = 1))
## first, create the function pmf() for the dynamic MFM ## with N = 100, K+ evaluated between 1 and 15 with alpha = 1, ## we assume that K will be smaller than 30 by setting maxK = 30, ## please increase this value for more realistic analysis. pmf <- nClusters(Kplus = 1:15, N = 100, type = "dynamic", alpha = 1, maxK = 30) ## then, specifiy the prior for K so that the pmf can be evaluated ## between K+ = 1 and K+ = 15 pmf(dgeom, list(prob = 0.1)) ## we can also compare this result with a different prior setting pmf(dpois, list(lambda = 1))