Title: | Tools for Nonparametric Martingale Posterior Sampling |
---|---|
Description: | Performs Bayesian nonparametric density estimation using Martingale posterior distributions including the Copula Resampling (CopRe) algorithm. Also included are a Gibbs sampler for the marginal Gibbs-type mixture model and an extension to include full uncertainty quantification via a predictive sequence resampling (SeqRe) algorithm. The CopRe and SeqRe samplers generate random nonparametric distributions as output, leading to complete nonparametric inference on posterior summaries. Routines for calculating arbitrary functionals from the sampled distributions are included as well as an important algorithm for finding the number and location of modes, which can then be used to estimate the clusters in the data using, for example, k-means. Implements work developed in Moya B., Walker S. G. (2022). <doi:10.48550/arxiv.2206.08418>, Fong, E., Holmes, C., Walker, S. G. (2021) <doi:10.48550/arxiv.2103.15671>, and Escobar M. D., West, M. (1995) <doi:10.1080/01621459.1995.10476550>. |
Authors: | Blake Moya [cre, aut], The University of Texas at Austin [cph, fnd] |
Maintainer: | Blake Moya <[email protected]> |
License: | GPL (>= 2) |
Version: | 0.2.1 |
Built: | 2024-10-31 19:57:57 UTC |
Source: | CRAN |
Performs Bayesian nonparametric density estimation using Martingale posterior distributions including the Copula Resampling (CopRe) algorithm. Also included are a Gibbs sampler for the marginal Gibbs-type mixture model and an extension to include full uncertainty quantification via a predictive sequence resampling (SeqRe) algorithm. The CopRe and SeqRe samplers generate random nonparametric distributions as output, leading to complete nonparametric inference on posterior summaries. Routines for calculating arbitrary functionals from the sampled distributions are included as well as an important algorithm for finding the number and location of modes, which can then be used to estimate the clusters in the data using, for example, k-means. Implements work developed in Moya B., Walker S. G. (2022).
Blake Moya [email protected]
Fong, E., Holmes, C., Walker, S. G. (2021). Martingale Posterior Distributions. arXiv. DOI: doi:10.48550/arxiv.2103.15671
Moya B., Walker S. G. (2022). Uncertainty Quantification and the Marginal MDP Model. arXiv. DOI: doi:10.48550/arxiv.2206.08418
Escobar M. D., West, M. (1995) Bayesian Density Estimation and Inference Using Mixtures. Journal of the American Statistical Association. DOI: doi:10.1080/01621459.1995.10476550
A function that samples predictive distributions for univariate continuous data using exchangeable predictive extension.
## S3 method for class 'seqreresult' obj[[i]] seqre(obj, inc = 1000, eps = 0.001, max_it = 100)
## S3 method for class 'seqreresult' obj[[i]] seqre(obj, inc = 1000, eps = 0.001, max_it = 100)
obj |
A |
i |
A numeric vector of sample indices. |
inc |
A positive integer increment value for the number of predictive samples to take each convergence check. |
eps |
An error value which determines the convergence approximation. |
max_it |
A positive integer maximum number of iterations before halting. |
A seqre_result
object, or a list of two seqre_result
objects if
keep_marg
is TRUE
.
[[
: Subset method for seqre_result
objects
copre_result
and seqre_result
objectsGrid evaluation of copre_result
and seqre_result
objects
## S3 method for class 'grideval_result' obj$name ## S3 method for class 'grideval_result' obj[[i]] grideval(obj, grd = NULL, func = "density", nthreads = 1) ## S3 method for class 'copre_result' grideval(obj, grd = NULL, func = "density", nthreads = 1) ## S3 method for class 'seqre_result' grideval(obj, grd = NULL, func = "density", nthreads = 1)
## S3 method for class 'grideval_result' obj$name ## S3 method for class 'grideval_result' obj[[i]] grideval(obj, grd = NULL, func = "density", nthreads = 1) ## S3 method for class 'copre_result' grideval(obj, grd = NULL, func = "density", nthreads = 1) ## S3 method for class 'seqre_result' grideval(obj, grd = NULL, func = "density", nthreads = 1)
obj |
A |
name |
The name of the attribute to access (i.e. |
i |
A numeric vector of sample indices. |
grd |
For |
func |
Either 'distribution', 'density', or 'gradient'. |
nthreads |
The number of parallel threads to launch with OpenMP. |
A grideval_result
object, which is a matrix with dimension [k, m]
of evaluated sample functions, with the following attributes:
func
: The evaluated function.
grid
: The grid points on which each of the k
rows was evaluated.
args
: A copy of the args
entry from obj
.
grideval(copre_result)
: Grid evaluation method for copre_result
objects.
grideval(seqre_result)
: Grid evaluation method for seqre_result
objects.
$
: Attribute access method for grideval_result
objects
[[
: Subset method for grideval_result
objects
Extracts the antimodes from a copre_result
or seqre_result
object.
antimodes(obj, mean = FALSE, grd = NULL, idx = FALSE)
antimodes(obj, mean = FALSE, grd = NULL, idx = FALSE)
obj |
A |
mean |
A logical value indicating whether to extract the modes of the mean density of each of the individual sampled density. |
grd |
For |
idx |
A logical value indicating whether to also return the index within
|
A matrix of antimodes values in the support of the copre_result
density
Create a CopRe Result ggplot
autoplot.copre_result(x, ..., func = "density", confint = NULL)
autoplot.copre_result(x, ..., func = "density", confint = NULL)
x |
A |
... |
Additional arguments discarded from |
func |
Either 'distribution', 'density', or 'gradient'. |
confint |
A decimal value indicating the confidence interval width (e.g.
0.95 for a 95% confidence interval). Defaults to |
A ggplot
object.
grideval_result
ObjectCreate a ggplot of a grideval_result
Object
autoplot.grideval_result(x, ..., confint = NULL)
autoplot.grideval_result(x, ..., confint = NULL)
x |
A |
... |
Additional arguments discarded from |
confint |
A decimal value indicating the confidence interval width (e.g.
0.95 for a 95 percent confidence interval). Defaults to |
A ggplot
object.
Create a SeqRe Result ggplot
autoplot.seqre_result(x, ..., func = "density", confint = NULL)
autoplot.seqre_result(x, ..., func = "density", confint = NULL)
x |
A |
... |
Additional arguments discarded from |
func |
Either 'distribution', 'density', or 'gradient'. |
confint |
A decimal value indicating the confidence interval width (e.g.
0.95 for a 95% confidence interval). Defaults to |
A ggplot
object.
A structure for wrapping base measures as in Escobar and West (1995).
base_measure(idx, dim, pars, hpars, eval)
base_measure(idx, dim, pars, hpars, eval)
idx |
A unique index for the base measure. |
dim |
A dimension for the support of the base measure. |
pars |
A list of parameters used to generate mixture components. |
hpars |
A list of hyperparameters used to generate |
eval |
An evaluation function taking |
A base_measure
object for use in the sequence resampling scheme for
mixtures.
Escobar M. D., West, M. (1995) Bayesian Density Estimation and Inference Using Mixtures. Journal of the American Statistical Association. DOI: doi:10.1080/01621459.1995.10476550
A function that samples predictive distributions for univariate continuous data using the bivariate Gaussian copula.
copre( data, N, k, rho = 0.91, grd_res = 1000, nthreads = parallel::detectCores(), gpu = FALSE, gpu_path = NULL, gpu_odir = NULL, gpu_seed = 1234 )
copre( data, N, k, rho = 0.91, grd_res = 1000, nthreads = parallel::detectCores(), gpu = FALSE, gpu_path = NULL, gpu_odir = NULL, gpu_seed = 1234 )
data |
The data from which to sample predictive distributions. |
N |
The number of unobserved data points to resample for each chain. |
k |
The number of predictive distributions to sample. |
rho |
A scalar concentration parameter. |
grd_res |
The number of points on which to evaluate the predictive distribution. |
nthreads |
The number of threads to call for parallel execution. |
gpu |
A logical value indicating whether or not to use the CUDA implementation of the algorithm. |
gpu_path |
The path to the CUDA implementation source code. |
gpu_odir |
A directory to output the compiled CUDA code. |
gpu_seed |
A seed for the CUDA random variates. |
A copre_result
object, whose underlying structure is a list which
contains the following components:
Fong, E., Holmes, C., Walker, S. G. (2021). Martingale Posterior Distributions. arXiv. DOI: doi:10.48550/arxiv.2103.15671
res_cop <- copre(rnorm(50), 10, 10, nthreads = 1)
res_cop <- copre(rnorm(50), 10, 10, nthreads = 1)
Obtain Functionals from a CopRe Result
functional(obj, f, ..., mean = FALSE)
functional(obj, f, ..., mean = FALSE)
obj |
A |
f |
A list of functions. |
... |
Additional arguments passed to |
mean |
A logical value indicating whether or not to obtain the functional from the pointwise mean of the sampled distributions or from each individually. |
The integral over the copre_result
grid of the functions in the
list multiplied by the density of each sample distribution in obj
.
Normal-Inverse-Gamma Base Measure for Location-Scale Normal Mixture Models.
G_normls(mu = 0, tau = 1, s = 1, S = 1, a = NULL, A = NULL, w = NULL, W = NULL)
G_normls(mu = 0, tau = 1, s = 1, S = 1, a = NULL, A = NULL, w = NULL, W = NULL)
mu |
The mean parameter. |
tau |
The variance scaling parameter. |
s |
The primary shape parameter for the Inverse-Gamma component. |
S |
The secondary shape parameter for the Inverse-Gamma component. |
a |
The prior mean parameter for |
A |
The prior variance for |
w |
The prior primary shape parameter for |
W |
The prior secondary shape parameter for |
A base_measure
object for use in the sequence resampling scheme for
mixtures.
A function that samples marginal mixture densities via a marginal Gibbs sampler.
gibbsmix(data, k, b_msr, s_msr, burn = 1000, thin = 150)
gibbsmix(data, k, b_msr, s_msr, burn = 1000, thin = 150)
data |
The data from which to sample predictive distributions. |
k |
The number of predictive samples to draw. |
b_msr |
A |
s_msr |
A |
burn |
The number of initial sampling iterations to discard, will be truncated if a non-integer. |
thin |
The number of sampling iterations to discard between records, will be truncated if a non-integer. |
A seqre_result
object.
seqre()
, seq_measure()
, base_measure()
Length
## S3 method for class 'grideval_result' length(x)
## S3 method for class 'grideval_result' length(x)
x |
A |
The number of samples k
in obj
.
Extracts the modes from a copre_result
or seqre_result
object.
modes(obj, mean = FALSE, grd = NULL, idx = FALSE, anti = FALSE) ## S3 method for class 'seqre_result' modes(obj, mean = FALSE, grd = NULL, idx = FALSE, anti = FALSE) ## S3 method for class 'grideval_result' modes(obj, mean = FALSE, grd = NULL, idx = FALSE, anti = FALSE) n_modes(obj, mean = FALSE, grd = NULL, anti = FALSE)
modes(obj, mean = FALSE, grd = NULL, idx = FALSE, anti = FALSE) ## S3 method for class 'seqre_result' modes(obj, mean = FALSE, grd = NULL, idx = FALSE, anti = FALSE) ## S3 method for class 'grideval_result' modes(obj, mean = FALSE, grd = NULL, idx = FALSE, anti = FALSE) n_modes(obj, mean = FALSE, grd = NULL, anti = FALSE)
obj |
A |
mean |
A logical value indicating whether to count the modes of the mean density of each of the individual sampled density. |
grd |
For |
idx |
A logical value indicating whether to also return the index within
|
anti |
A logical value indicating whether to extract true modes or anti-modes (i.e. local minima of the density function). |
A matrix of modes values in the support of the copre_result
density
modes(seqre_result)
: Mode-counting method for seqre_result
objects.
modes(grideval_result)
: Mode-counting method for grideval_result
objects.
n_modes()
: Counts the modes from a copre_result
or seqre_result
object.
Obtain Moments from a CopRe or SeqRe Result
moment(obj, mom, cntrl = TRUE, grd = NULL) ## S3 method for class 'seqre_result' moment(obj, mom, cntrl = TRUE, grd = NULL) ## S3 method for class 'grideval_result' moment(obj, mom, cntrl = TRUE, grd = NULL)
moment(obj, mom, cntrl = TRUE, grd = NULL) ## S3 method for class 'seqre_result' moment(obj, mom, cntrl = TRUE, grd = NULL) ## S3 method for class 'grideval_result' moment(obj, mom, cntrl = TRUE, grd = NULL)
obj |
A |
mom |
A numeric scalar indicating the moment to calculate. |
cntrl |
A logical value indicating whether the moment should be central
or not. Defaults to |
grd |
A numeric vector of grid values on which the density function
samples in |
A vector of moment values for each sampled distribution in obj
.
moment(seqre_result)
: Moment calculation method for seqre_result
objects.
moment(grideval_result)
: Moment calculation method for grideval_result
objects.
Create a CopRe Result Plot
## S3 method for class 'copre_result' plot(x, ..., func = "density", confint = NULL, use_ggplot = TRUE)
## S3 method for class 'copre_result' plot(x, ..., func = "density", confint = NULL, use_ggplot = TRUE)
x |
A |
... |
Additional arguments discarded from |
func |
Either 'distribution', 'density', or 'gradient'. |
confint |
A decimal value indicating the confidence interval width (e.g.
0.95 for a 95% confidence interval). Defaults to |
use_ggplot |
A logical value indicating whether to use |
None.
grideval_result
ObjectCreate a Plot of a grideval_result
Object
## S3 method for class 'grideval_result' plot(x, ..., confint = NULL, use_ggplot = TRUE)
## S3 method for class 'grideval_result' plot(x, ..., confint = NULL, use_ggplot = TRUE)
x |
A |
... |
Additional arguments discarded from |
confint |
A decimal value indicating the confidence interval width (e.g.
0.95 for a 95 percent confidence interval). Defaults to |
use_ggplot |
A logical value indicating whether to use |
A ggplot
object if ggplot2
is used, else none.
Create a SeqRe Result Plot
## S3 method for class 'seqre_result' plot(x, ..., func = "density", confint = NULL, use_ggplot = TRUE)
## S3 method for class 'seqre_result' plot(x, ..., func = "density", confint = NULL, use_ggplot = TRUE)
x |
A |
... |
Additional arguments discarded from |
func |
Either 'distribution', 'density', or 'gradient'. |
confint |
A decimal value indicating the confidence interval width (e.g.
0.95 for a 95% confidence interval). Defaults to |
use_ggplot |
A logical value indicating whether to use |
None.
autoplot
methods to ggplot2
Register autoplot
methods to ggplot2
register_autoplot_s3_methods()
register_autoplot_s3_methods()
None
https://github.com/tidyverse/hms/blob/master/R/zzz.R
register_s3_method(pkg, generic, class, fun = NULL)
register_s3_method(pkg, generic, class, fun = NULL)
pkg |
Package name. |
generic |
Generic function name. |
class |
Class name. |
fun |
Optional custom function name. |
None
Sequence Measure for Species Sampling Models
seq_measure(idx, pars, hpars, Pn, Po)
seq_measure(idx, pars, hpars, Pn, Po)
idx |
A unique index for the sequence measure. |
pars |
A list of parameters used in |
hpars |
A list of hyperparameters used to generate |
Pn |
A function on a sequence length |
Po |
A function on a sequence length |
A seq_measure
object for use in the exchangeable sequence
resampling scheme for mixtures.
Dirichlet Sequence Measure.
Sq_dirichlet(alpha = 1, c = NULL, C = NULL)
Sq_dirichlet(alpha = 1, c = NULL, C = NULL)
alpha |
The concentration parameter for the Dirichlet process. Must be greater than 0. |
c |
The prior primary shape parameter for |
C |
The prior secondary shape parameter for |
A seq_measure
object for use in the exchangeable sequence
resampling scheme for mixtures.
Collapsed Gnedin Process Sequence Measure.
Sq_gnedin0(gamma)
Sq_gnedin0(gamma)
gamma |
The gamma parameter for the Gnedin process with xi set to 0.
Bounded to |
A seq_measure
object for use in the exchangeable sequence
resampling scheme for mixtures.
Pitman-Yor Sequence Measure.
Sq_pitmanyor(d, alpha = 1, m = 1L)
Sq_pitmanyor(d, alpha = 1, m = 1L)
d |
The discount parameter for the Pitman-Yor process. Must be less than 1. |
alpha |
The concentration parameter for the Pitman-Yor process. Must be
greater than - |
m |
A positive integer used to set |
A seq_measure
object for use in the exchangeable sequence
resampling scheme for mixtures.