| Title: | Beta Kernel Process Modeling |
|---|---|
| Description: | Implements the Beta Kernel Process (BKP) for nonparametric modeling of covariate-dependent binomial probabilities, and the Dirichlet Kernel Process (DKP) for categorical or multinomial response data. Scalable global-local approximations are provided through TwinBKP and TwinDKP, using twinning-selected global subsets and local nearest-neighbour updates. Functions are included for model fitting, predictive inference with uncertainty quantification, posterior simulation, and visualization in one- and two-dimensional input spaces. Gaussian, Matern 5/2, Matern 3/2, and Wendland kernels are supported, with hyperparameters selected by multi-start derivative-free optimization. For more details, see Zhao, Qing, and Xu (2025) <doi:10.48550/arXiv.2508.10447>. |
| Authors: | Jiangyan Zhao [cre, aut], Kunhai Qing [aut], Jin Xu [aut] |
| Maintainer: | Jiangyan Zhao <[email protected]> |
| License: | GPL (>= 3) |
| Version: | 0.3.0 |
| Built: | 2026-07-02 21:12:55 UTC |
| Source: | https://github.com/cran/BKP |
The BKP package provides tools for Bayesian nonparametric modeling of binary/binomial and categorical/multinomial response data using the Beta Kernel Process (BKP), the Dirichlet Kernel Process (DKP), and their scalable global-local approximations, TwinBKP and TwinDKP. These methods estimate latent probability surfaces by combining kernel-based smoothing with conjugate posterior updates.
The package supports model fitting, posterior prediction with uncertainty quantification, posterior simulation, and visualization in one- and two-dimensional input spaces. It also provides flexible prior specification, multiple kernel choices, hyperparameter tuning, and twinning-based global-local computation for scalable BKP and DKP modeling.
Core functionality is organized as follows:
Use fit_BKP and fit_DKP for full BKP and DKP
models, and fit_TwinBKP and fit_TwinDKP for
scalable global-local approximations using twinning-selected global
subsets and local nearest-neighbour updates.
Use predict with fitted BKP, DKP,
TwinBKP, or TwinDKP objects for posterior inference at new
input locations, including posterior means, variances, and credible
intervals. Decision labels, when needed, can be obtained from posterior
means or method-specific outputs.
Use simulate with fitted model objects to generate posterior
draws of latent success probabilities or class-probability vectors.
Use plot with fitted model objects to visualize predictions and
associated uncertainty in one- and two-dimensional input spaces. For
inputs with more than two dimensions, users can select one or two
dimensions to display via the dims argument. TwinBKP and TwinDKP
plots can optionally highlight the selected global subset.
Use summary and print to summarize or display fitted
model objects and associated results.
Use fitted, parameter, and quantile to
extract fitted posterior means, model parameters, and posterior
quantiles.
Use kernel_matrix to construct kernel matrices,
get_prior to construct BKP or DKP prior parameters, and
loss_fun to evaluate leave-one-out tuning losses.
Zhao J, Qing K, Xu J (2025). BKP: An R Package for Beta Kernel Process Modeling. arXiv. https://doi.org/10.48550/arXiv.2508.10447.
Vakayil A, Joseph VR (2024). A Global-Local Approximation Framework for Large-Scale Gaussian Process Modeling. Technometrics, 66(2), 295–305. https://doi.org/10.1080/00401706.2023.2296451.
Rolland P, Kavis A, Immer A, Singla A, Cevher V (2019). Efficient learning of smooth probability functions from Bernoulli tests with guarantees. In Proceedings of the 36th International Conference on Machine Learning, ICML 2019, 9-15 June 2019, Long Beach, California, USA, volume 97 of Proceedings of Machine Learning Research, pp. 5459–5467. PMLR. https://proceedings.mlr.press/v97/rolland19a.html.
MacKenzie CA, Trafalis TB, Barker K (2014). A Bayesian Beta Kernel Model for Binary Classification and Online Learning Problems. Statistical Analysis and Data Mining: The ASA Data Science Journal, 7(6), 434–449. https://doi.org/10.1002/sam.11241.
Goetschalckx R, Poupart P, Hoey J (2011). Continuous Correlated Beta Processes. In Proceedings of the Twenty-Second International Joint Conference on Artificial Intelligence, IJCAI-11, p. 1269–1274. AAAI Press. https://www.ijcai.org/Proceedings/11/Papers/215.pdf.
Fit a Beta Kernel Process (BKP) model for binary or binomial response data. The model estimates a covariate-dependent success probability surface by combining kernel-based smoothing with conjugate Beta posterior updates.
fit_BKP( X, y, m, Xbounds = NULL, prior = c("noninformative", "fixed", "adaptive"), r0 = 2, p0 = mean(y/m), kernel = c("gaussian", "matern52", "matern32", "wendland"), loss = c("brier", "log_loss"), n_multi_start = NULL, theta = NULL, isotropic = TRUE, n_threads = 1, ess = c("none", "shepard") )fit_BKP( X, y, m, Xbounds = NULL, prior = c("noninformative", "fixed", "adaptive"), r0 = 2, p0 = mean(y/m), kernel = c("gaussian", "matern52", "matern32", "wendland"), loss = c("brier", "log_loss"), n_multi_start = NULL, theta = NULL, isotropic = TRUE, n_threads = 1, ess = c("none", "shepard") )
X |
A numeric input matrix or data frame of size |
y |
A numeric vector of observed success counts of length |
m |
A numeric vector of total binomial trial counts of length |
Xbounds |
Optional |
prior |
Type of prior specification. Options are
|
r0 |
Positive scalar prior precision used when |
p0 |
Prior mean used when |
kernel |
Kernel function used for local weighting. Options are
|
loss |
Leave-one-out loss used for kernel hyperparameter tuning. Options
are |
n_multi_start |
Number of initial points used in multi-start
derivative-free optimization of the kernel lengthscale parameters. If
|
theta |
Optional positive kernel lengthscale parameter. When
|
isotropic |
Logical. If |
n_threads |
Number of OpenMP threads used for multi-start optimization.
Default is |
ess |
Effective-sample-size calibration for the kernel-weighted data
contribution. Use |
Inputs are normalized to using Xbounds. For a
location , BKP computes kernel weights
between and the training
inputs. These weights are used to update a local Beta prior with
kernel-weighted binomial counts:
If theta = NULL, the kernel lengthscale parameters are selected by
leave-one-out cross-validation using the specified loss. Optimization
is performed over log-transformed lengthscales using a multi-start
derivative-free search. If theta is supplied, optimization is skipped
and the model is fitted using the supplied lengthscale parameter.
If ess = "shepard", only the kernel-weighted data contribution is
rescaled to match the local effective-sample-size target; the prior
parameters are not rescaled. Shepard calibration requires unique training
input locations on the normalized input scale.
The returned object stores posterior parameters evaluated at the training
inputs. Posterior inference at new inputs is performed by
predict.BKP.
A list of class "BKP" containing the fitted model, with the
following components:
theta_optOptimized or user-specified kernel lengthscale parameter(s).
kernelKernel function used.
isotropicLogical flag indicating whether a shared lengthscale or per-dimension lengthscales were used.
lossLoss function used for hyperparameter tuning.
loss_minLoss value at the selected or user-specified lengthscale parameter(s).
XOriginal input matrix.
XnormInput matrix normalized to .
XboundsNormalization bounds for each input dimension.
yObserved success counts, stored as a one-column matrix.
mObserved binomial trial counts, stored as a one-column matrix.
priorPrior specification used.
r0Prior precision parameter.
p0Prior mean used when prior = "fixed".
alpha0Prior Beta shape parameters evaluated at
the training inputs.
beta0Prior Beta shape parameters evaluated at
the training inputs.
alpha_nPosterior Beta shape parameters
evaluated at the training inputs.
beta_nPosterior Beta shape parameters evaluated
at the training inputs.
essEffective-sample-size calibration method used.
ess_infoESS calibration diagnostics, or NULL when
ess = "none".
Zhao J, Qing K, Xu J (2025). BKP: An R Package for Beta Kernel Process Modeling. arXiv. https://doi.org/10.48550/arXiv.2508.10447.
fit_DKP for Dirichlet Kernel Process modeling of
categorical or multinomial responses; fit_TwinBKP for the
scalable global-local TwinBKP approximation; predict.BKP,
plot.BKP, simulate.BKP, and
summary.BKP for downstream methods.
#-------------------------- 1D Example --------------------------- set.seed(123) # Define true success probability function true_pi_fun <- function(x) { (1 + exp(-x^2) * cos(10 * (1 - exp(-x)) / (1 + exp(-x)))) / 2 } n <- 30 Xbounds <- matrix(c(-2,2), nrow = 1) X <- tgp::lhs(n = n, rect = Xbounds) true_pi <- true_pi_fun(X) m <- sample(100, n, replace = TRUE) y <- rbinom(n, size = m, prob = true_pi) # Fit BKP model # A fixed theta is used here only to keep the example fast and reproducible. # In practice, omit theta to select it by leave-one-out cross-validation. model1 <- fit_BKP(X, y, m, Xbounds = Xbounds, theta = 0.04) print(model1) #-------------------------- 2D Example --------------------------- # Define 2D latent function and probability transformation true_pi_fun <- function(X) { if(is.null(nrow(X))) X <- matrix(X, nrow=1) m <- 8.6928 s <- 2.4269 x1 <- 4*X[,1]- 2 x2 <- 4*X[,2]- 2 a <- 1 + (x1 + x2 + 1)^2 * (19- 14*x1 + 3*x1^2- 14*x2 + 6*x1*x2 + 3*x2^2) b <- 30 + (2*x1- 3*x2)^2 * (18- 32*x1 + 12*x1^2 + 48*x2- 36*x1*x2 + 27*x2^2) f <- log(a*b) f <- (f- m)/s return(pnorm(f)) # Transform to probability } n <- 100 Xbounds <- matrix(c(0, 0, 1, 1), nrow = 2) X <- tgp::lhs(n = n, rect = Xbounds) true_pi <- true_pi_fun(X) m <- sample(100, n, replace = TRUE) y <- rbinom(n, size = m, prob = true_pi) # Fit BKP model # A fixed theta is used here only to keep the example fast and reproducible. # In practice, omit theta to select it by leave-one-out cross-validation. model2 <- fit_BKP(X, y, m, Xbounds = Xbounds, theta = 0.08) print(model2)#-------------------------- 1D Example --------------------------- set.seed(123) # Define true success probability function true_pi_fun <- function(x) { (1 + exp(-x^2) * cos(10 * (1 - exp(-x)) / (1 + exp(-x)))) / 2 } n <- 30 Xbounds <- matrix(c(-2,2), nrow = 1) X <- tgp::lhs(n = n, rect = Xbounds) true_pi <- true_pi_fun(X) m <- sample(100, n, replace = TRUE) y <- rbinom(n, size = m, prob = true_pi) # Fit BKP model # A fixed theta is used here only to keep the example fast and reproducible. # In practice, omit theta to select it by leave-one-out cross-validation. model1 <- fit_BKP(X, y, m, Xbounds = Xbounds, theta = 0.04) print(model1) #-------------------------- 2D Example --------------------------- # Define 2D latent function and probability transformation true_pi_fun <- function(X) { if(is.null(nrow(X))) X <- matrix(X, nrow=1) m <- 8.6928 s <- 2.4269 x1 <- 4*X[,1]- 2 x2 <- 4*X[,2]- 2 a <- 1 + (x1 + x2 + 1)^2 * (19- 14*x1 + 3*x1^2- 14*x2 + 6*x1*x2 + 3*x2^2) b <- 30 + (2*x1- 3*x2)^2 * (18- 32*x1 + 12*x1^2 + 48*x2- 36*x1*x2 + 27*x2^2) f <- log(a*b) f <- (f- m)/s return(pnorm(f)) # Transform to probability } n <- 100 Xbounds <- matrix(c(0, 0, 1, 1), nrow = 2) X <- tgp::lhs(n = n, rect = Xbounds) true_pi <- true_pi_fun(X) m <- sample(100, n, replace = TRUE) y <- rbinom(n, size = m, prob = true_pi) # Fit BKP model # A fixed theta is used here only to keep the example fast and reproducible. # In practice, omit theta to select it by leave-one-out cross-validation. model2 <- fit_BKP(X, y, m, Xbounds = Xbounds, theta = 0.08) print(model2)
Fit a Dirichlet Kernel Process (DKP) model for categorical or multinomial count response data. The model estimates covariate-dependent class-probability surfaces by combining kernel-based smoothing with conjugate Dirichlet posterior updates.
fit_DKP( X, Y, Xbounds = NULL, prior = c("noninformative", "fixed", "adaptive"), r0 = 2, p0 = NULL, kernel = c("gaussian", "matern52", "matern32", "wendland"), loss = c("brier", "log_loss"), n_multi_start = NULL, theta = NULL, isotropic = TRUE, n_threads = 1, ess = c("none", "shepard") )fit_DKP( X, Y, Xbounds = NULL, prior = c("noninformative", "fixed", "adaptive"), r0 = 2, p0 = NULL, kernel = c("gaussian", "matern52", "matern32", "wendland"), loss = c("brier", "log_loss"), n_multi_start = NULL, theta = NULL, isotropic = TRUE, n_threads = 1, ess = c("none", "shepard") )
X |
A numeric input matrix or data frame of size |
Y |
A numeric matrix or data frame of observed multinomial counts, with
dimension |
Xbounds |
Optional |
prior |
Type of prior specification. Options are
|
r0 |
Positive scalar prior precision used when |
p0 |
Prior class-probability vector used when |
kernel |
Kernel function used for local weighting. Options are
|
loss |
Leave-one-out loss used for kernel hyperparameter tuning. Options
are |
n_multi_start |
Number of initial points used in multi-start
derivative-free optimization of the kernel lengthscale parameters. If
|
theta |
Optional positive kernel lengthscale parameter. When
|
isotropic |
Logical. If |
n_threads |
Number of OpenMP threads used for multi-start optimization.
Default is |
ess |
Effective-sample-size calibration for the kernel-weighted
class-count contribution. Use |
Inputs are normalized to using Xbounds. For a
location , DKP computes kernel weights
between and the training
inputs. These weights are used to update a local Dirichlet prior with
kernel-weighted multinomial counts:
Equivalently,
The posterior class-probability vector at follows a
Dirichlet distribution with concentration vector
.
If theta = NULL, the kernel lengthscale parameters are selected by
leave-one-out cross-validation using the specified loss. Optimization
is performed over log-transformed lengthscales using a multi-start
derivative-free search. If theta is supplied, optimization is skipped
and the model is fitted using the supplied lengthscale parameter.
If ess = "shepard", only the kernel-weighted class-count
contribution is rescaled to match the local effective-sample-size target;
the prior parameters are not rescaled. Shepard calibration requires unique
training input locations on the normalized input scale.
The returned object stores posterior Dirichlet concentration parameters
evaluated at the training inputs. Posterior inference at new inputs is
performed by predict.DKP.
A list of class "DKP" containing the fitted model, with the
following components:
theta_optOptimized or user-specified kernel lengthscale parameter(s).
kernelKernel function used.
isotropicLogical flag indicating whether a shared lengthscale or per-dimension lengthscales were used.
lossLoss function used for hyperparameter tuning.
loss_minLoss value at the selected or user-specified lengthscale parameter(s).
XOriginal input matrix.
XnormInput matrix normalized to .
XboundsNormalization bounds for each input dimension.
YObserved multinomial count matrix.
priorPrior specification used.
r0Prior precision parameter.
p0Prior class-probability vector used when
prior = "fixed".
alpha0Prior Dirichlet concentration parameters evaluated at the training inputs.
alpha_nPosterior Dirichlet concentration parameters evaluated at the training inputs.
essEffective-sample-size calibration method used.
ess_infoESS calibration diagnostics, or NULL when
ess = "none".
Zhao J, Qing K, Xu J (2025). BKP: An R Package for Beta Kernel Process Modeling. arXiv. https://doi.org/10.48550/arXiv.2508.10447.
fit_BKP for Beta Kernel Process modeling of binary or
binomial responses; fit_TwinDKP for the scalable global-local
TwinDKP approximation; predict.DKP,
plot.DKP, simulate.DKP, and
summary.DKP for downstream methods.
#-------------------------- 1D Example --------------------------- set.seed(123) # Define true class probability function (3-class) true_pi_fun <- function(X) { p1 <- 1/(1+exp(-3*X)) p2 <- (1 + exp(-X^2) * cos(10 * (1 - exp(-X)) / (1 + exp(-X)))) / 2 return(matrix(c(p1/2, p2/2, 1 - (p1+p2)/2), nrow = length(p1))) } n <- 30 Xbounds <- matrix(c(-2, 2), nrow = 1) X <- tgp::lhs(n = n, rect = Xbounds) true_pi <- true_pi_fun(X) m <- sample(150, n, replace = TRUE) # Generate multinomial responses Y <- t(sapply(1:n, function(i) rmultinom(1, size = m[i], prob = true_pi[i, ]))) # Fit DKP model # A fixed theta is used here only to keep the example fast and reproducible. # In practice, omit theta to select it by leave-one-out cross-validation. model1 <- fit_DKP(X, Y, Xbounds = Xbounds, theta = 0.04) print(model1) #-------------------------- 2D Example --------------------------- # Define latent function and transform to 3-class probabilities true_pi_fun <- function(X) { if (is.null(nrow(X))) X <- matrix(X, nrow = 1) m <- 8.6928; s <- 2.4269 x1 <- 4 * X[,1] - 2 x2 <- 4 * X[,2] - 2 a <- 1 + (x1 + x2 + 1)^2 * (19 - 14*x1 + 3*x1^2 - 14*x2 + 6*x1*x2 + 3*x2^2) b <- 30 + (2*x1 - 3*x2)^2 * (18 - 32*x1 + 12*x1^2 + 48*x2 - 36*x1*x2 + 27*x2^2) f <- (log(a*b)- m)/s p1 <- pnorm(f) # Transform to probability p2 <- sin(pi * X[,1]) * sin(pi * X[,2]) return(matrix(c(p1/2, p2/2, 1 - (p1+p2)/2), nrow = length(p1))) } n <- 100 Xbounds <- matrix(c(0, 0, 1, 1), nrow = 2) X <- tgp::lhs(n = n, rect = Xbounds) true_pi <- true_pi_fun(X) m <- sample(150, n, replace = TRUE) # Generate multinomial responses Y <- t(sapply(1:n, function(i) rmultinom(1, size = m[i], prob = true_pi[i, ]))) # Fit DKP model # A fixed theta is used here only to keep the example fast and reproducible. # In practice, omit theta to select it by leave-one-out cross-validation. model2 <- fit_DKP(X, Y, Xbounds = Xbounds, theta = 0.08) print(model2)#-------------------------- 1D Example --------------------------- set.seed(123) # Define true class probability function (3-class) true_pi_fun <- function(X) { p1 <- 1/(1+exp(-3*X)) p2 <- (1 + exp(-X^2) * cos(10 * (1 - exp(-X)) / (1 + exp(-X)))) / 2 return(matrix(c(p1/2, p2/2, 1 - (p1+p2)/2), nrow = length(p1))) } n <- 30 Xbounds <- matrix(c(-2, 2), nrow = 1) X <- tgp::lhs(n = n, rect = Xbounds) true_pi <- true_pi_fun(X) m <- sample(150, n, replace = TRUE) # Generate multinomial responses Y <- t(sapply(1:n, function(i) rmultinom(1, size = m[i], prob = true_pi[i, ]))) # Fit DKP model # A fixed theta is used here only to keep the example fast and reproducible. # In practice, omit theta to select it by leave-one-out cross-validation. model1 <- fit_DKP(X, Y, Xbounds = Xbounds, theta = 0.04) print(model1) #-------------------------- 2D Example --------------------------- # Define latent function and transform to 3-class probabilities true_pi_fun <- function(X) { if (is.null(nrow(X))) X <- matrix(X, nrow = 1) m <- 8.6928; s <- 2.4269 x1 <- 4 * X[,1] - 2 x2 <- 4 * X[,2] - 2 a <- 1 + (x1 + x2 + 1)^2 * (19 - 14*x1 + 3*x1^2 - 14*x2 + 6*x1*x2 + 3*x2^2) b <- 30 + (2*x1 - 3*x2)^2 * (18 - 32*x1 + 12*x1^2 + 48*x2 - 36*x1*x2 + 27*x2^2) f <- (log(a*b)- m)/s p1 <- pnorm(f) # Transform to probability p2 <- sin(pi * X[,1]) * sin(pi * X[,2]) return(matrix(c(p1/2, p2/2, 1 - (p1+p2)/2), nrow = length(p1))) } n <- 100 Xbounds <- matrix(c(0, 0, 1, 1), nrow = 2) X <- tgp::lhs(n = n, rect = Xbounds) true_pi <- true_pi_fun(X) m <- sample(150, n, replace = TRUE) # Generate multinomial responses Y <- t(sapply(1:n, function(i) rmultinom(1, size = m[i], prob = true_pi[i, ]))) # Fit DKP model # A fixed theta is used here only to keep the example fast and reproducible. # In practice, omit theta to select it by leave-one-out cross-validation. model2 <- fit_DKP(X, Y, Xbounds = Xbounds, theta = 0.08) print(model2)
Fit a Twin Beta Kernel Process (TwinBKP) model for binary or
binomial response data. TwinBKP is a scalable global-local approximation to
the full fit_BKP model. It uses a twinning-selected global
subset to capture the broad input-response structure and local
nearest-neighbour updates to refine posterior inference near each target
location.
fit_TwinBKP( X, y, m, Xbounds = NULL, prior = c("noninformative", "fixed", "adaptive"), r0 = 2, p0 = mean(y/m), global_kernel = c("gaussian", "matern52", "matern32", "wendland"), local_kernel = c("wendland"), loss = c("brier", "log_loss"), n_multi_start = NULL, isotropic = TRUE, n_threads = 1, theta_g = NULL, theta_l = NULL, g = NULL, l = NULL, twins = 5, store_kernel = FALSE )fit_TwinBKP( X, y, m, Xbounds = NULL, prior = c("noninformative", "fixed", "adaptive"), r0 = 2, p0 = mean(y/m), global_kernel = c("gaussian", "matern52", "matern32", "wendland"), local_kernel = c("wendland"), loss = c("brier", "log_loss"), n_multi_start = NULL, isotropic = TRUE, n_threads = 1, theta_g = NULL, theta_l = NULL, g = NULL, l = NULL, twins = 5, store_kernel = FALSE )
X |
A numeric input matrix or data frame of size |
y |
A numeric vector of observed success counts of length |
m |
A numeric vector of total binomial trial counts of length |
Xbounds |
Optional |
prior |
Type of prior specification. Options are
|
r0 |
Positive scalar prior precision used when |
p0 |
Prior mean used when |
global_kernel |
Kernel function used for the global subset contribution.
Options are |
local_kernel |
Kernel function used for the local-neighbour
contribution. Currently only |
loss |
Leave-one-out loss used for kernel hyperparameter tuning. Options
are |
n_multi_start |
Number of initial points used in multi-start
optimization of the global kernel lengthscale parameter(s). If
|
isotropic |
Logical. If |
n_threads |
Number of OpenMP threads used for global-subset
hyperparameter optimization when |
theta_g |
Optional positive global kernel lengthscale parameter. When
|
theta_l |
Optional positive scalar specifying the local Wendland kernel
range. If |
g |
Target global subset size. If |
l |
Number of local non-global neighbours used at each training or
prediction location. If |
twins |
Number of Twinning runs used to identify the global subset.
Larger values may improve the selected global subset at additional
computational cost. Default is |
store_kernel |
Logical. If |
TwinBKP first normalizes the input matrix to . The
global subset is selected by twin_select_global_rcpp() using the
augmented representation cbind(Xnorm, y / m), so the selected points
represent both the normalized input distribution and the empirical response
surface.
Given the selected global subset , TwinBKP uses the union of
and a location-specific set of l nearest non-global
neighbours for posterior aggregation. The global contribution uses
global_kernel and theta_g; the local contribution uses the
compactly supported local_kernel and theta_l. Local
neighbours are found with a kd-tree over the non-global training points via
nanoflann.
If theta_g = NULL, the global lengthscale parameter is selected by
leave-one-out cross-validation on the selected global subset, using the
specified loss. If theta_g is supplied, global tuning is
skipped and the supplied value is used.
By default, TwinBKP aggregates posterior pseudo-counts row-wise and does
not store dense kernel matrices. Fitting posterior
aggregation costs . Prediction at new input points
costs for fixed input dimension. When
store_kernel = TRUE, dense diagnostic matrices are additionally
stored and memory use increases to .
Effective-sample-size calibration is currently available for full BKP and DKP models. TwinBKP uses the uncalibrated global-local posterior update to preserve the intended scalable approximation.
A list of class "TwinBKP" containing the fitted TwinBKP model,
with the following components:
theta_optAlias for theta_g, retained for consistency
with fit_BKP.
theta_gOptimized or user-specified global kernel lengthscale parameter(s).
theta_lLocal kernel range parameter used for the nearest-neighbour local component.
kernelAlias for global_kernel, retained for
consistency with fit_BKP.
global_kernelKernel function used for the global subset contribution.
local_kernelKernel function used for the local-neighbour contribution.
isotropicLogical flag indicating whether the global kernel uses one shared lengthscale or per-dimension lengthscales.
lossLoss function used for global lengthscale tuning.
loss_minLoss value at the selected or user-specified global lengthscale parameter(s).
XOriginal training input matrix.
XnormTraining input matrix normalized to .
XboundsNormalization bounds for each input dimension.
yObserved success counts, stored as a one-column matrix.
mObserved binomial trial counts, stored as a one-column matrix.
priorPrior specification used in the TwinBKP posterior update.
r0Prior precision parameter.
p0Prior mean used when prior = "fixed".
alpha0Prior Beta shape parameters evaluated at
the training inputs.
beta0Prior Beta shape parameters evaluated at
the training inputs.
alpha_nPosterior Beta shape parameters
evaluated at the training inputs.
beta_nPosterior Beta shape parameters evaluated
at the training inputs.
global_indicesOne-based indices of the selected global subset.
controlA list of fitting controls and realized Twinning
settings, including g_target, g, l, r,
twins, u1, leaf_size, n_multi_start,
n_threads, and store_kernel.
diagnosticsA list of diagnostic objects. It contains
twin_info from the C++ Twinning routine and, when
store_kernel = TRUE, dense matrices K, K_global,
and K_local. When store_kernel = FALSE, these kernel
matrices are NULL.
Zhao J, Qing K, Xu J (2025). BKP: An R Package for Beta Kernel Process Modeling. arXiv. https://doi.org/10.48550/arXiv.2508.10447.
Vakayil A, Joseph VR (2022). Data Twinning. Statistical Analysis and Data Mining: The ASA Data Science Journal, 15(5), 598–610. https://doi.org/10.1002/sam.11574.
Vakayil A, Joseph VR (2024). A Global-Local Approximation Framework for Large-Scale Gaussian Process Modeling. Technometrics, 66(2), 295–305. https://doi.org/10.1080/00401706.2023.2296451.
Blanco JL, PK Rai (2014). nanoflann: a C++ header-only fork of FLANN, a library for nearest neighbor (NN) with kd-trees. https://github.com/jlblancoc/nanoflann
fit_BKP for the full BKP model,
fit_DKP for multinomial responses,
fit_TwinDKP for the multinomial TwinDKP analogue, and
predict.TwinBKP, plot.TwinBKP,
simulate.TwinBKP, and summary.TwinBKP for
downstream methods.
#-------------------------- 1D Example --------------------------- set.seed(123) # Define true success probability function true_pi_fun <- function(x) { (1 + exp(-x^2) * cos(10 * (1 - exp(-x)) / (1 + exp(-x)))) / 2 } n <- 200 Xbounds <- matrix(c(-2, 2), nrow = 1) X <- tgp::lhs(n = n, rect = Xbounds) true_pi <- true_pi_fun(X) m <- sample(100, n, replace = TRUE) y <- rbinom(n, size = m, prob = true_pi) # Fit TwinBKP model # A fixed theta is used here only to keep the example fast and reproducible. # In practice, omit theta to select it by leave-one-out cross-validation. model1 <- fit_TwinBKP( X, y, m, Xbounds = Xbounds, theta_g = 0.04, g = 20, twins = 1, n_threads = 1 ) #-------------------------- 2D Example --------------------------- # Define 2D latent function and probability transformation true_pi_fun <- function(X) { if(is.null(nrow(X))) X <- matrix(X, nrow=1) m <- 8.6928 s <- 2.4269 x1 <- 4*X[,1]- 2 x2 <- 4*X[,2]- 2 a <- 1 + (x1 + x2 + 1)^2 * (19- 14*x1 + 3*x1^2- 14*x2 + 6*x1*x2 + 3*x2^2) b <- 30 + (2*x1- 3*x2)^2 * (18- 32*x1 + 12*x1^2 + 48*x2- 36*x1*x2 + 27*x2^2) f <- log(a*b) f <- (f- m)/s return(pnorm(f)) # Transform to probability } n <- 200 Xbounds <- matrix(c(0, 0, 1, 1), nrow = 2) X <- tgp::lhs(n = n, rect = Xbounds) true_pi <- true_pi_fun(X) m <- sample(100, n, replace = TRUE) y <- rbinom(n, size = m, prob = true_pi) # Fit TwinBKP model # A fixed theta is used here only to keep the example fast and reproducible. # In practice, omit theta to select it by leave-one-out cross-validation. model2 <- fit_TwinBKP( X, y, m, Xbounds = Xbounds, theta_g = 0.08, g = 20, twins = 1, n_threads = 1 )#-------------------------- 1D Example --------------------------- set.seed(123) # Define true success probability function true_pi_fun <- function(x) { (1 + exp(-x^2) * cos(10 * (1 - exp(-x)) / (1 + exp(-x)))) / 2 } n <- 200 Xbounds <- matrix(c(-2, 2), nrow = 1) X <- tgp::lhs(n = n, rect = Xbounds) true_pi <- true_pi_fun(X) m <- sample(100, n, replace = TRUE) y <- rbinom(n, size = m, prob = true_pi) # Fit TwinBKP model # A fixed theta is used here only to keep the example fast and reproducible. # In practice, omit theta to select it by leave-one-out cross-validation. model1 <- fit_TwinBKP( X, y, m, Xbounds = Xbounds, theta_g = 0.04, g = 20, twins = 1, n_threads = 1 ) #-------------------------- 2D Example --------------------------- # Define 2D latent function and probability transformation true_pi_fun <- function(X) { if(is.null(nrow(X))) X <- matrix(X, nrow=1) m <- 8.6928 s <- 2.4269 x1 <- 4*X[,1]- 2 x2 <- 4*X[,2]- 2 a <- 1 + (x1 + x2 + 1)^2 * (19- 14*x1 + 3*x1^2- 14*x2 + 6*x1*x2 + 3*x2^2) b <- 30 + (2*x1- 3*x2)^2 * (18- 32*x1 + 12*x1^2 + 48*x2- 36*x1*x2 + 27*x2^2) f <- log(a*b) f <- (f- m)/s return(pnorm(f)) # Transform to probability } n <- 200 Xbounds <- matrix(c(0, 0, 1, 1), nrow = 2) X <- tgp::lhs(n = n, rect = Xbounds) true_pi <- true_pi_fun(X) m <- sample(100, n, replace = TRUE) y <- rbinom(n, size = m, prob = true_pi) # Fit TwinBKP model # A fixed theta is used here only to keep the example fast and reproducible. # In practice, omit theta to select it by leave-one-out cross-validation. model2 <- fit_TwinBKP( X, y, m, Xbounds = Xbounds, theta_g = 0.08, g = 20, twins = 1, n_threads = 1 )
Fit a Twin Dirichlet Kernel Process (TwinDKP) model for
categorical or multinomial count response data. TwinDKP is a scalable
global-local approximation to the full fit_DKP model. It uses
a twinning-selected global subset to capture the broad input-response
structure and local nearest-neighbour updates to refine posterior inference
near each target location.
fit_TwinDKP( X, Y, Xbounds = NULL, prior = c("noninformative", "fixed", "adaptive"), r0 = 2, p0 = NULL, global_kernel = c("gaussian", "matern52", "matern32", "wendland"), local_kernel = c("wendland"), loss = c("brier", "log_loss"), n_multi_start = NULL, isotropic = TRUE, n_threads = 1, theta_g = NULL, theta_l = NULL, g = NULL, l = NULL, twins = 5, store_kernel = FALSE )fit_TwinDKP( X, Y, Xbounds = NULL, prior = c("noninformative", "fixed", "adaptive"), r0 = 2, p0 = NULL, global_kernel = c("gaussian", "matern52", "matern32", "wendland"), local_kernel = c("wendland"), loss = c("brier", "log_loss"), n_multi_start = NULL, isotropic = TRUE, n_threads = 1, theta_g = NULL, theta_l = NULL, g = NULL, l = NULL, twins = 5, store_kernel = FALSE )
X |
A numeric input matrix or data frame of size |
Y |
A numeric matrix or data frame of observed multinomial counts, with
dimension |
Xbounds |
Optional |
prior |
Type of prior specification. Options are
|
r0 |
Positive scalar prior precision used when |
p0 |
Prior class-probability vector used when |
global_kernel |
Kernel function used for the global subset contribution.
Options are |
local_kernel |
Kernel function used for the local-neighbour
contribution. Currently only |
loss |
Leave-one-out loss used for kernel hyperparameter tuning. Options
are |
n_multi_start |
Number of initial points used in multi-start
optimization of the global kernel lengthscale parameter(s). If
|
isotropic |
Logical. If |
n_threads |
Number of OpenMP threads used for global-subset
hyperparameter optimization when |
theta_g |
Optional positive global kernel lengthscale parameter. When
|
theta_l |
Optional positive scalar specifying the local Wendland kernel
range. If |
g |
Target global subset size. If |
l |
Number of local non-global neighbours used at each training or
prediction location. If |
twins |
Number of Twinning runs used to identify the global subset.
Larger values may improve the selected global subset at additional
computational cost. Default is |
store_kernel |
Logical. If |
TwinDKP first normalizes the input matrix to . The
global subset is selected by twin_select_global_rcpp() using the
augmented representation cbind(Xnorm, Y / rowSums(Y)), so the
selected points represent both the normalized input distribution and the
empirical class-probability surface.
Given the selected global subset , TwinDKP uses the union of
and a location-specific set of l nearest non-global
neighbours for posterior aggregation. The global contribution uses
global_kernel and theta_g; the local contribution uses the
compactly supported local_kernel and theta_l. Local
neighbours are found with a kd-tree over the non-global training points via
nanoflann.
If theta_g = NULL, the global lengthscale parameter is selected by
leave-one-out cross-validation on the selected global subset, using the
specified loss. If theta_g is supplied, global tuning is
skipped and the supplied value is used.
By default, TwinDKP aggregates posterior pseudo-counts row-wise and does
not store dense kernel matrices. Fitting posterior
aggregation costs . Prediction at new input points
costs for fixed input dimension. When
store_kernel = TRUE, dense diagnostic matrices are additionally
stored and memory use increases to .
Effective-sample-size calibration is currently available for full BKP and DKP models. TwinDKP uses the uncalibrated global-local posterior update to preserve the intended scalable approximation.
A list of class "TwinDKP" containing the fitted TwinDKP model,
with the following components:
theta_optAlias for theta_g, retained for consistency
with fit_DKP.
theta_gOptimized or user-specified global kernel lengthscale parameter(s).
theta_lLocal kernel range parameter used for the nearest-neighbour local component.
kernelAlias for global_kernel, retained for
consistency with fit_DKP.
global_kernelKernel function used for the global subset contribution.
local_kernelKernel function used for the local-neighbour contribution.
isotropicLogical flag indicating whether the global kernel uses one shared lengthscale or per-dimension lengthscales.
lossLoss function used for global lengthscale tuning.
loss_minLoss value at the selected or user-specified global lengthscale parameter(s).
XOriginal training input matrix.
XnormTraining input matrix normalized to .
XboundsNormalization bounds for each input dimension.
YObserved multinomial count matrix.
priorPrior specification used in the TwinDKP posterior update.
r0Prior precision parameter.
p0Prior class-probability vector used when
prior = "fixed".
alpha0Prior Dirichlet concentration parameters evaluated at the training inputs.
alpha_nPosterior Dirichlet concentration parameters evaluated at the training inputs.
probPosterior mean class-probability matrix, computed as
normalized rows of alpha_n.
global_indicesOne-based indices of the selected global subset.
controlA list of fitting controls and realized Twinning
settings, including g_target, g, l, r,
twins, u1, leaf_size, n_multi_start,
n_threads, and store_kernel.
diagnosticsA list of diagnostic objects. It contains
twin_info from the C++ Twinning routine and, when
store_kernel = TRUE, dense matrices K, K_global,
and K_local. When store_kernel = FALSE, these kernel
matrices are NULL.
Zhao J, Qing K, Xu J (2025). BKP: An R Package for Beta Kernel Process Modeling. arXiv. https://doi.org/10.48550/arXiv.2508.10447.
Vakayil A, Joseph VR (2022). Data Twinning. Statistical Analysis and Data Mining: The ASA Data Science Journal, 15(5), 598–610. https://doi.org/10.1002/sam.11574.
Vakayil A, Joseph VR (2024). A Global-Local Approximation Framework for Large-Scale Gaussian Process Modeling. Technometrics, 66(2), 295–305. https://doi.org/10.1080/00401706.2023.2296451.
Blanco JL, PK Rai (2014). nanoflann: a C++ header-only fork of FLANN, a library for nearest neighbor (NN) with kd-trees. https://github.com/jlblancoc/nanoflann
fit_DKP for the full DKP model,
fit_TwinBKP for the binomial TwinBKP analogue,
fit_BKP for binary or binomial responses, and
predict.TwinDKP, plot.TwinDKP,
simulate.TwinDKP, and summary.TwinDKP for
downstream methods.
#-------------------------- 1D Example --------------------------- set.seed(123) # Define true class probability function (3-class) true_pi_fun <- function(X) { p1 <- 1/(1+exp(-3*X)) p2 <- (1 + exp(-X^2) * cos(10 * (1 - exp(-X)) / (1 + exp(-X)))) / 2 return(matrix(c(p1/2, p2/2, 1 - (p1+p2)/2), nrow = length(p1))) } n <- 200 Xbounds <- matrix(c(-2, 2), nrow = 1) X <- tgp::lhs(n = n, rect = Xbounds) true_pi <- true_pi_fun(X) m <- sample(150, n, replace = TRUE) # Generate multinomial responses Y <- t(sapply(1:n, function(i) rmultinom(1, size = m[i], prob = true_pi[i, ]))) # Fit TwinDKP model # A fixed theta is used here only to keep the example fast and reproducible. # In practice, omit theta to select it by leave-one-out cross-validation. model1 <- fit_TwinDKP( X, Y, Xbounds = Xbounds, theta_g = 0.04, g = 20, twins = 1, n_threads = 1 ) print(model1) #-------------------------- 2D Example --------------------------- # Define latent function and transform to 3-class probabilities true_pi_fun <- function(X) { if (is.null(nrow(X))) X <- matrix(X, nrow = 1) m <- 8.6928; s <- 2.4269 x1 <- 4 * X[,1] - 2 x2 <- 4 * X[,2] - 2 a <- 1 + (x1 + x2 + 1)^2 * (19 - 14*x1 + 3*x1^2 - 14*x2 + 6*x1*x2 + 3*x2^2) b <- 30 + (2*x1 - 3*x2)^2 * (18 - 32*x1 + 12*x1^2 + 48*x2 - 36*x1*x2 + 27*x2^2) f <- (log(a*b)- m)/s p1 <- pnorm(f) # Transform to probability p2 <- sin(pi * X[,1]) * sin(pi * X[,2]) return(matrix(c(p1/2, p2/2, 1 - (p1+p2)/2), nrow = length(p1))) } n <- 200 Xbounds <- matrix(c(0, 0, 1, 1), nrow = 2) X <- tgp::lhs(n = n, rect = Xbounds) true_pi <- true_pi_fun(X) m <- sample(150, n, replace = TRUE) # Generate multinomial responses Y <- t(sapply(1:n, function(i) rmultinom(1, size = m[i], prob = true_pi[i, ]))) # Fit TwinDKP model # A fixed theta is used here only to keep the example fast and reproducible. # In practice, omit theta to select it by leave-one-out cross-validation. model2 <- fit_TwinDKP( X, Y, Xbounds = Xbounds, theta_g = 0.08, g = 20, twins = 1, n_threads = 1 ) print(model2)#-------------------------- 1D Example --------------------------- set.seed(123) # Define true class probability function (3-class) true_pi_fun <- function(X) { p1 <- 1/(1+exp(-3*X)) p2 <- (1 + exp(-X^2) * cos(10 * (1 - exp(-X)) / (1 + exp(-X)))) / 2 return(matrix(c(p1/2, p2/2, 1 - (p1+p2)/2), nrow = length(p1))) } n <- 200 Xbounds <- matrix(c(-2, 2), nrow = 1) X <- tgp::lhs(n = n, rect = Xbounds) true_pi <- true_pi_fun(X) m <- sample(150, n, replace = TRUE) # Generate multinomial responses Y <- t(sapply(1:n, function(i) rmultinom(1, size = m[i], prob = true_pi[i, ]))) # Fit TwinDKP model # A fixed theta is used here only to keep the example fast and reproducible. # In practice, omit theta to select it by leave-one-out cross-validation. model1 <- fit_TwinDKP( X, Y, Xbounds = Xbounds, theta_g = 0.04, g = 20, twins = 1, n_threads = 1 ) print(model1) #-------------------------- 2D Example --------------------------- # Define latent function and transform to 3-class probabilities true_pi_fun <- function(X) { if (is.null(nrow(X))) X <- matrix(X, nrow = 1) m <- 8.6928; s <- 2.4269 x1 <- 4 * X[,1] - 2 x2 <- 4 * X[,2] - 2 a <- 1 + (x1 + x2 + 1)^2 * (19 - 14*x1 + 3*x1^2 - 14*x2 + 6*x1*x2 + 3*x2^2) b <- 30 + (2*x1 - 3*x2)^2 * (18 - 32*x1 + 12*x1^2 + 48*x2 - 36*x1*x2 + 27*x2^2) f <- (log(a*b)- m)/s p1 <- pnorm(f) # Transform to probability p2 <- sin(pi * X[,1]) * sin(pi * X[,2]) return(matrix(c(p1/2, p2/2, 1 - (p1+p2)/2), nrow = length(p1))) } n <- 200 Xbounds <- matrix(c(0, 0, 1, 1), nrow = 2) X <- tgp::lhs(n = n, rect = Xbounds) true_pi <- true_pi_fun(X) m <- sample(150, n, replace = TRUE) # Generate multinomial responses Y <- t(sapply(1:n, function(i) rmultinom(1, size = m[i], prob = true_pi[i, ]))) # Fit TwinDKP model # A fixed theta is used here only to keep the example fast and reproducible. # In practice, omit theta to select it by leave-one-out cross-validation. model2 <- fit_TwinDKP( X, Y, Xbounds = Xbounds, theta_g = 0.08, g = 20, twins = 1, n_threads = 1 ) print(model2)
Extract posterior fitted values from fitted BKP,
DKP, TwinBKP, or TwinDKP model objects. For
BKP and TwinBKP objects, this returns the posterior mean
success probability at each training input. For DKP and
TwinDKP objects, this returns the posterior mean class-probability
vector at each training input.
## S3 method for class 'BKP' fitted(object, ...) ## S3 method for class 'DKP' fitted(object, ...) ## S3 method for class 'TwinBKP' fitted(object, ...) ## S3 method for class 'TwinDKP' fitted(object, ...)## S3 method for class 'BKP' fitted(object, ...) ## S3 method for class 'DKP' fitted(object, ...) ## S3 method for class 'TwinBKP' fitted(object, ...) ## S3 method for class 'TwinDKP' fitted(object, ...)
object |
A fitted model object of class |
... |
Additional arguments. Currently unused. |
The fitted() method extracts posterior means already stored
in the fitted model object. For BKP and TwinBKP, the fitted
value at a training input is computed from the corresponding Beta posterior
shape parameters. For DKP and TwinDKP, fitted values are
obtained by normalizing the corresponding Dirichlet posterior concentration
parameters across classes.
For BKP and TwinBKP objects, a numeric vector
containing posterior mean success probabilities at the training inputs.
For DKP and TwinDKP objects, a numeric matrix containing
posterior mean class probabilities at the training inputs, with one row per
training input and one column per class.
Zhao J, Qing K, Xu J (2025). BKP: An R Package for Beta Kernel Process Modeling. arXiv. https://doi.org/10.48550/arXiv.2508.10447.
fit_BKP, fit_DKP,
fit_TwinBKP, and fit_TwinDKP for model
fitting; predict for posterior prediction at new input locations.
# -------------------------- BKP and TwinBKP --------------------------- set.seed(123) # Define true success probability function true_pi_fun <- function(x) { (1 + exp(-x^2) * cos(10 * (1 - exp(-x)) / (1 + exp(-x)))) / 2 } n <- 30 Xbounds <- matrix(c(-2, 2), nrow = 1) X <- tgp::lhs(n = n, rect = Xbounds) true_pi <- true_pi_fun(X) m <- sample(100, n, replace = TRUE) y <- rbinom(n, size = m, prob = true_pi) # Fit BKP model # A fixed theta is used here only to keep the example fast and reproducible. # In practice, omit theta to select it by leave-one-out cross-validation. model <- fit_BKP(X, y, m, Xbounds = Xbounds, theta = 0.04) # Extract fitted values fitted(model) ## Not run: # Larger TwinBKP example n <- 1000 X <- tgp::lhs(n = n, rect = Xbounds) true_pi <- true_pi_fun(X) m <- sample(100, n, replace = TRUE) y <- rbinom(n, size = m, prob = true_pi) # Fit TwinBKP model using the default global lengthscale tuning model <- fit_TwinBKP( X, y, m, Xbounds = Xbounds ) # Extract fitted values fitted(model) ## End(Not run) # -------------------------- DKP and TwinDKP --------------------------- # Define true class probability function (3-class) true_pi_fun <- function(X) { p1 <- 1/(1+exp(-3*X)) p2 <- (1 + exp(-X^2) * cos(10 * (1 - exp(-X)) / (1 + exp(-X)))) / 2 return(matrix(c(p1/2, p2/2, 1 - (p1+p2)/2), nrow = length(p1))) } n <- 30 Xbounds <- matrix(c(-2, 2), nrow = 1) X <- tgp::lhs(n = n, rect = Xbounds) true_pi <- true_pi_fun(X) m <- sample(150, n, replace = TRUE) # Generate multinomial responses Y <- t(sapply(1:n, function(i) rmultinom(1, size = m[i], prob = true_pi[i, ]))) # Fit DKP model # A fixed theta is used here only to keep the example fast and reproducible. # In practice, omit theta to select it by leave-one-out cross-validation. model <- fit_DKP(X, Y, Xbounds = Xbounds, theta = 0.04) # Extract fitted values fitted(model) ## Not run: # Larger TwinDKP example n <- 1000 X <- tgp::lhs(n = n, rect = Xbounds) true_pi <- true_pi_fun(X) m <- sample(150, n, replace = TRUE) # Generate multinomial responses Y <- t(sapply(1:n, function(i) rmultinom(1, size = m[i], prob = true_pi[i, ]))) # Fit TwinDKP model using the default global lengthscale tuning model <- fit_TwinDKP( X, Y, Xbounds = Xbounds ) # Extract fitted values fitted(model) ## End(Not run)# -------------------------- BKP and TwinBKP --------------------------- set.seed(123) # Define true success probability function true_pi_fun <- function(x) { (1 + exp(-x^2) * cos(10 * (1 - exp(-x)) / (1 + exp(-x)))) / 2 } n <- 30 Xbounds <- matrix(c(-2, 2), nrow = 1) X <- tgp::lhs(n = n, rect = Xbounds) true_pi <- true_pi_fun(X) m <- sample(100, n, replace = TRUE) y <- rbinom(n, size = m, prob = true_pi) # Fit BKP model # A fixed theta is used here only to keep the example fast and reproducible. # In practice, omit theta to select it by leave-one-out cross-validation. model <- fit_BKP(X, y, m, Xbounds = Xbounds, theta = 0.04) # Extract fitted values fitted(model) ## Not run: # Larger TwinBKP example n <- 1000 X <- tgp::lhs(n = n, rect = Xbounds) true_pi <- true_pi_fun(X) m <- sample(100, n, replace = TRUE) y <- rbinom(n, size = m, prob = true_pi) # Fit TwinBKP model using the default global lengthscale tuning model <- fit_TwinBKP( X, y, m, Xbounds = Xbounds ) # Extract fitted values fitted(model) ## End(Not run) # -------------------------- DKP and TwinDKP --------------------------- # Define true class probability function (3-class) true_pi_fun <- function(X) { p1 <- 1/(1+exp(-3*X)) p2 <- (1 + exp(-X^2) * cos(10 * (1 - exp(-X)) / (1 + exp(-X)))) / 2 return(matrix(c(p1/2, p2/2, 1 - (p1+p2)/2), nrow = length(p1))) } n <- 30 Xbounds <- matrix(c(-2, 2), nrow = 1) X <- tgp::lhs(n = n, rect = Xbounds) true_pi <- true_pi_fun(X) m <- sample(150, n, replace = TRUE) # Generate multinomial responses Y <- t(sapply(1:n, function(i) rmultinom(1, size = m[i], prob = true_pi[i, ]))) # Fit DKP model # A fixed theta is used here only to keep the example fast and reproducible. # In practice, omit theta to select it by leave-one-out cross-validation. model <- fit_DKP(X, Y, Xbounds = Xbounds, theta = 0.04) # Extract fitted values fitted(model) ## Not run: # Larger TwinDKP example n <- 1000 X <- tgp::lhs(n = n, rect = Xbounds) true_pi <- true_pi_fun(X) m <- sample(150, n, replace = TRUE) # Generate multinomial responses Y <- t(sapply(1:n, function(i) rmultinom(1, size = m[i], prob = true_pi[i, ]))) # Fit TwinDKP model using the default global lengthscale tuning model <- fit_TwinDKP( X, Y, Xbounds = Xbounds ) # Extract fitted values fitted(model) ## End(Not run)
Construct prior shape parameters for Beta Kernel Process
(BKP) and Dirichlet Kernel Process (DKP) models. The function supports
prior = "noninformative", "fixed", and
"adaptive" prior specifications.
get_prior( prior = c("noninformative", "fixed", "adaptive"), model = c("BKP", "DKP"), r0 = 2, p0 = NULL, y = NULL, m = NULL, Y = NULL, K = NULL )get_prior( prior = c("noninformative", "fixed", "adaptive"), model = c("BKP", "DKP"), r0 = 2, p0 = NULL, y = NULL, m = NULL, Y = NULL, K = NULL )
prior |
Type of prior specification. Options are
|
model |
A character string specifying the model type: |
r0 |
Positive scalar prior precision used when |
p0 |
For BKP, a scalar in |
y |
A numeric vector of observed success counts of length |
m |
A numeric vector of total binomial trial counts of length |
Y |
A numeric matrix of observed class counts ( |
K |
Optional precomputed kernel matrix, typically obtained from
|
The noninformative prior sets all Beta or Dirichlet shape parameters to 1.
The fixed prior uses a common prior distribution at every evaluated
location. For BKP, this is a Beta prior with shape parameters
r0 * p0 and r0 * (1 - p0). For DKP, this is a Dirichlet
prior with concentration vector r0 * p0.
The adaptive prior estimates location-specific prior means from the observed
data. BKP uses kernel smoothing of the observed proportions y / m,
while DKP uses kernel smoothing of the row-normalized class counts in
Y. In both cases, the adaptive prior precision is scaled by the
local kernel mass.
If model = "BKP": a list with
alpha0Vector of prior alpha parameters for the Beta
distribution, length n.
beta0Vector of prior beta parameters for the Beta
distribution, length n.
If model = "DKP": a matrix alpha0 of prior Dirichlet
parameters at each input location (n × q).
Zhao J, Qing K, Xu J (2025). BKP: An R Package for Beta Kernel Process Modeling. arXiv. https://doi.org/10.48550/arXiv.2508.10447.
fit_BKP for fitting Beta Kernel Process models,
fit_DKP for fitting Dirichlet Kernel Process models,
predict.BKP and predict.DKP for making
predictions, kernel_matrix for computing kernel matrices used
in prior construction.
# -------------------------- BKP --------------------------- set.seed(123) n <- 10 X <- matrix(runif(n * 2), ncol = 2) y <- rbinom(n, size = 5, prob = 0.6) m <- rep(5, n) K <- kernel_matrix(X, theta = 0.2, kernel = "gaussian") # Adaptive BKP prior evaluated at the rows of K prior_bkp <- get_prior( model = "BKP", prior = "adaptive", r0 = 2, y = y, m = m, K = K ) # Fixed BKP prior; with K = NULL, output length is inferred from y/m prior_bkp_fixed <- get_prior( model = "BKP", prior = "fixed", r0 = 2, p0 = 0.5, y = y, m = m ) # -------------------------- DKP --------------------------- n <- 15 q <- 3 X <- matrix(runif(n * 2), ncol = 2) true_pi <- t(apply(X, 1, function(x) { raw <- c( exp(-sum((x - 0.2)^2)), exp(-sum((x - 0.5)^2)), exp(-sum((x - 0.8)^2)) ) raw / sum(raw) })) m <- sample(10:20, n, replace = TRUE) Y <- t(sapply(seq_len(n), function(i) { rmultinom(1, size = m[i], prob = true_pi[i, ]) })) K <- kernel_matrix(X, theta = 0.2, kernel = "gaussian") # Adaptive DKP prior evaluated at the rows of K prior_dkp <- get_prior( model = "DKP", prior = "adaptive", r0 = 2, Y = Y, K = K ) # Fixed DKP prior; with K = NULL, output rows are inferred from Y prior_dkp_fixed <- get_prior( model = "DKP", prior = "fixed", r0 = 2, p0 = rep(1 / q, q), Y = Y )# -------------------------- BKP --------------------------- set.seed(123) n <- 10 X <- matrix(runif(n * 2), ncol = 2) y <- rbinom(n, size = 5, prob = 0.6) m <- rep(5, n) K <- kernel_matrix(X, theta = 0.2, kernel = "gaussian") # Adaptive BKP prior evaluated at the rows of K prior_bkp <- get_prior( model = "BKP", prior = "adaptive", r0 = 2, y = y, m = m, K = K ) # Fixed BKP prior; with K = NULL, output length is inferred from y/m prior_bkp_fixed <- get_prior( model = "BKP", prior = "fixed", r0 = 2, p0 = 0.5, y = y, m = m ) # -------------------------- DKP --------------------------- n <- 15 q <- 3 X <- matrix(runif(n * 2), ncol = 2) true_pi <- t(apply(X, 1, function(x) { raw <- c( exp(-sum((x - 0.2)^2)), exp(-sum((x - 0.5)^2)), exp(-sum((x - 0.8)^2)) ) raw / sum(raw) })) m <- sample(10:20, n, replace = TRUE) Y <- t(sapply(seq_len(n), function(i) { rmultinom(1, size = m[i], prob = true_pi[i, ]) })) K <- kernel_matrix(X, theta = 0.2, kernel = "gaussian") # Adaptive DKP prior evaluated at the rows of K prior_dkp <- get_prior( model = "DKP", prior = "adaptive", r0 = 2, Y = Y, K = K ) # Fixed DKP prior; with K = NULL, output rows are inferred from Y prior_dkp_fixed <- get_prior( model = "DKP", prior = "fixed", r0 = 2, p0 = rep(1 / q, q), Y = Y )
Computes the kernel matrix between two sets of input locations using a specified kernel function. Supports both isotropic and anisotropic lengthscales. Available kernels include the Gaussian, Matérn 5/2, Matérn 3/2, and Wendland compactly supported kernel.
kernel_matrix( X, Xprime = NULL, theta = 0.1, kernel = c("gaussian", "matern52", "matern32", "wendland"), isotropic = TRUE )kernel_matrix( X, Xprime = NULL, theta = 0.1, kernel = c("gaussian", "matern52", "matern32", "wendland"), isotropic = TRUE )
X |
A numeric matrix (or vector) of input locations with shape |
Xprime |
An optional numeric matrix of input locations with shape |
theta |
A positive numeric value or vector specifying the kernel
lengthscale(s). If |
kernel |
A character string specifying the kernel function. Must be one
of |
isotropic |
Logical. If |
Let and denote two input
points. The scaled Euclidean distance is
For isotropic kernels, is a scalar shared by all
input dimensions. For anisotropic kernels, is a
vector of dimension-specific lengthscales.
The available kernels are
Gaussian:
Matérn 5/2:
Matérn 3/2:
Wendland:
The returned matrix has one row for each row of X and one column for
each row of Xprime. If Xprime = NULL, the function returns the
symmetric kernel matrix for X.
A numeric matrix of size , where each element
gives the kernel similarity between input and
.
Zhao J, Qing K, Xu J (2025). BKP: An R Package for Beta Kernel Process Modeling. arXiv. https://doi.org/10.48550/arXiv.2508.10447.
Rasmussen, C. E., & Williams, C. K. I. (2006). Gaussian Processes for Machine Learning. MIT Press. https://gaussianprocess.org/gpml/.
Wendland, H. (1995). Piecewise polynomial, positive definite and compactly supported radial functions of minimal degree. Advances in Computational Mathematics, 4(1), 389–396. https://doi.org/10.1007/BF02123482.
set.seed(123) X <- matrix(runif(20), ncol = 2) # Compute kernel matrices for all implemented kernels kernels <- c("gaussian", "matern52", "matern32", "wendland") K_list <- lapply(kernels, function(k) { kernel_matrix(X, theta = 0.2, kernel = k) }) names(K_list) <- kernels # Inspect part of the Gaussian and Wendland kernel matrices K_list$gaussian[1:3, 1:3] K_list$wendland[1:3, 1:3] # Anisotropic lengthscales K_aniso <- kernel_matrix( X, theta = c(0.1, 0.3), kernel = "matern52", isotropic = FALSE ) # Cross-kernel matrix between two input sets Xprime <- matrix(runif(10), ncol = 2) K_cross <- kernel_matrix(X, Xprime, theta = 0.2, kernel = "gaussian") dim(K_cross)set.seed(123) X <- matrix(runif(20), ncol = 2) # Compute kernel matrices for all implemented kernels kernels <- c("gaussian", "matern52", "matern32", "wendland") K_list <- lapply(kernels, function(k) { kernel_matrix(X, theta = 0.2, kernel = k) }) names(K_list) <- kernels # Inspect part of the Gaussian and Wendland kernel matrices K_list$gaussian[1:3, 1:3] K_list$wendland[1:3, 1:3] # Anisotropic lengthscales K_aniso <- kernel_matrix( X, theta = c(0.1, 0.3), kernel = "matern52", isotropic = FALSE ) # Cross-kernel matrix between two input sets Xprime <- matrix(runif(10), ncol = 2) K_cross <- kernel_matrix(X, Xprime, theta = 0.2, kernel = "gaussian") dim(K_cross)
Computes the leave-one-out cross-validation loss used for
kernel hyperparameter tuning in BKP and DKP models. The input
gamma represents log10-transformed kernel lengthscale parameters,
with theta = 10^gamma. The function supports Brier score and
log-loss under noninformative, fixed, and adaptive prior specifications.
loss_fun( gamma, Xnorm, y = NULL, m = NULL, Y = NULL, model = c("BKP", "DKP"), prior = c("noninformative", "fixed", "adaptive"), r0 = 2, p0 = NULL, loss = c("brier", "log_loss"), kernel = c("gaussian", "matern52", "matern32", "wendland"), isotropic = TRUE, ess = c("none", "shepard") )loss_fun( gamma, Xnorm, y = NULL, m = NULL, Y = NULL, model = c("BKP", "DKP"), prior = c("noninformative", "fixed", "adaptive"), r0 = 2, p0 = NULL, loss = c("brier", "log_loss"), kernel = c("gaussian", "matern52", "matern32", "wendland"), isotropic = TRUE, ess = c("none", "shepard") )
gamma |
A numeric vector of log10-transformed kernel hyperparameters. |
Xnorm |
A numeric matrix of normalized input features ( |
y |
A numeric vector of observed success counts of length |
m |
A numeric vector of total binomial trial counts of length |
Y |
A numeric matrix of observed class counts ( |
model |
A character string specifying the model type: |
prior |
Type of prior specification. Options are
|
r0 |
Positive scalar prior precision used when |
p0 |
For BKP, a scalar in |
loss |
Leave-one-out loss used for kernel hyperparameter tuning. Options
are |
kernel |
Kernel function used for local weighting. Options are
|
isotropic |
Logical. If |
ess |
Effective-sample-size calibration for leave-one-out data
contributions. Use |
The kernel lengthscale parameter is represented on the log10 scale:
theta = 10^gamma. The function first computes the kernel matrix
using kernel_matrix and then sets its diagonal entries to
zero, so that each fitted value is computed with the corresponding
observation excluded. This implements leave-one-out cross-validation
without repeatedly refitting the model.
For model = "BKP", the loss compares the leave-one-out posterior
mean success probability with the empirical proportion y / m. For
model = "DKP", the loss compares the leave-one-out posterior mean
class-probability vector with the empirical class-proportion vector
Y / rowSums(Y).
If ess = "shepard", the same data-precision rescaling used in model
fitting is applied in leave-one-out form. This option requires unique
training input locations on the normalized input scale.
A single numeric value giving the average leave-one-out loss to be minimized. For BKP, the Brier score and log-loss are averaged over observations. For DKP, class-wise losses are summed within each observation and then averaged over observations.
Zhao J, Qing K, Xu J (2025). BKP: An R Package for Beta Kernel Process Modeling. arXiv. https://doi.org/10.48550/arXiv.2508.10447.
fit_BKP for fitting BKP models, fit_DKP
for fitting DKP models, get_prior for constructing prior
parameters, kernel_matrix for computing kernel matrices.
# -------------------------- BKP --------------------------- set.seed(123) n <- 10 Xnorm <- matrix(runif(2 * n), ncol = 2) m <- rep(10, n) pi_true <- runif(n) y <- rbinom(n, size = m, prob = pi_true) loss_bkp <- loss_fun(gamma = log10(0.2), Xnorm = Xnorm, y = y, m = m) # -------------------------- DKP --------------------------- n <- 10 q <- 3 Xnorm <- matrix(runif(2 * n), ncol = 2) m <- rep(10, n) p0 <- rep(1 / q, q) Y <- matrix(rmultinom(n, size = 10, prob = rep(1/q, q)), nrow = n, byrow = TRUE) loss_dkp <- loss_fun(gamma = log10(0.2), Xnorm = Xnorm, Y = Y, model = "DKP")# -------------------------- BKP --------------------------- set.seed(123) n <- 10 Xnorm <- matrix(runif(2 * n), ncol = 2) m <- rep(10, n) pi_true <- runif(n) y <- rbinom(n, size = m, prob = pi_true) loss_bkp <- loss_fun(gamma = log10(0.2), Xnorm = Xnorm, y = y, m = m) # -------------------------- DKP --------------------------- n <- 10 q <- 3 Xnorm <- matrix(runif(2 * n), ncol = 2) m <- rep(10, n) p0 <- rep(1 / q, q) Y <- matrix(rmultinom(n, size = 10, prob = rep(1/q, q)), nrow = n, byrow = TRUE) loss_dkp <- loss_fun(gamma = log10(0.2), Xnorm = Xnorm, Y = Y, model = "DKP")
Extract key fitted parameters from BKP, DKP,
TwinBKP, and TwinDKP model objects. For full BKP and DKP
models, this includes the fitted kernel lengthscale parameter(s) and
posterior shape or concentration parameters. For TwinBKP and TwinDKP models,
the returned list also includes global-local approximation parameters,
selected global subset indices, and fitting controls.
parameter(object, ...) ## S3 method for class 'BKP' parameter(object, ...) ## S3 method for class 'DKP' parameter(object, ...) ## S3 method for class 'TwinBKP' parameter(object, ...) ## S3 method for class 'TwinDKP' parameter(object, ...)parameter(object, ...) ## S3 method for class 'BKP' parameter(object, ...) ## S3 method for class 'DKP' parameter(object, ...) ## S3 method for class 'TwinBKP' parameter(object, ...) ## S3 method for class 'TwinDKP' parameter(object, ...)
object |
A fitted model object of class |
... |
Additional arguments. Currently unused. |
A named list containing fitted model parameters. Common entries are:
theta: Fitted kernel lengthscale parameter(s). For TwinBKP and
TwinDKP, this is an alias for the global lengthscale theta_g.
alpha_n: Posterior Beta shape parameters for BKP
and TwinBKP, or posterior Dirichlet concentration parameters for DKP and
TwinDKP.
beta_n: Posterior Beta shape parameters, returned
for BKP and TwinBKP objects.
For TwinBKP and TwinDKP objects, the returned list also includes
theta_g, theta_l, global_kernel, local_kernel,
global_indices, and control.
Zhao J, Qing K, Xu J (2025). BKP: An R Package for Beta Kernel Process Modeling. arXiv. https://doi.org/10.48550/arXiv.2508.10447.
fit_BKP, fit_DKP,
fit_TwinBKP, and fit_TwinDKP for model fitting;
fitted for posterior mean extraction.
# -------------------------- BKP and TwinBKP --------------------------- set.seed(123) # Define true success probability function true_pi_fun <- function(x) { (1 + exp(-x^2) * cos(10 * (1 - exp(-x)) / (1 + exp(-x)))) / 2 } n <- 30 Xbounds <- matrix(c(-2, 2), nrow = 1) X <- tgp::lhs(n = n, rect = Xbounds) true_pi <- true_pi_fun(X) m <- sample(100, n, replace = TRUE) y <- rbinom(n, size = m, prob = true_pi) # Fit BKP model # A fixed theta is used here only to keep the example fast and reproducible. # In practice, omit theta to select it by leave-one-out cross-validation. model <- fit_BKP(X, y, m, Xbounds = Xbounds, theta = 0.04) # Extract posterior parameters parameter(model) ## Not run: # Larger TwinBKP example n <- 1000 X <- tgp::lhs(n = n, rect = Xbounds) true_pi <- true_pi_fun(X) m <- sample(100, n, replace = TRUE) y <- rbinom(n, size = m, prob = true_pi) # Fit TwinBKP model using the default global lengthscale tuning model <- fit_TwinBKP( X, y, m, Xbounds = Xbounds ) # Extract posterior and kernel parameters parameter(model) ## End(Not run) # -------------------------- DKP and TwinDKP --------------------------- # Define true class probability function (3-class) true_pi_fun <- function(X) { p1 <- 1/(1+exp(-3*X)) p2 <- (1 + exp(-X^2) * cos(10 * (1 - exp(-X)) / (1 + exp(-X)))) / 2 return(matrix(c(p1/2, p2/2, 1 - (p1+p2)/2), nrow = length(p1))) } n <- 30 Xbounds <- matrix(c(-2, 2), nrow = 1) X <- tgp::lhs(n = n, rect = Xbounds) true_pi <- true_pi_fun(X) m <- sample(150, n, replace = TRUE) # Generate multinomial responses Y <- t(sapply(1:n, function(i) rmultinom(1, size = m[i], prob = true_pi[i, ]))) # Fit DKP model # A fixed theta is used here only to keep the example fast and reproducible. # In practice, omit theta to select it by leave-one-out cross-validation. model <- fit_DKP(X, Y, Xbounds = Xbounds, theta = 0.04) # Extract model parameters parameter(model) ## Not run: # Larger TwinDKP example n <- 1000 X <- tgp::lhs(n = n, rect = Xbounds) true_pi <- true_pi_fun(X) m <- sample(150, n, replace = TRUE) # Generate multinomial responses Y <- t(sapply(1:n, function(i) rmultinom(1, size = m[i], prob = true_pi[i, ]))) # Fit TwinDKP model using the default global lengthscale tuning model <- fit_TwinDKP( X, Y, Xbounds = Xbounds ) # Extract posterior and kernel parameters parameter(model) ## End(Not run)# -------------------------- BKP and TwinBKP --------------------------- set.seed(123) # Define true success probability function true_pi_fun <- function(x) { (1 + exp(-x^2) * cos(10 * (1 - exp(-x)) / (1 + exp(-x)))) / 2 } n <- 30 Xbounds <- matrix(c(-2, 2), nrow = 1) X <- tgp::lhs(n = n, rect = Xbounds) true_pi <- true_pi_fun(X) m <- sample(100, n, replace = TRUE) y <- rbinom(n, size = m, prob = true_pi) # Fit BKP model # A fixed theta is used here only to keep the example fast and reproducible. # In practice, omit theta to select it by leave-one-out cross-validation. model <- fit_BKP(X, y, m, Xbounds = Xbounds, theta = 0.04) # Extract posterior parameters parameter(model) ## Not run: # Larger TwinBKP example n <- 1000 X <- tgp::lhs(n = n, rect = Xbounds) true_pi <- true_pi_fun(X) m <- sample(100, n, replace = TRUE) y <- rbinom(n, size = m, prob = true_pi) # Fit TwinBKP model using the default global lengthscale tuning model <- fit_TwinBKP( X, y, m, Xbounds = Xbounds ) # Extract posterior and kernel parameters parameter(model) ## End(Not run) # -------------------------- DKP and TwinDKP --------------------------- # Define true class probability function (3-class) true_pi_fun <- function(X) { p1 <- 1/(1+exp(-3*X)) p2 <- (1 + exp(-X^2) * cos(10 * (1 - exp(-X)) / (1 + exp(-X)))) / 2 return(matrix(c(p1/2, p2/2, 1 - (p1+p2)/2), nrow = length(p1))) } n <- 30 Xbounds <- matrix(c(-2, 2), nrow = 1) X <- tgp::lhs(n = n, rect = Xbounds) true_pi <- true_pi_fun(X) m <- sample(150, n, replace = TRUE) # Generate multinomial responses Y <- t(sapply(1:n, function(i) rmultinom(1, size = m[i], prob = true_pi[i, ]))) # Fit DKP model # A fixed theta is used here only to keep the example fast and reproducible. # In practice, omit theta to select it by leave-one-out cross-validation. model <- fit_DKP(X, Y, Xbounds = Xbounds, theta = 0.04) # Extract model parameters parameter(model) ## Not run: # Larger TwinDKP example n <- 1000 X <- tgp::lhs(n = n, rect = Xbounds) true_pi <- true_pi_fun(X) m <- sample(150, n, replace = TRUE) # Generate multinomial responses Y <- t(sapply(1:n, function(i) rmultinom(1, size = m[i], prob = true_pi[i, ]))) # Fit TwinDKP model using the default global lengthscale tuning model <- fit_TwinDKP( X, Y, Xbounds = Xbounds ) # Extract posterior and kernel parameters parameter(model) ## End(Not run)
Visualize fitted BKP, DKP, TwinBKP, and
TwinDKP model objects according to the selected input dimension(s).
For one-dimensional plots, the methods show posterior mean probability
curves, credible intervals, and observed proportions or class frequencies.
For two-dimensional plots, the methods generate posterior summary surfaces
over a prediction grid. For higher-dimensional inputs, users must specify
one or two dimensions through dims.
## S3 method for class 'BKP' plot( x, only_mean = FALSE, n_grid = 80, dims = NULL, engine = c("base", "ggplot"), ... ) ## S3 method for class 'DKP' plot( x, only_mean = FALSE, n_grid = 80, dims = NULL, engine = c("base", "ggplot"), ... ) ## S3 method for class 'TwinBKP' plot( x, only_mean = FALSE, n_grid = 80, dims = NULL, engine = c("base", "ggplot"), show_global = TRUE, ... ) ## S3 method for class 'TwinDKP' plot( x, only_mean = FALSE, n_grid = 80, dims = NULL, engine = c("base", "ggplot"), show_global = TRUE, ... )## S3 method for class 'BKP' plot( x, only_mean = FALSE, n_grid = 80, dims = NULL, engine = c("base", "ggplot"), ... ) ## S3 method for class 'DKP' plot( x, only_mean = FALSE, n_grid = 80, dims = NULL, engine = c("base", "ggplot"), ... ) ## S3 method for class 'TwinBKP' plot( x, only_mean = FALSE, n_grid = 80, dims = NULL, engine = c("base", "ggplot"), show_global = TRUE, ... ) ## S3 method for class 'TwinDKP' plot( x, only_mean = FALSE, n_grid = 80, dims = NULL, engine = c("base", "ggplot"), show_global = TRUE, ... )
x |
A fitted model object of class |
only_mean |
Logical. If |
n_grid |
Positive integer specifying the number of grid points per
dimension for constructing the prediction grid. Larger values produce
smoother and more detailed surfaces, but increase computation time. Default
is |
dims |
Integer vector indicating which input dimensions to plot. Must
have length 1 (for 1D) or 2 (for 2D). If |
engine |
Character string specifying the plotting backend.
Either |
... |
Additional arguments passed to internal plotting routines (currently unused). |
show_global |
Logical. For |
The plotting behavior depends on the dimensionality of the input covariates:
1D inputs:
For BKP (binary/binomial data), the function plots the posterior mean curve with a shaded 95% credible interval, overlaid with the observed proportions ().
For DKP (categorical/multinomial data), it plots one curve per class, each with a shaded credible interval and the observed class frequencies.
For classification tasks, an optional curve of the maximum posterior class probability can be displayed to visualize decision confidence.
2D inputs:
For BKP, DKP, TwinBKP, and TwinDKP models, the function generates contour plots over a 2D prediction grid.
Users can choose to plot only the predictive mean surface (only_mean = TRUE) or a set of four summary plots (only_mean = FALSE):
Predictive mean
97.5th percentile (upper bound of 95% credible interval)
Predictive variance
2.5th percentile (lower bound of 95% credible interval)
For DKP, these surfaces are generated separately for each class.
For classification tasks, predictive class probabilities can also be visualized as the maximum posterior probability surface.
Input dimensions greater than 2:
The function does not automatically support visualization and will terminate with an error.
Users must specify which dimensions to visualize via the dims argument (length 1 or 2).
For TwinBKP and TwinDKP objects, the plotting behavior is
analogous to their full-model counterparts, with optional highlighting of
the selected global subset.
Invisibly returns NULL. The function is called for its side
effect of producing plots.
Zhao J, Qing K, Xu J (2025). BKP: An R Package for Beta Kernel Process Modeling. arXiv. https://doi.org/10.48550/arXiv.2508.10447.
fit_BKP, fit_DKP,
fit_TwinBKP, and fit_TwinDKP for model fitting;
predict.BKP, predict.DKP,
predict.TwinBKP, and predict.TwinDKP for
generating predictions from fitted models.
#-------------------------- BKP and TwinBKP --------------------------- #-------------------------- 1D Example --------------------------- set.seed(123) # Define true success probability function true_pi_fun <- function(x) { (1 + exp(-x^2) * cos(10 * (1 - exp(-x)) / (1 + exp(-x)))) / 2 } n <- 30 Xbounds <- matrix(c(-2,2), nrow=1) X <- tgp::lhs(n = n, rect = Xbounds) true_pi <- true_pi_fun(X) m <- sample(100, n, replace = TRUE) y <- rbinom(n, size = m, prob = true_pi) # Fit BKP model # A fixed theta is used here only to keep the example fast and reproducible. # In practice, omit theta to select it by leave-one-out cross-validation. model1 <- fit_BKP(X, y, m, Xbounds = Xbounds, theta = 0.04) # Plot results plot(model1) ## Not run: # Larger TwinBKP example n <- 1000 X <- tgp::lhs(n = n, rect = Xbounds) true_pi <- true_pi_fun(X) m <- sample(100, n, replace = TRUE) y <- rbinom(n, size = m, prob = true_pi) # Fit TwinBKP model using the default global lengthscale tuning model1 <- fit_TwinBKP(X, y, m, Xbounds = Xbounds) # Plot results plot(model1) ## End(Not run) #-------------------------- 2D Example --------------------------- # Define 2D latent function and probability transformation true_pi_fun <- function(X) { if(is.null(nrow(X))) X <- matrix(X, nrow=1) m <- 8.6928 s <- 2.4269 x1 <- 4*X[,1]- 2 x2 <- 4*X[,2]- 2 a <- 1 + (x1 + x2 + 1)^2 * (19- 14*x1 + 3*x1^2- 14*x2 + 6*x1*x2 + 3*x2^2) b <- 30 + (2*x1- 3*x2)^2 * (18- 32*x1 + 12*x1^2 + 48*x2- 36*x1*x2 + 27*x2^2) f <- log(a*b) f <- (f- m)/s return(pnorm(f)) # Transform to probability } n <- 100 Xbounds <- matrix(c(0, 0, 1, 1), nrow = 2) X <- tgp::lhs(n = n, rect = Xbounds) true_pi <- true_pi_fun(X) m <- sample(100, n, replace = TRUE) y <- rbinom(n, size = m, prob = true_pi) # Fit BKP model # A fixed theta is used here only to keep the example fast and reproducible. # In practice, omit theta to select it by leave-one-out cross-validation. model2 <- fit_BKP(X, y, m, Xbounds = Xbounds, theta = 0.08) # Plot results plot(model2) ## Not run: # Larger TwinBKP example n <- 1000 X <- tgp::lhs(n = n, rect = Xbounds) true_pi <- true_pi_fun(X) m <- sample(100, n, replace = TRUE) y <- rbinom(n, size = m, prob = true_pi) # Fit TwinBKP model using the default global lengthscale tuning model2 <- fit_TwinBKP(X, y, m, Xbounds = Xbounds) # Plot results plot(model2) ## End(Not run) #-------------------------- DKP and TwinDKP --------------------------- #-------------------------- 1D Example --------------------------- # Define true class probability function (3-class) true_pi_fun <- function(X) { p1 <- 1/(1+exp(-3*X)) p2 <- (1 + exp(-X^2) * cos(10 * (1 - exp(-X)) / (1 + exp(-X)))) / 2 return(matrix(c(p1/2, p2/2, 1 - (p1+p2)/2), nrow = length(p1))) } n <- 30 Xbounds <- matrix(c(-2, 2), nrow = 1) X <- tgp::lhs(n = n, rect = Xbounds) true_pi <- true_pi_fun(X) m <- sample(150, n, replace = TRUE) # Generate multinomial responses Y <- t(sapply(1:n, function(i) rmultinom(1, size = m[i], prob = true_pi[i, ]))) # Fit DKP model # A fixed theta is used here only to keep the example fast and reproducible. # In practice, omit theta to select it by leave-one-out cross-validation. model1 <- fit_DKP(X, Y, Xbounds = Xbounds, theta = 0.04) # Plot results plot(model1) ## Not run: # Larger TwinDKP example n <- 1000 X <- tgp::lhs(n = n, rect = Xbounds) true_pi <- true_pi_fun(X) m <- sample(150, n, replace = TRUE) # Generate multinomial responses Y <- t(sapply(1:n, function(i) rmultinom(1, size = m[i], prob = true_pi[i, ]))) # Fit TwinDKP model using the default global lengthscale tuning model1 <- fit_TwinDKP(X, Y, Xbounds = Xbounds) # Plot results plot(model1) ## End(Not run) #-------------------------- 2D Example --------------------------- # Define latent function and transform to 3-class probabilities true_pi_fun <- function(X) { if (is.null(nrow(X))) X <- matrix(X, nrow = 1) m <- 8.6928; s <- 2.4269 x1 <- 4 * X[,1] - 2 x2 <- 4 * X[,2] - 2 a <- 1 + (x1 + x2 + 1)^2 * (19 - 14*x1 + 3*x1^2 - 14*x2 + 6*x1*x2 + 3*x2^2) b <- 30 + (2*x1 - 3*x2)^2 * (18 - 32*x1 + 12*x1^2 + 48*x2 - 36*x1*x2 + 27*x2^2) f <- (log(a*b)- m)/s p1 <- pnorm(f) # Transform to probability p2 <- sin(pi * X[,1]) * sin(pi * X[,2]) return(matrix(c(p1/2, p2/2, 1 - (p1+p2)/2), nrow = length(p1))) } n <- 100 Xbounds <- matrix(c(0, 0, 1, 1), nrow = 2) X <- tgp::lhs(n = n, rect = Xbounds) true_pi <- true_pi_fun(X) m <- sample(150, n, replace = TRUE) # Generate multinomial responses Y <- t(sapply(1:n, function(i) rmultinom(1, size = m[i], prob = true_pi[i, ]))) # Fit DKP model # A fixed theta is used here only to keep the example fast and reproducible. # In practice, omit theta to select it by leave-one-out cross-validation. model2 <- fit_DKP(X, Y, Xbounds = Xbounds, theta = 0.08) # Plot results plot(model2) ## Not run: # Larger TwinDKP example n <- 1000 X <- tgp::lhs(n = n, rect = Xbounds) true_pi <- true_pi_fun(X) m <- sample(150, n, replace = TRUE) # Generate multinomial responses Y <- t(sapply(1:n, function(i) rmultinom(1, size = m[i], prob = true_pi[i, ]))) # Fit TwinDKP model using the default global lengthscale tuning model2 <- fit_TwinDKP(X, Y, Xbounds = Xbounds) # Plot results plot(model2) ## End(Not run)#-------------------------- BKP and TwinBKP --------------------------- #-------------------------- 1D Example --------------------------- set.seed(123) # Define true success probability function true_pi_fun <- function(x) { (1 + exp(-x^2) * cos(10 * (1 - exp(-x)) / (1 + exp(-x)))) / 2 } n <- 30 Xbounds <- matrix(c(-2,2), nrow=1) X <- tgp::lhs(n = n, rect = Xbounds) true_pi <- true_pi_fun(X) m <- sample(100, n, replace = TRUE) y <- rbinom(n, size = m, prob = true_pi) # Fit BKP model # A fixed theta is used here only to keep the example fast and reproducible. # In practice, omit theta to select it by leave-one-out cross-validation. model1 <- fit_BKP(X, y, m, Xbounds = Xbounds, theta = 0.04) # Plot results plot(model1) ## Not run: # Larger TwinBKP example n <- 1000 X <- tgp::lhs(n = n, rect = Xbounds) true_pi <- true_pi_fun(X) m <- sample(100, n, replace = TRUE) y <- rbinom(n, size = m, prob = true_pi) # Fit TwinBKP model using the default global lengthscale tuning model1 <- fit_TwinBKP(X, y, m, Xbounds = Xbounds) # Plot results plot(model1) ## End(Not run) #-------------------------- 2D Example --------------------------- # Define 2D latent function and probability transformation true_pi_fun <- function(X) { if(is.null(nrow(X))) X <- matrix(X, nrow=1) m <- 8.6928 s <- 2.4269 x1 <- 4*X[,1]- 2 x2 <- 4*X[,2]- 2 a <- 1 + (x1 + x2 + 1)^2 * (19- 14*x1 + 3*x1^2- 14*x2 + 6*x1*x2 + 3*x2^2) b <- 30 + (2*x1- 3*x2)^2 * (18- 32*x1 + 12*x1^2 + 48*x2- 36*x1*x2 + 27*x2^2) f <- log(a*b) f <- (f- m)/s return(pnorm(f)) # Transform to probability } n <- 100 Xbounds <- matrix(c(0, 0, 1, 1), nrow = 2) X <- tgp::lhs(n = n, rect = Xbounds) true_pi <- true_pi_fun(X) m <- sample(100, n, replace = TRUE) y <- rbinom(n, size = m, prob = true_pi) # Fit BKP model # A fixed theta is used here only to keep the example fast and reproducible. # In practice, omit theta to select it by leave-one-out cross-validation. model2 <- fit_BKP(X, y, m, Xbounds = Xbounds, theta = 0.08) # Plot results plot(model2) ## Not run: # Larger TwinBKP example n <- 1000 X <- tgp::lhs(n = n, rect = Xbounds) true_pi <- true_pi_fun(X) m <- sample(100, n, replace = TRUE) y <- rbinom(n, size = m, prob = true_pi) # Fit TwinBKP model using the default global lengthscale tuning model2 <- fit_TwinBKP(X, y, m, Xbounds = Xbounds) # Plot results plot(model2) ## End(Not run) #-------------------------- DKP and TwinDKP --------------------------- #-------------------------- 1D Example --------------------------- # Define true class probability function (3-class) true_pi_fun <- function(X) { p1 <- 1/(1+exp(-3*X)) p2 <- (1 + exp(-X^2) * cos(10 * (1 - exp(-X)) / (1 + exp(-X)))) / 2 return(matrix(c(p1/2, p2/2, 1 - (p1+p2)/2), nrow = length(p1))) } n <- 30 Xbounds <- matrix(c(-2, 2), nrow = 1) X <- tgp::lhs(n = n, rect = Xbounds) true_pi <- true_pi_fun(X) m <- sample(150, n, replace = TRUE) # Generate multinomial responses Y <- t(sapply(1:n, function(i) rmultinom(1, size = m[i], prob = true_pi[i, ]))) # Fit DKP model # A fixed theta is used here only to keep the example fast and reproducible. # In practice, omit theta to select it by leave-one-out cross-validation. model1 <- fit_DKP(X, Y, Xbounds = Xbounds, theta = 0.04) # Plot results plot(model1) ## Not run: # Larger TwinDKP example n <- 1000 X <- tgp::lhs(n = n, rect = Xbounds) true_pi <- true_pi_fun(X) m <- sample(150, n, replace = TRUE) # Generate multinomial responses Y <- t(sapply(1:n, function(i) rmultinom(1, size = m[i], prob = true_pi[i, ]))) # Fit TwinDKP model using the default global lengthscale tuning model1 <- fit_TwinDKP(X, Y, Xbounds = Xbounds) # Plot results plot(model1) ## End(Not run) #-------------------------- 2D Example --------------------------- # Define latent function and transform to 3-class probabilities true_pi_fun <- function(X) { if (is.null(nrow(X))) X <- matrix(X, nrow = 1) m <- 8.6928; s <- 2.4269 x1 <- 4 * X[,1] - 2 x2 <- 4 * X[,2] - 2 a <- 1 + (x1 + x2 + 1)^2 * (19 - 14*x1 + 3*x1^2 - 14*x2 + 6*x1*x2 + 3*x2^2) b <- 30 + (2*x1 - 3*x2)^2 * (18 - 32*x1 + 12*x1^2 + 48*x2 - 36*x1*x2 + 27*x2^2) f <- (log(a*b)- m)/s p1 <- pnorm(f) # Transform to probability p2 <- sin(pi * X[,1]) * sin(pi * X[,2]) return(matrix(c(p1/2, p2/2, 1 - (p1+p2)/2), nrow = length(p1))) } n <- 100 Xbounds <- matrix(c(0, 0, 1, 1), nrow = 2) X <- tgp::lhs(n = n, rect = Xbounds) true_pi <- true_pi_fun(X) m <- sample(150, n, replace = TRUE) # Generate multinomial responses Y <- t(sapply(1:n, function(i) rmultinom(1, size = m[i], prob = true_pi[i, ]))) # Fit DKP model # A fixed theta is used here only to keep the example fast and reproducible. # In practice, omit theta to select it by leave-one-out cross-validation. model2 <- fit_DKP(X, Y, Xbounds = Xbounds, theta = 0.08) # Plot results plot(model2) ## Not run: # Larger TwinDKP example n <- 1000 X <- tgp::lhs(n = n, rect = Xbounds) true_pi <- true_pi_fun(X) m <- sample(150, n, replace = TRUE) # Generate multinomial responses Y <- t(sapply(1:n, function(i) rmultinom(1, size = m[i], prob = true_pi[i, ]))) # Fit TwinDKP model using the default global lengthscale tuning model2 <- fit_TwinDKP(X, Y, Xbounds = Xbounds) # Plot results plot(model2) ## End(Not run)
Generate posterior summaries from fitted BKP,
DKP, TwinBKP, and TwinDKP model objects at training or
new input locations. For BKP and TwinBKP models, summaries can be returned
either for the latent success probability or for a future success count
under the Beta-Binomial posterior predictive distribution. For DKP and
TwinDKP models, summaries can be returned either for the latent class
probability vector or for future class counts using marginal Beta-Binomial
posterior predictive distributions.
## S3 method for class 'BKP' predict( object, Xnew = NULL, CI_level = 0.95, threshold = 0.5, type = c("probability", "count"), Mnew = NULL, ... ) ## S3 method for class 'DKP' predict( object, Xnew = NULL, CI_level = 0.95, type = c("probability", "count"), Mnew = NULL, ... ) ## S3 method for class 'TwinBKP' predict( object, Xnew = NULL, CI_level = 0.95, threshold = 0.5, type = c("probability", "count"), Mnew = NULL, ... ) ## S3 method for class 'TwinDKP' predict( object, Xnew = NULL, CI_level = 0.95, type = c("probability", "count"), Mnew = NULL, ... )## S3 method for class 'BKP' predict( object, Xnew = NULL, CI_level = 0.95, threshold = 0.5, type = c("probability", "count"), Mnew = NULL, ... ) ## S3 method for class 'DKP' predict( object, Xnew = NULL, CI_level = 0.95, type = c("probability", "count"), Mnew = NULL, ... ) ## S3 method for class 'TwinBKP' predict( object, Xnew = NULL, CI_level = 0.95, threshold = 0.5, type = c("probability", "count"), Mnew = NULL, ... ) ## S3 method for class 'TwinDKP' predict( object, Xnew = NULL, CI_level = 0.95, type = c("probability", "count"), Mnew = NULL, ... )
object |
A fitted model object of class |
Xnew |
A numeric vector or matrix of new input locations at which to
generate predictions. If |
CI_level |
Numeric between 0 and 1 specifying the credible level for
posterior intervals (default |
threshold |
Numeric between 0 and 1 specifying the classification
threshold for binary predictions based on posterior mean, used for
BKP and TwinBKP models. Default is |
type |
Character string specifying the prediction target. The default
|
Mnew |
Positive integer trial size used when |
... |
Additional arguments passed to generic |
A list containing posterior or posterior predictive summaries:
XThe original training input locations.
XnewThe input locations at which predictions are returned.
If Xnew = NULL, predictions are returned at the training input
locations.
alpha_nPosterior shape or concentration parameters:
For BKP and TwinBKP, a vector of posterior Beta
shape parameters.
For DKP and TwinDKP, a matrix of posterior
Dirichlet concentration parameters, with rows corresponding to
prediction locations and columns corresponding to classes.
beta_nPosterior Beta shape parameters, returned
for BKP and TwinBKP objects.
meanMean of the prediction target:
For BKP and TwinBKP with
type = "probability", the posterior mean of the latent success
probability.
For BKP and TwinBKP with type = "count",
the posterior predictive mean of a future success count.
For DKP and TwinDKP with
type = "probability", a matrix of posterior mean class
probabilities.
For DKP and TwinDKP with type = "count",
a matrix of marginal posterior predictive mean class counts.
varianceVariance of the prediction target:
For BKP and TwinBKP with
type = "probability", the posterior variance of the latent
success probability.
For BKP and TwinBKP with type = "count",
the posterior predictive variance of a future success count.
For DKP and TwinDKP with
type = "probability", a matrix of marginal posterior variances
for class probabilities.
For DKP and TwinDKP with type = "count",
a matrix of marginal posterior predictive variances for class counts.
lowerLower bound of the credible or predictive interval:
For BKP and TwinBKP with
type = "probability", the lower Beta posterior quantile for
the latent success probability.
For BKP and TwinBKP with type = "count",
the lower Beta-Binomial posterior predictive quantile for a future
success count.
For DKP and TwinDKP with
type = "probability", a matrix of lower marginal posterior
quantiles for class probabilities.
For DKP and TwinDKP with type = "count",
a matrix of lower marginal Beta-Binomial posterior predictive quantiles
for class counts.
upperUpper bound of the credible or predictive interval:
For BKP and TwinBKP with
type = "probability", the upper Beta posterior quantile for
the latent success probability.
For BKP and TwinBKP with type = "count",
the upper Beta-Binomial posterior predictive quantile for a future
success count.
For DKP and TwinDKP with
type = "probability", a matrix of upper marginal posterior
quantiles for class probabilities.
For DKP and TwinDKP with type = "count",
a matrix of upper marginal Beta-Binomial posterior predictive quantiles
for class counts.
classPredicted class label, returned only when applicable:
For BKP and TwinBKP, a binary class label based on
the posterior mean and threshold, returned only for binary
classification data with m = 1 and
type = "probability".
For DKP and TwinDKP, the class with the largest
posterior mean probability, returned only for one-hot classification
data and type = "probability".
thresholdThe classification threshold, returned for
BKP and TwinBKP classification predictions.
CI_levelThe specified credible interval level.
typeThe prediction target, either "probability" or
"count".
MnewThe trial sizes used for count prediction, returned only
when type = "count".
essEffective-sample-size calibration method inherited from
the fitted model, returned for BKP and DKP objects.
ess_infoESS diagnostics for BKP and DKP
predictions when available. TwinBKP and TwinDKP use
uncalibrated global-local posterior updates and do not return ESS
diagnostics.
Zhao J, Qing K, Xu J (2025). BKP: An R Package for Beta Kernel Process Modeling. arXiv. https://doi.org/10.48550/arXiv.2508.10447.
fit_BKP, fit_DKP,
fit_TwinBKP, and fit_TwinDKP for model fitting;
plot.BKP, plot.DKP,
plot.TwinBKP, and plot.TwinDKP for visualization.
#-------------------------- BKP and TwinBKP --------------------------- set.seed(123) # Define true success probability function true_pi_fun <- function(x) { (1 + exp(-x^2) * cos(10 * (1 - exp(-x)) / (1 + exp(-x)))) / 2 } n <- 30 Xbounds <- matrix(c(-2,2), nrow=1) X <- tgp::lhs(n = n, rect = Xbounds) true_pi <- true_pi_fun(X) m <- sample(100, n, replace = TRUE) y <- rbinom(n, size = m, prob = true_pi) # Fit BKP model # A fixed theta is used here only to keep the example fast and reproducible. # In practice, omit theta to select it by leave-one-out cross-validation. model <- fit_BKP(X, y, m, Xbounds = Xbounds, theta = 0.04) # Prediction on training data predict(model) # Prediction on new data Xnew <- matrix(seq(-2, 2, length = 10), ncol = 1) #new data points predict(model, Xnew = Xnew) # Posterior predictive summaries for future success counts Mnew <- sample(100, nrow(Xnew), replace = TRUE) predict(model, Xnew = Xnew, type = "count", Mnew = Mnew) ## Not run: # Larger TwinBKP example n <- 1000 X <- tgp::lhs(n = n, rect = Xbounds) true_pi <- true_pi_fun(X) m <- sample(100, n, replace = TRUE) y <- rbinom(n, size = m, prob = true_pi) # Fit TwinBKP model using the default global lengthscale tuning model <- fit_TwinBKP( X, y, m, Xbounds = Xbounds ) # Prediction on training data predict(model) # Prediction on new data predict(model, Xnew = Xnew) # Posterior predictive summaries for future success counts predict(model, Xnew = Xnew, type = "count", Mnew = Mnew) ## End(Not run) #-------------------------- DKP and TwinDKP --------------------------- # Define true class probability function (3-class) true_pi_fun <- function(X) { p1 <- 1/(1+exp(-3*X)) p2 <- (1 + exp(-X^2) * cos(10 * (1 - exp(-X)) / (1 + exp(-X)))) / 2 return(matrix(c(p1/2, p2/2, 1 - (p1+p2)/2), nrow = length(p1))) } n <- 30 Xbounds <- matrix(c(-2, 2), nrow = 1) X <- tgp::lhs(n = n, rect = Xbounds) true_pi <- true_pi_fun(X) m <- sample(150, n, replace = TRUE) # Generate multinomial responses Y <- t(sapply(1:n, function(i) rmultinom(1, size = m[i], prob = true_pi[i, ]))) # Fit DKP model # A fixed theta is used here only to keep the example fast and reproducible. # In practice, omit theta to select it by leave-one-out cross-validation. model <- fit_DKP(X, Y, Xbounds = Xbounds, theta = 0.04) # Prediction on training data predict(model) # Prediction on new data Xnew = matrix(seq(-2, 2, length = 10), ncol = 1) #new data points predict(model, Xnew) # Posterior predictive summaries for future success counts Mnew <- sample(100, nrow(Xnew), replace = TRUE) predict(model, Xnew = Xnew, type = "count", Mnew = Mnew) ## Not run: # Larger TwinDKP example n <- 1000 X <- tgp::lhs(n = n, rect = Xbounds) true_pi <- true_pi_fun(X) m <- sample(150, n, replace = TRUE) # Generate multinomial responses Y <- t(sapply(1:n, function(i) rmultinom(1, size = m[i], prob = true_pi[i, ]))) # Fit TwinDKP model using the default global lengthscale tuning model <- fit_TwinDKP( X, Y, Xbounds = Xbounds ) # Prediction on training data predict(model) # Prediction on new data predict(model, Xnew) # Posterior predictive summaries for future success counts Mnew <- sample(100, nrow(Xnew), replace = TRUE) predict(model, Xnew = Xnew, type = "count", Mnew = Mnew) ## End(Not run)#-------------------------- BKP and TwinBKP --------------------------- set.seed(123) # Define true success probability function true_pi_fun <- function(x) { (1 + exp(-x^2) * cos(10 * (1 - exp(-x)) / (1 + exp(-x)))) / 2 } n <- 30 Xbounds <- matrix(c(-2,2), nrow=1) X <- tgp::lhs(n = n, rect = Xbounds) true_pi <- true_pi_fun(X) m <- sample(100, n, replace = TRUE) y <- rbinom(n, size = m, prob = true_pi) # Fit BKP model # A fixed theta is used here only to keep the example fast and reproducible. # In practice, omit theta to select it by leave-one-out cross-validation. model <- fit_BKP(X, y, m, Xbounds = Xbounds, theta = 0.04) # Prediction on training data predict(model) # Prediction on new data Xnew <- matrix(seq(-2, 2, length = 10), ncol = 1) #new data points predict(model, Xnew = Xnew) # Posterior predictive summaries for future success counts Mnew <- sample(100, nrow(Xnew), replace = TRUE) predict(model, Xnew = Xnew, type = "count", Mnew = Mnew) ## Not run: # Larger TwinBKP example n <- 1000 X <- tgp::lhs(n = n, rect = Xbounds) true_pi <- true_pi_fun(X) m <- sample(100, n, replace = TRUE) y <- rbinom(n, size = m, prob = true_pi) # Fit TwinBKP model using the default global lengthscale tuning model <- fit_TwinBKP( X, y, m, Xbounds = Xbounds ) # Prediction on training data predict(model) # Prediction on new data predict(model, Xnew = Xnew) # Posterior predictive summaries for future success counts predict(model, Xnew = Xnew, type = "count", Mnew = Mnew) ## End(Not run) #-------------------------- DKP and TwinDKP --------------------------- # Define true class probability function (3-class) true_pi_fun <- function(X) { p1 <- 1/(1+exp(-3*X)) p2 <- (1 + exp(-X^2) * cos(10 * (1 - exp(-X)) / (1 + exp(-X)))) / 2 return(matrix(c(p1/2, p2/2, 1 - (p1+p2)/2), nrow = length(p1))) } n <- 30 Xbounds <- matrix(c(-2, 2), nrow = 1) X <- tgp::lhs(n = n, rect = Xbounds) true_pi <- true_pi_fun(X) m <- sample(150, n, replace = TRUE) # Generate multinomial responses Y <- t(sapply(1:n, function(i) rmultinom(1, size = m[i], prob = true_pi[i, ]))) # Fit DKP model # A fixed theta is used here only to keep the example fast and reproducible. # In practice, omit theta to select it by leave-one-out cross-validation. model <- fit_DKP(X, Y, Xbounds = Xbounds, theta = 0.04) # Prediction on training data predict(model) # Prediction on new data Xnew = matrix(seq(-2, 2, length = 10), ncol = 1) #new data points predict(model, Xnew) # Posterior predictive summaries for future success counts Mnew <- sample(100, nrow(Xnew), replace = TRUE) predict(model, Xnew = Xnew, type = "count", Mnew = Mnew) ## Not run: # Larger TwinDKP example n <- 1000 X <- tgp::lhs(n = n, rect = Xbounds) true_pi <- true_pi_fun(X) m <- sample(150, n, replace = TRUE) # Generate multinomial responses Y <- t(sapply(1:n, function(i) rmultinom(1, size = m[i], prob = true_pi[i, ]))) # Fit TwinDKP model using the default global lengthscale tuning model <- fit_TwinDKP( X, Y, Xbounds = Xbounds ) # Prediction on training data predict(model) # Prediction on new data predict(model, Xnew) # Posterior predictive summaries for future success counts Mnew <- sample(100, nrow(Xnew), replace = TRUE) predict(model, Xnew = Xnew, type = "count", Mnew = Mnew) ## End(Not run)
Provides formatted console output for fitted BKP,
DKP, TwinBKP, and TwinDKP model objects, their
summaries, predictions, and simulations. The following specialized methods
are supported:
print.BKP, print.DKP, print.TwinBKP, and
print.TwinDKP display fitted model objects.
print.summary_BKP, print.summary_DKP,
print.summary_TwinBKP, and print.summary_TwinDKP
display model summaries.
print.predict_BKP, print.predict_DKP,
print.predict_TwinBKP, and print.predict_TwinDKP
display posterior predictive results.
print.simulate_BKP, print.simulate_DKP,
print.simulate_TwinBKP, and print.simulate_TwinDKP
display posterior simulations.
## S3 method for class 'BKP' print(x, ...) ## S3 method for class 'summary_BKP' print(x, ...) ## S3 method for class 'predict_BKP' print(x, ...) ## S3 method for class 'simulate_BKP' print(x, ...) ## S3 method for class 'DKP' print(x, ...) ## S3 method for class 'summary_DKP' print(x, ...) ## S3 method for class 'predict_DKP' print(x, ...) ## S3 method for class 'simulate_DKP' print(x, ...) ## S3 method for class 'TwinBKP' print(x, ...) ## S3 method for class 'predict_TwinBKP' print(x, ...) ## S3 method for class 'summary_TwinBKP' print(x, ...) ## S3 method for class 'simulate_TwinBKP' print(x, ...) ## S3 method for class 'TwinDKP' print(x, ...) ## S3 method for class 'predict_TwinDKP' print(x, ...) ## S3 method for class 'summary_TwinDKP' print(x, ...) ## S3 method for class 'simulate_TwinDKP' print(x, ...)## S3 method for class 'BKP' print(x, ...) ## S3 method for class 'summary_BKP' print(x, ...) ## S3 method for class 'predict_BKP' print(x, ...) ## S3 method for class 'simulate_BKP' print(x, ...) ## S3 method for class 'DKP' print(x, ...) ## S3 method for class 'summary_DKP' print(x, ...) ## S3 method for class 'predict_DKP' print(x, ...) ## S3 method for class 'simulate_DKP' print(x, ...) ## S3 method for class 'TwinBKP' print(x, ...) ## S3 method for class 'predict_TwinBKP' print(x, ...) ## S3 method for class 'summary_TwinBKP' print(x, ...) ## S3 method for class 'simulate_TwinBKP' print(x, ...) ## S3 method for class 'TwinDKP' print(x, ...) ## S3 method for class 'predict_TwinDKP' print(x, ...) ## S3 method for class 'summary_TwinDKP' print(x, ...) ## S3 method for class 'simulate_TwinDKP' print(x, ...)
x |
An object of class |
... |
Additional arguments passed to the generic |
Invisibly returns the input object. Called for the side effect of printing human-readable summaries to the console.
Zhao J, Qing K, Xu J (2025). BKP: An R Package for Beta Kernel Process Modeling. arXiv. https://doi.org/10.48550/arXiv.2508.10447.
fit_BKP, fit_DKP,
fit_TwinBKP, and fit_TwinDKP for model fitting;
summary.BKP, summary.DKP,
summary.TwinBKP, and summary.TwinDKP for model
summaries; predict.BKP, predict.DKP,
predict.TwinBKP, and predict.TwinDKP for
posterior prediction; simulate.BKP,
simulate.DKP, simulate.TwinBKP, and
simulate.TwinDKP for posterior simulations.
#-------------------------- BKP and TwinBKP --------------------------- set.seed(123) # Define true success probability function true_pi_fun <- function(x) { (1 + exp(-x^2) * cos(10 * (1 - exp(-x)) / (1 + exp(-x)))) / 2 } n <- 30 Xbounds <- matrix(c(-2,2), nrow = 1) X <- tgp::lhs(n = n, rect = Xbounds) true_pi <- true_pi_fun(X) m <- sample(100, n, replace = TRUE) y <- rbinom(n, size = m, prob = true_pi) # Fit BKP model # A fixed theta is used here only to keep the example fast and reproducible. # In practice, omit theta to select it by leave-one-out cross-validation. model <- fit_BKP(X, y, m, Xbounds = Xbounds, theta = 0.04) print(model) # fitted object print(summary(model)) # summary print(predict(model)) # predictions print(simulate(model, nsim = 3)) # posterior simulations ## Not run: # Larger TwinBKP example n <- 1000 X <- tgp::lhs(n = n, rect = Xbounds) true_pi <- true_pi_fun(X) m <- sample(100, n, replace = TRUE) y <- rbinom(n, size = m, prob = true_pi) # Fit TwinBKP model using the default global lengthscale tuning model <- fit_TwinBKP( X, y, m, Xbounds = Xbounds ) print(model) # fitted object print(summary(model)) # summary print(predict(model)) # predictions print(simulate(model, nsim = 3)) # posterior simulations ## End(Not run) #-------------------------- DKP and TwinDKP --------------------------- # Define true class probability function (3-class) true_pi_fun <- function(X) { p1 <- 1/(1+exp(-3*X)) p2 <- (1 + exp(-X^2) * cos(10 * (1 - exp(-X)) / (1 + exp(-X)))) / 2 return(matrix(c(p1/2, p2/2, 1 - (p1+p2)/2), nrow = length(p1))) } n <- 30 Xbounds <- matrix(c(-2, 2), nrow = 1) X <- tgp::lhs(n = n, rect = Xbounds) true_pi <- true_pi_fun(X) m <- sample(150, n, replace = TRUE) # Generate multinomial responses Y <- t(sapply(1:n, function(i) rmultinom(1, size = m[i], prob = true_pi[i, ]))) # Fit DKP model # A fixed theta is used here only to keep the example fast and reproducible. # In practice, omit theta to select it by leave-one-out cross-validation. model <- fit_DKP(X, Y, Xbounds = Xbounds, theta = 0.04) print(model) # fitted object print(summary(model)) # summary print(predict(model)) # predictions print(simulate(model, nsim = 3)) # posterior simulations ## Not run: # Larger TwinDKP example n <- 1000 X <- tgp::lhs(n = n, rect = Xbounds) true_pi <- true_pi_fun(X) m <- sample(150, n, replace = TRUE) # Generate multinomial responses Y <- t(sapply(1:n, function(i) rmultinom(1, size = m[i], prob = true_pi[i, ]))) # Fit TwinDKP model using the default global lengthscale tuning model <- fit_TwinDKP( X, Y, Xbounds = Xbounds ) print(model) # fitted object print(summary(model)) # summary print(predict(model)) # predictions print(simulate(model, nsim = 3)) # posterior simulations ## End(Not run)#-------------------------- BKP and TwinBKP --------------------------- set.seed(123) # Define true success probability function true_pi_fun <- function(x) { (1 + exp(-x^2) * cos(10 * (1 - exp(-x)) / (1 + exp(-x)))) / 2 } n <- 30 Xbounds <- matrix(c(-2,2), nrow = 1) X <- tgp::lhs(n = n, rect = Xbounds) true_pi <- true_pi_fun(X) m <- sample(100, n, replace = TRUE) y <- rbinom(n, size = m, prob = true_pi) # Fit BKP model # A fixed theta is used here only to keep the example fast and reproducible. # In practice, omit theta to select it by leave-one-out cross-validation. model <- fit_BKP(X, y, m, Xbounds = Xbounds, theta = 0.04) print(model) # fitted object print(summary(model)) # summary print(predict(model)) # predictions print(simulate(model, nsim = 3)) # posterior simulations ## Not run: # Larger TwinBKP example n <- 1000 X <- tgp::lhs(n = n, rect = Xbounds) true_pi <- true_pi_fun(X) m <- sample(100, n, replace = TRUE) y <- rbinom(n, size = m, prob = true_pi) # Fit TwinBKP model using the default global lengthscale tuning model <- fit_TwinBKP( X, y, m, Xbounds = Xbounds ) print(model) # fitted object print(summary(model)) # summary print(predict(model)) # predictions print(simulate(model, nsim = 3)) # posterior simulations ## End(Not run) #-------------------------- DKP and TwinDKP --------------------------- # Define true class probability function (3-class) true_pi_fun <- function(X) { p1 <- 1/(1+exp(-3*X)) p2 <- (1 + exp(-X^2) * cos(10 * (1 - exp(-X)) / (1 + exp(-X)))) / 2 return(matrix(c(p1/2, p2/2, 1 - (p1+p2)/2), nrow = length(p1))) } n <- 30 Xbounds <- matrix(c(-2, 2), nrow = 1) X <- tgp::lhs(n = n, rect = Xbounds) true_pi <- true_pi_fun(X) m <- sample(150, n, replace = TRUE) # Generate multinomial responses Y <- t(sapply(1:n, function(i) rmultinom(1, size = m[i], prob = true_pi[i, ]))) # Fit DKP model # A fixed theta is used here only to keep the example fast and reproducible. # In practice, omit theta to select it by leave-one-out cross-validation. model <- fit_DKP(X, Y, Xbounds = Xbounds, theta = 0.04) print(model) # fitted object print(summary(model)) # summary print(predict(model)) # predictions print(simulate(model, nsim = 3)) # posterior simulations ## Not run: # Larger TwinDKP example n <- 1000 X <- tgp::lhs(n = n, rect = Xbounds) true_pi <- true_pi_fun(X) m <- sample(150, n, replace = TRUE) # Generate multinomial responses Y <- t(sapply(1:n, function(i) rmultinom(1, size = m[i], prob = true_pi[i, ]))) # Fit TwinDKP model using the default global lengthscale tuning model <- fit_TwinDKP( X, Y, Xbounds = Xbounds ) print(model) # fitted object print(summary(model)) # summary print(predict(model)) # predictions print(simulate(model, nsim = 3)) # posterior simulations ## End(Not run)
Compute posterior quantiles from fitted BKP,
DKP, TwinBKP, and TwinDKP model objects. For
BKP and TwinBKP objects, this returns posterior quantiles of
the latent success probability. For DKP and TwinDKP objects,
this returns marginal posterior quantiles for each class probability.
## S3 method for class 'BKP' quantile(x, probs = c(0.025, 0.5, 0.975), ...) ## S3 method for class 'DKP' quantile(x, probs = c(0.025, 0.5, 0.975), ...) ## S3 method for class 'TwinBKP' quantile(x, probs = c(0.025, 0.5, 0.975), ...) ## S3 method for class 'TwinDKP' quantile(x, probs = c(0.025, 0.5, 0.975), ...)## S3 method for class 'BKP' quantile(x, probs = c(0.025, 0.5, 0.975), ...) ## S3 method for class 'DKP' quantile(x, probs = c(0.025, 0.5, 0.975), ...) ## S3 method for class 'TwinBKP' quantile(x, probs = c(0.025, 0.5, 0.975), ...) ## S3 method for class 'TwinDKP' quantile(x, probs = c(0.025, 0.5, 0.975), ...)
x |
A fitted model object of class |
probs |
Numeric vector of probabilities specifying which posterior
quantiles to return. Defaults to |
... |
Additional arguments (currently unused). |
For BKP and TwinBKP models, posterior quantiles are
computed from the corresponding Beta posterior for the latent success
probability. For DKP and TwinDKP models, marginal posterior
quantiles for each class probability are computed from the Beta marginal
distributions of the posterior Dirichlet distribution. These are marginal
class-wise quantiles, not joint Dirichlet credible regions.
For BKP and TwinBKP: a numeric vector if
length(probs) = 1, or a numeric matrix if
length(probs) > 1, of posterior quantiles. Rows correspond to
observations, and columns correspond to the requested probabilities.
For DKP and TwinDKP: a numeric matrix if
length(probs) = 1, or a three-dimensional array if
length(probs) > 1, of marginal posterior quantiles for class
probabilities. Dimensions correspond to observations classes
probabilities.
Zhao J, Qing K, Xu J (2025). BKP: An R Package for Beta Kernel Process Modeling. arXiv. https://doi.org/10.48550/arXiv.2508.10447.
fit_BKP, fit_DKP,
fit_TwinBKP, and fit_TwinDKP for model fitting;
predict.BKP, predict.DKP,
predict.TwinBKP, and predict.TwinDKP for
posterior prediction.
# -------------------------- BKP and TwinBKP --------------------------- set.seed(123) # Define true success probability function true_pi_fun <- function(x) { (1 + exp(-x^2) * cos(10 * (1 - exp(-x)) / (1 + exp(-x)))) / 2 } n <- 30 Xbounds <- matrix(c(-2, 2), nrow = 1) X <- tgp::lhs(n = n, rect = Xbounds) true_pi <- true_pi_fun(X) m <- sample(100, n, replace = TRUE) y <- rbinom(n, size = m, prob = true_pi) # Fit BKP model # A fixed theta is used here only to keep the example fast and reproducible. # In practice, omit theta to select it by leave-one-out cross-validation. model <- fit_BKP(X, y, m, Xbounds = Xbounds, theta = 0.04) # Extract posterior quantiles quantile(model) ## Not run: # Larger TwinBKP example n <- 1000 X <- tgp::lhs(n = n, rect = Xbounds) true_pi <- true_pi_fun(X) m <- sample(100, n, replace = TRUE) y <- rbinom(n, size = m, prob = true_pi) # Fit TwinBKP model using the default global lengthscale tuning model <- fit_TwinBKP( X, y, m, Xbounds = Xbounds ) # Extract posterior quantiles quantile(model) ## End(Not run) # -------------------------- DKP and TwinDKP --------------------------- # Define true class probability function (3-class) true_pi_fun <- function(X) { p1 <- 1/(1+exp(-3*X)) p2 <- (1 + exp(-X^2) * cos(10 * (1 - exp(-X)) / (1 + exp(-X)))) / 2 return(matrix(c(p1/2, p2/2, 1 - (p1+p2)/2), nrow = length(p1))) } n <- 30 Xbounds <- matrix(c(-2, 2), nrow = 1) X <- tgp::lhs(n = n, rect = Xbounds) true_pi <- true_pi_fun(X) m <- sample(100, n, replace = TRUE) # Generate multinomial responses Y <- t(sapply(1:n, function(i) rmultinom(1, size = m[i], prob = true_pi[i, ]))) # Fit DKP model # A fixed theta is used here only to keep the example fast and reproducible. # In practice, omit theta to select it by leave-one-out cross-validation. model <- fit_DKP(X, Y, Xbounds = Xbounds, theta = 0.04) # Extract posterior quantiles quantile(model) ## Not run: # Larger TwinDKP example n <- 1000 X <- tgp::lhs(n = n, rect = Xbounds) true_pi <- true_pi_fun(X) m <- sample(100, n, replace = TRUE) # Generate multinomial responses Y <- t(sapply(1:n, function(i) rmultinom(1, size = m[i], prob = true_pi[i, ]))) # Fit TwinDKP model using the default global lengthscale tuning model <- fit_TwinDKP( X, Y, Xbounds = Xbounds ) # Extract posterior quantiles quantile(model) ## End(Not run)# -------------------------- BKP and TwinBKP --------------------------- set.seed(123) # Define true success probability function true_pi_fun <- function(x) { (1 + exp(-x^2) * cos(10 * (1 - exp(-x)) / (1 + exp(-x)))) / 2 } n <- 30 Xbounds <- matrix(c(-2, 2), nrow = 1) X <- tgp::lhs(n = n, rect = Xbounds) true_pi <- true_pi_fun(X) m <- sample(100, n, replace = TRUE) y <- rbinom(n, size = m, prob = true_pi) # Fit BKP model # A fixed theta is used here only to keep the example fast and reproducible. # In practice, omit theta to select it by leave-one-out cross-validation. model <- fit_BKP(X, y, m, Xbounds = Xbounds, theta = 0.04) # Extract posterior quantiles quantile(model) ## Not run: # Larger TwinBKP example n <- 1000 X <- tgp::lhs(n = n, rect = Xbounds) true_pi <- true_pi_fun(X) m <- sample(100, n, replace = TRUE) y <- rbinom(n, size = m, prob = true_pi) # Fit TwinBKP model using the default global lengthscale tuning model <- fit_TwinBKP( X, y, m, Xbounds = Xbounds ) # Extract posterior quantiles quantile(model) ## End(Not run) # -------------------------- DKP and TwinDKP --------------------------- # Define true class probability function (3-class) true_pi_fun <- function(X) { p1 <- 1/(1+exp(-3*X)) p2 <- (1 + exp(-X^2) * cos(10 * (1 - exp(-X)) / (1 + exp(-X)))) / 2 return(matrix(c(p1/2, p2/2, 1 - (p1+p2)/2), nrow = length(p1))) } n <- 30 Xbounds <- matrix(c(-2, 2), nrow = 1) X <- tgp::lhs(n = n, rect = Xbounds) true_pi <- true_pi_fun(X) m <- sample(100, n, replace = TRUE) # Generate multinomial responses Y <- t(sapply(1:n, function(i) rmultinom(1, size = m[i], prob = true_pi[i, ]))) # Fit DKP model # A fixed theta is used here only to keep the example fast and reproducible. # In practice, omit theta to select it by leave-one-out cross-validation. model <- fit_DKP(X, Y, Xbounds = Xbounds, theta = 0.04) # Extract posterior quantiles quantile(model) ## Not run: # Larger TwinDKP example n <- 1000 X <- tgp::lhs(n = n, rect = Xbounds) true_pi <- true_pi_fun(X) m <- sample(100, n, replace = TRUE) # Generate multinomial responses Y <- t(sapply(1:n, function(i) rmultinom(1, size = m[i], prob = true_pi[i, ]))) # Fit TwinDKP model using the default global lengthscale tuning model <- fit_TwinDKP( X, Y, Xbounds = Xbounds ) # Extract posterior quantiles quantile(model) ## End(Not run)
Generate random posterior draws from fitted BKP,
DKP, TwinBKP, and TwinDKP model objects at training
or new input locations.
For BKP and TwinBKP models, posterior samples are generated
from Beta distributions for latent success probabilities. Optionally,
binary class labels can be derived by applying a user-specified
classification threshold.
For DKP and TwinDKP models, posterior samples are generated
from Dirichlet distributions for latent class-probability vectors. If the
training responses are single-label, that is, one-hot encoded, class labels
are additionally assigned using the maximum posterior simulated probability.
## S3 method for class 'BKP' simulate(object, nsim = 1, seed = NULL, Xnew = NULL, threshold = NULL, ...) ## S3 method for class 'DKP' simulate(object, nsim = 1, seed = NULL, Xnew = NULL, ...) ## S3 method for class 'TwinBKP' simulate(object, nsim = 1, seed = NULL, Xnew = NULL, threshold = NULL, ...) ## S3 method for class 'TwinDKP' simulate(object, nsim = 1, seed = NULL, Xnew = NULL, ...)## S3 method for class 'BKP' simulate(object, nsim = 1, seed = NULL, Xnew = NULL, threshold = NULL, ...) ## S3 method for class 'DKP' simulate(object, nsim = 1, seed = NULL, Xnew = NULL, ...) ## S3 method for class 'TwinBKP' simulate(object, nsim = 1, seed = NULL, Xnew = NULL, threshold = NULL, ...) ## S3 method for class 'TwinDKP' simulate(object, nsim = 1, seed = NULL, Xnew = NULL, ...)
object |
A fitted model object of class |
nsim |
Number of posterior samples to generate. The default is
|
seed |
Optional integer seed for reproducibility. |
Xnew |
A numeric vector or matrix of new input locations at which
posterior simulations are generated. If |
threshold |
Classification threshold for binary decisions, used for
|
... |
Additional arguments passed to the corresponding |
A list containing posterior simulations:
samplesFor BKP and TwinBKP, a numeric matrix with one row per
simulation location and one column per posterior draw. Each entry is a
simulated latent success probability.
For DKP and TwinDKP, a numeric array with dimensions
simulation locations classes posterior draws.
Each slice contains simulated latent class probabilities from the
posterior Dirichlet distribution.
meanPosterior mean at the simulation locations. For BKP and
TwinBKP, this is a numeric vector of success-probability means.
For DKP and TwinDKP, this is a matrix of class-probability
means.
classSimulated class labels when applicable. For BKP and
TwinBKP, this is returned when threshold is supplied. For
DKP and TwinDKP, this is returned when the training
response is one-hot encoded.
XThe original training input locations.
XnewThe new input locations used for simulation. If
NULL, simulations are returned at the training input locations.
thresholdThe binary classification threshold, returned for
BKP and TwinBKP simulations when supplied.
essEffective-sample-size calibration method inherited from
the fitted model, returned for BKP and DKP objects.
ess_infoESS diagnostics for BKP and DKP
simulations when available. TwinBKP and TwinDKP do not
support ESS calibration and do not return this component.
Zhao J, Qing K, Xu J (2025). BKP: An R Package for Beta Kernel Process Modeling. arXiv. https://doi.org/10.48550/arXiv.2508.10447.
fit_BKP, fit_DKP,
fit_TwinBKP, and fit_TwinDKP for model fitting;
predict.BKP, predict.DKP,
predict.TwinBKP, and predict.TwinDKP for
posterior prediction.
# -------------------------- BKP and TwinBKP --------------------------- set.seed(123) # Define true success probability function true_pi_fun <- function(x) { (1 + exp(-x^2) * cos(10 * (1 - exp(-x)) / (1 + exp(-x)))) / 2 } n <- 30 Xbounds <- matrix(c(-2,2), nrow = 1) X <- tgp::lhs(n = n, rect = Xbounds) true_pi <- true_pi_fun(X) m <- sample(100, n, replace = TRUE) y <- rbinom(n, size = m, prob = true_pi) # Fit BKP model # A fixed theta is used here only to keep the example fast and reproducible. # In practice, omit theta to select it by leave-one-out cross-validation. model <- fit_BKP(X, y, m, Xbounds = Xbounds, theta = 0.04) # Simulate 5 posterior draws of success probabilities Xnew <- matrix(seq(-2, 2, length.out = 5), ncol = 1) simulate(model, Xnew = Xnew, nsim = 5) # Simulate binary classifications (threshold = 0.5) simulate(model, Xnew = Xnew, nsim = 5, threshold = 0.5) ## Not run: # Larger TwinBKP example n <- 1000 X <- tgp::lhs(n = n, rect = Xbounds) true_pi <- true_pi_fun(X) m <- sample(100, n, replace = TRUE) y <- rbinom(n, size = m, prob = true_pi) # Fit TwinBKP model using the default global lengthscale tuning model <- fit_TwinBKP( X, y, m, Xbounds = Xbounds ) # Simulate 5 posterior draws of success probabilities simulate(model, Xnew = Xnew, nsim = 5) # Simulate binary classifications (threshold = 0.5) simulate(model, Xnew = Xnew, nsim = 5, threshold = 0.5) ## End(Not run) # -------------------------- DKP and TwinDKP --------------------------- # Define true class probability function (3-class) true_pi_fun <- function(X) { p1 <- 1/(1+exp(-3*X)) p2 <- (1 + exp(-X^2) * cos(10 * (1 - exp(-X)) / (1 + exp(-X)))) / 2 return(matrix(c(p1/2, p2/2, 1 - (p1+p2)/2), nrow = length(p1))) } n <- 30 Xbounds <- matrix(c(-2, 2), nrow = 1) X <- tgp::lhs(n = n, rect = Xbounds) true_pi <- true_pi_fun(X) m <- sample(150, n, replace = TRUE) # Generate multinomial responses Y <- t(sapply(1:n, function(i) rmultinom(1, size = m[i], prob = true_pi[i, ]))) # Fit DKP model # A fixed theta is used here only to keep the example fast and reproducible. # In practice, omit theta to select it by leave-one-out cross-validation. model <- fit_DKP(X, Y, Xbounds = Xbounds, theta = 0.04) # Simulate 5 draws from posterior Dirichlet distributions at new point Xnew <- matrix(seq(-2, 2, length.out = 5), ncol = 1) simulate(model, Xnew = Xnew, nsim = 5) ## Not run: # Larger TwinDKP example n <- 1000 X <- tgp::lhs(n = n, rect = Xbounds) true_pi <- true_pi_fun(X) m <- sample(150, n, replace = TRUE) # Generate multinomial responses Y <- t(sapply(1:n, function(i) rmultinom(1, size = m[i], prob = true_pi[i, ]))) # Fit TwinDKP model using the default global lengthscale tuning model <- fit_TwinDKP( X, Y, Xbounds = Xbounds ) # Simulate 5 draws from posterior Dirichlet distributions at new point simulate(model, Xnew = Xnew, nsim = 5) ## End(Not run)# -------------------------- BKP and TwinBKP --------------------------- set.seed(123) # Define true success probability function true_pi_fun <- function(x) { (1 + exp(-x^2) * cos(10 * (1 - exp(-x)) / (1 + exp(-x)))) / 2 } n <- 30 Xbounds <- matrix(c(-2,2), nrow = 1) X <- tgp::lhs(n = n, rect = Xbounds) true_pi <- true_pi_fun(X) m <- sample(100, n, replace = TRUE) y <- rbinom(n, size = m, prob = true_pi) # Fit BKP model # A fixed theta is used here only to keep the example fast and reproducible. # In practice, omit theta to select it by leave-one-out cross-validation. model <- fit_BKP(X, y, m, Xbounds = Xbounds, theta = 0.04) # Simulate 5 posterior draws of success probabilities Xnew <- matrix(seq(-2, 2, length.out = 5), ncol = 1) simulate(model, Xnew = Xnew, nsim = 5) # Simulate binary classifications (threshold = 0.5) simulate(model, Xnew = Xnew, nsim = 5, threshold = 0.5) ## Not run: # Larger TwinBKP example n <- 1000 X <- tgp::lhs(n = n, rect = Xbounds) true_pi <- true_pi_fun(X) m <- sample(100, n, replace = TRUE) y <- rbinom(n, size = m, prob = true_pi) # Fit TwinBKP model using the default global lengthscale tuning model <- fit_TwinBKP( X, y, m, Xbounds = Xbounds ) # Simulate 5 posterior draws of success probabilities simulate(model, Xnew = Xnew, nsim = 5) # Simulate binary classifications (threshold = 0.5) simulate(model, Xnew = Xnew, nsim = 5, threshold = 0.5) ## End(Not run) # -------------------------- DKP and TwinDKP --------------------------- # Define true class probability function (3-class) true_pi_fun <- function(X) { p1 <- 1/(1+exp(-3*X)) p2 <- (1 + exp(-X^2) * cos(10 * (1 - exp(-X)) / (1 + exp(-X)))) / 2 return(matrix(c(p1/2, p2/2, 1 - (p1+p2)/2), nrow = length(p1))) } n <- 30 Xbounds <- matrix(c(-2, 2), nrow = 1) X <- tgp::lhs(n = n, rect = Xbounds) true_pi <- true_pi_fun(X) m <- sample(150, n, replace = TRUE) # Generate multinomial responses Y <- t(sapply(1:n, function(i) rmultinom(1, size = m[i], prob = true_pi[i, ]))) # Fit DKP model # A fixed theta is used here only to keep the example fast and reproducible. # In practice, omit theta to select it by leave-one-out cross-validation. model <- fit_DKP(X, Y, Xbounds = Xbounds, theta = 0.04) # Simulate 5 draws from posterior Dirichlet distributions at new point Xnew <- matrix(seq(-2, 2, length.out = 5), ncol = 1) simulate(model, Xnew = Xnew, nsim = 5) ## Not run: # Larger TwinDKP example n <- 1000 X <- tgp::lhs(n = n, rect = Xbounds) true_pi <- true_pi_fun(X) m <- sample(150, n, replace = TRUE) # Generate multinomial responses Y <- t(sapply(1:n, function(i) rmultinom(1, size = m[i], prob = true_pi[i, ]))) # Fit TwinDKP model using the default global lengthscale tuning model <- fit_TwinDKP( X, Y, Xbounds = Xbounds ) # Simulate 5 draws from posterior Dirichlet distributions at new point simulate(model, Xnew = Xnew, nsim = 5) ## End(Not run)
Provides a structured summary of fitted BKP,
DKP, TwinBKP, and TwinDKP model objects. The summary
reports model dimensions, kernel settings, optimized hyperparameters,
prior specification, loss information, approximation settings when
applicable, and posterior summaries at the training input locations.
## S3 method for class 'BKP' summary(object, ...) ## S3 method for class 'DKP' summary(object, ...) ## S3 method for class 'TwinBKP' summary(object, ...) ## S3 method for class 'TwinDKP' summary(object, ...)## S3 method for class 'BKP' summary(object, ...) ## S3 method for class 'DKP' summary(object, ...) ## S3 method for class 'TwinBKP' summary(object, ...) ## S3 method for class 'TwinDKP' summary(object, ...)
object |
A fitted model object of class |
... |
Additional arguments passed to the generic |
A list containing model configuration, prior information, kernel hyperparameters, loss information, and posterior summaries at the training inputs.
For BKP and TwinBKP, posterior summaries are returned as
vectors of posterior means and variances for the latent success
probabilities.
For DKP and TwinDKP, posterior summaries are returned as
matrices of marginal posterior means and variances for class probabilities.
For TwinBKP and TwinDKP, the list additionally includes
global-local approximation fields such as global_kernel,
local_kernel, theta_g, theta_l,
global_size, global_target, local_size, and
twins.
Zhao J, Qing K, Xu J (2025). BKP: An R Package for Beta Kernel Process Modeling. arXiv. https://doi.org/10.48550/arXiv.2508.10447.
fit_BKP, fit_DKP,
fit_TwinBKP, and fit_TwinDKP for model fitting;
predict.BKP, predict.DKP,
predict.TwinBKP, and predict.TwinDKP for
posterior prediction.
# -------------------------- BKP and TwinBKP --------------------------- set.seed(123) # Define true success probability function true_pi_fun <- function(x) { (1 + exp(-x^2) * cos(10 * (1 - exp(-x)) / (1 + exp(-x)))) / 2 } n <- 30 Xbounds <- matrix(c(-2, 2), nrow = 1) X <- tgp::lhs(n = n, rect = Xbounds) true_pi <- true_pi_fun(X) m <- sample(100, n, replace = TRUE) y <- rbinom(n, size = m, prob = true_pi) # Fit BKP model # A fixed theta is used here only to keep the example fast and reproducible. # In practice, omit theta to select it by leave-one-out cross-validation. model <- fit_BKP(X, y, m, Xbounds = Xbounds, theta = 0.04) summary(model) ## Not run: # Larger TwinBKP example n <- 1000 X <- tgp::lhs(n = n, rect = Xbounds) true_pi <- true_pi_fun(X) m <- sample(100, n, replace = TRUE) y <- rbinom(n, size = m, prob = true_pi) # Fit TwinBKP model using the default global lengthscale tuning model <- fit_TwinBKP( X, y, m, Xbounds = Xbounds ) summary(model) ## End(Not run) # -------------------------- DKP and TwinDKP --------------------------- # Define true class probability function (3-class) true_pi_fun <- function(X) { p1 <- 1/(1+exp(-3*X)) p2 <- (1 + exp(-X^2) * cos(10 * (1 - exp(-X)) / (1 + exp(-X)))) / 2 return(matrix(c(p1/2, p2/2, 1 - (p1+p2)/2), nrow = length(p1))) } n <- 30 Xbounds <- matrix(c(-2, 2), nrow = 1) X <- tgp::lhs(n = n, rect = Xbounds) true_pi <- true_pi_fun(X) m <- sample(150, n, replace = TRUE) # Generate multinomial responses Y <- t(sapply(1:n, function(i) rmultinom(1, size = m[i], prob = true_pi[i, ]))) # Fit DKP model # A fixed theta is used here only to keep the example fast and reproducible. # In practice, omit theta to select it by leave-one-out cross-validation. model <- fit_DKP(X, Y, Xbounds = Xbounds, theta = 0.04) summary(model) ## Not run: # Larger TwinDKP example n <- 1000 X <- tgp::lhs(n = n, rect = Xbounds) true_pi <- true_pi_fun(X) m <- sample(150, n, replace = TRUE) # Generate multinomial responses Y <- t(sapply(1:n, function(i) rmultinom(1, size = m[i], prob = true_pi[i, ]))) # Fit TwinDKP model using the default global lengthscale tuning model <- fit_TwinDKP( X, Y, Xbounds = Xbounds ) summary(model) ## End(Not run)# -------------------------- BKP and TwinBKP --------------------------- set.seed(123) # Define true success probability function true_pi_fun <- function(x) { (1 + exp(-x^2) * cos(10 * (1 - exp(-x)) / (1 + exp(-x)))) / 2 } n <- 30 Xbounds <- matrix(c(-2, 2), nrow = 1) X <- tgp::lhs(n = n, rect = Xbounds) true_pi <- true_pi_fun(X) m <- sample(100, n, replace = TRUE) y <- rbinom(n, size = m, prob = true_pi) # Fit BKP model # A fixed theta is used here only to keep the example fast and reproducible. # In practice, omit theta to select it by leave-one-out cross-validation. model <- fit_BKP(X, y, m, Xbounds = Xbounds, theta = 0.04) summary(model) ## Not run: # Larger TwinBKP example n <- 1000 X <- tgp::lhs(n = n, rect = Xbounds) true_pi <- true_pi_fun(X) m <- sample(100, n, replace = TRUE) y <- rbinom(n, size = m, prob = true_pi) # Fit TwinBKP model using the default global lengthscale tuning model <- fit_TwinBKP( X, y, m, Xbounds = Xbounds ) summary(model) ## End(Not run) # -------------------------- DKP and TwinDKP --------------------------- # Define true class probability function (3-class) true_pi_fun <- function(X) { p1 <- 1/(1+exp(-3*X)) p2 <- (1 + exp(-X^2) * cos(10 * (1 - exp(-X)) / (1 + exp(-X)))) / 2 return(matrix(c(p1/2, p2/2, 1 - (p1+p2)/2), nrow = length(p1))) } n <- 30 Xbounds <- matrix(c(-2, 2), nrow = 1) X <- tgp::lhs(n = n, rect = Xbounds) true_pi <- true_pi_fun(X) m <- sample(150, n, replace = TRUE) # Generate multinomial responses Y <- t(sapply(1:n, function(i) rmultinom(1, size = m[i], prob = true_pi[i, ]))) # Fit DKP model # A fixed theta is used here only to keep the example fast and reproducible. # In practice, omit theta to select it by leave-one-out cross-validation. model <- fit_DKP(X, Y, Xbounds = Xbounds, theta = 0.04) summary(model) ## Not run: # Larger TwinDKP example n <- 1000 X <- tgp::lhs(n = n, rect = Xbounds) true_pi <- true_pi_fun(X) m <- sample(150, n, replace = TRUE) # Generate multinomial responses Y <- t(sapply(1:n, function(i) rmultinom(1, size = m[i], prob = true_pi[i, ]))) # Fit TwinDKP model using the default global lengthscale tuning model <- fit_TwinDKP( X, Y, Xbounds = Xbounds ) summary(model) ## End(Not run)