| Title: | Implements Pseudo-R2D2 Prior for Ordinal Regression |
|---|---|
| Description: | Implements the pseudo-R2D2 prior for ordinal regression from the paper "Pseudo-R2D2 prior for high-dimensional ordinal regression" by Yanchenko (2025) <doi:10.1007/s11222-025-10667-x>. In particular, it provides code to evaluate the probability distribution function for the cut-points, compute the log-likelihood, calculate the hyper-parameters for the global variance parameter, find the distribution of McFadden's coefficient-of-determination, and fit the model in 'rstan'. Please cite the paper if you use these codes. |
| Authors: | Eric Yanchenko [aut, cre] |
| Maintainer: | Eric Yanchenko <[email protected]> |
| License: | MIT + file LICENSE |
| Version: | 1.0.2 |
| Built: | 2026-05-17 08:13:50 UTC |
| Source: | https://github.com/cran/R2D2ordinal |
This function computes the value of the probability density function for the cut-points. The distribution is induced by a Dirichlet distribution on the prior probabilities of the response.
dcut(tau, W, alpha, log = FALSE)dcut(tau, W, alpha, log = FALSE)
tau |
cut-points |
W |
global variance |
alpha |
concentration parameters for prior probabilities of Y |
log |
logical; if TRUE, returns log pdf |
value of pdf at tau
tau = c(-Inf, -1, 1, Inf) # set cut points W = 1 # set value of global variance alpha = c(1,1,1) #concentration parameters dcut(tau, W, alpha, log=FALSE)tau = c(-Inf, -1, 1, Inf) # set cut points W = 1 # set value of global variance alpha = c(1,1,1) #concentration parameters dcut(tau, W, alpha, log=FALSE)
This function finds the optimal GIG parameters for the prior on W which induces a beta prior distribution on McFadden's R2.
find_param( a, b, n, K, alpha = rep(2, K), nsims = 1000, nreps = 5, no_cores = 10 )find_param( a, b, n, K, alpha = rep(2, K), nsims = 1000, nreps = 5, no_cores = 10 )
a |
hyper-parameter of prior for R2 ~ Beta(a,b) |
b |
hyper-parameter of prior for R2 ~ Beta(a,b) |
n |
number of observations |
K |
number of response categories |
alpha |
prior hyper-parameters for prior Dirichlet distribution on response probabilities |
nsims |
number of times to simulate data |
nreps |
number of times to run the algorithm (default = 5) |
no_cores |
number of cores to parallelize data-generation process |
Optimal GIG parameters
a = 1 b = 5 n = 100 K = 3 find_param(a, b, n, K, no_cores=1)a = 1 b = 5 n = 100 K = 3 find_param(a, b, n, K, no_cores=1)
This function evaluates the log-likelihood of the response for a given value of the parameters.
llike(Y, W, tau)llike(Y, W, tau)
Y |
ordinal response |
W |
global variance |
tau |
cut-points |
value of log-likelihood at Y, W and tau
set.seed(1234) K = 3 # number of response categories Y = sample(1:K, 10, replace=TRUE) # generate responses W = 1 tau = c(-Inf, -1, 1, Inf) # set parameter values llike(Y, W, tau)set.seed(1234) K = 3 # number of response categories Y = sample(1:K, 10, replace=TRUE) # generate responses W = 1 tau = c(-Inf, -1, 1, Inf) # set parameter values llike(Y, W, tau)
This function carries out a Bayesian ordinal regression model in Stan using the proposed psuedo-R2D2 prior
ord_r2d2( x, y, K, a = 1, b = 10, hyper = NULL, alpha = rep(2, K), xi0 = 1, nsims = 1000, nreps = 5, no_cores = 10, progress = FALSE, ... )ord_r2d2( x, y, K, a = 1, b = 10, hyper = NULL, alpha = rep(2, K), xi0 = 1, nsims = 1000, nreps = 5, no_cores = 10, progress = FALSE, ... )
x |
covariate matrix |
y |
response variables |
K |
number of response categories |
a |
hyper-parameter of prior for R2 ~ Beta(a,b) |
b |
hyper-parameter of prior for R2 ~ Beta(a,b) |
hyper |
hyper-parameters for W prior |
alpha |
prior hyper-parameters for prior Dirichlet distribution on response probabilities |
xi0 |
prior hyper-parameters for prior Dirichlet distribution of variance allocation parameters phi |
nsims |
number of times to simulate data |
nreps |
number of times to run the algorithm (default = 5) |
no_cores |
number of cores to parallelize data-generation process |
progress |
logical. if TRUE, shows the progress bars from the posterior sampling. |
... |
optional hyper-parameters for Stan fitting |
Stan model fit
# X are covariates, Y are responses, K is number of response categories # This example will yield low R2 values as the response are independent of the covariates. set.seed(1234) n = 100 p = 5 X = matrix(rnorm(n*p), nrow = n, ncol=p) K = 3 Y = sample(1:K, 100, replace=TRUE) a = 1 b = 5 # Pre-computed hyperparameters fit <- ord_r2d2(X, Y, K, hyper=c(0.002, 0.989, 1.013), no_cores=1) out <- rstan::extract(fit) # Plot histogram of posterior W hist(out$W, xlab="W")# X are covariates, Y are responses, K is number of response categories # This example will yield low R2 values as the response are independent of the covariates. set.seed(1234) n = 100 p = 5 X = matrix(rnorm(n*p), nrow = n, ncol=p) K = 3 Y = sample(1:K, 100, replace=TRUE) a = 1 b = 5 # Pre-computed hyperparameters fit <- ord_r2d2(X, Y, K, hyper=c(0.002, 0.989, 1.013), no_cores=1) out <- rstan::extract(fit) # Plot histogram of posterior W hist(out$W, xlab="W")
This function finds the posterior distribution of McFadden's R2 given the posterior samples from a Stan model fit
r2_mc(Y, out)r2_mc(Y, out)
Y |
ordinal response |
out |
posterior samples from R2D2 model fit in Stan |
Posterior samples from McFadden's R2
# Obtain output from ord_r2d2() model fit set.seed(1234) # X are covariates, Y are responses, K is number of response categories # This example will yield low R2 values as the response are independent of the covariates. n = 100 p = 5 X = matrix(rnorm(n*p), nrow = n, ncol=p) K = 3 Y = sample(1:K, 100, replace=TRUE) a = 1 b = 5 # Pre-computed hyperparameters fit <- ord_r2d2(X, Y, K, hyper=c(0.002, 0.989, 1.013), no_cores=1) out <- rstan::extract(fit) # Plot histogram of posterior R2 hist(r2_mc(Y, out), xlab="R2")# Obtain output from ord_r2d2() model fit set.seed(1234) # X are covariates, Y are responses, K is number of response categories # This example will yield low R2 values as the response are independent of the covariates. n = 100 p = 5 X = matrix(rnorm(n*p), nrow = n, ncol=p) K = 3 Y = sample(1:K, 100, replace=TRUE) a = 1 b = 5 # Pre-computed hyperparameters fit <- ord_r2d2(X, Y, K, hyper=c(0.002, 0.989, 1.013), no_cores=1) out <- rstan::extract(fit) # Plot histogram of posterior R2 hist(r2_mc(Y, out), xlab="R2")