Title: | ACSS, Corresponding ACSS, and GLP Algorithm |
---|---|
Description: | Allow user to run the Adaptive Correlated Spike and Slab (ACSS) algorithm, corresponding INdependent Spike and Slab (INSS) algorithm, and Giannone, Lenza and Primiceri (GLP) algorithm with adaptive burn-in. All of the three algorithms are used to fit high dimensional data set with either sparse structure, or dense structure with smaller contributions from all predictors. The state-of-the-art GLP algorithm is in Giannone, D., Lenza, M., & Primiceri, G. E. (2021, ISBN:978-92-899-4542-4) "Economic predictions with big data: The illusion of sparsity". The two new algorithms, ACSS algorithm and INSS algorithm, and the discussion on their performance can be seen in Yang, Z., Khare, K., & Michailidis, G. (2024, preprint) "Bayesian methodology for adaptive sparsity and shrinkage in regression". |
Authors: | Ziqian Yang [cre, aut], Kshitij Khare [aut], George Michailidis [aut] |
Maintainer: | Ziqian Yang <[email protected]> |
License: | GPL-3 |
Version: | 0.0.1.4 |
Built: | 2024-11-02 06:31:59 UTC |
Source: | CRAN |
Adaptive Correlated Spike and Slab (ACSS) algorithm with/without adaptive burn-in Gibbs sampler. See paper of Yang, Z., Khare, K., & Michailidis, G. (2024) for details.
ACSS_gs( Y, X, a = 1, b = 1, c = 1, s, Max_burnin = 10, nmc = 5000, adaptive_burn_in = TRUE )
ACSS_gs( Y, X, a = 1, b = 1, c = 1, s, Max_burnin = 10, nmc = 5000, adaptive_burn_in = TRUE )
Y |
A vector. |
X |
A matrix. |
a |
shape parameter for marginal of q; default=1. |
b |
shape parameter for marginal of q; default=1. |
c |
shape parameter for marginal of lambda^2; larger c introduce more shrinkage and stronger correlation. default=1. |
s |
scale (inversed) parameter for marginal of lambda^2; larger s introduce more shrinkage; default=sqrt(p). |
Max_burnin |
Maximum burn-in (in 100 steps) for adaptive burn-in Gibbs sampler. Minimum value is 10, corresponding to 1000 hard burn-insteps. Default=10. |
nmc |
Number of MCMC samples. Default=5000. |
adaptive_burn_in |
Logical. If TRUE, use adaptive burn-in Gibbs sampler; If false, use fixed burn-in with burn-in = Max_burnin. Default=TRUE. |
A list with betahat: predicted beta hat from majority voting, and Gibbs_res: 5000 samples of beta, q and lambda^2 from Gibbs sampler.
## A toy example is given below to save time. The full example can be run to get better results ## by using X instead of X[, 1:30] and let nmc=5000 (default). n = 30; p = 2 * n; beta1 = rep(0.1, p); beta2 = c(rep(0.2, p / 2), rep(0, p / 2)); beta3 = c(rep(0.15, 3 * p / 4), rep(0, ceiling(p / 4))); beta4 = c(rep(1, p / 4), rep(0, ceiling(3 * p / 4))); beta5 = c(rep(3, ceiling(p / 20)), rep(0 , 19 * p / 20)); betas = list(beta1, beta3, beta2, beta4, beta5); set.seed(123); X = matrix(rnorm(n * p), n, p); Y = c(X %*% betas[[1]] + rnorm(n)); ## A toy example with p=30, total Gibbs steps=1100, takes ~0.6s system.time({mod = ACSS_gs(Y, X[, 1:30], 1, 1, 1, sqrt(p), nmc = 100);}) mod$beta; ## estimated beta after the Majority voting hist(mod$Gibbs_res$betamat[1,]); ## histogram of the beta_1 hist(mod$Gibbs_res$q); ## histogram of the q hist(log(mod$Gibbs_res$lambdasq)); ## histogram of the log(lambda^2) plot(mod$Gibbs_res$q); ## trace plot of the q ## joint posterior of model density and shrinkage plot(log(mod$Gibbs_res$q / (1 - mod$Gibbs_res$q)), -log(mod$Gibbs_res$lambdasq), xlab = "logit(q)", ylab = "-log(lambda^2)", main = "Joint Posterior of Model Density and Shrinkage");
## A toy example is given below to save time. The full example can be run to get better results ## by using X instead of X[, 1:30] and let nmc=5000 (default). n = 30; p = 2 * n; beta1 = rep(0.1, p); beta2 = c(rep(0.2, p / 2), rep(0, p / 2)); beta3 = c(rep(0.15, 3 * p / 4), rep(0, ceiling(p / 4))); beta4 = c(rep(1, p / 4), rep(0, ceiling(3 * p / 4))); beta5 = c(rep(3, ceiling(p / 20)), rep(0 , 19 * p / 20)); betas = list(beta1, beta3, beta2, beta4, beta5); set.seed(123); X = matrix(rnorm(n * p), n, p); Y = c(X %*% betas[[1]] + rnorm(n)); ## A toy example with p=30, total Gibbs steps=1100, takes ~0.6s system.time({mod = ACSS_gs(Y, X[, 1:30], 1, 1, 1, sqrt(p), nmc = 100);}) mod$beta; ## estimated beta after the Majority voting hist(mod$Gibbs_res$betamat[1,]); ## histogram of the beta_1 hist(mod$Gibbs_res$q); ## histogram of the q hist(log(mod$Gibbs_res$lambdasq)); ## histogram of the log(lambda^2) plot(mod$Gibbs_res$q); ## trace plot of the q ## joint posterior of model density and shrinkage plot(log(mod$Gibbs_res$q / (1 - mod$Gibbs_res$q)), -log(mod$Gibbs_res$lambdasq), xlab = "logit(q)", ylab = "-log(lambda^2)", main = "Joint Posterior of Model Density and Shrinkage");
A list contains the five data set used in the paper of Giannone, Lenza, and Primiceri (2021). Contains the following data sets: Macro1, Macro2, Micro1, Micro2, and Finance1
Econ_data
Econ_data
## 'Econ_data' A list contains the five lists, as the 5 data sets
A list with a vector and a data frame, 'Y' and 'X'. 'Y' is the response vector, with 659 observations in it. 'X' is the data frame contains all predictors, with n=659, p=130. It have the structure of time series data.
A list with a vector and a data frame, 'Y' and 'X'. 'Y' is the response vector, with 90 observations in it. 'X' is the data frame contains all predictors, with n=90, p=69. It have the structure of sectional data.
A list with a vector and a data frame, 'Y' and 'X'. 'Y' is the response vector, with 576 observations in it. 'X' is the data frame contains all predictors, with n=576, p=285. It have the structure of panel data with 48 units on 12 time points.
A list with a vector and a data frame, 'Y' and 'X'. 'Y' is the response vector, with 312 observations in it. 'X' is the data frame contains all predictors, with n=312, p=138. It have the structure of panel data with 12 units on 26 time points.
A list with a vector and a data frame, 'Y' and 'X'. 'Y' is the response vector, with 68 observations in it. 'X' is the data frame contains all predictors, with n=68, p=16. It have the structure of time series data.
<https://www.econometricsociety.org/publications/econometrica/2021/09/01/economic-predictions-big-data-illusion-sparsity/supp/17842_Data_and_Programs.zip>
<https://research.stlouisfed.org/econ/mccracken/fred-databases/>
Giannone, Lenza and Primiceri (GLP) algorithm with/without adaptive burn-in Gibbs sampler. See paper Giannone, D., Lenza, M., & Primiceri, G. E. (2021) and Yang, Z., Khare, K., & Michailidis, G. (2024) for details.
Most of the codes are from https://github.com/bfava/IllusionOfIllusion with our modification to make it have adaptive burn-in Gibbs sampler, and some debugs.
GLP_gs( Y, X, a = 1, b = 1, A = 1, B = 1, Max_burnin = 10, nmc = 5000, adaptive_burn_in = TRUE )
GLP_gs( Y, X, a = 1, b = 1, A = 1, B = 1, Max_burnin = 10, nmc = 5000, adaptive_burn_in = TRUE )
Y |
A vector. |
X |
A matrix. |
a |
shape parameter for marginal of q; default=1. |
b |
shape parameter for marginal of q; default=1. |
A |
shape parameter for marginal of R^2; default=1. |
B |
shape parameter for marginal of R^2; default=1. |
Max_burnin |
Maximum burn-in (in 100 steps) for adaptive burn-in Gibbs sampler. Minimum value is 10, corresponding to 1000 hard burn-insteps. Default=10. |
nmc |
Number of MCMC samples. Default=5000. |
adaptive_burn_in |
Logical. If TRUE, use adaptive burn-in Gibbs sampler; If false, use fixed burn-in with burn-in = Max_burnin. Default=TRUE. |
A list with betahat: predicted beta hat from majority voting, and Gibbs_res: 5000 samples of beta, q and lambda^2 from Gibbs sampler.
## A toy example is given below to save your time, which will still take ~10s. ## The full example can be run to get BETTER results, which will take more than 80s, ## by using X instead of X[, 1:30] and let nmc=5000 (default). n = 30; p = 2 * n; beta1 = rep(0.1, p); beta2 = c(rep(0.2, p / 2), rep(0, p / 2)); beta3 = c(rep(0.15, 3 * p / 4), rep(0, ceiling(p / 4))); beta4 = c(rep(1, p / 4), rep(0, ceiling(3 * p / 4))); beta5 = c(rep(3, ceiling(p / 20)), rep(0 , 19 * p / 20)); betas = list(beta1, beta3, beta2, beta4, beta5); set.seed(123); X = matrix(rnorm(n * p), n, p); Y = c(X %*% betas[[1]] + rnorm(n)); ## A toy example with p=30, total Gibbs steps=1100 system.time({mod = GLP_gs(Y, X[, 1:30], 1, 1, 1, 1, nmc = 100);}) mod$beta; ## estimated beta after the Majority voting hist(mod$Gibbs_res$betamat[1,]); ## histogram of the beta_1 hist(mod$Gibbs_res$q); ## histogram of the q hist(log(mod$Gibbs_res$lambdasq)); ## histogram of the log(lambda^2) plot(mod$Gibbs_res$q); ## trace plot of the q ## joint posterior of model density and shrinkage plot(log(mod$Gibbs_res$q / (1 - mod$Gibbs_res$q)), -log(mod$Gibbs_res$lambdasq), xlab = "logit(q)", ylab = "-log(lambda^2)", main = "Joint Posterior of Model Density and Shrinkage");
## A toy example is given below to save your time, which will still take ~10s. ## The full example can be run to get BETTER results, which will take more than 80s, ## by using X instead of X[, 1:30] and let nmc=5000 (default). n = 30; p = 2 * n; beta1 = rep(0.1, p); beta2 = c(rep(0.2, p / 2), rep(0, p / 2)); beta3 = c(rep(0.15, 3 * p / 4), rep(0, ceiling(p / 4))); beta4 = c(rep(1, p / 4), rep(0, ceiling(3 * p / 4))); beta5 = c(rep(3, ceiling(p / 20)), rep(0 , 19 * p / 20)); betas = list(beta1, beta3, beta2, beta4, beta5); set.seed(123); X = matrix(rnorm(n * p), n, p); Y = c(X %*% betas[[1]] + rnorm(n)); ## A toy example with p=30, total Gibbs steps=1100 system.time({mod = GLP_gs(Y, X[, 1:30], 1, 1, 1, 1, nmc = 100);}) mod$beta; ## estimated beta after the Majority voting hist(mod$Gibbs_res$betamat[1,]); ## histogram of the beta_1 hist(mod$Gibbs_res$q); ## histogram of the q hist(log(mod$Gibbs_res$lambdasq)); ## histogram of the log(lambda^2) plot(mod$Gibbs_res$q); ## trace plot of the q ## joint posterior of model density and shrinkage plot(log(mod$Gibbs_res$q / (1 - mod$Gibbs_res$q)), -log(mod$Gibbs_res$lambdasq), xlab = "logit(q)", ylab = "-log(lambda^2)", main = "Joint Posterior of Model Density and Shrinkage");
INdependent Spike and Slab (INSS) algorithm with/without adaptive burn-in Gibbs sampler. See paper of Yang, Z., Khare, K., & Michailidis, G. (2024) for details.
INSS_gs( Y, X, a = 1, b = 1, c = 1, s, Max_burnin = 10, nmc = 5000, adaptive_burn_in = TRUE )
INSS_gs( Y, X, a = 1, b = 1, c = 1, s, Max_burnin = 10, nmc = 5000, adaptive_burn_in = TRUE )
Y |
A vector. |
X |
A matrix. |
a |
shape parameter for marginal of q; default=1. |
b |
shape parameter for marginal of q; default=1. |
c |
shape parameter for marginal of lambda^2; larger c introduce more shrinkage and stronger correlation. default=1. |
s |
scale (inversed) parameter for marginal of lambda^2; larger s introduce more shrinkage; default=sqrt(p). |
Max_burnin |
Maximum burn-in (in 100 steps) for adaptive burn-in Gibbs sampler. Minimum value is 10, corresponding to 1000 hard burn-insteps. Default=10. |
nmc |
Number of MCMC samples. Default=5000. |
adaptive_burn_in |
Logical. If TRUE, use adaptive burn-in Gibbs sampler; If false, use fixed burn-in with burn-in = Max_burnin. Default=TRUE. |
A list with betahat: predicted beta hat from majority voting, and Gibbs_res: 5000 samples of beta, q and lambda^2 from Gibbs sampler.
## A toy example is given below to save time. The full example can be run to get better results ## by using X instead of X[, 1:30] and let nmc=5000 (default). n = 30; p = 2 * n; beta1 = rep(0.1, p); beta2 = c(rep(0.2, p / 2), rep(0, p / 2)); beta3 = c(rep(0.15, 3 * p / 4), rep(0, ceiling(p / 4))); beta4 = c(rep(1, p / 4), rep(0, ceiling(3 * p / 4))); beta5 = c(rep(3, ceiling(p / 20)), rep(0 , 19 * p / 20)); betas = list(beta1, beta3, beta2, beta4, beta5); set.seed(123); X = matrix(rnorm(n * p), n, p); Y = c(X %*% betas[[1]] + rnorm(n)); ## A toy example with p=30, total Gibbs steps=1100, takes ~0.6s system.time({mod = INSS_gs(Y, X[, 1:30], 1, 1, 1, sqrt(p), nmc = 100);}) mod$beta; ## estimated beta after the Majority voting hist(mod$Gibbs_res$betamat[1,]); ## histogram of the beta_1 hist(mod$Gibbs_res$q); ## histogram of the q hist(log(mod$Gibbs_res$lambdasq)); ## histogram of the log(lambda^2) plot(mod$Gibbs_res$q); ## trace plot of the q ## joint posterior of model density and shrinkage plot(log(mod$Gibbs_res$q / (1 - mod$Gibbs_res$q)), -log(mod$Gibbs_res$lambdasq), xlab = "logit(q)", ylab = "-log(lambda^2)", main = "Joint Posterior of Model Density and Shrinkage");
## A toy example is given below to save time. The full example can be run to get better results ## by using X instead of X[, 1:30] and let nmc=5000 (default). n = 30; p = 2 * n; beta1 = rep(0.1, p); beta2 = c(rep(0.2, p / 2), rep(0, p / 2)); beta3 = c(rep(0.15, 3 * p / 4), rep(0, ceiling(p / 4))); beta4 = c(rep(1, p / 4), rep(0, ceiling(3 * p / 4))); beta5 = c(rep(3, ceiling(p / 20)), rep(0 , 19 * p / 20)); betas = list(beta1, beta3, beta2, beta4, beta5); set.seed(123); X = matrix(rnorm(n * p), n, p); Y = c(X %*% betas[[1]] + rnorm(n)); ## A toy example with p=30, total Gibbs steps=1100, takes ~0.6s system.time({mod = INSS_gs(Y, X[, 1:30], 1, 1, 1, sqrt(p), nmc = 100);}) mod$beta; ## estimated beta after the Majority voting hist(mod$Gibbs_res$betamat[1,]); ## histogram of the beta_1 hist(mod$Gibbs_res$q); ## histogram of the q hist(log(mod$Gibbs_res$lambdasq)); ## histogram of the log(lambda^2) plot(mod$Gibbs_res$q); ## trace plot of the q ## joint posterior of model density and shrinkage plot(log(mod$Gibbs_res$q / (1 - mod$Gibbs_res$q)), -log(mod$Gibbs_res$lambdasq), xlab = "logit(q)", ylab = "-log(lambda^2)", main = "Joint Posterior of Model Density and Shrinkage");