Title: | Spike-and-Slab Group Lasso for Group-Regularized Generalized Linear Models |
---|---|
Description: | Fits group-regularized generalized linear models (GLMs) using the spike-and-slab group lasso (SSGL) prior introduced by Bai et al. (2022) <doi:10.1080/01621459.2020.1765784> and extended to GLMs by Bai (2023) <arXiv:2007.07021>. This package supports fitting the SSGL model for the following GLMs with group sparsity: Gaussian linear regression, binary logistic regression, Poisson regression, negative binomial regression, and gamma regression. Stand-alone functions for group-regularized negative binomial regression and group-regularized gamma regression are also available, with the option of employing the group lasso penalty of Yuan and Lin (2006) <doi:10.1111/j.1467-9868.2005.00532.x>, the group minimax concave penalty (MCP) of Breheny and Huang <doi:10.1007/s11222-013-9424-2>, or the group smoothly clipped absolute deviation (SCAD) penalty of Breheny and Huang (2015) <doi:10.1007/s11222-013-9424-2>. |
Authors: | Ray Bai |
Maintainer: | Ray Bai <[email protected]> |
License: | GPL-3 |
Version: | 1.0 |
Built: | 2025-01-19 07:00:00 UTC |
Source: | CRAN |
This function implements -fold cross-validation for group-regularized gamma regression with a known shape parameter
and the log link. The cross-validation error (CVE) and cross-validation standard error (CVSE) are computed using the deviance for gamma regression.
For a description of group-regularized gamma regression, see the description for the gamma_grpreg
function. Our implementation is based on the least squares approximation approach of Wang and Leng (2007), and hence, the function does not allow the total number of covariates to be greater than
sample size, where
is the number of folds.
Note that the gamma_grpreg
function also returns the generalized information criterion (GIC) of Fan and Tang (2013) for each regularization parameter in lambda
, and the GIC can also be used for model selection instead of cross-validation.
cv_gamma_grpreg(Y, X, groups, gamma_shape=1, penalty=c("gLASSO","gSCAD","gMCP"), n_folds=10, group_weights, taper, n_lambda=100, lambda, max_iter=10000, tol=1e-4)
cv_gamma_grpreg(Y, X, groups, gamma_shape=1, penalty=c("gLASSO","gSCAD","gMCP"), n_folds=10, group_weights, taper, n_lambda=100, lambda, max_iter=10000, tol=1e-4)
Y |
|
X |
|
groups |
|
gamma_shape |
known shape parameter |
penalty |
group regularization method to use on the groups of regression coefficients. The options are |
n_folds |
number of folds |
group_weights |
group-specific, nonnegative weights for the penalty. Default is to use the square roots of the group sizes. |
taper |
tapering term |
n_lambda |
number of regularization parameters |
lambda |
grid of |
max_iter |
maximum number of iterations in the algorithm. Default is |
tol |
convergence threshold for algorithm. Default is |
The function returns a list containing the following components:
lambda |
|
cve |
|
cvse |
|
lambda_min |
The value in |
min_index |
The index of |
Breheny, P. and Huang, J. (2015). "Group descent algorithms for nonconvex penalized linear and logistic regression models with grouped predictors." Statistics and Computing, 25:173-187.
Fan, Y. and Tang, C. Y. (2013). "Tuning parameter selection in high-dimensional penalized likelihood." Journal of the Royal Statistical Society: Series B (Statistical Methodology), 75:531-552.
Wang, H. and Leng, C. (2007). "Unified LASSO estimation by least squares approximation." Journal of the American Statistical Association, 102:1039-1048.
Yuan, M. and Lin, Y. (2006). "Model selection and estimation in regression with grouped variables." Journal of the Royal Statistical Society: Series B (Statistical Methodology), 68:49-67.
## Generate data set.seed(12345) X = matrix(runif(100*11), nrow=100) n = dim(X)[1] groups = c(1,1,1,2,2,2,3,3,4,5,5) beta_true = c(-1,1,1,0,0,0,0,0,0,1.5,-1.5) ## Generate responses from gamma regression with known shape parameter 1 eta = crossprod(t(X), beta_true) shape = 1 Y = rgamma(n, rate=shape/exp(eta), shape=shape) ## 10-fold cross-validation for group-regularized gamma regression ## with the group LASSO penalty gamma_cv = cv_gamma_grpreg(Y, X, groups, penalty="gLASSO") ## Plot cross-validation curve plot(gamma_cv$lambda, gamma_cv$cve, type="l", xlab="lambda", ylab="CVE") ## lambda which minimizes mean CVE gamma_cv$lambda_min ## index of lambda_min in lambda gamma_cv$min_index
## Generate data set.seed(12345) X = matrix(runif(100*11), nrow=100) n = dim(X)[1] groups = c(1,1,1,2,2,2,3,3,4,5,5) beta_true = c(-1,1,1,0,0,0,0,0,0,1.5,-1.5) ## Generate responses from gamma regression with known shape parameter 1 eta = crossprod(t(X), beta_true) shape = 1 Y = rgamma(n, rate=shape/exp(eta), shape=shape) ## 10-fold cross-validation for group-regularized gamma regression ## with the group LASSO penalty gamma_cv = cv_gamma_grpreg(Y, X, groups, penalty="gLASSO") ## Plot cross-validation curve plot(gamma_cv$lambda, gamma_cv$cve, type="l", xlab="lambda", ylab="CVE") ## lambda which minimizes mean CVE gamma_cv$lambda_min ## index of lambda_min in lambda gamma_cv$min_index
This function implements -fold cross-validation for group-regularized negative binomial regression with a known size parameter
and the log link. The cross-validation error (CVE) and cross-validation standard error (CVSE) are computed using the deviance for negative binomial regression.
For a description of group-regularized negative binomial regression, see the description for the nb_grpreg
function. Our implementation is based on the least squares approximation approach of Wang and Leng (2007), and hence, the function does not allow the total number of covariates to be greater than
sample size, where
is the number of folds.
Note that the nb_grpreg
function also returns the generalized information criterion (GIC) of Fan and Tang (2013) for each regularization parameter in lambda
, and the GIC can also be used for model selection instead of cross-validation.
cv_nb_grpreg(Y, X, groups, nb_size=1, penalty=c("gLASSO","gSCAD","gMCP"), n_folds=10, group_weights, taper, n_lambda=100, lambda, max_iter=10000, tol=1e-4)
cv_nb_grpreg(Y, X, groups, nb_size=1, penalty=c("gLASSO","gSCAD","gMCP"), n_folds=10, group_weights, taper, n_lambda=100, lambda, max_iter=10000, tol=1e-4)
Y |
|
X |
|
groups |
|
nb_size |
known size parameter |
penalty |
group regularization method to use on the groups of regression coefficients. The options are |
n_folds |
number of folds |
group_weights |
group-specific, nonnegative weights for the penalty. Default is to use the square roots of the group sizes. |
taper |
tapering term |
n_lambda |
number of regularization parameters |
lambda |
grid of |
max_iter |
maximum number of iterations in the algorithm. Default is |
tol |
convergence threshold for algorithm. Default is |
The function returns a list containing the following components:
lambda |
|
cve |
|
cvse |
|
lambda_min |
The value in |
min_index |
The index of |
Breheny, P. and Huang, J. (2015). "Group descent algorithms for nonconvex penalized linear and logistic regression models with grouped predictors." Statistics and Computing, 25:173-187.
Fan, Y. and Tang, C. Y. (2013). "Tuning parameter selection in high dimensional penalized likelihood." Journal of the Royal Statistical Society: Series B (Statistical Methodology), 75:531-552.
Wang, H. and Leng, C. (2007). "Unified LASSO estimation by least squares approximation." Journal of the American Statistical Association, 102:1039-1048.
Yuan, M. and Lin, Y. (2006). "Model selection and estimation in regression with grouped variables." Journal of the Royal Statistical Society: Series B (Statistical Methodology), 68:49-67.
## Generate data set.seed(1234) X = matrix(runif(100*14), nrow=100) n = dim(X)[1] groups = c(1,1,1,2,2,2,2,3,3,4,5,5,6,6) beta_true = c(-1,1,1,0,0,0,0,-1,1,0,0,0,-1.5,1.5) ## Generate count responses from negative binomial regression eta = crossprod(t(X), beta_true) Y = rnbinom(n, size=1, mu=exp(eta)) ## 10-fold cross-validation for group-regularized negative binomial ## regression with the group MCP penalty nb_cv = cv_nb_grpreg(Y, X, groups, penalty="gMCP") ## Plot cross-validation curve plot(nb_cv$lambda, nb_cv$cve, type="l", xlab="lambda", ylab="CVE") ## lambda which minimizes mean CVE nb_cv$lambda_min ## index of lambda_min in lambda nb_cv$min_index
## Generate data set.seed(1234) X = matrix(runif(100*14), nrow=100) n = dim(X)[1] groups = c(1,1,1,2,2,2,2,3,3,4,5,5,6,6) beta_true = c(-1,1,1,0,0,0,0,-1,1,0,0,0,-1.5,1.5) ## Generate count responses from negative binomial regression eta = crossprod(t(X), beta_true) Y = rnbinom(n, size=1, mu=exp(eta)) ## 10-fold cross-validation for group-regularized negative binomial ## regression with the group MCP penalty nb_cv = cv_nb_grpreg(Y, X, groups, penalty="gMCP") ## Plot cross-validation curve plot(nb_cv$lambda, nb_cv$cve, type="l", xlab="lambda", ylab="CVE") ## lambda which minimizes mean CVE nb_cv$lambda_min ## index of lambda_min in lambda nb_cv$min_index
This function implements -fold cross-validation for group-regularized GLMs with the spike-and-slab group lasso (SSGL) penalty of Bai et al. (2022) and Bai (2023). The identity link function is used for Gaussian regression, the logit link is used for binomial regression, and the log link is used for Poisson, negative binomial, and gamma regression.
Although one can choose lambda0
from cross-validation with this function, it can be very time-consuming to do so if the number of groups and/or the number of total covariantes
is moderate to large. It is strongly recommended that the user simply run the
SSGL
function on the training dataset and select the final model according to the lambda0
that minimizes the generalized information criterion (GIC). See description of the SSGL
function for more details.
cv_SSGL(Y, X, groups, family=c("gaussian","binomial","poisson","negativebinomial","gamma"), nb_size=1, gamma_shape=1, group_weights, n_folds=5, n_lambda0=25, lambda0, lambda1=1, a=1, b=dim(X)[2], max_iter=100, tol=1e-6, print_fold=TRUE)
cv_SSGL(Y, X, groups, family=c("gaussian","binomial","poisson","negativebinomial","gamma"), nb_size=1, gamma_shape=1, group_weights, n_folds=5, n_lambda0=25, lambda0, lambda1=1, a=1, b=dim(X)[2], max_iter=100, tol=1e-6, print_fold=TRUE)
Y |
|
X |
|
groups |
|
family |
exponential dispersion family of the response variables. Allows for |
nb_size |
known size parameter |
gamma_shape |
known shape parameter |
group_weights |
group-specific, nonnegative weights for the penalty. Default is to use the square roots of the group sizes. |
n_folds |
number of folds |
n_lambda0 |
number of spike hyperparameters |
lambda0 |
grid of |
lambda1 |
slab hyperparameter |
a |
shape hyperparameter for the |
b |
shape hyperparameter for the |
max_iter |
maximum number of iterations in the algorithm. Default is |
tol |
convergence threshold for algorithm. Default is |
print_fold |
Boolean variable for whether or not to print the current fold in the algorithm. Default is |
The function returns a list containing the following components:
lambda0 |
|
cve |
|
cvse |
|
lambda0_min |
The value in |
min_index |
The index of |
Bai, R. (2023). "Bayesian group regularization in generalized linear models with a continuous spike-and-slab prior." arXiv pre-print arXiv:2007.07021.
Bai, R., Moran, G. E., Antonelli, J. L., Chen, Y., and Boland, M.R. (2022). "Spike-and-slab group lassos for grouped regression and sparse generalized additive models." Journal of the American Statistical Association, 117:184-197.
## Generate data set.seed(12345) X = matrix(runif(50*6), nrow=50) n = dim(X)[1] groups = c(1,1,1,2,2,2) beta_true = c(-2,1,1.5,0,0,0) ## Generate responses from Gaussian distribution Y = crossprod(t(X), beta_true) + rnorm(n) ## K-fold cross-validation ## NOTE: If you do not specify lambda0, the function will automatically choose a suitable grid. ssgl_mods = cv_SSGL(Y, X, groups, family="gaussian", lambda0=seq(from=16,to=4,by=-4)) ## Plot cross-validation curve plot(ssgl_mods$lambda0, ssgl_mods$cve, type="l", xlab="lambda0", ylab="CVE") ## lambda which minimizes mean CVE ssgl_mods$lambda0_min ssgl_mods$min_index ## Example with Poisson regression ## Generate count responses eta = crossprod(t(X), beta_true) Y = rpois(n,exp(eta)) ## K-fold cross-validation ## NOTE: If you do not specify lambda0, the program will automatically choose a suitable grid. ssgl_poisson_mods = cv_SSGL(Y, X, groups, family="poisson", lambda0=seq(from=20,to=2,by=-4)) ## Plot cross-validation curve plot(ssgl_poisson_mods$lambda0, ssgl_poisson_mods$cve, type="l", xlab="lambda0", ylab="CVE") ## lambda which minimizes mean CVE ssgl_poisson_mods$lambda0_min ssgl_poisson_mods$min_index
## Generate data set.seed(12345) X = matrix(runif(50*6), nrow=50) n = dim(X)[1] groups = c(1,1,1,2,2,2) beta_true = c(-2,1,1.5,0,0,0) ## Generate responses from Gaussian distribution Y = crossprod(t(X), beta_true) + rnorm(n) ## K-fold cross-validation ## NOTE: If you do not specify lambda0, the function will automatically choose a suitable grid. ssgl_mods = cv_SSGL(Y, X, groups, family="gaussian", lambda0=seq(from=16,to=4,by=-4)) ## Plot cross-validation curve plot(ssgl_mods$lambda0, ssgl_mods$cve, type="l", xlab="lambda0", ylab="CVE") ## lambda which minimizes mean CVE ssgl_mods$lambda0_min ssgl_mods$min_index ## Example with Poisson regression ## Generate count responses eta = crossprod(t(X), beta_true) Y = rpois(n,exp(eta)) ## K-fold cross-validation ## NOTE: If you do not specify lambda0, the program will automatically choose a suitable grid. ssgl_poisson_mods = cv_SSGL(Y, X, groups, family="poisson", lambda0=seq(from=20,to=2,by=-4)) ## Plot cross-validation curve plot(ssgl_poisson_mods$lambda0, ssgl_poisson_mods$cve, type="l", xlab="lambda0", ylab="CVE") ## lambda which minimizes mean CVE ssgl_poisson_mods$lambda0_min ssgl_poisson_mods$min_index
This function implements group-regularized gamma regression with a known shape parameter and the log link. In gamma regression, we assume that
, where
Then , and we relate
to a set of
covariates
through the log link,
If the covariates in each are grouped according to known groups
, then this function can estimate some of the
groups of coefficients as all zero, depending on the amount of regularization. Our implementation for regularized gamma regression is based on the least squares approximation approach of Wang and Leng (2007), and hence, the function does not allow the total number of covariates
to be greater than sample size.
In addition, this function has the option of returning the generalized information criterion (GIC) of Fan and Tang (2013) for each regularization parameter in the grid lambda
. The GIC can be used for model selection and serves as a useful alternative to cross-validation.
gamma_grpreg(Y, X, groups, X_test, gamma_shape=1, penalty=c("gLASSO","gSCAD","gMCP"), group_weights, taper, n_lambda=100, lambda, max_iter=10000, tol=1e-4, return_GIC=TRUE)
gamma_grpreg(Y, X, groups, X_test, gamma_shape=1, penalty=c("gLASSO","gSCAD","gMCP"), group_weights, taper, n_lambda=100, lambda, max_iter=10000, tol=1e-4, return_GIC=TRUE)
Y |
|
X |
|
groups |
|
X_test |
|
gamma_shape |
known shape parameter |
penalty |
group regularization method to use on the groups of regression coefficients. The options are |
group_weights |
group-specific, nonnegative weights for the penalty. Default is to use the square roots of the group sizes. |
taper |
tapering term |
n_lambda |
number of regularization parameters |
lambda |
grid of |
max_iter |
maximum number of iterations in the algorithm. Default is |
tol |
convergence threshold for algorithm. Default is |
return_GIC |
Boolean variable for whether or not to return the GIC. Default is |
The function returns a list containing the following components:
lambda |
|
beta |
|
beta0 |
|
classifications |
|
Y_pred |
|
GIC |
|
lambda_min |
The value in |
min_index |
The index of |
Breheny, P. and Huang, J. (2015). "Group descent algorithms for nonconvex penalized linear and logistic regression models with grouped predictors." Statistics and Computing, 25:173-187.
Fan, Y. and Tang, C. Y. (2013). "Tuning parameter selection in high dimensional penalized likelihood." Journal of the Royal Statistical Society: Series B (Statistical Methodology), 75:531-552.
Wang, H. and Leng, C. (2007). "Unified LASSO estimation by least squares approximation." Journal of the American Statistical Association, 102:1039-1048.
Yuan, M. and Lin, Y. (2006). "Model selection and estimation in regression with grouped variables." Journal of the Royal Statistical Society: Series B (Statistical Methodology), 68:49-67.
## Generate data set.seed(1234) X = matrix(runif(100*11), nrow=100) n = dim(X)[1] groups = c(1,1,1,2,2,2,3,3,4,5,5) beta_true = c(-1,1,1,0,0,0,0,0,0,1.5,-1.5) ## Generate responses from gamma regression with known shape parameter 1 eta = crossprod(t(X), beta_true) shape = 1 Y = rgamma(n, rate=shape/exp(eta), shape=shape) ## Generate test data n_test = 50 X_test = matrix(runif(n_test*11), nrow=n_test) ## Fit gamma regression models with the group SCAD penalty gamma_mod = gamma_grpreg(Y, X, groups, X_test, penalty="gSCAD") ## Tuning parameters used to fit models gamma_mod$lambda ## Predicted n_test-dimensional vectors mu=E(Y_test) based on test data, X_test. ## The kth column of 'Y_pred' corresponds to the kth entry in 'lambda.' gamma_mod$Y_pred ## Classifications of the 5 groups. The kth column of 'classifications' # corresponds to the kth entry in 'lambda.' gamma_mod$classifications ## Plot lambda vs. GIC plot(gamma_mod$lambda, gamma_mod$GIC, type='l') ## Model selection with the lambda that minimizes GIC gamma_mod$lambda_min gamma_mod$min_index gamma_mod$classifications[, gamma_mod$min_index] gamma_mod$beta[, gamma_mod$min_index]
## Generate data set.seed(1234) X = matrix(runif(100*11), nrow=100) n = dim(X)[1] groups = c(1,1,1,2,2,2,3,3,4,5,5) beta_true = c(-1,1,1,0,0,0,0,0,0,1.5,-1.5) ## Generate responses from gamma regression with known shape parameter 1 eta = crossprod(t(X), beta_true) shape = 1 Y = rgamma(n, rate=shape/exp(eta), shape=shape) ## Generate test data n_test = 50 X_test = matrix(runif(n_test*11), nrow=n_test) ## Fit gamma regression models with the group SCAD penalty gamma_mod = gamma_grpreg(Y, X, groups, X_test, penalty="gSCAD") ## Tuning parameters used to fit models gamma_mod$lambda ## Predicted n_test-dimensional vectors mu=E(Y_test) based on test data, X_test. ## The kth column of 'Y_pred' corresponds to the kth entry in 'lambda.' gamma_mod$Y_pred ## Classifications of the 5 groups. The kth column of 'classifications' # corresponds to the kth entry in 'lambda.' gamma_mod$classifications ## Plot lambda vs. GIC plot(gamma_mod$lambda, gamma_mod$GIC, type='l') ## Model selection with the lambda that minimizes GIC gamma_mod$lambda_min gamma_mod$min_index gamma_mod$classifications[, gamma_mod$min_index] gamma_mod$beta[, gamma_mod$min_index]
This function implements group-regularized negative binomial regression with a known size parameter and the log link. In negative binomial regression, we assume that
, where
Then , and we relate
to a set of
covariates
through the log link,
If the covariates in each are grouped according to known groups
, then this function can estimate some of the
groups of coefficients as all zero, depending on the amount of regularization. Our implementation for regularized negative binomial regression is based on the least squares approximation approach of Wang and Leng (2007), and hence, the function does not allow the total number of covariates
to be greater than sample size.
In addition, this function has the option of returning the generalized information criterion (GIC) of Fan and Tang (2013) for each regularization parameter in the grid lambda
. The GIC can be used for model selection and serves as a useful alternative to cross-validation.
nb_grpreg(Y, X, groups, X_test, nb_size=1, penalty=c("gLASSO","gSCAD","gMCP"), group_weights, taper, n_lambda=100, lambda, max_iter=10000, tol=1e-4, return_GIC=TRUE)
nb_grpreg(Y, X, groups, X_test, nb_size=1, penalty=c("gLASSO","gSCAD","gMCP"), group_weights, taper, n_lambda=100, lambda, max_iter=10000, tol=1e-4, return_GIC=TRUE)
Y |
|
X |
|
groups |
|
X_test |
|
nb_size |
known size parameter |
penalty |
group regularization method to use on the groups of regression coefficients. The options are |
group_weights |
group-specific, nonnegative weights for the penalty. Default is to use the square roots of the group sizes. |
taper |
tapering term |
n_lambda |
number of regularization parameters |
lambda |
grid of |
max_iter |
maximum number of iterations in the algorithm. Default is |
tol |
convergence threshold for algorithm. Default is |
return_GIC |
Boolean variable for whether or not to return the GIC. Default is |
The function returns a list containing the following components:
lambda |
|
beta |
|
beta0 |
|
classifications |
|
Y_pred |
|
GIC |
|
lambda_min |
The value in |
min_index |
The index of |
Breheny, P. and Huang, J. (2015). "Group descent algorithms for nonconvex penalized linear and logistic regression models with grouped predictors." Statistics and Computing, 25:173-187.
Fan, Y. and Tang, C. Y. (2013). "Tuning parameter selection in high dimensional penalized likelihood." Journal of the Royal Statistical Society: Series B (Statistical Methodology), 75:531-552.
Wang, H. and Leng, C. (2007). "Unified LASSO estimation by least squares approximation." Journal of the American Statistical Association, 102:1039-1048.
Yuan, M. and Lin, Y. (2006). "Model selection and estimation in regression with grouped variables." Journal of the Royal Statistical Society: Series B (Statistical Methodology), 68:49-67.
## Generate data set.seed(1234) X = matrix(runif(100*15), nrow=100) n = dim(X)[1] groups = c("A","A","A","A","B","B","B","B","C","C","D","D","E","E","E") groups = as.factor(groups) beta_true = c(-1.5,1.5,-1.5,1.5,0,0,0,0,0,0,2,-2,0,0,0) ## Generate count responses from negative binomial regression eta = crossprod(t(X), beta_true) Y = rnbinom(n,size=1, mu=exp(eta)) ## Generate test data n_test = 50 X_test = matrix(runif(n_test*15), nrow=n_test) ## Fit negative binomial regression models with the group MCP penalty nb_mod = nb_grpreg(Y, X, groups, X_test, penalty="gMCP") ## Tuning parameters used to fit models nb_mod$lambda # Predicted n_test-dimensional vectors mu=E(Y_test) based on test data, X_test. # The kth column of 'Y_pred' corresponds to the kth entry in 'lambda.' nb_mod$Y_pred # Classifications of the 8 groups. The kth column of 'classifications' # corresponds to the kth entry in lambda. nb_mod$classifications ## Plot lambda vs. GIC plot(nb_mod$lambda, nb_mod$GIC, type='l') ## Model selection with the lambda that minimizes GIC nb_mod$lambda_min nb_mod$min_index nb_mod$classifications[, nb_mod$min_index] nb_mod$beta[, nb_mod$min_index]
## Generate data set.seed(1234) X = matrix(runif(100*15), nrow=100) n = dim(X)[1] groups = c("A","A","A","A","B","B","B","B","C","C","D","D","E","E","E") groups = as.factor(groups) beta_true = c(-1.5,1.5,-1.5,1.5,0,0,0,0,0,0,2,-2,0,0,0) ## Generate count responses from negative binomial regression eta = crossprod(t(X), beta_true) Y = rnbinom(n,size=1, mu=exp(eta)) ## Generate test data n_test = 50 X_test = matrix(runif(n_test*15), nrow=n_test) ## Fit negative binomial regression models with the group MCP penalty nb_mod = nb_grpreg(Y, X, groups, X_test, penalty="gMCP") ## Tuning parameters used to fit models nb_mod$lambda # Predicted n_test-dimensional vectors mu=E(Y_test) based on test data, X_test. # The kth column of 'Y_pred' corresponds to the kth entry in 'lambda.' nb_mod$Y_pred # Classifications of the 8 groups. The kth column of 'classifications' # corresponds to the kth entry in lambda. nb_mod$classifications ## Plot lambda vs. GIC plot(nb_mod$lambda, nb_mod$GIC, type='l') ## Model selection with the lambda that minimizes GIC nb_mod$lambda_min nb_mod$min_index nb_mod$classifications[, nb_mod$min_index] nb_mod$beta[, nb_mod$min_index]
This is a function to implement group-regularized GLMs with the spike-and-slab group lasso (SSGL) penalty of Bai et al. (2022) and Bai (2023). The identity link function is used for Gaussian regression, the logit link is used for binomial regression, and the log link is used for Poisson, negative binomial, and gamma regression. If the covariates in each are grouped according to known groups
, then this function can estimate some of the
groups of coefficients as all zero, depending on the amount of regularization.
In addition, this function has the option of returning the generalized information criterion (GIC) of Fan and Tang (2013) for each regularization parameter in the grid lambda0
. The GIC can be used for model selection and serves as a useful alternative to cross-validation. The formula for the GIC and a given is
where is the deviance computed with the estimate of
beta
based on spike hyperparameter ,
is the number of nonzero elements in the estimated
beta
, and is a sequence that diverges at a suitable rate relative to
. As recommended by Fan and Tang (2013), we set
.
SSGL(Y, X, groups, family=c("gaussian","binomial","poisson","negativebinomial","gamma"), X_test, nb_size=1, gamma_shape=1, group_weights, n_lambda0=25, lambda0, lambda1=1, a=1, b=dim(X)[2], max_iter=100, tol = 1e-6, return_GIC=TRUE, print_lambda0=TRUE)
SSGL(Y, X, groups, family=c("gaussian","binomial","poisson","negativebinomial","gamma"), X_test, nb_size=1, gamma_shape=1, group_weights, n_lambda0=25, lambda0, lambda1=1, a=1, b=dim(X)[2], max_iter=100, tol = 1e-6, return_GIC=TRUE, print_lambda0=TRUE)
Y |
|
X |
|
groups |
|
family |
exponential dispersion family of the response variables. Allows for |
X_test |
|
nb_size |
known size parameter |
gamma_shape |
known shape parameter |
group_weights |
group-specific, nonnegative weights for the penalty. Default is to use the square roots of the group sizes. |
n_lambda0 |
number of spike hyperparameters |
lambda0 |
grid of |
lambda1 |
slab hyperparameter |
a |
shape hyperparameter for the |
b |
shape hyperparameter for the |
max_iter |
maximum number of iterations in the algorithm. Default is |
tol |
convergence threshold for algorithm. Default is |
return_GIC |
Boolean variable for whether or not to return the GIC. Default is |
print_lambda0 |
Boolean variable for whether or not to print the current value in |
The function returns a list containing the following components:
lambda0 |
|
beta |
|
beta0 |
|
classifications |
|
Y_pred |
|
GIC |
|
lambda0_min |
The value in |
min_index |
The index of |
Bai, R. (2023). "Bayesian group regularization in generalized linear models with a continuous spike-and-slab prior." arXiv pre-print arXiv:2007.07021.
Bai, R., Moran, G. E., Antonelli, J. L., Chen, Y., and Boland, M.R. (2022). "Spike-and-slab group lassos for grouped regression and sparse generalized additive models." Journal of the American Statistical Association, 117:184-197.
Fan, Y. and Tang, C. Y. (2013). "Tuning parameter selection in high dimensional penalized likelihood." Journal of the Royal Statistical Society: Series B (Statistical Methodology), 75:531-552.
## Generate data set.seed(12345) X = matrix(runif(100*10), nrow=100) n = dim(X)[1] groups = c("A","A","A","B","B","B","C","C","D","D") groups = as.factor(groups) beta_true = c(-2.5,1.5,1.5,0,0,0,2,-2,0,0) ## Generate responses from Gaussian distribution Y = crossprod(t(X), beta_true) + rnorm(n) ## Generate test data n_test = 50 X_test = matrix(runif(n_test*10), nrow=n_test) ## Fit SSGL model with 10 spike hyperparameters ## NOTE: If you do not specify lambda0, the program will automatically choose a suitable grid. SSGL_mod = SSGL(Y, X, groups, family="gaussian", X_test, lambda0=seq(from=50,to=5,by=-5)) ## Regression coefficient estimates SSGL_mod$beta ## Predicted n_test-dimensional vectors mu=E(Y.test) based on test data, X_test. ## The kth column of 'Y_pred' corresponds to the kth entry in 'lambda.' SSGL_mod$Y_pred ## Classifications of the 8 groups. The kth column of 'classifications' ## corresponds to the kth entry in 'lambda.' SSGL_mod$classifications ## Plot lambda vs. GIC plot(SSGL_mod$lambda0, SSGL_mod$GIC, type='l') ## Model selection with the lambda that minimizes GIC SSGL_mod$lambda0_min SSGL_mod$min_index SSGL_mod$classifications[, SSGL_mod$min_index] SSGL_mod$beta[, SSGL_mod$min_index] ## Example with binary logistic regression set.seed(12345) X = matrix(runif(100*8), nrow=100) n = dim(X)[1] groups = c("A","A","A","B","B","B","C","C") groups = as.factor(groups) beta_true = c(-2.5,1.5,1.5,0,0,0,2,-2) ## Generate binary responses eta = crossprod(t(X), beta_true) Y = rbinom(n, size=1, prob=1/(1+exp(-eta))) ## Generate test data n_test = 50 X_test = matrix(runif(n_test*8), nrow=n_test) ## Fit SSGL logistic regression model with 10 spike hyperparameters ## NOTE: If you do not specify lambda0, the program will automatically choose a suitable grid. SSGL_logistic_mod = SSGL(Y, X, groups, family="binomial", X_test, lambda0=seq(from=10,to=1,by=-1.5)) ## Regression coefficient estimates SSGL_logistic_mod$beta ## Predicted n_test-dimensional vectors mu=E(Y_test) based on test data, X_test. ## The kth column of 'Y_pred' corresponds to the kth entry in 'lambda.' SSGL_logistic_mod$Y_pred ## Classifications of the 8 groups. The kth column of 'classifications' ## corresponds to the kth entry in 'lambda.' SSGL_logistic_mod$classifications ## Plot lambda vs. GIC plot(SSGL_logistic_mod$lambda0, SSGL_logistic_mod$GIC, type='l') ## Model selection with the lambda that minimizes GIC SSGL_logistic_mod$lambda0_min SSGL_logistic_mod$min_index SSGL_logistic_mod$classifications[, SSGL_logistic_mod$min_index] SSGL_logistic_mod$beta[, SSGL_logistic_mod$min_index]
## Generate data set.seed(12345) X = matrix(runif(100*10), nrow=100) n = dim(X)[1] groups = c("A","A","A","B","B","B","C","C","D","D") groups = as.factor(groups) beta_true = c(-2.5,1.5,1.5,0,0,0,2,-2,0,0) ## Generate responses from Gaussian distribution Y = crossprod(t(X), beta_true) + rnorm(n) ## Generate test data n_test = 50 X_test = matrix(runif(n_test*10), nrow=n_test) ## Fit SSGL model with 10 spike hyperparameters ## NOTE: If you do not specify lambda0, the program will automatically choose a suitable grid. SSGL_mod = SSGL(Y, X, groups, family="gaussian", X_test, lambda0=seq(from=50,to=5,by=-5)) ## Regression coefficient estimates SSGL_mod$beta ## Predicted n_test-dimensional vectors mu=E(Y.test) based on test data, X_test. ## The kth column of 'Y_pred' corresponds to the kth entry in 'lambda.' SSGL_mod$Y_pred ## Classifications of the 8 groups. The kth column of 'classifications' ## corresponds to the kth entry in 'lambda.' SSGL_mod$classifications ## Plot lambda vs. GIC plot(SSGL_mod$lambda0, SSGL_mod$GIC, type='l') ## Model selection with the lambda that minimizes GIC SSGL_mod$lambda0_min SSGL_mod$min_index SSGL_mod$classifications[, SSGL_mod$min_index] SSGL_mod$beta[, SSGL_mod$min_index] ## Example with binary logistic regression set.seed(12345) X = matrix(runif(100*8), nrow=100) n = dim(X)[1] groups = c("A","A","A","B","B","B","C","C") groups = as.factor(groups) beta_true = c(-2.5,1.5,1.5,0,0,0,2,-2) ## Generate binary responses eta = crossprod(t(X), beta_true) Y = rbinom(n, size=1, prob=1/(1+exp(-eta))) ## Generate test data n_test = 50 X_test = matrix(runif(n_test*8), nrow=n_test) ## Fit SSGL logistic regression model with 10 spike hyperparameters ## NOTE: If you do not specify lambda0, the program will automatically choose a suitable grid. SSGL_logistic_mod = SSGL(Y, X, groups, family="binomial", X_test, lambda0=seq(from=10,to=1,by=-1.5)) ## Regression coefficient estimates SSGL_logistic_mod$beta ## Predicted n_test-dimensional vectors mu=E(Y_test) based on test data, X_test. ## The kth column of 'Y_pred' corresponds to the kth entry in 'lambda.' SSGL_logistic_mod$Y_pred ## Classifications of the 8 groups. The kth column of 'classifications' ## corresponds to the kth entry in 'lambda.' SSGL_logistic_mod$classifications ## Plot lambda vs. GIC plot(SSGL_logistic_mod$lambda0, SSGL_logistic_mod$GIC, type='l') ## Model selection with the lambda that minimizes GIC SSGL_logistic_mod$lambda0_min SSGL_logistic_mod$min_index SSGL_logistic_mod$classifications[, SSGL_logistic_mod$min_index] SSGL_logistic_mod$beta[, SSGL_logistic_mod$min_index]