Title: | Group Subset Selection |
---|---|
Description: | Provides tools for sparse regression modelling with grouped predictors using the group subset selection penalty. Uses coordinate descent and local search algorithms to rapidly deliver near optimal estimates. The group subset penalty can be combined with a group lasso or ridge penalty for added shrinkage. Linear and logistic regression are supported, as are overlapping groups. |
Authors: | Ryan Thompson [aut, cre] |
Maintainer: | Ryan Thompson <[email protected]> |
License: | GPL-3 |
Version: | 1.3.2 |
Built: | 2024-12-28 06:40:58 UTC |
Source: | CRAN |
Extracts coefficients for specified values of the tuning parameters.
## S3 method for class 'cv.grpsel' coef(object, lambda = "lambda.min", gamma = "gamma.min", ...)
## S3 method for class 'cv.grpsel' coef(object, lambda = "lambda.min", gamma = "gamma.min", ...)
object |
an object of class |
lambda |
the value of |
gamma |
the value of |
... |
any other arguments |
A matrix of coefficients.
Ryan Thompson <[email protected]>
Extracts coefficients for specified values of the tuning parameters.
## S3 method for class 'grpsel' coef(object, lambda = NULL, gamma = NULL, ...)
## S3 method for class 'grpsel' coef(object, lambda = NULL, gamma = NULL, ...)
object |
an object of class |
lambda |
the value of |
gamma |
the value of |
... |
any other arguments |
A matrix of coefficients.
Ryan Thompson <[email protected]>
Fits the regularisation surface for a regression model with a group subset selection penalty and then cross-validates this surface.
cv.grpsel( x, y, group = seq_len(ncol(x)), penalty = c("grSubset", "grSubset+grLasso", "grSubset+Ridge"), loss = c("square", "logistic"), lambda = NULL, gamma = NULL, nfold = 10, folds = NULL, cv.loss = NULL, cluster = NULL, interpolate = TRUE, ... )
cv.grpsel( x, y, group = seq_len(ncol(x)), penalty = c("grSubset", "grSubset+grLasso", "grSubset+Ridge"), loss = c("square", "logistic"), lambda = NULL, gamma = NULL, nfold = 10, folds = NULL, cv.loss = NULL, cluster = NULL, interpolate = TRUE, ... )
x |
a predictor matrix |
y |
a response vector |
group |
a vector of length |
penalty |
the type of penalty to apply; one of 'grSubset', 'grSubset+grLasso', or 'grSubset+Ridge' |
loss |
the type of loss function to use; 'square' for linear regression or 'logistic' for logistic regression |
lambda |
an optional list of decreasing sequences of group subset selection parameters; the
list should contain a vector for each value of |
gamma |
an optional decreasing sequence of group lasso or ridge parameters |
nfold |
the number of cross-validation folds |
folds |
an optional vector of length |
cv.loss |
an optional cross-validation loss-function to use; should accept a vector of predicted values and a vector of actual values |
cluster |
an optional cluster for running cross-validation in parallel; must be set up using
|
interpolate |
a logical indicating whether to interpolate the |
... |
any other arguments for |
When loss='logistic'
stratified cross-validation is used to balance
the folds. When fitting to the cross-validation folds, interpolate=TRUE
cross-validates
the midpoints between consecutive lambda
values rather than the original lambda
sequence. This new sequence retains the same set of solutions on the full data, but often leads
to superior cross-validation performance.
An object of class cv.grpsel
; a list with the following components:
cv.mean |
a list of vectors containing cross-validation means per value of |
cd.sd |
a list of vectors containing cross-validation standard errors per value of
|
lambda |
a list of vectors containing the values of |
gamma |
a vector containing the values of |
lambda.min |
the value of |
gamma.min |
the value of |
fit |
the fit from running |
Ryan Thompson <[email protected]>
# Grouped data set.seed(123) n <- 100 p <- 10 g <- 5 group <- rep(1:g, each = p / g) beta <- numeric(p) beta[which(group %in% 1:2)] <- 1 x <- matrix(rnorm(n * p), n, p) y <- rnorm(n, x %*% beta) newx <- matrix(rnorm(p), ncol = p) # Group subset selection fit <- cv.grpsel(x, y, group) plot(fit) coef(fit) predict(fit, newx) # Parallel cross-validation cl <- parallel::makeCluster(2) fit <- cv.grpsel(x, y, group, cluster = cl) parallel::stopCluster(cl)
# Grouped data set.seed(123) n <- 100 p <- 10 g <- 5 group <- rep(1:g, each = p / g) beta <- numeric(p) beta[which(group %in% 1:2)] <- 1 x <- matrix(rnorm(n * p), n, p) y <- rnorm(n, x %*% beta) newx <- matrix(rnorm(p), ncol = p) # Group subset selection fit <- cv.grpsel(x, y, group) plot(fit) coef(fit) predict(fit, newx) # Parallel cross-validation cl <- parallel::makeCluster(2) fit <- cv.grpsel(x, y, group, cluster = cl) parallel::stopCluster(cl)
Fits the regularisation surface for a regression model with a group subset selection
penalty. The group subset penalty can be combined with either a group lasso or ridge penalty
for shrinkage. The group subset parameter is lambda
and the group lasso/ridge parameter is
gamma
.
grpsel( x, y, group = seq_len(ncol(x)), penalty = c("grSubset", "grSubset+grLasso", "grSubset+Ridge"), loss = c("square", "logistic"), local.search = FALSE, orthogonalise = FALSE, nlambda = 100, lambda.step = 0.99, lambda = NULL, lambda.factor = NULL, ngamma = 10, gamma.max = 100, gamma.min = 1e-04, gamma = NULL, gamma.factor = NULL, pmax = ncol(x), gmax = length(unique(group)), eps = 1e-04, max.cd.iter = 10000, max.ls.iter = 100, active.set = TRUE, active.set.count = 3, sort = TRUE, screen = 500, warn = TRUE )
grpsel( x, y, group = seq_len(ncol(x)), penalty = c("grSubset", "grSubset+grLasso", "grSubset+Ridge"), loss = c("square", "logistic"), local.search = FALSE, orthogonalise = FALSE, nlambda = 100, lambda.step = 0.99, lambda = NULL, lambda.factor = NULL, ngamma = 10, gamma.max = 100, gamma.min = 1e-04, gamma = NULL, gamma.factor = NULL, pmax = ncol(x), gmax = length(unique(group)), eps = 1e-04, max.cd.iter = 10000, max.ls.iter = 100, active.set = TRUE, active.set.count = 3, sort = TRUE, screen = 500, warn = TRUE )
x |
a predictor matrix |
y |
a response vector |
group |
a vector of length |
penalty |
the type of penalty to apply; one of 'grSubset', 'grSubset+grLasso', or 'grSubset+Ridge' |
loss |
the type of loss function to use; 'square' for linear regression or 'logistic' for logistic regression |
local.search |
a logical indicating whether to perform local search after coordinate descent; typically leads to higher quality solutions |
orthogonalise |
a logical indicating whether to orthogonalise within groups |
nlambda |
the number of group subset selection parameters to evaluate when |
lambda.step |
the step size taken when computing |
lambda |
an optional list of decreasing sequences of group subset selection parameters; the
list should contain a vector for each value of |
lambda.factor |
a vector of penalty factors applied to the group subset selection penalty; equal to the group sizes by default |
ngamma |
the number of group lasso or ridge parameters to evaluate when |
gamma.max |
the maximum value for |
gamma.min |
the minimum value for |
gamma |
an optional decreasing sequence of group lasso or ridge parameters |
gamma.factor |
a vector of penalty factors applied to the shrinkage penalty; by default,
equal to the square root of the group sizes when |
pmax |
the maximum number of predictors ever allowed to be active; ignored if |
gmax |
the maximum number of groups ever allowed to be active; ignored if |
eps |
the convergence tolerance; convergence is declared when the relative maximum
difference in consecutive coefficients is less than |
max.cd.iter |
the maximum number of coordinate descent iterations allowed per value of
|
max.ls.iter |
the maximum number of local search iterations allowed per value of
|
active.set |
a logical indicating whether to use active set updates; typically lowers the run time |
active.set.count |
the number of consecutive coordinate descent iterations in which a subset should appear before running active set updates |
sort |
a logical indicating whether to sort the coordinates before running coordinate descent; required for gradient screening; typically leads to higher quality solutions |
screen |
the number of groups to keep after gradient screening; smaller values typically lower the run time |
warn |
a logical indicating whether to print a warning if the algorithms fail to converge |
For linear regression (loss='square'
) the response and predictors are centred
about zero and scaled to unit l2-norm. For logistic regression (loss='logistic'
) only the
predictors are centred and scaled and an intercept is fit during the course of the algorithm.
An object of class grpsel
; a list with the following components:
beta |
a list of matrices whose columns contain fitted coefficients for a given value of
|
gamma |
a vector containing the values of |
lambda |
a list of vectors containing the values of |
np |
a list of vectors containing the number of active predictors per value of
|
ng |
a list of vectors containing the the number of active groups per value of
|
iter.cd |
a list of vectors containing the number of coordinate descent iterations per value
of |
iter.ls |
a list of vectors containing the number of local search iterations per value
of |
loss |
a list of vectors containing the evaluated loss function per value of |
Ryan Thompson <[email protected]>
Thompson, R. and Vahid, F. (2024). 'Group selection and shrinkage: Structured sparsity for semiparametric additive models'. Journal of Computational and Graphical Statistics 33.4, pp. 1286–1297.
# Grouped data set.seed(123) n <- 100 p <- 10 g <- 5 group <- rep(1:g, each = p / g) beta <- numeric(p) beta[which(group %in% 1:2)] <- 1 x <- matrix(rnorm(n * p), n, p) y <- rnorm(n, x %*% beta) newx <- matrix(rnorm(p), ncol = p) # Group subset selection fit <- grpsel(x, y, group) plot(fit) coef(fit, lambda = 0.05) predict(fit, newx, lambda = 0.05) # Group subset selection with group lasso shrinkage fit <- grpsel(x, y, group, penalty = 'grSubset+grLasso') plot(fit, gamma = 0.05) coef(fit, lambda = 0.05, gamma = 0.1) predict(fit, newx, lambda = 0.05, gamma = 0.1) # Group subset selection with ridge shrinkage fit <- grpsel(x, y, group, penalty = 'grSubset+Ridge') plot(fit, gamma = 0.05) coef(fit, lambda = 0.05, gamma = 0.1) predict(fit, newx, lambda = 0.05, gamma = 0.1)
# Grouped data set.seed(123) n <- 100 p <- 10 g <- 5 group <- rep(1:g, each = p / g) beta <- numeric(p) beta[which(group %in% 1:2)] <- 1 x <- matrix(rnorm(n * p), n, p) y <- rnorm(n, x %*% beta) newx <- matrix(rnorm(p), ncol = p) # Group subset selection fit <- grpsel(x, y, group) plot(fit) coef(fit, lambda = 0.05) predict(fit, newx, lambda = 0.05) # Group subset selection with group lasso shrinkage fit <- grpsel(x, y, group, penalty = 'grSubset+grLasso') plot(fit, gamma = 0.05) coef(fit, lambda = 0.05, gamma = 0.1) predict(fit, newx, lambda = 0.05, gamma = 0.1) # Group subset selection with ridge shrinkage fit <- grpsel(x, y, group, penalty = 'grSubset+Ridge') plot(fit, gamma = 0.05) coef(fit, lambda = 0.05, gamma = 0.1) predict(fit, newx, lambda = 0.05, gamma = 0.1)
Plot the cross-validation results from group subset selection for a specified value
of gamma
.
## S3 method for class 'cv.grpsel' plot(x, gamma = "gamma.min", ...)
## S3 method for class 'cv.grpsel' plot(x, gamma = "gamma.min", ...)
x |
an object of class |
gamma |
the value of |
... |
any other arguments |
A plot of the cross-validation results.
Ryan Thompson <[email protected]>
Plot the coefficient profiles from group subset selection for a specified value of
gamma
.
## S3 method for class 'grpsel' plot(x, gamma = 0, ...)
## S3 method for class 'grpsel' plot(x, gamma = 0, ...)
x |
an object of class |
gamma |
the value of |
... |
any other arguments |
A plot of the coefficient profiles.
Ryan Thompson <[email protected]>
Generate predictions for new data using specified values of the tuning parameters.
## S3 method for class 'cv.grpsel' predict(object, x.new, lambda = "lambda.min", gamma = "gamma.min", ...)
## S3 method for class 'cv.grpsel' predict(object, x.new, lambda = "lambda.min", gamma = "gamma.min", ...)
object |
an object of class |
x.new |
a matrix of new values for the predictors |
lambda |
the value of |
gamma |
the value of |
... |
any other arguments |
A matrix of predictions.
Ryan Thompson <[email protected]>
Generate predictions for new data using specified values of the tuning parameters.
## S3 method for class 'grpsel' predict(object, x.new, lambda = NULL, gamma = NULL, ...)
## S3 method for class 'grpsel' predict(object, x.new, lambda = NULL, gamma = NULL, ...)
object |
an object of class |
x.new |
a matrix of new values for the predictors |
lambda |
the value of |
gamma |
the value of |
... |
any other arguments |
A matrix of predictions.
Ryan Thompson <[email protected]>