Title: | Sparse Group Lasso |
---|---|
Description: | Efficient implementation of sparse group lasso with optional bound constraints on the coefficients; see <doi:10.18637/jss.v110.i06>. It supports the use of a sparse design matrix as well as returning coefficient estimates in a sparse matrix. Furthermore, it correctly calculates the degrees of freedom to allow for information criteria rather than cross-validation with very large data. Finally, the interface to compiled code avoids unnecessary copies and allows for the use of long integers. |
Authors: | Daniel J. McDonald [aut, cre], Xiaoxuan Liang [aut], Anibal Solón Heinsfeld [aut], Aaron Cohen [aut], Yi Yang [ctb], Hui Zou [ctb], Jerome Friedman [ctb], Trevor Hastie [ctb], Rob Tibshirani [ctb], Balasubramanian Narasimhan [ctb], Kenneth Tay [ctb], Noah Simon [ctb], Junyang Qian [ctb], James Yang [ctb] |
Maintainer: | Daniel J. McDonald <[email protected]> |
License: | MIT + file LICENSE |
Version: | 1.1.1 |
Built: | 2024-12-23 06:24:55 UTC |
Source: | CRAN |
cv.sparsegl
object.This function etracts coefficients from a
cross-validated sparsegl()
model, using the stored "sparsegl.fit"
object, and the optimal value chosen for lambda
.
## S3 method for class 'cv.sparsegl' coef(object, s = c("lambda.1se", "lambda.min"), ...)
## S3 method for class 'cv.sparsegl' coef(object, s = c("lambda.1se", "lambda.min"), ...)
object |
Fitted |
s |
Value(s) of the penalty parameter |
... |
Not used. |
The coefficients at the requested value(s) for lambda
.
cv.sparsegl()
and predict.cv.sparsegl()
.
n <- 100 p <- 20 X <- matrix(rnorm(n * p), nrow = n) eps <- rnorm(n) beta_star <- c(rep(5, 5), c(5, -5, 2, 0, 0), rep(-5, 5), rep(0, (p - 15))) y <- X %*% beta_star + eps groups <- rep(1:(p / 5), each = 5) fit1 <- sparsegl(X, y, group = groups) cv_fit <- cv.sparsegl(X, y, groups) coef(cv_fit, s = c(0.02, 0.03))
n <- 100 p <- 20 X <- matrix(rnorm(n * p), nrow = n) eps <- rnorm(n) beta_star <- c(rep(5, 5), c(5, -5, 2, 0, 0), rep(-5, 5), rep(0, (p - 15))) y <- X %*% beta_star + eps groups <- rep(1:(p / 5), each = 5) fit1 <- sparsegl(X, y, group = groups) cv_fit <- cv.sparsegl(X, y, groups) coef(cv_fit, s = c(0.02, 0.03))
sparsegl
object.Computes the coefficients at the requested value(s) for lambda
from a
sparsegl()
object.
## S3 method for class 'sparsegl' coef(object, s = NULL, ...)
## S3 method for class 'sparsegl' coef(object, s = NULL, ...)
object |
Fitted |
s |
Value(s) of the penalty parameter |
... |
Not used. |
s
is the new vector of lambda
values at which predictions are requested.
If s
is not in the lambda sequence used for fitting the model, the coef
function will use linear interpolation to make predictions. The new values
are interpolated using a fraction of coefficients from both left and right
lambda
indices.
The coefficients at the requested values for lambda
.
sparsegl()
and predict.sparsegl()
.
n <- 100 p <- 20 X <- matrix(rnorm(n * p), nrow = n) eps <- rnorm(n) beta_star <- c(rep(5, 5), c(5, -5, 2, 0, 0), rep(-5, 5), rep(0, (p - 15))) y <- X %*% beta_star + eps groups <- rep(1:(p / 5), each = 5) fit1 <- sparsegl(X, y, group = groups) coef(fit1, s = c(0.02, 0.03))
n <- 100 p <- 20 X <- matrix(rnorm(n * p), nrow = n) eps <- rnorm(n) beta_star <- c(rep(5, 5), c(5, -5, 2, 0, 0), rep(-5, 5), rep(0, (p - 15))) y <- X %*% beta_star + eps groups <- rep(1:(p / 5), each = 5) fit1 <- sparsegl(X, y, group = groups) coef(fit1, s = c(0.02, 0.03))
sparsegl
object.Performs k-fold cross-validation for sparsegl()
.
This function is largely similar glmnet::cv.glmnet()
.
cv.sparsegl( x, y, group = NULL, family = c("gaussian", "binomial"), lambda = NULL, pred.loss = c("default", "mse", "deviance", "mae", "misclass"), nfolds = 10, foldid = NULL, weights = NULL, offset = NULL, ... )
cv.sparsegl( x, y, group = NULL, family = c("gaussian", "binomial"), lambda = NULL, pred.loss = c("default", "mse", "deviance", "mae", "misclass"), nfolds = 10, foldid = NULL, weights = NULL, offset = NULL, ... )
x |
Double. A matrix of predictors, of dimension
|
y |
Double/Integer/Factor. The response variable.
Quantitative for |
group |
Integer. A vector of consecutive integers describing the grouping of the coefficients (see example below). |
family |
Character or function. Specifies the generalized linear model to use. Valid options are:
For any other type, a valid |
lambda |
A user supplied |
pred.loss |
Loss to use for cross-validation error. Valid options are:
|
nfolds |
Number of folds - default is 10. Although |
foldid |
An optional vector of values between 1 and |
weights |
Double vector. Optional observation weights. These can
only be used with a |
offset |
Double vector. Optional offset (constant predictor without a
corresponding coefficient). These can only be used with a
|
... |
Additional arguments to |
The function runs sparsegl()
nfolds + 1
times; the first to
get the lambda
sequence, and then the remainder to compute the fit
with each of the folds omitted. The average error and standard error
over the folds are computed.
An object of class cv.sparsegl()
is returned, which is a
list with the components describing the cross-validation error.
lambda |
The values of |
cvm |
The mean cross-validated error - a vector of
length |
cvsd |
Estimate of standard error of |
cvupper |
Upper curve = |
cvlower |
Lower curve = |
name |
A text string indicating type of measure (for plotting purposes). |
nnzero |
The number of non-zero coefficients for each |
active_grps |
The number of active groups for each |
sparsegl.fit |
A fitted |
lambda.min |
The optimal value of |
lambda.1se |
The largest value of |
call |
The function call. |
Liang, X., Cohen, A., Sólon Heinsfeld, A., Pestilli, F., and
McDonald, D.J. 2024.
sparsegl: An R
Package for Estimating Sparse Group Lasso.
Journal of Statistical Software, Vol. 110(6): 1–23.
doi:10.18637/jss.v110.i06.
sparsegl()
, as well as plot()
,
predict()
, and coef()
methods for "cv.sparsegl"
objects.
n <- 100 p <- 20 X <- matrix(rnorm(n * p), nrow = n) eps <- rnorm(n) beta_star <- c(rep(5, 5), c(5, -5, 2, 0, 0), rep(-5, 5), rep(0, (p - 15))) y <- X %*% beta_star + eps groups <- rep(1:(p / 5), each = 5) cv_fit <- cv.sparsegl(X, y, groups)
n <- 100 p <- 20 X <- matrix(rnorm(n * p), nrow = n) eps <- rnorm(n) beta_star <- c(rep(5, 5), c(5, -5, 2, 0, 0), rep(-5, 5), rep(0, (p - 15))) y <- X %*% beta_star + eps groups <- rep(1:(p / 5), each = 5) cv_fit <- cv.sparsegl(X, y, groups)
This function uses the degrees of freedom to calculate various information
criteria. This function uses the "unknown variance" version of the likelihood.
Only implemented for Gaussian regression. The constant is ignored (as in
stats::extractAIC()
).
estimate_risk(object, x, type = c("AIC", "BIC", "GCV"), approx_df = FALSE)
estimate_risk(object, x, type = c("AIC", "BIC", "GCV"), approx_df = FALSE)
object |
fitted object from a call to |
x |
Matrix. The matrix of predictors used to estimate
the |
type |
one or more of AIC, BIC, or GCV. |
approx_df |
the |
a data.frame
with as many rows as object$lambda
. It contains
columns lambda
, df
, and the requested risk types.
Liang, X., Cohen, A., Sólon Heinsfeld, A., Pestilli, F., and
McDonald, D.J. 2024.
sparsegl: An R
Package for Estimating Sparse Group Lasso.
Journal of Statistical Software, Vol. 110(6): 1–23.
doi:10.18637/jss.v110.i06.
Vaiter S, Deledalle C, Peyré G, Fadili J, Dossal C. (2012). The Degrees of Freedom of the Group Lasso for a General Design. https://arxiv.org/abs/1212.6478.
sparsegl()
method.
n <- 100 p <- 20 X <- matrix(rnorm(n * p), nrow = n) eps <- rnorm(n) beta_star <- c(rep(5, 5), c(5, -5, 2, 0, 0), rep(-5, 5), rep(0, (p - 15))) y <- X %*% beta_star + eps groups <- rep(1:(p / 5), each = 5) fit1 <- sparsegl(X, y, group = groups) estimate_risk(fit1, type = "AIC", approx_df = TRUE)
n <- 100 p <- 20 X <- matrix(rnorm(n * p), nrow = n) eps <- rnorm(n) beta_star <- c(rep(5, 5), c(5, -5, 2, 0, 0), rep(-5, 5), rep(0, (p - 15))) y <- X %*% beta_star + eps groups <- rep(1:(p / 5), each = 5) fit1 <- sparsegl(X, y, group = groups) estimate_risk(fit1, type = "AIC", approx_df = TRUE)
This function may be used to create potentially valid starting
values for calling sparsegl()
with a stats::family()
object.
It is not typically necessary to call this function (as it is used
internally to create some), but in some cases, especially with custom
generalized linear models, it may improve performance.
make_irls_warmup(nobs, nvars, b0 = 0, beta = double(nvars), r = double(nobs))
make_irls_warmup(nobs, nvars, b0 = 0, beta = double(nvars), r = double(nobs))
nobs |
Number of observations in the response (or rows in |
nvars |
Number of columns in |
b0 |
Scalar. Initial value for the intercept. |
beta |
Vector. Initial values for the coefficients. Must be length
|
r |
Vector. Initial values for the deviance residuals. Must be length
|
Occasionally, the irls fitting routine may fail with an admonition to create valid starting values.
List of class irlsspgl_warmup
cv.sparsegl
object.Plots the cross-validation curve, and upper and lower standard deviation
curves, as a function of the lambda
values used.
## S3 method for class 'cv.sparsegl' plot(x, log_axis = c("xy", "x", "y", "none"), sign.lambda = 1, ...)
## S3 method for class 'cv.sparsegl' plot(x, log_axis = c("xy", "x", "y", "none"), sign.lambda = 1, ...)
x |
Fitted |
log_axis |
Apply log scaling to the requested axes. |
sign.lambda |
Either plot against |
... |
Not used. |
A ggplot2::ggplot()
plot is produced. Additional user
modifications may be added as desired.
n <- 100 p <- 20 X <- matrix(rnorm(n * p), nrow = n) eps <- rnorm(n) beta_star <- c(rep(5, 5), c(5, -5, 2, 0, 0), rep(-5, 5), rep(0, (p - 15))) y <- X %*% beta_star + eps groups <- rep(1:(p / 5), each = 5) cv_fit <- cv.sparsegl(X, y, groups) plot(cv_fit)
n <- 100 p <- 20 X <- matrix(rnorm(n * p), nrow = n) eps <- rnorm(n) beta_star <- c(rep(5, 5), c(5, -5, 2, 0, 0), rep(-5, 5), rep(0, (p - 15))) y <- X %*% beta_star + eps groups <- rep(1:(p / 5), each = 5) cv_fit <- cv.sparsegl(X, y, groups) plot(cv_fit)
sparsegl
object.Produces a coefficient profile plot of a fitted
sparsegl()
object. The result is a ggplot2::ggplot()
. Additional user
modifications can be added as desired.
## S3 method for class 'sparsegl' plot( x, y_axis = c("coef", "group"), x_axis = c("lambda", "penalty"), add_legend = n_legend_values < 20, ... )
## S3 method for class 'sparsegl' plot( x, y_axis = c("coef", "group"), x_axis = c("lambda", "penalty"), add_legend = n_legend_values < 20, ... )
x |
Fitted |
y_axis |
Variable on the y_axis. Either the coefficients (default) or the group norm. |
x_axis |
Variable on the x-axis. Either the (log)-lambda sequence (default) or the value of the penalty. In the second case, the penalty is scaled by its maximum along the path. |
add_legend |
Show the legend. Often, with many groups/predictors, this can become overwhelming. The default produces a legend if the number of groups/predictors is less than 20. |
... |
Not used. |
n <- 100 p <- 20 X <- matrix(rnorm(n * p), nrow = n) eps <- rnorm(n) beta_star <- c(rep(5, 5), c(5, -5, 2, 0, 0), rep(-5, 5), rep(0, (p - 15))) y <- X %*% beta_star + eps groups <- rep(1:(p / 5), each = 5) fit1 <- sparsegl(X, y, group = groups) plot(fit1, y_axis = "coef", x_axis = "penalty")
n <- 100 p <- 20 X <- matrix(rnorm(n * p), nrow = n) eps <- rnorm(n) beta_star <- c(rep(5, 5), c(5, -5, 2, 0, 0), rep(-5, 5), rep(0, (p - 15))) y <- X %*% beta_star + eps groups <- rep(1:(p / 5), each = 5) fit1 <- sparsegl(X, y, group = groups) plot(fit1, y_axis = "coef", x_axis = "penalty")
cv.sparsegl
object.This function makes predictions from a cross-validated cv.sparsegl()
object,
using the stored sparsegl.fit
object, and the value chosen for lambda
.
## S3 method for class 'cv.sparsegl' predict( object, newx, s = c("lambda.1se", "lambda.min"), type = c("link", "response", "coefficients", "nonzero", "class"), ... )
## S3 method for class 'cv.sparsegl' predict( object, newx, s = c("lambda.1se", "lambda.min"), type = c("link", "response", "coefficients", "nonzero", "class"), ... )
object |
Fitted |
newx |
Matrix of new values for |
s |
Value(s) of the penalty parameter |
type |
Type of prediction required. Type |
... |
Not used. |
A matrix or vector of predicted values.
cv.sparsegl()
and coef.cv.sparsegl()
.
n <- 100 p <- 20 X <- matrix(rnorm(n * p), nrow = n) eps <- rnorm(n) beta_star <- c(rep(5, 5), c(5, -5, 2, 0, 0), rep(-5, 5), rep(0, (p - 15))) y <- X %*% beta_star + eps groups <- rep(1:(p / 5), each = 5) fit1 <- sparsegl(X, y, group = groups) cv_fit <- cv.sparsegl(X, y, groups) predict(cv_fit, newx = X[50:60, ], s = "lambda.min")
n <- 100 p <- 20 X <- matrix(rnorm(n * p), nrow = n) eps <- rnorm(n) beta_star <- c(rep(5, 5), c(5, -5, 2, 0, 0), rep(-5, 5), rep(0, (p - 15))) y <- X %*% beta_star + eps groups <- rep(1:(p / 5), each = 5) fit1 <- sparsegl(X, y, group = groups) cv_fit <- cv.sparsegl(X, y, groups) predict(cv_fit, newx = X[50:60, ], s = "lambda.min")
sparsegl
object.Similar to other predict methods, this function produces fitted values and
class labels from a fitted sparsegl
object.
## S3 method for class 'sparsegl' predict( object, newx, s = NULL, type = c("link", "response", "coefficients", "nonzero", "class"), ... )
## S3 method for class 'sparsegl' predict( object, newx, s = NULL, type = c("link", "response", "coefficients", "nonzero", "class"), ... )
object |
Fitted |
newx |
Matrix of new values for |
s |
Value(s) of the penalty parameter |
type |
Type of prediction required. Type |
... |
Not used. |
s
is the new vector of lambda
values at which predictions are requested.
If s
is not in the lambda sequence used for fitting the model, the coef
function will use linear interpolation to make predictions. The new values
are interpolated using a fraction of coefficients from both left and right
lambda
indices.
The object returned depends on type.
n <- 100 p <- 20 X <- matrix(rnorm(n * p), nrow = n) eps <- rnorm(n) beta_star <- c(rep(5, 5), c(5, -5, 2, 0, 0), rep(-5, 5), rep(0, (p - 15))) y <- X %*% beta_star + eps groups <- rep(1:(p / 5), each = 5) fit1 <- sparsegl(X, y, group = groups) predict(fit1, newx = X[10, ], s = fit1$lambda[3:5])
n <- 100 p <- 20 X <- matrix(rnorm(n * p), nrow = n) eps <- rnorm(n) beta_star <- c(rep(5, 5), c(5, -5, 2, 0, 0), rep(-5, 5), rep(0, (p - 15))) y <- X %*% beta_star + eps groups <- rep(1:(p / 5), each = 5) fit1 <- sparsegl(X, y, group = groups) predict(fit1, newx = X[10, ], s = fit1$lambda[3:5])
Fits regularization paths for sparse group-lasso penalized learning problems at a
sequence of regularization parameters lambda
.
Note that the objective function for least squares is
Users can also tweak the penalty by choosing a different penalty factor.
sparsegl( x, y, group = NULL, family = c("gaussian", "binomial"), nlambda = 100, lambda.factor = ifelse(nobs < nvars, 0.01, 1e-04), lambda = NULL, pf_group = sqrt(bs), pf_sparse = rep(1, nvars), intercept = TRUE, asparse = 0.05, standardize = TRUE, lower_bnd = -Inf, upper_bnd = Inf, weights = NULL, offset = NULL, warm = NULL, trace_it = 0, dfmax = as.integer(max(group)) + 1L, pmax = min(dfmax * 1.2, as.integer(max(group))), eps = 1e-08, maxit = 3e+06 )
sparsegl( x, y, group = NULL, family = c("gaussian", "binomial"), nlambda = 100, lambda.factor = ifelse(nobs < nvars, 0.01, 1e-04), lambda = NULL, pf_group = sqrt(bs), pf_sparse = rep(1, nvars), intercept = TRUE, asparse = 0.05, standardize = TRUE, lower_bnd = -Inf, upper_bnd = Inf, weights = NULL, offset = NULL, warm = NULL, trace_it = 0, dfmax = as.integer(max(group)) + 1L, pmax = min(dfmax * 1.2, as.integer(max(group))), eps = 1e-08, maxit = 3e+06 )
x |
Double. A matrix of predictors, of dimension
|
y |
Double/Integer/Factor. The response variable.
Quantitative for |
group |
Integer. A vector of consecutive integers describing the grouping of the coefficients (see example below). |
family |
Character or function. Specifies the generalized linear model to use. Valid options are:
For any other type, a valid |
nlambda |
The number of |
lambda.factor |
A multiplicative factor for the minimal lambda in the
|
lambda |
A user supplied |
pf_group |
Penalty factor on the groups, a vector of the same
length as the total number of groups. Separate penalty weights can be applied
to each group of |
pf_sparse |
Penalty factor on l1-norm, a vector the same length as the
total number of columns in |
intercept |
Whether to include intercept in the model. Default is TRUE. |
asparse |
The relative weight to put on the |
standardize |
Logical flag for variable standardization (scaling) prior to fitting the model. Default is TRUE. |
lower_bnd |
Lower bound for coefficient values, a vector in length of 1
or of length the number of groups. Must be non-positive numbers only.
Default value for each entry is |
upper_bnd |
Upper for coefficient values, a vector in length of 1
or of length the number of groups. Must be non-negative numbers only.
Default value for each entry is |
weights |
Double vector. Optional observation weights. These can
only be used with a |
offset |
Double vector. Optional offset (constant predictor without a
corresponding coefficient). These can only be used with a
|
warm |
List created with |
trace_it |
Scalar integer. Larger values print more output during
the irls loop. Typical values are |
dfmax |
Limit the maximum number of groups in the model. Default is no limit. |
pmax |
Limit the maximum number of groups ever to be nonzero. For example once a group enters the model, no matter how many times it exits or re-enters model through the path, it will be counted only once. |
eps |
Convergence termination tolerance. Defaults value is |
maxit |
Maximum number of outer-loop iterations allowed at fixed lambda
value. Default is |
An object with S3 class "sparsegl"
. Among the list components:
call
The call that produced this object.
b0
Intercept sequence of length length(lambda)
.
beta
A p
x length(lambda)
sparse matrix of coefficients.
df
The number of features with nonzero coefficients for each value of
lambda
.
dim
Dimension of coefficient matrix.
lambda
The actual sequence of lambda
values used.
npasses
Total number of iterations summed over all lambda
values.
jerr
Error flag, for warnings and errors, 0 if no error.
group
A vector of consecutive integers describing the grouping of the
coefficients.
nobs
The number of observations used to estimate the model.
If sparsegl()
was called with a stats::family()
method, this may also
contain information about the deviance and the family used in fitting.
Liang, X., Cohen, A., Sólon Heinsfeld, A., Pestilli, F., and
McDonald, D.J. 2024.
sparsegl: An R
Package for Estimating Sparse Group Lasso.
Journal of Statistical Software, Vol. 110(6): 1–23.
doi:10.18637/jss.v110.i06.
cv.sparsegl()
and the plot()
,
predict()
, and coef()
methods for "sparsegl"
objects.
n <- 100 p <- 20 X <- matrix(rnorm(n * p), nrow = n) eps <- rnorm(n) beta_star <- c(rep(5, 5), c(5, -5, 2, 0, 0), rep(-5, 5), rep(0, (p - 15))) y <- X %*% beta_star + eps groups <- rep(1:(p / 5), each = 5) fit <- sparsegl(X, y, group = groups) yp <- rpois(n, abs(X %*% beta_star)) fit_pois <- sparsegl(X, yp, group = groups, family = poisson())
n <- 100 p <- 20 X <- matrix(rnorm(n * p), nrow = n) eps <- rnorm(n) beta_star <- c(rep(5, 5), c(5, -5, 2, 0, 0), rep(-5, 5), rep(0, (p - 15))) y <- X %*% beta_star + eps groups <- rep(1:(p / 5), each = 5) fit <- sparsegl(X, y, group = groups) yp <- rpois(n, abs(X %*% beta_star)) fit_pois <- sparsegl(X, yp, group = groups, family = poisson())
A dataset containing a measurement of "trust" in experts along with other metrics collected through the Delphi Group at Carnegie Mellon University U.S. COVID-19 Trends and Impact Survey, in partnership with Facebook. This particular dataset is created from one of the public contingency tables, specifically, the breakdown by state, age, gender, and race/ethnicity published on 05 February 2022.
trust_experts
trust_experts
A data.frame
with 9759 rows and 8 columns
trust_experts
Real-valued. This is the average of
pct_trust_covid_info_*
where *
is each of doctors
, experts
, cdc
, and govt_health
.
period
Factor. Start date of data collection period. There are 13 monthly periods
region
Factor. State abbreviation.
age
Factor. Self-reported age bucket.
gender
Factor. Self-reported gender.
raceethnicity
Factor. Self-reported race or ethnicity.
cli
Real-valued. This is the wcli
indicator measuring the
percent of circulating Covid-like illness in a particular region. See
the Delphi Epidata API
for a complete description.
hh_cmnty_cli
Real-valued. This is the whh_cmnty_cli
indicator
measuring the percent of people reporting illness in their local
community and household.
The U.S. COVID-19 Trends and Impact Survey.
The paper describing the survey:
Joshua A. Salomon, Alex Reinhart, Alyssa Bilinski, Eu Jing Chua, Wichada La Motte-Kerr, Minttu M. Rönn, Marissa Reitsma, Katherine Ann Morris, Sarah LaRocca, Tamar Farag, Frauke Kreuter, Roni Rosenfeld, and Ryan J. Tibshirani (2021). "The US COVID-19 Trends and Impact Survey: Continuous real-time measurement of COVID-19 symptoms, risks, protective behaviors, testing, and vaccination", Proceedings of the National Academy of Sciences 118 (51) e2111454118. doi:10.1073/pnas.2111454118.
The Public Delphi US CTIS Documentation
## Not run: library(splines) library(dplyr) library(magrittr) df <- 10 trust_experts <- trust_experts %>% mutate(across( where(is.factor), ~ set_attr(.x, "contrasts", contr.sum(nlevels(.x), FALSE, TRUE)) )) x <- Matrix::sparse.model.matrix( ~ 0 + region + age + gender + raceethnicity + period + bs(cli, df = df) + bs(hh_cmnty_cli, df = df), data = trust_experts, drop.unused.levels = TRUE ) gr <- sapply(trust_experts, function(x) ifelse(is.factor(x), nlevels(x), NA)) gr <- rep(seq(ncol(trust_experts) - 1), times = c(gr[!is.na(gr)], df, df)) fit <- cv.sparsegl(x, trust_experts$trust_experts, gr) ## End(Not run)
## Not run: library(splines) library(dplyr) library(magrittr) df <- 10 trust_experts <- trust_experts %>% mutate(across( where(is.factor), ~ set_attr(.x, "contrasts", contr.sum(nlevels(.x), FALSE, TRUE)) )) x <- Matrix::sparse.model.matrix( ~ 0 + region + age + gender + raceethnicity + period + bs(cli, df = df) + bs(hh_cmnty_cli, df = df), data = trust_experts, drop.unused.levels = TRUE ) gr <- sapply(trust_experts, function(x) ifelse(is.factor(x), nlevels(x), NA)) gr <- rep(seq(ncol(trust_experts) - 1), times = c(gr[!is.na(gr)], df, df)) fit <- cv.sparsegl(x, trust_experts$trust_experts, gr) ## End(Not run)
Calculate different norms of vectors with or without grouping structures.
zero_norm(x) one_norm(x) two_norm(x) grouped_zero_norm(x, gr) grouped_one_norm(x, gr) grouped_two_norm(x, gr) grouped_sp_norm(x, gr, asparse) gr_one_norm(x, gr) gr_two_norm(x, gr) sp_group_norm(x, gr, asparse = 0.05)
zero_norm(x) one_norm(x) two_norm(x) grouped_zero_norm(x, gr) grouped_one_norm(x, gr) grouped_two_norm(x, gr) grouped_sp_norm(x, gr, asparse) gr_one_norm(x, gr) gr_two_norm(x, gr) sp_group_norm(x, gr, asparse = 0.05)
x |
A numeric vector. |
gr |
An integer (or factor) vector of the same length as x. |
asparse |
Scalar. The weight to put on the l1 norm when calculating the group norm. |
A numeric scalar or vector
zero_norm()
: l0-norm (number of nonzero entries).
one_norm()
: l1-norm (Absolute-value norm).
two_norm()
: l2-norm (Euclidean norm).
grouped_zero_norm()
: A vector of group-wise l0-norms.
grouped_one_norm()
: A vector of group-wise l1-norms.
grouped_two_norm()
: A vector of group-wise l2-norms.
grouped_sp_norm()
: A vector of length unique(gr)
consisting of
the asparse
convex combination of the l1 and l2-norm for each group.
gr_one_norm()
: The l1-norm norm of a vector (a scalar).
gr_two_norm()
: The sum of the group-wise l2-norms of a vector
(a scalar).
sp_group_norm()
: The sum of the asparse
convex combination of
group l1 and l2-norms vectors (a scalar).
x <- c(rep(-1, 5), rep(0, 5), rep(1, 5)) gr <- c(rep(1, 5), rep(2, 5), rep(3, 5)) asparse <- 0.05 grouped_sp_norm(x, gr, asparse)
x <- c(rep(-1, 5), rep(0, 5), rep(1, 5)) gr <- c(rep(1, 5), rep(2, 5), rep(3, 5)) asparse <- 0.05 grouped_sp_norm(x, gr, asparse)