Title:  Lasso and ElasticNet Regularized Generalized Linear Models 

Description:  Extremely efficient procedures for fitting the entire lasso or elasticnet regularization path for linear regression, logistic and multinomial regression models, Poisson regression, Cox model, multipleresponse Gaussian, and the grouped multinomial regression; see <doi:10.18637/jss.v033.i01> and <doi:10.18637/jss.v039.i05>. There are two new and important additions. The family argument can be a GLM family object, which opens the door to any programmed family (<doi:10.18637/jss.v106.i01>). This comes with a modest computational cost, so when the builtin families suffice, they should be used instead. The other novelty is the relax option, which refits each of the active sets in the path unpenalized. The algorithm uses cyclical coordinate descent in a pathwise fashion, as described in the papers cited. 
Authors:  Jerome Friedman [aut], Trevor Hastie [aut, cre], Rob Tibshirani [aut], Balasubramanian Narasimhan [aut], Kenneth Tay [aut], Noah Simon [aut], Junyang Qian [ctb], James Yang [aut] 
Maintainer:  Trevor Hastie <[email protected]> 
License:  GPL2 
Version:  4.18 
Built:  20240404 08:37:52 UTC 
Source:  CRAN 
This package fits lasso and elasticnet model paths for regression, logistic and multinomial regression using coordinate descent. The algorithm is extremely fast, and exploits sparsity in the input x matrix where it exists. A variety of predictions can be made from the fitted models.
Package:  glmnet 
Type:  Package 
Version:  1.0 
Date:  20080514 
License:  What license is it under? 
Very simple to use. Accepts x,y
data for regression models, and
produces the regularization path over a grid of values for the tuning
parameter lambda
. Only 5 functions: glmnet
predict.glmnet
plot.glmnet
print.glmnet
coef.glmnet
Jerome Friedman, Trevor Hastie and Rob Tibshirani
Maintainer:
Trevor Hastie [email protected]
Friedman, J., Hastie, T. and Tibshirani, R. (2008)
Regularization Paths for Generalized Linear Models via Coordinate
Descent (2010), Journal of Statistical Software, Vol. 33(1), 122,
doi:10.18637/jss.v033.i01.
Simon, N., Friedman, J., Hastie, T. and Tibshirani, R. (2011)
Regularization Paths for Cox's Proportional
Hazards Model via Coordinate Descent, Journal of Statistical Software, Vol.
39(5), 113,
doi:10.18637/jss.v039.i05.
Tibshirani,Robert, Bien, J., Friedman, J., Hastie, T.,Simon, N.,Taylor, J. and
Tibshirani, Ryan. (2012) Strong Rules for Discarding Predictors in
Lassotype Problems, JRSSB, Vol. 74(2), 245266,
https://arxiv.org/abs/1011.2234.
Hastie, T., Tibshirani, Robert and Tibshirani, Ryan (2020) Best Subset,
Forward Stepwise or Lasso? Analysis and Recommendations Based on Extensive Comparisons,
Statist. Sc. Vol. 35(4), 579592,
https://arxiv.org/abs/1707.08692.
Glmnet webpage with four vignettes: https://glmnet.stanford.edu.
x = matrix(rnorm(100 * 20), 100, 20)
y = rnorm(100)
g2 = sample(1:2, 100, replace = TRUE)
g4 = sample(1:4, 100, replace = TRUE)
fit1 = glmnet(x, y)
predict(fit1, newx = x[1:5, ], s = c(0.01, 0.005))
predict(fit1, type = "coef")
plot(fit1, xvar = "lambda")
fit2 = glmnet(x, g2, family = "binomial")
predict(fit2, type = "response", newx = x[2:5, ])
predict(fit2, type = "nonzero")
fit3 = glmnet(x, g4, family = "multinomial")
predict(fit3, newx = x[1:3, ], type = "response", s = 0.01)
Given a test set, produce summary performance measures for the glmnet model(s)
assess.glmnet(
object,
newx = NULL,
newy,
weights = NULL,
family = c("gaussian", "binomial", "poisson", "multinomial", "cox", "mgaussian"),
...
)
confusion.glmnet(
object,
newx = NULL,
newy,
family = c("binomial", "multinomial"),
...
)
roc.glmnet(object, newx = NULL, newy, ...)
object 
Fitted 
newx 
If predictions are to made, these are the 'x' values. Required
for 
newy 
required argument for all functions; the new response values 
weights 
For observation weights for the test observations 
family 
The family of the model, in case predictions are passed in as 'object' 
... 
additional arguments to 
assess.glmnet
produces all the different performance measures
provided by cv.glmnet
for each of the families. A single vector, or a
matrix of predictions can be provided, or fitted model objects or CV
objects. In the case when the predictions are still to be made, the
...
arguments allow, for example, 'offsets' and other prediction
parameters such as values for 'gamma' for 'relaxed' fits. roc.glmnet
produces for a single vector a two column matrix with columns TPR and FPR
(true positive rate and false positive rate). This object can be plotted to
produce an ROC curve. If more than one predictions are called for, then a
list of such matrices is produced. confusion.glmnet
produces a
confusion matrix tabulating the classification results. Again, a single
table or a list, with a print method.
assess.glmnet
produces a list of vectors of measures.
roc.glmnet
a list of 'roc' twocolumn matrices, and
confusion.glmnet
a list of tables. If a single prediction is
provided, or predictions are made from a CV object, the latter two drop the
list status and produce a single matrix or table.
Trevor Hastie and Rob Tibshirani
Maintainer: Trevor Hastie
[email protected]
cv.glmnet
, glmnet.measures
and vignette("relax",package="glmnet")
data(QuickStartExample)
x < QuickStartExample$x; y < QuickStartExample$y
set.seed(11)
train = sample(seq(length(y)),70,replace=FALSE)
fit1 = glmnet(x[train,], y[train])
assess.glmnet(fit1, newx = x[train,], newy = y[train])
preds = predict(fit1, newx = x[train, ], s = c(1, 0.25))
assess.glmnet(preds, newy = y[train], family = "gaussian")
fit1c = cv.glmnet(x, y, keep = TRUE)
fit1a = assess.glmnet(fit1c$fit.preval, newy=y,family="gaussian")
plot(fit1c$lambda, log="x",fit1a$mae,xlab="Log Lambda",ylab="Mean Absolute Error")
abline(v=fit1c$lambda.min, lty=2, col="red")
data(BinomialExample)
x < BinomialExample$x; y < BinomialExample$y
fit2 = glmnet(x[train,], y[train], family = "binomial")
assess.glmnet(fit2,newx = x[train,], newy=y[train], s=0.1)
plot(roc.glmnet(fit2, newx = x[train,], newy=y[train])[[10]])
fit2c = cv.glmnet(x, y, family = "binomial", keep=TRUE)
idmin = match(fit2c$lambda.min, fit2c$lambda)
plot(roc.glmnet(fit2c$fit.preval, newy = y)[[idmin]])
data(MultinomialExample)
x < MultinomialExample$x; y < MultinomialExample$y
set.seed(103)
train = sample(seq(length(y)),100,replace=FALSE)
fit3 = glmnet(x[train,], y[train], family = "multinomial")
confusion.glmnet(fit3, newx = x[train, ], newy = y[train], s = 0.01)
fit3c = cv.glmnet(x, y, family = "multinomial", type.measure="class", keep=TRUE)
idmin = match(fit3c$lambda.min, fit3c$lambda)
confusion.glmnet(fit3c$fit.preval, newy = y, family="multinomial")[[idmin]]
Simple simulated data, used to demonstrate the features of glmnet
Data objects used to demonstrate features in the glmnet vignette
These datasets are artificial, and are used to test out some of the features of glmnet.
data(QuickStartExample)
x < QuickStartExample$x; y < QuickStartExample$y
glmnet(x, y)
glmnet
Fit a generalized linear model as in glmnet
but unpenalized. This
allows all the features of glmnet
such as sparse x, bounds on
coefficients, offsets, and so on.
bigGlm(x, ..., path = FALSE)
x 
input matrix 
... 
Most other arguments to glmnet that make sense 
path 
Since 
This is essentially the same as fitting a "glmnet" model with a single value
lambda=0
, but it avoids some edge cases. CAVEAT: If the user tries a
problem with N smaller than or close to p for some models, it is likely to
fail (and maybe not gracefully!) If so, use the path=TRUE
argument.
It returns an object of class "bigGlm" that inherits from class
"glmnet". That means it can be predicted from, coefficients extracted via
coef
. It has its own print method.
Trevor Hastie
Maintainer: Trevor Hastie
[email protected]
print
, predict
, and coef
methods.
# Gaussian
x = matrix(rnorm(100 * 20), 100, 20)
y = rnorm(100)
fit1 = bigGlm(x, y)
print(fit1)
fit2=bigGlm(x,y>0,family="binomial")
print(fit2)
fit2p=bigGlm(x,y>0,family="binomial",path=TRUE)
print(fit2p)
Randomly generated data for binomial regression example.
data(BinomialExample)
List containing the following elements:
100 by 30 matrix of numeric values.
Numeric vector of length 100 containing 44 zeros and 56 ones.
Computes Harrel's C index for predictions from a "coxnet"
object.
Cindex(pred, y, weights = rep(1, nrow(y)))
pred 
Predictions from a 
y 
a survival response object  a matrix with two columns "time" and "status"; see documentation for "glmnet" 
weights 
optional observation weights 
Computes the concordance index, taking into account censoring.
Trevor Hastie [email protected]
Harrel Jr, F. E. and Lee, K. L. and Mark, D. B. (1996) Tutorial in biostatistics: multivariable prognostic models: issues in developing models, evaluating assumptions and adequacy, and measuring and reducing error, Statistics in Medicine, 15, pages 361–387.
cv.glmnet
set.seed(10101)
N = 1000
p = 30
nzc = p/3
x = matrix(rnorm(N * p), N, p)
beta = rnorm(nzc)
fx = x[, seq(nzc)] %*% beta/3
hx = exp(fx)
ty = rexp(N, hx)
tcens = rbinom(n = N, prob = 0.3, size = 1) # censoring indicator
y = cbind(time = ty, status = 1  tcens) # y=Surv(ty,1tcens) with library(survival)
fit = glmnet(x, y, family = "cox")
pred = predict(fit, newx = x)
apply(pred, 2, Cindex, y=y)
cv.glmnet(x, y, family = "cox", type.measure = "C")
Similar to other predict methods, this functions predicts fitted values,
logits, coefficients and more from a fitted "glmnet"
object.
## S3 method for class 'glmnet'
coef(object, s = NULL, exact = FALSE, ...)
## S3 method for class 'glmnet'
predict(
object,
newx,
s = NULL,
type = c("link", "response", "coefficients", "nonzero", "class"),
exact = FALSE,
newoffset,
...
)
## S3 method for class 'relaxed'
predict(
object,
newx,
s = NULL,
gamma = 1,
type = c("link", "response", "coefficients", "nonzero", "class"),
exact = FALSE,
newoffset,
...
)
object 
Fitted 
s 
Value(s) of the penalty parameter 
exact 
This argument is relevant only when predictions are made at
values of 
... 
This is the mechanism for passing arguments like 
newx 
Matrix of new values for 
type 
Type of prediction required. Type 
newoffset 
If an offset is used in the fit, then one must be supplied
for making predictions (except for 
gamma 
Single value of 
The shape of the objects returned are different for "multinomial"
objects. This function actually calls NextMethod()
, and the
appropriate predict method is invoked for each of the three model types.
coef(...)
is equivalent to predict(type="coefficients",...)
The object returned depends on type.
Jerome Friedman, Trevor Hastie and Rob Tibshirani
Maintainer:
Trevor Hastie [email protected]
Friedman, J., Hastie, T. and Tibshirani, R. (2008)
Regularization Paths for Generalized Linear Models via Coordinate
Descent (2010), Journal of Statistical Software, Vol. 33(1), 122,
doi:10.18637/jss.v033.i01.
Simon, N., Friedman, J., Hastie, T. and Tibshirani, R. (2011)
Regularization Paths for Cox's Proportional
Hazards Model via Coordinate Descent, Journal of Statistical Software, Vol.
39(5), 113,
doi:10.18637/jss.v039.i05.
Glmnet webpage with four vignettes, https://glmnet.stanford.edu.
glmnet
, and print
, and coef
methods, and
cv.glmnet
.
x=matrix(rnorm(100*20),100,20)
y=rnorm(100)
g2=sample(1:2,100,replace=TRUE)
g4=sample(1:4,100,replace=TRUE)
fit1=glmnet(x,y)
predict(fit1,newx=x[1:5,],s=c(0.01,0.005))
predict(fit1,type="coef")
fit2=glmnet(x,g2,family="binomial")
predict(fit2,type="response",newx=x[2:5,])
predict(fit2,type="nonzero")
fit3=glmnet(x,g4,family="multinomial")
predict(fit3,newx=x[1:3,],type="response",s=0.01)
Returns the elastic net objective function value for Cox regression model.
cox_obj_function(y, pred, weights, lambda, alpha, coefficients, vp)
y 
Survival response variable, must be a 
pred 
Model's predictions for 
weights 
Observation weights. 
lambda 
A single value for the 
alpha 
The elasticnet mixing parameter, with 
coefficients 
The model's coefficients. 
vp 
Penalty factors for each of the coefficients. 
Fit a Cox regression model via penalized maximum likelihood for a single value of lambda. Can deal with (start, stop] data and strata, as well as sparse design matrices.
cox.fit(
x,
y,
weights,
lambda,
alpha = 1,
offset = rep(0, nobs),
thresh = 1e10,
maxit = 1e+05,
penalty.factor = rep(1, nvars),
exclude = c(),
lower.limits = Inf,
upper.limits = Inf,
warm = NULL,
from.cox.path = FALSE,
save.fit = FALSE,
trace.it = 0
)
x 
Input matrix, of dimension 
y 
Survival response variable, must be a Surv or stratifySurv object. 
weights 
Observation weights. 
lambda 
A single value for the 
alpha 
See glmnet help file 
offset 
See glmnet help file 
thresh 
Convergence threshold for coordinate descent. Each inner
coordinatedescent loop continues until the maximum change in the objective
after any coefficient update is less than thresh times the null deviance.
Default value is 
maxit 
Maximum number of passes over the data; default is 
penalty.factor 
See glmnet help file 
exclude 
See glmnet help file 
lower.limits 
See glmnet help file 
upper.limits 
See glmnet help file 
warm 
Either a 
from.cox.path 
Was 
save.fit 
Return the warm start object? Default is FALSE. 
trace.it 
Controls how much information is printed to screen. If

WARNING: Users should not call cox.fit
directly. Higherlevel
functions in this package call cox.fit
as a subroutine. If a
warm start object is provided, some of the other arguments in the function
may be overriden.
cox.fit
solves the elastic net problem for a single, userspecified
value of lambda. cox.fit
works for Cox regression models, including
(start, stop] data and strata. It solves the problem using iteratively
reweighted least squares (IRLS). For each IRLS iteration, cox.fit
makes a quadratic (Newton) approximation of the loglikelihood, then calls
elnet.fit
to minimize the resulting approximation.
In terms of standardization: cox.fit
does not standardize x
and weights
. penalty.factor
is standardized so that they sum
up to nvars
.
An object with class "coxnet", "glmnetfit" and "glmnet". The list returned contains more keys than that of a "glmnet" object.
a0 
Intercept value, 
beta 
A 
df 
The number of nonzero coefficients. 
dim 
Dimension of coefficient matrix. 
lambda 
Lambda value used. 
dev.ratio 
The fraction of (null) deviance explained. The deviance calculations incorporate weights if present in the model. The deviance is defined to be 2*(loglike_sat  loglike), where loglike_sat is the loglikelihood for the saturated model (a model with a free parameter per observation). Hence dev.ratio=1dev/nulldev. 
nulldev 
Null deviance (per observation). This is defined to be 2*(loglike_sat loglike(Null)). The null model refers to the 0 model. 
npasses 
Total passes over the data. 
jerr 
Error flag, for warnings and errors (largely for internal debugging). 
offset 
A logical variable indicating whether an offset was included in the model. 
call 
The call that produced this object. 
nobs 
Number of observations. 
warm_fit 
If 
family 
Family used for the model, always "cox". 
converged 
A logical variable: was the algorithm judged to have converged? 
boundary 
A logical variable: is the fitted value on the boundary of the attainable values? 
obj_function 
Objective function value at the solution. 
Fit a Cox regression model via penalized maximum likelihood for a path of lambda values. Can deal with (start, stop] data and strata, as well as sparse design matrices.
cox.path(
x,
y,
weights = NULL,
offset = NULL,
alpha = 1,
nlambda = 100,
lambda.min.ratio = ifelse(nobs < nvars, 0.01, 1e04),
lambda = NULL,
standardize = TRUE,
thresh = 1e10,
exclude = NULL,
penalty.factor = rep(1, nvars),
lower.limits = Inf,
upper.limits = Inf,
maxit = 1e+05,
trace.it = 0,
...
)
x 
See glmnet help file 
y 
Survival response variable, must be a 
weights 
See glmnet help file 
offset 
See glmnet help file 
alpha 
See glmnet help file 
nlambda 
See glmnet help file 
lambda.min.ratio 
See glmnet help file 
lambda 
See glmnet help file 
standardize 
See glmnet help file 
thresh 
Convergence threshold for coordinate descent. Each inner
coordinatedescent loop continues until the maximum change in the objective
after any coefficient update is less than thresh times the null deviance.
Default value is 
exclude 
See glmnet help file 
penalty.factor 
See glmnet help file 
lower.limits 
See glmnet help file 
upper.limits 
See glmnet help file 
maxit 
See glmnet help file 
trace.it 
Controls how much information is printed to screen. Default is

... 
Other arguments passed from glmnet (not used right now). 
Sometimes the sequence is truncated before nlambda
values of lambda
have been used. This happens when cox.path
detects that the
decrease in deviance is marginal (i.e. we are near a saturated fit).
An object of class "coxnet" and "glmnet".
a0 
Intercept value, 
beta 
A 
df 
The number of nonzero coefficients for each value of lambda. 
dim 
Dimension of coefficient matrix. 
lambda 
The actual sequence of lambda values used. When alpha=0, the largest lambda reported does not quite give the zero coefficients reported (lambda=inf would in principle). Instead, the largest lambda for alpha=0.001 is used, and the sequence of lambda values is derived from this. 
dev.ratio 
The fraction of (null) deviance explained. The deviance calculations incorporate weights if present in the model. The deviance is defined to be 2*(loglike_sat  loglike), where loglike_sat is the loglikelihood for the saturated model (a model with a free parameter per observation). Hence dev.ratio=1dev/nulldev. 
nulldev 
Null deviance (per observation). This is defined to be 2*(loglike_sat loglike(Null)). The null model refers to the 0 model. 
npasses 
Total passes over the data summed over all lambda values. 
jerr 
Error flag, for warnings and errors (largely for internal debugging). 
offset 
A logical variable indicating whether an offset was included in the model. 
call 
The call that produced this object. 
nobs 
Number of observations. 
set.seed(2)
nobs < 100; nvars < 15
xvec < rnorm(nobs * nvars)
xvec[sample.int(nobs * nvars, size = 0.4 * nobs * nvars)] < 0
x < matrix(xvec, nrow = nobs)
beta < rnorm(nvars / 3)
fx < x[, seq(nvars / 3)] %*% beta / 3
ty < rexp(nobs, exp(fx))
tcens < rbinom(n = nobs, prob = 0.3, size = 1)
jsurv < survival::Surv(ty, tcens)
fit1 < glmnet:::cox.path(x, jsurv)
# works with sparse x matrix
x_sparse < Matrix::Matrix(x, sparse = TRUE)
fit2 < glmnet:::cox.path(x_sparse, jsurv)
# example with (start, stop] data
set.seed(2)
start_time < runif(100, min = 0, max = 5)
stop_time < start_time + runif(100, min = 0.1, max = 3)
status < rbinom(n = nobs, prob = 0.3, size = 1)
jsurv_ss < survival::Surv(start_time, stop_time, status)
fit3 < glmnet:::cox.path(x, jsurv_ss)
# example with strata
jsurv_ss2 < stratifySurv(jsurv_ss, rep(1:2, each = 50))
fit4 < glmnet:::cox.path(x, jsurv_ss2)
Randomly generated data for Cox regression example.
data(CoxExample)
List containing the following elements:
1,000 by 30 matrix of numeric values.
1,000 by 2 matrix with column names "time" and "status". The first column consists of positive numbers representing time to event, while the second column represents the status indicator (0=rightcensored, 1=observed).
Compute the gradient of the log partial likelihood at a particular fit for Cox model.
coxgrad(eta, y, w, std.weights = TRUE, diag.hessian = FALSE)
eta 
Fit vector (usually from glmnet at a particular lambda). 
y 
Survival response variable, must be a 
w 
Observation weights (default is all equal to 1). 
std.weights 
If TRUE (default), observation weights are standardized to sum to 1. 
diag.hessian 
If 
Compute a gradient vector at the fitted vector for the log partial likelihood.
This is like a residual vector, and useful for manual screening of
predictors for glmnet
in applications where p
is very large
(as in GWAS). Uses the Breslow approach to ties.
This function is essentially a wrapper: it checks whether the response provided is rightcensored or (start, stop] survival data, and calls the appropriate internal routine.
A single gradient vector the same length as eta
. If
diag.hessian=TRUE
, the diagonal of the Hessian is
included as an attribute "diag_hessian".
coxnet.deviance
set.seed(1)
eta < rnorm(10)
time < runif(10, min = 1, max = 10)
d < ifelse(rnorm(10) > 0, 1, 0)
y < survival::Surv(time, d)
coxgrad(eta, y)
# return diagonal of Hessian as well
coxgrad(eta, y, diag.hessian = TRUE)
# example with (start, stop] data
y2 < survival::Surv(time, time + runif(10), d)
coxgrad(eta, y2)
# example with strata
y2 < stratifySurv(y, rep(1:2, length.out = 10))
coxgrad(eta, y2)
Compute the deviance (2 log partial likelihood) for Cox model.
coxnet.deviance(
pred = NULL,
y,
x = NULL,
offset = NULL,
weights = NULL,
std.weights = TRUE,
beta = NULL
)
pred 
Fit vector or matrix (usually from glmnet at a particular lambda or a sequence of lambdas). 
y 
Survival response variable, must be a 
x 
Optional 
offset 
Optional offset vector. 
weights 
Observation weights (default is all equal to 1). 
std.weights 
If TRUE (default), observation weights are standardized to sum to 1. 
beta 
Optional coefficient vector/matrix, to be supplied if

Computes the deviance for a single set of predictions, or for a matrix
of predictions. The user can either supply the predictions
directly through the pred
option, or by supplying the x
matrix
and beta
coefficients. Uses the Breslow approach to ties.
The function first checks if pred
is passed: if so, it is used as
the predictions. If pred
is not passed but x
and beta
are passed, then these values are used to compute the predictions. If
neither x
nor beta
are passed, then the predictions are all
taken to be 0.
coxnet.deviance()
is a wrapper: it calls the appropriate internal
routine based on whether the response is rightcensored data or
(start, stop] survival data.
A vector of deviances, one for each column of predictions.
coxgrad
set.seed(1)
eta < rnorm(10)
time < runif(10, min = 1, max = 10)
d < ifelse(rnorm(10) > 0, 1, 0)
y < survival::Surv(time, d)
coxnet.deviance(pred = eta, y = y)
# if pred not provided, it is set to zero vector
coxnet.deviance(y = y)
# example with x and beta
x < matrix(rnorm(10 * 3), nrow = 10)
beta < matrix(1:3, ncol = 1)
coxnet.deviance(y = y, x = x, beta = beta)
# example with (start, stop] data
y2 < survival::Surv(time, time + runif(10), d)
coxnet.deviance(pred = eta, y = y2)
# example with strata
y2 < stratifySurv(y, rep(1:2, length.out = 10))
coxnet.deviance(pred = eta, y = y2)
Does kfold crossvalidation for glmnet, produces a plot, and returns a
value for lambda
(and gamma
if relax=TRUE
)
cv.glmnet(
x,
y,
weights = NULL,
offset = NULL,
lambda = NULL,
type.measure = c("default", "mse", "deviance", "class", "auc", "mae", "C"),
nfolds = 10,
foldid = NULL,
alignment = c("lambda", "fraction"),
grouped = TRUE,
keep = FALSE,
parallel = FALSE,
gamma = c(0, 0.25, 0.5, 0.75, 1),
relax = FALSE,
trace.it = 0,
...
)
x 

y 
response 
weights 
Observation weights; defaults to 1 per observation 
offset 
Offset vector (matrix) as in 
lambda 
Optional usersupplied lambda sequence; default is

type.measure 
loss to use for crossvalidation. Currently five
options, not all available for all models. The default is

nfolds 
number of folds  default is 10. Although 
foldid 
an optional vector of values between 1 and 
alignment 
This is an experimental argument, designed to fix the
problems users were having with CV, with possible values 
grouped 
This is an experimental argument, with default 
keep 
If 
parallel 
If 
gamma 
The values of the parameter for mixing the relaxed fit with the
regularized fit, between 0 and 1; default is 
relax 
If 
trace.it 
If 
... 
Other arguments that can be passed to 
The function runs glmnet
nfolds
+1 times; the first to get the
lambda
sequence, and then the remainder to compute the fit with each
of the folds omitted. The error is accumulated, and the average error and
standard deviation over the folds is computed. Note that cv.glmnet
does NOT search for values for alpha
. A specific value should be
supplied, else alpha=1
is assumed by default. If users would like to
crossvalidate alpha
as well, they should call cv.glmnet
with
a precomputed vector foldid
, and then use this same fold vector in
separate calls to cv.glmnet
with different values of alpha
.
Note also that the results of cv.glmnet
are random, since the folds
are selected at random. Users can reduce this randomness by running
cv.glmnet
many times, and averaging the error curves.
If relax=TRUE
then the values of gamma
are used to mix the
fits. If $\eta$
is the fit for lasso/elastic net, and $\eta_R$
is
the relaxed fit (with unpenalized coefficients), then a relaxed fit mixed by
$\gamma$
is
$\eta(\gamma)=(1\gamma)\eta_R+\gamma\eta.$
There is
practically no extra cost for having a lot of values for gamma
.
However, 5 seems sufficient for most purposes. CV then selects both
gamma
and lambda
.
an object of class "cv.glmnet"
is returned, which is a list
with the ingredients of the crossvalidation fit. If the object was created
with relax=TRUE
then this class has a prefix class of
"cv.relaxed"
.
lambda 
the values of 
cvm 
The mean crossvalidated error  a vector of length

cvsd 
estimate of standard error of

cvup 
upper curve = 
cvlo 
lower
curve = 
nzero 
number of nonzero coefficients at
each 
name 
a text string indicating type of measure (for plotting purposes). 
glmnet.fit 
a fitted glmnet object for the full data. 
lambda.min 
value of 
lambda.1se 
largest value of 
fit.preval 
if

foldid 
if 
index 
a one column matrix with the indices of 
relaxed 
if 
Jerome Friedman, Trevor Hastie and Rob Tibshirani
Noah Simon
helped develop the 'coxnet' function.
Jeffrey Wong and B. Narasimhan
helped with the parallel option
Maintainer: Trevor Hastie
[email protected]
Friedman, J., Hastie, T. and Tibshirani, R. (2008)
Regularization Paths for Generalized Linear Models via Coordinate
Descent (2010), Journal of Statistical Software, Vol. 33(1), 122,
doi:10.18637/jss.v033.i01.
Simon, N., Friedman, J., Hastie, T. and Tibshirani, R. (2011)
Regularization Paths for Cox's Proportional
Hazards Model via Coordinate Descent, Journal of Statistical Software, Vol.
39(5), 113,
doi:10.18637/jss.v039.i05.
glmnet
and plot
, predict
, and coef
methods for "cv.glmnet"
and "cv.relaxed"
objects.
set.seed(1010)
n = 1000
p = 100
nzc = trunc(p/10)
x = matrix(rnorm(n * p), n, p)
beta = rnorm(nzc)
fx = x[, seq(nzc)] %*% beta
eps = rnorm(n) * 5
y = drop(fx + eps)
px = exp(fx)
px = px/(1 + px)
ly = rbinom(n = length(px), prob = px, size = 1)
set.seed(1011)
cvob1 = cv.glmnet(x, y)
plot(cvob1)
coef(cvob1)
predict(cvob1, newx = x[1:5, ], s = "lambda.min")
title("Gaussian Family", line = 2.5)
set.seed(1011)
cvob1a = cv.glmnet(x, y, type.measure = "mae")
plot(cvob1a)
title("Gaussian Family", line = 2.5)
set.seed(1011)
par(mfrow = c(2, 2), mar = c(4.5, 4.5, 4, 1))
cvob2 = cv.glmnet(x, ly, family = "binomial")
plot(cvob2)
title("Binomial Family", line = 2.5)
frame()
set.seed(1011)
cvob3 = cv.glmnet(x, ly, family = "binomial", type.measure = "class")
plot(cvob3)
title("Binomial Family", line = 2.5)
## Not run:
cvob1r = cv.glmnet(x, y, relax = TRUE)
plot(cvob1r)
predict(cvob1r, newx = x[, 1:5])
set.seed(1011)
cvob3a = cv.glmnet(x, ly, family = "binomial", type.measure = "auc")
plot(cvob3a)
title("Binomial Family", line = 2.5)
set.seed(1011)
mu = exp(fx/10)
y = rpois(n, mu)
cvob4 = cv.glmnet(x, y, family = "poisson")
plot(cvob4)
title("Poisson Family", line = 2.5)
# Multinomial
n = 500
p = 30
nzc = trunc(p/10)
x = matrix(rnorm(n * p), n, p)
beta3 = matrix(rnorm(30), 10, 3)
beta3 = rbind(beta3, matrix(0, p  10, 3))
f3 = x %*% beta3
p3 = exp(f3)
p3 = p3/apply(p3, 1, sum)
g3 = glmnet:::rmult(p3)
set.seed(10101)
cvfit = cv.glmnet(x, g3, family = "multinomial")
plot(cvfit)
title("Multinomial Family", line = 2.5)
# Cox
beta = rnorm(nzc)
fx = x[, seq(nzc)] %*% beta/3
hx = exp(fx)
ty = rexp(n, hx)
tcens = rbinom(n = n, prob = 0.3, size = 1) # censoring indicator
y = cbind(time = ty, status = 1  tcens) # y=Surv(ty,1tcens) with library(survival)
foldid = sample(rep(seq(10), length = n))
fit1_cv = cv.glmnet(x, y, family = "cox", foldid = foldid)
plot(fit1_cv)
title("Cox Family", line = 2.5)
# Parallel
require(doMC)
registerDoMC(cores = 4)
x = matrix(rnorm(1e+05 * 100), 1e+05, 100)
y = rnorm(1e+05)
system.time(cv.glmnet(x, y))
system.time(cv.glmnet(x, y, parallel = TRUE))
## End(Not run)
Returns the elastic net deviance value.
dev_function(y, mu, weights, family)
y 
Quantitative response variable. 
mu 
Model's predictions for 
weights 
Observation weights. 
family 
A description of the error distribution and link function to be used in the model. This is the result of a call to a family function. 
Compute the deviance sequence from the glmnet object
## S3 method for class 'glmnet'
deviance(object, ...)
object 
fitted glmnet object 
... 
additional print arguments 
A glmnet object has components dev.ratio
and nulldev
. The
former is the fraction of (null) deviance explained. The deviance
calculations incorporate weights if present in the model. The deviance is
defined to be 2*(loglike_sat  loglike), where loglike_sat is the
loglikelihood for the saturated model (a model with a free parameter per
observation). Null deviance is defined to be 2*(loglike_sat
loglike(Null)); The NULL model refers to the intercept model, except for
the Cox, where it is the 0 model. Hence dev.ratio=1deviance/nulldev, and
this deviance
method returns (1dev.ratio)*nulldev.
(1dev.ratio)*nulldev
Jerome Friedman, Trevor Hastie and Rob Tibshirani
Maintainer:
Trevor Hastie [email protected]
Friedman, J., Hastie, T. and Tibshirani, R. (2008) Regularization Paths for Generalized Linear Models via Coordinate Descent
glmnet
, predict
, print
, and coef
methods.
x = matrix(rnorm(100 * 20), 100, 20)
y = rnorm(100)
fit1 = glmnet(x, y)
deviance(fit1)
Solves the weighted least squares (WLS) problem for a single lambda value. Internal function that users should not call directly.
elnet.fit(
x,
y,
weights,
lambda,
alpha = 1,
intercept = TRUE,
thresh = 1e07,
maxit = 1e+05,
penalty.factor = rep(1, nvars),
exclude = c(),
lower.limits = Inf,
upper.limits = Inf,
warm = NULL,
from.glmnet.fit = FALSE,
save.fit = FALSE
)
x 
Input matrix, of dimension 
y 
Quantitative response variable. 
weights 
Observation weights. 
lambda 
A single value for the 
alpha 
The elasticnet mixing parameter, with

intercept 
Should intercept be fitted (default=TRUE) or set to zero (FALSE)? 
thresh 
Convergence threshold for coordinate descent. Each inner
coordinatedescent loop continues until the maximum change in the objective
after any coefficient update is less than thresh times the null deviance.
Default value is 
maxit 
Maximum number of passes over the data; default is 
penalty.factor 
Separate penalty factors can be applied to each
coefficient. This is a number that multiplies 
exclude 
Indices of variables to be excluded from the model. Default is none. Equivalent to an infinite penalty factor. 
lower.limits 
Vector of lower limits for each coefficient; default

upper.limits 
Vector of upper limits for each coefficient; default

warm 
Either a 
from.glmnet.fit 
Was 
save.fit 
Return the warm start object? Default is FALSE. 
WARNING: Users should not call elnet.fit
directly. Higherlevel functions
in this package call elnet.fit
as a subroutine. If a warm start object
is provided, some of the other arguments in the function may be overriden.
elnet.fit
is essentially a wrapper around a C++ subroutine which
minimizes
$1/2 \sum w_i (y_i  X_i^T \beta)^2 + \sum \lambda \gamma_j
[(1\alpha)/2 \beta^2+\alpha\beta],$
over $\beta$
, where $\gamma_j$
is the relative penalty factor on the
jth variable. If intercept = TRUE
, then the term in the first sum is
$w_i (y_i  \beta_0  X_i^T \beta)^2$
, and we are minimizing over both
$\beta_0$
and $\beta$
.
None of the inputs are standardized except for penalty.factor
, which
is standardized so that they sum up to nvars
.
An object with class "glmnetfit" and "glmnet". The list returned has
the same keys as that of a glmnet
object, except that it might have an
additional warm_fit
key.
a0 
Intercept value. 
beta 
A 
df 
The number of nonzero coefficients. 
dim 
Dimension of coefficient matrix. 
lambda 
Lambda value used. 
dev.ratio 
The fraction of (null) deviance explained. The deviance calculations incorporate weights if present in the model. The deviance is defined to be 2*(loglike_sat  loglike), where loglike_sat is the loglikelihood for the saturated model (a model with a free parameter per observation). Hence dev.ratio=1dev/nulldev. 
nulldev 
Null deviance (per observation). This is defined to be 2*(loglike_sat loglike(Null)). The null model refers to the intercept model. 
npasses 
Total passes over the data. 
jerr 
Error flag, for warnings and errors (largely for internal debugging). 
offset 
Always FALSE, since offsets do not appear in the WLS problem. Included for compability with glmnet output. 
call 
The call that produced this object. 
nobs 
Number of observations. 
warm_fit 
If 
Helps to find ties in death times of data.
fid(x, index)
x 
Sorted vector of death times. 
index 
Vector of indices for the death times. 
A list with two arguments.
index_first 
A vector of indices for the first observation at each death time as they appear in the sorted list. 
index_ties 
If there are no ties at all, this is NULL. If not, this is a list with length equal to the number of unique times with ties. For each time with ties, index_ties gives the indices of the observations with a death at that time. 
# Example with no ties
glmnet:::fid(c(1, 4, 5, 6), 1:5)
# Example with ties
glmnet:::fid(c(1, 1, 1, 2, 3, 3, 4, 4, 4), 1:9)
Return the lambda max value for Cox regression model, used for computing initial lambda values. For internal use only.
get_cox_lambda_max(
x,
y,
alpha,
weights = rep(1, nrow(x)),
offset = rep(0, nrow(x)),
exclude = c(),
vp = rep(1, ncol(x))
)
x 
Input matrix, of dimension 
y 
Survival response variable, must be a 
alpha 
The elasticnet mixing parameter, with 
weights 
Observation weights. 
offset 
Offset for the model. Default is a zero vector of length

exclude 
Indices of variables to be excluded from the model. 
vp 
Separate penalty factors can be applied to each coefficient. 
This function is called by cox.path
for the value of lambda max.
When x
is not sparse, it is expected to already by centered and scaled.
When x
is sparse, the function will get its attributes xm
and
xs
for its centering and scaling factors. The value of
lambda_max
changes depending on whether x
is centered and
scaled or not, so we need xm
and xs
to get the correct value.
Given x, coefficients and intercept, return linear predictions. Wrapper that works with both regular and sparse x. Only works for single set of coefficients and intercept.
get_eta(x, beta, a0)
x 
Input matrix, of dimension 
beta 
Feature coefficients. 
a0 
Intercept. 
Return the null deviance, starting mu and lambda max values for initialization. For internal use only.
get_start(
x,
y,
weights,
family,
intercept,
is.offset,
offset,
exclude,
vp,
alpha
)
x 
Input matrix, of dimension 
y 
Quantitative response variable. 
weights 
Observation weights. 
family 
A description of the error distribution and link function to be
used in the model. This is the result of a call to a family function.
(See 
intercept 
Does the model we are fitting have an intercept term or not? 
is.offset 
Is the model being fit with an offset or not? 
offset 
Offset for the model. If 
exclude 
Indices of variables to be excluded from the model. 
vp 
Separate penalty factors can be applied to each coefficient. 
alpha 
The elasticnet mixing parameter, with 
This function is called by glmnet.path
for null deviance, starting mu
and lambda max values. It is also called by glmnet.fit
when used
without warmstart, but they only use the null deviance and starting mu values.
When x
is not sparse, it is expected to already by centered and scaled.
When x
is sparse, the function will get its attributes xm
and
xs
for its centering and scaling factors.
Note that whether x
is centered & scaled or not, the values of mu
and nulldev
don't change. However, the value of lambda_max
does
change, and we need xm
and xs
to get the correct value.
Fit a generalized linear model via penalized maximum likelihood. The regularization path is computed for the lasso or elasticnet penalty at a grid of values for the regularization parameter lambda. Can deal with all shapes of data, including very large sparse data matrices. Fits linear, logistic and multinomial, poisson, and Cox regression models.
glmnet(
x,
y,
family = c("gaussian", "binomial", "poisson", "multinomial", "cox", "mgaussian"),
weights = NULL,
offset = NULL,
alpha = 1,
nlambda = 100,
lambda.min.ratio = ifelse(nobs < nvars, 0.01, 1e04),
lambda = NULL,
standardize = TRUE,
intercept = TRUE,
thresh = 1e07,
dfmax = nvars + 1,
pmax = min(dfmax * 2 + 20, nvars),
exclude = NULL,
penalty.factor = rep(1, nvars),
lower.limits = Inf,
upper.limits = Inf,
maxit = 1e+05,
type.gaussian = ifelse(nvars < 500, "covariance", "naive"),
type.logistic = c("Newton", "modified.Newton"),
standardize.response = FALSE,
type.multinomial = c("ungrouped", "grouped"),
relax = FALSE,
trace.it = 0,
...
)
relax.glmnet(fit, x, ..., maxp = n  3, path = FALSE, check.args = TRUE)
x 
input matrix, of dimension nobs x nvars; each row is an observation
vector. Can be in sparse matrix format (inherit from class

y 
response variable. Quantitative for 
family 
Either a character string representing
one of the builtin families, or else a 
weights 
observation weights. Can be total counts if responses are proportion matrices. Default is 1 for each observation 
offset 
A vector of length 
alpha 
The elasticnet mixing parameter, with

nlambda 
The number of 
lambda.min.ratio 
Smallest value for 
lambda 
A user supplied 
standardize 
Logical flag for x variable standardization, prior to
fitting the model sequence. The coefficients are always returned on the
original scale. Default is 
intercept 
Should intercept(s) be fitted (default=TRUE) or set to zero (FALSE) 
thresh 
Convergence threshold for coordinate descent. Each inner
coordinatedescent loop continues until the maximum change in the objective
after any coefficient update is less than 
dfmax 
Limit the maximum number of variables in the model. Useful for
very large 
pmax 
Limit the maximum number of variables ever to be nonzero 
exclude 
Indices of variables to be excluded from the model. Default
is none. Equivalent to an infinite penalty factor for the variables excluded (next item).
Users can supply instead an 
penalty.factor 
Separate penalty factors can be applied to each
coefficient. This is a number that multiplies 
lower.limits 
Vector of lower limits for each coefficient; default

upper.limits 
Vector of upper limits for each coefficient; default

maxit 
Maximum number of passes over the data for all lambda values; default is 10^5. 
type.gaussian 
Two algorithm types are supported for (only)

type.logistic 
If 
standardize.response 
This is for the 
type.multinomial 
If 
relax 
If 
trace.it 
If 
... 
Additional argument used in 
fit 
For 
maxp 
a limit on how many relaxed coefficients are allowed. Default is 'n3', where 'n' is the sample size. This may not be sufficient for nongaussian familes, in which case users should supply a smaller value. This argument can be supplied directly to 'glmnet'. 
path 
Since 
check.args 
Should 
The sequence of models implied by lambda
is fit by coordinate
descent. For family="gaussian"
this is the lasso sequence if
alpha=1
, else it is the elasticnet sequence.
The objective function for "gaussian"
is
$1/2 RSS/nobs +
\lambda*penalty,$
and for the other models it is
$loglik/nobs +
\lambda*penalty.$
Note also that for "gaussian"
, glmnet
standardizes y to have unit variance (using 1/n rather than 1/(n1) formula)
before computing its lambda sequence (and then unstandardizes the resulting
coefficients); if you wish to reproduce/compare results with other software,
best to supply a standardized y. The coefficients for any predictor
variables with zero variance are set to zero for all values of lambda.
family
optionFrom version 4.0 onwards, glmnet supports both the original builtin families,
as well as any family object as used by stats:glm()
.
This opens the door to a wide variety of additional models. For example
family=binomial(link=cloglog)
or family=negative.binomial(theta=1.5)
(from the MASS library).
Note that the code runs faster for the builtin families.
The built in families are specifed via a character string. For all families,
the object produced is a lasso or elasticnet regularization path for fitting the
generalized linear regression paths, by maximizing the appropriate penalized
loglikelihood (partial likelihood for the "cox" model). Sometimes the
sequence is truncated before nlambda
values of lambda
have
been used, because of instabilities in the inverse link functions near a
saturated fit. glmnet(...,family="binomial")
fits a traditional
logistic regression model for the logodds.
glmnet(...,family="multinomial")
fits a symmetric multinomial model,
where each class is represented by a linear model (on the logscale). The
penalties take care of redundancies. A twoclass "multinomial"
model
will produce the same fit as the corresponding "binomial"
model,
except the pair of coefficient matrices will be equal in magnitude and
opposite in sign, and half the "binomial"
values.
Two useful additional families are the family="mgaussian"
family and
the type.multinomial="grouped"
option for multinomial fitting. The
former allows a multiresponse gaussian model to be fit, using a "group
lasso" penalty on the coefficients for each variable. Tying the responses
together like this is called "multitask" learning in some domains. The
grouped multinomial allows the same penalty for the
family="multinomial"
model, which is also multiresponsed. For both
of these the penalty on the coefficient vector for variable j is
$(1\alpha)/2\beta_j_2^2+\alpha\beta_j_2.$
When alpha=1
this is a grouplasso penalty, and otherwise it mixes with quadratic just
like elasticnet. A small detail in the Cox model: if death times are tied
with censored times, we assume the censored times occurred just
before the death times in computing the Breslow approximation; if
users prefer the usual convention of after, they can add a small
number to all censoring times to achieve this effect.
family="cox"
For Cox models, the response should preferably be a Surv
object,
created by the Surv()
function in survival package. For
rightcensored data, this object should have type "right", and for
(start, stop] data, it should have type "counting". To fit stratified Cox
models, strata should be added to the response via the stratifySurv()
function before passing the response to glmnet()
. (For backward
compatibility, rightcensored data can also be passed as a
twocolumn matrix with columns named 'time' and 'status'. The
latter is a binary variable, with '1' indicating death, and '0' indicating
right censored.)
relax
optionIf relax=TRUE
a duplicate sequence of models is produced, where each active set in the
elasticnet path is refit without regularization. The result of this is a
matching "glmnet"
object which is stored on the original object in a
component named "relaxed"
, and is part of the glmnet output.
Generally users will not call relax.glmnet
directly, unless the
original 'glmnet' object took a long time to fit. But if they do, they must
supply the fit, and all the original arguments used to create that fit. They
can limit the length of the relaxed path via 'maxp'.
An object with S3 class "glmnet","*"
, where "*"
is
"elnet"
, "lognet"
, "multnet"
, "fishnet"
(poisson), "coxnet"
or "mrelnet"
for the various types of
models. If the model was created with relax=TRUE
then this class has
a prefix class of "relaxed"
.
call 
the call that produced this object 
a0 
Intercept sequence of length 
beta 
For 
lambda 
The actual sequence of 
dev.ratio 
The
fraction of (null) deviance explained (for 
nulldev 
Null deviance (per observation). This is defined to be 2*(loglike_sat loglike(Null)); The NULL model refers to the intercept model, except for the Cox, where it is the 0 model. 
df 
The number of
nonzero coefficients for each value of 
dfmat 
For 
dim 
dimension of coefficient matrix (ices) 
nobs 
number of observations 
npasses 
total passes over the data summed over all lambda values 
offset 
a logical variable indicating whether an offset was included in the model 
jerr 
error flag, for warnings and errors (largely for internal debugging). 
relaxed 
If 
Jerome Friedman, Trevor Hastie, Balasubramanian Narasimhan, Noah
Simon, Kenneth Tay and Rob Tibshirani
Maintainer: Trevor Hastie
[email protected]
Friedman, J., Hastie, T. and Tibshirani, R. (2008)
Regularization Paths for Generalized Linear Models via Coordinate
Descent (2010), Journal of Statistical Software, Vol. 33(1), 122,
doi:10.18637/jss.v033.i01.
Simon, N., Friedman, J., Hastie, T. and Tibshirani, R. (2011)
Regularization Paths for Cox's Proportional
Hazards Model via Coordinate Descent, Journal of Statistical Software, Vol.
39(5), 113,
doi:10.18637/jss.v039.i05.
Tibshirani,Robert, Bien, J., Friedman, J., Hastie, T.,Simon, N.,Taylor, J. and
Tibshirani, Ryan. (2012) Strong Rules for Discarding Predictors in
Lassotype Problems, JRSSB, Vol. 74(2), 245266,
https://arxiv.org/abs/1011.2234.
Hastie, T., Tibshirani, Robert and Tibshirani, Ryan (2020) Best Subset,
Forward Stepwise or Lasso? Analysis and Recommendations Based on Extensive Comparisons,
Statist. Sc. Vol. 35(4), 579592,
https://arxiv.org/abs/1707.08692.
Glmnet webpage with four vignettes: https://glmnet.stanford.edu.
print
, predict
, coef
and plot
methods,
and the cv.glmnet
function.
# Gaussian
x = matrix(rnorm(100 * 20), 100, 20)
y = rnorm(100)
fit1 = glmnet(x, y)
print(fit1)
coef(fit1, s = 0.01) # extract coefficients at a single value of lambda
predict(fit1, newx = x[1:10, ], s = c(0.01, 0.005)) # make predictions
# Relaxed
fit1r = glmnet(x, y, relax = TRUE) # can be used with any model
# multivariate gaussian
y = matrix(rnorm(100 * 3), 100, 3)
fit1m = glmnet(x, y, family = "mgaussian")
plot(fit1m, type.coef = "2norm")
# binomial
g2 = sample(c(0,1), 100, replace = TRUE)
fit2 = glmnet(x, g2, family = "binomial")
fit2n = glmnet(x, g2, family = binomial(link=cloglog))
fit2r = glmnet(x,g2, family = "binomial", relax=TRUE)
fit2rp = glmnet(x,g2, family = "binomial", relax=TRUE, path=TRUE)
# multinomial
g4 = sample(1:4, 100, replace = TRUE)
fit3 = glmnet(x, g4, family = "multinomial")
fit3a = glmnet(x, g4, family = "multinomial", type.multinomial = "grouped")
# poisson
N = 500
p = 20
nzc = 5
x = matrix(rnorm(N * p), N, p)
beta = rnorm(nzc)
f = x[, seq(nzc)] %*% beta
mu = exp(f)
y = rpois(N, mu)
fit = glmnet(x, y, family = "poisson")
plot(fit)
pfit = predict(fit, x, s = 0.001, type = "response")
plot(pfit, y)
# Cox
set.seed(10101)
N = 1000
p = 30
nzc = p/3
x = matrix(rnorm(N * p), N, p)
beta = rnorm(nzc)
fx = x[, seq(nzc)] %*% beta/3
hx = exp(fx)
ty = rexp(N, hx)
tcens = rbinom(n = N, prob = 0.3, size = 1) # censoring indicator
y = cbind(time = ty, status = 1  tcens) # y=Surv(ty,1tcens) with library(survival)
fit = glmnet(x, y, family = "cox")
plot(fit)
# Cox example with (start, stop] data
set.seed(2)
nobs < 100; nvars < 15
xvec < rnorm(nobs * nvars)
xvec[sample.int(nobs * nvars, size = 0.4 * nobs * nvars)] < 0
x < matrix(xvec, nrow = nobs)
start_time < runif(100, min = 0, max = 5)
stop_time < start_time + runif(100, min = 0.1, max = 3)
status < rbinom(n = nobs, prob = 0.3, size = 1)
jsurv_ss < survival::Surv(start_time, stop_time, status)
fit < glmnet(x, jsurv_ss, family = "cox")
# Cox example with strata
jsurv_ss2 < stratifySurv(jsurv_ss, rep(1:2, each = 50))
fit < glmnet(x, jsurv_ss2, family = "cox")
# Sparse
n = 10000
p = 200
nzc = trunc(p/10)
x = matrix(rnorm(n * p), n, p)
iz = sample(1:(n * p), size = n * p * 0.85, replace = FALSE)
x[iz] = 0
sx = Matrix(x, sparse = TRUE)
inherits(sx, "sparseMatrix") #confirm that it is sparse
beta = rnorm(nzc)
fx = x[, seq(nzc)] %*% beta
eps = rnorm(n)
y = fx + eps
px = exp(fx)
px = px/(1 + px)
ly = rbinom(n = length(px), prob = px, size = 1)
system.time(fit1 < glmnet(sx, y))
system.time(fit2n < glmnet(x, y))
View and/or change the factory default parameters in glmnet
glmnet.control(
fdev = 1e05,
devmax = 0.999,
eps = 1e06,
big = 9.9e+35,
mnlam = 5,
pmin = 1e09,
exmx = 250,
prec = 1e10,
mxit = 100,
itrace = 0,
epsnr = 1e06,
mxitnr = 25,
factory = FALSE
)
fdev 
minimum fractional change in deviance for stopping path; factory default = 1.0e5 
devmax 
maximum fraction of explained deviance for stopping path; factory default = 0.999 
eps 
minimum value of lambda.min.ratio (see glmnet); factory default= 1.0e6 
big 
large floating point number; factory default = 9.9e35. Inf in definition of upper.limit is set to big 
mnlam 
minimum number of path points (lambda values) allowed; factory default = 5 
pmin 
minimum probability for any class. factory default = 1.0e9. Note that this implies a pmax of 1pmin. 
exmx 
maximum allowed exponent. factory default = 250.0 
prec 
convergence threshold for multi response bounds adjustment solution. factory default = 1.0e10 
mxit 
maximum iterations for multiresponse bounds adjustment solution. factory default = 100 
itrace 
If 1 then progress bar is displayed when running 
epsnr 
convergence threshold for 
mxitnr 
maximum iterations for the IRLS loop in 
factory 
If 
If called with no arguments, glmnet.control()
returns a list with the
current settings of these parameters. Any arguments included in the call
sets those parameters to the new values, and then silently returns. The
values set are persistent for the duration of the R session.
A list with named elements as in the argument list
Jerome Friedman, Kenneth Tay, Trevor Hastie
Maintainer: Trevor Hastie
[email protected]
glmnet
glmnet.control(fdev = 0) #continue along path even though not much changes
glmnet.control() # view current settings
glmnet.control(factory = TRUE) # reset all the parameters to their default
Fit a generalized linear model via penalized maximum likelihood for a single value of lambda. Can deal with any GLM family.
glmnet.fit(
x,
y,
weights,
lambda,
alpha = 1,
offset = rep(0, nobs),
family = gaussian(),
intercept = TRUE,
thresh = 1e10,
maxit = 1e+05,
penalty.factor = rep(1, nvars),
exclude = c(),
lower.limits = Inf,
upper.limits = Inf,
warm = NULL,
from.glmnet.path = FALSE,
save.fit = FALSE,
trace.it = 0
)
x 
Input matrix, of dimension 
y 
Quantitative response variable. 
weights 
Observation weights. 
lambda 
A single value for the 
alpha 
The elasticnet mixing parameter, with

offset 
A vector of length 
family 
A description of the error distribution and link function to be
used in the model. This is the result of a call to a family function. Default
is 
intercept 
Should intercept be fitted (default=TRUE) or set to zero (FALSE)? 
thresh 
Convergence threshold for coordinate descent. Each inner
coordinatedescent loop continues until the maximum change in the objective
after any coefficient update is less than thresh times the null deviance.
Default value is 
maxit 
Maximum number of passes over the data; default is 
penalty.factor 
Separate penalty factors can be applied to each
coefficient. This is a number that multiplies 
exclude 
Indices of variables to be excluded from the model. Default is none. Equivalent to an infinite penalty factor. 
lower.limits 
Vector of lower limits for each coefficient; default

upper.limits 
Vector of upper limits for each coefficient; default

warm 
Either a 
from.glmnet.path 
Was 
save.fit 
Return the warm start object? Default is FALSE. 
trace.it 
Controls how much information is printed to screen. If

WARNING: Users should not call glmnet.fit
directly. Higherlevel functions
in this package call glmnet.fit
as a subroutine. If a warm start object
is provided, some of the other arguments in the function may be overriden.
glmnet.fit
solves the elastic net problem for a single, userspecified
value of lambda. glmnet.fit
works for any GLM family. It solves the
problem using iteratively reweighted least squares (IRLS). For each IRLS
iteration, glmnet.fit
makes a quadratic (Newton) approximation of the
loglikelihood, then calls elnet.fit
to minimize the resulting
approximation.
In terms of standardization: glmnet.fit
does not standardize x
and weights
. penalty.factor
is standardized so that they sum up
to nvars
.
An object with class "glmnetfit" and "glmnet". The list returned contains more keys than that of a "glmnet" object.
a0 
Intercept value. 
beta 
A 
df 
The number of nonzero coefficients. 
dim 
Dimension of coefficient matrix. 
lambda 
Lambda value used. 
dev.ratio 
The fraction of (null) deviance explained. The deviance calculations incorporate weights if present in the model. The deviance is defined to be 2*(loglike_sat  loglike), where loglike_sat is the loglikelihood for the saturated model (a model with a free parameter per observation). Hence dev.ratio=1dev/nulldev. 
nulldev 
Null deviance (per observation). This is defined to be 2*(loglike_sat loglike(Null)). The null model refers to the intercept model. 
npasses 
Total passes over the data. 
jerr 
Error flag, for warnings and errors (largely for internal debugging). 
offset 
A logical variable indicating whether an offset was included in the model. 
call 
The call that produced this object. 
nobs 
Number of observations. 
warm_fit 
If 
family 
Family used for the model. 
converged 
A logical variable: was the algorithm judged to have converged? 
boundary 
A logical variable: is the fitted value on the boundary of the attainable values? 
obj_function 
Objective function value at the solution. 
Produces a list of names of measures
glmnet.measures(
family = c("all", "gaussian", "binomial", "poisson", "multinomial", "cox", "mgaussian",
"GLM")
)
family 
If a "glmnet" family is supplied, a list of the names of measures available for that family are produced. Default is "all", in which case the names of measures for all families are produced. 
Try it and see. A very simple function to provide information
Trevor Hastie
Maintainer: Trevor Hastie
[email protected]
cv.glmnet
and assess.glmnet
.
Fit a generalized linear model via penalized maximum likelihood for a path of lambda values. Can deal with any GLM family.
glmnet.path(
x,
y,
weights = NULL,
lambda = NULL,
nlambda = 100,
lambda.min.ratio = ifelse(nobs < nvars, 0.01, 1e04),
alpha = 1,
offset = NULL,
family = gaussian(),
standardize = TRUE,
intercept = TRUE,
thresh = 1e10,
maxit = 1e+05,
penalty.factor = rep(1, nvars),
exclude = integer(0),
lower.limits = Inf,
upper.limits = Inf,
trace.it = 0
)
x 
Input matrix, of dimension 
y 
Quantitative response variable. 
weights 
Observation weights. Default is 1 for each observation. 
lambda 
A user supplied lambda sequence. Typical usage is to have the
program compute its own lambda sequence based on 
nlambda 
The number of lambda values, default is 100. 
lambda.min.ratio 
Smallest value for lambda as a fraction of lambda.max,
the (data derived) entry value (i.e. the smallest value for which all
coefficients are zero). The default depends on the sample size 
alpha 
The elasticnet mixing parameter, with

offset 
A vector of length 
family 
A description of the error distribution and link function to be
used in the model. This is the result of a call to a family function. Default
is 
standardize 
Logical flag for x variable standardization, prior to
fitting the model sequence. The coefficients are always returned on the
original scale. Default is 
intercept 
Should intercept be fitted (default=TRUE) or set to zero (FALSE)? 
thresh 
Convergence threshold for coordinate descent. Each inner
coordinatedescent loop continues until the maximum change in the objective
after any coefficient update is less than thresh times the null deviance.
Default value is 
maxit 
Maximum number of passes over the data; default is 
penalty.factor 
Separate penalty factors can be applied to each
coefficient. This is a number that multiplies 
exclude 
Indices of variables to be excluded from the model. Default is none. Equivalent to an infinite penalty factor. 
lower.limits 
Vector of lower limits for each coefficient; default

upper.limits 
Vector of upper limits for each coefficient; default

trace.it 
Controls how much information is printed to screen. Default is

glmnet.path
solves the elastic net problem for a path of lambda values.
It generalizes glmnet::glmnet
in that it works for any GLM family.
Sometimes the sequence is truncated before nlambda
values of lambda
have been used. This happens when glmnet.path
detects that the decrease
in deviance is marginal (i.e. we are near a saturated fit).
An object with class "glmnetfit" and "glmnet".
a0 
Intercept sequence of length 
beta 
A 
df 
The number of nonzero coefficients for each value of lambda. 
dim 
Dimension of coefficient matrix. 
lambda 
The actual sequence of lambda values used. When alpha=0, the largest lambda reported does not quite give the zero coefficients reported (lambda=inf would in principle). Instead, the largest lambda for alpha=0.001 is used, and the sequence of lambda values is derived from this. 
dev.ratio 
The fraction of (null) deviance explained. The deviance calculations incorporate weights if present in the model. The deviance is defined to be 2*(loglike_sat  loglike), where loglike_sat is the loglikelihood for the saturated model (a model with a free parameter per observation). Hence dev.ratio=1dev/nulldev. 
nulldev 
Null deviance (per observation). This is defined to be 2*(loglike_sat loglike(Null)). The null model refers to the intercept model. 
npasses 
Total passes over the data summed over all lambda values. 
jerr 
Error flag, for warnings and errors (largely for internal debugging). 
offset 
A logical variable indicating whether an offset was included in the model. 
call 
The call that produced this object. 
family 
Family used for the model. 
nobs 
Number of observations. 
set.seed(1)
x < matrix(rnorm(100 * 20), nrow = 100)
y < ifelse(rnorm(100) > 0, 1, 0)
# binomial with probit link
fit1 < glmnet:::glmnet.path(x, y, family = binomial(link = "probit"))
Converts a data frame to a data matrix suitable for input to glmnet
.
Factors are converted to dummy matrices via "onehot" encoding. Options deal
with missing values and sparsity.
makeX(train, test = NULL, na.impute = FALSE, sparse = FALSE, ...)
train 
Required argument. A dataframe consisting of vectors, matrices and factors 
test 
Optional argument. A dataframe matching 'train' for use as testing data 
na.impute 
Logical, default 
sparse 
Logical, default 
... 
additional arguments, currently unused 
The main function is to convert factors to dummy matrices via "onehot"
encoding. Having the 'train' and 'test' data present is useful if some
factor levels are missing in either. Since a factor with k levels leads to a
submatrix with 1/k entries zero, with large k the sparse=TRUE
option
can be helpful; a large matrix will be returned, but stored in sparse matrix
format. Finally, the function can deal with missing data. The current
version has the option to replace missing observations with the mean from
the training data. For dummy submatrices, these are the mean proportions at
each level.
If only 'train' was provided, the function returns a matrix 'x'. If missing values were imputed, this matrix has an attribute containing its column means (before imputation). If 'test' was provided as well, a list with two components is returned: 'x' and 'xtest'.
Trevor Hastie
Maintainer: Trevor Hastie [email protected]
glmnet
set.seed(101)
### Single data frame
X = matrix(rnorm(20), 10, 2)
X3 = sample(letters[1:3], 10, replace = TRUE)
X4 = sample(LETTERS[1:3], 10, replace = TRUE)
df = data.frame(X, X3, X4)
makeX(df)
makeX(df, sparse = TRUE)
### Single data freame with missing values
Xn = X
Xn[3, 1] = NA
Xn[5, 2] = NA
X3n = X3
X3n[6] = NA
X4n = X4
X4n[9] = NA
dfn = data.frame(Xn, X3n, X4n)
makeX(dfn)
makeX(dfn, sparse = TRUE)
makeX(dfn, na.impute = TRUE)
makeX(dfn, na.impute = TRUE, sparse = TRUE)
### Test data as well
X = matrix(rnorm(10), 5, 2)
X3 = sample(letters[1:3], 5, replace = TRUE)
X4 = sample(LETTERS[1:3], 5, replace = TRUE)
dft = data.frame(X, X3, X4)
makeX(df, dft)
makeX(df, dft, sparse = TRUE)
### Missing data in test as well
Xn = X
Xn[3, 1] = NA
Xn[5, 2] = NA
X3n = X3
X3n[1] = NA
X4n = X4
X4n[2] = NA
dftn = data.frame(Xn, X3n, X4n)
makeX(dfn, dftn)
makeX(dfn, dftn, sparse = TRUE)
makeX(dfn, dftn, na.impute = TRUE)
makeX(dfn, dftn, sparse = TRUE, na.impute = TRUE)
Randomly generated data for multiresponse Gaussian regression example.
data(MultiGaussianExample)
List containing the following elements:
100 by 20 matrix of numeric values.
100 by 4 matrix of numeric values, each column representing one response vector.
Randomly generated data for multinomial regression example.
data(MultinomialExample)
List containing the following elements:
500 by 30 matrix of numeric values.
Numeric vector of length 500 containing 142 ones, 174 twos and 184 threes.
This function constructs the coxph call needed to run the "hack" of coxph with 0 iterations. It's a separate function as we have to deal with function options like strata, offset and observation weights.
mycoxph(object, s, ...)
object 
A class 
s 
The value of the penalty parameter lambda at which the survival curve is required. 
... 
The same ... that was passed to survfit.coxnet. 
This function amends the function arguments passed to survfit.coxnet via ... if new data was passed to survfit.coxnet. It's a separate function as we have to deal with function options like newstrata and newoffset.
mycoxpred(object, s, ...)
object 
A class 
s 
The response for the fitted model. 
... 
The same ... that was passed to survfit.coxnet. 
Missing entries in any given column of the matrix are replaced by the column means or the values in a supplied vector.
na.replace(x, m = rowSums(x, na.rm = TRUE))
x 
A matrix with potentially missing values, and also potentially in sparse matrix format (i.e. inherits from "sparseMatrix") 
m 
Optional argument. A vector of values used to replace the missing entries, columnwise. If missing, the column means of 'x' are used 
This is a simple imputation scheme. This function is called by makeX
if the na.impute=TRUE
option is used, but of course can be used on
its own. If 'x' is sparse, the result is sparse, and the replacements are
done so as to maintain sparsity.
A version of 'x' is returned with the missing values replaced.
Trevor Hastie
Maintainer: Trevor Hastie [email protected]
makeX
and glmnet
set.seed(101)
### Single data frame
X = matrix(rnorm(20), 10, 2)
X[3, 1] = NA
X[5, 2] = NA
X3 = sample(letters[1:3], 10, replace = TRUE)
X3[6] = NA
X4 = sample(LETTERS[1:3], 10, replace = TRUE)
X4[9] = NA
dfn = data.frame(X, X3, X4)
x = makeX(dfn)
m = rowSums(x, na.rm = TRUE)
na.replace(x, m)
x = makeX(dfn, sparse = TRUE)
na.replace(x, m)
Returns the elastic net objective function value.
obj_function(y, mu, weights, family, lambda, alpha, coefficients, vp)
y 
Quantitative response variable. 
mu 
Model's predictions for 
weights 
Observation weights. 
family 
A description of the error distribution and link function to be used in the model. This is the result of a call to a family function. 
lambda 
A single value for the 
alpha 
The elasticnet mixing parameter, with 
coefficients 
The model's coefficients (excluding intercept). 
vp 
Penalty factors for each of the coefficients. 
Returns the elastic net penalty value without the lambda
factor.
pen_function(coefficients, alpha = 1, vp = 1)
coefficients 
The model's coefficients (excluding intercept). 
alpha 
The elasticnet mixing parameter, with 
vp 
Penalty factors for each of the coefficients. 
The penalty is defined as
$(1\alpha)/2 \sum vp_j \beta_j^2 + \alpha \sum vp_j \beta.$
Note the omission of the multiplicative lambda
factor.
Plots the crossvalidation curve, and upper and lower standard deviation
curves, as a function of the lambda
values used. If the object has
class "cv.relaxed"
a different plot is produced, showing both
lambda
and gamma
## S3 method for class 'cv.glmnet'
plot(x, sign.lambda = 1, ...)
## S3 method for class 'cv.relaxed'
plot(x, se.bands = TRUE, ...)
x 
fitted 
sign.lambda 
Either plot against 
... 
Other graphical parameters to plot 
se.bands 
Should shading be produced to show standarderror bands;
default is 
A plot is produced, and nothing is returned.
Jerome Friedman, Trevor Hastie and Rob Tibshirani
Maintainer:
Trevor Hastie [email protected]
Friedman, J., Hastie, T. and Tibshirani, R. (2008) Regularization Paths for Generalized Linear Models via Coordinate Descent
glmnet
and cv.glmnet
.
set.seed(1010)
n = 1000
p = 100
nzc = trunc(p/10)
x = matrix(rnorm(n * p), n, p)
beta = rnorm(nzc)
fx = (x[, seq(nzc)] %*% beta)
eps = rnorm(n) * 5
y = drop(fx + eps)
px = exp(fx)
px = px/(1 + px)
ly = rbinom(n = length(px), prob = px, size = 1)
cvob1 = cv.glmnet(x, y)
plot(cvob1)
title("Gaussian Family", line = 2.5)
cvob1r = cv.glmnet(x, y, relax = TRUE)
plot(cvob1r)
frame()
set.seed(1011)
par(mfrow = c(2, 2), mar = c(4.5, 4.5, 4, 1))
cvob2 = cv.glmnet(x, ly, family = "binomial")
plot(cvob2)
title("Binomial Family", line = 2.5)
## set.seed(1011)
## cvob3 = cv.glmnet(x, ly, family = "binomial", type = "class")
## plot(cvob3)
## title("Binomial Family", line = 2.5)
Produces a coefficient profile plot of the coefficient paths for a fitted
"glmnet"
object.
## S3 method for class 'glmnet'
plot(x, xvar = c("norm", "lambda", "dev"), label = FALSE, ...)
## S3 method for class 'mrelnet'
plot(
x,
xvar = c("norm", "lambda", "dev"),
label = FALSE,
type.coef = c("coef", "2norm"),
...
)
## S3 method for class 'multnet'
plot(
x,
xvar = c("norm", "lambda", "dev"),
label = FALSE,
type.coef = c("coef", "2norm"),
...
)
## S3 method for class 'relaxed'
plot(x, xvar = c("lambda", "dev"), label = FALSE, gamma = 1, ...)
x 
fitted 
xvar 
What is on the Xaxis. 
label 
If 
... 
Other graphical parameters to plot 
type.coef 
If 
gamma 
Value of the mixing parameter for a "relaxed" fit 
A coefficient profile plot is produced. If x
is a multinomial model,
a coefficient plot is produced for each class.
Jerome Friedman, Trevor Hastie and Rob Tibshirani
Maintainer:
Trevor Hastie [email protected]
Friedman, J., Hastie, T. and Tibshirani, R. (2008) Regularization Paths for Generalized Linear Models via Coordinate Descent
glmnet
, and print
, predict
and coef
methods.
x=matrix(rnorm(100*20),100,20)
y=rnorm(100)
g2=sample(1:2,100,replace=TRUE)
g4=sample(1:4,100,replace=TRUE)
fit1=glmnet(x,y)
plot(fit1)
plot(fit1,xvar="lambda",label=TRUE)
fit3=glmnet(x,g4,family="multinomial")
plot(fit3,pch=19)
Randomly generated data for Poisson regression example.
data(PoissonExample)
List containing the following elements:
500 by 20 matrix of numeric values.
Numeric vector of length 500 consisting of nonnegative integers.
This function makes predictions from a crossvalidated glmnet model, using
the stored "glmnet.fit"
object, and the optimal value chosen for
lambda
(and gamma
for a 'relaxed' fit.
## S3 method for class 'cv.glmnet'
predict(object, newx, s = c("lambda.1se", "lambda.min"), ...)
## S3 method for class 'cv.relaxed'
predict(
object,
newx,
s = c("lambda.1se", "lambda.min"),
gamma = c("gamma.1se", "gamma.min"),
...
)
object 
Fitted 
newx 
Matrix of new values for 
s 
Value(s) of the penalty parameter 
... 
Not used. Other arguments to predict. 
gamma 
Value (single) of 'gamma' at which predictions are to be made 
This function makes it easier to use the results of crossvalidation to make a prediction.
The object returned depends on the ... argument which is passed
on to the predict
method for glmnet
objects.
Jerome Friedman, Trevor Hastie and Rob Tibshirani
Maintainer:
Trevor Hastie [email protected]
Friedman, J., Hastie, T. and Tibshirani, R. (2008)
Regularization Paths for Generalized Linear Models via Coordinate
Descent (2010), Journal of Statistical Software, Vol. 33(1), 122,
doi:10.18637/jss.v033.i01.
Simon, N., Friedman, J., Hastie, T. and Tibshirani, R. (2011)
Regularization Paths for Cox's Proportional
Hazards Model via Coordinate Descent, Journal of Statistical Software, Vol.
39(5), 113,
doi:10.18637/jss.v039.i05.
Hastie, T., Tibshirani, Robert and Tibshirani, Ryan (2020) Best Subset,
Forward Stepwise or Lasso? Analysis and Recommendations Based on Extensive Comparisons,
Statist. Sc. Vol. 35(4), 579592,
https://arxiv.org/abs/1707.08692.
Glmnet webpage with four vignettes, https://glmnet.stanford.edu.
glmnet
, and print
, and coef
methods, and
cv.glmnet
.
x = matrix(rnorm(100 * 20), 100, 20)
y = rnorm(100)
cv.fit = cv.glmnet(x, y)
predict(cv.fit, newx = x[1:5, ])
coef(cv.fit)
coef(cv.fit, s = "lambda.min")
predict(cv.fit, newx = x[1:5, ], s = c(0.001, 0.002))
cv.fitr = cv.glmnet(x, y, relax = TRUE)
predict(cv.fit, newx = x[1:5, ])
coef(cv.fit)
coef(cv.fit, s = "lambda.min", gamma = "gamma.min")
predict(cv.fit, newx = x[1:5, ], s = c(0.001, 0.002), gamma = "gamma.min")
glmnetfit
fit objectGives fitted values, linear predictors, coefficients and number of nonzero
coefficients from a fitted glmnetfit
object.
## S3 method for class 'glmnetfit'
predict(
object,
newx,
s = NULL,
type = c("link", "response", "coefficients", "nonzero"),
exact = FALSE,
newoffset,
...
)
object 
Fitted "glmnetfit" object. 
newx 
Matrix of new values for 
s 
Value(s) of the penalty parameter lambda at which predictions are required. Default is the entire sequence used to create the model. 
type 
Type of prediction required. Type "link" gives the linear predictors (eta scale); Type "response" gives the fitted values (mu scale). Type "coefficients" computes the coefficients at the requested values for s. Type "nonzero" returns a list of the indices of the nonzero coefficients for each value of s. 
exact 
This argument is relevant only when predictions are made at values
of 
newoffset 
If an offset is used in the fit, then one must be supplied for making predictions (except for type="coefficients" or type="nonzero"). 
... 
This is the mechanism for passing arguments like 
The object returned depends on type.
Print a summary of the results of crossvalidation for a glmnet model.
## S3 method for class 'cv.glmnet'
print(x, digits = max(3, getOption("digits")  3), ...)
x 
fitted 'cv.glmnet' object 
digits 
significant digits in printout 
... 
additional print arguments 
A summary of the crossvalidated fit is produced, slightly different for a
'cv.relaxed' object than for a 'cv.glmnet' object. Note that a 'cv.relaxed'
object inherits from class 'cv.glmnet', so by directly invoking
print.cv.glmnet(object)
will print the summary as if
relax=TRUE
had not been used.
Jerome Friedman, Trevor Hastie and Rob Tibshirani
Maintainer:
Trevor Hastie [email protected]
Friedman, J., Hastie, T. and Tibshirani, R. (2008)
Regularization Paths for Generalized Linear Models via Coordinate
Descent
https://arxiv.org/abs/1707.08692
Hastie, T.,
Tibshirani, Robert, Tibshirani, Ryan (2019) Extended Comparisons of
Best Subset Selection, Forward Stepwise Selection, and the Lasso
glmnet
, predict
and coef
methods.
x = matrix(rnorm(100 * 20), 100, 20)
y = rnorm(100)
fit1 = cv.glmnet(x, y)
print(fit1)
fit1r = cv.glmnet(x, y, relax = TRUE)
print(fit1r)
## print.cv.glmnet(fit1r) ## CHECK WITH TREVOR
Print a summary of the glmnet path at each step along the path.
## S3 method for class 'glmnet'
print(x, digits = max(3, getOption("digits")  3), ...)
x 
fitted glmnet object 
digits 
significant digits in printout 
... 
additional print arguments 
The call that produced the object x
is printed, followed by a
threecolumn matrix with columns Df
, %Dev
and Lambda
.
The Df
column is the number of nonzero coefficients (Df is a
reasonable name only for lasso fits). %Dev
is the percent deviance
explained (relative to the null deviance). In the case of a 'relaxed' fit,
an additional column is inserted, %Dev R
which gives the percent
deviance explained by the relaxed model. For a "bigGlm" model, a simpler
summary is printed.
The matrix above is silently returned
Friedman, J., Hastie, T. and Tibshirani, R. (2008). Regularization Paths for Generalized Linear Models via Coordinate Descent
glmnet
, predict
and coef
methods.
x = matrix(rnorm(100 * 20), 100, 20)
y = rnorm(100)
fit1 = glmnet(x, y)
print(fit1)
Randomly generated data for Gaussian regression example.
data(QuickStartExample)
List containing the following elements:
100 by 20 matrix of numeric values.
Numeric vector of length 100.
Internal function to make the response y passed to glmnet suitable for coxnet (i.e. glmnet with family = "cox"). Sanity checks are performed here too.
response.coxnet(y)
y 
Response variable. Either a class "Surv" object or a twocolumn matrix with columns named 'time' and 'status'. 
If y is a class "Surv" object, this function returns y with no changes. If y is a twocolumn matrix with columns named 'time' and 'status', it is converted into a "Surv" object.
A class "Surv" object.
Generate multinomial samples
rmult(p)
p 
matrix of probabilities, with number of columns the number of classes 
Simple function that calls the rmultinom
function. It generates a class label
for each row of its input matrix of class probabilities.
a vector of class memberships
Trevor Hastie
Maintainer: Trevor Hastie [email protected]
Randomly generated data for Gaussian regression example with the design matrix x being in sparse matrix format.
data(SparseExample)
List containing the following elements:
100 by 20 matrix of numeric values. x is in sparse matrix format, having class "dgCMatrix".
Numeric vector of length 100.
Helper function to add strata as an attribute to a Surv object. The
output of this function can be used as the response in glmnet()
for fitting stratified Cox models.
stratifySurv(y, strata = rep(1, length(y)))
y 
A Surv object. 
strata 
A vector of length equal to the number of observations in y, indicating strata membership. Default is all belong to same strata. 
When fitting a stratified Cox model with glmnet()
, strata should
be added to a Surv
response with this helper function. Note that
it is not sufficient to add strata as an attribute to the Surv
response manually: if the result does not have class stratifySurv
,
subsetting of the response will not work properly.
An object of class stratifySurv
(in addition to all the
classes y
belonged to).
y < survival::Surv(1:10, rep(0:1, length.out = 10))
strata < rep(1:3, length.out = 10)
y2 < stratifySurv(y, strata) # returns stratifySurv object
Computes the predicted survivor function for a Cox proportional hazards model with elastic net penalty.
## S3 method for class 'coxnet'
survfit(formula, s = NULL, ...)
formula 
A class 
s 
Value(s) of the penalty parameter lambda at which the survival
curve is required. Default is the entire sequence used to create the model.
However, it is recommended that 
... 
This is the mechanism for passing additional arguments like (i) x= and y= for the x and y used to fit the model, (ii) weights= and offset= when the model was fit with these options, (iii) arguments for new data (newx, newoffset, newstrata), and (iv) arguments to be passed to survfit.coxph(). 
To be consistent with other functions in glmnet
, if s
is not specified, survival curves are returned for the entire lambda
sequence. This is not recommended usage: it is best to call
survfit.coxnet
with a single value of the penalty parameter
for the s
option.
If s
is a single value, an object of class "survfitcox"
and "survfit" containing one or more survival curves. Otherwise, a list
of such objects, one element for each value in s
.
Methods defined for survfit objects are print, summary and plot.
set.seed(2)
nobs < 100; nvars < 15
xvec < rnorm(nobs * nvars)
xvec[sample.int(nobs * nvars, size = 0.4 * nobs * nvars)] < 0
x < matrix(xvec, nrow = nobs)
beta < rnorm(nvars / 3)
fx < x[, seq(nvars / 3)] %*% beta / 3
ty < rexp(nobs, exp(fx))
tcens < rbinom(n = nobs, prob = 0.3, size = 1)
y < survival::Surv(ty, tcens)
fit1 < glmnet(x, y, family = "cox")
# survfit object for Cox model where lambda = 0.1
sf1 < survival::survfit(fit1, s = 0.1, x = x, y = y)
plot(sf1)
# example with new data
sf2 < survival::survfit(fit1, s = 0.1, x = x, y = y, newx = x[1:3, ])
plot(sf2)
# example with strata
y2 < stratifySurv(y, rep(1:2, length.out = nobs))
fit2 < glmnet(x, y2, family = "cox")
sf3 < survival::survfit(fit2, s = 0.1, x = x, y = y2)
sf4 < survival::survfit(fit2, s = 0.1, x = x, y = y2,
newx = x[1:3, ], newstrata = c(1, 1, 1))
Computes the predicted survivor function for a Cox proportional hazards model with elastic net penalty from a crossvalidated glmnet model.
## S3 method for class 'cv.glmnet'
survfit(formula, s = c("lambda.1se", "lambda.min"), ...)
formula 
A class 
s 
Value(s) of the penalty parameter lambda at which predictions are required. Default is the value s="lambda.1se" stored on the CV object. Alternatively s="lambda.min" can be used. If s is numeric, it is taken as the value(s) of lambda to be used. 
... 
Other arguments to be passed to 
This function makes it easier to use the results of crossvalidation to compute a survival curve.
If s
is a single value, an object of class "survfitcox"
and "survfit" containing one or more survival curves. Otherwise, a list
of such objects, one element for each value in s
.
Methods defined for survfit objects are print, summary and plot.
set.seed(2)
nobs < 100; nvars < 15
xvec < rnorm(nobs * nvars)
x < matrix(xvec, nrow = nobs)
beta < rnorm(nvars / 3)
fx < x[, seq(nvars / 3)] %*% beta / 3
ty < rexp(nobs, exp(fx))
tcens < rbinom(n = nobs, prob = 0.3, size = 1)
y < survival::Surv(ty, tcens)
cvfit < cv.glmnet(x, y, family = "cox")
# default: s = "lambda.1se"
survival::survfit(cvfit, x = x, y = y)
# s = "lambda.min"
survival::survfit(cvfit, s = "lambda.min", x = x, y = y)
Helper function to check if glmnet() should call cox.path().
use.cox.path(x, y)
x 
Design matrix. 
y 
Response variable. 
For family="cox"
, we only call the original coxnet() function if
(i) x is not sparse, (ii) y is rightcensored data, and (iii) we are
not fitting a stratified Cox model. This function also throws an error
if y has a "strata" attribute but is not of type "stratifySurv".
TRUE if cox.path() should be called, FALSE otherwise.
Helper function to compute weighted mean and standard deviation. Deals gracefully whether x is sparse matrix or not.
weighted_mean_sd(x, weights = rep(1, nrow(x)))
x 
Observation matrix. 
weights 
Optional weight vector. 
A list with components.
mean 
vector of weighted means of columns of x 
sd 
vector of weighted standard deviations of columns of x 