Title: | The Induced Smoothed Lasso |
---|---|
Description: | An implementation of the induced smoothing (IS) idea to lasso regularization models to allow estimation and inference on the model coefficients (currently hypothesis testing only). Linear, logistic, Poisson and gamma regressions with several link functions are implemented. The algorithm is described in the original paper; see <doi:10.1177/0962280219842890> and discussed in a tutorial <doi:10.13140/RG.2.2.16360.11521>. |
Authors: | Gianluca Sottile [aut, cre], Giovanna Cilluffo [aut, ctb], Vito MR Muggeo [aut, ctb] |
Maintainer: | Gianluca Sottile <[email protected]> |
License: | GPL (>= 2) |
Version: | 1.5.2 |
Built: | 2024-11-04 06:47:45 UTC |
Source: | CRAN |
This package implements an induced smoothed approach for hypothesis testing in lasso regression.
Package: | islasso |
Type: | Package |
Version: | 1.5.w |
Date: | 2024-01-22 |
License: | GPL-2 |
islasso
is used to fit generalized linear models with a L1-penalty on (some) regression coefficients. Along with point estimates, the main advantage is to return the full covariance matrix of estimate. The resulting standard errors can be used to make inference in the lasso framework. The main function is islasso
and the correspoinding fitter function islasso.fit
, and many auxiliary functions are implemented to summarize and visualize results: summary.islasso
, predict.islasso
, logLik.islasso
, deviance.islasso
, residuals.islasso
.
islasso.path
is used to fit Fit a generalized linear model via the Induced Smoothed Lasso. The regularization path is computed for the lasso or elasticnet penalty at a grid of values for the regularization parameter lambda. Along with coefficients profile, the main advantage is to return also the standard errors profile. The resulting standard errors can be used to make inference in the lasso framework. The main function is islasso.path
and the correspoinding fitter function islasso.path.fit
, and many auxiliary functions are implemented to summarize and visualize results: summary.islasso.path
, predict.islasso.path
, logLik.islasso.path
, deviance.islasso.path
, residuals.islasso.path
, coef.islasso.path
, fitted.islasso.path
.
Gianluca Sottile based on some preliminary functions by Vito Muggeo.
Maintainer: Gianluca Sottile <[email protected]>
Cilluffo, G, Sottile, G, S, La Grutta, S and Muggeo, VMR (2019). The Induced Smoothed lasso: A practical framework for hypothesis testing in high dimensional regression. Statistical Methods in Medical Research, DOI: 10.1177/0962280219842890.
Sottile, G, Cilluffo, G, Muggeo, VMR (2019). The R package islasso: estimation and hypothesis testing in lasso regression. Technical Report on ResearchGate. doi:10.13140/RG.2.2.16360.11521.
set.seed(1) n <- 100 p <- 30 p1 <- 10 #number of nonzero coefficients coef.veri <- sort(round(c(seq(.5, 3, l=p1/2), seq(-1, -2, l=p1/2)), 2)) sigma <- 1 coef <- c(coef.veri, rep(0, p-p1)) X <- matrix(rnorm(n*p), n, p) mu <- drop(X%*%coef) y <- mu + rnorm(n, 0,sigma) o <- islasso.path(y ~ ., data = data.frame(y = y, X), family = gaussian()) temp <- GoF.islasso.path(o) lambda <- temp$lambda.min["BIC"] o <- islasso(y ~ ., data = data.frame(y = y, X), family = gaussian(), lambda = lambda) o summary(o, pval = .05)
set.seed(1) n <- 100 p <- 30 p1 <- 10 #number of nonzero coefficients coef.veri <- sort(round(c(seq(.5, 3, l=p1/2), seq(-1, -2, l=p1/2)), 2)) sigma <- 1 coef <- c(coef.veri, rep(0, p-p1)) X <- matrix(rnorm(n*p), n, p) mu <- drop(X%*%coef) y <- mu + rnorm(n, 0,sigma) o <- islasso.path(y ~ ., data = data.frame(y = y, X), family = gaussian()) temp <- GoF.islasso.path(o) lambda <- temp$lambda.min["BIC"] o <- islasso(y ~ ., data = data.frame(y = y, X), family = gaussian(), lambda = lambda) o summary(o, pval = .05)
This function performs a minimization of the AIC/BIC criterion for selecting the tuning parameter in “islasso
”.
aic.islasso(object, method = c("AIC", "BIC", "AICc", "GCV", "GIC"), interval, g = 0, y, X, intercept = FALSE, family = gaussian(), alpha = 1, offset, weights, unpenalized, control = is.control(), trace = TRUE)
aic.islasso(object, method = c("AIC", "BIC", "AICc", "GCV", "GIC"), interval, g = 0, y, X, intercept = FALSE, family = gaussian(), alpha = 1, offset, weights, unpenalized, control = is.control(), trace = TRUE)
object |
a fitted model object of class "islasso". |
method |
the criterion to optimize, AIC, BIC, AICc, GCV, GIC. |
interval |
the lower and upper limits of |
g |
a value belonging to the interval [0, 1]. Classical BIC is returned by letting g = 0 (default value), whereas extended BIC corresponds to the case g = 0.5. |
y |
if |
X |
if |
intercept |
if |
family |
if |
alpha |
The elasticnet mixing parameter, with
|
offset |
this can be used to specify an a priori known component to be included in the linear predictor during fitting. This should be NULL or a numeric vector of length equal to the number of cases. |
weights |
observation weights. Default is 1 for each observation. |
unpenalized |
a vector used to specify the unpenalized estimators; unpenalized has to be a vector of logicals. |
control |
a list of parameters for controlling the fitting process (see |
trace |
Should the iterative procedure be printed? TRUE is the default value. |
Minimization of the Akaike Information Criterion (AIC), or Bayesian Information Criterion (BIC) or several other criteria are sometimes employed to select the tuning parameter as an alternative to the cross validation. The model degrees of freedom (not necessarly integers as in the plain lasso) used in all methods are computed as trace of the hat matrix at convergence.
the optimal lambda value is returned
Maintainer: Gianluca Sottile <[email protected]>
islasso.fit
, summary.islasso
, residuals.islasso
, logLik.islasso
, predict.islasso
and deviance.islasso
methods.
set.seed(1) n <- 100 p <- 100 p1 <- 20 #number of nonzero coefficients coef.veri <- sort(round(c(seq(.5, 3, l=p1/2), seq(-1, -2, l=p1/2)), 2)) sigma <- 1 coef <- c(coef.veri, rep(0, p-p1)) X <- matrix(rnorm(n*p), n, p) mu <- drop(X%*%coef) y <- mu + rnorm(n, 0, sigma) o <- islasso(y ~ ., data = data.frame(y = y, X)) ## Not run: #use the evaluation interval of the fit lambda_aic <- aic.islasso(o, method = "AIC") #overwrites the evaluation interval for lambda lambda_bic <- aic.islasso(o, interval = c(.1, 30), method = "BIC") #overwrites the evaluation interval for lambda using eBIC criterion lambda_ebic <- aic.islasso(o, interval = c(.1, 30), method = "BIC", g = .5) ## End(Not run)
set.seed(1) n <- 100 p <- 100 p1 <- 20 #number of nonzero coefficients coef.veri <- sort(round(c(seq(.5, 3, l=p1/2), seq(-1, -2, l=p1/2)), 2)) sigma <- 1 coef <- c(coef.veri, rep(0, p-p1)) X <- matrix(rnorm(n*p), n, p) mu <- drop(X%*%coef) y <- mu + rnorm(n, 0, sigma) o <- islasso(y ~ ., data = data.frame(y = y, X)) ## Not run: #use the evaluation interval of the fit lambda_aic <- aic.islasso(o, method = "AIC") #overwrites the evaluation interval for lambda lambda_bic <- aic.islasso(o, interval = c(.1, 30), method = "BIC") #overwrites the evaluation interval for lambda using eBIC criterion lambda_ebic <- aic.islasso(o, interval = c(.1, 30), method = "BIC", g = .5) ## End(Not run)
islasso
objectsGeneral linear hypotheses and confidence intervals estimation for linear combinantions of the regression coefficients in islasso
fits
## S3 method for class 'islasso' anova(object, A, b = NULL, ci, ...)
## S3 method for class 'islasso' anova(object, A, b = NULL, ci, ...)
object |
a fitted model object of class "islasso". |
A |
matrix (or vector) giving linear combinations of coefficients by rows, or a character vector giving the hypothesis in symbolic form (see Details). |
b |
right-hand-side vector for hypothesis, with as many entries as rows in the hypothesis matrix A; can be omitted, in which case it defaults to a vector of zeroes. |
ci |
optionally, a two columns matrix of estimated confidence intervals for the estimated coefficients. |
... |
not used. |
For the islasso regression model with coefficients , the null hypothesis is
where A and b are known matrix and vector.
The hypothesis matrix A can be supplied as a numeric matrix (or vector), the rows of which specify linear combinations of the model coefficients, which are tested equal to the corresponding entries in the right-hand-side vector b, which defaults to a vector of zeroes.
Alternatively, the hypothesis can be specified symbolically as a character vector with one or more elements, each of which gives either a linear combination of coefficients, or a linear equation in the coefficients (i.e., with both a left and right side separated by an equals sign). Components of a linear expression or linear equation can consist of numeric constants, or numeric constants multiplying coefficient names (in which case the number precedes the coefficient, and may be separated from it by spaces or an asterisk); constants of 1 or -1 may be omitted. Spaces are always optional. Components are separated by plus or minus signs. Newlines or tabs in hypotheses will be treated as spaces. See the examples below.
An object of class "anova.islasso" which contains the estimates, the standard errors, the Wald statistics and corresponding p value of each linear combination and of the restriced model.
The main function of the same name was inspired by the R function previously implemented by Vito MR Muggeo.
Maintainer: Gianluca Sottile <[email protected]>
set.seed(1) n <- 100 p <- 100 p1 <- 10 #number of nonzero coefficients coef.true <- sort(round(c(seq(.5, 3, l=p1/2), seq(-1, -2, l=p1/2)), 2)) sigma <- 1 coef <- c(coef.true, rep(0, p-p1)) X <- matrix(rnorm(n*p), n, p) eta <- drop(X %*% coef) mu <- eta y <- mu + rnorm(n, 0, sigma) o <- islasso(y ~ . - 1, data = data.frame(y = y, X), family = gaussian()) anova(o, A = diag(p), b = coef) anova(o, A = c("X1 + X2 + X3 + X4 + X5 = -7.5")) anova(o, A = c("X1 + X2 + X3 + X4 + X5 = 0")) anova(o, A = c("X6 + X7 + X8 + X9 + X10"), b = 8.75) anova(o, A = c("X6 + X7 + X8 + X9 + X10"), b = 0) anova(o, A = c("X1 + X2 + X3 + X4 + X5 = -7.5", "X6 + X7 + X8 + X9 + X10 = 8.75")) anova(o, A = c("X1 + X2 + X3 + X4 + X5", "X6 + X7 + X8 + X9 + X10"), b = c(-7.5, 8.75)) anova(o, A = c("X1 + X2 + X3 + X4 + X5", "X6 + X7 + X8 + X9 + X10"))
set.seed(1) n <- 100 p <- 100 p1 <- 10 #number of nonzero coefficients coef.true <- sort(round(c(seq(.5, 3, l=p1/2), seq(-1, -2, l=p1/2)), 2)) sigma <- 1 coef <- c(coef.true, rep(0, p-p1)) X <- matrix(rnorm(n*p), n, p) eta <- drop(X %*% coef) mu <- eta y <- mu + rnorm(n, 0, sigma) o <- islasso(y ~ . - 1, data = data.frame(y = y, X), family = gaussian()) anova(o, A = diag(p), b = coef) anova(o, A = c("X1 + X2 + X3 + X4 + X5 = -7.5")) anova(o, A = c("X1 + X2 + X3 + X4 + X5 = 0")) anova(o, A = c("X6 + X7 + X8 + X9 + X10"), b = 8.75) anova(o, A = c("X6 + X7 + X8 + X9 + X10"), b = 0) anova(o, A = c("X1 + X2 + X3 + X4 + X5 = -7.5", "X6 + X7 + X8 + X9 + X10 = 8.75")) anova(o, A = c("X1 + X2 + X3 + X4 + X5", "X6 + X7 + X8 + X9 + X10"), b = c(-7.5, 8.75)) anova(o, A = c("X1 + X2 + X3 + X4 + X5", "X6 + X7 + X8 + X9 + X10"))
This data set details microarray experiment for 52 breast cancer patients. The binary variable status
is used to indicate whether or not the patient has died of breast cancer (status = 0
= did not die of breast cancer, status = 1
= died of breast cancer). The other variables contain the amplification or deletion of the considered genes.
Rather than measuring gene expression, this experiment aims to measure gene amplification or deletion, which refers to the number of copies of a particular DNA sequence within the genome. The aim of the experiment is to find out the key genomic factors involved in agressive and non-agressive forms of breast cancer.
The experiment was conducted by the Dr.\ John Bartlett and Dr.\ Caroline Witton in the Division of Cancer Sciences and Molecular Pathology of the University of Glasgow at the city's Royal Infirmary.
data(breast)
data(breast)
Dr. John Bartlett and Dr. Caroline Witton, Division of Cancer Sciences and Molecular Pathology, University of Glasgow, Glasgow Royal Infirmary.
Augugliaro L., Mineo A.M. and Wit E.C. (2013) dgLARS: a differential geometric approach to sparse generalized linear models, Journal of the Royal Statistical Society. Series B., Vol 75(3), 471-498.
Wit E.C. and McClure J. (2004) "Statistics for Microarrays: Design, Analysis and Inference" Chichester: Wiley.
islasso
objectsconfint method for islasso
objects
## S3 method for class 'islasso' confint(object, parm, level = 0.95, type.ci = "wald", trace = TRUE, ...)
## S3 method for class 'islasso' confint(object, parm, level = 0.95, type.ci = "wald", trace = TRUE, ...)
object |
a fitted model object of class "islasso". |
parm |
a specification of which parameters are to be given confidence intervals, either a vector of numbers or a vector of names. If missing, all parameters are considered. |
level |
the confidence level required. |
type.ci |
Only Wald-type confidence intervals are implemented yet! type.ci = "wald" estimates and standard errors are used to build confidence interval |
trace |
if TRUE (default) a bar shows the iterations status. |
... |
additional argument(s) for methods. |
Maintainer: Gianluca Sottile <[email protected]>
islasso.fit
, summary.islasso
, residuals.islasso
, logLik.islasso
, predict.islasso
and deviance.islasso
methods.
set.seed(1) n <- 100 p <- 100 p1 <- 10 #number of nonzero coefficients coef.veri <- sort(round(c(seq(.5, 3, l=p1/2), seq(-1, -2, l=p1/2)), 2)) sigma <- 1 coef <- c(coef.veri, rep(0, p-p1)) X <- matrix(rnorm(n*p), n, p) eta <- drop(X%*%coef) ##### gaussian ###### mu <- eta y <- mu + rnorm(n, 0, sigma) o <- islasso(y ~ ., data = data.frame(y = y, X), family = gaussian()) ci <- confint(o, type.ci = "wald", parm = 1:10) ci plot(ci)
set.seed(1) n <- 100 p <- 100 p1 <- 10 #number of nonzero coefficients coef.veri <- sort(round(c(seq(.5, 3, l=p1/2), seq(-1, -2, l=p1/2)), 2)) sigma <- 1 coef <- c(coef.veri, rep(0, p-p1)) X <- matrix(rnorm(n*p), n, p) eta <- drop(X%*%coef) ##### gaussian ###### mu <- eta y <- mu + rnorm(n, 0, sigma) o <- islasso(y ~ ., data = data.frame(y = y, X), family = gaussian()) ci <- confint(o, type.ci = "wald", parm = 1:10) ci plot(ci)
The diabetes
data frame has 442 rows and 3 columns.
These are the data used in the Efron et al "Least Angle Regression" paper.
This data frame contains the following columns:
a matrix with 10 columns
a numeric vector
a matrix with 64 columns
The x matrix has been standardized to have unit L2 norm in each column and zero mean. The matrix x2 consists of x plus certain interactions.
https://web.stanford.edu/~hastie/Papers/LARS/LeastAngle_2002.ps
Efron, Hastie, Johnstone and Tibshirani (2003) "Least Angle Regression" (with discussion) Annals of Statistics
This function extracts the value of the tuning parameter which minimizes the AIC/BIC/AICc/eBIC/GCV/GIC criterion in “islasso.path
”.
GoF.islasso.path(object, plot = TRUE, ...)
GoF.islasso.path(object, plot = TRUE, ...)
object |
a fitted model object of class |
plot |
a logical flag indicating if each criterion have to be plotted |
... |
further arguments passed to or from other methods. |
Minimization of the Akaike Information Criterion (AIC), or Bayesian Information Criterion (BIC) or several other criteria are sometimes employed to select the tuning parameter as an alternative to the cross validation. The model degrees of freedom (not necessarly integers as in the plain lasso) used in all methods are computed as trace of the hat matrix at convergence.
A list of
gof |
the goodness of fit measures |
minimum |
the position of the optimal lambda values |
lambda.min |
the optimal lambda values |
Maintainer: Gianluca Sottile <[email protected]>
islasso.path
, islasso.path.fit
, coef.islasso.path
, residuals.islasso.path
, summary.islasso.path
, logLik.islasso.path
, fitted.islasso.path
, predict.islasso.path
and deviance.islasso.path
methods.
set.seed(1) n <- 100 p <- 30 p1 <- 10 #number of nonzero coefficients coef.veri <- sort(round(c(seq(.5, 3, l=p1/2), seq(-1, -2, l=p1/2)), 2)) sigma <- 1 coef <- c(coef.veri, rep(0, p-p1)) X <- matrix(rnorm(n*p), n, p) mu <- drop(X%*%coef) y <- mu + rnorm(n, 0, sigma) o <- islasso.path(y ~ ., data = data.frame(y = y, X)) GoF.islasso.path(o)
set.seed(1) n <- 100 p <- 30 p1 <- 10 #number of nonzero coefficients coef.veri <- sort(round(c(seq(.5, 3, l=p1/2), seq(-1, -2, l=p1/2)), 2)) sigma <- 1 coef <- c(coef.veri, rep(0, p-p1)) X <- matrix(rnorm(n*p), n, p) mu <- drop(X%*%coef) y <- mu + rnorm(n, 0, sigma) o <- islasso.path(y ~ ., data = data.frame(y = y, X)) GoF.islasso.path(o)
Auxiliary function for controlling the islasso model fitting.
is.control(sigma2 = -1, tol = 1E-05, itmax = 1E+3, stand = TRUE, trace = 0, nfolds = 5, seed = NULL, adaptive = FALSE, g = .5, b0 = NULL, V0 = NULL, c = .5)
is.control(sigma2 = -1, tol = 1E-05, itmax = 1E+3, stand = TRUE, trace = 0, nfolds = 5, seed = NULL, adaptive = FALSE, g = .5, b0 = NULL, V0 = NULL, c = .5)
sigma2 |
optional. The fixed value of dispersion parameter. If -1 (default) it is estimated from the data |
tol |
tollerance value to declare convergence, dafault to 1e-5 |
itmax |
maximum number of iterations, default to 1000 |
stand |
if TRUE (default), the covariates are standardized prior to fitting the model. However the coefficients are always returned on the original scale. |
trace |
Should the iterative procedure be printed? 0: no printing, 1 = compact printing, 2 = enlarged printing, 3 = compact printing including Fisher scoring information (only used in glm family). |
nfolds |
if |
seed |
optional, the seed to be used to split the dataframe and to perform cross validation. Useful to make reproducible the results. |
adaptive |
experimental, if TRUE the adaptive LASSO is implemented. |
g |
a value belonging to the interval [0, 1]. Classical BIC is returned by letting g = 0 (default value), whereas extended BIC corresponds to the case g = 0.5. |
b0 |
optional, starting values for the regression coefficients. If NULL, the point estimates from |
V0 |
optional, starting value for the estimates covariance matrix, If NULL, the identity matrix is used. |
c |
the weight of the mixture in the induced smoothed lasso, the default is |
Maintainer: Gianluca Sottile <[email protected]>
islasso
is used to fit lasso regression models wherein the nonsmooth norm penalty is replaced by a smooth approximation justified under the induced smoothing paradigm. Simple lasso-type or elastic-net penalties are permitted and Linear, Logistic, Poisson and Gamma responses are allowed.
islasso(formula, family = gaussian, lambda, alpha = 1, data, weights, subset, offset, unpenalized, contrasts = NULL, control = is.control())
islasso(formula, family = gaussian, lambda, alpha = 1, data, weights, subset, offset, unpenalized, contrasts = NULL, control = is.control())
formula |
an object of class “formula” (or one that can be coerced to that class): the ‘usual’ symbolic description of the model to be fitted. |
family |
the assumed response distribution. Gaussian, (quasi) Binomial, (quasi) Poisson, and Gamma are allowed. |
lambda |
Value of the tuning parameter in the objective. If missing, the optimal lambda is computed using |
alpha |
The elastic-net mixing parameter, with
|
data |
an optional data frame, list or environment (or object coercible by as.data.frame to a data frame) containing the variables in the model. If not found in data, the variables are taken from environment(formula), typically the environment from which |
weights |
observation weights. Default is 1 for each observation. |
subset |
an optional vector specifying a subset of observations to be used in the fitting process. |
offset |
this can be used to specify an a priori known component to be included in the linear predictor during fitting. This should be NULL or a numeric vector of length equal to the number of cases. |
unpenalized |
optional. A vector of integers or characters indicating any covariate (in the formula) with coefficients not to be penalized. The intercept, if included in the model, is always unpenalized. |
contrasts |
an optional list. See the contrasts.arg of |
control |
a list of parameters for controlling the fitting process (see |
islasso
estimates regression models by imposing a lasso-type penalty on some or all regression coefficients. However the nonsmooth norm penalty is replaced by a smooth approximation justified under the induced smoothing paradigm. The advantage is that reliable standard errors are returned as model output and hypothesis testing on linear combinantions of the regression parameters can be carried out straightforwardly via the Wald statistic. Simulation studies provide evidence that the proposed approach controls type-I errors and exhibits good power in different scenarios.
A list of
coefficients |
a named vector of coefficients |
se |
a named vector of standard errors |
residuals |
the working residuals |
fitted.values |
the fitted values |
rank |
the estimated degrees of freedom |
family |
the family object used |
linear.predictors |
the linear predictors |
deviance |
the family deviance |
aic |
the Akaike Information Criterion |
null.deviance |
the family null deviance |
iter |
the number of iterations of IWLS used |
weights |
the working weights, that is the weights in the final iteration of the IWLS fit |
df.residual |
the residual degrees of freedom |
df.null |
the degrees of freedom of a null model |
converged |
logical. Was the IWLS algorithm judged to have converged? |
model |
if requested (the default), the model frame used. |
call |
the matched call |
formula |
the formula supplied |
terms |
the terms object used |
data |
he data argument. |
offset |
the offset vector used. |
control |
the value of the control argument used |
xlevels |
(where relevant) a record of the levels of the factors used in fitting. |
lambda |
the lambda value used in the islasso algorithm |
alpha |
the elasticnet mixing parameter |
dispersion |
the estimated dispersion parameter |
internal |
internal elements |
contrasts |
(only where relevant) the contrasts used. |
The main function of the same name was inspired by the R function previously implemented by Vito MR Muggeo.
Maintainer: Gianluca Sottile <[email protected]>
Cilluffo, G, Sottile, G, S, La Grutta, S and Muggeo, VMR (2019). The Induced Smoothed lasso: A practical framework for hypothesis testing in high dimensional regression. Statistical Methods in Medical Research, DOI: 10.1177/0962280219842890.
Sottile, G, Cilluffo, G, Muggeo, VMR (2019). The R package islasso: estimation and hypothesis testing in lasso regression. Technical Report on ResearchGate. doi:10.13140/RG.2.2.16360.11521.
islasso.fit
, summary.islasso
, residuals.islasso
, logLik.islasso
, predict.islasso
and deviance.islasso
methods.
set.seed(1) n <- 100 p <- 100 p1 <- 10 #number of nonzero coefficients coef.veri <- sort(round(c(seq(.5, 3, l=p1/2), seq(-1, -2, l=p1/2)), 2)) sigma <- 1 coef <- c(coef.veri, rep(0, p-p1)) X <- matrix(rnorm(n*p), n, p) eta <- drop(X%*%coef) ##### gaussian ###### mu <- eta y <- mu + rnorm(n, 0, sigma) o <- islasso(y ~ ., data = data.frame(y = y, X), family = gaussian()) o summary(o) coef(o) fitted(o) predict(o, type="response") plot(o) residuals(o) deviance(o) AIC(o) logLik(o) ## Not run: # for the interaction o <- islasso(y ~ X1 * X2, data = data.frame(y = y, X), family = gaussian()) ##### binomial ###### coef <- c(c(1,1,1), rep(0, p-3)) X <- matrix(rnorm(n*p), n, p) eta <- drop(cbind(1, X)%*%c(-1, coef)) mu <- binomial()$linkinv(eta) y <- rbinom(n, 100, mu) y <- cbind(y, 100-y) o <- islasso(cbind(y1, y2) ~ ., data = data.frame(y1 = y[,1], y2 = y[,2], X), family = binomial()) summary(o, pval = .05) ##### poisson ###### coef <- c(c(1,1,1), rep(0, p-3)) X <- matrix(rnorm(n*p), n, p) eta <- drop(cbind(1, X)%*%c(1, coef)) mu <- poisson()$linkinv(eta) y <- rpois(n, mu) o <- islasso(y ~ ., data = data.frame(y = y, X), family = poisson()) summary(o, pval = .05) ##### Gamma ###### coef <- c(c(1,1,1), rep(0, p-3)) X <- matrix(rnorm(n*p), n, p) eta <- drop(cbind(1, X)%*%c(-1, coef)) mu <- Gamma(link="log")$linkinv(eta) shape <- 10 phi <- 1 / shape y <- rgamma(n, scale = mu / shape, shape = shape) o <- islasso(y ~ ., data = data.frame(y = y, X), family = Gamma(link = "log")) summary(o, pval = .05) ## End(Not run)
set.seed(1) n <- 100 p <- 100 p1 <- 10 #number of nonzero coefficients coef.veri <- sort(round(c(seq(.5, 3, l=p1/2), seq(-1, -2, l=p1/2)), 2)) sigma <- 1 coef <- c(coef.veri, rep(0, p-p1)) X <- matrix(rnorm(n*p), n, p) eta <- drop(X%*%coef) ##### gaussian ###### mu <- eta y <- mu + rnorm(n, 0, sigma) o <- islasso(y ~ ., data = data.frame(y = y, X), family = gaussian()) o summary(o) coef(o) fitted(o) predict(o, type="response") plot(o) residuals(o) deviance(o) AIC(o) logLik(o) ## Not run: # for the interaction o <- islasso(y ~ X1 * X2, data = data.frame(y = y, X), family = gaussian()) ##### binomial ###### coef <- c(c(1,1,1), rep(0, p-3)) X <- matrix(rnorm(n*p), n, p) eta <- drop(cbind(1, X)%*%c(-1, coef)) mu <- binomial()$linkinv(eta) y <- rbinom(n, 100, mu) y <- cbind(y, 100-y) o <- islasso(cbind(y1, y2) ~ ., data = data.frame(y1 = y[,1], y2 = y[,2], X), family = binomial()) summary(o, pval = .05) ##### poisson ###### coef <- c(c(1,1,1), rep(0, p-3)) X <- matrix(rnorm(n*p), n, p) eta <- drop(cbind(1, X)%*%c(1, coef)) mu <- poisson()$linkinv(eta) y <- rpois(n, mu) o <- islasso(y ~ ., data = data.frame(y = y, X), family = poisson()) summary(o, pval = .05) ##### Gamma ###### coef <- c(c(1,1,1), rep(0, p-3)) X <- matrix(rnorm(n*p), n, p) eta <- drop(cbind(1, X)%*%c(-1, coef)) mu <- Gamma(link="log")$linkinv(eta) shape <- 10 phi <- 1 / shape y <- rgamma(n, scale = mu / shape, shape = shape) o <- islasso(y ~ ., data = data.frame(y = y, X), family = Gamma(link = "log")) summary(o, pval = .05) ## End(Not run)
islasso.path
is used to fit a generalized linear model via induced smoothed lasso method. The regularization path is computed for the lasso or elasticnet penalty at a grid of values for the regularization parameter lambda. Fits linear, logistic, poisson and gamma regression models.
islasso.path(formula, family = gaussian(), lambda = NULL, nlambda = 100, lambda.min.ratio = ifelse(nobs < nvars, 1E-2, 1E-03), alpha = 1, data, weights, subset, offset, contrasts = NULL, unpenalized, control = is.control())
islasso.path(formula, family = gaussian(), lambda = NULL, nlambda = 100, lambda.min.ratio = ifelse(nobs < nvars, 1E-2, 1E-03), alpha = 1, data, weights, subset, offset, contrasts = NULL, unpenalized, control = is.control())
formula |
an object of class “formula” (or one that can be coerced to that class): the ‘usual’ symbolic description of the model to be fitted. |
family |
the assumed response distribution. Gaussian, (quasi) Binomial, (quasi) Poisson, and Gamma are allowed. |
lambda |
A user supplied lambda sequence. Typical usage is to have the program compute its own lambda sequence based on nlambda and lambda.min.ratio. Supplying a value of lambda overrides this. |
nlambda |
The number of lambda values - default is 100. |
lambda.min.ratio |
Smallest value for lambda, as a fraction of lambda.max, the (data derived) entry value (i.e. the smallest value for which all coefficients are zero). The default depends on the sample size |
alpha |
The elastic-net mixing parameter, with
|
data |
an optional data frame, list or environment (or object coercible by as.data.frame to a data frame) containing the variables in the model. If not found in data, the variables are taken from environment(formula), typically the environment from which |
weights |
observation weights. Default is 1 for each observation. |
subset |
an optional vector specifying a subset of observations to be used in the fitting process. |
offset |
this can be used to specify an a priori known component to be included in the linear predictor during fitting. This should be NULL or a numeric vector of length equal to the number of cases. |
contrasts |
an optional list. See the contrasts.arg of |
control |
a list of parameters for controlling the fitting process (see |
unpenalized |
optional. A vector of integers or characters indicating any covariate (in the formula) with coefficients not to be penalized. The intercept, if included in the model, is always unpenalized. |
The sequence of models implied by lambda is fit the islasso method. islasso
estimates regression models by imposing a lasso-type penalty on some or all regression coefficients. However the nonsmooth norm penalty is replaced by a smooth approximation justified under the induced smoothing paradigm. The advantage is that reliable standard errors are returned as model output and hypothesis testing on linear combinantions of the regression parameters can be carried out straightforwardly via the Wald statistic. Simulation studies provide evidence that the proposed approach controls type-I errors and exhibits good power in different scenarios.
A list of
call |
the matched call. |
Info |
a named matrix containing information about lambda values, estimated degrees of freedom, estimated dispersion parameters, deviance, loglikelhood, number of iterations and convergence criteria. |
GoF |
a named matrix containing information criteria, i.e., AIC, BIC, AICc, eBIC, GCV, GIC. |
Coef |
a |
SE |
a |
Weights |
a |
Linear.predictors |
a |
Fitted.values |
a |
Residuals |
a |
Input |
a named list containing several input arguments, i.e., the numbers of observations and predictors, if an intercept ha to be estimated, the model matrix and the response vector, the observation weights, the offset, the family object used, The elasticnet mixing parameter and the vector used to specify the unpenalized estimators. |
control |
the value of the control argument used. |
formula |
the formula supplied. |
model |
if requested (the default), the model frame used. |
terms |
the terms object used. |
data |
he data argument. |
xlevels |
(where relevant) a record of the levels of the factors used in fitting. |
contrasts |
(only where relevant) the contrasts used. |
Maintainer: Gianluca Sottile <[email protected]>
Cilluffo, G, Sottile, G, S, La Grutta, S and Muggeo, VMR (2019). The Induced Smoothed lasso: A practical framework for hypothesis testing in high dimensional regression. Statistical Methods in Medical Research, DOI: 10.1177/0962280219842890.
Sottile, G, Cilluffo, G, Muggeo, VMR (2019). The R package islasso: estimation and hypothesis testing in lasso regression. Technical Report on ResearchGate. doi:10.13140/RG.2.2.16360.11521.
islasso.path.fit
, coef.islasso.path
, summary.islasso.path
, residuals.islasso.path
, GoF.islasso.path
, logLik.islasso.path
, fitted.islasso.path
, predict.islasso.path
and deviance.islasso.path
methods.
set.seed(1) n <- 100 p <- 30 p1 <- 10 #number of nonzero coefficients coef.veri <- sort(round(c(seq(.5, 3, l=p1/2), seq(-1, -2, l=p1/2)), 2)) sigma <- 1 coef <- c(coef.veri, rep(0, p-p1)) X <- matrix(rnorm(n*p), n, p) eta <- drop(X%*%coef) ##### gaussian ###### mu <- eta y <- mu + rnorm(n, 0, sigma) o <- islasso.path(y ~ ., data = data.frame(y = y, X), family = gaussian(), nlambda = 30L) o summary(o, lambda = 10) coef(o, lambda = 10) fitted(o, lambda = 10) predict(o, type="response", lambda = 10) plot(o, xvar = "coef") residuals(o, lambda = 10) deviance(o, lambda = 10) logLik(o, lambda = 10) GoF.islasso.path(o) ## Not run: ##### binomial ###### coef <- c(c(1,1,1), rep(0, p-3)) X <- matrix(rnorm(n*p), n, p) eta <- drop(cbind(1, X)%*%c(-1, coef)) mu <- binomial()$linkinv(eta) y <- rbinom(n, 100, mu) y <- cbind(y, 100-y) o <- islasso.path(cbind(y1, y2) ~ ., data = data.frame(y1 = y[,1], y2 = y[,2], X), family = binomial(), nlambda = 30L) temp <- GoF.islasso.path(o) summary(o, pval = .05, lambda = temp$lambda.min["BIC"]) ##### poisson ###### coef <- c(c(1,1,1), rep(0, p-3)) X <- matrix(rnorm(n*p), n, p) eta <- drop(cbind(1, X)%*%c(1, coef)) mu <- poisson()$linkinv(eta) y <- rpois(n, mu) o <- islasso.path(y ~ ., data = data.frame(y = y, X), family = poisson(), nlambda = 30L) temp <- GoF.islasso.path(o) summary(o, pval = .05, lambda = temp$lambda.min["BIC"]) ##### Gamma ###### coef <- c(c(1,1,1), rep(0, p-3)) X <- matrix(rnorm(n*p), n, p) eta <- drop(cbind(1, X)%*%c(-1, coef)) mu <- Gamma(link="log")$linkinv(eta) shape <- 10 phi <- 1 / shape y <- rgamma(n, scale = mu / shape, shape = shape) o <- islasso.path(y ~ ., data = data.frame(y = y, X), family = Gamma(link = "log"), nlambda = 30L) temp <- GoF.islasso.path(o) summary(o, pval = .05, lambda = temp$lambda.min["BIC"]) ## End(Not run)
set.seed(1) n <- 100 p <- 30 p1 <- 10 #number of nonzero coefficients coef.veri <- sort(round(c(seq(.5, 3, l=p1/2), seq(-1, -2, l=p1/2)), 2)) sigma <- 1 coef <- c(coef.veri, rep(0, p-p1)) X <- matrix(rnorm(n*p), n, p) eta <- drop(X%*%coef) ##### gaussian ###### mu <- eta y <- mu + rnorm(n, 0, sigma) o <- islasso.path(y ~ ., data = data.frame(y = y, X), family = gaussian(), nlambda = 30L) o summary(o, lambda = 10) coef(o, lambda = 10) fitted(o, lambda = 10) predict(o, type="response", lambda = 10) plot(o, xvar = "coef") residuals(o, lambda = 10) deviance(o, lambda = 10) logLik(o, lambda = 10) GoF.islasso.path(o) ## Not run: ##### binomial ###### coef <- c(c(1,1,1), rep(0, p-3)) X <- matrix(rnorm(n*p), n, p) eta <- drop(cbind(1, X)%*%c(-1, coef)) mu <- binomial()$linkinv(eta) y <- rbinom(n, 100, mu) y <- cbind(y, 100-y) o <- islasso.path(cbind(y1, y2) ~ ., data = data.frame(y1 = y[,1], y2 = y[,2], X), family = binomial(), nlambda = 30L) temp <- GoF.islasso.path(o) summary(o, pval = .05, lambda = temp$lambda.min["BIC"]) ##### poisson ###### coef <- c(c(1,1,1), rep(0, p-3)) X <- matrix(rnorm(n*p), n, p) eta <- drop(cbind(1, X)%*%c(1, coef)) mu <- poisson()$linkinv(eta) y <- rpois(n, mu) o <- islasso.path(y ~ ., data = data.frame(y = y, X), family = poisson(), nlambda = 30L) temp <- GoF.islasso.path(o) summary(o, pval = .05, lambda = temp$lambda.min["BIC"]) ##### Gamma ###### coef <- c(c(1,1,1), rep(0, p-3)) X <- matrix(rnorm(n*p), n, p) eta <- drop(cbind(1, X)%*%c(-1, coef)) mu <- Gamma(link="log")$linkinv(eta) shape <- 10 phi <- 1 / shape y <- rgamma(n, scale = mu / shape, shape = shape) o <- islasso.path(y ~ ., data = data.frame(y = y, X), family = Gamma(link = "log"), nlambda = 30L) temp <- GoF.islasso.path(o) summary(o, pval = .05, lambda = temp$lambda.min["BIC"]) ## End(Not run)
Diagnostics plots for Induced Smoothing Lasso Model
## S3 method for class 'islasso' plot(x, ...)
## S3 method for class 'islasso' plot(x, ...)
x |
an object of class |
... |
other graphical parameters for the plot |
The plot on the top left is a plot of the standard deviance residuals against the fitted values. The plot on the top right is a normal QQ plot of the standardized deviance residuals. The red line is the expected line if the standardized residuals are normally distributed, i.e. it is the line with intercept 0 and slope 1. The bottom two panels are plots of link and variance functions. On the left is squared standardized Pearson residuals against the fitted values. On the right working vector against the linear predictor.
Maintainer: Gianluca Sottile <[email protected]>
islasso.fit
, summary.islasso
, residuals.islasso
, logLik.islasso
, predict.islasso
and deviance.islasso
methods.
## Not run: set.seed(1) n <- 100 p <- 100 p1 <- 20 #number of nonzero coefficients coef.veri <- sort(round(c(seq(.5, 3, l=p1/2), seq(-1, -2, l=p1/2)), 2)) sigma <- 1 coef <- c(coef.veri, rep(0, p-p1)) X <- matrix(rnorm(n*p), n, p) mu <- drop(X%*%coef) y <- mu + rnorm(n, 0,sigma) lambda <- 2 o <- islasso(y ~ ., data = data.frame(y = y, X), family = gaussian(), lambda = lambda) plot(o) ## End(Not run)
## Not run: set.seed(1) n <- 100 p <- 100 p1 <- 20 #number of nonzero coefficients coef.veri <- sort(round(c(seq(.5, 3, l=p1/2), seq(-1, -2, l=p1/2)), 2)) sigma <- 1 coef <- c(coef.veri, rep(0, p-p1)) X <- matrix(rnorm(n*p), n, p) mu <- drop(X%*%coef) y <- mu + rnorm(n, 0,sigma) lambda <- 2 o <- islasso(y ~ ., data = data.frame(y = y, X), family = gaussian(), lambda = lambda) plot(o) ## End(Not run)
Produces a coefficient profile plot of the coefficient paths for a fitted "islasso.path" object.
## S3 method for class 'islasso.path' plot(x, yvar = c("coefficients", "se", "gradient", "weight", "gof"), gof = c("none", "AIC", "BIC", "AICc", "eBIC", "GCV", "GIC"), label = FALSE, legend = FALSE, ...)
## S3 method for class 'islasso.path' plot(x, yvar = c("coefficients", "se", "gradient", "weight", "gof"), gof = c("none", "AIC", "BIC", "AICc", "eBIC", "GCV", "GIC"), label = FALSE, legend = FALSE, ...)
x |
an object of class |
yvar |
What is on the Y-axis. "coef" plot the log-lambda sequence against the coefficients; "se" plot the log-lambda sequence against the standard deviations; "gradient" plot the log-lambda sequence against the gradient; "weight" plot the log-lambda sequence against the mixture weight of the islasso method; "gof" plot the log-lambda sequence against the chosen criterion. |
gof |
the chosen criterion to highlight the active variables. "none" doesn't highlight active variables. |
label |
a logical flag indicating if some labels have to be added. |
legend |
a logical flag indicating if legend has to be shown. |
... |
other graphical parameters for the plot, i.e., main, xlab, ylab, xlim, ylim, lty, col, lwd, cex.axis, cex.lab, cex.main, gof_lty, gof_col and gof_lwd. The last three parameters are used to modify aspects of the legend, and of the goodness of fit measure used. |
A coefficient profile plot is produced for Induced Smoothing Lasso Model path.
## Not run: set.seed(1) n <- 100 p <- 30 p1 <- 10 #number of nonzero coefficients coef.veri <- sort(round(c(seq(.5, 3, l=p1/2), seq(-1, -2, l=p1/2)), 2)) sigma <- 1 coef <- c(coef.veri, rep(0, p-p1)) X <- matrix(rnorm(n*p), n, p) mu <- drop(X%*%coef) y <- mu + rnorm(n, 0,sigma) o <- islasso.path(y ~ ., data = data.frame(y = y, X), family = gaussian()) par(mfrow = c(2, 2)) plot(o, yvar = "coefficients", gof = "AICc", label = TRUE) plot(o, yvar = "se", gof = "AICc") plot(o, yvar = "gradient", gof = "AICc") plot(o, yvar = "gof", gof = "AICc") ## End(Not run)
## Not run: set.seed(1) n <- 100 p <- 30 p1 <- 10 #number of nonzero coefficients coef.veri <- sort(round(c(seq(.5, 3, l=p1/2), seq(-1, -2, l=p1/2)), 2)) sigma <- 1 coef <- c(coef.veri, rep(0, p-p1)) X <- matrix(rnorm(n*p), n, p) mu <- drop(X%*%coef) y <- mu + rnorm(n, 0,sigma) o <- islasso.path(y ~ ., data = data.frame(y = y, X), family = gaussian()) par(mfrow = c(2, 2)) plot(o, yvar = "coefficients", gof = "AICc", label = TRUE) plot(o, yvar = "se", gof = "AICc") plot(o, yvar = "gradient", gof = "AICc") plot(o, yvar = "gof", gof = "AICc") ## End(Not run)
Prediction method for islasso fitted objects
## S3 method for class 'islasso' predict(object, newdata = NULL, type = c("link", "response", "coefficients", "class", "terms"), se.fit = FALSE, ci = NULL, type.ci = "wald", level = .95, terms = NULL, na.action = na.pass, ...)
## S3 method for class 'islasso' predict(object, newdata = NULL, type = c("link", "response", "coefficients", "class", "terms"), se.fit = FALSE, ci = NULL, type.ci = "wald", level = .95, terms = NULL, na.action = na.pass, ...)
object |
a fitted object of class "islasso". |
newdata |
optionally, a data frame in which to look for variables with which to predict. If omitted, the fitted linear predictors are used. |
type |
the type of prediction required. The default is on the scale of the linear predictors; the alternative "response" is on the scale of the response variable.
Thus for a default binomial model the default predictions are of log-odds (probabilities on logit scale) and type = "response" gives the predicted probabilities. The |
se.fit |
logical switch indicating if confidence intervals are required. |
ci |
optionally, a two columns matrix of estimated confidence intervals for the estimated coefficients. |
type.ci |
Only Wald-type confidence intervals are implemented yet! type.ci = "wald" estimates and standard errors are used to build confidence interval |
level |
the confidence level required. |
terms |
with type = "terms" by default all terms are returned. A character vector specifies which terms are to be returned. |
na.action |
function determining what should be done with missing values in newdata. The default is to predict NA. |
... |
further arguments passed to or from other methods. |
An object depending on the type argument
Maintainer: Gianluca Sottile <[email protected]>
islasso.fit
, summary.islasso
, residuals.islasso
, logLik.islasso
, predict.islasso
and deviance.islasso
methods.
set.seed(1) n <- 100 p <- 100 p1 <- 20 #number of nonzero coefficients coef.veri <- sort(round(c(seq(.5, 3, l=p1/2), seq(-1, -2, l=p1/2)), 2)) sigma <- 1 coef <- c(coef.veri, rep(0, p-p1)) X <- matrix(rnorm(n*p), n, p) mu <- drop(X%*%coef) y <- mu + rnorm(n, 0,sigma) lambda <- 2 o <- islasso(y ~ ., data = data.frame(y = y, X), lambda = lambda) predict(o, type = "response")
set.seed(1) n <- 100 p <- 100 p1 <- 20 #number of nonzero coefficients coef.veri <- sort(round(c(seq(.5, 3, l=p1/2), seq(-1, -2, l=p1/2)), 2)) sigma <- 1 coef <- c(coef.veri, rep(0, p-p1)) X <- matrix(rnorm(n*p), n, p) mu <- drop(X%*%coef) y <- mu + rnorm(n, 0,sigma) lambda <- 2 o <- islasso(y ~ ., data = data.frame(y = y, X), lambda = lambda) predict(o, type = "response")
Prediction method for islasso fitted objects
## S3 method for class 'islasso.path' predict(object, newdata, type = c("link", "response", "coefficients", "class"), lambda, ...)
## S3 method for class 'islasso.path' predict(object, newdata, type = c("link", "response", "coefficients", "class"), lambda, ...)
object |
a fitted object of class |
newdata |
optionally, a data frame in which to look for variables with which to predict. If omitted, the fitted linear predictors are used. |
type |
the type of prediction required. The default is on the scale of the linear predictors; the alternative "response" is on the scale of the response variable.
Thus for a default binomial model the default predictions are of log-odds (probabilities on logit scale) and type = "response" gives the predicted probabilities. The |
lambda |
Value(s) of the penalty parameter lambda at which predictions are required. Default is the entire sequence used to create the model. |
... |
further arguments passed to or from other methods. |
An object depending on the type argument
Maintainer: Gianluca Sottile <[email protected]>
islasso.path
, islasso.path.fit
, coef.islasso.path
, residuals.islasso.path
, GoF.islasso.path
, logLik.islasso.path
, fitted.islasso.path
, summary.islasso.path
and deviance.islasso.path
methods.
## Not run: set.seed(1) n <- 100 p <- 30 p1 <- 10 #number of nonzero coefficients coef.veri <- sort(round(c(seq(.5, 3, l=p1/2), seq(-1, -2, l=p1/2)), 2)) sigma <- 1 coef <- c(coef.veri, rep(0, p-p1)) X <- matrix(rnorm(n*p), n, p) mu <- drop(X%*%coef) y <- mu + rnorm(n, 0,sigma) o <- islasso.path(y ~ ., data = data.frame(y = y, X), family = gaussian()) temp <- GoF.islasso.path(o) predict(o, type = "response", lambda = temp$lambda.min) ## End(Not run)
## Not run: set.seed(1) n <- 100 p <- 30 p1 <- 10 #number of nonzero coefficients coef.veri <- sort(round(c(seq(.5, 3, l=p1/2), seq(-1, -2, l=p1/2)), 2)) sigma <- 1 coef <- c(coef.veri, rep(0, p-p1)) X <- matrix(rnorm(n*p), n, p) mu <- drop(X%*%coef) y <- mu + rnorm(n, 0,sigma) o <- islasso.path(y ~ ., data = data.frame(y = y, X), family = gaussian()) temp <- GoF.islasso.path(o) predict(o, type = "response", lambda = temp$lambda.min) ## End(Not run)
These data come from a study that examined the correlation between the level of prostate specific antigen and a number of clinical measures in men who were about to receive a radical prostatectomy. It is data frame with 97 rows and 9 columns.
data(Prostate)
data(Prostate)
The data frame has the following components:
lcavol
log(cancer volume)
lweight
log(prostate weight)
age
age
lbph
log(benign prostatic hyperplasia amount)
svi
seminal vesicle invasion
lcp
log(capsular penetration)
gleason
Gleason score
pgg45
percentage Gleason scores 4 or 5
lpsa
log(prostate specific antigen)
Stamey, T.A., Kabalin, J.N., McNeal, J.E., Johnstone, I.M., Freiha,
F., Redwine, E.A. and Yang, N. (1989)
Prostate specific antigen in the diagnosis and treatment of
adenocarcinoma of the prostate: II. radical prostatectomy treated
patients,
Journal of Urology 141(5), 1076–1083.
Simulate model matrix and response from a specified distribution.
simulXy(n, p, interc = 0, beta, family = gaussian(), prop = 0.1, lim.b = c(-3, 3), sigma = 1, size = 1, rho = 0, scale = TRUE, seed, X)
simulXy(n, p, interc = 0, beta, family = gaussian(), prop = 0.1, lim.b = c(-3, 3), sigma = 1, size = 1, rho = 0, scale = TRUE, seed, X)
n |
number of observations. |
p |
total number of covariates in the model matrix. |
interc |
the model intercept. |
beta |
the vector of p coefficients in the linear predictor. |
family |
a description of the error distribution and link function to be used in the model. This can be a character string naming a family function, a family function or the result of a call to a family function. Only gaussian, binomial or poisson are allowed. |
prop |
if |
lim.b |
if |
sigma |
if family is 'gaussian', the standard deviation of the response. The default is 1. |
size |
if family is 'binomial', the number of trials to build the response vector. The default is 1. |
rho |
correlation value to define the variance covariance matrix to build the model matrix, i.e., rho^|i-j| i,j = 1,...,p and i different from j. The default is 0. |
scale |
Should the columns of the mdoel matrix be scaled? The default is TRUE |
seed |
optional, the seed to generate the data. |
X |
optional, the model matrix. |
n <- 100 p <- 100 beta <- c(runif(10, -3, 3), rep(0, p-10)) dat <- simulXy(n, p, beta = beta, seed=1234)
n <- 100 p <- 100 beta <- c(runif(10, -3, 3), rep(0, p-10)) dat <- simulXy(n, p, beta = beta, seed=1234)
summary method for islasso fitted objects
## S3 method for class 'islasso' summary(object, pval = 1, which, use.t = FALSE, type.pval = "wald", ...)
## S3 method for class 'islasso' summary(object, pval = 1, which, use.t = FALSE, type.pval = "wald", ...)
object |
fitted |
pval |
a threshold p-value value indicating which coefficients should be printed. If |
which |
a specification of which parameters are to be given p-values. If missing, all parameters are considered. |
use.t |
if |
type.pval |
Only Wald-type confidence intervals are implemented yet! type.pval = "wald" (default) estimates and standard errors are used to build confidence interval |
... |
not used |
Maintainer: Gianluca Sottile <[email protected]>
islasso.fit
, summary.islasso
, residuals.islasso
, logLik.islasso
, predict.islasso
and deviance.islasso
methods.
## Not run: #continues example from ?islasso summary(o, pval = .1) #print just the "borderline" significant coefficients ## End(Not run)
## Not run: #continues example from ?islasso summary(o, pval = .1) #print just the "borderline" significant coefficients ## End(Not run)
summary method for islasso.path fitted objects
## S3 method for class 'islasso.path' summary(object, pval = 1, use.t = FALSE, lambda, ...)
## S3 method for class 'islasso.path' summary(object, pval = 1, use.t = FALSE, lambda, ...)
object |
fitted |
pval |
a threshold p-value value indicating which coefficients should be printed. If |
use.t |
if |
lambda |
Value of the penalty parameter lambda at which summary are required. |
... |
not used |
Maintainer: Gianluca Sottile <[email protected]>
islasso.path
, islasso.path.fit
, coef.islasso.path
, residuals.islasso.path
, GoF.islasso.path
, logLik.islasso.path
, fitted.islasso.path
, predict.islasso.path
and deviance.islasso.path
methods.
## Not run: #continues example from ?islasso.path summary(o, pval = .1, lambda = 5) #print just the "borderline" significant coefficients ## End(Not run)
## Not run: #continues example from ?islasso.path summary(o, pval = .1, lambda = 5) #print just the "borderline" significant coefficients ## End(Not run)