Title: | Sparsity by Worst-Case Quadratic Penalties |
---|---|
Description: | Fits classical sparse regression models with efficient active set algorithms by solving quadratic problems as described by Grandvalet, Chiquet and Ambroise (2017) <doi:10.48550/arXiv.1210.2077>. Also provides a few methods for model selection purpose (cross-validation, stability selection). |
Authors: | Julien Chiquet [aut, cre] |
Maintainer: | Julien Chiquet <[email protected]> |
License: | GPL (>= 3) |
Version: | 0.2-12 |
Built: | 2024-12-23 06:45:57 UTC |
Source: | CRAN |
Adjust a linear model penalized by a mixture of a (possibly
weighted) -norm (bounding the
magnitude of the parameters) and a (possibly structured)
-norm (ridge-like). The solution path is computed
at a grid of values for the infinity-penalty, fixing the amount of
regularization. See details for the criterion
optimized.
bounded.reg( x, y, lambda1 = NULL, lambda2 = 0.01, penscale = rep(1, p), struct = NULL, intercept = TRUE, normalize = TRUE, naive = FALSE, nlambda1 = ifelse(is.null(lambda1), 100, length(lambda1)), min.ratio = ifelse(n <= p, 0.01, 0.001), max.feat = ifelse(lambda2 < 0.01, min(n, p), min(4 * n, p)), control = list(), checkargs = TRUE )
bounded.reg( x, y, lambda1 = NULL, lambda2 = 0.01, penscale = rep(1, p), struct = NULL, intercept = TRUE, normalize = TRUE, naive = FALSE, nlambda1 = ifelse(is.null(lambda1), 100, length(lambda1)), min.ratio = ifelse(n <= p, 0.01, 0.001), max.feat = ifelse(lambda2 < 0.01, min(n, p), min(4 * n, p)), control = list(), checkargs = TRUE )
x |
matrix of features, possibly sparsely encoded
(experimental). Do NOT include intercept. When normalized os
|
y |
response vector. |
lambda1 |
sequence of decreasing |
lambda2 |
real scalar; tunes the |
penscale |
vector with real positive values that weight the infinity norm of each feature. Default set all weights to 1. See details below. |
struct |
matrix structuring the coefficients. Must be at
least positive semidefinite (this is checked internally if the
|
intercept |
logical; indicates if an intercept should be
included in the model. Default is |
normalize |
logical; indicates if variables should be
normalized to have unit L2 norm before fitting. Default is
|
naive |
logical; Compute either 'naive' of 'classic' bounded
regression: mimicking the Elastic-net, the vector of parameters is
rescaled by a coefficient |
nlambda1 |
integer that indicates the number of values to put
in the |
min.ratio |
minimal value of infinity-part of the penalty
that will be tried, as a fraction of the maximal |
max.feat |
integer; limits the number of features ever to
enter the model: in our implementation of the bounded regression,
it corresponds to the variables which have left the boundary along
the path. The algorithm stops if this number is exceeded and
|
control |
list of argument controlling low level options of the algorithm –use with care and at your own risk– :
|
checkargs |
logical; should arguments be checked to
(hopefully) avoid internal crashes? Default is
|
The optimized criterion is
βhatλ1,λ2 = argminβ 1/2 RSS(β) + λ1 | D β |∞ + λ/2 2 βT S β,
where
is a diagonal matrix, whose diagonal terms are provided
as a vector by the
penscale
argument. The
structuring matrix
is provided via the
struct
argument, a positive semidefinite matrix (possibly of class
Matrix
).
Note that the quadratic algorithm for the bounded regression may
become unstable along the path because of singularity of the
underlying problem, e.g. when there are too much correlation or
when the size of the problem is close to or smaller than the
sample size. In such cases, it might be a good idea to switch to
the proximal solver, slower yet more robust. This is the strategy
adopted by the 'bulletproof'
mode, that will send a warning
while switching the method to 'fista'
and keep on
optimizing on the remainder of the path. When bulletproof
is set to FALSE
, the algorithm stops at an early stage of
the path of solutions. Hence, users should be careful when
manipulating the resulting 'quadrupen'
object, as it will
not have the size expected regarding the dimension of the
lambda1
argument.
Singularity of the system can also be avoided with a larger
-regularization, via
lambda2
, or a
"not-too-small" regularization, via
a larger
'min.ratio'
argument.
an object with class quadrupen
, see the
documentation page quadrupen
for details.
See also quadrupen
,
plot,quadrupen-method
and crossval
.
## Simulating multivariate Gaussian with blockwise correlation ## and piecewise constant vector of parameters beta <- rep(c(0,1,0,-1,0), c(25,10,25,10,25)) cor <- 0.75 Soo <- toeplitz(cor^(0:(25-1))) ## Toeplitz correlation for irrelevant variables Sww <- matrix(cor,10,10) ## bloc correlation between active variables Sigma <- bdiag(Soo,Sww,Soo,Sww,Soo) diag(Sigma) <- 1 n <- 50 x <- as.matrix(matrix(rnorm(95*n),n,95) %*% chol(Sigma)) y <- 10 + x %*% beta + rnorm(n,0,10) ## Infinity norm without/with an additional l2 regularization term ## and with structuring prior labels <- rep("irrelevant", length(beta)) labels[beta != 0] <- "relevant" plot(bounded.reg(x,y,lambda2=0) , label=labels) ## a mess plot(bounded.reg(x,y,lambda2=10), label=labels) ## good guys are at the boundaries
## Simulating multivariate Gaussian with blockwise correlation ## and piecewise constant vector of parameters beta <- rep(c(0,1,0,-1,0), c(25,10,25,10,25)) cor <- 0.75 Soo <- toeplitz(cor^(0:(25-1))) ## Toeplitz correlation for irrelevant variables Sww <- matrix(cor,10,10) ## bloc correlation between active variables Sigma <- bdiag(Soo,Sww,Soo,Sww,Soo) diag(Sigma) <- 1 n <- 50 x <- as.matrix(matrix(rnorm(95*n),n,95) %*% chol(Sigma)) y <- 10 + x %*% beta + rnorm(n,0,10) ## Infinity norm without/with an additional l2 regularization term ## and with structuring prior labels <- rep("irrelevant", length(beta)) labels[beta != 0] <- "relevant" plot(bounded.reg(x,y,lambda2=0) , label=labels) ## a mess plot(bounded.reg(x,y,lambda2=10), label=labels) ## good guys are at the boundaries
Function that computes K-fold (double) cross-validated error of a
quadrupen
fit. If no lambda2
is provided, simple
cross validation on the lambda1
parameter is performed. If
a vector lambda2
is passed as an argument, double
cross-validation is performed.
crossval( x, y, penalty = c("elastic.net", "bounded.reg"), K = 10, folds = split(sample(1:nrow(x)), rep(1:K, length = nrow(x))), lambda2 = 0.01, verbose = TRUE, mc.cores = 2, ... )
crossval( x, y, penalty = c("elastic.net", "bounded.reg"), K = 10, folds = split(sample(1:nrow(x)), rep(1:K, length = nrow(x))), lambda2 = 0.01, verbose = TRUE, mc.cores = 2, ... )
x |
matrix of features, possibly sparsely encoded (experimental). Do NOT include intercept. |
y |
response vector. |
penalty |
a string for the fitting procedure used for
cross-validation. Either |
K |
integer indicating the number of folds. Default is 10. |
folds |
list of |
lambda2 |
tunes the |
verbose |
logical; indicates if the progression (the current
lambda2) should be displayed. Default is |
mc.cores |
the number of cores to use. The default uses 2 cores. |
... |
additional parameters to overwrite the defaults of the
fitting procedure identified by the |
An object of class "cvpen" for which a plot
method
is available.
If the user runs the fitting method with option
'bulletproof'
set to FALSE
, the algorithm may stop
at an early stage of the path. Early stops are handled internally,
in order to provide results on the same grid of penalty tuned by
. This is done by means of
NA
values, so as mean and standard error are consistently
evaluated. If, while cross-validating, the procedure experiences
too many early stoppings, a warning is sent to the user, in which
case you should reconsider the grid of lambda1
used for the
cross-validation. If bulletproof
is TRUE
(the
default), there is nothing to worry about, except a possible slow
down when any switching to the proximal algorithm is required.
quadrupen
, plot,cvpen-method
and cvpen
.
## Simulating multivariate Gaussian with blockwise correlation ## and piecewise constant vector of parameters beta <- rep(c(0,1,0,-1,0), c(25,10,25,10,25)) cor <- 0.75 Soo <- toeplitz(cor^(0:(25-1))) ## Toeplitz correlation for irrelevant variable Sww <- matrix(cor,10,10) ## bloc correlation between active variables Sigma <- bdiag(Soo,Sww,Soo,Sww,Soo) + 0.1 diag(Sigma) <- 1 n <- 100 x <- as.matrix(matrix(rnorm(95*n),n,95) %*% chol(Sigma)) y <- 10 + x %*% beta + rnorm(n,0,10) ## Use fewer lambda1 values by overwritting the default parameters ## and cross-validate over the sequences lambda1 and lambda2 cv.double <- crossval(x,y, lambda2=10^seq(2,-2,len=50), nlambda1=50) ## Rerun simple cross-validation with the appropriate lambda2 cv.10K <- crossval(x,y, lambda2=0.2) ## Try leave one out also cv.loo <- crossval(x,y, K=n, lambda2=0.2) plot(cv.double) plot(cv.10K) plot(cv.loo) ## Performance for selection purpose beta.min.10K <- slot(cv.10K, "beta.min") beta.min.loo <- slot(cv.loo, "beta.min") cat("\nFalse positives with the minimal 10-CV choice: ", sum(sign(beta) != sign(beta.min.10K))) cat("\nFalse positives with the minimal LOO-CV choice: ", sum(sign(beta) != sign(beta.min.loo)))
## Simulating multivariate Gaussian with blockwise correlation ## and piecewise constant vector of parameters beta <- rep(c(0,1,0,-1,0), c(25,10,25,10,25)) cor <- 0.75 Soo <- toeplitz(cor^(0:(25-1))) ## Toeplitz correlation for irrelevant variable Sww <- matrix(cor,10,10) ## bloc correlation between active variables Sigma <- bdiag(Soo,Sww,Soo,Sww,Soo) + 0.1 diag(Sigma) <- 1 n <- 100 x <- as.matrix(matrix(rnorm(95*n),n,95) %*% chol(Sigma)) y <- 10 + x %*% beta + rnorm(n,0,10) ## Use fewer lambda1 values by overwritting the default parameters ## and cross-validate over the sequences lambda1 and lambda2 cv.double <- crossval(x,y, lambda2=10^seq(2,-2,len=50), nlambda1=50) ## Rerun simple cross-validation with the appropriate lambda2 cv.10K <- crossval(x,y, lambda2=0.2) ## Try leave one out also cv.loo <- crossval(x,y, K=n, lambda2=0.2) plot(cv.double) plot(cv.10K) plot(cv.loo) ## Performance for selection purpose beta.min.10K <- slot(cv.10K, "beta.min") beta.min.loo <- slot(cv.loo, "beta.min") cat("\nFalse positives with the minimal 10-CV choice: ", sum(sign(beta) != sign(beta.min.10K))) cat("\nFalse positives with the minimal LOO-CV choice: ", sum(sign(beta) != sign(beta.min.loo)))
Class of object returned by a cross-validation performed through
the crossval
method.
lambda1
:vector of
(
or
penalty levels)
for which each cross-validation has been performed.
lambda2
:vector (or scalar) of -penalty levels for
which each cross-validation has been performed.
lambda1.min
:level of that minimizes the
error estimated by cross-validation.
lambda1.1se
:largest level of such as
the cross-validated error is within 1 standard error of the
minimum.
lambda2.min
:level of that minimizes the
error estimated by cross-validation.
cv.error
:a data frame containing the mean
cross-validated error and its associated standard error for each
values of lambda1
and lambda2
.
folds
:list of K
vectors indicating the folds
used for cross-validation.
beta.min
:the vector of parameters obtained by
fitting the problem on the full data set x
and y
with
lambda1.min
and lambda2.min
penalties.
beta.1se
:the vector of parameters obtained by
fitting the problem on the full data set x
and y
with
lambda1.1se
and lambda2.min
penalties.
The specific plot,cvpen-method
method is documented.
See also plot,cvpen-method
and
crossval
.
Adjust a linear model with elastic-net regularization, mixing a
(possibly weighted) -norm (LASSO) and a
(possibly structured)
-norm (ridge-like). The
solution path is computed at a grid of values for the
-penalty, fixing the amount of
regularization. See details for the criterion optimized.
elastic.net( x, y, lambda1 = NULL, lambda2 = 0.01, penscale = rep(1, p), struct = NULL, intercept = TRUE, normalize = TRUE, naive = FALSE, nlambda1 = ifelse(is.null(lambda1), 100, length(lambda1)), min.ratio = ifelse(n <= p, 0.01, 1e-04), max.feat = ifelse(lambda2 < 0.01, min(n, p), min(4 * n, p)), beta0 = NULL, control = list(), checkargs = TRUE )
elastic.net( x, y, lambda1 = NULL, lambda2 = 0.01, penscale = rep(1, p), struct = NULL, intercept = TRUE, normalize = TRUE, naive = FALSE, nlambda1 = ifelse(is.null(lambda1), 100, length(lambda1)), min.ratio = ifelse(n <= p, 0.01, 1e-04), max.feat = ifelse(lambda2 < 0.01, min(n, p), min(4 * n, p)), beta0 = NULL, control = list(), checkargs = TRUE )
x |
matrix of features, possibly sparsely encoded
(experimental). Do NOT include intercept. When normalized os
|
y |
response vector. |
lambda1 |
sequence of decreasing |
lambda2 |
real scalar; tunes the |
penscale |
vector with real positive values that weight the
|
struct |
matrix structuring the coefficients (preferably
sparse). Must be at least positive semidefinite (this is checked
internally if the |
intercept |
logical; indicates if an intercept should be
included in the model. Default is |
normalize |
logical; indicates if variables should be
normalized to have unit L2 norm before fitting. Default is
|
naive |
logical; Compute either 'naive' of classic
elastic-net as defined in Zou and Hastie (2006): the vector of
parameters is rescaled by a coefficient |
nlambda1 |
integer that indicates the number of values to put
in the |
min.ratio |
minimal value of |
max.feat |
integer; limits the number of features ever to
enter the model; i.e., non-zero coefficients for the Elastic-net:
the algorithm stops if this number is exceeded and |
beta0 |
a starting point for the vector of parameter. When
|
control |
list of argument controlling low level options of the algorithm –use with care and at your own risk– :
|
checkargs |
logical; should arguments be checked to
(hopefully) avoid internal crashes? Default is
|
The optimized criterion is the following:
βhatλ1,λ2 = argminβ 1/2 RSS(β) + λ1 | D β |1 + λ/2 2 βT S β,
where
is a diagonal matrix, whose diagonal terms are provided
as a vector by the
penscale
argument. The
structuring matrix
is provided via the
struct
argument, a positive semidefinite matrix (possibly of class
Matrix
).
an object with class quadrupen
, see the
documentation page quadrupen
for details.
See also quadrupen
,
plot,quadrupen-method
and crossval
.
## Simulating multivariate Gaussian with blockwise correlation ## and piecewise constant vector of parameters beta <- rep(c(0,1,0,-1,0), c(25,10,25,10,25)) cor <- 0.75 Soo <- toeplitz(cor^(0:(25-1))) ## Toeplitz correlation for irrelevant variables Sww <- matrix(cor,10,10) ## bloc correlation between active variables Sigma <- bdiag(Soo,Sww,Soo,Sww,Soo) diag(Sigma) <- 1 n <- 50 x <- as.matrix(matrix(rnorm(95*n),n,95) %*% chol(Sigma)) y <- 10 + x %*% beta + rnorm(n,0,10) labels <- rep("irrelevant", length(beta)) labels[beta != 0] <- "relevant" ## Comparing the solution path of the LASSO and the Elastic-net plot(elastic.net(x,y,lambda2=0), label=labels) ## a mess plot(elastic.net(x,y,lambda2=10), label=labels) ## a lot better
## Simulating multivariate Gaussian with blockwise correlation ## and piecewise constant vector of parameters beta <- rep(c(0,1,0,-1,0), c(25,10,25,10,25)) cor <- 0.75 Soo <- toeplitz(cor^(0:(25-1))) ## Toeplitz correlation for irrelevant variables Sww <- matrix(cor,10,10) ## bloc correlation between active variables Sigma <- bdiag(Soo,Sww,Soo,Sww,Soo) diag(Sigma) <- 1 n <- 50 x <- as.matrix(matrix(rnorm(95*n),n,95) %*% chol(Sigma)) y <- 10 + x %*% beta + rnorm(n,0,10) labels <- rep("irrelevant", length(beta)) labels[beta != 0] <- "relevant" ## Comparing the solution path of the LASSO and the Elastic-net plot(elastic.net(x,y,lambda2=0), label=labels) ## a mess plot(elastic.net(x,y,lambda2=10), label=labels) ## a lot better
quadrupen
modelProduce a plot of the cross validated error of a quadrupen
model.
\S4method{plot}{cvpen}(x, y, log.scale=TRUE, reverse=FALSE, plot=TRUE, main = "Cross-validation error", ...)
\S4method{plot}{cvpen}(x, y, log.scale=TRUE, reverse=FALSE, plot=TRUE, main = "Cross-validation error", ...)
x |
output of a |
y |
used for S4 compatibility. |
log.scale |
logical; indicates if a log-scale should be used
when |
reverse |
logical; should the X-axis by reversed when |
plot |
logical; indicates if the graph should be plotted. Default is |
main |
the main title, with a hopefully appropriate default definition. |
... |
used for S4 compatibility. |
a ggplot2 object which can be plotted via the print
method.
## Simulating multivariate Gaussian with blockwise correlation ## and piecewise constant vector of parameters beta <- rep(c(0,1,0,-1,0), c(25,10,25,10,25)) cor <- 0.75 Soo <- toeplitz(cor^(0:(25-1))) ## Toeplitz correlation for irrelevant variables Sww <- matrix(cor,10,10) ## bloc correlation between active variables Sigma <- bdiag(Soo,Sww,Soo,Sww,Soo) + 0.1 diag(Sigma) <- 1 n <- 100 x <- as.matrix(matrix(rnorm(95*n),n,95) %*% chol(Sigma)) y <- 10 + x %*% beta + rnorm(n,0,10) ## Use fewer lambda1 values by overwritting the default parameters ## and cross-validate over the sequences lambda1 and lambda2 cv.double <- crossval(x,y, lambda2=10^seq(2,-2,len=50), nlambda1=50) ## Rerun simple cross-validation with the appropriate lambda2 cv.10K <- crossval(x,y, lambda2=.2) ## Try leave one out also cv.loo <- crossval(x,y, K=n, lambda2=0.2) plot(cv.double) plot(cv.10K) plot(cv.loo) ## Performance for selection purpose beta.min.10K <- slot(cv.10K, "beta.min") beta.min.loo <- slot(cv.loo, "beta.min") cat("\nFalse positives with the minimal 10-CV choice: ", sum(sign(beta) != sign(beta.min.10K))) cat("\nFalse positives with the minimal LOO-CV choice: ", sum(sign(beta) != sign(beta.min.loo)))
## Simulating multivariate Gaussian with blockwise correlation ## and piecewise constant vector of parameters beta <- rep(c(0,1,0,-1,0), c(25,10,25,10,25)) cor <- 0.75 Soo <- toeplitz(cor^(0:(25-1))) ## Toeplitz correlation for irrelevant variables Sww <- matrix(cor,10,10) ## bloc correlation between active variables Sigma <- bdiag(Soo,Sww,Soo,Sww,Soo) + 0.1 diag(Sigma) <- 1 n <- 100 x <- as.matrix(matrix(rnorm(95*n),n,95) %*% chol(Sigma)) y <- 10 + x %*% beta + rnorm(n,0,10) ## Use fewer lambda1 values by overwritting the default parameters ## and cross-validate over the sequences lambda1 and lambda2 cv.double <- crossval(x,y, lambda2=10^seq(2,-2,len=50), nlambda1=50) ## Rerun simple cross-validation with the appropriate lambda2 cv.10K <- crossval(x,y, lambda2=.2) ## Try leave one out also cv.loo <- crossval(x,y, K=n, lambda2=0.2) plot(cv.double) plot(cv.10K) plot(cv.loo) ## Performance for selection purpose beta.min.10K <- slot(cv.10K, "beta.min") beta.min.loo <- slot(cv.loo, "beta.min") cat("\nFalse positives with the minimal 10-CV choice: ", sum(sign(beta) != sign(beta.min.10K))) cat("\nFalse positives with the minimal LOO-CV choice: ", sum(sign(beta) != sign(beta.min.loo)))
Produce a plot of the solution path of a quadrupen
fit.
\S4method{plot}{quadrupen}(x, y, xvar = "lambda", main = paste(slot(x, "penalty")," path", sep=""), log.scale = TRUE, standardize=TRUE, reverse=FALSE, labels = NULL, plot = TRUE, ...)
\S4method{plot}{quadrupen}(x, y, xvar = "lambda", main = paste(slot(x, "penalty")," path", sep=""), log.scale = TRUE, standardize=TRUE, reverse=FALSE, labels = NULL, plot = TRUE, ...)
x |
output of a fitting procedure of the quadrupen
package ( |
y |
used for S4 compatibility. |
xvar |
variable to plot on the X-axis: either |
main |
the main title. Default is set to the model name followed by what is on the Y-axis. |
log.scale |
logical; indicates if a log-scale should be used
when |
standardize |
logical; standardize the coefficients before
plotting (with the norm of the predictor). Default is |
reverse |
logical; should the X-axis be reversed when
|
labels |
vector indicating the names associated to the plotted
variables. When specified, a legend is drawn in order to identify
each variable. Only relevant when the number of predictor is
small. Remind that the intercept does not count. Default is
|
plot |
logical; indicates if the graph should be plotted on
call. Default is |
... |
Not used |
a ggplot2 object which can be plotted via the
print
method.
## Simulating multivariate Gaussian with blockwise correlation ## and piecewise constant vector of parameters beta <- rep(c(0,1,0,-1,0), c(25,10,25,10,25)) cor <- 0.75 Soo <- toeplitz(cor^(0:(25-1))) ## Toeplitz correlation for irrelevant variables Sww <- matrix(cor,10,10) ## bloc correlation between active variables Sigma <- bdiag(Soo,Sww,Soo,Sww,Soo) diag(Sigma) <- 1 n <- 50 x <- as.matrix(matrix(rnorm(95*n),n,95) %*% chol(Sigma)) y <- 10 + x %*% beta + rnorm(n,0,10) ## Plot the Lasso path plot(elastic.net(x,y, lambda2=0), main="Lasso solution path") ## Plot the Elastic-net path plot(elastic.net(x,y, lambda2=10), xvar = "lambda") ## Plot the Elastic-net path (fraction on X-axis, unstandardized coefficient) plot(elastic.net(x,y, lambda2=10), standardize=FALSE, xvar="fraction") ## Plot the Bounded regression path (fraction on X-axis) plot(bounded.reg(x,y, lambda2=10), xvar="fraction")
## Simulating multivariate Gaussian with blockwise correlation ## and piecewise constant vector of parameters beta <- rep(c(0,1,0,-1,0), c(25,10,25,10,25)) cor <- 0.75 Soo <- toeplitz(cor^(0:(25-1))) ## Toeplitz correlation for irrelevant variables Sww <- matrix(cor,10,10) ## bloc correlation between active variables Sigma <- bdiag(Soo,Sww,Soo,Sww,Soo) diag(Sigma) <- 1 n <- 50 x <- as.matrix(matrix(rnorm(95*n),n,95) %*% chol(Sigma)) y <- 10 + x %*% beta + rnorm(n,0,10) ## Plot the Lasso path plot(elastic.net(x,y, lambda2=0), main="Lasso solution path") ## Plot the Elastic-net path plot(elastic.net(x,y, lambda2=10), xvar = "lambda") ## Plot the Elastic-net path (fraction on X-axis, unstandardized coefficient) plot(elastic.net(x,y, lambda2=10), standardize=FALSE, xvar="fraction") ## Plot the Bounded regression path (fraction on X-axis) plot(bounded.reg(x,y, lambda2=10), xvar="fraction")
stability.path
.Produce a plot of the stability path obtained by stability selection.
\S4method{plot}{stability.path}(x, y, xvar = "lambda", annot=TRUE, main = paste("Stability path for ", slot(x, "penalty")," regularizer", sep=""), log.scale = TRUE, labels = rep("unknown status",p), plot = TRUE, sel.mode = c("rank","PFER"), cutoff=0.75, PFER=2, nvar=floor(n/log(p)), ...)
\S4method{plot}{stability.path}(x, y, xvar = "lambda", annot=TRUE, main = paste("Stability path for ", slot(x, "penalty")," regularizer", sep=""), log.scale = TRUE, labels = rep("unknown status",p), plot = TRUE, sel.mode = c("rank","PFER"), cutoff=0.75, PFER=2, nvar=floor(n/log(p)), ...)
x |
output of a |
y |
used for S4 compatibility. |
xvar |
variable to plot on the X-axis: either |
annot |
logical; should annotation be made on the graph
regarding controlled PFER (only relevant when |
main |
main title. If none given, a somewhat appropriate title is automatically generated. |
log.scale |
logical; indicates if a log-scale should be used
when |
labels |
an optional vector of labels for each variable in the path (e.g., 'relevant'/'irrelevant'). See examples. |
plot |
logical; indicates if the graph should be
plotted. Default is |
sel.mode |
a character string, either |
cutoff |
value of the cutoff probability (only relevant when
|
PFER |
value of the per-family error rate to control (only
relevant when |
nvar |
number of variables selected (only relevant when
|
... |
used for S4 compatibility. |
a list with a ggplot2 object which can be plotted
via the print
method, and a vector of selected variables
corresponding to method of choice ('rank'
or
'PFER'
)
## Simulating multivariate Gaussian with blockwise correlation ## and piecewise constant vector of parameters beta <- rep(c(0,1,0,-1,0), c(25,10,25,10,25)) Soo <- matrix(0.75,25,25) ## bloc correlation between zero variables Sww <- matrix(0.75,10,10) ## bloc correlation between active variables Sigma <- bdiag(Soo,Sww,Soo,Sww,Soo) + 0.2 diag(Sigma) <- 1 n <- 100 x <- as.matrix(matrix(rnorm(95*n),n,95) %*% chol(Sigma)) y <- 10 + x %*% beta + rnorm(n,0,10) ## Build a vector of label for true nonzeros labels <- rep("irrelevant", length(beta)) labels[beta != 0] <- c("relevant") labels <- factor(labels, ordered=TRUE, levels=c("relevant","irrelevant")) ## Call to stability selection function, 200 subsampling stab <- stability(x,y, subsamples=200, lambda2=1, min.ratio=1e-2) ## Build the plot an recover the selected variable plot(stab, labels=labels) plot(stab, xvar="fraction", labels=labels, sel.mode="PFER", cutoff=0.75, PFER=2)
## Simulating multivariate Gaussian with blockwise correlation ## and piecewise constant vector of parameters beta <- rep(c(0,1,0,-1,0), c(25,10,25,10,25)) Soo <- matrix(0.75,25,25) ## bloc correlation between zero variables Sww <- matrix(0.75,10,10) ## bloc correlation between active variables Sigma <- bdiag(Soo,Sww,Soo,Sww,Soo) + 0.2 diag(Sigma) <- 1 n <- 100 x <- as.matrix(matrix(rnorm(95*n),n,95) %*% chol(Sigma)) y <- 10 + x %*% beta + rnorm(n,0,10) ## Build a vector of label for true nonzeros labels <- rep("irrelevant", length(beta)) labels[beta != 0] <- c("relevant") labels <- factor(labels, ordered=TRUE, levels=c("relevant","irrelevant")) ## Call to stability selection function, 200 subsampling stab <- stability(x,y, subsamples=200, lambda2=1, min.ratio=1e-2) ## Build the plot an recover the selected variable plot(stab, labels=labels) plot(stab, xvar="fraction", labels=labels, sel.mode="PFER", cutoff=0.75, PFER=2)
Class of object returned by any fitting function of the
quadrupen package (elastic.net
or
bounded.reg
).
coefficients
:Matrix (class "dgCMatrix"
) of
coefficients with respect to the original input. The number of
rows corresponds the length of lambda1
.
active.set
:Matrix (class "dgCMatrix"
, generally
sparse) indicating the 'active' variables, in the sense that they
activate the constraints. For the elastic.net
, it
corresponds to the nonzero variables; for the
bounded.reg
function, it is the set of variables
reaching the boundary along the path of solutions.
intercept
:logical; indicates if an intercept has been included to the model.
mu
:A vector (class "numeric"
)
containing the successive values of the (unpenalized) intercept.
Equals to zero if intercept
has been set to FALSE
.
meanx
:Vector (class "numeric"
) containing
the column means of the predictor matrix.
normx
:Vector (class "numeric"
) containing the
square root of the sum of squares of each column of the design
matrix.
penscale
:Vector "numeric"
with real positive
values that have been used to weight the penalty tuned by
.
penalty
:Object of class "character"
indicating the method used ("elastic-net"
or "bounded
regression"
).
naive
:logical; was the naive
mode on?
lambda1
:Vector (class "numeric"
) of penalty
levels (either or
)
for which the model has eventually been fitted.
lambda2
:Scalar (class "numeric"
) for the
amount of (ridge-like) penalty.
struct
:Object of class "Matrix"
used to
structure the coefficients in the penalty.
control
:Object of class "list"
with low
level options used for optimization.
monitoring
:List (class "list"
) which
contains various indicators dealing with the optimization
process.
residuals
:Matrix of residuals, each column
corresponding to a value of lambda1
.
r.squared
:Vector (class "numeric"
) given the
coefficient of determination as a function of lambda1.
fitted
:Matrix of fitted values, each column
corresponding to a value of lambda1
.
This class comes with the usual predict(object, newx, ...)
,
fitted(object, ...)
, residuals(object, ...)
,
print(object, ...)
, show(object)
and
deviance(object, ...)
generic (undocumented) methods.
A specific plotting method is available and documented
(plot,quadrupen-method
).
See also plot,quadrupen-method
.
Compute the stability path of a (possibly randomized) fitting procedure as introduced by Meinshausen and Buhlmann (2010).
stability( x, y, penalty = c("elastic.net", "bounded.reg"), subsamples = 100, sample.size = floor(n/2), randomize = TRUE, weakness = 0.5, verbose = TRUE, folds = replicate(subsamples, sample(1:nrow(x), sample.size), simplify = FALSE), mc.cores = 2, ... )
stability( x, y, penalty = c("elastic.net", "bounded.reg"), subsamples = 100, sample.size = floor(n/2), randomize = TRUE, weakness = 0.5, verbose = TRUE, folds = replicate(subsamples, sample(1:nrow(x), sample.size), simplify = FALSE), mc.cores = 2, ... )
x |
matrix of features, possibly sparsely encoded (experimental). Do NOT include intercept. |
y |
response vector. |
penalty |
a string for the fitting procedure used for
cross-validation. Either |
subsamples |
integer indicating the number of subsamplings used to estimate the selection probabilities. Default is 100. |
sample.size |
integer indicating the size of each subsamples.
Default is |
randomize |
Should a randomized version of the fitting
procedure by used? Default is |
weakness |
Coefficient used for randomizing. Default is
|
verbose |
logical; indicates if the progression should be
displayed. Default is |
folds |
list with |
mc.cores |
the number of cores to use. The default uses 2 cores. |
... |
additional parameters to overwrite the defaults of the
fitting procedure. See the corresponding documentation
( |
An object of class stability.path
.
When randomized = TRUE
, the penscale
argument
that weights the penalty tuned by is
perturbed (divided) for each subsample by a random variable
uniformly distributed on
[α,1],
where
α is
the weakness parameter.
If the user runs the fitting method with option
'bulletproof'
set to FALSE
, the algorithm may stop
at an early stage of the path. Early stops of the underlying
fitting function are handled internally, in the following way: we
chose to simply skip the results associated with such runs, in
order not to bias the stability selection procedure. If it occurs
too often, a warning is sent to the user, in which case you should
reconsider the grid of lambda1
for stability selection. If
bulletproof
is TRUE
(the default), there is nothing
to worry about, except a possible slow down when any switching to
the proximal algorithm is required.
N. Meinshausen and P. Buhlmann (2010). Stability Selection, JRSS(B).
stability.path
and
plot,stability.path-method
.
## Simulating multivariate Gaussian with blockwise correlation ## and piecewise constant vector of parameters beta <- rep(c(0,1,0,-1,0), c(25,10,25,10,25)) Soo <- matrix(0.75,25,25) ## bloc correlation between zero variables Sww <- matrix(0.75,10,10) ## bloc correlation between active variables Sigma <- bdiag(Soo,Sww,Soo,Sww,Soo) + 0.2 diag(Sigma) <- 1 n <- 100 x <- as.matrix(matrix(rnorm(95*n),n,95) %*% chol(Sigma)) y <- 10 + x %*% beta + rnorm(n,0,10) ## Build a vector of label for true nonzeros labels <- rep("irrelevant", length(beta)) labels[beta != 0] <- c("relevant") labels <- factor(labels, ordered=TRUE, levels=c("relevant","irrelevant")) ## Call to stability selection function, 200 subsampling stab <- stability(x,y, subsamples=200, lambda2=1, min.ratio=1e-2) ## Recover the selected variables for a given cutoff ## and per-family error rate, without producing any plot stabpath <- plot(stab, cutoff=0.75, PFER=1, plot=FALSE) cat("\nFalse positives for the randomized Elastic-net with stability selection: ", sum(labels[stabpath$selected] != "relevant")) cat("\nDONE.\n")
## Simulating multivariate Gaussian with blockwise correlation ## and piecewise constant vector of parameters beta <- rep(c(0,1,0,-1,0), c(25,10,25,10,25)) Soo <- matrix(0.75,25,25) ## bloc correlation between zero variables Sww <- matrix(0.75,10,10) ## bloc correlation between active variables Sigma <- bdiag(Soo,Sww,Soo,Sww,Soo) + 0.2 diag(Sigma) <- 1 n <- 100 x <- as.matrix(matrix(rnorm(95*n),n,95) %*% chol(Sigma)) y <- 10 + x %*% beta + rnorm(n,0,10) ## Build a vector of label for true nonzeros labels <- rep("irrelevant", length(beta)) labels[beta != 0] <- c("relevant") labels <- factor(labels, ordered=TRUE, levels=c("relevant","irrelevant")) ## Call to stability selection function, 200 subsampling stab <- stability(x,y, subsamples=200, lambda2=1, min.ratio=1e-2) ## Recover the selected variables for a given cutoff ## and per-family error rate, without producing any plot stabpath <- plot(stab, cutoff=0.75, PFER=1, plot=FALSE) cat("\nFalse positives for the randomized Elastic-net with stability selection: ", sum(labels[stabpath$selected] != "relevant")) cat("\nDONE.\n")
Class of object returned by the stability
function, with
methods print
, show
and plot
.
probabilities
: a Matrix
object containing the
estimated probabilities of selection along the path of solutions.
penalty
: Object of class "character"
indicating the penalizer used.
naive
: logical indicating whether rescaling of the
coefficients has been performed regarding the -penalty.
lambda1
: a vector with the levels of the first penalty.
lambda2
: a scalar with the -penalty level.
folds
: a list that contains the folds used for each subsample.
See also plot,stability.path-method
, and
stability
.