Title: | Estimate a Penalized Linear Model using the CDF Penalty Function |
---|---|
Description: | Utilize the CDF penalty function to estimate a penalized linear model. It enables you to display some graphical representations and determine whether the Karush-Kuhn-Tucker conditions are met. For more details about the theory, please refer to Cuntrera, D., Augugliaro, L., & Muggeo, V. M. (2022) <arXiv:2212.08582>. |
Authors: | Daniele Cuntrera [aut, cre], Luigi Augugliaro [aut], Vito M.R. Muggeo [aut] |
Maintainer: | Daniele Cuntrera <[email protected]> |
License: | GPL-2 | GPL-3 |
Version: | 0.1.0 |
Built: | 2024-12-11 06:47:36 UTC |
Source: | CRAN |
Function that takes the resulting values of the estimated model as input, to compute BIC
BIC_calc(X, b.tld, y, n)
BIC_calc(X, b.tld, y, n)
X |
The covariates' matrix |
b.tld |
The estimated sparse-beta |
y |
The response variable |
n |
The number of observation |
Returns the BIC value calculated for a single value of the tuning parameter.
p <- 10 n <- 100 X <- cbind(1, matrix(rnorm(n * p), n , p)) b.s <- c(1, rep(0, p)) b.s[sample(2:p, 3)] <- 1 y <- drop(crossprod(t(X), b.s)) out <- cdfPen(X = X, y = y) (bic <- BIC_cdfpen(out)) plot(out$lmb, bic, "s")
p <- 10 n <- 100 X <- cbind(1, matrix(rnorm(n * p), n , p)) b.s <- c(1, rep(0, p)) b.s[sample(2:p, 3)] <- 1 y <- drop(crossprod(t(X), b.s)) out <- cdfPen(X = X, y = y) (bic <- BIC_cdfpen(out)) plot(out$lmb, bic, "s")
Calculates the BIC for all estimated models in a "cdfpen" object
BIC_cdfpen(object)
BIC_cdfpen(object)
object |
Object containing the results. |
Returns a vector containing the BIC values calculated over the entire estimated path
Uses the CDF penalty to estimate a linear model with the maximum penalized likelihood. The path of coefficients is computed for a grid of values for the lambda regularization parameter.
cdfPen(X, y, nu, lmb, nlmb = 100L, e = 1E-3, rho = 2, algorithm = c("lla", "opt"), nstep = 1E+5, eps = 1E-6, eps.lla = 1E-6, nstep.lla = 1E+5)
cdfPen(X, y, nu, lmb, nlmb = 100L, e = 1E-3, rho = 2, algorithm = c("lla", "opt"), nstep = 1E+5, eps = 1E-6, eps.lla = 1E-6, nstep.lla = 1E+5)
X |
Matrix of covariates, each row is a vector of observations. The matrix must not contain the intercept. |
y |
Vector of response variable. |
nu |
Shape parameter of the penalty. It affects the degree of the non-convexity of the penalty. If no value is specified, the smallest value that ensures a single solution will be used. |
lmb |
A user-supplied tuning parameter sequence. |
nlmb |
number of lambda values; 100 is the default value. |
e |
The smallest lambda value, expressed as a percentage of maximum lambda. Default value is .001. |
rho |
Parameter of the optimization algorithm. Default is 2. |
algorithm |
Approximation to be used to obtain the sparse solution. |
nstep |
Maximum number of iterations of the global algorithm. |
eps |
Convergence threshold of the global algorithm. |
eps.lla |
Convergence threshold of the LLA-algorithm (if used). |
nstep.lla |
Maximum number of iterations of the LLA-algorithm (if used). |
We consider a local quadratic approximation of the likelihood to treat the problem as a weighted linear model.
The choice of value assigned to is of fundamental importance: it affects both computational and estimation aspects. It affects the ”degree of non-convexity” of the penalty and determines which of the good and bad properties of convex and non-convex penalties are obtained. Using a high value of
ensures the uniqueness of solution, but the estimates will be biased. Conversely, a small value of
guarantees negligible bias in the estimates. The parameter
has the role of determining the convergence rate of non-null estimates$: the lower the value, the higher the convergence rate. Using lower values of
, the objective function will have local minima.
coefficients |
The coefficients fit matrix. The number of columns is equal to nlmb, and the number of rows is equal to the number of coefficients. |
lmb |
The vector of lambda used. |
e |
The smallest lambda value, expressed as a percentage of maximum lambda. Default value is .001. |
rho |
The parameter of the optimization algorithm used |
nu |
The shape parameters of the penalty used. |
X |
The design matrix. |
y |
The response. |
algorithm |
Approximation used |
Daniele Cuntrera, Luigi Augugliaro, Vito Muggeo
Aggiungere Arxiv
p <- 10 n <- 100 X <- cbind(1, matrix(rnorm(n * p), n , p)) b.s <- c(1, rep(0, p)) b.s[sample(2:p, 3)] <- 1 y <- drop(crossprod(t(X), b.s)) out <- cdfPen(X = X, y = y)
p <- 10 n <- 100 X <- cbind(1, matrix(rnorm(n * p), n , p)) b.s <- c(1, rep(0, p)) b.s[sample(2:p, 3)] <- 1 y <- drop(crossprod(t(X), b.s)) out <- cdfPen(X = X, y = y)
These are the fundamental computing algorithms that cdfPen invokes to estimate penalized linear models by varying lambda.
cdfPen.fit(b, b.tld, g, b.rho, H.rho, lmb.rho, nu, algorithm, nstep = 1E+5, eps = 1E-5, eps.lla = 1E-6, nstep.lla = 1E+5)
cdfPen.fit(b, b.tld, g, b.rho, H.rho, lmb.rho, nu, algorithm, nstep = 1E+5, eps = 1E-5, eps.lla = 1E-6, nstep.lla = 1E+5)
b |
Starting values of beta-vector. |
b.tld |
Starting values of sparse beta-vector. |
g |
Starting values of pseudo-variable. |
b.rho |
Ridge solution. |
H.rho |
Second part of ridge solution. |
lmb.rho |
Lambda-rho ratio. |
nu |
Shape parameter of the penalty. It affects the degree of the non-convexity of the penalty. |
algorithm |
Approximation to be used to obtain the sparse solution. |
nstep |
Maximum number of iterations of the global algorithm. |
eps |
Convergence threshold of the global algorithm. |
eps.lla |
Convergence threshold of the LLA-algorithm (if used). |
nstep.lla |
Maximum number of iterations of the LLA-algorithm (if used). |
b |
Estimated beta-vector. |
b.tld |
Estimated sparse beta-vector. |
g |
Final values of pseudo-variable. |
i |
Number of iterations. |
conv |
Convergence check status (0 if converged). |
Daniele Cuntrera, Luigi Augugliaro, Vito Muggeo
Aggiungere Arxiv
Control over Karush-Kuhn-Tucker (Karush, 1939) conditions for the estimates obtained.
check_KKT(obj, intercept = TRUE)
check_KKT(obj, intercept = TRUE)
obj |
Object to be checked. |
intercept |
Is the intercept used in the model? |
grd |
The value of gradient. |
hx |
The value of equality constraint. |
glob |
The global value of derivative (grd + hx). |
test |
Is the condition verified? |
lmb |
The values of lambda used in the model |
Daniele Cuntrera, Luigi Augugliaro, Vito Muggeo
Karush, W. (1939). Minima of functions of several variables with inequalities as side constraints. M. Sc. Dissertation. Dept. of Mathematics, Univ. of Chicago.
p <- 10 n <- 100 X <- cbind(1, matrix(rnorm(n * p), n , p)) b.s <- c(1, rep(0, p)) b.s[sample(2:p, 3)] <- 1 y <- drop(crossprod(t(X), b.s)) out <- cdfPen(X = X, y = y) KKT <- check_KKT(out) plot(KKT$test)
p <- 10 n <- 100 X <- cbind(1, matrix(rnorm(n * p), n , p)) b.s <- c(1, rep(0, p)) b.s[sample(2:p, 3)] <- 1 y <- drop(crossprod(t(X), b.s)) out <- cdfPen(X = X, y = y) KKT <- check_KKT(out) plot(KKT$test)
Linearly approximate a part of the objective function to greatly speed up computations.
lla(b.o, lmb.rho, bm_gm, nu, nstep.lla = 100L, eps.lla = 1E-6)
lla(b.o, lmb.rho, bm_gm, nu, nstep.lla = 100L, eps.lla = 1E-6)
b.o |
Vector of sparse-solution. |
lmb.rho |
Lambda-rho ratio. |
bm_gm |
Vector of pseudo-solution |
nu |
Shape parameter of the penalty. |
nstep.lla |
Maximum number of iterations of the LLA-algorithm (if used). |
eps.lla |
Convergence threshhold of the LLA-algorithm (if used). |
The LLA approximation allows the computationally intensive part to be treated as a weighted LASSO (Tibshirani, 1996) problem. In this way the computational effort is significantly less while maintaining satisfactory accuracy of the results. See Zou and Li (2008).
b |
Vector of the estimated sparse-solution. |
Conv |
Convergence check (0 if converged). |
nstep.lla |
Number of iterations done. |
Tibshirani, R. (1996). Regression shrinkage and selection via the lasso. Journal of the Royal Statistical Society: Series B (Methodological), 58(1):267–288.
Zou, H. and Li, R. (2008). One-step sparse estimates in nonconcave penalized likelihood models. Annals of statistics, 36(4):1509
Plot coefficient profile plot or BIC trend
plot_cdfpen(object, ...)
plot_cdfpen(object, ...)
object |
Object to be plotted. |
... |
Other graphical parameters to plot. |
A graph showing the BIC trend or profile of coefficients is displayed.
No return value
Daniele Cuntrera, Luigi Augugliaro, Vito Muggeo
p <- 10 n <- 100 X <- cbind(1, matrix(rnorm(n * p), n , p)) b.s <- c(1, rep(0, p)) b.s[sample(2:p, 3)] <- 1 y <- drop(crossprod(t(X), b.s)) out <- cdfPen(X = X, y = y) plot_cdfpen(out) #Coefficients' path ~ lambda plot_cdfpen(out, "l1") #Coefficients' path ~ L1 norm plot_cdfpen(out, "BIC") #BIC ~ lambda
p <- 10 n <- 100 X <- cbind(1, matrix(rnorm(n * p), n , p)) b.s <- c(1, rep(0, p)) b.s[sample(2:p, 3)] <- 1 y <- drop(crossprod(t(X), b.s)) out <- cdfPen(X = X, y = y) plot_cdfpen(out) #Coefficients' path ~ lambda plot_cdfpen(out, "l1") #Coefficients' path ~ L1 norm plot_cdfpen(out, "BIC") #BIC ~ lambda
Function that takes user requests as input, to show the requested graph
plot_path(obj, lmb, coeff, type = c("path", "l1", "BIC"), ...)
plot_path(obj, lmb, coeff, type = c("path", "l1", "BIC"), ...)
obj |
Object to be plotted |
lmb |
lambda values used in the model |
coeff |
the coefficients' matrix |
type |
type of graph to be ploted |
... |
Other characteristics to be added |
No return value
Applies the threshold rule to obtain the vector of sparse estimates
S(bm_gm, db, w)
S(bm_gm, db, w)
bm_gm |
Vector of pseudo-solution. |
db |
Lambda-rho ratio. |
w |
Weights obtained from the penalty function. |
The estimated coefficient