Title: | Penalized Precision Matrix Estimation via ADMM |
---|---|
Description: | Estimates a penalized precision matrix via the alternating direction method of multipliers (ADMM) algorithm. It currently supports a general elastic-net penalty that allows for both ridge and lasso-type penalties as special cases. This package is an alternative to the 'glasso' package. See Boyd et al (2010) <doi:10.1561/2200000016> for details regarding the estimation method. |
Authors: | Matt Galloway [aut, cre] |
Maintainer: | Matt Galloway <[email protected]> |
License: | GPL (>= 2) |
Version: | 2.1 |
Built: | 2024-10-31 06:46:28 UTC |
Source: | CRAN |
Penalized precision matrix estimation using the ADMM algorithm.
Consider the case where are iid
and we are tasked with estimating the precision matrix,
denoted
. This function solves the
following optimization problem:
where ,
,
is the Frobenius norm and we define
.
This elastic net penalty is identical to the penalty used in the popular penalized
regression package
glmnet
. Clearly, when the elastic-net
reduces to a ridge-type penalty and when
it reduces to a
lasso-type penalty.
ADMMsigma(X = NULL, S = NULL, nlam = 10, lam.min.ratio = 0.01, lam = NULL, alpha = seq(0, 1, 0.2), diagonal = FALSE, path = FALSE, rho = 2, mu = 10, tau.inc = 2, tau.dec = 2, crit = c("ADMM", "loglik"), tol.abs = 1e-04, tol.rel = 1e-04, maxit = 10000, adjmaxit = NULL, K = 5, crit.cv = c("loglik", "penloglik", "AIC", "BIC"), start = c("warm", "cold"), cores = 1, trace = c("progress", "print", "none"))
ADMMsigma(X = NULL, S = NULL, nlam = 10, lam.min.ratio = 0.01, lam = NULL, alpha = seq(0, 1, 0.2), diagonal = FALSE, path = FALSE, rho = 2, mu = 10, tau.inc = 2, tau.dec = 2, crit = c("ADMM", "loglik"), tol.abs = 1e-04, tol.rel = 1e-04, maxit = 10000, adjmaxit = NULL, K = 5, crit.cv = c("loglik", "penloglik", "AIC", "BIC"), start = c("warm", "cold"), cores = 1, trace = c("progress", "print", "none"))
X |
option to provide a nxp data matrix. Each row corresponds to a single observation and each column contains n observations of a single feature/variable. |
S |
option to provide a pxp sample covariance matrix (denominator n). If argument is |
nlam |
number of |
lam.min.ratio |
smallest |
lam |
option to provide positive tuning parameters for penalty term. This will cause |
alpha |
elastic net mixing parameter contained in [0, 1]. |
diagonal |
option to penalize the diagonal elements of the estimated precision matrix ( |
path |
option to return the regularization path. This option should be used with extreme care if the dimension is large. If set to TRUE, cores must be set to 1 and errors and optimal tuning parameters will based on the full sample. Defaults to FALSE. |
rho |
initial step size for ADMM algorithm. |
mu |
factor for primal and residual norms in the ADMM algorithm. This will be used to adjust the step size |
tau.inc |
factor in which to increase step size |
tau.dec |
factor in which to decrease step size |
crit |
criterion for convergence ( |
tol.abs |
absolute convergence tolerance. Defaults to 1e-4. |
tol.rel |
relative convergence tolerance. Defaults to 1e-4. |
maxit |
maximum number of iterations. Defaults to 1e4. |
adjmaxit |
adjusted maximum number of iterations. During cross validation this option allows the user to adjust the maximum number of iterations after the first |
K |
specify the number of folds for cross validation. |
crit.cv |
cross validation criterion ( |
start |
specify |
cores |
option to run CV in parallel. Defaults to |
trace |
option to display progress of CV. Choose one of |
For details on the implementation of 'ADMMsigma', see the website https://mgallow.github.io/ADMMsigma/articles/Details.html.
returns class object ADMMsigma
which includes:
Call |
function call. |
Iterations |
number of iterations. |
Tuning |
optimal tuning parameters (lam and alpha). |
Lambdas |
grid of lambda values for CV. |
Alphas |
grid of alpha values for CV. |
maxit |
maximum number of iterations. |
Omega |
estimated penalized precision matrix. |
Sigma |
estimated covariance matrix from the penalized precision matrix (inverse of Omega). |
Path |
array containing the solution path. Solutions will be ordered in ascending alpha values for each lambda. |
Z |
final sparse update of estimated penalized precision matrix. |
Y |
final dual update. |
rho |
final step size. |
Loglik |
penalized log-likelihood for Omega |
MIN.error |
minimum average cross validation error (cv.crit) for optimal parameters. |
AVG.error |
average cross validation error (cv.crit) across all folds. |
CV.error |
cross validation errors (cv.crit). |
Matt Galloway [email protected]
Boyd, Stephen, Neal Parikh, Eric Chu, Borja Peleato, Jonathan Eckstein, and others. 2011. 'Distributed Optimization and Statistical Learning via the Alternating Direction Method of Multipliers.' Foundations and Trends in Machine Learning 3 (1). Now Publishers, Inc.: 1-122. https://web.stanford.edu/~boyd/papers/pdf/admm_distr_stats.pdf
Hu, Yue, Chi, Eric C, amd Allen, Genevera I. 2016. 'ADMM Algorithmic Regularization Paths for Sparse Statistical Machine Learning.' Splitting Methods in Communication, Imaging, Science, and Engineering. Springer: 433-459.
Zou, Hui and Hastie, Trevor. 2005. 'Regularization and Variable Selection via the Elastic Net.' Journal of the Royal Statistial Society: Series B (Statistical Methodology) 67 (2). Wiley Online Library: 301-320.
Rothman, Adam. 2017. 'STAT 8931 notes on an algorithm to compute the Lasso-penalized Gaussian likelihood precision matrix estimator.'
# generate data from a sparse matrix # first compute covariance matrix S = matrix(0.7, nrow = 5, ncol = 5) for (i in 1:5){ for (j in 1:5){ S[i, j] = S[i, j]^abs(i - j) } } # generate 100 x 5 matrix with rows drawn from iid N_p(0, S) set.seed(123) Z = matrix(rnorm(100*5), nrow = 100, ncol = 5) out = eigen(S, symmetric = TRUE) S.sqrt = out$vectors %*% diag(out$values^0.5) S.sqrt = S.sqrt %*% t(out$vectors) X = Z %*% S.sqrt # elastic-net type penalty (use CV for optimal lambda and alpha) ADMMsigma(X) # ridge penalty (use CV for optimal lambda) ADMMsigma(X, alpha = 0) # lasso penalty (lam = 0.1) ADMMsigma(X, lam = 0.1, alpha = 1)
# generate data from a sparse matrix # first compute covariance matrix S = matrix(0.7, nrow = 5, ncol = 5) for (i in 1:5){ for (j in 1:5){ S[i, j] = S[i, j]^abs(i - j) } } # generate 100 x 5 matrix with rows drawn from iid N_p(0, S) set.seed(123) Z = matrix(rnorm(100*5), nrow = 100, ncol = 5) out = eigen(S, symmetric = TRUE) S.sqrt = out$vectors %*% diag(out$values^0.5) S.sqrt = S.sqrt %*% t(out$vectors) X = Z %*% S.sqrt # elastic-net type penalty (use CV for optimal lambda and alpha) ADMMsigma(X) # ridge penalty (use CV for optimal lambda) ADMMsigma(X, alpha = 0) # lasso penalty (lam = 0.1) ADMMsigma(X, lam = 0.1, alpha = 1)
Produces a plot for the cross validation errors, if available.
## S3 method for class 'ADMM' plot(x, type = c("line", "heatmap"), footnote = TRUE, ...)
## S3 method for class 'ADMM' plot(x, type = c("line", "heatmap"), footnote = TRUE, ...)
x |
class object ADMM. |
type |
produce either 'heatmap' or 'line' graph |
footnote |
option to print footnote of optimal values. Defaults to TRUE. |
... |
additional arguments. |
# generate data from a sparse matrix # first compute covariance matrix S = matrix(0.7, nrow = 5, ncol = 5) for (i in 1:5){ for (j in 1:5){ S[i, j] = S[i, j]^abs(i - j) } } # generate 100 x 5 matrix with rows drawn from iid N_p(0, S) set.seed(123) Z = matrix(rnorm(100*5), nrow = 100, ncol = 5) out = eigen(S, symmetric = TRUE) S.sqrt = out$vectors %*% diag(out$values^0.5) S.sqrt = S.sqrt %*% t(out$vectors) X = Z %*% S.sqrt # produce line graph for ADMMsigma plot(ADMMsigma(X), type = 'line') # produce CV heat map for ADMMsigma plot(ADMMsigma(X), type = 'heatmap')
# generate data from a sparse matrix # first compute covariance matrix S = matrix(0.7, nrow = 5, ncol = 5) for (i in 1:5){ for (j in 1:5){ S[i, j] = S[i, j]^abs(i - j) } } # generate 100 x 5 matrix with rows drawn from iid N_p(0, S) set.seed(123) Z = matrix(rnorm(100*5), nrow = 100, ncol = 5) out = eigen(S, symmetric = TRUE) S.sqrt = out$vectors %*% diag(out$values^0.5) S.sqrt = S.sqrt %*% t(out$vectors) X = Z %*% S.sqrt # produce line graph for ADMMsigma plot(ADMMsigma(X), type = 'line') # produce CV heat map for ADMMsigma plot(ADMMsigma(X), type = 'heatmap')
Produces a heat plot for the cross validation errors, if available.
## S3 method for class 'RIDGE' plot(x, type = c("heatmap", "line"), footnote = TRUE, ...)
## S3 method for class 'RIDGE' plot(x, type = c("heatmap", "line"), footnote = TRUE, ...)
x |
class object RIDGE |
type |
produce either 'heatmap' or 'line' graph |
footnote |
option to print footnote of optimal values. Defaults to TRUE. |
... |
additional arguments. |
# generate data from a sparse matrix # first compute covariance matrix S = matrix(0.7, nrow = 5, ncol = 5) for (i in 1:5){ for (j in 1:5){ S[i, j] = S[i, j]^abs(i - j) } } # generate 100 x 5 matrix with rows drawn from iid N_p(0, S) set.seed(123) Z = matrix(rnorm(100*5), nrow = 100, ncol = 5) out = eigen(S, symmetric = TRUE) S.sqrt = out$vectors %*% diag(out$values^0.5) S.sqrt = S.sqrt %*% t(out$vectors) X = Z %*% S.sqrt # produce CV heat map for RIDGEsigma plot(RIDGEsigma(X, lam = 10^seq(-5, 5, 0.5))) # produce line graph for RIDGEsigma plot(RIDGEsigma(X), type = 'line')
# generate data from a sparse matrix # first compute covariance matrix S = matrix(0.7, nrow = 5, ncol = 5) for (i in 1:5){ for (j in 1:5){ S[i, j] = S[i, j]^abs(i - j) } } # generate 100 x 5 matrix with rows drawn from iid N_p(0, S) set.seed(123) Z = matrix(rnorm(100*5), nrow = 100, ncol = 5) out = eigen(S, symmetric = TRUE) S.sqrt = out$vectors %*% diag(out$values^0.5) S.sqrt = S.sqrt %*% t(out$vectors) X = Z %*% S.sqrt # produce CV heat map for RIDGEsigma plot(RIDGEsigma(X, lam = 10^seq(-5, 5, 0.5))) # produce line graph for RIDGEsigma plot(RIDGEsigma(X), type = 'line')
Ridge penalized matrix estimation via closed-form solution. If one is only interested in the ridge penalty, this function will be faster and provide a more precise estimate than using ADMMsigma
.
Consider the case where
are iid
and we are tasked with estimating the precision matrix,
denoted
. This function solves the
following optimization problem:
where and
is the Frobenius
norm.
RIDGEsigma(X = NULL, S = NULL, lam = 10^seq(-2, 2, 0.1), path = FALSE, K = 5, cores = 1, trace = c("none", "progress", "print"))
RIDGEsigma(X = NULL, S = NULL, lam = 10^seq(-2, 2, 0.1), path = FALSE, K = 5, cores = 1, trace = c("none", "progress", "print"))
X |
option to provide a nxp data matrix. Each row corresponds to a single observation and each column contains n observations of a single feature/variable. |
S |
option to provide a pxp sample covariance matrix (denominator n). If argument is |
lam |
positive tuning parameters for ridge penalty. If a vector of parameters is provided, they should be in increasing order. Defaults to grid of values |
path |
option to return the regularization path. This option should be used with extreme care if the dimension is large. If set to TRUE, cores will be set to 1 and errors and optimal tuning parameters will based on the full sample. Defaults to FALSE. |
K |
specify the number of folds for cross validation. |
cores |
option to run CV in parallel. Defaults to |
trace |
option to display progress of CV. Choose one of |
returns class object RIDGEsigma
which includes:
Lambda |
optimal tuning parameter. |
Lambdas |
grid of lambda values for CV. |
Omega |
estimated penalized precision matrix. |
Sigma |
estimated covariance matrix from the penalized precision matrix (inverse of Omega). |
Path |
array containing the solution path. Solutions are ordered dense to sparse. |
Gradient |
gradient of optimization function (penalized gaussian likelihood). |
MIN.error |
minimum average cross validation error (cv.crit) for optimal parameters. |
AVG.error |
average cross validation error (cv.crit) across all folds. |
CV.error |
cross validation errors (cv.crit). |
Matt Galloway [email protected]
Rothman, Adam. 2017. 'STAT 8931 notes on an algorithm to compute the Lasso-penalized Gaussian likelihood precision matrix estimator.'
# generate data from a sparse matrix # first compute covariance matrix S = matrix(0.7, nrow = 5, ncol = 5) for (i in 1:5){ for (j in 1:5){ S[i, j] = S[i, j]^abs(i - j) } } # generate 100 x 5 matrix with rows drawn from iid N_p(0, S) set.seed(123) Z = matrix(rnorm(100*5), nrow = 100, ncol = 5) out = eigen(S, symmetric = TRUE) S.sqrt = out$vectors %*% diag(out$values^0.5) S.sqrt = S.sqrt %*% t(out$vectors) X = Z %*% S.sqrt # ridge penalty no ADMM RIDGEsigma(X, lam = 10^seq(-5, 5, 0.5))
# generate data from a sparse matrix # first compute covariance matrix S = matrix(0.7, nrow = 5, ncol = 5) for (i in 1:5){ for (j in 1:5){ S[i, j] = S[i, j]^abs(i - j) } } # generate 100 x 5 matrix with rows drawn from iid N_p(0, S) set.seed(123) Z = matrix(rnorm(100*5), nrow = 100, ncol = 5) out = eigen(S, symmetric = TRUE) S.sqrt = out$vectors %*% diag(out$values^0.5) S.sqrt = S.sqrt %*% t(out$vectors) X = Z %*% S.sqrt # ridge penalty no ADMM RIDGEsigma(X, lam = 10^seq(-5, 5, 0.5))