Title: | Variable Banding of Large Precision Matrices |
---|---|
Description: | Implementation of the variable banding procedure for modeling local dependence and estimating precision matrices that is introduced in Yu & Bien (2016) and is available at <https://arxiv.org/abs/1604.07451>. |
Authors: | Guo Yu [aut, cre] |
Maintainer: | Guo Yu <[email protected]> |
License: | GPL-3 |
Version: | 0.9.0 |
Built: | 2024-11-13 06:36:32 UTC |
Source: | CRAN |
Generate lower triangular matrix with strict bandwidth. See, e.g., Model 1 in the paper.
ar_gen(p, phi_vec)
ar_gen(p, phi_vec)
p |
the dimension of L |
phi_vec |
a K-dimensional vector for off-diagonal values |
a p-by-p strictly banded lower triangular matrix
true_ar <- ar_gen(p = 50, phi = c(0.5, -0.4, 0.1))
true_ar <- ar_gen(p = 50, phi = c(0.5, -0.4, 0.1))
Generate a model with block-diagonal structure
block_diag_gen(p)
block_diag_gen(p)
p |
the dimension of L |
a p-by-p lower triangular matrix with block-diagonal structure from p/4-th row to 3p/4-th row
set.seed(123) true_L_block_diag <- block_diag_gen(p = 50)
set.seed(123) true_L_block_diag <- block_diag_gen(p = 50)
Black, white and gray stand for positive, zero and negative respectively
matimage(Mat, main = NULL)
matimage(Mat, main = NULL)
Mat |
A matrix to plot. |
main |
A plot title. |
set.seed(123) p <- 50 n <- 50 phi <- 0.4 true <- varband_gen(p = p, block = 5) matimage(true)
set.seed(123) p <- 50 n <- 50 phi <- 0.4 true <- varband_gen(p = p, block = 5) matimage(true)
Generate n
random samples from multivariate Gaussian distribution N(0, (L^TL)^-1)
sample_gen(L, n)
sample_gen(L, n)
L |
p-dimensional inverse Cholesky factor of true covariance matrix. |
n |
number of samples to generate. |
returns a n-by-p matrix with each row a random sample generated.
set.seed(123) true <- varband_gen(p = 50, block = 5) x <- sample_gen(L = true, n = 100)
set.seed(123) true <- varband_gen(p = 50, block = 5) x <- sample_gen(L = true, n = 100)
Solves the main optimization problem in Yu & Bien (2016):
where
or
varband(S, lambda, init, w = FALSE, lasso = FALSE)
varband(S, lambda, init, w = FALSE, lasso = FALSE)
S |
The sample covariance matrix |
lambda |
Non-negative tuning parameter. Controls sparsity level. |
init |
Initial estimate of L. Default is a closed-form diagonal estimate of L. |
w |
Logical. Should we use weighted version of the penalty or not? If |
lasso |
Logical. Should we use l1 penalty instead of hierarchical group lasso penalty? Note that by using l1 penalty, we lose the banded structure in the resulting estimate. Default is |
The function decomposes into p independent row problems, each of which is solved by an ADMM algorithm. see paper for more explanation.
Returns the variable banding estimate of L, where L^TL = Omega.
set.seed(123) n <- 50 true <- varband_gen(p = 50, block = 5) x <- sample_gen(L = true, n = n) S <- crossprod(scale(x, center = TRUE, scale = FALSE)) / n init <- diag(1/sqrt(diag(S))) # unweighted estimate L_unweighted <- varband(S, lambda = 0.1, init, w = FALSE) # weighted estimate L_weighted <- varband(S, lambda = 0.1, init, w = TRUE) # lasso estimate L_lasso <- varband(S, lambda = 0.1, init, w = TRUE, lasso = TRUE)
set.seed(123) n <- 50 true <- varband_gen(p = 50, block = 5) x <- sample_gen(L = true, n = n) S <- crossprod(scale(x, center = TRUE, scale = FALSE)) / n init <- diag(1/sqrt(diag(S))) # unweighted estimate L_unweighted <- varband(S, lambda = 0.1, init, w = FALSE) # weighted estimate L_weighted <- varband(S, lambda = 0.1, init, w = TRUE) # lasso estimate L_lasso <- varband(S, lambda = 0.1, init, w = TRUE, lasso = TRUE)
Select tuning parameter by cross validation according to the likelihood on testing data, with and without refitting.
varband_cv(x, w = FALSE, lasso = FALSE, lamlist = NULL, nlam = 60, flmin = 0.01, folds = NULL, nfolds = 5)
varband_cv(x, w = FALSE, lasso = FALSE, lamlist = NULL, nlam = 60, flmin = 0.01, folds = NULL, nfolds = 5)
x |
A n-by-p sample matrix, each row is an observation of the p-dim random vector. |
w |
Logical. Should we use weighted version of the penalty or not? If |
lasso |
Logical. Should we use l1 penalty instead of hierarchical group lasso penalty? Note that by using l1 penalty, we lose the banded structure in the resulting estimate. And when using l1 penalty, the becomes CSCS (Convex Sparse Cholesky Selection) introduced in Khare et al. (2016). Default value for |
lamlist |
A list of non-negative tuning parameters |
nlam |
If lamlist is not provided, create a lamlist with length |
flmin |
If lamlist is not provided, create a lamlist with ratio of the smallest and largest lambda in the list equal to |
folds |
Folds used in cross-validation |
nfolds |
If folds are not provided, create folds of size |
A list object containing
A nlam
-by-nfolds
matrix of negative Gaussian log-likelihood values on the CV test data sets. errs[i,j]
is negative Gaussian log-likelihood values incurred in using lamlist[i]
on fold j
.
A nlam
-by-nfolds
matrix of negative Gaussian log-likelihood values of the refitting.
Folds used in cross validation.
lambda
grid used in cross validation.
index of lamlist
minimizing CV negative Gaussian log-likelihood.
index of lamlist
minimizing refitting CV negative Gaussian log-likelihood.
Selected value of lambda
using the one-standard-error rule.
Selected value of lambda
of the refitting process using the one-standard-error rule.
Estimate of L corresponding to ibest_fit
.
Refitted estimate of L corresponding to ibest_refit
.
set.seed(123) p <- 50 n <- 50 true <- varband_gen(p = p, block = 5) x <- sample_gen(L = true, n = n) res_cv <- varband_cv(x = x, w = FALSE, nlam = 40, flmin = 0.03)
set.seed(123) p <- 50 n <- 50 true <- varband_gen(p = p, block = 5) x <- sample_gen(L = true, n = n) res_cv <- varband_cv(x = x, w = FALSE, nlam = 40, flmin = 0.03)
Generate lower triangular matrix with variable bandwidth. See, e.g., Model 2 and 3 in the paper.
varband_gen(p, block = 10)
varband_gen(p, block = 10)
p |
the dimension of L |
block |
the number of block diagonal structures in the resulting model, assumed to divide p |
a p-by-p lower triangular matrix with variable bandwidth
set.seed(123) # small block size (big number of blocks) true_small <- varband_gen(p = 50, block = 10) # large block size (small number of blocks) true_large <- varband_gen(p = 50, block = 2)
set.seed(123) # small block size (big number of blocks) true_small <- varband_gen(p = 50, block = 10) # large block size (small number of blocks) true_large <- varband_gen(p = 50, block = 2)
Compute the varband estimates along a path of tuning parameter values.
varband_path(S, w = FALSE, lasso = FALSE, lamlist = NULL, nlam = 60, flmin = 0.01)
varband_path(S, w = FALSE, lasso = FALSE, lamlist = NULL, nlam = 60, flmin = 0.01)
S |
The sample covariance matrix |
w |
Logical. Should we use weighted version of the penalty or not? If |
lasso |
Logical. Should we use l1 penalty instead of hierarchical group lasso penalty? Note that by using l1 penalty, we lose the banded structure in the resulting estimate. And when using l1 penalty, the becomes CSCS (Convex Sparse Cholesky Selection) introduced in Khare et al. (2016). Default value for |
lamlist |
A list of non-negative tuning parameters |
nlam |
If lamlist is not provided, create a lamlist with length |
flmin |
if lamlist is not provided, create a lamlist with ratio of the smallest and largest lambda in the list. Default is 0.01. |
A list object containing
A array of dim (p
, p
, nlam
) of estimates of L
a grid values of tuning parameters
set.seed(123) n <- 50 true <- varband_gen(p = 50, block = 5) x <- sample_gen(L = true, n = n) S <- crossprod(scale(x, center = TRUE, scale = FALSE))/n path_res <- varband_path(S = S, w = FALSE, nlam = 40, flmin = 0.03)
set.seed(123) n <- 50 true <- varband_gen(p = 50, block = 5) x <- sample_gen(L = true, n = n) S <- crossprod(scale(x, center = TRUE, scale = FALSE))/n path_res <- varband_path(S = S, w = FALSE, nlam = 40, flmin = 0.03)