Title: | High-Dimensional Lasso Generalized Estimating Equations |
---|---|
Description: | Fits generalized estimating equations with L1 regularization to longitudinal data with high dimensional covariates. Use a efficient iterative composite gradient descent algorithm. |
Authors: | Yaguang Li, Xin Gao, Wei Xu |
Maintainer: | Yaguang Li <[email protected]> |
License: | GPL (>= 2) |
Version: | 1.0 |
Built: | 2024-12-09 06:59:49 UTC |
Source: | CRAN |
Does k-fold cross-validation for LassoGEE to select tuning parameter value for longitudinal data with working independence structure.
cv.LassoGEE( X, y, id, family, method = c("CGD", "RWL"), scale.fix, scale.value, fold, lambda.vec, maxiter, tol )
cv.LassoGEE( X, y, id, family, method = c("CGD", "RWL"), scale.fix, scale.value, fold, lambda.vec, maxiter, tol )
X |
A design matrix of dimension |
y |
A response vector of length |
id |
A vector for identifying subjects/clusters. |
family |
A family object: a list of functions and expressions for defining link and variance functions. Families supported here is same as in PGEE which are binomial, gaussian, gamma and poisson. |
method |
The algorithms that are available. |
scale.fix |
A logical variable; if true, the scale parameter is fixed at the value of scale.value. The default value is TRUE. |
scale.value |
If |
fold |
The number of folds used in cross-validation. |
lambda.vec |
A vector of tuning parameters that will be used in the cross-validation. |
maxiter |
The number of iterations that is used in the estimation algorithm.
The default value is |
tol |
The tolerance level that is used in the estimation algorithm.
The default value is |
An object class of cv.LassoGEE.
Li, Y., Gao, X., and Xu, W. (2020). Statistical consistency for
generalized estimating equation with regularization.
LassoGEE
Information Criterion for a fitted LassoGEE object with the AIC, BIC, or GCV criteria.
IC(obj, criterion = c("BIC", "AIC", "GCV", "AICc", "EBIC"))
IC(obj, criterion = c("BIC", "AIC", "GCV", "AICc", "EBIC"))
obj |
A fitted LassoGEE object. |
criterion |
The criterion by which to select the regularization parameter. One of "AIC", "BIC", "GCV", "AICc", or "EBIC"; default is "BIC". |
IC |
The calculated model selection criteria |
Gao, X., and Yi, G. Y. (2013). Simultaneous model selection and estimation for mean and association structures with clustered binary data. Stat, 2(1), 102-118.
This function fits a penalized GEE model to longitudinal
data by I-CGD algorithm or re-weighted least square algorithm.
LassoGEE( X, y, id, family = binomial("probit"), lambda, corstr = "independence", method = c("CGD", "RWL"), beta.ini = NULL, R = NULL, scale.fix = TRUE, scale.value = 1, maxiter = 50, tol = 0.001, silent = TRUE, Mv = NULL, verbose = TRUE )
LassoGEE( X, y, id, family = binomial("probit"), lambda, corstr = "independence", method = c("CGD", "RWL"), beta.ini = NULL, R = NULL, scale.fix = TRUE, scale.value = 1, maxiter = 50, tol = 0.001, silent = TRUE, Mv = NULL, verbose = TRUE )
X |
A design matrix of dimension |
y |
A response vector of length |
id |
A vector for identifying subjects/clusters. |
family |
A family object representing one of the built-in families.
Families supported here are the same as in PGEE, e.g, |
lambda |
A user supplied value for the penalization parameter. |
corstr |
A character string that indicates the correlation structure among
the repeated measurements of a subject. Structures supported in |
method |
The algorithms that are available. |
beta.ini |
User specified initial values for regression parameters.
The default value is |
R |
User specified correlation matrix. The default value is |
scale.fix |
A logical variable. The default value is |
scale.value |
If |
maxiter |
The maximum number of iterations used in the algorithm. The default value is 50. |
tol |
The tolerance level used in the algorithm. The default value is |
silent |
A logical variable; if false, the iteration counts at each iteration of CGD are printed. The default value is TRUE. |
Mv |
If either "stat_M_dep", or "non_stat_M_dep" is specified in corstr, then this assigns a numeric value for Mv. Otherwise, the default value is NULL. |
verbose |
A logical variable; Print the out loop iteration counts. The default value is TRUE. |
A list containing the following components:
betaest |
return final estimation |
beta_all_step |
return estimate in each iteration |
inner.count |
iterative count in each stage |
outer.iter |
iterate number of outer loop |
Li, Y., Gao, X., and Xu, W. (2020). Statistical consistency for
generalized estimating equation with regularization.
cv.LassoGEE
# required R package library(mvtnorm) library(SimCorMultRes) # set.seed(123) p <- 200 s <- ceiling(p^{1/3}) n <- ceiling(10 * s * log(p)) m <- 4 # covariance matrix of p number of continuous covariates X.sigma <- matrix(0, p, p) { for (i in 1:p) X.sigma[i,] <- 0.5^(abs((1:p)-i)) } # generate matrix of covariates X <- as.matrix(rmvnorm(n*m, mean = rep(0,p), X.sigma)) # true regression parameter associated with the covariate bt <- runif(s, 0.05, 0.5) # = rep(1/s,s) beta.true <- c(bt,rep(0,p-s)) # intercept beta_intercepts <- 0 # unstructure tt <- runif(m*m,-1,1) Rtmp <- t(matrix(tt, m,m))%*%matrix(tt, m,m)+diag(1,4) R_tr <- diag(diag(Rtmp)^{-1/2})%*%Rtmp%*%diag(diag(Rtmp)^{-1/2}) diag(R_tr) = round(diag(R_tr)) # library(SimCorMultRes) # simulation of clustered binary responses simulated_binary_dataset <- rbin(clsize = m, intercepts = beta_intercepts, betas = beta.true, xformula = ~X, cor.matrix = R_tr, link = "probit") lambda <- 0.2* s *sqrt(log(p)/n) data = simulated_binary_dataset$simdata y = data$y X = data$X id = data$id ptm <- proc.time() nCGDfit = LassoGEE(X = X, y = y, id = id, family = binomial("probit"), lambda = lambda, corstr = "unstructured") proc.time() - ptm betaest <- nCGDfit$betaest
# required R package library(mvtnorm) library(SimCorMultRes) # set.seed(123) p <- 200 s <- ceiling(p^{1/3}) n <- ceiling(10 * s * log(p)) m <- 4 # covariance matrix of p number of continuous covariates X.sigma <- matrix(0, p, p) { for (i in 1:p) X.sigma[i,] <- 0.5^(abs((1:p)-i)) } # generate matrix of covariates X <- as.matrix(rmvnorm(n*m, mean = rep(0,p), X.sigma)) # true regression parameter associated with the covariate bt <- runif(s, 0.05, 0.5) # = rep(1/s,s) beta.true <- c(bt,rep(0,p-s)) # intercept beta_intercepts <- 0 # unstructure tt <- runif(m*m,-1,1) Rtmp <- t(matrix(tt, m,m))%*%matrix(tt, m,m)+diag(1,4) R_tr <- diag(diag(Rtmp)^{-1/2})%*%Rtmp%*%diag(diag(Rtmp)^{-1/2}) diag(R_tr) = round(diag(R_tr)) # library(SimCorMultRes) # simulation of clustered binary responses simulated_binary_dataset <- rbin(clsize = m, intercepts = beta_intercepts, betas = beta.true, xformula = ~X, cor.matrix = R_tr, link = "probit") lambda <- 0.2* s *sqrt(log(p)/n) data = simulated_binary_dataset$simdata y = data$y X = data$X id = data$id ptm <- proc.time() nCGDfit = LassoGEE(X = X, y = y, id = id, family = binomial("probit"), lambda = lambda, corstr = "unstructured") proc.time() - ptm betaest <- nCGDfit$betaest
Print a summary of the results of cross-validation for a LassoGEE model.
## S3 method for class 'cv.LassoGEE' print(x, digits = NULL, ...)
## S3 method for class 'cv.LassoGEE' print(x, digits = NULL, ...)
x |
fitted 'cv.LassoGEE' object |
digits |
significant digits in printout |
... |
additional print arguments |
A summary of the cross-validated fit is produced. print.cv.LassoGEE(object)
will print the summary for a sequence of lambda
.
Li, Y., Gao, X., and Xu, W. (2020). Statistical consistency for
generalized estimating equation with regularization.
LassoGEE
, and cv.LassoGEE
methods.
Print a summary of the results of a LassoGEE model.
## S3 method for class 'LassoGEE' print(x, digits = NULL, ...)
## S3 method for class 'LassoGEE' print(x, digits = NULL, ...)
x |
fitted 'LassoGEE' object |
digits |
significant digits in printout |
... |
additional print arguments |
A summary of the cross-validated fit is produced. print.cv.LassoGEE(object)
will print the summary includes Working Correlation and Returned Error Value.
Li, Y., Gao, X., and Xu, W. (2020). Statistical consistency for
generalized estimating equation with regularization.
LassoGEE
, and cv.LassoGEE
methods.