Title: | Conditional Graphical LASSO for Gaussian Graphical Models with Censored and Missing Values |
---|---|
Description: | Conditional graphical lasso estimator is an extension of the graphical lasso proposed to estimate the conditional dependence structure of a set of p response variables given q predictors. This package provides suitable extensions developed to study datasets with censored and/or missing values. Standard conditional graphical lasso is available as a special case. Furthermore, the package provides an integrated set of core routines for visualization, analysis, and simulation of datasets with censored and/or missing values drawn from a Gaussian graphical model. Details about the implemented models can be found in Augugliaro et al. (2023) <doi: 10.18637/jss.v105.i01>, Augugliaro et al. (2020b) <doi: 10.1007/s11222-020-09945-7>, Augugliaro et al. (2020a) <doi: 10.1093/biostatistics/kxy043>, Yin et al. (2001) <doi: 10.1214/11-AOAS494> and Stadler et al. (2012) <doi: 10.1007/s11222-010-9219-7>. |
Authors: | Luigi Augugliaro [aut, cre] , Gianluca Sottile [aut] , Ernst C. Wit [aut] , Veronica Vinciotti [aut] |
Maintainer: | Luigi Augugliaro <[email protected]> |
License: | GPL (>= 2) |
Version: | 2.0.7 |
Built: | 2024-12-09 06:35:13 UTC |
Source: | CRAN |
Conditional graphical lasso (cglasso) estimator (Yin and other, 2011) is an extension of the graphical lasso (Yuan and other, 2007) proposed to estimate the conditional dependence structure of a set of p response variables given q predictors. This package provides suitable extensions developed to study datasets with censored and/or missing values (Augugliaro and other, 2020a and Augugliaro and other, 2020b). Standard conditional graphical lasso is available as a special case. Furthermore, the cglasso package provides an integrated set of core routines for visualization, analysis, and simulation of datasets with censored and/or missing values drawn from a Gaussian graphical model.
Package: | cglasso |
Type: | Package |
Version: | 2.0.6 |
Date: | 2023-01-17 |
License: | GPL (>=2) |
Luigi Augugliaro [aut, cre],
Gianluca Sottile [aut]
Ernst C. Wit [aut]
Veronica Vinciotti [aut]
Maintainer: Luigi Augugliaro <[email protected]>
Augugliaro L., Sottile G., Wit E.C., and Vinciotti V. (2023) <doi:10.18637/jss.v105.i01>. cglasso: An R Package for Conditional Graphical Lasso Inference with Censored and Missing Values. Journal of Statistical Software 105(1), 1–58.
Augugliaro, L., Sottile, G., and Vinciotti, V. (2020a) <doi:10.1007/s11222-020-09945-7>. The conditional censored graphical lasso estimator. Statistics and Computing 30, 1273–1289.
Augugliaro, L., Abbruzzo, A., and Vinciotti, V. (2020b) <doi:10.1093/biostatistics/kxy043>.
-Penalized censored Gaussian graphical model.
Biostatistics 21, e1–e16.
Friedman, J.H., Hastie, T., and Tibshirani, R. (2008) <doi:10.1093/biostatistics/kxm045>. Sparse inverse covariance estimation with the graphical lasso. Biostatistics 9, 432–441.
Yin, J. and Li, H. (2001) <doi:10.1214/11-AOAS494>. A sparse conditional Gaussian graphical model for analysis of genetical genomics data. The Annals of Applied Statistics 5(4), 2630–2650.
Yuan, M., and Lin, Y. (2007) <doi:10.1093/biomet/asm018>. Model selection and estimation in the Gaussian graphical model. Biometrika 94, 19–35.
‘AIC
’ computes the ‘Akaike Information Criterion’.
## S3 method for class 'cglasso' AIC(object, k = 2, mle, ...)
## S3 method for class 'cglasso' AIC(object, k = 2, mle, ...)
object |
an R object inheriting class ‘ |
k |
the penalty parameter to be used; the default |
mle |
logical. TRUE if the measure of goodness-of-fit should be computed using the maximum likelihood estimates. Default depends on the class of the argument |
... |
further arguments passed to |
‘AIC
’ computes the following measure of goodness-of-fit (Ibrahim and other, 2008):
where is the penalty parameter and
represents the number of unique non-zero parameters in the fitted model.
The values of the Q-function function are computed using QFun
. By default, for an object of class cglasso
these values are computed using the penalized estimates whereas maximum likelihood estimates are used if the object
is of class cggm
(see argument ‘mle
’ in QFun
).
The Akaike Information Criterion (AIC) is returned by letting (default value of the function
AIC
) whereas the ‘Bayesian Information Criterion’ (BIC) is returned by letting , where
is the sample size.
Function AIC
can be passed to the functions select_cglasso
and summary.cglasso
to select and print the optimal fitted model, respectively.
The function plot.GoF
can be used to graphically evaluate the behaviour of the fitted models in terms of goodness-of-fit.
‘AIC
’ return an R object of S3 class “GoF
”, i.e., a named list containing the following components:
value_gof |
a matrix storing the values of the measure of goodness-of-fit used to evaluate the fitted models. |
df |
a matrix storing the number of the estimated non-zero parameters. |
dfB |
a matrix storing the number of estimated non-zero regression coefficients. |
dfTht |
a matrix storing the number of estimated non-zero partial correlation coefficients. |
value |
a matrix storing the values of the Q-function. |
n |
the sample size. |
p |
the number of response variables. |
q |
the number of columns of the design matrix |
lambda |
the |
nlambda |
the number of |
rho |
the |
nrho |
the number of |
type |
a description of the computed measure of goodness-of-fit. |
model |
a description of the fitted model passed through the argument |
Luigi Augugliaro ([email protected])
Ibrahim, J.G., Zhu, H. and Tang, N. (2008) <doi:10.1198/016214508000001057>. Model selection criteria for missing-data problems using the EM algorithm. Journal of the American Statistical Association 103, 1648–1658.
Sakamoto, Y., Ishiguro, M., and Kitagawa, G. (1986). Akaike Information Criterion Statistics. D. Reidel Publishing Company.
Wit, E., Heuvel, E. V. D., & Romeijn, J. W. (2012) <doi:10.1111/j.1467-9574.2012.00530.x>. All models are wrong...?: an introduction to model uncertainty. Statistica Neerlandica, 66(3), 217-236.
BIC.cglasso
, cglasso
, cggm
, QFun
, plot.GoF
and summary.cglasso
set.seed(123) # Y ~ N(b0 + XB, Sigma) and probability of left/right censored values equal to 0.05 n <- 100L p <- 3L q <- 2L b0 <- runif(p) B <- matrix(runif(q * p), nrow = q, ncol = p) X <- matrix(rnorm(n * q), nrow = n, ncol = q) rho <- 0.3 Sigma <- outer(1L:p, 1L:p, function(i, j) rho^abs(i - j)) Z <- rcggm(n = n, b0 = b0, X = X, B = B, Sigma = Sigma, probl = 0.05, probr = 0.05) out <- cglasso(. ~ ., data = Z) AIC(out) out.mle <- cggm(out, lambda.id = 3L, rho.id = 3L) AIC(out.mle)
set.seed(123) # Y ~ N(b0 + XB, Sigma) and probability of left/right censored values equal to 0.05 n <- 100L p <- 3L q <- 2L b0 <- runif(p) B <- matrix(runif(q * p), nrow = q, ncol = p) X <- matrix(rnorm(n * q), nrow = n, ncol = q) rho <- 0.3 Sigma <- outer(1L:p, 1L:p, function(i, j) rho^abs(i - j)) Z <- rcggm(n = n, b0 = b0, X = X, B = B, Sigma = Sigma, probl = 0.05, probr = 0.05) out <- cglasso(. ~ ., data = Z) AIC(out) out.mle <- cggm(out, lambda.id = 3L, rho.id = 3L) AIC(out.mle)
‘BIC
’ computes the Bayesian Information Criterion.
## S3 method for class 'cglasso' BIC(object, g = 0, type, mle, ...)
## S3 method for class 'cglasso' BIC(object, g = 0, type, mle, ...)
object |
an R object inheriting class ‘ |
g |
a value belonging to the interval |
type |
character; if |
mle |
logical. TRUE if the measure of goodness-of-fit should be computed using the maximum likelihood estimates. Default depends on the class of the argument |
... |
further arguments passed to the model-fitting function |
‘BIC
’ computes the Bayesian Information Criterion (BIC) for models fitted by cglasso
or cggm
. As proposed in Ibrahim and other (2008), BIC
computes the measure of goodness-of-fit by replacing the log-likelihood function with the Q-function, that is, the function maximized in the M-Step of the EM-algorithm. The values of the Q-function are computed using QFun
. By default, for an object of class cglasso
these values are computed using the penalized estimates whereas, if the object
has class cggm
, maximum likelihood estimates are used (see argument ‘mle
’ in QFun
).
By default, BIC
computes the standard BIC measure ():
where is the sample size and
represents the number of unique non-zero parameters in the fitted model.
If , the default depends on the number of predictors (
).
If ,
BIC
computes the measure of goodness-of-fit proposed in Foygel and other (2010) (type = "FD"
):
where is a value belonging to the interval
and indexing the measure of goodness-of-fit.
If ,
BIC
computes the measure of goodness-of-fit proposed in Chen and other (2008, 2012) (type = "CC"
):
BIC
can be passed to the functions select_cglasso
and summary.cglasso
to select and print the optimal fitted model, respectively.
The function plot.GoF
can be used to graphically evaluate the behaviour of the fitted models in terms of goodness-of-fit.
‘BIC
’ returns an R object of S3 class “GoF
”, i.e. a named list containing the following components:
value_gof |
a matrix storing the values of the measure of goodness-of-fit used to evaluate the fitted models. |
df |
a matrix storing the number of the estimated non-zero parameters. |
dfB |
a matrix storing the number of estimated non-zero regression coefficients. |
dfTht |
a matrix storing the number of estimated non-zero partial correlation coefficients. |
value |
a matrix storing the values of the Q-function. |
n |
the sample size. |
p |
the number of response variables. |
q |
the number of columns of the design matrix |
lambda |
the |
nlambda |
the number of |
rho |
the |
nrho |
the number of |
type |
a description of the computed measure of goodness-of-fit. |
model |
a description of the fitted model passed through the argument |
Luigi Augugliaro ([email protected])
Foygel, R. and Drton, M. (2010). Extended Bayesian Information Criteria for Gaussian Graphical Models. In: Lafferty, J., Williams, C., Shawe-taylor, J., Zemel, R.s. and Culott, A. (editors), Advances in Neural Information Processing Systems 23. pp. 604–612.
Chen, J. and Chen, Z. (2008) <doi:10.1093/biomet/asn034>. Extended Bayesian information criteria for model selection with large model spaces. Biometrika, Vol. 95(2), pp. 759–771.
Chen, J. and Chen, Z. (2012) <doi:10.5705/ss.2010.216>. Extended BIC for small-n-large-p sparse GLM. Statistica Sinica, Vol. 22, pp. 555–574.
Wit, E., Heuvel, E. V. D., & Romeijn, J. W. (2012) <doi:10.1111/j.1467-9574.2012.00530.x>. All models are wrong...?: an introduction to model uncertainty. Statistica Neerlandica, 66(3), 217-236.
cglasso
, cggm
, AIC.cglasso
, QFun
, plot.GoF
and summary.cglasso
set.seed(123) # Y ~ N(0, Sigma) and probability of left/right censored values equal to 0.05 n <- 100L p <- 3L rho <- 0.3 Sigma <- outer(1L:p, 1L:p, function(i, j) rho^abs(i - j)) Z <- rcggm(n = n, Sigma = Sigma, probl = 0.05, probr = 0.05) out <- cglasso(. ~ ., data = Z) BIC(out) # standard BIC measure BIC(out, mle = TRUE, g = 0.5, type = "FD") # eBIC proposed in Foygel and other (2010) # Y ~ N(b0 + XB, Sigma) and probability of left/right censored values equal to 0.05 n <- 100L p <- 3L q <- 2 b0 <- runif(p) B <- matrix(runif(q * p), nrow = q, ncol = p) X <- matrix(rnorm(n * q), nrow = n, ncol = q) rho <- 0.3 Sigma <- outer(1L:p, 1L:p, function(i, j) rho^abs(i - j)) Z <- rcggm(n = n, b0 = b0, X = X, B = B, Sigma = Sigma, probl = 0.05, probr = 0.05) out <- cglasso(. ~ ., data = Z) BIC(out) # standard BIC measure BIC(out, mle = TRUE, g = 0.5, type = "FD") # eBIC proposed in Foygel and other (2010) BIC(out, mle = TRUE, g = 0.5, type = "CC") # eBIC proposed in Chen and other (2008, 2010)
set.seed(123) # Y ~ N(0, Sigma) and probability of left/right censored values equal to 0.05 n <- 100L p <- 3L rho <- 0.3 Sigma <- outer(1L:p, 1L:p, function(i, j) rho^abs(i - j)) Z <- rcggm(n = n, Sigma = Sigma, probl = 0.05, probr = 0.05) out <- cglasso(. ~ ., data = Z) BIC(out) # standard BIC measure BIC(out, mle = TRUE, g = 0.5, type = "FD") # eBIC proposed in Foygel and other (2010) # Y ~ N(b0 + XB, Sigma) and probability of left/right censored values equal to 0.05 n <- 100L p <- 3L q <- 2 b0 <- runif(p) B <- matrix(runif(q * p), nrow = q, ncol = p) X <- matrix(rnorm(n * q), nrow = n, ncol = q) rho <- 0.3 Sigma <- outer(1L:p, 1L:p, function(i, j) rho^abs(i - j)) Z <- rcggm(n = n, b0 = b0, X = X, B = B, Sigma = Sigma, probl = 0.05, probr = 0.05) out <- cglasso(. ~ ., data = Z) BIC(out) # standard BIC measure BIC(out, mle = TRUE, g = 0.5, type = "FD") # eBIC proposed in Foygel and other (2010) BIC(out, mle = TRUE, g = 0.5, type = "CC") # eBIC proposed in Chen and other (2008, 2010)
‘cggm
’ is used to perform post-hoc maximum likelihood refitting of a selected conditional graphical lasso model with censored and/or missing values.
cggm(object, GoF = AIC, lambda.id, rho.id, tp.min = 1.0E-6, ntp = 100L, maxit.em = 1.0E+4, thr.em = 1.0E-3, maxit.bcd = 1.0E+5, thr.bcd = 1.0E-4, trace = 0L, ...)
cggm(object, GoF = AIC, lambda.id, rho.id, tp.min = 1.0E-6, ntp = 100L, maxit.em = 1.0E+4, thr.em = 1.0E-3, maxit.bcd = 1.0E+5, thr.bcd = 1.0E-4, trace = 0L, ...)
object |
an R object of S3 class ‘ |
GoF |
a valid goodness-of-fit function, such as |
lambda.id |
an optional integer used to identify a |
rho.id |
an optional integer used to identify a |
tp.min |
the smallest |
ntp |
integer; number of |
maxit.em |
maximum number of iterations of the EM algorithm. Default is |
thr.em |
threshold for the convergence of the EM algorithm. Default value is |
maxit.bcd |
maximum number of iterations of the glasso algorithm. Default is |
thr.bcd |
threshold for the convergence of the glasso algorithm. Default is |
trace |
integer for printing information out as iterations proceed: |
... |
the penalty parameter passed to the goodness-of-fit function (argument ‘ |
The model-fitting function cggm
is used to obtain the maximum likelihood estimates of a censored Gaussian graphical model whose structure was found by penalised cglasso inference. That is, given a fitted cglasso model, identified along the path via a goodness-of-fit function (passed by GoF
) or via the optional arguments lambda.id
and rho.id
), cggm
computes the maximum likelihood estimates of the parameters with zero constraints associated to the given structure.
Maximum likelihood estimates are computed using cglasso
as workhorse function, that is, cggm
fits a sequence of cglasso models, with length equal to ntp
, reducing and
until they are equal to the value ‘
min(tp.min, lambda.min, rho.min)
’, where lambda.min
and rho.min
are the smallest - and
-values stored in
object
. Maximum likelihood estimates are obtained from the last fitted cglasso model.
The model-fitting function cggm
returns an R object inheriting the S3 class ‘cglasso
’, for which all printing and plotting functions, designed for a cglasso
object, can be used (see section ‘See Also
’). Function ShowStructure
can be used to show the structure of the package.
NOTE: if the user tries a problem when the sample size is not large enough, then increase the number of fitted cglasso models (argument ntp
).
cggm
returns an object of S3 class “cggm
”, i.e., a list containing the
following components:
call |
the call that produced this object. |
Yipt |
an array storing the ‘working response matrix’. |
B |
the maximum likelihood estimate of the regression coefficient matrix. |
mu |
the fitted expected values. |
R |
the ‘working residuals’ matrix. |
S |
the ‘working empirical covariance matrix’. |
Sgm |
the maximum likelihood estimate of the covariance matrix. |
Tht |
the maximum likelihood estimate of the precision matrix. |
dfB |
the number of estimated non-zero regression coefficients. Only for internal purpose. |
dfTht |
the number of estimated non-zero (off-diagonal) partial correlation coefficients. Only for internal purpose. |
InfoStructure |
a named list whose elements are used to store the information about the estimated networks. Only for internal purpose. |
nit |
the number of EM steps. |
Z |
the ‘ |
nlambda |
Only for internal purpose. |
lambda |
the |
nrho |
Only for internal purpose. |
rho |
the |
maxit.em |
maximum number of iterations of the EM algorithm. |
thr.em |
threshold for the convergence of the EM algorithm. |
maxit.bcd |
maximum number of iterations of the glasso algorithm. |
thr.bcd |
threshold for the convergence of the glasso algorithm. |
conv |
a description of the error that has occurred. |
subrout |
the name of the Fortran subroutine where the error has occurred (for internal debug only). |
trace |
the integer used for printing information on screen. |
nobs |
the sample size |
nresp |
the number of response variables used to fit the model. |
npred |
the number of predictors used to fit the model. |
Luigi Augugliaro ([email protected])
cglasso
, coef.cglasso
, fitted.cglasso
, residuals.cglasso
, predict.cggm
, impute
, AIC.cglasso
, BIC.cglasso
, summary.cglasso
, to_graph
, plot.cglasso2igraph
and ShowStructure
.
set.seed(123) # Y ~ N(XB, Sigma) and # 1. probability of left/right censored values equal to 0.05 # 2. probability of missing-at-random values equal to 0.05 n <- 100L p <- 3L q <- 2L b0 <- runif(p) B <- matrix(runif(q * p), nrow = q, ncol = p) X <- matrix(rnorm(n * q), nrow = n, ncol = q) rho <- 0.3 Sigma <- outer(1L:p, 1L:p, function(i, j) rho^abs(i - j)) Z <- rcggm(n = n, b0 = b0, X = X, B = B, Sigma = Sigma, probl = 0.05, probr = 0.5, probna = 0.05) out <- cglasso(. ~ ., data = Z) # MLE of the censored Gaussian graphical model identified by 'BIC' out.mle <- cggm(out, GoF = BIC) out.mle # accessor functions coef(out.mle, drop = TRUE) fitted(out.mle, drop = TRUE) residuals(out.mle, type = "working", drop = TRUE) impute(out.mle, type = "both") # goodness-of-fit functions AIC(out.mle) BIC(out.mle) summary(out.mle) # network analysis out.graph <- plot(out.mle) out.graph
set.seed(123) # Y ~ N(XB, Sigma) and # 1. probability of left/right censored values equal to 0.05 # 2. probability of missing-at-random values equal to 0.05 n <- 100L p <- 3L q <- 2L b0 <- runif(p) B <- matrix(runif(q * p), nrow = q, ncol = p) X <- matrix(rnorm(n * q), nrow = n, ncol = q) rho <- 0.3 Sigma <- outer(1L:p, 1L:p, function(i, j) rho^abs(i - j)) Z <- rcggm(n = n, b0 = b0, X = X, B = B, Sigma = Sigma, probl = 0.05, probr = 0.5, probna = 0.05) out <- cglasso(. ~ ., data = Z) # MLE of the censored Gaussian graphical model identified by 'BIC' out.mle <- cggm(out, GoF = BIC) out.mle # accessor functions coef(out.mle, drop = TRUE) fitted(out.mle, drop = TRUE) residuals(out.mle, type = "working", drop = TRUE) impute(out.mle, type = "both") # goodness-of-fit functions AIC(out.mle) BIC(out.mle) summary(out.mle) # network analysis out.graph <- plot(out.mle) out.graph
‘cglasso
’ fits the conditional graphical lasso model to datasets with censored and/or missing values.
cglasso(formula, data, subset, contrasts = NULL, diagonal = FALSE, weights.B = NULL, weights.Tht = NULL, nlambda, lambda.min.ratio, lambda, nrho, rho.min.ratio, rho, maxit.em = 1.0E+4, thr.em = 1.0E-3, maxit.bcd = 1.0E+5, thr.bcd = 1.0E-4, trace = 0L)
cglasso(formula, data, subset, contrasts = NULL, diagonal = FALSE, weights.B = NULL, weights.Tht = NULL, nlambda, lambda.min.ratio, lambda, nrho, rho.min.ratio, rho, maxit.em = 1.0E+4, thr.em = 1.0E-3, maxit.bcd = 1.0E+5, thr.bcd = 1.0E-4, trace = 0L)
formula |
an object of class ‘ |
data |
an R object of S3 class ‘ |
subset |
an optional vector specifying a subset of observations to be used in the fitting process. |
contrasts |
an optional list. See the |
diagonal |
logical. Should diagonal entries of the concentration matrix be penalized? Default is ‘ |
weights.B |
an optional |
weights.Tht |
an optional symmetric matrix of non-negative weights used to penalize the partial regression coefficients. This matrix can be used to specify the unpenalized partial correlation coefficients (‘ |
nlambda |
integer. The number of |
lambda.min.ratio |
the smallest |
lambda |
an optional user-supplied decreasing sequence of |
nrho |
integer. The number of |
rho.min.ratio |
the smallest |
rho |
an optional user supplied decreasing sequence of |
maxit.em |
maximum number of iterations of the EM algorithm. Default is |
thr.em |
threshold for the convergence of the EM algorithm. Default value is |
maxit.bcd |
maximum number of iterations of the glasso algorithm. Default is |
thr.bcd |
threshold for the convergence of the glasso algorithm. Default is |
trace |
integer for printing information out as iterations proceed: |
cglasso
is the main model-fitting function and can be used to fit a broad range of extensions of the glasso estimator (Friedman and other, 2008). It is specifically proposed to study datasets with censored and/or missing response values. To help the user, the cglasso
function has been designed to automatically select the most suitable extension by using the information stored in the ‘datacggm
’ object passed through Z
.
Below we sum up the available extenions:
if only left/right-censored are observed in the response matrix Y
(without missing values) and no predictor matrix is stored in Z
, then cglasso
computes the censored glasso estimator (Augugliaro and other, 2020a);
if only left/right-censored are observed in the response matrix Y
(without missing values) and and a predictor matrix X
is stored in Z
, then cglasso
computes the conditional censored glasso estimator (Augugliaro and other, 2020b);
if only missing values are stored in the response matrix Y
(without censored values), then cglasso
computes the missglasso estimator (Stadler and other, 2012);
starting with version 2.0.0, cglasso
can also handle datasets with both missing and censored response values.
See section ‘Examples’ for some example.
The model-fitting function cglasso
returns an R object of S3 class ‘cglasso
’ for which there are available a set of accessor functions, a set of functions designed to evaluate the goodness-of-fit of the fitted models and, finally, a set of functions developed to analyze the selected network. The function ShowStructure
can be used to show the structure of the package.
The accessor functions coef.cglasso
, fitted.cglasso
, residuals.cglasso
, predict.cglasso
and impute
can be used to extract various useful features of the object fitted by cglasso
.
For an R object returned by cglasso
, the functions AIC.cglasso
and BIC.cglasso
can be used to evaluate the goodness-of-fit of the fitted models. Usually, these functions are used together with the function summary.cglasso
, which gives more information about the sequence of fitted models. The plotting function plot.GoF
can be used to graphically identify the optimal pair of the tuning parameters.
Given a pair of the tuning paremeters, the functions to_graph
and plot.cglasso2igraph
can be used to analyze and show the selected network. Finally, the function cggm
can be used to produce post-hoc maximum likelihood refitting of the selected graphical model.
cglasso
returns an object of S3 class “cglasso
”, i.e., a named list containing the
following components:
call |
the call that produced this object. |
Yipt |
an array of dimension ‘ |
B |
an array of dimension ‘ |
mu |
an array of dimension ‘ |
R |
an array of dimension ‘ |
S |
an array of dimension ‘ |
Sgm |
an array of dimension ‘ |
Tht |
an array of dimension ‘ |
dfB |
a matrix of dimension ‘ |
dfTht |
a matrix of dimension ‘ |
InfoStructure |
a named list whose elements contain information about the estimated networks. Only for internal purpose. |
nit |
an array of dimension ‘ |
Z |
the ‘ |
diagonal |
the flag used to specify if the diagonal entries of the precision matrix are penalized. |
weights.B |
the matrix of non-negative weights used for the regression coefficients. |
weights.Tht |
the matrix of non-negative weights used for the precision matrix. |
nlambda |
the number of |
lambda.min.ratio |
the value used to compute the smallest |
lambda |
the sequence of |
nrho |
the number of |
rho.min.ratio |
the value used to compute the smallest |
rho |
the sequence of |
model |
a description of the fitted model. |
maxit.em |
maximum number of iterations of the EM algorithm. |
thr.em |
threshold for the convergence of the EM algorithm. |
maxit.bcd |
maximum number of iterations of the glasso algorithm. |
thr.bcd |
threshold for the convergence of the glasso algorithm. |
conv |
a description of the error that has occurred. |
subrout |
the name of the Fortran subroutine where the error has occurred (for internal debug only). |
trace |
the integer used for printing information on screen. |
nobs |
the sample size |
nresp |
the number of response variables used to fit the model. |
npred |
the number of predictors used to fit the model. |
Luigi Augugliaro ([email protected])
Augugliaro L., Sottile G., Wit E.C., and Vinciotti V. (2023) <doi:10.18637/jss.v105.i01>. cglasso: An R Package for Conditional Graphical Lasso Inference with Censored and Missing Values. Journal of Statistical Software 105(1), 1–58.
Augugliaro, L., Abbruzzo, A., and Vinciotti, V. (2020a) <doi:10.1093/biostatistics/kxy043>.
-Penalized censored Gaussian graphical model.
Biostatistics 21, e1–e16.
Augugliaro, L., Sottile, G., and Vinciotti, V. (2020b) <doi:10.1007/s11222-020-09945-7>. The conditional censored graphical lasso estimator. Statistics and Computing 30, 1273–1289.
Friedman, J.H., Hastie, T., and Tibshirani, R. (2008) <doi:10.1093/biostatistics/kxm045>. Sparse inverse covariance estimation with the graphical lasso. Biostatistics 9, 432–441.
Stadler, N. and Buhlmann, P. (2012) <doi:10.1007/s11222-010-9219-7>. Missing values: sparse inverse covariance estimation and an extension to sparse regression. Statistics and Computing 22, 219–235.
datacggm
, coef.cglasso
, fitted.cglasso
, residuals.cglasso
, predict.cglasso
, impute
, AIC.cglasso
, BIC.cglasso
, summary.cglasso
, select_cglasso
, plot.GoF
, to_graph
, plot.cglasso2igraph
, cggm
and ShowStructure
.
set.seed(123) # Model 1: censored glasso estimator (Augugliaro \emph{and other}, 2020a) # Y ~ N(0, Sigma) and probability of left/right censored values equal to 0.05 n <- 1000L p <- 3L rho <- 0.3 Sigma <- outer(1L:p, 1L:p, function(i, j) rho^abs(i - j)) Z <- rcggm(n = n, Sigma = Sigma, probl = 0.05, probr = 0.05) out <- cglasso(. ~ ., data = Z) out # Model 2: conditional censored glasso estimator (Augugliaro \emph{and other}, 2020b) # Y ~ N(b0 + XB, Sigma) and probability of left/right censored values equal to 0.05 n <- 1000L p <- 3L q <- 2L b0 <- runif(p) B <- matrix(runif(q * p), nrow = q, ncol = p) X <- matrix(rnorm(n * q), nrow = n, ncol = q) rho <- 0.3 Sigma <- outer(1L:p, 1L:p, function(i, j) rho^abs(i - j)) Z <- rcggm(n = n, b0 = b0, X = X, B = B, Sigma = Sigma, probl = 0.05, probr = 0.05) out <- cglasso(. ~ ., data = Z) out # Model 3: missglasso estimator (Stadler \emph{and other}, 2012) # Y ~ N(b0 + XB, Sigma) and probability of missing-at-random values equal to 0.05 n <- 1000L p <- 3L q <- 2L b0 <- runif(p) B <- matrix(runif(q * p), nrow = q, ncol = p) X <- matrix(rnorm(n * q), nrow = n, ncol = q) rho <- 0.3 Sigma <- outer(1L:p, 1L:p, function(i, j) rho^abs(i - j)) Z <- rcggm(n = n, b0 = b0, X = X, B = B, Sigma = Sigma, probna = 0.05) out <- cglasso(. ~ ., data = Z) out # Model 4: mixed estimator # Y ~ N(b0 + XB, Sigma) and # 1. probability of left/right censored values equal to 0.05 # 2. probability of missing-at-random values equal to 0.05 n <- 1000L p <- 3L q <- 2L b0 <- runif(p) B <- matrix(runif(q * p), nrow = q, ncol = p) X <- matrix(rnorm(n * q), nrow = n, ncol = q) rho <- 0.3 Sigma <- outer(1L:p, 1L:p, function(i, j) rho^abs(i - j)) Z <- rcggm(n = n, X = X, b0 = b0, B = B, Sigma = Sigma, probl = 0.05, probr = 0.05, probna = 0.05) out <- cglasso(. ~ ., data = Z) out
set.seed(123) # Model 1: censored glasso estimator (Augugliaro \emph{and other}, 2020a) # Y ~ N(0, Sigma) and probability of left/right censored values equal to 0.05 n <- 1000L p <- 3L rho <- 0.3 Sigma <- outer(1L:p, 1L:p, function(i, j) rho^abs(i - j)) Z <- rcggm(n = n, Sigma = Sigma, probl = 0.05, probr = 0.05) out <- cglasso(. ~ ., data = Z) out # Model 2: conditional censored glasso estimator (Augugliaro \emph{and other}, 2020b) # Y ~ N(b0 + XB, Sigma) and probability of left/right censored values equal to 0.05 n <- 1000L p <- 3L q <- 2L b0 <- runif(p) B <- matrix(runif(q * p), nrow = q, ncol = p) X <- matrix(rnorm(n * q), nrow = n, ncol = q) rho <- 0.3 Sigma <- outer(1L:p, 1L:p, function(i, j) rho^abs(i - j)) Z <- rcggm(n = n, b0 = b0, X = X, B = B, Sigma = Sigma, probl = 0.05, probr = 0.05) out <- cglasso(. ~ ., data = Z) out # Model 3: missglasso estimator (Stadler \emph{and other}, 2012) # Y ~ N(b0 + XB, Sigma) and probability of missing-at-random values equal to 0.05 n <- 1000L p <- 3L q <- 2L b0 <- runif(p) B <- matrix(runif(q * p), nrow = q, ncol = p) X <- matrix(rnorm(n * q), nrow = n, ncol = q) rho <- 0.3 Sigma <- outer(1L:p, 1L:p, function(i, j) rho^abs(i - j)) Z <- rcggm(n = n, b0 = b0, X = X, B = B, Sigma = Sigma, probna = 0.05) out <- cglasso(. ~ ., data = Z) out # Model 4: mixed estimator # Y ~ N(b0 + XB, Sigma) and # 1. probability of left/right censored values equal to 0.05 # 2. probability of missing-at-random values equal to 0.05 n <- 1000L p <- 3L q <- 2L b0 <- runif(p) B <- matrix(runif(q * p), nrow = q, ncol = p) X <- matrix(rnorm(n * q), nrow = n, ncol = q) rho <- 0.3 Sigma <- outer(1L:p, 1L:p, function(i, j) rho^abs(i - j)) Z <- rcggm(n = n, X = X, b0 = b0, B = B, Sigma = Sigma, probl = 0.05, probr = 0.05, probna = 0.05) out <- cglasso(. ~ ., data = Z) out
The accessor function coef
extracts model coefficients from an R object inheriting class ‘cglasso
’.
## S3 method for class 'cglasso' coef(object, type = c("all", "B", "Sigma", "Theta"), lambda.id, rho.id, drop = TRUE, ...)
## S3 method for class 'cglasso' coef(object, type = c("all", "B", "Sigma", "Theta"), lambda.id, rho.id, drop = TRUE, ...)
object |
an R object inheriting class ‘ |
type |
a description of the desired estimates. |
lambda.id |
an optional vector of integers used to specify the |
rho.id |
an optional vector of integers used to specify the |
drop |
logical. Dimensions of the required objects can only be dropped if their extent is one. |
... |
further arguments passed to or from other methods. |
Coefficients extracted from ‘object
’ are returned. By default, a named list storing all the estimated parameters is returned.
Luigi Augugliaro ([email protected])
Model-fitting functions cglasso
, cggm
and the accessor functions fitted.cglasso
, residuals.cglasso
, predict.cglasso
and impute
.
set.seed(123) # Y ~ N(0, Sigma) and probability of left/right censored values equal to 0.05 n <- 1000L p <- 3L rho <- 0.3 Sigma <- outer(1L:p, 1L:p, function(i, j) rho^abs(i - j)) Z <- rcggm(n = n, Sigma = Sigma, probl = 0.05, probr = 0.05) out <- cglasso(. ~ ., data = Z) coef(out, type = "Theta", rho.id = 1:4) coef(out, type = "Theta", rho.id = 3, drop = TRUE) # Y ~ N(b0 + XB, Sigma) and probability of left/right censored values equal to 0.05 n <- 1000L p <- 3L q <- 2 b0 <- runif(p) B <- matrix(runif(q * p), nrow = q, ncol = p) X <- matrix(rnorm(n * q), nrow = n, ncol = q) rho <- 0.3 Sigma <- outer(1L:p, 1L:p, function(i, j) rho^abs(i - j)) Z <- rcggm(n = n, b0 = b0, X = X, B = B, Sigma = Sigma, probl = 0.05, probr = 0.05) out <- cglasso(. ~ ., data = Z) coef(out, type = "B", lambda.id = 3, rho.id = 1:4) coef(out, type = "B", lambda.id = 3, rho.id = 3, drop = TRUE)
set.seed(123) # Y ~ N(0, Sigma) and probability of left/right censored values equal to 0.05 n <- 1000L p <- 3L rho <- 0.3 Sigma <- outer(1L:p, 1L:p, function(i, j) rho^abs(i - j)) Z <- rcggm(n = n, Sigma = Sigma, probl = 0.05, probr = 0.05) out <- cglasso(. ~ ., data = Z) coef(out, type = "Theta", rho.id = 1:4) coef(out, type = "Theta", rho.id = 3, drop = TRUE) # Y ~ N(b0 + XB, Sigma) and probability of left/right censored values equal to 0.05 n <- 1000L p <- 3L q <- 2 b0 <- runif(p) B <- matrix(runif(q * p), nrow = q, ncol = p) X <- matrix(rnorm(n * q), nrow = n, ncol = q) rho <- 0.3 Sigma <- outer(1L:p, 1L:p, function(i, j) rho^abs(i - j)) Z <- rcggm(n = n, b0 = b0, X = X, B = B, Sigma = Sigma, probl = 0.05, probr = 0.05) out <- cglasso(. ~ ., data = Z) coef(out, type = "B", lambda.id = 3, rho.id = 1:4) coef(out, type = "B", lambda.id = 3, rho.id = 3, drop = TRUE)
Retrieve column means and column variances of a “datacggm” object.
ColMeans(x) ColVars(x)
ColMeans(x) ColVars(x)
x |
an object of class ‘ |
For an R object x
of class ‘datacggm
’, ColMeans
(ColVars
) retrieves the column means (variances) of the matrices obtained by getMatrix(x, "Y")
and getMatrix(x, "X")
. For the response variables, marginal means and variances are estimated using a EM-algorithm under the assumption that the response variables are marginally normally distributed (see also Details section in
datacggm
). For the numeric predictor variables, marginal means and variances are computed by mean
and var
, whereas, for categorical data, ColMeans
(ColVars
) retrieves the statistical mode and the Gini-Simpson Index, respectively.
ColMeans
(ColVars
) returns a named list with the columns means (variances).
Luigi Augugliaro ([email protected])
datacggm
, rcggm
, qqcnorm
and hist.datacggm
.
set.seed(123) n <- 1000L p <- 3L b0 <- rep(0, p) Z <- rcggm(n = n, b0 = b0, probl = 0.05, probr = 0.05) ColMeans(Z) ColVars(Z) n <- 1000L p <- 3L q <- 2 b0 <- runif(p) B <- matrix(runif(q * p), nrow = q, ncol = p) X <- matrix(rnorm(n * q), nrow = n, ncol = q) Z <- rcggm(n = n, b0 = b0, X = X, B = B, probl = 0.05, probr = 0.05) ColMeans(Z) ColVars(Z)
set.seed(123) n <- 1000L p <- 3L b0 <- rep(0, p) Z <- rcggm(n = n, b0 = b0, probl = 0.05, probr = 0.05) ColMeans(Z) ColVars(Z) n <- 1000L p <- 3L q <- 2 b0 <- runif(p) B <- matrix(runif(q * p), nrow = q, ncol = p) X <- matrix(rnorm(n * q), nrow = n, ncol = q) Z <- rcggm(n = n, b0 = b0, X = X, B = B, probl = 0.05, probr = 0.05) ColMeans(Z) ColVars(Z)
‘The datacggm
’ function is used to create a dataset from a conditional Gaussian graphical model with censored and/or missing values.
datacggm(Y, lo = -Inf, up = +Inf, X = NULL, control = list(maxit = 1.0E+4, thr = 1.0E-4))
datacggm(Y, lo = -Inf, up = +Inf, X = NULL, control = list(maxit = 1.0E+4, thr = 1.0E-4))
Y |
a |
lo |
the lower censoring vector; |
up |
the upper censoring vector; |
X |
an optional |
control |
a named list used to pass the arguments to the EM algorithm (see below for more details). The components are:
|
The function ‘datacggm
’ returns an R object of class ‘datacggm
’, that is a named list containing the elements needed to fit a conditional graphical LASSO (cglasso) model to datasets with censored and/or missing values.
A set of specific method functions are developed to decsribe data with censored/missing values. For example, the method function ‘print.datacggm
’ prints out the left and right-censored values using the following rules: a right-censored value is labeled adding the symbol ‘+
’ at the end of the value, whereas the symbol ‘-
’ is used for the left-censored values (see examples below). The summary statistics can be obtained using the method function ‘summary.datacggm
’. The matrices Y
and X
are extracted from a datacggm
object using the function ‘getMatrix
’.
For each column of the matrix ‘Y
’, mean and variance are estimated using a standard EM-algorithm based on the assumption of a Gaussian distribution. ‘maxit
’ and ‘thr
’ are used to set the number of iterations and the threshold for convergence, respectively. Marginal means and variances can be extracted using the accessor functions ‘ColMeans
’ and ‘ColVars
’, respectively. Furthermore, the plotting functions ‘hist.datacggm
’ and ‘qqcnorm
’ can be used to inspect the marginal distribution of each column of the matrix ‘Y
’.
The status indicator matrix, denoted by R
, can be extracted by using the function event
. The entries of this matrix specify the status of an observation using the following code:
‘R[i, j] = 0
’ means that the is inside the open interval
(lo[j], up[j])
;
‘R[i, j] = -1
’ means that the is a left-censored value;
‘R[i, j] = +1
’ means that the is a right-censored value;
‘R[i, j] = +9
’ means that the is a missing value.
See below for the other functions related to an object of class ‘datacggm
’.
‘datacggm
’ returns an R object of S3 class “datacggm
”, that is, a nested named list containing the
following components:
Y |
the |
X |
the |
Info |
|
Luigi Augugliaro ([email protected])
Augugliaro L., Sottile G., Wit E.C., and Vinciotti V. (2023) <doi:10.18637/jss.v105.i01>. cglasso: An R Package for Conditional Graphical Lasso Inference with Censored and Missing Values. Journal of Statistical Software 105(1), 1–58.
Augugliaro, L., Sottile, G., and Vinciotti, V. (2020a) <doi:10.1007/s11222-020-09945-7>. The conditional censored graphical lasso estimator. Statistics and Computing 30, 1273–1289.
Augugliaro, L., Abbruzzo, A., and Vinciotti, V. (2020b) <doi:10.1093/biostatistics/kxy043>.
-Penalized censored Gaussian graphical model.
Biostatistics 21, e1–e16.
Related to the R objects of class “datacggm
” there are the accessor functions, rowNames
, colNames
, getMatrix
, ColMeans
, ColVars
, upper
, lower
, event
, qqcnorm
and the method functions is.datacggm
, dim.datacggm
, summary.datacggm
and hist.datacggm
. The function rcggm
can be used to simulate a dataset from a conditional Gaussian graphical model whereas cglasso
is the model fitting function devoted to the l1-penalized censored Gaussian graphical model.
set.seed(123) # a dataset from a right-censored Gaussian graphical model n <- 100L p <- 3L Y <- matrix(rnorm(n * p), n, p) up <- 1 Y[Y >= up] <- up Z <- datacggm(Y = Y, up = up) Z # a dataset from a conditional censored Gaussian graphical model n <- 100L p <- 3L q <- 2 Y <- matrix(rnorm(n * p), n, p) up <- 1 lo <- -1 Y[Y >= up] <- up Y[Y <= lo] <- lo X <- matrix(rnorm(n * q), n, q) Z <- datacggm(Y = Y, lo = lo, up = up, X = X) Z # a dataset from a conditional censored Gaussian graphical model # and with missing-at-random values n <- 100L p <- 3L q <- 2 Y <- matrix(rnorm(n * p), n, p) NA.id <- matrix(rbinom(n * p, 1L, 0.01), n, p) Y[NA.id == 1L] <- NA up <- 1 lo <- -1 Y[Y >= up] <- up Y[Y <= lo] <- lo X <- matrix(rnorm(n * q), n, q) Z <- datacggm(Y = Y, lo = lo, up = up, X = X) Z
set.seed(123) # a dataset from a right-censored Gaussian graphical model n <- 100L p <- 3L Y <- matrix(rnorm(n * p), n, p) up <- 1 Y[Y >= up] <- up Z <- datacggm(Y = Y, up = up) Z # a dataset from a conditional censored Gaussian graphical model n <- 100L p <- 3L q <- 2 Y <- matrix(rnorm(n * p), n, p) up <- 1 lo <- -1 Y[Y >= up] <- up Y[Y <= lo] <- lo X <- matrix(rnorm(n * q), n, q) Z <- datacggm(Y = Y, lo = lo, up = up, X = X) Z # a dataset from a conditional censored Gaussian graphical model # and with missing-at-random values n <- 100L p <- 3L q <- 2 Y <- matrix(rnorm(n * p), n, p) NA.id <- matrix(rbinom(n * p, 1L, 0.01), n, p) Y[NA.id == 1L] <- NA up <- 1 lo <- -1 Y[Y >= up] <- up Y[Y <= lo] <- lo X <- matrix(rnorm(n * q), n, q) Z <- datacggm(Y = Y, lo = lo, up = up, X = X) Z
Retrieve the dimension of an R object of class “datacggm”, that is, the sample size (), the number of response variables (
) and the number of predictors (
).
## S3 method for class 'datacggm' dim(x)
## S3 method for class 'datacggm' dim(x)
x |
an R object of class ‘ |
For an R object of class ‘datacggm
’, dim
retrieves a list with elements named Y
and X
. The first component is the “dim
” attribute of the matrix Y
whereas the second component is the “dim
” attribute of the matrix X
. If X
is missing, the second component is NULL
.
For an object of class ‘datacggm
’, dim
retrieves a named list with components Y
and X
which are the “dim
” attribute of the two matrices, respectively. If X
is missing then the last component is NULL
.
Luigi Augugliaro ([email protected])
datacggm
, rcggm
,nobs
, nresp
and npred
.
set.seed(123) # a dataset from a censored Gaussian graphical model n <- 100L p <- 3L b0 <- rep(0, p) Z <- rcggm(n = n, b0 = b0, probl = 0.05, probr = 0.05) dim(Z) # a dataset from a conditional censored Gaussian graphical model n <- 100L p <- 3L q <- 2 b0 <- runif(p) B <- matrix(runif(q * p), nrow = q, ncol = p) X <- matrix(rnorm(n * q), nrow = n, ncol = q) Z <- rcggm(n = n, b0 = b0, X = X, B = B, probl = 0.05, probr = 0.05) dim(Z)
set.seed(123) # a dataset from a censored Gaussian graphical model n <- 100L p <- 3L b0 <- rep(0, p) Z <- rcggm(n = n, b0 = b0, probl = 0.05, probr = 0.05) dim(Z) # a dataset from a conditional censored Gaussian graphical model n <- 100L p <- 3L q <- 2 b0 <- runif(p) B <- matrix(runif(q * p), nrow = q, ncol = p) X <- matrix(rnorm(n * q), nrow = n, ncol = q) Z <- rcggm(n = n, b0 = b0, X = X, B = B, probl = 0.05, probr = 0.05) dim(Z)
Retrieve or set the dimnames of a “datacggm” object.
## S3 method for class 'datacggm' dimnames(x) ## S3 replacement method for class 'datacggm' dimnames(x) <- value
## S3 method for class 'datacggm' dimnames(x) ## S3 replacement method for class 'datacggm' dimnames(x) <- value
x |
an object of class ‘ |
value |
a nested list with names ‘ |
For an R object ‘x
’ of class ‘datacggm
’, dimnames
retrieves or sets the dimnames
attribute for the matrices Y
and X
(see ‘datacggm
’ for more details). When setting the dimnames
attribute, value$Y
can be NULL
(which is not stored) or a list of length two. In the last case, value$Y
is passed to the method function dimnames
for setting the dimnames
attributes of the matrix retrieved by getMatrix(x, "Y")
. In the same way, value$X
can be used for setting the dimnames
attributes of the matrix retrieved by getMatrix(x, "X")
.
Luigi Augugliaro ([email protected])
datacggm
, rcggm
, getMatrix
, rowNames
and colNames
.
set.seed(123) # a dataset from a censored Gaussian graphical model n <- 100L p <- 3L b0 <- rep(0, p) Z <- rcggm(n = n, b0 = b0, probl = 0.05, probr = 0.05) dimnames(Z) dimnames(Z) <- list(Y = list(paste0("i", seq_len(n)), paste("Y", 1:p, sep = ":"))) dimnames(Z) # the same as # dimnames(Z)$Y <- list(paste0("i", seq_len(n)), paste("Y", 1:p, sep = ":")) # dimnames(Z) # a dataset from a conditional censored Gaussian graphical model n <- 100L p <- 3L q <- 2 b0 <- runif(p) B <- matrix(runif(q * p), nrow = q, ncol = p) X <- matrix(rnorm(n * q), nrow = n, ncol = q) Z <- rcggm(n = n, b0 = b0, X = X, B = B, probl = 0.05, probr = 0.05) dimnames(Z) dimnames(Z) <- list(Y = list(NULL, paste("Y", 1:p, sep = ":")), X = list(NULL, paste("X", 1:q, sep = ":"))) dimnames(Z) # the same as # dimnames(Z)$Y <- list(NULL, paste("Y", 1:p, sep = ":")) # dimnames(Z)$X <- list(NULL, paste("X", 1:q, sep = ":")) # dimnames(Z)
set.seed(123) # a dataset from a censored Gaussian graphical model n <- 100L p <- 3L b0 <- rep(0, p) Z <- rcggm(n = n, b0 = b0, probl = 0.05, probr = 0.05) dimnames(Z) dimnames(Z) <- list(Y = list(paste0("i", seq_len(n)), paste("Y", 1:p, sep = ":"))) dimnames(Z) # the same as # dimnames(Z)$Y <- list(paste0("i", seq_len(n)), paste("Y", 1:p, sep = ":")) # dimnames(Z) # a dataset from a conditional censored Gaussian graphical model n <- 100L p <- 3L q <- 2 b0 <- runif(p) B <- matrix(runif(q * p), nrow = q, ncol = p) X <- matrix(rnorm(n * q), nrow = n, ncol = q) Z <- rcggm(n = n, b0 = b0, X = X, B = B, probl = 0.05, probr = 0.05) dimnames(Z) dimnames(Z) <- list(Y = list(NULL, paste("Y", 1:p, sep = ":")), X = list(NULL, paste("X", 1:q, sep = ":"))) dimnames(Z) # the same as # dimnames(Z)$Y <- list(NULL, paste("Y", 1:p, sep = ":")) # dimnames(Z)$X <- list(NULL, paste("X", 1:q, sep = ":")) # dimnames(Z)
datacggm
’ ObjectThe ‘event
’ function retrieves the status indicator matrix from an object of class ‘datacggm
’.
event(x, ordered = FALSE)
event(x, ordered = FALSE)
x |
an object of class ‘ |
ordered |
logical value used to specify if the rows of the status indicator matrix should be ordered according to the patterns of censored/missing values. Default |
The ‘event
’ function is used to retrieve the status indicator matrix, denoted by R
, from an object of class ‘datacggm
’. The entries of the matrix are used to specify the status of the response variable:
‘R[i, j] = 0
’ means that is inside the open interval
(lo[j], up[j])
;
‘R[i, j] = -1
’ means that is a left-censored value;
‘R[i, j] = +1
’ means that is a right-censored value;
‘R[i, j] = +9
’ means that is a missing value.
event
returns a -dimensional matrix.
Luigi Augugliaro ([email protected])
Augugliaro L., Sottile G., Wit E.C., and Vinciotti V. (2023) <doi:10.18637/jss.v105.i01>. cglasso: An R Package for Conditional Graphical Lasso Inference with Censored and Missing Values. Journal of Statistical Software 105(1), 1–58.
Augugliaro, L., Sottile, G., and Vinciotti, V. (2020) <doi:10.1007/s11222-020-09945-7>. The conditional censored graphical lasso estimator. Statistics and Computing 30, 1273–1289.
Augugliaro, L., Abbruzzo, A., and Vinciotti, V. (2020) <doi:10.1093/biostatistics/kxy043>.
-Penalized censored Gaussian graphical model.
Biostatistics 21, e1–e16.
set.seed(123) # Y ~ N(b0 + XB, Sigma) and # 1. probability of left/right censored values equal to 0.05 # 2. probability of missing-at-random euqals to 0.05 n <- 100L p <- 3L q <- 2 b0 <- runif(p) B <- matrix(runif(q * p), nrow = q, ncol = p) X <- matrix(rnorm(n * q), nrow = n, ncol = q) rho <- 0.3 Sigma <- outer(1L:p, 1L:p, function(i, j) rho^abs(i - j)) Z <- rcggm(n = n, b0 = b0, X = X, B = B, Sigma = Sigma, probl = 0.05, probr = 0.05, probna = 0.05) # status indicator matrix event(Z) # in this case the status indicator matrix is returned with # rows ordered according to the patterns of missing data event(Z, ordered = TRUE)
set.seed(123) # Y ~ N(b0 + XB, Sigma) and # 1. probability of left/right censored values equal to 0.05 # 2. probability of missing-at-random euqals to 0.05 n <- 100L p <- 3L q <- 2 b0 <- runif(p) B <- matrix(runif(q * p), nrow = q, ncol = p) X <- matrix(rnorm(n * q), nrow = n, ncol = q) rho <- 0.3 Sigma <- outer(1L:p, 1L:p, function(i, j) rho^abs(i - j)) Z <- rcggm(n = n, b0 = b0, X = X, B = B, Sigma = Sigma, probl = 0.05, probr = 0.05, probna = 0.05) # status indicator matrix event(Z) # in this case the status indicator matrix is returned with # rows ordered according to the patterns of missing data event(Z, ordered = TRUE)
Simulated data, used to demonstrate the features of datacggm class. It contains two simulated datasets that mimic the structural characteristics of the two real datasets MKMEP and MM.
MKMEP.Sim
: is a reponse matrix containing 50 samples of 10 genes;
MM.Sim
: is a list containing two components;
Y
: is the response matrix containing 50 measurements of 10 miRNAs
X
: is the predictor matrix containing 50 measurements of 5 variables (four numerical and one categorical)
In both datasets, the response matrices are right-censored with an upper limit of detection fixed to 40.
data("Example")
data("Example")
cglasso
, to_graph
, and the method functions summary
, coef
, plot
, AIC.cglasso
, BIC.cglasso
, MKMEP
and MM
.
data("Example") MM.Sim <- datacggm(Y = MM.Sim$Y, up = 40, X = MM.Sim$X) ColMeans(MM.Sim) ColVars(MM.Sim) summary(MM.Sim) MKMEP.Sim <- datacggm(Y = MKMEP.Sim, up = 40) ColMeans(MKMEP.Sim) ColVars(MKMEP.Sim) summary(MKMEP.Sim)
data("Example") MM.Sim <- datacggm(Y = MM.Sim$Y, up = 40, X = MM.Sim$X) ColMeans(MM.Sim) ColVars(MM.Sim) summary(MM.Sim) MKMEP.Sim <- datacggm(Y = MKMEP.Sim, up = 40) ColMeans(MKMEP.Sim) ColVars(MKMEP.Sim) summary(MKMEP.Sim)
The accessor function fitted
extracts model fitted values from an R object inheriting class ‘cglasso
’.
## S3 method for class 'cglasso' fitted(object, lambda.id, rho.id, drop = TRUE, ...)
## S3 method for class 'cglasso' fitted(object, lambda.id, rho.id, drop = TRUE, ...)
object |
an R object inheriting class ‘ |
lambda.id |
a vector of integers used to specify the |
rho.id |
a vector of integers used to specify the |
drop |
logical. Dimensions can only be dropped if their extent is one. |
... |
further arguments passed to or from other methods. |
Fitted values extracted from ‘object
’ are returned.
Luigi Augugliaro ([email protected])
Model-fitting functions cglasso
, cggm
and the accessor functions coef.cglasso
, residuals.cglasso
, predict.cglasso
and impute
.
set.seed(123) # Y ~ N(0, Sigma) and probability of left/right censored values equal to 0.05 n <- 1000L p <- 3L rho <- 0.3 Sigma <- outer(1L:p, 1L:p, function(i, j) rho^abs(i - j)) Z <- rcggm(n = n, Sigma = Sigma, probl = 0.05, probr = 0.05) out <- cglasso(. ~ ., data = Z) fitted(out, rho.id = 3L, drop = TRUE) # Y ~ N(b0 + XB, Sigma) and probability of left/right censored values equal to 0.05 n <- 1000L p <- 3L q <- 2 b0 <- runif(p) B <- matrix(runif(q * p), nrow = q, ncol = p) X <- matrix(rnorm(n * q), nrow = n, ncol = q) rho <- 0.3 Sigma <- outer(1L:p, 1L:p, function(i, j) rho^abs(i - j)) Z <- rcggm(n = n, b0 = b0, X = X, B = B, Sigma = Sigma, probl = 0.05, probr = 0.05) out <- cglasso(. ~ ., data = Z) fitted(out, lambda.id = 3L, rho.id = 3L, drop = TRUE)
set.seed(123) # Y ~ N(0, Sigma) and probability of left/right censored values equal to 0.05 n <- 1000L p <- 3L rho <- 0.3 Sigma <- outer(1L:p, 1L:p, function(i, j) rho^abs(i - j)) Z <- rcggm(n = n, Sigma = Sigma, probl = 0.05, probr = 0.05) out <- cglasso(. ~ ., data = Z) fitted(out, rho.id = 3L, drop = TRUE) # Y ~ N(b0 + XB, Sigma) and probability of left/right censored values equal to 0.05 n <- 1000L p <- 3L q <- 2 b0 <- runif(p) B <- matrix(runif(q * p), nrow = q, ncol = p) X <- matrix(rnorm(n * q), nrow = n, ncol = q) rho <- 0.3 Sigma <- outer(1L:p, 1L:p, function(i, j) rho^abs(i - j)) Z <- rcggm(n = n, b0 = b0, X = X, B = B, Sigma = Sigma, probl = 0.05, probr = 0.05) out <- cglasso(. ~ ., data = Z) fitted(out, lambda.id = 3L, rho.id = 3L, drop = TRUE)
cglasso2igraph
’ Object‘getGraph
’ retrieves graphs from an R object of class ‘cglasso2igraph
’.
getGraph(x, type = c("Gyy", "Gxy", "both"))
getGraph(x, type = c("Gyy", "Gxy", "both"))
x |
an object of class ‘ |
type |
a description of the required graph. Default is ‘ |
getGraph
retrieves an R object of class ‘igraph
’ representing the graph required by the argument type.
Luigi Augugliaro ([email protected])
to_graph
, is.cglasso2igraph
and plot.cglasso2igraph
.
set.seed(123) # Y ~ N(0, Sigma) and probability of left/right censored values equal to 0.05 n <- 100L p <- 3L rho <- 0.3 Sigma <- outer(1L:p, 1L:p, function(i, j) rho^abs(i - j)) Z <- rcggm(n = n, Sigma = Sigma, probl = 0.05, probr = 0.05) out <- cglasso(. ~ ., data = Z) out.graph <- to_graph(out) getGraph(out.graph) # Y ~ N(b0 + XB, Sigma) and probability of left/right censored values equal to 0.05 n <- 100L p <- 3L q <- 2L b0 <- runif(p) B <- matrix(runif(q * p), nrow = q, ncol = p) X <- matrix(rnorm(n * q), nrow = n, ncol = q) rho <- 0.3 Sigma <- outer(1L:p, 1L:p, function(i, j) rho^abs(i - j)) Z <- rcggm(n = n, b0 = b0, X = X, B = B, Sigma = Sigma, probl = 0.05, probr = 0.05) out <- cglasso(. ~ ., data = Z) out.graph <- to_graph(out, lambda.id = 3, rho.id = 3, weighted = TRUE) getGraph(out.graph, type = "Gyy") getGraph(out.graph, type = "Gxy") getGraph(out.graph, type = "both")
set.seed(123) # Y ~ N(0, Sigma) and probability of left/right censored values equal to 0.05 n <- 100L p <- 3L rho <- 0.3 Sigma <- outer(1L:p, 1L:p, function(i, j) rho^abs(i - j)) Z <- rcggm(n = n, Sigma = Sigma, probl = 0.05, probr = 0.05) out <- cglasso(. ~ ., data = Z) out.graph <- to_graph(out) getGraph(out.graph) # Y ~ N(b0 + XB, Sigma) and probability of left/right censored values equal to 0.05 n <- 100L p <- 3L q <- 2L b0 <- runif(p) B <- matrix(runif(q * p), nrow = q, ncol = p) X <- matrix(rnorm(n * q), nrow = n, ncol = q) rho <- 0.3 Sigma <- outer(1L:p, 1L:p, function(i, j) rho^abs(i - j)) Z <- rcggm(n = n, b0 = b0, X = X, B = B, Sigma = Sigma, probl = 0.05, probr = 0.05) out <- cglasso(. ~ ., data = Z) out.graph <- to_graph(out, lambda.id = 3, rho.id = 3, weighted = TRUE) getGraph(out.graph, type = "Gyy") getGraph(out.graph, type = "Gxy") getGraph(out.graph, type = "both")
Y
’ and ‘X
’ from a ‘datacggm
’ Object‘getMatrix
’ retrieves matrices ‘Y
’ and/or ‘X
’ from an object of class ‘datacggm
’.
getMatrix(x, name = c("Y", "X", "both"), ordered = FALSE)
getMatrix(x, name = c("Y", "X", "both"), ordered = FALSE)
x |
an object of class ‘ |
name |
the name of the required matrix. |
ordered |
logical value used to specify if the required matrix should be retrieved with rows ordered according to the patterns of censored values. See below for some example. |
getMatrix
retrieves the matrix specified by ‘name
’ and with row ordering specified by ‘ordered
’. A named list returned if name
is "both"
.
Luigi Augugliaro ([email protected])
Augugliaro L., Sottile G., Wit E.C., and Vinciotti V. (2023) <doi:10.18637/jss.v105.i01>. cglasso: An R Package for Conditional Graphical Lasso Inference with Censored and Missing Values. Journal of Statistical Software 105(1), 1–58.
Augugliaro, L., Sottile, G., and Vinciotti, V. (2020) <doi:10.1007/s11222-020-09945-7>. The conditional censored graphical lasso estimator. Statistics and Computing 30, 1273–1289.
Augugliaro, L., Abbruzzo, A., and Vinciotti, V. (2020) <doi:10.1093/biostatistics/kxy043>.
-Penalized censored Gaussian graphical model.
Biostatistics 21, e1–e16.
set.seed(123) # a dataset from a conditional censored Gaussian graphical model n <- 100L p <- 3L q <- 2 b0 <- runif(p) B <- matrix(runif(q * p), nrow = q, ncol = p) X <- matrix(rnorm(n * q), nrow = n, ncol = q) Z <- rcggm(n = n, b0 = b0, X = X, B = B, probl = 0.05, probr = 0.05) getMatrix(Z, name = "Y") # in the following examples 'Y' and 'X' is returned with rows ordered # according to the patterns of censored data getMatrix(Z, name = "Y", ordered = TRUE) getMatrix(Z, name = "X") getMatrix(Z, name = "X", ordered = TRUE) getMatrix(Z, name = "both") getMatrix(Z, name = "both", ordered = TRUE)
set.seed(123) # a dataset from a conditional censored Gaussian graphical model n <- 100L p <- 3L q <- 2 b0 <- runif(p) B <- matrix(runif(q * p), nrow = q, ncol = p) X <- matrix(rnorm(n * q), nrow = n, ncol = q) Z <- rcggm(n = n, b0 = b0, X = X, B = B, probl = 0.05, probr = 0.05) getMatrix(Z, name = "Y") # in the following examples 'Y' and 'X' is returned with rows ordered # according to the patterns of censored data getMatrix(Z, name = "Y", ordered = TRUE) getMatrix(Z, name = "X") getMatrix(Z, name = "X", ordered = TRUE) getMatrix(Z, name = "both") getMatrix(Z, name = "both", ordered = TRUE)
datacggm
ObjectCreates histograms with a normal density as background and two segments corresponding to left and/or right censored values (if any) in a conditional censored Gaussian graphical model.
## S3 method for class 'datacggm' hist(x, breaks = "Sturges", include.lowest = TRUE, right = TRUE, nclass = NULL, which, max.hist = 1L, save.hist = FALSE, grdev = pdf, grdev.arg, polygon.col = adjustcolor("grey", alpha.f = 0.25), polygon.border = NA, segments.lwd = 4L, segments.lty = 2L, segments.col = "gray40", points.pch = c(4L, 1L), points.cex = 1.8, points.col = rep("black", 2L), legend = TRUE, ...)
## S3 method for class 'datacggm' hist(x, breaks = "Sturges", include.lowest = TRUE, right = TRUE, nclass = NULL, which, max.hist = 1L, save.hist = FALSE, grdev = pdf, grdev.arg, polygon.col = adjustcolor("grey", alpha.f = 0.25), polygon.border = NA, segments.lwd = 4L, segments.lty = 2L, segments.col = "gray40", points.pch = c(4L, 1L), points.cex = 1.8, points.col = rep("black", 2L), legend = TRUE, ...)
x |
an object of class ‘ |
breaks |
one of:
See |
include.lowest |
logical; if |
right |
logical; if |
nclass |
numeric (integer). For S(-PLUS) compatibility only, |
which |
a vector of integers used to specify the response variables for which the histogram is required. |
max.hist |
the maximum number of histograms drawn in a single figure. |
save.hist |
a logical variable or a string specifying the path of the directory where plots will be saved. Letting ‘ |
grdev |
the graphics device used to save the required histograms on external files. See ‘ |
grdev.arg |
additional parameters passed to the graphics device specified by ‘ |
polygon.col |
graphical parameter; the colour for filling the area underlying the Gaussian distribution. |
polygon.border |
graphical parameter; the colour of the border of the Gaussian distribution. |
segments.lwd |
graphical parameter; the line width used in plotting the segments associated to the symbols specified by ‘ |
segments.lty |
graphical parameter; the line type used in plotting the segments associated to the symbols specified by ‘ |
segments.col |
graphical parameter; the colour used in plotting the segments associated to the symbols specified by ‘ |
points.pch |
graphical parameter; a pair of integers specifying the symbols used to plot the proportion of censored response variables (default ‘ |
points.cex |
graphical parameter; a numerical value giving the amount by which the symbols, specified by ‘ |
points.col |
graphical parameter; the colours used to plot the symbols specified by ‘ |
legend |
logical; if ‘ |
... |
additional graphical parameters passed to ‘ |
For each response variable, the method function ‘hist.datacggm
’ plots a histogram using only the observed values. Densities, plotted on the y
-axis, are computed in such a way that the sum of all densities plus the proportion of censored values is equal to one.
To evaluate the distributional assumption underlying the censored Gaussian distribution, on each histogram the area underlying the Gaussian density function is also shown in the background (marginal parameters are estimated as described in ‘datacggm
’ and ‘ColMeans
’). Furthermore, on each plot, the proportions of left/right censored values are graphically compared with the corresponding Gaussian tail probabilities. If the assumption about the censoring mechanism is satisfied, then the proportion of censored values and the tail probability are approximately equals to each other.
Gianluca Sottile ([email protected])
datacggm
, rcggm
, ColMeans
, ColVars
and qqcnorm
.
set.seed(123) # a dataset from a right-censored Gaussian graphical model n <- 1000L p <- 10L Y <- matrix(rnorm(n * p), n, p) up <- 1 Y[Y >= up] <- up Z <- datacggm(Y = Y, up = up) hist(Z, max.hist = 4L)
set.seed(123) # a dataset from a right-censored Gaussian graphical model n <- 1000L p <- 10L Y <- matrix(rnorm(n * p), n, p) up <- 1 Y[Y >= up] <- up Z <- datacggm(Y = Y, up = up) hist(Z, max.hist = 4L)
Imputes multivariate missing and censored values.
impute(object, type = c("mar", "censored", "both"), lambda.new, rho.new)
impute(object, type = c("mar", "censored", "both"), lambda.new, rho.new)
object |
an R object inheriting class ‘ |
type |
a description of the imputation required (see section ‘Description’). By default only the missing-at-random values are imputed. |
lambda.new |
value of the tuning parameter |
rho.new |
value of the tuning parameter |
The impute
function returns the response matrix with the imputed missing-at-random values (‘type = "mar"
’), the imputed censored values (‘type = "censored"
’), or both (‘type = "both"
’).
If ‘type = "mar"
’ then, for each row, the missing response values are replaced with the expected values of a multivariate normal distribution conditioned on the observed response values.
If ‘type = "censored"
’ then, for each row, the censored response values are imputed using the expected values of a multivariate truncated normal distribution conditioned on the observed response values.
If ‘type = "both"
’ then missing-at-random and censored values are imputed. In this case the returned matrix corresponds to the working response matrix computed during the E-Step.
The impute
function returns a response matrix with the required imputed data.
Luigi Augugliaro ([email protected])
Model-fitting functions cglasso
, cggm
and the accessor functions coef.cglasso
, fitted.cglasso
, residuals.cglasso
and predict.cglasso
.
set.seed(123) # Y ~ N(0, Sigma) and # 1. probability of left/right censored values equal to 0.05 # 2. probability of missing-at-random values equal to 0.05 n <- 100L p <- 3L rho <- 0.3 Sigma <- outer(1L:p, 1L:p, function(i, j) rho^abs(i - j)) Z <- rcggm(n = n, Sigma = Sigma, probl = 0.05, probr = 0.05, probna = 0.05) out <- cglasso(. ~ ., data = Z) rho.new <- mean(out$rho) # imputing missing values Y.impute <- impute(out, type = "mar", rho.new = rho.new) # imputing censored values Y.impute <- impute(out, type = "censored", rho.new = rho.new) # imputing missing and censored values Y.impute <- impute(out, type = "both", rho.new = rho.new) # Y ~ N(XB, Sigma) and # 1. probability of left/right censored values equal to 0.05 # 2. probability of missing-at-random values equal to 0.05 n <- 100L p <- 3L q <- 2 b0 <- runif(p) B <- matrix(runif(q * p), nrow = q, ncol = p) X <- matrix(rnorm(n * q), nrow = n, ncol = q) rho <- 0.3 Sigma <- outer(1L:p, 1L:p, function(i, j) rho^abs(i - j)) Z <- rcggm(n = n, b0 = b0, X = X, B = B, Sigma = Sigma, probl = 0.05, probr = 0.05, probna = 0.05) out <- cglasso(. ~ ., data = Z) lambda.new <- mean(out$lambda) rho.new <- mean(out$rho) # imputing missing values Y.impute <- impute(out, type = "mar", lambda.new = lambda.new, rho.new = rho.new) # imputing censored values Y.impute <- impute(out, type = "censored", lambda.new = lambda.new, rho.new = rho.new) # imputing missing and censored values Y.impute <- impute(out, type = "both", lambda.new = lambda.new, rho.new = rho.new)
set.seed(123) # Y ~ N(0, Sigma) and # 1. probability of left/right censored values equal to 0.05 # 2. probability of missing-at-random values equal to 0.05 n <- 100L p <- 3L rho <- 0.3 Sigma <- outer(1L:p, 1L:p, function(i, j) rho^abs(i - j)) Z <- rcggm(n = n, Sigma = Sigma, probl = 0.05, probr = 0.05, probna = 0.05) out <- cglasso(. ~ ., data = Z) rho.new <- mean(out$rho) # imputing missing values Y.impute <- impute(out, type = "mar", rho.new = rho.new) # imputing censored values Y.impute <- impute(out, type = "censored", rho.new = rho.new) # imputing missing and censored values Y.impute <- impute(out, type = "both", rho.new = rho.new) # Y ~ N(XB, Sigma) and # 1. probability of left/right censored values equal to 0.05 # 2. probability of missing-at-random values equal to 0.05 n <- 100L p <- 3L q <- 2 b0 <- runif(p) B <- matrix(runif(q * p), nrow = q, ncol = p) X <- matrix(rnorm(n * q), nrow = n, ncol = q) rho <- 0.3 Sigma <- outer(1L:p, 1L:p, function(i, j) rho^abs(i - j)) Z <- rcggm(n = n, b0 = b0, X = X, B = B, Sigma = Sigma, probl = 0.05, probr = 0.05, probna = 0.05) out <- cglasso(. ~ ., data = Z) lambda.new <- mean(out$lambda) rho.new <- mean(out$rho) # imputing missing values Y.impute <- impute(out, type = "mar", lambda.new = lambda.new, rho.new = rho.new) # imputing censored values Y.impute <- impute(out, type = "censored", lambda.new = lambda.new, rho.new = rho.new) # imputing missing and censored values Y.impute <- impute(out, type = "both", lambda.new = lambda.new, rho.new = rho.new)
is.cglasso2igraph
tests if its argument is an object of class ‘cglasso2igraph’.
is.cglasso2igraph(x)
is.cglasso2igraph(x)
x |
object to be tested. |
Luigi Augugliaro ([email protected])
set.seed(123) n <- 100L p <- 3L rho <- 0.3 Sigma <- outer(1L:p, 1L:p, function(i, j) rho^abs(i - j)) Z <- rcggm(n = n, Sigma = Sigma, probl = 0.05, probr = 0.05) out <- cglasso(. ~ ., data = Z) out.graph <- to_graph(out) is.cglasso2igraph(out.graph )
set.seed(123) n <- 100L p <- 3L rho <- 0.3 Sigma <- outer(1L:p, 1L:p, function(i, j) rho^abs(i - j)) Z <- rcggm(n = n, Sigma = Sigma, probl = 0.05, probr = 0.05) out <- cglasso(. ~ ., data = Z) out.graph <- to_graph(out) is.cglasso2igraph(out.graph )
‘is.datacggm
’ tests if its argument is an object of class ‘datacggm’.
is.datacggm(x)
is.datacggm(x)
x |
object to be tested. |
Luigi Augugliaro ([email protected])
Augugliaro L., Sottile G., Wit E.C., and Vinciotti V. (2023) <doi:10.18637/jss.v105.i01>. cglasso: An R Package for Conditional Graphical Lasso Inference with Censored and Missing Values. Journal of Statistical Software 105(1), 1–58.
Augugliaro, L., Sottile, G., and Vinciotti, V. (2020a) <doi:10.1007/s11222-020-09945-7>. The conditional censored graphical lasso estimator. Statistics and Computing 30, 1273–1289.
Augugliaro, L., Abbruzzo, A., and Vinciotti, V. (2020b) <doi:10.1093/biostatistics/kxy043>.
-Penalized censored Gaussian graphical model.
Biostatistics 21, e1–e16.
set.seed(123) n <- 100L p <- 3L b0 <- rep(0, p) Z <- rcggm(n = n, b0 = b0, probl = 0.05, probr = 0.05) is.datacggm(Z) # TRUE
set.seed(123) n <- 100L p <- 3L b0 <- rep(0, p) Z <- rcggm(n = n, b0 = b0, probl = 0.05, probr = 0.05) is.datacggm(Z) # TRUE
Functions ‘lower
’ and ‘upper
’ retrieve the lower and upper censoring values from an object of class “datacggm”.
lower(x) upper(x)
lower(x) upper(x)
x |
an object of class ‘ |
For an R object x
of class ‘datacggm
’, lower
(upper
) retrieves the lower (upper) censoring values of the response variables.
Luigi Augugliaro ([email protected])
set.seed(123) n <- 100L p <- 3L rho <- 0.3 Sigma <- outer(1L:p, 1L:p, function(i, j) rho^abs(i - j)) Z <- rcggm(n = n, Sigma = Sigma, probr = 0.05) lower(Z) upper(Z) Z <- rcggm(n = n, Sigma = Sigma, probl = 0.05) lower(Z) upper(Z) Z <- rcggm(n = n, Sigma = Sigma, probl = 0.05, probr = 0.05) lower(Z) upper(Z)
set.seed(123) n <- 100L p <- 3L rho <- 0.3 Sigma <- outer(1L:p, 1L:p, function(i, j) rho^abs(i - j)) Z <- rcggm(n = n, Sigma = Sigma, probr = 0.05) lower(Z) upper(Z) Z <- rcggm(n = n, Sigma = Sigma, probl = 0.05) lower(Z) upper(Z) Z <- rcggm(n = n, Sigma = Sigma, probl = 0.05, probr = 0.05) lower(Z) upper(Z)
In a study about the formation of blood cells, Psaila and others (2016) have recently identified three distinct sub-populations of cells, which are all derived from hematopoietic stem cells through cell differentiation. One of these sub-populations, denoted by MK-MEP, is a previously unknown, rare population of cells that are bipotent but primarily generate megakaryocytic progeny.
Multiplex RT-qPCR has been used to profile 63 genes and 48 single human MK-MEP cells. RT-qPCR data are typically right-censored with a limit of detection fixed by the manufacturer to 40. Raw data have been mean normalized using the method proposed in Pipelers and others (2017). See Section 5 in Augugliaro and others (2018) for more details.
‘MKMEP
’ is a matrix containing a subset of the data available from Psaila and others (2016).
data("MKMEP")
data("MKMEP")
Augugliaro L., Sottile G., Wit E.C., and Vinciotti V. (2023) <doi:10.18637/jss.v105.i01>. cglasso: An R Package for Conditional Graphical Lasso Inference with Censored and Missing Values. Journal of Statistical Software 105(1), 1–58.
Augugliaro, L., Sottile, G., and Vinciotti, V. (2020) <doi:10.1007/s11222-020-09945-7>. The conditional censored graphical lasso estimator. Statistics and Computing 30, 1273–1289.
Augugliaro, L., Abbruzzo, A., and Vinciotti, V. (2020) <doi:10.1093/biostatistics/kxy043>.
-Penalized censored Gaussian graphical model.
Biostatistics 21, e1–e16.
Pipelers, P., Clement, L., Vynck, M., Hellemans, J., Vandesompele, J. and Thas, O. (2017) <doi: 10.1371/journal.pone.0182832>. A unified censored normal regression model for qPCR differential gene expression analysis. PLoS One 12, e0182832.
Psaila, B., Barkas, N., Iskander, D., Roy, A., Anderson, S., Ashley, N., Caputo, V. S., Lichtenberg, J., Loaiza, S., Bodine, D. M. and others. (2016) <doi: 10.1186/s13059-016-0939-7> Single-cell profiling of human megakaryocyte-erythroid progenitors identifies distinct megakaryocyte and erythroid differentiation pathways. Genome Biology 17, 83–102.
cglasso
, to_graph
, and the method functions summary
, coef
, plot
, AIC.cglasso
and BIC.cglasso
.
data("MKMEP") is.matrix(MKMEP) Z <- datacggm(Y = MKMEP, up = 40) str(Z) is.datacggm(Z) print(Z, width = 80L)
data("MKMEP") is.matrix(MKMEP) Z <- datacggm(Y = MKMEP, up = 40) str(Z) is.datacggm(Z) print(Z, width = 80L)
MicroRNAs (miRNAs) are endogenous small non-coding RNAs, approximately 22 nucleotides in length, that play a regulatory role in gene expression by mediating mRNA cleavage or translation expression. Several studies have shown that a deregulation of the miRNAs can cause a disruption in the gene regulation mechanisms of the cell and that this might even lead to cancerous phenotypes.
Gutierrez and others (2010) investigated the expression level of a set of 141 miRNAs, measured by RT-qPCR technology on a sample of 64 patients with multiple myeloma (MM). Patients were selected to represent the most relevant recurrent genetic abnormalities in MM. RT-qPCR data are typically right-censored and, in this study, the upper limit of detection was fixed to 40 cycles.
Formally, MM
is a list with containing a subset of the date studied in Gutierrez and others (2010). Its components are:
Y
: the response matrix containing the cycle-threshold of the measured 49 miRNAs;
X
: the predictor matrix containing the expression levels of 15 endogenous internal reference genes, called housekeeping, and a factor encoding the cytogenetic abnormalities.
data("MM")
data("MM")
Augugliaro L., Sottile G., Wit E.C., and Vinciotti V. (2023) <doi:10.18637/jss.v105.i01>. cglasso: An R Package for Conditional Graphical Lasso Inference with Censored and Missing Values. Journal of Statistical Software 105(1), 1–58.
Augugliaro, L., Sottile, G., and Vinciotti, V. (2020) <doi:10.1007/s11222-020-09945-7>. The conditional censored graphical lasso estimator. Statistics and Computing 30, 1273–1289.
Augugliaro, L., Abbruzzo, A., and Vinciotti, V. (2020) <doi:10.1093/biostatistics/kxy043>.
-Penalized censored Gaussian graphical model.
Biostatistics 21, e1–e16.
Gutierrez, N. Sarasquete, M., Misiewicz-Krzeminska, I., Delgado, M. De Las Rivas, J., Ticona, F.V., Ferminan, E., Martin-Jimenez, P., Chillon, C., Risueno, A., Hernandez, J.M., Garcia-Sanz, R., Gonzalez, M. and San Miguel, J.F. (2010) <doi:10.1038/leu.2009.274>. Deregulation of microRNA expression in the different genetic subtypes of multiple myeloma and correlation with gene expression profiling. Leukemia 24(3), 629–637.
cglasso
, to_graph
, and the method functions summary
, coef
, plot
, AIC.cglasso
and BIC.cglasso
.
data("MM") Z <- datacggm(Y = MM$Y, X = MM$X, up = 40) print(Z, width = 80L)
data("MM") Z <- datacggm(Y = MM$Y, X = MM$X, up = 40) print(Z, width = 80L)
Extract the number of observations, response variables and predictors from an object of class datacggm
.
## S3 method for class 'datacggm' nobs(object, ...) ## S3 method for class 'datacggm' nresp(object, ...) ## S3 method for class 'datacggm' npred(object, ...)
## S3 method for class 'datacggm' nobs(object, ...) ## S3 method for class 'datacggm' nresp(object, ...) ## S3 method for class 'datacggm' npred(object, ...)
object |
an R object of class |
... |
further arguments to be passed to methods. |
Luigi Augugliaro ([email protected])
datacggm
, rcggm
and dim.datacggm
.
set.seed(123) n <- 100 p <- 3 q <- 2 b0 <- rep(1, p) X <- matrix(rnorm(n * q), n, q) B <- matrix(rnorm(q * p), q, p) Sigma <- outer(1:p, 1:p, function(i, j) 0.3^abs(i - j)) probl <- 0.05 probr <- 0.05 probna <- 0.05 Z <- rcggm(n = n, b0 = b0, Sigma = Sigma, probl = probl, probr = probr, probna = probna) nobs(Z) nresp(Z) npred(Z) Z <- rcggm(b0 = b0, X = X, B = B, Sigma = Sigma, probl = probl, probr = probr, probna = probna) nobs(Z) nresp(Z) npred(Z)
set.seed(123) n <- 100 p <- 3 q <- 2 b0 <- rep(1, p) X <- matrix(rnorm(n * q), n, q) B <- matrix(rnorm(q * p), q, p) Sigma <- outer(1:p, 1:p, function(i, j) 0.3^abs(i - j)) probl <- 0.05 probr <- 0.05 probna <- 0.05 Z <- rcggm(n = n, b0 = b0, Sigma = Sigma, probl = probl, probr = probr, probna = probna) nobs(Z) nresp(Z) npred(Z) Z <- rcggm(b0 = b0, X = X, B = B, Sigma = Sigma, probl = probl, probr = probr, probna = probna) nobs(Z) nresp(Z) npred(Z)
cggm
’ ObjectThe plot.cggm
produces graphs from an R object of class ‘cggm
’.
## S3 method for class 'cggm' plot(x, type, weighted = FALSE, simplify = TRUE, ...)
## S3 method for class 'cggm' plot(x, type, weighted = FALSE, simplify = TRUE, ...)
x |
an R object of class ‘ |
type |
a description of the required graph. Default depends on the type of fitted model, that is, |
weighted |
logical. Should weighted graphs be created? Default is |
simplify |
logical. Should isolated vertices be removed from the graph? Default is |
... |
additional graphical arguments passed to the functions |
Luigi Augugliaro ([email protected])
cggm
, to_graph
, getGraph
and plot.cglasso2igraph
.
set.seed(123) # Y ~ N(XB, Sigma) and # 1. probability of left/right censored values equal to 0.05 # 2. probability of missing-at-random values equal to 0.05 n <- 100L p <- 3L q <- 2L b0 <- runif(p) B <- matrix(runif(q * p), nrow = q, ncol = p) X <- matrix(rnorm(n * q), nrow = n, ncol = q) rho <- 0.3 Sigma <- outer(1L:p, 1L:p, function(i, j) rho^abs(i - j)) Z <- rcggm(n = n, b0 = b0, X = X, B = B, Sigma = Sigma, probl = 0.05, probr = 0.5, probna = 0.05) out <- cglasso(. ~ ., data = Z) out.mle <- cggm(out, lambda.id = 3L, rho.id = 3L) plot(out.mle, type = "Gyy") plot(out.mle, type = "Gxy") plot(out.mle, type = "both")
set.seed(123) # Y ~ N(XB, Sigma) and # 1. probability of left/right censored values equal to 0.05 # 2. probability of missing-at-random values equal to 0.05 n <- 100L p <- 3L q <- 2L b0 <- runif(p) B <- matrix(runif(q * p), nrow = q, ncol = p) X <- matrix(rnorm(n * q), nrow = n, ncol = q) rho <- 0.3 Sigma <- outer(1L:p, 1L:p, function(i, j) rho^abs(i - j)) Z <- rcggm(n = n, b0 = b0, X = X, B = B, Sigma = Sigma, probl = 0.05, probr = 0.5, probna = 0.05) out <- cglasso(. ~ ., data = Z) out.mle <- cggm(out, lambda.id = 3L, rho.id = 3L) plot(out.mle, type = "Gyy") plot(out.mle, type = "Gxy") plot(out.mle, type = "both")
cglasso
’ Object‘The plot.cglasso
’ function produces plots to study the coefficient paths of fitted cglasso models.
## S3 method for class 'cglasso' plot(x, what = c("Theta", "diag(Theta)", "b0", "B"), penalty = ifelse(x$nrho >= x$nlambda, "rho", "lambda"), given = NULL, GoF = AIC, add.labels, matplot.arg1, matplot.arg2, labels.arg, abline.arg, mtext.arg, save.plot, grdev = pdf, grdev.arg, digits = 4L, ...)
## S3 method for class 'cglasso' plot(x, what = c("Theta", "diag(Theta)", "b0", "B"), penalty = ifelse(x$nrho >= x$nlambda, "rho", "lambda"), given = NULL, GoF = AIC, add.labels, matplot.arg1, matplot.arg2, labels.arg, abline.arg, mtext.arg, save.plot, grdev = pdf, grdev.arg, digits = 4L, ...)
x |
an R object of class ‘ |
what |
a character or a formula specifying the required plot. If |
penalty |
optional character argument used to specify the tuning parameter needed to plot the conditional coefficient path. Allowed descriptors are ‘ |
given |
an optional vector of integers identifying the conditioning values of the second tuning parameter to be used for the coefficient profile plots of the parameters identified by the input ‘ |
GoF |
a valid goodness-of-fit function, such as |
add.labels |
logical value. Should labels be added to the plot? |
matplot.arg1 |
a named list with the graphical parameters used to plot the paths of the estimates identified by ‘ |
matplot.arg2 |
a named list with the graphical parameters used to plot the paths of the remaining estimates. |
labels.arg |
a named list with the graphical parameters used to plot the labels. |
abline.arg |
a named list with the graphical parameters used to plot the line identifying the optimal tuning parameter value. |
mtext.arg |
a named list with the graphical parameters used to plot the text reported on the third axis. |
save.plot |
a logical variable or a string specifying the path of the directory where plots will be saved. Letting ‘ |
grdev |
the graphics device used to save the required histograms on external files. See ‘ |
grdev.arg |
additional parameters passed to the graphics device specified by ‘ |
digits |
number of digits used to print the value of the second tuning parameter identified by ‘ |
... |
further arguments passed to the chosen goodness-of-fit function, such as ‘ |
The function plot.cglasso
produces the coefficient profile plot. The output depends both on the type of fitted model and on the setting of the three main arguments, what
, penalty
and given
. Below we give more details.
If the model fitting function cglasso
is used to fit an l1-penalized censored Gaussian graphical model (see first part in Section Example), then the user can specify only the main argument ‘what
’, whereas ‘penalty
’ and ‘given
’ can be omitted. If ‘penalty
’ is specified, then it must be equal to “rho
”. The main argument ‘what
’ is used to specify the estimator that is plotted on the -axis. In this case, it must be equal to one of the following descriptors:
“Theta
”: the path of the estimated partial correlation coefficients is returned;
“diag(Theta)
”: the path of the diagonal entries of the estimated precision matrix is returned;
“b0
”: the path of the estimated expected values of the response variables is returned;
If a l1-penalized conditional censored Gaussian graphical model is fitted (see second part in Section Example), then all the main arguments can be used. In this case:
‘what
’ must be equal to one of the following descriptors:
“Theta
”: the path of the off-diagonal entries of the estimated precision matrix is returned;
“diag(Theta)
”: the path of the estimated partial correlation coefficients is returned;
“b0
”: the path of the estimated intercepts is returned;
“B
”: the path of the estimated regression coefficients is returned for each response variable;
‘penalty
’ is used to specify the tuning parameter that is reported on the -axis and it must be equal to one of the following descriptors:
“rho
”: the tuning parameter used to penalize the precision matrix;
“lambda
”: the tuning parameter used to penalize the regression coefficient matrix;
‘given
’ is an optional vector of integers used to specify the values of the second tuning parameter. For instance, letting what = "Theta"
, penalty = "rho"
and given = 1
then plot.cglasso
returns the plot of the estimated partial correlation coefficients versus the -values and given the first value of the used
-values. If
given
is left unspecified then a sequence of conditional profile plot is returned, a plot for each value of the second tuning parameter.
Optionally, the user can specify a conditional profile plot using a model formula with the following template:
‘what ~ penalty | given
’
For instance, the previous plot can be specified using the following model formula
‘Theta ~ rho | 1
’
In this case, the arguments penalty
and given
must be left unspecified whereas what
is used for the model formula (see also examples below).
The optional argument GoF
can be used to identify the non-zero estimates by the output of a goodness-of-fit function, such as AIC.cglasso
and BIC.cglasso
. In this case, a vertical red dashed line is used the identify the optimal value of the tuning parameter reported on the -axis, whereas the path of the identified non-zero estimates is drawn using a solid black line; the remaining paths are drawn using gray dashed lines. This option is disabled if we let ‘
GoF = NULL
’, when all the paths are drawn usign solid black lines.
Luigi Augugliaro ([email protected])
Augugliaro L., Sottile G., Wit E.C., and Vinciotti V. (2023) <doi:10.18637/jss.v105.i01>. cglasso: An R Package for Conditional Graphical Lasso Inference with Censored and Missing Values. Journal of Statistical Software 105(1), 1–58.
Augugliaro, L., Abbruzzo, A., and Vinciotti, V. (2020a) <doi:10.1093/biostatistics/kxy043>.
-Penalized censored Gaussian graphical model.
Biostatistics 21, e1–e16.
Augugliaro, L., Sottile, G., and Vinciotti, V. (2020b) <doi:10.1007/s11222-020-09945-7>. The conditional censored graphical lasso estimator. Statistics and Computing 30, 1273–1289.
Friedman, J.H., Hastie, T., and Tibshirani, R. (2008) <doi:10.1093/biostatistics/kxm045>. Sparse inverse covariance estimation with the graphical lasso. Biostatistics 9, 432–441.
Stadler, N. and Buhlmann, P. (2012) <doi:10.1007/s11222-010-9219-7>. Missing values: sparse inverse covariance estimation and an extension to sparse regression. Statistics and Computing 22, 219–235.
cglasso
, AIC.cglasso
and BIC.cglasso
.
set.seed(123) ################################################################################ # Model 1: censored glasso estimator (Augugliaro \emph{and other}, 2020a) # Y ~ N(0, Sigma) and probability of left/right censored values equal to 0.05 ################################################################################ n <- 100L p <- 10L rho <- 0.3 Sigma <- outer(1L:p, 1L:p, function(i, j) rho^abs(i - j)) Z <- rcggm(n = n, Sigma = Sigma, probl = 0.05, probr = 0.05) out <- cglasso(. ~ ., data = Z) # coefficient profile plot of the off-diagonal entries of the precision matrix plot(out, what = "Theta") # the same as #plot(out, what = "Theta", penalty = "rho") # the output of a goodness-of-fit function can be used to identify the # non-zero estimates plot(out, what = "Theta", GoF = BIC) # letting 'GoF = NULL' the previous option is disabled plot(out, what = "Theta", GoF = NULL) # coefficient profile plot of the diagonal entries of the precision matrix plot(out, what = "diag(Theta)") # coefficient profile plot of the expected values of the response variables plot(out, what = "b0") ################################################################################ # Model 2: conditional censored glasso estimator (Augugliaro \emph{and other}, 2020b) # Y ~ N(b0 + XB, Sigma) and probability of left/right censored values equal to 0.05 ################################################################################ n <- 100L p <- 10L q <- 5L b0 <- runif(p) B <- matrix(runif(q * p), nrow = q, ncol = p) X <- matrix(rnorm(n * q), nrow = n, ncol = q) rho <- 0.3 Sigma <- outer(1L:p, 1L:p, function(i, j) rho^abs(i - j)) Z <- rcggm(n = n, b0 = b0, X = X, B = B, Sigma = Sigma, probl = 0.05, probr = 0.05) out <- cglasso(. ~ ., data = Z) # conditional profile plot of the estimater partial correlation coefficients versus # the used values of the tunin parameter rho and given the first lambda-value plot(out, what = "Theta", penalty = "rho", given = 1L) out$lambda[1L] lambda.id <- c(2, 4) plot(out, what = "Theta", penalty = "rho", given = lambda.id) out$lambda[lambda.id] # in this case a sequence of ten conditional profile plots is returned, tha is a # plot for each lambda-value. plot(out, what = "Theta", penalty = "rho") # optionally, the user can specify the conditional profile plots using the model # formula plot(out, what = Theta ~ rho | lambda.id) lambda.id plot(out, what = Theta ~ rho) # the output of a goodness-of-fit function can be used to identify the # non-zero estimates plot(out, what = Theta ~ rho | 10, GoF = BIC) # letting 'GoF = NULL' the previous option is disabled plot(out, what = Theta ~ rho | 10, GoF = NULL)
set.seed(123) ################################################################################ # Model 1: censored glasso estimator (Augugliaro \emph{and other}, 2020a) # Y ~ N(0, Sigma) and probability of left/right censored values equal to 0.05 ################################################################################ n <- 100L p <- 10L rho <- 0.3 Sigma <- outer(1L:p, 1L:p, function(i, j) rho^abs(i - j)) Z <- rcggm(n = n, Sigma = Sigma, probl = 0.05, probr = 0.05) out <- cglasso(. ~ ., data = Z) # coefficient profile plot of the off-diagonal entries of the precision matrix plot(out, what = "Theta") # the same as #plot(out, what = "Theta", penalty = "rho") # the output of a goodness-of-fit function can be used to identify the # non-zero estimates plot(out, what = "Theta", GoF = BIC) # letting 'GoF = NULL' the previous option is disabled plot(out, what = "Theta", GoF = NULL) # coefficient profile plot of the diagonal entries of the precision matrix plot(out, what = "diag(Theta)") # coefficient profile plot of the expected values of the response variables plot(out, what = "b0") ################################################################################ # Model 2: conditional censored glasso estimator (Augugliaro \emph{and other}, 2020b) # Y ~ N(b0 + XB, Sigma) and probability of left/right censored values equal to 0.05 ################################################################################ n <- 100L p <- 10L q <- 5L b0 <- runif(p) B <- matrix(runif(q * p), nrow = q, ncol = p) X <- matrix(rnorm(n * q), nrow = n, ncol = q) rho <- 0.3 Sigma <- outer(1L:p, 1L:p, function(i, j) rho^abs(i - j)) Z <- rcggm(n = n, b0 = b0, X = X, B = B, Sigma = Sigma, probl = 0.05, probr = 0.05) out <- cglasso(. ~ ., data = Z) # conditional profile plot of the estimater partial correlation coefficients versus # the used values of the tunin parameter rho and given the first lambda-value plot(out, what = "Theta", penalty = "rho", given = 1L) out$lambda[1L] lambda.id <- c(2, 4) plot(out, what = "Theta", penalty = "rho", given = lambda.id) out$lambda[lambda.id] # in this case a sequence of ten conditional profile plots is returned, tha is a # plot for each lambda-value. plot(out, what = "Theta", penalty = "rho") # optionally, the user can specify the conditional profile plots using the model # formula plot(out, what = Theta ~ rho | lambda.id) lambda.id plot(out, what = Theta ~ rho) # the output of a goodness-of-fit function can be used to identify the # non-zero estimates plot(out, what = Theta ~ rho | 10, GoF = BIC) # letting 'GoF = NULL' the previous option is disabled plot(out, what = Theta ~ rho | 10, GoF = NULL)
plot.cglasso2igraph
produces graphs from an R object of class ‘cglasso2igraph
’.
## S3 method for class 'cglasso2igraph' plot(x, type, ...)
## S3 method for class 'cglasso2igraph' plot(x, type, ...)
x |
an R object of class ‘ |
type |
a description of the required graph. Default is ‘ |
... |
additional graphical arguments passed to the functions |
Luigi Augugliaro ([email protected])
set.seed(123) # Y ~ N(0, Sigma) and probability of left/right censored values equal to 0.05 n <- 100L p <- 3L rho <- 0.3 Sigma <- outer(1L:p, 1L:p, function(i, j) rho^abs(i - j)) Z <- rcggm(n = n, Sigma = Sigma, probl = 0.05, probr = 0.05) out <- cglasso(. ~ ., data = Z) out.graph <- to_graph(out) plot(out.graph, type = "Gyy") out.graph <- to_graph(out, weighted = TRUE) plot(out.graph, type = "Gyy") # Y ~ N(b0 +XB, Sigma) and probability of left/right censored values equal to 0.05 n <- 100L p <- 3L q <- 2L b0 <- runif(p) B <- matrix(runif(q * p), nrow = q, ncol = p) X <- matrix(rnorm(n * q), nrow = n, ncol = q) rho <- 0.3 Sigma <- outer(1L:p, 1L:p, function(i, j) rho^abs(i - j)) Z <- rcggm(n = n, b0 = b0, X = X, B = B, Sigma = Sigma, probl = 0.05, probr = 0.05) out <- cglasso(. ~ ., data = Z) out.graph <- to_graph(out, lambda.id = 3, rho.id = 3, weighted = TRUE) plot(out.graph, type = "Gyy") plot(out.graph, type = "Gxy") plot(out.graph, type = "both")
set.seed(123) # Y ~ N(0, Sigma) and probability of left/right censored values equal to 0.05 n <- 100L p <- 3L rho <- 0.3 Sigma <- outer(1L:p, 1L:p, function(i, j) rho^abs(i - j)) Z <- rcggm(n = n, Sigma = Sigma, probl = 0.05, probr = 0.05) out <- cglasso(. ~ ., data = Z) out.graph <- to_graph(out) plot(out.graph, type = "Gyy") out.graph <- to_graph(out, weighted = TRUE) plot(out.graph, type = "Gyy") # Y ~ N(b0 +XB, Sigma) and probability of left/right censored values equal to 0.05 n <- 100L p <- 3L q <- 2L b0 <- runif(p) B <- matrix(runif(q * p), nrow = q, ncol = p) X <- matrix(rnorm(n * q), nrow = n, ncol = q) rho <- 0.3 Sigma <- outer(1L:p, 1L:p, function(i, j) rho^abs(i - j)) Z <- rcggm(n = n, b0 = b0, X = X, B = B, Sigma = Sigma, probl = 0.05, probr = 0.05) out <- cglasso(. ~ ., data = Z) out.graph <- to_graph(out, lambda.id = 3, rho.id = 3, weighted = TRUE) plot(out.graph, type = "Gyy") plot(out.graph, type = "Gxy") plot(out.graph, type = "both")
GoF
’ Object‘The plot.GoF
’ function produces plots to study the sequence of fitted models.
## S3 method for class 'GoF' plot(x, add.line = TRUE, arg.line = list(lty = 2L, lwd = 2L, col = "red"), add.text = FALSE, arg.text = list(side = 3L), arg.points = list(pch = 2L), ...)
## S3 method for class 'GoF' plot(x, add.line = TRUE, arg.line = list(lty = 2L, lwd = 2L, col = "red"), add.text = FALSE, arg.text = list(side = 3L), arg.points = list(pch = 2L), ...)
x |
an R object of class ‘ |
add.line |
logical; if ‘ |
arg.line |
a named list of graphical parameters passed to the function |
add.text |
logical; if ‘ |
arg.text |
a list of further parameters passed to the function |
arg.points |
a named list of graphical parameters passed to the function |
... |
additional graphical arguments passed to the functions |
plot.GoF
is the plotting method function of an R object of class ‘GoF
’, that is, the output of a goodness-of-fit function (see AIC.cglasso
, or BIC.cglasso
). This function produces a plot aimed both to evaluate the sequence of fitted models in terms of goodness-of-fit and to identify the optimal values of the tuning parameters.
If a tuning parameter is held fixed, then plot.GoF
produces a plot showing the chosen measure of goodness-of-fit as a function of the remaining tuning parameter. In this case, the optimal value is identified by a vertical dashed line. The degrees-of-freedom of the selected fitted model are also shown.
If the cglasso model is fitted using both a sequence of and
values, then
plot.GoF
produces a contour plot and a triangle is used to identify the optimal pair of the two tuning parameters.
Luigi Augugliaro ([email protected])
cglasso
, AIC.cglasso
, BIC.cglasso
, summary.cglasso
and select_cglasso
.
set.seed(123) n <- 1000L p <- 3L q <- 2 b0 <- runif(p) B <- matrix(runif(q * p), nrow = q, ncol = p) X <- matrix(rnorm(n * q), nrow = n, ncol = q) rho <- 0.3 Sigma <- outer(1L:p, 1L:p, function(i, j) rho^abs(i - j)) Z <- rcggm(n = n, b0 = b0, X = X, B = B, Sigma = Sigma, probl = 0.05, probr = 0.05) out <- cglasso(. ~ ., data = Z, nlambda = 1L) plot(AIC(out)) plot(BIC(out)) out <- cglasso(. ~ ., data = Z, nrho = 1L) plot(AIC(out)) plot(BIC(out)) out <- cglasso(. ~ ., data = Z) plot(AIC(out)) plot(BIC(out))
set.seed(123) n <- 1000L p <- 3L q <- 2 b0 <- runif(p) B <- matrix(runif(q * p), nrow = q, ncol = p) X <- matrix(rnorm(n * q), nrow = n, ncol = q) rho <- 0.3 Sigma <- outer(1L:p, 1L:p, function(i, j) rho^abs(i - j)) Z <- rcggm(n = n, b0 = b0, X = X, B = B, Sigma = Sigma, probl = 0.05, probr = 0.05) out <- cglasso(. ~ ., data = Z, nlambda = 1L) plot(AIC(out)) plot(BIC(out)) out <- cglasso(. ~ ., data = Z, nrho = 1L) plot(AIC(out)) plot(BIC(out)) out <- cglasso(. ~ ., data = Z) plot(AIC(out)) plot(BIC(out))
Obtains predictions from an R object inheriting class ‘cglasso
’.
## S3 method for class 'cglasso' predict(object, type = c("B", "mu", "Sigma", "Theta"), X.new, lambda.new, rho.new, ...) ## S3 method for class 'cggm' predict(object, X.new, ...)
## S3 method for class 'cglasso' predict(object, type = c("B", "mu", "Sigma", "Theta"), X.new, lambda.new, rho.new, ...) ## S3 method for class 'cggm' predict(object, X.new, ...)
object |
an R object inheriting class ‘ |
type |
a description of prediction required. |
X.new |
matrix of new values for |
lambda.new |
value of the tuning parameter |
rho.new |
value of the tuning parameter |
... |
further arguments passed to or from other methods. |
If object
has S3 class ‘cglasso
’, then for a new pair of the tuning parameters and
, the
predict
function can be used to predict the estimate of the regression coefficient matrix (‘type = "B"
’), the estimate of the covariance matrix (‘type = "Sigma"
’) or the estimate of the precision matrix (‘type = "Theta"
’). If X.new
is missing and ‘type = "mu"
’, then the predict
function returns the predicted values using the matrix of predictors X
, otherwise the predicted fitted values are computed using the matrix X.new
.
For a new pair of the tuning parameters and
, the predicted values are computed using a bilinear interpolation.
If the object
has S3 class ‘cggm
’, then the predict
function returns only the predicted fitted values using the argument X.new
.
The matrix of predicted values.
Luigi Augugliaro ([email protected])
Model-fitting function cglasso
and the other accessor functions coef.cglasso
, fitted.cglasso
, residuals.cglasso
and impute
.
set.seed(123) # Y ~ N(0, Sigma) and probability of left/right censored values equal to 0.05 n <- 100L p <- 3L rho <- 0.3 Sigma <- outer(1L:p, 1L:p, function(i, j) rho^abs(i - j)) Z <- rcggm(n = n, Sigma = Sigma, probl = 0.05, probr = 0.05) out <- cglasso(. ~ ., data = Z) rho.new <- mean(out$rho) Theta.pred <- predict(out, rho.new = rho.new, type = "Theta") Theta.pred # Y ~ N(b0 + XB, Sigma) and probability of left/right censored values equal to 0.05 n <- 100L p <- 3L q <- 2 b0 <- runif(p) B <- matrix(runif(q * p), nrow = q, ncol = p) X <- matrix(rnorm(n * q), nrow = n, ncol = q) rho <- 0.3 Sigma <- outer(1L:p, 1L:p, function(i, j) rho^abs(i - j)) Z <- rcggm(n = n, b0 = b0, X = X, B = B, Sigma = Sigma, probl = 0.05, probr = 0.05) out <- cglasso(. ~ ., data = Z) rho.new <- mean(out$rho) lambda.new <- mean(out$lambda) Theta.pred <- predict(out, lambda.new = lambda.new, rho.new = rho.new, type = "Theta") Theta.pred
set.seed(123) # Y ~ N(0, Sigma) and probability of left/right censored values equal to 0.05 n <- 100L p <- 3L rho <- 0.3 Sigma <- outer(1L:p, 1L:p, function(i, j) rho^abs(i - j)) Z <- rcggm(n = n, Sigma = Sigma, probl = 0.05, probr = 0.05) out <- cglasso(. ~ ., data = Z) rho.new <- mean(out$rho) Theta.pred <- predict(out, rho.new = rho.new, type = "Theta") Theta.pred # Y ~ N(b0 + XB, Sigma) and probability of left/right censored values equal to 0.05 n <- 100L p <- 3L q <- 2 b0 <- runif(p) B <- matrix(runif(q * p), nrow = q, ncol = p) X <- matrix(rnorm(n * q), nrow = n, ncol = q) rho <- 0.3 Sigma <- outer(1L:p, 1L:p, function(i, j) rho^abs(i - j)) Z <- rcggm(n = n, b0 = b0, X = X, B = B, Sigma = Sigma, probl = 0.05, probr = 0.05) out <- cglasso(. ~ ., data = Z) rho.new <- mean(out$rho) lambda.new <- mean(out$lambda) Theta.pred <- predict(out, lambda.new = lambda.new, rho.new = rho.new, type = "Theta") Theta.pred
‘QFun
’ extracts the values of the Q-function from an R object inheriting class ‘cglasso
’.
QFun(object, mle, verbose = FALSE, ...)
QFun(object, mle, verbose = FALSE, ...)
object |
an R object inheriting class ‘ |
mle |
logical. Should Q-values be computed using the maximum likelihood estimates? Default depends on the class of the argument |
verbose |
logical for printing out a progress bar on the R console. Default is |
... |
further arguments passed to |
‘QFun
’ returns the value of the Q-function, i.e., the value of the function maximised in the M-step of the EM algorithm. The Q-function is defined as follows:
where is the ‘working’ empirical covariance matrix computed during the E-step.
QFun
is used as a workhorse function to compute the measures of goodness-of-fit returned by the functions AIC.cglasso
and BIC.cglasso
.
The function ‘print.QFun
’ is used the improve the readability of the results.
‘QFun
’ returns an R object of S3 class “QFun
”, i.e., a named list containing the following components:
value |
a matrix with the values of the Q-function. |
df |
a matrix with the number of estimated non-zero parameters. |
dfB |
a matrix with the number of estimated non-zero regression coefficients. |
dfTht |
a matrix with the number of estimated non-zero partial correlation coefficients. |
n |
the sample size. |
p |
the number of response variables. |
q |
the number of columns of the design matrix |
lambda |
the |
nlambda |
the number of |
rho |
the |
nrho |
the number of |
model |
a description of the fitted model passed through the argument |
Luigi Augugliaro ([email protected])
AIC.cglasso
, BIC.cglasso
, cglasso
, cggm
, summary.cglasso
, select_cglasso
and to_graph
.
set.seed(123) # Y ~ N(b0+ XB, Sigma) and # 1. probability of left/right censored values equal to 0.05 # 2. probability of missing-at-random values equal to 0.05 n <- 100L p <- 3L q <- 2L b0 <- runif(p) B <- matrix(runif(q * p), nrow = q, ncol = p) X <- matrix(rnorm(n * q), nrow = n, ncol = q) rho <- 0.3 Sigma <- outer(1:p, 1:p, function(i, j) rho^abs(i - j)) Z <- rcggm(n = n, b0 = b0, X = X, B = B, Sigma = Sigma, probl = 0.05, probr = 0.5, probna = 0.05) out <- cglasso(. ~ ., data = Z) QFun(out) out.mle <- cggm(out, lambda.id = 3L, rho.id = 3L) QFun(out.mle)
set.seed(123) # Y ~ N(b0+ XB, Sigma) and # 1. probability of left/right censored values equal to 0.05 # 2. probability of missing-at-random values equal to 0.05 n <- 100L p <- 3L q <- 2L b0 <- runif(p) B <- matrix(runif(q * p), nrow = q, ncol = p) X <- matrix(rnorm(n * q), nrow = n, ncol = q) rho <- 0.3 Sigma <- outer(1:p, 1:p, function(i, j) rho^abs(i - j)) Z <- rcggm(n = n, b0 = b0, X = X, B = B, Sigma = Sigma, probl = 0.05, probr = 0.5, probna = 0.05) out <- cglasso(. ~ ., data = Z) QFun(out) out.mle <- cggm(out, lambda.id = 3L, rho.id = 3L) QFun(out.mle)
datacggm
ObjectCreates a quantile-quantile plot for a censored Gaussian distribution.
qqcnorm(x, which, max.plot = 1L, save.plot = FALSE, grdev = pdf, grdev.arg, main = "Censored Normal Q-Q Plot", xlab = "Theoretical Quantiles", ylab = "Sample Quantiles", plot.it = TRUE, plot.pch = c(2L, 20L), plot.col = c(2L, 1L), plot.cex = c(2L, 1L), abline = FALSE, line.col = "gray50", line.lwd = 1L, line.lty = 2L, ...)
qqcnorm(x, which, max.plot = 1L, save.plot = FALSE, grdev = pdf, grdev.arg, main = "Censored Normal Q-Q Plot", xlab = "Theoretical Quantiles", ylab = "Sample Quantiles", plot.it = TRUE, plot.pch = c(2L, 20L), plot.col = c(2L, 1L), plot.cex = c(2L, 1L), abline = FALSE, line.col = "gray50", line.lwd = 1L, line.lty = 2L, ...)
x |
an object of class ‘ |
which |
a vector of integers used to specify the response variables for which the histogram is required. |
max.plot |
the maximum number of plots drawn in a single figure. |
save.plot |
a logical variable or a string specifying the path of the directory where plots will be saved. Letting ‘ |
grdev |
the graphics device used to save the required histograms on external files. See ‘ |
grdev.arg |
additional parameters passed to the graphics device specified by ‘ |
main , xlab , ylab
|
plot labels. |
plot.it |
logical. Should the result be plotted? |
plot.pch |
a pair of graphical parameters. The first entry specifies the symbol used for plotting the points associated to the censoring values |
plot.col |
a pair of graphical parameters. The first entry specifies the colour used for plotting the points associated to the censoring values |
plot.cex |
a pair of graphical parameters. The first entry specifies the size of the symbol used for plotting the points associated to the censoring values |
abline |
logical. Should the line |
line.col |
graphical parameter. If ‘ |
line.lwd |
graphical parameter. If ‘ |
line.lty |
graphical parameter. If ‘ |
... |
additional graphical parameter passed to ‘ |
‘qqcnorm
’ produces a censored normal QQ plot, that is, a graphical method for comparing the empirical distribution of a given response variable (specified by the argument which
) to the censored Gaussian distribution, which is defined as:
|
if | |
|
if | |
|
if |
|
where and
are the probability density function and the cumulative ditribution of the standard normal distribution, respectively, whereas
and
are the lower and upper censoring values, respectively.
The comparison is done by plotting the empirical quantiles (-coordinate) against the theoretical quantiles (
-coordinate) of the censored Gaussian distribution, which are defined as follows:
|
if | |
|
if | |
|
if |
|
where . If the two distributions are similar, the points will approximately lie on the line
. If the distributions are linearly related, the points will approximately lie on a line, but not necessarily on the line
. In order to evaluate if the proportions of left/right-censored values are similar to the Gaussian tail probabilities, points corresponding to the censored values are plotted using a specific symbol (see argument ‘
plot.pch
’), colour (see argument ‘plot.col
’) and size (see argument ‘plot.cex
’).
Finally, maximum likelihood estimates of the marginal parameters and
are computed as described in ‘
datacggm
’ and can be extracted from an R of class ‘datacggm
’ by using the functions ‘ColMeans
’ and ‘ColVars
’, respectively.
A named list is silently returned. Each element of the list contains a two-columns matrix; first columns (named ‘x
’) contains the theoretical quantiles whereas second columns (named ‘y
’) contains the empirical quantiles.
Gianluca Sottile ([email protected])
Augugliaro L., Sottile G., Wit E.C., and Vinciotti V. (2023) <doi:10.18637/jss.v105.i01>. cglasso: An R Package for Conditional Graphical Lasso Inference with Censored and Missing Values. Journal of Statistical Software 105(1), 1–58.
Augugliaro, L., Sottile, G., and Vinciotti, V. (2020) <doi:10.1007/s11222-020-09945-7>. The conditional censored graphical lasso estimator. Statistics and Computing 30, 1273–1289.
Augugliaro, L., Abbruzzo, A., and Vinciotti, V. (2020) <doi:10.1093/biostatistics/kxy043>.
-Penalized censored Gaussian graphical model.
Biostatistics 21, e1–e16.
datacggm
, rcggm
, ColMeans
, ColVars
and hist.datacggm
.
set.seed(123) # a dataset from a right-censored Gaussian graphical model n <- 1000L p <- 10L Y <- matrix(rnorm(n * p), n, p) up <- 1 Y[Y >= up] <- up Z <- datacggm(Y = Y, up = up) qqcnorm(Z, max.plot = 4L) # a dataset from a conditional censored Gaussian graphical model n <- 1000L p <- 10L q <- 2 Y <- matrix(rnorm(n * p), n, p) up <- 1 lo <- -1 Y[Y >= up] <- up Y[Y <= lo] <- lo X <- matrix(rnorm(n * q), n, q) Z <- datacggm(Y = Y, lo = lo, up = up, X = X) qqcnorm(Z, max.plot = 4L)
set.seed(123) # a dataset from a right-censored Gaussian graphical model n <- 1000L p <- 10L Y <- matrix(rnorm(n * p), n, p) up <- 1 Y[Y >= up] <- up Z <- datacggm(Y = Y, up = up) qqcnorm(Z, max.plot = 4L) # a dataset from a conditional censored Gaussian graphical model n <- 1000L p <- 10L q <- 2 Y <- matrix(rnorm(n * p), n, p) up <- 1 lo <- -1 Y[Y >= up] <- up Y[Y <= lo] <- lo X <- matrix(rnorm(n * q), n, q) Z <- datacggm(Y = Y, lo = lo, up = up, X = X) qqcnorm(Z, max.plot = 4L)
‘rcggm
’ function is used to produce one or more samples from a conditional Gaussian graphical model with censored and/or missing values.
rcggm(n, p, b0, X, B, Sigma, probl, probr, probna, ...)
rcggm(n, p, b0, X, B, Sigma, probl, probr, probna, ...)
n |
the number of samples required (optional, see below for a description). |
p |
the number of response variables (optional, see below for a description). |
b0 |
a vector of length |
X |
a matrix of dimension |
B |
a matrix of dimension |
Sigma |
a positive-definite symmetric matrix specifying the covariance matrix of the response variables. Default is the identity matrix (optional, see below for a description). |
probl |
a vector giving the probabilities that the response variables are left-censored. |
probr |
a vector giving the probabilities that the response variables are right-censored. |
probna |
the probability that a response value is missing-at-random. By default ‘ |
... |
further arguments passed to the function ‘ |
‘The rcggm
’ function simulates a dataset from a conditional Gaussian graphical model with censored or missing values and returns an object of class ‘datacggm
’. Censoring values are implicitly specified using arguments probl
and probr
, that is, lo
and up
are computed in such a way that the average probabilities of left and right censoring are equal to probl
and probr
, respectively. Missing-at-random values are simulated using a Bernoulli distribution with probability probna
.
The dataset is simulated through the following steps:
lower and upper censoring values (lo
and up
) are computed according to the arguments probl
and probr
;
The function mvrnorm
is used to simulate one or more samples from the multivariate Gaussian distribution specified by the arguments b0
, X
, B
and Sigma
;
The response values that are outside of the interval [lo, up]
are replaced with the corresponding censoring values;
if probna
is greater than zero, then missing-at-random values are simulated using a Bernoulli distribution with probability probna
.
Model | n |
p |
b0 |
X |
B |
Sigma |
Gaussian distribution |
1 | x |
x |
|
||||
2 | x |
x |
|
||||
3 | x |
x |
|
||||
4 | x |
x |
x |
|
|||
5 |
x |
x |
|
||||
6 |
x |
x |
x |
|
|||
7 | x |
x |
x |
|
|||
8 | x |
x |
x |
x |
|
The previous table sums up the default setting of the multivariate Gaussian distribution used in step 2 (specified arguments are marked by the symbol ‘x
’). See also below for some examples.
rcggm
returns an object of class ‘datacggm
’. See datacggm
for further details.
Luigi Augugliaro ([email protected])
Augugliaro L., Sottile G., Wit E.C., and Vinciotti V. (2023) <doi:10.18637/jss.v105.i01>. cglasso: An R Package for Conditional Graphical Lasso Inference with Censored and Missing Values. Journal of Statistical Software 105(1), 1–58.
Augugliaro, L., Sottile, G., and Vinciotti, V. (2020a) <doi:10.1007/s11222-020-09945-7>. The conditional censored graphical lasso estimator. Statistics and Computing 30, 1273–1289.
Augugliaro, L., Abbruzzo, A., and Vinciotti, V. (2020b) <doi:10.1093/biostatistics/kxy043>.
-Penalized censored Gaussian graphical model.
Biostatistics 21, e1–e16.
set.seed(123) n <- 100 p <- 3 q <- 2 b0 <- rep(1, p) X <- matrix(rnorm(n * q), n, q) B <- matrix(rnorm(q * p), q, p) Sigma <- outer(1:p, 1:p, function(i, j) 0.3^abs(i - j)) probl <- 0.05 probr <- 0.05 probna <- 0.05 # Model 1: Y ~ N(0, I) Z <- rcggm(n = n, p = p, probl = probl, probr = probr, probna = probna) summary(Z) # Model 2: Y ~ N(0, Sigma) Z <- rcggm(n = n, Sigma = Sigma, probl = probl, probr = probr, probna = probna) summary(Z) # Model 3: Y ~ N(b0, I) Z <- rcggm(n = n, b0 = b0, probl = probl, probr = probr, probna = probna) summary(Z) # Model 4: Y ~ N(b0, Sigma) Z <- rcggm(n = n, b0 = b0, Sigma = Sigma, probl = probl, probr = probr, probna = probna) summary(Z) # Model 5: Y ~ N(XB, I) Z <- rcggm(X = X, B = B, probl = probl, probr = probr, probna = probna) summary(Z) # Model 6: Y ~ N(XB, Sigma) Z <- rcggm(X = X, B = B, Sigma = Sigma, probl = probl, probr = probr, probna = probna) summary(Z) # Model 7: Y ~ N(b0 + XB, I) Z <- rcggm(b0 = b0, X = X, B = B, probl = probl, probr = probr, probna = probna) summary(Z) # Model 8: Y ~ N(b0 + XB, Sigma) Z <- rcggm(b0 = b0, X = X, B = B, Sigma = Sigma, probl = probl, probr = probr, probna = probna) summary(Z)
set.seed(123) n <- 100 p <- 3 q <- 2 b0 <- rep(1, p) X <- matrix(rnorm(n * q), n, q) B <- matrix(rnorm(q * p), q, p) Sigma <- outer(1:p, 1:p, function(i, j) 0.3^abs(i - j)) probl <- 0.05 probr <- 0.05 probna <- 0.05 # Model 1: Y ~ N(0, I) Z <- rcggm(n = n, p = p, probl = probl, probr = probr, probna = probna) summary(Z) # Model 2: Y ~ N(0, Sigma) Z <- rcggm(n = n, Sigma = Sigma, probl = probl, probr = probr, probna = probna) summary(Z) # Model 3: Y ~ N(b0, I) Z <- rcggm(n = n, b0 = b0, probl = probl, probr = probr, probna = probna) summary(Z) # Model 4: Y ~ N(b0, Sigma) Z <- rcggm(n = n, b0 = b0, Sigma = Sigma, probl = probl, probr = probr, probna = probna) summary(Z) # Model 5: Y ~ N(XB, I) Z <- rcggm(X = X, B = B, probl = probl, probr = probr, probna = probna) summary(Z) # Model 6: Y ~ N(XB, Sigma) Z <- rcggm(X = X, B = B, Sigma = Sigma, probl = probl, probr = probr, probna = probna) summary(Z) # Model 7: Y ~ N(b0 + XB, I) Z <- rcggm(b0 = b0, X = X, B = B, probl = probl, probr = probr, probna = probna) summary(Z) # Model 8: Y ~ N(b0 + XB, Sigma) Z <- rcggm(b0 = b0, X = X, B = B, Sigma = Sigma, probl = probl, probr = probr, probna = probna) summary(Z)
Extracts model residuals from an R object inheriting class ‘cglasso
’.
## S3 method for class 'cglasso' residuals(object, type = c("observed", "working"), lambda.id, rho.id, drop = TRUE, ...)
## S3 method for class 'cglasso' residuals(object, type = c("observed", "working"), lambda.id, rho.id, drop = TRUE, ...)
object |
an R object inheriting class ‘ |
type |
a description of the desired residuals (see section ‘Details’). |
lambda.id |
a vector of integers used to specify the |
rho.id |
a vector of integers used to specify the |
drop |
logical. Dimensions can only be dropped if their extent is one. |
... |
further arguments passed to or from other methods. |
The accessor function ‘residuals
’ returns an array storing the ‘observed’ or ‘working’ residuals, depending on the argument type
.
The ‘observed’ residuals are defined as the difference between the observed response values and the corresponding fitted expected values. For missing and censored response values, the ‘observed’ residuals are set equal to ‘NA
’.
The ‘working’ residuals are obtained as a byproduct of the EM-algorithm used to fit the model. In the E-step, the algorithm computes the ‘working’ response matrix, that is, the response matrix with missing values replaced by the conditional expected values of the multivariate normal distribution and censored values replaced by the conditional expected values of the multivariate truncated normal distribution. The ‘working’ residuals are defined as the difference between ‘working’ responses and fitted expected values.
The accessor function ‘residuals
’ returns an array storing the desired residuals.
Luigi Augugliaro ([email protected])
Model-fitting functions cglasso
, cggm
and the accessor functions coef.cglasso
, fitted.cglasso
, predict.cglasso
and impute
.
set.seed(123) # Y ~ N(0, Sigma) and probability of left/right censored values equal to 0.05 n <- 1000L p <- 3L rho <- 0.3 Sigma <- outer(1L:p, 1L:p, function(i, j) rho^abs(i - j)) Z <- rcggm(n = n, Sigma = Sigma, probl = 0.05, probr = 0.05) out <- cglasso(. ~ ., data = Z) residuals(out, type = "observed", rho.id = 3L, drop = TRUE) residuals(out, type = "working", rho.id = 3L, drop = TRUE) # Y ~ N(b0 + XB, Sigma) and probability of left/right censored values equal to 0.05 n <- 1000L p <- 3L q <- 2 b0 <- runif(p) B <- matrix(runif(q * p), nrow = q, ncol = p) X <- matrix(rnorm(n * q), nrow = n, ncol = q) rho <- 0.3 Sigma <- outer(1L:p, 1L:p, function(i, j) rho^abs(i - j)) Z <- rcggm(n = n, b0 = b0, X = X, B = B, Sigma = Sigma, probl = 0.05, probr = 0.05) out <- cglasso(. ~ ., data = Z) residuals(out, type = "observed", lambda.id = 3L, rho.id = 3L, drop = TRUE) residuals(out, type = "working", lambda.id = 3L, rho.id = 3L, drop = TRUE)
set.seed(123) # Y ~ N(0, Sigma) and probability of left/right censored values equal to 0.05 n <- 1000L p <- 3L rho <- 0.3 Sigma <- outer(1L:p, 1L:p, function(i, j) rho^abs(i - j)) Z <- rcggm(n = n, Sigma = Sigma, probl = 0.05, probr = 0.05) out <- cglasso(. ~ ., data = Z) residuals(out, type = "observed", rho.id = 3L, drop = TRUE) residuals(out, type = "working", rho.id = 3L, drop = TRUE) # Y ~ N(b0 + XB, Sigma) and probability of left/right censored values equal to 0.05 n <- 1000L p <- 3L q <- 2 b0 <- runif(p) B <- matrix(runif(q * p), nrow = q, ncol = p) X <- matrix(rnorm(n * q), nrow = n, ncol = q) rho <- 0.3 Sigma <- outer(1L:p, 1L:p, function(i, j) rho^abs(i - j)) Z <- rcggm(n = n, b0 = b0, X = X, B = B, Sigma = Sigma, probl = 0.05, probr = 0.05) out <- cglasso(. ~ ., data = Z) residuals(out, type = "observed", lambda.id = 3L, rho.id = 3L, drop = TRUE) residuals(out, type = "working", lambda.id = 3L, rho.id = 3L, drop = TRUE)
Retrieve or set the row or column names of a “datacggm” object.
rowNames(x) rowNames(x) <- value colNames(x) colNames(x) <- value
rowNames(x) rowNames(x) <- value colNames(x) colNames(x) <- value
x |
an R object of class ‘ |
value |
a named list with elements ‘ |
For an R object of class ‘datacggm
’, rowNames
(colNames
) retrieves or set rownames
(colnames
) for matrices Y
and X
. See below for some examples.
Luigi Augugliaro ([email protected])
datacggm
, rcggm
and the method function dimnames
.
set.seed(123) # a dataset from a censored Gaussian graphical model n <- 100L p <- 3L Z <- rcggm(n = n, p = p, probl = 0.05, probr = 0.05) rowNames(Z) rowNames(Z) <- list(Y = paste0("i", seq_len(n))) # the same as rowNames(Z)$Y <- paste0("i", seq_len(n)) rowNames(Z) colNames(Z) colNames(Z) <- list(Y = paste("Y", 1:p, sep = ":")) # the same as colNames(Z)$Y <- paste("Y", 1:p, sep = ":") colNames(Z) # a dataset from a conditional censored Gaussian graphical model n <- 100L p <- 3L q <- 2 b0 <- runif(p) B <- matrix(runif(q * p), nrow = q, ncol = p) X <- matrix(rnorm(n * q), nrow = n, ncol = q) Z <- rcggm(n = n, b0 = b0, X = X, B = B, probl = 0.05, probr = 0.05) rowNames(Z)$Y <- paste0("i", seq_len(n)) rowNames(Z)$X <- paste0("i", seq_len(n)) dimnames(Z) colNames(Z)$Y <- paste("Y", 1:p, sep = ":") colNames(Z)$X <- paste("X", 1:q, sep = ":") dimnames(Z)
set.seed(123) # a dataset from a censored Gaussian graphical model n <- 100L p <- 3L Z <- rcggm(n = n, p = p, probl = 0.05, probr = 0.05) rowNames(Z) rowNames(Z) <- list(Y = paste0("i", seq_len(n))) # the same as rowNames(Z)$Y <- paste0("i", seq_len(n)) rowNames(Z) colNames(Z) colNames(Z) <- list(Y = paste("Y", 1:p, sep = ":")) # the same as colNames(Z)$Y <- paste("Y", 1:p, sep = ":") colNames(Z) # a dataset from a conditional censored Gaussian graphical model n <- 100L p <- 3L q <- 2 b0 <- runif(p) B <- matrix(runif(q * p), nrow = q, ncol = p) X <- matrix(rnorm(n * q), nrow = n, ncol = q) Z <- rcggm(n = n, b0 = b0, X = X, B = B, probl = 0.05, probr = 0.05) rowNames(Z)$Y <- paste0("i", seq_len(n)) rowNames(Z)$X <- paste0("i", seq_len(n)) dimnames(Z) colNames(Z)$Y <- paste("Y", 1:p, sep = ":") colNames(Z)$X <- paste("X", 1:q, sep = ":") dimnames(Z)
‘select_cglasso
’ returns the optimal fitted model selected by a chosen measure of goodness-of-fit.
select_cglasso(object, GoF = AIC, ...)
select_cglasso(object, GoF = AIC, ...)
object |
an R object of class |
GoF |
a valid goodness-of-fit function, such as |
... |
further arguments passed to the chosen goodness-of-fit function (argument ‘ |
The function select_cglasso
evaluates the goodness-of-fit of the models fitted by cglasso
and extracts the selected model.
Model evaluation can be made in two ways. The easiest way is to use a valid goodness-of-fit function, such as AIC.cglasso
or BIC.cglasso
. In this case, further arguments are passed to these functions by ‘...’. The second way consists on passing the output of a goodness-of-fit function, that is, an R object of class ‘GoF
’. Usually, this approach is preferable when the computation of the chosen goodness-of-fit measure is time-consuming, such as when the sample size is small relative to the number of parameters and the AIC
or BIC
functions are used to evaluate a long sequence of fitted models. In these cases, we suggest the computation of several measures of goodness-of-fit in a preliminary step and then the use of the select_cglasso
function in a subsequent step to select the optimal fitted model.
‘select_cglasso
’ returns the optimal fitted model.
Gianluca Sottile ([email protected])
cglasso
, AIC.cglasso
and BIC.cglasso
.
set.seed(123) # Y ~ N(0, Sigma) and probability of left/right censored values equal to 0.05 n <- 100L p <- 3L rho <- 0.3 Sigma <- outer(1L:p, 1L:p, function(i, j) rho^abs(i - j)) Z <- rcggm(n = n, Sigma = Sigma, probl = 0.05, probr = 0.05) out <- cglasso(. ~ ., data = Z) select_cglasso(out) # models selection by AIC select_cglasso(out, GoF = BIC) # models selection by BIC select_cglasso(out, GoF = BIC, mle = TRUE, g = 0.5) # models selection by eBIC # Y ~ N(b0 + XB, Sigma) and probability of left/right censored values equal to 0.05 n <- 100L p <- 3L q <- 2 b0 <- runif(p) B <- matrix(runif(q * p), nrow = q, ncol = p) X <- matrix(rnorm(n * q), nrow = n, ncol = q) rho <- 0.3 Sigma <- outer(1L:p, 1L:p, function(i, j) rho^abs(i - j)) Z <- rcggm(n = n, b0 = b0, X = X, B = B, Sigma = Sigma, probl = 0.05, probr = 0.05) out <- cglasso(. ~ ., data = Z, lambda = 0.01) select_cglasso(out) # models selection by AIC select_cglasso(out, GoF = BIC) # models selection by BIC select_cglasso(out, GoF = BIC, mle = TRUE, g = 0.5) # models selection by eBIC out <- cglasso(. ~ ., data = Z, rho = 0.01) select_cglasso(out) # models selection by AIC select_cglasso(out, GoF = BIC) # models selection by BIC select_cglasso(out, GoF = BIC, mle = TRUE, g = 0.5) # models selection by eBIC out <- cglasso(. ~ ., data = Z) select_cglasso(out) # models selection by AIC select_cglasso(out, GoF = BIC) # models selection by BIC select_cglasso(out, GoF = BIC, mle = TRUE, g = 0.5) # models selection by eBIC
set.seed(123) # Y ~ N(0, Sigma) and probability of left/right censored values equal to 0.05 n <- 100L p <- 3L rho <- 0.3 Sigma <- outer(1L:p, 1L:p, function(i, j) rho^abs(i - j)) Z <- rcggm(n = n, Sigma = Sigma, probl = 0.05, probr = 0.05) out <- cglasso(. ~ ., data = Z) select_cglasso(out) # models selection by AIC select_cglasso(out, GoF = BIC) # models selection by BIC select_cglasso(out, GoF = BIC, mle = TRUE, g = 0.5) # models selection by eBIC # Y ~ N(b0 + XB, Sigma) and probability of left/right censored values equal to 0.05 n <- 100L p <- 3L q <- 2 b0 <- runif(p) B <- matrix(runif(q * p), nrow = q, ncol = p) X <- matrix(rnorm(n * q), nrow = n, ncol = q) rho <- 0.3 Sigma <- outer(1L:p, 1L:p, function(i, j) rho^abs(i - j)) Z <- rcggm(n = n, b0 = b0, X = X, B = B, Sigma = Sigma, probl = 0.05, probr = 0.05) out <- cglasso(. ~ ., data = Z, lambda = 0.01) select_cglasso(out) # models selection by AIC select_cglasso(out, GoF = BIC) # models selection by BIC select_cglasso(out, GoF = BIC, mle = TRUE, g = 0.5) # models selection by eBIC out <- cglasso(. ~ ., data = Z, rho = 0.01) select_cglasso(out) # models selection by AIC select_cglasso(out, GoF = BIC) # models selection by BIC select_cglasso(out, GoF = BIC, mle = TRUE, g = 0.5) # models selection by eBIC out <- cglasso(. ~ ., data = Z) select_cglasso(out) # models selection by AIC select_cglasso(out, GoF = BIC) # models selection by BIC select_cglasso(out, GoF = BIC, mle = TRUE, g = 0.5) # models selection by eBIC
ShowStructure
gives information about the package structure.
ShowStructure(module = c("ALL", "DM", "MF", "MS", "NA"), description = TRUE, plot = TRUE)
ShowStructure(module = c("ALL", "DM", "MF", "MS", "NA"), description = TRUE, plot = TRUE)
module |
a description of the module for which the description is required. |
plot |
logical. If |
description |
logical. If |
ShowStructure
silently returns a named list with components:
description |
a named list containing the description of the available functions. |
graph |
an R object of class ‘ |
Luigi Augugliaro ([email protected])
‘summary
’ produces a summary of the models fitted using cglasso
or cggm
.
## S3 method for class 'cglasso' summary(object, GoF = AIC, print.all = TRUE, digits = 3L, ...)
## S3 method for class 'cglasso' summary(object, GoF = AIC, print.all = TRUE, digits = 3L, ...)
object |
an R object inheriting class |
GoF |
a valid goodness-of-fit function, such as |
print.all |
logical. Should all summary statistics be printed? |
digits |
the minimum number of significant digits to be used: see ‘print.default’. |
... |
the penalty parameter passed to the goodness-of-fit function (argument ‘ |
The function summary.cglasso
computes the summary statistics needed to evaluate the goodness-of-fit of the models fitted by cglasso
or cggm
.
Model evaluation can be made in two ways. The easiest way is to use a valid goodness-of-fit function, such as AIC.cglasso
or BIC.cglasso
. In this case, further arguments can be passed to these functions by the argument ‘...’. The second way consists in passing the output of a goodness-of-fit function, that is, an R object of class ‘GoF
’. Usually, this approach is preferable when the computation of the chosen goodness-of-fit measure is time-consuming, such as when the sample size is small relative to the number of parameters and the AIC
or BIC
functions are used to evaluate a sequence of fitted models. In these cases, we suggest the computation of several measures of goodness-of-fit in a preliminary step and then the use of summary.cglasso
in a subsequent step to evaluate the sequence of fitted models.
To improve the readability of the results, the output is divided into two sections.
The first section is structured in a sequence of tables showing, for each combination of the two tuning parameters ‘lambda
’ and ‘rho
’, the number of estimated non-zero regression coefficients (‘df.B
’), the number of estimated non-zero partial correlation coefficients (‘df.Tht
’), the degrees-of-freedom (‘df
’) and the value of the goodness-of-fit measure used to evaluate the fitted models. To help the user with the identification of the optimal fitted model, the last column of each table reports the ranking of the models (where the optimal model is marked with the symbol ‘<-
’).
The second section reports the summary statistics of the selected optimal model.
A named list with the following elements is silently returned:
table |
data.frame containing the summary statistics used to evaluate the sequence of fitted models. |
lambda.id |
position of the optimal |
rho.id |
position of the optimal |
Luigi Augugliaro ([email protected])
cglasso
, cggm
, AIC.cglasso
and BIC.cglasso
.
set.seed(123) # Y ~ N(0, Sigma) and probability of left/right censored values equal to 0.05 n <- 100L p <- 3L rho <- 0.3 Sigma <- outer(1L:p, 1L:p, function(i, j) rho^abs(i - j)) Z <- rcggm(n = n, Sigma = Sigma, probl = 0.05, probr = 0.05) out <- cglasso(. ~ ., data = Z) summary(out) # models evaluation by AIC summary(out, GoF = BIC) # models evaluation by BIC summary(out, GoF = BIC, mle = TRUE, g = 0.5) # models evaluation by eBIC # Y ~ N(b0 + XB, Sigma) and probability of left/right censored values equal to 0.05 n <- 100L p <- 3L q <- 2 b0 <- runif(p) B <- matrix(runif(q * p), nrow = q, ncol = p) X <- matrix(rnorm(n * q), nrow = n, ncol = q) rho <- 0.3 Sigma <- outer(1L:p, 1L:p, function(i, j) rho^abs(i - j)) Z <- rcggm(n = n, b0 = b0, X = X, B = B, Sigma = Sigma, probl = 0.05, probr = 0.05) out <- cglasso(. ~ ., data = Z, lambda = 0.01) summary(out) # models evaluation by AIC summary(out, GoF = BIC) # models evaluation by BIC summary(out, GoF = BIC, mle = TRUE, g = 0.5) # models evaluation by eBIC out <- cglasso(. ~ ., data = Z, rho = 0.01) summary(out) # models evaluation by AIC summary(out, GoF = BIC) # models evaluation by BIC summary(out, GoF = BIC, mle = TRUE, g = 0.5) # models evaluation by eBIC out <- cglasso(. ~ ., data = Z) summary(out) # models evaluation by AIC summary(out, GoF = BIC) # models evaluation by BIC summary(out, GoF = BIC, mle = TRUE, g = 0.5) # models evaluation by eBIC
set.seed(123) # Y ~ N(0, Sigma) and probability of left/right censored values equal to 0.05 n <- 100L p <- 3L rho <- 0.3 Sigma <- outer(1L:p, 1L:p, function(i, j) rho^abs(i - j)) Z <- rcggm(n = n, Sigma = Sigma, probl = 0.05, probr = 0.05) out <- cglasso(. ~ ., data = Z) summary(out) # models evaluation by AIC summary(out, GoF = BIC) # models evaluation by BIC summary(out, GoF = BIC, mle = TRUE, g = 0.5) # models evaluation by eBIC # Y ~ N(b0 + XB, Sigma) and probability of left/right censored values equal to 0.05 n <- 100L p <- 3L q <- 2 b0 <- runif(p) B <- matrix(runif(q * p), nrow = q, ncol = p) X <- matrix(rnorm(n * q), nrow = n, ncol = q) rho <- 0.3 Sigma <- outer(1L:p, 1L:p, function(i, j) rho^abs(i - j)) Z <- rcggm(n = n, b0 = b0, X = X, B = B, Sigma = Sigma, probl = 0.05, probr = 0.05) out <- cglasso(. ~ ., data = Z, lambda = 0.01) summary(out) # models evaluation by AIC summary(out, GoF = BIC) # models evaluation by BIC summary(out, GoF = BIC, mle = TRUE, g = 0.5) # models evaluation by eBIC out <- cglasso(. ~ ., data = Z, rho = 0.01) summary(out) # models evaluation by AIC summary(out, GoF = BIC) # models evaluation by BIC summary(out, GoF = BIC, mle = TRUE, g = 0.5) # models evaluation by eBIC out <- cglasso(. ~ ., data = Z) summary(out) # models evaluation by AIC summary(out, GoF = BIC) # models evaluation by BIC summary(out, GoF = BIC, mle = TRUE, g = 0.5) # models evaluation by eBIC
datacggm
’The function ‘summary.datacggm
’ produces summaries of an object of class ‘datacggm
’.
## S3 method for class 'datacggm' summary(object, n, quantile.type = 7L, digits = 3L, quote = FALSE, ...)
## S3 method for class 'datacggm' summary(object, n, quantile.type = 7L, digits = 3L, quote = FALSE, ...)
object |
an object of class ‘ |
n |
maximum number of rows printed on video. By default all rows are printed. |
quantile.type |
an integer between 1 and 9 selecting one of the nine quantile algorithms (see also |
digits |
integer, used for number formatting with ‘ |
quote |
logical, indicating whether or not strings should be printed with surrounding quotes. |
... |
further arguments passed to |
The function ‘summary.datacggm
’ extends the results given by ‘summary.matrix()
’ taking into account the specific structure of an object of class ‘datacggm
’. Summary statistics are printed out in two sections: the first section reports the summary statistics for the response matrix ‘Y
’, whereas the second section is devoted to summarising the matrix of predictors ‘X
’.
For each response variable, the first section reports the standard summary statistics computed using only the observed values, that is the entries of the response matrix whose status indicators are equals to zero (see ‘event
’ for more details), together with the information about the lower and upper censoring values (columns named Lower
and Upper
) and the percentage of missing-at-random, left-censored and right-censored response values (columns named NA%
, LC%
and RC%
, respectively).
The second section is divided into two subsections reporting the main summary statistics computed for numeric and categorical predictors, respectively.
The two sections are printed out in such a way that the readability of the summary statistics is improved in a high-dimensional setting, that is, when the number of predictors and response variables exceeds the sample size.
‘summary.dataccgm
’ returns a named list with components
Y |
a matrix with class ‘ |
X.numeric |
a matrix with class ‘ |
X.categorical |
a list containing the summary measures on the categorical predictors. |
Luigi Augugliaro ([email protected])
Augugliaro L., Sottile G., Wit E.C., and Vinciotti V. (2023) <doi:10.18637/jss.v105.i01>. cglasso: An R Package for Conditional Graphical Lasso Inference with Censored and Missing Values. Journal of Statistical Software 105(1), 1–58.
Augugliaro, L., Sottile, G., and Vinciotti, V. (2020a) <doi:10.1007/s11222-020-09945-7>. The conditional censored graphical lasso estimator. Statistics and Computing 30, 1273–1289.
Augugliaro, L., Abbruzzo, A., and Vinciotti, V. (2020b) <doi:10.1093/biostatistics/kxy043>.
-Penalized censored Gaussian graphical model.
Biostatistics 21, e1–e16.
set.seed(123) # case 1: Y ~ N(0, Sigma) # 1. probability of left/right censored values equal to 0.05 # 2. probability of missing-at-random equals to 0.05 n <- 50L p <- 100L rho <- 0.3 Sigma <- outer(1L:p, 1L:p, function(i, j) rho^abs(i - j)) Z <- rcggm(n = n, Sigma = Sigma, probl = 0.05, probr = 0.05, probna = 0.05) summary(Z, max = n * p) # case 2: Y ~ N(b0 + XB, Sigma) and # 1. probability of left/right censored values equal to 0.05 # 2. probability of missing-at-random equals to 0.05 n <- 50L p <- 100L q <- 100L b0 <- runif(p) B <- matrix(runif(q * p), nrow = q, ncol = p) X <- matrix(rnorm(n * q), nrow = n, ncol = q) rho <- 0.3 Sigma <- outer(1L:p, 1L:p, function(i, j) rho^abs(i - j)) Z <- rcggm(n = n, b0 = b0, X = X, B = B, Sigma = Sigma, probl = 0.05, probr = 0.05, probna = 0.05) summary(Z) summary(Z, max = n * p)
set.seed(123) # case 1: Y ~ N(0, Sigma) # 1. probability of left/right censored values equal to 0.05 # 2. probability of missing-at-random equals to 0.05 n <- 50L p <- 100L rho <- 0.3 Sigma <- outer(1L:p, 1L:p, function(i, j) rho^abs(i - j)) Z <- rcggm(n = n, Sigma = Sigma, probl = 0.05, probr = 0.05, probna = 0.05) summary(Z, max = n * p) # case 2: Y ~ N(b0 + XB, Sigma) and # 1. probability of left/right censored values equal to 0.05 # 2. probability of missing-at-random equals to 0.05 n <- 50L p <- 100L q <- 100L b0 <- runif(p) B <- matrix(runif(q * p), nrow = q, ncol = p) X <- matrix(rnorm(n * q), nrow = n, ncol = q) rho <- 0.3 Sigma <- outer(1L:p, 1L:p, function(i, j) rho^abs(i - j)) Z <- rcggm(n = n, b0 = b0, X = X, B = B, Sigma = Sigma, probl = 0.05, probr = 0.05, probna = 0.05) summary(Z) summary(Z, max = n * p)
‘to_graph
’ returns a named list of graphs using the results of an R object of class ‘cglasso
’ or ‘cggm
’.
to_graph(object, GoF = AIC, lambda.id, rho.id, weighted = FALSE, simplify = TRUE, ...)
to_graph(object, GoF = AIC, lambda.id, rho.id, weighted = FALSE, simplify = TRUE, ...)
object |
an R object inheriting class ‘ |
GoF |
a valid goodness-of-fit function, such as |
lambda.id |
optional. If |
rho.id |
optional. If |
weighted |
logical. Should weighted graphs be created? Default is |
simplify |
logical. Should isolated vertices be removed from the graph? Default is |
... |
further arguments passed to the chosen goodness-of-fit function (argument ‘ |
‘to_graph
’ returns a named list of graphs using the results of an R object of class ‘cglasso
’ or ‘cggm
’.
If object
has class ‘cglasso
’, then the goodness-of-fit function passed through the argument GoF
is used to identify the adjacency matrix (object$InfoStructure$Adj_yy
) describing the undirected edges among the response variables. If the model is fitted using
predictors, then the matrix describing the effects of the predictors onto the response variables (see object$InfoStructure$Adj_xy) is also returned. Finally, these matrices are used to return an undirected and directed graph. Opionally, the user can identify a specific fitted model using the arguments
lambda.id
and rho.id
.
If object
has class ‘cggm
’, then GoF
, lambda.id
and rho.id
can be omitted.
If argument weighted is set equal to ‘TRUE
’, then the estimated precision matrix and, if available, the estimated regression coefficient matrix are used to return weighted graphs. In this case, edges associated with positive estimates are shown using a solid line. Otherwise, a dashed line is used.
‘to_graph
’ returns an R object of S3 class “cglasso2igraph
”, i.e., a named list containing the
following components:
Gyy |
an undirected graph representing the conditional dependence structure among the |
Gxy |
a directed graph representing the effetcs of the |
Each component is an R object of class igraph
.
Luigi Augugliaro ([email protected])
cglasso
, cggm
and plot.cglasso2igraph
. For more details about the object of class ‘igraph
’, the interested reader is referred to the package igraph.
set.seed(123) # Y ~ N(0, Sigma) and probability of left/right censored values equal to 0.05 n <- 100L p <- 3L rho <- 0.3 Sigma <- outer(1L:p, 1L:p, function(i, j) rho^abs(i - j)) Z <- rcggm(n = n, Sigma = Sigma, probl = 0.05, probr = 0.05) out <- cglasso(. ~ ., data = Z) out.graph <- to_graph(out) out.graph # Y ~ N(b0 + XB, Sigma) and probability of left/right censored values equal to 0.05 n <- 100L p <- 3L q <- 2L b0 <- runif(p) B <- matrix(runif(q * p), nrow = q, ncol = p) X <- matrix(rnorm(n * q), nrow = n, ncol = q) rho <- 0.3 Sigma <- outer(1L:p, 1L:p, function(i, j) rho^abs(i - j)) Z <- rcggm(n = n, b0 = b0, X = X, B = B, Sigma = Sigma, probl = 0.05, probr = 0.05) out <- cglasso(. ~ ., data = Z) out.graph <- to_graph(out, lambda.id = 3, rho.id = 3, weighted = TRUE) out.graph
set.seed(123) # Y ~ N(0, Sigma) and probability of left/right censored values equal to 0.05 n <- 100L p <- 3L rho <- 0.3 Sigma <- outer(1L:p, 1L:p, function(i, j) rho^abs(i - j)) Z <- rcggm(n = n, Sigma = Sigma, probl = 0.05, probr = 0.05) out <- cglasso(. ~ ., data = Z) out.graph <- to_graph(out) out.graph # Y ~ N(b0 + XB, Sigma) and probability of left/right censored values equal to 0.05 n <- 100L p <- 3L q <- 2L b0 <- runif(p) B <- matrix(runif(q * p), nrow = q, ncol = p) X <- matrix(rnorm(n * q), nrow = n, ncol = q) rho <- 0.3 Sigma <- outer(1L:p, 1L:p, function(i, j) rho^abs(i - j)) Z <- rcggm(n = n, b0 = b0, X = X, B = B, Sigma = Sigma, probl = 0.05, probr = 0.05) out <- cglasso(. ~ ., data = Z) out.graph <- to_graph(out, lambda.id = 3, rho.id = 3, weighted = TRUE) out.graph