Package 'sglasso'

Title: Lasso Method for RCON(V,E) Models
Description: RCON(V, E) models are a kind of restriction of the Gaussian Graphical Models defined by a set of equality constraints on the entries of the concentration matrix. 'sglasso' package implements the structured graphical lasso (sglasso) estimator proposed in Abbruzzo et al. (2014) for the weighted l1-penalized RCON(V, E) model. Two cyclic coordinate algorithms are implemented to compute the sglasso estimator, i.e. a cyclic coordinate minimization (CCM) and a cyclic coordinate descent (CCD) algorithm.
Authors: Luigi Augugliaro
Maintainer: Luigi Augugliaro <[email protected]>
License: GPL (>= 2)
Version: 1.2.6
Built: 2024-11-28 06:39:34 UTC
Source: CRAN

Help Index


Lasso Method for RCON(V, E) Models

Description

RCON(V, E) models (Hojsgaard, et al., 2008) are a kind of restriction of the Gaussian Graphical Models defined by a set of equality constraints on the entries of the concentration matrix. sglasso package implements the structured graphical lasso (sglasso) estimator proposed in Abbruzzo et al. (2014) for the weighted l1-penalized RCON(V, E) model. Two cyclic coordinate algorithms are implemented to compute the sglasso estimator, i.e. a cyclic coordinate minimization (CCM) and a cyclic coordinate descent (CCD) algorithm.

Details

Package: sglasso
Type: Package
Version: 1.2.6
Date: 2023-12-03
License: GPL (>=2)

Author(s)

Luigi Augugliaro

Maintainer:
Luigi Augugliaro <[email protected]>

References

Abbruzzo, A., Augugliaro, L., Mineo, A. M. and Wit, E. C. (2014) Cyclic coordinate for penalized Gaussian Graphical Models with symmetry restrictions. In Proceeding of COMPSTAT 2014 - 21th International Conference on Computational Statistics, Geneva, August 19-24, 2014.

Hojsgaard, S. and Lauritzen, S. L. (2008) Graphical gaussian models with edge and vertex symmetries. J. Roy. Statist. Soc. Ser. B., Vol. 70(5), 1005–1027.


L1-penalized Factorial Graphical Lasso Model

Description

Fit the weight l1-penlized factorial dynamic Gaussian Graphical Model.

Usage

fglasso(S, model, tp, p, ...)

Arguments

S

the empirical variance/covariance matrix;

model

a list or a matrix used to specify the factorial dynamic Gaussian Graphical Model (see Details);

tp

number of time points;

p

number of random variables observed for each time point;

...

further arguments passed to sglasso.

Details

The factorial dynamic Gaussian Graphical Model (Abbruzzo et al., 2015) is a special kind of RCON(V, E) model (Hojsgaard, et al.,2008) proposed to study dynamic networks. Let Xt=(Xit,,Xit)X_t = (X_{it},\ldots,X_{it})' be a pp-dimensional random variable at time tt. Assuming that X=(X1,,XT)X = (X'_1,\ldots,X'_T) follows a multivariate normal distribution, the concentration matrix KK has the following block structure

K=(K1,1K1,2K1,TK2,1K2,2K2,TKT,1KT,2KT,T),K = \left( \begin{array}{cccc} K_{1,1} & K_{1,2} & \ldots & K_{1,T}\\ K_{2,1} & K_{2,2} & \ldots & K_{2,T}\\ \vdots & \vdots & \ddots & \vdots\\ K_{T,1} & K_{T,2} & \ldots & K_{T,T} \end{array}\right),

where Kt,tK_{t,t} give information about the conditinal independence structure among the pp random variables at time tt, and Kt,t+hK_{t,t + h} give information about the conditional independence structure between XtX_t and Xt+hX_{t + h}. An interpretation of the elements of the submatrices Kt,t+hK_{t,t + h} brings to the notion of natural structure, i.e.,

Kt,t+h=(k1,1t,t+h000k2,2t,t+h000kp,pt,t+h)+(0k1,2t,t+hk1,pt,t+hk2,1t,t+h00kp,1t,t+hkp,2t,t+h0).K_{t,t + h} = \left( \begin{array}{cccc} k_{1,1}^{t,t+h} & 0 & \ldots & 0\\ 0 & k_{2,2}^{t,t+h} & \ldots & 0\\ \vdots & \vdots & \ddots & \vdots\\ 0 & 0 & \ldots & k_{p,p}^{t,t+h} \end{array}\right) + \left( \begin{array}{cccc} 0 & k_{1,2}^{t,t+h} & \ldots & k_{1,p}^{t,t+h}\\ k_{2,1}^{t,t+h} & 0 & \ldots & 0\\ \vdots & \vdots & \ddots & \vdots\\ k_{p,1}^{t,t+h} & k_{p,2}^{t,t+h} & \ldots & 0 \end{array}\right).

The entries of the first matrix are called self-self conditinal dependences at temporal lag hh and represents the (negative) self-similarity of a given random variable across different time points. The entries of the second matrix are the conditional dependence among the pp random variables. To make the interpretation of the results more relevant and, at the same time, reduce the number of parameters, the authors propose the following equality constraints:

ki,it,t+hk_{i,i}^{t,t+h} effect R code ki,jt,t+hk_{i,j}^{t,t+h} effect R code
i. 00 zero "." iv. 00 zero "."
ii. shs^h costant "c" ii. nhn^h costant "c"
iii. sihs^h_i unit "u" iii. nihn^h_i unit "u"
iv. st,hs^{t,h} time "t" iv. nt,hn^{t,h} time "t"
v. sit,hs^{t,h}_i interaction "ut" v. ni,jt,hn^{t,h}_{i,j} interaction "ut"

Argument model is used to specify the restrinctions previously describted. This argument can be a named list or a matrix with dimension nlag×2nlag\times 2, where nlagtpnlag\le\code{tp}. To gain more insight, suppose that we want to model only the sub-matrices Kt,tK_{t,t} and Kt,t+1K_{t,t+1}, i.e., the sub-matrices corresponding to the temporal lag zero and one. A possible R code is

model.mat <- matrix("", nrow = 2, ncol = 2)
rownames(model.mat) <- c("lag0", "lag1")
colnames(model.mat) <- c("s", "n")
model.mat[1, ] <- c("c", "ut")
model.mat[2, ] <- c("t", ".")

In this example we are modelling the diagonal elements of the sub-matrices Kt,tK_{t,t} with the constant effect while the off-diagonal elements are modelled by the interaction effect. In the same way, the diagonal elements of the sub-matrices Kt,t+1K_{t,t+1} are modelled by the time effect while the remaning elements are equal to zero. The fglasso function passes the matrix model.mat to the internal function fglasso_model2mask, i.e.,

mask <- fglasso_model2mask(model.mat, tp = 3, p = 3)

which returns the mask used in sglasso to fit the specified factorial dynamic Gaussian Graphical model. The same model can be specified by the following named list

model.list <- list(lag0 = c(s = "c", n = "ut"), lag1 = c(s = "t", n = "."))

See the example below for more details.

Value

fglasso returns an obejct with S3 class "sglasso". See the corresponding manual for more details.

Author(s)

Luigi Augugliaro
Maintainer: Luigi Augugliaro [email protected]

References

Wit, E. C. and Abbruzzo, A.(2015) Dynamic factorial graphical models for dynamic networks. Network Science, Vol. 3(1), 37– 57

Abbruzzo, A., Augugliaro, L., Mineo, A.M. and Wit, E.C. (2014) Cyclic coordinate for penalized Gaussian Graphical Models with symmetry restrictions. In Proceeding of COMPSTAT 2014 - 21th International Conference on Computational Statistics. Geneva, August 19-24, 2014.

Hojsgaard, S. and Lauritzen, S.L. (2008) Graphical gaussian models with edge and vertex symmetries. J. Roy. Statist. Soc. Ser. B., Vol. 70(5), 1005–1027.

See Also

sglasso function.

Examples

#######################
# fglasso solution path
#######################
N <- 50
tp <- 3
p <- 3
X <- matrix(rnorm(N * p * tp), N, tp * p)
S <- crossprod(X) / N
model <- list(lag0 = c(s = "c", n = "ut"), lag1 = c(s = "t", n = "."))
out.fglasso <- fglasso(S = S, model = model, tp = tp, p = p)
out.fglasso

Plotting Sparse Graph

Description

gplot is a generic function for plotting sparse graphs.

Usage

gplot(object, ...)

Arguments

object

fitted sglasso/fglasso object;

...

other parameters passed to gplot.sglasso or gplot.fglasso.

Details

gplot is a generic function used to plot a graph estimated by sglasso or fglasso. See the method function gplot.sglasso or gplot.fglasso for more details about the specific arguments.

Author(s)

Luigi Augugliaro
Maintainer: Luigi Augugliaro [email protected]

See Also

gplot.sglasso and gplot.fglasso method functions.


Plotting Sparse Factorial Dynamic Gaussian Graphical Model

Description

gplot.fglasso shows the sequence of graphs estimated by fglasso.

Usage

## S3 method for class 'fglasso'
gplot(object, rhoid, tp = c(1, 2), sub.tp1, sub.tp2, cex.sub = 1, 
    k = 1.5, layout = layout.circle, ...)

Arguments

object

fitted fglasso object;

rhoid

an integer used to specificy the ρ\rho-value used to fit the fglasso model;

tp

a vector of length equal to two used to specify the time points of the two graphs that will be compared. By default the first two time points are used;

sub.tp1

sub title for the graph estimated at time point tp[1];

sub.tp2

sub title for the graph estimated at time point tp[2];

cex.sub

a numerical value giving the amount by which plotting sub titles should be magnified relateve to the default;

k

value used to specify the distance between the two graphs;

layout

a function or a matrix used to specify the layout of the graphs that will be plotted. By default the layout.circle function is used;

...

further graphical parameters used to plot the graphs. See package igraph for more details.

Details

For a given value of the tuning parameter, specified by the argument rhoid, gplot.fglasso shows the graphs estimated at the time points tp[1] and tp[2]. By convention, the graph associated to the sub matrix Ktp[1],tp[2]K_{tp[1],tp[2]} is represented by a directed graph where a directed edge is drawn by an arrow from a vertex in the first graph pointing forwards a vertex in the second graph.

Value

gplot.fglasso returns a list with components:

graph.tp1

an object with class igraph representing the undirected graph estimated at the time point tp[1];

graph.tp2

an object with class igraph representing the undirected graph estimated at the time point tp[2];

graph.net

an object with class igraph representing the directed graph associated to the submatrix Ktp[1],tp[2]K_{tp[1],tp[2]};

layout

the matrix used to specify the placement of the vertices.

Author(s)

Luigi Augugliaro
Maintainer: Luigi Augugliaro [email protected]

See Also

fglasso function.

Examples

N <- 50
tp <- 3
p <- 3
X <- matrix(rnorm(N * p * tp), N, tp * p)
S <- crossprod(X) / N
model <- list(lag0 = c(s = "c", n = "ut"), lag1 = c(s = "t", n = "t"))
out.fglasso <- fglasso(S = S, model = model, tp = tp, p = p)
gplot(out.fglasso, rhoid = 50, sub.tp1 = "First graph", 
   sub.tp2 = "Second graph")

Plotting Sparse Graphs

Description

gplot.sglasso shows the sequence of graphs estimated by sglasso.

Usage

## S3 method for class 'sglasso'
gplot(object, rhoid, layout = layout.circle, ...)

Arguments

object

fitted sglasso object;

rhoid

vector of integers used to specificy the ρ\rho-values used to fit the sglasso model. By default gplot.sglasso shows the sequence of graphs estimated by sglasso. Only topologically different graphs are plotted;

layout

a function or a matrix used to specify the layout of the graphs that will be plotted. By default the layout.circle function is used;

...

further graphical parameters used to plot the graphs. See package igraph for more details.

Details

gplot.sglasso shows the sequence of topologically different graphs estimated by sglasso. To specify the layout of the graphs, the user can use any layout function available in the R package igraph. The user can also specify the placement of the vertices by a matrix with two columns and the same number of rows as the number of vertices.

Author(s)

Luigi Augugliaro
Maintainer: Luigi Augugliaro [email protected]

See Also

sglasso function.

Examples

N <- 100
p <- 5
X <- matrix(rnorm(N * p), N, p)
S <- crossprod(X)/N
mask <- outer(1:p, 1:p, function(i,j) 0.5^abs(i-j))
mask[1,5] <- mask[1,4] <- mask[2,5] <- NA
mask[5,1] <- mask[4,1] <- mask[5,2] <- NA
out.sglasso_path <- sglasso(S, mask, tol = 1.0e-13)
gplot(out.sglasso_path)
gplot(out.sglasso_path, rhoid = 1:5)

Extract Sparse Structured Precision Matrices

Description

Function Kh computes the sequence of sparse structured precision matrices estimated by sglasso function.

Usage

Kh(object, rho)

Arguments

object

fitted sglasso object;

rho

a subset of the values of the tuning parameter used in sglasso to compute the solution path. By default, the entire sequence of estimated sparse structured precision matrices is returned.

Value

Kh returns a named list containing the sequence of estimated sparse structured precision matrices.

Author(s)

Luigi Augugliaro
Maintainer: Luigi Augugliaro [email protected]

See Also

sglasso function.

Examples

N <- 100
p <- 5
X <- matrix(rnorm(N * p), N, p)
S <- crossprod(X) / N
mask <- outer(1:p, 1:p, function(i, j) 0.5^abs(i - j))
out.sglasso_path <- sglasso(S, mask, nrho = 5, tol = 1.0e-13)
out.sglasso_path
Kh(out.sglasso_path)
rho <- out.sglasso_path$rho[3]
out.sglasso_single <- sglasso(S, mask, nrho = 1, min_rho = rho, 
   tol = 1.0e-13, algorithm = "ccm")
Kh(out.sglasso_single)

Cross-Validated Kullback-Leibler Divergence

Description

Model selection criterion based on the leave-one-out cross-validated Kullback-Leibler divergence.

Usage

klcv(object, X, scale = 1)

Arguments

object

fitted sglasso/fglasso object;

X

the matrix used to compute the empirical variance/covariance matrix. Its dimension is N ×\times p, where p is the number of random variables and N is the samlpe size;

scale

scalar value used to scale the estimated degrees-of-freedom. See below for more details.

Details

klcv function implements the leave-one-out cross-validate Kullback-Leibler divergence criterion proposed in Vujacic et al. (2015). For l1l_1-penalized Gaussian Graphical Models this measure of goodness-of-fit has the following form

klcv(ρ)=(K^(ρ))N+scale2Ngdf(K^(ρ)),klcv(\rho) = -\frac{\ell(\hat K(\rho))}{N} + \frac{\code{scale}}{2N} gdf(\hat K(\rho)),

where K^(ρ)\hat K(\rho) is the glasso estimate of the concentration matrix, (K^(ρ))\ell(\hat K(\rho)) is the corresponding value of the log-likelihood function, scale is a scale factor for the complexity part, i.e. gdf(K^(ρ))gdf(\hat K(\rho)), which is defined as

gdf(K^(ρ))=1N1k=1Nvec{(K^(ρ)1Sk)1ρ}vec[K^(ρ){(SSk)1ρ}K^(ρ)].gdf(\hat K(\rho)) = \frac{1}{N-1}\sum_{k=1}^N vec\{(\hat K(\rho)^{-1} - S_k)\circ 1_\rho\}'vec[\hat K(\rho)\{(S-S_k)\circ 1_\rho\}\hat K(\rho)].

In the previous expression SS is the empirical variance/covariance matrix, Sk=XkXkS_k = X_k X_k', 1ρ1_\rho is a matrix with entries I(k^ij(ρ)0)I(\hat k_{ij}(\rho)\ne 0) and \circ is the Hadamard product operator.

Value

klcv returns an S3 object with calls klcv, i.e. a named list with the following components:

klcv

the vector with the leave-one-out cross-validated Kullback-Leibler divergence;

rho

the rho-values used to compute the leave-one-out cross-validated Kullback-Leibler divergence;

loglik

a vector with the log-likelihood computed for the sequence of weighted l1-penalized RCON(V, E);

gdf

a vector returning the generalized degrees-of-freedom;

scale

the scale value used to define the leave-one-out cross-validated Kullback-Leibler divergence;

min.klcv

minimum value of the leave-one-out cross-validated Kullback-Leibler divergence;

rho.opt

the rho-value corresponding to minimum leave-one-out cross-validated Kullback-Leibler divergence;

rhoid

the index of the rho-value identified by the leave-one-out cross-validated Kullback-Leibler divergence.

Author(s)

Luigi Augugliaro
Maintainer: Luigi Augugliaro [email protected]

References

Vujacic, I., Abbruzzo, A. and Wit, E. C. (2015) A computationally fast alternative to cross-validation in penalized Gaussian graphical models. J. Stat. Comput. Simul.

See Also

sglasso, loglik functions and plot.klcv method.

Examples

N <- 100
p <- 5
X <- matrix(rnorm(N * p), N, p)
S <- crossprod(X) / N
mask <- outer(1:p, 1:p, function(i,j) 0.5^abs(i-j))
mask[1,5] <- mask[1,4] <- mask[2,5] <- NA
mask[5,1] <- mask[4,1] <- mask[5,2] <- NA
out.sglasso_path <- sglasso(S, mask, tol = 1.0e-13)
out.klcv <- klcv(out.sglasso_path, X)
out.klcv

Extract Log-Likelihood

Description

This function extracts the log-likelihood for the sequence of weighted l1-penalized RCON(V, E) models estimated by sglasso function.

Usage

loglik(object, N = 2)

Arguments

object

a fitted sglasso object;

N

sample size. Default value is 2 to remove the constant term in the log-likelihood function. See below for more details.

Details

Denoted with ψ=(η,θ)\psi = (\eta',\theta')' the parameter vector of the structured concentration matrix K(ψ)K(\psi), the log-likelihood function of the RCON(V, E) model is equal, up to a constant, to the following expression

(ψ)=N2[logdetK(ψ)tr{SK(ψ)}],\ell(\psi) = \frac{N}{2}[\log det K(\psi) - tr\{S K(\psi)\}],

where S=N1i=1NXiXiTS = N^{-1}\sum_{i=1}^NX_iX_i^T, NN is the sample size and XiX_i is the iith observed pp-dimensional vector. Denoted with ψ^=(η^,θ^)\hat\psi = (\hat\eta',\hat\theta')' the sglasso estimates, straightforward algebra shows that

(ψ^)=N2[logdetK(ψ^)p+ρm=1Swmθ^m],\ell(\hat\psi) = \frac{N}{2}[\log det K(\hat\psi) - p + \rho\sum_{m=1}^S w_m|\hat\theta_m|],

where ρ\rho is the tuning parameter and wmw_m are the weights used to define the weighted l1-norm.

Value

loglik returns a vector containing the log-likelihood computed for the sequence of weighted l1-penalized RCON(V, E).

Author(s)

Luigi Augugliaro
Maintainer: Luigi Augugliaro [email protected]

See Also

summary.sglasso method and sglasso function.

Examples

N <- 100
p <- 5
X <- matrix(rnorm(N * p), N, p)
S <- crossprod(X) / N
mask <- outer(1:p, 1:p, function(i, j) 0.5^abs(i-j))
out.sglasso_path <- sglasso(S, mask, nrho = 5, tol = 1.0e-13)
out.sglasso_path
loglik(out.sglasso_path, N = N)
rho <- out.sglasso_path$rho[3]
out.sglasso_single <- sglasso(S, mask, nrho = 1, min_rho = rho, 
   tol = 1.0e-13, algorithm = "ccm")
loglik(out.sglasso_single, N = N)

Neisseria Data Set

Description

This data set contains the gene expression data from a high-resolution time-course experiment besed on the sequenced Neisseria meningitidis serogroup strain B strain MC58. Specifically, the expression level of ten genes is measured at ten different time points. Each column is standardized to have zero mean and standard deviation equal to one.

Usage

data("neisseria")

Plot Method for Leave-One-Out Cross-Validated Kullback-Leibler Divergence

Description

plot.klcv produces a plot to study the sequence of leave-one-out cross-validated Kullback-Leibler divergences computed by klcv.

Usage

## S3 method for class 'klcv'
plot(x, ...)

Arguments

x

fitted klcv object;

...

other parameters to be passed through the plotting function.

Details

This method function produces a plot showing the sequence of leave-one-out cross-validated Kullback-Leibler as function of the tuning parameter rhorho. The optimal value of the tuning parameter is identified by a vertical dashed line.

Author(s)

Luigi Augugliaro
Maintainer: Luigi Augugliaro [email protected]

See Also

klcv function.

Examples

N <- 100
p <- 5
X <- matrix(rnorm(N * p), N, p)
S <- crossprod(X) / N
mask <- outer(1:p, 1:p, function(i,j) 0.5^abs(i-j))
mask[1,5] <- mask[1,4] <- mask[2,5] <- NA
mask[5,1] <- mask[4,1] <- mask[5,2] <- NA
out.sglasso_path <- sglasso(S, mask, tol = 1.0e-13)
out.klcv <- klcv(out.sglasso_path, X)
plot(out.klcv)

Plot Method for the Weighted l1-Penalized RCON(V, E) Model

Description

plot.sglasso produces two plots to study the sequence of models estimates by sglasso or fglasso.

Usage

## S3 method for class 'sglasso'
plot(x, ...)

Arguments

x

fitted sglasso/fglasso object;

...

other parameters to be passed through the plotting function.

Details

This function produces two different plots. The first one shows the path of the estimated parameters as function of the tuning parameter ρ\rho. In the same way, the second plot shows the path of the weighted scores as function of ρ\rho.

Author(s)

Luigi Augugliaro
Maintainer: Luigi Augugliaro [email protected]

See Also

sglasso function and summary.sglasso method.

Examples

N <- 100
p <- 5
X <- matrix(rnorm(N * p), N, p)
S <- crossprod(X)/N
mask <- outer(1:p, 1:p, function(i,j) 0.5^abs(i-j))
mask[1,5] <- mask[1,4] <- mask[2,5] <- NA
mask[5,1] <- mask[4,1] <- mask[5,2] <- NA
out.sglasso_path <- sglasso(S, mask, tol = 1.0e-13)
plot(out.sglasso_path)

Lasso Method for the RCON(V, E) Models

Description

Fit the weighted l1-penalized RCON(V, E) models using a cyclic coordinate algorithm.

Usage

sglasso(S, mask, w = NULL, flg = NULL, min_rho = 1.0e-02, nrho = 50,  
        nstep = 1.0e+05, algorithm = c("ccd","ccm"), truncate = 1e-05, 
        tol = 1.0e-03)

Arguments

S

the empirical variance/covariance matrix;

mask

a symmetric matrix used to specify the equality constraints on the entries of the concentration matrix. See the example bellow for more details;

w

a vector specifying the weights used to compute the weighted l1-norm of the parameters of the RCON(V, E) model;

flg

a logical vector used to specify if a parameter is penalized, i.e., if flg[i] = TRUE then the i-th parameter is penalized, otherwise (flg[i] = FALSE) the maximum likelihood estimate is computed;

min_rho

last value of the sequence of tuning parameters used to compute the sglasso solution path. If nrho = 1, then min_rho is the value used to compute the sglasso estimate. Default value is 1.0e-02;

nrho

number of tuning parameters used to compute the sglasso solution path. Default is 50;

nstep

nonnegative integer used to specify the maximun number of iterations of the two cyclic coordinate algorithms. Default is 1.0e+05;

algorithm

character by means of to specify the algorithm used to fit the model, i.e., a cyclic coordinate descente (ccd) algorithm or a cyclic coordinate minimization (ccm) algorithm. Default is ccd;

truncate

at convergence all estimates below this value will be set to zero. Default is 1e-05;

tol

value used for convergence. Default value is 1.0e-05.

Details

The RCON(V, E) model (Hojsgaard et al., 2008) is a kind of restriction of the Gaussian Graphical Model defined using a coloured graph to specify a set of equality constraints on the entries of the concentration matrix. Roughly speaking, a coloured graph implies a partition of the vertex set into RR disjoint subsets, called vertex colour classes, and a partition of the edge set into SS disjoint subsets, called edge colour classes. At each vertex/edge colour class is associated a specific colour. If we denote by K=(kij)K = (k_{ij}) the concentration matrix, i.e. the inverse of the variance/covariance matrix Σ\Sigma, the coloured graph implies the following equality constraints:

  1. kii=ηnk_{ii} = \eta_n for any index ii belonging to the nnth vertex colour class;

  2. kij=θmk_{ij} = \theta_m for any pair (i,j)(i,j) belonging to the mmth edge colour class.

Denoted with ψ=(η,θ)\psi = (\eta',\theta')' the (R+S)(R+S)-dimensional parameter vector, the concentration matrix can be defined as

K(ψ)=n=1RηnDn+m=1SθmTm,K(\psi) = \sum_{n=1}^R\eta_nD_n + \sum_{m=1}^S\theta_mT_m,

where DnD_n is a diagonal matrix with entries Diin=1D^n_{ii} = 1 if the index ii belongs to the nnth vertex colour class and zero otherwise. In the same way, TmT_m is a symmetrix matrix with entries Tijm=1T^m_{ij} = 1 if the pair (i,j)(i,j) belongs to the mmth edge colour class. Using the previous specification of the concentration matrix, the structured graphical lasso (sglasso) estimator (Abbruzzo et al., 2014) is defined as

ψ^=argmaxψlogdetK(ψ)tr{Sk(ψ)}ρm=1Swmθm,\hat\psi = \arg\max_{\psi} \log det K(\psi) - tr\{Sk(\psi)\} - \rho\sum_{m=1}^Sw_m|\theta_m|,

where SS is the empirical variance/covariance matrix, ρ\rho is the tuning parameter used to control the ammount of shrinkage and wmw_m are weights used to define the weighted 1\ell_1-norm. By default, the sglasso function sets the weights equal to the cardinality of the edge colour classes.

Value

sglasso returns an obejct with S3 class "sglasso", i.e. a named list containing the following components:

call

the call that produced this object;

nv

number of vertex colour classes;

ne

number of edge colour classes;

theta

the matrix of the sglasso estimates. The first nv rows correspond to the unpenalized parameters while the remaining rows correspond to the weighted l1-penalized parameters;

w

the vector of weights used to define the weighted l1-norm;

df

nrho-dimensional vector of the number of estimated nonzero parameters;

rho

nrho-dimensional vector of the sequence of tuning parameters;

grd

the matrix of the scores;

nstep

nonnegative integer used to specify the maximum number of iterations of the algorithms;

nrho

number of tuning parameters used to compute the sglasso solution path;

algorithm

the algorithm used to fit the model;

truncate

the value used to set to zero the estimated parameters;

tol

a nonnegative value used to define the convergence of the algorithms;

S

the empirical variace/covariance matrix used to compute the sglasso solution path;

mask

the mask used to define the equality constraints on the entries of the concentration matrix;

n

number of interations of the algorithm;

conv

an integer value used to encode the warnings related to the algorihtms. If conv = 0 the convergence has been achieved otherwise the maximum number of iterations has been achieved.

Author(s)

Luigi Augugliaro
Maintainer: Luigi Augugliaro [email protected]

References

Abbruzzo, A., Augugliaro, L., Mineo, A. M. and Wit, E. C. (2014) Cyclic coordinate for penalized Gaussian Graphical Models with symmetry restrictions. In Proceeding of COMPSTAT 2014 - 21th International Conference on Computational Statistics, Geneva, August 19-24, 2014.

Hojsgaard, S. and Lauritzen, S. L. (2008) Graphical gaussian models with edge and vertex symmetries. J. Roy. Statist. Soc. Ser. B., Vol. 70(5), 1005–1027.

See Also

summary.sglasso, plot.sglasso gplot.sglasso and methods.

The function Kh extracts the estimated sparse structured concentration matrices.

Examples

########################################################
# sglasso solution path
#
## structural zeros:
## there are two ways to specify structural zeros which are 
## related to the kind of mask. If mask is a numeric matrix
## NA is used to identify the structural zero. If mask is a
## character matrix then the structural zeros are specified
## using NA or ".".
N <- 100
p <- 5
X <- matrix(rnorm(N * p), N, p)
S <- crossprod(X) / N
mask <- outer(1:p, 1:p, function(i,j) 0.5^abs(i-j))
mask[1,5] <- mask[1,4] <- mask[2,5] <- NA
mask[5,1] <- mask[4,1] <- mask[5,2] <- NA
mask

out.sglasso_path <- sglasso(S, mask, tol = 1.0e-13)
out.sglasso_path

rho <- out.sglasso_path$rho[20]
out.sglasso <- sglasso(S, mask, nrho = 1, min_rho = rho, tol = 1.0e-13, algorithm = "ccm")
out.sglasso

out.sglasso_path$theta[, 20]
out.sglasso$theta[, 1]

Summarizing sglasso Fits

Description

summary method for class "sglasso".

Usage

## S3 method for class 'sglasso'
summary(object, N, k = c("bic","aic"), 
        digits = max(3, getOption("digits") - 3), ...)

Arguments

object

fitted sglasso object;

N

sample size;

k

character/numeric argument used to specify the 'weight' of the complexity part in the measure of goodness-of-fit used to select the best model (see below for more details). Default is k = "bic";

digits

significant digits in printout;

...

additional print arguments.

Details

summary.sglasso gives us information about the sequence of models estimated by the sglasso estimator. To select the best model, summary method uses a measure of Goodness-of-Fit (GoF) defined as follows:

2(ψ^)+k×df,-2\ell(\hat\psi) + k \times df,

where (ψ^)\ell(\hat\psi) is the log-likelihood of the estimated weighted l1-penalized RCON(V, E) model, dfdf is the number of nonzero estimated parameters and kk is a non-negative value used to weight the complexity part in the measure of goodness-of-fit. By default the summary method computes the BIC criterion to select the best model (k = "bic"). The AIC criterion can be easily computed setting k = "aic". The user can also define other measures of goodness-of-fit specifying k as any non-negative value.

The output of the summary method is divided in two sections. First section shows the call producing the argument object followed by a data.frame. The column named rho shows the sequence of the ρ\rho values used to compute the solution curve, while the column log-lik shows the corresponding values of the log-likelihood function. The remaining columns show the number of estimated non-zero parameters, the values of the GoF and the asscoated ranking of the estimated models. Finally, the second section shows the estimated parameters of the best model identified by the used GoF criterion. Informations about the algorithm and the corresponding convergence are also provided.

Value

A list with components table and theta_gof is silently returned. The table component is the data.frame previously described while the component theta_gof is the vector of the estimated parameters corresponding to the best models identified by the GoF criterion.

Author(s)

Luigi Augugliaro
Maintainer: Luigi Augugliaro [email protected]

See Also

sglasso and loglik functions.

Examples

N <- 100
p <- 5
X <- matrix(rnorm(N * p), N, p)
S <- crossprod(X) / N
mask <- outer(1:p, 1:p, function(i,j) 0.5^abs(i-j))
mask[1,5] <- mask[1,4] <- mask[2,5] <- NA
mask[5,1] <- mask[4,1] <- mask[5,2] <- NA
out.sglasso_path <- sglasso(S, mask, tol = 1.0e-13)
summary(out.sglasso_path, N)
rho <- out.sglasso_path$rho[20]
out.sglasso <- sglasso(S, mask, nrho = 1, min_rho = rho, tol = 1.0e-13)
summary(out.sglasso, N)