Title: | Integrative Differential Network Analysis in Genomics |
---|---|
Description: | Fits covariate dependent partial correlation matrices for integrative models to identify differential networks between two groups. The methods are described in Class et. al., (2018) <doi:10.1093/bioinformatics/btx750> and Ha et. al., (2015) <doi:10.1093/bioinformatics/btv406>. |
Authors: | Caleb A. Class <[email protected]>, Min Jin Ha <[email protected]> |
Maintainer: | Caleb A. Class <[email protected]> |
License: | GPL-2 |
Version: | 1.0.4 |
Built: | 2024-10-14 06:37:47 UTC |
Source: | CRAN |
This packages jointly estimates group specific partial correlations in a multi-level/platform data set.
This packages jointly estimates group specific partial correlations in a multi-level/platform data set, considering the directionality of effects between platforms using a chain graph model.
Min Jin Ha, Caleb Class, Veerabhadran Baladandayuthapani, and Kim-Anh Do
Modified TCGA Breast Cancer data.
data(brca)
data(brca)
Three data frames, with columns as standardized miRNA, gene, and protein expressions. One vector with the two classes of the samples (tumor or normal tissue). Also one iDINGO fit object, obtained by running the example in the idingo manual entry.
This function fit DINGO model and calculates edge-wise differential scores for all pairwise edges among p variables.
dingo(dat,x,rhoarray=NULL,diff.score=T,B=100,verbose=T,cores=1)
dingo(dat,x,rhoarray=NULL,diff.score=T,B=100,verbose=T,cores=1)
dat |
nxp data with colnames as genename |
x |
a length n vector representing a binary covariate |
rhoarray |
a vector representing candidate tuning parameters of glasso for fitting global network model. If it is one value, then we use the value as the tuning parameter. It is set by NULL as default and we select 100 candidate values. |
diff.score |
a logical value. If TRUE, edge-wise differential scores are calculated from bootstrap standard error. Otherwise, we fit Steps 1 and 2 of DINGO model to get group specific GGMs (partial correlations) |
B |
the number of bootstrap samples to calculate differential scores. |
verbose |
if TRUE, lists the procedure |
cores |
the number of cores to run in parallel for bootstrapping, set to 1 as a default. If more cores are specified than the recommended maximum (the number of cores detected minus 1), this value will be replaced by the recommended value. |
genepair |
a p(p-1)/2 x 2 matrix indicating all pairs of genes |
levels.x |
a length 2 vector indicating levels of the binary covariate x, the first element is for group 1 and the second element is for group 2 |
R1 |
a length p(p-1)/2 vector indicating partial correlations for group 1 and the order is corresponding to the order of genepair |
R2 |
a length p(p-1)/2 vector indicating partial correlations for group 2 and the order is corresponding to the order of genepair |
boot.diff |
a p(p-1)/2 x boot.B matrix indicating bootstrapped difference, Fisher's Z transformed R1 - R2. The rows are corresponding to the order of gene pair and the columns are corresponding to the bootstrap samples |
diff.score |
a p(p-1)/2 vector of differential score corresponding to genepair |
p.val |
a p(p-1)/2 vector of corrected p-values corresponding to genepair |
rho |
selected tuning parameter of glasso fit |
P |
p by p matrix of Global component of the DINGO model |
Q |
p by 2 matrix of the coefficient parameter of the local group specific component L(x) of the DINGO model. |
Psi |
p by p diagonal matrix of the noise covariance parameter of the local group specific component L(x) of the DINGO model. |
step.times |
a length 3 vector containing the elapsed time for Step 1, Step 2, and Bootstrap Scoring, respectively. |
Min Jin HA [email protected], Caleb CLASS [email protected]
data(gbm) # Run DINGO (the first column, 'x', contains the group data). # This may take 5-10 minutes. ## Not run: fit <- dingo(gbm[,-1], gbm$x, diff.score = TRUE, B = 100, cores = 2)
data(gbm) # Run DINGO (the first column, 'x', contains the group data). # This may take 5-10 minutes. ## Not run: fit <- dingo(gbm[,-1], gbm$x, diff.score = TRUE, B = 100, cores = 2)
Extended bayesian information criteria for gaussian graphical models
extendedBIC(gamma,omegahat,S,n)
extendedBIC(gamma,omegahat,S,n)
gamma |
a tuning parameter taking a scalar in [0,1] and leading to stronger penalization of large graphs |
omegahat |
a p x p matrix indicating an estimates of precision (inverse covariance) matrix |
S |
a p x p matrix indicating sample covariance matrix |
n |
a scalar indicating sample size |
Extended BIC penalized by the size of graphs
Min Jin Ha <[email protected]>
Foygel, R. and Drton, M. (2010). Extended bayesian information criteria for gaussian graphical models. arXiv preprint arXiv:1011.6640 .
library(glasso) data(gbm) x = gbm[,1] Y = gbm[,-1] # Estimating inverse covariance matrix using GLasso # S = cov(Y) rhoarray = exp(seq(log(0.001),log(1),length=100)) BIC = rep(0,length(rhoarray)) for (rh in 1:length(rhoarray)) { fit.gl1 = glasso(S,rho=rhoarray[rh]) BIC[rh] = extendedBIC(gamma=0,omegahat=fit.gl1$wi,S=S,n=nrow(Y)) } rho = rhoarray[which.min(BIC)] fit.gl2 = glasso(S,rho=rho) Omega = fit.gl2$wi
library(glasso) data(gbm) x = gbm[,1] Y = gbm[,-1] # Estimating inverse covariance matrix using GLasso # S = cov(Y) rhoarray = exp(seq(log(0.001),log(1),length=100)) BIC = rep(0,length(rhoarray)) for (rh in 1:length(rhoarray)) { fit.gl1 = glasso(S,rho=rhoarray[rh]) BIC[rh] = extendedBIC(gamma=0,omegahat=fit.gl1$wi,S=S,n=nrow(Y)) } rho = rhoarray[which.min(BIC)] fit.gl2 = glasso(S,rho=rho) Omega = fit.gl2$wi
Modified TCGA Glioblastoma data.
data(gbm)
data(gbm)
A data frame with first column as a covariate and other columns as standardized gene expressions.
This function fits the covariance regression model by Hoff and Niu (2012) using EM algorithm with the restriction of diagonal matrix for the noise variance
Greg.em(formula, data = NULL, R = 1, tol = 1e-10, itmax = 1000, verbose = F)
Greg.em(formula, data = NULL, R = 1, tol = 1e-10, itmax = 1000, verbose = F)
formula |
an object of class "formula" used in model.frame function |
data |
a data frame used in model.frame function |
R |
rank of the model |
tol |
a stopping criterion |
itmax |
maximum number of iteration |
verbose |
If true, estimation results for each iteration are printed |
A |
MLE of the baseline covariance matrix |
B |
MLE of the regression coefficients |
Min Jin Ha <[email protected]>
Hoff, P. D. and Niu, X. (2012) A covariance regression model. Statistica Sinica, 22, 729-753.
library(glasso) data(gbm) x = gbm[,1] Y = as.matrix(gbm[,-1]) p = ncol(Y) # Estimating inverse covariance matrix using GLasso # S = cov(Y) w.upper = which(upper.tri(S)) rhoarray = exp(seq(log(0.001),log(1),length=100)) BIC = rep(0,length(rhoarray)) for (rh in 1:length(rhoarray)) { fit.gl1 = glasso(S,rho=rhoarray[rh]) BIC[rh] = extendedBIC(gamma=0,omegahat=fit.gl1$wi,S=S,n=nrow(Y)) } rho = rhoarray[which.min(BIC)] fit.gl2 = glasso(S,rho=rho) Omega = fit.gl2$wi # Fitting (Covariance Regression on transformed data) diag.Omega = diag(Omega) P = -Omega/diag.Omega diag(P) = 0 tY = Y mdat = apply(tY,2,mean) sdat = apply(tY,2,sd) std.tY = t((t(tY) - mdat)/sdat) smat = diag(sdat) ## rank 1 covariance regression fit.g = Greg.em(std.tY~x,R=1)
library(glasso) data(gbm) x = gbm[,1] Y = as.matrix(gbm[,-1]) p = ncol(Y) # Estimating inverse covariance matrix using GLasso # S = cov(Y) w.upper = which(upper.tri(S)) rhoarray = exp(seq(log(0.001),log(1),length=100)) BIC = rep(0,length(rhoarray)) for (rh in 1:length(rhoarray)) { fit.gl1 = glasso(S,rho=rhoarray[rh]) BIC[rh] = extendedBIC(gamma=0,omegahat=fit.gl1$wi,S=S,n=nrow(Y)) } rho = rhoarray[which.min(BIC)] fit.gl2 = glasso(S,rho=rho) Omega = fit.gl2$wi # Fitting (Covariance Regression on transformed data) diag.Omega = diag(Omega) P = -Omega/diag.Omega diag(P) = 0 tY = Y mdat = apply(tY,2,mean) sdat = apply(tY,2,sd) std.tY = t((t(tY) - mdat)/sdat) smat = diag(sdat) ## rank 1 covariance regression fit.g = Greg.em(std.tY~x,R=1)
This function fits the iDINGO model and calculates edge-wise differential scores for all pairwise edges among p variables between multiple platforms.
idingo(dat,dat2=NULL,dat3=NULL,x,plats=NULL,rhoarray=NULL, diff.score=T,B=100,verbose=T,cores=1)
idingo(dat,dat2=NULL,dat3=NULL,x,plats=NULL,rhoarray=NULL, diff.score=T,B=100,verbose=T,cores=1)
dat |
nxp dataframe/matrix with colnames as genename |
dat2 |
Second nxp dataframe/matrix with colnames as genename (optional) |
dat3 |
Third nxp dataframe/matrix with colnames as genename (optional) |
x |
a length n vector representing a binary covariate |
plats |
a length 1-3 vector (corresponding to the number of data sets submitted, with names for the platforms/levels of the data, such as "microRNA" or "RNAseq". This is optional, and default names "platN" will be used if names are not provided. |
rhoarray |
a vector representing candidate tuning parameters of glasso for fitting global network model. If it is one value, then we use the value as the tuning parameter. It is set by NULL as default and we select 100 candidate values. |
diff.score |
a logical value. If TRUE, edge-wise differential scores are calculated from bootstrap standard error. Otherwise, we fit Steps 1 and 2 of DINGO model to get group specific GGMs (partial correlations) |
B |
the number of bootstrap samples to calculate differential scores. |
verbose |
if TRUE, lists the procedure |
cores |
the number of cores to run in parallel for bootstrapping, set to 1 as a default. If more cores are specified than the recommended maximum (the number of cores detected minus 1), this value will be replaced by the recommended value. |
genepair |
a p(p-1)/2 x 2 matrix indicating all pairs of genes |
levels.x |
a length 2 vector indicating levels of the binary covariate x, the first element is for group 1 and the second element is for group 2 |
R1 |
a length p(p-1)/2 vector indicating partial correlations for group 1 and the order is corresponding to the order of genepair |
R2 |
a length p(p-1)/2 vector indicating partial correlations for group 2 and the order is corresponding to the order of genepair |
diff.score |
a p(p-1)/2 vector of differential score corresponding to genepair |
p.val |
a p(p-1)/2 vector of corrected p-values corresponding to genepair |
Caleb CLASS [email protected], Min Jin HA [email protected]
data(brca) # Run iDINGO with microRNA, RNA, and protein data. # Generally, we recommend a minimum of 100 bootstraps. ## Not run: fit <- idingo(brca$mirna, dat2 = brca$rna, dat3 = brca$prot, x = brca$class, plats = c("microRNA", "RNA", "Protein"), diff.score = TRUE, B = 20, cores = 2) ## End(Not run)
data(brca) # Run iDINGO with microRNA, RNA, and protein data. # Generally, we recommend a minimum of 100 bootstraps. ## Not run: fit <- idingo(brca$mirna, dat2 = brca$rna, dat3 = brca$prot, x = brca$class, plats = c("microRNA", "RNA", "Protein"), diff.score = TRUE, B = 20, cores = 2) ## End(Not run)
This function plots the differential network from a completed DINGO or iDINGO model.
plotNetwork(fit, threshold=0.05, thresh.type="p.val", layout="circular", legend.pos="left")
plotNetwork(fit, threshold=0.05, thresh.type="p.val", layout="circular", legend.pos="left")
fit |
output from running dingo() or idingo() |
threshold |
a numeric value containing the threshold for which edges will be included in the differential network plot. If 'thresh.type' is 'p.val', all edges with p-values below this threshold will be included in the plot. If 'thresh.type' is 'diff.score', all edges with absolute differential scores above this threshold will be included in the plot. |
thresh.type |
either 'p.val' or 'diff.score', defining which variable is used as threshold for edge inclusion. |
layout |
either 'circular' or one of igraph's supported layouts. If 'circular', dingo networks will be plotted in a circle, and idingo networks will be plotted as a cylinder (with each platform/level as a separate circle). |
legend.pos |
Legend position for multi-platform networks, in c("left", "right"). Legend is not included for single-platform networks. |
visNet |
a network plot, using igraph and visNetwork |
To calculate differential scores and p-values for use in network plot thresholding, diff.score must be set to TRUE in dingo() or idingo().
Caleb CLASS [email protected], Min Jin HA [email protected]
data(brca) # Plot the iDINGO result using a p-value threshold of 0.01. plotNetwork(brca$fit, threshold = 0.01, thresh.type = "p.val")
data(brca) # Plot the iDINGO result using a p-value threshold of 0.01. plotNetwork(brca$fit, threshold = 0.01, thresh.type = "p.val")
scale a square matrix to have unit diagonal elements.
scaledMat(x)
scaledMat(x)
x |
a square matrix with positive diagonal elements |
scaled matrix of x
Min Jin Ha [email protected]
This function calculates standard errors for edge-wise partial correlation differences obtained from DINGO model.
scoring.boot(stddat,z,Omega,A,B,boot.B=100,verbose=T)
scoring.boot(stddat,z,Omega,A,B,boot.B=100,verbose=T)
stddat |
standardized nxp data with colnames as genename |
z |
a length n vector representing a binary covariate |
Omega |
a p x p precision matrix for std dat which implies the global network |
A |
p x p matrix of the MLE for the baseline covariance matrix which is obtained from A value of the Greg.em function. |
B |
p x 2 matrix of the MLE for the regression coefficient which is obtained from B value of the Greg.em function |
boot.B |
a scalar indicating the number of bootstraps |
verbose |
if TRUE, lists the bootstrap replications |
genepair |
a p(p-1)/2 x 2 matrix indicating all pairs of genes |
levels.z |
a length 2 vector indicating levels of the binary covariate z, the first element is for group 1 and the second element is for group 2 |
R1 |
a length p(p-1)/2 vector indicating partial correlations for group 1 and the order is corresponding to the order of genepair |
R2 |
a length p(p-1)/2 vector indicating partial correlations for group 2 and the order is corresponding to the order of genepair |
boot.diff |
a p(p-1)/2 x boot.B matrix indicating bootstrapped difference, Fisher's Z transformed R1 - R2. The rows are corresponding to the order of gene pair and the columns are corresponding to the bootstrap samples |
diff.score |
a p(p-1)/2 vector of differential score corresponding to genepair |
p.val |
a p(p-1)/2 vector of corrected p-values corresponding to genepair |
Min Jin HA [email protected]
This function calculates standard errors for edge-wise partial correlation differences obtained from DINGO model. Bootstrapping is done in parallel using parSapply from the "parallel" library.
scoring.boot.parallel(stddat,z,Omega,A,B,boot.B=100,verbose=T,cores=1)
scoring.boot.parallel(stddat,z,Omega,A,B,boot.B=100,verbose=T,cores=1)
stddat |
standardized nxp data with colnames as genename |
z |
a length n vector representing a binary covariate |
Omega |
a p x p precision matrix for std dat which implies the global network |
A |
p x p matrix of the MLE for the baseline covariance matrix which is obtained from A value of the Greg.em function. |
B |
p x 2 matrix of the MLE for the regression coefficient which is obtained from B value of the Greg.em function |
boot.B |
a scalar indicating the number of bootstraps |
verbose |
if TRUE, lists the bootstrap replications |
cores |
the number of cores to run in parallel for bootstrapping, set to 1 as a default. |
genepair |
a p(p-1)/2 x 2 matrix indicating all pairs of genes |
levels.z |
a length 2 vector indicating levels of the binary covariate z, the first element is for group 1 and the second element is for group 2 |
R1 |
a length p(p-1)/2 vector indicating partial correlations for group 1 and the order is corresponding to the order of genepair |
R2 |
a length p(p-1)/2 vector indicating partial correlations for group 2 and the order is corresponding to the order of genepair |
boot.diff |
a p(p-1)/2 x boot.B matrix indicating bootstrapped difference, Fisher's Z transformed R1 - R2. The rows are corresponding to the order of gene pair and the columns are corresponding to the bootstrap samples |
diff.score |
a p(p-1)/2 vector of differential score corresponding to genepair |
p.val |
a p(p-1)/2 vector of corrected p-values corresponding to genepair |
Min Jin HA [email protected], Caleb CLASS [email protected]
From parameters of DINGO model, group specific covariance matrices are obtained
Sigmax(P = NULL, Q, Psi, x)
Sigmax(P = NULL, Q, Psi, x)
P |
a p x p matrix specifying global component |
Q |
the coefficient parameter matrix of covariance regression model using Greg.em function |
Psi |
the diagonal error variance matrix of covariance regression model using Greg.em function |
x |
a vector specifying group. This must be corresponding to the design matrix of Greg.em function |
group specific precision matrix
Min Jin Ha <[email protected]>
library(glasso) data(gbm) x = gbm[,1] Y = as.matrix(gbm[,-1]) p = ncol(Y) # Estimating inverse covariance matrix using GLasso # S = cov(Y) w.upper = which(upper.tri(S)) rhoarray = exp(seq(log(0.001),log(1),length=100)) BIC = rep(0,length(rhoarray)) for (rh in 1:length(rhoarray)) { fit.gl1 = glasso(S,rho=rhoarray[rh]) BIC[rh] = extendedBIC(gamma=0,omegahat=fit.gl1$wi,S=S,n=nrow(Y)) } rho = rhoarray[which.min(BIC)] fit.gl2 = glasso(S,rho=rho) Omega = fit.gl2$wi # Fitting (Covariance Regression on transformed data) diag.Omega = diag(Omega) P = -Omega/diag.Omega diag(P) = 0 tY = Y mdat = apply(tY,2,mean) sdat = apply(tY,2,sd) std.tY = t((t(tY) - mdat)/sdat) smat = diag(sdat) ## rank 1 covariance regression fit.g = Greg.em(std.tY~x,R=1) ## obtain covariance matrix of Y when x=1 sigmaX1 = Sigmax(Q=fit.g$B,P=P,Psi=fit.g$A,x=c(1,1))
library(glasso) data(gbm) x = gbm[,1] Y = as.matrix(gbm[,-1]) p = ncol(Y) # Estimating inverse covariance matrix using GLasso # S = cov(Y) w.upper = which(upper.tri(S)) rhoarray = exp(seq(log(0.001),log(1),length=100)) BIC = rep(0,length(rhoarray)) for (rh in 1:length(rhoarray)) { fit.gl1 = glasso(S,rho=rhoarray[rh]) BIC[rh] = extendedBIC(gamma=0,omegahat=fit.gl1$wi,S=S,n=nrow(Y)) } rho = rhoarray[which.min(BIC)] fit.gl2 = glasso(S,rho=rho) Omega = fit.gl2$wi # Fitting (Covariance Regression on transformed data) diag.Omega = diag(Omega) P = -Omega/diag.Omega diag(P) = 0 tY = Y mdat = apply(tY,2,mean) sdat = apply(tY,2,sd) std.tY = t((t(tY) - mdat)/sdat) smat = diag(sdat) ## rank 1 covariance regression fit.g = Greg.em(std.tY~x,R=1) ## obtain covariance matrix of Y when x=1 sigmaX1 = Sigmax(Q=fit.g$B,P=P,Psi=fit.g$A,x=c(1,1))
This function calculates the edge-wise partial correlation difference for a single bootstrap.
single.boot(i, z, n, tY.org, P, levels.z, w.upper)
single.boot(i, z, n, tY.org, P, levels.z, w.upper)
i |
iteration number. This is not used within this function, but necessary for parSapply within scoring.boot.parallel function. |
z |
a length n vector representing a binary covariate |
n |
the number of rows in data |
tY.org |
the transformed standardized data |
P |
the global correlation component |
levels.z |
the levels of the covariates |
w.upper |
the upper triangular of Omega |
boot.diff |
the difference for this bootstrap |
Min Jin HA [email protected], Caleb CLASS [email protected]
Fisher's Z-transformation of (partial) correlation.
x |
a vector having entries between -1 and 1 |
Fisher's Z-transformed values
Min Jin HA [email protected]