Title: | Algorithm for Searching the Space of Gaussian Directed Acyclic Graph Models Through Moment Fractional Bayes Factors |
---|---|
Description: | We propose an objective Bayesian algorithm for searching the space of Gaussian directed acyclic graph (DAG) models. The algorithm proposed makes use of moment fractional Bayes factors (MFBF) and thus it is suitable for learning sparse graph. The algorithm is implemented by using Armadillo: an open-source C++ linear algebra library. |
Authors: | Davide Altomare, Guido Consonni and Luca La Rocca |
Maintainer: | Davide Altomare <[email protected]> |
License: | GPL (>= 2) |
Version: | 1.2 |
Built: | 2024-11-06 06:14:43 UTC |
Source: | CRAN |
Data on a set of flow cytometry experiments on signaling networks of human immune system cells. The dataset includes p=11 proteins and n=7466 samples.
data(HumanPw)
data(HumanPw)
dataHuman
contains the following objects:
Obs
Matrix (7466x11) with the observations.
Perms
List of 5 matrices (1x11) each of which with a permutation of the nodes.
TDag
Matrix (11x11) with the adjacency matrix of the known regulatory network.
Sachs, K., Perez, O., Pe'er, D., Lauffenburger, D., and Nolan, G. (2003). Casual protein- signaling networks derived from multiparameter single-cell data. Science 308, 504-6.
D. Altomare, G. Consonni and L. La Rocca (2012). Objective Bayesian search of Gaussian directed acyclic graphical models for ordered variables with non-local priors. Article submitted to Biometric Methodology.
Shojaie, A. and Michailidis, G. (2010). Penalized likelihood methods for estimation of sparse high-dimensional directed acyclic graphs. Biometrika 97, 519-538.
Data on publishing productivity among academics.
data(PubProd)
data(PubProd)
dataPub
contains the following objects:
Corr
Matrix (7x7) with the correlation matrix of the variables.
nobs
Scalar with the number of observations.
Spirtes, P., Glymour, C., and Scheines, R. (2000). Causation, prediction and search (2nd edition). Cambridge, MA: The MIT Press. pages 1-16.
Drton, M. and Perlman, M. D. (2008). A SINful approach to Gaussian graphical model selection. J. Statist. Plann. Inference 138, 1179-1200.
D. Altomare, G. Consonni and L. La Rocca (2012). Objective Bayesian search of Gaussian directed acyclic graphical models for ordered variables with non-local priors. Article submitted to Biometric Methodology.
dataSim100
is a list
with the adjacency matrix of a randomly generated DAG with 100 nodes and 100 edges, 10 samples generated from the DAG and 5 permutations of the nodes.
data(SimDag100)
data(SimDag100)
dataSim100
contains the following objects:
Obs
List of 10 matrices (100x100) each of which with 100 observations generated from the DAG.
Perms
List of 5 matrices (1x100) each of which with a permutation of the nodes.
TDag
Matrix (100x100) with the adjacency matrix of the DAG.
D. Altomare, G. Consonni and L. La Rocca (2012). Objective Bayesian search of Gaussian directed acyclic graphical models for ordered variables with non-local priors. Article submitted to Biometric Methodology.
Shojaie, A. and Michailidis, G. (2010). Penalized likelihood methods for estimation of sparse high-dimensional directed acyclic graphs. Biometrika 97, 519-538.
dataSim200
is a list
with the adjacency matrix of a randomly generated DAG with 200 nodes and 100 edges, 10 samples generated from the DAG and 5 permutations of the nodes.
data(SimDag200)
data(SimDag200)
dataSim200
contains the following objects:
Obs
List of 10 matrices (100x200) each of which with 100 observations simulated from the DAG.
Perms
List of 5 matrices (1x200) each of which with a permutation of the nodes.
TDag
Matrix (200x200) with the adjacency matrix of the DAG.
D. Altomare, G. Consonni and L. La Rocca (2012). Objective Bayesian search of gaussian directed acyclic graphical models for ordered variables with non-local priors. Article submitted to Biometric Methodology.
Shojaie, A. and Michailidis, G. (2010). Penalized likelihood methods for estimation of sparse high-dimensional directed acyclic graphs. Biometrika 97, 519-538.
dataSim50
is a list
with the adjacency matrix of a randomly generated DAG with 50 nodes and 100 edges, 10 samples generated from the DAG and 5 permutations of the nodes.
data(SimDag50)
data(SimDag50)
dataSim50
contains the following objects:
Obs
List of 10 matrices (100x50) each of which with 100 observations simulated from the DAG.
Perms
List of 5 matrices (1x50) each of which with a permutation of the nodes.
TDag
Matrix (50x50) with the adjacency matrix of the DAG.
D. Altomare, G. Consonni and L. La Rocca (2012). Objective Bayesian search of Gaussian directed acyclic graphical models for ordered variables with non-local priors. Article submitted to Biometric Methodology.
Shojaie, A. and Michailidis, G. (2010). Penalized likelihood methods for estimation of sparse high-dimensional directed acyclic graphs. Biometrika 97, 519-538.
dataSim6
is a list
with the adjacency matrix of a randomly generated DAG with 6 nodes and 5 edges and 100 correlation matrices generated from the DAG.
data(SimDag6)
data(SimDag6)
dataSim6
contains the following objects:
Corr
List of 100 matrices (6x6) each of which with a correlation matrix generated from the DAG.
TDag
Matrix (6x6) with the adjacency matrix of the DAG.
D. Altomare, G. Consonni and L. La Rocca (2012). Objective Bayesian search of Gaussian directed acyclic graphical models for ordered variables with non-local priors. Article submitted to Biometric Methodology.
Data generated from the known regulatory network of human cell signalling data.
data(SimHumanPw)
data(SimHumanPw)
dataSimHuman
contains the following objects:
Obs
List of 100 matrices (100x11) each of which with 100 observations simulated from the known regulatory network.
Perms
List of 5 matrices (1x11) each of which with a permutation of the nodes.
TDag
Matrix (11x11) with the adjacency matrix of the known regulatory network.
D. Altomare, G. Consonni and L. La Rocca (2012). Objective Bayesian search of Gaussian directed acyclic graphical models for ordered variables with non-local priors. Article submitted to Biometric Methodology.
Sachs, K., Perez, O., Pe'er, D., Lauffenburger, D., and Nolan, G. (2003). Casual protein- signaling networks derived from multiparameter single-cell data. Science 308, 504-6.
Shojaie, A. and Michailidis, G. (2010). Penalized likelihood methods for estimation of sparse high-dimensional directed acyclic graphs. Biometrika 97, 519-538.
Estimate the edge inclusion probabilities for a Gaussian DAG with q nodes from observational data, using the moment fractional Bayes factor approach with global prior.
FBF_GS(Corr, nobs, G_base, h, C, n_tot_mod, n_hpp)
FBF_GS(Corr, nobs, G_base, h, C, n_tot_mod, n_hpp)
Corr |
qxq correlation matrix. |
nobs |
Number of observations. |
G_base |
Base DAG. |
h |
Parameter prior. |
C |
Costant who keeps the probability of all local moves bounded away from 0 and 1. |
n_tot_mod |
Maximum number of different models which will be visited by the algorithm, for each equation. |
n_hpp |
Number of the highest posterior probability models which will be returned by the procedure. |
An object of class
list
with:
M_q
Matrix (qxq) with the estimated edge inclusion probabilities.
M_G
Matrix (n*n_hpp)xq with the n_hpp highest posterior probability models returned by the procedure.
M_P
Vector (n_hpp) with the n_hpp posterior probabilities of the models in M_G.
Davide Altomare ([email protected]).
D. Altomare, G. Consonni and L. La Rocca (2012). Objective Bayesian search of Gaussian directed acyclic graphical models for ordered variables with non-local priors. Article submitted to Biometric Methodology.
data(SimDag6) Corr=dataSim6$SimCorr[[1]] nobs=50 q=ncol(Corr) Gt=dataSim6$TDag Res_search=FBF_GS(Corr, nobs, matrix(0,q,q), 1, 0.01, 1000, 10) M_q=Res_search$M_q M_G=Res_search$M_G M_P=Res_search$M_P G_med=M_q G_med[M_q>=0.5]=1 G_med[M_q<0.5]=0 #median probability DAG G_high=M_G[1:q,1:q] #Highest Posterior Probability DAG (HPP) pp_high=M_P[1] #Posterior Probability of the HPP #Structural Hamming Distance between the true DAG and the median probability DAG sum(sum(abs(G_med-Gt))) #Structural Hamming Distance between the true DAG and the highest probability DAG sum(sum(abs(G_high-Gt)))
data(SimDag6) Corr=dataSim6$SimCorr[[1]] nobs=50 q=ncol(Corr) Gt=dataSim6$TDag Res_search=FBF_GS(Corr, nobs, matrix(0,q,q), 1, 0.01, 1000, 10) M_q=Res_search$M_q M_G=Res_search$M_G M_P=Res_search$M_P G_med=M_q G_med[M_q>=0.5]=1 G_med[M_q<0.5]=0 #median probability DAG G_high=M_G[1:q,1:q] #Highest Posterior Probability DAG (HPP) pp_high=M_P[1] #Posterior Probability of the HPP #Structural Hamming Distance between the true DAG and the median probability DAG sum(sum(abs(G_med-Gt))) #Structural Hamming Distance between the true DAG and the highest probability DAG sum(sum(abs(G_high-Gt)))
Estimate the edge inclusion probabilities for a directed acyclic graph (DAG) from observational data, using the moment fractional Bayes factor approach with local prior.
FBF_LS(Corr, nobs, G_base, h, C, n_tot_mod)
FBF_LS(Corr, nobs, G_base, h, C, n_tot_mod)
Corr |
qxq correlation matrix. |
nobs |
Number of observations. |
G_base |
Base DAG. |
h |
Parameter prior. |
C |
Costant who keeps the probability of all local moves bounded away from 0 and 1. |
n_tot_mod |
Maximum number of different models which will be visited by the algorithm, for each equation. |
An object of class
matrix
with the estimated edge inclusion probabilities.
Davide Altomare ([email protected]).
D. Altomare, G. Consonni and L. LaRocca (2012).Objective Bayesian search of Gaussian directed acyclic graphical models for ordered variables with non-local priors.Article submitted to Biometric Methodology.
data(SimDag6) Corr=dataSim6$SimCorr[[1]] nobs=50 q=ncol(Corr) Gt=dataSim6$TDag M_q=FBF_LS(Corr, nobs, matrix(0,q,q), 0, 0.01, 1000) G_med=M_q G_med[M_q>=0.5]=1 G_med[M_q<0.5]=0 #median probability DAG #Structural Hamming Distance between the true DAG and the median probability DAG sum(sum(abs(G_med-Gt)))
data(SimDag6) Corr=dataSim6$SimCorr[[1]] nobs=50 q=ncol(Corr) Gt=dataSim6$TDag M_q=FBF_LS(Corr, nobs, matrix(0,q,q), 0, 0.01, 1000) G_med=M_q G_med[M_q>=0.5]=1 G_med[M_q<0.5]=0 #median probability DAG #Structural Hamming Distance between the true DAG and the median probability DAG sum(sum(abs(G_med-Gt)))
Estimate the edge inclusion probabilities for a regression model (Y(q) on Y(q-1),...,Y(1)) with q variables from observational data, using the moment fractional Bayes factor approach.
FBF_RS(Corr, nobs, G_base, h, C, n_tot_mod, n_hpp)
FBF_RS(Corr, nobs, G_base, h, C, n_tot_mod, n_hpp)
Corr |
qxq correlation matrix. |
nobs |
Number of observations. |
G_base |
Base model. |
h |
Parameter prior. |
C |
Costant who keeps the probability of all local moves bounded away from 0 and 1. |
n_tot_mod |
Maximum number of different models which will be visited by the algorithm, for each equation. |
n_hpp |
Number of the highest posterior probability models which will be returned by the procedure. |
An object of class
list
with:
M_q
Matrix (qxq) with the estimated edge inclusion probabilities.
M_G
Matrix (n*n_hpp)xq with the n_hpp highest posterior probability models returned by the procedure.
M_P
Vector (n_hpp) with the n_hpp posterior probabilities of the models in M_G.
Davide Altomare ([email protected]).
D. Altomare, G. Consonni and L. LaRocca (2012). Objective Bayesian search of Gaussian directed acyclic graphical models for ordered variables with non-local priors. Article submitted to Biometric Methodology.
data(SimDag6) Corr=dataSim6$SimCorr[[1]] nobs=50 q=ncol(Corr) Gt=dataSim6$TDag Res_search=FBF_RS(Corr, nobs, matrix(0,1,(q-1)), 1, 0.01, 1000, 10) M_q=Res_search$M_q M_G=Res_search$M_G M_P=Res_search$M_P Mt=rev(matrix(Gt[1:(q-1),q],1,(q-1))) #True Model M_med=M_q M_med[M_q>=0.5]=1 M_med[M_q<0.5]=0 #median probability model #Structural Hamming Distance between the true DAG and the median probability DAG sum(sum(abs(M_med-Mt)))
data(SimDag6) Corr=dataSim6$SimCorr[[1]] nobs=50 q=ncol(Corr) Gt=dataSim6$TDag Res_search=FBF_RS(Corr, nobs, matrix(0,1,(q-1)), 1, 0.01, 1000, 10) M_q=Res_search$M_q M_G=Res_search$M_G M_P=Res_search$M_P Mt=rev(matrix(Gt[1:(q-1),q],1,(q-1))) #True Model M_med=M_q M_med[M_q>=0.5]=1 M_med[M_q<0.5]=0 #median probability model #Structural Hamming Distance between the true DAG and the median probability DAG sum(sum(abs(M_med-Mt)))