Title: | Methods for Graphical Models and Causal Inference |
---|---|
Description: | Functions for causal structure learning and causal inference using graphical models. The main algorithms for causal structure learning are PC (for observational data without hidden variables), FCI and RFCI (for observational data with hidden variables), and GIES (for a mix of data from observational studies (i.e. observational data) and data from experiments involving interventions (i.e. interventional data) without hidden variables). For causal inference the IDA algorithm, the Generalized Backdoor Criterion (GBC), the Generalized Adjustment Criterion (GAC) and some related functions are implemented. Functions for incorporating background knowledge are provided. |
Authors: | Markus Kalisch [aut, cre], Alain Hauser [aut], Martin Maechler [aut], Diego Colombo [ctb], Doris Entner [ctb], Patrik Hoyer [ctb], Antti Hyttinen [ctb], Jonas Peters [ctb], Nicoletta Andri [ctb], Emilija Perkovic [ctb], Preetam Nandy [ctb], Philipp Ruetimann [ctb], Daniel Stekhoven [ctb], Manuel Schuerch [ctb], Marco Eigenmann [ctb], Leonard Henckel [ctb], Joris Mooij [ctb] |
Maintainer: | Markus Kalisch <[email protected]> |
License: | GPL (>= 2) |
Version: | 2.7-12 |
Built: | 2024-12-12 07:15:18 UTC |
Source: | CRAN |
Add background knowledge x -> y to an adjacency matrix and complete the orientation rules from Meek (1995).
addBgKnowledge(gInput, x = c(), y = c(), verbose = FALSE, checkInput = TRUE)
addBgKnowledge(gInput, x = c(), y = c(), verbose = FALSE, checkInput = TRUE)
gInput |
|
x , y
|
node labels of |
verbose |
If TRUE, detailed output is provided. |
checkInput |
If TRUE, the input adjacency matrix is carefully
checked to see if it is a valid graph using function |
If the input is a graphNEL
object, it will be converted into an adjacency matrix of type amat.cpdag
.
If x
and y
are given and if amat[y,x] != 0
, this function adds orientation x -> y to the adjacency matrix amat
and completes the orientation rules from Meek (1995).
If x
and y
are not specified (or empty vectors) this function simply completes the orientation rules from Meek (1995). If x
and y
are vectors of length k, k>1, this function tries to add x[i] -> y[i]
to the adjacency matrix amat and complete the orientation rules from Meek (1995) for every (see Algorithm 1 in Perkovic et. al, 2017).
An adjacency matrix of type amat.cpdag
of the maximally oriented pdag with added background knowledge x -> y
or NULL
, if the backgound knowledge is not consistent with any DAG represented by the PDAG with the adjacency matrix amat
.
Emilija Perkovic and Markus Kalisch
C. Meek (1995). Causal inference and causal explanation with background knowledge, In Proceedings of UAI 1995, 403-410.
E. Perkovic, M. Kalisch and M.H. Maathuis (2017). Interpreting and using CPDAGs with background knowledge. In Proceedings of UAI 2017.
## a -- b -- c amat <- matrix(c(0,1,0, 1,0,1, 0,1,0), 3,3) colnames(amat) <- rownames(amat) <- letters[1:3] ## plot(as(t(amat), "graphNEL")) addBgKnowledge(gInput = amat) ## amat is a valid CPDAG ## b -> c is directed; a -- b is not directed by applying ## Meek's orientation rules bg1 <- addBgKnowledge(gInput = amat, x = "b", y = "c") ## plot(as(t(bg1), "graphNEL")) ## b -> c and b -> a are directed bg2 <- addBgKnowledge(gInput = amat, x = c("b","b"), y = c("c","a")) ## plot(as(t(bg2), "graphNEL")) ## c -> b is directed; as a consequence of Meek's orientation rules, ## b -> a is directed as well bg3 <- addBgKnowledge(gInput = amat, x = "c", y = "b") ## plot(as(t(bg3), "graphNEL")) amat2 <- matrix(c(0,1,0, 1,0,1, 0,1,0), 3,3) colnames(amat2) <- rownames(amat2) <- letters[1:3] ## new collider is inconsistent with original CPDAG; thus, NULL is returned addBgKnowledge(gInput = amat2, x = c("c", "a"), y = c("b", "b"))
## a -- b -- c amat <- matrix(c(0,1,0, 1,0,1, 0,1,0), 3,3) colnames(amat) <- rownames(amat) <- letters[1:3] ## plot(as(t(amat), "graphNEL")) addBgKnowledge(gInput = amat) ## amat is a valid CPDAG ## b -> c is directed; a -- b is not directed by applying ## Meek's orientation rules bg1 <- addBgKnowledge(gInput = amat, x = "b", y = "c") ## plot(as(t(bg1), "graphNEL")) ## b -> c and b -> a are directed bg2 <- addBgKnowledge(gInput = amat, x = c("b","b"), y = c("c","a")) ## plot(as(t(bg2), "graphNEL")) ## c -> b is directed; as a consequence of Meek's orientation rules, ## b -> a is directed as well bg3 <- addBgKnowledge(gInput = amat, x = "c", y = "b") ## plot(as(t(bg3), "graphNEL")) amat2 <- matrix(c(0,1,0, 1,0,1, 0,1,0), 3,3) colnames(amat2) <- rownames(amat2) <- letters[1:3] ## new collider is inconsistent with original CPDAG; thus, NULL is returned addBgKnowledge(gInput = amat2, x = c("c", "a"), y = c("b", "b"))
This function is a wrapper for convenience to the function adjustmentSet
from package dagitty.
adjustment(amat, amat.type, x, y, set.type)
adjustment(amat, amat.type, x, y, set.type)
amat |
adjacency matrix of type |
amat.type |
string specifying the type of graph of the adjacency matrix amat. It can be a DAG (type="dag"), a CPDAG (type="cpdag") or a maximally oriented PDAG (type="pdag") from Meek (1995); then the type of adjacency matrix is assumed to be amat.cpdag. It can also be a MAG (type = "mag") or a PAG (type="pag"); then the type of the adjacency matrix is assumed to be amat.pag. |
x |
(integer) position of variable x in the adjacency matrix. |
y |
(integer) position of variable y in the adjacency matrix. |
set.type |
string specifying the type of adjustment set that should be computed. It can be "minimal" ,"all" and "canonical". See Details explanations. |
If set.type
is "minimal", then only minimal sufficient adjustment sets are returned. If set.type
is "all", all valid
adjustment sets are returned. If set.type
is "canonical", a single adjustment set
is returned that consists of all (possible) ancestors of x
and y
,
minus (possible) descendants of nodes on proper causal paths. This canonical adjustment set is always valid if any valid set exists at all.
If adjustment sets exist, list of length at least one (list elements might be empty vectors, if the empty set is an adjustment set). If no adjustment set exists, an empty list is returned.
Emilija Perkovic and Markus Kalisch ([email protected])
E. Perkovic, J. Textor, M. Kalisch and M.H. Maathuis (2015). A Complete Generalized Adjustment Criterion. In Proceedings of UAI 2015.
E. Perkovic, J. Textor, M. Kalisch and M.H. Maathuis (2017). Complete graphical characterization and construction of adjustment sets in Markov equivalence classes of ancestral graphs. To appear in Journal of Machine Learning Research.
B. van der Zander, M. Liskiewicz and J. Textor (2014). Constructing separators and adjustment sets in ancestral graphs. In Proceedings of UAI 2014.
gac
for testing if a set satisfies the Generalized Adjustment Criterion.
## Example 4.1 in Perkovic et. al (2015), Example 2 in Perkovic et. al (2017) mFig1 <- matrix(c(0,1,1,0,0,0, 1,0,1,1,1,0, 0,0,0,0,0,1, 0,1,1,0,1,1, 0,1,0,1,0,1, 0,0,0,0,0,0), 6,6) type <- "cpdag" x <- 3; y <- 6 ## plot(as(t(mFig1), "graphNEL")) ## all if(requireNamespace("dagitty", quietly = TRUE)) { adjustment(amat = mFig1, amat.type = type, x = x, y = y, set.type = "all") } ## finds adjustment sets: (2,4), (1,2,4), (4,5), (1,4,5), (2,4,5), (1,2,4,5) ## minimal if(requireNamespace("dagitty", quietly = TRUE)) { adjustment(amat = mFig1, amat.type = type, x = x, y = y, set.type = "minimal") } ## finds adjustment sets: (2,4), (4,5), i.e., the valid sets with the fewest elements ## canonical if(requireNamespace("dagitty", quietly = TRUE)) { adjustment(amat = mFig1, amat.type = type, x = x, y = y, set.type = "canonical") } ## finds adjustment set: (1,2,4,5)
## Example 4.1 in Perkovic et. al (2015), Example 2 in Perkovic et. al (2017) mFig1 <- matrix(c(0,1,1,0,0,0, 1,0,1,1,1,0, 0,0,0,0,0,1, 0,1,1,0,1,1, 0,1,0,1,0,1, 0,0,0,0,0,0), 6,6) type <- "cpdag" x <- 3; y <- 6 ## plot(as(t(mFig1), "graphNEL")) ## all if(requireNamespace("dagitty", quietly = TRUE)) { adjustment(amat = mFig1, amat.type = type, x = x, y = y, set.type = "all") } ## finds adjustment sets: (2,4), (1,2,4), (4,5), (1,4,5), (2,4,5), (1,2,4,5) ## minimal if(requireNamespace("dagitty", quietly = TRUE)) { adjustment(amat = mFig1, amat.type = type, x = x, y = y, set.type = "minimal") } ## finds adjustment sets: (2,4), (4,5), i.e., the valid sets with the fewest elements ## canonical if(requireNamespace("dagitty", quietly = TRUE)) { adjustment(amat = mFig1, amat.type = type, x = x, y = y, set.type = "canonical") } ## finds adjustment set: (1,2,4,5)
Estimate an APDAG (a particular PDAG) using the aggregative greedy equivalence search (AGES) algorithm, which uses the solution path of the greedy equivalence search (GES) algorithm of Chickering (2002).
ages(data, lambda_min = 0.5 * log(nrow(data)), labels = NULL, fixedGaps = NULL, adaptive = c("none", "vstructures", "triples"), maxDegree = integer(0), verbose = FALSE, ...)
ages(data, lambda_min = 0.5 * log(nrow(data)), labels = NULL, fixedGaps = NULL, adaptive = c("none", "vstructures", "triples"), maxDegree = integer(0), verbose = FALSE, ...)
data |
A |
lambda_min |
The smallest penalty parameter value used when computing the solution path of GES. |
labels |
Node labels; if NULL the names of the columns of the data matrix (or the names in the data frame) are used. If these are not specified the sequence 1 to p is used. |
fixedGaps |
logical symmetric matrix of dimension p*p. If entry
|
adaptive |
indicating whether constraints should be adapted to newly detected v-structures or unshielded triples (cf. details). |
maxDegree |
Parameter used to limit the vertex degree of the estimated graph. Valid arguments:
|
verbose |
If |
... |
Additional arguments for debugging purposes and fine tuning. |
This function tries to add orientations to the essential graph (CPDAG) found by ges
(ran with lambda=lambda_min). It does it aggregating several CPDAGs present in the solution path of GES. Conceptually, AGES starts with the essential graph found by GES ran with lambda = lambda_min
. Then, it checks for further (compatible) orientation information in other essential graphs present in the solution path of GES, i.e., in essential graphs outputted by GES for larger penalty parameters. With compatible we mean that the aggregation process is done such that the final APDAG is still within the Markov equivalence graph represented by the essential graph found by GES in the following sense: an APDAG can always be extended to a DAG without creating new v-structures. This DAG lies in the Markov equivalence class represented by the essential graph found by GES. The algorithm is explained in detail in Eigenmann, Nandy, and Maathuis (2017).
The arguments fixedgaps
and adaptive
work also with AGES. However, they have not been studied in Eigenmann, Nandy, and Maathuis (2017).
Using the argument fixedGaps
, one can make sure that certain edges
will not be present in the resulting essential graph: if the entry
[i, j]
of the matrix passed to fixedGaps
is TRUE
, there
will be no edge between nodes and
. The argument
adaptive
can be
used to relax the constraints encoded by fixedGaps
according to a
modification of GES called ARGES (adaptively restricted greedy
equivalence search) which has been presented in Nandy, Hauser and Maathuis
(2018):
When adaptive = "vstructures"
and the algorithm introduces a
new v-structure in the
forward phase, then the edge
is removed from the list of fixed
gaps, meaning that the insertion of an edge between
and
becomes possible even if it was forbidden by the initial matrix passed to
fixedGaps
.
When adaptive = "triples"
and the algorithm introduces a new
unshielded triple in the forward phase (i.e., a subgraph of three nodes
,
and
where
and
as well as
and
are adjacent, but
and
are not), then the edge
is removed from the list of fixed gaps.
With one of the adaptive modifications, the successive application of a skeleton estimation method and GES restricted to an estimated skeleton still gives a consistent estimator of the DAG, which is not the case without the adaptive modification.
For a detailed explanation of the GES function as well as its related object like essential graphs, we refer to the ges
function.
Differences in the arguments with respect to GES: AGES uses data
to initialize several scores taken as argument by GES. AGES modifies the forward and backward phases of GES performing single steps in either directions. For this reason, phase
, iterate
, and turning
are not available arguments.
ages
returns a list with the following four components:
essgraph |
An object of class |
repr |
An object of a class derived from |
CPDAGsList |
A list of p*p matrices containing all CPDAGs considered by AGES in the aggregation processes |
lambda |
A vector containing the penalty parameter used to obtain the list of CPDAGs mentioned above. GES returns the list of CPDAGs when used with this vector of penalty parameters if used with phases = c("forward", "backward") and iterate = FALSE. |
Marco Eigenmann ([email protected])
D.M. Chickering (2002). Optimal structure identification with greedy search. Journal of Machine Learning Research 3, 507–554
M.F. Eigenmann, P. Nandy, and M.H. Maathuis (2017). Structure learning of linear Gaussian structural equation models with weak edges. In Proceedings of UAI 2017
P. Nandy, A. Hauser and M.H. Maathuis (2018). High-dimensional consistency in score-based and hybrid structure learning. Annals of Statistics, to appear.
## Example 1: ages adds correct orientations: Bar --> V6 and Bar --> V8 set.seed(77) p <- 8 n <- 5000 ## true DAG: vars <- c("Author", "Bar", "Ctrl", "Goal", paste0("V",5:8)) gGtrue <- randomDAG(p, prob = 0.3, V = vars) data = rmvDAG(n, gGtrue) ## Estimate the aggregated PDAG with ages ages.fit <- ages(data = data) ## Estimate the essential graph with ges ## We specify the phases in order to have a fair comparison of the algorithms ## Without the phases specified it would be easy to find examples ## where each algorithm outperforms the other score <- new("GaussL0penObsScore", data) ges.fit <- ges(score, phase = c("forward","backward"), iterate = FALSE) ## Plots if (require(Rgraphviz)) { par(mfrow=c(1,3)) plot(ges.fit$essgraph, main="Estimated CPDAG with GES") plot(ages.fit$essgraph, main="Estimated APDAG with AGES") plot(gGtrue, main="TrueDAG") } ## Example 2: ages adds correct orientations: Author --> Goal and Author --> V5 set.seed(50) p <- 9 n <- 5000 ## true DAG: vars <- c("Author", "Bar", "Ctrl", "Goal", paste0("V",5:9)) gGtrue <- randomDAG(p, prob = 0.5, V = vars) data = rmvDAG(n, gGtrue) ## Estimate the aggregated PDAG with ages ages.fit <- ages(data = data) ## Estimate the essential graph with ges ## We specify the phases in order to have a fair comparison of the algorithms ## Without the phases specified it would be easy to find examples ## where each algorithm outperforms the other score <- new("GaussL0penObsScore", data) ges.fit <- ges(score, phase = c("forward","backward"), iterate = FALSE) ## Plots if (require(Rgraphviz)) { par(mfrow=c(1,3)) plot(ges.fit$essgraph, main="Estimated CPDAG with GES") plot(ages.fit$essgraph, main="Estimated APDAG with AGES") plot(gGtrue, main="TrueDAG") } ## Example 3: ges and ages return the same graph data(gmG) data <- gmG8$x ## Estimate the aggregated PDAG with ages ages.fit <- ages(data = data) ## Estimate the essential graph with ges score <- new("GaussL0penObsScore", data) ges.fit <- ges(score) ## Plots if (require(Rgraphviz)) { par(mfrow=c(1,3)) plot(ges.fit$essgraph, main="Estimated CPDAG with GES") plot(ages.fit$essgraph, main="Estimated APDAG with AGES") plot(gmG8$g, main="TrueDAG") }
## Example 1: ages adds correct orientations: Bar --> V6 and Bar --> V8 set.seed(77) p <- 8 n <- 5000 ## true DAG: vars <- c("Author", "Bar", "Ctrl", "Goal", paste0("V",5:8)) gGtrue <- randomDAG(p, prob = 0.3, V = vars) data = rmvDAG(n, gGtrue) ## Estimate the aggregated PDAG with ages ages.fit <- ages(data = data) ## Estimate the essential graph with ges ## We specify the phases in order to have a fair comparison of the algorithms ## Without the phases specified it would be easy to find examples ## where each algorithm outperforms the other score <- new("GaussL0penObsScore", data) ges.fit <- ges(score, phase = c("forward","backward"), iterate = FALSE) ## Plots if (require(Rgraphviz)) { par(mfrow=c(1,3)) plot(ges.fit$essgraph, main="Estimated CPDAG with GES") plot(ages.fit$essgraph, main="Estimated APDAG with AGES") plot(gGtrue, main="TrueDAG") } ## Example 2: ages adds correct orientations: Author --> Goal and Author --> V5 set.seed(50) p <- 9 n <- 5000 ## true DAG: vars <- c("Author", "Bar", "Ctrl", "Goal", paste0("V",5:9)) gGtrue <- randomDAG(p, prob = 0.5, V = vars) data = rmvDAG(n, gGtrue) ## Estimate the aggregated PDAG with ages ages.fit <- ages(data = data) ## Estimate the essential graph with ges ## We specify the phases in order to have a fair comparison of the algorithms ## Without the phases specified it would be easy to find examples ## where each algorithm outperforms the other score <- new("GaussL0penObsScore", data) ges.fit <- ges(score, phase = c("forward","backward"), iterate = FALSE) ## Plots if (require(Rgraphviz)) { par(mfrow=c(1,3)) plot(ges.fit$essgraph, main="Estimated CPDAG with GES") plot(ages.fit$essgraph, main="Estimated APDAG with AGES") plot(gGtrue, main="TrueDAG") } ## Example 3: ges and ages return the same graph data(gmG) data <- gmG8$x ## Estimate the aggregated PDAG with ages ages.fit <- ages(data = data) ## Estimate the essential graph with ges score <- new("GaussL0penObsScore", data) ges.fit <- ges(score) ## Plots if (require(Rgraphviz)) { par(mfrow=c(1,3)) plot(ges.fit$essgraph, main="Estimated CPDAG with GES") plot(ages.fit$essgraph, main="Estimated APDAG with AGES") plot(gmG8$g, main="TrueDAG") }
Two types of adjacency matrices are used in package pcalg: Type
amat.cpdag
for DAGs and CPDAGs and type amat.pag
for
MAGs and PAGs. The required type of adjacency matrix is documented
in the help files of the respective functions or classes. If in some
functions more detailed information on the graph type is needed
(i.e. DAG or CPDAG; MAG or PAG) this information will be passed in a
separate argument (see e.g. gac
and the examples below).
Note that you get (‘extract’) such adjacency matrices as (S3)
objects of class
"amat"
via the usual
as(., "<class>")
coercion,
as(from, "amat")
from |
an R object of class class class |
Adjacency matrices are integer valued square matrices with zeros on the diagonal. They can have row- and columnnames; however, most functions will work on the (integer) column positions in the adjacency matrix.
Coding for type amat.cpdag
:
0
:No edge or tail
1
:Arrowhead
Note that the edgemark-code refers to the row index (as opposed adjacency matrices of type mag or pag). E.g.:
amat[a,b] = 0 and amat[b,a] = 1 implies a --> b. amat[a,b] = 1 and amat[b,a] = 0 implies a <-- b. amat[a,b] = 0 and amat[b,a] = 0 implies a b. amat[a,b] = 1 and amat[b,a] = 1 implies a --- b.
Coding for type amat.pag
:
0
:No edge
1
:Circle
2
:Arrowhead
3
:Tail
Note that the edgemark-code refers to the column index (as opposed adjacency matrices of type dag or cpdag). E.g.:
amat[a,b] = 2 and amat[b,a] = 3 implies a --> b. amat[a,b] = 3 and amat[b,a] = 2 implies a <-- b. amat[a,b] = 2 and amat[b,a] = 2 implies a <-> b. amat[a,b] = 1 and amat[b,a] = 3 implies a --o b. amat[a,b] = 0 and amat[b,a] = 0 implies a b.
E.g. gac
for a function which takes an
adjacency matrix as input; fciAlgo
for a class
which has an adjacency matrix in one slot.
getGraph(x)
extracts the graph-class
object from x
, whereas as(*, "amat")
gets the
corresponding adjacency matrix.
################################################## ## Function gac() takes an adjecency matrix of ## any kind as input. In addition to that, the ## precise type of graph (DAG/CPDAG/MAG/PAG) needs ## to be passed as a different argument ################################################## ## Adjacency matrix of type 'amat.cpdag' m1 <- matrix(c(0,1,0,1,0,0, 0,0,1,0,1,0, 0,0,0,0,0,1, 0,0,0,0,0,0, 0,0,0,0,0,0, 0,0,0,0,0,0), 6,6) ## more detailed information on the graph type needed by gac() gac(m1, x=1,y=3, z=NULL, type = "dag") ## Adjacency matrix of type 'amat.cpdag' m2 <- matrix(c(0,1,1,0,0,0, 1,0,1,1,1,0, 0,0,0,0,0,1, 0,1,1,0,1,1, 0,1,0,1,0,1, 0,0,0,0,0,0), 6,6) ## more detailed information on the graph type needed by gac() gac(m2, x=3, y=6, z=c(2,4), type = "cpdag") ## Adjacency matrix of type 'amat.pag' m3 <- matrix(c(0,2,0,0, 3,0,3,3, 0,2,0,3, 0,2,2,0), 4,4) ## more detailed information on the graph type needed by gac() mg3 <- gac(m3, x=2, y=4, z=NULL, type = "mag") pg3 <- gac(m3, x=2, y=4, z=NULL, type = "pag") ############################################################ ## as(*, "amat") returns an adjacency matrix incl. its type ############################################################ ## Load predefined data data(gmG) n <- nrow (gmG8$x) V <- colnames(gmG8$x) ## define sufficient statistics suffStat <- list(C = cor(gmG8$x), n = n) ## estimate CPDAG skel.fit <- skeleton(suffStat, indepTest = gaussCItest, alpha = 0.01, labels = V) ## Extract the "amat" [and show nicely via 'print()' method]: as(skel.fit, "amat") ################################################## ## Function fci() returns an adjacency matrix ## of type amat.pag as one slot. ################################################## set.seed(42) p <- 7 ## generate and draw random DAG : myDAG <- randomDAG(p, prob = 0.4) ## find skeleton and PAG using the FCI algorithm suffStat <- list(C = cov2cor(trueCov(myDAG)), n = 10^9) res <- fci(suffStat, indepTest=gaussCItest, alpha = 0.9999, p=p, doPdsep = FALSE) str(res) ## get the a(djacency) mat(rix) and nicely print() it: as(res, "amat") ################################################## ## pcAlgo object ################################################## ## Load predefined data data(gmG) n <- nrow (gmG8$x) V <- colnames(gmG8$x) ## define sufficient statistics suffStat <- list(C = cor(gmG8$x), n = n) ## estimate CPDAG skel.fit <- skeleton(suffStat, indepTest = gaussCItest, alpha = 0.01, labels = V) ## Extract Adjacency Matrix - and print (via method 'print.amat'): as(skel.fit, "amat") pc.fit <- pc(suffStat, indepTest = gaussCItest, alpha = 0.01, labels = V) pc.fit # (using its own print() method 'print.pcAlgo') as(pc.fit, "amat")
################################################## ## Function gac() takes an adjecency matrix of ## any kind as input. In addition to that, the ## precise type of graph (DAG/CPDAG/MAG/PAG) needs ## to be passed as a different argument ################################################## ## Adjacency matrix of type 'amat.cpdag' m1 <- matrix(c(0,1,0,1,0,0, 0,0,1,0,1,0, 0,0,0,0,0,1, 0,0,0,0,0,0, 0,0,0,0,0,0, 0,0,0,0,0,0), 6,6) ## more detailed information on the graph type needed by gac() gac(m1, x=1,y=3, z=NULL, type = "dag") ## Adjacency matrix of type 'amat.cpdag' m2 <- matrix(c(0,1,1,0,0,0, 1,0,1,1,1,0, 0,0,0,0,0,1, 0,1,1,0,1,1, 0,1,0,1,0,1, 0,0,0,0,0,0), 6,6) ## more detailed information on the graph type needed by gac() gac(m2, x=3, y=6, z=c(2,4), type = "cpdag") ## Adjacency matrix of type 'amat.pag' m3 <- matrix(c(0,2,0,0, 3,0,3,3, 0,2,0,3, 0,2,2,0), 4,4) ## more detailed information on the graph type needed by gac() mg3 <- gac(m3, x=2, y=4, z=NULL, type = "mag") pg3 <- gac(m3, x=2, y=4, z=NULL, type = "pag") ############################################################ ## as(*, "amat") returns an adjacency matrix incl. its type ############################################################ ## Load predefined data data(gmG) n <- nrow (gmG8$x) V <- colnames(gmG8$x) ## define sufficient statistics suffStat <- list(C = cor(gmG8$x), n = n) ## estimate CPDAG skel.fit <- skeleton(suffStat, indepTest = gaussCItest, alpha = 0.01, labels = V) ## Extract the "amat" [and show nicely via 'print()' method]: as(skel.fit, "amat") ################################################## ## Function fci() returns an adjacency matrix ## of type amat.pag as one slot. ################################################## set.seed(42) p <- 7 ## generate and draw random DAG : myDAG <- randomDAG(p, prob = 0.4) ## find skeleton and PAG using the FCI algorithm suffStat <- list(C = cov2cor(trueCov(myDAG)), n = 10^9) res <- fci(suffStat, indepTest=gaussCItest, alpha = 0.9999, p=p, doPdsep = FALSE) str(res) ## get the a(djacency) mat(rix) and nicely print() it: as(res, "amat") ################################################## ## pcAlgo object ################################################## ## Load predefined data data(gmG) n <- nrow (gmG8$x) V <- colnames(gmG8$x) ## define sufficient statistics suffStat <- list(C = cor(gmG8$x), n = n) ## estimate CPDAG skel.fit <- skeleton(suffStat, indepTest = gaussCItest, alpha = 0.01, labels = V) ## Extract Adjacency Matrix - and print (via method 'print.amat'): as(skel.fit, "amat") pc.fit <- pc(suffStat, indepTest = gaussCItest, alpha = 0.01, labels = V) pc.fit # (using its own print() method 'print.pcAlgo') as(pc.fit, "amat")
This function first checks if the total causal effect of
one variable (x
) onto another variable (y
) is
identifiable via the GBC, and if this is
the case it explicitly gives a set of variables that satisfies the
GBC with respect to x
and y
in the given graph.
backdoor(amat, x, y, type = "pag", max.chordal = 10, verbose=FALSE)
backdoor(amat, x, y, type = "pag", max.chordal = 10, verbose=FALSE)
amat |
adjacency matrix of type |
x , y
|
(integer) position of variable |
type |
string specifying the type of graph of the adjacency matrix
|
max.chordal |
only if |
verbose |
logical; if true, some output is produced during computation. |
This function is a generalization of Pearl's backdoor criterion, see Pearl (1993), defined for directed acyclic graphs (DAGs), for single interventions and single outcome variable to more general types of graphs (CPDAGs, MAGs, and PAGs) that describe Markov equivalence classes of DAGs with and without latent variables but without selection variables. For more details see Maathuis and Colombo (2015).
The motivation to find a set W that satisfies the GBC with respect to
x
and y
in the given graph relies on the result of the generalized backdoor adjustment:
If a set of variables W satisfies the GBC relative to x
and y
in the given graph, then
the causal effect of x
on y
is identifiable and is given
by
This result allows to write post-intervention densities (the one written using Pearl's do-calculus) using only observational densities estimated from the data.
If the input graph is a DAG (type="dag"
), this function reduces
to Pearl's backdoor criterion for single interventions and single
outcome variable, and the parents of x
in the DAG satisfy the
backdoor criterion unless y
is a parent of x
.
If the input graph is a CPDAG C (type="cpdag"
), a MAG M
(type="mag"
), or a PAG P (type="pag"
) (with both M and P
not allowing selection variables), this function first checks if the
total causal effect of x
on y
is identifiable via the
GBC (see Maathuis and Colombo, 2015). If
the effect is not identifiable in this way, the output is
NA. Otherwise, an explicit set W that satisfies the GBC with respect
to x
and y
in the given graph is found.
At this moment this function is not able to work with an RFCI-PAG.
It is important to note that there can be pair of nodes x
and
y
for which there is no set W that satisfies the GBC, but the
total causal effect might be identifiable via some other technique.
For the coding of the adjacency matrix see amatType.
Either NA if the total causal effect is not identifiable via the GBC, or a set if the effect is identifiable via the GBC. Note that if the set W is equal to the empty set, the output is NULL.
Diego Colombo and Markus Kalisch ([email protected])
M.H. Maathuis and D. Colombo (2015). A generalized backdoor criterion. Annals of Statistics 43 1060-1088.
J. Pearl (1993). Comment: Graphical models, causality and intervention. Statistical Science 8, 266–269.
gac
for the Generalized Adjustment Criterion
(GAC), which is a generalization of GBC; pc
for
estimating a CPDAG, dag2pag
and fci
for estimating a PAG, and
pag2magAM
for estimating a MAG.
##################################################################### ##DAG ##################################################################### ## Simulate the true DAG suppressWarnings(RNGversion("3.5.0")) set.seed(123) p <- 7 myDAG <- randomDAG(p, prob = 0.2) ## true DAG ## Extract the adjacency matrix of the true DAG true.amat <- (amat <- as(myDAG, "matrix")) != 0 # TRUE/FALSE <==> 1/0 print.table(1*true.amat, zero.=".") # "visualization" ## Compute set satisfying the GBC: backdoor(true.amat, 5, 7, type="dag") ##################################################################### ##CPDAG ##################################################################### ################################################## ## Example not identifiable ## Maathuis and Colombo (2015), Fig. 3a, p.1072 ################################################## ## create the graph p <- 5 . <- 0 amat <- rbind(c(.,.,1,1,1), c(.,.,1,1,1), c(.,.,.,1,.), c(.,.,.,.,1), c(.,.,.,.,.)) colnames(amat) <- rownames(amat) <- as.character(1:5) V <- as.character(1:5) edL <- vector("list",length=5) names(edL) <- V edL[[1]] <- list(edges=c(3,4,5),weights=c(1,1,1)) edL[[2]] <- list(edges=c(3,4,5),weights=c(1,1,1)) edL[[3]] <- list(edges=4,weights=c(1)) edL[[4]] <- list(edges=5,weights=c(1)) g <- new("graphNEL", nodes=V, edgeL=edL, edgemode="directed") ## estimate the true CPDAG myCPDAG <- dag2cpdag(g) ## Extract the adjacency matrix of the true CPDAG true.amat <- (as(myCPDAG, "matrix") != 0) # 1/0 <==> TRUE/FALSE ## The effect is not identifiable, in fact: backdoor(true.amat, 3, 5, type="cpdag") ################################################## ## Example identifiable ## Maathuis and Colombo (2015), Fig. 3b, p.1072 ################################################## ## create the graph p <- 6 amat <- rbind(c(0,0,1,1,0,1), c(0,0,1,1,0,1), c(0,0,0,0,1,0), c(0,0,0,0,1,1), c(0,0,0,0,0,0), c(0,0,0,0,0,0)) colnames(amat) <- rownames(amat) <- as.character(1:6) V <- as.character(1:6) edL <- vector("list",length=6) names(edL) <- V edL[[1]] <- list(edges=c(3,4,6),weights=c(1,1,1)) edL[[2]] <- list(edges=c(3,4,6),weights=c(1,1,1)) edL[[3]] <- list(edges=5,weights=c(1)) edL[[4]] <- list(edges=c(5,6),weights=c(1,1)) g <- new("graphNEL", nodes=V, edgeL=edL, edgemode="directed") ## estimate the true CPDAG myCPDAG <- dag2cpdag(g) ## Extract the adjacency matrix of the true CPDAG true.amat <- as(myCPDAG, "matrix") != 0 ## The effect is identifiable and the set satisfying GBC is: backdoor(true.amat, 6, 3, type="cpdag") ################################################################## ##PAG ################################################################## ################################################## ## Example identifiable ## Maathuis and Colombo (2015), Fig. 5a, p.1075 ################################################## ## create the graph p <- 7 amat <- t(matrix(c(0,0,1,1,0,0,0, 0,0,1,1,0,0,0, 0,0,0,1,0,1,0, 0,0,0,0,0,0,1, 0,0,0,0,0,1,1, 0,0,0,0,0,0,0, 0,0,0,0,0,0,0), 7, 7)) colnames(amat) <- rownames(amat) <- as.character(1:7) V <- as.character(1:7) edL <- vector("list",length=7) names(edL) <- V edL[[1]] <- list(edges=c(3,4),weights=c(1,1)) edL[[2]] <- list(edges=c(3,4),weights=c(1,1)) edL[[3]] <- list(edges=c(4,6),weights=c(1,1)) edL[[4]] <- list(edges=7,weights=c(1)) edL[[5]] <- list(edges=c(6,7),weights=c(1,1)) g <- new("graphNEL", nodes=V, edgeL=edL, edgemode="directed") L <- 5 ## compute the true covariance matrix of g cov.mat <- trueCov(g) ## transform covariance matrix into a correlation matrix true.corr <- cov2cor(cov.mat) suffStat <- list(C=true.corr, n=10^9) indepTest <- gaussCItest ## estimate the true PAG true.pag <- dag2pag(suffStat, indepTest, g, L, alpha = 0.9999)@amat ## The effect is identifiable and the backdoor set is: backdoor(true.pag, 3, 5, type="pag")
##################################################################### ##DAG ##################################################################### ## Simulate the true DAG suppressWarnings(RNGversion("3.5.0")) set.seed(123) p <- 7 myDAG <- randomDAG(p, prob = 0.2) ## true DAG ## Extract the adjacency matrix of the true DAG true.amat <- (amat <- as(myDAG, "matrix")) != 0 # TRUE/FALSE <==> 1/0 print.table(1*true.amat, zero.=".") # "visualization" ## Compute set satisfying the GBC: backdoor(true.amat, 5, 7, type="dag") ##################################################################### ##CPDAG ##################################################################### ################################################## ## Example not identifiable ## Maathuis and Colombo (2015), Fig. 3a, p.1072 ################################################## ## create the graph p <- 5 . <- 0 amat <- rbind(c(.,.,1,1,1), c(.,.,1,1,1), c(.,.,.,1,.), c(.,.,.,.,1), c(.,.,.,.,.)) colnames(amat) <- rownames(amat) <- as.character(1:5) V <- as.character(1:5) edL <- vector("list",length=5) names(edL) <- V edL[[1]] <- list(edges=c(3,4,5),weights=c(1,1,1)) edL[[2]] <- list(edges=c(3,4,5),weights=c(1,1,1)) edL[[3]] <- list(edges=4,weights=c(1)) edL[[4]] <- list(edges=5,weights=c(1)) g <- new("graphNEL", nodes=V, edgeL=edL, edgemode="directed") ## estimate the true CPDAG myCPDAG <- dag2cpdag(g) ## Extract the adjacency matrix of the true CPDAG true.amat <- (as(myCPDAG, "matrix") != 0) # 1/0 <==> TRUE/FALSE ## The effect is not identifiable, in fact: backdoor(true.amat, 3, 5, type="cpdag") ################################################## ## Example identifiable ## Maathuis and Colombo (2015), Fig. 3b, p.1072 ################################################## ## create the graph p <- 6 amat <- rbind(c(0,0,1,1,0,1), c(0,0,1,1,0,1), c(0,0,0,0,1,0), c(0,0,0,0,1,1), c(0,0,0,0,0,0), c(0,0,0,0,0,0)) colnames(amat) <- rownames(amat) <- as.character(1:6) V <- as.character(1:6) edL <- vector("list",length=6) names(edL) <- V edL[[1]] <- list(edges=c(3,4,6),weights=c(1,1,1)) edL[[2]] <- list(edges=c(3,4,6),weights=c(1,1,1)) edL[[3]] <- list(edges=5,weights=c(1)) edL[[4]] <- list(edges=c(5,6),weights=c(1,1)) g <- new("graphNEL", nodes=V, edgeL=edL, edgemode="directed") ## estimate the true CPDAG myCPDAG <- dag2cpdag(g) ## Extract the adjacency matrix of the true CPDAG true.amat <- as(myCPDAG, "matrix") != 0 ## The effect is identifiable and the set satisfying GBC is: backdoor(true.amat, 6, 3, type="cpdag") ################################################################## ##PAG ################################################################## ################################################## ## Example identifiable ## Maathuis and Colombo (2015), Fig. 5a, p.1075 ################################################## ## create the graph p <- 7 amat <- t(matrix(c(0,0,1,1,0,0,0, 0,0,1,1,0,0,0, 0,0,0,1,0,1,0, 0,0,0,0,0,0,1, 0,0,0,0,0,1,1, 0,0,0,0,0,0,0, 0,0,0,0,0,0,0), 7, 7)) colnames(amat) <- rownames(amat) <- as.character(1:7) V <- as.character(1:7) edL <- vector("list",length=7) names(edL) <- V edL[[1]] <- list(edges=c(3,4),weights=c(1,1)) edL[[2]] <- list(edges=c(3,4),weights=c(1,1)) edL[[3]] <- list(edges=c(4,6),weights=c(1,1)) edL[[4]] <- list(edges=7,weights=c(1)) edL[[5]] <- list(edges=c(6,7),weights=c(1,1)) g <- new("graphNEL", nodes=V, edgeL=edL, edgemode="directed") L <- 5 ## compute the true covariance matrix of g cov.mat <- trueCov(g) ## transform covariance matrix into a correlation matrix true.corr <- cov2cor(cov.mat) suffStat <- list(C=true.corr, n=10^9) indepTest <- gaussCItest ## estimate the true PAG true.pag <- dag2pag(suffStat, indepTest, g, L, alpha = 0.9999)@amat ## The effect is identifiable and the backdoor set is: backdoor(true.pag, 3, 5, type="pag")
This function is DEPRECATED! Use ida
instead.
beta.special(dat=NA, x.pos, y.pos, verbose=0, a=0.01, myDAG=NA, myplot=FALSE, perfect=FALSE, method="local", collTest=TRUE, pcObj=NA, all.dags=NA, u2pd="rand")
beta.special(dat=NA, x.pos, y.pos, verbose=0, a=0.01, myDAG=NA, myplot=FALSE, perfect=FALSE, method="local", collTest=TRUE, pcObj=NA, all.dags=NA, u2pd="rand")
dat |
Data matrix |
x.pos , y.pos
|
integer column positions of |
verbose |
0=no comments, 2=detail on estimates |
a |
Significance level of tests for finding CPDAG |
myDAG |
Needed if true correlation matrix shall be computed |
myplot |
Plot estimated graph |
perfect |
True cor matrix is calculated from myDAG |
method |
"local" - local (all combinations of parents in regr.); "global" - all DAGs |
collTest |
True - Exclude orientations of undirected edges that introduce a new collider |
pcObj |
Fit of PC Algorithm (CPDAG); if this is available, no new fit is done |
all.dags |
All DAGs in the format of function allDags; if this is available, no new function call allDags is done |
u2pd |
function for converting a UDAG to a PDAG;
"rand": |
estimates of intervention effects
Markus Kalisch ([email protected])
pcAlgo
, dag2cpdag
;
beta.special.pcObj
for a fast version of
beta.special()
, using a precomputed pc-object.
This function is DEPRECATED! Use ida
or
idaFast
instead.
beta.special.pcObj(x.pos, y.pos, pcObj, mcov=NA, amat=NA, amatSkel=NA, t.amat=NA)
beta.special.pcObj(x.pos, y.pos, pcObj, mcov=NA, amat=NA, amatSkel=NA, t.amat=NA)
x.pos |
Column of x in dat |
y.pos |
Column of y in dat |
pcObj |
Precomputed pc-object |
mcov |
Covariance that was used in the pc-object fit |
amat , amatSkel , t.amat
|
Matrices that can be precomputed, if needed (see code for details on how to precompute) |
estimates of intervention effects
Markus Kalisch ([email protected])
pcAlgo
, dag2cpdag
,
beta.special
test for (conditional) independence of
binary variables
and
given the (possibly empty)
set of binary variables
.
binCItest()
is a wrapper of gSquareBin()
, to be easily
used in skeleton
, pc
and
fci
.
gSquareBin(x, y, S, dm, adaptDF = FALSE, n.min = 10*df, verbose = FALSE) binCItest (x, y, S, suffStat)
gSquareBin(x, y, S, dm, adaptDF = FALSE, n.min = 10*df, verbose = FALSE) binCItest (x, y, S, suffStat)
x , y
|
(integer) position of variable |
S |
(integer) positions of zero or more conditioning variables in the adjacency matrix. |
dm |
data matrix (with |
adaptDF |
logical specifying if the degrees of freedom should be lowered by one for each zero count. The value for the degrees of freedom cannot go below 1. |
n.min |
the smallest |
verbose |
logical or integer indicating that increased diagnostic output is to be provided. |
suffStat |
a |
The statistic is used to test for (conditional)
independence of X and Y given a set S (can be
NULL
). This
function is a specialized version of gSquareDis
which is
for discrete variables with more than two levels.
The p-value of the test.
Nicoletta Andri and Markus Kalisch ([email protected])
R.E. Neapolitan (2004). Learning Bayesian Networks. Prentice Hall Series in Artificial Intelligence. Chapter 10.3.1
gSquareDis
for a (conditional) independence test
for discrete variables with more than two levels.
dsepTest
, gaussCItest
and
disCItest
for similar functions for a d-separation
oracle, a conditional independence test for Gaussian variables and
a conditional independence test for discrete variables, respectively.
skeleton
, pc
or fci
which
need a testing function such as binCItest
.
n <- 100 set.seed(123) ## Simulate *independent data of {0,1}-variables: x <- rbinom(n, 1, pr=1/2) y <- rbinom(n, 1, pr=1/2) z <- rbinom(n, 1, pr=1/2) dat <- cbind(x,y,z) binCItest(1,3,2, list(dm = dat, adaptDF = FALSE)) # 0.36, not signif. binCItest(1,3,2, list(dm = dat, adaptDF = TRUE )) # the same, here ## Simulate data from a chain of 3 variables: x1 -> x2 -> x3 set.seed(12) b0 <- 0 b1 <- 1 b2 <- 1 n <- 10000 x1 <- rbinom(n, size=1, prob=1/2) ## = sample(c(0,1), n, replace=TRUE) ## NB: plogis(u) := "expit(u)" := exp(u) / (1 + exp(u)) p2 <- plogis(b0 + b1*x1) ; x2 <- rbinom(n, 1, prob = p2) # {0,1} p3 <- plogis(b0 + b2*x2) ; x3 <- rbinom(n, 1, prob = p2) # {0,1} ftable(xtabs(~ x1+x2+x3)) dat <- cbind(x1,x2,x3) ## Test marginal and conditional independencies gSquareBin(3,1,NULL,dat, verbose=TRUE) gSquareBin(3,1, 2, dat) gSquareBin(1,3, 2, dat) # the same gSquareBin(1,3, 2, dat, adaptDF=TRUE, verbose = 2)
n <- 100 set.seed(123) ## Simulate *independent data of {0,1}-variables: x <- rbinom(n, 1, pr=1/2) y <- rbinom(n, 1, pr=1/2) z <- rbinom(n, 1, pr=1/2) dat <- cbind(x,y,z) binCItest(1,3,2, list(dm = dat, adaptDF = FALSE)) # 0.36, not signif. binCItest(1,3,2, list(dm = dat, adaptDF = TRUE )) # the same, here ## Simulate data from a chain of 3 variables: x1 -> x2 -> x3 set.seed(12) b0 <- 0 b1 <- 1 b2 <- 1 n <- 10000 x1 <- rbinom(n, size=1, prob=1/2) ## = sample(c(0,1), n, replace=TRUE) ## NB: plogis(u) := "expit(u)" := exp(u) / (1 + exp(u)) p2 <- plogis(b0 + b1*x1) ; x2 <- rbinom(n, 1, prob = p2) # {0,1} p3 <- plogis(b0 + b2*x2) ; x3 <- rbinom(n, 1, prob = p2) # {0,1} ftable(xtabs(~ x1+x2+x3)) dat <- cbind(x1,x2,x3) ## Test marginal and conditional independencies gSquareBin(3,1,NULL,dat, verbose=TRUE) gSquareBin(3,1, 2, dat) gSquareBin(1,3, 2, dat) # the same gSquareBin(1,3, 2, dat, adaptDF=TRUE, verbose = 2)
For each subset of nbrsA
and nbrsC
where a
and
c
are conditionally independent, it is checked if b
is in the
conditioning set.
checkTriple(a, b, c, nbrsA, nbrsC, sepsetA, sepsetC, suffStat, indepTest, alpha, version.unf = c(NA, NA), maj.rule = FALSE, verbose = FALSE)
checkTriple(a, b, c, nbrsA, nbrsC, sepsetA, sepsetC, suffStat, indepTest, alpha, version.unf = c(NA, NA), maj.rule = FALSE, verbose = FALSE)
a , b , c
|
(integer) positions in adjacency matrix for nodes
|
nbrsA , nbrsC
|
(integer) position in adjacency matrix for
neighbors of |
sepsetA |
vector containing |
sepsetC |
vector containing |
suffStat |
a |
indepTest |
|
alpha |
significance level of test. |
version.unf |
(integer) vector of length two:
|
maj.rule |
logical indicating that the following majority rule
is applied: if |
verbose |
Logical asking for detailed output of intermediate steps. |
This function is used in the conservative versions of structure learning algorithms.
decision |
Decision on possibly ambiguous triple, an integer code,
|
vers |
Version (1 or 2) of the ambiguous triple
(1=normal ambiguous triple that is |
sepsetA |
Updated version of |
sepsetC |
Updated version of |
Markus Kalisch ([email protected]) and Diego Colombo.
D. Colombo and M.H. Maathuis (2014).Order-independent constraint-based causal structure learning. Journal of Machine Learning Research 15 3741-3782.
################################################## ## Using Gaussian Data ################################################## ## Load predefined data data(gmG) n <- nrow (gmG8$x) V <- colnames(gmG8$x) ## define independence test (partial correlations), and test level indepTest <- gaussCItest alpha <- 0.01 ## define sufficient statistics suffStat <- list(C = cor(gmG8$x), n = n) ## estimate CPDAG pc.fit <- pc(suffStat, indepTest, alpha=alpha, labels = V, verbose = TRUE) if (require(Rgraphviz)) { ## show estimated CPDAG par(mfrow=c(1,2)) plot(pc.fit, main = "Estimated CPDAG") plot(gmG8$g, main = "True DAG") } a <- 6 b <- 1 c <- 8 checkTriple(a, b, c, nbrsA = c(1,5,7), nbrsC = c(1,5), sepsetA = pc.fit@sepset[[a]][[c]], sepsetC = pc.fit@sepset[[c]][[a]], suffStat=suffStat, indepTest=indepTest, alpha=alpha, version.unf = c(2,2), verbose = TRUE) -> ct str(ct) ## List of 4 ## $ decision: int 2 ## $ version : int 1 ## $ SepsetA : int [1:2] 1 5 ## $ SepsetC : int 1 checkTriple(a, b, c, nbrsA = c(1,5,7), nbrsC = c(1,5), sepsetA = pc.fit@sepset[[a]][[c]], sepsetC = pc.fit@sepset[[c]][[a]], version.unf = c(1,1), suffStat=suffStat, indepTest=indepTest, alpha=alpha) -> c2 stopifnot(identical(ct, c2)) ## in this case, 'version.unf' had no effect
################################################## ## Using Gaussian Data ################################################## ## Load predefined data data(gmG) n <- nrow (gmG8$x) V <- colnames(gmG8$x) ## define independence test (partial correlations), and test level indepTest <- gaussCItest alpha <- 0.01 ## define sufficient statistics suffStat <- list(C = cor(gmG8$x), n = n) ## estimate CPDAG pc.fit <- pc(suffStat, indepTest, alpha=alpha, labels = V, verbose = TRUE) if (require(Rgraphviz)) { ## show estimated CPDAG par(mfrow=c(1,2)) plot(pc.fit, main = "Estimated CPDAG") plot(gmG8$g, main = "True DAG") } a <- 6 b <- 1 c <- 8 checkTriple(a, b, c, nbrsA = c(1,5,7), nbrsC = c(1,5), sepsetA = pc.fit@sepset[[a]][[c]], sepsetC = pc.fit@sepset[[c]][[a]], suffStat=suffStat, indepTest=indepTest, alpha=alpha, version.unf = c(2,2), verbose = TRUE) -> ct str(ct) ## List of 4 ## $ decision: int 2 ## $ version : int 1 ## $ SepsetA : int [1:2] 1 5 ## $ SepsetC : int 1 checkTriple(a, b, c, nbrsA = c(1,5,7), nbrsC = c(1,5), sepsetA = pc.fit@sepset[[a]][[c]], sepsetC = pc.fit@sepset[[c]][[a]], version.unf = c(1,1), suffStat=suffStat, indepTest=indepTest, alpha=alpha) -> c2 stopifnot(identical(ct, c2)) ## in this case, 'version.unf' had no effect
Compares the true undirected graph with an estimated undirected graph in terms of True Positive Rate (TPR), False Positive Rate (FPR) and True Discovery Rate (TDR).
compareGraphs(gl, gt)
compareGraphs(gl, gt)
gl |
Estimated graph (graph object) |
gt |
True graph (graph object) |
If the input graph is directed, the directions are omitted. Special cases:
If the true graph contains no edges, the tpr is defined to be zero.
Similarly, if the true graph contains no gaps, the fpr is defined to be one.
If there are no edges in the true graph and there are none in the estimated graph, tdr is one. If there are none in the true graph but there are some in the estimated graph, tdr is zero.
A named numeric vector with three numbers:
tpr |
True Positive Rate: Number of correctly found edges (in estimated graph) divided by number of true edges (in true graph) |
fpr |
False Positive Rate: Number of incorrectly found edges divided by number of true gaps (in true graph) |
tdr |
True Discovery Rate: Number of correctly found edges divided by number of found edges (both in estimated graph) |
Markus Kalisch ([email protected]) and Martin Maechler
randomDAG
for generating a random DAG.
## generate a graph with 4 nodes V <- LETTERS[1:4] edL2 <- vector("list", length=4) names(edL2) <- V edL2[[1]] <- list(edges= 2) edL2[[2]] <- list(edges= c(1,3,4)) edL2[[3]] <- list(edges= c(2,4)) edL2[[4]] <- list(edges= c(2,3)) gt <- new("graphNEL", nodes=V, edgeL=edL2, edgemode="undirected") ## change graph gl <- graph::addEdge("A","C", gt,1) ## compare the two graphs if (require(Rgraphviz)) { par(mfrow=c(2,1)) plot(gt) ; title("True graph") plot(gl) ; title("Estimated graph") (cg <- compareGraphs(gl,gt)) }
## generate a graph with 4 nodes V <- LETTERS[1:4] edL2 <- vector("list", length=4) names(edL2) <- V edL2[[1]] <- list(edges= 2) edL2[[2]] <- list(edges= c(1,3,4)) edL2[[3]] <- list(edges= c(2,4)) edL2[[4]] <- list(edges= c(2,3)) gt <- new("graphNEL", nodes=V, edgeL=edL2, edgemode="undirected") ## change graph gl <- graph::addEdge("A","C", gt,1) ## compare the two graphs if (require(Rgraphviz)) { par(mfrow=c(2,1)) plot(gt) ; title("True graph") plot(gl) ; title("Estimated graph") (cg <- compareGraphs(gl,gt)) }
Using Fisher's z-transformation of the partial correlation, test for zero partial correlation of sets of normally / Gaussian distributed random variables.
The gaussCItest()
function, using zStat()
to test for
(conditional) independence between gaussian random variables, with an
interface that can easily be used in skeleton
,
pc
and fci
.
condIndFisherZ(x, y, S, C, n, cutoff, verbose= ) zStat (x, y, S, C, n) gaussCItest (x, y, S, suffStat)
condIndFisherZ(x, y, S, C, n, cutoff, verbose= ) zStat (x, y, S, C, n) gaussCItest (x, y, S, suffStat)
x , y , S
|
(integer) position of variable |
C |
Correlation matrix of nodes |
n |
Integer specifying the number of observations
(“samples”) used to estimate the correlation matrix |
cutoff |
Numeric cutoff for significance level of individual partial
correlation tests. Must be set to |
verbose |
Logical indicating whether some intermediate output should be shown; currently not used. |
suffStat |
A |
For gaussian random variables and after performing Fisher's
z-transformation of the partial correlation, the test statistic
zStat()
is (asymptotically for large enough n
) standard
normally distributed.
Partial correlation is tested in a two-sided hypothesis test, i.e.,
basically, condIndFisherZ(*) == abs(zStat(*)) > qnorm(1 - alpha/2)
.
In a multivariate normal distribution, zero partial correlation is
equivalent to conditional independence.
zStat()
gives a number
which is asymptotically normally distributed under the null hypothesis of correlation 0.
condIndFisherZ()
returns a logical
indicating whether the “partial correlation of x
and y given S is zero” could not be rejected on the given
significance level. More intuitively and for multivariate normal
data, this means: If
TRUE
then it seems plausible, that x and
y are conditionally independent given S. If FALSE
then there
was strong evidence found against this conditional independence
statement.
gaussCItest()
returns the p-value of the test.
Markus Kalisch ([email protected]) and Martin Maechler
M. Kalisch and P. Buehlmann (2007). Estimating high-dimensional directed acyclic graphs with the PC-algorithm. JMLR 8 613-636.
pcorOrder
for computing a partial correlation
given the correlation matrix in a recursive way.
dsepTest
, disCItest
and
binCItest
for similar functions for a d-separation
oracle, a conditional independence test for discrete variables and a
conditional independence test for binary variables, respectively.
set.seed(42) ## Generate four independent normal random variables n <- 20 data <- matrix(rnorm(n*4),n,4) ## Compute corresponding correlation matrix corMatrix <- cor(data) ## Test, whether variable 1 (col 1) and variable 2 (col 2) are ## independent given variable 3 (col 3) and variable 4 (col 4) on 0.05 ## significance level x <- 1 y <- 2 S <- c(3,4) n <- 20 alpha <- 0.05 cutoff <- qnorm(1-alpha/2) (b1 <- condIndFisherZ(x,y,S,corMatrix,n,cutoff)) # -> 1 and 2 seem to be conditionally independent given 3,4 ## Now an example with conditional dependence data <- matrix(rnorm(n*3),n,3) data[,3] <- 2*data[,1] corMatrix <- cor(data) (b2 <- condIndFisherZ(1,3,2,corMatrix,n,cutoff)) # -> 1 and 3 seem to be conditionally dependent given 2 ## simulate another dep.case: x -> y -> z set.seed(29) x <- rnorm(100) y <- 3*x + rnorm(100) z <- 2*y + rnorm(100) dat <- cbind(x,y,z) ## analyze data suffStat <- list(C = cor(dat), n = nrow(dat)) gaussCItest(1,3,NULL, suffStat) ## dependent [highly signif.] gaussCItest(1,3, 2, suffStat) ## independent | S
set.seed(42) ## Generate four independent normal random variables n <- 20 data <- matrix(rnorm(n*4),n,4) ## Compute corresponding correlation matrix corMatrix <- cor(data) ## Test, whether variable 1 (col 1) and variable 2 (col 2) are ## independent given variable 3 (col 3) and variable 4 (col 4) on 0.05 ## significance level x <- 1 y <- 2 S <- c(3,4) n <- 20 alpha <- 0.05 cutoff <- qnorm(1-alpha/2) (b1 <- condIndFisherZ(x,y,S,corMatrix,n,cutoff)) # -> 1 and 2 seem to be conditionally independent given 3,4 ## Now an example with conditional dependence data <- matrix(rnorm(n*3),n,3) data[,3] <- 2*data[,1] corMatrix <- cor(data) (b2 <- condIndFisherZ(1,3,2,corMatrix,n,cutoff)) # -> 1 and 3 seem to be conditionally dependent given 2 ## simulate another dep.case: x -> y -> z set.seed(29) x <- rnorm(100) y <- 3*x + rnorm(100) z <- 2*y + rnorm(100) dat <- cbind(x,y,z) ## analyze data suffStat <- list(C = cor(dat), n = nrow(dat)) gaussCItest(1,3,NULL, suffStat) ## dependent [highly signif.] gaussCItest(1,3, 2, suffStat) ## independent | S
Computes the correlation graph. This is the graph in which an edge is
drawn between node i and node j, if the null hypothesis “Correlation
between and
is zero” can be rejected at the
given significance level
.
corGraph(dm, alpha=0.05, Cmethod="pearson")
corGraph(dm, alpha=0.05, Cmethod="pearson")
dm |
numeric matrix with rows as samples and columns as variables. |
alpha |
significance level for correlation test (numeric) |
Cmethod |
a |
Undirected correlation graph, a graph-class
object
(package graph); getGraph
for the “fitted”
graph.
Markus Kalisch ([email protected]) and Martin Maechler
## create correlated samples x1 <- rnorm(100) x2 <- rnorm(100) mat <- cbind(x1,x2, x3 = x1+x2) if (require(Rgraphviz)) { ## ``analyze the data'' (g <- corGraph(mat)) # a 'graphNEL' graph, undirected plot(g) # ==> (1) and (2) are each linked to (3) ## use different significance level and different method (g2 <- corGraph(mat, alpha=0.01, Cmethod="kendall")) plot(g2) ## same edges as 'g' }
## create correlated samples x1 <- rnorm(100) x2 <- rnorm(100) mat <- cbind(x1,x2, x3 = x1+x2) if (require(Rgraphviz)) { ## ``analyze the data'' (g <- corGraph(mat)) # a 'graphNEL' graph, undirected plot(g) # ==> (1) and (2) are each linked to (3) ## use different significance level and different method (g2 <- corGraph(mat, alpha=0.01, Cmethod="kendall")) plot(g2) ## same edges as 'g' }
Convert a DAG (Directed Acyclic Graph) to a Completed Partially Directed Acyclic Graph (CPDAG).
dag2cpdag(g)
dag2cpdag(g)
g |
an R object of class |
This function converts a DAG into its corresponding
(unique) CPDAG as follows. Because every DAG in the
Markov equivalence class described by a CPDAG shares the same skeleton
and the same v-structures, this function takes the skeleton and the
v-structures of the given DAG g
. Afterwards it simply uses the
3 orientation rules of the PC algorithm (see references) to orient as
many of the remaining undirected edges as possible.
The function is a simple wrapper function for dag2essgraph
which is more powerfull since it also allows the calculation of the
Markov equivalence class in the presence of interventional data.
The output of this function is exactly the same as the one using
pc(suffStat, indepTest, alpha, labels)
using the true correlation matrix in the function gaussCItest
with a large virtual sample size and a large alpha, but it is much
faster.
A graph object containing the CPDAG.
Markus Kalisch ([email protected]) and Alain Hauser([email protected])
C. Meek (1995). Causal inference and causal explanation with background knowledge. In Proceedings of the Eleventh Conference on Uncertainty in Artificial Intelligence (UAI-95), pp. 403-411. Morgan Kaufmann Publishers, Inc.
P. Spirtes, C. Glymour and R. Scheines (2000) Causation, Prediction, and Search, 2nd edition, The MIT Press.
## A -> B <- C am1 <- matrix(c(0,1,0, 0,0,0, 0,1,0), 3,3) colnames(am1) <- rownames(am1) <- LETTERS[1:3] g1 <- as(t(am1), "graphNEL") ## convert to graph cpdag1 <- dag2cpdag(g1) if(requireNamespace("Rgraphviz")) { par(mfrow = c(1,2)) plot(g1) plot(cpdag1) } ## A -> B -> C am2 <- matrix(c(0,1,0, 0,0,1, 0,0,0), 3,3) colnames(am2) <- rownames(am2) <- LETTERS[1:3] g2 <- as(t(am2), "graphNEL") ## convert to graph cpdag2 <- dag2cpdag(g2) if(requireNamespace("Rgraphviz")) { par(mfrow = c(1,2)) plot(g2) plot(cpdag2) }
## A -> B <- C am1 <- matrix(c(0,1,0, 0,0,0, 0,1,0), 3,3) colnames(am1) <- rownames(am1) <- LETTERS[1:3] g1 <- as(t(am1), "graphNEL") ## convert to graph cpdag1 <- dag2cpdag(g1) if(requireNamespace("Rgraphviz")) { par(mfrow = c(1,2)) plot(g1) plot(cpdag1) } ## A -> B -> C am2 <- matrix(c(0,1,0, 0,0,1, 0,0,0), 3,3) colnames(am2) <- rownames(am2) <- LETTERS[1:3] g2 <- as(t(am2), "graphNEL") ## convert to graph cpdag2 <- dag2cpdag(g2) if(requireNamespace("Rgraphviz")) { par(mfrow = c(1,2)) plot(g2) plot(cpdag2) }
Convert a DAG to an (interventional or observational) essential graph.
dag2essgraph(dag, targets = list(integer(0)))
dag2essgraph(dag, targets = list(integer(0)))
dag |
The DAG whose essential graph has to be calculated. Different
representations are possible: |
targets |
List of intervention targets with respect to which the
essential graph has to be calculated. An observational setting is
represented by one single empty target ( |
This function converts a DAG to its corresponding (interventional or observational) essential graph, using the algorithm of Hauser and Bühlmann (2012).
The essential graph is a partially directed graph that represents the (interventional or observational) Markov equivalence class of a DAG. It has the same has the same skeleton as the DAG; a directed edge represents an arrow that has a common orientation in all representatives of the (interventional or observational) Markov equivalence class, whereas an undirected edge represents an arrow that has different orientations in different representatives of the equivalence class. In the observational case, the essential graph is also known as “CPDAG” (Spirtes et al., 2000).
In a purely observational setting (i.e., if targets =
list(integer(0))
), the function yields the same graph as
dag2cpdag
.
Depending on the class of dag
, the essential graph is returned as
an instance of graphNEL
, if dag
is an
instance of graphNEL
,
an instance of EssGraph
, if dag
is
an instance of a class derived from ParDAG
.
Alain Hauser ([email protected])
A. Hauser and P. Bühlmann (2012). Characterization and greedy learning of interventional Markov equivalence classes of directed acyclic graphs. Journal of Machine Learning Research 13, 2409–2464.
P. Spirtes, C.N. Glymour, and R. Scheines (2000). Causation, Prediction, and Search, MIT Press, Cambridge (MA).
p <- 10 # Number of random variables s <- 0.4 # Sparseness of the DAG ## Generate a random DAG set.seed(42) require(graph) dag <- randomDAG(p, s) nodes(dag) <- sprintf("V%d", 1:p) ## Calculate observational essential graph res.obs <- dag2essgraph(dag) ## Different argument classes res2 <- dag2essgraph(as(dag, "GaussParDAG")) str(res2) ## Calculate interventional essential graph for intervention targets ## {1} and {3} res.int <- dag2essgraph(dag, as.list(c(1, 3)))
p <- 10 # Number of random variables s <- 0.4 # Sparseness of the DAG ## Generate a random DAG set.seed(42) require(graph) dag <- randomDAG(p, s) nodes(dag) <- sprintf("V%d", 1:p) ## Calculate observational essential graph res.obs <- dag2essgraph(dag) ## Different argument classes res2 <- dag2essgraph(as(dag, "GaussParDAG")) str(res2) ## Calculate interventional essential graph for intervention targets ## {1} and {3} res.int <- dag2essgraph(dag, as.list(c(1, 3)))
Convert a DAG with latent variables into its corresponding (unique) Partial Ancestral Graph (PAG).
dag2pag(suffStat, indepTest, graph, L, alpha, rules = rep(TRUE,10), verbose = FALSE)
dag2pag(suffStat, indepTest, graph, L, alpha, rules = rep(TRUE,10), verbose = FALSE)
suffStat |
the sufficient statistics, a |
indepTest |
a |
graph |
a DAG with |
L |
array containing the labels of the nodes in the |
alpha |
significance level in |
rules |
logical vector of length 10 indicating which rules should be used when directing edges. The order of the rules is taken from Zhang (2009). |
verbose |
logical; if |
This function converts a DAG (graph object) with latent variables into
its corresponding (unique) PAG, an fciAlgo
class
object, using the ancestor information and conditional independence
tests entailed in the true DAG. The output of this function is
exactly the same as the one using
fci(suffStat, gaussCItest, p, alpha, rules = rep(TRUE, 10))
using the true correlation matrix in gaussCItest()
with a large
“virtual sample size” and a large alpha, but it is much faster,
see the example.
An object of class
fciAlgo
,
containing the estimated graph (in the form of an adjacency matrix
with various possible edge marks), the conditioning sets that lead to
edge removals (sepset) and several other parameters.
Diego Colombo and Markus Kalisch [email protected].
Richardson, T. and Spirtes, P. (2002). Ancestral graph Markov models. Ann. Statist. 30, 962–1030; Theorem 4.2., page 983.
## create the graph set.seed(78) g <- randomDAG(10, prob = 0.25) graph::nodes(g) # "1" "2" ... "10" % FIXME: should be kept in result! ## define nodes 2 and 6 to be latent variables L <- c(2,6) ## compute the true covariance matrix of g cov.mat <- trueCov(g) ## transform covariance matrix into a correlation matrix true.corr <- cov2cor(cov.mat) ## Find PAG ## as dependence "oracle", we use the true correlation matrix in ## gaussCItest() with a large "virtual sample size" and a large alpha: system.time( true.pag <- dag2pag(suffStat = list(C = true.corr, n = 10^9), indepTest = gaussCItest, graph=g, L=L, alpha = 0.9999) ) ### ---- Find PAG using fci-function -------------------------- ## From trueCov(g), delete rows and columns belonging to latent variable L true.cov1 <- cov.mat[-L,-L] ## transform covariance matrix into a correlation matrix true.corr1 <- cov2cor(true.cov1) ## Find PAG with FCI algorithm ## as dependence "oracle", we use the true correlation matrix in ## gaussCItest() with a large "virtual sample size" and a large alpha: system.time( true.pag1 <- fci(suffStat = list(C = true.corr1, n = 10^9), indepTest = gaussCItest, p = ncol(true.corr1), alpha = 0.9999) ) ## confirm that the outputs are equal stopifnot(true.pag@amat == true.pag1@amat)
## create the graph set.seed(78) g <- randomDAG(10, prob = 0.25) graph::nodes(g) # "1" "2" ... "10" % FIXME: should be kept in result! ## define nodes 2 and 6 to be latent variables L <- c(2,6) ## compute the true covariance matrix of g cov.mat <- trueCov(g) ## transform covariance matrix into a correlation matrix true.corr <- cov2cor(cov.mat) ## Find PAG ## as dependence "oracle", we use the true correlation matrix in ## gaussCItest() with a large "virtual sample size" and a large alpha: system.time( true.pag <- dag2pag(suffStat = list(C = true.corr, n = 10^9), indepTest = gaussCItest, graph=g, L=L, alpha = 0.9999) ) ### ---- Find PAG using fci-function -------------------------- ## From trueCov(g), delete rows and columns belonging to latent variable L true.cov1 <- cov.mat[-L,-L] ## transform covariance matrix into a correlation matrix true.corr1 <- cov2cor(true.cov1) ## Find PAG with FCI algorithm ## as dependence "oracle", we use the true correlation matrix in ## gaussCItest() with a large "virtual sample size" and a large alpha: system.time( true.pag1 <- fci(suffStat = list(C = true.corr1, n = 10^9), indepTest = gaussCItest, p = ncol(true.corr1), alpha = 0.9999) ) ## confirm that the outputs are equal stopifnot(true.pag@amat == true.pag1@amat)
test for (conditional) independence of discrete
(each with a finite number of “levels”)
variables
and
given the (possibly empty) set of
discrete variables
.
disCItest()
is a wrapper of gSquareDis()
, to be easily
used in skeleton
, pc
and fci
.
gSquareDis(x, y, S, dm, nlev, adaptDF = FALSE, n.min = 10*df, verbose = FALSE) disCItest (x, y, S, suffStat)
gSquareDis(x, y, S, dm, nlev, adaptDF = FALSE, n.min = 10*df, verbose = FALSE) disCItest (x, y, S, suffStat)
x , y
|
(integer) position of variable |
S |
(integer) positions of zero or more conditioning variables in the adjacency matrix. |
dm |
data matrix (rows: samples, columns: variables) with integer entries; the k levels for a given column must be coded by the integers 0,1,...,k-1. (see example) |
nlev |
optional vector with numbers of levels for each variable
in |
adaptDF |
logical specifying if the degrees of freedom should be lowered by one for each zero count. The value for the degrees of freedom cannot go below 1. |
n.min |
the smallest |
verbose |
logical or integer indicating that increased diagnostic output is to be provided. |
suffStat |
a |
The statistic is used to test for (conditional) independence
of X and Y given a set S (can be
NULL
). If only binary
variables are involved, gSquareBin
is a specialized
(a bit more efficient) alternative to gSquareDis()
.
The p-value of the test.
Nicoletta Andri and Markus Kalisch ([email protected]).
R.E. Neapolitan (2004). Learning Bayesian Networks. Prentice Hall Series in Artificial Intelligence. Chapter 10.3.1
gSquareBin
for a (conditional) independence test
for binary variables.
dsepTest
, gaussCItest
and
binCItest
for similar functions for a d-separation
oracle, a conditional independence test for gaussian variables and a
conditional independence test for binary variables, respectively.
## Simulate data n <- 100 set.seed(123) x <- sample(0:2, n, TRUE) ## three levels y <- sample(0:3, n, TRUE) ## four levels z <- sample(0:1, n, TRUE) ## two levels dat <- cbind(x,y,z) ## Analyze data gSquareDis(1,3, S=2, dat, nlev = c(3,4,2)) # but nlev is optional: gSquareDis(1,3, S=2, dat, verbose=TRUE, adaptDF=TRUE) ## with too little data, gives a warning (and p-value 1): gSquareDis(1,3, S=2, dat[1:60,], nlev = c(3,4,2)) suffStat <- list(dm = dat, nlev = c(3,4,2), adaptDF = FALSE) disCItest(1,3,2,suffStat)
## Simulate data n <- 100 set.seed(123) x <- sample(0:2, n, TRUE) ## three levels y <- sample(0:3, n, TRUE) ## four levels z <- sample(0:1, n, TRUE) ## two levels dat <- cbind(x,y,z) ## Analyze data gSquareDis(1,3, S=2, dat, nlev = c(3,4,2)) # but nlev is optional: gSquareDis(1,3, S=2, dat, verbose=TRUE, adaptDF=TRUE) ## with too little data, gives a warning (and p-value 1): gSquareDis(1,3, S=2, dat[1:60,], nlev = c(3,4,2)) suffStat <- list(dm = dat, nlev = c(3,4,2), adaptDF = FALSE) disCItest(1,3,2,suffStat)
Let x and y be two distinct vertices in a mixed graph G. This function computes D-SEP(x,y,G), which is defined as follows:
A node v is in D-SEP(x,y,G) iff v is not equal to x and there is a collider path between x and v in G such that every vertex on this path is an ancestor of x or y in G.
See p.136 of Sprirtes et al (2000) or Definition 4.1 of Maathuis and Colombo (2015).
dreach(x, y, amat, verbose = FALSE)
dreach(x, y, amat, verbose = FALSE)
x |
First argument of D-SEP, given as the column number of the node in the adjacency matrix. |
y |
Second argument of D-SEP, given as the column number of the
node in the adjacency matrix ( |
amat |
Adjacency matrix of type amat.pag. |
verbose |
Logical specifying details should be on output |
Vector of column positions indicating the nodes in D-SEP(x,y,G).
Diego Colombo and Markus Kalisch ([email protected])
P. Spirtes, C. Glymour and R. Scheines (2000) Causation, Prediction, and Search, 2nd edition, The MIT Press.
M.H. Maathuis and D. Colombo (2015). A generalized back-door criterion. Annals of Statistics 43 1060-1088.
backdoor
uses this function;
pag2magAM
.
This function tests for d-separation of nodes in a DAG.
dsep(a, b, S=NULL, g, john.pairs = NULL)
dsep(a, b, S=NULL, g, john.pairs = NULL)
a |
Label (sic!) of node A |
b |
Label (sic!) of node B |
S |
Labels (sic!) of set of nodes on which it is conditioned, maybe empty |
g |
The Directed Acyclic Graph (object of |
john.pairs |
The shortest path distance matrix for all pairs of
nodes as computed (also by default) in
|
This function checks separation in the moralized graph as explained in Lauritzen (2004).
TRUE if a and b are d-separated by S in G, otherwise FALSE.
Markus Kalisch ([email protected])
S.L. Lauritzen (2004), Graphical Models, Oxford University Press, Chapter 3.2.2
dsepTest
for a wrapper of this function that can
easily be included into skeleton
, pc
,
fci
or fciPlus
.
dsepAM
for a similar function for MAGs.
## generate random DAG p <- 8 set.seed(45) myDAG <- randomDAG(p, prob = 0.3) if (require(Rgraphviz)) { plot(myDAG) } ## Examples for d-separation dsep("1","7",NULL,myDAG) dsep("4","5",NULL,myDAG) dsep("4","5","2",myDAG) dsep("4","5",c("2","3"),myDAG) ## Examples for d-connection dsep("1","3",NULL,myDAG) dsep("1","6","3",myDAG) dsep("4","5","8",myDAG)
## generate random DAG p <- 8 set.seed(45) myDAG <- randomDAG(p, prob = 0.3) if (require(Rgraphviz)) { plot(myDAG) } ## Examples for d-separation dsep("1","7",NULL,myDAG) dsep("4","5",NULL,myDAG) dsep("4","5","2",myDAG) dsep("4","5",c("2","3"),myDAG) ## Examples for d-connection dsep("1","3",NULL,myDAG) dsep("1","6","3",myDAG) dsep("4","5","8",myDAG)
This function tests for d-separation (also known as m-separation) of nodes X
and nodes Y
given nodes S
in a MAG.
dsepAM(X, Y, S = NULL, amat, verbose=FALSE)
dsepAM(X, Y, S = NULL, amat, verbose=FALSE)
X |
Vector of column numbers of nodes |
Y |
Vector of column numbers of nodes |
S |
Vector of column numbers of nodes |
amat |
The Maximal Ancestral Graph encoded as adjacency matrix of type amatType |
verbose |
If true, more detailed output is provided. |
This function checks separation in the moralized graph as explained in Richardson and Spirtes (2002).
TRUE if X
and Y
are d-separated by S
in the
MAG encoded by amat
, otherwise FALSE.
Markus Kalisch ([email protected]), Joris Mooij
T.S. Richardson and P. Spirtes (2002). Ancestral graph Markov models. Annals of Statistics 30 962-1030.
dsepAMTest
for a wrapper of this function that can
easily be included into skeleton
, fci
or
fciPlus
. dsep
for a similar function for DAGs.
# Y-structure MAG # Encode as adjacency matrix p <- 4 # total number of variables V <- c("X1","X2","X3","X4") # variable labels # amat[i,j] = 0 iff no edge btw i,j # amat[i,j] = 1 iff i *-o j # amat[i,j] = 2 iff i *-> j # amat[i,j] = 3 iff i *-- j amat <- rbind(c(0,0,2,0), c(0,0,2,0), c(3,3,0,2), c(0,0,3,0)) rownames(amat)<-V colnames(amat)<-V ## d-separated cat('X1 d-separated from X2? ', dsepAM(1,2,S=NULL,amat),'\n') ## not d-separated given node 3 cat('X1 d-separated from X2 given X4? ', dsepAM(1,2,S=4,amat),'\n') ## not d-separated by node 3 and 4 cat('X1 d-separated from X2 given X3 and X4? ', dsepAM(1,2,S=c(3,4),amat),'\n')
# Y-structure MAG # Encode as adjacency matrix p <- 4 # total number of variables V <- c("X1","X2","X3","X4") # variable labels # amat[i,j] = 0 iff no edge btw i,j # amat[i,j] = 1 iff i *-o j # amat[i,j] = 2 iff i *-> j # amat[i,j] = 3 iff i *-- j amat <- rbind(c(0,0,2,0), c(0,0,2,0), c(3,3,0,2), c(0,0,3,0)) rownames(amat)<-V colnames(amat)<-V ## d-separated cat('X1 d-separated from X2? ', dsepAM(1,2,S=NULL,amat),'\n') ## not d-separated given node 3 cat('X1 d-separated from X2 given X4? ', dsepAM(1,2,S=4,amat),'\n') ## not d-separated by node 3 and 4 cat('X1 d-separated from X2 given X3 and X4? ', dsepAM(1,2,S=c(3,4),amat),'\n')
This function tests for d-separation (also known as m-separation) of node x
and node y
given nodes S
in a MAG.
dsepAMTest()
is written to be easily used in skeleton
, fci
,
fciPlus
.
dsepAMTest(x, y, S = NULL, suffStat)
dsepAMTest(x, y, S = NULL, suffStat)
x |
Column number of node |
y |
Column number of node |
S |
Vector of column numbers of nodes |
suffStat |
a
|
The function is a wrapper for dsepAM
, which checks
separation in the moralized graph as explained in Richardson and Spirtes (2002).
Returns 1 if x
and y
are d-separated by S
in the
MAG encoded by amat
, otherwise 0.
This is analogous to the p-value of an ideal (without sampling error) conditional independence test on any distribution that is faithful to the MAG.
Markus Kalisch ([email protected]), Joris Mooij
T.S. Richardson and P. Spirtes (2002). Ancestral graph Markov models. Annals of Statistics 30 962-1030.
dsepTest
for a similar function for DAGs.
gaussCItest
, disCItest
and
binCItest
for similar functions for a conditional
independence test for gaussian, discrete and
binary variables, respectively.
# Y-structure MAG # Encode as adjacency matrix p <- 4 # total number of variables V <- c("X1","X2","X3","X4") # variable labels # amat[i,j] = 0 iff no edge btw i,j # amat[i,j] = 1 iff i *-o j # amat[i,j] = 2 iff i *-> j # amat[i,j] = 3 iff i *-- j amat <- rbind(c(0,0,2,0), c(0,0,2,0), c(3,3,0,2), c(0,0,3,0)) rownames(amat)<-V colnames(amat)<-V suffStat<-list(g=amat,verbose=FALSE) ## d-separated cat('X1 d-separated from X2? ', dsepAMTest(1,2,S=NULL,suffStat),'\n') ## not d-separated given node 3 cat('X1 d-separated from X2 given X4? ', dsepAMTest(1,2,S=4,suffStat),'\n') ## not d-separated by node 3 and 4 cat('X1 d-separated from X2 given X3 and X4? ', dsepAMTest(1,2,S=c(3,4), suffStat),'\n') # Derive PAG that represents the Markov equivalence class of the MAG with the FCI algorithm # Make use of d-separation oracle as "independence test" indepTest <- dsepAMTest fci.pag <- fci(suffStat,indepTest,alpha = 0.5,labels = V,verbose=FALSE) cat('True MAG:\n') print(amat) cat('PAG output by FCI:\n') print(fci.pag@amat)
# Y-structure MAG # Encode as adjacency matrix p <- 4 # total number of variables V <- c("X1","X2","X3","X4") # variable labels # amat[i,j] = 0 iff no edge btw i,j # amat[i,j] = 1 iff i *-o j # amat[i,j] = 2 iff i *-> j # amat[i,j] = 3 iff i *-- j amat <- rbind(c(0,0,2,0), c(0,0,2,0), c(3,3,0,2), c(0,0,3,0)) rownames(amat)<-V colnames(amat)<-V suffStat<-list(g=amat,verbose=FALSE) ## d-separated cat('X1 d-separated from X2? ', dsepAMTest(1,2,S=NULL,suffStat),'\n') ## not d-separated given node 3 cat('X1 d-separated from X2 given X4? ', dsepAMTest(1,2,S=4,suffStat),'\n') ## not d-separated by node 3 and 4 cat('X1 d-separated from X2 given X3 and X4? ', dsepAMTest(1,2,S=c(3,4), suffStat),'\n') # Derive PAG that represents the Markov equivalence class of the MAG with the FCI algorithm # Make use of d-separation oracle as "independence test" indepTest <- dsepAMTest fci.pag <- fci(suffStat,indepTest,alpha = 0.5,labels = V,verbose=FALSE) cat('True MAG:\n') print(amat) cat('PAG output by FCI:\n') print(fci.pag@amat)
Tests for d-separation of nodes in a DAG. dsepTest()
is
written to be easily used in skeleton
, pc
,
fci
.
dsepTest(x, y, S=NULL, suffStat)
dsepTest(x, y, S=NULL, suffStat)
x , y
|
(integer) position of variable |
S |
(integer) positions of zero or more conditioning variables in the adjacency matrix. |
suffStat |
a
|
The function is based on dsep
. For details on
d-separation see the reference Lauritzen (2004).
If x and y are d-separated by S in DAG G the result is 1, otherwise it is 0. This is analogous to the p-value of an ideal (without sampling error) conditional independence test on any distribution that is faithful to the DAG G.
Markus Kalisch ([email protected])
S.L. Lauritzen (2004), Graphical Models, Oxford University Press.
dsepAMTest
for a similar function for MAGs.
gaussCItest
, disCItest
and
binCItest
for similar functions for a conditional
independence test for gaussian, discrete and
binary variables, respectively.
p <- 8 set.seed(45) myDAG <- randomDAG(p, prob = 0.3) if (require(Rgraphviz)) { ## plot the DAG plot(myDAG, main = "randomDAG(10, prob = 0.2)") } ## define sufficient statistics (d-separation oracle) suffStat <- list(g = myDAG, jp = RBGL::johnson.all.pairs.sp(myDAG)) dsepTest(1,6, S= NULL, suffStat) ## not d-separated dsepTest(1,6, S= 3, suffStat) ## not d-separated by node 3 dsepTest(1,6, S= c(3,4),suffStat) ## d-separated by node 3 and 4
p <- 8 set.seed(45) myDAG <- randomDAG(p, prob = 0.3) if (require(Rgraphviz)) { ## plot the DAG plot(myDAG, main = "randomDAG(10, prob = 0.2)") } ## define sufficient statistics (d-separation oracle) suffStat <- list(g = myDAG, jp = RBGL::johnson.all.pairs.sp(myDAG)) dsepTest(1,6, S= NULL, suffStat) ## not d-separated dsepTest(1,6, S= 3, suffStat) ## not d-separated by node 3 dsepTest(1,6, S= c(3,4),suffStat) ## d-separated by node 3 and 4
"EssGraph"
This class represents an (observentional or interventional) essential graph.
An observational or interventional Markov equivalence class of DAGs can be uniquely represented by a partially directed graph, the essential graph. Its edges have the following interpretation:
a directed edge stands for an arrow
that has the same orientation in all representatives of the
Markov equivalence class;
an undirected edge stands for an arrow that is oriented in one
way in some representatives of the equivalence class and in the other way
in other representatives of the equivalence class.
All reference classes extend and inherit methods from
"envRefClass"
.
new("EssGraph", nodes, in.edges, ...)
nodes
Vector of node names; cf. also field .nodes
.
in.edges
A list of length p
consisting of index
vectors indicating the edges pointing into the nodes of the DAG.
.nodes
:Vector of node names; defaults to as.character(1:p)
,
where p
denotes the number of nodes (variables) of the model.
.in.edges
:A list of length p
consisting of index
vectors indicating the edges pointing into the nodes of the DAG.
targets
List of mutually exclusive intervention targets with respect to which Markov equivalence is defined.
score
:Object of class Score
; used
internally for score-based causal inference.
Most class-based methods are only for internal use. Methods of interest for the user are:
repr()
:Yields a representative causal model of the
equivalence class, an object of a class derived from
Score
. Since the representative is not only
characterized by the DAG, but also by appropriate parameters, the field
score
must be assigned for this method to work. The DAG is
drawn at random; note that all representatives are statistically
indistinguishable under a given set of intervention targets.
node.count()
:Yields the number of nodes of the essential graph.
edge.count()
:Yields the number of edges of the essential graph. Note that unoriented edges count as 2, whereas oriented edges count as 1 due to the internal representation.
signature(x = "EssGraph", y = "ANY")
: plots the
essential graph. In the plot, undirected and bidirected edges are equivalent.
Alain Hauser ([email protected])
showClass("EssGraph")
showClass("EssGraph")
Estimate a Partial Ancestral Graph (PAG) from observational data, using the FCI (Fast Causal Inference) algorithm, or from a combination of data from different (e.g., observational and interventional) contexts, using the FCI-JCI (Joint Causal Inference) extension.
fci(suffStat, indepTest, alpha, labels, p, skel.method = c("stable", "original", "stable.fast"), type = c("normal", "anytime", "adaptive"), fixedGaps = NULL, fixedEdges = NULL, NAdelete = TRUE, m.max = Inf, pdsep.max = Inf, rules = rep(TRUE, 10), doPdsep = TRUE, biCC = FALSE, conservative = FALSE, maj.rule = FALSE, numCores = 1, selectionBias = TRUE, jci = c("0","1","12","123"), contextVars = NULL, verbose = FALSE)
fci(suffStat, indepTest, alpha, labels, p, skel.method = c("stable", "original", "stable.fast"), type = c("normal", "anytime", "adaptive"), fixedGaps = NULL, fixedEdges = NULL, NAdelete = TRUE, m.max = Inf, pdsep.max = Inf, rules = rep(TRUE, 10), doPdsep = TRUE, biCC = FALSE, conservative = FALSE, maj.rule = FALSE, numCores = 1, selectionBias = TRUE, jci = c("0","1","12","123"), contextVars = NULL, verbose = FALSE)
suffStat |
sufficient statistics: A named |
indepTest |
a |
alpha |
numeric significance level (in |
labels |
(optional) |
p |
(optional) number of variables (or nodes). May be specified
if |
skel.method |
character string specifying method; the default,
|
type |
character string specifying the version of the FCI
algorithm to be used. By default, it is |
fixedGaps |
|
fixedEdges |
logical matrix of dimension p*p. If entry
|
NAdelete |
If indepTest returns |
m.max |
Maximum size of the conditioning sets that are considered in the conditional independence tests. |
pdsep.max |
Maximum size of Possible-D-SEP for which subsets are
considered as conditioning sets in the conditional independence
tests. If the nodes |
rules |
Logical vector of length 10 indicating which rules should be used when directing edges. The order of the rules is taken from Zhang (2008). |
doPdsep |
If |
biCC |
If |
conservative |
Logical indicating if the unshielded triples should be checked for ambiguity the second time when v-structures are determined. For more information, see details. |
maj.rule |
Logical indicating if the unshielded triples should be checked for ambiguity the second time when v-structures are determined using a majority rule idea, which is less strict than the standard conservative. For more information, see details. |
numCores |
Specifies the number of cores to be used for parallel
estimation of |
selectionBias |
If |
jci |
String specifying the JCI assumptions that are used. It can be one of:
For more information, see Mooij et al. (2020). |
contextVars |
Subset of variable indices {1,...,p} that will be treated as context variables in the JCI extension of FCI. |
verbose |
If true, more detailed output is provided. |
This function is a generalization of the PC algorithm (see pc
),
in the sense that it allows arbitrarily many latent and selection variables.
Under the assumption that the data are faithful to a DAG that includes all
latent and selection variables, the FCI algorithm (Fast Causal Inference
algorithm) (Spirtes, Glymour and Scheines, 2000) estimates the Markov
equivalence class of MAGs that describe the conditional independence
relationships between the observed variables. Under the assumption that the
data are -faithful to a simple (possibly cyclic) SCM that allows
for latent confounding (but selection bias is absent), the FCI algorithm
estimates the
-Markov equivalence class of the DMGs (directed
mixed graphs) that describe the causal relations between the observed
variables and the conditional independence relationships in the observed
distribution through
-separation (Mooij and Claassen, 2020). The
FCI-JCI (Joint Causal Inference) extension allows the algorithm to combine
data from different contexts, for example, observational and different types
of interventional data (Mooij et al., 2020).
FCI estimates a partial ancestral graph (PAG). The PAG
represents a Markov equivalence class of DAGs with latent and selection
variables in the acyclic case (Zhang, 2008), and a Markov equivalence class
of directed graphs with latent variables (but without selection variables)
in the cyclic -separation case.
A PAG contains the following types of edges: o-o, o–, o->, –>, <->, —.
The bidirected edges come from latent confounders,
and the undirected edges come from latent selection variables. The edges have
the following interpretation: (i) there is an edge between
x
and y
if and only if variables x
and y
are conditionally dependent
given S for all sets S consisting of all selection variables
and a subset of the observed variables (assuming the Markov property and
faithfulness); (ii) a tail x --* y
at x on an edge between x and y means that
x is an ancestor of y or S; (iii) an arrowhead x <-* y
at x on an edge between
x and y means that x is not an ancestor of y, nor of S;
(iv) a circle mark x o-* y
at x on an edge between x and y means that there
exists both a graph in the Markov equivalence class where x is ancestor of
y or S, and one where x is not ancestor of y, nor of S. For further information
on the interpretation of PAGs see e.g. (Zhang, 2008) and (Mooij and Claassen, 2020).
The first part of the FCI algorithm is analogous to the PC algorithm. It
starts with a complete undirected graph and estimates an initial skeleton
using skeleton(*, method="stable")
which produces an
initial order-independent skeleton, see skeleton
for
more details. All edges of this skeleton are of
the form o-o. Due to the presence of hidden variables, it is no longer
sufficient to consider only subsets of the neighborhoods of nodes x
and y
to decide whether the edge x o-o y
should be removed.
Therefore, the initial skeleton may contain some superfluous edges.
These edges are removed in the next step of the algorithm which
requires some orientations. Therefore, the v-structures
are determined using the conservative method (see discussion on
conservative
below).
After the v-structures have been oriented, Possible-D-SEP sets for each
node in the graph are computed at once. To decide whether edge
x o-o y
should be removed, one performs conditional indepedence
tests of x and y given all possible subsets of Possible-D-SEP(x) and
of Possible-D-SEP(y). The edge is removed if a conditional
independence is found. This produces a fully order-independent final
skeleton as explained in Colombo and Maathuis (2014). Subsequently,
the v-structures are newly determined on the final skeleton (using
information in sepset). Finally, as many as possible undetermined edge
marks (o) are determined using (a subset of) the 10 orientation rules
given by Zhang (2008).
The “Anytime FCI” algorithm was introduced by Spirtes (2001). It
can be viewed as a modification of the FCI algorithm that only performs
conditional independence tests up to and including order m.max when
finding the initial skeleton, using the function skeleton
, and
the final skeleton, using the function pdsep
. Thus, Anytime FCI
performs fewer conditional independence tests than FCI. To use the
Anytime algorithm, one sets type = "anytime"
and needs to
specify m.max
, the maximum size of the conditioning sets.
The “Adaptive Anytime FCI” algorithm was introduced by Colombo
et. al (2012). The first part of the algorithm is identical to the normal
FCI described above. But in the second part when the final skeleton is
estimated using the function pdsep
, the Adaptive Anytime
FCI algorithm only performs conditional independence tests up to and
including order m.max
, where m.max is the maximum size of the
conditioning sets that were considered to determine the initial
skeleton using the function skeleton
. Thus, m.max is chosen
adaptively and does not have to be specified by the user.
Conservative versions of FCI, Anytime FCI, and Adaptive Anytime FCI
are computed if conservative = TRUE
is specified. After the
final skeleton is computed, all potential
v-structures a-b-c are checked in the following way. We test whether a
and c are independent conditioning on any subset of the neighbors of a
or any subset of the neighbors of c. When a subset makes a and c
conditionally independent, we call it a separating set. If b is in no
such separating set or in all such separating sets, no further action
is taken and the normal version of the FCI, Anytime FCI, or Adaptive
Anytime FCI algorithm is continued. If, however, b is in only some
separating sets, the triple a-b-c is marked ‘ambiguous’. If a is
independent of c given some S in the skeleton (i.e., the edge a-c
dropped out), but a and c remain dependent given all subsets of
neighbors of either a or c, we will call all triples a-b-c
‘unambiguous’. This is because in the FCI algorithm, the true separating set
might be outside the neighborhood of either a or c. An ambiguous
triple is not oriented as a v-structure. Furthermore, no further
orientation rule that needs to know whether a-b-c is a v-structure or
not is applied. Instead of using the conservative version, which is
quite strict towards the v-structures, Colombo and Maathuis (2014)
introduced a less strict version for the v-structures called majority
rule. This adaptation can be called using maj.rule = TRUE
. In
this case, the triple a-b-c is marked as ‘ambiguous’ if and only if b
is in exactly 50 percent of such separating sets or no separating set
was found. If b is in less than 50 percent of the separating sets it
is set as a v-structure, and if in more than 50 percent it is set as a
non v-structure (for more details see Colombo and Maathuis,
2014). Colombo and Maathuis (2014) showed that with both these
modifications, the final skeleton and the decisions about the
v-structures of the FCI algorithm are fully order-independent.
Note that the order-dependence issues on the 10 orientation rules are
still present, see Colombo and Maathuis (2014) for more details.
The FCI-JCI extension of FCI was introduced by Mooij et. al (2020). It is an
implementation of the Joint Causal Inference (JCI) framework that
reduces causal discovery from several data sets corresponding to different
contexts (e.g., observational and different interventional settings, or data
measured in different labs or in different countries) to causal discovery
from the pooled data (treated as a single 'observational' data set). Two
types of variables are distinguished in the JCI framework: system variables
(describing aspects of the system in some environment) and context variables
(describing aspects of the environment of the system). Different
assumptions regarding the causal relations between context variables and
system variables can be made. The most common assumption (JCI Assumption 1:
'exogeneity') is that no system variable can affect any context variable.
The second, less common, assumption (JCI Assumption 2: 'complete randomized
context') is that there is no latent confounding between system and context.
The third assumption (JCI Assumption 3: 'generic context model') is in fact
a faithfulness assumption on the context distribution. The FCI-JCI extension
can be used by specifying the subset of the variables that are designated as
context variables with the contextVars
argument, and by specifying
the combination of JCI assumptions that is used with the jci
argument.
For the default values of these arguments, the JCI extension is not used.
The only difference between the FCI-JCI extension and the standard FCI
algorithm is that the background knowledge about the PAG implied by the
JCI assumptions is exploited at several points in the FCI-JCI algorithm to
enforce adjacencies between context variables (in case of JCI Assumption 3)
and to orient certain edges adjacent to context variables (in case of JCI
Assumptions 1, 2 or 3).
The current JCI framework assumes that no selection bias is present, and
therefore the FCI-JCI extension should be called with
selectionBias = FALSE
. For more details on FCI-JCI, and the general
JCI framework, see Mooij et al. (2020).
An object of class
fciAlgo
(see
fciAlgo
) containing the estimated graph
(in the form of an adjacency matrix with various possible edge marks),
the conditioning sets that lead to edge removals (sepset) and several other
parameters.
Diego Colombo, Markus Kalisch ([email protected]) and Joris Mooij.
D. Colombo and M.H. Maathuis (2014). Order-independent constraint-based causal structure learning. Journal of Machine Learning Research 15 3741-3782.
D. Colombo, M. H. Maathuis, M. Kalisch, T. S. Richardson (2012). Learning high-dimensional directed acyclic graphs with latent and selection variables. Ann. Statist. 40, 294-321.
M.Kalisch, M. Maechler, D. Colombo, M. H. Maathuis, P. Buehlmann (2012). Causal Inference Using Graphical Models with the R Package pcalg. Journal of Statistical Software 47(11) 1–26, doi:10.18637/jss.v047.i11.
J. M. Mooij, S. Magliacane, T. Claassen (2020). Joint Causal Inference from Multiple Contexts. Journal of Machine Learning Research 21(99), 1-108.
J. M. Mooij and T. Claassen (2020). Constraint-Based Causal Discovery using Partial Ancestral Graphs in the presence of Cycles. In Proc. of the 36th Conference on Uncertainty in Artificial Intelligence (UAI-20), 1159-1168.
P. Spirtes (2001). An anytime algorithm for causal inference. In Proc. of the Eighth International Workshop on Artificial Intelligence and Statistics 213-221. Morgan Kaufmann, San Francisco.
P. Spirtes, C. Glymour and R. Scheines (2000). Causation, Prediction, and Search, 2nd edition, MIT Press, Cambridge (MA).
P. Spirtes, C. Meek, T.S. Richardson (1999). In: Computation, Causation and Discovery. An algorithm for causal inference in the presence of latent variables and selection bias. Pages 211-252. MIT Press.
T.S. Richardson and P. Spirtes (2002). Ancestral graph Markov models. Annals of Statistics 30 962-1030.
J. Zhang (2008). On the completeness of orientation rules for causal discovery in the presence of latent confounders and selection bias. Artificial Intelligence 172 1873-1896.
fciPlus
for a more efficient variation of FCI;
skeleton
for estimating a skeleton
using the PC algorithm; pc
for estimating a CPDAG using
the PC algorithm; pdsep
for computing
Possible-D-SEP for each node and testing and adapting the graph
accordingly; qreach
for a fast way of finding Possible-D-SEP
for a given node.
gaussCItest
, disCItest
,
binCItest
, dsepTest
and dsepAMTest
as examples for indepTest
.
################################################## ## Example without latent variables ################################################## set.seed(42) p <- 7 ## generate and draw random DAG : myDAG <- randomDAG(p, prob = 0.4) ## find skeleton and PAG using the FCI algorithm suffStat <- list(C = cov2cor(trueCov(myDAG)), n = 10^9) res <- fci(suffStat, indepTest=gaussCItest, alpha = 0.9999, p=p, doPdsep = FALSE) ################################################## ## Example with hidden variables ## Zhang (2008), Fig. 6, p.1882 ################################################## ## create the graph g p <- 4 L <- 1 # '1' is latent V <- c("Ghost", "Max","Urs","Anna","Eva") edL <- setNames(vector("list", length=length(V)), V) edL[[1]] <- list(edges=c(2,4),weights=c(1,1)) edL[[2]] <- list(edges=3,weights=c(1)) edL[[3]] <- list(edges=5,weights=c(1)) edL[[4]] <- list(edges=5,weights=c(1)) g <- new("graphNEL", nodes=V, edgeL=edL, edgemode="directed") ## compute the true covariance matrix of g cov.mat <- trueCov(g) ## delete rows and columns belonging to latent variable L true.cov <- cov.mat[-L,-L] ## transform covariance matrix into a correlation matrix true.corr <- cov2cor(true.cov) ## The same, for the following three examples indepTest <- gaussCItest suffStat <- list(C = true.corr, n = 10^9) ## find PAG with FCI algorithm. ## As dependence "oracle", we use the true correlation matrix in ## gaussCItest() with a large "virtual sample size" and a large alpha: normal.pag <- fci(suffStat, indepTest, alpha = 0.9999, labels = V[-L], verbose=TRUE) ## find PAG with Anytime FCI algorithm with m.max = 1 ## This means that only conditioning sets of size 0 and 1 are considered. ## As dependence "oracle", we use the true correlation matrix in the ## function gaussCItest with a large "virtual sample size" and a large ## alpha anytime.pag <- fci(suffStat, indepTest, alpha = 0.9999, labels = V[-L], type = "anytime", m.max = 1, verbose=TRUE) ## find PAG with Adaptive Anytime FCI algorithm. ## This means that only conditining sets up to size K are considered ## in estimating the final skeleton, where K is the maximal size of a ## conditioning set found while estimating the initial skeleton. ## As dependence "oracle", we use the true correlation matrix in the ## function gaussCItest with a large "virtual sample size" and a large ## alpha adaptive.pag <- fci(suffStat, indepTest, alpha = 0.9999, labels = V[-L], type = "adaptive", verbose=TRUE) ## define PAG given in Zhang (2008), Fig. 6, p.1882 corr.pag <- rbind(c(0,1,1,0), c(1,0,0,2), c(1,0,0,2), c(0,3,3,0)) ## check if estimated and correct PAG are in agreement all(corr.pag == normal.pag @ amat) # TRUE all(corr.pag == anytime.pag @ amat) # FALSE all(corr.pag == adaptive.pag@ amat) # TRUE ij <- rbind(cbind(1:4,1:4), c(2,3), c(3,2)) all(corr.pag[ij] == anytime.pag @ amat[ij]) # TRUE ################################################## ## Joint Causal Inference Example ## Mooij et al. (2020), Fig. 43(a), p. 97 ################################################## # Encode MAG as adjacency matrix p <- 8 # total number of variables V <- c("Ca","Cb","Cc","X0","X1","X2","X3","X4") # 3 context variables, 5 system variables # amat[i,j] = 0 iff no edge btw i,j # amat[i,j] = 1 iff i *-o j # amat[i,j] = 2 iff i *-> j # amat[i,j] = 3 iff i *-- j amat <- rbind(c(0,2,2,2,0,0,0,0), c(2,0,2,0,2,0,0,0), c(2,2,0,0,2,2,0,0), c(3,0,0,0,0,0,2,0), c(0,3,3,0,0,3,0,2), c(0,0,3,0,2,0,0,0), c(0,0,0,3,0,0,0,2), c(0,0,0,0,2,0,3,0)) rownames(amat)<-V colnames(amat)<-V # Make use of d-separation oracle as "independence test" indepTest <- dsepAMTest suffStat<-list(g=amat,verbose=FALSE) # Derive PAG that represents the Markov equivalence class of the MAG with the FCI algorithm # (assuming no selection bias) fci.pag <- fci(suffStat, indepTest, alpha = 0.5, labels = V, verbose=TRUE, selectionBias=FALSE) # Derive PAG with FCI-JCI, the Joint Causal Inference extension of FCI # (assuming no selection bias, and all three JCI assumptions) fcijci.pag <- fci(suffStat, indepTest, alpha = 0.5, labels = V, verbose=TRUE, contextVars=c(1,2,3), jci="123", selectionBias=FALSE) # Report results cat('True MAG:\n') print(amat) cat('PAG output by FCI:\n') print(fci.pag@amat) cat('PAG output by FCI-JCI:\n') print(fcijci.pag@amat) # Read off causal features from the FCI PAG cat('Identified absence (-1) and presence (+1) of ancestral causal relations from FCI PAG:\n') print(pag2anc(fci.pag@amat)) cat('Identified absence (-1) and presence (+1) of direct causal relations from FCI PAG:\n') print(pag2edge(fci.pag@amat)) cat('Identified absence (-1) and presence (+1) of pairwise latent confounding from FCI PAG:\n') print(pag2conf(fci.pag@amat)) # Read off causal features from the FCI-JCI PAG cat('Identified absence (-1) and presence (+1) of ancestral causal relations from FCI-JCI PAG:\n') print(pag2anc(fcijci.pag@amat)) cat('Identified absence (-1) and presence (+1) of direct causal relations from FCI-JCI PAG:\n') print(pag2edge(fcijci.pag@amat)) cat('Identified absence (-1) and presence (+1) of pairwise latent confounding from FCI-JCI PAG:\n') print(pag2conf(fcijci.pag@amat))
################################################## ## Example without latent variables ################################################## set.seed(42) p <- 7 ## generate and draw random DAG : myDAG <- randomDAG(p, prob = 0.4) ## find skeleton and PAG using the FCI algorithm suffStat <- list(C = cov2cor(trueCov(myDAG)), n = 10^9) res <- fci(suffStat, indepTest=gaussCItest, alpha = 0.9999, p=p, doPdsep = FALSE) ################################################## ## Example with hidden variables ## Zhang (2008), Fig. 6, p.1882 ################################################## ## create the graph g p <- 4 L <- 1 # '1' is latent V <- c("Ghost", "Max","Urs","Anna","Eva") edL <- setNames(vector("list", length=length(V)), V) edL[[1]] <- list(edges=c(2,4),weights=c(1,1)) edL[[2]] <- list(edges=3,weights=c(1)) edL[[3]] <- list(edges=5,weights=c(1)) edL[[4]] <- list(edges=5,weights=c(1)) g <- new("graphNEL", nodes=V, edgeL=edL, edgemode="directed") ## compute the true covariance matrix of g cov.mat <- trueCov(g) ## delete rows and columns belonging to latent variable L true.cov <- cov.mat[-L,-L] ## transform covariance matrix into a correlation matrix true.corr <- cov2cor(true.cov) ## The same, for the following three examples indepTest <- gaussCItest suffStat <- list(C = true.corr, n = 10^9) ## find PAG with FCI algorithm. ## As dependence "oracle", we use the true correlation matrix in ## gaussCItest() with a large "virtual sample size" and a large alpha: normal.pag <- fci(suffStat, indepTest, alpha = 0.9999, labels = V[-L], verbose=TRUE) ## find PAG with Anytime FCI algorithm with m.max = 1 ## This means that only conditioning sets of size 0 and 1 are considered. ## As dependence "oracle", we use the true correlation matrix in the ## function gaussCItest with a large "virtual sample size" and a large ## alpha anytime.pag <- fci(suffStat, indepTest, alpha = 0.9999, labels = V[-L], type = "anytime", m.max = 1, verbose=TRUE) ## find PAG with Adaptive Anytime FCI algorithm. ## This means that only conditining sets up to size K are considered ## in estimating the final skeleton, where K is the maximal size of a ## conditioning set found while estimating the initial skeleton. ## As dependence "oracle", we use the true correlation matrix in the ## function gaussCItest with a large "virtual sample size" and a large ## alpha adaptive.pag <- fci(suffStat, indepTest, alpha = 0.9999, labels = V[-L], type = "adaptive", verbose=TRUE) ## define PAG given in Zhang (2008), Fig. 6, p.1882 corr.pag <- rbind(c(0,1,1,0), c(1,0,0,2), c(1,0,0,2), c(0,3,3,0)) ## check if estimated and correct PAG are in agreement all(corr.pag == normal.pag @ amat) # TRUE all(corr.pag == anytime.pag @ amat) # FALSE all(corr.pag == adaptive.pag@ amat) # TRUE ij <- rbind(cbind(1:4,1:4), c(2,3), c(3,2)) all(corr.pag[ij] == anytime.pag @ amat[ij]) # TRUE ################################################## ## Joint Causal Inference Example ## Mooij et al. (2020), Fig. 43(a), p. 97 ################################################## # Encode MAG as adjacency matrix p <- 8 # total number of variables V <- c("Ca","Cb","Cc","X0","X1","X2","X3","X4") # 3 context variables, 5 system variables # amat[i,j] = 0 iff no edge btw i,j # amat[i,j] = 1 iff i *-o j # amat[i,j] = 2 iff i *-> j # amat[i,j] = 3 iff i *-- j amat <- rbind(c(0,2,2,2,0,0,0,0), c(2,0,2,0,2,0,0,0), c(2,2,0,0,2,2,0,0), c(3,0,0,0,0,0,2,0), c(0,3,3,0,0,3,0,2), c(0,0,3,0,2,0,0,0), c(0,0,0,3,0,0,0,2), c(0,0,0,0,2,0,3,0)) rownames(amat)<-V colnames(amat)<-V # Make use of d-separation oracle as "independence test" indepTest <- dsepAMTest suffStat<-list(g=amat,verbose=FALSE) # Derive PAG that represents the Markov equivalence class of the MAG with the FCI algorithm # (assuming no selection bias) fci.pag <- fci(suffStat, indepTest, alpha = 0.5, labels = V, verbose=TRUE, selectionBias=FALSE) # Derive PAG with FCI-JCI, the Joint Causal Inference extension of FCI # (assuming no selection bias, and all three JCI assumptions) fcijci.pag <- fci(suffStat, indepTest, alpha = 0.5, labels = V, verbose=TRUE, contextVars=c(1,2,3), jci="123", selectionBias=FALSE) # Report results cat('True MAG:\n') print(amat) cat('PAG output by FCI:\n') print(fci.pag@amat) cat('PAG output by FCI-JCI:\n') print(fcijci.pag@amat) # Read off causal features from the FCI PAG cat('Identified absence (-1) and presence (+1) of ancestral causal relations from FCI PAG:\n') print(pag2anc(fci.pag@amat)) cat('Identified absence (-1) and presence (+1) of direct causal relations from FCI PAG:\n') print(pag2edge(fci.pag@amat)) cat('Identified absence (-1) and presence (+1) of pairwise latent confounding from FCI PAG:\n') print(pag2conf(fci.pag@amat)) # Read off causal features from the FCI-JCI PAG cat('Identified absence (-1) and presence (+1) of ancestral causal relations from FCI-JCI PAG:\n') print(pag2anc(fcijci.pag@amat)) cat('Identified absence (-1) and presence (+1) of direct causal relations from FCI-JCI PAG:\n') print(pag2edge(fcijci.pag@amat)) cat('Identified absence (-1) and presence (+1) of pairwise latent confounding from FCI-JCI PAG:\n') print(pag2conf(fcijci.pag@amat))
This class of objects is returned by functions
fci()
, rfci()
, fciPlus
, and
dag2pag
and represent the estimated PAG (and sometimes
properties of the algorithm).
Objects of this class have methods for the functions
plot
, show
and summary
.
## S4 method for signature 'fciAlgo' show(object) ## S3 method for class 'fciAlgo' print(x, amat = FALSE, zero.print = ".", ...) ## S4 method for signature 'fciAlgo' summary(object, amat = TRUE, zero.print = ".", ...) ## S4 method for signature 'fciAlgo,ANY' plot(x, y, main = NULL, ...)
## S4 method for signature 'fciAlgo' show(object) ## S3 method for class 'fciAlgo' print(x, amat = FALSE, zero.print = ".", ...) ## S4 method for signature 'fciAlgo' summary(object, amat = TRUE, zero.print = ".", ...) ## S4 method for signature 'fciAlgo,ANY' plot(x, y, main = NULL, ...)
x , object
|
a |
amat |
|
zero.print |
string for printing |
y |
(generic |
main |
main title, not yet supported. |
... |
optional further arguments (passed from and to methods). |
The slots call
, n
, max.ord
, n.edgetests
,
sepset
, and pMax
are inherited from class
"gAlgo"
, see there.
In addition, "fciAlgo"
has slots
amat
:adjacency matrix; for the coding of the adjacency matrix see amatType
allPdsep
a list
: the ith entry of
this list contains Possible D-SEP of node number i
.
n.edgetestsPDSEP
the number of new conditional independence tests (i.e., tests that were not done in the first part of the algorithm) that were performed while checking subsets of Possible D-SEP.
max.ordPDSEP
an integer
: the maximum
size of the conditioning sets used in the new conditional independence
that were performed when checking subsets of Possible D-SEP.
Class "gAlgo"
.
signature(x = "fciAlgo")
: Plot the resulting graph
signature(object = "fciAlgo")
: Show basic properties of
the fitted object
signature(object = "fciAlgo")
: Show details of
the fitted object
Markus Kalisch and Martin Maechler
fci
, fciPlus
, etc (see above);
pcAlgo
## look at slots of the class showClass("fciAlgo") ## Also look at the extensive examples in ?fci , ?fciPlus, etc ! ## Not run: ## Suppose, fciObj is an object of class fciAlgo ## access slots by using the @ symbol fciObj@amat ## adjacency matrix fciObj@sepset ## separation sets ## use show, summary and plot method fciObj ## same as show(fciObj) show(fciObj) summary(fciObj) plot(fciObj) ## End(Not run)
## look at slots of the class showClass("fciAlgo") ## Also look at the extensive examples in ?fci , ?fciPlus, etc ! ## Not run: ## Suppose, fciObj is an object of class fciAlgo ## access slots by using the @ symbol fciObj@amat ## adjacency matrix fciObj@sepset ## separation sets ## use show, summary and plot method fciObj ## same as show(fciObj) show(fciObj) summary(fciObj) plot(fciObj) ## End(Not run)
Estimate a Partial Ancestral Graph (PAG) from observational data, using the FCI+ (Fast Causal Inference) algorithm, or from a combination of data from different (e.g., observational and interventional) contexts, using the FCI+-JCI (Joint Causal Inference) extension.
fciPlus(suffStat, indepTest, alpha, labels, p, verbose=TRUE, selectionBias = TRUE, jci = c("0","1","12","123"), contextVars = NULL)
fciPlus(suffStat, indepTest, alpha, labels, p, verbose=TRUE, selectionBias = TRUE, jci = c("0","1","12","123"), contextVars = NULL)
suffStat |
sufficient statistics: A named |
indepTest |
a |
alpha |
numeric significance level (in |
labels |
(optional) |
p |
(optional) number of variables (or nodes). May be specified
if |
selectionBias |
If |
jci |
String specifying the JCI assumptions that are used. It can be one of:
For more information, see Mooij et al. (2020). |
contextVars |
Subset of variable indices {1,...,p} that will be treated as context variables in the JCI extension of FCI+. |
verbose |
logical indicating if progress of the algorithm should be printed. The default is true, which used to be hard coded previously. |
A (possibly much faster) variation of FCI (Fast Causal Inference).
For details, please see the references, and also fci
.
An object of class
fciAlgo
(see
fciAlgo
) containing the estimated graph
(in the form of an adjacency matrix with various possible edge marks),
the conditioning sets that lead to edge removals (sepset) and several other
parameters.
Emilija Perkovic, Markus Kalisch ([email protected]) and Joris Mooij.
T. Claassen, J. Mooij, and T. Heskes (2013). Learning Sparse Causal Models is not NP-hard. In UAI 2013, Proceedings of the 29th Conference on Uncertainty in Artificial Intelligence
fci
for estimating a PAG using the FCI algorithm.
################################################## ## Example without latent variables ################################################## ## generate a random DAG ( p = 7 ) set.seed(42) p <- 7 myDAG <- randomDAG(p, prob = 0.4) ## find PAG using the FCI+ algorithm on "Oracle" suffStat <- list(C = cov2cor(trueCov(myDAG)), n = 10^9) m.fci <- fciPlus(suffStat, indepTest=gaussCItest, alpha = 0.9999, p=p) summary(m.fci) if (require(Rgraphviz)) { par(mfrow=c(1,2)) plot(myDAG) plot(m.fci) } ################################################## ## Joint Causal Inference Example ## Mooij et al. (2020), Fig. 43(a), p. 97 ################################################## # Encode MAG as adjacency matrix p <- 8 # total number of variables V <- c("Ca","Cb","Cc","X0","X1","X2","X3","X4") # 3 context variables, 5 system variables # amat[i,j] = 0 iff no edge btw i,j # amat[i,j] = 1 iff i *-o j # amat[i,j] = 2 iff i *-> j # amat[i,j] = 3 iff i *-- j amat <- rbind(c(0,2,2,2,0,0,0,0), c(2,0,2,0,2,0,0,0), c(2,2,0,0,2,2,0,0), c(3,0,0,0,0,0,2,0), c(0,3,3,0,0,3,0,2), c(0,0,3,0,2,0,0,0), c(0,0,0,3,0,0,0,2), c(0,0,0,0,2,0,3,0)) rownames(amat)<-V colnames(amat)<-V # Make use of d-separation oracle as "independence test" indepTest <- dsepAMTest suffStat<-list(g=amat,verbose=FALSE) # Derive PAG that represents the Markov equivalence class of the MAG with the FCI+ algorithm # (assuming no selection bias) # fci.pag <- fciPlus(suffStat, indepTest, alpha = 0.5, labels = V, # selectionBias=FALSE,verbose=TRUE) # Derive PAG with FCI+-JCI, the Joint Causal Inference extension of FCI # (assuming no selection bias, and all three JCI assumptions) # fcijci.pag <- fciPlus(suffStat, indepTest, alpha = 0.5, labels = V, # selectionBias=FALSE, contextVars=c(1,2,3), jci="123", verbose=TRUE) # Report results # cat('True MAG:\n') # print(amat) # cat('PAG output by FCI+:\n') # print(fci.pag@amat) # cat('PAG output by FCI+-JCI:\n') # print(fcijci.pag@amat) # Read off causal features from the FCI PAG #cat('Identified absence (-1) and presence (+1) of ancestral causal relations from FCI+ PAG:\n') #print(pag2anc(fci.pag@amat)) #cat('Identified absence (-1) and presence (+1) of direct causal relations from FCI+ PAG:\n') #print(pag2edge(fci.pag@amat)) #cat('Identified absence (-1) and presence (+1) of pairwise latent confounding from FCI+ PAG:\n') #print(pag2conf(fci.pag@amat)) # Read off causal features from the FCI-JCI PAG #cat('Identified absence (-1) and presence (+1) of ancestral causal relations from FCI+-JCI PAG:\n') #print(pag2anc(fcijci.pag@amat)) #cat('Identified absence (-1) and presence (+1) of direct causal relations from FCI+-JCI PAG:\n') #print(pag2edge(fcijci.pag@amat)) #cat('Ident. absence (-1) and presence (+1) of pairwise latent confounding from FCI+-JCI PAG:\n') #print(pag2conf(fcijci.pag@amat))
################################################## ## Example without latent variables ################################################## ## generate a random DAG ( p = 7 ) set.seed(42) p <- 7 myDAG <- randomDAG(p, prob = 0.4) ## find PAG using the FCI+ algorithm on "Oracle" suffStat <- list(C = cov2cor(trueCov(myDAG)), n = 10^9) m.fci <- fciPlus(suffStat, indepTest=gaussCItest, alpha = 0.9999, p=p) summary(m.fci) if (require(Rgraphviz)) { par(mfrow=c(1,2)) plot(myDAG) plot(m.fci) } ################################################## ## Joint Causal Inference Example ## Mooij et al. (2020), Fig. 43(a), p. 97 ################################################## # Encode MAG as adjacency matrix p <- 8 # total number of variables V <- c("Ca","Cb","Cc","X0","X1","X2","X3","X4") # 3 context variables, 5 system variables # amat[i,j] = 0 iff no edge btw i,j # amat[i,j] = 1 iff i *-o j # amat[i,j] = 2 iff i *-> j # amat[i,j] = 3 iff i *-- j amat <- rbind(c(0,2,2,2,0,0,0,0), c(2,0,2,0,2,0,0,0), c(2,2,0,0,2,2,0,0), c(3,0,0,0,0,0,2,0), c(0,3,3,0,0,3,0,2), c(0,0,3,0,2,0,0,0), c(0,0,0,3,0,0,0,2), c(0,0,0,0,2,0,3,0)) rownames(amat)<-V colnames(amat)<-V # Make use of d-separation oracle as "independence test" indepTest <- dsepAMTest suffStat<-list(g=amat,verbose=FALSE) # Derive PAG that represents the Markov equivalence class of the MAG with the FCI+ algorithm # (assuming no selection bias) # fci.pag <- fciPlus(suffStat, indepTest, alpha = 0.5, labels = V, # selectionBias=FALSE,verbose=TRUE) # Derive PAG with FCI+-JCI, the Joint Causal Inference extension of FCI # (assuming no selection bias, and all three JCI assumptions) # fcijci.pag <- fciPlus(suffStat, indepTest, alpha = 0.5, labels = V, # selectionBias=FALSE, contextVars=c(1,2,3), jci="123", verbose=TRUE) # Report results # cat('True MAG:\n') # print(amat) # cat('PAG output by FCI+:\n') # print(fci.pag@amat) # cat('PAG output by FCI+-JCI:\n') # print(fcijci.pag@amat) # Read off causal features from the FCI PAG #cat('Identified absence (-1) and presence (+1) of ancestral causal relations from FCI+ PAG:\n') #print(pag2anc(fci.pag@amat)) #cat('Identified absence (-1) and presence (+1) of direct causal relations from FCI+ PAG:\n') #print(pag2edge(fci.pag@amat)) #cat('Identified absence (-1) and presence (+1) of pairwise latent confounding from FCI+ PAG:\n') #print(pag2conf(fci.pag@amat)) # Read off causal features from the FCI-JCI PAG #cat('Identified absence (-1) and presence (+1) of ancestral causal relations from FCI+-JCI PAG:\n') #print(pag2anc(fcijci.pag@amat)) #cat('Identified absence (-1) and presence (+1) of direct causal relations from FCI+-JCI PAG:\n') #print(pag2edge(fcijci.pag@amat)) #cat('Ident. absence (-1) and presence (+1) of pairwise latent confounding from FCI+-JCI PAG:\n') #print(pag2conf(fcijci.pag@amat))
Find all unshielded triples in an undirected graph, , i.e.,
the ordered (
with
) list of all the triples
in the graph.
find.unsh.triple(g, check=TRUE)
find.unsh.triple(g, check=TRUE)
g |
adjacency matrix of type amat.cpdag representing the
skeleton; since a skeleton consists only of undirected edges,
|
check |
logical indicating that the symmetry of |
A triple of nodes x
, y
and z
is
“unshielded”, if (all of these are true):
x
and y
are connected;
y
and z
are connected;
x
and z
are not connected.
unshTripl |
Matrix with 3 rows containing in each column an unshielded triple |
unshVect |
Vector containing the unique number for each column in unshTripl (for internal use only) |
Diego Colombo, Markus Kalisch ([email protected]), and Martin Maechler
data(gmG) if (require(Rgraphviz)) { ## show graph plot(gmG$g, main = "True DAG") } ## prepare skeleton use in example g <- wgtMatrix(gmG$g) ## compute weight matrix g <- 1*(g != 0) # wgts --> 0/1; still lower triangular print.table(g, zero.print=".") skel <- g + t(g) ## adjacency matrix of skeleton ## estimate unshielded triples -- there are 13 : (uTr <- find.unsh.triple(skel))
data(gmG) if (require(Rgraphviz)) { ## show graph plot(gmG$g, main = "True DAG") } ## prepare skeleton use in example g <- wgtMatrix(gmG$g) ## compute weight matrix g <- 1*(g != 0) # wgts --> 0/1; still lower triangular print.table(g, zero.print=".") skel <- g + t(g) ## adjacency matrix of skeleton ## estimate unshielded triples -- there are 13 : (uTr <- find.unsh.triple(skel))
This function tests if z
satisfies the Generalized Adjustment
Criterion (GAC) relative to (x,y)
in the graph represented by
adjacency matrix amat
and interpreted as type
(DAG,
maximal PDAG, CPDAG, MAG, PAG). If yes, z
can be used in covariate adjustment
for estimating causal effects of x
on y
.
gac(amat, x, y, z, type = "pag")
gac(amat, x, y, z, type = "pag")
amat |
adjacency matrix of type amat.cpdag or amat.pag |
x , y , z
|
(integer) positions of variables in |
type |
string specifying the type of graph of the adjacency matrix
|
This work is a generalization of the work of Shpitser et al. (2012) (necessary and sufficient criterion in DAGs/ADMGs) and van der Zander et al. (2014) (necessary and sufficient criterion in MAGs). Moreover, it is a generalization of the Generalized Backdoor Criterion (GBC) of Maathuis and Colombo (2013): While GBC is sufficient but not necessary, GAC is both sufficient and necessary for DAGs, CPDAGs, MAGs and PAGs. For more details see Perkovic et al. (2015, 2017a, 2017b).
The motivation to find a set z
that satisfies the GAC with
respect to (x,y)
is the following:
A set of variables z
satisfies the GAC relative to
(x,y)
in the given graph, if and only if
the causal effect of x
on y
is identifiable by
covariate adjustment and is given by
(for any joint distribution “compatible” with the graph; the formula is for discrete variables with straightforward modifications for continuous variables). This result allows to write post-intervention densities (the one written using Pearl's do-calculus) using only observational densities estimated from the data.
For z
to satisfy the GAC relative to (x,y)
and the graph, the
following three conditions must hold:
The graph is adjustment amenable relative to (x,y)
.
The intersection of z
and the forbidden set
(explained in Perkovic et al. (2015, 2017b) is empty.
All proper definite status non-causal paths in the graph from
x
to y
are blocked by z
.
It is important to note that there can be x
and
y
for which there is no set Z that satisfies the GAC, but the
total causal effect might be identifiable via some technique other
than covariate adjustment.
For details on the GAC for DAGs, CPDAGs, PAGs see Perkovic et. al (2015,2017a). For details on the GAC for MAGs see van der Zander et. al (2014) and for details on the GAC for maximal PDAGs see Perkovic et. al (2017b).
For the coding of the adjacency matrix see amatType. The input
matrix can either be of class matrix
or of class amat
.
A list
with three components:
gac |
logical; TRUE if |
res |
logical vector of length three indicating if each of the three conditions (0), (1) and (2) are true |
f |
node positions of nodes in the forbidden set (see Perkovic et al. (2015, 2017b) |
Emilija Perkovic and Markus Kalisch ([email protected])
E. Perkovic, J. Textor, M. Kalisch and M.H. Maathuis (2015). A Complete Generalized Adjustment Criterion. In Proceedings of UAI 2015.
E. Perkovic, J. Textor, M. Kalisch and M.H. Maathuis (2017a). Complete graphical characterization and construction of adjustment sets in Markov equivalence classes of ancestral graphs. To appear in Journal of Machine Learning Research.
E. Perkovic, M. Kalisch and M.H. Maathuis (2017b). Interpreting and using CPDAGs with background knowledge. In Proceedings of UAI 2017.
I. Shpitser, T. VanderWeele and J.M. Robins (2012). On the validity of covariate adjustment for estimating causal effects. In Proceedings of UAI 2010.
B. van der Zander, M. Liskiewicz and J. Textor (2014). Constructing separators and adjustment sets in ancestral graphs. In Proceedings of UAI 2014.
M.H. Maathuis and D. Colombo (2013). A generalized backdoor criterion. Annals of Statistics 43 1060-1088.
backdoor
for the Generalized Backdoor Criterion,
pc
for estimating a CPDAG and fci
and
fciPlus
for estimating a PAG.
## We reproduce the four examples in Perkovic et. al (2015, 2017a) ############################## ## Example 4.1 in Perkovic et. al (2015), Example 2 in Perkovic et. al (2017a) ############################## mFig1 <- matrix(c(0,1,1,0,0,0, 1,0,1,1,1,0, 0,0,0,0,0,1, 0,1,1,0,1,1, 0,1,0,1,0,1, 0,0,0,0,0,0), 6,6) type <- "cpdag" x <- 3; y <- 6 ## Z satisfies GAC : gac(mFig1, x,y, z=c(2,4), type) gac(mFig1, x,y, z=c(4,5), type) gac(mFig1, x,y, z=c(4,2,1), type) gac(mFig1, x,y, z=c(4,5,1), type) gac(mFig1, x,y, z=c(4,2,5), type) gac(mFig1, x,y, z=c(4,2,5,1),type) ## Z does not satisfy GAC : gac(mFig1,x,y, z=2, type) gac(mFig1,x,y, z=NULL, type) ############################## ## Example 4.2 in Perkovic et. al (2015), Example 3 in Perkovic et. al (2017a) ############################## mFig3a <- matrix(c(0,1,0,0, 1,0,1,1, 0,1,0,1, 0,1,1,0), 4,4) mFig3b <- matrix(c(0,2,0,0, 3,0,3,3, 0,2,0,3, 0,2,2,0), 4,4) mFig3c <- matrix(c(0,3,0,0, 2,0,3,3, 0,2,0,3, 0,2,2,0), 4,4) type <- "pag" x <- 2; y <- 4 ## Z does not satisfy GAC gac(mFig3a,x,y, z=NULL, type) ## not amenable rel. to (X,Y) gac(mFig3b,x,y, z=NULL, type) ## not amenable rel. to (X,Y) ## Z satisfies GAC gac(mFig3c,x,y, z=NULL, type) ## amenable rel. to (X,Y) ############################## ## Example 4.3 in Perkovic et. al (2015), Example 4 in Perkovic et. al (2017a) ############################## mFig4a <- matrix(c(0,0,1,0,0,0, 0,0,1,0,0,0, 2,2,0,3,3,2, 0,0,2,0,2,2, 0,0,2,1,0,2, 0,0,1,3,3,0), 6,6) mFig4b <- matrix(c(0,0,1,0,0,0, 0,0,1,0,0,0, 2,2,0,0,3,2, 0,0,0,0,2,2, 0,0,2,3,0,2, 0,0,2,3,2,0), 6,6) type <- "pag" x <- 3; y <- 4 ## both PAGs are amenable rel. to (X,Y) ## Z satisfies GAC in Fig. 4a gac(mFig4a,x,y, z=6, type) gac(mFig4a,x,y, z=c(1,6), type) gac(mFig4a,x,y, z=c(2,6), type) gac(mFig4a,x,y, z=c(1,2,6), type) ## no Z satisfies GAC in Fig. 4b gac(mFig4b,x,y, z=NULL, type) gac(mFig4b,x,y, z=6, type) gac(mFig4b,x,y, z=c(5,6), type) ############################## ## Example 4.4 in Perkovic et. al (2015), Example 8 in Perkovic et. al (2017a) ############################## mFig5a <- matrix(c(0,1,0,0,0, 1,0,1,0,0, 0,0,0,0,1, 0,0,1,0,0, 0,0,0,0,0), 5,5) type <- "cpdag" x <- c(1,5); y <- 4 ## Z satisfies GAC gac(mFig5a,x,y, z=c(2,3), type) ## Z does not satisfy GAC gac(mFig5a,x,y, z=2, type) mFig5b <- matrix(c(0,1,0,0,0,0,0, 2,0,2,3,0,3,0, 0,1,0,0,0,0,0, 0,2,0,0,3,0,0, 0,0,0,2,0,2,3, 0,2,0,0,2,0,0, 0,0,0,0,2,0,0), 7,7) type <- "pag" x<-c(2,7); y<-6 ## Z satisfies GAC gac(mFig5b,x,y, z=c(4,5), type) gac(mFig5b,x,y, z=c(4,5,1), type) gac(mFig5b,x,y, z=c(4,5,3), type) gac(mFig5b,x,y, z=c(1,3,4,5), type) ## Z does not satisfy GAC gac(mFig5b,x,y, z=NULL, type) ############################## ## Example 4.7 in Perkovic et. al (2017b) ############################## mFig3a <- matrix(c(0,1,0,0, 1,0,1,1, 0,1,0,1, 0,1,1,0), 4,4) mFig3b <- matrix(c(0,1,0,0, 0,0,1,1, 0,0,0,1, 0,0,1,0), 4,4) mFig3c <- matrix(c(0,0,0,0, 1,0,1,0, 0,1,0,1, 0,1,1,0), 4,4) type <- "pdag" x <- 2; y <- 4 ## Z does not satisfy GAC gac(mFig3a,x,y, z=NULL, type) ## not amenable rel. to (X,Y) gac(mFig3c,x,y, z=NULL, type) ## amenable rel. to (X,Y), but no set can block X <- Y ## Z satisfies GAC gac(mFig3b,x,y, z=NULL, type) ## amenable rel. to (X,Y)
## We reproduce the four examples in Perkovic et. al (2015, 2017a) ############################## ## Example 4.1 in Perkovic et. al (2015), Example 2 in Perkovic et. al (2017a) ############################## mFig1 <- matrix(c(0,1,1,0,0,0, 1,0,1,1,1,0, 0,0,0,0,0,1, 0,1,1,0,1,1, 0,1,0,1,0,1, 0,0,0,0,0,0), 6,6) type <- "cpdag" x <- 3; y <- 6 ## Z satisfies GAC : gac(mFig1, x,y, z=c(2,4), type) gac(mFig1, x,y, z=c(4,5), type) gac(mFig1, x,y, z=c(4,2,1), type) gac(mFig1, x,y, z=c(4,5,1), type) gac(mFig1, x,y, z=c(4,2,5), type) gac(mFig1, x,y, z=c(4,2,5,1),type) ## Z does not satisfy GAC : gac(mFig1,x,y, z=2, type) gac(mFig1,x,y, z=NULL, type) ############################## ## Example 4.2 in Perkovic et. al (2015), Example 3 in Perkovic et. al (2017a) ############################## mFig3a <- matrix(c(0,1,0,0, 1,0,1,1, 0,1,0,1, 0,1,1,0), 4,4) mFig3b <- matrix(c(0,2,0,0, 3,0,3,3, 0,2,0,3, 0,2,2,0), 4,4) mFig3c <- matrix(c(0,3,0,0, 2,0,3,3, 0,2,0,3, 0,2,2,0), 4,4) type <- "pag" x <- 2; y <- 4 ## Z does not satisfy GAC gac(mFig3a,x,y, z=NULL, type) ## not amenable rel. to (X,Y) gac(mFig3b,x,y, z=NULL, type) ## not amenable rel. to (X,Y) ## Z satisfies GAC gac(mFig3c,x,y, z=NULL, type) ## amenable rel. to (X,Y) ############################## ## Example 4.3 in Perkovic et. al (2015), Example 4 in Perkovic et. al (2017a) ############################## mFig4a <- matrix(c(0,0,1,0,0,0, 0,0,1,0,0,0, 2,2,0,3,3,2, 0,0,2,0,2,2, 0,0,2,1,0,2, 0,0,1,3,3,0), 6,6) mFig4b <- matrix(c(0,0,1,0,0,0, 0,0,1,0,0,0, 2,2,0,0,3,2, 0,0,0,0,2,2, 0,0,2,3,0,2, 0,0,2,3,2,0), 6,6) type <- "pag" x <- 3; y <- 4 ## both PAGs are amenable rel. to (X,Y) ## Z satisfies GAC in Fig. 4a gac(mFig4a,x,y, z=6, type) gac(mFig4a,x,y, z=c(1,6), type) gac(mFig4a,x,y, z=c(2,6), type) gac(mFig4a,x,y, z=c(1,2,6), type) ## no Z satisfies GAC in Fig. 4b gac(mFig4b,x,y, z=NULL, type) gac(mFig4b,x,y, z=6, type) gac(mFig4b,x,y, z=c(5,6), type) ############################## ## Example 4.4 in Perkovic et. al (2015), Example 8 in Perkovic et. al (2017a) ############################## mFig5a <- matrix(c(0,1,0,0,0, 1,0,1,0,0, 0,0,0,0,1, 0,0,1,0,0, 0,0,0,0,0), 5,5) type <- "cpdag" x <- c(1,5); y <- 4 ## Z satisfies GAC gac(mFig5a,x,y, z=c(2,3), type) ## Z does not satisfy GAC gac(mFig5a,x,y, z=2, type) mFig5b <- matrix(c(0,1,0,0,0,0,0, 2,0,2,3,0,3,0, 0,1,0,0,0,0,0, 0,2,0,0,3,0,0, 0,0,0,2,0,2,3, 0,2,0,0,2,0,0, 0,0,0,0,2,0,0), 7,7) type <- "pag" x<-c(2,7); y<-6 ## Z satisfies GAC gac(mFig5b,x,y, z=c(4,5), type) gac(mFig5b,x,y, z=c(4,5,1), type) gac(mFig5b,x,y, z=c(4,5,3), type) gac(mFig5b,x,y, z=c(1,3,4,5), type) ## Z does not satisfy GAC gac(mFig5b,x,y, z=NULL, type) ############################## ## Example 4.7 in Perkovic et. al (2017b) ############################## mFig3a <- matrix(c(0,1,0,0, 1,0,1,1, 0,1,0,1, 0,1,1,0), 4,4) mFig3b <- matrix(c(0,1,0,0, 0,0,1,1, 0,0,0,1, 0,0,1,0), 4,4) mFig3c <- matrix(c(0,0,0,0, 1,0,1,0, 0,1,0,1, 0,1,1,0), 4,4) type <- "pdag" x <- 2; y <- 4 ## Z does not satisfy GAC gac(mFig3a,x,y, z=NULL, type) ## not amenable rel. to (X,Y) gac(mFig3c,x,y, z=NULL, type) ## amenable rel. to (X,Y), but no set can block X <- Y ## Z satisfies GAC gac(mFig3b,x,y, z=NULL, type) ## amenable rel. to (X,Y)
"gAlgo"
"gAlgo"
is a "VIRTUAL"
class, the common basis of classes
"pcAlgo"
and "fciAlgo"
.
We describe the common slots here; for more see the help pages of the specific classes.
call
:a call
object: the original function call.
n
:an "integer"
, the sample size used to estimate the graph.
max.ord
:an integer
, the maximum size of
the conditioning set used in the conditional independence tests of
the (first part of the algorithm), in function skeleton
.
n.edgetests
:the number of conditional independence tests performed by the (first part of the) algorithm.
sepset
:a list
, the conditioning sets
that led to edge deletions. The set that led to the removal of
the edge i -- j
is saved in either sepset[[i]][[j]]
or
in sepset[[j]][[i]]
.
pMax
:a numeric square matrix
, where the
th entry contains the maximal p-value of all conditional
independence tests for edge
.
Martin Maechler
showClass("gAlgo")
showClass("gAlgo")
"GaussL0penIntScore"
This class represents a score for causal inference from jointly interventional
and observational Gaussian data; it is used in the causal inference functions
gies
and simy
.
The class implements an -penalized Gaussian maximum
likelihood estimator. The penalization is a constant (specified by
the argument
lambda
in the constructor) times the number of
parameters of the DAG model. By default, the constant is
chosen as
, which corresponds to the BIC score.
Class "Score"
, directly.
All reference classes extend and inherit methods from "envRefClass"
.
The class GaussL0penIntScore
has the same fields as Score
.
They need not be accessed by the user.
new("GaussL0penIntScore", data = matrix(1, 1, 1), targets = list(integer(0)), target.index = rep(as.integer(1), nrow(data)), lambda = 0.5*log(nrow(data)), intercept = FALSE, use.cpp = TRUE, ...)
data
Data matrix with rows and
columns. Each
row corresponds to one realization, either interventional or
observational.
targets
List of mutually exclusive intervention targets that have been used for data generation.
target.index
Vector of length ; the
-th entry
specifies the index of the intervention
target in
targets
under which the -th row of
data
was measured.
lambda
Penalization constant (cf. details)
intercept
Indicates whether an intercept is allowed in the linear structural equations, or, equivalently, if a mean different from zero is allowed for the observational distribution.
use.cpp
Indicates whether the calculation of the score should be done
by the C++ library of the package, which speeds up calculation. This
parameter should only be set to FALSE
in the case of problems.
local.score(vertex, parents, ...)
Calculates the local score of a vertex and its parents. Since this score has no obvious interpretation, it is rather for internal use.
global.score.int(edges, ...)
Calculates the global score of a DAG, represented as a list of in-edges: for each vertex in the DAG, this list contains a vector of parents.
global.score(dag, ...)
Calculates the global score of a DAG,
represented as an object of a class derived from
ParDAG
.
local.mle(vertex, parents, ...)
Calculates the local MLE of a vertex and its parents. The result is a vector of parameters encoded as follows:
First element: variance of the Gaussian error term
Second element: intercept
Following elements: regression coefficients; one per parent vertex
global.mle(dag, ...)
Calculates the global MLE of a DAG,
represented by an object of a class derived from
ParDAG
.
The result is a list of vectors, one per vertex, each in the same format
as the result vector of local.mle
.
Alain Hauser ([email protected])
gies
, simy
,
GaussL0penObsScore
,
Score
################################################## ## Using Gaussian Data ################################################## ## Load predefined data data(gmInt) ## Define the score object score <- new("GaussL0penIntScore", gmInt$x, gmInt$targets, gmInt$target.index) ## Score of the true underlying DAG score$global.score(as(gmInt$g, "GaussParDAG")) ## Score of the DAG that has only one edge from 1 to 2 A <- matrix(0, ncol(gmInt$x), ncol(gmInt$x)) A[1, 2] <- 1 score$global.score(as(A, "GaussParDAG")) ## (Note: this is lower than the score of the true DAG.)
################################################## ## Using Gaussian Data ################################################## ## Load predefined data data(gmInt) ## Define the score object score <- new("GaussL0penIntScore", gmInt$x, gmInt$targets, gmInt$target.index) ## Score of the true underlying DAG score$global.score(as(gmInt$g, "GaussParDAG")) ## Score of the DAG that has only one edge from 1 to 2 A <- matrix(0, ncol(gmInt$x), ncol(gmInt$x)) A[1, 2] <- 1 score$global.score(as(A, "GaussParDAG")) ## (Note: this is lower than the score of the true DAG.)
"GaussL0penObsScore"
This class represents a score for causal inference from observational Gaussian
data; it is used in the causal inference function ges
.
The class implements an -penalized Gaussian maximum
likelihood estimator. The penalization is a constant (specified by
the argument
lambda
in the constructor) times the number of
parameters of the DAG model. By default, the constant is
chosen as
, which corresponds to the BIC score.
Class "Score"
, directly.
All reference classes extend and inherit methods from "envRefClass"
.
The class GaussL0penObsScore
has the same fields as
Score
. They need not be accessed by the user.
new("GaussL0penObsScore", data = matrix(1, 1, 1), lambda = 0.5*log(nrow(data)), intercept = TRUE, use.cpp = TRUE, ...)
data
Data matrix with rows and
columns. Each row
corresponds to one observational realization.
lambda
Penalization constant (cf. details)
intercept
Indicates whether an intercept is allowed in the linear structural equations, or, equivalently, if a mean different from zero is allowed for the observational distribution.
use.cpp
Indicates whether the calculation of the score should be done
by the C++ library of the package, which speeds up calculation. This
parameter should only be set to FALSE
in the case of problems.
local.score(vertex, parents, ...)
Calculates the local score of a vertex and its parents. Since this score has no obvious interpretation, it is rather for internal use.
global.score.int(edges, ...)
Calculates the global score of a DAG, represented as a list of in-edges: for each vertex in the DAG, this list contains a vector of parents.
global.score(dag, ...)
Calculates the global score of a DAG,
represented as an object of a class derived from
ParDAG
.
local.mle(vertex, parents, ...)
Calculates the local MLE of a vertex and its parents. The result is a vector of parameters encoded as follows:
First element: variance of the Gaussian error term
Second element: intercept
Following elements: regression coefficients; one per parent vertex
global.mle(dag, ...)
Calculates the global MLE of a DAG,
represented by an object of a class derived from
ParDAG
.
The result is a list of vectors, one per vertex, each in the same format
as the result vector of local.mle
.
Alain Hauser ([email protected])
ges
,
GaussL0penIntScore
,
Score
################################################## ## Using Gaussian Data ################################################## ## Load predefined data data(gmG) ## Define the score object score <- new("GaussL0penObsScore", gmG$x) ## Score of the true underlying DAG score$global.score(as(gmG$g, "GaussParDAG")) ## Score of the DAG that has only one edge from 1 to 2 A <- matrix(0, ncol(gmG$x), ncol(gmG$x)) A[1, 2] <- 1 score$global.score(as(A, "GaussParDAG")) ## (Note: this is lower than the score of the true DAG.)
################################################## ## Using Gaussian Data ################################################## ## Load predefined data data(gmG) ## Define the score object score <- new("GaussL0penObsScore", gmG$x) ## Score of the true underlying DAG score$global.score(as(gmG$g, "GaussParDAG")) ## Score of the DAG that has only one edge from 1 to 2 A <- matrix(0, ncol(gmG$x), ncol(gmG$x)) A[1, 2] <- 1 score$global.score(as(A, "GaussParDAG")) ## (Note: this is lower than the score of the true DAG.)
"GaussParDAG"
of Gaussian Causal ModelsThe "GaussParDAG"
class represents a Gaussian causal model.
The class "GaussParDAG"
is used to simulate observational
and/or interventional data from Gaussian causal models as well as for parameter
estimation (maximum-likelihood estimation) for a given DAG structure in the
presence of a data set with jointly observational and interventional data.
A Gaussian causal model can be represented as a set of linear
structural equations with Gaussian noise variables. Those equations are
fully specified by indicating the regression parameters, the intercept
and the variance of the noise or error terms. More details can be found e.g.
in Kalisch and Bühlmann (2007) or Hauser and Bühlmann (2012).
Class "ParDAG"
, directly.
All reference classes extend and inherit methods from
"envRefClass"
.
new("GaussParDAG", nodes, in.edges, params)
nodes
Vector of node names; cf. also field .nodes
.
in.edges
A list of length p
consisting of index
vectors indicating the edges pointing into the nodes of the DAG.
params
A list of length p
consisting of parameter
vectors modeling the conditional distribution of a node given its
parents; cf. also field .params
for the meaning of the
parameters.
.nodes
:Vector of node names; defaults to as.character(1:p)
,
where p
denotes the number of nodes (variables) of the model.
.in.edges
:A list of length p
consisting of index
vectors indicating the edges pointing into the nodes of the DAG. The
-th entry lists the indices of the parents of the
-th node.
.params
:A list of length p
consisting of parameter
vectors modeling the conditional distribution of a node given its
parents. The -th entry models the conditional (normal)
distribution of the
-th variable in the model given its parents.
It is a vector of length
, where
is the number of
parents of node
; the first entry encodes the error variance of
node
, the second entry the intercept, and the remaining entries
the regression coefficients (see above). In most cases, it is easier
to access the parameters via the wrapper functions
err.var
,
intercept
and weight.mat
.
set.err.var(value)
:Sets the error variances. The argument
must be a vector of length , where
denotes the number
of nodes in the model.
err.var()
:Yields the vector of error variances.
intercept()
:Yields the vector of intercepts.
set.intercept(value)
:Sets the intercepts. The argument
must be a vector of length , where
denotes the number
of nodes in the model.
weight.mat(target)
:Yields the (observational or
interventional) weight matrix of the model. The weight matrix is an
matrix whose
-th columns contains the
regression coefficients of the
-th structural equation, if node
is not intervened (i.e., if
i
is not contained in the
vector target
), and is empty otherwise.
cov.mat(target, ivent.var)
:Yields the covariance matrix
of the observational or an interventional distribution of the causal
model. If target
has length 0, the covariance matrix of the
observational distribution is returned; otherwise target
is a
vector of the intervened nodes, and ivent.var
is a vector of the
same length indicating the variances of the intervention levels.
Deterministic interventions with fix intervention levels would correspond
to vanishing intervention variances; with non-zero intervention variances,
stochastic interventions are considered in which intervention values are
realizations of Gaussian variables (Korb et al., 2004).
The following methods are inherited (from the corresponding class):
node.count
("ParDAG"), edge.count
("ParDAG"), simulate
("ParDAG")
Alain Hauser ([email protected])
A. Hauser and P. Bühlmann (2012). Characterization and greedy learning of interventional Markov equivalence classes of directed acyclic graphs. Journal of Machine Learning Research 13, 2409–2464.
M. Kalisch and P. Buehlmann (2007). Estimating high-dimensional directed acyclic graphs with the PC-algorithm. Journal of Machine Learning Research 8, 613–636.
K.B. Korb, L.R. Hope, A.E. Nicholson, and K. Axnick (2004). Varieties of causal intervention. Proc. of the Pacific Rim International Conference on Artificial Intelligence (PRICAI 2004), 322–331
set.seed(307) myDAG <- r.gauss.pardag(p = 5, prob = 0.4) (wm <- myDAG$weight.mat()) m <- as(myDAG, "matrix") # TRUE/FALSE adjacency matrix symnum(m) stopifnot(identical(unname( m ), unname(wm != 0))) myDAG$err.var() myDAG$intercept() myDAG$set.intercept(runif(5, min=3, max=4)) myDAG$intercept() if (require(Rgraphviz)) plot(myDAG)
set.seed(307) myDAG <- r.gauss.pardag(p = 5, prob = 0.4) (wm <- myDAG$weight.mat()) m <- as(myDAG, "matrix") # TRUE/FALSE adjacency matrix symnum(m) stopifnot(identical(unname( m ), unname(wm != 0))) myDAG$err.var() myDAG$intercept() myDAG$set.intercept(runif(5, min=3, max=4)) myDAG$intercept() if (require(Rgraphviz)) plot(myDAG)
Estimate the observational or interventional essential graph representing the
Markov equivalence class of a DAG by greedily optimizing a score function in
the space of DAGs. In practice, greedy search should always be done in the
space of equivalence classes instead of DAGs, giving the functions
gies
or ges
the preference over gds
.
gds(score, labels = score$getNodes(), targets = score$getTargets(), fixedGaps = NULL, phase = c("forward", "backward", "turning"), iterate = length(phase) > 1, turning = TRUE, maxDegree = integer(0), verbose = FALSE, ...)
gds(score, labels = score$getNodes(), targets = score$getTargets(), fixedGaps = NULL, phase = c("forward", "backward", "turning"), iterate = length(phase) > 1, turning = TRUE, maxDegree = integer(0), verbose = FALSE, ...)
score |
An instance of a class derived from |
labels |
Node labels; by default, they are determined from the scoring object. |
targets |
A |
fixedGaps |
Logical symmetric matrix of dimension p*p. If entry
|
phase |
Character vector listing the phases that should be used; possible
values: |
iterate |
Logical indicating whether the phases listed in the argument
|
turning |
Setting |
maxDegree |
Parameter used to limit the vertex degree of the estimated graph. Valid arguments:
|
verbose |
if |
... |
additional arguments for debugging purposes and fine tuning. |
This function estimates the observational or interventional Markov
equivalence class of a DAG
based on a data sample with interventional data originating from various
interventions and possibly observational data. The intervention targets used
for data generation must be specified by the argument targets
as a
list of (integer) vectors listing the intervened vertices; observational
data is specified by an empty set, i.e. a vector of the form
integer(0)
. As an example, if data contains observational samples
as well as samples originating from an intervention at vertices 1 and 4,
the intervention targets must be specified as list(integer(0),
as.integer(1), as.integer(c(1, 4)))
.
An interventional Markov equivalence class of DAGs can be uniquely represented by a partially directed graph called interventional essential graph. Its edges have the following interpretation:
a directed edge stands for an arrow
that has the same orientation in all representatives of the
interventional Markov equivalence class;
an undirected edge a – b stands for an arrow that is oriented in one way in some representatives of the equivalence class and in the other way in other representatives of the equivalence class.
Note that when plotting the object, undirected and bidirected edges are equivalent.
Greedy DAG search (GDS) maximizes a score function (typically the BIC, passed
to the function via the argument score
) of a DAG in three phases,
starting from the empty DAG:
In the forward phase, GDS adds single arrows to the DAG as long as this augments the score.
In the backward phase, the algorithm removes arrows from the DAG as long as this augments the score.
In the turning phase, the algorithm reverts arrows of the DAG as long as this augments the score.
The phases that are actually run are specified with the argument
phase
. GDS cycles through the specified phases until no augmentation
of the score is possible any more if iterate = TRUE
. In the end,
gds
returns the (interventional or observational) essential graph of
the last visited DAG.
It is well-known that a greedy search in the space of DAGs instead of
essential graphs is more prone to be stuck in local optima of the score
function and hence expected to yield worse estimation results than GIES
(function gies
) or GES (function ges
) (Chickering,
2002; Hauser and Bühlmann, 2012). The
function gds
is therefore not of practical use, but can be used
to compare causal inference algorithms to an elementary and straight-forward
approach.
gds
returns a list with the following two components:
essgraph |
An object of class |
repr |
An object of a class derived from |
Alain Hauser ([email protected])
D.M. Chickering (2002). Optimal structure identification with greedy search. Journal of Machine Learning Research 3, 507–554
A. Hauser and P. Bühlmann (2012). Characterization and greedy learning of interventional Markov equivalence classes of directed acyclic graphs. Journal of Machine Learning Research 13, 2409–2464.
## Load predefined data data(gmInt) ## Define the score (BIC) score <- new("GaussL0penIntScore", gmInt$x, gmInt$targets, gmInt$target.index) ## Estimate the essential graph gds.fit <- gds(score) ## Plot the estimated essential graph and the true DAG if (require(Rgraphviz)) { par(mfrow=c(1,2)) plot(gds.fit$essgraph, main = "Estimated ess. graph") plot(gmInt$g, main = "True DAG") }
## Load predefined data data(gmInt) ## Define the score (BIC) score <- new("GaussL0penIntScore", gmInt$x, gmInt$targets, gmInt$target.index) ## Estimate the essential graph gds.fit <- gds(score) ## Plot the estimated essential graph and the true DAG if (require(Rgraphviz)) { par(mfrow=c(1,2)) plot(gds.fit$essgraph, main = "Estimated ess. graph") plot(gmInt$g, main = "True DAG") }
Estimate the observational essential graph representing the Markov equivalence class of a DAG using the greedy equivalence search (GES) algorithm of Chickering (2002).
ges(score, labels = score$getNodes(), fixedGaps = NULL, adaptive = c("none", "vstructures", "triples"), phase = c("forward", "backward", "turning"), iterate = length(phase) > 1, turning = NULL, maxDegree = integer(0), verbose = FALSE, ...)
ges(score, labels = score$getNodes(), fixedGaps = NULL, adaptive = c("none", "vstructures", "triples"), phase = c("forward", "backward", "turning"), iterate = length(phase) > 1, turning = NULL, maxDegree = integer(0), verbose = FALSE, ...)
score |
An instance of a class derived from |
labels |
Node labels; by default, they are determined from the scoring object. |
fixedGaps |
logical symmetric matrix of dimension p*p. If entry
|
adaptive |
indicating whether constraints should be adapted to newly detected v-structures or unshielded triples (cf. details). |
phase |
Character vector listing the phases that should be used; possible
values: |
iterate |
Logical indicating whether the phases listed in the argument
|
turning |
Setting |
maxDegree |
Parameter used to limit the vertex degree of the estimated graph. Valid arguments:
|
verbose |
If |
... |
Additional arguments for debugging purposes and fine tuning. |
Under the assumption that the distribution of the observed variables is faithful to a DAG, this function estimates the Markov equivalence class of the DAG. It does not estimate the DAG itself, because this is typically impossible (even with an infinite amount of data): different DAGs (forming a Markov equivalence class) can describe the same conditional independence relationships and be statistically indistinguishable from observational data alone.
All DAGs in an equivalence class have the same skeleton (i.e., the same
adjacency information) and the same v-structures (i.e., the same induced
subgraphs of the form ).
However, the direction of some edges may be undetermined, in the sense that
they point one way in one DAG in the equivalence class, while they point the
other way in another DAG in the equivalence class.
An equivalence class can be uniquely represented by a partially directed graph called (observational) essential graph or CPDAG (completed partially directed acyclic graph). Its edges have the following interpretation:
a directed edge stands for an arrow
that has the same orientation in all representatives of the Markov
equivalence class;
an undirected edge a – b stands for an arrow that is oriented in one way in some representatives of the equivalence class and in the other way in other representatives of the equivalence class.
Note that when plotting the object, undirected and bidirected edges are equivalent.
GES (greedy equivalence search) is a score-based algorithm that greedily
maximizes a score function (typically the BIC, passed to the function via the
argument score
) in the space of (observational) essential graphs in
three phases, starting from the empty graph:
In the forward phase, GES moves through the space of essential graphs in steps that correspond to the addition of a single edge in the space of DAGs; the phase is aborted as soon as the score cannot be augmented any more.
In the backward phase, the algorithm performs moves that correspond to the removal of a single edge in the space of DAGs until the score cannot be augmented any more.
In the turning phase, the algorithm performs moves that correspond to the reversal of a single arrow in the space of DAGs until the score cannot be augmented any more.
GES cycles through these three phases until no augmentation of the score is
possible any more if iterate = TRUE
. Note that the turning phase
was not part of the original implementation of Chickering (2002), but was
introduced by Hauser and Bühlmann (2012) and shown to improve the overall
estimation performance. The original algorithm of Chickering (2002) is
reproduced with phase = c("forward", "backward")
and
iterate = FALSE
.
GES has the same purpose as the PC algorithm (see pc
). While
the PC algorithm is based on conditional independence tests (requiring the
choice of an independence test and a significance level, see
pc
), the GES algorithm is a score-based method (requiring the
choice of a score function) and does not depend on conditional independence
tests. Since GES always operates in the space of essential graphs, it
returns a valid essential graph (or CPDAG) in any case.
Using the argument fixedGaps
, one can make sure that certain edges
will not be present in the resulting essential graph: if the entry
[i, j]
of the matrix passed to fixedGaps
is TRUE
, there
will be no edge between nodes and
. Using this argument
can speed up the execution of GIES and allows the user to account for
previous knowledge or other constraints. The argument
adaptive
can be
used to relax the constraints encoded by fixedGaps
according to a
modification of GES called ARGES (adaptively restricted greedy
equivalence search) which has been presented in Nandy, Hauser and Maathuis
(2015):
When adaptive = "vstructures"
and the algorithm introduces a
new v-structure in the
forward phase, then the edge
is removed from the list of fixed
gaps, meaning that the insertion of an edge between
and
becomes possible even if it was forbidden by the initial matrix passed to
fixedGaps
.
When adaptive = "triples"
and the algorithm introduces a new
unshielded triple in the forward phase (i.e., a subgraph of three nodes
,
and
where
and
as well as
and
are adjacent, but
and
are not), then the edge
is removed from the list of fixed gaps.
With one of the adaptive modifications, the successive application of a skeleton estimation method and GES restricted to an estimated skeleton still gives a consistent estimator of the DAG, which is not the case without the adaptive modification.
ges
returns a list with the following two components:
essgraph |
An object of class |
repr |
An object of a class derived from |
Alain Hauser ([email protected])
D.M. Chickering (2002). Optimal structure identification with greedy search. Journal of Machine Learning Research 3, 507–554
A. Hauser and P. Bühlmann (2012). Characterization and greedy learning of interventional Markov equivalence classes of directed acyclic graphs. Journal of Machine Learning Research 13, 2409–2464.
P. Nandy, A. Hauser and M. Maathuis (2015). Understanding consistency in hybrid causal structure learning. arXiv preprint 1507.02608
P. Spirtes, C.N. Glymour, and R. Scheines (2000). Causation, Prediction, and Search, MIT Press, Cambridge (MA).
## Load predefined data data(gmG) ## Define the score (BIC) score <- new("GaussL0penObsScore", gmG8$x) ## Estimate the essential graph ges.fit <- ges(score) ## Plot the estimated essential graph and the true DAG if (require(Rgraphviz)) { par(mfrow=c(1,2)) plot(ges.fit$essgraph, main = "Estimated CPDAG") plot(gmG8$g, main = "True DAG") str(ges.fit, max=2) } ## alternative: if (require(Matrix)) { as(as(ges.fit$essgraph,"graphNEL"),"Matrix") }
## Load predefined data data(gmG) ## Define the score (BIC) score <- new("GaussL0penObsScore", gmG8$x) ## Estimate the essential graph ges.fit <- ges(score) ## Plot the estimated essential graph and the true DAG if (require(Rgraphviz)) { par(mfrow=c(1,2)) plot(ges.fit$essgraph, main = "Estimated CPDAG") plot(gmG8$g, main = "True DAG") str(ges.fit, max=2) } ## alternative: if (require(Matrix)) { as(as(ges.fit$essgraph,"graphNEL"),"Matrix") }
Get the graph-class
part or “aspect” of an R
object, notably from our pc()
, skeleton()
,
fci()
, etc, results.
getGraph(x)
getGraph(x)
x |
potentially any R object which can be interpreted as a graph (with nodes and edges). |
a graph-class
object, i.e., one inheriting from (the
virtual) class "graph"
, package graph.
signature(x = "ANY")
the default method just tries
as(x, "graph")
, so works when a coerce
(S4)
method is defined for x
.
signature(x = "pcAlgo")
and
signature(x = "fciAlgo")
extract the graph part explicitly.
signature(x = "matrix")
interpret x
as adjacency
matrix and return the corresponding "graphAM"
object.
For sparseMatrix methods, see the ‘Note’.
For large graphs, it may be attractive to work with sparse matrices from the Matrix package. If desired, you can activate this by
require(Matrix) setMethod("getGraph", "sparseMatrix", function(x) as(x, "graphNEL")) setMethod("getGraph", "Matrix", function(x) as(x, "graphAM"))
Martin Maechler
fci
, etc.
The graph-class
class description in package graph.
A <- rbind(c(0,1,0,0,1), c(0,0,0,1,1), c(1,0,0,1,0), c(1,0,0,0,1), c(0,0,0,1,0)) sum(A) # 9 getGraph(A) ## a graph with 5 nodes and 'sum(A)' edges
A <- rbind(c(0,1,0,0,1), c(0,0,0,1,1), c(1,0,0,1,0), c(1,0,0,0,1), c(0,0,0,1,0)) sum(A) # 9 getGraph(A) ## a graph with 5 nodes and 'sum(A)' edges
Given a combination of elements out of the elements
, the next
set of size
k
in a specified sequence is computed.
getNextSet(n,k,set)
getNextSet(n,k,set)
n |
Number of elements to choose from (integer) |
k |
Size of chosen set (integer) |
set |
Previous set in list (numeric vector) |
The initial set is 1:k
. Last index varies quickest. Using the
dynamic creation of sets reduces the memory demands dramatically for
large sets. If complete lists of combination sets have to be produced
and memory is no problem, the function combn
from package combinat is an alternative.
List with two elements:
nextSet |
Next set in list (numeric vector) |
wasLast |
Logical indicating whether the end of the specified sequence is reached. |
Markus Kalisch [email protected] and Martin Maechler
This function is used in skeleton
.
## start from first set (1,2) and get the next set of size 2 out of 1:5 ## notice that res$wasLast is FALSE : str(r <- getNextSet(5,2,c(1,2))) ## input is the last set; notice that res$wasLast now is TRUE: str(r2 <- getNextSet(5,2,c(4,5))) ## Show all sets of size k out of 1:n : ## {if you really want this in practice, use something like combn() !} n <- 5 k <- 3 currentSet <- 1:k (res <- rbind(currentSet, deparse.level = 0)) repeat { newEl <- getNextSet(n,k,currentSet) if (newEl$wasLast) break ## otherwise continue: currentSet <- newEl$nextSet res <- rbind(res, currentSet, deparse.level = 0) } res stopifnot(choose(n,k) == nrow(res)) ## must be identical
## start from first set (1,2) and get the next set of size 2 out of 1:5 ## notice that res$wasLast is FALSE : str(r <- getNextSet(5,2,c(1,2))) ## input is the last set; notice that res$wasLast now is TRUE: str(r2 <- getNextSet(5,2,c(4,5))) ## Show all sets of size k out of 1:n : ## {if you really want this in practice, use something like combn() !} n <- 5 k <- 3 currentSet <- 1:k (res <- rbind(currentSet, deparse.level = 0)) repeat { newEl <- getNextSet(n,k,currentSet) if (newEl$wasLast) break ## otherwise continue: currentSet <- newEl$nextSet res <- rbind(res, currentSet, deparse.level = 0) } res stopifnot(choose(n,k) == nrow(res)) ## must be identical
Estimate the interventional essential graph representing the Markov equivalence class of a DAG using the greedy interventional equivalence search (GIES) algorithm of Hauser and Bühlmann (2012).
gies(score, labels = score$getNodes(), targets = score$getTargets(), fixedGaps = NULL, adaptive = c("none", "vstructures", "triples"), phase = c("forward", "backward", "turning"), iterate = length(phase) > 1, turning = NULL, maxDegree = integer(0), verbose = FALSE, ...)
gies(score, labels = score$getNodes(), targets = score$getTargets(), fixedGaps = NULL, adaptive = c("none", "vstructures", "triples"), phase = c("forward", "backward", "turning"), iterate = length(phase) > 1, turning = NULL, maxDegree = integer(0), verbose = FALSE, ...)
score |
An R object inheriting from |
labels |
Node labels; by default, they are determined from the scoring object. |
targets |
A list of intervention targets (cf. details). A list of vectors, each vector listing the vertices of one intervention target. |
fixedGaps |
Logical symmetric matrix of dimension p*p. If entry
|
adaptive |
indicating whether constraints should be adapted to newly detected v-structures or unshielded triples (cf. details). |
phase |
Character vector listing the phases that should be used; possible
values: |
iterate |
Logical indicating whether the phases listed in the argument
|
turning |
Setting |
maxDegree |
Parameter used to limit the vertex degree of the estimated graph. Possible values:
|
verbose |
If |
... |
Additional arguments for debugging purposes and fine tuning. |
This function estimates the interventional Markov equivalence class of a DAG
based on a data sample with interventional data originating from various
interventions and possibly observational data. The intervention targets used
for data generation must be specified by the argument targets
as a
list of (integer) vectors listing the intervened vertices; observational
data is specified by an empty set, i.e. a vector of the form
integer(0)
. As an example, if data contains observational samples
as well as samples originating from an intervention at vertices 1 and 4,
the intervention targets must be specified as list(integer(0),
as.integer(1), as.integer(c(1, 4)))
.
An interventional Markov equivalence class of DAGs can be uniquely represented by a partially directed graph called interventional essential graph. Its edges have the following interpretation:
a directed edge stands for an arrow
that has the same orientation in all representatives of the
interventional Markov equivalence class;
an undirected edge –
stands for an arrow that is
oriented in one way in some representatives of the equivalence class and
in the other way in other representatives of the equivalence class.
Note that when plotting the object, undirected and bidirected edges are equivalent.
GIES (greedy interventional equivalence search) is a score-based algorithm
that greedily maximizes a score function (typically the BIC, passed to the
function via the argument score
) in the space of interventional
essential graphs in three phases, starting from the empty graph:
In the forward phase, GIES moves through the space of interventional essential graphs in steps that correspond to the addition of a single edge in the space of DAGs; the phase is aborted as soon as the score cannot be augmented any more.
In the backward phase, the algorithm performs moves that correspond to the removal of a single edge in the space of DAGs until the score cannot be augmented any more.
In the turning phase, the algorithm performs moves that correspond to the reversal of a single arrow in the space of DAGs until the score cannot be augmented any more.
The phases that are actually run are specified with the argument
phase
. GIES cycles through the specified phases until no augmentation
of the score is possible any more if iterate = TRUE
. GIES is an
interventional extension of the GES (greedy equivalence search) algorithm of
Chickering (2002) which is limited to observational data and hence operates
on the space of observational instead of interventional Markov equivalence
classes.
Using the argument fixedGaps
, one can make sure that certain edges
will not be present in the resulting essential graph: if the entry
[i, j]
of the matrix passed to fixedGaps
is TRUE
, there
will be no edge between nodes and
. Using this argument
can speed up the execution of GIES and allows the user to account for
previous knowledge or other constraints. The argument
adaptive
can be
used to relax the constraints encoded by fixedGaps
as follows:
When adaptive = "vstructures"
and the algorithm introduces a
new v-structure in the
forward phase, then the edge
is removed from the list of fixed
gaps, meaning that the insertion of an edge between
and
becomes possible even if it was forbidden by the initial matrix passed to
fixedGaps
.
When adaptive = "triples"
and the algorithm introduces a new
unshielded triple in the forward phase (i.e., a subgraph of three nodes
,
and
where
and
as well as
and
are adjacent, but
and
are not), then the edge
is removed from the list of fixed gaps.
This modifications of the forward phase of GIES are inspired by the analog modifications in the forward phase of GES, which makes the successive application of a skeleton estimation method and GES restricted to an estimated skeleton a consistent estimator of the DAG (cf. Nandy, Hauser and Maathuis, 2015).
gies
returns a list with the following two components:
essgraph |
An object of class |
repr |
An object of a class derived from |
Alain Hauser ([email protected])
D.M. Chickering (2002). Optimal structure identification with greedy search. Journal of Machine Learning Research 3, 507–554
A. Hauser and P. Bühlmann (2012). Characterization and greedy learning of interventional Markov equivalence classes of directed acyclic graphs. Journal of Machine Learning Research 13, 2409–2464.
P. Nandy, A. Hauser and M. Maathuis (2015). Understanding consistency in hybrid causal structure learning. arXiv preprint 1507.02608
## Load predefined data data(gmInt) ## Define the score (BIC) score <- new("GaussL0penIntScore", gmInt$x, gmInt$targets, gmInt$target.index) ## Estimate the essential graph gies.fit <- gies(score) ## Plot the estimated essential graph and the true DAG if (require(Rgraphviz)) { par(mfrow=c(1,2)) plot(gies.fit$essgraph, main = "Estimated ess. graph") plot(gmInt$g, main = "True DAG") }
## Load predefined data data(gmInt) ## Define the score (BIC) score <- new("GaussL0penIntScore", gmInt$x, gmInt$targets, gmInt$target.index) ## Estimate the essential graph gies.fit <- gies(score) ## Plot the estimated essential graph and the true DAG if (require(Rgraphviz)) { par(mfrow=c(1,2)) plot(gies.fit$essgraph, main = "Estimated ess. graph") plot(gmInt$g, main = "True DAG") }
This data set contains a matrix containing information on five binary variables (coded as 0/1) and the corresonding DAG model.
data(gmB)
data(gmB)
The format is a list of two components
Int [1:5000, 1:5] 0 1 1 0 0 1 1 0 1 1 ...
Formal class 'graphNEL' [package "graph"] with 6 slots
.. ..@ nodes : chr [1:5] "1" "2" "3" "4" ...
.. ..@ edgeL :List of 5
........
The data was generated using Tetrad in the following way. A random DAG on five nodes was generated; binary variables were assigned to each node; then conditional probability tables corresponding to the structure of the generated DAG were constructed. Finally, 5000 samples were drawn using the conditional probability tables.
data(gmB) ## maybe str(gmB) ; plot(gmB) ...
data(gmB) ## maybe str(gmB) ; plot(gmB) ...
This data set contains a matrix containing information on five discrete variables (levels are coded as numbers) and the corresonding DAG model.
data(gmD)
data(gmD)
A list
of two components
a data.frame
with 5 columns X1
.. X5
each coding a discrete variable (aka
factor
) with interagesInt [1:10000, 1:5] 2 2 1 1 1 2 2 0 2 0 ...
Formal class 'graphNEL' [package "graph"] with 6 slots
.. ..@ nodes : chr [1:5] "1" "2" "3" "4" ...
.. ..@ edgeL :List of 5
........
where x
is the data matrix and g
is the DAG from which
the data were generated.
The data was generated using Tetrad in the following way. A random DAG on five nodes was generated; discrete variables were assigned to each node (with 3, 2, 3, 4 and 2 levels); then conditional probability tables corresponding to the structure of the generated DAG were constructed. Finally, 10000 samples were drawn using the conditional probability tables.
data(gmD) str(gmD, max=1) if(require("Rgraphviz")) plot(gmD$ g, main = "gmD $ g --- the DAG of the gmD (10'000 x 5 discrete data)") ## >>> 1 --> 3 <-- 2 --> 4 --> 5 str(gmD$x) ## The number of unique values of each variable: sapply(gmD$x, function(v) nlevels(as.factor(v))) ## X1 X2 X3 X4 X5 ## 3 2 3 4 2 lapply(gmD$x, table) ## the (marginal) empirical distributions ## $X1 ## 0 1 2 ## 1933 3059 5008 ## ## $X2 ## 0 1 ## 8008 1992 ## ## $X3 ## .....
data(gmD) str(gmD, max=1) if(require("Rgraphviz")) plot(gmD$ g, main = "gmD $ g --- the DAG of the gmD (10'000 x 5 discrete data)") ## >>> 1 --> 3 <-- 2 --> 4 --> 5 str(gmD$x) ## The number of unique values of each variable: sapply(gmD$x, function(v) nlevels(as.factor(v))) ## X1 X2 X3 X4 X5 ## 3 2 3 4 2 lapply(gmD$x, table) ## the (marginal) empirical distributions ## $X1 ## 0 1 2 ## 1933 3059 5008 ## ## $X2 ## 0 1 ## 8008 1992 ## ## $X3 ## .....
These two data sets contain a matrix containing information on eight gaussian variables and the corresonding DAG model.
data(gmG)
data(gmG)
gmG
and gmG8
are each a list
of two components
a numeric matrix .
a graph, i.e., of formal class
"graphNEL"
from package graph with 6 slots
.. ..@ nodes : chr [1:8] "1" "2" "3" "4" ...
.. ..@ edgeL :List of 8
........
The data was generated as indicated below. First, a random DAG model was
generated, then 5000 samples were drawn from “almost” this
model, for gmG
: In the previous version, the data generation
wgtMatrix
had the non-zero weights in reversed order for
each node. On the other hand, for gmG8
, the correct weights
were used in all cases
The data set is identical
to the one generated by
## Used to generate "gmG" set.seed(40) p <- 8 n <- 5000 ## true DAG: vars <- c("Author", "Bar", "Ctrl", "Goal", paste0("V",5:8)) gGtrue <- randomDAG(p, prob = 0.3, V = vars) gmG <- list(x = rmvDAG(n, gGtrue, back.compatible=TRUE), g = gGtrue) gmG8 <- list(x = rmvDAG(n, gGtrue), g = gGtrue)
data(gmG) str(gmG, max=3) stopifnot(identical(gmG $ g, gmG8 $ g)) if(dev.interactive()) { ## to save time in tests round(as(gmG $ g, "Matrix"), 2) # weight ("adjacency") matrix if (require(Rgraphviz)) plot(gmG$g) pairs(gmG$x, gap = 0, panel=function(...) smoothScatter(..., add=TRUE)) }
data(gmG) str(gmG, max=3) stopifnot(identical(gmG $ g, gmG8 $ g)) if(dev.interactive()) { ## to save time in tests round(as(gmG $ g, "Matrix"), 2) # weight ("adjacency") matrix if (require(Rgraphviz)) plot(gmG$g) pairs(gmG$x, gap = 0, panel=function(...) smoothScatter(..., add=TRUE)) }
This data set contains a matrix containing information on seven gaussian variables and the corresonding DAG model.
data(gmI)
data(gmI)
The two gmI*
objects are each a list
of two components
x
, an numeric matrix, and
g
, a DAG, a graph generated by randomDAG
.
See gmG
for more
The data was generated as indicated below. First, a random DAG was
generated, then samples were drawn from this model, strictly
speaking for gmI7
only.
The data sets are identical
to those generated by
## Used to generate "gmI" set.seed(123) p <- 7 myDAG <- randomDAG(p, prob = 0.2) ## true DAG gmI <- list(x = rmvDAG(10000, myDAG, back.compatible=TRUE), g = myDAG) gmI7 <- list(x = rmvDAG( 8000, myDAG), g = myDAG)
data(gmI) str(gmI, max=3) stopifnot(identical(gmI $ g, gmI7 $ g)) if(dev.interactive()) { ## to save time in tests round(as(gmI $ g, "Matrix"), 2) # weight ("adjacency") matrix if (require(Rgraphviz)) plot(gmI $ g) pairs(gmI$x, gap = 0, panel=function(...) smoothScatter(..., add=TRUE)) }
data(gmI) str(gmI, max=3) stopifnot(identical(gmI $ g, gmI7 $ g)) if(dev.interactive()) { ## to save time in tests round(as(gmI $ g, "Matrix"), 2) # weight ("adjacency") matrix if (require(Rgraphviz)) plot(gmI $ g) pairs(gmI$x, gap = 0, panel=function(...) smoothScatter(..., add=TRUE)) }
This data set contains a matrix with an ensemble of observational and interventional data from eight Gaussian variables. The corresponding (data generating) DAG model is also stored.
data(gmInt)
data(gmInt)
The format is a list of four components
Matrix with 5000 rows (one row a measurement) and 8 columns (corresponding to the 8 variables
List of (mutually exclusive) intervention targets. In this
example, the three entries integer(0)
, 3
and 5
indicate that the data set consists of observational data, interventional
data originating from an intervention at vertex 3, and interventional data
originating from an intervention at vertex 5.
Vector with 5000 elements. Each entry maps a row of
x
to the corresponding intervention target. Example:
gmInt$target.index[3322] == 2
means that x[3322, ]
was
simulated from an intervention at gmInt$targets[[2]]
, i.e. at
vertex 3.
Formal class 'graphNEL' [package "graph"] with 6 slots, representing the true DAG from which observational and interventional data was sampled.
The data was generated as indicated below. First, a random DAG model was
generated, then 5000 samples were drawn from this model: 3000 observational
ones, and 1000 each from an intervention at vertex 3 and 5, respectively
(see gmInt$target.index
).
The data set is identical
to the one generated by
set.seed(40) p <- 8 n <- 5000 gGtrue <- randomDAG(p, prob = 0.3) pardag <- as(gGtrue, "GaussParDAG") pardag$set.err.var(rep(1, p)) targets <- list(integer(0), 3, 5) target.index <- c(rep(1, 0.6*n), rep(2, n/5), rep(3, n/5)) x1 <- rmvnorm.ivent(0.6*n, pardag) x2 <- rmvnorm.ivent(n/5, pardag, targets[[2]], matrix(rnorm(n/5, mean = 4, sd = 0.02), ncol = 1)) x3 <- rmvnorm.ivent(n/5, pardag, targets[[3]], matrix(rnorm(n/5, mean = 4, sd = 0.02), ncol = 1)) gmInt <- list(x = rbind(x1, x2, x3), targets = targets, target.index = target.index, g = gGtrue)
data(gmInt) str(gmInt, max = 3) pairs(gmInt$x, gap = 0, pch = ".")
data(gmInt) str(gmInt, max = 3) pairs(gmInt$x, gap = 0, pch = ".")
This data set contains a matrix containing information on four gaussian variables and the corresonding DAG model containing four observed and one latent variable.
data(gmL)
data(gmL)
The format is a list of 2 components
$ x: num [1:10000, 1:4] 0.924 -0.189 1.016 0.363 0.497 ... ..- attr(*, "dimnames")=List of 2 .. ..$ : NULL .. ..$ : chr [1:4] "2" "3" "4" "5"
$ g:Formal class 'graphNEL' [package "graph"] with 6 slots .. ..@ nodes : chr [1:5] "1" "2" "3" "4" ... .. ..@ edgeL :List of 5 ........
The data was generated as indicated below. First, a random DAG model was generated with five nodes; then 10000 samples were drawn from this model; finally, variable one was declared to be latent and the corresponding column was deleted from the simulated data set.
## Used to generate "gmL" set.seed(47) p <- 5 n <- 10000 gGtrue <- randomDAG(p, prob = 0.3) ## true DAG myX <- rmvDAG(n, gGtrue) colnames(myX) <- as.character(1:5) gmL <- list(x = myX[,-1], g = gGtrue)
data(gmL) str(gmL, max=3) ## the graph: gmL$g graph::nodes(gmL$g) ; str(graph::edges(gmL$g)) if(require("Rgraphviz")) plot(gmL$g, main = "gmL $ g -- latent variable example data") pairs(gmL $x) # the data
data(gmL) str(gmL, max=3) ## the graph: gmL$g graph::nodes(gmL$g) ; str(graph::edges(gmL$g)) if(require("Rgraphviz")) plot(gmL$g, main = "gmL $ g -- latent variable example data") pairs(gmL $x) # the data
ida()
estimates the multiset of possible joint total causal effects
of variables (X
) onto variables (Y
)
from observational data via adjustment.
ida(x.pos, y.pos, mcov, graphEst, method = c("local","optimal","global"), y.notparent = FALSE, verbose = FALSE, all.dags = NA, type = c("cpdag", "pdag"))
ida(x.pos, y.pos, mcov, graphEst, method = c("local","optimal","global"), y.notparent = FALSE, verbose = FALSE, all.dags = NA, type = c("cpdag", "pdag"))
x.pos , x
|
Positions of variables |
y.pos , y
|
Positions of variables |
mcov |
Covariance matrix that was used to estimate |
graphEst |
Estimated CPDAG or PDAG. The CPDAG is typically from |
method |
Character string specifying the method with default
See details below. |
y.notparent |
Logical; for singleton |
verbose |
If TRUE, details on the regressions are printed. |
all.dags |
All DAGs in the equivalence class represented by the CPDAG or PDAG
can be precomputed by |
type |
Type of graph |
It is assumed that we have observational data from a multivariate Gaussian distribution
faithful to the true (but unknown) underlying causal DAG (without hidden variables).
Under these assumptions, this function estimates the multiset of possible total joint effects of X
on Y
.
Here the total joint effect of X
on
Y
is defined via Pearl's do-calculus as the vector
,
with a similar definition for more than two variables. These values are equal to the partial derivatives
(evaluated at
) of
with respect to
' and
'.
Moreover, under the Gaussian assumption, these partial derivatives do not depend on the values at which they are evaluated.
We estimate a set of possible joint total causal effects instead of
the unique joint total causal effect, since it is typically impossible to
identify the latter when the true underlying causal DAG is unknown
(even with an infinite amount of data). Conceptually, the method
works as follows. First, we estimate the equivalence class of DAGs
that describe the conditional independence relationships in the data,
using the function pc
(see the help file of this
function). For each DAG G in the equivalence class, we apply Pearl's
do-calculus to estimate the total causal effect of X
on
Y
. This can be done via a simple linear regression
adjusting for a valid adjustment set.
For example, if X
and Y
are singleton and Y
is not a parent of X
, we can take the regression coefficient of
X
in the regression lm(Y ~ X + pa(X,G))
, where
pa(X,G)
denotes the parents of X
in the DAG G; if Y
is a parent of X
in G, we can set the estimated causal effect to
zero.
If the equivalence class contains k
DAGs, this will yield
k
estimated total causal effects. Since we do not know which DAG
is the true causal DAG, we do not know which estimated possible total joint causal
effect of X
on Y
is the correct one. Therefore, we return
the entire multiset of k
estimated effects (it is a multiset
rather than a set because it can contain duplicate values).
One can take summary measures of the multiset. For example, the minimum absolute value provides a lower bound on the size of the true causal effect: If the minimum absolute value of all values in the multiset is larger than one, then we know that the size of the true causal effect (up to sampling error) must be larger than one.
If method="global"
, the method as described above is carried
out, where all DAGs in the equivalene class of the estimated CPDAG or PDAG
graphEst
are computed using the function pdag2allDags
.
The parent set for each DAG is then used to estimate the corresponding possible
total causal effect. This method is suitable for small graphs (say, up to 10 nodes) and can
only be used for singleton X
.
If method="local"
, we only consider all valid possible directions of undirected edges
that have X
as an endpoint.
In the case of a CPDAG, we consider all
possible directions of undirected edges that have X
as an
endpoint, such that no new v-structure is created.
Maathuis, Kalisch and Buehlmann (2009) showed that there is at least one DAG in
the equivalence class for each such local configuration. Hence, the
procedure is truly local in this setting.
In the case of a PDAG, we need to verify for all possible directions whether they lead to an
amenable max. PDAG if we apply Meek's orientation rules.
In this setting the complexity of the "local"
method is similar to the "optimal"
one and it is not truly local.
For details see Section 4.2 in Perkovic, Kalisch and Maathuis (2017).
We estimate the total causal effect of X
on Y
for each
valid configuration as above, using linear regression adjusting for the correspoding possible parents.
As we adjust for the same sets as in the "global"
method, it follows that the multisets of total causal effects of
the two methods have the same unique values. They may, however, have different multiplicities.
Since the parents of X
are usually an inefficient valid adjustment set we provide a third method, that uses
different adjustment sets.
If method="optimal"
, we do not determine all DAGs in the
equivalence class of the CPDAG or PDAG. Instead, we only direct edges until
obtaining an amenable PDAG, which is sufficient for computing the optimal
valid adjustment set. Each amenable PDAG can be obtained by
orienting the neighborhood of X
and then applying Meek's orientation rules, similar to the "local"
method
for PDAGs. This can be done faster than the "global"
method but is slower than the "local"
method, especially for CPDAGs. For details see Witte, Henckel, Maathuis and Didelez (2019).
For each amenable PDAG the corresponding optimal valid adjustment set is computed.
The optimal set is a valid adjustment set irrespectively of whether X
is a singleton.
Hence, as opposed to the other two, this method can be applied to sets X
. Sometimes, however,
a joint total causal effect cannot be estimated via adjustment. In these cases we recommend use of the pcalg function jointIda
.
We then estimate the joint total causal effect of X
on Y
for each
valid configuration with linear regression, adjusting for the possible optimal sets.
If the estimated graph is correct, each of these regressions is guaranteed
to be more efficient than the corresponding linear regression with any other valid adjustment set
(see Henckel, Perkovic and Maathuis (2019) for more details). The estimates are, however, more sensitive to graph estimation errors than the ones obtained with the other two methods.
If X
is a singleton, the output of this method is a multiset of the same size as the output of the "local"
method.
For example, a CPDAG may represent eight DAGs, and the "global"
method
may produce an estimate of the multiset of possible total effects
{1.3, -0.5, 0.7, 1.3, 1.3, -0.5, 0.7, 0.7}.
The unique values in this set are -0.5, 0.7 and 1.3, and the
multiplicities are 2, 3 and 3. The "local"
and "optimal"
methods, on the other hand,
may prodcue estimates of the set {1.3, -0.5, -0.5, 0.7}. The unique values are again -0.5,
0.7 and 1.3, but the multiplicities are now 2, 1 and 1. The fact that
the unique values of the multisets for all three methods
are identical implies that summary measures of the multiset that only
depend on the unique values (such as the minimum absolute value) can
be estimated with all three.
A list of length |Y
| of matrices, each containing the possible joint total causal effect of X
on one node in Y
.
Markus Kalisch ([email protected]), Emilija Perkovic and Leonard Henckel
M.H. Maathuis, M. Kalisch, P. Buehlmann (2009). Estimating high-dimensional intervention effects from observational data. Annals of Statistics 37, 3133–3164.
M.H. Maathuis, D. Colombo, M. Kalisch, P. Bühlmann (2010). Predicting causal effects in large-scale systems from observational data. Nature Methods 7, 247–248.
C. Meek (1995). Causal inference and causal explanation with background knowledge, In Proceedings of UAI 1995, 403-410.
Markus Kalisch, Martin Maechler, Diego Colombo, Marloes H. Maathuis, Peter Buehlmann (2012). Causal inference using graphical models with the R-package pcalg. Journal of Statistical Software 47(11) 1–26, doi:10.18637/jss.v047.i11.
Pearl (2005). Causality. Models, reasoning and inference. Cambridge University Press, New York.
E. Perkovic, M. Kalisch and M.H. Maathuis (2017). Interpreting and using CPDAGs with background knowledge. In Proceedings of UAI 2017.
L. Henckel, E. Perkovic and M.H. Maathuis (2019). Graphical criteria for efficient total effect estimation via adjustment in causal linear models. Working Paper.
J. Witte, L. Henckel, M.H Maathuis and V. Didelez (2019). On efficient adjustment in causal graphs. Working Paper.
jointIda
for estimating the multiset of possible total
joint effects; idaFast
for faster estimation of the multiset of
possible total causal effects for several target variables.
pc
for estimating a CPDAG. addBgKnowledge
for obtaining a PDAG from CPDAG and background knowledge.
## Simulate the true DAG suppressWarnings(RNGversion("3.5.0")) set.seed(123) p <- 10 myDAG <- randomDAG(p, prob = 0.2) ## true DAG myCPDAG <- dag2cpdag(myDAG) ## true CPDAG myPDAG <- addBgKnowledge(myCPDAG,2,3) ## true PDAG with background knowledge 2 -> 3 covTrue <- trueCov(myDAG) ## true covariance matrix ## simulate Gaussian data from the true DAG n <- 10000 dat <- rmvDAG(n, myDAG) ## estimate CPDAG and PDAG -- see help(pc) suffStat <- list(C = cor(dat), n = n) pc.fit <- pc(suffStat, indepTest = gaussCItest, p=p, alpha = 0.01) pc.fit.pdag <- addBgKnowledge(pc.fit@graph,2,3) if (require(Rgraphviz)) { ## plot the true and estimated graphs par(mfrow = c(1,3)) plot(myDAG, main = "True DAG") plot(pc.fit, main = "Estimated CPDAG") plot(pc.fit.pdag, main = "Max. PDAG") } ## Supppose that we know the true CPDAG and covariance matrix (l.ida.cpdag <- ida(3,10, covTrue, myCPDAG, method = "local", type = "cpdag")) (o.ida.cpdag <- ida(3,10, covTrue, myCPDAG, method = "optimal", type = "cpdag")) ## Not run: (g.ida.cpdag <- ida(3,10, covTrue, myCPDAG, method = "global", type = "cpdag")) ## All three methods produce the same unique values. ## Supppose that we know the true PDAG and covariance matrix (l.ida.pdag <- ida(3,10, covTrue, myPDAG, method = "local", type = "pdag")) (o.ida.pdag <- ida(3,10, covTrue, myPDAG, method = "optimal", type = "pdag")) ## Not run: (g.ida.pdag <- ida(3,10, covTrue, myPDAG, method = "global", type = "pdag")) ## All three methods produce the same unique values. ## From the true DAG, we can compute the true causal effect of 3 on 10 (ce.3.10 <- causalEffect(myDAG, 10, 3)) ## Indeed, this value is contained in the values found by all methods ## When working with data we have to use the estimated CPDAG and ## the sample covariance matrix (l.ida.est.cpdag <- ida(3,10, cov(dat), pc.fit@graph, method = "local", type = "cpdag")) (o.ida.est.cpdag <- ida(3,10, cov(dat), pc.fit@graph, method = "optimal", type = "cpdag")) ## Not run: (g.ida.est.cpdag <- ida(3,10, cov(dat), pc.fit@graph, method = "global", type = "cpdag")) ## End(Not run) ## The unique values of the local and the global method are still identical. ## While not identical, the values of the optimal method are very similar. ## The true causal effect is contained in all three sets, up to a small ## estimation error (0.118 vs. 0.112 with true value 0.114) ## Similarly, when working with data and background knowledge we have to use the estimated PDAG and ## the sample covariance matrix (l.ida.est.pdag <- ida(3,10, cov(dat), pc.fit.pdag, method = "local", type = "pdag")) (o.ida.est.pdag <- ida(3,10, cov(dat), pc.fit.pdag, method = "optimal", type = "pdag")) ## Not run: (g.ida.est.pdag <- ida(3,10, cov(dat), pc.fit.pdag, method = "global", type = "pdag")) ## The unique values of the local and the global method are still identical. ## While not necessarily identical, the values of the optimal method will be similar. ## The true causal effect is contained in both sets, up to a small estimation error ## All three can also be applied to sets y. (l.ida.cpdag.2 <- ida(3,c(6,10), cov(dat), pc.fit@graph, method = "local", type = "cpdag")) (o.ida.cpdag.2 <- ida(3,c(6,10), cov(dat), pc.fit@graph, method = "optimal", type = "cpdag")) ## Not run: (g.ida.cpdag.2 <- ida(3,c(6,10), cov(dat), pc.fit@graph, method = "global", type = "cpdag")) ## End(Not run) ## For the methods local and global we recommend use of idaFast in this case for better performance. ## Note that only the optimal method can be appplied to sets x. (o.ida.cpdag.2 <- ida(c(2,3),10, cov(dat), pc.fit@graph, method = "optimal", type = "cpdag"))
## Simulate the true DAG suppressWarnings(RNGversion("3.5.0")) set.seed(123) p <- 10 myDAG <- randomDAG(p, prob = 0.2) ## true DAG myCPDAG <- dag2cpdag(myDAG) ## true CPDAG myPDAG <- addBgKnowledge(myCPDAG,2,3) ## true PDAG with background knowledge 2 -> 3 covTrue <- trueCov(myDAG) ## true covariance matrix ## simulate Gaussian data from the true DAG n <- 10000 dat <- rmvDAG(n, myDAG) ## estimate CPDAG and PDAG -- see help(pc) suffStat <- list(C = cor(dat), n = n) pc.fit <- pc(suffStat, indepTest = gaussCItest, p=p, alpha = 0.01) pc.fit.pdag <- addBgKnowledge(pc.fit@graph,2,3) if (require(Rgraphviz)) { ## plot the true and estimated graphs par(mfrow = c(1,3)) plot(myDAG, main = "True DAG") plot(pc.fit, main = "Estimated CPDAG") plot(pc.fit.pdag, main = "Max. PDAG") } ## Supppose that we know the true CPDAG and covariance matrix (l.ida.cpdag <- ida(3,10, covTrue, myCPDAG, method = "local", type = "cpdag")) (o.ida.cpdag <- ida(3,10, covTrue, myCPDAG, method = "optimal", type = "cpdag")) ## Not run: (g.ida.cpdag <- ida(3,10, covTrue, myCPDAG, method = "global", type = "cpdag")) ## All three methods produce the same unique values. ## Supppose that we know the true PDAG and covariance matrix (l.ida.pdag <- ida(3,10, covTrue, myPDAG, method = "local", type = "pdag")) (o.ida.pdag <- ida(3,10, covTrue, myPDAG, method = "optimal", type = "pdag")) ## Not run: (g.ida.pdag <- ida(3,10, covTrue, myPDAG, method = "global", type = "pdag")) ## All three methods produce the same unique values. ## From the true DAG, we can compute the true causal effect of 3 on 10 (ce.3.10 <- causalEffect(myDAG, 10, 3)) ## Indeed, this value is contained in the values found by all methods ## When working with data we have to use the estimated CPDAG and ## the sample covariance matrix (l.ida.est.cpdag <- ida(3,10, cov(dat), pc.fit@graph, method = "local", type = "cpdag")) (o.ida.est.cpdag <- ida(3,10, cov(dat), pc.fit@graph, method = "optimal", type = "cpdag")) ## Not run: (g.ida.est.cpdag <- ida(3,10, cov(dat), pc.fit@graph, method = "global", type = "cpdag")) ## End(Not run) ## The unique values of the local and the global method are still identical. ## While not identical, the values of the optimal method are very similar. ## The true causal effect is contained in all three sets, up to a small ## estimation error (0.118 vs. 0.112 with true value 0.114) ## Similarly, when working with data and background knowledge we have to use the estimated PDAG and ## the sample covariance matrix (l.ida.est.pdag <- ida(3,10, cov(dat), pc.fit.pdag, method = "local", type = "pdag")) (o.ida.est.pdag <- ida(3,10, cov(dat), pc.fit.pdag, method = "optimal", type = "pdag")) ## Not run: (g.ida.est.pdag <- ida(3,10, cov(dat), pc.fit.pdag, method = "global", type = "pdag")) ## The unique values of the local and the global method are still identical. ## While not necessarily identical, the values of the optimal method will be similar. ## The true causal effect is contained in both sets, up to a small estimation error ## All three can also be applied to sets y. (l.ida.cpdag.2 <- ida(3,c(6,10), cov(dat), pc.fit@graph, method = "local", type = "cpdag")) (o.ida.cpdag.2 <- ida(3,c(6,10), cov(dat), pc.fit@graph, method = "optimal", type = "cpdag")) ## Not run: (g.ida.cpdag.2 <- ida(3,c(6,10), cov(dat), pc.fit@graph, method = "global", type = "cpdag")) ## End(Not run) ## For the methods local and global we recommend use of idaFast in this case for better performance. ## Note that only the optimal method can be appplied to sets x. (o.ida.cpdag.2 <- ida(c(2,3),10, cov(dat), pc.fit@graph, method = "optimal", type = "cpdag"))
This function estimates the multiset of possible total causal effects of
one variable (x
) on a several (i.e., a vector of) target
variables (y
) from observational data.
idaFast()
is more efficient than looping over
ida
. Only method="local"
(see ida
)
is available.
idaFast(x.pos, y.pos.set, mcov, graphEst)
idaFast(x.pos, y.pos.set, mcov, graphEst)
x.pos |
(integer) position of variable |
y.pos.set |
integer vector of positions of the target variables
|
mcov |
covariance matrix that was used to estimate |
graphEst |
estimated CPDAG from the function
|
This function performs
ida(x.pos, y.pos, mcov, graphEst, method="local",
y.notparent=FALSE, verbose=FALSE)
for all values of y.pos
in
y.pos.set
simultaneously, in an efficient way.
See (the help about) ida
for more details. Note that the
option y.notparent = TRUE
is not implemented, since it is not
clear how to do that efficiently without orienting all edges away from
y.pos.set
at the same time, which seems not to be
desirable. Suggestions are welcome.
Matrix with length(y.pos.set)
rows. Row contains the multiset
of estimated possible total causal effects of
x
on
y.pos.set[i]
. Note that all multisets in the matrix have the
same length, since the parents of x
are the same for all elements
of y.pos.set
.
Markus Kalisch ([email protected])
see the list in ida
.
pc
for estimating a CPDAG, and
ida
for estimating the multiset of possible total causal
effects from observational data on only one target variable but with many more
options (than here in idaFast
).
## Simulate the true DAG set.seed(123) p <- 7 myDAG <- randomDAG(p, prob = 0.2) ## true DAG myCPDAG <- dag2cpdag(myDAG) ## true CPDAG covTrue <- trueCov(myDAG) ## true covariance matrix ## simulate data from the true DAG n <- 10000 dat <- rmvDAG(n, myDAG) cov.d <- cov(dat) ## estimate CPDAG (see help on the function "pc") suffStat <- list(C = cor(dat), n = n) pc.fit <- pc(suffStat, indepTest = gaussCItest, alpha = 0.01, p=p) if(require(Rgraphviz)) { op <- par(mfrow=c(1,3)) plot(myDAG, main="true DAG") plot(myCPDAG, main="true CPDAG") plot(pc.fit@graph, main="pc()-estimated CPDAG") par(op) } (eff.est1 <- ida(2,5, cov.d, pc.fit@graph))## method = "local" is default (eff.est2 <- ida(2,6, cov.d, pc.fit@graph)) (eff.est3 <- ida(2,7, cov.d, pc.fit@graph)) ## These three computations can be combinded in an efficient way ## by using idaFast : (eff.estF <- idaFast(2, c(5,6,7), cov.d, pc.fit@graph))
## Simulate the true DAG set.seed(123) p <- 7 myDAG <- randomDAG(p, prob = 0.2) ## true DAG myCPDAG <- dag2cpdag(myDAG) ## true CPDAG covTrue <- trueCov(myDAG) ## true covariance matrix ## simulate data from the true DAG n <- 10000 dat <- rmvDAG(n, myDAG) cov.d <- cov(dat) ## estimate CPDAG (see help on the function "pc") suffStat <- list(C = cor(dat), n = n) pc.fit <- pc(suffStat, indepTest = gaussCItest, alpha = 0.01, p=p) if(require(Rgraphviz)) { op <- par(mfrow=c(1,3)) plot(myDAG, main="true DAG") plot(myCPDAG, main="true CPDAG") plot(pc.fit@graph, main="pc()-estimated CPDAG") par(op) } (eff.est1 <- ida(2,5, cov.d, pc.fit@graph))## method = "local" is default (eff.est2 <- ida(2,6, cov.d, pc.fit@graph)) (eff.est3 <- ida(2,7, cov.d, pc.fit@graph)) ## These three computations can be combinded in an efficient way ## by using idaFast : (eff.estF <- idaFast(2, c(5,6,7), cov.d, pc.fit@graph))
Notably, when the Rgraphviz package is not easily available,
iplotPC()
is an alternative for plotting a "pcAlgo"
object,
making use of package igraph.
It extracts the adjacency matrix and converts it into an object from package igraph which is then plotted.
iplotPC(pc.fit, labels = NULL)
iplotPC(pc.fit, labels = NULL)
pc.fit |
an R object of class |
labels |
optional labels for nodes; by default, the labels from
the |
Nothing. As side effect, the plot of pcAlgo object pc.fit
.
Note that this function does not work on fciAlgo
objects, as those need different edge marks.
Markus Kalisch [email protected]
showEdgeList
for printing the edge list of a
pcAlgo
object; showAmat
for
printing the adjacency matrix of a pcAlgo object.
## Load predefined data data(gmG) n <- nrow (gmG8$x) V <- colnames(gmG8$x) ## define sufficient statistics suffStat <- list(C = cor(gmG8$x), n = n) ## estimate CPDAG pc.fit <- pc(suffStat, indepTest = gaussCItest, alpha = 0.01, labels = V, verbose = TRUE) ## Edge list showEdgeList(pc.fit) ## Adjacency matrix showAmat(pc.fit) ## Plot using package igraph; show estimated CPDAG: iplotPC(pc.fit)
## Load predefined data data(gmG) n <- nrow (gmG8$x) V <- colnames(gmG8$x) ## define sufficient statistics suffStat <- list(C = cor(gmG8$x), n = n) ## estimate CPDAG pc.fit <- pc(suffStat, indepTest = gaussCItest, alpha = 0.01, labels = V, verbose = TRUE) ## Edge list showEdgeList(pc.fit) ## Adjacency matrix showAmat(pc.fit) ## Plot using package igraph; show estimated CPDAG: iplotPC(pc.fit)
Check whether the adjacency matrix amat
matches the specified type
.
isValidGraph(amat, type = c("pdag", "cpdag", "dag"), verbose = FALSE)
isValidGraph(amat, type = c("pdag", "cpdag", "dag"), verbose = FALSE)
amat |
adjacency matrix of type |
type |
string specifying the type of graph of the adjacency matrix amat. It
can be a DAG ( |
verbose |
If TRUE, detailed output on why the graph might not be valid is provided. |
For a given adjacency matrix amat
and graph type
, this
function checks whether the two match.
For type = "dag"
we require that amat
does NOT contain
directed cycles.
For type = "cpdag"
we require that amat
does NOT contain
directed or partially directed cycles. We also require that the
undirected part of the CPDAG (represented by amat
) is made up of
chordal components and that our graph is maximally oriented according to
rules from Meek (1995).
For type = "pdag"
we require that amat
does NOT contain
directed cycles. We also require that the PDAG is maximally oriented
according to rules from Meek (1995). Additionally, we require that the
adjacency matrix amat1
of the CPDAG corresponding to our PDAG
(represented by amat
), satisfies isValidGraph(amat =
amat1,type = "cpdag") == TRUE
and that there is no mismatch in the
orientations implied by amat
and amat1
. We obtain
amat1
by extracting the skeleton and v-structures from
amat
and then closing the orientation rules from Meek (1995).
TRUE, if the adjacency matrix amat
is of the type
specified and FALSE, otherwise.
Emilija Perkovic and Markus Kalisch
C. Meek (1995). Causal inference and causal explanation with background knowledge, In Proceedings of UAI 1995, 403-410.
## a -> b -> c amat <- matrix(c(0,1,0, 0,0,1, 0,0,0), 3,3) colnames(amat) <- rownames(amat) <- letters[1:3] ## graph::plot(as(t(amat), "graphNEL")) isValidGraph(amat = amat, type = "dag") ## is a valid DAG isValidGraph(amat = amat, type = "cpdag") ## not a valid CPDAG isValidGraph(amat = amat, type = "pdag") ## is a valid PDAG ## a -- b -- c amat <- matrix(c(0,1,0, 1,0,1, 0,1,0), 3,3) colnames(amat) <- rownames(amat) <- letters[1:3] ## plot(as(t(amat), "graphNEL")) isValidGraph(amat = amat, type = "dag") ## not a valid DAG isValidGraph(amat = amat, type = "cpdag") ## is a valid CPDAG isValidGraph(amat = amat, type = "pdag") ## is a valid PDAG ## a -- b -- c -- d -- a amat <- matrix(c(0,1,0,1, 1,0,1,0, 0,1,0,1, 1,0,1,0), 4,4) colnames(amat) <- rownames(amat) <- letters[1:4] ## plot(as(t(amat), "graphNEL")) isValidGraph(amat = amat, type = "dag") ## not a valid DAG isValidGraph(amat = amat, type = "cpdag") ## not a valid CPDAG isValidGraph(amat = amat, type = "pdag") ## not a valid PDAG
## a -> b -> c amat <- matrix(c(0,1,0, 0,0,1, 0,0,0), 3,3) colnames(amat) <- rownames(amat) <- letters[1:3] ## graph::plot(as(t(amat), "graphNEL")) isValidGraph(amat = amat, type = "dag") ## is a valid DAG isValidGraph(amat = amat, type = "cpdag") ## not a valid CPDAG isValidGraph(amat = amat, type = "pdag") ## is a valid PDAG ## a -- b -- c amat <- matrix(c(0,1,0, 1,0,1, 0,1,0), 3,3) colnames(amat) <- rownames(amat) <- letters[1:3] ## plot(as(t(amat), "graphNEL")) isValidGraph(amat = amat, type = "dag") ## not a valid DAG isValidGraph(amat = amat, type = "cpdag") ## is a valid CPDAG isValidGraph(amat = amat, type = "pdag") ## is a valid PDAG ## a -- b -- c -- d -- a amat <- matrix(c(0,1,0,1, 1,0,1,0, 0,1,0,1, 1,0,1,0), 4,4) colnames(amat) <- rownames(amat) <- letters[1:4] ## plot(as(t(amat), "graphNEL")) isValidGraph(amat = amat, type = "dag") ## not a valid DAG isValidGraph(amat = amat, type = "cpdag") ## not a valid CPDAG isValidGraph(amat = amat, type = "pdag") ## not a valid PDAG
jointIda()
estimates the multiset of possible total joint effects
of a set of intervention variables (X
) on another variable (Y
)
from observational data. This is a version of ida
that
allows multiple simultaneous interventions.
jointIda(x.pos, y.pos, mcov, graphEst = NULL, all.pasets = NULL, technique = c("RRC", "MCD"), type = c("pdag", "cpdag", "dag"))
jointIda(x.pos, y.pos, mcov, graphEst = NULL, all.pasets = NULL, technique = c("RRC", "MCD"), type = c("pdag", "cpdag", "dag"))
x.pos |
(integer vector) positions of the intervention variables
|
y.pos |
(integer) position of variable |
mcov |
(estimated) covariance matrix. |
graphEst |
(graphNEL object) Estimated CPDAG or PDAG. The CPDAG is typically from |
all.pasets |
(an optional argument and the default is
|
technique |
character string specifying the technique that will be used to estimate the total joint causal effects (given the parent sets), see details below.
|
type |
Type of graph |
It is assumed that we have observational data that are multivariate
Gaussian and faithful to the true (but unknown) underlying causal DAG
(without hidden variables). Under these assumptions, this function
estimates the multiset of possible total joint effects of X
on
Y
. Here the total joint effect of on
is defined via Pearl's do-calculus as the vector
,
with a similar definition for more than two variables. These values
are equal to the partial derivatives (evaluated at
) of
with respect to
and
. Moreover, under the Gaussian assumption, these partial
derivatives do not depend on the values at which they are evaluated.
We estimate a multiset of possible total joint effects instead of the unique total joint effect, since it is typically impossible to identify the latter when the true underlying causal DAG is unknown (even with an infinite amount of data).
Conceptually, the method
works as follows. First, we estimate the CPDAG or PDAG based on the data.
The CPDAG represents the equivalence class of DAGs and can be estimated
from observational data with the function pc
(see the help file of this function).
The PDAG contains more orientations than the CPDAG and thus, represents a
smaller equivalence class of DAGs, compared to the CPDAG. We can obtain a PDAG if
we have background knowledge of, for example, certain edge orientations of
undirected edges in the CPDAG. We obtain the PDAG by adding these
orientations to the CPDAG using the function addBgKnowledge
(see the help file of this function).
Then using the CPDAG or PDAG we extract a collection of "jointly valid" parent sets of the intervention variables from the
estimated CPDAG. For each set of jointly valid parent sets we apply
RRC (recursive regressions for causal effects) or MCD (modifying the
Cholesky decomposition) to estimate the total joint effect of X
on Y
from the sample covariance matrix (see Section 3 of Nandy et. al, 2015).
A matrix representing the multiset containing the estimated
possible total joint effects of X
on Y
. The number of
rows is equal to length(x.pos)
, i.e., each column represents a
vector of possible joint causal effects.
For a single variable X
, jointIda()
estimates the
same quantities as ida()
.
If graphEst
is of type = "cpdag"
, jointIda()
obtains
all.pasets
by using the semi-local approach described in Section 5 in Nandy et. al, (2015).
Nandy et. al, (2015) show that jointIda()
yields
correct multiplicities of the distinct elements of the resulting multiset (in the sense that it matches
ida()
with method="global"
up to a constant factor).
If graphEst
is of type = "pdag"
, jointIda()
obtains
all.pasets
by using the semi-local approach described in Algorithm 2,
Section 4.2 in Perkovic et. al (2017). For this case, jointIda()
does not necessarily yield
the correct multiplicities of the distinct elements of the resulting multiset (it behaves similarly to
ida()
with method="local"
).
jointIda()
(like idaFast
) also allows direct
computation of the total joint effect of a set of intervention
variables X
on another set of target variables Y
. In
this case, y.pos
must be an integer vector containing positions
of the target variables Y
in the covariance matrix and the
output is a list of matrices that correspond to the variables in
Y
in the same order. This method is slightly more efficient
than looping over jointIda()
with single target variables, if
all.pasets
is not specified.
Preetam Nandy, Emilija Perkovic
P. Nandy, M.H. Maathuis and T.S. Richardson (2017). Estimating the effect of joint interventions from observational data in sparse high-dimensional settings. In Annals of Statistics.
E. Perkovic, M. Kalisch and M.H. Maathuis (2017). Interpreting and using CPDAGs with background knowledge. In Proceedings of UAI 2017.
ida
, the simple version;
pc
for estimating a CPDAG.
## Create a weighted DAG p <- 6 V <- as.character(1:p) edL <- list( "1" = list(edges=c(3,4), weights=c(1.1,0.3)), "2" = list(edges=c(6), weights=c(0.4)), "3" = list(edges=c(2,4,6),weights=c(0.6,0.8,0.9)), "4" = list(edges=c(2),weights=c(0.5)), "5" = list(edges=c(1,4),weights=c(0.2,0.7)), "6" = NULL) myDAG <- new("graphNEL", nodes=V, edgeL=edL, edgemode="directed") ## true DAG myCPDAG <- dag2cpdag(myDAG) ## true CPDAG myPDAG <- addBgKnowledge(myCPDAG,1,3) ## true PDAG with background knowledge 1 -> 3 covTrue <- trueCov(myDAG) ## true covariance matrix n <- 1000 ## simulate Gaussian data from the true DAG dat <- if (require("mvtnorm")) { set.seed(123) rmvnorm(n, mean=rep(0,p), sigma=covTrue) } else readRDS(system.file(package="pcalg", "external", "N_6_1000.rds")) ## estimate CPDAG and PDAG -- see help(pc), help(addBgKnoweldge) suffStat <- list(C = cor(dat), n = n) pc.fit <- pc(suffStat, indepTest = gaussCItest, p = p, alpha = 0.01, u2pd="relaxed") pc.fit.pdag <- addBgKnowledge(pc.fit@graph,1,3) if (require(Rgraphviz)) { ## plot the true and estimated graphs par(mfrow = c(1,3)) plot(myDAG, main = "True DAG") plot(pc.fit, main = "Estimated CPDAG") plot(pc.fit.pdag, main = "Estimated PDAG") } ## Suppose that we know the true CPDAG and covariance matrix jointIda(x.pos=c(1,2), y.pos=6, covTrue, graphEst=myCPDAG, technique="RRC", type = "cpdag") jointIda(x.pos=c(1,2), y.pos=6, covTrue, graphEst=myCPDAG, technique="MCD", type = "cpdag") ## Suppose that we know the true PDAG and covariance matrix jointIda(x.pos=c(1,2), y.pos=6, covTrue, graphEst=myPDAG, technique="RRC", type = "pdag") jointIda(x.pos=c(1,2), y.pos=6, covTrue, graphEst=myPDAG, technique="MCD", type = "pdag") ## Instead of knowing the true CPDAG or PDAG, it is enough to know only ## the jointly valid parent sets of the intervention variables ## to use RRC or MCD ## all.jointly.valid.pasets: ajv.pasets <- list(list(5,c(3,4)),list(integer(0),c(3,4)),list(3,c(3,4))) jointIda(x.pos=c(1,2), y.pos=6, covTrue, all.pasets=ajv.pasets, technique="RRC") jointIda(x.pos=c(1,2), y.pos=6, covTrue, all.pasets=ajv.pasets, technique="MCD") ## From the true DAG, we can compute the true total joint effects ## using RRC or MCD cat("Dim covTrue: ", dim(covTrue),"\n") jointIda(x.pos=c(1,2), y.pos=6, covTrue, graphEst=myDAG, technique="RRC", type = "dag") jointIda(x.pos=c(1,2), y.pos=6, covTrue, graphEst=myDAG, technique="MCD", type = "dag") ## When working with data, we have to use the estimated CPDAG or PDAG ## and the sample covariance matrix jointIda(x.pos=c(1,2), y.pos=6, cov(dat), graphEst=pc.fit@graph, technique="RRC", type = "cpdag") jointIda(x.pos=c(1,2), y.pos=6, cov(dat), graphEst=pc.fit@graph, technique="MCD", type = "cpdag") jointIda(x.pos=c(1,2), y.pos=6, cov(dat), graphEst=pc.fit.pdag, technique="RRC", type = "pdag") jointIda(x.pos=c(1,2), y.pos=6, cov(dat), graphEst=pc.fit.pdag, technique="MCD", type = "pdag") ## RRC and MCD can produce different results when working with data ## jointIda also works when x.pos has length 1 and in the following example ## it gives the same result as ida() (see Note) ## ## When the CPDAG is known jointIda(x.pos=1, y.pos=6, covTrue, graphEst=myCPDAG, technique="RRC", type = "cpdag") ida(x.pos=1, y.pos=6, covTrue, graphEst=myCPDAG, method="global", type = "cpdag") ## When the PDAG is known jointIda(x.pos=1, y.pos=6, covTrue, graphEst=myPDAG, technique="RRC", type = "pdag") ida(x.pos=1, y.pos=6, covTrue, graphEst=myPDAG, method="global", type = "pdag") ## When the DAG is known jointIda(x.pos=1, y.pos=6, covTrue, graphEst=myDAG, technique="RRC", type = "dag") ida(x.pos=1, y.pos=6, covTrue, graphEst=myDAG, method="global") ## Note that, causalEffect(myDAG,y=6,x=1) does not give the correct value in this case, ## since this function requires that the variables are in a causal order.
## Create a weighted DAG p <- 6 V <- as.character(1:p) edL <- list( "1" = list(edges=c(3,4), weights=c(1.1,0.3)), "2" = list(edges=c(6), weights=c(0.4)), "3" = list(edges=c(2,4,6),weights=c(0.6,0.8,0.9)), "4" = list(edges=c(2),weights=c(0.5)), "5" = list(edges=c(1,4),weights=c(0.2,0.7)), "6" = NULL) myDAG <- new("graphNEL", nodes=V, edgeL=edL, edgemode="directed") ## true DAG myCPDAG <- dag2cpdag(myDAG) ## true CPDAG myPDAG <- addBgKnowledge(myCPDAG,1,3) ## true PDAG with background knowledge 1 -> 3 covTrue <- trueCov(myDAG) ## true covariance matrix n <- 1000 ## simulate Gaussian data from the true DAG dat <- if (require("mvtnorm")) { set.seed(123) rmvnorm(n, mean=rep(0,p), sigma=covTrue) } else readRDS(system.file(package="pcalg", "external", "N_6_1000.rds")) ## estimate CPDAG and PDAG -- see help(pc), help(addBgKnoweldge) suffStat <- list(C = cor(dat), n = n) pc.fit <- pc(suffStat, indepTest = gaussCItest, p = p, alpha = 0.01, u2pd="relaxed") pc.fit.pdag <- addBgKnowledge(pc.fit@graph,1,3) if (require(Rgraphviz)) { ## plot the true and estimated graphs par(mfrow = c(1,3)) plot(myDAG, main = "True DAG") plot(pc.fit, main = "Estimated CPDAG") plot(pc.fit.pdag, main = "Estimated PDAG") } ## Suppose that we know the true CPDAG and covariance matrix jointIda(x.pos=c(1,2), y.pos=6, covTrue, graphEst=myCPDAG, technique="RRC", type = "cpdag") jointIda(x.pos=c(1,2), y.pos=6, covTrue, graphEst=myCPDAG, technique="MCD", type = "cpdag") ## Suppose that we know the true PDAG and covariance matrix jointIda(x.pos=c(1,2), y.pos=6, covTrue, graphEst=myPDAG, technique="RRC", type = "pdag") jointIda(x.pos=c(1,2), y.pos=6, covTrue, graphEst=myPDAG, technique="MCD", type = "pdag") ## Instead of knowing the true CPDAG or PDAG, it is enough to know only ## the jointly valid parent sets of the intervention variables ## to use RRC or MCD ## all.jointly.valid.pasets: ajv.pasets <- list(list(5,c(3,4)),list(integer(0),c(3,4)),list(3,c(3,4))) jointIda(x.pos=c(1,2), y.pos=6, covTrue, all.pasets=ajv.pasets, technique="RRC") jointIda(x.pos=c(1,2), y.pos=6, covTrue, all.pasets=ajv.pasets, technique="MCD") ## From the true DAG, we can compute the true total joint effects ## using RRC or MCD cat("Dim covTrue: ", dim(covTrue),"\n") jointIda(x.pos=c(1,2), y.pos=6, covTrue, graphEst=myDAG, technique="RRC", type = "dag") jointIda(x.pos=c(1,2), y.pos=6, covTrue, graphEst=myDAG, technique="MCD", type = "dag") ## When working with data, we have to use the estimated CPDAG or PDAG ## and the sample covariance matrix jointIda(x.pos=c(1,2), y.pos=6, cov(dat), graphEst=pc.fit@graph, technique="RRC", type = "cpdag") jointIda(x.pos=c(1,2), y.pos=6, cov(dat), graphEst=pc.fit@graph, technique="MCD", type = "cpdag") jointIda(x.pos=c(1,2), y.pos=6, cov(dat), graphEst=pc.fit.pdag, technique="RRC", type = "pdag") jointIda(x.pos=c(1,2), y.pos=6, cov(dat), graphEst=pc.fit.pdag, technique="MCD", type = "pdag") ## RRC and MCD can produce different results when working with data ## jointIda also works when x.pos has length 1 and in the following example ## it gives the same result as ida() (see Note) ## ## When the CPDAG is known jointIda(x.pos=1, y.pos=6, covTrue, graphEst=myCPDAG, technique="RRC", type = "cpdag") ida(x.pos=1, y.pos=6, covTrue, graphEst=myCPDAG, method="global", type = "cpdag") ## When the PDAG is known jointIda(x.pos=1, y.pos=6, covTrue, graphEst=myPDAG, technique="RRC", type = "pdag") ida(x.pos=1, y.pos=6, covTrue, graphEst=myPDAG, method="global", type = "pdag") ## When the DAG is known jointIda(x.pos=1, y.pos=6, covTrue, graphEst=myDAG, technique="RRC", type = "dag") ida(x.pos=1, y.pos=6, covTrue, graphEst=myDAG, method="global") ## Note that, causalEffect(myDAG,y=6,x=1) does not give the correct value in this case, ## since this function requires that the variables are in a causal order.
Check if the path is legal.
A 3-node path is “legal” iff either
is a collider or
is a triangle.
legal.path(a, b, c, amat)
legal.path(a, b, c, amat)
a , b , c
|
(integer) positions in adjacency matrix of nodes |
amat |
Adjacency matrix (coding 0,1,2,3 for no edge, circle,
arrowhead, tail; e.g., |
TRUE
if path is legal, otherwise FALSE
.
Prerequisite: must be in a path (and
this is not checked by
legal.path()
).
Markus Kalisch ([email protected])
amat <- matrix( c(0,1,1,0,0, 2,0,1,0,0, 2,2,0,2,1, 0,0,1,0,0, 0,0,2,0,0), 5,5) legal.path(1,3,5, amat) legal.path(1,2,3, amat) legal.path(2,3,4, amat)
amat <- matrix( c(0,1,1,0,0, 2,0,1,0,0, 2,2,0,2,1, 0,0,1,0,0, 0,0,2,0,0), 5,5) legal.path(1,3,5, amat) legal.path(1,2,3, amat) legal.path(2,3,4, amat)
Fits a Linear non-Gaussian Acyclic Model (LiNGAM) to the data and returns the corresponding DAG.
For details, see the reference below.
lingam(X, verbose = FALSE) ## For back-compatibility; this is *deprecated* LINGAM(X, verbose = FALSE)
lingam(X, verbose = FALSE) ## For back-compatibility; this is *deprecated* LINGAM(X, verbose = FALSE)
X |
n x p data matrix (n: sample size, p: number of variables). |
verbose |
logical or integer indicating that increased diagnostic output is to be provided. |
lingam()
returns an R object of (S3) class "LINGAM"
,
basically a list
with components
Bpruned |
a |
stde |
a vector of length |
ci |
a vector of length |
LINGAM()
— deprecated now — returns a list
with components
Adj |
a |
B |
|
Of LINGAM()
and the underlying functionality,
Patrik Hoyer <[email protected]>, Doris Entner
<[email protected]>, Antti Hyttinen <[email protected]>
and Jonas Peters <[email protected]>.
S. Shimizu, P.O. Hoyer, A. Hyv\"arinen, A. Kerminen (2006) A Linear Non-Gaussian Acyclic Model for Causal Discovery; Journal of Machine Learning Research 7, 2003–2030.
fastICA
from package fastICA is used.
################################################## ## Exp 1 ################################################## set.seed(1234) n <- 500 eps1 <- sign(rnorm(n)) * sqrt(abs(rnorm(n))) eps2 <- runif(n) - 0.5 x2 <- 3 + eps2 x1 <- 0.9*x2 + 7 + eps1 #truth: x1 <- x2 trueDAG <- cbind(c(0,1),c(0,0)) X <- cbind(x1,x2) res <- lingam(X) cat("true DAG:\n") show(trueDAG) cat("estimated DAG:\n") as(res, "amat") cat("\n true constants:\n") show(c(7,3)) cat("estimated constants:\n") show(res$ci) cat("\n true (sample) noise standard deviations:\n") show(c(sd(eps1), sd(eps2))) cat("estimated noise standard deviations:\n") show(res$stde) ################################################## ## Exp 2 ################################################## set.seed(123) n <- 500 eps1 <- sign(rnorm(n)) * sqrt(abs(rnorm(n))) eps2 <- runif(n) - 0.5 eps3 <- sign(rnorm(n)) * abs(rnorm(n))^(1/3) eps4 <- rnorm(n)^2 x2 <- eps2 x1 <- 0.9*x2 + eps1 x3 <- 0.8*x2 + eps3 x4 <- -x1 -0.9*x3 + eps4 X <- cbind(x1,x2,x3,x4) trueDAG <- cbind(x1 = c(0,1,0,0), x2 = c(0,0,0,0), x3 = c(0,1,0,0), x4 = c(1,0,1,0)) ## x4 <- x3 <- x2 -> x1 -> x4 ## adjacency matrix: ## 0 0 0 1 ## 1 0 1 0 ## 0 0 0 1 ## 0 0 0 0 res1 <- lingam(X, verbose = TRUE)# details on LINGAM res2 <- lingam(X, verbose = 2) # details on LINGAM and fastICA ## results are the same, of course: stopifnot(identical(res1, res2)) cat("true DAG:\n") show(trueDAG) cat("estimated DAG:\n") as(res1, "amat")
################################################## ## Exp 1 ################################################## set.seed(1234) n <- 500 eps1 <- sign(rnorm(n)) * sqrt(abs(rnorm(n))) eps2 <- runif(n) - 0.5 x2 <- 3 + eps2 x1 <- 0.9*x2 + 7 + eps1 #truth: x1 <- x2 trueDAG <- cbind(c(0,1),c(0,0)) X <- cbind(x1,x2) res <- lingam(X) cat("true DAG:\n") show(trueDAG) cat("estimated DAG:\n") as(res, "amat") cat("\n true constants:\n") show(c(7,3)) cat("estimated constants:\n") show(res$ci) cat("\n true (sample) noise standard deviations:\n") show(c(sd(eps1), sd(eps2))) cat("estimated noise standard deviations:\n") show(res$stde) ################################################## ## Exp 2 ################################################## set.seed(123) n <- 500 eps1 <- sign(rnorm(n)) * sqrt(abs(rnorm(n))) eps2 <- runif(n) - 0.5 eps3 <- sign(rnorm(n)) * abs(rnorm(n))^(1/3) eps4 <- rnorm(n)^2 x2 <- eps2 x1 <- 0.9*x2 + eps1 x3 <- 0.8*x2 + eps3 x4 <- -x1 -0.9*x3 + eps4 X <- cbind(x1,x2,x3,x4) trueDAG <- cbind(x1 = c(0,1,0,0), x2 = c(0,0,0,0), x3 = c(0,1,0,0), x4 = c(1,0,1,0)) ## x4 <- x3 <- x2 -> x1 -> x4 ## adjacency matrix: ## 0 0 0 1 ## 1 0 1 0 ## 0 0 0 1 ## 0 0 0 0 res1 <- lingam(X, verbose = TRUE)# details on LINGAM res2 <- lingam(X, verbose = 2) # details on LINGAM and fastICA ## results are the same, of course: stopifnot(identical(res1, res2)) cat("true DAG:\n") show(trueDAG) cat("estimated DAG:\n") as(res1, "amat")
In a data set with measurements of
variables, intervened
variables can be specified in two ways:
with a logical
intervention matrix of dimension
, where the entry
[i, j]
indicates whether
variable has been intervened in measurement
; or
with a list of (unique) intervention targets and a
-dimensional vector indicating the indices of the intervention
targets of the
measurements.
The function mat2targets
converts the first representation to the
second one, the function targets2mat
does the reverse conversion. The
second representation can be used to create scoring objects (see
Score
) and to run causal inference methods based on
interventional data such as gies
or simy
.
mat2targets(A) targets2mat(p, targets, target.index)
mat2targets(A) targets2mat(p, targets, target.index)
A |
Logical matrix with |
p |
Number of variables |
targets |
List of unique intervention targets |
target.index |
Vector of intervention target indices. The intervention
target of data point |
mat2targets
returns a list with two components:
targets |
A list of unique intervention targets. |
target.index |
A vector of intervention target indices. The intervention
target of data point |
Alain Hauser ([email protected])
## Specify interventions using a matrix p <- 5 n <- 10 A <- matrix(FALSE, nrow = n, ncol = p) for (i in 1:n) A[i, (i-1) %% p + 1] <- TRUE ## Generate list of intervention targets and corresponding indices target.list <- mat2targets(A) for (i in 1:length(target.list$target.index)) sprintf("Intervention target of %d-th data point: %d", i, target.list$targets[[target.list$target.index[i]]]) ## Convert back to matrix representation all(A == targets2mat(p, target.list$targets, target.list$target.index))
## Specify interventions using a matrix p <- 5 n <- 10 A <- matrix(FALSE, nrow = n, ncol = p) for (i in 1:n) A[i, (i-1) %% p + 1] <- TRUE ## Generate list of intervention targets and corresponding indices target.list <- mat2targets(A) for (i in 1:length(target.list$target.index)) sprintf("Intervention target of %d-th data point: %d", i, target.list$targets[[target.list$target.index[i]]]) ## Convert back to matrix representation all(A == targets2mat(p, target.list$targets, target.list$target.index))
Compute a correlation matrix, possibly by robust methods, applicable also for the case of a large number of variables.
mcor(dm, method = c("standard", "Qn", "QnStable", "ogkScaleTau2", "ogkQn", "shrink"))
mcor(dm, method = c("standard", "Qn", "QnStable", "ogkScaleTau2", "ogkQn", "shrink"))
dm |
numeric data matrix; rows are observiations (“samples”), columns are variables. |
method |
a string; |
The "standard"
method envokes a standard correlation estimator. "Qn"
envokes a robust, elementwise correlation estimator based on the Qn scale
estimte. "QnStable"
also uses the Qn scale estimator, but uses an
improved way of transforming that into the correlation
estimator. "ogkQn"
envokes a correlation estimator based on Qn using
OGK. "shrink"
is only useful when used with pcSelect. An optimal
shrinkage parameter is used. Only correlation between response and
covariates is shrinked.
A correlation matrix estimated by the specified method.
Markus Kalisch [email protected] and Martin Maechler
See those in the help pages for Qn
and
covOGK
from package robustbase.
Qn
and covOGK
from package robustbase.
pcorOrder
for computing partial correlations.
## produce uncorrelated normal random variables set.seed(42) x <- rnorm(100) y <- 2*x + rnorm(100) ## compute correlation of var1 and var2 mcor(cbind(x,y), method="standard") ## repeat but this time with heavy-tailed noise yNoise <- 2*x + rcauchy(100) mcor(cbind(x,yNoise), method="standard") ## shows almost no correlation mcor(cbind(x,yNoise), method="Qn") ## shows a lot correlation mcor(cbind(x,yNoise), method="QnStable") ## shows still much correlation mcor(cbind(x,yNoise), method="ogkQn") ## ditto
## produce uncorrelated normal random variables set.seed(42) x <- rnorm(100) y <- 2*x + rnorm(100) ## compute correlation of var1 and var2 mcor(cbind(x,y), method="standard") ## repeat but this time with heavy-tailed noise yNoise <- 2*x + rcauchy(100) mcor(cbind(x,yNoise), method="standard") ## shows almost no correlation mcor(cbind(x,yNoise), method="Qn") ## shows a lot correlation mcor(cbind(x,yNoise), method="QnStable") ## shows still much correlation mcor(cbind(x,yNoise), method="ogkQn") ## ditto
Given a (observational or interventional) essential graph (or "CPDAG"), find the optimal intervention target that maximizes the number of edges that can be oriented after the intervention.
opt.target(essgraph, max.size, use.node.names = TRUE)
opt.target(essgraph, max.size, use.node.names = TRUE)
essgraph |
An |
max.size |
Maximum size of the intervention target. Only 1 and the
number of vertices of |
use.node.names |
Indicates if the intervention target should be
returned as a list of node names (if |
This function implements active learning strategies for structure learning from interventional data, one that calculates an optimal single-vertex intervention target, and one that calculates an optimal intervention target of arbitrary size. "Optimal" means the proposed intervention target guarantees the highest number of edges that can be oriented after performing the intervention, assuming the essential graph provided as input is the true essential graph under the currently available interventional data (i.e., neglecting possible estimation errors).
Implementation corresponds to algorithms "OptSingle" and "OptUnb" published in Hauser and Bühlmann (2012).
A character vector of node names (if use.node.names = TRUE
), or an
integer vector of node indices (if use.node.names = FALSE
) indicating
the optimal intervention target.
Alain Hauser ([email protected])
A. Hauser and P. Bühlmann (2012). Two optimal strategies for active learning of causal models from interventions. Proceedings of the 6th European Workshop on Probabilistic Graphical Models (PGM-2012), 123–130
## Load predefined data data(gmG) ## Define the score (BIC) score <- new("GaussL0penObsScore", gmG8$x) ## Estimate the essential graph using GES ges.fit <- ges(score) essgraph <- ges.fit$essgraph ## Plot the estimated essential graph if (require(Rgraphviz)) { plot(essgraph, main = "Estimated CPDAG") } ## The CPDAG has 1 unoriented component with 3 edges (Author <-> Bar, Bar <-> ## Ctrl, Bar <-> V5) ## Get optimal single-vertex and unbounded intervention target opt.target(essgraph, max.size = 1) opt.target(essgraph, max.size = essgraph$node.count())
## Load predefined data data(gmG) ## Define the score (BIC) score <- new("GaussL0penObsScore", gmG8$x) ## Estimate the essential graph using GES ges.fit <- ges(score) essgraph <- ges.fit$essgraph ## Plot the estimated essential graph if (require(Rgraphviz)) { plot(essgraph, main = "Estimated CPDAG") } ## The CPDAG has 1 unoriented component with 3 edges (Author <-> Bar, Bar <-> ## Ctrl, Bar <-> V5) ## Get optimal single-vertex and unbounded intervention target opt.target(essgraph, max.size = 1) opt.target(essgraph, max.size = essgraph$node.count())
optAdjSet
computes the optimal valid adjustment set relative to the variables (X
,Y
) in the given graph.
optAdjSet(graphEst,x.pos,y.pos)
optAdjSet(graphEst,x.pos,y.pos)
graphEst |
graphNel object or adjacency matrix of type amat.cpdag. |
x.pos , x
|
Positions of variables |
y.pos , y
|
Positions of variables |
Suppose we have data from a linear SEM compatible with a known causal graph G
and our aim is to estimate the total joint effect of X
on Y
. Here the total joint effect of X
on
Y
is defined via Pearl's do-calculus as the vector , with a similar definition for more than two variables. These values are equal to the partial derivatives (evaluated at
) of
with respect to
' and
'. Moreover, under the linearity assumption, these partial derivatives do not depend on the values at which they are evaluated.
It is possible to estimate the total joint effect of X
on Y
with a simple linear regression of the form lm(Y ~ X + Z)
, if and only if the covariate set Z
is a valid adjustment set (see Perkovic et al. (2018)). Often, however, there are multiple such valid adjustment sets, providing total effect estimates with varying accuracies. Suppose that there exists a valid adjustment set relative to (X
,Y
) in causal graph G
, and each node in Y is a descendant of X, then there exists a valid adjustment which provides the total effect estimate with the optimal asymptotic variance, which we will refer to as O(X,Y,G)
(Henckel et al., 2019). This function returns this optimal valid adjustment set O(X,Y,G)
.
The restriction that each node in Y
be a descendant of the node set X
is not notable, as the total effect of the node set X
on a non-descendant is always 0. If provided with a node set Y
that does not fulfill this condition this function computes a pruned node set Y2
by removing all nodes from Y
that are not descendants of X
and returns O(X,Y2,G)
instead. The user will be alerted to this and given the pruned set Y2
.
A vector with the positions of the nodes of the optimal set O(X
,Y
,G
).
Leonard Henckel
E. Perković, J. Textor, M. Kalisch and M.H. Maathuis (2018). Complete graphical characterization and construction of adjustment sets in Markov equivalence classes of ancestral graphs. Journal of Machine Learning Research. 18(220) 1–62,
L. Henckel, E. Perkovic and M.H. Maathuis (2019). Graphical criteria for efficient total effect estimation via adjustment in causal linear models. Working Paper.
## Simulate a true DAG, its CPDAG and an intermediate max. PDAG suppressWarnings(RNGversion("3.5.0")) set.seed(123) p <- 10 ## true DAG myDAG <- randomDAG(p, prob = 0.3) ## true CPDAG myCPDAG <- dag2cpdag(myDAG) ## true PDAG with added background knowledge 5 -> 6 myPDAG <- addBgKnowledge(myCPDAG,5,6) if (require(Rgraphviz)) { par(mfrow = c(1,3)) plot(myDAG) plot(myPDAG) plot(myCPDAG) ## plot of the graphs } ## if the CPDAG C is amenable relative to (X,Y), ## the optimal set will be the same for all DAGs ## and any max. PDAGs obtained by adding background knowledge to C (optAdjSet(myDAG,3,10)) (optAdjSet(myPDAG,3,10)) (optAdjSet(myCPDAG,3,10)) ## the optimal adjustment set can also be compute for sets X and Y (optAdjSet(myDAG,c(3,4),c(9,10))) (optAdjSet(myPDAG,c(3,4),c(9,10))) (optAdjSet(myCPDAG,c(3,4),c(9,10))) ## The only restriction is that it requires all nodes in Y to be ## descendants of X. ## However, if a node in Y is non-descendant of X the lowest variance ## partial total effect estimate is simply 0. ## Hence, we can proceed with a pruned Y. This function does this automatically! optAdjSet(myDAG,1,c(3,9)) ## Note that for sets X there may be no valid adjustment set even ## if the PDAG is is amenable relative to (X,Y). ## Not run: optAdjSet(myPDAG,c(4,9),7)
## Simulate a true DAG, its CPDAG and an intermediate max. PDAG suppressWarnings(RNGversion("3.5.0")) set.seed(123) p <- 10 ## true DAG myDAG <- randomDAG(p, prob = 0.3) ## true CPDAG myCPDAG <- dag2cpdag(myDAG) ## true PDAG with added background knowledge 5 -> 6 myPDAG <- addBgKnowledge(myCPDAG,5,6) if (require(Rgraphviz)) { par(mfrow = c(1,3)) plot(myDAG) plot(myPDAG) plot(myCPDAG) ## plot of the graphs } ## if the CPDAG C is amenable relative to (X,Y), ## the optimal set will be the same for all DAGs ## and any max. PDAGs obtained by adding background knowledge to C (optAdjSet(myDAG,3,10)) (optAdjSet(myPDAG,3,10)) (optAdjSet(myCPDAG,3,10)) ## the optimal adjustment set can also be compute for sets X and Y (optAdjSet(myDAG,c(3,4),c(9,10))) (optAdjSet(myPDAG,c(3,4),c(9,10))) (optAdjSet(myCPDAG,c(3,4),c(9,10))) ## The only restriction is that it requires all nodes in Y to be ## descendants of X. ## However, if a node in Y is non-descendant of X the lowest variance ## partial total effect estimate is simply 0. ## Hence, we can proceed with a pruned Y. This function does this automatically! optAdjSet(myDAG,1,c(3,9)) ## Note that for sets X there may be no valid adjustment set even ## if the PDAG is is amenable relative to (X,Y). ## Not run: optAdjSet(myPDAG,c(4,9),7)
Constructs a matrix which contains identifiable ancestral and non-ancestral relations in the Markov equivalence class represented by a directed partial ancestral graph.
pag2anc(P,verbose=FALSE)
pag2anc(P,verbose=FALSE)
P |
Adjacency matrix of type amat.pag, which should encode
a directed PAG (i.e., it should not contain any undirected edges of the
form |
verbose |
If true, more detailed output is provided. |
We say that node i
is ancestor of node j
in a directed mixed graph (DMG) iff there
exists a directed path from i
to j
in that graph. If the directed mixed graph
has a causal interpretation (for example, if it is the graph of a simple SCM) then ancestral
relations coincide (generically) with causal relations.
This function implements the sufficient conditions (Propositions 4 and 5) in Mooij and Claassen (2020)
for concluding whether an ancestral relation between two nodes must be present or absent
in all directed mixed graphs in the Markov equivalence class represented by the directed PAG P
.
It applies to both the
acyclic case as well as the cyclic (simple SCM) case, assuming the d-separation resp. -separation
Markov property.
The output is a matrix containing for each ordered pair of nodes whether the presence of an ancestral relation was identified, or the absence, or neither.
It is not known whether these sufficient conditions for identifiability are complete. Hence, zero entries in the result indicate that the sufficient condition gives no conclusion, rather than that the Markov equivalence class represented by the directed PAG necessarily contains DMGs where an ancestral relation is present as well as DMGs where it is absent.
P
should be an adjacency matrix of type amat.pag that contains no undirected
and circle-tail edges.
Matrix A
, where entry A[i,j]
equals
if node i
is an identifiable ancestor of node j
,
if node i
is an identifiable non-ancestor of node j
,
in case the ancestral relationship between nodes i
and j
is unknown.
Joris Mooij.
J. M. Mooij and T. Claassen (2020). Constraint-Based Causal Discovery using Partial Ancestral Graphs in the presence of Cycles. In Proc. of the 36th Conference on Uncertainty in Artificial Intelligence (UAI-20), 1159-1168.
################################################## ## Mooij et al. (2020), Fig. 43(a), p. 97 ################################################## # Encode ADMG as adjacency matrix p <- 8 # total number of variables V <- c("Ca","Cb","Cc","X0","X1","X2","X3","X4") # 3 context variables, 5 system variables # amat[i,j] = 0 iff no edge btw i,j # amat[i,j] = 1 iff i *-o j # amat[i,j] = 2 iff i *-> j # amat[i,j] = 3 iff i *-- j amat <- rbind(c(0,2,2,2,0,0,0,0), c(2,0,2,0,2,0,0,0), c(2,2,0,0,2,2,0,0), c(3,0,0,0,0,0,2,0), c(0,3,3,0,0,3,0,2), c(0,0,3,0,2,0,0,0), c(0,0,0,3,0,0,0,2), c(0,0,0,0,2,0,3,0)) rownames(amat)<-V colnames(amat)<-V # Make use of d-separation oracle as "independence test" indepTest <- dsepAMTest suffStat<-list(g=amat,verbose=FALSE) # Derive PAG that represents the Markov equivalence class of the ADMG with the FCI algorithm # (assuming no selection bias) fci.pag <- fci(suffStat,indepTest,alpha = 0.5,labels = V,verbose=TRUE,selectionBias=FALSE) # Read off causal features from the FCI PAG cat('Identified absence (-1) and presence (+1) of ancestral causal relations from FCI PAG:\n') print(pag2anc(fci.pag@amat))
################################################## ## Mooij et al. (2020), Fig. 43(a), p. 97 ################################################## # Encode ADMG as adjacency matrix p <- 8 # total number of variables V <- c("Ca","Cb","Cc","X0","X1","X2","X3","X4") # 3 context variables, 5 system variables # amat[i,j] = 0 iff no edge btw i,j # amat[i,j] = 1 iff i *-o j # amat[i,j] = 2 iff i *-> j # amat[i,j] = 3 iff i *-- j amat <- rbind(c(0,2,2,2,0,0,0,0), c(2,0,2,0,2,0,0,0), c(2,2,0,0,2,2,0,0), c(3,0,0,0,0,0,2,0), c(0,3,3,0,0,3,0,2), c(0,0,3,0,2,0,0,0), c(0,0,0,3,0,0,0,2), c(0,0,0,0,2,0,3,0)) rownames(amat)<-V colnames(amat)<-V # Make use of d-separation oracle as "independence test" indepTest <- dsepAMTest suffStat<-list(g=amat,verbose=FALSE) # Derive PAG that represents the Markov equivalence class of the ADMG with the FCI algorithm # (assuming no selection bias) fci.pag <- fci(suffStat,indepTest,alpha = 0.5,labels = V,verbose=TRUE,selectionBias=FALSE) # Read off causal features from the FCI PAG cat('Identified absence (-1) and presence (+1) of ancestral causal relations from FCI PAG:\n') print(pag2anc(fci.pag@amat))
Constructs a matrix which contains identifiably unconfounded node pairs in the Markov equivalence class represented by a directed partial ancestral graph.
pag2conf(P)
pag2conf(P)
P |
Adjacency matrix of type amat.pag, which should encode
a directed PAG (i.e., it should not contain any undirected edges of the
form |
We say that nodes i
and j
are confounded in a directed mixed graph (DMG) iff there
exists a bidirected edge i<->j
in that graph. If the directed mixed graph
has a causal interpretation (for example, if it is the graph of a simple SCM) then the presence
of a bidirected edge coincides (generically) with the presence of a confounder, i.e., a
latent common cause (relative to the variables in the graph).
This function implements the sufficient condition (Proposition 6) in Mooij and Claassen (2020)
for concluding whether two nodes are unconfounded in all directed mixed graphs in the Markov
equivalence class represented by the directed PAG P
.
It applies to both the
acyclic case as well as the cyclic (simple SCM) case, assuming the d-separation resp. -separation
Markov property.
The output is a (symmetric) matrix containing for each ordered pair of nodes whether the two nodes are identifiably unconfounded.
It is not known whether these sufficient conditions for identifiability are complete. Hence, zero entries in the result indicate that the sufficient condition gives no conclusion, rather than that the Markov equivalence class represented by the directed PAG necessarily contains DMGs where a bidirected edge is present.
P
should be an adjacency matrix of type amat.pag that contains no undirected
and circle-tail edges.
Matrix A
, where entry A[i,j]
equals
if nodes i
and j
are identifiably unconfounded,
in case it is unknown whether nodes i
and j
are confounded or not.
Joris Mooij.
J. M. Mooij and T. Claassen (2020). Constraint-Based Causal Discovery using Partial Ancestral Graphs in the presence of Cycles. In Proc. of the 36th Conference on Uncertainty in Artificial Intelligence (UAI-20), 1159-1168.
################################################## ## Mooij et al. (2020), Fig. 43(a), p. 97 ################################################## # Encode ADMG as adjacency matrix p <- 8 # total number of variables V <- c("Ca","Cb","Cc","X0","X1","X2","X3","X4") # 3 context variables, 5 system variables # amat[i,j] = 0 iff no edge btw i,j # amat[i,j] = 1 iff i *-o j # amat[i,j] = 2 iff i *-> j # amat[i,j] = 3 iff i *-- j amat <- rbind(c(0,2,2,2,0,0,0,0), c(2,0,2,0,2,0,0,0), c(2,2,0,0,2,2,0,0), c(3,0,0,0,0,0,2,0), c(0,3,3,0,0,3,0,2), c(0,0,3,0,2,0,0,0), c(0,0,0,3,0,0,0,2), c(0,0,0,0,2,0,3,0)) rownames(amat)<-V colnames(amat)<-V # Make use of d-separation oracle as "independence test" indepTest <- dsepAMTest suffStat<-list(g=amat,verbose=FALSE) # Derive PAG that represents the Markov equivalence class of the ADMG with the FCI algorithm # (assuming no selection bias) fci.pag <- fci(suffStat,indepTest,alpha = 0.5,labels = V,verbose=TRUE,selectionBias=FALSE) # Read off causal features from the FCI PAG cat('Identified absence (-1) and presence (+1) of pairwise latent confounding from FCI PAG:\n') print(pag2conf(fci.pag@amat))
################################################## ## Mooij et al. (2020), Fig. 43(a), p. 97 ################################################## # Encode ADMG as adjacency matrix p <- 8 # total number of variables V <- c("Ca","Cb","Cc","X0","X1","X2","X3","X4") # 3 context variables, 5 system variables # amat[i,j] = 0 iff no edge btw i,j # amat[i,j] = 1 iff i *-o j # amat[i,j] = 2 iff i *-> j # amat[i,j] = 3 iff i *-- j amat <- rbind(c(0,2,2,2,0,0,0,0), c(2,0,2,0,2,0,0,0), c(2,2,0,0,2,2,0,0), c(3,0,0,0,0,0,2,0), c(0,3,3,0,0,3,0,2), c(0,0,3,0,2,0,0,0), c(0,0,0,3,0,0,0,2), c(0,0,0,0,2,0,3,0)) rownames(amat)<-V colnames(amat)<-V # Make use of d-separation oracle as "independence test" indepTest <- dsepAMTest suffStat<-list(g=amat,verbose=FALSE) # Derive PAG that represents the Markov equivalence class of the ADMG with the FCI algorithm # (assuming no selection bias) fci.pag <- fci(suffStat,indepTest,alpha = 0.5,labels = V,verbose=TRUE,selectionBias=FALSE) # Read off causal features from the FCI PAG cat('Identified absence (-1) and presence (+1) of pairwise latent confounding from FCI PAG:\n') print(pag2conf(fci.pag@amat))
Constructs a matrix which contains identifiable parental and non-parental relations in the Markov equivalence class represented by a directed partial ancestral graph.
pag2edge(P)
pag2edge(P)
P |
Adjacency matrix of type amat.pag, which should encode
a directed PAG (i.e., it should not contain any undirected edges of the
form |
We say that node i
is parent of node j
in a directed mixed graph (DMG) iff there
exists a directed edge i-->j
in that graph. If the directed mixed graph
has a causal interpretation (for example, if it is the graph of a simple SCM) then parental
relations coincide (generically) with direct causal relations (relative to the variables in the graph).
This function implements the sufficient conditions (Propositions 7 and 8) in Mooij and Claassen (2020)
for concluding whether a parental relation between two nodes must be present or absent
in all directed mixed graphs in the Markov equivalence class represented by the directed PAG P
.
It applies to both the
acyclic case as well as the cyclic (simple SCM) case, assuming the d-separation resp. -separation
Markov property.
The output is a matrix containing for each ordered pair of nodes whether the presence of a parental relation was identified, or the absence, or neither.
It is not known whether these sufficient conditions for identifiability are complete. Hence, zero entries in the result indicate that the sufficient condition gives no conclusion, rather than that the Markov equivalence class represented by the directed PAG necessarily contains DMGs where a parental relation is present as well as DMGs where it is absent.
P
should be an adjacency matrix of type amat.pag that contains no undirected
and circle-tail edges.
Matrix A
, where entry A[i,j]
equals
if node i
is an identifiable parent of node j
,
if node i
is an identifiable non-parent of node j
,
in case the parental relationship between nodes i
and j
is unknown.
Joris Mooij.
J. M. Mooij and T. Claassen (2020). Constraint-Based Causal Discovery using Partial Ancestral Graphs in the presence of Cycles. In Proc. of the 36th Conference on Uncertainty in Artificial Intelligence (UAI-20), 1159-1168.
################################################## ## Mooij et al. (2020), Fig. 43(a), p. 97 ################################################## # Encode ADMG as adjacency matrix p <- 8 # total number of variables V <- c("Ca","Cb","Cc","X0","X1","X2","X3","X4") # 3 context variables, 5 system variables # amat[i,j] = 0 iff no edge btw i,j # amat[i,j] = 1 iff i *-o j # amat[i,j] = 2 iff i *-> j # amat[i,j] = 3 iff i *-- j amat <- rbind(c(0,2,2,2,0,0,0,0), c(2,0,2,0,2,0,0,0), c(2,2,0,0,2,2,0,0), c(3,0,0,0,0,0,2,0), c(0,3,3,0,0,3,0,2), c(0,0,3,0,2,0,0,0), c(0,0,0,3,0,0,0,2), c(0,0,0,0,2,0,3,0)) rownames(amat)<-V colnames(amat)<-V # Make use of d-separation oracle as "independence test" indepTest <- dsepAMTest suffStat<-list(g=amat,verbose=FALSE) # Derive PAG that represents the Markov equivalence class of the ADMG with the FCI algorithm # (assuming no selection bias) fci.pag <- fci(suffStat,indepTest,alpha = 0.5,labels = V,verbose=TRUE,selectionBias=FALSE) # Read off causal features from the FCI PAG cat('Identified absence (-1) and presence (+1) of direct causal relations from FCI PAG:\n') print(pag2edge(fci.pag@amat))
################################################## ## Mooij et al. (2020), Fig. 43(a), p. 97 ################################################## # Encode ADMG as adjacency matrix p <- 8 # total number of variables V <- c("Ca","Cb","Cc","X0","X1","X2","X3","X4") # 3 context variables, 5 system variables # amat[i,j] = 0 iff no edge btw i,j # amat[i,j] = 1 iff i *-o j # amat[i,j] = 2 iff i *-> j # amat[i,j] = 3 iff i *-- j amat <- rbind(c(0,2,2,2,0,0,0,0), c(2,0,2,0,2,0,0,0), c(2,2,0,0,2,2,0,0), c(3,0,0,0,0,0,2,0), c(0,3,3,0,0,3,0,2), c(0,0,3,0,2,0,0,0), c(0,0,0,3,0,0,0,2), c(0,0,0,0,2,0,3,0)) rownames(amat)<-V colnames(amat)<-V # Make use of d-separation oracle as "independence test" indepTest <- dsepAMTest suffStat<-list(g=amat,verbose=FALSE) # Derive PAG that represents the Markov equivalence class of the ADMG with the FCI algorithm # (assuming no selection bias) fci.pag <- fci(suffStat,indepTest,alpha = 0.5,labels = V,verbose=TRUE,selectionBias=FALSE) # Read off causal features from the FCI PAG cat('Identified absence (-1) and presence (+1) of direct causal relations from FCI PAG:\n') print(pag2edge(fci.pag@amat))
Transform a Partial Ancestral Graph (PAG) into a valid
Maximal Ancestral Graph (MAG) that belongs to the Markov equivalence class
represented by the given PAG, with no additional edges into node x
.
pag2magAM(amat.pag, x, max.chordal = 10, verbose = FALSE)
pag2magAM(amat.pag, x, max.chordal = 10, verbose = FALSE)
amat.pag |
Adjacency matrix of type amat.pag |
x |
(integer) position in adjacency matrix of node in the PAG into which no additional edges are oriented. |
max.chordal |
Positive integer: graph paths larger than
|
verbose |
Logical; if true, some output is produced during computation. |
This function converts a PAG (adjacency matrix) to a valid MAG (adjacency matrix) that belongs to the Markov equivalence class represented by the given PAG. Note that we assume that there are no selection variables, meaning that the edges in the PAG can be of the following types: ->, <->, o->, and o-o. In a first step, it uses the Arrowhead Augmentation of Zhang (2006), i.e., any o-> edge is oriented into ->. Afterwards, it orients each chordal component into a valid DAG without orienting any additional edges into x.
This function is used in the Generalized Backdoor Criterion
backdoor
with type="pag"
, see Maathuis and Colombo
(2015) for details.
The output is an adjacency matrix of type amat.pag representing a valid MAG that belongs to the Markov equivalence class represented by the given PAG.
Diego Colombo, Markus Kalisch and Martin Maechler.
M.H. Maathuis and D. Colombo (2015). A generalized back-door criterion. Annals of Statistics 43 1060-1088.
Zhang, J. (2006). Causal Inference and Reasoning in Causally Insufficient Systems. Ph. D. thesis, Carnegie Mellon University.
## create the graph set.seed(78) p <- 12 g <- randomDAG(p, prob = 0.4) ## Compute the true covariance and then correlation matrix of g: true.corr <- cov2cor(trueCov(g)) ## define nodes 2 and 6 to be latent variables L <- c(2,6) ## Find PAG ## As dependence "oracle", we use the true correlation matrix in ## gaussCItest() with a large "virtual sample size" and a large alpha: true.pag <- dag2pag(suffStat = list(C= true.corr, n= 10^9), indepTest= gaussCItest, graph=g, L=L, alpha= 0.9999) ## find a valid MAG such that no additional edges are directed into (amat.mag <- pag2magAM(true.pag@amat, 4)) # -> the adj.matrix of the MAG
## create the graph set.seed(78) p <- 12 g <- randomDAG(p, prob = 0.4) ## Compute the true covariance and then correlation matrix of g: true.corr <- cov2cor(trueCov(g)) ## define nodes 2 and 6 to be latent variables L <- c(2,6) ## Find PAG ## As dependence "oracle", we use the true correlation matrix in ## gaussCItest() with a large "virtual sample size" and a large alpha: true.pag <- dag2pag(suffStat = list(C= true.corr, n= 10^9), indepTest= gaussCItest, graph=g, L=L, alpha= 0.9999) ## find a valid MAG such that no additional edges are directed into (amat.mag <- pag2magAM(true.pag@amat, 4)) # -> the adj.matrix of the MAG
"ParDAG"
of Parametric Causal ModelsThis virtual base class represents a parametric causal model.
The class "ParDAG"
serves as a basis for simulating observational
and/or interventional data from causal models as well as for parameter
estimation (maximum-likelihood estimation) for a given causal model in the
presence of a data set with jointly observational and interventional data.
The virtual base class "ParDAG"
provides a “skeleton” for all
functions relied to the aforementioned task. In practical cases, a user may
always choose an appropriate class derived from ParDAG
which
represents a specific parametric model class. The base class itself does
not represent such a model class.
new("ParDAG", nodes, in.edges, params)
nodes
Vector of node names; cf. also field .nodes
.
in.edges
A list of length p
consisting of index
vectors indicating the edges pointing into the nodes of the DAG.
params
A list of length p
consisting of parameter
vectors modeling the conditional distribution of a node given its
parents; cf. also field .params
.
.nodes
:Vector of node names; defaults to as.character(1:p)
,
where p
denotes the number of nodes (variables) of the model.
.in.edges
:A list of length p
consisting of index
vectors indicating the edges pointing into the nodes of the DAG.
.params
:A list of length p
consisting of parameter
vectors modeling the conditional distribution of a node given its
parents. The entries of the parameter vectors only get a concrete
meaning in derived classes belonging to specific parametric model classes.
node.count()
:Yields the number of nodes (variables) of the model.
simulate(n, target, int.level)
:Generates
(observational or interventional) samples from the parametric causal
model. The intervention target to be used is specified by the parameter
target
; if the target is empty (target = integer(0)
),
observational samples are generated. int.level
indicates
the values of the intervened variables; if it is a vector of the same
length as target
, all samples are drawn from the same intervention
levels; if it is a matrix with rows and as many columns as
target
has entries, its rows are interpreted as individual
intervention levels for each sample.
edge.count()
:Yields the number of edges (arrows) in the DAG.
mle.fit(score)
:Fits the parameters using an appropriate
Score
object.
signature(x = "ParDAG", y = "ANY")
: plots the underlying
DAG of the causal model. Parameters are not visualized.
Alain Hauser ([email protected])
Estimate the equivalence class of a directed acyclic graph (DAG) from observational data, using the PC-algorithm.
pc(suffStat, indepTest, alpha, labels, p, fixedGaps = NULL, fixedEdges = NULL, NAdelete = TRUE, m.max = Inf, u2pd = c("relaxed", "rand", "retry"), skel.method = c("stable", "original", "stable.fast"), conservative = FALSE, maj.rule = FALSE, solve.confl = FALSE, numCores = 1, verbose = FALSE)
pc(suffStat, indepTest, alpha, labels, p, fixedGaps = NULL, fixedEdges = NULL, NAdelete = TRUE, m.max = Inf, u2pd = c("relaxed", "rand", "retry"), skel.method = c("stable", "original", "stable.fast"), conservative = FALSE, maj.rule = FALSE, solve.confl = FALSE, numCores = 1, verbose = FALSE)
suffStat |
A |
indepTest |
A |
alpha |
significance level (number in |
labels |
(optional) character vector of variable (or
“node”) names. Typically preferred to specifying |
p |
(optional) number of variables (or nodes). May be specified
if |
numCores |
Specifies the number of cores to be used for parallel
estimation of |
verbose |
If |
fixedGaps |
A logical matrix of dimension p*p. If entry
|
fixedEdges |
A logical matrix of dimension p*p. If entry
|
NAdelete |
If indepTest returns |
m.max |
Maximal size of the conditioning sets that are considered in the conditional independence tests. |
u2pd |
String specifying the method for dealing with conflicting information when trying to orient edges (see details below). |
skel.method |
Character string specifying method; the default,
|
conservative |
Logical indicating if the conservative PC is used.
In this case, only option |
maj.rule |
Logical indicating that the triples shall be checked for ambiguity using a majority rule idea, which is less strict than the conservative PC algorithm. For more information, see details. |
solve.confl |
If |
Under the assumption that the distribution of the observed variables is faithful to a DAG, this function estimates the Markov equivalence class of the DAG. We do not estimate the DAG itself, because this is typically impossible (even with an infinite amount of data), since different DAGs can describe the same conditional independence relationships. Since all DAGs in an equivalence class describe the same conditional independence relationships, they are equally valid ways to describe the conditional dependence structure that was given as input.
All DAGs in a Markov equivalence class have the same skeleton (i.e., the same adjacency information) and the same v-structures (see definition below). However, the direction of some edges may be undetermined, in the sense that they point one way in one DAG in the equivalence class, while they point the other way in another DAG in the equivalence class.
A Markov equivalence class can be uniquely represented by a completed
partially directed acyclic graph (CPDAG). A CPDAG
contains undirected and directed edges. The edges have the following
interpretation: (i) there is a (directed or undirected) edge between i
and j if and only if variables i and j are conditionally dependent
given S for all possible subsets S of the remaining nodes; (ii) a directed
edge means that this directed edge is
present in all DAGs in the Markov equivalence class; (iii) an undirected
edge
means that there is at least one DAG in the Markov
equivalence class with edge
and
there is at least one DAG in the Markov equivalence class with edge
.
The CPDAG is estimated using the PC algorithm (named after its inventors
Peter Spirtes and Clark Glymour). The skeleton is
estimated by the function skeleton
which uses a modified
version of the original PC algorithm (see Colombo and Maathuis (2014) for
details). The original PC algorithm is known to be
order-dependent, in the sense that the output depends on the order in
which the variables are given. Therefore, Colombo and Maathuis (2014)
proposed a simple modification, called PC-stable, that yields
order-independent adjacencies in the skeleton (see the help file
of this function for details). Subsequently, as many edges as possible
are oriented. This is done in two steps. It is important to note that
if no further actions are taken (see below) these two steps still
remain order-dependent.
The edges are oriented as follows. First, the algorithm considers all
triples (a,b,c)
, where and
are adjacent,
and
are adjacent, but
and
are not adjacent. For all such
triples, we direct both edges towards
(
) if and only if
was not part of the conditioning set that made the edge between
and
drop out. These conditioning sets were saved in
sepset
. The structure
is called a
v-structure.
After determining all v-structures, there may still be undirected edges. It may be possible to direct some of these edges, since one can deduce that one of the two possible directions of the edge is invalid because it introduces a new v-structure or a directed cycle. Such edges are found by repeatedly applying rules R1-R3 of the PC algorithm as given in Algorithm 2 of Kalisch and Bühlmann (2007). The algorithm stops if none of the rules is applicable to the graph.
The conservative PC algorithm (conservative = TRUE
) is a
slight variation of the PC algorithm (see Ramsey et al. 2006). After
the skeleton is computed, all potential v-structures are
checked in the following way. We test whether a and c are independent
conditioning on all subsets of the neighbors of
and all subsets of
the neighbors of
. When a subset makes
and
conditionally independent, we call it a separating set. If
is in no
such separating set or in all such separating sets, no further action is
taken and the usual PC is continued. If, however,
is in only some
separating sets, the triple
is marked as 'ambiguous'.
Moreover, if no separating set is found among the neighbors, the triple is
also marked as 'ambiguous'. An ambiguous triple is not oriented as a
v-structure. Furthermore, no further orientation rule that needs to
know whether
is a v-structure or not is applied. Instead of
using the conservative version, which is quite strict towards the
v-structures, Colombo and Maathuis (2014) introduced a less strict
version for the v-structures called majority rule. This adaptation can
be called using
maj.rule = TRUE
. In this case, the triple
is marked as 'ambiguous' if and only if
is in
exactly 50 percent of such separating sets or no separating set was found.
If
is in less than 50 percent of the separating sets it is set as a
v-structure, and if in more than 50 percent it is set as a non v-structure
(for more details see Colombo and Maathuis, 2014). The usage of both the
conservative and the majority rule versions resolve the
order-dependence issues of the determination of the v-structures.
Sampling errors (or hidden variables) can lead to conflicting
information about edge directions. For example, one may find that
and
should both be directed as v-structures.
This gives conflicting information about the edge
, since it should
be directed as
in v-structure
, while it should be
directed as
in v-structure
. With the option
solve.confl = FALSE
, in such cases, we simply overwrite the
directions of the conflicting edge. In the example above this means
that we obtain
if
was visited first, and
if
was visited first, meaning that the final orientation on
the edge depends on the ordering in which the v-structures were
considered. With the option
solve.confl = TRUE
(which is only
supported with option u2pd = "relaxed"
), we first generate a list
of all (unambiguous) v-structures (in the example above and
), and then we simply orient them allowing both directions on
the edge
, namely we allow the bi-directed edge
resolving the order-dependence issues on the
edge orientations. We denote bi-directed edges in the adjacency matrix
of the graph as
M[b,c] = 2
and M[c,b] = 2
. In a similar
way, using lists for the candidate edges for each orientation rule and
allowing bi-directed edges, the order-dependence issues in the orientation
rules can be resolved. Note that bi-directed edges merely represent a
conflicting orientation and they should not to be interpreted causally. The
useage of these lists for the candidate edges and allowing bi-directed edges
resolves the order-dependence issues on the orientation of the v-structures
and on the orientation rules, see Colombo and Maathuis (2014) for
more details.
Note that calling (conservative = TRUE
), or maj.rule =
TRUE
, together with solve.confl = TRUE
produces a fully
order-independent output, see Colombo and Maathuis (2014).
Sampling errors, non faithfulness, or hidden variables can also lead
to non-extendable CPDAGs, meaning that there does not exist a DAG that
has the same skeleton and v-structures as the graph found by the
algorithm. An example of this is an undirected cycle consisting of the
edges and
. In this case it is impossible to
direct the edges without creating a cycle or a new v-structure. The option
u2pd
specifies what should be done in such a situation. If the
option is set to "relaxed"
, the algorithm simply outputs the
invalid CPDAG. If the option is set to "rand"
, all direction
information is discarded and a random DAG is generated on the
skeleton, which is then converted into its CPDAG. If the option is set
to "retry"
, up to 100 combinations of possible directions of
the ambiguous edges are tried, and the first combination that results
in an extendable CPDAG is chosen. If no valid combination is found, an
arbitrary DAG is generated on the skeleton as in the option "rand",
and then converted into its CPDAG. Note that the output can also be an
invalid CPDAG, in the sense that it cannot arise from the oracle PC
algorithm, but be extendible to a DAG, for example
.
In this case,
u2pd
is not used.
Using the function isValidGraph
one can check if the final output is indeed a valid CPDAG.
Notes: (1) Throughout, the algorithm works with the column positions of the variables in the adjacency matrix, and not with the names of the variables. (2) When plotting the object, undirected and bidirected edges are equivalent.
An object of class
"pcAlgo"
(see
pcAlgo
) containing an estimate of the equivalence
class of the underlying DAG.
Markus Kalisch ([email protected]), Martin Maechler, and Diego Colombo.
D. Colombo and M.H. Maathuis (2014).Order-independent constraint-based causal structure learning. Journal of Machine Learning Research 15 3741-3782.
M. Kalisch, M. Maechler, D. Colombo, M.H. Maathuis and P. Buehlmann (2012). Causal Inference Using Graphical Models with the R Package pcalg. Journal of Statistical Software 47(11) 1–26, doi:10.18637/jss.v047.i11.
M. Kalisch and P. Buehlmann (2007). Estimating high-dimensional directed acyclic graphs with the PC-algorithm. JMLR 8 613-636.
J. Ramsey, J. Zhang and P. Spirtes (2006). Adjacency-faithfulness and conservative causal inference. In Proceedings of the 22nd Annual Conference on Uncertainty in Artificial Intelligence. AUAI Press, Arlington, VA.
P. Spirtes, C. Glymour and R. Scheines (2000). Causation, Prediction, and Search, 2nd edition. The MIT Press.
skeleton
for estimating a skeleton of a DAG;
udag2pdag
for converting the
skeleton to a CPDAG; gaussCItest
,
disCItest
, binCItest
and
dsepTest
as examples for indepTest
. isValidGraph
for testing whether the output is a valid CPDAG.
################################################## ## Using Gaussian Data ################################################## ## Load predefined data data(gmG) n <- nrow (gmG8$ x) V <- colnames(gmG8$ x) # labels aka node names ## estimate CPDAG pc.fit <- pc(suffStat = list(C = cor(gmG8$x), n = n), indepTest = gaussCItest, ## indep.test: partial correlations alpha=0.01, labels = V, verbose = TRUE) if (require(Rgraphviz)) { ## show estimated CPDAG par(mfrow=c(1,2)) plot(pc.fit, main = "Estimated CPDAG") plot(gmG8$g, main = "True DAG") } ################################################## ## Using d-separation oracle ################################################## ## define sufficient statistics (d-separation oracle) suffStat <- list(g = gmG8$g, jp = RBGL::johnson.all.pairs.sp(gmG8$g)) ## estimate CPDAG fit <- pc(suffStat, indepTest = dsepTest, labels = V, alpha= 0.01) ## value is irrelevant as dsepTest returns either 0 or 1 if (require(Rgraphviz)) { ## show estimated CPDAG plot(fit, main = "Estimated CPDAG") plot(gmG8$g, main = "True DAG") } ################################################## ## Using discrete data ################################################## ## Load data data(gmD) V <- colnames(gmD$x) ## define sufficient statistics suffStat <- list(dm = gmD$x, nlev = c(3,2,3,4,2), adaptDF = FALSE) ## estimate CPDAG pc.D <- pc(suffStat, ## independence test: G^2 statistic indepTest = disCItest, alpha = 0.01, labels = V, verbose = TRUE) if (require(Rgraphviz)) { ## show estimated CPDAG par(mfrow = c(1,2)) plot(pc.D, main = "Estimated CPDAG") plot(gmD$g, main = "True DAG") } ################################################## ## Using binary data ################################################## ## Load binary data data(gmB) V <- colnames(gmB$x) ## estimate CPDAG pc.B <- pc(suffStat = list(dm = gmB$x, adaptDF = FALSE), indepTest = binCItest, alpha = 0.01, labels = V, verbose = TRUE) pc.B if (require(Rgraphviz)) { ## show estimated CPDAG plot(pc.B, main = "Estimated CPDAG") plot(gmB$g, main = "True DAG") } ################################################## ## Detecting ambiguities due to sampling error ################################################## ## Load predefined data data(gmG) n <- nrow (gmG8$ x) V <- colnames(gmG8$ x) # labels aka node names ## estimate CPDAG pc.fit <- pc(suffStat = list(C = cor(gmG8$x), n = n), indepTest = gaussCItest, ## indep.test: partial correlations alpha=0.01, labels = V, verbose = TRUE) ## due to sampling error, some edges were overwritten: isValidGraph(as(pc.fit, "amat"), type = "cpdag") ## re-fit with solve.confl = TRUE pc.fit2 <- pc(suffStat = list(C = cor(gmG8$x), n = n), indepTest = gaussCItest, ## indep.test: partial correlations alpha=0.01, labels = V, verbose = TRUE, solve.confl = TRUE) ## conflicting edge is V5 - V6 as(pc.fit2, "amat")
################################################## ## Using Gaussian Data ################################################## ## Load predefined data data(gmG) n <- nrow (gmG8$ x) V <- colnames(gmG8$ x) # labels aka node names ## estimate CPDAG pc.fit <- pc(suffStat = list(C = cor(gmG8$x), n = n), indepTest = gaussCItest, ## indep.test: partial correlations alpha=0.01, labels = V, verbose = TRUE) if (require(Rgraphviz)) { ## show estimated CPDAG par(mfrow=c(1,2)) plot(pc.fit, main = "Estimated CPDAG") plot(gmG8$g, main = "True DAG") } ################################################## ## Using d-separation oracle ################################################## ## define sufficient statistics (d-separation oracle) suffStat <- list(g = gmG8$g, jp = RBGL::johnson.all.pairs.sp(gmG8$g)) ## estimate CPDAG fit <- pc(suffStat, indepTest = dsepTest, labels = V, alpha= 0.01) ## value is irrelevant as dsepTest returns either 0 or 1 if (require(Rgraphviz)) { ## show estimated CPDAG plot(fit, main = "Estimated CPDAG") plot(gmG8$g, main = "True DAG") } ################################################## ## Using discrete data ################################################## ## Load data data(gmD) V <- colnames(gmD$x) ## define sufficient statistics suffStat <- list(dm = gmD$x, nlev = c(3,2,3,4,2), adaptDF = FALSE) ## estimate CPDAG pc.D <- pc(suffStat, ## independence test: G^2 statistic indepTest = disCItest, alpha = 0.01, labels = V, verbose = TRUE) if (require(Rgraphviz)) { ## show estimated CPDAG par(mfrow = c(1,2)) plot(pc.D, main = "Estimated CPDAG") plot(gmD$g, main = "True DAG") } ################################################## ## Using binary data ################################################## ## Load binary data data(gmB) V <- colnames(gmB$x) ## estimate CPDAG pc.B <- pc(suffStat = list(dm = gmB$x, adaptDF = FALSE), indepTest = binCItest, alpha = 0.01, labels = V, verbose = TRUE) pc.B if (require(Rgraphviz)) { ## show estimated CPDAG plot(pc.B, main = "Estimated CPDAG") plot(gmB$g, main = "True DAG") } ################################################## ## Detecting ambiguities due to sampling error ################################################## ## Load predefined data data(gmG) n <- nrow (gmG8$ x) V <- colnames(gmG8$ x) # labels aka node names ## estimate CPDAG pc.fit <- pc(suffStat = list(C = cor(gmG8$x), n = n), indepTest = gaussCItest, ## indep.test: partial correlations alpha=0.01, labels = V, verbose = TRUE) ## due to sampling error, some edges were overwritten: isValidGraph(as(pc.fit, "amat"), type = "cpdag") ## re-fit with solve.confl = TRUE pc.fit2 <- pc(suffStat = list(C = cor(gmG8$x), n = n), indepTest = gaussCItest, ## indep.test: partial correlations alpha=0.01, labels = V, verbose = TRUE, solve.confl = TRUE) ## conflicting edge is V5 - V6 as(pc.fit2, "amat")
The pc.cons.intern()
function is used in pc
and
fci
, notably when
conservative = TRUE
(conservative orientation of v-structures) or
maj.rule = TRUE
(majority rule orientation of v-structures).
pc.cons.intern(sk, suffStat, indepTest, alpha, version.unf = c(NA, NA), maj.rule = FALSE, verbose = FALSE)
pc.cons.intern(sk, suffStat, indepTest, alpha, version.unf = c(NA, NA), maj.rule = FALSE, verbose = FALSE)
sk |
A skeleton object as returned from |
suffStat |
Sufficient statistic: List containing all necessary
elements for the conditional independence decisions in the
function |
indepTest |
Pre-defined function for testing conditional independence. The
function is internally called as |
alpha |
Significance level for the individual conditional independence tests. |
version.unf |
Vector of length two. If |
maj.rule |
Logical indicatin if the triples are checked for ambiguity using the majority rule idea, which is less strict than the standard conservative method. |
verbose |
Logical asking for detailed output. |
For any unshielded triple A-B-C, consider all subsets of the neighbors of A and of the neighbors of C, and record all such sets D for which A and C are conditionally independent given D. We call such sets “separating sets”.
If version.unf
[2]==1, the initial separating set found in the
PC/FCI algorithm is added to this set of separating sets.
If version.unf
[2]==2, the initial separating set is not added (as in Tetrad).
In the latter case, if the set of separating sets is empty, then the
triple is marked as ‘ambiguous’ if version.unf
[1]==2, for
example in pc
, or as ‘unambiguous’ if
version.unf
[1]==1, for example in fci
.
Otherwise, there is at least one separating set.
If maj.rule=FALSE
, the conservative PC algorithm is used
(Ramsey et al., 2006): If B is in some but not all separating sets,
the triple is marked as ambiguous. Otherwise it is treated as in the
standard PC algorithm. If maj.rule=TRUE
, the majority rule is
applied (Colombo and Maathuis, 2014): The triple is marked as
‘ambiguous’ if B is in exactly 50 percent of the separating sets. If
it is in less than 50 percent it is marked as a v-structure, and if it
is in more than 50 percent it is marked as a non v-structure.
Note: This function modifies the separating sets for unambiguous triples in the skeleton object (adding or removing B) to ensure that the usual orientations rules later on lead to the correct v-structures/non v-structures.
unfTripl |
numeric vector of triples coded as numbers (via
|
vers |
Vector containing the version (1 or 2) of the corresponding triple saved in unfTripl (1=normal ambiguous triple, i.e., B is in some sepsets but not all or none; 2=triple coming from version.unf[1]==2, i.e., a and c are indep given the initial sepset but there does not exist a subset of the neighbours of a or of c that d-separates them.) |
sk |
The updated skeleton-object (separating sets might have been updated). |
Markus Kalisch ([email protected]) and Diego Colombo.
D. Colombo and M.H. Maathuis (2014).Order-independent constraint-based causal structure learning. Journal of Machine Learning Research 15 3741-3782.
J. Ramsey, J. Zhang and P. Spirtes (2006). Adjacency-faithfulness and conservative causal inference. In Proceedings of the 22nd Annual Conference on Uncertainty in Artificial Intelligence, Arlington, VA. AUAI Press.
Transform the adjacency matrix of type amat.cpdag
or
amat.pag
(for details on coding see amatType
).
pcalg2dagitty(amat, labels, type = "cpdag")
pcalg2dagitty(amat, labels, type = "cpdag")
amat |
adjacency matrix of type |
labels |
|
type |
string specifying the type of graph of the adjacency matrix |
For a given adjacency matrix amat
the form amat.cpdag
or amat.pag
and a specified graph type
, this function returns a dagitty object corresponding to the graph structure specified by amat
, labels
and type
.
The resulting object is compatible with the dagitty package.
A dagitty graph (see the dagitty package).
Emilija Perkovic and Markus Kalisch
data(gmG) n <- nrow (gmG8$x) V <- colnames(gmG8$x) # labels aka node names amat <- wgtMatrix(gmG8$g) amat[amat != 0] <- 1 if(requireNamespace("dagitty", quietly = TRUE)) { dagitty_dag1 <- pcalg2dagitty(amat,V,type="dag") }
data(gmG) n <- nrow (gmG8$x) V <- colnames(gmG8$x) # labels aka node names amat <- wgtMatrix(gmG8$g) amat[amat != 0] <- 1 if(requireNamespace("dagitty", quietly = TRUE)) { dagitty_dag1 <- pcalg2dagitty(amat,V,type="dag") }
This function is DEPRECATED! Use skeleton
, pc
or
fci
instead.
Use the PC-algorithm to estimate the underlying graph (“skeleton”) or the equivalence class (CPDAG) of a DAG.
pcAlgo(dm = NA, C = NA, n=NA, alpha, corMethod = "standard", verbose=FALSE, directed=FALSE, G=NULL, datatype = "continuous", NAdelete=TRUE, m.max=Inf, u2pd = "rand", psepset=FALSE)
pcAlgo(dm = NA, C = NA, n=NA, alpha, corMethod = "standard", verbose=FALSE, directed=FALSE, G=NULL, datatype = "continuous", NAdelete=TRUE, m.max=Inf, u2pd = "rand", psepset=FALSE)
dm |
Data matrix; rows correspond to samples, cols correspond to nodes. |
C |
Correlation matrix; this is an alternative for specifying the data matrix. |
n |
Sample size; this is only needed if the data matrix is not provided. |
alpha |
Significance level for the individual partial correlation tests. |
corMethod |
A character string speciyfing the method for
(partial) correlation estimation.
"standard", "QnStable", "Qn" or "ogkQn" for standard and robust (based on
the Qn scale estimator without and with OGK) correlation
estimation. For robust estimation, we recommend |
verbose |
0-no output, 1-small output, 2-details;using 1 and 2 makes the function very much slower |
directed |
If |
G |
The adjacency matrix of the graph from which the algorithm should start (logical) |
datatype |
Distinguish between discrete and continuous data |
NAdelete |
Delete edge if pval=NA (for discrete data) |
m.max |
Maximal size of conditioning set |
u2pd |
Function used for converting skeleton to cpdag. "rand" (use udag2pdag); "relaxed" (use udag2pdagRelaxed); "retry" (use udag2pdagSpecial) |
psepset |
If true, also possible separation sets are tested. |
An object of class
"pcAlgo"
(see
pcAlgo
) containing an undirected graph
(object of class
"graph"
, see
graph-class
from the package graph)
(without weigths) as estimate of the skeleton or the CPDAG of the
underlying DAG.
Markus Kalisch ([email protected]) and Martin Maechler.
P. Spirtes, C. Glymour and R. Scheines (2000) Causation, Prediction, and Search, 2nd edition, The MIT Press.
Kalisch M. and P. B\"uhlmann (2007) Estimating high-dimensional directed acyclic graphs with the PC-algorithm; JMLR, Vol. 8, 613-636, 2007.
This class of objects is returned by the functions
skeleton
and pc
to represent the
(skeleton) of an estimated CPDAG.
Objects of this class have methods for the functions plot, show and
summary.
## S4 method for signature 'pcAlgo,ANY' plot(x, y, main = NULL, zvalue.lwd = FALSE, lwd.max = 7, labels = NULL, ...) ## S3 method for class 'pcAlgo' print(x, amat = FALSE, zero.print = ".", ...) ## S4 method for signature 'pcAlgo' summary(object, amat = TRUE, zero.print = ".", ...) ## S4 method for signature 'pcAlgo' show(object)
## S4 method for signature 'pcAlgo,ANY' plot(x, y, main = NULL, zvalue.lwd = FALSE, lwd.max = 7, labels = NULL, ...) ## S3 method for class 'pcAlgo' print(x, amat = FALSE, zero.print = ".", ...) ## S4 method for signature 'pcAlgo' summary(object, amat = TRUE, zero.print = ".", ...) ## S4 method for signature 'pcAlgo' show(object)
x , object
|
a |
y |
(generic |
main |
main title for the plot (with an automatic default). |
zvalue.lwd |
|
lwd.max |
maximal |
labels |
if non- |
amat |
|
zero.print |
string for printing |
... |
optional further arguments (passed from and to methods). |
Objects are typically created as result from
skeleton()
or pc()
, but could be
be created by calls of the form new("pcAlgo", ...)
.
The slots call
, n
, max.ord
, n.edgetests
,
sepset
, and pMax
are inherited from class
"gAlgo"
, see there.
In addition, "pcAlgo"
has slots
graph
:Object of class "graph-class"
:
the undirected or partially directed graph that was estimated.
zMin
:Deprecated.
Class "gAlgo"
.
signature(x = "pcAlgo")
: Plot the resulting
graph. If argument "zvalue.lwd"
is true, the
linewidth an edge reflects zMin
, so that
thicker lines indicate more reliable dependencies. The argument
"lwd.max"
controls the maximum linewidth.
signature(object = "pcAlgo")
: Show basic properties of
the fitted object
signature(object = "pcAlgo")
: Show details of
the fitted object
Markus Kalisch and Martin Maechler
showClass("pcAlgo") ## generate a pcAlgo object p <- 8 set.seed(45) myDAG <- randomDAG(p, prob = 0.3) n <- 10000 d.mat <- rmvDAG(n, myDAG, errDist = "normal") suffStat <- list(C = cor(d.mat), n = n) pc.fit <- pc(suffStat, indepTest = gaussCItest, alpha = 0.01, p = p) ## use methods of class pcAlgo show(pc.fit) if(require(Rgraphviz)) plot(pc.fit, main = "Fitted graph") summary(pc.fit) ## access slots of this object (g <- pc.fit@graph) str(ss <- pc.fit@sepset, max=1)
showClass("pcAlgo") ## generate a pcAlgo object p <- 8 set.seed(45) myDAG <- randomDAG(p, prob = 0.3) n <- 10000 d.mat <- rmvDAG(n, myDAG, errDist = "normal") suffStat <- list(C = cor(d.mat), n = n) pc.fit <- pc(suffStat, indepTest = gaussCItest, alpha = 0.01, p = p) ## use methods of class pcAlgo show(pc.fit) if(require(Rgraphviz)) plot(pc.fit, main = "Fitted graph") summary(pc.fit) ## access slots of this object (g <- pc.fit@graph) str(ss <- pc.fit@sepset, max=1)
This function computes partial correlations given a correlation matrix using a recursive algorithm.
pcorOrder(i,j, k, C, cut.at = 0.9999999)
pcorOrder(i,j, k, C, cut.at = 0.9999999)
i , j
|
(integer) position of variable |
k |
(integer) positions of zero or more conditioning variables in the correlation matrix. |
C |
Correlation matrix (matrix) |
cut.at |
Number slightly smaller than one; if |
The partial correlations are computed using a recusive formula
if the size of the conditioning set is one. For larger conditioning
sets, the pseudoinverse of parts of the correlation matrix is
computed (by pseudoinverse()
from package
corpcor). The pseudoinverse instead of the inverse is used in
order to avoid numerical problems.
The partial correlation of i and j given the set k.
Markus Kalisch [email protected] and Martin Maechler
condIndFisherZ
for testing zero partial correlation.
## produce uncorrelated normal random variables mat <- matrix(rnorm(3*20),20,3) ## compute partial correlation of var1 and var2 given var3 pcorOrder(1,2, 3, cor(mat)) ## define graphical model, simulate data and compute ## partial correlation with bigger conditional set genDAG <- randomDAG(20, prob = 0.2) dat <- rmvDAG(1000, genDAG) C <- cor(dat) pcorOrder(2,5, k = c(3,7,8,14,19), C)
## produce uncorrelated normal random variables mat <- matrix(rnorm(3*20),20,3) ## compute partial correlation of var1 and var2 given var3 pcorOrder(1,2, 3, cor(mat)) ## define graphical model, simulate data and compute ## partial correlation with bigger conditional set genDAG <- randomDAG(20, prob = 0.2) dat <- rmvDAG(1000, genDAG) C <- cor(dat) pcorOrder(2,5, k = c(3,7,8,14,19), C)
The goal is feature selection: If you
have a response variable and a data matrix
, we want
to know which variables are “strongly influential” on
. The
type of influence is the same as in the PC-Algorithm, i.e.,
and
(a column of
) are associated if they are
correlated even when conditioning on any subset of the remaining
columns in
. Therefore, only very strong relations will be
found and the result is typically a subset of other feature selection
techniques. Note that there are also robust correlation methods
available which render this method robust.
pcSelect(y, dm, alpha, corMethod = "standard", verbose = FALSE, directed = FALSE)
pcSelect(y, dm, alpha, corMethod = "standard", verbose = FALSE, directed = FALSE)
y |
response vector. |
dm |
data matrix (rows: samples/observations, columns: variables);
|
alpha |
significance level of individual partial correlation tests. |
corMethod |
a string determining the method for correlation
estimation via |
verbose |
Note that such diagnostic output may make the function considerably slower. |
directed |
logical; should the output graph be directed? |
This function basically applies pc
on the data
matrix obtained by joining y
and dm
. Since the output is
not concerned with the edges found within the columns of dm
,
the algorithm is adapted accordingly. Therefore, the runtime and the
ability to deal with large datasets is typically increased
substantially.
G |
A |
zMin |
The minimal z-values when testing partial correlations
between |
Markus Kalisch ([email protected]) and Martin Maechler.
Buehlmann, P., Kalisch, M. and Maathuis, M.H. (2010). Variable selection for high-dimensional linear models: partially faithful distributions and the PC-simple algorithm. Biometrika 97, 261–278.
pc
which is the more general version of this function;
pcSelect.presel
which applies pcSelect()
twice.
p <- 10 ## generate and draw random DAG : suppressWarnings(RNGversion("3.5.0")) set.seed(101) myDAG <- randomDAG(p, prob = 0.2) if (require(Rgraphviz)) { plot(myDAG, main = "randomDAG(10, prob = 0.2)") } ## generate 1000 samples of DAG using standard normal error distribution n <- 1000 d.mat <- rmvDAG(n, myDAG, errDist = "normal") ## let's pretend that the 10th column is the response and the first 9 ## columns are explanatory variable. Which of the first 9 variables ## "cause" the tenth variable? y <- d.mat[,10] dm <- d.mat[,-10] (pcS <- pcSelect(d.mat[,10], d.mat[,-10], alpha=0.05)) ## You see, that variable 4,5,6 are considered as important ## By inspecting zMin, with(pcS, zMin[G]) ## you can also see that the influence of variable 6 ## is most evident from the data (its zMin is 18.64, so quite large - as ## a rule of thumb for judging what is large, you could use quantiles ## of the Standard Normal Distribution)
p <- 10 ## generate and draw random DAG : suppressWarnings(RNGversion("3.5.0")) set.seed(101) myDAG <- randomDAG(p, prob = 0.2) if (require(Rgraphviz)) { plot(myDAG, main = "randomDAG(10, prob = 0.2)") } ## generate 1000 samples of DAG using standard normal error distribution n <- 1000 d.mat <- rmvDAG(n, myDAG, errDist = "normal") ## let's pretend that the 10th column is the response and the first 9 ## columns are explanatory variable. Which of the first 9 variables ## "cause" the tenth variable? y <- d.mat[,10] dm <- d.mat[,-10] (pcS <- pcSelect(d.mat[,10], d.mat[,-10], alpha=0.05)) ## You see, that variable 4,5,6 are considered as important ## By inspecting zMin, with(pcS, zMin[G]) ## you can also see that the influence of variable 6 ## is most evident from the data (its zMin is 18.64, so quite large - as ## a rule of thumb for judging what is large, you could use quantiles ## of the Standard Normal Distribution)
This function uses pcSelect
to preselect some covariates
and then runs pcSelect
again on the reduced data set.
pcSelect.presel(y, dm, alpha, alphapre, corMethod = "standard", verbose = 0, directed=FALSE)
pcSelect.presel(y, dm, alpha, alphapre, corMethod = "standard", verbose = 0, directed=FALSE)
y |
Response vector. |
dm |
Data matrix (rows: samples, cols: nodes; i.e.,
|
alpha |
Significance level of individual partial correlation tests. |
alphapre |
Significance level for pcSelect in preselection |
corMethod |
"standard" or "Qn" for standard or robust correlation estimation |
verbose |
0-no output, 1-small output, 2-details (using 1 and 2 makes the function very much slower) |
directed |
Logical; should the output graph be directed? |
First, pcSelect
is run using alphapre
. Then,
only the important variables are kept and pcSelect
is run on
them again.
pcs |
Logical vector indicating which column of |
zMin |
The minimal z-values when testing partial correlations
between |
Xnew |
Preselected Variables. |
Philipp Ruetimann
p <- 10 ## generate and draw random DAG : set.seed(101) myDAG <- randomDAG(p, prob = 0.2) if(require(Rgraphviz)) plot(myDAG, main = "randomDAG(10, prob = 0.2)") ## generate 1000 samples of DAG using standard normal error distribution n <- 1000 d.mat <- rmvDAG(n, myDAG, errDist = "normal") ## let's pretend that the 10th column is the response and the first 9 ## columns are explanatory variable. Which of the first 9 variables ## "cause" the tenth variable? y <- d.mat[,10] dm <- d.mat[,-10] res <- pcSelect.presel(d.mat[,10], d.mat[,-10], alpha=0.05, alphapre=0.6)
p <- 10 ## generate and draw random DAG : set.seed(101) myDAG <- randomDAG(p, prob = 0.2) if(require(Rgraphviz)) plot(myDAG, main = "randomDAG(10, prob = 0.2)") ## generate 1000 samples of DAG using standard normal error distribution n <- 1000 d.mat <- rmvDAG(n, myDAG, errDist = "normal") ## let's pretend that the 10th column is the response and the first 9 ## columns are explanatory variable. Which of the first 9 variables ## "cause" the tenth variable? y <- d.mat[,10] dm <- d.mat[,-10] res <- pcSelect.presel(d.mat[,10], d.mat[,-10], alpha=0.05, alphapre=0.6)
pdag2allDags
computes all DAGs in the Markov Equivalence Class
Represented by a Given Partially Directed Acyclic Graph (PDAG).
pdag2allDags(gm, verbose = FALSE)
pdag2allDags(gm, verbose = FALSE)
gm |
adjacency matrix of type amat.cpdag |
verbose |
logical; if true, some output is produced during computation |
All DAGs extending the given PDAG are computing while avoiding new
v-structures and cycles. If no DAG is found, the function returns NULL
.
List with two elements:
dags: |
Matrix; every row corresponds to a DAG; every column
corresponds to an entry in the adjacency matrix of this DAG. Thus, the
adjacency matrix (of type amat.cpdag) contained in the i-th row
of matrix |
nodeNms |
Node labels of the input PDAG. |
Markus Kalisch ([email protected])
## Example 1 gm <- rbind(c(0,1), c(1,0)) colnames(gm) <- rownames(gm) <- LETTERS[1:2] res1 <- pdag2allDags(gm) ## adjacency matrix of first DAG in output amat1 <- matrix(res1$dags[1,],2,2, byrow = TRUE) colnames(amat1) <- rownames(amat1) <- res1$nodeNms amat1 ## A --> B ## Example 2 gm <- rbind(c(0,1,1), c(1,0,1), c(1,1,0)) colnames(gm) <- rownames(gm) <- LETTERS[1:ncol(gm)] res2 <- pdag2allDags(gm) ## adjacency matrix of first DAG in output amat2 <- matrix(res2$dags[1,],3,3, byrow = TRUE) colnames(amat2) <- rownames(amat2) <- res2$nodeNms amat2 ## Example 3 gm <- rbind(c(0,1,1,0,0), c(1,0,0,0,0), c(1,0,0,0,0), c(0,1,1,0,1), c(0,0,0,1,0)) colnames(gm) <- rownames(gm) <- LETTERS[1:ncol(gm)] res3 <- pdag2allDags(gm) ## adjacency matrix of first DAG in output amat3 <- matrix(res3$dags[1,],5,5, byrow = TRUE) colnames(amat3) <- rownames(amat3) <- res3$nodeNms amat3 if (require(Rgraphviz)) { ## for convenience a simple plotting function ## for the function output plotAllDags <- function(res) { require(graph) p <- sqrt(ncol(res$dags)) nDags <- ceiling(sqrt(nrow(res$dags))) par(mfrow = c(nDags, nDags)) for (i in 1:nrow(res$dags)) { tmp <- matrix(res$dags[i,],p,p) colnames(tmp) <- rownames(tmp) <- res$nodeNms plot(as(tmp, "graphNEL")) } } plotAllDags(res1) amat1 ## adj.matrix corresponding to the first plot for expl 1 plotAllDags(res2) amat2 ## adj.matrix corresponding to the first plot for expl 2 plotAllDags(res3) amat3 ## adj.matrix corresponding to the first plot for expl 3 }
## Example 1 gm <- rbind(c(0,1), c(1,0)) colnames(gm) <- rownames(gm) <- LETTERS[1:2] res1 <- pdag2allDags(gm) ## adjacency matrix of first DAG in output amat1 <- matrix(res1$dags[1,],2,2, byrow = TRUE) colnames(amat1) <- rownames(amat1) <- res1$nodeNms amat1 ## A --> B ## Example 2 gm <- rbind(c(0,1,1), c(1,0,1), c(1,1,0)) colnames(gm) <- rownames(gm) <- LETTERS[1:ncol(gm)] res2 <- pdag2allDags(gm) ## adjacency matrix of first DAG in output amat2 <- matrix(res2$dags[1,],3,3, byrow = TRUE) colnames(amat2) <- rownames(amat2) <- res2$nodeNms amat2 ## Example 3 gm <- rbind(c(0,1,1,0,0), c(1,0,0,0,0), c(1,0,0,0,0), c(0,1,1,0,1), c(0,0,0,1,0)) colnames(gm) <- rownames(gm) <- LETTERS[1:ncol(gm)] res3 <- pdag2allDags(gm) ## adjacency matrix of first DAG in output amat3 <- matrix(res3$dags[1,],5,5, byrow = TRUE) colnames(amat3) <- rownames(amat3) <- res3$nodeNms amat3 if (require(Rgraphviz)) { ## for convenience a simple plotting function ## for the function output plotAllDags <- function(res) { require(graph) p <- sqrt(ncol(res$dags)) nDags <- ceiling(sqrt(nrow(res$dags))) par(mfrow = c(nDags, nDags)) for (i in 1:nrow(res$dags)) { tmp <- matrix(res$dags[i,],p,p) colnames(tmp) <- rownames(tmp) <- res$nodeNms plot(as(tmp, "graphNEL")) } } plotAllDags(res1) amat1 ## adj.matrix corresponding to the first plot for expl 1 plotAllDags(res2) amat2 ## adj.matrix corresponding to the first plot for expl 2 plotAllDags(res3) amat3 ## adj.matrix corresponding to the first plot for expl 3 }
This function extends a PDAG (Partially Directed Acyclic Graph) to a DAG, if this is possible.
pdag2dag(g, keepVstruct=TRUE)
pdag2dag(g, keepVstruct=TRUE)
g |
Input PDAG (graph object) |
keepVstruct |
Logical indicating if the v-structures in g are kept. Otherwise they are ignored and an arbitrary extension is generated. |
Direct undirected edges without creating directed cycles or additional v-structures. The PDAG is consistently extended to a DAG using the algorithm by Dor and Tarsi (1992). If no extension is possible, a DAG corresponding to the skeleton of the PDAG is generated and a warning message is produced.
List with entries
graph |
Contains a consistent DAG extension (graph object), |
success |
Is |
Markus Kalisch [email protected]
D.Dor, M.Tarsi (1992). A simple algorithm to construct a consistent extension of a partially oriented graph. Technicial Report R-185, Cognitive Systems Laboratory, UCLA
p <- 10 # number of random variables n <- 10000 # number of samples s <- 0.4 # sparsness of the graph ## generate random data set.seed(42) g <- randomDAG(p, prob = s) # generate a random DAG d <- rmvDAG(n,g) # generate random samples gSkel <- pcAlgo(d,alpha=0.05) # estimate of the skeleton (gPDAG <- udag2pdag(gSkel)) (gDAG <- pdag2dag(gPDAG@graph))
p <- 10 # number of random variables n <- 10000 # number of samples s <- 0.4 # sparsness of the graph ## generate random data set.seed(42) g <- randomDAG(p, prob = s) # generate a random DAG d <- rmvDAG(n,g) # generate random samples gSkel <- pcAlgo(d,alpha=0.05) # estimate of the skeleton (gPDAG <- udag2pdag(gSkel)) (gDAG <- pdag2dag(gPDAG@graph))
Estimate the final skeleton in the FCI algorithm (Spirtes et al, 2000), as described in Steps 2 and 3 of Algorithm 3.1 in Colombo et al. (2012). The input of this function consists of an initial skeleton that was estimated by the PC algorithm (Step 1 of Algorithm 3.1 in Colombo et al. (2012)).
Given the initial skeleton, all unshielded triples are considered and
oriented as colliders when appropriate. Then, for all nodes x in the
resulting partially directed graph G, Possible-D-SEP(x,G) is computed,
using the function qreach
. Finally, for any edge y-z that is
present in G and that is not flagged as fixed by the fixedEdges
argument, conditional independence between Y and Z is tested given
all subsets of Possible-D-SEP(y,G) and all subsets of
Possible-D-SEP(z,G). These tests are done at level alpha, using
indepTest
. If the pair of nodes is judged to be independent
given some set S, then S is recorded in sepset(y,z) and sepset(z,y)
and the edge y-z is deleted. Otherwise, the edge remains and there is
no change to sepset.
pdsep(skel, suffStat, indepTest, p, sepset, alpha, pMax, m.max = Inf, pdsep.max = Inf, NAdelete = TRUE, unfVect = NULL, biCC = FALSE, fixedEdges = NULL, verbose = FALSE)
pdsep(skel, suffStat, indepTest, p, sepset, alpha, pMax, m.max = Inf, pdsep.max = Inf, NAdelete = TRUE, unfVect = NULL, biCC = FALSE, fixedEdges = NULL, verbose = FALSE)
skel |
Graph object returned by |
suffStat |
Sufficient statistic: A list containing all necessary
elements for making conditional independence decisions using
function |
indepTest |
Predefined function for testing conditional independence. The
function is internally called as |
p |
Number of variables. |
sepset |
List of length |
alpha |
Significance level for the individual conditional independence tests. |
pMax |
Matrix with the maximal p-values of conditional
independence tests in a previous call of |
m.max |
Maximum size of the conditioning sets that are considered in the conditional independence tests. |
pdsep.max |
Maximum size of Possible-D-SEP for which subsets are
considered as conditioning sets in the conditional independence
tests. If the nodes |
NAdelete |
If indepTest returns |
unfVect |
Vector containing numbers that encode the unfaithful
triple (as returned by |
biCC |
Logical; if |
fixedEdges |
a logical symmetric matrix of dimension p*p. If entry
|
verbose |
Logical indicating that detailed output is to be provided. |
To make the code more efficient, we only perform tests that were not performed in the estimation of the initial skeleton.
Note that the Possible-D-SEP sets are computed once in the beginning. They are not updated after edge deletions, in order to make sure that the output of the algorithm does not depend on the ordering of the variables (see also Colombo and Maathuis (2014)).
A list with the following elements:
G |
Updated adjacency matrix representing the final skeleton |
sepset |
Updated sepsets |
pMax |
Updated matrix containing maximal p-values |
allPdsep |
Possible-D-Sep for each node |
max.ord |
Maximal order of conditioning sets during independence tests |
n.edgetests |
Number of conditional edgetests performed, grouped by the size of the conditioning set. |
Markus Kalisch ([email protected]) and Diego Colombo.
P. Spirtes, C. Glymour and R. Scheines (2000). Causation, Prediction, and Search, 2nd edition. The MIT Press.
D. Colombo, M.H. Maathuis, M. Kalisch and T.S. Richardson (2012). Learning high-dimensional directed acyclic graphs with latent and selection variables. Annals of Statistics 40, 294–321.
D. Colombo and M.H. Maathuis (2014). Order-independent constraint-based causal structure learning. Journal of Machine Learning Research 15 3741-3782.
qreach
to find Possible-D-SEP(x,G);
fci
.
p <- 10 ## generate and draw random DAG: set.seed(44) myDAG <- randomDAG(p, prob = 0.2) ## generate 10000 samples of DAG using gaussian distribution library(RBGL) n <- 10000 d.mat <- rmvDAG(n, myDAG, errDist = "normal") ## estimate skeleton indepTest <- gaussCItest suffStat <- list(C = cor(d.mat), n = n) alpha <- 0.01 skel <- skeleton(suffStat, indepTest, alpha=alpha, p=p) ## prepare input for pdsep sepset <- skel@sepset pMax <- skel@pMax ## call pdsep to find Possible-D-Sep and enhance the skeleton pdsepRes <- pdsep(skel@graph, suffStat, indepTest, p, sepset, alpha, pMax, verbose = TRUE) ## call pdsep with biconnected components to find Possible-D-Sep and enhance the skeleton pdsepResBicc <- pdsep(skel@graph, suffStat, indepTest, p, sepset, alpha, pMax, biCC= TRUE, verbose = TRUE)
p <- 10 ## generate and draw random DAG: set.seed(44) myDAG <- randomDAG(p, prob = 0.2) ## generate 10000 samples of DAG using gaussian distribution library(RBGL) n <- 10000 d.mat <- rmvDAG(n, myDAG, errDist = "normal") ## estimate skeleton indepTest <- gaussCItest suffStat <- list(C = cor(d.mat), n = n) alpha <- 0.01 skel <- skeleton(suffStat, indepTest, alpha=alpha, p=p) ## prepare input for pdsep sepset <- skel@sepset pMax <- skel@pMax ## call pdsep to find Possible-D-Sep and enhance the skeleton pdsepRes <- pdsep(skel@graph, suffStat, indepTest, p, sepset, alpha, pMax, verbose = TRUE) ## call pdsep with biconnected components to find Possible-D-Sep and enhance the skeleton pdsepResBicc <- pdsep(skel@graph, suffStat, indepTest, p, sepset, alpha, pMax, biCC= TRUE, verbose = TRUE)
This function is DEPRECATED! Use the plot
method of the
fciAlgo
class instead.
plotAG(amat)
plotAG(amat)
amat |
Adjacency matrix (coding 0,1,2,3 for no edge, circle,
arrowhead, tail; e.g., |
Markus Kalisch ([email protected])
Plots a subgraph for a specified starting node and a given graph. The subgraph consists of those nodes that can be reached from the starting node by passing no more than a specified number of edges.
plotSG(graphObj, y, dist, amat = NA, directed = TRUE, plot = requireNamespace("Rgraphviz"), main = , cex.main = 1.25, font.main = par("font.main"), col.main=par("col.main"), ...)
plotSG(graphObj, y, dist, amat = NA, directed = TRUE, plot = requireNamespace("Rgraphviz"), main = , cex.main = 1.25, font.main = par("font.main"), col.main=par("col.main"), ...)
graphObj |
An R object of class |
y |
(integer) position of the starting node in the adjacency matrix. |
dist |
Distance of nodes included in subgraph from starting node |
amat |
Precomputed adjacency matrix of type amat.cpdag (optional) |
directed |
|
plot |
logical indicating if the subgraph should be plotted (or just returned). Defaults to true when Rgraphviz is installed. |
main |
title to be used, with a sensible default; see
|
cex.main , font.main , col.main
|
optional settings for the
|
... |
optional arguments passed to the |
Commencing at the starting point y
the function looks for the
neighbouring nodes. Beginning with direct parents and children it
will continue hierarchically through the distances to y
. Note
that the neighbourhood does not depend on edge directions. If
directed
is true (as per default), the orientation of the edges
is taken from the initial graph.
For the plotting, the package Rgraphviz must be installed.
the desired subgraph is returned; invisibly, i.e., via
invisible
, if plot
is true.
Daniel Stekhoven, then Martin Maechler.
## generate a random DAG: p <- 10 set.seed(45) myDAG <- randomDAG(p, prob = 0.3) if(requireNamespace("Rgraphviz")) { ## plot whole the DAG plot(myDAG, main = "randomDAG(10, prob = 0.3)") op <- par(mfrow = c(3,2)) ## plot the neighbours of node number 8 up to distance 1 plotSG(myDAG, 8, 1, directed = TRUE) plotSG(myDAG, 8, 1, directed = FALSE) ## plot the neighbours of node number 8 up to distance 2 plotSG(myDAG, 8, 2, directed = TRUE) plotSG(myDAG, 8, 2, directed = FALSE) ## plot the neighbours of node number 8 up to distance 3 plotSG(myDAG, 8, 3, directed = TRUE) plotSG(myDAG, 8, 3, directed = FALSE) ## Note that the layout of the subgraph might be different than in the ## original graph, but the graph structure is identical par(op) } else { ## without 'Rgraphviz' sg2d <- plotSG(myDAG, 8, 2, directed = TRUE, plot=FALSE) sg2u <- plotSG(myDAG, 8, 2, directed = FALSE, plot=FALSE) }
## generate a random DAG: p <- 10 set.seed(45) myDAG <- randomDAG(p, prob = 0.3) if(requireNamespace("Rgraphviz")) { ## plot whole the DAG plot(myDAG, main = "randomDAG(10, prob = 0.3)") op <- par(mfrow = c(3,2)) ## plot the neighbours of node number 8 up to distance 1 plotSG(myDAG, 8, 1, directed = TRUE) plotSG(myDAG, 8, 1, directed = FALSE) ## plot the neighbours of node number 8 up to distance 2 plotSG(myDAG, 8, 2, directed = TRUE) plotSG(myDAG, 8, 2, directed = FALSE) ## plot the neighbours of node number 8 up to distance 3 plotSG(myDAG, 8, 3, directed = TRUE) plotSG(myDAG, 8, 3, directed = FALSE) ## Note that the layout of the subgraph might be different than in the ## original graph, but the graph structure is identical par(op) } else { ## without 'Rgraphviz' sg2d <- plotSG(myDAG, 8, 2, directed = TRUE, plot=FALSE) sg2u <- plotSG(myDAG, 8, 2, directed = FALSE, plot=FALSE) }
In a DAG, CPDAG, MAG or PAG determine which nodes are (possible) ancestors of x on definite status or just any paths potentially avoiding given nodes on the paths.
possAn(m, x, y = NULL, possible = TRUE, ds = TRUE, type = c("cpdag", "pdag", "dag", "mag", "pag"))
possAn(m, x, y = NULL, possible = TRUE, ds = TRUE, type = c("cpdag", "pdag", "dag", "mag", "pag"))
m |
Adjacency matrix in coding according to type. |
x |
Node positions of starting nodes. |
y |
Node positions of nodes through which a path must not go. |
possible |
If |
ds |
If |
type |
Type of adjacency matrix in |
Not all possible combinations of the arguments are currently implemented and will issue an error if called.
Vector of all node positions found as (possible) ancestors of the nodes in x
.
Markus Kalisch
## a -- b -> c amat <- matrix(c(0,1,0, 1,0,1, 0,0,0), 3,3) colnames(amat) <- rownames(amat) <- letters[1:3] if (require(Rgraphviz)) plot(as(t(amat), "graphNEL")) possAn(m = amat, x = 3, possible = TRUE, ds = FALSE, type = "pdag") ## all nodes possAn(m = amat, x = 3, y = 2, possible = TRUE, ds = FALSE, type = "pdag") ## only node 1
## a -- b -> c amat <- matrix(c(0,1,0, 1,0,1, 0,0,0), 3,3) colnames(amat) <- rownames(amat) <- letters[1:3] if (require(Rgraphviz)) plot(as(t(amat), "graphNEL")) possAn(m = amat, x = 3, possible = TRUE, ds = FALSE, type = "pdag") ## all nodes possAn(m = amat, x = 3, y = 2, possible = TRUE, ds = FALSE, type = "pdag") ## only node 1
In a DAG, CPDAG, MAG or PAG determine which nodes are (possible) descendants of x on definite status or just any paths potentially avoiding given nodes on the paths.
possDe(m, x, y = NULL, possible = TRUE, ds = TRUE, type = c("cpdag", "pdag", "dag", "mag", "pag"))
possDe(m, x, y = NULL, possible = TRUE, ds = TRUE, type = c("cpdag", "pdag", "dag", "mag", "pag"))
m |
Adjacency matrix in coding according to type. |
x |
Node positions of starting nodes. |
y |
Node positions of nodes through which a path must not go. |
possible |
If |
ds |
If |
type |
Type of adjacency matrix in |
Not all possible combinations of the arguments are currently implemented and will issue an error if called.
Vector of all node positions found as (possible) descendents of the nodes in x
.
Markus Kalisch
## a -> b -- c amat <- matrix(c(0,1,0, 0,0,1, 0,1,0), 3,3) colnames(amat) <- rownames(amat) <- letters[1:3] if (require(Rgraphviz)) plot(as(t(amat), "graphNEL")) possDe(m = amat, x = 1, possible = TRUE, ds = FALSE, type = "pdag") ## all nodes possDe(m = amat, x = 1, possible = FALSE, ds = FALSE, type = "pdag") ## only nodes 1 and 2 possDe(m = amat, x = 1, y = 2, possible = TRUE, ds = FALSE, type = "pdag") ## only node 1
## a -> b -- c amat <- matrix(c(0,1,0, 0,0,1, 0,1,0), 3,3) colnames(amat) <- rownames(amat) <- letters[1:3] if (require(Rgraphviz)) plot(as(t(amat), "graphNEL")) possDe(m = amat, x = 1, possible = TRUE, ds = FALSE, type = "pdag") ## all nodes possDe(m = amat, x = 1, possible = FALSE, ds = FALSE, type = "pdag") ## only nodes 1 and 2 possDe(m = amat, x = 1, y = 2, possible = TRUE, ds = FALSE, type = "pdag") ## only node 1
This function is DEPRECATED! Use possDe
instead.
In a DAG, CPDAG, MAG or PAG determine which nodes are possible descendants of x on definite status paths.
possibleDe(amat, x)
possibleDe(amat, x)
amat |
adjacency matrix of type amat.pag |
x |
(integer) position of node |
A non-endpoint vertex X
on a path p
in a partial mixed
graph is said to be of a definite status if it is either a collider or a
definite non-collider on p
. The path p
is said to be of a
definite status if all non-endpoint vertices on the path are of a
definite status (see e.g. Maathuis and Colombo (2015), Def. 3.4).
A possible descendent of x can be reached moving to adjacent nodes of x but never going against an arrowhead.
Vector with possible descendents.
Diego Colombo
M.H. Maathuis and D. Colombo (2015). A generalized back-door criterion. Annals of Statistics 43 1060-1088.
amat <- matrix( c(0,3,0,0,0,0, 2,0,2,0,0,0, 0,3,0,0,0,0, 0,0,0,0,1,0, 0,0,0,1,0,1, 0,0,0,0,1,0), 6,6) colnames(amat) <- rownames(amat) <- letters[1:6] if(require(Rgraphviz)) { plotAG(amat) } possibleDe(amat, 1) ## a, b are poss. desc. of a possibleDe(amat, 4) ## d, e, f are poss. desc. of d
amat <- matrix( c(0,3,0,0,0,0, 2,0,2,0,0,0, 0,3,0,0,0,0, 0,0,0,0,1,0, 0,0,0,1,0,1, 0,0,0,0,1,0), 6,6) colnames(amat) <- rownames(amat) <- letters[1:6] if(require(Rgraphviz)) { plotAG(amat) } possibleDe(amat, 1) ## a, b are poss. desc. of a possibleDe(amat, 4) ## d, e, f are poss. desc. of d
Let G be a graph with the following edge types: o-o, o-> or <->, and let x be a vertex in the graph. Then this function computes Possible-D-SEP(x,G), which is defined as follows:
v is in Possible-D-SEP(x,G) iff there is a path p between x and v in G such that for every subpath <s,t,u> of p, t is a collider on this subpath or <s,t,u> is a triangle in G.
See Spirtes et al (2000) or Definition 3.3 of Colombo et al (2012).
qreach(x, amat, verbose = FALSE)
qreach(x, amat, verbose = FALSE)
x |
(integer) position of vertex |
amat |
Adjacency matrix of type amat.pag. |
verbose |
Logical, asking for details on output |
Vector of column positions indicating the nodes in Possible-D-SEP of x.
Markus Kalisch ([email protected])
P. Spirtes, C. Glymour and R. Scheines (2000). Causation, Prediction, and Search, 2nd edition, The MIT Press.
D. Colombo, M.H. Maathuis, M. Kalisch, T.S. Richardson (2012). Learning high-dimensional directed acyclic graphs with latent and selection variables. Annals of Statistics 40, 294–321.
fci
and pdsep
which both use
this function.
Generate a random Gaussian causal model. Parameters specifying the connectivity as well as coefficients and error terms of the corresponding linear structural equation model can be specified. The observational expectation value of the generated model is always 0, meaning that no interception terms are drawn.
r.gauss.pardag(p, prob, top.sort = FALSE, normalize = FALSE, lbe = 0.1, ube = 1, neg.coef = TRUE, labels = as.character(1:p), lbv = 0.5, ubv = 1)
r.gauss.pardag(p, prob, top.sort = FALSE, normalize = FALSE, lbe = 0.1, ube = 1, neg.coef = TRUE, labels = as.character(1:p), lbv = 0.5, ubv = 1)
p |
the number of nodes. |
prob |
probability of connecting a node to another node. |
top.sort |
|
normalize |
|
lbe , ube
|
lower and upper bounds of the absolute values of edge weights. |
neg.coef |
logical indicating whether negative edge weights are also admissible. |
labels |
(optional) character vector of variable (or “node”) names. |
lbv , ubv
|
lower and upper bound on error variances of the noise terms in the structural equations. |
The underlying directed acyclic
graph (DAG) is generated by drawing an undirected graph from an Erdős-Rényi
model orienting the edges according to a random topological ordering drawn
uniformly from the set of permutations of p
variables. This means that
any two nodes are connected with (the same) probability prob
, and that
the connectivity of different pairs of nodes is independent.
A Gaussian causal model can be represented as a set of linear structural
equations. The regression coefficients of the model can be represented as
"edge weights" of the DAG. Edge weights are drawn uniformly and
independently from the interval between lbe
and ube
; if
neg.coef = TRUE
, their sign is flipped with probability 0.5. Error
variances are drawn uniformly and independently from the interval between
lbv
and ubv
.
If normalize = TRUE
, the edge weights and error variances are
normalized in the end to ensure that the diagonal elements of the
observational covariance matrix are all 1; the procedure used is described in
Hauser and Bühlmann (2012). Note that in this case the error variances and
edge weights are no longer guaranteed to lie in the specified intervals
after normalization.
An object of class "GaussParDAG"
.
Alain Hauser ([email protected])
P. Erdős and A. Rényi (1960). On the evolution of random graphs. Publications of the Mathematical Institute of the Hungarian Academy of Sciences 5, 17–61.
A. Hauser and P. Bühlmann (2012). Characterization and greedy learning of interventional Markov equivalence classes of directed acyclic graphs. Journal of Machine Learning Research 13, 2409–2464.
set.seed(307) ## Plot some random DAGs if (require(Rgraphviz)) { ## Topologically sorted random DAG myDAG <- r.gauss.pardag(p = 10, prob = 0.2, top.sort = TRUE) plot(myDAG) ## Unsorted DAG myDAG <- r.gauss.pardag(p = 10, prob = 0.2, top.sort = FALSE) plot(myDAG) } ## Without normalization, edge weigths and error variances lie within the ## specified borders set.seed(307) myDAG <- r.gauss.pardag(p = 10, prob = 0.4, lbe = 0.1, ube = 1, lbv = 0.5, ubv = 1.5, neg.coef = FALSE) B <- myDAG$weight.mat() V <- myDAG$err.var() any((B > 0 & B < 0.1) | B > 1) any(V < 0.5 | V > 1.5) ## After normalization, edge weights and error variances are not necessarily ## within the specified range, but the diagonal of the observational covariance ## matrix consists of ones only set.seed(308) myDAG <- r.gauss.pardag(p = 10, prob = 0.4, normalize = TRUE, lbe = 0.1, ube = 1, lbv = 0.5, ubv = 1.5, neg.coef = FALSE) B <- myDAG$weight.mat() V <- myDAG$err.var() any((B > 0 & B < 0.1) | B > 1) any(V < 0.5 | V > 1.5) diag(myDAG$cov.mat())
set.seed(307) ## Plot some random DAGs if (require(Rgraphviz)) { ## Topologically sorted random DAG myDAG <- r.gauss.pardag(p = 10, prob = 0.2, top.sort = TRUE) plot(myDAG) ## Unsorted DAG myDAG <- r.gauss.pardag(p = 10, prob = 0.2, top.sort = FALSE) plot(myDAG) } ## Without normalization, edge weigths and error variances lie within the ## specified borders set.seed(307) myDAG <- r.gauss.pardag(p = 10, prob = 0.4, lbe = 0.1, ube = 1, lbv = 0.5, ubv = 1.5, neg.coef = FALSE) B <- myDAG$weight.mat() V <- myDAG$err.var() any((B > 0 & B < 0.1) | B > 1) any(V < 0.5 | V > 1.5) ## After normalization, edge weights and error variances are not necessarily ## within the specified range, but the diagonal of the observational covariance ## matrix consists of ones only set.seed(308) myDAG <- r.gauss.pardag(p = 10, prob = 0.4, normalize = TRUE, lbe = 0.1, ube = 1, lbv = 0.5, ubv = 1.5, neg.coef = FALSE) B <- myDAG$weight.mat() V <- myDAG$err.var() any((B > 0 & B < 0.1) | B > 1) any(V < 0.5 | V > 1.5) diag(myDAG$cov.mat())
Generating random directed acyclic graphs (DAGs) with fixed expected
number of neighbours. Several different methods are provided, each
intentionally biased towards certain properties. The methods are based
on the analogue *.game
functions in the igraph package.
randDAG(n, d, method ="er", par1=NULL, par2=NULL, DAG = TRUE, weighted = TRUE, wFUN = list(runif, min=0.1, max=1))
randDAG(n, d, method ="er", par1=NULL, par2=NULL, DAG = TRUE, weighted = TRUE, wFUN = list(runif, min=0.1, max=1))
n |
integer, at least |
d |
a positive number, corresponding to the expected number of neighbours per node, more precisely the expected sum of the in- and out-degree. |
method |
a string, specifying the method used for generating the random graph. See details below. |
par1 , par2
|
optional additional arguments, dependent on the method. See details. |
DAG |
logical, if |
weighted |
logical indicating if edge weights are computed according to |
wFUN |
a |
A (weighted) random graph with n
nodes and expected number of
neighbours d
is constructed. For DAG=TRUE
, the graph is
oriented to a DAG. There are eight different random graph models
provided, each selectable by the parameters method
,
par1
and par2
, with method
, a string,
taking one of the following values:
regular
:Graph where every node has exactly d
incident edges. par1
and par2
are not used.
watts
:Watts-Strogatz graph that interpolates between
the regular (par1->0
) and Erdoes-Renyi graph
(par1->1
). The parameter par1
is per default
0.5
and has to be in (0,1)
. par2
is not used.
er
:Erdoes-Renyi graph where every edge is present
independently. par1
and par2
are not used.
power
:A graph with power-law degree distribution with
expectation d
.par1
and par2
are not used.
bipartite
:Bipartite graph with at least par1*n
nodes in group 1 and at most (1-par1)*n
nodes in group 2.
The argument par1
has to be in [0,1]
and is per
default 0.5
. par2
is not used.
barabasi
:A graph with power-law degree distribution
and preferential attachement according to parameter par1
. It
must hold that par1 >= 1
and the default is
par1=1
. par2
is not used.
geometric
:A geometric random graph in dimension
par1
, where par1
can take values from
{2,3,4,5}
and is per default 2
. If par2="geo"
and weighted=TRUE
, then the weights are computed according to
the Euclidean distance. There are currently no other option for
par2
implemented.
interEr
:A graph with par1
islands of
Erdoes-Renyi graphs, every pair of those connected by a certain
number of edges proportional to par2
(fraction of
inter-connectivity). It is required that
be integer and
par2
in . Defaults are
par1=2
and par2=0.25
, respectively.
A graph object of class graphNEL
.
The output is not topologically sorted (as opposed to the
output of randomDAG
).
Markus Kalisch ([email protected]) and Manuel Schuerch.
These methods are mainly based on the analogue functions in the igraph package.
the package igraph, notably help pages such as
sample_k_regular
or sample_smallworld
;
unifDAG
from package unifDAG for generating uniform random DAGs.
randomDAG
a limited and soon deprecated version of randDAG
;
rmvDAG
for generating multivariate data according to a DAG.
set.seed(38) dag1 <- randDAG(10, 4, "regular") dag2 <- randDAG(10, 4, "watts") dag3 <- randDAG(10, 4, "er") dag4 <- randDAG(10, 4, "power") dag5 <- randDAG(10, 4, "bipartite") dag6 <- randDAG(10, 4, "barabasi") dag7 <- randDAG(10, 4, "geometric") dag8 <- randDAG(10, 4, "interEr", par2 = 0.5) if (require(Rgraphviz)) { par(mfrow=c(4,2)) plot(dag1,main="Regular graph") plot(dag2,main="Watts-Strogatz graph") plot(dag3,main="Erdoes-Renyi graph") plot(dag4,main="Power-law graph") plot(dag5,main="Bipartite graph") plot(dag6,main="Barabasi graph") plot(dag7,main="Geometric random graph") plot(dag8,main="Interconnected island graph") } set.seed(45) dag0 <- randDAG(6,3) dag1 <- randDAG(6,3, weighted=FALSE) dag2 <- randDAG(6,3, DAG=FALSE) if (require(Rgraphviz)) { par(mfrow=c(1,2)) plot(dag1) plot(dag2) ## undirected graph } dag0@edgeData ## note the uniform weights between 0.1 and 1 dag1@edgeData ## note the constant weights wFUN <- function(m,lB,uB) { runif(m,lB,uB) } dag <- randDAG(6,3,wFUN=list(wFUN,1,4)) dag@edgeData ## note the uniform weights between 1 and 4
set.seed(38) dag1 <- randDAG(10, 4, "regular") dag2 <- randDAG(10, 4, "watts") dag3 <- randDAG(10, 4, "er") dag4 <- randDAG(10, 4, "power") dag5 <- randDAG(10, 4, "bipartite") dag6 <- randDAG(10, 4, "barabasi") dag7 <- randDAG(10, 4, "geometric") dag8 <- randDAG(10, 4, "interEr", par2 = 0.5) if (require(Rgraphviz)) { par(mfrow=c(4,2)) plot(dag1,main="Regular graph") plot(dag2,main="Watts-Strogatz graph") plot(dag3,main="Erdoes-Renyi graph") plot(dag4,main="Power-law graph") plot(dag5,main="Bipartite graph") plot(dag6,main="Barabasi graph") plot(dag7,main="Geometric random graph") plot(dag8,main="Interconnected island graph") } set.seed(45) dag0 <- randDAG(6,3) dag1 <- randDAG(6,3, weighted=FALSE) dag2 <- randDAG(6,3, DAG=FALSE) if (require(Rgraphviz)) { par(mfrow=c(1,2)) plot(dag1) plot(dag2) ## undirected graph } dag0@edgeData ## note the uniform weights between 0.1 and 1 dag1@edgeData ## note the constant weights wFUN <- function(m,lB,uB) { runif(m,lB,uB) } dag <- randDAG(6,3,wFUN=list(wFUN,1,4)) dag@edgeData ## note the uniform weights between 1 and 4
Generate a random Directed Acyclic Graph (DAG). The resulting graph is topologically ordered from low to high node numbers.
randomDAG(n, prob, lB = 0.1, uB = 1, V = as.character(1:n))
randomDAG(n, prob, lB = 0.1, uB = 1, V = as.character(1:n))
n |
Number of nodes, |
prob |
Probability of connecting a node to another node with higher topological ordering. |
lB , uB
|
Lower and upper limit of edge weights, chosen uniformly
at random, i.e., by |
V |
|
The n
nodes are ordered. Start with first node. Let the
number of nodes with higher order be k. Then, the number of
neighbouring nodes is drawn as Bin(k, prob
). The neighbours are
then drawn without replacement from the nodes with higher order. For
each node, a weight is uniformly sampled from lB
to uB
.
This procedure is repeated for the next node in the original ordering
and so on.
An object of class "graphNEL"
, see
graph-class
from package graph, with n
named ("1" to "n") nodes and directed edges. The graph is
topologically ordered.
Each edge has a weight between lB
and uB
.
Markus Kalisch ([email protected]) and Martin Maechler
randDAG
for a more elaborate version of this
function; rmvDAG
for generating data according to a
DAG; compareGraphs
for comparing the skeleton of a DAG
with some other undirected graph (in terms of TPR, FPR and TDR).
set.seed(101) myDAG <- randomDAG(n = 20, prob= 0.2, lB = 0.1, uB = 1) if (require(Rgraphviz)) plot(myDAG)
set.seed(101) myDAG <- randomDAG(n = 20, prob= 0.2, lB = 0.1, uB = 1) if (require(Rgraphviz)) plot(myDAG)
Estimate an RFCI-PAG from observational data, using the RFCI-algorithm.
rfci(suffStat, indepTest, alpha, labels, p, skel.method = c("stable", "original", "stable.fast"), fixedGaps = NULL, fixedEdges = NULL, NAdelete = TRUE, m.max = Inf, rules = rep(TRUE, 10), conservative = FALSE, maj.rule = FALSE, numCores = 1, verbose = FALSE)
rfci(suffStat, indepTest, alpha, labels, p, skel.method = c("stable", "original", "stable.fast"), fixedGaps = NULL, fixedEdges = NULL, NAdelete = TRUE, m.max = Inf, rules = rep(TRUE, 10), conservative = FALSE, maj.rule = FALSE, numCores = 1, verbose = FALSE)
suffStat |
Sufficient statistics: List containing all necessary
elements for the conditional independence decisions in the
function |
indepTest |
Predefined function for testing conditional independence. The
function is internally called as |
alpha |
significance level (number in |
labels |
(optional) character vector of variable (or
“node”) names. Typically preferred to specifying |
p |
(optional) number of variables (or nodes). May be specified
if |
skel.method |
Character string specifying method; the default,
|
fixedGaps |
A logical matrix of dimension p*p. If entry
|
fixedEdges |
A logical matrix of dimension p*p. If entry
|
NAdelete |
If indepTest returns |
m.max |
Maximum size of the conditioning sets that are considered in the conditional independence tests. |
rules |
Logical vector of length 10 indicating which rules should be used when directing edges. The order of the rules is taken from Zhang (2009). |
conservative |
Logical indicating if the unshielded triples should be checked for ambiguity after the skeleton has been found, similar to the conservative PC algorithm. |
maj.rule |
Logical indicating if the unshielded triples should be checked for ambiguity after the skeleton has been found using a majority rule idea, which is less strict than the conservative. |
numCores |
Specifies the number of cores to be used for parallel
estimation of |
verbose |
If true, more detailed output is provided. |
This function is rather similar to fci. However, it does not compute any Possible-D-SEP sets and thus does not make tests conditioning on subsets of Possible-D-SEP. This makes RFCI much faster than FCI. The orientation rules for v-structures and rule 4 were modified in order to produce an RFCI-PAG which, in the oracle version, is guaranteed to have the correct ancestral relationships.
The first part of the RFCI algorithm is analogous to the PC and FCI
algorithm. It starts with a complete undirected graph and estimates an
initial skeleton using the function skeleton
, which
produces an initial order-independent skeleton, see
skeleton
for more details. All edges
of this skeleton are of the form o-o. Due to the presence of hidden
variables, it is no longer sufficient to consider only subsets of the
neighborhoods of nodes x
and y
to decide whether the
edge x o-o y
should be removed. The FCI algorithm performs
independence tests conditioning on subsets of Possible-D-SEP to remove
those edges. Since this procedure is computationally infeasible, the
RFCI algorithm uses a different approach to remove some of those
superfluous edges before orienting the v-structures and the
discriminating paths in orientation rule 4.
Before orienting the v-structures, we perform the following additional conditional independence tests. For each unshielded triple a-b-c in the initial skeleton, we check if both a and b and b and c are conditionally dependent given the separating of a and c (sepset(a,c)). These conditional dependencies may not have been checked while estimating the initial skeleton, since sepset(a,c) does not need to be a subset of the neighbors of a nor of the neighbors of c. If both conditional dependencies hold and b is not in the sepset(a,c), the triple is oriented as a v-structure a->b<-c. On the other hand, if an additional conditional independence relationship may be detected, say a is independent from b given the sepset(a,c), the edge between a and c is removed from the graph and the set responsible for that is saved in sepset(a,b). The removal of an edge can destroy or create new unshielded triples in the graph. To solve this problem we work with lists (for details see Colombo et al., 2012).
Before orienting discriminating paths, we perform the following additional
conditional independence tests. For each triple a <-* b o- *c with a
-> c, the algorithm searches for a discriminating path p = <d,
. . . , a,b,c> for b of minimal length, and checks that the vertices
in every consecutive pair (f1,f2) on p are conditionally dependent
given all subsets of
. If we do not find any
conditional independence relationship, the path is oriented as in rule
(R4). If one or more conditional independence relationships are found,
the corresponding edges are removed, their minimal separating sets are
stored.
Conservative RFCI can be computed if the argument of conservative
is
TRUE
. After the final skeleton is computed and the
additional local tests on all unshielded triples, as described above,
have been done, all potential v-structures a-b-c are checked
in the following way. We test whether a and c are independent
conditioning on any subset of the neighbors of a or any subset of the
neighbors of c. When a subset makes a and c conditionally independent,
we call it a separating set. If b is in no such separating set or in
all such separating sets, no further action is taken and the normal
version of the RFCI algorithm is continued. If, however, b is in only
some separating sets, the triple a-b-c is marked 'ambiguous'. If a is
independent of c given some S in the skeleton (i.e., the edge a-c
dropped out), but a and c remain dependent given all subsets of
neighbors of either a or c, we will call all triples a-b-c
'unambiguous'. This is because in the RFCI algorithm, the true
separating set might be outside the neighborhood of either a or c. An
ambiguous triple is not oriented as a v-structure. Furthermore, no further
orientation rule that needs to know whether a-b-c is a v-structure or
not is applied. Instead of using the conservative version, which is
quite strict towards the v-structures, Colombo and Maathuis (2014)
introduced a less strict version for the v-structures called majority
rule. This adaptation can be called using maj.rule = TRUE
. In
this case, the triple a-b-c is marked as 'ambiguous' if and only if b
is in exactly 50 percent of such separating sets or no separating set
was found. If b is in less than 50 percent of the separating sets it
is set as a v-structure, and if in more than 50 percent it is set as a
non v-structure (for more details see Colombo and Maathuis,
2014).
The implementation uses the stabilized skeleton
skeleton
, which produces an initial order-independent
skeleton. The final skeleton and edge orientations can still be
order-dependent, see Colombo and Maathuis (2014).
An object of class
fciAlgo
(see
fciAlgo
) containing the estimated graph
(in the form of an adjacency matrix with various possible edge marks),
the conditioning sets that lead to edge removals (sepset) and several other
parameters.
Diego Colombo and Markus Kalisch ([email protected]).
D. Colombo and M.H. Maathuis (2014).Order-independent constraint-based causal structure learning. Journal of Machine Learning Research 15 3741-3782.
D. Colombo, M. H. Maathuis, M. Kalisch, T. S. Richardson (2012). Learning high-dimensional directed acyclic graphs with latent and selection variables. Ann. Statist. 40, 294-321.
fci
and fciPlus
for estimating a
PAG using the FCI algorithm;
skeleton
for estimating an initial skeleton
using the RFCI algorithm; pc
for estimating a CPDAG using
the PC algorithm; gaussCItest
,
disCItest
, binCItest
and
dsepTest
as examples for indepTest
.
################################################## ## Example without latent variables ################################################## set.seed(42) p <- 7 ## generate and draw random DAG : myDAG <- randomDAG(p, prob = 0.4) ## find skeleton and PAG using the RFCI algorithm suffStat <- list(C = cov2cor(trueCov(myDAG)), n = 10^9) indepTest <- gaussCItest res <- rfci(suffStat, indepTest, alpha = 0.9999, p=p, verbose=TRUE) ##################################################% -------------- ## Example with hidden variables ## Zhang (2008), Fig. 6, p.1882 ################################################## ## create the DAG : V <- LETTERS[1:5] edL <- setNames(vector("list", length = 5), V) edL[[1]] <- list(edges=c(2,4),weights=c(1,1)) edL[[2]] <- list(edges=3,weights=c(1)) edL[[3]] <- list(edges=5,weights=c(1)) edL[[4]] <- list(edges=5,weights=c(1)) ## and leave edL[[ 5 ]] empty g <- new("graphNEL", nodes=V, edgeL=edL, edgemode="directed") if (require(Rgraphviz)) plot(g) ## define the latent variable L <- 1 ## compute the true covariance matrix of g cov.mat <- trueCov(g) ## delete rows and columns belonging to latent variable L true.cov <- cov.mat[-L,-L] ## transform covariance matrix into a correlation matrix true.corr <- cov2cor(true.cov) ## find PAG with RFCI algorithm ## as dependence "oracle", we use the true correlation matrix in ## gaussCItest() with a large "virtual sample size" and a large alpha : rfci.pag <- rfci(suffStat = list(C = true.corr, n = 10^9), indepTest = gaussCItest, alpha = 0.9999, labels = V[-L], verbose=TRUE) ## define PAG given in Zhang (2008), Fig. 6, p.1882 corr.pag <- rbind(c(0,1,1,0), c(1,0,0,2), c(1,0,0,2), c(0,3,3,0)) ## check that estimated and correct PAG are in agreement: stopifnot(corr.pag == rfci.pag@amat)
################################################## ## Example without latent variables ################################################## set.seed(42) p <- 7 ## generate and draw random DAG : myDAG <- randomDAG(p, prob = 0.4) ## find skeleton and PAG using the RFCI algorithm suffStat <- list(C = cov2cor(trueCov(myDAG)), n = 10^9) indepTest <- gaussCItest res <- rfci(suffStat, indepTest, alpha = 0.9999, p=p, verbose=TRUE) ##################################################% -------------- ## Example with hidden variables ## Zhang (2008), Fig. 6, p.1882 ################################################## ## create the DAG : V <- LETTERS[1:5] edL <- setNames(vector("list", length = 5), V) edL[[1]] <- list(edges=c(2,4),weights=c(1,1)) edL[[2]] <- list(edges=3,weights=c(1)) edL[[3]] <- list(edges=5,weights=c(1)) edL[[4]] <- list(edges=5,weights=c(1)) ## and leave edL[[ 5 ]] empty g <- new("graphNEL", nodes=V, edgeL=edL, edgemode="directed") if (require(Rgraphviz)) plot(g) ## define the latent variable L <- 1 ## compute the true covariance matrix of g cov.mat <- trueCov(g) ## delete rows and columns belonging to latent variable L true.cov <- cov.mat[-L,-L] ## transform covariance matrix into a correlation matrix true.corr <- cov2cor(true.cov) ## find PAG with RFCI algorithm ## as dependence "oracle", we use the true correlation matrix in ## gaussCItest() with a large "virtual sample size" and a large alpha : rfci.pag <- rfci(suffStat = list(C = true.corr, n = 10^9), indepTest = gaussCItest, alpha = 0.9999, labels = V[-L], verbose=TRUE) ## define PAG given in Zhang (2008), Fig. 6, p.1882 corr.pag <- rbind(c(0,1,1,0), c(1,0,0,2), c(1,0,0,2), c(0,3,3,0)) ## check that estimated and correct PAG are in agreement: stopifnot(corr.pag == rfci.pag@amat)
Generate multivariate data with dependency structure specified by a (given) DAG (Directed Acyclic Graph) with nodes corresponding to random variables. The DAG has to be topologically ordered.
rmvDAG(n, dag, errDist = c("normal", "cauchy", "t4", "mix", "mixt3", "mixN100"), mix = 0.1, errMat = NULL, back.compatible = FALSE, use.node.names = !back.compatible)
rmvDAG(n, dag, errDist = c("normal", "cauchy", "t4", "mix", "mixt3", "mixN100"), mix = 0.1, errMat = NULL, back.compatible = FALSE, use.node.names = !back.compatible)
n |
number of samples that should be drawn. (integer) |
dag |
a graph object describing the DAG; must contain weights for
all the edges. The nodes must be topologically sorted. (For
topological sorting use |
errDist |
string specifying the distribution of each node.
Currently, the options "normal", "t4", "cauchy", "mix", "mixt3" and
"mixN100" are supported. The first
three generate standard normal-, t(df=4)- and cauchy-random
numbers. The options containing the word "mix" create standard
normal random variables with a mix of outliers. The outliers for the
options "mix", "mixt3", "mixN100" are drawn from a standard cauchy,
t(df=3) and N(0,100) distribution, respectively. The fraction of
outliers is determined by the |
mix |
for the |
errMat |
numeric |
back.compatible |
logical indicating if the data generated should
be the same as with pcalg version 1.0-6 and earlier (where
|
use.node.names |
logical indicating if the column names of the
result matrix should equal |
Each node is visited in the topological order. For each node we
generate a
-dimensional value
in the following way:
Let
denote the values of all neighbours of
with lower order.
Let
be the weights of the corresponding edges.
Furthermore, generate a random vector
according to the
specified error distribution. Then, the value of
is
computed as
If node has no neighbors with lower order,
is set.
A matrix with the generated data. The
columns
correspond to the nodes (i.e., random variables) and each of the
rows correspond to a sample.
Markus Kalisch ([email protected]) and Martin Maechler.
randomDAG
for generating a random DAG;
skeleton
and pc
for estimating the
skeleton and the CPDAG of a DAG that
corresponds to the data.
## generate random DAG p <- 20 rDAG <- randomDAG(p, prob = 0.2, lB=0.1, uB=1) if (require(Rgraphviz)) { ## plot the DAG plot(rDAG, main = "randomDAG(20, prob = 0.2, ..)") } ## generate 1000 samples of DAG using standard normal error distribution n <- 1000 d.normMat <- rmvDAG(n, rDAG, errDist="normal") ## generate 1000 samples of DAG using standard t(df=4) error distribution d.t4Mat <- rmvDAG(n, rDAG, errDist="t4") ## generate 1000 samples of DAG using standard normal with a cauchy ## mixture of 30 percent d.mixMat <- rmvDAG(n, rDAG, errDist="mix",mix=0.3) require(MASS) ## for mvrnorm() Sigma <- toeplitz(ARMAacf(0.2, lag.max = p - 1)) dim(Sigma)# p x p ## *Correlated* normal error matrix "e_i" (against model assumption) eMat <- mvrnorm(n, mu = rep(0, p), Sigma = Sigma) d.CnormMat <- rmvDAG(n, rDAG, errMat = eMat)
## generate random DAG p <- 20 rDAG <- randomDAG(p, prob = 0.2, lB=0.1, uB=1) if (require(Rgraphviz)) { ## plot the DAG plot(rDAG, main = "randomDAG(20, prob = 0.2, ..)") } ## generate 1000 samples of DAG using standard normal error distribution n <- 1000 d.normMat <- rmvDAG(n, rDAG, errDist="normal") ## generate 1000 samples of DAG using standard t(df=4) error distribution d.t4Mat <- rmvDAG(n, rDAG, errDist="t4") ## generate 1000 samples of DAG using standard normal with a cauchy ## mixture of 30 percent d.mixMat <- rmvDAG(n, rDAG, errDist="mix",mix=0.3) require(MASS) ## for mvrnorm() Sigma <- toeplitz(ARMAacf(0.2, lag.max = p - 1)) dim(Sigma)# p x p ## *Correlated* normal error matrix "e_i" (against model assumption) eMat <- mvrnorm(n, mu = rep(0, p), Sigma = Sigma) d.CnormMat <- rmvDAG(n, rDAG, errMat = eMat)
Produces one or more samples from the observational or an interventional distribution associated to a Gaussian causal model.
rmvnorm.ivent(n, object, target = integer(0), target.value = numeric(0))
rmvnorm.ivent(n, object, target = integer(0), target.value = numeric(0))
n |
Number of samples required. |
object |
An instance of |
target |
Intervention target: vector of intervened nodes. If the vector is empty, samples from the observational distribution are generated. Otherwise, samples from an interventional distribution are simulated. |
target.value |
Values of the intervened variables. If
|
If n = 1
a vector of length p
is returned, where p
denotes the number of nodes of object
. Otherwise an n
by
p
matrix is returned with one sample per row.
Alain Hauser ([email protected])
set.seed(307) myDAG <- r.gauss.pardag(5, 0.5) var(rmvnorm.ivent(n = 1000, myDAG)) myDAG$cov.mat() var(rmvnorm.ivent(n = 1000, myDAG, target = 1, target.value = 1)) myDAG$cov.mat(target = 1, ivent.var = 0)
set.seed(307) myDAG <- r.gauss.pardag(5, 0.5) var(rmvnorm.ivent(n = 1000, myDAG)) myDAG$cov.mat() var(rmvnorm.ivent(n = 1000, myDAG, target = 1, target.value = 1)) myDAG$cov.mat(target = 1, ivent.var = 0)
This virtual base class represents a score for causal inference; it is used
in the causal inference functions ges
, gies
and
simy
.
Score-based structure learning algorithms for causal inference such as
Greedy Equivalence Search
(GES, implemented in the function ges
), Greedy Interventional
Equivalence Search (GIES, implemented in the function gies
) and
the dynamic programming approach of Silander and Myllymäki (2006)
(implemented in the function simy
) try to find the DAG model which
maximizes a scoring criterion for a given data set. A widely-used scoring
criterion is the Bayesian Information Criterion (BIC).
The virtual class Score
is the base class for providing a scoring
criterion to the mentioned causal inference algorithms. It does not
implement a concrete scoring criterion, but it defines the functions that
must be provided by its descendants (cf. methods).
Knowledge of this class is only required if you aim to implement an own
scoring criterion. At the moment, it is recommended to use the predefined
scoring criteria for multivariate Gaussian data derived from Score
,
GaussL0penIntScore
and
GaussL0penObsScore
.
The fields of Score
are mainly of interest for users who aim at
deriving an own class from this virtual base class, i.e., implementing an own
score function.
.nodes
:Node labels. They are passed to causal inference methods by default to label the nodes of the resulting graph.
decomp
:Indicates whether the represented score is decomposable (cf. details). At the moment, only decomposable scores are supported by the implementation of the causal inference algorithms; support for non-decomposable scores is planned.
pp.dat
:List representing the preprocessed input data; this is typically a statistic which is sufficient for the calculation of the score.
.pardag.class
:Name of the class of the parametric DAG model
corresponding to the score. This must name a class derived from
ParDAG
.
c.fcn
:Only used internally; must remain empty for (user
specified) classes derived from Score
.
new("Score", data = matrix(1, 1, 1), targets = list(integer(0)), target.index = rep(as.integer(1), nrow(data)), nodes = colnames(data), ...)
data
Data matrix with rows and
columns. Each
row corresponds to one realization, either interventional or
observational.
targets
List of mutually exclusive intervention targets that have been used for data generation.
target.index
Vector of length ; the
-th entry
specifies the index of the intervention
target in
targets
under which the -th row of
data
was measured.
nodes
Node labels
...
Additional parameters used by derived (and non-virtual) classes.
Note that since Score
is a virtual class, its methods cannot be called
directly, but only on derived classes.
local.score(vertex, parents, ...)
For decomposable scores, this function calculates the local score of a vertex and its parents. Must throw an error in derived classes that do not represent a decomposable score.
global.score.int(edges, ...)
Calculates the global score of a DAG, represented as a list of in-edges: for each vertex in the DAG, this list contains a vector of parents.
global.score(dag, ...)
Calculates the global score of a DAG,
represented as object of a class derived from ParDAG
.
local.fit(vertex, parents, ...)
Calculates a local model fit
of a vertex and its parents, e.g. by MLE.
The result is a vector of parameters whose
meaning depends on the model class; it matches the convention used in the
corresponding causal model (cf. .pardag.class
).
global.fit(dag, ...)
Calculates the global MLE of a DAG,
represented by an object of the class specified by .pardag.class
.
The result is a list of vectors, one per vertex, each in the same format
as the result vector of local.mle
.
Alain Hauser ([email protected])
T. Silander and P. Myllymäki (2006). A simple approach for finding the globally optimal Bayesian network structure. Proceedings of the 22nd Conference on Uncertainty in Artificial Intelligence (UAI 2006), 445–452
ges
, gies
, simy
,
GaussL0penIntScore
,
GaussL0penObsScore
Searches for all ancestors, descendants, anteriors, spouses, neighbors, parents, children or possible descendants of a (set of) node(s) in a DAG, CPDAG, MAG or PAG.
searchAM(amat,x, type = c("an", "de", "ant", "sp", "nb", "pa", "ch", "pde"))
searchAM(amat,x, type = c("an", "de", "ant", "sp", "nb", "pa", "ch", "pde"))
amat |
Adjacency matrix of type amat.pag. |
x |
Target node(s), given as (a vector of) column number(s) of the node(s) in the adjacency matrix. |
type |
Character string specifying which relation to the
target nodes in
For the precise definitions of these concepts, see the references. |
This function performs a search for nodes related to the set of target nodes x
in the way specified by type
in adjacency matrix amat
of type amat.pag.
Vector of column numbers of the nodes related to x
as specified by type
.
Joris Mooij.
T.S. Richardson and P. Spirtes (2002). Ancestral graph Markov models. Annals of Statistics 30 962-1030.
J. Zhang (2008). Causal Reasoning with Ancestral Graphs. Journal of Machine Learning Research 9 1437-1474.
# Y-structure MAG # Encode as adjacency matrix p <- 10 # total number of variables V <- c("X1","X2","X3","X4","X5","X6","X7","X8","X9","X10") # variable labels # amat[i,j] = 0 iff no edge btw i,j # amat[i,j] = 1 iff i *-o j # amat[i,j] = 2 iff i *-> j # amat[i,j] = 3 iff i *-- j amat <- rbind(c(0,3,0,0,0,0,0,0,0,0), c(3,0,3,0,0,0,0,0,0,0), c(0,3,0,2,0,0,0,0,0,0), c(0,0,3,0,2,0,0,0,0,0), c(0,0,0,3,0,2,0,2,2,1), c(0,0,0,0,3,0,2,0,0,0), c(0,0,0,0,0,3,0,0,0,0), c(0,0,0,0,2,0,0,0,0,0), c(0,0,0,0,1,0,0,0,0,0), c(0,0,0,0,1,0,0,0,0,0)) rownames(amat)<-V colnames(amat)<-V stopifnot(all.equal(searchAM(amat,5,type = "an"), c(3,4,5))) # ancestors of X5 stopifnot(all.equal(searchAM(amat,5,type = "de"), c(5,6,7))) # descendants of X5 stopifnot(all.equal(searchAM(amat,5,type = "ant"), c(1,2,3,4,5))) # anteriors of X5 stopifnot(all.equal(searchAM(amat,5,type = "sp"), c(8))) # spouses of X5 stopifnot(all.equal(searchAM(amat,2,type = "nb"), c(1,3))) # neighbors of X2 stopifnot(all.equal(searchAM(amat,c(4,6),type = "pa"), c(3,5))) # parents of {X4,X6} stopifnot(all.equal(searchAM(amat,c(3,5),type = "ch"), c(4,6))) # children of {X3,X5} stopifnot(all.equal(searchAM(amat,5,type = "pde"), c(5,6,7,9,10))) # possible descendants of X5
# Y-structure MAG # Encode as adjacency matrix p <- 10 # total number of variables V <- c("X1","X2","X3","X4","X5","X6","X7","X8","X9","X10") # variable labels # amat[i,j] = 0 iff no edge btw i,j # amat[i,j] = 1 iff i *-o j # amat[i,j] = 2 iff i *-> j # amat[i,j] = 3 iff i *-- j amat <- rbind(c(0,3,0,0,0,0,0,0,0,0), c(3,0,3,0,0,0,0,0,0,0), c(0,3,0,2,0,0,0,0,0,0), c(0,0,3,0,2,0,0,0,0,0), c(0,0,0,3,0,2,0,2,2,1), c(0,0,0,0,3,0,2,0,0,0), c(0,0,0,0,0,3,0,0,0,0), c(0,0,0,0,2,0,0,0,0,0), c(0,0,0,0,1,0,0,0,0,0), c(0,0,0,0,1,0,0,0,0,0)) rownames(amat)<-V colnames(amat)<-V stopifnot(all.equal(searchAM(amat,5,type = "an"), c(3,4,5))) # ancestors of X5 stopifnot(all.equal(searchAM(amat,5,type = "de"), c(5,6,7))) # descendants of X5 stopifnot(all.equal(searchAM(amat,5,type = "ant"), c(1,2,3,4,5))) # anteriors of X5 stopifnot(all.equal(searchAM(amat,5,type = "sp"), c(8))) # spouses of X5 stopifnot(all.equal(searchAM(amat,2,type = "nb"), c(1,3))) # neighbors of X2 stopifnot(all.equal(searchAM(amat,c(4,6),type = "pa"), c(3,5))) # parents of {X4,X6} stopifnot(all.equal(searchAM(amat,c(3,5),type = "ch"), c(4,6))) # children of {X3,X5} stopifnot(all.equal(searchAM(amat,5,type = "pde"), c(5,6,7,9,10))) # possible descendants of X5
Compute the Structural Hamming Distance (SHD) between two graphs. In simple terms, this is the number of edge insertions, deletions or flips in order to transform one graph to another graph.
shd(g1,g2)
shd(g1,g2)
g1 |
Graph object |
g2 |
Graph object |
The value of the SHD (numeric).
Markus Kalisch [email protected] and Martin Maechler
I. Tsamardinos, L.E. Brown and C.F. Aliferis (2006). The Max-Min Hill-Climbing Bayesian Network Structure Learning Algorithm. JMLR 65, 31–78.
## generate two graphs g1 <- randomDAG(10, prob = 0.2) g2 <- randomDAG(10, prob = 0.2) ## compute SHD (shd.val <- shd(g1,g2))
## generate two graphs g1 <- randomDAG(10, prob = 0.2) g2 <- randomDAG(10, prob = 0.2) ## compute SHD (shd.val <- shd(g1,g2))
This function is deprecated - Use as(*, "amat")
instead !
Show the adjacency matrix of a "pcAlgo"
object; this is
intended to be an alternative if the Rgraphviz package does not work.
showAmat(object)
showAmat(object)
object |
an R object of class |
The adjacency matrix.
For "fciAlgo"
objects, the show method produces a similar result.
Markus Kalisch ([email protected])
showEdgeList
for showing the edge list of a pcAlgo
object.
iplotPC
for plotting a "pcAlgo"
object using the package
igraph also for an example of showAmat()
.
Show the list of edges (of the graph) of a pcAlgo object; this is intended to be an alternative if Rgraphviz does not work.
showEdgeList(object, labels = NULL)
showEdgeList(object, labels = NULL)
object |
an R object of class |
labels |
optional labels for nodes; by default, the labels from
the |
none; the purpose is in (the side effect of) printing the edge list.
This is not quite ok for "fciAlgo"
objects, yet.
Markus Kalisch ([email protected])
showAmat
for the adjacency matrix of a pcAlgo
object.
iplotPC
for plotting a pcAlgo object using the package
igraph, also for an example of showEdgeList()
.
Estimate the interventional essential graph representing the Markov equivalence class of a DAG using the dynamic programming (DP) approach of Silander and Myllymäki (2006). This algorithm maximizes a decomposable scoring criterion in exponential runtime.
simy(score, labels = score$getNodes(), targets = score$getTargets(), verbose = FALSE, ...)
simy(score, labels = score$getNodes(), targets = score$getTargets(), verbose = FALSE, ...)
score |
An instance of a class derived from |
labels |
Node labels; by default, they are determined from the scoring object. |
targets |
A list of intervention targets (cf. details). A list of vectors, each vector listing the vertices of one intervention target. |
verbose |
if |
... |
Additional arguments for debugging purposes and fine tuning. |
This function estimates the interventional Markov equivalence class of a DAG
based on a data sample with interventional data originating from various
interventions and possibly observational data. The intervention targets used
for data generation must be specified by the argument targets
as a
list of (integer) vectors listing the intervened vertices; observational
data is specified by an empty set, i.e. a vector of the form
integer(0)
. As an example, if data contains observational samples
as well as samples originating from an intervention at vertices 1 and 4,
the intervention targets must be specified as list(integer(0),
as.integer(1), as.integer(c(1, 4)))
.
An interventional Markov equivalence class of DAGs can be uniquely represented by a partially directed graph called interventional essential graph. Its edges have the following interpretation:
a directed edge stands for an arrow that has the same
orientation in all representatives of the interventional Markov
equivalence class;
an undirected edge a – b stands for an arrow that is oriented in one way in some representatives of the equivalence class and in the other way in other representatives of the equivalence class.
Note that when plotting the object, undirected and bidirected edges are equivalent.
The DP approach of Silander and Myllymäki (2006) is a score-based algorithm that guarantees to find the optimum of any decomposable scoring criterion. Its CPU and memory consumption grow exponentially with the number of variables in the system, irrespective of the sparseness of the true or estimated DAG. The implementation in the pcalg package is feasible up to approximately 20 variables, depending on the user's computer.
simy
returns a list with the following two components:
essgraph |
An object of class |
repr |
An object of a class derived from |
Alain Hauser ([email protected])
T. Silander and P. Myllymäki (2006). A simple approach for finding the globally optimal Bayesian network structure. Proceedings of the 22nd Conference on Uncertainty in Artificial Intelligence (UAI 2006), 445–452
################################################## ## Using Gaussian Data ################################################## ## Load predefined data data(gmInt) ## Define the score (BIC) score <- new("GaussL0penIntScore", gmInt$x, gmInt$targets, gmInt$target.index) ## Estimate the essential graph simy.fit <- simy(score) eDAG <- simy.fit$essgraph as(eDAG, "graph") ## Look at the graph incidence matrix (a "sparseMatrix"): if(require(Matrix)) show( as(as(eDAG, "graphNEL"), "Matrix") ) ## Plot the estimated essential graph and the true DAG if (require(Rgraphviz)) { par(mfrow=c(1,2)) plot(eDAG, main = "Estimated ess. graph") plot(gmInt$g, main = "True DAG") }
################################################## ## Using Gaussian Data ################################################## ## Load predefined data data(gmInt) ## Define the score (BIC) score <- new("GaussL0penIntScore", gmInt$x, gmInt$targets, gmInt$target.index) ## Estimate the essential graph simy.fit <- simy(score) eDAG <- simy.fit$essgraph as(eDAG, "graph") ## Look at the graph incidence matrix (a "sparseMatrix"): if(require(Matrix)) show( as(as(eDAG, "graphNEL"), "Matrix") ) ## Plot the estimated essential graph and the true DAG if (require(Rgraphviz)) { par(mfrow=c(1,2)) plot(eDAG, main = "Estimated ess. graph") plot(gmInt$g, main = "True DAG") }
Estimate the skeleton of a DAG without latent and selection variables using the PC Algorithm or estimate an initial skeleton of a DAG with arbitrarily many latent and selection variables using the FCI and the RFCI algorithms.
If used in the PC algorithm, it estimates the order-independent
“PC-stable” ("stable"
) or original PC ("original"
)
“skeleton” of a directed acyclic graph (DAG) from observational
data.
When used in the FCI and RFCI algorithms, this function estimates only an initial order-independent (or PC original) “skeleton”. Because of the presence of latent and selection variables, to find the final skeleton those algorithms need to perform additional tests later on and consequently some edges can be further deleted.
skeleton(suffStat, indepTest, alpha, labels, p, method = c("stable", "original", "stable.fast"), m.max = Inf, fixedGaps = NULL, fixedEdges = NULL, NAdelete = TRUE, numCores = 1, verbose = FALSE)
skeleton(suffStat, indepTest, alpha, labels, p, method = c("stable", "original", "stable.fast"), m.max = Inf, fixedGaps = NULL, fixedEdges = NULL, NAdelete = TRUE, numCores = 1, verbose = FALSE)
suffStat |
Sufficient statistics: List containing all necessary
elements for the conditional independence decisions in the
function |
indepTest |
Predefined |
alpha |
significance level (number in |
labels |
(optional) character vector of variable (or
“node”) names. Typically preferred to specifying |
p |
(optional) number of variables (or nodes). May be specified
if |
method |
Character string specifying method; the default,
|
m.max |
Maximal size of the conditioning sets that are considered in the conditional independence tests. |
fixedGaps |
logical symmetric matrix of dimension p*p. If entry
|
fixedEdges |
a logical symmetric matrix of dimension p*p. If entry
|
NAdelete |
logical needed for the case |
numCores |
number of processor cores to use for parallel computation.
Only available for |
verbose |
if |
Under the assumption that the distribution of the observed variables
is faithful to a DAG and that there are no latent and selection
variables, this function estimates the skeleton of the DAG. The
skeleton of a DAG is the undirected graph resulting from removing all
arrowheads from the DAG. Edges in the skeleton of a DAG have the
following interpretation:
There is an edge between and
,
–
,
if and only if variables
and
are conditionally
dependent given
for all possible subsets
of the
remaining nodes.
On the other hand, the distribution of the observed variables
is faithful to a DAG with arbitrarily many latent and selection
variables, skeleton()
estimates the initial skeleton of the
DAG. Edges in this initial skeleton of a DAG have the
following interpretation:
There is an edge –
if and only if variables
and
are conditionally dependent given
for all possible
subsets
of the neighbours of
and the neighbours of
.
The data are not required to follow a specific distribution,
but one should make sure that the conditional indepedence test used in
indepTest
is appropriate for the data. Pre-programmed versions
of indepTest
are available for Gaussian data
(gaussCItest
), discrete data (disCItest
),
and binary data (see binCItest
). Users may also specify
their own indepTest
function.
The PC algorithm (Spirtes, Glymour and Scheines, 2000)
(method = "original"
) is known to be order-dependent, in the
sense that the output may depend on the order in which the variables
are given. Therefore, Colombo and Maathuis (2014) proposed a simple
modification, called “PC-stable”, which yields
order-independent adjacencies in the skeleton, provided by pc()
with the new default method = "stable"
. This stable variant
of the algorithm is also available with the method = "stable.fast"
:
it runs the algorithm of Colombo and Maathuis (2014) faster than
method = "stable"
in general, but should be regarded as an
experimental option at the moment.
The algorithm starts with a complete undirected graph. In each
step, it visits all pairs of adjacent nodes in the
current graph, and determines based on conditional independence tests
whether the edge
should be removed. In particular, for each step
(
) of the size of the conditioning sets, the
algorithm at first determines the neighbours
of each node
in the graph. Then, the algorithm visits all pairs
of adjacent nodes in the current graph, and the edge
is
kept if and only if the null hypothesis
and
are conditionally independent given S
rejected at significance level alpha
for all subsets of size
of
and of
(as judged by the function
indepTest
). For the "stable"
method, the neighborhoods
are kept fixed within each value of
, and this
makes the algorithm order-independent. Method
"original"
,
the original PC algorithm would update the neighbour list after each
edge change.
The algorithm stops when is larger than the largest
neighbourhood size of all nodes, or when
has reached the limit
m.max
which may be set by the user.
Since the FCI (Spirtes, Glymour and Scheines, 2000) and RFCI (Colombo
et al., 2012) algorithms are built up from the PC algorithm, they are also
order-dependent in the skeleton. To resolve their order-dependence
issues in the skeleton is more involved, see Colombo and Maathuis
(2014). However now, with method = "stable"
, this function
estimates an initial order-independent skeleton in these algorithms
(for additional details on how to make the final skeleton of FCI fully
order-independent see fci
and Colombo and Maathuis (2014)).
The information in fixedGaps
and fixedEdges
is used as follows.
The gaps given in fixedGaps
are introduced in the very beginning of
the algorithm by removing the corresponding edges from the complete
undirected graph. Pairs in
fixedEdges
are skipped
in all steps of the algorithm, so that these edges remain in the graph.
Note: Throughout, the algorithm works with the column positions of the variables in the adjacency matrix, and not with the names of the variables.
An object of class
"pcAlgo"
(see
pcAlgo
) containing an estimate of the skeleton of
the underlying DAG, the conditioning sets (sepset
) that led to
edge removals and several other parameters.
Markus Kalisch ([email protected]), Martin Maechler, Alain Hauser, and Diego Colombo.
D. Colombo and M.H. Maathuis (2014).Order-independent constraint-based causal structure learning. Journal of Machine Learning Research 15 3741-3782.
D. Colombo, M. H. Maathuis, M. Kalisch, T. S. Richardson (2012). Learning high-dimensional directed acyclic graphs with latent and selection variables. Ann. Statist. 40, 294-321.
M. Kalisch and P. Buehlmann (2007). Estimating high-dimensional directed acyclic graphs with the PC-algorithm, JMLR 8 613-636.
P. Spirtes, C. Glymour and R. Scheines (2000). Causation, Prediction, and Search, 2nd edition, MIT Press.
pc
for generating a partially directed graph
using the PC algorithm; fci
for generating a partial
ancestral graph using the FCI algorithm; rfci
for
generating a partial ancestral graph using the RFCI algorithm;
udag2pdag
for converting the skeleton to a CPDAG.
Further, gaussCItest
, disCItest
,
binCItest
and dsepTest
as examples for
indepTest
.
################################################## ## Using Gaussian Data ################################################## ## Load predefined data data(gmG) n <- nrow (gmG8$x) V <- colnames(gmG8$x) # labels aka node names ## estimate Skeleton skel.fit <- skeleton(suffStat = list(C = cor(gmG8$x), n = n), indepTest = gaussCItest, ## (partial correlations) alpha = 0.01, labels = V, verbose = TRUE) if (require(Rgraphviz)) { ## show estimated Skeleton par(mfrow=c(1,2)) plot(skel.fit, main = "Estimated Skeleton") plot(gmG8$g, main = "True DAG") } ################################################## ## Using d-separation oracle ################################################## ## define sufficient statistics (d-separation oracle) Ora.stat <- list(g = gmG8$g, jp = RBGL::johnson.all.pairs.sp(gmG8$g)) ## estimate Skeleton fit.Ora <- skeleton(suffStat=Ora.stat, indepTest = dsepTest, labels = V, alpha=0.01) # <- irrelevant as dsepTest returns either 0 or 1 if (require(Rgraphviz)) { ## show estimated Skeleton plot(fit.Ora, main = "Estimated Skeleton (d-sep oracle)") plot(gmG8$g, main = "True DAG") } ################################################## ## Using discrete data ################################################## ## Load data data(gmD) V <- colnames(gmD$x) # labels aka node names ## define sufficient statistics suffStat <- list(dm = gmD$x, nlev = c(3,2,3,4,2), adaptDF = FALSE) ## estimate Skeleton skel.fit <- skeleton(suffStat, indepTest = disCItest, ## (G^2 statistics independence test) alpha = 0.01, labels = V, verbose = TRUE) if (require(Rgraphviz)) { ## show estimated Skeleton par(mfrow = c(1,2)) plot(skel.fit, main = "Estimated Skeleton") plot(gmD$g, main = "True DAG") } ################################################## ## Using binary data ################################################## ## Load binary data data(gmB) X <- gmB$x ## estimate Skeleton skel.fm2 <- skeleton(suffStat = list(dm = X, adaptDF = FALSE), indepTest = binCItest, alpha = 0.01, labels = colnames(X), verbose = TRUE) if (require(Rgraphviz)) { ## show estimated Skeleton par(mfrow = c(1,2)) plot(skel.fm2, main = "Binary Data 'gmB': Estimated Skeleton") plot(gmB$g, main = "True DAG") }
################################################## ## Using Gaussian Data ################################################## ## Load predefined data data(gmG) n <- nrow (gmG8$x) V <- colnames(gmG8$x) # labels aka node names ## estimate Skeleton skel.fit <- skeleton(suffStat = list(C = cor(gmG8$x), n = n), indepTest = gaussCItest, ## (partial correlations) alpha = 0.01, labels = V, verbose = TRUE) if (require(Rgraphviz)) { ## show estimated Skeleton par(mfrow=c(1,2)) plot(skel.fit, main = "Estimated Skeleton") plot(gmG8$g, main = "True DAG") } ################################################## ## Using d-separation oracle ################################################## ## define sufficient statistics (d-separation oracle) Ora.stat <- list(g = gmG8$g, jp = RBGL::johnson.all.pairs.sp(gmG8$g)) ## estimate Skeleton fit.Ora <- skeleton(suffStat=Ora.stat, indepTest = dsepTest, labels = V, alpha=0.01) # <- irrelevant as dsepTest returns either 0 or 1 if (require(Rgraphviz)) { ## show estimated Skeleton plot(fit.Ora, main = "Estimated Skeleton (d-sep oracle)") plot(gmG8$g, main = "True DAG") } ################################################## ## Using discrete data ################################################## ## Load data data(gmD) V <- colnames(gmD$x) # labels aka node names ## define sufficient statistics suffStat <- list(dm = gmD$x, nlev = c(3,2,3,4,2), adaptDF = FALSE) ## estimate Skeleton skel.fit <- skeleton(suffStat, indepTest = disCItest, ## (G^2 statistics independence test) alpha = 0.01, labels = V, verbose = TRUE) if (require(Rgraphviz)) { ## show estimated Skeleton par(mfrow = c(1,2)) plot(skel.fit, main = "Estimated Skeleton") plot(gmD$g, main = "True DAG") } ################################################## ## Using binary data ################################################## ## Load binary data data(gmB) X <- gmB$x ## estimate Skeleton skel.fm2 <- skeleton(suffStat = list(dm = X, adaptDF = FALSE), indepTest = binCItest, alpha = 0.01, labels = colnames(X), verbose = TRUE) if (require(Rgraphviz)) { ## show estimated Skeleton par(mfrow = c(1,2)) plot(skel.fm2, main = "Binary Data 'gmB': Estimated Skeleton") plot(gmB$g, main = "True DAG") }
Compute the (true) covariance matrix of a generated DAG.
trueCov(dag, back.compatible = FALSE)
trueCov(dag, back.compatible = FALSE)
dag |
Graph object containing the DAG. |
back.compatible |
logical indicating if the data generated should
be the same as with pcalg version 1.0-6 and earlier (where
|
Covariance matrix.
This function can not be used to estimate the covariance matrix from an estimated DAG or corresponding data.
Markus Kalisch
randomDAG
for generating a random DAG
set.seed(123) g <- randomDAG(n = 5, prob = 0.3) ## generate random DAG if(require(Rgraphviz)) { plot(g) } ## Compute true covariance matrix trueCov(g) ## For comparison: ## Estimate true covariance matrix after generating data from g d <- rmvDAG(10000, g) cov(d)
set.seed(123) g <- randomDAG(n = 5, prob = 0.3) ## generate random DAG if(require(Rgraphviz)) { plot(g) } ## Compute true covariance matrix trueCov(g) ## For comparison: ## Estimate true covariance matrix after generating data from g d <- rmvDAG(10000, g) cov(d)
This function performs the last step of the RFCI algorithm: It transforms a partially oriented graph in which the v-structures have been oriented into an RFCI Partial Ancestral Graph (PAG) (see Colombo et al (2012)).
While orienting the edges, this function performs some additional conditional independence tests in orientation rule 4 to ensure correctness of the ancestral relationships. As a result of these additional tests, some additional edges can be deleted. The result is the final adjacency matrix indicating also the edge marks and the updated sepsets.
udag2apag(apag, suffStat, indepTest, alpha, sepset, rules = rep(TRUE, 10), unfVect = NULL, verbose = FALSE)
udag2apag(apag, suffStat, indepTest, alpha, sepset, rules = rep(TRUE, 10), unfVect = NULL, verbose = FALSE)
apag |
Adjacency matrix of type amat.pag |
suffStat |
Sufficient statistics: A |
indepTest |
Pre-defined function for testing conditional
independence. The function is internally called as
|
alpha |
Significance level for the individual conditional independence tests. |
sepset |
List of length p; each element of the list
contains another list of length p. The element
|
rules |
Logical vector of length 10 with |
unfVect |
Vector containing numbers that encode the ambiguous
triples (as returned by |
verbose |
Logical indicating if detailed output is to be given. |
The partially oriented graph in which the v-structures have been
oriented is transformed into an RFCI-PAG using adapted rules of Zhang
(2008). This function is similar to udag2pag
used to
orient the skeleton into a PAG in the FCI algorithm. However, it is
slightly more complicated because we perform additional conditional
independence tests when applying rule 4, to ensure correctness of the
ancestral relationships. As a result, some additional edges can be
deleted, see Colombo et al. (2012). Because of these addiitonal
tests, we need to give suffStat
, indepTest
, and
alpha
as inputs. Since edges can be deleted, the input
adjacency matrix apag
and the input separating sets
sepset
can change in this algorithm.
If unfVect = NULL
(no ambiguous triples), the orientation rules
are applied to each eligible structure until no more edges can be
oriented. On the other hand, hand, if one uses conservative or
majority rule FCI and ambiguous triples have been found in
pc.cons.intern
, unfVect
contains the numbers of
all ambiguous triples in the graph. In this case, the orientation
rules take this information into account. For example, if a *-> b o-*
c and <a,b,c> is an unambigous unshielded triple and not a
v-structure, then we obtain b -* c (otherwise we would create an
additional v-structure). On the other hand, if a *-> b o-* c but
<a,b,c> is an ambiguous unshielded triple, then the circle mark at b
is not oriented.
Note that the algorithm works with columns' position of the adjacency matrix and not with the names of the variables.
Note that this function does not resolve possible order-dependence in the application of the orientation rules, see Colombo and Maathuis (2014).
apag |
Final adjacency matrix of type amat.pag |
sepset |
Updated list of separating sets |
Diego Colombo and Markus Kalisch ([email protected])
D. Colombo and M.H. Maathuis (2014).Order-independent constraint-based causal structure learning. Journal of Machine Learning Research 15 3741-3782.
D. Colombo, M. H. Maathuis, M. Kalisch, T. S. Richardson (2012). Learning high-dimensional directed acyclic graphs with latent and selection variables. Ann. Statist. 40, 294–321.
J. Zhang (2008). On the completeness of orientation rules for causal discovery in the presence of latent confounders and selection bias. Artificial Intelligence 172, 1873–1896.
rfci
, udag2pag
,
dag2pag
, udag2pdag
,
udag2pdagSpecial
, udag2pdagRelaxed
##################################################% -------- ----------- ## Example with hidden variables ## Zhang (2008), Fig. 6, p.1882 ################################################## ## create the DAG : amat <- t(matrix(c(0,1,0,0,1, 0,0,1,0,0, 0,0,0,1,0, 0,0,0,0,0, 0,0,0,1,0),5,5)) V <- LETTERS[1:5] colnames(amat) <- rownames(amat) <- V edL <- setNames(vector("list",length=5), V) edL[[1]] <- list(edges= c(2,4),weights=c(1,1)) edL[[2]] <- list(edges= 3, weights=c(1)) edL[[3]] <- list(edges= 5, weights=c(1)) edL[[4]] <- list(edges= 5, weights=c(1)) ## and leave edL[[ 5 ]] empty g <- new("graphNEL", nodes=V, edgeL=edL, edgemode="directed") if (require(Rgraphviz)) plot(g) ## define the latent variable L <- 1 ## compute the true covariance matrix of g cov.mat <- trueCov(g) ## delete rows and columns belonging to latent variable L true.cov <- cov.mat[-L,-L] ## transform covariance matrix into a correlation matrix true.corr <- cov2cor(true.cov) n <- 100000 alpha <- 0.01 p <- ncol(true.corr) if (require("MASS")) { ## generate 100000 samples of DAG using standard normal error distribution set.seed(289) d.mat <- mvrnorm(n, mu = rep(0, p), Sigma = true.cov) ## estimate the skeleton of given data suffStat <- list(C = cor(d.mat), n = n) indepTest <- gaussCItest resD <- skeleton(suffStat, indepTest, alpha = alpha, labels=colnames(true.corr)) ## estimate all ordered unshielded triples amat.resD <- as(resD@graph, "matrix") print(u.t <- find.unsh.triple(amat.resD)) # four of them ## check and orient v-structures vstrucs <- rfci.vStruc(suffStat, indepTest, alpha=alpha, sepset = resD@sepset, g.amat = amat.resD, unshTripl= u.t$unshTripl, unshVect = u.t$unshVect, verbose = TRUE) ## Estimate the final skeleton and extend it into a PAG ## (using all 10 rules, as per default): resP <- udag2apag(vstrucs$amat, suffStat, indepTest=indepTest, alpha=alpha, sepset=vstrucs$sepset, verbose = TRUE) print(Amat <- resP$graph) } # only if "MASS" is there
##################################################% -------- ----------- ## Example with hidden variables ## Zhang (2008), Fig. 6, p.1882 ################################################## ## create the DAG : amat <- t(matrix(c(0,1,0,0,1, 0,0,1,0,0, 0,0,0,1,0, 0,0,0,0,0, 0,0,0,1,0),5,5)) V <- LETTERS[1:5] colnames(amat) <- rownames(amat) <- V edL <- setNames(vector("list",length=5), V) edL[[1]] <- list(edges= c(2,4),weights=c(1,1)) edL[[2]] <- list(edges= 3, weights=c(1)) edL[[3]] <- list(edges= 5, weights=c(1)) edL[[4]] <- list(edges= 5, weights=c(1)) ## and leave edL[[ 5 ]] empty g <- new("graphNEL", nodes=V, edgeL=edL, edgemode="directed") if (require(Rgraphviz)) plot(g) ## define the latent variable L <- 1 ## compute the true covariance matrix of g cov.mat <- trueCov(g) ## delete rows and columns belonging to latent variable L true.cov <- cov.mat[-L,-L] ## transform covariance matrix into a correlation matrix true.corr <- cov2cor(true.cov) n <- 100000 alpha <- 0.01 p <- ncol(true.corr) if (require("MASS")) { ## generate 100000 samples of DAG using standard normal error distribution set.seed(289) d.mat <- mvrnorm(n, mu = rep(0, p), Sigma = true.cov) ## estimate the skeleton of given data suffStat <- list(C = cor(d.mat), n = n) indepTest <- gaussCItest resD <- skeleton(suffStat, indepTest, alpha = alpha, labels=colnames(true.corr)) ## estimate all ordered unshielded triples amat.resD <- as(resD@graph, "matrix") print(u.t <- find.unsh.triple(amat.resD)) # four of them ## check and orient v-structures vstrucs <- rfci.vStruc(suffStat, indepTest, alpha=alpha, sepset = resD@sepset, g.amat = amat.resD, unshTripl= u.t$unshTripl, unshVect = u.t$unshVect, verbose = TRUE) ## Estimate the final skeleton and extend it into a PAG ## (using all 10 rules, as per default): resP <- udag2apag(vstrucs$amat, suffStat, indepTest=indepTest, alpha=alpha, sepset=vstrucs$sepset, verbose = TRUE) print(Amat <- resP$graph) } # only if "MASS" is there
This function performs the last steps of the FCI algorithm, as it
transforms an unoriented final skeleton into a Partial Ancestral
Graph (PAG). The final skeleton must have been estimated with
pdsep()
or fciplus.intern()
.
The result is an adjacency matrix indicating also the edge marks.
udag2pag(pag, sepset, rules = rep(TRUE, 10), unfVect = NULL, jci = c("0","1","12","123"), contextVars = NULL, verbose = FALSE, orientCollider = TRUE)
udag2pag(pag, sepset, rules = rep(TRUE, 10), unfVect = NULL, jci = c("0","1","12","123"), contextVars = NULL, verbose = FALSE, orientCollider = TRUE)
pag |
Adjacency matrix of type amat.pag |
sepset |
List of length p; each element of the list
contains another list of length p. The element
|
rules |
Array of length 10 containing |
unfVect |
Vector containing numbers that encode ambiguous unshielded
triples (as returned by |
verbose |
If |
orientCollider |
if |
jci |
String specifying the JCI assumptions that are used. It can be one of:
|
contextVars |
Subset of variable indices {1,...,p} that will be treated as context variables in the JCI extension. |
The skeleton is transformed into an FCI-PAG using rules by Zhang (2008). When using the JCI extension, additional adjacency and orientation rules incorporate the JCI background knowledge regarding the causal relations of the context variables; for details, see Mooij et al. (2020).
If unfVect = NULL
(i.e., one uses standard FCI or one uses
conservative/majority rule FCI but there are no ambiguous triples),
then the orientation rules are applied to each eligible structure
until no more edges can be oriented. On the other hand, if one uses
conservative or majority rule FCI and ambiguous triples have been
found in pc.cons.intern
, unfVect
contains the
numbers of all ambiguous triples in the graph. In this case, the
orientation rules take this information into account. For example, if
a *-> b o-* c and <a,b,c> is an unambigous unshielded triple and not a
v-structure, then we obtain b -* c (otherwise we would create an
additional v-structure). On the other hand, if a *-> b o-* c but
<a,b,c> is an ambiguous unshielded triple, then the circle mark at b
is not oriented.
Note that the algorithm works with columns' position of the adjacency matrix and not with the names of the variables.
Note that this function does not resolve possible order-dependence in the application of the orientation rules, see Colombo and Maathuis (2014).
Adjacency matrix of type amat.pag.
Diego Colombo and Markus Kalisch ([email protected]); JCI extension by Joris Mooij.
D. Colombo and M.H. Maathuis (2014).Order-independent constraint-based causal structure learning. Journal of Machine Learning Research 15 3741-3782.
D. Colombo, M. H. Maathuis, M. Kalisch, T. S. Richardson (2012). Learning high-dimensional directed acyclic graphs with latent and selection variables. Ann. Statist. 40, 294–321.
J. M. Mooij, S. Magliacane, T. Claassen (2020). Joint Causal Inference from Multiple Contexts. Journal of Machine Learning Research 21(99), 1–108.
J. Zhang (2008). On the completeness of orientation rules for causal discovery in the presence of latent confounders and selection bias. Artificial Intelligence 172, 1873–1896.
fci
, fciPlus
, udag2apag
, dag2pag
;
further, udag2pdag
(incl. udag2pdagSpecial
and
udag2pdagRelaxed
).
################################################## ## Example with hidden variables ## Zhang (2008), Fig. 6, p.1882 ################################################## ## draw a DAG with latent variables ## this example is taken from Zhang (2008), Fig. 6, p.1882 (see references) amat <- t(matrix(c(0,1,0,0,1, 0,0,1,0,0, 0,0,0,1,0, 0,0,0,0,0, 0,0,0,1,0),5,5)) V <- as.character(1:5) colnames(amat) <- rownames(amat) <- V edL <- vector("list",length=5) names(edL) <- V edL[[1]] <- list(edges= c(2,4),weights=c(1,1)) edL[[2]] <- list(edges= 3, weights=c(1)) edL[[3]] <- list(edges= 5, weights=c(1)) edL[[4]] <- list(edges= 5, weights=c(1)) g <- new("graphNEL", nodes=V, edgeL=edL,edgemode="directed") if(require("Rgraphviz")) plot(g) else print(g) ## define the latent variable L <- 1 ## compute the true covariance matrix of g cov.mat <- trueCov(g) ## delete rows and columns which belong to L true.cov <- cov.mat[-L,-L] ## transform it in a correlation matrix true.corr <- cov2cor(true.cov) if (require("MASS")) { ## generate 100000 samples of DAG using standard normal error distribution n <- 100000 alpha <- 0.01 set.seed(314) d.mat <- mvrnorm(n, mu = rep(0,dim(true.corr)[1]), Sigma = true.cov) ## estimate the skeleton of given data suffStat <- list(C = cor(d.mat), n = n) indepTest <- gaussCItest resD <- skeleton(suffStat, indepTest, p=dim(true.corr)[2], alpha = alpha) ## estimate v-structures conservatively tmp <- pc.cons.intern(resD, suffStat, indepTest, alpha, version.unf = c(1, 1)) ## tripleList <- tmp$unfTripl resD <- tmp$sk ## estimate the final skeleton of given data using Possible-D-Sep pdsepRes <- pdsep(resD@graph, suffStat, indepTest, p=dim(true.corr)[2], resD@sepset, alpha = alpha, m.max = Inf, pMax = resD@pMax) ## extend the skeleton into a PAG using all 10 rules resP <- udag2pag(pag = pdsepRes$G, pdsepRes$sepset, rules = rep(TRUE,10), verbose = TRUE) colnames(resP) <- rownames(resP) <- as.character(2:5) print(resP) } # only if "MASS" is there
################################################## ## Example with hidden variables ## Zhang (2008), Fig. 6, p.1882 ################################################## ## draw a DAG with latent variables ## this example is taken from Zhang (2008), Fig. 6, p.1882 (see references) amat <- t(matrix(c(0,1,0,0,1, 0,0,1,0,0, 0,0,0,1,0, 0,0,0,0,0, 0,0,0,1,0),5,5)) V <- as.character(1:5) colnames(amat) <- rownames(amat) <- V edL <- vector("list",length=5) names(edL) <- V edL[[1]] <- list(edges= c(2,4),weights=c(1,1)) edL[[2]] <- list(edges= 3, weights=c(1)) edL[[3]] <- list(edges= 5, weights=c(1)) edL[[4]] <- list(edges= 5, weights=c(1)) g <- new("graphNEL", nodes=V, edgeL=edL,edgemode="directed") if(require("Rgraphviz")) plot(g) else print(g) ## define the latent variable L <- 1 ## compute the true covariance matrix of g cov.mat <- trueCov(g) ## delete rows and columns which belong to L true.cov <- cov.mat[-L,-L] ## transform it in a correlation matrix true.corr <- cov2cor(true.cov) if (require("MASS")) { ## generate 100000 samples of DAG using standard normal error distribution n <- 100000 alpha <- 0.01 set.seed(314) d.mat <- mvrnorm(n, mu = rep(0,dim(true.corr)[1]), Sigma = true.cov) ## estimate the skeleton of given data suffStat <- list(C = cor(d.mat), n = n) indepTest <- gaussCItest resD <- skeleton(suffStat, indepTest, p=dim(true.corr)[2], alpha = alpha) ## estimate v-structures conservatively tmp <- pc.cons.intern(resD, suffStat, indepTest, alpha, version.unf = c(1, 1)) ## tripleList <- tmp$unfTripl resD <- tmp$sk ## estimate the final skeleton of given data using Possible-D-Sep pdsepRes <- pdsep(resD@graph, suffStat, indepTest, p=dim(true.corr)[2], resD@sepset, alpha = alpha, m.max = Inf, pMax = resD@pMax) ## extend the skeleton into a PAG using all 10 rules resP <- udag2pag(pag = pdsepRes$G, pdsepRes$sepset, rules = rep(TRUE,10), verbose = TRUE) colnames(resP) <- rownames(resP) <- as.character(2:5) print(resP) } # only if "MASS" is there
These functions perform the last step in the PC algorithm:
Transform an object of the class
"pcAlgo"
containing a skeleton and corresponding
conditional independence information into a completed partially
directed acyclic graph (CPDAG). The functions first determine the
v-structures, and then apply the three orientation rules as described
in Sprirtes et al (2000) and Meek (1995) to orient as many of the
remaining edges as possible.
In the oracle version and when all assumptions hold, all three functions yield the same CPDAG. In the sample version, however, the resulting CPDAG may be invalid in the sense that one cannot extend it a DAG without additional unshielded colliders by orienting the undirecting edges. This can for example happen due to errors in the conditional indepedence tests or violations of the faithfulness assumption. The three functions deal with such conflicts in different ways, as described in Details.
udag2pdag (gInput, verbose) udag2pdagRelaxed(gInput, verbose, unfVect=NULL, solve.confl=FALSE, orientCollider = TRUE, rules = rep(TRUE, 3)) udag2pdagSpecial(gInput, verbose, n.max=100)
udag2pdag (gInput, verbose) udag2pdagRelaxed(gInput, verbose, unfVect=NULL, solve.confl=FALSE, orientCollider = TRUE, rules = rep(TRUE, 3)) udag2pdagSpecial(gInput, verbose, n.max=100)
gInput |
|
verbose |
0: No output; 1: Details |
unfVect |
vector containing numbers that encode ambiguous
triples (as returned by |
solve.confl |
if |
n.max |
maximum number of tries for re-orienting doubly visited
edges in |
orientCollider |
if |
rules |
Array of length 3 containing |
udag2pdag
:If there are edges that are part of more than one v-structure (i.e., the edge b - c in the v-structures a -> b <- c and b -> c <- d), earlier edge orientations are simply overwritten by later ones. Thus, if a -> b <- c is considered first, the edge b - c is first oriented as b <- c and later overwritten by b -> c. The v-structures are considered in lexicographical ordering.
If the resulting graph is extendable to a DAG without additional
v-structures, then the rules of Meek (1995) and Spirtes et al
(2000) are applied to obtain the corresponding CPDAG. Otherwise,
the edges are oriented randomly to obtain a DAG that fits on the
skeleton, discarding all information about the v-structures. The
resulting DAG is then transformed into its CPDAG. Note that the
output of udag2pdag
is random whenever the initial graph
was not extendable.
Although the output of udag2pdag
is always
extendable, it is not necessarily a valid CPDAG in the sense that
it describes a Markov equivalence class of DAGs. For example, two
v-structures a -> b <- c and b -> c <- d (considered in this
order) would yield the output a -> b -> c <- d. This is
extendable to a DAG (it already is a DAG), but it does not
describe a Markov equivalence class of DAGs, since the DAG a <- b
-> c <- d describes the same conditional independencies.
udag2pdagSpecial
:If the graph after orienting the v-structures as in
udag2pdag
is extendable to a DAG without additional
v-structures, then the rules of Meek (1995) and Spirtes et al
(2000) are applied to obtain the corresponding CPDAG. Otherwise,
the algorithm tries at most n.max
different random
orderings of the v-structures (hence overwriting orientations in
different orders), until it finds one that yields an extendable
CPDAG. If this fails, the edges are oriented randomly to obtain a
DAG that fits on the skeleton, discarding all information about
the v-structures. The resulting DAG is then transformed into its
CPDAG. Note that the output of udag2pdagSpecial
is random
whenever the initial graph was not extendable.
Although the output of udag2pdag
is always
extendable, it is not necessarily a valid CPDAG in the sense that
it describes a Markov equivalence class of DAGs. For example, two
v-structures a -> b <- c and b -> c <- d (considered in this
order) would yield the output a -> b -> c <- d. This is
extendable to a DAG (it already IS a DAG), but it does not
describe a Markov equivalence class of DAGs, since the DAG a <- b
-> c <- d describes the same conditional independencies.
udag2pdagRelaxed
:This is the default version in the PC/RFCI/FCI algorithm. It does not test whether the output is extendable to a DAG without additional v-structures.
If unfVect = NULL
(no ambiguous triples), the three
orientation rules are applied to each eligible structure until no
more edges can be oriented. Otherwise, unfVect
contains
the numbers of all ambiguous triples in the graph as determined by
pc.cons.intern
. Then the orientation rules take
this information into account. For example, if a -> b - c and
<a,b,c> is an unambigous triple and a non-v-structure, then rule 1
implies b -> c. On the other hand, if a -> b - c but <a,b,c> is
an ambiguous triple, then the edge b - c is not oriented.
If solve.confl = FALSE
, earlier edge orientations are
overwritten by later ones as in udag2pdag
and
udag2pdagSpecial
.
If solv.confl = TRUE
, both the v-structures and the
orientation rules work with lists for the candidate edges and
allow bi-directed edges if there are conflicting orientations. For
example, two v-structures a -> b <- c and b -> c <- d then yield a
-> b <-> c <-d. This option can be used to get an
order-independent version of the PC algorithm (see Colombo and
Maathuis (2014)).
We denote bi-directed edges, for example between two variables i and j, in the adjacency matrix M of the graph as M[i,j]=2 and M[j,i]=2. Such edges should be interpreted as indications of conflicts in the algorithm, for example due to errors in the conditional independence tests or violations of the faithfulness assumption.
udag2pdag()
and udag2pdagRelaxed()
:oriented "pcAlgo"
-object.
udag2pdagSpecial
:a list
with
components
An oriented "pcAlgo"
-object.
Matrix counting the number of orientation attempts per edge
Logical indicating whether the original graph with v-structures is extendable.
Logical indicating whether the final graph with v-structures is extendable
Adjacency matrix of original graph with v-structures (type amat.cpdag) .
Adjacency matrix of final graph with v-structures after changing the ordering in which the v-structures are considered (type amat.cpdag) .
Integer code with values
Original try is extendable;
Reorienting double edge visits helps;
Original try is not extendable; reorienting double visits does not help; result is acyclic, has original v-structures, but perhaps additional v-structures.
Number of orderings of the v-structures until
success or n.max
.
Markus Kalisch ([email protected])
C. Meek (1995). Causal inference and causal explanation with background knowledge. In Proceedings of the Eleventh Conference on Uncertainty in Artificial Intelligence (UAI-95), pp. 403-411. Morgan Kaufmann Publishers, Inc.
P. Spirtes, C. Glymour and R. Scheines (2000) Causation, Prediction, and Search, 2nd edition, The MIT Press.
J. Pearl (2000), Causality, Cambridge University Press.
D. Colombo and M.H. Maathuis (2014).Order-independent constraint-based causal structure learning. Journal of Machine Learning Research 15 3741-3782.
pc
, pdag2dag
,
dag2cpdag
, udag2pag
,
udag2apag
, dag2pag
.
## simulate data set.seed(123) p <- 10 myDAG <- randomDAG(p, prob = 0.2) trueCPDAG <- dag2cpdag(myDAG) n <- 1000 d.mat <- rmvDAG(n, myDAG, errDist = "normal") ## estimate skeleton resU <- skeleton(suffStat = list(C = cor(d.mat), n = n), indepTest = gaussCItest, ## (partial correlations) alpha = 0.05, p=p) ## orient edges using three different methods resD1 <- udag2pdagRelaxed(resU, verbose=0) resD2 <- udag2pdagSpecial(resU, verbose=0, n.max=100) resD3 <- udag2pdag (resU, verbose=0)
## simulate data set.seed(123) p <- 10 myDAG <- randomDAG(p, prob = 0.2) trueCPDAG <- dag2cpdag(myDAG) n <- 1000 d.mat <- rmvDAG(n, myDAG, errDist = "normal") ## estimate skeleton resU <- skeleton(suffStat = list(C = cor(d.mat), n = n), indepTest = gaussCItest, ## (partial correlations) alpha = 0.05, p=p) ## orient edges using three different methods resD1 <- udag2pdagRelaxed(resU, verbose=0) resD2 <- udag2pdagSpecial(resU, verbose=0, n.max=100) resD3 <- udag2pdag (resU, verbose=0)
Check if the directed edge from x to z in a MAG or in a PAG is visible or not.
visibleEdge(amat, x, z)
visibleEdge(amat, x, z)
amat |
Adjacency matrix of type amat.pag |
x , z
|
(integer) position of variable |
All directed edges in DAGs and CPDAGs are said to be visible. Given a MAG M / PAG P, a directed edge A -> B in M / P is visible if there is a vertex C not adjacent to B, such that there is an edge between C and A that is into A, or there is a collider path between C and A that is into A and every non-endpoint vertex on the path is a parent of B. Otherwise A -> B is said to be invisible. (see Maathuis and Colombo (2015), Def. 3.1)
TRUE
if edge is visible, otherwise FALSE
.
Diego Colombo
M.H. Maathuis and D. Colombo (2015). A generalized backdoor criterion. Annals of Statistics 43 1060-1088.
amat <- matrix(c(0,3,0,0, 2,0,2,3, 0,2,0,3, 0,2,2,0), 4,4) colnames(amat) <- rownames(amat) <- letters[1:4] if(require(Rgraphviz)) { plotAG(amat) } visibleEdge(amat, 3, 4) ## visible visibleEdge(amat, 2, 4) ## visible visibleEdge(amat, 1, 2) ## invisible
amat <- matrix(c(0,3,0,0, 2,0,2,3, 0,2,0,3, 0,2,2,0), 4,4) colnames(amat) <- rownames(amat) <- letters[1:4] if(require(Rgraphviz)) { plotAG(amat) } visibleEdge(amat, 3, 4) ## visible visibleEdge(amat, 2, 4) ## visible visibleEdge(amat, 1, 2) ## invisible
Given a graph
object g
, as
generated e.g., by randomDAG
, return the matrix of its
edge weights, the “weight matrix”.
wgtMatrix(g, transpose = TRUE)
wgtMatrix(g, transpose = TRUE)
g |
|
transpose |
logical indicating if the weight matrix should be
transposed ( |
When generating a DAG (e.g. using randomDAG
), a graph
object is usually generated and edge weights are usually specified.
This function extracts the edge weights and arranges them in a matrix
.
If transpose
is TRUE
(default), M[i,j]
is the
weight of the edge from j to i. If transpose
is false, M[i,j]
is the weight of the edge from i to j.
Nowadays, this is a trivial wrapper around as(g, "matrix")
using the (coerce
) method provided by the graph
package.
The weight matrix
M
.
This function can not be used to estimate the edge weights in an estimated DAG / CPDAG.
Markus Kalisch
randomDAG
for generating a random DAG;
rmvDAG
for simulating data from a generated DAG.
set.seed(123) g <- randomDAG(n = 5, prob = 0.3) ## generate random DAG if(require(Rgraphviz)) { plot(g) } ## edge weights as matrix wgtMatrix(g) ## for comparison: edge weights in graph object g@edgeData@data
set.seed(123) g <- randomDAG(n = 5, prob = 0.3) ## generate random DAG if(require(Rgraphviz)) { plot(g) } ## edge weights as matrix wgtMatrix(g) ## for comparison: edge weights in graph object g@edgeData@data