Title: | Bayesian Network Structure Learning from Data with Missing Values |
---|---|
Description: | Bayesian Network Structure Learning from Data with Missing Values. The package implements the Silander-Myllymaki complete search, the Max-Min Parents-and-Children, the Hill-Climbing, the Max-Min Hill-climbing heuristic searches, and the Structural Expectation-Maximization algorithm. Available scoring functions are BDeu, AIC, BIC. The package also implements methods for generating and using bootstrap samples, imputed data, inference. |
Authors: | Francesco Sambo [aut], Alberto Franzin [aut, cre] |
Maintainer: | Alberto Franzin <[email protected]> |
License: | GPL (>= 2) | file LICENSE |
Version: | 1.0.15 |
Built: | 2025-02-03 06:45:27 UTC |
Source: | CRAN |
InferenceEngine
.Add a list of observations to an InferenceEngine that already has observations, using a list composed by the two following vectors:
observed.vars
vector of observed variables;
observed.vals
vector of values observed for the variables in observed.vars
in the corresponding position.
add.observations(x) <- value ## S4 replacement method for signature 'InferenceEngine' add.observations(x) <- value
add.observations(x) <- value ## S4 replacement method for signature 'InferenceEngine' add.observations(x) <- value
x |
an |
value |
the list of observations of the |
In case of multiple observations of the same variable, the last observation is the one used, as the most recent.
Asia
dataset.Wrapper for a loader for the Asia
dataset, with only raw data.
asia()
asia()
The dataset has 10000 items, no missing data, so no imputation needs to be performed.
a BNDataset containing the Child
dataset.
dataset <- asia() print(dataset)
dataset <- asia() print(dataset)
Asia
dataset.The Asia
dataset contains 10000 complete (no missing data, no latent variables) randomly generated items of the Asia
Bayesian Network.
No imputation needs to be performed, so only raw data is present.
a BNDataset
with raw data slow filled.
The data the BNDataset object is built from is located in files pkg_folder/extdata/asia_10000.header
and pkg_folder/extdata/asia_10000.data
.
S. Lauritzen, D. Spiegelhalter. Local Computation with Probabilities on Graphical Structures and their Application to Expert Systems (with discussion). Journal of the Royal Statistical Society: Series B (Statistical Methodology), 50(2):157-224, 1988.
Asia
dataset.Wrapper for a loader for a 2-layers dataset derived from the Asia
dataset, with only raw data.
asia_2_layers()
asia_2_layers()
The dataset has 100 items, no missing data, so no imputation needs to be performed.
a BNDataset containing the Child
dataset.
dataset <- asia_2_layers() print(dataset)
dataset <- asia_2_layers() print(dataset)
Perform belief propagation for the network of an InferenceEngine, given a set of observations.
In the current version of bnstruct
, belief propagation can be computed only over a junction tree.
belief.propagation(ie, observations = NULL, return.potentials = FALSE) ## S4 method for signature 'InferenceEngine' belief.propagation(ie, observations = NULL, return.potentials = FALSE)
belief.propagation(ie, observations = NULL, return.potentials = FALSE) ## S4 method for signature 'InferenceEngine' belief.propagation(ie, observations = NULL, return.potentials = FALSE)
ie |
an |
observations |
list of observations, consisting in two vector, |
return.potentials |
if TRUE only the potentials are returned, instead of the default |
updated InferenceEngine
object.
## Not run: dataset <- BNDataset("file.header", "file.data") bn <- BN(dataset) ie <- InferenceEngine(bn) ie <- belief.propagation(ie) observations(ie) <- list("observed.vars"=("A","G","X"), "observed.vals"=c(1,2,1)) belief.propagation(ie) ## End(Not run)
## Not run: dataset <- BNDataset("file.header", "file.data") bn <- BN(dataset) ie <- InferenceEngine(bn) ie <- belief.propagation(ie) observations(ie) <- list("observed.vars"=("A","G","X"), "observed.vals"=c(1,2,1)) belief.propagation(ie) ## End(Not run)
BN
object contained in an InferenceEngine
.Return a network contained in an InferenceEngine.
bn(x) ## S4 method for signature 'InferenceEngine' bn(x)
bn(x) ## S4 method for signature 'InferenceEngine' bn(x)
x |
an |
the BN
object contained in an InferenceEngine
.
Instantiate a BN
object.
## S4 method for signature 'BN' initialize(.Object, dataset = NULL, ...) BN(dataset = NULL, ...)
## S4 method for signature 'BN' initialize(.Object, dataset = NULL, ...) BN(dataset = NULL, ...)
.Object |
a BN |
dataset |
a |
... |
potential further arguments of methods. |
The constructor may be invoked without parameters – in this case an empty network will be created, and its slots will be filled manually by the user. This is usually viable only if the user already has knowledge about the network structure.
BN object.
name
:name of the network
num.nodes
:number of nodes in the network
variables
:names of the variables in the network
discreteness
:TRUE
if variable is discrete, FALSE
if variable is continue
node.sizes
:if variable i
is discrete, node.sizes[i]
contains the cardinality of i
,
if i
is instead discrete the value is the number of states variable i
takes when discretized
cpts
:list of conditional probability tables of the network
dag
:adjacency matrix of the network
wpdag
:weighted partially dag
scoring.func
:scoring function used in structure learning (when performed)
struct.algo
:algorithm used in structure learning (when performed)
num.time.steps
:number of instants in which the network is observed (1, unless it is a Dynamic Bayesian Network)
discreteness
:TRUE
if variable is discrete, FALSE
if variable is continue
## Not run: net.1 <- BN() dataset <- BNDataset() dataset <- read.dataset(dataset, "file.header", "file.data") net.2 <- BN(dataset) ## End(Not run)
## Not run: net.1 <- BN() dataset <- BNDataset() dataset <- read.dataset(dataset, "file.header", "file.data") net.2 <- BN(dataset) ## End(Not run)
BN
object contained in an InferenceEngine
.Add an original network to an InferenceEngine.
bn(x) <- value ## S4 replacement method for signature 'InferenceEngine' bn(x) <- value
bn(x) <- value ## S4 replacement method for signature 'InferenceEngine' bn(x) <- value
x |
an |
value |
the |
Contains the all of the data that can be extracted from a given dataset: raw data, imputed data, raw and imputed data with bootstrap.
BNDataset(data, discreteness, variables = NULL, node.sizes = NULL, ...) ## S4 method for signature 'BNDataset' initialize(.Object)
BNDataset(data, discreteness, variables = NULL, node.sizes = NULL, ...) ## S4 method for signature 'BNDataset' initialize(.Object)
.Object |
an empty BNDataset. |
data |
raw data.frame or path/name of the file containing the raw dataset (see 'Details'). |
discreteness |
a vector of booleans indicating if the variables are discrete or continuous
( |
variables |
vector of variable names. |
node.sizes |
vector of variable cardinalities (for discrete variables) or quantization ranges (for continuous variables). |
... |
further arguments for reading a dataset from files (see documentation for |
There are two ways to build a BNDataset: using two files containing respectively header informations and data, and manually providing the data table and the related header informations (variable names, cardinality and discreteness).
The key informations needed are: 1. the data; 2. the state of variables (discrete or continuous); 3. the names of the variables; 4. the cardinalities of the variables (if discrete), or the number of levels they have to be quantized into (if continuous). Names and cardinalities/leves can be guessed by looking at the data, but it is strongly advised to provide _all_ of the informations, in order to avoid problems later on during the execution.
Data can be provided in form of data.frame or matrix. It can contain NAs. By default, NAs are indicated with '?';
to specify a different character for NAs, it is possible to provide also the na.string.symbol
parameter.
The values contained in the data have to be numeric (real for continuous variables, integer for discrete ones).
The default range of values for a discrete variable X
is [1,|X|]
, with |X|
being
the cardinality of X
. The same applies for the levels of quantization for continuous variables.
If the value ranges for the data are different from the expected ones, it is possible to specify a different
starting value (for the whole dataset) with the starts.from
parameter. E.g. by starts.from=0
we assume that the values of the variables in the dataset have range [0,|X|-1]
.
Please keep in mind that the internal representation of bnstruct starts from 1,
and the original starting values are then lost.
It is possible to use two files, one for the data and one for the metadata,
instead of providing manually all of the info.
bnstruct requires the data files to be in a format subsequently described.
The actual data has to be in (a text file containing data in) tabular format, one tuple per row,
with the values for each variable separated by a space or a tab. Values for each variable have to be
numbers, starting from 1
in case of discrete variables.
Data files can have a first row containing the names of the corresponding variables.
In addition to the data file, a header file containing additional informations can also be provided.
An header file has to be composed by three rows of tab-delimited values:
1. list of names of the variables, in the same order of the data file;
2. a list of integers representing the cardinality of the variables, in case of discrete variables,
or the number of levels each variable has to be quantized in, in case of continuous variables;
3. a list that indicates, for each variable, if the variable is continuous
(c
or C
), and thus has to be quantized before learning,
or discrete (d
or D
).
In case of need of more advanced options when reading a dataset from files, please refer to the
documentation of the read.dataset
method. Imputation and bootstrap are also available
as separate routines (impute
and bootstrap
, respectively).
In case of an evolving system to be modeled as a Dynamic Bayesian Network, it is possible to specify
only the description of the variables of a single instant; the information will be replicated for all
the num.time.steps
instants that compose the dataset, where num.time.steps
needs to be
set as parameter. In this case, it is assumed that the N variables v1, v2, ..., vN of a single instant
appear in the dataset as v1_t1, v2_t1, ..., vN_t1, v1_t2, v2_t2, ..., in this exact order.
The user can however provide information for all the variables in all the instants; if it is not the case,
the name of the variables will be edited to include the instant. In case of an evolving system, the
num.variables
slots refers anyway to the total number of variables observed in all the instants
(the number of columns in the dataset), and not to a single instant.
BNDataset object.
a BNDataset object.
name
:name of the dataset
header.file
:name and location of the header file
data.file
:name and location of the data file
variables
:names of the variables in the network
node.sizes
:cardinality of each variable of the network
num.variables
:number of variables (columns) in the dataset
discreteness
:TRUE
if variable is discrete, FALSE
if variable is continue
quantiles
:list of vectors containing the quantiles, one vector per variable. Each vector is NULL
if the variable is discrete, and contains the quantiles if it is continuous
num.items
:number of observations (rows) in the dataset
has.raw.data
:TRUE if the dataset contains data read from a file
has.imputed.data
:TRUE if the dataset contains imputed data (computed from raw data)
raw.data
:matrix containing raw data
imputed.data
:matrix containing imputed data
has.boots
:dataset has bootstrap samples
boots
:list of bootstrap samples
has.imputed.boots
:dataset has imputed bootstrap samples
imp.boots
:list of imputed bootstrap samples
num.boots
:number of bootstrap samples
num.time.steps
:number of instants in which the network is observed (1, unless it is a dynamic system)
read.dataset, impute, bootstrap
## Not run: # create from files dataset <- BNDataset("file.data", "file.header") # other way: create from raw dataset and metadata data <- matrix(c(1:16), nrow = 4, ncol = 4) dataset <- BNDataset(data = data, discreteness = rep('d',4), variables = c("a", "b", "c", "d"), node.sizes = c(4,8,12,16)) ## End(Not run)
## Not run: # create from files dataset <- BNDataset("file.data", "file.header") # other way: create from raw dataset and metadata data <- matrix(c(1:16), nrow = 4, ncol = 4) dataset <- BNDataset(data = data, discreteness = rep('d',4), variables = c("a", "b", "c", "d"), node.sizes = c(4,8,12,16)) ## End(Not run)
Given a BNDataset
, return the sample corresponding to given index.
boot(dataset, index, use.imputed.data = FALSE) ## S4 method for signature 'BNDataset,numeric' boot(dataset, index, use.imputed.data = FALSE)
boot(dataset, index, use.imputed.data = FALSE) ## S4 method for signature 'BNDataset,numeric' boot(dataset, index, use.imputed.data = FALSE)
dataset |
a |
index |
the index of the requested sample. |
use.imputed.data |
|
bootstrap
## Not run: dataset <- BNDataset("file.data", "file.header") dataset <- bootstrap(dataset, num.boots = 1000) for (i in 1:num.boots(dataset)) print(boot(dataset, i)) ## End(Not run)
## Not run: dataset <- BNDataset("file.data", "file.header") dataset <- bootstrap(dataset, num.boots = 1000) for (i in 1:num.boots(dataset)) print(boot(dataset, i)) ## End(Not run)
BNDataset
.Return the list of samples computed from raw data of a dataset.
boots(x) ## S4 method for signature 'BNDataset' boots(x)
boots(x) ## S4 method for signature 'BNDataset' boots(x)
x |
a |
the list of bootstrap samples.
has.boots
, has.imputed.boots
, imp.boots
BNDataset
.Add to a dataset a list of samples from raw data computed using bootstrap.
boots(x) <- value ## S4 replacement method for signature 'BNDataset' boots(x) <- value
boots(x) <- value ## S4 replacement method for signature 'BNDataset' boots(x) <- value
x |
a |
value |
the list of bootstrap samples. |
Create a list of num.boots
samples of the original dataset.
bootstrap(object, num.boots = 100, seed = 0, imputation = FALSE, k.impute = 10) ## S4 method for signature 'BNDataset' bootstrap(object, num.boots = 100, seed = 0, imputation = FALSE, k.impute = 10)
bootstrap(object, num.boots = 100, seed = 0, imputation = FALSE, k.impute = 10) ## S4 method for signature 'BNDataset' bootstrap(object, num.boots = 100, seed = 0, imputation = FALSE, k.impute = 10)
object |
the |
num.boots |
number of sampled datasets for bootstrap. |
seed |
random seed. |
imputation |
|
k.impute |
number of neighbours to be used; for discrete variables we use mode, for continuous variables the median value is instead taken (useful only if imputation == TRUE). |
## Not run: dataset <- BNDataset("file.data", "file.header") dataset <- bootstrap(dataset, num.boots = 1000) ## End(Not run)
## Not run: dataset <- BNDataset("file.data", "file.header") dataset <- bootstrap(dataset, num.boots = 1000) ## End(Not run)
Starting from the adjacency matrix of the directed acyclic graph of the network contained in an InferenceEngine, build a JunctionTree for the network and store it into an InferenceEngine.
build.junction.tree(object, ...) ## S4 method for signature 'InferenceEngine' build.junction.tree(object, ...)
build.junction.tree(object, ...) ## S4 method for signature 'InferenceEngine' build.junction.tree(object, ...)
object |
an |
... |
potential further arguments for methods. |
InferenceEngine
## Not run: dataset <- BNDataset("file.header", "file.data") net <- BN(dataset) eng <- InferenceEngine() eng <- build.junction.tree(eng) ## End(Not run)
## Not run: dataset <- BNDataset("file.header", "file.data") net <- BN(dataset) eng <- InferenceEngine() eng <- build.junction.tree(eng) ## End(Not run)
Child
dataset.Wrapper for a loader for the Child
raw dataset; also perform imputation.
child()
child()
The dataset has 5000 items, with random missing values (no latent variables). BNDataset object contains the raw dataset and imputed dataset, with k=10
(see impute
for related explanation).
a BNDataset containing the Child
dataset.
dataset <- child() print(dataset)
dataset <- child() print(dataset)
Child
dataset.The Child
dataset contains 5000 randomly generated items with missing data (no latent variables) of the Child
Bayesian Network.
Imputation is performed, so both raw and imputed data is present.
a BNDataset
with a raw and imputed data slow filled with 5000 items.
The data the BNDataset object is built from is located in files pkg_folder/extdata/extdata/Child_data_na_5000.header
and pkg_folder/extdata/extdata/Child_data_na_5000.data
.
D. J. Spiegelhalter, R. G. Cowell (1992). Learning in probabilistic expert systems. In Bayesian Statistics 4 (J. M. Bernardo, J. 0. Berger, A. P. Dawid and A. F. M. Smith, eds.) 447-466. Clarendon Press, Oxford.
BNDataset
to get only complete cases.Given a BNDataset
, return a copy of the original object where
the raw.data
consists only in the observations that do not contain missing values.
complete(x, complete.vars = seq_len(num.variables(x))) ## S4 method for signature 'BNDataset' complete(x, complete.vars = seq_len(num.variables(x)))
complete(x, complete.vars = seq_len(num.variables(x))) ## S4 method for signature 'BNDataset' complete(x, complete.vars = seq_len(num.variables(x)))
x |
a |
complete.vars |
vector containing the indices of the variables to be considered
for the subsetting; variables not included in the vector can still contain |
Non-missingness can be required on a subset of variables (by default, on all variables).
If present, imputed data and bootstrap samples are eliminated from the
new BNDataset
, as using this method *after* using impute
or bootstrap
, there may likely be a loss of correspondence between
the subsetted raw.data
and the previously generated imputed.data
and bootstrap
samples.
a copy of the original BNDataset
containing only complete observations.
BN
.Return the list of conditional probability tables of the variables of a BN
object.
Each probability table is associated to the corresponding variable, and its dimensions are named according
to the variable they represent.
cpts(x) ## S4 method for signature 'BN' cpts(x)
cpts(x) ## S4 method for signature 'BN' cpts(x)
x |
an object. |
Each conditional probability table is represented as a multidimensional array.
The ordering of the dimensions of each variable is not guaranteed to follow the actual conditional distribution.
E.g. dimensions for conditional probability P(C|A,B)
can be either (C,A,B)
or (A,B,C)
, depending on
if some operations have been performed, or how the probability table has been computed.
Users should not rely on dimension numbers, but should instead select the dimensions using their names.
list of the conditional probability tables of the desired object.
Set the list of conditional probability tables of a BN
object.
cpts(x) <- value ## S4 replacement method for signature 'BN' cpts(x) <- value
cpts(x) <- value ## S4 replacement method for signature 'BN' cpts(x) <- value
x |
an object. |
value |
list of the conditional probability tables of the object. |
Each conditional probability table is represented as a multidimensional array. To retrieve single dimensions (e.g. to compute marginals), users should provide dimensions names.
Return the adjacency matrix of the directed acyclic graph representing the structure of a network.
dag(x) ## S4 method for signature 'BN' dag(x)
dag(x) ## S4 method for signature 'BN' dag(x)
x |
an object. |
matrix containing the adjacency matrix of the directed acyclic graph representing the structure of the object.
Convert the adjacency matrix representing the DAG of a BN
into the adjacency matrix representing a CPDAG for the network.
dag.to.cpdag(dag.adj.matrix, layering = NULL, layer.struct = NULL)
dag.to.cpdag(dag.adj.matrix, layering = NULL, layer.struct = NULL)
dag.adj.matrix |
the adjacency matrix representing the DAG of a |
layering |
vector containing the layers where each node belongs. |
layer.struct |
layer.struct |
the adjacency matrix representing a CPDAG for the network.
## Not run: net <- learn.network(dataset, layering=layering, layer.struct=layer.struct) pdag <- dag.to.cpdag(dag(net), layering, layer.struct) wpdag(net) <- pdag ## End(Not run)
## Not run: net <- learn.network(dataset, layering=layering, layer.struct=layer.struct) pdag <- dag.to.cpdag(dag(net), layering, layer.struct) wpdag(net) <- pdag ## End(Not run)
Set the adjacency matrix of the directed acyclic graph representing the structure of a network.
dag(x) <- value ## S4 replacement method for signature 'BN' dag(x) <- value
dag(x) <- value ## S4 replacement method for signature 'BN' dag(x) <- value
x |
an object. |
value |
matrix containing the adjacency matrix of the directed acyclic graph representing the structure of the object. |
BNDataset
.Return the data filename of a dataset (with the path to its position, as given by the user). The data filename may contain a header in the first row, containing the list of names of the variables, in the same order as in the header file. After the header, if present, the file contains a data.frame with the observations, one item per row.
data.file(x) ## S4 method for signature 'BNDataset' data.file(x)
data.file(x) ## S4 method for signature 'BNDataset' data.file(x)
x |
a |
data filename of the dataset.
BNDataset
.Set the data filename of a dataset (with the path to its position, as given by the user). The data filename may contain a header in the first row, containing the list of names of the variables, in the same order as in the header file. After the header, if present, the file contains a data.frame with the observations, one item per row.
data.file(x) <- value ## S4 replacement method for signature 'BNDataset' data.file(x) <- value
data.file(x) <- value ## S4 replacement method for signature 'BNDataset' data.file(x) <- value
x |
a |
value |
data filename. |
Get a vector representing the status of the variables (with their names) of a BN
or BNDataset
.
Elements of the vector are c
if the variable is continue, and d
if the variable is discrete.
discreteness(x) ## S4 method for signature 'BN' discreteness(x) ## S4 method for signature 'BNDataset' discreteness(x)
discreteness(x) ## S4 method for signature 'BN' discreteness(x) ## S4 method for signature 'BNDataset' discreteness(x)
x |
an object. |
vector contaning, for each variable of the desired object,
c
if the variable is continue, and d
if the variable is discrete.
Set the list of variable status for the variables in a network or a dataset.
discreteness(x) <- value ## S4 replacement method for signature 'BN' discreteness(x) <- value ## S4 replacement method for signature 'BNDataset' discreteness(x) <- value
discreteness(x) <- value ## S4 replacement method for signature 'BN' discreteness(x) <- value ## S4 replacement method for signature 'BNDataset' discreteness(x) <- value
x |
an object. |
value |
a vector of elements in { |
Given a BN
with a WPDAG
, it counts the edges, with
their directionality.
edge.dir.wpdag(x, use.node.names = TRUE)
edge.dir.wpdag(x, use.node.names = TRUE)
x |
the |
use.node.names |
use node names rather than number ( |
a matrix containing the node pairs with the count of the edges
between them in the WPDAG
.
Learn parameters of a network using the Expectation-Maximization algorithm.
em(x, dataset, threshold = 0.001, max.em.iterations = 10, ess = 1) ## S4 method for signature 'InferenceEngine,BNDataset' em(x, dataset, threshold = 0.001, max.em.iterations = 10, ess = 1)
em(x, dataset, threshold = 0.001, max.em.iterations = 10, ess = 1) ## S4 method for signature 'InferenceEngine,BNDataset' em(x, dataset, threshold = 0.001, max.em.iterations = 10, ess = 1)
x |
an |
dataset |
observed dataset with missing values for the Bayesian Network of |
threshold |
threshold for convergence, used as stopping criterion. |
max.em.iterations |
maximum number of iterations to run in case of no convergence. |
ess |
Equivalent Sample Size value. |
a list containing: an InferenceEngine
with a new updated network ("InferenceEngine"
),
and the imputed dataset ("BNDataset"
).
## Not run: em(x, dataset) ## End(Not run)
## Not run: em(x, dataset) ## End(Not run)
Return an array containing the values that each variable of the network is more likely to take, according to the CPTS. In case of ties take the first value.
get.most.probable.values(x, prev.values = NULL) ## S4 method for signature 'BN' get.most.probable.values(x, prev.values = NULL) ## S4 method for signature 'InferenceEngine' get.most.probable.values(x, prev.values = NULL)
get.most.probable.values(x, prev.values = NULL) ## S4 method for signature 'BN' get.most.probable.values(x, prev.values = NULL) ## S4 method for signature 'InferenceEngine' get.most.probable.values(x, prev.values = NULL)
x |
a |
prev.values |
vector of size |
array containing, in each position, the most probable value for the corresponding variable.
## Not run: # try with a BN object x get.most.probable.values(x) # now build an InferenceEngine object eng <- InferenceEngine(x) get.most.probable.values(eng) ## End(Not run)
## Not run: # try with a BN object x get.most.probable.values(x) # now build an InferenceEngine object eng <- InferenceEngine(x) get.most.probable.values(eng) ## End(Not run)
BNDataset
has bootstrap samples or not.Return TRUE
if the given dataset contains samples for bootstrap, FALSE
otherwise.
has.boots(x) ## S4 method for signature 'BNDataset' has.boots(x)
has.boots(x) ## S4 method for signature 'BNDataset' has.boots(x)
x |
a |
TRUE
if dataset has bootstrap samples.
has.imputed.boots
, boots
, imp.boots
BNDataset
has bootstrap samples from imputed data or not.Return TRUE
if the given dataset contains samples for bootstrap from inputed dataset, FALSE
otherwise.
has.imputed.boots(x) ## S4 method for signature 'BNDataset' has.imputed.boots(x)
has.imputed.boots(x) ## S4 method for signature 'BNDataset' has.imputed.boots(x)
x |
a |
TRUE
if dataset has bootstrap samples from imputed data.
Check whether a BNDataset
object actually contains imputed data.
has.imputed.data(x) ## S4 method for signature 'BNDataset' has.imputed.data(x)
has.imputed.data(x) ## S4 method for signature 'BNDataset' has.imputed.data(x)
x |
a |
has.raw.data
, raw.data
, imputed.data
## Not run: x <- BNDataset() has.imputed.data(x) # FALSE x <- read.dataset(x, "file.header", "file.data") has.imputed.data(x) # FALSE, since read.dataset() actually reads raw data. x <- impute(x) has.imputed.data(x) # TRUE ## End(Not run)
## Not run: x <- BNDataset() has.imputed.data(x) # FALSE x <- read.dataset(x, "file.header", "file.data") has.imputed.data(x) # FALSE, since read.dataset() actually reads raw data. x <- impute(x) has.imputed.data(x) # TRUE ## End(Not run)
Check whether a BNDataset
object actually contains raw data.
has.raw.data(x) ## S4 method for signature 'BNDataset' has.raw.data(x)
has.raw.data(x) ## S4 method for signature 'BNDataset' has.raw.data(x)
x |
a |
has.imputed.data
, raw.data
, imputed.data
## Not run: x <- BNDataset() has.raw.data(x) # FALSE x <- read.dataset(x, "file.header", "file.data") has.raw.data(x) # TRUE, since read.dataset() actually reads raw data. ## End(Not run)
## Not run: x <- BNDataset() has.raw.data(x) # FALSE x <- read.dataset(x, "file.header", "file.data") has.raw.data(x) # TRUE, since read.dataset() actually reads raw data. ## End(Not run)
BNDataset
.Return the header filename of a dataset (with the path to its position, as given by the user), present if the dataset has been read from a file and not manually inserted. The header file contains three rows:
list of names of the variables, in the same order as in the data file;
list of cardinalities of the variables, if discrete, or levels for quantization if continuous;
list of status of the variables: c
for continuous variables, d
for discrete ones.
header.file(x) ## S4 method for signature 'BNDataset' header.file(x)
header.file(x) ## S4 method for signature 'BNDataset' header.file(x)
x |
a |
header filename of the dataset.
BNDataset
.Set the header filename of a dataset (with the path to its position, as given by the user). The header file has to contain three rows:
list of names of the variables, in the same order as in the data file;
list of cardinalities of the variables, if discrete, or levels for quantization if continuous;
list of status of the variables: c
for continuous variables, d
for discrete ones.
Further rows are ignored.
header.file(x) <- value ## S4 replacement method for signature 'BNDataset' header.file(x) <- value
header.file(x) <- value ## S4 replacement method for signature 'BNDataset' header.file(x) <- value
x |
a |
value |
header filename. |
BNDataset
.Return the list of samples computed from raw data of a dataset.
imp.boots(x) ## S4 method for signature 'BNDataset' imp.boots(x)
imp.boots(x) ## S4 method for signature 'BNDataset' imp.boots(x)
x |
a |
the list of bootstrap samples from imputed data.
has.boots
, has.imputed.boots
, boots
BNDataset
.Add to a dataset a list of samples from imputed data computed using bootstrap.
imp.boots(x) <- value ## S4 replacement method for signature 'BNDataset' imp.boots(x) <- value
imp.boots(x) <- value ## S4 replacement method for signature 'BNDataset' imp.boots(x) <- value
x |
a |
value |
the list of bootstrap samples from imputed data. |
BNDataset
raw data with missing values.Impute a BNDataset
raw data with missing values.
impute(object, k.impute = 10) ## S4 method for signature 'BNDataset' impute(object, k.impute = 10)
impute(object, k.impute = 10) ## S4 method for signature 'BNDataset' impute(object, k.impute = 10)
object |
the |
k.impute |
number of neighbours to be used; for discrete variables we use mode, for continuous variables the median value is instead taken. |
## Not run: dataset <- BNDataset("file.data", "file.header") dataset <- impute(dataset) ## End(Not run)
## Not run: dataset <- BNDataset("file.data", "file.header") dataset <- impute(dataset) ## End(Not run)
Return imputed data contained in a BNDataset
object, if any.
imputed.data(x) ## S4 method for signature 'BNDataset' imputed.data(x)
imputed.data(x) ## S4 method for signature 'BNDataset' imputed.data(x)
x |
a |
has.raw.data
, has.imputed.data
, raw.data
Insert imputed data in a BNDataset
object.
imputed.data(x) <- value ## S4 replacement method for signature 'BNDataset' imputed.data(x) <- value
imputed.data(x) <- value ## S4 replacement method for signature 'BNDataset' imputed.data(x) <- value
x |
a |
value |
a matrix of integers containing a dataset. |
has.imputed.data
, imputed.data
, read.dataset
InferenceEngine class.
Constructor method of InferenceEngine
class.
constructor for InferenceEngine
object
## S4 method for signature 'InferenceEngine' initialize(.Object, ...) InferenceEngine(bn = NULL, observations = NULL, interventions = NULL, ...)
## S4 method for signature 'InferenceEngine' initialize(.Object, ...) InferenceEngine(bn = NULL, observations = NULL, interventions = NULL, ...)
.Object |
an empty InferenceEngine object. |
... |
potential further arguments of methods. |
bn |
a |
observations |
a list of observations composed by the two following vectors:
|
interventions |
a list of interventions composed of the following two vectors:
|
an InferenceEngine object.
InferenceEngine object.
junction.tree
:junction tree adjacency matrix.
num.nodes
:number of nodes in the junction tree.
cliques
:list of cliques composing the nodes of the junction tree.
triangulated.graph
:adjacency matrix of the original triangulated graph.
jpts
:inferred joint probability tables.
bn
:original Bayesian Network (as object of class BN
) as provided by the user, or learnt from a dataset.
NULL
if missing.
updated.bn
:Bayesian Network (as object of class BN
) as modified by a belief propagation computation. In particular,
it will have different conditional probability tables with respect to its original version. NULL
if missing.
observed.vars
:list of observed variables, by name or number.
observed.vals
:list of observed values for the corresponding variables in observed.vars
.
intervention.vars
:list of manipulated variables, by name or number.
intervention.vals
:list of specified values for the corresponding variables in intervention.vars
.
## Not run: dataset <- BNDataset() dataset <- read.dataset(dataset, "file.header", "file.data") bn <- BN(dataset) eng <- InferenceEngine(bn) obs <- list(c("A","G,"X),c(1,2,1)) eng.2 <- InferenceEngine(bn, obs) ## End(Not run)
## Not run: dataset <- BNDataset() dataset <- read.dataset(dataset, "file.header", "file.data") bn <- BN(dataset) eng <- InferenceEngine(bn) obs <- list(c("A","G,"X),c(1,2,1)) eng.2 <- InferenceEngine(bn, obs) ## End(Not run)
InferenceEngine
.Return the list of interventions added to an InferenceEngine.
interventions(x) ## S4 method for signature 'InferenceEngine' interventions(x)
interventions(x) ## S4 method for signature 'InferenceEngine' interventions(x)
x |
an |
Output is a list in the following format:
intervention.vars
vector of manipulated variables;
intervention.vals
vector of values for the variables in observed.vars
in the corresponding position.
the list of interventions of the InferenceEngine
.
InferenceEngine
.Add a list of interventions to an InferenceEngine, using a list composed by the two following vectors:
intervention.vars
vector of the variables we manipulate;
intervention.vals
vector of values for the variables in observed.vars
in the corresponding position.
interventions(x) <- value ## S4 replacement method for signature 'InferenceEngine' interventions(x) <- value
interventions(x) <- value ## S4 replacement method for signature 'InferenceEngine' interventions(x) <- value
x |
an |
value |
the list of interventions of the |
An intervention can be applied only when building an InferenceEngine
.
In case of multiple interventions of the same variable, the last intervention is the one used.
InferenceEngine
.Return the list of joint probability tables for the cliques of the junction tree obtained after belief propagation has been performed.
jpts(x) ## S4 method for signature 'InferenceEngine' jpts(x)
jpts(x) ## S4 method for signature 'InferenceEngine' jpts(x)
x |
an |
Each joint probability table is represented as a multidimensional array. To retrieve single dimensions (e.g. to compute marginals), users should not rely on dimension numbers, but should instead select the dimensions using their names.
the list of joint probability tables compiled by the InferenceEngine
.
InferenceEngine
.Add a list of joint probability tables for the cliques of the junction tree.
jpts(x) <- value ## S4 replacement method for signature 'InferenceEngine' jpts(x) <- value
jpts(x) <- value ## S4 replacement method for signature 'InferenceEngine' jpts(x) <- value
x |
an |
value |
the list of joint probability tables compiled by the |
Each joint probability table is represented as a multidimensional array. To retrieve single dimensions (e.g. to compute marginals), users should provide dimension names.
InferenceEngine
.Return the list of cliques containing the variables associated to each node of a junction tree.
jt.cliques(x) ## S4 method for signature 'InferenceEngine' jt.cliques(x)
jt.cliques(x) ## S4 method for signature 'InferenceEngine' jt.cliques(x)
x |
an |
the list of cliques of the junction tree contained in the InferenceEngine
.
InferenceEngine
.Add to the InferenceEngine a list containing the cliques of variables composing the nodes of the junction tree.
jt.cliques(x) <- value ## S4 replacement method for signature 'InferenceEngine' jt.cliques(x) <- value
jt.cliques(x) <- value ## S4 replacement method for signature 'InferenceEngine' jt.cliques(x) <- value
x |
an |
value |
the list of cliques of the junction tree contained in the |
InferenceEngine
.Return the adjacency matrix representing the junction tree computed for a network.
junction.tree(x) ## S4 method for signature 'InferenceEngine' junction.tree(x)
junction.tree(x) ## S4 method for signature 'InferenceEngine' junction.tree(x)
x |
an |
Rows and columns are named after the (variables in the) cliques that each node of the junction tree represent.
the junction tree contained in the InferenceEngine
.
InferenceEngine
.Set the adjacency matrix of the junction tree computed for a network.
junction.tree(x) <- value ## S4 replacement method for signature 'InferenceEngine' junction.tree(x) <- value
junction.tree(x) <- value ## S4 replacement method for signature 'InferenceEngine' junction.tree(x) <- value
x |
an |
value |
the junction tree to be inserted in the |
Perform imputation of missing data in a data frame using the k-Nearest Neighbour algorithm. For discrete variables we use the mode, for continuous variables the median value is instead taken.
knn.impute( data, k = 10, cat.var = 1:ncol(data), to.impute = 1:nrow(data), using = 1:nrow(data) )
knn.impute( data, k = 10, cat.var = 1:ncol(data), to.impute = 1:nrow(data), using = 1:nrow(data) )
data |
a numerical matrix. |
k |
number of neighbours to be used; for categorical variables the mode of the neighbours is used, for continuous variables the median value is used instead. Default: 10. |
cat.var |
vector containing the indices of the variables to be considered as categorical. Default: all variables. |
to.impute |
vector indicating which rows of the dataset are to be imputed. Default: impute all rows. |
using |
vector indicating which rows of the dataset are to be used to search for neighbours. Default: use all rows. |
imputed matrix.
Compute the topological ordering of the nodes of a network, in order to divide the network in layers.
layering(x) ## S4 method for signature 'BN' layering(x)
layering(x) ## S4 method for signature 'BN' layering(x)
x |
a |
a vector containing layers the nodes can be divided into.
## Not run: dataset <- BNDataset("file.header", "file.data") x <- BN(dataset) x <- learn.network(x, dataset) layering(x) ## End(Not run)
## Not run: dataset <- BNDataset("file.header", "file.data") x <- BN(dataset) x <- learn.network(x, dataset) layering(x) ## End(Not run)
Learn a dynamic network (structure and parameters) of a BN from a BNDataset (see the Details
section).
This method is a wrapper for learn.network
to simplify the learning of a dynamic network.
It provides an automated generation of the layering
required to represent the set of time constraints
encoded in a dynamic network. In this function, it is assumed that the dataset contains the observations for each variable
in all the time steps:
V_1^{t_1}, V_2^{t_1}, V_n^{t_1}, V_1^{t_2}, ... , V_n^{t_k}
.
Variables in time step j
can be parents for any variable in time steps k>=j
, but not for variables i<j
.
If a layering is provided for a time step, it is valid in each time step, and not throughout the whole dynamic network;
a global layering can however be provided.
learn.dynamic.network(x, ...) ## S4 method for signature 'BN' learn.dynamic.network( x, y = NULL, num.time.steps = num.time.steps(y), algo = "mmhc", scoring.func = "BDeu", initial.network = NULL, alpha = 0.05, ess = 1, bootstrap = FALSE, layering = c(), max.fanin = num.variables(y) - 1, max.fanin.layers = NULL, max.parents = num.variables(y) - 1, max.parents.layers = NULL, layer.struct = NULL, cont.nodes = c(), use.imputed.data = FALSE, use.cpc = TRUE, mandatory.edges = NULL, ... ) ## S4 method for signature 'BNDataset' learn.dynamic.network( x, num.time.steps = num.time.steps(x), algo = "mmhc", scoring.func = "BDeu", initial.network = NULL, alpha = 0.05, ess = 1, bootstrap = FALSE, layering = c(), max.fanin = num.variables(x) - 1, max.fanin.layers = NULL, max.parents = num.variables(x) - 1, max.parents.layers = NULL, layer.struct = NULL, cont.nodes = c(), use.imputed.data = FALSE, use.cpc = TRUE, mandatory.edges = NULL, ... )
learn.dynamic.network(x, ...) ## S4 method for signature 'BN' learn.dynamic.network( x, y = NULL, num.time.steps = num.time.steps(y), algo = "mmhc", scoring.func = "BDeu", initial.network = NULL, alpha = 0.05, ess = 1, bootstrap = FALSE, layering = c(), max.fanin = num.variables(y) - 1, max.fanin.layers = NULL, max.parents = num.variables(y) - 1, max.parents.layers = NULL, layer.struct = NULL, cont.nodes = c(), use.imputed.data = FALSE, use.cpc = TRUE, mandatory.edges = NULL, ... ) ## S4 method for signature 'BNDataset' learn.dynamic.network( x, num.time.steps = num.time.steps(x), algo = "mmhc", scoring.func = "BDeu", initial.network = NULL, alpha = 0.05, ess = 1, bootstrap = FALSE, layering = c(), max.fanin = num.variables(x) - 1, max.fanin.layers = NULL, max.parents = num.variables(x) - 1, max.parents.layers = NULL, layer.struct = NULL, cont.nodes = c(), use.imputed.data = FALSE, use.cpc = TRUE, mandatory.edges = NULL, ... )
x |
can be a |
... |
potential further arguments for methods. |
y |
|
num.time.steps |
the number of time steps to be represented in the dynamic BN. |
algo |
the algorithm to use. Currently, one among |
scoring.func |
the scoring function to use. Currently, one among
|
initial.network |
network structure to be used as starting point for structure search.
Can take different values:
a |
alpha |
confidence threshold (only for |
ess |
Equivalent Sample Size value. |
bootstrap |
|
layering |
vector containing the layers each node belongs to. |
max.fanin |
maximum number of parents for each node (only for |
max.fanin.layers |
matrix of available parents in each layer (only for |
max.parents |
maximum number of parents for each node (for |
max.parents.layers |
matrix of available parents in each layer (only for |
layer.struct |
|
cont.nodes |
vector containing the index of continuous variables. |
use.imputed.data |
|
use.cpc |
(when using |
mandatory.edges |
binary matrix, where a |
The other parameters available are the ones of learn.network
, refer to the documentation of that function
for more details. See also the documentation for learn.structure
and learn.params
for more informations.
new BN
object with structure (DAG) and conditional probabilities
as learnt from the given dataset.
learn.network learn.structure learn.params
## Not run: mydataset <- BNDataset("data.file", "header.file") net <- learn.dynamic.network(mydataset, num.time.steps=2) ## End(Not run)
## Not run: mydataset <- BNDataset("data.file", "header.file") net <- learn.dynamic.network(mydataset, num.time.steps=2) ## End(Not run)
Learn a network (structure and parameters) of a BN from a BNDataset (see the Details
section).
learn.network(x, ...) ## S4 method for signature 'BN' learn.network( x, y = NULL, algo = "mmhc", scoring.func = "BDeu", initial.network = NULL, alpha = 0.05, ess = 1, bootstrap = FALSE, layering = c(), max.fanin = num.variables(y) - 1, max.fanin.layers = NULL, max.parents = num.variables(y) - 1, max.parents.layers = NULL, layer.struct = NULL, cont.nodes = c(), use.imputed.data = FALSE, use.cpc = TRUE, mandatory.edges = NULL, ... ) ## S4 method for signature 'BNDataset' learn.network( x, algo = "mmhc", scoring.func = "BDeu", initial.network = NULL, alpha = 0.05, ess = 1, bootstrap = FALSE, layering = c(), max.fanin = num.variables(x) - 1, max.fanin.layers = NULL, max.parents = num.variables(x) - 1, max.parents.layers = NULL, layer.struct = NULL, cont.nodes = c(), use.imputed.data = FALSE, use.cpc = TRUE, mandatory.edges = NULL, ... )
learn.network(x, ...) ## S4 method for signature 'BN' learn.network( x, y = NULL, algo = "mmhc", scoring.func = "BDeu", initial.network = NULL, alpha = 0.05, ess = 1, bootstrap = FALSE, layering = c(), max.fanin = num.variables(y) - 1, max.fanin.layers = NULL, max.parents = num.variables(y) - 1, max.parents.layers = NULL, layer.struct = NULL, cont.nodes = c(), use.imputed.data = FALSE, use.cpc = TRUE, mandatory.edges = NULL, ... ) ## S4 method for signature 'BNDataset' learn.network( x, algo = "mmhc", scoring.func = "BDeu", initial.network = NULL, alpha = 0.05, ess = 1, bootstrap = FALSE, layering = c(), max.fanin = num.variables(x) - 1, max.fanin.layers = NULL, max.parents = num.variables(x) - 1, max.parents.layers = NULL, layer.struct = NULL, cont.nodes = c(), use.imputed.data = FALSE, use.cpc = TRUE, mandatory.edges = NULL, ... )
x |
can be a |
... |
potential further arguments for methods. |
y |
|
algo |
the algorithm to use. Currently, one among |
scoring.func |
the scoring function to use. Currently, one among
|
initial.network |
network structure to be used as starting point for structure search.
Can take different values:
a |
alpha |
confidence threshold (only for |
ess |
Equivalent Sample Size value. |
bootstrap |
|
layering |
vector containing the layers each node belongs to. |
max.fanin |
maximum number of parents for each node (only for |
max.fanin.layers |
matrix of available parents in each layer (only for |
max.parents |
maximum number of parents for each node (for |
max.parents.layers |
matrix of available parents in each layer (only for |
layer.struct |
|
cont.nodes |
vector containing the index of continuous variables. |
use.imputed.data |
|
use.cpc |
(when using |
mandatory.edges |
binary matrix, where a |
Learn the structure (the directed acyclic graph) of a BN
object according to a BNDataset
.
We provide five algorithms for learning the structure of the network, that can be chosen with the algo
parameter.
The first one is the Silander-Myllym\"aki (sm
)
exact search-and-score algorithm, that performs a complete evaluation of the search space in order to discover
the best network; this algorithm may take a very long time, and can be inapplicable when discovering networks
with more than 25–30 nodes. Even for small networks, users are strongly encouraged to provide
meaningful parameters such as the layering of the nodes, or the maximum number of parents – refer to the
documentation in package manual for more details on the method parameters.
The second method is the constraint-based Max-Min Parents-and-Children (mmpc
), that returns the skeleton of the network.
Given the possible presence of loops, due to the non-directionality of the edges discovered, no parameter learning
is possible using this algorithm. Also note that in the case of a very dense network and lots of obsevations, the statistical evaluation
of the search space may take a long time. Also for this algorithm there are parameters that may need to be tuned,
mainly the confidence threshold of the statistical pruning. Please refer to the rest of this documentation for their explanation.
The third algorithm is another heuristic, the Hill-Climbing (hc
). It can start from the complete space of possibilities
(default) or from a reduced subset of possible edges, using the cpc
argument.
The fourth algorithm (and the default one) is the Max-Min Hill-Climbing heuristic (mmhc
), that performs a statistical
sieving of the search space followed by a greedy evaluation, by combining the MMPC and the HC algorithms.
It is considerably faster than the complete method, at the cost of a (likely)
lower quality. As for MMPC, the computational time depends on the density of the network, the number of observations and
the tuning of the parameters.
The fifth method is the Structural Expectation-Maximization (sem
) algorithm,
for learning a network from a dataset with missing values. It iterates a sequence of Expectation-Maximization (in order to “fill in”
the holes in the dataset) and structure learning from the guessed dataset, until convergence. The structure learning used inside SEM,
due to computational reasons, is MMHC. Convergence of SEM can be controlled with the parameters struct.threshold
and param.threshold
, for the structure and the parameter convergence, respectively.
Search-and-score methods also need a scoring function to compute an estimated measure of each configuration of nodes.
We provide three of the most popular scoring functions, BDeu
(Bayesian-Dirichlet equivalent uniform, default),
AIC
(Akaike Information Criterion) and BIC
(Bayesian Information Criterion). The scoring function
can be chosen using the scoring.func
parameter.
Structure learning sets the dag
field of the BN
under study, unless bootstrap or the mmpc
algorithm
are employed. In these cases, given the possible presence of loops, the wpdag
field is set.
In case of missing data, the default behaviour (with no other indication from the user)
is to learn the structure using mmhc
starting from the raw dataset, using only the
available cases with no imputation.
In case of learning from a dataset containing observations of a dynamic system, learn.dynamic.network
will be employed.
Then, the parameters of the network are learnt using MAP (Maximum A Posteriori) estimation (when not using bootstrap or mmpc
).
See documentation for learn.structure
and learn.params
for more informations.
new BN
object with structure (DAG) and conditional probabilities
as learnt from the given dataset.
learn.structure learn.params learn.dynamic.network
## Not run: mydataset <- BNDataset("data.file", "header.file") # starting from a BN net <- BN(mydataset) net <- learn.network(net, mydataset) # start directly from the dataset net <- learn.network(mydataset) ## End(Not run)
## Not run: mydataset <- BNDataset("data.file", "header.file") # starting from a BN net <- BN(mydataset) net <- learn.network(net, mydataset) # start directly from the dataset net <- learn.network(mydataset) ## End(Not run)
Learn the parameters of a BN object according to a BNDataset using MAP (Maximum A Posteriori) estimation.
learn.params(bn, dataset, ess = 1, use.imputed.data = F) ## S4 method for signature 'BN,BNDataset' learn.params(bn, dataset, ess = 1, use.imputed.data = FALSE)
learn.params(bn, dataset, ess = 1, use.imputed.data = F) ## S4 method for signature 'BN,BNDataset' learn.params(bn, dataset, ess = 1, use.imputed.data = FALSE)
bn |
a |
dataset |
a |
ess |
Equivalent Sample Size value. |
use.imputed.data |
use imputed data. |
Parameter learning is not possible in case of networks learnt using the mmpc
algorithm,
or from bootstrap samples, as there may be loops.
new BN
object with conditional probabilities.
learn.network
## Not run: ## first create a BN and learn its structure from a dataset dataset <- BNDataset("file.header", "file.data") bn <- BN(dataset) bn <- learn.structure(bn, dataset) bn <- learn.params(bn, dataset, ess=1) ## End(Not run)
## Not run: ## first create a BN and learn its structure from a dataset dataset <- BNDataset("file.header", "file.data") bn <- BN(dataset) bn <- learn.structure(bn, dataset) bn <- learn.params(bn, dataset, ess=1) ## End(Not run)
Learn the structure (the directed acyclic graph) of a BN
object according to a BNDataset
.
learn.structure( bn, dataset, algo = "mmhc", scoring.func = "BDeu", initial.network = NULL, alpha = 0.05, ess = 1, bootstrap = FALSE, layering = c(), max.fanin = num.variables(dataset), max.fanin.layers = NULL, max.parents = num.variables(dataset), max.parents.layers = NULL, layer.struct = NULL, cont.nodes = c(), use.imputed.data = FALSE, use.cpc = TRUE, mandatory.edges = NULL, ... ) ## S4 method for signature 'BN,BNDataset' learn.structure( bn, dataset, algo = "mmhc", scoring.func = "BDeu", initial.network = NULL, alpha = 0.05, ess = 1, bootstrap = FALSE, layering = c(), max.fanin = num.variables(dataset) - 1, max.fanin.layers = NULL, max.parents = num.variables(dataset) - 1, max.parents.layers = NULL, layer.struct = NULL, cont.nodes = c(), use.imputed.data = FALSE, use.cpc = TRUE, mandatory.edges = NULL, ... )
learn.structure( bn, dataset, algo = "mmhc", scoring.func = "BDeu", initial.network = NULL, alpha = 0.05, ess = 1, bootstrap = FALSE, layering = c(), max.fanin = num.variables(dataset), max.fanin.layers = NULL, max.parents = num.variables(dataset), max.parents.layers = NULL, layer.struct = NULL, cont.nodes = c(), use.imputed.data = FALSE, use.cpc = TRUE, mandatory.edges = NULL, ... ) ## S4 method for signature 'BN,BNDataset' learn.structure( bn, dataset, algo = "mmhc", scoring.func = "BDeu", initial.network = NULL, alpha = 0.05, ess = 1, bootstrap = FALSE, layering = c(), max.fanin = num.variables(dataset) - 1, max.fanin.layers = NULL, max.parents = num.variables(dataset) - 1, max.parents.layers = NULL, layer.struct = NULL, cont.nodes = c(), use.imputed.data = FALSE, use.cpc = TRUE, mandatory.edges = NULL, ... )
bn |
a |
dataset |
a |
algo |
the algorithm to use. Currently, one among |
scoring.func |
the scoring function to use. Currently, one among |
initial.network |
network srtructure to be used as starting point for structure search.
Can take different values:
a |
alpha |
confidence threshold (only for |
ess |
Equivalent Sample Size value. |
bootstrap |
|
layering |
vector containing the layers each node belongs to (only for |
max.fanin |
maximum number of parents for each node (only for |
max.fanin.layers |
matrix of available parents in each layer (only for |
max.parents |
maximum number of parents for each node (for |
max.parents.layers |
matrix of available parents in each layer (only for |
layer.struct |
|
cont.nodes |
vector containing the index of continuous variables. |
use.imputed.data |
|
use.cpc |
(when using |
mandatory.edges |
binary matrix, where a |
... |
potential further arguments for method. |
We provide three algorithms in order to learn the structure of the network, that can be chosen with the algo
parameter.
The first is the Silander-Myllym\"aki (sm
)
exact search-and-score algorithm, that performs a complete evaluation of the search space in order to discover
the best network; this algorithm may take a very long time, and can be inapplicable when discovering networks
with more than 25–30 nodes. Even for small networks, users are strongly encouraged to provide
meaningful parameters such as the layering of the nodes, or the maximum number of parents – refer to the
documentation in package manual for more details on the method parameters.
The second method is the constraint-based Max-Min Parents-and-Children (mmpc
), that returns the skeleton of the network.
Given the possible presence of loops, due to the non-directionality of the edges discovered, no parameter learning
is possible using this algorithm. Also note that in the case of a very dense network and lots of obsevations, the statistical evaluation
of the search space may take a long time. Also for this algorithm there are parameters that may need to be tuned,
mainly the confidence threshold of the statistical pruning. Please refer to the rest of this documentation for their explanation.
The third algorithm is another heuristic, the Hill-Climbing (hc
). It can start from the complete space of possibilities
(default) or from a reduced subset of possible edges, using the cpc
argument.
The fourth algorithm (and the default one) is the Max-Min Hill-Climbing heuristic (mmhc
), that performs a statistical
sieving of the search space followed by a greedy evaluation, by combining the MMPC and the HC algorithms.
It is considerably faster than the complete method, at the cost of a (likely)
lower quality. As for MMPC, the computational time depends on the density of the network, the number of observations and
the tuning of the parameters.
The fifth method is the Structural Expectation-Maximization (sem
) algorithm,
for learning a network from a dataset with missing values. It iterates a sequence of Expectation-Maximization (in order to “fill in”
the holes in the dataset) and structure learning from the guessed dataset, until convergence. The structure learning used inside SEM,
due to computational reasons, is MMHC. Convergence of SEM can be controlled with the parameters struct.threshold
and param.threshold
, for the structure and the parameter convergence, respectively.
for learning a network from a dataset with missing values. It iterates a sequence of Expectation-Maximization (in order to “fill in”
the holes in the dataset) and structure learning from the guessed dataset, until convergence. The structure learning used inside SEM,
due to computational reasons, is MMHC. Convergence of SEM can be controlled with the parameters struct.threshold
and param.threshold
, for the structure and the parameter convergence, respectively.
Search-and-score methods also need a scoring function to compute an estimated measure of each configuration of nodes.
We provide three of the most popular scoring functions, BDeu
(Bayesian-Dirichlet equivalent uniform, default),
AIC
(Akaike Information Criterion) and BIC
(Bayesian Information Criterion). The scoring function
can be chosen using the scoring.func
parameter.
Structure learning sets the dag
field of the BN
under study, unless bootstrap or the mmpc
algorithm
are employed. In these cases, given the possible presence of loops, the wpdag
field is set.
In case of missing data, the default behaviour (with no other indication from the user)
is to learn the structure using mmhc
starting from the raw dataset.
new BN
object with DAG.
learn.network learn.dynamic.network
## Not run: dataset <- BNDataset("file.header", "file.data") bn <- BN(dataset) # use MMHC bn <- learn.structure(bn, dataset, alpha=0.05, ess=1, bootstrap=FALSE) # now use Silander-Myllymaki layers <- layering(bn) mfl <- as.matrix(read.table(header=F, text='0 1 1 1 1 0 1 1 1 1 0 0 8 7 7 0 0 0 14 6 0 0 0 0 19')) bn <- learn.structure(bn, dataset, algo='sm', max.fanin=3, cont.nodes=c(), layering=layers, max.fanin.layers=mfl, use.imputed.data=FALSE) ## End(Not run)
## Not run: dataset <- BNDataset("file.header", "file.data") bn <- BN(dataset) # use MMHC bn <- learn.structure(bn, dataset, alpha=0.05, ess=1, bootstrap=FALSE) # now use Silander-Myllymaki layers <- layering(bn) mfl <- as.matrix(read.table(header=F, text='0 1 1 1 1 0 1 1 1 1 0 0 8 7 7 0 0 0 14 6 0 0 0 0 19')) bn <- learn.structure(bn, dataset, algo='sm', max.fanin=3, cont.nodes=c(), layering=layers, max.fanin.layers=mfl, use.imputed.data=FALSE) ## End(Not run)
Given an InferenceEngine
, it returns a list containing the marginals for the variables
in the network, according to the propagated beliefs.
marginals(x, ...) ## S4 method for signature 'InferenceEngine' marginals(x, ...)
marginals(x, ...) ## S4 method for signature 'InferenceEngine' marginals(x, ...)
x |
|
... |
potential further arguments of methods. |
a list containing the marginals of each variable, as probability tables.
## Not run: eng <- InferenceEngine(net) marginals(eng) ## End(Not run)
## Not run: eng <- InferenceEngine(net) marginals(eng) ## End(Not run)
Return the name of an object, of class BN
or BNDataset
.
name(x) ## S4 method for signature 'BN' name(x) ## S4 method for signature 'BNDataset' name(x)
name(x) ## S4 method for signature 'BN' name(x) ## S4 method for signature 'BNDataset' name(x)
x |
an object. |
name of the object.
Set the name
slot of an object of type BN
or BNDataset
.
name(x) <- value ## S4 replacement method for signature 'BN' name(x) <- value ## S4 replacement method for signature 'BNDataset' name(x) <- value
name(x) <- value ## S4 replacement method for signature 'BN' name(x) <- value ## S4 replacement method for signature 'BNDataset' name(x) <- value
x |
an object. |
value |
the new name of the object. |
Return a list containing the size of the variables of an object. It is the actual cardinality of discrete variables, and the cardinality of the discretized variable for continuous variables.
node.sizes(x) ## S4 method for signature 'BN' node.sizes(x) ## S4 method for signature 'BNDataset' node.sizes(x)
node.sizes(x) ## S4 method for signature 'BN' node.sizes(x) ## S4 method for signature 'BNDataset' node.sizes(x)
x |
an object. |
vector contaning the size of each variable of the desired object.
Set the size of the variables of a BN or BNDataset object. It represents the actual cardinality of discrete variables, and the cardinality of the discretized variable for continuous variables.
node.sizes(x) <- value ## S4 replacement method for signature 'BN' node.sizes(x) <- value ## S4 replacement method for signature 'BNDataset' node.sizes(x) <- value
node.sizes(x) <- value ## S4 replacement method for signature 'BN' node.sizes(x) <- value ## S4 replacement method for signature 'BNDataset' node.sizes(x) <- value
x |
an object. |
value |
vector contaning the size of each variable of the object. |
BNDataset
.Return the number of bootstrap samples computed from a dataset.
num.boots(x) ## S4 method for signature 'BNDataset' num.boots(x)
num.boots(x) ## S4 method for signature 'BNDataset' num.boots(x)
x |
a |
the number of bootstrap samples.
BNDataset
.Set the length of the list of samples of a dataset computed using bootstrap.
num.boots(x) <- value ## S4 replacement method for signature 'BNDataset' num.boots(x) <- value
num.boots(x) <- value ## S4 replacement method for signature 'BNDataset' num.boots(x) <- value
x |
a |
value |
the number of bootstrap samples. |
BNDataset
.Return the number of items in a dataset, that is, the number of rows in its data slot.
num.items(x) ## S4 method for signature 'BNDataset' num.items(x)
num.items(x) ## S4 method for signature 'BNDataset' num.items(x)
x |
a |
number of items of the desired dataset.
BNDataset
.Set the number of observed items (rows) in a dataset.
num.items(x) <- value ## S4 replacement method for signature 'BNDataset' num.items(x) <- value
num.items(x) <- value ## S4 replacement method for signature 'BNDataset' num.items(x) <- value
x |
a |
value |
number of items of the desired dataset. |
Return the name of an object, of class BN
or InferenceEngine
.
num.nodes(x) ## S4 method for signature 'BN' num.nodes(x) ## S4 method for signature 'InferenceEngine' num.nodes(x)
num.nodes(x) ## S4 method for signature 'BN' num.nodes(x) ## S4 method for signature 'InferenceEngine' num.nodes(x)
x |
an object. |
number of nodes of the desired object.
Set the number of nodes of an object of type BN
(number of nodes of the network)
or InferenceEngine
(where parameter contains the number of nodes of the junction tree).
num.nodes(x) <- value ## S4 replacement method for signature 'BN' num.nodes(x) <- value ## S4 replacement method for signature 'InferenceEngine' num.nodes(x) <- value
num.nodes(x) <- value ## S4 replacement method for signature 'BN' num.nodes(x) <- value ## S4 replacement method for signature 'InferenceEngine' num.nodes(x) <- value
x |
an object. |
value |
the number of nodes in the object. |
BN
or a BNDataset
.Return the number of time steps observed in a dataset.
num.time.steps(x) ## S4 method for signature 'BN' num.time.steps(x) ## S4 method for signature 'BNDataset' num.time.steps(x)
num.time.steps(x) ## S4 method for signature 'BN' num.time.steps(x) ## S4 method for signature 'BNDataset' num.time.steps(x)
x |
the number of time steps.
BN
or a BNDataset
.Set the number of time steps of a dataset.
num.time.steps(x) <- value ## S4 replacement method for signature 'BN' num.time.steps(x) <- value ## S4 replacement method for signature 'BNDataset' num.time.steps(x) <- value
num.time.steps(x) <- value ## S4 replacement method for signature 'BN' num.time.steps(x) <- value ## S4 replacement method for signature 'BNDataset' num.time.steps(x) <- value
x |
|
value |
the number of time steps. |
BNDataset
.Return the number of the variables contained in a dataset. This value corresponds to the value
of num.nodes
of a network built upon the same dataset.
num.variables(x) ## S4 method for signature 'BNDataset' num.variables(x) ## S4 method for signature 'BNDataset' num.variables(x)
num.variables(x) ## S4 method for signature 'BNDataset' num.variables(x) ## S4 method for signature 'BNDataset' num.variables(x)
x |
a |
number of variables of the desired dataset.
BNDataset
.Set the number of variables observed in a dataset.
num.variables(x) <- value ## S4 replacement method for signature 'BNDataset' num.variables(x) <- value
num.variables(x) <- value ## S4 replacement method for signature 'BNDataset' num.variables(x) <- value
x |
a |
value |
number of variables of the dataset. |
InferenceEngine
.Return the list of observations added to an InferenceEngine.
observations(x) ## S4 method for signature 'InferenceEngine' observations(x)
observations(x) ## S4 method for signature 'InferenceEngine' observations(x)
x |
an |
Output is a list in the following format:
observed.vars
vector of observed variables;
observed.vals
vector of values observed for the variables in observed.vars
in the corresponding position.
the list of observations of the InferenceEngine
.
InferenceEngine
.Add a list of observations to an InferenceEngine, using a list of observations composed by the two following vectors:
observed.vars
vector of observed variables;
observed.vals
vector of values observed for the variables in observed.vars
in the corresponding position.
observations(x) <- value ## S4 replacement method for signature 'InferenceEngine' observations(x) <- value
observations(x) <- value ## S4 replacement method for signature 'InferenceEngine' observations(x) <- value
x |
an |
value |
the list of observations of the |
Replace previous list of observations, if present. In order to add evidence, and not just replace it,
one must use the add.observations<-
method.
In case of multiple observations of the same variable, the last observation is the one used, as the most recent.
BN
as a picture.plot a BN
as a picture.
## S3 method for class 'BN' plot( x, method = "default", use.node.names = TRUE, frac = 0.2, max.weight = max(dag(x)), node.size.lab = 14, node.col = rep("white", num.nodes(x)), plot.wpdag = FALSE, ... )
## S3 method for class 'BN' plot( x, method = "default", use.node.names = TRUE, frac = 0.2, max.weight = max(dag(x)), node.size.lab = 14, node.col = rep("white", num.nodes(x)), plot.wpdag = FALSE, ... )
x |
a |
method |
either |
use.node.names |
|
frac |
minimum fraction [0,1] of presence of an edge to be plotted (used in case of |
max.weight |
maximum possible weight of an edge (used in case of |
node.size.lab |
font size for the node labels in the default mode. |
node.col |
list of ( |
plot.wpdag |
if |
... |
potential further arguments when using |
BN
, BNDataset
or InferenceEngine
to stdout
.print a BN
, BNDataset
or InferenceEngine
to stdout
.
## S3 method for class 'BN' print(x, ...) ## S3 method for class 'BNDataset' print(x, show.raw.data = FALSE, show.imputed.data = FALSE, ...) ## S3 method for class 'InferenceEngine' print(x, engine = "jt", ...)
## S3 method for class 'BN' print(x, ...) ## S3 method for class 'BNDataset' print(x, show.raw.data = FALSE, show.imputed.data = FALSE, ...) ## S3 method for class 'InferenceEngine' print(x, engine = "jt", ...)
x |
a |
... |
potential other arguments. |
show.raw.data |
if |
show.imputed.data |
if |
engine |
if |
Return the list of quantiles of a BN
or a BNDataset
. It is set when a discretization needs to be performed.
quantiles(x) ## S4 method for signature 'BN' quantiles(x) ## S4 method for signature 'BNDataset' quantiles(x)
quantiles(x) ## S4 method for signature 'BN' quantiles(x) ## S4 method for signature 'BNDataset' quantiles(x)
x |
a list of vectors. |
Output is a list of num.nodes
vectors, one per variable. Each vector is NULL
if the corresponding variable is discrete in the original dataset, and contains the cut points for the quantiles
if the corresponding variable is continuous.
the list of quantiles of the BN
of BNDataset
.
Set the list of quantiles of a BN
or a BNDataset
.
quantiles(x) <- value ## S4 replacement method for signature 'BN' quantiles(x) <- value ## S4 replacement method for signature 'BNDataset' quantiles(x) <- value
quantiles(x) <- value ## S4 replacement method for signature 'BN' quantiles(x) <- value ## S4 replacement method for signature 'BNDataset' quantiles(x) <- value
x |
|
value |
a list of vectors. |
It is used when a discretization needs to be performed.
Return raw data contained in a BNDataset
object, if any.
raw.data(x) ## S4 method for signature 'BNDataset' raw.data(x)
raw.data(x) ## S4 method for signature 'BNDataset' raw.data(x)
x |
a |
has.raw.data
, has.imputed.data
Insert raw data in a BNDataset
object.
raw.data(x) <- value ## S4 replacement method for signature 'BNDataset' raw.data(x) <- value
raw.data(x) <- value ## S4 replacement method for signature 'BNDataset' raw.data(x) <- value
x |
a |
value |
a matrix of integers containing a dataset. |
has.raw.data
, raw.data
, read.dataset
.bif
file.Read a network described in a .bif
-formatted file, and
build a BN
object.
read.bif(x) ## S4 method for signature 'character' read.bif(x)
read.bif(x) ## S4 method for signature 'character' read.bif(x)
x |
the |
The method relies on a coherent ordering of variable values and parameters in the file.
a BN
object.
There are two ways to build a BNDataset: using two files containing respectively header informations and data, and manually providing the data table and the related header informations (variable names, cardinality and discreteness).
read.dataset( object, data.file, header.file, data.with.header = FALSE, na.string.symbol = "?", sep.symbol = "", starts.from = 1, num.time.steps = 1 ) ## S4 method for signature 'BNDataset,character,character' read.dataset( object, data.file, header.file, data.with.header = FALSE, na.string.symbol = "?", sep.symbol = "", starts.from = 1, num.time.steps = 1 )
read.dataset( object, data.file, header.file, data.with.header = FALSE, na.string.symbol = "?", sep.symbol = "", starts.from = 1, num.time.steps = 1 ) ## S4 method for signature 'BNDataset,character,character' read.dataset( object, data.file, header.file, data.with.header = FALSE, na.string.symbol = "?", sep.symbol = "", starts.from = 1, num.time.steps = 1 )
object |
the |
data.file |
the |
header.file |
the |
data.with.header |
|
na.string.symbol |
character that denotes |
sep.symbol |
separator among values in the dataset. |
starts.from |
starting value for entries in the dataset (observed values, default is 1). |
num.time.steps |
number of instants composing the observations (1, unless it is a dynamic system). |
The key informations needed are: 1. the data; 2. the state of variables (discrete or continuous); 3. the names of the variables; 4. the cardinalities of the variables (if discrete), or the number of levels they have to be quantized into (if continuous). Names and cardinalities/leves can be guessed by looking at the data, but it is strongly advised to provide _all_ of the informations, in order to avoid problems later on during the execution.
Data can be provided in form of data.frame or matrix. It can contain NAs. By default, NAs are indicated with '?';
to specify a different character for NAs, it is possible to provide also the na.string.symbol
parameter.
The values contained in the data have to be numeric (real for continuous variables, integer for discrete ones).
The default range of values for a discrete variable X
is [1,|X|]
, with |X|
being
the cardinality of X
. The same applies for the levels of quantization for continuous variables.
If the value ranges for the data are different from the expected ones, it is possible to specify a different
starting value (for the whole dataset) with the starts.from
parameter. E.g. by starts.from=0
we assume that the values of the variables in the dataset have range [0,|X|-1]
.
Please keep in mind that the internal representation of bnstruct starts from 1,
and the original starting values are then lost.
It is possible to use two files, one for the data and one for the metadata,
instead of providing manually all of the info.
bnstruct requires the data files to be in a format subsequently described.
The actual data has to be in (a text file containing data in) tabular format, one tuple per row,
with the values for each variable separated by a space or a tab. Values for each variable have to be
numbers, starting from 1
in case of discrete variables.
Data files can have a first row containing the names of the corresponding variables.
In addition to the data file, a header file containing additional informations can also be provided.
An header file has to be composed by three rows of tab-delimited values:
1. list of names of the variables, in the same order of the data file;
2. a list of integers representing the cardinality of the variables, in case of discrete variables,
or the number of levels each variable has to be quantized in, in case of continuous variables;
3. a list that indicates, for each variable, if the variable is continuous
(c
or C
), and thus has to be quantized before learning,
or discrete (d
or D
).
BNDataset
## Not run: dataset <- BNDataset() dataset <- read.dataset(dataset, "file.data", "file.header") ## End(Not run)
## Not run: dataset <- BNDataset() dataset <- read.dataset(dataset, "file.data", "file.header") ## End(Not run)
.dsc
file.Read a network described in a .dsc
-formatted file, and
build a BN
object.
read.dsc(x) ## S4 method for signature 'character' read.dsc(x)
read.dsc(x) ## S4 method for signature 'character' read.dsc(x)
x |
the |
The method relies on a coherent ordering of variable values and parameters in the file.
a BN
object.
.net
file.Read a network described in a .net
-formatted file, and
build a BN
object.
read.net(x) ## S4 method for signature 'character' read.net(x)
read.net(x) ## S4 method for signature 'character' read.net(x)
x |
the |
The method relies on a coherent ordering of variable values and parameters in the file.
a BN
object.
BNDataset
from a network of an inference engine.sample a BNDataset
from a network of an inference engine.
sample.dataset(x, n = 100, mar = 0) ## S4 method for signature 'BN' sample.dataset(x, n = 100, mar = 0) ## S4 method for signature 'InferenceEngine' sample.dataset(x, n = 100)
sample.dataset(x, n = 100, mar = 0) ## S4 method for signature 'BN' sample.dataset(x, n = 100, mar = 0) ## S4 method for signature 'InferenceEngine' sample.dataset(x, n = 100)
x |
a |
n |
number of items to sample. |
mar |
fraction [0,1] of missing values in the sampled dataset (missing at random), default value is 0 (no missing values). |
sample a row vector of values for a network.
sample.row(x, mar = 0) ## S4 method for signature 'BN' sample.row(x, mar = 0)
sample.row(x, mar = 0) ## S4 method for signature 'BN' sample.row(x, mar = 0)
x |
a |
mar |
fraction [0,1] of missing values in the sampled vector (missing at random), default value is 0 (no missing values). |
a vector of values.
BN
picture as .eps
file.Save an image of a Bayesian Network as an .eps
file.
save.to.eps(x, filename, ...) ## S4 method for signature 'BN,character' save.to.eps(x, filename, ...)
save.to.eps(x, filename, ...) ## S4 method for signature 'BN,character' save.to.eps(x, filename, ...)
x |
a |
filename |
name (with path, if needed) of the file to be created |
... |
parameters for the |
## Not run: save.to.eps(x, "out.eps") ## End(Not run)
## Not run: save.to.eps(x, "out.eps") ## End(Not run)
Read the scoring function used in the learn.structure
method.
Outcome is meaningful only if the structure of a network has been learnt.
scoring.func(x) ## S4 method for signature 'BN' scoring.func(x)
scoring.func(x) ## S4 method for signature 'BN' scoring.func(x)
x |
the |
the scoring function used.
Set the scoring function used in the learn.structure
method.
scoring.func(x) <- value ## S4 replacement method for signature 'BN' scoring.func(x) <- value
scoring.func(x) <- value ## S4 replacement method for signature 'BN' scoring.func(x) <- value
x |
the |
value |
the scoring function used. |
updated BN.
Compute the Structural Hamming Distance between two adjacency matrices, that is,
the distance, in terms of edges, between two network structures. The lower the shd
,
the more similar are the two network structures.
shd(g1, g2)
shd(g1, g2)
g1 |
first adjacency matrix. |
g2 |
second adjacency matrix. |
The show
method allows to provide a custom aspect for the output that is generated
when the name of an instance is gives as command in an R session.
show(object)
show(object)
object |
an object. |
Read the algorithm used in the learn.structure
method.
Outcome is meaningful only if the structure of a network has been learnt.
struct.algo(x) ## S4 method for signature 'BN' struct.algo(x)
struct.algo(x) ## S4 method for signature 'BN' struct.algo(x)
x |
the |
the structure learning algorithm used.
Set the algorithm used in the learn.structure
method.
struct.algo(x) <- value ## S4 replacement method for signature 'BN' struct.algo(x) <- value
struct.algo(x) <- value ## S4 replacement method for signature 'BN' struct.algo(x) <- value
x |
the |
value |
the scoring function used. |
updated BN.
BN
is present in an InferenceEngine
.Check if an InferenceEngine actually contains an updated network, in order to provide the chance of
a fallback and use the original network if no belief propagation has been performed.
An InferenceEngine
built specifying a set of interventions will contain
an updated BN
with altered structure and no conditional probability tables
(unless they are computed by a belief propagation operation.)
test.updated.bn(x) ## S4 method for signature 'InferenceEngine' test.updated.bn(x)
test.updated.bn(x) ## S4 method for signature 'InferenceEngine' test.updated.bn(x)
x |
an |
TRUE
if an updated network is contained in the InferenceEngine, FALSE
otherwise.
## Not run: dataset <- BNDataset("file.header", "file.data") bn <- BN(dataset) ie <- InferenceEngine(bn) test.updated.bn(ie) # FALSE observations(ie) <- list("observed.vars"=("A","G","X"), "observed.vals"=c(1,2,1)) ie <- belief.propagation(ie) test.updated.bn(ie) # TRUE interventions <- list("intervention.vars"=("A","G","X"), "intervention.vals"=c(1,2,1)) ie2 <- InferenceEngine(bn, interventions = interventions) test.updated.bn(ie2) # TRUE ## End(Not run)
## Not run: dataset <- BNDataset("file.header", "file.data") bn <- BN(dataset) ie <- InferenceEngine(bn) test.updated.bn(ie) # FALSE observations(ie) <- list("observed.vars"=("A","G","X"), "observed.vals"=c(1,2,1)) ie <- belief.propagation(ie) test.updated.bn(ie) # TRUE interventions <- list("intervention.vars"=("A","G","X"), "intervention.vals"=c(1,2,1)) ie2 <- InferenceEngine(bn, interventions = interventions) test.updated.bn(ie2) # TRUE ## End(Not run)
tune the parameter k of the knn algorithm used in imputation.
tune.knn.impute( data, cat.var = 1:ncol(data), k.min = 1, k.max = 20, frac.miss = 0.1, n.iter = 20, seed = 0 )
tune.knn.impute( data, cat.var = 1:ncol(data), k.min = 1, k.max = 20, frac.miss = 0.1, n.iter = 20, seed = 0 )
data |
a numerical matrix. |
cat.var |
vector containing the categorical variables |
k.min |
minimum value for k |
k.max |
maximum value for k |
frac.miss |
fraction of missing values to add |
n.iter |
number of iterations for each k |
seed |
random seed |
matrix of error distributions
BN
object contained in an InferenceEngine
.Return an updated network contained in an InferenceEngine.
updated.bn(x) ## S4 method for signature 'InferenceEngine' updated.bn(x)
updated.bn(x) ## S4 method for signature 'InferenceEngine' updated.bn(x)
x |
an |
the updated BN
object contained in an InferenceEngine
.
BN
object contained in an InferenceEngine
.Add an updated network to an InferenceEngine.
updated.bn(x) <- value ## S4 replacement method for signature 'InferenceEngine' updated.bn(x) <- value
updated.bn(x) <- value ## S4 replacement method for signature 'InferenceEngine' updated.bn(x) <- value
x |
an |
value |
the updated |
Get the list of variables (with their names) of a BN
or BNDataset
.
variables(x) ## S4 method for signature 'BN' variables(x) ## S4 method for signature 'BNDataset' variables(x)
variables(x) ## S4 method for signature 'BN' variables(x) ## S4 method for signature 'BNDataset' variables(x)
x |
an object. |
vector of the variables names of the desired object.
Set the list of variable names in a BN
or BNDataset
object.
variables(x) <- value ## S4 replacement method for signature 'BN' variables(x) <- value ## S4 replacement method for signature 'BNDataset' variables(x) <- value
variables(x) <- value ## S4 replacement method for signature 'BN' variables(x) <- value ## S4 replacement method for signature 'BNDataset' variables(x) <- value
x |
an object. |
value |
vector containing the variable names of the object.
Overwrites |
Return the weighted partially directed acyclic graph of a network, when available (e.g. when bootstrap on dataset is performed).
wpdag(x) ## S4 method for signature 'BN' wpdag(x)
wpdag(x) ## S4 method for signature 'BN' wpdag(x)
x |
an object. |
matrix contaning the WPDAG of the object.
Given a BN
object with a dag
, return a network
with its wpdag
set as the CPDAG computed starting from the dag
.
wpdag.from.dag(x, layering = NULL, layer.struct = NULL) ## S4 method for signature 'BN' wpdag.from.dag(x, layering = NULL, layer.struct = NULL)
wpdag.from.dag(x, layering = NULL, layer.struct = NULL) ## S4 method for signature 'BN' wpdag.from.dag(x, layering = NULL, layer.struct = NULL)
x |
a |
layering |
vector containing the layers each node belongs to. |
layer.struct |
|
a BN
object with an initialized wpdag
.
## Not run: net <- learn.network(dataset, layering=layering, layer.struct=layer.struct) wp.net <- wpdag.from.dag(net, layering, layer.struct=layer.struct) ## End(Not run)
## Not run: net <- learn.network(dataset, layering=layering, layer.struct=layer.struct) wp.net <- wpdag.from.dag(net, layering, layer.struct=layer.struct) ## End(Not run)
Set the weighted partially directed acyclic graph of a network (e.g. in case bootstrap on dataset is performed).
wpdag(x) <- value ## S4 replacement method for signature 'BN' wpdag(x) <- value
wpdag(x) <- value ## S4 replacement method for signature 'BN' wpdag(x) <- value
x |
an object. |
value |
matrix contaning the WPDAG of the object. |
XGMML
file.Write a network on disk, saving it in an XGMML
file,
for importing it in Cytoscape.
write_xgmml( x, path = "./network", write.wpdag = FALSE, node.col = rep("white", num.nodes(x)), frac = 0.2, max.weight = max(wpdag(x)) ) ## S4 method for signature 'BN' write_xgmml( x, path = "./network", write.wpdag = FALSE, node.col = rep("white", num.nodes(x)), frac = 0.2, max.weight = max(wpdag(x)) )
write_xgmml( x, path = "./network", write.wpdag = FALSE, node.col = rep("white", num.nodes(x)), frac = 0.2, max.weight = max(wpdag(x)) ) ## S4 method for signature 'BN' write_xgmml( x, path = "./network", write.wpdag = FALSE, node.col = rep("white", num.nodes(x)), frac = 0.2, max.weight = max(wpdag(x)) )
x |
the |
path |
file name with relative or absolute path to be written. |
write.wpdag |
write the weighted PDAG computed using bootstrap samples or the MMPC structure algorithm, instead of the normaldag (default FALSE). |
node.col |
vector of colors for each node of the network (in R colornames). |
frac |
minimum fraction [0,1] of presence of an edge to be plotted (used in case of |
max.weight |
maximum possible weight of an edge (used in case of |
.dsc
file.Write a network on disk, saving it in a .dsc
-formatted file.
write.dsc(x, path = "./") ## S4 method for signature 'BN' write.dsc(x, path = "./")
write.dsc(x, path = "./") ## S4 method for signature 'BN' write.dsc(x, path = "./")
x |
the |
path |
the relative or absolute path of the directory of the created file. |