Bayesian Network Structure Learning

library(abn)

With this vignette we aim to provide a basic introduction to the structure learning of Bayesian networks with the abn package.

Structure Learning of Bayesian Networks

The structure learning of Bayesian networks is the process of estimating the (in-)dependencies between the variables of the network that results in a directed acyclic graph (DAG) where the nodes represent the variables and the edges represent the dependencies between the variables. Structure learning of Bayesian networks is a challenging problem and there are several algorithms to solve it (see Koller and Friedman (2009) for a comprehensive review).

The abn package currently offers four distinct algorithms for Bayesian network structure learning:

  • mostProbable(): This exact order-based structure learning algorithm identifies the most probable posterior DAG following the method of Koivisto and Sood (2004). For details see the help page of mostProbable().

  • searchHillClimber(): The Hill-climber algorithm is a single move algorithm. At each step, an arc is attempted to be added. The new score is computed and compared to the previous network’s score. It deviates slightly from the original method proposed by Heckerman, Geiger, and Chickering (1995) by utilizing a full cache of all possible local combinations as provided by buildScoreCache(). The algorithm considers all potential single swaps in the DAG at each step, selecting the swap that maximizes the goodness of fit. While multiple arc changes are considered at each step, arc reversal requires two steps. This implementation (in C), which is more intuitive with a pre-computed cache of local scores, is optimized for the abn workflow. For details see the help page of searchHillClimber().

  • searchHeuristic(): This function is a flexible implementation of multiple greedy heuristic algorithms, particularly well adapted to the abn framework. It targets multi-random restarts heuristic algorithms and allows the user to select the number of searches and the maximum number of steps within each search. The function implements three algorithms selected with the parameter algo: hc, tabu, or sa.

    • algo=hc: This alternative implementation of the greedy hill-climbing approach that is fully written in R, unlike searchHillClimber() and mostProbable() which are written in C. It performs a local stepwise optimization, choosing a structural move and computing the score’s change. This function is closer to the MCMCMC algorithm from Madigan, York, and Allard (1995) and Paolo Giudici and Roberto Castello (2003) with a single edge move.

    • algo=tabu: The same algorithm as hc is used, but a list of banned moves is computed. The parameter tabu.memory controls the length of the tabu list. This forces the algorithm to escape the local maximum by choosing some not already used moves.

    • algo=sa: This variant of the heuristic search algorithm is based on simulated annealing described by Metropolis et al. (1953). Some accepted moves could result in a decrease of the network score. The acceptance rate can be monitored with the parameter temperature.

For more information, refer to the help page searchHeuristic().

References

Heckerman, David, Dan Geiger, and David M. Chickering. 1995. “Learning Bayesian Networks: The Combination of Knowledge and Statistical Data.” Machine Learning 20 (3): 197–243. https://doi.org/10.1023/A:1022623210503.
Koivisto, Mikko, and Kismat Sood. 2004. “Exact Bayesian Structure Discovery in Bayesian Networks.” Journal of Machine Learning Research, 25.
Koller, Daphne, and Nir Friedman. 2009. Probabilistic Graphical Models: Principles and Techniques. Adaptive Computation and Machine Learning. Cambridge, MA: MIT Press.
Madigan, David, Jeremy York, and Denis Allard. 1995. “Bayesian Graphical Models for Discrete Data.” International Statistical Review / Revue Internationale de Statistique 63 (2): 215. https://doi.org/10.2307/1403615.
Metropolis, Nicholas, Arianna W. Rosenbluth, Marshall N. Rosenbluth, Augusta H. Teller, and Edward Teller. 1953. “Equation of State Calculations by Fast Computing Machines.” The Journal of Chemical Physics 21 (6): 1087–92. https://doi.org/10.1063/1.1699114.
Paolo Giudici, and Roberto Castello. 2003. “Improving Markov Chain Monte Carlo Model Search for Data Mining.” Machine Learning 50: 32.