JAGStree: R code that writes ‘JAGS’ code

library(JAGStree)
#> Registered S3 method overwritten by 'mcmcplots':
#>   method        from  
#>   as.mcmc.rjags R2jags

Introduction

In many real-world applications, individuals or subjects may appear in multiple data sets, setting the stage for the opportunity to synthesize evidence to use all available data for inference. These relational structure of these data sources often admits a graphical representation, where nodes may be represented by data sources and edges representing the relationship between them (e.g., directed edges for sub-groupings or representation of time). One such representation is as a tree (i.e., a cycle-free graph with directed edges). Performing inference on this structure can be achieved using MCMC methods to obtain estimates of posterior distributions of a Bayesian hierarchical model.

This package is geared towards one subset of cases in which this general framework holds (Flynn and Gustafson 2024a).

Making a ‘JAGS’ Model for Trees

In particular, we assume a number of related data sources exist with a known relational structure described by a tree. We also assume that nodes are associated with integer counts, which are multinomially distributed from the parent node, and branches are associated with a probability, which are Dirichlet distributed among a given sibling group. Then, given a dataframe representing the tree structure, the makeJAGStree() function will generate appropriate MCMC modeling code to be used with ‘JAGS’, to run a Bayesian hierarchical model.

The data frame must contain only two columns:

  • ‘from’ (string, node label)

  • ‘to’ (string, node label)

The ‘from’ and ‘to’ labels encode the tree structure, describing the directed edge between nodes.

This makeJAGStree function has three parameters, two of which are required and one which is optional:

  • data: This required argument must be a data frame with at least two columns: from and to. Together, these columns specify the tree structure. Other columns may also be included; the AutoWMM package can additionally be used to render tree diagrams and perform root node population size estimation with the weighted multiplier method, if appropriate Flynn, Gustafson, and Irvine (2024)

  • prior: This optional argument specifies the prior chosen for the root node population size. The default is lognormal, while a uniform prior is also an option.

  • filename: This required argument is used to name the ‘JAGS’ code output file. It must end in .mod or .txt for correct functionality.

The Bayesian model implied behind this ‘JAGS’ code is described at length elsewhere (Flynn and Gustafson 2024a), and is based on previously developed methodology (Flynn and Gustafson 2024b). A more extensive application to real-world data can be found in (Flynn, Gustafson, and Irvine 2024).

Examples

The functionality of the package is best demonstrated using simple trees. The AutoWMM package can also be used to help render and visualize each of these trees:

data <- data.frame("from" = c("Z", "Z", "A", "A"),
                       "to" = c("A", "B", "C", "D"),
                       "Estimate" = c(4, 34, 9, 1),
                       "Total" = c(11, 70, 10, 10),
                       "Count" = c(NA, 500, NA, 50),
                       "Population" = c(FALSE, FALSE, FALSE, FALSE),
                       "Description" = c("First child of the root", "Second child of the root",
                                         "First grandchild", "Second grandchild"))

# optional use of the AutoWMM package to show tree structure
Sys.setenv("RGL_USE_NULL" = TRUE)
library(AutoWMM)
library(DiagrammeR)
tree <- makeTree(data)
drawTree(tree)

The function can be directly applied to the data, and will produce a .mod or .txt file, depending on which is specified. A uniform prior can also be chosen for the root node, with parameters that will be specified when running ‘JAGS’:

makeJAGStree(data1, filename=file.path(tempdir(), "data1_JAGSscript.mod"))
makeJAGStree(data1, filename=file.path(tempdir(), "data1_JAGSscript.txt", prior="uniform"))

References

Flynn, M. J., and P. Gustafson. 2024a. AutoWMM and JAGStree - R Packages Fro Population Estimation on Relational Tree-Structured Data.” [Manuscript in Preparation] Department of Statistics, University of British Columbia.
———. 2024b. “Leveraging Relational Evidence: Population Size Estimation on Tree-Structured Data Wiht the Weighted Multiplier Method.” [Manuscript in Preparation] Department of Statistics, University of British Columbia.
Flynn, M. J., P. Gustafson, and M. A. Irvine. 2024. “Estimating the Number of Opioid Overdoses in British Columbia Using Relational Evidence with Tree Structure.” [Manuscript in Preparation] Department of Statistics, University of British Columbia.