Title: | Create Visualisations for BART Models |
---|---|
Description: | Investigating and visualising Bayesian Additive Regression Tree (BART) (Chipman, H. A., George, E. I., & McCulloch, R. E. 2010) <doi:10.1214/09-AOAS285> model fits. We construct conventional plots to analyze a model’s performance and stability as well as create new tree-based plots to analyze variable importance, interaction, and tree structure. We employ Value Suppressing Uncertainty Palettes (VSUP) to construct heatmaps that display variable importance and interactions jointly using colour scale to represent posterior uncertainty. Our visualisations are designed to work with the most popular BART R packages available, namely 'BART' Rodney Sparapani and Charles Spanbauer and Robert McCulloch 2021 <doi:10.18637/jss.v097.i01>, 'dbarts' (Vincent Dorie 2023) <https://CRAN.R-project.org/package=dbarts>, and 'bartMachine' (Adam Kapelner and Justin Bleich 2016) <doi:10.18637/jss.v070.i04>. |
Authors: | Alan Inglis [aut, cre], Andrew Parnell [aut], Catherine Hurley [aut], Claus Wilke [ctb] (Developer of VSUP script) |
Maintainer: | Alan Inglis <[email protected]> |
License: | GPL (>= 2) |
Version: | 0.1.1 |
Built: | 2024-11-22 06:53:35 UTC |
Source: | CRAN |
Plots the acceptance rate of trees from a BART model.
acceptRate(trees)
acceptRate(trees)
trees |
A data frame created by extractTreeData function. Displays a division on the plot to separate prior and post burn-in iterations. |
A ggplot object plot of acceptance rate.
if(requireNamespace("dbarts", quietly = TRUE)){ # Load the dbarts package to access the bart function library(dbarts) # Get Data df <- na.omit(airquality) # Create Simple dbarts Model For Regression: set.seed(1701) dbartModel <- bart(df[2:6], df[,1], ntree = 5, keeptrees = TRUE, nskip = 10, ndpost = 10) # Tree Data trees_data <- extractTreeData(model = dbartModel, data = df) acceptRate(trees = trees_data)}
if(requireNamespace("dbarts", quietly = TRUE)){ # Load the dbarts package to access the bart function library(dbarts) # Get Data df <- na.omit(airquality) # Create Simple dbarts Model For Regression: set.seed(1701) dbartModel <- bart(df[2:6], df[,1], ntree = 5, keeptrees = TRUE, nskip = 10, ndpost = 10) # Tree Data trees_data <- extractTreeData(model = dbartModel, data = df) acceptRate(trees = trees_data)}
Displays a selection of diagnostic plots for a BART model.
bartClassifDiag( model, data, response, threshold = "Youden", pNorm = FALSE, showInterval = TRUE, combineFactors = FALSE )
bartClassifDiag( model, data, response, threshold = "Youden", pNorm = FALSE, showInterval = TRUE, combineFactors = FALSE )
model |
a model created from either the BART, dbarts, or bartMachine package. |
data |
A dataframe |
response |
The name of the response for the fit. |
threshold |
A dashed line on some plots to indicate a chosen threshold value. by default the Youden index is shown. |
pNorm |
apply pnorm to the y-hat data |
showInterval |
LOGICAL if TRUE then show 5% and 95% quantile intervals. |
combineFactors |
Whether or not to combine dummy variables (if present) in display. |
A selection of diagnostic plots
Displays a selection of diagnostic plots for a BART model.
bartDiag( model, data, response, burnIn = 0, threshold = "Youden", pNorm = FALSE, showInterval = TRUE, combineFactors = FALSE )
bartDiag( model, data, response, burnIn = 0, threshold = "Youden", pNorm = FALSE, showInterval = TRUE, combineFactors = FALSE )
model |
a model created from either the BART, modelarts, or bartMachine package. |
data |
A dataframe used to build the model. |
response |
The name of the response for the fit. |
burnIn |
Trace plot will only show iterations above selected burn in value. |
threshold |
A dashed line on some plots to indicate a chosen threshold value (classification only). by default the Youden index is shown. |
pNorm |
apply pnorm to the y-hat data (classification only). |
showInterval |
LOGICAL if TRUE then show 5% and 95% quantile intervals on ROC an PC curves (classification only). |
combineFactors |
Whether or not to combine dummy variables (if present) in display. |
A selection of diagnostic plots.
# For Regression # Generate Friedman data fData <- function(n = 200, sigma = 1.0, seed = 1701, nvar = 5) { set.seed(seed) x <- matrix(runif(n * nvar), n, nvar) colnames(x) <- paste0("x", 1:nvar) Ey <- 10 * sin(pi * x[, 1] * x[, 2]) + 20 * (x[, 3] - 0.5)^2 + 10 * x[, 4] + 5 * x[, 5] y <- rnorm(n, Ey, sigma) data <- as.data.frame(cbind(x, y)) return(data) } f_data <- fData(nvar = 10) x <- f_data[, 1:10] y <- f_data$y # Create dbarts model library(dbarts) set.seed(1701) dbartModel <- bart(x, y, ntree = 5, keeptrees = TRUE, nskip = 10, ndpost = 10) bartDiag(model = dbartModel, response = "y", burnIn = 100, data = f_data) # For Classification data(iris) iris2 <- iris[51:150, ] iris2$Species <- factor(iris2$Species) # Create dbarts model dbartModel <- bart(iris2[, 1:4], iris2[, 5], ntree = 5, keeptrees = TRUE, nskip = 10, ndpost = 10) bartDiag(model = dbartModel, data = iris2, response = iris2$Species)
# For Regression # Generate Friedman data fData <- function(n = 200, sigma = 1.0, seed = 1701, nvar = 5) { set.seed(seed) x <- matrix(runif(n * nvar), n, nvar) colnames(x) <- paste0("x", 1:nvar) Ey <- 10 * sin(pi * x[, 1] * x[, 2]) + 20 * (x[, 3] - 0.5)^2 + 10 * x[, 4] + 5 * x[, 5] y <- rnorm(n, Ey, sigma) data <- as.data.frame(cbind(x, y)) return(data) } f_data <- fData(nvar = 10) x <- f_data[, 1:10] y <- f_data$y # Create dbarts model library(dbarts) set.seed(1701) dbartModel <- bart(x, y, ntree = 5, keeptrees = TRUE, nskip = 10, ndpost = 10) bartDiag(model = dbartModel, response = "y", burnIn = 100, data = f_data) # For Classification data(iris) iris2 <- iris[51:150, ] iris2$Species <- factor(iris2$Species) # Create dbarts model dbartModel <- bart(iris2[, 1:4], iris2[, 5], ntree = 5, keeptrees = TRUE, nskip = 10, ndpost = 10) bartDiag(model = dbartModel, data = iris2, response = iris2$Species)
Displays a selection of diagnostic plots for a BART model.
bartRegrDiag(model, response, burnIn = 0, data, combineFactors = FALSE)
bartRegrDiag(model, response, burnIn = 0, data, combineFactors = FALSE)
model |
a model created from either the BART, modelarts, or bartMachine package. |
response |
The name of the response for the fit. |
burnIn |
Trace plot will only show iterations above selected burn in value. |
data |
A dataframe used to build the model. |
combineFactors |
Whether or not to combine dummy variables (if present) in display. |
A selection of diagnostic plots
Reorders a list of tree structures based on the clustering of variables within each tree.
clusterTrees(tree_list)
clusterTrees(tree_list)
tree_list |
A list of trees, where each tree is expected to have a 'var' column. |
A list of trees reordered based on the clustering of variables.
This function updates the 'var' column in the 'structure' component of the 'trees' list, replacing dummy variable names derived from factor variables with their original factor variable names.
combineDummy(trees)
combineDummy(trees)
trees |
A list containing at least two components: 'data' and 'structure'. 'data' should be a dataframe, and 'structure' a dataframe that includes a 'var' column. |
The function first identifies factor variables in 'trees$data', then checks each entry in 'trees$structure$var' for matches with these factor variables. If a match is found, indicating a dummy variable, the entry is replaced with the original factor variable name.
The modified 'trees' list with updated 'var' column entries in 'trees$structure'.
if(requireNamespace("dbarts", quietly = TRUE)){ # Load the dbarts package to access the bart function library(dbarts) # Create Simple dbarts Model with Dummies set.seed(1701) dbartModel <- bart(iris[2:5], iris[,1], ntree = 5, keeptrees = TRUE, nskip = 10, ndpost = 10) # Tree Data trees_data <- extractTreeData(model = dbartModel, data = iris) combined_trees <- combineDummy(trees = trees_data) }
if(requireNamespace("dbarts", quietly = TRUE)){ # Load the dbarts package to access the bart function library(dbarts) # Create Simple dbarts Model with Dummies set.seed(1701) dbartModel <- bart(iris[2:5], iris[,1], ntree = 5, keeptrees = TRUE, nskip = 10, ndpost = 10) # Tree Data trees_data <- extractTreeData(model = dbartModel, data = iris) combined_trees <- combineDummy(trees = trees_data) }
Creates a list of all tree attributes for a model created by either the BART, dbarts or bartMachine packages.
extractTreeData(model, data)
extractTreeData(model, data)
model |
Model created from either the BART, dbarts or bartMachine packages. |
data |
a data frame used to build the BART model. |
A list containing the extracted and processed tree data. This list includes:
Tree Data Frame: A data frame containing tree attributes.
Variable Name: The names of the variables used in building the model.
nMCMC: The total number of iterations (posterior draws) after burn-in.
nTree: The total number of trees grown in the sum-of-trees model.
nVar: The total number of covariates used in the model.
The object created by the 'extractTreeData' function encompasses these elements, facilitating detailed analysis and visualisation of BART model components.
if(requireNamespace("dbarts", quietly = TRUE)){ # Load the dbarts package to access the bart function library(dbarts) # Get Data df <- na.omit(airquality) # Create Simple dbarts Model For Regression: set.seed(1701) dbartModel <- bart(df[2:6], df[,1], ntree = 5, keeptrees = TRUE, nskip = 10, ndpost = 10) # Tree Data trees_data <- extractTreeData(model = dbartModel, data = df) }
if(requireNamespace("dbarts", quietly = TRUE)){ # Load the dbarts package to access the bart function library(dbarts) # Get Data df <- na.omit(airquality) # Create Simple dbarts Model For Regression: set.seed(1701) dbartModel <- bart(df[2:6], df[,1], ntree = 5, keeptrees = TRUE, nskip = 10, ndpost = 10) # Tree Data trees_data <- extractTreeData(model = dbartModel, data = df) }
This function is internal and is used to compute the color of a stump for the purpose of legend display, based on the mean value relative to specified limits.
get_stump_colour_for_legend(lims, mean_value, palette)
get_stump_colour_for_legend(lims, mean_value, palette)
lims |
A numeric vector of length 2 specifying the limits within which the mean value falls. |
mean_value |
The mean value for which the color needs to be determined. |
palette |
A character vector of colors representing the palette from which the color is selected. |
A character string specifying the color corresponding to the mean value.
Populates 'childLeft', 'childRight', and 'parent' columns in the dataset to establish parent-child relationships between nodes based on tree structure.
getChildren(data)
getChildren(data)
data |
A data frame with tree structure, including 'iteration', 'treeNum', 'node', and 'depth' columns, along with a 'terminal' indicator. |
The modified data frame with 'childLeft', 'childRight', and 'parent' columns added, detailing the tree's parent-child node relationships.
data("tree_data_example") # Create Terminal Column tree_data_example <- transform(tree_data_example, terminal = ifelse(is.na(var), TRUE, FALSE)) # Get depths depthList <- lapply(split(tree_data_example, ~treeNum + iteration), function(x) cbind(x, depth = node_depth(x)-1)) # Turn into data frame tree_data_example <- dplyr::bind_rows(depthList, .id = "list_id") # Add node number sequntially tree_data_example$node <- with(tree_data_example, ave(seq_along(iteration), list(iteration, treeNum), FUN = seq_along)) # get children getChildren(data = tree_data_example)
data("tree_data_example") # Create Terminal Column tree_data_example <- transform(tree_data_example, terminal = ifelse(is.na(var), TRUE, FALSE)) # Get depths depthList <- lapply(split(tree_data_example, ~treeNum + iteration), function(x) cbind(x, depth = node_depth(x)-1)) # Turn into data frame tree_data_example <- dplyr::bind_rows(depthList, .id = "list_id") # Add node number sequntially tree_data_example$node <- with(tree_data_example, ave(seq_along(iteration), list(iteration, treeNum), FUN = seq_along)) # get children getChildren(data = tree_data_example)
This function determines which observations from a given dataset fall into which nodes of a tree, based on a tree structure defined in 'treeData'. The treeData object must include 'iteration', 'treeNum', 'var', and 'splitValue' columns.
getObservations(data, treeData)
getObservations(data, treeData)
data |
A data frame used to build BART model. |
treeData |
A data frame representing the tree structure, including the necessary columns 'iteration', 'treeNum', 'var', and 'splitValue'. |
A modified version of 'treeData' that includes two new columns: 'obsNode' and 'noObs'. 'obsNode' lists the observations falling into each node, and 'noObs' provides the count of observations for each node.
data("tree_data_example") # Create Terminal Column tree_data_example <- transform(tree_data_example, terminal = ifelse(is.na(var), TRUE, FALSE)) # Create Split Value Column tree_data_example <- transform(tree_data_example, splitValue = ifelse(terminal == FALSE, value, NA_integer_)) # get the observations getObservations(data = input_data, treeData = tree_data_example)
data("tree_data_example") # Create Terminal Column tree_data_example <- transform(tree_data_example, terminal = ifelse(is.na(var), TRUE, FALSE)) # Create Split Value Column tree_data_example <- transform(tree_data_example, splitValue = ifelse(terminal == FALSE, value, NA_integer_)) # get the observations getObservations(data = input_data, treeData = tree_data_example)
Colourfan guide
guide_colourfan( title = waiver(), title.x.position = "top", title.y.position = "right", title.theme = NULL, title.hjust = 0.5, title.vjust = NULL, label = TRUE, label.theme = NULL, barwidth = NULL, barheight = NULL, nbin = 32, reverse = FALSE, order = 0, available_aes = c("colour", "color", "fill"), ... ) guide_colorfan( title = waiver(), title.x.position = "top", title.y.position = "right", title.theme = NULL, title.hjust = 0.5, title.vjust = NULL, label = TRUE, label.theme = NULL, barwidth = NULL, barheight = NULL, nbin = 32, reverse = FALSE, order = 0, available_aes = c("colour", "color", "fill"), ... )
guide_colourfan( title = waiver(), title.x.position = "top", title.y.position = "right", title.theme = NULL, title.hjust = 0.5, title.vjust = NULL, label = TRUE, label.theme = NULL, barwidth = NULL, barheight = NULL, nbin = 32, reverse = FALSE, order = 0, available_aes = c("colour", "color", "fill"), ... ) guide_colorfan( title = waiver(), title.x.position = "top", title.y.position = "right", title.theme = NULL, title.hjust = 0.5, title.vjust = NULL, label = TRUE, label.theme = NULL, barwidth = NULL, barheight = NULL, nbin = 32, reverse = FALSE, order = 0, available_aes = c("colour", "color", "fill"), ... )
title |
Title |
title.x.position |
Title x position |
title.y.position |
Title y position |
title.theme |
Title theme |
title.hjust |
Title hjust |
title.vjust |
Title vjust |
label |
Label |
label.theme |
Label theme |
barwidth |
Barwidth |
barheight |
Barheight |
nbin |
Number of bins |
reverse |
Reverse |
order |
order |
available_aes |
Available aesthetics |
... |
Extra paramters |
A 'grob' object representing a color fan. This 'grob' can be added to a grid-based plot or a ggplot2 object to visualize a range of colors in a fan-like structure. Each segment of the fan corresponds to a color specified in the 'colours' parameter, allowing for an intuitive representation of color gradients or palettes.
Small example of Friedman data following the formula:
input_data
input_data
A data frame with 10 rows and 6 columns:
Covariate
Covariate
Covariate
Covariate
Covariate
Response
...
A variable selection approach performed by permuting the response.
localProcedure( model, data, response, numRep = 10, numTreesRep = NULL, alpha = 0.5, shift = FALSE )
localProcedure( model, data, response, numRep = 10, numTreesRep = NULL, alpha = 0.5, shift = FALSE )
model |
Model created from either the BART, dbarts or bartMachine packages. |
data |
A data frame containing variables in the model. |
response |
The name of the response for the fit. |
numRep |
The number of replicates to perform for the BART null model's variable inclusion proportions. |
numTreesRep |
The number of trees to be used in the replicates. As suggested by Chipman (2009), a small number of trees is recommended (~20) to force important variables to used in the model. If NULL, then the number of trees from the true model is used. |
alpha |
The cut-off level for the thresholds. |
shift |
Whether to shift the inclusion proportion points by the difference in distance between the quantile and the value of the inclusion proportion point. |
A variable selection plot using the local procedure method.
if(requireNamespace("dbarts", quietly = TRUE)){ # Load the dbarts package to access the bart function library(dbarts) # Get Data df <- na.omit(airquality) # Create Simple dbarts Model For Regression: set.seed(1701) dbartModel <- bart(df[2:6], df[,1], ntree = 5, keeptrees = TRUE, nskip = 10, ndpost = 10) localProcedure(model = dbartModel, data = df, numRep = 5, numTreesRep = 5, alpha = 0.5, shift = FALSE) }
if(requireNamespace("dbarts", quietly = TRUE)){ # Load the dbarts package to access the bart function library(dbarts) # Get Data df <- na.omit(airquality) # Create Simple dbarts Model For Regression: set.seed(1701) dbartModel <- bart(df[2:6], df[,1], ntree = 5, keeptrees = TRUE, nskip = 10, ndpost = 10) localProcedure(model = dbartModel, data = df, numRep = 5, numTreesRep = 5, alpha = 0.5, shift = FALSE) }
Multi-dimensional Scaling Plot of proximity matrix from a BART model.
mdsBart( trees, data, target, response, plotType = "rows", showGroup = TRUE, level = 0.95 )
mdsBart( trees, data, target, response, plotType = "rows", showGroup = TRUE, level = 0.95 )
trees |
A data frame created by 'extractTreeData' function. |
data |
a dataframe used in building the model. |
target |
A target proximity matrix to |
response |
The name of the response for the fit. |
plotType |
Type of plot to show. Either 'interactive' - showing interactive confidence ellipses. 'point' - a point plot showing the average position of a observation. 'rows' - displaying the average position of a observation number instead of points. 'all' - show all observations (not averaged). |
showGroup |
Logical. Show confidence ellipses. |
level |
The confidence level to show. Default is 95% confidence level. |
For this function, the MDS coordinates are calculated for each iteration. Procrustes method is then applied to align each of the coordinates to a target set of coordinates. The returning result is then a clustered average of each point.
if (requireNamespace("dbarts", quietly = TRUE)) { # Load the dbarts package to access the bart function library(dbarts) # Get Data df <- na.omit(airquality) # Create Simple dbarts Model For Regression: set.seed(1701) dbartModel <- bart(df[2:6], df[, 1], ntree = 5, keeptrees = TRUE, nskip = 10, ndpost = 10 ) # Tree Data trees_data <- extractTreeData(model = dbartModel, data = df) # Cretae Porximity Matrix bmProx <- proximityMatrix( trees = trees_data, reorder = TRUE, normalize = TRUE, iter = 1 ) # MDS plot mdsBart( trees = trees_data, data = df, target = bmProx, plotType = "interactive", level = 0.25, response = "Ozone" ) }
if (requireNamespace("dbarts", quietly = TRUE)) { # Load the dbarts package to access the bart function library(dbarts) # Get Data df <- na.omit(airquality) # Create Simple dbarts Model For Regression: set.seed(1701) dbartModel <- bart(df[2:6], df[, 1], ntree = 5, keeptrees = TRUE, nskip = 10, ndpost = 10 ) # Tree Data trees_data <- extractTreeData(model = dbartModel, data = df) # Cretae Porximity Matrix bmProx <- proximityMatrix( trees = trees_data, reorder = TRUE, normalize = TRUE, iter = 1 ) # MDS plot mdsBart( trees = trees_data, data = df, target = bmProx, plotType = "interactive", level = 0.25, response = "Ozone" ) }
Computes the depth of each node in a given tree data frame, assuming a binary tree structure. Requires the tree data frame to contain a logical column 'terminal' indicating terminal nodes.
node_depth(tree)
node_depth(tree)
tree |
A data frame representing a tree, must contain a 'terminal' column. |
A vector of depths corresponding to each node in the tree.
data("tree_data_example") # Create Terminal Column tree_data_example <- transform(tree_data_example, terminal = ifelse(is.na(var), TRUE, FALSE)) # Get depths depthList <- lapply(split(tree_data_example, ~treeNum + iteration), function(x) cbind(x, depth = node_depth(x)-1)) # Turn into data frame tree_data_example <- dplyr::bind_rows(depthList, .id = "list_id")
data("tree_data_example") # Create Terminal Column tree_data_example <- transform(tree_data_example, terminal = ifelse(is.na(var), TRUE, FALSE)) # Get depths depthList <- lapply(split(tree_data_example, ~treeNum + iteration), function(x) cbind(x, depth = node_depth(x)-1)) # Turn into data frame tree_data_example <- dplyr::bind_rows(depthList, .id = "list_id")
Returns a palette function that turns 'v' (value) and 'u' (uncertainty) (both between 0 and 1) into colors.
pal_vsup( values, unc_levels = 4, max_light = 0.9, max_desat = 0, pow_light = 0.8, pow_desat = 1 )
pal_vsup( values, unc_levels = 4, max_light = 0.9, max_desat = 0, pow_light = 0.8, pow_desat = 1 )
values |
Color values to be used at minimum uncertainty. Needs to be a vector of length '2^unc_levels'. |
unc_levels |
Number of discrete uncertainty levels. The number of discrete colors at each level doubles. |
max_light |
Maximum amount of lightening |
max_desat |
Maximum amount of desaturation |
pow_light |
Power exponent of lightening |
pow_desat |
Power exponent of desaturation |
A function that takes two parameters, 'v' (value) and 'u' (uncertainty), both expected to be in the range of 0 to 1, and returns a color. This color is determined by the specified 'values' colors at minimum uncertainty, and modified according to the given 'v' and 'u' parameters to represent uncertainty by adjusting lightness and saturation. The resulting function is useful for creating color palettes that can encode both value and uncertainty in visualizations.
A variable selection approach which creates a null model by permuting the response, rebuilding the model, and calculating the inclusion proportion (IP) on the null model. The final result displayed is the original model's IP minus the null IP.
permVimp(model, data, response, numTreesPerm = NULL, plotType = "barplot")
permVimp(model, data, response, numTreesPerm = NULL, plotType = "barplot")
model |
Model created from either the BART, dbarts or bartMachine packages. |
data |
A data frame containing variables in the model. |
response |
The name of the response for the fit. |
numTreesPerm |
The number of trees to be used in the null model. As suggested by Chipman (2009), a small number of trees is recommended (~20) to force important variables to used in the model. If NULL, then the number of trees from the true model is used. |
plotType |
Either a bar plot ('barplot') or a point plot ('point') |
A variable selection plot.
if (requireNamespace("dbarts", quietly = TRUE)) { # Load the dbarts package to access the bart function library(dbarts) # Get Data df <- na.omit(airquality) # Create Simple dbarts Model For Regression: set.seed(1701) dbartModel <- bart(df[2:6], df[, 1], ntree = 5, keeptrees = TRUE, nskip = 10, ndpost = 10 ) # Tree Data trees_data <- extractTreeData(model = dbartModel, data = df) permVimp(model = dbartModel, data = df, response = 'Ozone', numTreesPerm = 2, plotType = 'point') }
if (requireNamespace("dbarts", quietly = TRUE)) { # Load the dbarts package to access the bart function library(dbarts) # Get Data df <- na.omit(airquality) # Create Simple dbarts Model For Regression: set.seed(1701) dbartModel <- bart(df[2:6], df[, 1], ntree = 5, keeptrees = TRUE, nskip = 10, ndpost = 10 ) # Tree Data trees_data <- extractTreeData(model = dbartModel, data = df) permVimp(model = dbartModel, data = df, response = 'Ozone', numTreesPerm = 2, plotType = 'point') }
A variable interaction evaluation which creates a null model by permuting the response, rebuilding the model, and calculating the inclusion proportion (IP) of adjacent splits on the null model. The final result displayed is the original model's IP minus the null IP.
permVint(model, data, trees, response, numTreesPerm = NULL, top = NULL)
permVint(model, data, trees, response, numTreesPerm = NULL, top = NULL)
model |
Model created from either the BART, dbarts or bartMachine packages. |
data |
A data frame containing variables in the model. |
trees |
A data frame created by extractTreeData function. |
response |
The name of the response for the fit. |
numTreesPerm |
The number of trees to be used in the null model. As suggested by Chipman (2009), a small number of trees is recommended (~20) to force important variables to used in the model. If NULL, then the number of trees from the true model is used. |
top |
Display only the top X interactions. |
A variable interaction plot. Note that for a dbarts fit, due to the internal workings of dbarts, the null model is hard-coded to 20 trees, a burn-in of 100, and 1000 iterations. Both a BART and bartMachine null model will extract the identical parameters from the original model.
Plot a proximity matrix
plotProximity( matrix, pal = rev(colorspace::sequential_hcl(palette = "Blues 2", n = 100)), limit = NULL )
plotProximity( matrix, pal = rev(colorspace::sequential_hcl(palette = "Blues 2", n = 100)), limit = NULL )
matrix |
A matrix of proximities created by the proximityMatrix function |
pal |
A vector of colours to show proximity scores, for use with scale_fill_gradientn. |
limit |
Specifies the fit range for the color map for proximity scores. |
A plot of proximity values.
if(requireNamespace("dbarts", quietly = TRUE)){ # Load the dbarts package to access the bart function library(dbarts) # Get Data df <- na.omit(airquality) # Create Simple dbarts Model For Regression: set.seed(1701) dbartModel <- bart(df[2:6], df[, 1], ntree = 5, keeptrees = TRUE, nskip = 10, ndpost = 10) # Tree Data trees_data <- extractTreeData(model = dbartModel, data = df) # Create Proximity Matrix mProx <- proximityMatrix(trees = trees_data, reorder = TRUE, normalize = TRUE, iter = 1) # Plot plotProximity(matrix = mProx) }
if(requireNamespace("dbarts", quietly = TRUE)){ # Load the dbarts package to access the bart function library(dbarts) # Get Data df <- na.omit(airquality) # Create Simple dbarts Model For Regression: set.seed(1701) dbartModel <- bart(df[2:6], df[, 1], ntree = 5, keeptrees = TRUE, nskip = 10, ndpost = 10) # Tree Data trees_data <- extractTreeData(model = dbartModel, data = df) # Create Proximity Matrix mProx <- proximityMatrix(trees = trees_data, reorder = TRUE, normalize = TRUE, iter = 1) # Plot plotProximity(matrix = mProx) }
Plots individual trees.
plotSingleTree(trees, iter = 1, treeNo = 1, plotType = "icicle")
plotSingleTree(trees, iter = 1, treeNo = 1, plotType = "icicle")
trees |
A data frame created by |
iter |
The MCMC iteration or chain to plot. |
treeNo |
The tree number to plot. |
plotType |
What type of plot to display. either dendrogram or icicle. |
A plot of an individual tree
if (requireNamespace("dbarts", quietly = TRUE)) { # Load the dbarts package to access the bart function library(dbarts) # Get Data df <- na.omit(airquality) # Create Simple dbarts Model For Regression: set.seed(1701) dbartModel <- bart(df[2:6], df[, 1], ntree = 5, keeptrees = TRUE, nskip = 10, ndpost = 10 ) # Tree Data trees_data <- extractTreeData(model = dbartModel, data = df) plotSingleTree(trees = trees_data, iter = 1, treeNo = 1) }
if (requireNamespace("dbarts", quietly = TRUE)) { # Load the dbarts package to access the bart function library(dbarts) # Get Data df <- na.omit(airquality) # Create Simple dbarts Model For Regression: set.seed(1701) dbartModel <- bart(df[2:6], df[, 1], ntree = 5, keeptrees = TRUE, nskip = 10, ndpost = 10 ) # Tree Data trees_data <- extractTreeData(model = dbartModel, data = df) plotSingleTree(trees = trees_data, iter = 1, treeNo = 1) }
This function plots trees from a list of tidygraph objects. It allows for various customisations such as fill colour based on node response or value, node size adjustments, and color palettes.
plotTrees( trees, iter = NULL, treeNo = NULL, fillBy = NULL, sizeNodes = FALSE, removeStump = FALSE, selectedVars = NULL, pal = rev(colorRampPalette(c("steelblue", "#f7fcfd", "orange"))(5)), center_Mu = TRUE, cluster = NULL )
plotTrees( trees, iter = NULL, treeNo = NULL, fillBy = NULL, sizeNodes = FALSE, removeStump = FALSE, selectedVars = NULL, pal = rev(colorRampPalette(c("steelblue", "#f7fcfd", "orange"))(5)), center_Mu = TRUE, cluster = NULL )
trees |
A data frame of trees. |
iter |
An integer specifying the iteration number of trees to be included in the output. If NULL, trees from all iterations are included. |
treeNo |
An integer specifying the number of the tree to include in the output. If NULL, all trees are included. |
fillBy |
A character string specifying the attribute to color nodes by. Options are 'response' for coloring nodes based on their mean response values or 'mu' for coloring nodes based on their predicted value, or NULL for no specific fill attribute. |
sizeNodes |
A logical value indicating whether to adjust node sizes. If TRUE, node sizes are adjusted; if FALSE, all nodes are given the same size. |
removeStump |
A logical value. If TRUE, then stumps are removed from plot. |
selectedVars |
A vector of selected variables to display. Either a character vector of names or the variables column number. |
pal |
A colour palette for node colouring. Palette is used when 'fillBy' is specified for gradient colouring. |
center_Mu |
A logical value indicating whether to center the color scale for the 'mu' attribute around zero. Applicable only when 'fillBy' is set to "mu". |
cluster |
A character string that specifies the criterion for reordering trees in the output. Currently supports "depth" for ordering by the maximum depth of nodes, and "var" for a clustering based on variables. If NULL, no reordering is performed. |
A ggplot object representing the plotted trees with the specified customisations.
if (requireNamespace("dbarts", quietly = TRUE)) { # Load the dbarts package to access the bart function library(dbarts) # Get Data df <- na.omit(airquality) # Create Simple dbarts Model For Regression: set.seed(1701) dbartModel <- bart(df[2:6], df[, 1], ntree = 5, keeptrees = TRUE, nskip = 10, ndpost = 10 ) # Tree Data trees_data <- extractTreeData(model = dbartModel, data = df) plotTrees(trees = trees_data, fillBy = 'response', sizeNodes = TRUE) }
if (requireNamespace("dbarts", quietly = TRUE)) { # Load the dbarts package to access the bart function library(dbarts) # Get Data df <- na.omit(airquality) # Create Simple dbarts Model For Regression: set.seed(1701) dbartModel <- bart(df[2:6], df[, 1], ntree = 5, keeptrees = TRUE, nskip = 10, ndpost = 10 ) # Tree Data trees_data <- extractTreeData(model = dbartModel, data = df) plotTrees(trees = trees_data, fillBy = 'response', sizeNodes = TRUE) }
This function hides parts from the print out but are still accessible via indexing.
## S3 method for class 'hideHelper1' print(x, ...)
## S3 method for class 'hideHelper1' print(x, ...)
x |
A data frame of trees |
... |
Extra parameters |
No return value; this function is called for its side effect of printing a formatted summary of the tree data frame. It displays parts of the data frame, such as the tree structure and various counts (like number of MCMC iterations, number of trees, and number of variables), while keeping the complete data accessible via indexing.
Creates a matrix of proximity values.
proximityMatrix(trees, nRows, normalize = TRUE, reorder = TRUE, iter = NULL)
proximityMatrix(trees, nRows, normalize = TRUE, reorder = TRUE, iter = NULL)
trees |
A list of tree attributes created by 'extractTreeData' function. |
nRows |
Number of rows to consider. |
normalize |
Default is TRUE. Divide the total number of pairs of observations by the number of trees. |
reorder |
Default is TRUE. Whether to sort the matrix so high values are pushed to top left. |
iter |
Which iteration to use, if NULL the proximity matrix is calculated over all iterations. |
A matrix containing proximity values.
if(requireNamespace("dbarts", quietly = TRUE)){ # Load the dbarts package to access the bart function library(dbarts) # Get Data df <- na.omit(airquality) # Create Simple dbarts Model For Regression: set.seed(1701) dbartModel <- bart(df[2:6], df[, 1], ntree = 5, keeptrees = TRUE, nskip = 10, ndpost = 10) # Tree Data trees_data <- extractTreeData(model = dbartModel, data = df) # Create Proximity Matrix mProx <- proximityMatrix(trees = trees_data, reorder = TRUE, normalize = TRUE, iter = 1) }
if(requireNamespace("dbarts", quietly = TRUE)){ # Load the dbarts package to access the bart function library(dbarts) # Get Data df <- na.omit(airquality) # Create Simple dbarts Model For Regression: set.seed(1701) dbartModel <- bart(df[2:6], df[, 1], ntree = 5, keeptrees = TRUE, nskip = 10, ndpost = 10) # Tree Data trees_data <- extractTreeData(model = dbartModel, data = df) # Create Proximity Matrix mProx <- proximityMatrix(trees = trees_data, reorder = TRUE, normalize = TRUE, iter = 1) }
Constructor for bivariate range object
bivariate_range()
bivariate_range()
An object of class RangeBivariate
(inherits from Range
, ggproto
, gg
) of length 2.
Constructor for bivariate scale object
bivariate_scale( aesthetics, palette, name = waiver(), breaks = waiver(), labels = waiver(), limits = NULL, rescaler = scales::rescale, oob = scales::censor, expand = waiver(), na.value = NA_real_, trans = "identity", guide = "none", super = ScaleBivariate, scale_name = "bivariate_scale" )
bivariate_scale( aesthetics, palette, name = waiver(), breaks = waiver(), labels = waiver(), limits = NULL, rescaler = scales::rescale, oob = scales::censor, expand = waiver(), na.value = NA_real_, trans = "identity", guide = "none", super = ScaleBivariate, scale_name = "bivariate_scale" )
aesthetics |
The names of the aesthetics that this scale works with. |
palette |
A palette function that when called with a numeric vector with
values between 0 and 1 returns the corresponding output values
(e.g., |
name |
The name of the scale. Used as the axis or legend title. If
|
breaks |
One of:
|
labels |
One of:
|
limits |
Data frame with two columns of length two each defining the limits for the two data dimensions. |
rescaler |
Either one rescaling function applied to both data dimensions or list of two rescaling functions, one for each data dimension. |
oob |
One of:
|
expand |
For position scales, a vector of range expansion constants used to add some
padding around the data to ensure that they are placed some distance
away from the axes. Use the convenience function |
na.value |
Missing values will be replaced with this value. |
trans |
Either one transformation applied to both data dimensions or list of two transformations, one for each data dimension. Transformations can be given as either the name of a transformation object or the object itself. See ['ggplot2::continuous_scale()'] for details. |
guide |
A function used to create a guide or its name. See
|
super |
The super class to use for the constructed scale |
scale_name |
The name of the scale that should be used for error messages associated with this scale. |
An object of class ScaleBivariate
(inherits from Scale
, ggproto
, gg
) of length 15.
Sort Trees by Maximum Depth
sort_trees_by_depthMax(tree_list)
sort_trees_by_depthMax(tree_list)
tree_list |
List of 'tbl_graph' trees. |
Sorted list of 'tbl_graph' trees by decreasing maximum depth.
Density plots of the split value for each variable.
splitDensity( trees, data, bandWidth = NULL, panelScale = NULL, scaleFactor = NULL, display = "histogram" )
splitDensity( trees, data, bandWidth = NULL, panelScale = NULL, scaleFactor = NULL, display = "histogram" )
trees |
A list of trees created using the trees function. |
data |
Data frame containing variables from the model. |
bandWidth |
Bandwidth used for density calculation. If not provided, is estimated from the data. |
panelScale |
If TRUE, the default, relative scaling is calculated separately for each panel. If FALSE, relative scaling is calculated globally. @param scaleFactor A scaling factor to scale the height of the ridgelines relative to the spacing between them. A value of 1 indicates that the maximum point of any ridgeline touches the baseline right above, assuming even spacing between baselines. |
scaleFactor |
A numerical value to scale the plot. |
display |
Choose how to display the plot. Either histogram, facet wrap, ridges or display both the split value and density of the predictor by using dataSplit. |
A faceted group of density plots
if(requireNamespace("dbarts", quietly = TRUE)){ # Load the dbarts package to access the bart function library(dbarts) # Get Data df <- na.omit(airquality) # Create Simple dbarts Model For Regression: set.seed(1701) dbartModel <- bart(df[2:6], df[, 1], ntree = 5, keeptrees = TRUE, nskip = 10, ndpost = 10) # Tree Data trees_data <- extractTreeData(model = dbartModel, data = df) splitDensity(trees = trees_data, data = df, display = 'ridge') }
if(requireNamespace("dbarts", quietly = TRUE)){ # Load the dbarts package to access the bart function library(dbarts) # Get Data df <- na.omit(airquality) # Create Simple dbarts Model For Regression: set.seed(1701) dbartModel <- bart(df[2:6], df[, 1], ntree = 5, keeptrees = TRUE, nskip = 10, ndpost = 10) # Tree Data trees_data <- extractTreeData(model = dbartModel, data = df) splitDensity(trees = trees_data, data = df, display = 'ridge') }
Adds a boolean 'terminal' column to the dataset indicating whether each node is terminal.
terminalFunction(data)
terminalFunction(data)
data |
A data frame containing tree structure information with at least 'treeNum', 'iteration', and 'depth' columns. |
The modified data frame with an additional 'terminal' column.
Train range for bivariate scale
train_bivariate(new, existing = NULL)
train_bivariate(new, existing = NULL)
new |
New data on which to train. |
existing |
Existing range |
A tibble containing two columns, 'range1' and 'range2', each representing the trained continuous range based on the new and existing data. This function is used to update or define the scales of a bivariate analysis by considering both new input data and any existing range specifications.
Small example of tree data, like that obtained when using 'extractTreeData()' function.
tree_data_example
tree_data_example
A data frame with 14 rows and 4 columns representing the structure of trees:
Variable name used for splitting.
The value in a node (i.e., either the split value or leaf value).
Iteration Number.
Tree Number in the iteration.
...
This function takes raw data and a tree structure, then processes it to form a detailed and structured dataframe. The data is transformed to indicate terminal nodes, calculate leaf values, and determine split values. It then assigns labels, calculates node depth, and establishes hierarchical relationships within the tree. Additional metadata about the tree, such as maximum depth, parent and child node relationships, and observation nodes are also included. The final dataframe is organized and enriched with necessary attributes for further analysis.
tree_dataframe(data, trees, response = NULL)
tree_dataframe(data, trees, response = NULL)
data |
A dataframe containing the raw data used for building the tree. |
trees |
A dataframe representing the initial tree structure, including variables and values for splits. |
response |
Optional character of the name of the response variable in your BART model. Including the response will remove it from the list elements 'Variable names' and 'nVar'. |
A list containing a detailed dataframe of the tree structure ('structure') with added information such as node depth, parent and child nodes, and observational data, along with meta-information about the tree like variable names ('varNames'), number of MCMC iterations ('nMCMC'), number of trees ('nTree'), and number of variables ('nVar').
data("input_data") data("tree_data_example") my_trees <- tree_dataframe(data = input_data, trees = tree_data_example, response = "y")
data("input_data") data("tree_data_example") my_trees <- tree_dataframe(data = input_data, trees = tree_data_example, response = "y")
Generates a bar plot showing the frequency of different tree structures represented in a list of tree graphs. Optionally, it can filter to show only the top N trees and handle stump trees specially.
treeBarPlot(trees, iter = NULL, topTrees = NULL, removeStump = FALSE)
treeBarPlot(trees, iter = NULL, topTrees = NULL, removeStump = FALSE)
trees |
A list of tree graphs to display |
iter |
Optional; specifies the iteration to display. |
topTrees |
Optional; the number of top tree structures to display. If NULL, displays all. |
removeStump |
Logical; if TRUE, trees with no edges (stumps) are excluded from the display |
This function processes a list of tree structures to compute the frequency of each unique structure, represented by a bar plot. It has options to exclude stump trees (trees with no edges) and to limit the plot to the top N most frequent structures.
A 'ggplot' object representing the bar plot of tree frequencies.
if(requireNamespace("dbarts", quietly = TRUE)){ # Load the dbarts package to access the bart function library(dbarts) # Get Data df <- na.omit(airquality) # Create Simple dbarts Model For Regression: set.seed(1701) dbartModel <- bart(df[2:6], df[, 1], ntree = 5, keeptrees = TRUE, nskip = 10, ndpost = 10) # Tree Data trees_data <- extractTreeData(model = dbartModel, data = df) plot <- treeBarPlot(trees = trees_data, topTrees = 3, removeStump = TRUE) }
if(requireNamespace("dbarts", quietly = TRUE)){ # Load the dbarts package to access the bart function library(dbarts) # Get Data df <- na.omit(airquality) # Create Simple dbarts Model For Regression: set.seed(1701) dbartModel <- bart(df[2:6], df[, 1], ntree = 5, keeptrees = TRUE, nskip = 10, ndpost = 10) # Tree Data trees_data <- extractTreeData(model = dbartModel, data = df) plot <- treeBarPlot(trees = trees_data, topTrees = 3, removeStump = TRUE) }
A plot of tree depth over iterations.
treeDepth(trees)
treeDepth(trees)
trees |
A list of tree attributes created using the extractTreeData function. |
A plot of average tree depths over iteration
if(requireNamespace("dbarts", quietly = TRUE)){ # Load the dbarts package to access the bart function library(dbarts) # Get Data df <- na.omit(airquality) # Create Simple dbarts Model For Regression: set.seed(1701) dbartModel <- bart(df[2:6], df[, 1], ntree = 5, keeptrees = TRUE, nskip = 10, ndpost = 10) # Tree Data trees_data <- extractTreeData(model = dbartModel, data = df) treeDepth(trees = trees_data) }
if(requireNamespace("dbarts", quietly = TRUE)){ # Load the dbarts package to access the bart function library(dbarts) # Get Data df <- na.omit(airquality) # Create Simple dbarts Model For Regression: set.seed(1701) dbartModel <- bart(df[2:6], df[, 1], ntree = 5, keeptrees = TRUE, nskip = 10, ndpost = 10) # Tree Data trees_data <- extractTreeData(model = dbartModel, data = df) treeDepth(trees = trees_data) }
This function takes a dataframe of trees, which is output from a BART model, and organizes it into a list of tree structures. It allows for filtering based on iteration number, tree number, and optionally reordering based on the maximum depth of nodes or variables.
treeList(trees, iter = NULL, treeNo = NULL)
treeList(trees, iter = NULL, treeNo = NULL)
trees |
A dataframe that contains the tree structures generated by a BART model. Expected columns include iteration, treeNum, parent, node, obsNode, |
iter |
An integer specifying the iteration number of trees to be included in the output. If NULL, trees from all iterations are included. |
treeNo |
An integer specifying the number of the tree to include in the output. If NULL, all trees are included. |
A list of tidygraph objects, each representing the structure of a tree. Each tidygraph object includes node and edge information necessary for visualisation.
if(requireNamespace("dbarts", quietly = TRUE)){ # Load the dbarts package to access the bart function library(dbarts) library(ggplot2) # Get Data df <- na.omit(airquality) # Create Simple dbarts Model For Regression: set.seed(1701) dbartModel <- bart(df[2:6], df[, 1], ntree = 5, keeptrees = TRUE, nskip = 10, ndpost = 10) # Tree Data trees_data <- extractTreeData(model = dbartModel, data = df) trees_list <- treeList(trees_data) }
if(requireNamespace("dbarts", quietly = TRUE)){ # Load the dbarts package to access the bart function library(dbarts) library(ggplot2) # Get Data df <- na.omit(airquality) # Create Simple dbarts Model For Regression: set.seed(1701) dbartModel <- bart(df[2:6], df[, 1], ntree = 5, keeptrees = TRUE, nskip = 10, ndpost = 10) # Tree Data trees_data <- extractTreeData(model = dbartModel, data = df) trees_list <- treeList(trees_data) }
A plot of number of nodes over iterations.
treeNodes(trees)
treeNodes(trees)
trees |
A list of tree attributes created using the extractTreeData function. |
A plot of tree number of nodes over iterations.
if(requireNamespace("dbarts", quietly = TRUE)){ # Load the dbarts package to access the bart function library(dbarts) # Get Data df <- na.omit(airquality) # Create Simple dbarts Model For Regression: set.seed(1701) dbartModel <- bart(df[2:6], df[, 1], ntree = 5, keeptrees = TRUE, nskip = 10, ndpost = 10) # Tree Data trees_data <- extractTreeData(model = dbartModel, data = df) treeNodes(trees = trees_data) }
if(requireNamespace("dbarts", quietly = TRUE)){ # Load the dbarts package to access the bart function library(dbarts) # Get Data df <- na.omit(airquality) # Create Simple dbarts Model For Regression: set.seed(1701) dbartModel <- bart(df[2:6], df[, 1], ntree = 5, keeptrees = TRUE, nskip = 10, ndpost = 10) # Tree Data trees_data <- extractTreeData(model = dbartModel, data = df) treeNodes(trees = trees_data) }
A matrix with nMCMC rows with each variable as a column. Each row represents an MCMC iteration. For each variable, the total count of the number of times that variable is used in a tree is given.
vimpBart(trees, type = "prop")
vimpBart(trees, type = "prop")
trees |
A data frame created by 'extractTreeData' function. |
type |
What value to return. Either the raw count 'val', the proportion 'prop', the column means of the proportions 'propMean', or the median of the proportions 'propMedian'. |
A matrix of importance values
if(requireNamespace("dbarts", quietly = TRUE)){ # Load the dbarts package to access the bart function library(dbarts) # Get Data df <- na.omit(airquality) # Create Simple dbarts Model For Regression: set.seed(1701) dbartModel <- bart(df[2:6], df[, 1], ntree = 5, keeptrees = TRUE, nskip = 10, ndpost = 10) # Tree Data trees_data <- extractTreeData(model = dbartModel, data = df) vimpBart(trees_data, type = 'prop') }
if(requireNamespace("dbarts", quietly = TRUE)){ # Load the dbarts package to access the bart function library(dbarts) # Get Data df <- na.omit(airquality) # Create Simple dbarts Model For Regression: set.seed(1701) dbartModel <- bart(df[2:6], df[, 1], ntree = 5, keeptrees = TRUE, nskip = 10, ndpost = 10) # Tree Data trees_data <- extractTreeData(model = dbartModel, data = df) vimpBart(trees_data, type = 'prop') }
Plot the variable importance for a BART model with the 25 quantile.
vimpPlot(trees, type = "prop", plotType = "barplot", metric = "median")
vimpPlot(trees, type = "prop", plotType = "barplot", metric = "median")
trees |
A data frame created by 'extractTreeData' function. |
type |
What value to return. Either the raw count 'count' or the proportions 'prop' averaged over iterations. |
plotType |
Which type of plot to return. Either a barplot 'barplot' with the quantiles shown as a line, a point plot with the quantiles shown as a gradient 'point', or a letter-value plot 'lvp'. |
metric |
Whether to show the 'mean' or 'median' importance values. Note, this has no effect when using plotType = 'lvp'. |
A plot of variable importance.
if(requireNamespace("dbarts", quietly = TRUE)){ # Load the dbarts package to access the bart function library(dbarts) # Get Data df <- na.omit(airquality) # Create Simple dbarts Model For Regression: set.seed(1701) dbartModel <- bart(df[2:6], df[, 1], ntree = 5, keeptrees = TRUE, nskip = 10, ndpost = 10) # Tree Data trees_data <- extractTreeData(model = dbartModel, data = df) vimpPlot(trees = trees_data, plotType = 'point') }
if(requireNamespace("dbarts", quietly = TRUE)){ # Load the dbarts package to access the bart function library(dbarts) # Get Data df <- na.omit(airquality) # Create Simple dbarts Model For Regression: set.seed(1701) dbartModel <- bart(df[2:6], df[, 1], ntree = 5, keeptrees = TRUE, nskip = 10, ndpost = 10) # Tree Data trees_data <- extractTreeData(model = dbartModel, data = df) vimpPlot(trees = trees_data, plotType = 'point') }
Plot the pair-wise variable interactions inclusion porportions for a BART model with the 25
vintPlot(trees, plotType = "barplot", top = NULL)
vintPlot(trees, plotType = "barplot", top = NULL)
trees |
A data frame created by 'extractTreeData' function. |
plotType |
Which type of plot to return. Either a barplot 'barplot' with the quantiles shown as a line, a point plot with the quantiles shown as a gradient 'point', or a letter-value plot 'lvp'. |
top |
Display only the top X metrics (does not apply to the letter-value plot). |
A plot of variable importance.
if(requireNamespace("dbarts", quietly = TRUE)){ # Load the dbarts package to access the bart function library(dbarts) # Get Data df <- na.omit(airquality) # Create Simple dbarts Model For Regression: set.seed(1701) dbartModel <- bart(df[2:6], df[, 1], ntree = 5, keeptrees = TRUE, nskip = 10, ndpost = 10) # Tree Data trees_data <- extractTreeData(model = dbartModel, data = df) vintPlot(trees = trees_data, top = 5) }
if(requireNamespace("dbarts", quietly = TRUE)){ # Load the dbarts package to access the bart function library(dbarts) # Get Data df <- na.omit(airquality) # Create Simple dbarts Model For Regression: set.seed(1701) dbartModel <- bart(df[2:6], df[, 1], ntree = 5, keeptrees = TRUE, nskip = 10, ndpost = 10) # Tree Data trees_data <- extractTreeData(model = dbartModel, data = df) vintPlot(trees = trees_data, top = 5) }
Returns a list containing a dataframe of variable importance summaries and a dataframe of variable interaction summaries.
viviBart(trees, out = "vivi")
viviBart(trees, out = "vivi")
trees |
A data frame created by 'extractTreeData' function. |
out |
Choose to either output just the variable importance ('vimp'), the variable interaction ('vint'), or both ('vivi') (default). |
A list of dataframes of VIVI summaries.
if(requireNamespace("dbarts", quietly = TRUE)){ # Load the dbarts package to access the bart function library(dbarts) # Get Data df <- na.omit(airquality) # Create Simple dbarts Model For Regression: set.seed(1701) dbartModel <- bart(df[2:6], df[, 1], ntree = 5, keeptrees = TRUE, nskip = 10, ndpost = 10) # Tree Data trees_data <- extractTreeData(model = dbartModel, data = df) viviBart(trees = trees_data, out = 'vivi') }
if(requireNamespace("dbarts", quietly = TRUE)){ # Load the dbarts package to access the bart function library(dbarts) # Get Data df <- na.omit(airquality) # Create Simple dbarts Model For Regression: set.seed(1701) dbartModel <- bart(df[2:6], df[, 1], ntree = 5, keeptrees = TRUE, nskip = 10, ndpost = 10) # Tree Data trees_data <- extractTreeData(model = dbartModel, data = df) viviBart(trees = trees_data, out = 'vivi') }
Returns a matrix or list of matrices. If type = 'standard' a matrix filled with vivi values is returned. If type = 'vsup' two matrices are returned. One with the actual values and another matrix of uncertainty values. If type = 'quantiles', three matrices are returned. One for the 25
viviBartMatrix( trees, type = "standard", metric = "propMean", metricError = "CV", reorder = FALSE )
viviBartMatrix( trees, type = "standard", metric = "propMean", metricError = "CV", reorder = FALSE )
trees |
A data frame created by 'extractTreeData' function. |
type |
Which type of matrix to return. Either 'standard', 'vsup', 'quantiles' |
metric |
Which metric to use to fill the actual values matrix. Either 'propMean' or 'count'. |
metricError |
Which metric to use to fill the uncertainty matrix. Either 'SD', 'CV' or 'SE'. |
reorder |
LOGICAL. If TRUE then the matrix is reordered so high values are pushed to the top left. |
A heatmap plot showing variable importance on the diagonal and variable interaction on the off-diagonal.
if(requireNamespace("dbarts", quietly = TRUE)){ # Load the dbarts package to access the bart function library(dbarts) # Get Data df <- na.omit(airquality) # Create Simple dbarts Model For Regression: set.seed(1701) dbartModel <- bart(df[2:6], df[, 1], ntree = 5, keeptrees = TRUE, nskip = 10, ndpost = 10) # Tree Data trees_data <- extractTreeData(model = dbartModel, data = df) # VSUP Matrix vsupMat <- viviBartMatrix(trees = trees_data, type = 'vsup', metric = 'propMean', metricError = 'CV') }
if(requireNamespace("dbarts", quietly = TRUE)){ # Load the dbarts package to access the bart function library(dbarts) # Get Data df <- na.omit(airquality) # Create Simple dbarts Model For Regression: set.seed(1701) dbartModel <- bart(df[2:6], df[, 1], ntree = 5, keeptrees = TRUE, nskip = 10, ndpost = 10) # Tree Data trees_data <- extractTreeData(model = dbartModel, data = df) # VSUP Matrix vsupMat <- viviBartMatrix(trees = trees_data, type = 'vsup', metric = 'propMean', metricError = 'CV') }
Plots a Heatmap showing variable importance on the diagonal and variable interaction on the off-diagonal with uncertainty included.
viviBartPlot( matrix, intPal = NULL, impPal = NULL, intLims = NULL, impLims = NULL, uncIntLims = NULL, uncImpLims = NULL, unc_levels = 4, max_desat = 0.6, pow_desat = 0.2, max_light = 0.6, pow_light = 1, angle = 0, border = FALSE, label = NULL )
viviBartPlot( matrix, intPal = NULL, impPal = NULL, intLims = NULL, impLims = NULL, uncIntLims = NULL, uncImpLims = NULL, unc_levels = 4, max_desat = 0.6, pow_desat = 0.2, max_light = 0.6, pow_light = 1, angle = 0, border = FALSE, label = NULL )
matrix |
Matrices, such as that returned by viviBartMatrix, of values to be plotted. |
intPal |
A vector of colours to show interactions, for use with scale_fill_gradientn. Palette number has to be 2^x/2 |
impPal |
A vector of colours to show importance, for use with scale_fill_gradientn. Palette number has to be 2^x/2 |
intLims |
Specifies the fit range for the color map for interaction strength. |
impLims |
Specifies the fit range for the color map for importance. |
uncIntLims |
Specifies the fit range for the color map for interaction strength uncertainties. |
uncImpLims |
Specifies the fit range for the color map for importance uncertainties. |
unc_levels |
The number of uncertainty levels |
max_desat |
The maximum desaturation level. |
pow_desat |
The power of desaturation level. |
max_light |
The maximum light level. |
pow_light |
The power of light level. |
angle |
The angle to rotate the x-axis labels. Defaults to zero. |
border |
Logical. If TRUE then draw a black border around the diagonal elements. |
label |
legend label for the uncertainty measure. |
Either a heatmap, VSUP, or quantile heatmap plot.
if(requireNamespace("dbarts", quietly = TRUE)){ # Load the dbarts package to access the bart function library(dbarts) # Get Data df <- na.omit(airquality) # Create Simple dbarts Model For Regression: set.seed(1701) dbartModel <- bart(df[2:6], df[, 1], ntree = 5, keeptrees = TRUE, nskip = 10, ndpost = 10) # Tree Data trees_data <- extractTreeData(model = dbartModel, data = df) # VSUP Matrix vsupMat <- viviBartMatrix(trees = trees_data, type = 'vsup', metric = 'propMean', metricError = 'CV') # Plot viviBartPlot(vsupMat, label = 'CV') }
if(requireNamespace("dbarts", quietly = TRUE)){ # Load the dbarts package to access the bart function library(dbarts) # Get Data df <- na.omit(airquality) # Create Simple dbarts Model For Regression: set.seed(1701) dbartModel <- bart(df[2:6], df[, 1], ntree = 5, keeptrees = TRUE, nskip = 10, ndpost = 10) # Tree Data trees_data <- extractTreeData(model = dbartModel, data = df) # VSUP Matrix vsupMat <- viviBartMatrix(trees = trees_data, type = 'vsup', metric = 'propMean', metricError = 'CV') # Plot viviBartPlot(vsupMat, label = 'CV') }