--- title: "Graphical VAR" author: "Donny Williams" date: "6/04/2020" bibliography: ../inst/REFERENCES.bib output: rmarkdown::html_vignette: toc: yes vignette: > %\VignetteIndexEntry{Graphical VAR} %\VignetteEngine{knitr::rmarkdown} \usepackage[utf8]{inputenc} --- ```{r setup, include=FALSE} knitr::opts_chunk$set(echo = TRUE) ``` # Introduction This vignette describes the implemention for a "graphical VAR" in `BGGM`. In `BGGM`, this is fitted as a multivariate regression. The key innovation is a novel prior distribution for the residual covariance matrix. There are a variety of much cooler names than a *mere* "multivariate regression", including "VAR" (vector autoregressive models) and "TSCGM" (time series chain graphical model). ## R package ``` # need the developmental version if (!requireNamespace("remotes")) { install.packages("remotes") } # install from github remotes::install_github("donaldRwilliams/BGGM") library(BGGM) # for comparsion library(vars) # for plotting library(qgraph) # combine plots library(cowplot) ``` ## Data I use data from the ifit (fit bit) study. The data were gathered over 100 consecutive days on a variety of variables, including the PANAS scale (positive and negative affect) and the number of steps each day. `BGGM` includes a subset of variables for two individuals. ```r # data Y <- subset(ifit, id == 1)[,-1] # first 3 rows head(Y, n = 3) #> interested disinterested excited upset strong stressed steps #> 72 10 50 2 50 16 7805 #> 75 6 75 0 76 0 18248 ># 36 58 38 5 45 1 12139 ``` # Estimation The methods in **BGGM** are organized around Bayesian "estimation" and "hypothesis testing". This is to reach a broader audience, as former is more similar to classical methods (those more familiar to researchers). ## Fit Model With the data in hand, the model is fitted as follows ``` # fit model fit <- var_estimate(Y, beta_sd = 1) ``` Note that `beta_sd` is the prior distribution for the regression coefficients. A smaller value, say, `beta_sd = 0.25`, results in a Bayesian ridge regression. Note also this model, including 5000 draws from the posterior, was estimated in less than 1 second. The results can then be printed ```r # print fit #> BGGM: Bayesian Gaussian Graphical Models #> --- #> Vector Autoregressive Model (VAR) #> --- #> Posterior Samples: 5000 #> Observations (n): 94 #> Nodes (p): 7 #> --- #> Call: #> var_estimate(Y = Y, beta_sd = 10) #> --- #> Partial Correlations: #> #> interested disinterested excited upset strong stressed steps #> interested 0.000 -0.170 0.388 -0.217 0.313 0.268 0.089 #> disinterested -0.170 0.000 -0.172 -0.029 0.094 0.160 -0.078 #> excited 0.388 -0.172 0.000 -0.126 0.500 -0.161 -0.016 #> upset -0.217 -0.029 -0.126 0.000 0.118 0.350 -0.039 #> strong 0.313 0.094 0.500 0.118 0.000 -0.010 0.176 #> stressed 0.268 0.160 -0.161 0.350 -0.010 0.000 -0.038 #> steps 0.089 -0.078 -0.016 -0.039 0.176 -0.038 0.000 #> --- #> Coefficients: #> #> interested disinterested excited upset strong stressed steps #> interested.l1 0.230 -0.009 0.182 -0.102 0.178 0.018 0.113 #> disinterested.l1 -0.051 -0.007 0.056 -0.019 0.049 0.091 -0.023 #> excited.l1 -0.088 -0.196 0.003 0.057 -0.093 0.092 0.106 #> upset.l1 -0.155 0.262 -0.097 0.435 0.057 0.324 -0.091 #> strong.l1 0.026 0.182 0.026 0.048 0.189 -0.073 -0.196 #> stressed.l1 -0.021 -0.014 -0.033 -0.048 -0.079 0.152 0.133 #> steps.l1 -0.157 0.180 -0.211 0.155 -0.092 0.209 0.042 #> --- #> Date: Thu Jun 04 08:54:04 2020 ``` Note that the coefficients are comparable, given each variable has been standardized (e.g., the predictors and the outcome are standardized). `BGGM` does not compute the partial directed correlation (PDC) by default (as in **graphicalVAR**). This is because the standardized effects can readily be tested with the Bayes factor, both across and within each model, whereas this does not seem straightforward for the PDC (which requires a transformation). ### Compare to Classical Here are the estimates from the `vars` package ```r t(round( vars::Bcoef( vars:::VAR(scale(na.omit(Y)), type = "none")), digits = 3) ) #> interested disinterested excited upset strong stressed steps #> interested.l1 0.229 -0.012 0.184 -0.100 0.180 0.015 0.112 #> disinterested.l1 -0.050 -0.006 0.057 -0.019 0.050 0.092 -0.022 #> excited.l1 -0.088 -0.193 0.002 0.056 -0.091 0.093 0.106 #> upset.l1 -0.155 0.260 -0.096 0.436 0.058 0.321 -0.092 #> strong.l1 0.027 0.182 0.025 0.047 0.188 -0.073 -0.192 #> stressed.l1 -0.021 -0.012 -0.033 -0.046 -0.077 0.152 0.133 #> steps.l1 -0.157 0.183 -0.210 0.153 -0.093 0.207 0.041 ``` Recall that the "estimation" methods are similar to, in this case, ordinary least squares. The graphical structure in `BGGM` is determined with credible intervals, which will be quite similar to using confidence intervals. Hence for those researchers unfamiliar with Bayesian methods the "estimation" methods are perhaps a nice place to start. ## Summarize Model The model can also be summarized with ```r print( summary(fit, cred = 0.95), param = "pcor" ) #> BGGM: Bayesian Gaussian Graphical Models #> --- #> Vector Autoregressive Model (VAR) #> --- #> Partial Correlations: #> #> Relation Post.mean Post.sd Cred.lb Cred.ub #> interested--disinterested -0.170 0.108 -0.382 0.044 #> interested--excited 0.388 0.085 0.219 0.546 #> disinterested--excited -0.172 0.104 -0.369 0.049 #> interested--upset -0.217 0.106 -0.417 0.000 #> disinterested--upset -0.029 0.101 -0.239 0.161 #> excited--upset -0.126 0.098 -0.315 0.066 #> interested--strong 0.313 0.090 0.135 0.480 #> disinterested--strong 0.094 0.112 -0.120 0.318 #> excited--strong 0.500 0.078 0.337 0.645 #> upset--strong 0.118 0.109 -0.100 0.325 #> interested--stressed 0.268 0.102 0.058 0.460 #> disinterested--stressed 0.160 0.100 -0.049 0.351 #> excited--stressed -0.161 0.099 -0.358 0.031 #> upset--stressed 0.350 0.091 0.166 0.519 #> strong--stressed -0.010 0.107 -0.212 0.201 #> interested--steps 0.089 0.108 -0.123 0.297 #> disinterested--steps -0.078 0.108 -0.284 0.125 #> excited--steps -0.016 0.100 -0.207 0.182 #> upset--steps -0.039 0.107 -0.245 0.178 #> strong--steps 0.176 0.101 -0.024 0.364 #> stressed--steps -0.038 0.108 -0.236 0.193 #> --- ``` The coefficients can also be printed by changing `param` to either `all` or `beta`, The summary can also be plotted. Here are the coefficients ```r plts <- plot(summary(fit, cred = 0.95)) cowplot::plot_grid( cowplot::plot_grid( plts$beta_plt$interested, plts$beta_plt$disinterested, plts$beta_plt$excited, nrow = 1), cowplot::plot_grid( plts$beta_plt$upset, plts$beta_plt$strong, plts$beta_plt$stressed, nrow = 1 ), nrow = 2) ``` ![](../man/figures/var_coef.png) There is a plot for the partial correlations in the object `plts`. ## Select Graph The graphs are selected with ```r select(fit, cred = 0.95) #> BGGM: Bayesian Gaussian Graphical Models #> --- #> Vector Autoregressive Model (VAR) #> --- #> Posterior Samples: 5000 #> Credible Interval: 95 % #> --- #> Call: #> var_estimate(Y = Y, beta_sd = 10) #> --- #> Partial Correlations: #> #> interested disinterested excited upset strong stressed steps #> interested 0.000 0 0.388 -0.217 0.313 0.268 0 #> disinterested 0.000 0 0.000 0.000 0.000 0.000 0 #> excited 0.388 0 0.000 0.000 0.500 0.000 0 #> upset -0.217 0 0.000 0.000 0.000 0.350 0 #> strong 0.313 0 0.500 0.000 0.000 0.000 0 #> stressed 0.268 0 0.000 0.350 0.000 0.000 0 #> steps 0.000 0 0.000 0.000 0.000 0.000 0 #> --- #> Coefficients: #> #> interested disinterested excited upset strong stressed steps #> interested.l1 0 0.000 0 0.000 0 0.000 0 #> disinterested.l1 0 0.000 0 0.000 0 0.000 0 #> excited.l1 0 0.000 0 0.000 0 0.000 0 #> upset.l1 0 0.262 0 0.435 0 0.324 0 #> strong.l1 0 0.000 0 0.000 0 0.000 0 #> stressed.l1 0 0.000 0 0.000 0 0.000 0 #> steps.l1 0 0.000 0 0.000 0 0.209 0 #> --- ``` # Plot Graph For plotting, I use the **qgraph** package. ```r par(mfrow=c(1,2)) qgraph::qgraph(sel$pcor_weighted_adj, title = "Partials") qgraph::qgraph(sel$beta_weighted_adj, title = "Coefficients") ``` ![](../man/figures/est_net.png) # Predictability Finally, it is also possible to compute predictability, in this case Bayesian $R^2$ ```r r2 <- predictability(fit) # print r2 #> BGGM: Bayesian Gaussian Graphical Models #> --- #> Metric: Bayes R2 #> Type: continuous #> --- #> Estimates: #> #> Node Post.mean Post.sd Cred.lb Cred.ub #> interested 0.144 0.057 0.050 0.271 #> disinterested 0.166 0.061 0.060 0.302 #> excited 0.127 0.054 0.039 0.250 #> upset 0.220 0.070 0.093 0.368 #> strong 0.116 0.051 0.035 0.232 #> stressed 0.227 0.069 0.102 0.373 #> steps 0.105 0.047 0.032 0.210 ``` The object `r2` can also be plotted ```r plot(r2, type = "ridgeline") ``` ![](../man/figures/var_ridgeline.png) # Explore Bayesian (exploratory) testing to come... # Confirm Bayesian (confirmatory) testing to come... # Note