Package 'BayesPostEst'

Title:	Generate Postestimation Quantities for Bayesian MCMC Estimation
Description:	An implementation of functions to generate and plot postestimation quantities after estimating Bayesian regression models using Markov chain Monte Carlo (MCMC). Functionality includes the estimation of the Precision-Recall curves (see Beger, 2016 <doi:10.2139/ssrn.2765419>), the implementation of the observed values method of calculating predicted probabilities by Hanmer and Kalkan (2013) <doi:10.1111/j.1540-5907.2012.00602.x>, the implementation of the average value method of calculating predicted probabilities (see King, Tomz, and Wittenberg, 2000 <doi:10.2307/2669316>), and the generation and plotting of first differences to summarize typical effects across covariates (see Long 1997, ISBN:9780803973749; King, Tomz, and Wittenberg, 2000 <doi:10.2307/2669316>). This package can be used with MCMC output generated by any Bayesian estimation tool including 'JAGS', 'BUGS', 'MCMCpack', and 'Stan'.
Authors:	Johannes Karreth [aut] , Shana Scogin [aut, cre] , Rob Williams [aut] , Andreas Beger [aut] , Myunghee Lee [ctb], Neil Williams [ctb]
Maintainer:	Shana Scogin <[email protected]>
License:	GPL-3
Version:	0.3.2
Built:	2025-02-16 06:56:56 UTC
Source:	CRAN

Help Index

BayesPostEst Overview
Fitted JAGS interactive linear model
Fitted JAGS interactive linear model with categorical moderator
Fitted JAGS logit model
Fitted JAGS probit model
Predicted Probabilities using Bayesian MCMC estimates for the "Average" Case
Coefficient Plots for MCMC Output
First Differences of a Bayesian Logit or Probit model
Marginal Effects Plots for MCMC Output
Predicted Probabilities using Bayesian MCMC estimates for the Average of Observed Cases
LaTeX or HTML regression tables for MCMC Output
ROC and Precision-Recall Curves using Bayesian MCMC estimates generalized
Summarize Bayesian MCMC Output R function for summarizing MCMC output in a regression-style table.
Plot Method for First Differences from MCMC output
ROC and Precision-Recall Curves using Bayesian MCMC estimates
Simulated data for examples
Simulated data for examples
Simulated data for examples

This package currently has nine main functions that can be used to generate and plot postestimation quantities after estimating Bayesian regression models using MCMC. The package combines functions written originally for Johannes Karreth's workshop on Bayesian modeling at the ICPSR Summer program. Currently BayesPostEst focuses mostly on generalized linear regression models for binary outcomes (logistic and probit regression). The vignette for this package has a walk-through of each function in action. Please refer to that to get an overview of all the functions, or visit the documentation for a specific function of your choice. Johannes Karreth's website (http://www.jkarreth.net) also has resources for getting started with Bayesian analysis, fitting models, and presenting results.

Main Functions

mcmcAveProb()
mcmcObsProb()
mcmcFD()
mcmcMargEff()
mcmcRocPrc()
mcmcRocPrcGen()
mcmcTab()
mcmcReg()
plot.mcmcFD()

Fitted JAGS interactive linear model

Description

A fitted JAGS linear model with interaction term generated with [R2jags::jags()]. See the example code below for how it was created. Used in examples and for testing.

Usage

jags_interactive
jags_interactive

Format

A class "rjags" object created by [R2jags::jags()]

Examples


if (interactive()) {
data("sim_data_interactive")

## formatting the data for jags
datjags <- as.list(sim_data_interactive)
datjags$N <- length(datjags$Y)

## creating jags model
model <- function()  {
  
  for(i in 1:N){
    Y[i] ~ dnorm(mu[i], sigma)  ## Bernoulli distribution of y_i
    
    mu[i] <- b[1] + 
      b[2] * X1[i] + 
      b[3] * X2[i] +
      b[4] * X1[i] * X2[i]
    
  }
  
  for(j in 1:4){
    b[j] ~ dnorm(0, 0.001) ## Use a coefficient vector for simplicity
  }
  
  sigma ~ dexp(1)
  
}

params <- c("b")
inits1 <- list("b" = rep(0, 4))
inits2 <- list("b" = rep(0, 4))
inits <- list(inits1, inits2)

## fitting the model with R2jags
set.seed(123)
jags_interactive <- R2jags::jags(data = datjags, inits = inits, 
                                 parameters.to.save = params, n.chains = 2,
                                 n.iter = 2000, n.burnin = 1000,
                                 model.file = model)
                                 
}

if (interactive()) {
data("sim_data_interactive")

## formatting the data for jags
datjags <- as.list(sim_data_interactive)
datjags$N <- length(datjags$Y)

## creating jags model
model <- function()  {
  
  for(i in 1:N){
    Y[i] ~ dnorm(mu[i], sigma)  ## Bernoulli distribution of y_i
    
    mu[i] <- b[1] + 
      b[2] * X1[i] + 
      b[3] * X2[i] +
      b[4] * X1[i] * X2[i]
    
  }
  
  for(j in 1:4){
    b[j] ~ dnorm(0, 0.001) ## Use a coefficient vector for simplicity
  }
  
  sigma ~ dexp(1)
  
}

params <- c("b")
inits1 <- list("b" = rep(0, 4))
inits2 <- list("b" = rep(0, 4))
inits <- list(inits1, inits2)

## fitting the model with R2jags
set.seed(123)
jags_interactive <- R2jags::jags(data = datjags, inits = inits, 
                                 parameters.to.save = params, n.chains = 2,
                                 n.iter = 2000, n.burnin = 1000,
                                 model.file = model)
                                 
}

Fitted JAGS interactive linear model with categorical moderator

Description

A fitted JAGS linear model with interaction term generated with [R2jags::jags()]. See the example code below for how it was created. Used in examples and for testing.

Usage

jags_interactive_cat
jags_interactive_cat

Format

A class "rjags" object created by [R2jags::jags()]

Examples


if (interactive()) {
data("sim_data_interactive_cat")

## formatting the data for jags
datjags <- as.list(sim_data_interactive_cat)
datjags$N <- length(datjags$Y)

## creating jags model
model <- function()  {
  
  for(i in 1:N){
    Y[i] ~ dnorm(mu[i], sigma)  ## Bernoulli distribution of y_i
    
    mu[i] <- b[1] + 
      b[2] * X1[i] + 
      b[3] * X3[i] +
      b[4] * X1[i] * X3[i]
    
  }
  
  for(j in 1:4){
    b[j] ~ dnorm(0, 0.001) ## Use a coefficient vector for simplicity
  }
  
  sigma ~ dexp(1)
  
}

params <- c("b")
inits1 <- list("b" = rep(0, 4))
inits2 <- list("b" = rep(0, 4))
inits <- list(inits1, inits2)

## fitting the model with R2jags
set.seed(123)
jags_interactive_cat <- R2jags::jags(data = datjags, inits = inits,
                                     parameters.to.save = params, n.chains = 2,
                                     n.iter = 2000, n.burnin = 1000,
                                     model.file = model)
                                 
}

if (interactive()) {
data("sim_data_interactive_cat")

## formatting the data for jags
datjags <- as.list(sim_data_interactive_cat)
datjags$N <- length(datjags$Y)

## creating jags model
model <- function()  {
  
  for(i in 1:N){
    Y[i] ~ dnorm(mu[i], sigma)  ## Bernoulli distribution of y_i
    
    mu[i] <- b[1] + 
      b[2] * X1[i] + 
      b[3] * X3[i] +
      b[4] * X1[i] * X3[i]
    
  }
  
  for(j in 1:4){
    b[j] ~ dnorm(0, 0.001) ## Use a coefficient vector for simplicity
  }
  
  sigma ~ dexp(1)
  
}

params <- c("b")
inits1 <- list("b" = rep(0, 4))
inits2 <- list("b" = rep(0, 4))
inits <- list(inits1, inits2)

## fitting the model with R2jags
set.seed(123)
jags_interactive_cat <- R2jags::jags(data = datjags, inits = inits,
                                     parameters.to.save = params, n.chains = 2,
                                     n.iter = 2000, n.burnin = 1000,
                                     model.file = model)
                                 
}

Fitted JAGS logit model

Description

A fitted JAGS logit model generated with [R2jags::jags()]. See the example code below for how it was created. Used in examples and for testing.

Usage

jags_logit
jags_logit

Format

A class "rjags" object created by [R2jags::jags()]

Examples


if (interactive()) {
data("sim_data")
  
## formatting the data for jags
datjags <- as.list(sim_data)
datjags$N <- length(datjags$Y)

## creating jags model
model <- function()  {

  for(i in 1:N){
    Y[i] ~ dbern(p[i])  ## Bernoulli distribution of y_i
      logit(p[i]) <- mu[i]    ## Logit link function
      mu[i] <- b[1] +
        b[2] * X1[i] +
        b[3] * X2[i]
  }

  for(j in 1:3){
    b[j] ~ dnorm(0, 0.001) ## Use a coefficient vector for simplicity
  }

}

params <- c("b")
inits1 <- list("b" = rep(0, 3))
inits2 <- list("b" = rep(0, 3))
inits <- list(inits1, inits2)

## fitting the model with R2jags
set.seed(123)
jags_logit <- R2jags::jags(data = datjags, inits = inits,
                         parameters.to.save = params, n.chains = 2, 
                         n.iter = 2000, n.burnin = 1000, model.file = model)

}


if (interactive()) {
data("sim_data")
  
## formatting the data for jags
datjags <- as.list(sim_data)
datjags$N <- length(datjags$Y)

## creating jags model
model <- function()  {

  for(i in 1:N){
    Y[i] ~ dbern(p[i])  ## Bernoulli distribution of y_i
      logit(p[i]) <- mu[i]    ## Logit link function
      mu[i] <- b[1] +
        b[2] * X1[i] +
        b[3] * X2[i]
  }

  for(j in 1:3){
    b[j] ~ dnorm(0, 0.001) ## Use a coefficient vector for simplicity
  }

}

params <- c("b")
inits1 <- list("b" = rep(0, 3))
inits2 <- list("b" = rep(0, 3))
inits <- list(inits1, inits2)

## fitting the model with R2jags
set.seed(123)
jags_logit <- R2jags::jags(data = datjags, inits = inits,
                         parameters.to.save = params, n.chains = 2, 
                         n.iter = 2000, n.burnin = 1000, model.file = model)

}

Fitted JAGS probit model

Description

A fitted JAGS probit model generated with [R2jags::jags()]. See the example code below for how it was created. Used in examples and for testing.

Usage

jags_probit
jags_probit

Format

A class "rjags" object created by [R2jags::jags()]

Examples


if (interactive()) {
data("sim_data")
  
## formatting the data for jags
datjags <- as.list(sim_data)
datjags$N <- length(datjags$Y)

## creating jags model
model <- function()  {

  for(i in 1:N){
    Y[i] ~ dbern(p[i])  ## Bernoulli distribution of y_i
      probit(p[i]) <- mu[i]    ## Update with probit link function
      mu[i] <- b[1] +
        b[2] * X1[i] +
        b[3] * X2[i]
  }

  for(j in 1:3){
    b[j] ~ dnorm(0, 0.001) ## Use a coefficient vector for simplicity
  }

}

params <- c("b")
inits1 <- list("b" = rep(0, 3))
inits2 <- list("b" = rep(0, 3))
inits <- list(inits1, inits2)

## fitting the model with R2jags
set.seed(123)
jags_probit <- R2jags::jags(data = datjags, inits = inits,
                         parameters.to.save = params, n.chains = 2, 
                         n.iter = 2000, n.burnin = 1000, model.file = model)

}


if (interactive()) {
data("sim_data")
  
## formatting the data for jags
datjags <- as.list(sim_data)
datjags$N <- length(datjags$Y)

## creating jags model
model <- function()  {

  for(i in 1:N){
    Y[i] ~ dbern(p[i])  ## Bernoulli distribution of y_i
      probit(p[i]) <- mu[i]    ## Update with probit link function
      mu[i] <- b[1] +
        b[2] * X1[i] +
        b[3] * X2[i]
  }

  for(j in 1:3){
    b[j] ~ dnorm(0, 0.001) ## Use a coefficient vector for simplicity
  }

}

params <- c("b")
inits1 <- list("b" = rep(0, 3))
inits2 <- list("b" = rep(0, 3))
inits <- list(inits1, inits2)

## fitting the model with R2jags
set.seed(123)
jags_probit <- R2jags::jags(data = datjags, inits = inits,
                         parameters.to.save = params, n.chains = 2, 
                         n.iter = 2000, n.burnin = 1000, model.file = model)

}

Predicted Probabilities using Bayesian MCMC estimates for the "Average" Case

Description

This function calculates predicted probabilities for "average" cases after a Bayesian logit or probit model. As "average" cases, this function calculates the median value of each predictor. For an explanation of predicted probabilities for "average" cases, see e.g. King, Tomz & Wittenberg (2000, American Journal of Political Science 44(2): 347-361).

Usage

mcmcAveProb(
  modelmatrix,
  mcmcout,
  xcol,
  xrange,
  xinterest,
  link = "logit",
  ci = c(0.025, 0.975),
  fullsims = FALSE
)
mcmcAveProb(
  modelmatrix,
  mcmcout,
  xcol,
  xrange,
  xinterest,
  link = "logit",
  ci = c(0.025, 0.975),
  fullsims = FALSE
)

Arguments

`modelmatrix`	model matrix, including intercept (if the intercept is among the parameters estimated in the model). Create with model.matrix(formula, data). Note: the order of columns in the model matrix must correspond to the order of columns in the matrix of posterior draws in the `mcmcout` argument. See the `mcmcout` argument for more.
`mcmcout`	posterior distributions of all logit coefficients, in matrix form. This can be created from rstan, MCMCpack, R2jags, etc. and transformed into a matrix using the function as.mcmc() from the coda package for `jags` class objects, as.matrix() from base R for `mcmc`, `mcmc.list`, `stanreg`, and `stanfit` class objects, and `object$sims.matrix` for `bugs` class objects. Note: the order of columns in this matrix must correspond to the order of columns in the model matrix. One can do this by examining the posterior distribution matrix and sorting the variables in the order of this matrix when creating the model matrix. A useful function for sorting column names containing both characters and numbers as you create the matrix of posterior distributions is `mixedsort()` from the gtools package.
`xcol`	column number of the posterior draws (`mcmcout`) and model matrices that corresponds to the explanatory variable for which to calculate associated Pr(y = 1). Note that the columns in these matrices must match.
`xrange`	name of the vector with the range of relevant values of the explanatory variable for which to calculate associated Pr(y = 1).
`xinterest`	semi-optional argument. Name of the explanatory variable for which to calculate associated Pr(y = 1). If `xcol` is supplied, this is not needed. If both are supplied, the function defaults to `xcol` and this argument is ignored.
`link`	type of generalized linear model; a character vector set to `"logit"` (default) or `"probit"`.
`ci`	the bounds of the credible interval. Default is `c(0.025, 0.975)` for the 95% credible interval.
`fullsims`	logical indicator of whether full object (based on all MCMC draws rather than their average) will be returned. Default is `FALSE`. Note: The longer `xrange` is, the larger the full output will be if `TRUE` is selected.

Details

This function calculates predicted probabilities for "average" cases after a Bayesian logit or probit model. For an explanation of predicted probabilities for "average" cases, see e.g. King, Tomz & Wittenberg (2000, American Journal of Political Science 44(2): 347-361)

Value

if fullsims = FALSE (default), a tibble with 4 columns:

x: value of variable of interest, drawn from xrange
median_pp: median predicted Pr(y = 1) when variable of interest is set to x, holding all other predictors to average (median) values
lower_pp: lower bound of credible interval of predicted probability at given x
upper_pp: upper bound of credible interval of predicted probability at given x

if fullsims = TRUE, a tibble with 3 columns:

Iteration: number of the posterior draw
x: value of variable of interest, drawn from xrange
pp: average predicted Pr(y = 1) when variable of interest is set to x, holding all other predictors to average (median) values

References

King, Gary, Michael Tomz, and Jason Wittenberg. 2000. “Making the Most of Statistical Analyses: Improving Interpretation and Presentation.” American Journal of Political Science 44 (2): 347–61. http://www.jstor.org/stable/2669316

Examples



if (interactive()) {
  ## simulating data
  set.seed(123)
  b0 <- 0.2 # true value for the intercept
  b1 <- 0.5 # true value for first beta
  b2 <- 0.7 # true value for second beta
  n <- 500 # sample size
  X1 <- runif(n, -1, 1)
  X2 <- runif(n, -1, 1)
  Z <- b0 + b1 * X1 + b2 * X2
  pr <- 1 / (1 + exp(-Z)) # inv logit function
  Y <- rbinom(n, 1, pr) 
  df <- data.frame(cbind(X1, X2, Y))
  
  ## formatting the data for jags
  datjags <- as.list(df)
  datjags$N <- length(datjags$Y)
  
  ## creating jags model
  model <- function()  {
  
  for(i in 1:N){
    Y[i] ~ dbern(p[i])  ## Bernoulli distribution of y_i
    logit(p[i]) <- mu[i]    ## Logit link function
    mu[i] <- b[1] + 
      b[2] * X1[i] + 
      b[3] * X2[i]
  }
  
  for(j in 1:3){
    b[j] ~ dnorm(0, 0.001) ## Use a coefficient vector for simplicity
  }
  
}

params <- c("b")
inits1 <- list("b" = rep(0, 3))
inits2 <- list("b" = rep(0, 3))
inits <- list(inits1, inits2)

## fitting the model with R2jags
library(R2jags)
set.seed(123)
fit <- jags(data = datjags, inits = inits, 
         parameters.to.save = params, n.chains = 2, n.iter = 2000, 
         n.burnin = 1000, model.file = model)

### average value approach
library(coda)
xmat <- model.matrix(Y ~ X1 + X2, data = df)
mcmc <- as.mcmc(fit)
mcmc_mat <- as.matrix(mcmc)[, 1:ncol(xmat)]
X1_sim <- seq(from = min(datjags$X1),
              to = max(datjags$X1), 
              length.out = 10)
ave_prob <- mcmcAveProb(modelmatrix = xmat,
                        mcmcout = mcmc_mat,
                        xrange = X1_sim, 
                        xcol = 2)
}



if (interactive()) {
  ## simulating data
  set.seed(123)
  b0 <- 0.2 # true value for the intercept
  b1 <- 0.5 # true value for first beta
  b2 <- 0.7 # true value for second beta
  n <- 500 # sample size
  X1 <- runif(n, -1, 1)
  X2 <- runif(n, -1, 1)
  Z <- b0 + b1 * X1 + b2 * X2
  pr <- 1 / (1 + exp(-Z)) # inv logit function
  Y <- rbinom(n, 1, pr) 
  df <- data.frame(cbind(X1, X2, Y))
  
  ## formatting the data for jags
  datjags <- as.list(df)
  datjags$N <- length(datjags$Y)
  
  ## creating jags model
  model <- function()  {
  
  for(i in 1:N){
    Y[i] ~ dbern(p[i])  ## Bernoulli distribution of y_i
    logit(p[i]) <- mu[i]    ## Logit link function
    mu[i] <- b[1] + 
      b[2] * X1[i] + 
      b[3] * X2[i]
  }
  
  for(j in 1:3){
    b[j] ~ dnorm(0, 0.001) ## Use a coefficient vector for simplicity
  }
  
}

params <- c("b")
inits1 <- list("b" = rep(0, 3))
inits2 <- list("b" = rep(0, 3))
inits <- list(inits1, inits2)

## fitting the model with R2jags
library(R2jags)
set.seed(123)
fit <- jags(data = datjags, inits = inits, 
         parameters.to.save = params, n.chains = 2, n.iter = 2000, 
         n.burnin = 1000, model.file = model)

### average value approach
library(coda)
xmat <- model.matrix(Y ~ X1 + X2, data = df)
mcmc <- as.mcmc(fit)
mcmc_mat <- as.matrix(mcmc)[, 1:ncol(xmat)]
X1_sim <- seq(from = min(datjags$X1),
              to = max(datjags$X1), 
              length.out = 10)
ave_prob <- mcmcAveProb(modelmatrix = xmat,
                        mcmcout = mcmc_mat,
                        xrange = X1_sim, 
                        xcol = 2)
}

Coefficient Plots for MCMC Output

Description

Coefficient plots for MCMC output using ggplot2

Usage

mcmcCoefPlot(
  mod,
  pars = NULL,
  pointest = "mean",
  ci = 0.95,
  hpdi = FALSE,
  sort = FALSE,
  plot = TRUE,
  regex = FALSE
)
mcmcCoefPlot(
  mod,
  pars = NULL,
  pointest = "mean",
  ci = 0.95,
  hpdi = FALSE,
  sort = FALSE,
  plot = TRUE,
  regex = FALSE
)

Arguments

`mod`	Bayesian model object generated by R2jags, rjags, R2WinBUGS, R2OpenBUGS, MCMCpack, rstan, rstanarm, and brms.
`pars`	a scalar or vector of the parameters you wish to include in the table. By default, `mcmcCoefPlot` includes all parameters saved in a model object. If a model has lots of samples and lots of saved parameters, not explicitly specifying a limited number of parameters to include via `pars` may take a long time or produce an unreadable plot. `pars` can either be a vector with the specific parameters to be included in the table e.g. `pars = c("beta[1]", "beta[2]", "beta[3]")`, or they can be partial names that will be matched using regular expressions e.g. `pars = "beta"` if `regex = TRUE`. Both of these will include `beta[1]`, `beta[2]`, and `beta[3]` in the plot. If `pars` is left blank, `mcmcCoefPlot` will exclude auxiliary parameters such as `deviance` from JAGS or `lp__` from Stan.
`pointest`	a character indicating whether to use the mean or median for point estimates in the table.
`ci`	a scalar indicating the confidence level of the uncertainty intervals.
`hpdi`	a logical indicating whether to use highest posterior density intervals or equal tailed credible intervals to capture uncertainty; default `FALSE`.
`sort`	logical indicating whether to sort the point estimates to produce a caterpillar or dot plot; default `FALSE`.
`plot`	logical indicating whether to return a `ggplot` object or the underlying tidy DataFrame; default `TRUE`.
`regex`	use regular expression matching with `pars`?

Value

a ggplot object or a tidy DataFrame.

Author(s)

Rob Williams, [email protected]

Examples



if (interactive()) {
## simulating data
set.seed(123456)
b0 <- 0.2 # true value for the intercept
b1 <- 0.5 # true value for first beta
b2 <- 0.7 # true value for second beta
n <- 500 # sample size
X1 <- runif(n, -1, 1)
X2 <- runif(n, -1, 1)
Z <- b0 + b1 * X1 + b2 * X2
pr <- 1 / (1 + exp(-Z)) # inv logit function
Y <- rbinom(n, 1, pr)
df <- data.frame(cbind(X1, X2, Y))

## formatting the data for jags
datjags <- as.list(df)
datjags$N <- length(datjags$Y)

## creating jags model
model <- function()  {
  
  for(i in 1:N){
    Y[i] ~ dbern(p[i])  ## Bernoulli distribution of y_i
    logit(p[i]) <- mu[i]    ## Logit link function
    mu[i] <- b[1] +
      b[2] * X1[i] +
      b[3] * X2[i]
  }
  
  for(j in 1:3){
    b[j] ~ dnorm(0, 0.001) ## Use a coefficient vector for simplicity
  }
}

params <- c("b")
inits1 <- list("b" = rep(0, 3))
inits2 <- list("b" = rep(0, 3))
inits <- list(inits1, inits2)

## fitting the model with R2jags
set.seed(123)
fit <- R2jags::jags(data = datjags, inits = inits,
                    parameters.to.save = params, n.chains = 2, n.iter = 2000,
                    n.burnin = 1000, model.file = model)

## generating coefficient plot with all non-auxiliary parameters
mcmcCoefPlot(fit)
}



if (interactive()) {
## simulating data
set.seed(123456)
b0 <- 0.2 # true value for the intercept
b1 <- 0.5 # true value for first beta
b2 <- 0.7 # true value for second beta
n <- 500 # sample size
X1 <- runif(n, -1, 1)
X2 <- runif(n, -1, 1)
Z <- b0 + b1 * X1 + b2 * X2
pr <- 1 / (1 + exp(-Z)) # inv logit function
Y <- rbinom(n, 1, pr)
df <- data.frame(cbind(X1, X2, Y))

## formatting the data for jags
datjags <- as.list(df)
datjags$N <- length(datjags$Y)

## creating jags model
model <- function()  {
  
  for(i in 1:N){
    Y[i] ~ dbern(p[i])  ## Bernoulli distribution of y_i
    logit(p[i]) <- mu[i]    ## Logit link function
    mu[i] <- b[1] +
      b[2] * X1[i] +
      b[3] * X2[i]
  }
  
  for(j in 1:3){
    b[j] ~ dnorm(0, 0.001) ## Use a coefficient vector for simplicity
  }
}

params <- c("b")
inits1 <- list("b" = rep(0, 3))
inits2 <- list("b" = rep(0, 3))
inits <- list(inits1, inits2)

## fitting the model with R2jags
set.seed(123)
fit <- R2jags::jags(data = datjags, inits = inits,
                    parameters.to.save = params, n.chains = 2, n.iter = 2000,
                    n.burnin = 1000, model.file = model)

## generating coefficient plot with all non-auxiliary parameters
mcmcCoefPlot(fit)
}

First Differences of a Bayesian Logit or Probit model

Description

R function to calculate first differences after a Bayesian logit or probit model. First differences are a method to summarize effects across covariates. This quantity represents the difference in predicted probabilities for each covariate for cases with low and high values of the respective covariate. For each of these differences, all other variables are held constant at their median. For more, see Long (1997, Sage Publications) and King, Tomz, and Wittenberg (2000, American Journal of Political Science 44(2): 347-361).

Usage

mcmcFD(
  modelmatrix,
  mcmcout,
  link = "logit",
  ci = c(0.025, 0.975),
  percentiles = c(0.25, 0.75),
  fullsims = FALSE
)
mcmcFD(
  modelmatrix,
  mcmcout,
  link = "logit",
  ci = c(0.025, 0.975),
  percentiles = c(0.25, 0.75),
  fullsims = FALSE
)

Arguments

`modelmatrix`	model matrix, including intercept (if the intercept is among the parameters estimated in the model). Create with model.matrix(formula, data). Note: the order of columns in the model matrix must correspond to the order of columns in the matrix of posterior draws in the `mcmcout` argument. See the `mcmcout` argument for more.
`mcmcout`	posterior distributions of all logit coefficients, in matrix form. This can be created from rstan, MCMCpack, R2jags, etc. and transformed into a matrix using the function as.mcmc() from the coda package for `jags` class objects, as.matrix() from base R for `mcmc`, `mcmc.list`, `stanreg`, and `stanfit` class objects, and `object$sims.matrix` for `bugs` class objects. Note: the order of columns in this matrix must correspond to the order of columns in the model matrix. One can do this by examining the posterior distribution matrix and sorting the variables in the order of this matrix when creating the model matrix. A useful function for sorting column names containing both characters and numbers as you create the matrix of posterior distributions is `mixedsort()` from the gtools package.
`link`	type of generalized linear model; a character vector set to `"logit"` (default) or `"probit"`.
`ci`	the bounds of the credible interval. Default is `c(0.025, 0.975)` for the 95% credible interval.
`percentiles`	values of each predictor for which the difference in Pr(y = 1) is to be calculated. Default is `c(0.25, 0.75)`, which will calculate the difference between Pr(y = 1) for the 25th percentile and 75th percentile of the predictor. For binary predictors, the function automatically calculates the difference between Pr(y = 1) for x = 0 and x = 1.
`fullsims`	logical indicator of whether full object (based on all MCMC draws rather than their average) will be returned. Default is `FALSE`.

Value

An object of class mcmcFD. If fullsims = FALSE (default), a data frame with five columns:

median_fd: median first difference
lower_fd: lower bound of credible interval of the first difference
upper_fd: upper bound of credible interval of the first difference
VarName: name of the variable as found in modelmatrix
VarID: identifier of the variable, based on the order of columns in modelmatrix and mcmcout. Can be adjusted for plotting

If fullsims = TRUE, a matrix with as many columns as predictors in the model. Each row is the first difference for that variable based on one set of posterior draws. Column names are taken from the column names of modelmatrix.

References

King, Gary, Michael Tomz, and Jason Wittenberg. 2000. “Making the Most of Statistical Analyses: Improving Interpretation and Presentation.” American Journal of Political Science 44 (2): 347–61. http://www.jstor.org/stable/2669316
Long, J. Scott. 1997. Regression Models for Categorical and Limited Dependent Variables. Thousand Oaks: Sage Publications

Examples



if (interactive()) {
## simulating data
set.seed(1234)
b0 <- 0.2 # true value for the intercept
b1 <- 0.5 # true value for first beta
b2 <- 0.7 # true value for second beta
n <- 500 # sample size
X1 <- runif(n, -1, 1)
X2 <- runif(n, -1, 1)
Z <- b0 + b1 * X1 + b2 * X2
pr <- 1 / (1 + exp(-Z)) # inv logit function
Y <- rbinom(n, 1, pr) 
df <- data.frame(cbind(X1, X2, Y))

## formatting the data for jags
datjags <- as.list(df)
datjags$N <- length(datjags$Y)

## creating jags model
model <- function()  {
  
  for(i in 1:N){
    Y[i] ~ dbern(p[i])  ## Bernoulli distribution of y_i
    logit(p[i]) <- mu[i]    ## Logit link function
    mu[i] <- b[1] + 
      b[2] * X1[i] + 
      b[3] * X2[i]
  }
  
  for(j in 1:3){
    b[j] ~ dnorm(0, 0.001) ## Use a coefficient vector for simplicity
  }
  
}

params <- c("b")
inits1 <- list("b" = rep(0, 3))
inits2 <- list("b" = rep(0, 3))
inits <- list(inits1, inits2)

## fitting the model with R2jags
set.seed(123)
fit <- R2jags::jags(data = datjags, inits = inits, 
                    parameters.to.save = params, n.chains = 2, n.iter = 2000, 
                    n.burnin = 1000, model.file = model)

## running function with logit
xmat <- model.matrix(Y ~ X1 + X2, data = df)
mcmc <- coda::as.mcmc(fit)
mcmc_mat <- as.matrix(mcmc)[, 1:ncol(xmat)]
object <- mcmcFD(modelmatrix = xmat,
                 mcmcout = mcmc_mat)
object
}



if (interactive()) {
## simulating data
set.seed(1234)
b0 <- 0.2 # true value for the intercept
b1 <- 0.5 # true value for first beta
b2 <- 0.7 # true value for second beta
n <- 500 # sample size
X1 <- runif(n, -1, 1)
X2 <- runif(n, -1, 1)
Z <- b0 + b1 * X1 + b2 * X2
pr <- 1 / (1 + exp(-Z)) # inv logit function
Y <- rbinom(n, 1, pr) 
df <- data.frame(cbind(X1, X2, Y))

## formatting the data for jags
datjags <- as.list(df)
datjags$N <- length(datjags$Y)

## creating jags model
model <- function()  {
  
  for(i in 1:N){
    Y[i] ~ dbern(p[i])  ## Bernoulli distribution of y_i
    logit(p[i]) <- mu[i]    ## Logit link function
    mu[i] <- b[1] + 
      b[2] * X1[i] + 
      b[3] * X2[i]
  }
  
  for(j in 1:3){
    b[j] ~ dnorm(0, 0.001) ## Use a coefficient vector for simplicity
  }
  
}

params <- c("b")
inits1 <- list("b" = rep(0, 3))
inits2 <- list("b" = rep(0, 3))
inits <- list(inits1, inits2)

## fitting the model with R2jags
set.seed(123)
fit <- R2jags::jags(data = datjags, inits = inits, 
                    parameters.to.save = params, n.chains = 2, n.iter = 2000, 
                    n.burnin = 1000, model.file = model)

## running function with logit
xmat <- model.matrix(Y ~ X1 + X2, data = df)
mcmc <- coda::as.mcmc(fit)
mcmc_mat <- as.matrix(mcmc)[, 1:ncol(xmat)]
object <- mcmcFD(modelmatrix = xmat,
                 mcmcout = mcmc_mat)
object
}

Marginal Effects Plots for MCMC Output

Description

Marginal effects plots for MCMC output using ggplot2

Usage

mcmcMargEff(
  mod,
  main,
  int,
  moderator,
  pointest = "mean",
  seq = 100,
  ci = 0.95,
  hpdi = FALSE,
  plot = TRUE,
  xlab = "Moderator",
  ylab = "Marginal Effect"
)
mcmcMargEff(
  mod,
  main,
  int,
  moderator,
  pointest = "mean",
  seq = 100,
  ci = 0.95,
  hpdi = FALSE,
  plot = TRUE,
  xlab = "Moderator",
  ylab = "Marginal Effect"
)

Arguments

`mod`	Bayesian model object generated by R2jags, rjags, R2WinBUGS, R2OpenBUGS, MCMCpack, rstan, rstanarm, and brms.
`main`	a character with the name of the parameter of interest in the interaction term.
`int`	a character with the name of the moderating parameter in the interaction term.
`moderator`	a vector of values that the moderating parameter takes on in the data.
`pointest`	a character indicating whether to use the mean or median for point estimates in the plot.
`seq`	a numeric giving the number of moderator values used to generate the marginal effects plot.
`ci`	a scalar indicating the confidence level of the uncertainty intervals.
`hpdi`	a logical indicating whether to use highest posterior density intervals or equal tailed credible intervals to capture uncertainty.
`plot`	logical indicating whether to return a `ggplot` object or the underlying tidy DataFrame. By default, `mcmcMargEff` returns a line and ribbon plot for continuous variables, and a dot and line plot for factor variables and discrete variables with fewer than 25 unique values.
`xlab`	character giving x axis label if `plot = TRUE`, default `"Moderator"`
`ylab`	character giving y axis label if `plot = TRUE`, default `"Marginal Effect"`

Value

a ggplot object or a tidy DataFrame.

Author(s)

Rob Williams, [email protected]

Examples



if (interactive()) {
## simulating data
set.seed(123456)
b0 <- 0.2 # true value for the intercept
b1 <- 0.5 # true value for first beta
b2 <- 0.7 # true value for second beta
n <- 500 # sample size
X1 <- runif(n, -1, 1)
X2 <- runif(n, -1, 1)
Z <- b0 + b1 * X1 + b2 * X2

## linear model data
Y_linear <- rnorm(n, Z, 1)
df <- data.frame(cbind(X1, X2, Y = Y_linear))

## formatting the data for jags
datjags <- as.list(df)
datjags$N <- length(datjags$Y)

## creating jags model
model <- function()  {
  
  for(i in 1:N){
    Y[i] ~ dnorm(mu[i], sigma)  ## Bernoulli distribution of y_i
    
    mu[i] <- b[1] + 
      b[2] * X1[i] + 
      b[3] * X2[i] +
      b[4] * X1[i] * X2[i]
    
  }
  
  for(j in 1:4){
    b[j] ~ dnorm(0, 0.001) ## Use a coefficient vector for simplicity
  }
  
  sigma ~ dexp(1)
  
}

params <- c("b")
inits1 <- list("b" = rep(0, 4))
inits2 <- list("b" = rep(0, 4))
inits <- list(inits1, inits2)

## fitting the model with R2jags
set.seed(123)
fit <- R2jags::jags(data = datjags, inits = inits, 
                    parameters.to.save = params, n.chains = 2, n.iter = 2000, 
                    n.burnin = 1000, model.file = model)

mcmcMargEff(mod = fit,
            main = 'b[2]',
            int = 'b[4]',
            moderator = sim_data_interactive$X2,
            plot = TRUE)
}



if (interactive()) {
## simulating data
set.seed(123456)
b0 <- 0.2 # true value for the intercept
b1 <- 0.5 # true value for first beta
b2 <- 0.7 # true value for second beta
n <- 500 # sample size
X1 <- runif(n, -1, 1)
X2 <- runif(n, -1, 1)
Z <- b0 + b1 * X1 + b2 * X2

## linear model data
Y_linear <- rnorm(n, Z, 1)
df <- data.frame(cbind(X1, X2, Y = Y_linear))

## formatting the data for jags
datjags <- as.list(df)
datjags$N <- length(datjags$Y)

## creating jags model
model <- function()  {
  
  for(i in 1:N){
    Y[i] ~ dnorm(mu[i], sigma)  ## Bernoulli distribution of y_i
    
    mu[i] <- b[1] + 
      b[2] * X1[i] + 
      b[3] * X2[i] +
      b[4] * X1[i] * X2[i]
    
  }
  
  for(j in 1:4){
    b[j] ~ dnorm(0, 0.001) ## Use a coefficient vector for simplicity
  }
  
  sigma ~ dexp(1)
  
}

params <- c("b")
inits1 <- list("b" = rep(0, 4))
inits2 <- list("b" = rep(0, 4))
inits <- list(inits1, inits2)

## fitting the model with R2jags
set.seed(123)
fit <- R2jags::jags(data = datjags, inits = inits, 
                    parameters.to.save = params, n.chains = 2, n.iter = 2000, 
                    n.burnin = 1000, model.file = model)

mcmcMargEff(mod = fit,
            main = 'b[2]',
            int = 'b[4]',
            moderator = sim_data_interactive$X2,
            plot = TRUE)
}

Predicted Probabilities using Bayesian MCMC estimates for the Average of Observed Cases

Description

Implements R function to calculate the predicted probabilities for "observed" cases after a Bayesian logit or probit model, following Hanmer & Kalkan (2013) (2013, American Journal of Political Science 57(1): 263-277).

Usage

mcmcObsProb(
  modelmatrix,
  mcmcout,
  xcol,
  xrange,
  xinterest,
  link = "logit",
  ci = c(0.025, 0.975),
  fullsims = FALSE
)
mcmcObsProb(
  modelmatrix,
  mcmcout,
  xcol,
  xrange,
  xinterest,
  link = "logit",
  ci = c(0.025, 0.975),
  fullsims = FALSE
)

Arguments

`modelmatrix`	model matrix, including intercept (if the intercept is among the parameters estimated in the model). Create with model.matrix(formula, data). Note: the order of columns in the model matrix must correspond to the order of columns in the matrix of posterior draws in the `mcmcout` argument. See the `mcmcout` argument for more.
`mcmcout`	posterior distributions of all logit coefficients, in matrix form. This can be created from rstan, MCMCpack, R2jags, etc. and transformed into a matrix using the function as.mcmc() from the coda package for `jags` class objects, as.matrix() from base R for `mcmc`, `mcmc.list`, `stanreg`, and `stanfit` class objects, and `object$sims.matrix` for `bugs` class objects. Note: the order of columns in this matrix must correspond to the order of columns in the model matrix. One can do this by examining the posterior distribution matrix and sorting the variables in the order of this matrix when creating the model matrix. A useful function for sorting column names containing both characters and numbers as you create the matrix of posterior distributions is `mixedsort()` from the gtools package.
`xcol`	column number of the posterior draws (`mcmcout`) and model matrices that corresponds to the explanatory variable for which to calculate associated Pr(y = 1). Note that the columns in these matrices must match.
`xrange`	name of the vector with the range of relevant values of the explanatory variable for which to calculate associated Pr(y = 1).
`xinterest`	semi-optional argument. Name of the explanatory variable for which to calculate associated Pr(y = 1). If `xcol` is supplied, this is not needed. If both are supplied, the function defaults to `xcol` and this argument is ignored.
`link`	type of generalized linear model; a character vector set to `"logit"` (default) or `"probit"`.
`ci`	the bounds of the credible interval. Default is `c(0.025, 0.975)` for the 95% credible interval.
`fullsims`	logical indicator of whether full object (based on all MCMC draws rather than their average) will be returned. Default is `FALSE`. Note: The longer `xrange` is, the larger the full output will be if `TRUE` is selected.

Details

This function calculates predicted probabilities for "observed" cases after a Bayesian logit or probit model following Hanmer and Kalkan (2013, American Journal of Political Science 57(1): 263-277)

Value

if fullsims = FALSE (default), a tibble with 4 columns:

x: value of variable of interest, drawn from xrange
median_pp: median predicted Pr(y = 1) when variable of interest is set to x
lower_pp: lower bound of credible interval of predicted probability at given x
upper_pp: upper bound of credible interval of predicted probability at given x

if fullsims = TRUE, a tibble with 3 columns:

Iteration: number of the posterior draw
x: value of variable of interest, drawn from xrange
pp: average predicted Pr(y = 1) of all observed cases when variable of interest is set to x

References

Hanmer, Michael J., & Ozan Kalkan, K. (2013). Behind the curve: Clarifying the best approach to calculating predicted probabilities and marginal effects from limited dependent variable models. American Journal of Political Science, 57(1), 263-277. https://doi.org/10.1111/j.1540-5907.2012.00602.x

Examples



if (interactive()) {
  ## simulating data
  set.seed(12345)
  b0 <- 0.2 # true value for the intercept
  b1 <- 0.5 # true value for first beta
  b2 <- 0.7 # true value for second beta
  n <- 500 # sample size
  X1 <- runif(n, -1, 1)
  X2 <- runif(n, -1, 1)
  Z <- b0 + b1 * X1 + b2 * X2
  pr <- 1 / (1 + exp(-Z)) # inv logit function
  Y <- rbinom(n, 1, pr) 
  df <- data.frame(cbind(X1, X2, Y))
  
  ## formatting the data for jags
  datjags <- as.list(df)
  datjags$N <- length(datjags$Y)
  
  ## creating jags model
  model <- function()  {
  
  for(i in 1:N){
    Y[i] ~ dbern(p[i])  ## Bernoulli distribution of y_i
    logit(p[i]) <- mu[i]    ## Logit link function
    mu[i] <- b[1] + 
      b[2] * X1[i] + 
      b[3] * X2[i]
  }
  
  for(j in 1:3){
    b[j] ~ dnorm(0, 0.001) ## Use a coefficient vector for simplicity
  }
  
}

params <- c("b")
inits1 <- list("b" = rep(0, 3))
inits2 <- list("b" = rep(0, 3))
inits <- list(inits1, inits2)

## fitting the model with R2jags
library(R2jags)
set.seed(123)
fit <- jags(data = datjags, inits = inits, 
          parameters.to.save = params, n.chains = 2, n.iter = 2000, 
          n.burnin = 1000, model.file = model)

### observed value approach
library(coda)
xmat <- model.matrix(Y ~ X1 + X2, data = df)
mcmc <- as.mcmc(fit)
mcmc_mat <- as.matrix(mcmc)[, 1:ncol(xmat)]
X1_sim <- seq(from = min(datjags$X1),
              to = max(datjags$X1), 
              length.out = 10)
obs_prob <- mcmcObsProb(modelmatrix = xmat,
                        mcmcout = mcmc_mat,
                        xrange = X1_sim,
                        xcol = 2)
}



if (interactive()) {
  ## simulating data
  set.seed(12345)
  b0 <- 0.2 # true value for the intercept
  b1 <- 0.5 # true value for first beta
  b2 <- 0.7 # true value for second beta
  n <- 500 # sample size
  X1 <- runif(n, -1, 1)
  X2 <- runif(n, -1, 1)
  Z <- b0 + b1 * X1 + b2 * X2
  pr <- 1 / (1 + exp(-Z)) # inv logit function
  Y <- rbinom(n, 1, pr) 
  df <- data.frame(cbind(X1, X2, Y))
  
  ## formatting the data for jags
  datjags <- as.list(df)
  datjags$N <- length(datjags$Y)
  
  ## creating jags model
  model <- function()  {
  
  for(i in 1:N){
    Y[i] ~ dbern(p[i])  ## Bernoulli distribution of y_i
    logit(p[i]) <- mu[i]    ## Logit link function
    mu[i] <- b[1] + 
      b[2] * X1[i] + 
      b[3] * X2[i]
  }
  
  for(j in 1:3){
    b[j] ~ dnorm(0, 0.001) ## Use a coefficient vector for simplicity
  }
  
}

params <- c("b")
inits1 <- list("b" = rep(0, 3))
inits2 <- list("b" = rep(0, 3))
inits <- list(inits1, inits2)

## fitting the model with R2jags
library(R2jags)
set.seed(123)
fit <- jags(data = datjags, inits = inits, 
          parameters.to.save = params, n.chains = 2, n.iter = 2000, 
          n.burnin = 1000, model.file = model)

### observed value approach
library(coda)
xmat <- model.matrix(Y ~ X1 + X2, data = df)
mcmc <- as.mcmc(fit)
mcmc_mat <- as.matrix(mcmc)[, 1:ncol(xmat)]
X1_sim <- seq(from = min(datjags$X1),
              to = max(datjags$X1), 
              length.out = 10)
obs_prob <- mcmcObsProb(modelmatrix = xmat,
                        mcmcout = mcmc_mat,
                        xrange = X1_sim,
                        xcol = 2)
}

LaTeX or HTML regression tables for MCMC Output

Description

This function creates LaTeX or HTML regression tables for MCMC Output using the texreg function from the texreg R package.

Usage

mcmcReg(
  mod,
  pars = NULL,
  pointest = "mean",
  ci = 0.95,
  hpdi = FALSE,
  sd = FALSE,
  pr = FALSE,
  coefnames = NULL,
  gof = numeric(0),
  gofnames = character(0),
  format = "latex",
  file,
  regex = FALSE,
  ...
)
mcmcReg(
  mod,
  pars = NULL,
  pointest = "mean",
  ci = 0.95,
  hpdi = FALSE,
  sd = FALSE,
  pr = FALSE,
  coefnames = NULL,
  gof = numeric(0),
  gofnames = character(0),
  format = "latex",
  file,
  regex = FALSE,
  ...
)

Arguments

`mod`	Bayesian model object generated by R2jags, rjags, R2WinBUGS, R2OpenBUGS, MCMCpack, rstan, rstanarm, and brms, or a list of model objects of the same class.
`pars`	a scalar or vector of the parameters you wish to include in the table. By default, `mcmcReg` includes all parameters saved in a model object. If a model has lots of samples and lots of saved parameters, not explicitly specifying a limited number of parameters to include via `pars` may take a long time. `pars` can either be a vector with the specific parameters to be included in the table e.g. `pars = c("beta[1]", "beta[2]", "beta[3]")`, or they can be partial names that will be matched using regular expressions e.g. `pars = "beta"` if `regex = TRUE`. Both of these will include `beta[1]`, `beta[2]`, and `beta[3]` in the table. When combining models with different parameters in one table, this argument also accepts a list the length of the number of models.
`pointest`	a character indicating whether to use the mean or median for point estimates in the table.
`ci`	a scalar indicating the confidence level of the uncertainty intervals.
`hpdi`	a logical indicating whether to use highest posterior density intervals instead of equal tailed credible intervals to capture uncertainty (default `FALSE`).
`sd`	a logical indicating whether to report the standard deviation of posterior distributions instead of an uncertainty interval (default `FALSE`). If `TRUE`, overrides `ci`, `hpdi`, and `pr`.
`pr`	a logical indicating whether to report the probability that a coefficient is in the same direction as the point estimate for that coefficient (default `FALSE`). If `TRUE`, overrides `ci` and `hpdi`.
`coefnames`	an optional vector or list of vectors containing parameter names for each model. If there are multiple models, the list must have the same number of elements as there are models, and the vector of names in each list element must match the number of parameters. If not supplied, the function will use the parameter names in the model object(s). Note that this replaces the standard `custom.coef.names` argument in `texreg` because there is no `extract` method for MCMC model objects, and many MCMC model objects do not have unique parameter names.
`gof`	a named list of goodness of fit statistics, or a list of such lists.
`gofnames`	an optional vector or list of vectors containing goodness of fit statistic names for each model. Like `coefnames` in this function (which replaces the `custom.coef.names` argument in `texreg`), `gofnames` replaces the standard `custom.gof.names` argument in `texreg`. If there are multiple models, the list must have the same number of elements as there are models, and the vector of names in each list element must match the number of goodness of fit statistics.
`format`	a character indicating `latex` or `html` output.
`file`	optional file name to write table to file instead of printing to console.
`regex`	use regular expression matching with `pars`?
`...`	optional arguments to `texreg`.

Details

If using custom.coef.map with more than one model, you should rename the parameters in the model objects to ensure that different parameters with the same subscript are not conflated by texreg e.g. beta[1] could represent age in one model and income in another, and texreg would combine the two if you do not rename beta[1] to more informative names in the model objects.

If mod is a brmsfit object or list of brmsfit objects, note that the default brms names for coefficients are b_Intercept and b, so both of these should be included in par if you wish to include the intercept in the table.

Value

A formatted regression table in LaTeX or HTML format.

Author(s)

Rob Williams, [email protected]

Examples



if (interactive()) {
## simulating data
set.seed(123456)
b0 <- 0.2 # true value for the intercept
b1 <- 0.5 # true value for first beta
b2 <- 0.7 # true value for second beta
n <- 500 # sample size
X1 <- runif(n, -1, 1)
X2 <- runif(n, -1, 1)
Z <- b0 + b1 * X1 + b2 * X2
pr <- 1 / (1 + exp(-Z)) # inv logit function
Y <- rbinom(n, 1, pr)
df <- data.frame(cbind(X1, X2, Y))

## formatting the data for jags
datjags <- as.list(df)
datjags$N <- length(datjags$Y)

## creating jags model
model <- function()  {

  for(i in 1:N){
    Y[i] ~ dbern(p[i])  ## Bernoulli distribution of y_i
    logit(p[i]) <- mu[i]    ## Logit link function
    mu[i] <- b[1] +
      b[2] * X1[i] +
      b[3] * X2[i]
  }

  for(j in 1:3){
    b[j] ~ dnorm(0, 0.001) ## Use a coefficient vector for simplicity
  }

}

params <- c("b")
inits1 <- list("b" = rep(0, 3))
inits2 <- list("b" = rep(0, 3))
inits <- list(inits1, inits2)

## fitting the model with R2jags
set.seed(123)
fit <- R2jags::jags(data = datjags, inits = inits,
                    parameters.to.save = params,
                    n.chains = 2,
                    n.iter = 2000, n.burnin = 1000,
                    model.file = model)

## generating regression table with all parameters
mcmcReg(fit)

## generating regression table with only betas and custom coefficent names
mcmcReg(fit, pars = c('b'), coefnames = c('Variable 1',
                                          'Variable 2',
                                          'Variable 3'),
        regex = TRUE)
## generating regression tables with all betas and custom names
mcmcReg(fit, coefnames = c('Variable 1', 'Variable 2',
                           'Variable 3', 'deviance'))
}



if (interactive()) {
## simulating data
set.seed(123456)
b0 <- 0.2 # true value for the intercept
b1 <- 0.5 # true value for first beta
b2 <- 0.7 # true value for second beta
n <- 500 # sample size
X1 <- runif(n, -1, 1)
X2 <- runif(n, -1, 1)
Z <- b0 + b1 * X1 + b2 * X2
pr <- 1 / (1 + exp(-Z)) # inv logit function
Y <- rbinom(n, 1, pr)
df <- data.frame(cbind(X1, X2, Y))

## formatting the data for jags
datjags <- as.list(df)
datjags$N <- length(datjags$Y)

## creating jags model
model <- function()  {

  for(i in 1:N){
    Y[i] ~ dbern(p[i])  ## Bernoulli distribution of y_i
    logit(p[i]) <- mu[i]    ## Logit link function
    mu[i] <- b[1] +
      b[2] * X1[i] +
      b[3] * X2[i]
  }

  for(j in 1:3){
    b[j] ~ dnorm(0, 0.001) ## Use a coefficient vector for simplicity
  }

}

params <- c("b")
inits1 <- list("b" = rep(0, 3))
inits2 <- list("b" = rep(0, 3))
inits <- list(inits1, inits2)

## fitting the model with R2jags
set.seed(123)
fit <- R2jags::jags(data = datjags, inits = inits,
                    parameters.to.save = params,
                    n.chains = 2,
                    n.iter = 2000, n.burnin = 1000,
                    model.file = model)

## generating regression table with all parameters
mcmcReg(fit)

## generating regression table with only betas and custom coefficent names
mcmcReg(fit, pars = c('b'), coefnames = c('Variable 1',
                                          'Variable 2',
                                          'Variable 3'),
        regex = TRUE)
## generating regression tables with all betas and custom names
mcmcReg(fit, coefnames = c('Variable 1', 'Variable 2',
                           'Variable 3', 'deviance'))
}

ROC and Precision-Recall Curves using Bayesian MCMC estimates generalized

Description

This function generates ROC and Precision-Recall curves after fitting a Bayesian logit or probit regression. For fast calculation for from an "rjags" object use mcmcRocPrc

Usage

mcmcRocPrcGen(
  modelmatrix,
  mcmcout,
  modelframe,
  curves = FALSE,
  link = "logit",
  fullsims = FALSE
)
mcmcRocPrcGen(
  modelmatrix,
  mcmcout,
  modelframe,
  curves = FALSE,
  link = "logit",
  fullsims = FALSE
)

Arguments

`modelmatrix`	model matrix, including intercept (if the intercept is among the parameters estimated in the model). Create with model.matrix(formula, data). Note: the order of columns in the model matrix must correspond to the order of columns in the matrix of posterior draws in the `mcmcout` argument. See the `mcmcout` argument for more and Beger (2016) for background.
`mcmcout`	posterior distributions of all logit coefficients, in matrix form. This can be created from rstan, MCMCpack, R2jags, etc. and transformed into a matrix using the function as.mcmc() from the coda package for `jags` class objects, as.matrix() from base R for `mcmc`, `mcmc.list`, `stanreg`, and `stanfit` class objects, and `object$sims.matrix` for `bugs` class objects. Note: the order of columns in this matrix must correspond to the order of columns in the model matrix. One can do this by examining the posterior distribution matrix and sorting the variables in the order of this matrix when creating the model matrix. A useful function for sorting column names containing both characters and numbers as you create the matrix of posterior distributions is `mixedsort()` from the gtools package.
`modelframe`	model frame in matrix form. Can be created using as.matrix(model.frame(formula, data))
`curves`	logical indicator of whether or not to return values to plot the ROC or Precision-Recall curves. If set to `FALSE` (default), results are returned as a list without the extra values.
`link`	type of generalized linear model; a character vector set to `"logit"` (default) or `"probit"`.
`fullsims`	logical indicator of whether full object (based on all MCMC draws rather than their average) will be returned. Default is `FALSE`. Note: If `TRUE` is chosen, the function takes notably longer to execute.

Details

This function generates ROC and precision-recall curves after fitting a Bayesian logit or probit model.

Value

This function returns a list with 4 elements:

area_under_roc: area under ROC curve (scalar)
area_under_prc: area under precision-recall curve (scalar)
prc_dat: data to plot precision-recall curve (data frame)
roc_dat: data to plot ROC curve (data frame)

References

Beger, Andreas. 2016. “Precision-Recall Curves.” Available at SSRN: https://ssrn.com/Abstract=2765419. http://dx.doi.org/10.2139/ssrn.2765419.

Examples



if (interactive()) {
# simulating data

set.seed(123456)
b0 <- 0.2 # true value for the intercept
b1 <- 0.5 # true value for first beta
b2 <- 0.7 # true value for second beta
n <- 500 # sample size
X1 <- runif(n, -1, 1)
X2 <- runif(n, -1, 1)
Z <- b0 + b1 * X1 + b2 * X2
pr <- 1 / (1 + exp(-Z)) # inv logit function
Y <- rbinom(n, 1, pr) 
df <- data.frame(cbind(X1, X2, Y))

# formatting the data for jags
datjags <- as.list(df)
datjags$N <- length(datjags$Y)

# creating jags model
model <- function()  {
  
  for(i in 1:N){
    Y[i] ~ dbern(p[i])  ## Bernoulli distribution of y_i
    logit(p[i]) <- mu[i]    ## Logit link function
    mu[i] <- b[1] + 
      b[2] * X1[i] + 
      b[3] * X2[i]
  }
  
  for(j in 1:3){
    b[j] ~ dnorm(0, 0.001) ## Use a coefficient vector for simplicity
  }
  
}

params <- c("b")
inits1 <- list("b" = rep(0, 3))
inits2 <- list("b" = rep(0, 3))
inits <- list(inits1, inits2)

## fitting the model with R2jags
set.seed(123)
fit <- R2jags::jags(data = datjags, inits = inits, 
                    parameters.to.save = params, n.chains = 2, n.iter = 2000, 
                    n.burnin = 1000, model.file = model)

# processing the data
mm <- model.matrix(Y ~ X1 + X2, data = df)
xframe <- as.matrix(model.frame(Y ~ X1 + X2, data = df))
mcmc <- coda::as.mcmc(fit)
mcmc_mat <- as.matrix(mcmc)[, 1:ncol(xframe)]

# using mcmcRocPrcGen
fit_sum <- mcmcRocPrcGen(modelmatrix = mm,
                      modelframe = xframe,
                      mcmcout = mcmc_mat,
                      curves = TRUE,
                      fullsims = FALSE)
}



if (interactive()) {
# simulating data

set.seed(123456)
b0 <- 0.2 # true value for the intercept
b1 <- 0.5 # true value for first beta
b2 <- 0.7 # true value for second beta
n <- 500 # sample size
X1 <- runif(n, -1, 1)
X2 <- runif(n, -1, 1)
Z <- b0 + b1 * X1 + b2 * X2
pr <- 1 / (1 + exp(-Z)) # inv logit function
Y <- rbinom(n, 1, pr) 
df <- data.frame(cbind(X1, X2, Y))

# formatting the data for jags
datjags <- as.list(df)
datjags$N <- length(datjags$Y)

# creating jags model
model <- function()  {
  
  for(i in 1:N){
    Y[i] ~ dbern(p[i])  ## Bernoulli distribution of y_i
    logit(p[i]) <- mu[i]    ## Logit link function
    mu[i] <- b[1] + 
      b[2] * X1[i] + 
      b[3] * X2[i]
  }
  
  for(j in 1:3){
    b[j] ~ dnorm(0, 0.001) ## Use a coefficient vector for simplicity
  }
  
}

params <- c("b")
inits1 <- list("b" = rep(0, 3))
inits2 <- list("b" = rep(0, 3))
inits <- list(inits1, inits2)

## fitting the model with R2jags
set.seed(123)
fit <- R2jags::jags(data = datjags, inits = inits, 
                    parameters.to.save = params, n.chains = 2, n.iter = 2000, 
                    n.burnin = 1000, model.file = model)

# processing the data
mm <- model.matrix(Y ~ X1 + X2, data = df)
xframe <- as.matrix(model.frame(Y ~ X1 + X2, data = df))
mcmc <- coda::as.mcmc(fit)
mcmc_mat <- as.matrix(mcmc)[, 1:ncol(xframe)]

# using mcmcRocPrcGen
fit_sum <- mcmcRocPrcGen(modelmatrix = mm,
                      modelframe = xframe,
                      mcmcout = mcmc_mat,
                      curves = TRUE,
                      fullsims = FALSE)
}

Summarize Bayesian MCMC Output R function for summarizing MCMC output in a regression-style table.

Description

Summarize Bayesian MCMC Output

R function for summarizing MCMC output in a regression-style table.

Usage

mcmcTab(
  sims,
  ci = c(0.025, 0.975),
  pars = NULL,
  Pr = FALSE,
  ROPE = NULL,
  regex = FALSE
)
mcmcTab(
  sims,
  ci = c(0.025, 0.975),
  pars = NULL,
  Pr = FALSE,
  ROPE = NULL,
  regex = FALSE
)

Arguments

`sims`	Bayesian model object generated by R2jags, rjags, R2WinBUGS, R2OpenBUGS, MCMCpack, rstan, and rstanarm.
`ci`	desired level for credible intervals; defaults to c(0.025, 0.975).
`pars`	character vector of parameters to be printed; defaults to `NULL` (all parameters are printed). If not `NULL`, the user can either specify the exact names of parameters to be printed (e.g. `c("alpha", "beta1", "beta2")`) or part of a name so that all parameters containing that name will be printed (e.g. `"beta"` will print `beta1`, `beta2`, etc.).
`Pr`	print percent of posterior draws with same sign as median; defaults to `FALSE`.
`ROPE`	defaults to `NULL`. If not `NULL`, a vector of two values defining the region of practical equivalence ("ROPE"); returns % of posterior draws to the left/right of ROPE. For this quantity to be meaningful, all parameters must be on the same scale (e.g. standardized coefficients or first differences). See Kruschke (2013, Journal of Experimental Psychology 143(2): 573-603) for more on the ROPE.
`regex`	use regular expression matching with `pars`?

Value

a data frame containing MCMC summary statistics.

References

Kruschke, John K. 2013. “Bayesian Estimation Supersedes the T-Test.” Journal of Experimental Psychology: General 142 (2): 573–603. https://doi.org/10.1037/a0029146.

Examples



if (interactive()) {
data("jags_logit")

## printing out table
object <- mcmcTab(jags_logit, 
          ci = c(0.025, 0.975), 
          pars = NULL, 
          Pr = FALSE,
          ROPE = NULL)
object
}



if (interactive()) {
data("jags_logit")

## printing out table
object <- mcmcTab(jags_logit, 
          ci = c(0.025, 0.975), 
          pars = NULL, 
          Pr = FALSE,
          ROPE = NULL)
object
}

Plot Method for First Differences from MCMC output

Description

The plot method for first differences generated from MCMC output by mcmcFD. For more on this method, see Long (1997, Sage Publications), and King, Tomz, and Wittenberg (2000, American Journal of Political Science 44(2): 347-361). For a description of this type of plot, see Figure 1 in Karreth (2018, International Interactions 44(3): 463-90).

Usage

## S3 method for class 'mcmcFD'
plot(x, ROPE = NULL, ...)
## S3 method for class 'mcmcFD'
plot(x, ROPE = NULL, ...)

Arguments

`x`	Output generated from `mcmcFD(..., full_sims = TRUE)`.
`ROPE`	defaults to NULL. If not NULL, a numeric vector of length two, defining the Region of Practical Equivalence around 0. See Kruschke (2013, Journal of Experimental Psychology 143(2): 573-603) for more on the ROPE.
`...`	optional arguments to `theme` from `ggplot2`.

Value

a density plot of the differences in probabilities. The plot is made with ggplot2 and can be passed on as an object to customize. Annotated numbers show the percent of posterior draws with the same sign as the median estimate (if ROPE = NULL) or on the same side of the ROPE as the median estimate (if ROPE is specified).

References

Karreth, Johannes. 2018. “The Economic Leverage of International Organizations in Interstate Disputes.” International Interactions 44 (3): 463-90. https://doi.org/10.1080/03050629.2018.1389728.
King, Gary, Michael Tomz, and Jason Wittenberg. 2000. “Making the Most of Statistical Analyses: Improving Interpretation and Presentation.” American Journal of Political Science 44 (2): 347–61. http://www.jstor.org/stable/2669316.
Kruschke, John K. 2013. “Bayesian Estimation Supersedes the T-Test.” Journal of Experimental Psychology: General 142 (2): 573–603. https://doi.org/10.1037/a0029146.
Long, J. Scott. 1997. Regression Models for Categorical and Limited Dependent Variables. Thousand Oaks: Sage Publications.

Examples



if (interactive()) {
## simulating data
set.seed(1234)
b0 <- 0.2 # true value for the intercept
b1 <- 0.5 # true value for first beta
b2 <- 0.7 # true value for second beta
n <- 500 # sample size
X1 <- runif(n, -1, 1)
X2 <- runif(n, -1, 1)
Z <- b0 + b1 * X1 + b2 * X2
pr <- 1 / (1 + exp(-Z)) # inv logit function
Y <- rbinom(n, 1, pr) 
df <- data.frame(cbind(X1, X2, Y))

## formatting the data for jags
datjags <- as.list(df)
datjags$N <- length(datjags$Y)

## creating jags model
model <- function()  {
  
  for(i in 1:N){
    Y[i] ~ dbern(p[i])  ## Bernoulli distribution of y_i
    logit(p[i]) <- mu[i]    ## Logit link function
    mu[i] <- b[1] + 
      b[2] * X1[i] + 
      b[3] * X2[i]
  }
  
  for(j in 1:3){
    b[j] ~ dnorm(0, 0.001) ## Use a coefficient vector for simplicity
  }
  
}

params <- c("b")
inits1 <- list("b" = rep(0, 3))
inits2 <- list("b" = rep(0, 3))
inits <- list(inits1, inits2)

## fitting the model with R2jags
set.seed(123)
fit <- R2jags::jags(data = datjags, inits = inits, 
                    parameters.to.save = params, n.chains = 2, n.iter = 2000, 
                    n.burnin = 1000, model.file = model)

## preparing data for mcmcFD()
xmat <- model.matrix(Y ~ X1 + X2, data = df)
mcmc <- coda::as.mcmc(fit)
mcmc_mat <- as.matrix(mcmc)[, 1:ncol(xmat)]

## plotting with mcmcFDplot()
full <- mcmcFD(modelmatrix = xmat,
               mcmcout = mcmc_mat,
               fullsims = TRUE)
plot(full)

}




if (interactive()) {
## simulating data
set.seed(1234)
b0 <- 0.2 # true value for the intercept
b1 <- 0.5 # true value for first beta
b2 <- 0.7 # true value for second beta
n <- 500 # sample size
X1 <- runif(n, -1, 1)
X2 <- runif(n, -1, 1)
Z <- b0 + b1 * X1 + b2 * X2
pr <- 1 / (1 + exp(-Z)) # inv logit function
Y <- rbinom(n, 1, pr) 
df <- data.frame(cbind(X1, X2, Y))

## formatting the data for jags
datjags <- as.list(df)
datjags$N <- length(datjags$Y)

## creating jags model
model <- function()  {
  
  for(i in 1:N){
    Y[i] ~ dbern(p[i])  ## Bernoulli distribution of y_i
    logit(p[i]) <- mu[i]    ## Logit link function
    mu[i] <- b[1] + 
      b[2] * X1[i] + 
      b[3] * X2[i]
  }
  
  for(j in 1:3){
    b[j] ~ dnorm(0, 0.001) ## Use a coefficient vector for simplicity
  }
  
}

params <- c("b")
inits1 <- list("b" = rep(0, 3))
inits2 <- list("b" = rep(0, 3))
inits <- list(inits1, inits2)

## fitting the model with R2jags
set.seed(123)
fit <- R2jags::jags(data = datjags, inits = inits, 
                    parameters.to.save = params, n.chains = 2, n.iter = 2000, 
                    n.burnin = 1000, model.file = model)

## preparing data for mcmcFD()
xmat <- model.matrix(Y ~ X1 + X2, data = df)
mcmc <- coda::as.mcmc(fit)
mcmc_mat <- as.matrix(mcmc)[, 1:ncol(xmat)]

## plotting with mcmcFDplot()
full <- mcmcFD(modelmatrix = xmat,
               mcmcout = mcmc_mat,
               fullsims = TRUE)
plot(full)

}

ROC and Precision-Recall Curves using Bayesian MCMC estimates

Description

Generate ROC and Precision-Recall curves after fitting a Bayesian logit or probit regression using rstan::stan(), rstanarm::stan_glm(), R2jags::jags(), R2WinBUGS::bugs(), MCMCpack::MCMClogit(), or other functions that provide samples from a posterior density.

Usage

## S3 method for class 'mcmcRocPrc'
print(x, ...)

## S3 method for class 'mcmcRocPrc'
plot(x, n = 40, alpha = 0.5, ...)

## S3 method for class 'mcmcRocPrc'
as.data.frame(
  x,
  row.names = NULL,
  optional = FALSE,
  what = c("auc", "roc", "prc"),
  ...
)

mcmcRocPrc(object, curves = FALSE, fullsims = FALSE, ...)

## Default S3 method:
mcmcRocPrc(object, curves, fullsims, yvec, ...)

## S3 method for class 'jags'
mcmcRocPrc(
  object,
  curves = FALSE,
  fullsims = FALSE,
  yname,
  xnames,
  posterior_samples,
  ...
)

## S3 method for class 'rjags'
mcmcRocPrc(object, curves = FALSE, fullsims = FALSE, yname, xnames, ...)

## S3 method for class 'runjags'
mcmcRocPrc(object, curves = FALSE, fullsims = FALSE, yname, xnames, ...)

## S3 method for class 'stanfit'
mcmcRocPrc(object, curves = FALSE, fullsims = FALSE, data, xnames, yname, ...)

## S3 method for class 'stanreg'
mcmcRocPrc(object, curves = FALSE, fullsims = FALSE, ...)

## S3 method for class 'brmsfit'
mcmcRocPrc(object, curves = FALSE, fullsims = FALSE, ...)

## S3 method for class 'bugs'
mcmcRocPrc(
  object,
  curves = FALSE,
  fullsims = FALSE,
  data,
  xnames,
  yname,
  type = c("logit", "probit"),
  ...
)

## S3 method for class 'mcmc'
mcmcRocPrc(
  object,
  curves = FALSE,
  fullsims = FALSE,
  data,
  xnames,
  yname,
  type = c("logit", "probit"),
  force = FALSE,
  ...
)
## S3 method for class 'mcmcRocPrc'
print(x, ...)

## S3 method for class 'mcmcRocPrc'
plot(x, n = 40, alpha = 0.5, ...)

## S3 method for class 'mcmcRocPrc'
as.data.frame(
  x,
  row.names = NULL,
  optional = FALSE,
  what = c("auc", "roc", "prc"),
  ...
)

mcmcRocPrc(object, curves = FALSE, fullsims = FALSE, ...)

## Default S3 method:
mcmcRocPrc(object, curves, fullsims, yvec, ...)

## S3 method for class 'jags'
mcmcRocPrc(
  object,
  curves = FALSE,
  fullsims = FALSE,
  yname,
  xnames,
  posterior_samples,
  ...
)

## S3 method for class 'rjags'
mcmcRocPrc(object, curves = FALSE, fullsims = FALSE, yname, xnames, ...)

## S3 method for class 'runjags'
mcmcRocPrc(object, curves = FALSE, fullsims = FALSE, yname, xnames, ...)

## S3 method for class 'stanfit'
mcmcRocPrc(object, curves = FALSE, fullsims = FALSE, data, xnames, yname, ...)

## S3 method for class 'stanreg'
mcmcRocPrc(object, curves = FALSE, fullsims = FALSE, ...)

## S3 method for class 'brmsfit'
mcmcRocPrc(object, curves = FALSE, fullsims = FALSE, ...)

## S3 method for class 'bugs'
mcmcRocPrc(
  object,
  curves = FALSE,
  fullsims = FALSE,
  data,
  xnames,
  yname,
  type = c("logit", "probit"),
  ...
)

## S3 method for class 'mcmc'
mcmcRocPrc(
  object,
  curves = FALSE,
  fullsims = FALSE,
  data,
  xnames,
  yname,
  type = c("logit", "probit"),
  force = FALSE,
  ...
)

Arguments

`x`	a `mcmcRocPrc()` object
`...`	Used by methods
`n`	plot method: if 'fullsims = TRUE', how many sample curves to draw?
`alpha`	plot method: alpha value for plotting sampled curves; between 0 and 1
`row.names`	see [base::as.data.frame()]
`optional`	see [base::as.data.frame()]
`what`	which information to extract and convert to a data frame?
`object`	A fitted binary choice model, e.g. "rjags" object (see `R2jags::jags()`), or a `⁠[N, iter]⁠` matrix of predicted probabilites.
`curves`	logical indicator of whether or not to return values to plot the ROC or Precision-Recall curves. If set to `FALSE` (default), results are returned as a list without the extra values.
`fullsims`	logical indicator of whether full object (based on all MCMC draws rather than their average) will be returned. Default is `FALSE`. Note: If `TRUE` is chosen, the function takes notably longer to execute.
`yvec`	A `numeric(N)` vector of observed outcomes.
`yname`	(`character(1)`) The name of the dependent variable, should match the variable name in the JAGS data object.
`xnames`	(`base::character()`) A character vector of the independent variable names, should match the corresponding names in the JAGS data object.
`posterior_samples`	a "mcmc" object with the posterior samples
`data`	the data that was used in the 'stan(data = ?, ...)' call
`type`	"logit" or "probit"
`force`	for MCMCpack models, suppress warning if the model does not appear to be a binary choice model?

Details

If only the average AUC-ROC and PR are of interest, setting curves = FALSE and fullsims = FALSE can greatly speed up calculation time. The curve data (curves = TRUE) is needed for plotting. The plot method will always plot both the ROC and PR curves, but the underlying data can easily be extracted from the output for your own plotting; see the documentation of the value returned below.

The default method works with a matrix of predicted probabilities and the vector of observed incomes as input. Other methods accommodate some of the common Bayesian modeling packages like rstan (which returns class "stanfit"), rstanarm ("stanreg"), R2jags ("jags"), R2WinBUGS ("bugs"), and MCMCpack ("mcmc"). Even if a package-specific method is not implemented, the default method can always be used as a fallback by manually calculating the matrix of predicted probabilities for each posterior sample.

Note that MCMCpack returns generic "mcmc" output that is annotated with some additional information as attributes, including the original function call. There is no inherent way to distinguish any other kind of "mcmc" object from one generated by a proper MCMCpack modeling function, but as a basic precaution, mcmcRocPrc() will check the saved call and return an error if the function called was not MCMClogit() or MCMCprobit(). This behavior can be suppressed by setting force = TRUE.

Value

Returns a list with length 2 or 4, depending on the on the "curves" and "fullsims" argument values:

"area_under_roc": numeric(); either length 1 if fullsims = FALSE, or one value for each posterior sample otherwise
"area_under_prc": numeric(); either length 1 if fullsims = FALSE, or one value for each posterior sample otherwise
"prc_dat": only if curves = TRUE; a list with length 1 if fullsims = FALSE, longer otherwise
"roc_dat": only if curves = TRUE; a list with length 1 if fullsims = FALSE, longer otherwise

References

Beger, Andreas. 2016. “Precision-Recall Curves.” Available at doi:10.2139/ssrn.2765419

Examples


if (interactive()) {
# load simulated data and fitted model (see ?sim_data and ?jags_logit)
data("jags_logit")

# using mcmcRocPrc
fit_sum <- mcmcRocPrc(jags_logit,
                      yname = "Y",
                      xnames = c("X1", "X2"),
                      curves = TRUE,
                      fullsims = FALSE)
fit_sum                     
plot(fit_sum)

# Equivalently, we can calculate the matrix of predicted probabilities 
# ourselves; using the example from ?jags_logit:
library(R2jags)

data("sim_data")
yvec <- sim_data$Y
xmat <- sim_data[, c("X1", "X2")]

# add intercept to the X data
xmat <- as.matrix(cbind(Intercept = 1L, xmat))

beta <- as.matrix(as.mcmc(jags_logit))[, c("b[1]", "b[2]", "b[3]")]
pred_mat <- plogis(xmat %*% t(beta)) 

# the matrix of predictions has rows matching the number of rows in the data;
# the column are the predictions for each of the 2,000 posterior samples
nrow(sim_data)
dim(pred_mat)

# now we can call mcmcRocPrc; the default method works with the matrix
# of predictions and vector of outcomes as input
mcmcRocPrc(object = pred_mat, curves = TRUE, fullsims = FALSE, yvec = yvec)
}


if (interactive()) {
# load simulated data and fitted model (see ?sim_data and ?jags_logit)
data("jags_logit")

# using mcmcRocPrc
fit_sum <- mcmcRocPrc(jags_logit,
                      yname = "Y",
                      xnames = c("X1", "X2"),
                      curves = TRUE,
                      fullsims = FALSE)
fit_sum                     
plot(fit_sum)

# Equivalently, we can calculate the matrix of predicted probabilities 
# ourselves; using the example from ?jags_logit:
library(R2jags)

data("sim_data")
yvec <- sim_data$Y
xmat <- sim_data[, c("X1", "X2")]

# add intercept to the X data
xmat <- as.matrix(cbind(Intercept = 1L, xmat))

beta <- as.matrix(as.mcmc(jags_logit))[, c("b[1]", "b[2]", "b[3]")]
pred_mat <- plogis(xmat %*% t(beta)) 

# the matrix of predictions has rows matching the number of rows in the data;
# the column are the predictions for each of the 2,000 posterior samples
nrow(sim_data)
dim(pred_mat)

# now we can call mcmcRocPrc; the default method works with the matrix
# of predictions and vector of outcomes as input
mcmcRocPrc(object = pred_mat, curves = TRUE, fullsims = FALSE, yvec = yvec)
}

Simulated data for examples

Description

Simulated data to fit example models against

Usage

sim_data
sim_data

Format

a data.frame

Examples

## simulating data
set.seed(123456)
b0 <- 0.2 # true value for the intercept
b1 <- 0.5 # true value for first beta
b2 <- 0.7 # true value for second beta
n <- 500 # sample size
X1 <- runif(n, -1, 1)
X2 <- runif(n, -1, 1)
Z <- b0 + b1 * X1 + b2 * X2
pr <- 1 / (1 + exp(-Z)) # inv logit function
Y <- rbinom(n, 1, pr) 
sim_data <- data.frame(cbind(X1, X2, Y))
## simulating data
set.seed(123456)
b0 <- 0.2 # true value for the intercept
b1 <- 0.5 # true value for first beta
b2 <- 0.7 # true value for second beta
n <- 500 # sample size
X1 <- runif(n, -1, 1)
X2 <- runif(n, -1, 1)
Z <- b0 + b1 * X1 + b2 * X2
pr <- 1 / (1 + exp(-Z)) # inv logit function
Y <- rbinom(n, 1, pr) 
sim_data <- data.frame(cbind(X1, X2, Y))

Simulated data for examples

Description

Simulated data to fit example models against

Usage

sim_data_interactive
sim_data_interactive

Format

a data.frame

Examples

set.seed(123456)
b0 <- 0.2 # true value for the intercept
b1 <- 0.5 # true value for first beta
b2 <- 0.7 # true value for second beta
b3 <- -0.3 # true value for second beta
n <- 500 # sample size
X1 <- runif(n, -1, 1)
X2 <- runif(n, -1, 1)
Z_interactive <- b0 + b1 * X1 + b2 * X2 + b3 * (X1 * X2)
Y_interactive <- rnorm(n, Z_interactive, 1)
sim_data_interactive <- data.frame(cbind(X1, X2, Y = Y_interactive))
set.seed(123456)
b0 <- 0.2 # true value for the intercept
b1 <- 0.5 # true value for first beta
b2 <- 0.7 # true value for second beta
b3 <- -0.3 # true value for second beta
n <- 500 # sample size
X1 <- runif(n, -1, 1)
X2 <- runif(n, -1, 1)
Z_interactive <- b0 + b1 * X1 + b2 * X2 + b3 * (X1 * X2)
Y_interactive <- rnorm(n, Z_interactive, 1)
sim_data_interactive <- data.frame(cbind(X1, X2, Y = Y_interactive))

Simulated data for examples

Description

Simulated data to fit example models against

Usage

sim_data_interactive_cat
sim_data_interactive_cat

Format

a data.frame

Examples

set.seed(123456)
b0 <- 0.2 # true value for the intercept
b1 <- 0.5 # true value for first beta
b2 <- 0.7 # true value for second beta
b3 <- -0.3 # true value for second beta
n <- 500 # sample size
X1 <- runif(n, -1, 1)
X3 <- rbinom(n, 5, .23)
Z_interactive_cat <- b0 + b1 * X1 + b2 * X3 + b3 * (X1 * X3)
Y_interactive_cat <- rnorm(n, Z_interactive_cat, 1)
sim_data_interactive_cat <- data.frame(cbind(X1, X3, Y = Y_interactive_cat))
set.seed(123456)
b0 <- 0.2 # true value for the intercept
b1 <- 0.5 # true value for first beta
b2 <- 0.7 # true value for second beta
b3 <- -0.3 # true value for second beta
n <- 500 # sample size
X1 <- runif(n, -1, 1)
X3 <- rbinom(n, 5, .23)
Z_interactive_cat <- b0 + b1 * X1 + b2 * X3 + b3 * (X1 * X3)
Y_interactive_cat <- rnorm(n, Z_interactive_cat, 1)
sim_data_interactive_cat <- data.frame(cbind(X1, X3, Y = Y_interactive_cat))

Package 'BayesPostEst'

Help Index

BayesPostEst Overview

Description

Main Functions

Fitted JAGS interactive linear model

Description

Usage

Format

Examples

Fitted JAGS interactive linear model with categorical moderator

Description

Usage

Format

Examples

Fitted JAGS logit model

Description

Usage

Format

Examples

Fitted JAGS probit model

Description

Usage

Format

Examples

Predicted Probabilities using Bayesian MCMC estimates for the "Average" Case

Description

Usage

Arguments

Details

Value

References

Examples

Coefficient Plots for MCMC Output

Description

Usage

Arguments

Value

Author(s)

Examples

First Differences of a Bayesian Logit or Probit model

Description

Usage

Arguments

Value

References

Examples

Marginal Effects Plots for MCMC Output

Description

Usage

Arguments

Value

Author(s)

Examples

Predicted Probabilities using Bayesian MCMC estimates for the Average of Observed Cases

Description

Usage

Arguments

Details

Value

References

Examples

LaTeX or HTML regression tables for MCMC Output

Description

Usage

Arguments

Details

Value

Author(s)

Examples

ROC and Precision-Recall Curves using Bayesian MCMC estimates generalized

Description

Usage

Arguments

Details

Value

References

Examples

Summarize Bayesian MCMC Output R function for summarizing MCMC output in a regression-style table.

Description