Borusyak and Hull (2023) develop a new approach to estimating the
causal effects of treatments or instruments that combine multiple
sources of variation according to a known formula. The key challenge of
applying this “formula instrument approach” is in specifying the
distribution over counterfactual shocks. The formulaiv
package enables researchers to evaluate the sensitivity of their
estimates to small or large deviations away from an assumed baseline
distribution of shocks. This guide walks through the method using the
China high-speed rail (HSR) application from Borusyak and Hull
(2023).
The first step is to prepare market access and high speed railway
data in Borusyak and Hull (2023). Every object is annotated with its
shape (N is the number of observations, S is
the number of counterfactual shocks, J is the number of
controls, L is the number of lines).
# Set up BH market access data
y <- ma$emp_growth # N x 1 vector
x <- ma$dma0 # N x 1 vector
z <- x # N x 1 vector
controls <- ma[, c("distance_B", "scaled_lat", "scaled_lon")] # N x J dataframe
f <- ma[, paste0("ma_nlink", 1:1999)] - ma$ma2007 # N x S dataframe
pbar <- rep(1 / 1999, 1999) # S x 1 vector
# In the BH market access example, N = 275, S = 1999, J = 3
# Set up BH high speed railway data
# Generate dummies of line opening status in 2016
for (i in 1:1999) {
line[[paste0("open2016_sim", i)]] <- as.integer(
line[[paste0("year_operate", i)]] <= 2016
)
}
# Generate probability of line opening across S simulations
g <- line[, paste0("open2016_sim", 1:1999)] # L x S dataframe
qbar <- as.numeric(as.matrix(g) %*% pbar) # L x 1 vectorThen we assess how sensitive formula instrument estimators are to the assumed distribution of the underlying shocks in Borusyak and Hull (2023). We consider two ways of specifying the sensitivity set, intended to capture different ways of measuring deviations, each with and without geographic controls.
BH_sens_joint_cons_no_controls <- formulaiv(
y = y,
x = x,
z = z,
f = f,
eps = seq(1, 20, 0.2),
cons = list(name = "joint", pbar = pbar)
)$betaBH_sens_joint_cons_with_controls <- formulaiv(
y = y,
x = x,
z = z,
f = f,
eps = seq(1, 20, 0.2),
cons = list(name = "joint", pbar = pbar),
controls = controls
)$betaBH_sens_marginal_cons_no_controls <- formulaiv(
y = y,
x = x,
z = z,
f = f,
eps = seq(1, 2.5, 0.25),
cons = list(name = "marginal", g = g, qbar = qbar)
)$betaBH_sens_marginal_cons_with_controls <- formulaiv(
y = y,
x = x,
z = z,
f = f,
eps = seq(1, 2.5, 0.25),
cons = list(name = "marginal", g = g, qbar = qbar),
controls = controls
)$betaOur sensitivity analysis shows that small changes in the distribution of shocks used in their formula instrument lead to instrumental variable estimates that are anywhere from large negative effects to large positive effects.
Borusyak, K. and Hull, P. (2023). “Nonrandom Exposure to Exogenous Shocks.” Econometrica, 91(6), 2155–2185. https://doi.org/10.3982/ECTA19367