| Title: | Model Selection Between TWFE and ETWFE |
|---|---|
| Description: | Estimates both a vanilla two-way fixed effects (TWFE) model and an extended TWFE (ETWFE) model, then selects between them using Cochran's Q test for heterogeneity. When ETWFE wins, reports the heterogeneity fraction (I-squared) and cohort-time estimates with empirical Bayes shrinkage and Bonferroni multiplicity correction. Methods build on Wooldridge (2025) <doi:10.1007/s00181-025-02807-z> and Callaway and Sant'Anna (2021) <doi:10.1016/j.jeconom.2020.12.001>. |
| Authors: | Paul von Hippel [aut, cre] |
| Maintainer: | Paul von Hippel <[email protected]> |
| License: | MIT + file LICENSE |
| Version: | 0.2.1 |
| Built: | 2026-06-02 19:16:33 UTC |
| Source: | https://github.com/cran/selectTWFE |
County-level panel data on log teen employment and log population in 500 US counties from 2003 to 2007, used to study the effect of state minimum wage increases on teen employment. Treatment is staggered: counties are grouped into cohorts by the year their state first raised the minimum wage above the federal level.
mpdtampdta
A data frame with 2,500 rows and 6 variables:
County FIPS code (unit identifier)
Calendar year (2003–2007)
Year of first minimum wage increase for the county's
state; 0 for never-treated counties
Log teen employment (outcome)
Log county population (covariate)
Binary indicator equal to 1 if the county is treated in that
year (i.e., year >= first.treat and first.treat > 0)
Callaway, B. and Sant'Anna, P.H.C. (2021). Difference-in-differences with multiple time periods. Journal of Econometrics, 225(2), 200–230. Originally distributed with the did package.
When ETWFE wins, produces an event study plot of cohort-time estimates, optionally shrunk toward the grand mean via empirical Bayes. When TWFE wins, produces a simple ATT plot.
## S3 method for class 'select_twfe' plot(x, ...)## S3 method for class 'select_twfe' plot(x, ...)
x |
A select_twfe object |
... |
Ignored |
Invisibly returns the input x (a select_twfe object).
Called for its side effect of drawing a ggplot to the active
graphics device: an event study plot of cohort-time effects with
Bonferroni-adjusted 95% confidence intervals when ETWFE is selected,
or a single-point ATT plot when TWFE is selected.
Print method for select_twfe objects
## S3 method for class 'select_twfe' print(x, digits = 4, ...)## S3 method for class 'select_twfe' print(x, digits = 4, ...)
x |
A select_twfe object |
digits |
Number of decimal places for estimates (default 4) |
... |
Ignored |
Invisibly returns the input x (a select_twfe object).
Called for its side effect of printing a formatted summary of the model
selection result to the console.
Print method for summary.select_twfe objects
## S3 method for class 'summary.select_twfe' print(x, digits = 4, ...)## S3 method for class 'summary.select_twfe' print(x, digits = 4, ...)
x |
A summary.select_twfe object |
digits |
Number of significant digits (default 4) |
... |
Ignored |
Invisibly returns the input x (a summary.select_twfe
object). Called for its side effect of printing a formatted summary to
the console.
Estimates both a vanilla two-way fixed effects (TWFE) model and an extended TWFE (ETWFE) model, then selects the best model using Cochran's Q test for heterogeneity.
When ETWFE wins, reports the heterogeneity fraction () and
cohort-time estimates with empirical Bayes shrinkage and multiplicity
correction.
select_twfe( fml, tvar, gvar, data, ivar = NULL, cgroup = c("notyet", "never"), vcov = NULL, selection_criterion = c("Q", "estimated_mse"), alpha = 0.05, shrink_heterogeneity = TRUE, ... )select_twfe( fml, tvar, gvar, data, ivar = NULL, cgroup = c("notyet", "never"), vcov = NULL, selection_criterion = c("Q", "estimated_mse"), alpha = 0.05, shrink_heterogeneity = TRUE, ... )
fml |
A two-sided formula: outcome ~ controls. Use 1 on RHS if no controls. |
tvar |
Time variable (unquoted). |
gvar |
Group/cohort variable (unquoted). Should be 0 or Inf for never-treated. |
data |
A data frame. |
ivar |
Unit ID variable (unquoted). Optional; inferred from gvar FE if NULL. |
cgroup |
Comparison group: "notyet" (default) or "never". |
vcov |
Variance-covariance specification passed to fixest. Recommended: ~unit_id. |
selection_criterion |
Character: "Q" (default) or "estimated_mse". Determines how the model is selected. "Q" uses Cochran's Q test for heterogeneity (selects ETWFE if Q significantly indicates heterogeneity). "estimated_mse" uses bias-corrected estimated mean squared error comparison (legacy option, not recommended; Monte Carlo studies show the Q criterion performs better). |
alpha |
Significance level for Cochran's Q test (default 0.05). Only used when selection_criterion="Q". |
shrink_heterogeneity |
If TRUE (default) and ETWFE wins, apply empirical Bayes shrinkage to cohort-time estimates in plot output. |
... |
Additional arguments passed to etwfe(). |
A select_twfe object containing:
Character: "etwfe" or "twfe"
Aggregate ATT estimate from ETWFE
Standard error of ETWFE ATT
ATT estimate from naive TWFE
Standard error of TWFE ATT
Estimated bias of TWFE = ATT(TWFE) - ATT(ETWFE)
Var(TWFE) - Var(ETWFE); included for reference
Score-based sandwich estimate of Cov(ATT_TWFE, ATT_ETWFE)
Estimated correlation between the two ATT estimators
Bias-corrected MSE(TWFE) - MSE(ETWFE) when selection_criterion="estimated_mse"
Cochran's Q statistic when selection_criterion="Q"
P-value from Cochran's Q test when selection_criterion="Q"
Heterogeneity fraction (only if ETWFE wins)
Data frame of EB-shrunk cohort-time estimates
(if ETWFE wins and shrink_heterogeneity=TRUE). Both the point estimates
and the standard errors are shrunk: the SE is the naive EB posterior SD
. The original (unshrunk) SE is
retained in the se_raw column.
The fitted etwfe model object
The fitted feols TWFE model object
The emfx() aggregated ETWFE results
Logical: whether shrinkage was applied
The criterion used for model selection
Significance level used for Cochran's Q (if applicable)
Returns a named list of key quantities from the model selection, suitable
for programmatic use. This is distinct from print(), which formats
results for human reading.
## S3 method for class 'select_twfe' summary(object, ...)## S3 method for class 'select_twfe' summary(object, ...)
object |
A select_twfe object |
... |
Ignored |
A list with elements: selected, att_etwfe,
se_etwfe, att_twfe, se_twfe, bias_twfe,
var_diff, mse_diff, Q, Q_pval,
i2 (heterogeneity fraction ; NULL if TWFE wins),
selection_criterion, and Q_significance_level.