Introduction to cdid

Introduction


  • Recent advances in econometrics have focused on the identification, estimation, and inference of treatment effects in staggered designs, i.e. where treatment timing varies across units. Despite the variety of available approaches, none stand out as universally superior, and many overlook key challenges such as estimator efficiency or the frequent occurrence of unbalanced panel data.

    In our recent paper (Bellégo, Benatia, Dortet-Bernadet, Journal of Econometrics, 2024), we address these gaps by extending the well-established method developed by Callaway and Sant’Anna (Journal of Econometrics, 2021) The cdid library is therefore intended to be used in connection with the did library: https://bcallaway11.github.io/did/articles/did-basics.html. Our approach improves efficiency and adapts seamlessly to unbalanced panels, making it particularly valuable for researchers facing data with missing observations.

    This page introduces the cdid R library that implements the methods from our paper, showcasing their relevance for empirical research. For those short on time, here are the key takeaways:

    • cdid offers greater precision (smaller standard errors) than did with balanced panel data, by aggregating smaller ATT parameters efficiently.

    • cdid outperforms did with unbalanced panels, particularly if errors are serially correlated (e.g., in presence of unit-level fixed effects). Remark, however, units observed only once across all time periods must be dropped before estimation because they are not used by cdid.

    • cdid is less prone to bias in cases of attrition, notably if missingness is related to unobservable heterogeneity. For cases where missingness is related to observables, additional results from our paper have yet to be implemented in the library (see the paper).

Features of the cdid library

What it supports:

  • Handles any form of missing data.

  • Allows for control units comprising only “never treated” or also “not-yet-treated.”

  • Provides two weighting matrix options for GMM-based aggregation of ΔkATT(g, t) into ATT(g, t): identity or 2-step. Small-sample simulations show that:

    • Identity: Best for smaller datasets with more missing data, and less overidentifying restrictions, like rotating panel surveys.

    • 2-step: Best for larger, balanced datasets with many overidentifying restrictions, or unbalanced datasets with relatively few missings.

  • Implements simple propensity scores for treatment assignment where predictor columns X are treated as constant across time periods. For time-varying covariates, users can include additional columns (Xt,Xt+1,etc.) in the dataset.

Current limitations:

  • The generalized attrition model from our paper, which allows for dynamic attrition processes (e.g., past outcomes influencing observation status), is not yet implemented.

  • The library currently supports only the IPW estimator, though extensions to doubly-robust estimators like in the did library are planned.

Getting started

Installing the library requires using remotes at the moment because we cannot submit the package on CRAN yet (Happy holidays!). The command is

remotes::install_github("joelcuerrier/cdid", ref = "main", build_vignettes = TRUE, force = TRUE)

Balanced data

#Load the relevant packages
library(did) #Callaway and Sant'Anna (2021) 
library(cdid) #Bellego, Benatia, and Dortet-Bernadet (2024)
## Registered S3 methods overwritten by 'cdid':
##   method     from
##   print.MP   did 
##   summary.MP did
## 
## Attaching package: 'cdid'
## The following objects are masked from 'package:did':
## 
##     DIDparams, MP
set.seed(123)
#Generate a dataset: 500 units, 8 time periods, with unit fixed-effects. 
# The parameter sigma_alpha controls the unit-specific time-persistent unobserved
# heterogeneity.
data0=fonction_simu_attrition(N = 500,T = 8,theta2_alpha_Gg = 0.5, 
                              lambda1_alpha_St = 0, sigma_alpha = 2, 
                              sigma_epsilon = 0.1, tprob = 0.5)

# The true values of the coefficients are based on time-to-treatment. The treatment
# effect is zero before the treatment, 1.75 one period after, 1.5 two period after,
# 1.25 three period after, 1 four period after, 0.75 five period after, 0.5 six  
# period after, etc.

#We keep all observations, so we have a balanced dataset
data0$S <- 1

#Look at the data
head(data0,20)
##     id date          Y           X date_G S
## 496  1    1  1.0375153 -0.50916654      0 1
## 497  2    1  1.6642546  0.90485255      0 1
## 498  3    1  5.4180717  0.10405218      0 1
## 499  4    1  2.3776851 -1.07075107      0 1
## 500  5    1  2.2881996  1.15012013      0 1
## 501  6    1  5.7148449  0.92078829      0 1
## 502  7    1  3.1825990  0.90263073      0 1
## 503  8    1 -0.3141815  1.21615254      0 1
## 504  9    1  0.6587048  1.88246516      0 1
## 505 10    1  1.2679689  1.20559750      0 1
## 506 11    1  4.5992371  0.38356416      0 1
## 507 12    1  2.9192416  0.26520075      0 1
## 508 13    1  2.7608238  0.86819721      0 1
## 509 14    1  2.5006867  1.31001699      0 1
## 510 15    1  0.9133735 -0.03968035      0 1
## 511 16    1  5.7266600  0.81569113      0 1
## 512 17    1  3.0689819  1.96726726      0 1
## 513 18    1 -1.8001064  0.89171991      0 1
## 514 19    1  3.4962817  0.30157933      0 1
## 515 20    1  1.2549496  0.72405483      0 1

The dataset is a dataframe with 3960 observations and 6 columns: id to keep track of unit ids, date to keep track of time periods, an outcome variable Y, a predictor X, a treatment date date_G (zero for control group), and a sampling dummy S used to keep track which observations are used in the estimation. There are 495 unique id, and 8 time periods.

#run did library on balanced panel
did.results = att_gt(
  yname="Y",
  tname="date",
  idname = "id",
  gname = "date_G",
  xformla = ~X,
  data = data0,
  weightsname = NULL,
  allow_unbalanced_panel = FALSE,
  panel = TRUE,
  control_group = "notyettreated",
  alp = 0.05,
  bstrap = TRUE,
  cband = TRUE,
  biters = 1000,
  clustervars = NULL,
  est_method = "ipw",
  base_period = "varying",
  print_details = FALSE,
  pl = FALSE,
  cores = 1
)

#run cdid with 2step weighting matrix
result_2step = att_gt_cdid(yname="Y", tname="date",
                         idname="id",
                         gname="date_G",
                         xformla=~X,
                         data=data0,
                         control_group="notyettreated",
                         alp=0.05,
                         bstrap=TRUE,
                         biters=1000,
                         clustervars=NULL,
                         cband=TRUE,
                         est_method="2-step",
                         base_period="varying",
                         print_details=FALSE,
                         pl=FALSE,
                         cores=1)

#run cdid with identity weighting matrix
result_id = att_gt_cdid(yname="Y", tname="date",
                        idname="id",
                        gname="date_G",
                        xformla=~X,
                        data=data0,
                        control_group="notyettreated",
                        alp=0.05,
                        bstrap=TRUE,
                        biters=1000,
                        clustervars=NULL,
                        cband=TRUE,
                        est_method="Identity",
                        base_period="varying",
                        print_details=FALSE,
                        pl=FALSE,
                        cores=1)
print(did.results)
## 
## Call:
## att_gt(yname = "Y", tname = "date", idname = "id", gname = "date_G", 
##     xformla = ~X, data = data0, panel = TRUE, allow_unbalanced_panel = FALSE, 
##     control_group = "notyettreated", weightsname = NULL, alp = 0.05, 
##     bstrap = TRUE, cband = TRUE, biters = 1000, clustervars = NULL, 
##     est_method = "ipw", base_period = "varying", print_details = FALSE, 
##     pl = FALSE, cores = 1)
## 
## Bellego C., Benatia D., and V. Dortet-Bernadet (2024). "The Chained Difference-in-Differences." Journal of Econometrics. doi:10.1016/j.jeconom.2023.11.002.
## Group-Time Average Treatment Effects:
##  Group Time ATT(g,t) Std. Error [95% Simult.  Conf. Band]  
##      3    2   0.0183     0.0275       -0.0618      0.0984  
##      3    3   1.7586     0.0319        1.6657      1.8515 *
##      3    4   1.4714     0.0235        1.4030      1.5398 *
##      3    5   1.2372     0.0276        1.1567      1.3177 *
##      3    6   0.9510     0.0266        0.8735      1.0285 *
##      3    7   0.7746     0.0255        0.7003      0.8488 *
##      3    8   0.5150     0.0229        0.4481      0.5818 *
##      4    2  -0.0283     0.0405       -0.1463      0.0898  
##      4    3   0.0322     0.0279       -0.0491      0.1136  
##      4    4   1.6992     0.0329        1.6034      1.7950 *
##      4    5   1.4894     0.0338        1.3909      1.5878 *
##      4    6   1.2562     0.0310        1.1658      1.3466 *
##      4    7   1.0153     0.0364        0.9092      1.1213 *
##      4    8   0.7839     0.0310        0.6935      0.8742 *
##      5    2  -0.0164     0.0305       -0.1054      0.0726  
##      5    3  -0.0162     0.0284       -0.0991      0.0666  
##      5    4   0.0219     0.0351       -0.0806      0.1243  
##      5    5   1.7020     0.0291        1.6171      1.7869 *
##      5    6   1.4653     0.0281        1.3834      1.5472 *
##      5    7   1.2536     0.0299        1.1666      1.3407 *
##      5    8   1.0071     0.0432        0.8811      1.1332 *
##      6    2  -0.0147     0.0372       -0.1232      0.0937  
##      6    3   0.0364     0.0351       -0.0660      0.1388  
##      6    4  -0.0364     0.0301       -0.1240      0.0513  
##      6    5  -0.0125     0.0317       -0.1048      0.0799  
##      6    6   1.7606     0.0353        1.6576      1.8636 *
##      6    7   1.5163     0.0330        1.4202      1.6125 *
##      6    8   1.2603     0.0292        1.1752      1.3454 *
##      7    2  -0.0489     0.0267       -0.1266      0.0288  
##      7    3   0.0042     0.0248       -0.0681      0.0765  
##      7    4  -0.0294     0.0212       -0.0911      0.0322  
##      7    5   0.0276     0.0293       -0.0577      0.1130  
##      7    6   0.0022     0.0323       -0.0920      0.0963  
##      7    7   1.7774     0.0284        1.6945      1.8603 *
##      7    8   1.4987     0.0267        1.4209      1.5765 *
##      8    2   0.0106     0.0363       -0.0953      0.1164  
##      8    3  -0.0170     0.0332       -0.1137      0.0797  
##      8    4  -0.0063     0.0361       -0.1114      0.0988  
##      8    5  -0.0059     0.0319       -0.0989      0.0870  
##      8    6   0.0299     0.0296       -0.0563      0.1162  
##      8    7  -0.0034     0.0351       -0.1056      0.0988  
##      8    8   1.7633     0.0412        1.6432      1.8833 *
## ---
## Signif. codes: `*' confidence band does not cover 0
## 
## P-value for pre-test of parallel trends assumption:  0.7097
## Control Group:  Not Yet Treated,  Anticipation Periods:  0
## Estimation Method:  Inverse Probability Weighting
# Remark that standard errors are smaller for most ATT(g,t) when using cdid
print(result_2step)
## 
## Call:
## att_gt_cdid(yname = "Y", tname = "date", idname = "id", gname = "date_G", 
##     xformla = ~X, data = data0, control_group = "notyettreated", 
##     alp = 0.05, bstrap = TRUE, cband = TRUE, biters = 1000, clustervars = NULL, 
##     est_method = "2-step", base_period = "varying", print_details = FALSE, 
##     pl = FALSE, cores = 1)
## 
## Bellego C., Benatia D., and V. Dortet-Bernadet (2024). "The Chained Difference-in-Differences." Journal of Econometrics. doi:10.1016/j.jeconom.2023.11.002.
## Group-Time Average Treatment Effects:
##  Group Time ATT(g,t) Std. Error [95% Simult.  Conf. Band]  
##      3    1  -0.0184     0.0280       -0.1026      0.0657  
##      3    3   1.7588     0.0326        1.6611      1.8565 *
##      3    4   1.4670     0.0246        1.3933      1.5408 *
##      3    5   1.2375     0.0287        1.1515      1.3235 *
##      3    6   0.9509     0.0266        0.8709      1.0308 *
##      3    7   0.7724     0.0263        0.6936      0.8513 *
##      3    8   0.5146     0.0205        0.4531      0.5762 *
##      4    1  -0.0180     0.0148       -0.0624      0.0264  
##      4    2  -0.0083     0.0126       -0.0460      0.0295  
##      4    4   1.7023     0.0247        1.6283      1.7764 *
##      4    5   1.4911     0.0317        1.3960      1.5863 *
##      4    6   1.2708     0.0259        1.1930      1.3486 *
##      4    7   1.0318     0.0299        0.9419      1.1216 *
##      4    8   0.7827     0.0286        0.6969      0.8686 *
##      5    1   0.0016     0.0043       -0.0114      0.0145  
##      5    2   0.0027     0.0029       -0.0061      0.0115  
##      5    3   0.0024     0.0052       -0.0132      0.0181  
##      5    5   1.7127     0.0202        1.6522      1.7733 *
##      5    6   1.4733     0.0205        1.4119      1.5346 *
##      5    7   1.2630     0.0188        1.2066      1.3195 *
##      5    8   1.0113     0.0241        0.9390      1.0837 *
##      6    1   0.0116     0.0046       -0.0021      0.0252  
##      6    2   0.0038     0.0035       -0.0068      0.0144  
##      6    3   0.0214     0.0075       -0.0012      0.0440  
##      6    4   0.0056     0.0023       -0.0013      0.0125  
##      6    6   1.7324     0.0234        1.6622      1.8026 *
##      6    7   1.5131     0.0279        1.4293      1.5968 *
##      6    8   1.2524     0.0205        1.1910      1.3138 *
##      7    1   0.0150     0.0088       -0.0114      0.0413  
##      7    2  -0.0016     0.0019       -0.0074      0.0042  
##      7    3   0.0007     0.0010       -0.0024      0.0038  
##      7    4  -0.0085     0.0057       -0.0254      0.0085  
##      7    5   0.0000     0.0012       -0.0037      0.0037  
##      7    7   1.7792     0.0140        1.7373      1.8211 *
##      7    8   1.4893     0.0149        1.4446      1.5340 *
##      8    1  -0.0007     0.0036       -0.0115      0.0100  
##      8    2   0.0034     0.0035       -0.0072      0.0140  
##      8    3  -0.0024     0.0043       -0.0153      0.0106  
##      8    4  -0.0134     0.0050       -0.0284      0.0016  
##      8    5  -0.0133     0.0063       -0.0322      0.0055  
##      8    6   0.0048     0.0036       -0.0060      0.0157  
##      8    8   1.7780     0.0207        1.7161      1.8400 *
## ---
## Signif. codes: `*' confidence band does not cover 0
## 
## P-value for pre-test of parallel trends assumption:  0.0001117182
## Control Group:  Not Yet Treated,  Anticipation Periods:  0
## Estimation Method:  2-step
print(result_id)
## 
## Call:
## att_gt_cdid(yname = "Y", tname = "date", idname = "id", gname = "date_G", 
##     xformla = ~X, data = data0, control_group = "notyettreated", 
##     alp = 0.05, bstrap = TRUE, cband = TRUE, biters = 1000, clustervars = NULL, 
##     est_method = "Identity", base_period = "varying", print_details = FALSE, 
##     pl = FALSE, cores = 1)
## 
## Bellego C., Benatia D., and V. Dortet-Bernadet (2024). "The Chained Difference-in-Differences." Journal of Econometrics. doi:10.1016/j.jeconom.2023.11.002.
## Group-Time Average Treatment Effects:
##  Group Time ATT(g,t) Std. Error [95% Simult.  Conf. Band]  
##      3    1  -0.0161     0.0304       -0.1051      0.0729  
##      3    3   1.7595     0.0329        1.6631      1.8559 *
##      3    4   1.4704     0.0236        1.4013      1.5395 *
##      3    5   1.2359     0.0269        1.1571      1.3146 *
##      3    6   0.9511     0.0293        0.8654      1.0368 *
##      3    7   0.7739     0.0259        0.6980      0.8498 *
##      3    8   0.5147     0.0227        0.4482      0.5812 *
##      4    1  -0.0037     0.0353       -0.1070      0.0996  
##      4    2  -0.0329     0.0289       -0.1175      0.0516  
##      4    4   1.6983     0.0311        1.6073      1.7893 *
##      4    5   1.4881     0.0357        1.3836      1.5926 *
##      4    6   1.2569     0.0294        1.1708      1.3430 *
##      4    7   1.0156     0.0354        0.9118      1.1193 *
##      4    8   0.7842     0.0317        0.6913      0.8771 *
##      5    1   0.0113     0.0279       -0.0706      0.0931  
##      5    2  -0.0062     0.0310       -0.0969      0.0845  
##      5    3  -0.0203     0.0348       -0.1224      0.0817  
##      5    5   1.7020     0.0311        1.6109      1.7931 *
##      5    6   1.4654     0.0264        1.3881      1.5427 *
##      5    7   1.2523     0.0316        1.1598      1.3449 *
##      5    8   1.0046     0.0417        0.8825      1.1267 *
##      6    1   0.0273     0.0337       -0.0713      0.1260  
##      6    2   0.0109     0.0319       -0.0826      0.1044  
##      6    3   0.0486     0.0318       -0.0445      0.1416  
##      6    4   0.0122     0.0306       -0.0774      0.1017  
##      6    6   1.7610     0.0376        1.6510      1.8710 *
##      6    7   1.5164     0.0346        1.4151      1.6178 *
##      6    8   1.2595     0.0297        1.1726      1.3464 *
##      7    1   0.0456     0.0285       -0.0380      0.1292  
##      7    2  -0.0062     0.0242       -0.0771      0.0647  
##      7    3   0.0014     0.0248       -0.0711      0.0740  
##      7    4  -0.0295     0.0259       -0.1054      0.0464  
##      7    5  -0.0026     0.0315       -0.0949      0.0896  
##      7    7   1.7772     0.0305        1.6878      1.8666 *
##      7    8   1.4992     0.0232        1.4311      1.5672 *
##      8    1  -0.0054     0.0353       -0.1086      0.0979  
##      8    2   0.0018     0.0367       -0.1056      0.1092  
##      8    3  -0.0132     0.0347       -0.1149      0.0886  
##      8    4  -0.0221     0.0344       -0.1227      0.0785  
##      8    5  -0.0274     0.0315       -0.1198      0.0650  
##      8    6   0.0034     0.0346       -0.0979      0.1047  
##      8    8   1.7633     0.0405        1.6448      1.8817 *
## ---
## Signif. codes: `*' confidence band does not cover 0
## 
## P-value for pre-test of parallel trends assumption:  0.6624945303
## Control Group:  Not Yet Treated,  Anticipation Periods:  0
## Estimation Method:  Identity

Now that we have the ATT(g,t) for all three methods, we can compare aggregate parameters using the following commands

# Aggregation
agg.es.did <- aggte(MP = did.results, type = 'dynamic')
agg.es.2step <- aggte(MP = result_2step, type = 'dynamic')
agg.es.id <- aggte(MP = result_id, type = 'dynamic')
# Print results
agg.es.did
## 
## Call:
## aggte(MP = did.results, type = "dynamic")
## 
## Reference: Callaway, Brantly and Pedro H.C. Sant'Anna.  "Difference-in-Differences with Multiple Time Periods." Journal of Econometrics, Vol. 225, No. 2, pp. 200-230, 2021. <https://doi.org/10.1016/j.jeconom.2020.12.001>, <https://arxiv.org/abs/1803.09015> 
## 
## 
## Overall summary of ATT's based on event-study/dynamic aggregation:  
##     ATT    Std. Error     [ 95%  Conf. Int.]  
##  1.1276        0.0119     1.1042       1.151 *
## 
## 
## Dynamic Effects:
##  Event time Estimate Std. Error [95% Simult.  Conf. Band]  
##          -6   0.0106     0.0350       -0.0843      0.1054  
##          -5  -0.0326     0.0222       -0.0927      0.0276  
##          -4  -0.0062     0.0190       -0.0576      0.0453  
##          -3  -0.0018     0.0145       -0.0412      0.0376  
##          -2  -0.0059     0.0136       -0.0427      0.0310  
##          -1   0.0097     0.0125       -0.0243      0.0437  
##           0   1.7446     0.0136        1.7079      1.7813 *
##           1   1.4884     0.0128        1.4536      1.5232 *
##           2   1.2512     0.0150        1.2107      1.2916 *
##           3   0.9875     0.0194        0.9351      1.0399 *
##           4   0.7787     0.0201        0.7242      0.8332 *
##           5   0.5150     0.0238        0.4504      0.5795 *
## ---
## Signif. codes: `*' confidence band does not cover 0
## 
## Control Group:  Not Yet Treated,  Anticipation Periods:  0
## Estimation Method:  Inverse Probability Weighting
#Remark that standard errors are smaller, notably so for the 2step estimator
agg.es.2step
## 
## Call:
## aggte(MP = result_2step, type = "dynamic")
## 
## Reference: Callaway, Brantly and Pedro H.C. Sant'Anna.  "Difference-in-Differences with Multiple Time Periods." Journal of Econometrics, Vol. 225, No. 2, pp. 200-230, 2021. <https://doi.org/10.1016/j.jeconom.2020.12.001>, <https://arxiv.org/abs/1803.09015> 
## 
## 
## Overall summary of ATT's based on event-study/dynamic aggregation:  
##     ATT    Std. Error     [ 95%  Conf. Int.]  
##  1.1285        0.0107     1.1076      1.1495 *
## 
## 
## Dynamic Effects:
##  Event time Estimate Std. Error [95% Simult.  Conf. Band]  
##          -7  -0.0007     0.0034       -0.0101      0.0086  
##          -6   0.0090     0.0041       -0.0022      0.0203  
##          -5   0.0030     0.0020       -0.0026      0.0086  
##          -4  -0.0017     0.0014       -0.0057      0.0022  
##          -3  -0.0027     0.0042       -0.0142      0.0089  
##          -2  -0.0030     0.0065       -0.0210      0.0151  
##           0   1.7442     0.0102        1.7161      1.7723 *
##           1   1.4867     0.0101        1.4588      1.5145 *
##           2   1.2546     0.0119        1.2217      1.2876 *
##           3   0.9939     0.0170        0.9470      1.0409 *
##           4   0.7771     0.0192        0.7241      0.8300 *
##           5   0.5146     0.0226        0.4523      0.5769 *
## ---
## Signif. codes: `*' confidence band does not cover 0
## 
## Control Group:  Not Yet Treated,  Anticipation Periods:  0
## Estimation Method:  2-step
agg.es.id
## 
## Call:
## aggte(MP = result_id, type = "dynamic")
## 
## Reference: Callaway, Brantly and Pedro H.C. Sant'Anna.  "Difference-in-Differences with Multiple Time Periods." Journal of Econometrics, Vol. 225, No. 2, pp. 200-230, 2021. <https://doi.org/10.1016/j.jeconom.2020.12.001>, <https://arxiv.org/abs/1803.09015> 
## 
## 
## Overall summary of ATT's based on event-study/dynamic aggregation:  
##     ATT    Std. Error     [ 95%  Conf. Int.]  
##  1.1272        0.0118     1.1042      1.1503 *
## 
## 
## Dynamic Effects:
##  Event time Estimate Std. Error [95% Simult.  Conf. Band]  
##          -7  -0.0054     0.0328       -0.0980      0.0873  
##          -6   0.0232     0.0249       -0.0473      0.0937  
##          -5   0.0039     0.0170       -0.0441      0.0519  
##          -4   0.0005     0.0150       -0.0419      0.0429  
##          -3  -0.0019     0.0148       -0.0436      0.0399  
##          -2  -0.0093     0.0123       -0.0441      0.0255  
##           0   1.7447     0.0137        1.7061      1.7833 *
##           1   1.4880     0.0129        1.4514      1.5246 *
##           2   1.2505     0.0150        1.2081      1.2928 *
##           3   0.9869     0.0196        0.9315      1.0424 *
##           4   0.7785     0.0198        0.7224      0.8347 *
##           5   0.5147     0.0227        0.4503      0.5790 *
## ---
## Signif. codes: `*' confidence band does not cover 0
## 
## Control Group:  Not Yet Treated,  Anticipation Periods:  0
## Estimation Method:  Identity

Unbalanced data

Let us now do the same but with an unbalanced panel dataset.

#Generate a dataset: 500 units, 8 time periods, with unit fixed-effects (alpha)
set.seed(123)
data0=fonction_simu_attrition(N = 500,T = 8,theta2_alpha_Gg = 0.5, 
                              lambda1_alpha_St = 0, sigma_alpha = 2, 
                              sigma_epsilon = 0.1, tprob = 0.5)

#We discard observations based on sampling indicator S
data0 <- data0[data0$S==1,]

#run did
did.results =  att_gt(
  yname="Y",
  tname="date",
  idname = "id",
  gname = "date_G",
  xformla = ~X,
  data = data0,
  weightsname = NULL,
  allow_unbalanced_panel = FALSE,
  panel = FALSE,
  control_group = "notyettreated",
  alp = 0.05,
  bstrap = TRUE,
  cband = TRUE,
  biters = 1000,
  clustervars = NULL,
  est_method = "ipw",
  base_period = "varying",
  print_details = FALSE,
  pl = FALSE,
  cores = 1
)

#run cdid with 2step weighting matrix
result_2step = att_gt_cdid(yname="Y", tname="date",
                         idname="id",
                         gname="date_G",
                         xformla=~X,
                         data=data0,
                         control_group="notyettreated",
                         alp=0.05,
                         bstrap=TRUE,
                         biters=1000,
                         clustervars=NULL,
                         cband=TRUE,
                         est_method="2-step",
                         base_period="varying",
                         print_details=FALSE,
                         pl=FALSE,
                         cores=1)

#run cdid with identity weighting matrix
result_id = att_gt_cdid(yname="Y", tname="date",
                        idname="id",
                        gname="date_G",
                        xformla=~X,
                        data=data0,
                        control_group="notyettreated",
                        alp=0.05,
                        bstrap=TRUE,
                        biters=1000,
                        clustervars=NULL,
                        cband=TRUE,
                        est_method="Identity",
                        base_period="varying",
                        print_details=FALSE,
                        pl=FALSE,
                        cores=1)
#Note the precision gains
print(did.results)
## 
## Call:
## att_gt(yname = "Y", tname = "date", idname = "id", gname = "date_G", 
##     xformla = ~X, data = data0, panel = FALSE, allow_unbalanced_panel = FALSE, 
##     control_group = "notyettreated", weightsname = NULL, alp = 0.05, 
##     bstrap = TRUE, cband = TRUE, biters = 1000, clustervars = NULL, 
##     est_method = "ipw", base_period = "varying", print_details = FALSE, 
##     pl = FALSE, cores = 1)
## 
## Bellego C., Benatia D., and V. Dortet-Bernadet (2024). "The Chained Difference-in-Differences." Journal of Econometrics. doi:10.1016/j.jeconom.2023.11.002.
## Group-Time Average Treatment Effects:
##  Group Time ATT(g,t) Std. Error [95% Simult.  Conf. Band]  
##      3    2  -0.2381     0.3802       -1.3972      0.9211  
##      3    3   2.1285     0.3718        0.9952      3.2618 *
##      3    4   1.7048     0.4093        0.4570      2.9526 *
##      3    5   1.8590     0.4304        0.5470      3.1710 *
##      3    6   1.5577     0.4366        0.2270      2.8885 *
##      3    7   1.0118     0.3933       -0.1870      2.2107  
##      3    8   1.1837     0.3939       -0.0171      2.3846  
##      4    2   0.1676     0.4592       -1.2322      1.5674  
##      4    3   0.3427     0.4486       -1.0249      1.7103  
##      4    4   1.6564     0.5137        0.0905      3.2224 *
##      4    5   1.3559     0.4325        0.0376      2.6742 *
##      4    6   1.5466     0.5311       -0.0725      3.1657  
##      4    7   1.3461     0.4887       -0.1436      2.8357  
##      4    8   1.2161     0.3851        0.0420      2.3901 *
##      5    2  -0.1077     0.7223       -2.3095      2.0941  
##      5    3   0.7784     0.7322       -1.4537      3.0105  
##      5    4  -0.3655     0.6700       -2.4080      1.6769  
##      5    5   2.6003     0.6686        0.5621      4.6385 *
##      5    6   1.6121     0.6104       -0.2487      3.4729  
##      5    7   1.8961     0.6918       -0.2126      4.0049  
##      5    8   1.2898     0.6432       -0.6710      3.2505  
##      6    2  -0.2072     0.5333       -1.8328      1.4183  
##      6    3   0.0161     0.5532       -1.6702      1.7024  
##      6    4   0.2298     0.6010       -1.6023      2.0618  
##      6    5   0.0950     0.6550       -1.9017      2.0918  
##      6    6   1.6410     0.5121        0.0798      3.2021 *
##      6    7   1.7781     0.5653        0.0550      3.5013 *
##      6    8   2.0181     0.5634        0.3006      3.7355 *
##      7    2   0.3906     0.7141       -1.7863      2.5674  
##      7    3  -0.0790     0.5789       -1.8437      1.6857  
##      7    4   0.1251     0.6254       -1.7814      2.0317  
##      7    5  -0.2276     0.6918       -2.3365      1.8813  
##      7    6   0.4632     0.5537       -1.2246      2.1510  
##      7    7   1.3587     0.5433       -0.2975      3.0149  
##      7    8   1.7903     0.5054        0.2495      3.3310 *
##      8    2  -0.4526     0.5854       -2.2371      1.3320  
##      8    3   0.3770     0.5910       -1.4247      2.1786  
##      8    4   0.1857     0.5961       -1.6314      2.0028  
##      8    5   0.5065     0.5635       -1.2112      2.2241  
##      8    6  -0.1824     0.5977       -2.0044      1.6396  
##      8    7  -0.2489     0.6269       -2.1600      1.6622  
##      8    8   1.8350     0.6339       -0.0975      3.7674  
## ---
## Signif. codes: `*' confidence band does not cover 0
## 
## P-value for pre-test of parallel trends assumption:  0.97337
## Control Group:  Not Yet Treated,  Anticipation Periods:  0
## Estimation Method:  Inverse Probability Weighting
print(result_2step)
## 
## Call:
## att_gt_cdid(yname = "Y", tname = "date", idname = "id", gname = "date_G", 
##     xformla = ~X, data = data0, control_group = "notyettreated", 
##     alp = 0.05, bstrap = TRUE, cband = TRUE, biters = 1000, clustervars = NULL, 
##     est_method = "2-step", base_period = "varying", print_details = FALSE, 
##     pl = FALSE, cores = 1)
## 
## Bellego C., Benatia D., and V. Dortet-Bernadet (2024). "The Chained Difference-in-Differences." Journal of Econometrics. doi:10.1016/j.jeconom.2023.11.002.
## Group-Time Average Treatment Effects:
##  Group Time ATT(g,t) Std. Error [95% Simult.  Conf. Band]  
##      3    1   0.0679     0.0118        0.0314      0.1044 *
##      3    3   1.7812     0.0235        1.7085      1.8539 *
##      3    4   1.4826     0.0124        1.4443      1.5209 *
##      3    5   1.2726     0.0171        1.2197      1.3256 *
##      3    6   0.9400     0.0166        0.8885      0.9916 *
##      3    7   0.8012     0.0142        0.7573      0.8450 *
##      3    8   0.4987     0.0146        0.4535      0.5439 *
##      4    1  -0.1172     0.0215       -0.1837     -0.0507 *
##      4    2  -0.0725     0.0108       -0.1059     -0.0391 *
##      4    4   1.6902     0.0216        1.6232      1.7572 *
##      4    5   1.3226     0.0182        1.2663      1.3789 *
##      4    6   1.1866     0.0259        1.1063      1.2668 *
##      4    7   0.9646     0.0236        0.8917      1.0376 *
##      4    8   0.6788     0.0237        0.6053      0.7522 *
##      5    1   0.0281     0.0110       -0.0061      0.0623  
##      5    2   0.1029     0.0113        0.0678      0.1380 *
##      5    3   0.0537     0.0123        0.0155      0.0919 *
##      5    5   1.7718     0.0109        1.7381      1.8056 *
##      5    6   1.5217     0.0125        1.4829      1.5605 *
##      5    7   1.2101     0.0119        1.1733      1.2468 *
##      5    8   1.0782     0.0133        1.0369      1.1195 *
##      6    1   0.0000     0.0193       -0.0598      0.0598  
##      6    2  -0.0279     0.0169       -0.0802      0.0244  
##      6    3   0.0961     0.0174        0.0423      0.1500 *
##      6    4  -0.1992     0.0179       -0.2545     -0.1439 *
##      6    6   1.6180     0.0202        1.5555      1.6805 *
##      6    7   1.4415     0.0164        1.3909      1.4922 *
##      6    8   1.1778     0.0189        1.1192      1.2363 *
##      7    1  -0.0588     0.0159       -0.1081     -0.0095 *
##      7    2  -0.0403     0.0137       -0.0827      0.0022  
##      7    3  -0.0196     0.0114       -0.0549      0.0157  
##      7    4  -0.0129     0.0158       -0.0619      0.0361  
##      7    5  -0.0902     0.0185       -0.1475     -0.0329 *
##      7    7   1.7156     0.0189        1.6570      1.7742 *
##      7    8   1.5285     0.0154        1.4809      1.5761 *
##      8    1   0.1377     0.0226        0.0676      0.2077 *
##      8    2   0.0502     0.0159        0.0009      0.0995 *
##      8    3   0.0809     0.0188        0.0226      0.1392 *
##      8    4  -0.0412     0.0184       -0.0982      0.0158  
##      8    5   0.0740     0.0229        0.0030      0.1449 *
##      8    6   0.1242     0.0206        0.0603      0.1881 *
##      8    8   1.8980     0.0274        1.8132      1.9828 *
## ---
## Signif. codes: `*' confidence band does not cover 0
## 
## P-value for pre-test of parallel trends assumption:  0
## Control Group:  Not Yet Treated,  Anticipation Periods:  0
## Estimation Method:  2-step
print(result_id)
## 
## Call:
## att_gt_cdid(yname = "Y", tname = "date", idname = "id", gname = "date_G", 
##     xformla = ~X, data = data0, control_group = "notyettreated", 
##     alp = 0.05, bstrap = TRUE, cband = TRUE, biters = 1000, clustervars = NULL, 
##     est_method = "Identity", base_period = "varying", print_details = FALSE, 
##     pl = FALSE, cores = 1)
## 
## Bellego C., Benatia D., and V. Dortet-Bernadet (2024). "The Chained Difference-in-Differences." Journal of Econometrics. doi:10.1016/j.jeconom.2023.11.002.
## Group-Time Average Treatment Effects:
##  Group Time ATT(g,t) Std. Error [95% Simult.  Conf. Band]  
##      3    1  -0.0254     0.0328       -0.1211      0.0702  
##      3    3   1.7594     0.0444        1.6298      1.8890 *
##      3    4   1.4888     0.0332        1.3920      1.5856 *
##      3    5   1.2262     0.0281        1.1444      1.3081 *
##      3    6   0.9125     0.0338        0.8140      1.0110 *
##      3    7   0.7831     0.0358        0.6786      0.8877 *
##      3    8   0.5122     0.0302        0.4241      0.6003 *
##      4    1  -0.0239     0.0633       -0.2088      0.1609  
##      4    2  -0.0734     0.0365       -0.1799      0.0332  
##      4    4   1.6734     0.0527        1.5196      1.8272 *
##      4    5   1.4696     0.0498        1.3242      1.6150 *
##      4    6   1.2621     0.0465        1.1265      1.3977 *
##      4    7   0.9711     0.0478        0.8315      1.1107 *
##      4    8   0.7824     0.0435        0.6555      0.9093 *
##      5    1   0.0168     0.0391       -0.0973      0.1308  
##      5    2   0.0581     0.0330       -0.0383      0.1546  
##      5    3   0.0054     0.0311       -0.0853      0.0961  
##      5    5   1.6640     0.0284        1.5811      1.7468 *
##      5    6   1.4700     0.0370        1.3620      1.5780 *
##      5    7   1.2105     0.0294        1.1249      1.2962 *
##      5    8   1.0682     0.0432        0.9423      1.1942 *
##      6    1   0.0302     0.0388       -0.0831      0.1435  
##      6    2   0.0348     0.0379       -0.0759      0.1455  
##      6    3   0.0551     0.0405       -0.0630      0.1731  
##      6    4  -0.0106     0.0424       -0.1343      0.1130  
##      6    6   1.7572     0.0580        1.5880      1.9263 *
##      6    7   1.5040     0.0395        1.3888      1.6192 *
##      6    8   1.2899     0.0379        1.1792      1.4006 *
##      7    1  -0.0031     0.0321       -0.0968      0.0906  
##      7    2  -0.0047     0.0281       -0.0866      0.0772  
##      7    3   0.0041     0.0259       -0.0715      0.0796  
##      7    4  -0.0066     0.0337       -0.1050      0.0918  
##      7    5  -0.0532     0.0389       -0.1668      0.0604  
##      7    7   1.7418     0.0381        1.6306      1.8531 *
##      7    8   1.5147     0.0338        1.4161      1.6133 *
##      8    1   0.0036     0.0549       -0.1566      0.1638  
##      8    2  -0.0277     0.0372       -0.1364      0.0809  
##      8    3   0.0098     0.0512       -0.1397      0.1593  
##      8    4  -0.0422     0.0372       -0.1509      0.0665  
##      8    5  -0.0240     0.0418       -0.1458      0.0979  
##      8    6   0.0017     0.0444       -0.1279      0.1312  
##      8    8   1.7798     0.0479        1.6401      1.9194 *
## ---
## Signif. codes: `*' confidence band does not cover 0
## 
## P-value for pre-test of parallel trends assumption:  0.4719657727
## Control Group:  Not Yet Treated,  Anticipation Periods:  0
## Estimation Method:  Identity
# Aggregation
# There are other ways to aggregate, see the did library
agg.es.did <- aggte(MP = did.results, type = 'dynamic')
agg.es.2step <- aggte(MP = result_2step, type = 'dynamic')
agg.es.id <- aggte(MP = result_id, type = 'dynamic')
#Note the precision gains
agg.es.did
## 
## Call:
## aggte(MP = did.results, type = "dynamic")
## 
## Reference: Callaway, Brantly and Pedro H.C. Sant'Anna.  "Difference-in-Differences with Multiple Time Periods." Journal of Econometrics, Vol. 225, No. 2, pp. 200-230, 2021. <https://doi.org/10.1016/j.jeconom.2020.12.001>, <https://arxiv.org/abs/1803.09015> 
## 
## 
## Overall summary of ATT's based on event-study/dynamic aggregation:  
##     ATT    Std. Error     [ 95%  Conf. Int.]  
##  1.5077        0.1936     1.1283      1.8871 *
## 
## 
## Dynamic Effects:
##  Event time Estimate Std. Error [95% Simult.  Conf. Band]  
##          -6  -0.4526     0.5680       -2.0262      1.1211  
##          -5   0.3833     0.4289       -0.8048      1.5714  
##          -4  -0.0339     0.3478       -0.9974      0.9297  
##          -3   0.1529     0.3174       -0.7264      1.0322  
##          -2   0.1255     0.3110       -0.7361      0.9871  
##          -1   0.0064     0.2239       -0.6138      0.6267  
##           0   1.8525     0.2140        1.2596      2.4453 *
##           1   1.6546     0.2268        1.0265      2.2828 *
##           2   1.8346     0.2740        1.0755      2.5937 *
##           3   1.4177     0.2692        0.6719      2.1635 *
##           4   1.1034     0.2774        0.3348      1.8720 *
##           5   1.1837     0.3764        0.1409      2.2265 *
## ---
## Signif. codes: `*' confidence band does not cover 0
## 
## Control Group:  Not Yet Treated,  Anticipation Periods:  0
## Estimation Method:  Inverse Probability Weighting
agg.es.2step
## 
## Call:
## aggte(MP = result_2step, type = "dynamic")
## 
## Reference: Callaway, Brantly and Pedro H.C. Sant'Anna.  "Difference-in-Differences with Multiple Time Periods." Journal of Econometrics, Vol. 225, No. 2, pp. 200-230, 2021. <https://doi.org/10.1016/j.jeconom.2020.12.001>, <https://arxiv.org/abs/1803.09015> 
## 
## 
## Overall summary of ATT's based on event-study/dynamic aggregation:  
##     ATT    Std. Error     [ 95%  Conf. Int.]  
##  1.1078        0.0083     1.0916      1.1241 *
## 
## 
## Dynamic Effects:
##  Event time Estimate Std. Error [95% Simult.  Conf. Band]  
##          -7   0.1377     0.0239        0.0702      0.2051 *
##          -6  -0.0030     0.0158       -0.0476      0.0417  
##          -5   0.0138     0.0124       -0.0212      0.0487  
##          -4  -0.0166     0.0082       -0.0397      0.0065  
##          -3   0.0281     0.0116       -0.0048      0.0610  
##          -2  -0.0204     0.0113       -0.0523      0.0114  
##           0   1.7439     0.0107        1.7137      1.7742 *
##           1   1.4572     0.0099        1.4292      1.4852 *
##           2   1.2144     0.0103        1.1853      1.2434 *
##           3   0.9866     0.0128        0.9503      1.0229 *
##           4   0.7462     0.0156        0.7021      0.7904 *
##           5   0.4987     0.0145        0.4578      0.5396 *
## ---
## Signif. codes: `*' confidence band does not cover 0
## 
## Control Group:  Not Yet Treated,  Anticipation Periods:  0
## Estimation Method:  2-step
agg.es.id
## 
## Call:
## aggte(MP = result_id, type = "dynamic")
## 
## Reference: Callaway, Brantly and Pedro H.C. Sant'Anna.  "Difference-in-Differences with Multiple Time Periods." Journal of Econometrics, Vol. 225, No. 2, pp. 200-230, 2021. <https://doi.org/10.1016/j.jeconom.2020.12.001>, <https://arxiv.org/abs/1803.09015> 
## 
## 
## Overall summary of ATT's based on event-study/dynamic aggregation:  
##     ATT    Std. Error     [ 95%  Conf. Int.]  
##  1.1233        0.0155     1.0929      1.1536 *
## 
## 
## Dynamic Effects:
##  Event time Estimate Std. Error [95% Simult.  Conf. Band]  
##          -7   0.0036     0.0566       -0.1485      0.1557  
##          -6  -0.0157     0.0246       -0.0818      0.0503  
##          -5   0.0129     0.0231       -0.0492      0.0749  
##          -4   0.0042     0.0185       -0.0455      0.0538  
##          -3   0.0119     0.0198       -0.0411      0.0650  
##          -2  -0.0262     0.0157       -0.0683      0.0159  
##           0   1.7317     0.0185        1.6819      1.7815 *
##           1   1.4897     0.0166        1.4450      1.5344 *
##           2   1.2482     0.0182        1.1992      1.2972 *
##           3   0.9750     0.0256        0.9063      1.0436 *
##           4   0.7828     0.0269        0.7107      0.8549 *
##           5   0.5122     0.0309        0.4292      0.5952 *
## ---
## Signif. codes: `*' confidence band does not cover 0
## 
## Control Group:  Not Yet Treated,  Anticipation Periods:  0
## Estimation Method:  Identity