| Title: | Doubly Robust Estimation of Local Average Treatment Effects |
|---|---|
| Description: | Estimates the local average treatment effect (LATE) and the local average treatment effect on the treated (LATT) using observational data with a binary instrument, implementing the complete estimator suite of Sloczynski, Uysal, and Wooldridge: the doubly robust estimators of Sloczynski, Uysal, and Wooldridge (2022) <doi:10.48550/arXiv.2208.01300> -- inverse probability weighted regression adjustment (IPWRA), inverse probability weighting (IPW), augmented inverse probability weighting (AIPW), and regression adjustment (RA) -- and the Abadie-kappa weighting estimators of Sloczynski, Uysal, and Wooldridge (2025) <doi:10.1080/07350015.2024.2332763>. Supports linear, logistic, probit, Poisson, and fractional (fractional-logit and fractional-probit) outcome and treatment models, and instrument propensity scores estimated by maximum likelihood, covariate balancing (CBPS), or inverse probability tilting (IPT). Standard errors are computed jointly for all estimation stages by stacking the moment conditions of every model into a single M-estimation system; weak-instrument-robust Fieller confidence sets, cluster-aware bootstrap inference, design diagnostics, and a doubly robust Hausman-type test of unconfoundedness are included. Estimates and standard errors are validated against the authors' Stata commands 'drlate' (Statistical Software Components S459708) and 'kappalate' (S459257). |
| Authors: | Kailas Venkitasubramanian [aut, cre], S. Derya Uysal [ctb, cph] (Author of the original Stata package 'drlate'), Tymon Sloczynski [ctb, cph] (Author of the original Stata package 'drlate'), Jeffrey M. Wooldridge [ctb, cph] (Author of the original Stata package 'drlate') |
| Maintainer: | Kailas Venkitasubramanian <[email protected]> |
| License: | MIT + file LICENSE |
| Version: | 0.3.1 |
| Built: | 2026-06-24 13:24:55 UTC |
| Source: | https://github.com/cran/drlate |
Computes standardized mean differences (SMDs) of the model covariates between the two instrument arms, before and after weighting by the inverse of the estimated instrument propensity score. Well-balanced weighted covariates (conventionally, absolute SMD below 0.1) indicate that the propensity score model is doing its job.
balance(object, ...) ## S3 method for class 'drlate' balance(object, detail = FALSE, ...)balance(object, ...) ## S3 method for class 'drlate' balance(object, detail = FALSE, ...)
object |
A fitted |
... |
Currently unused. |
detail |
Logical. If |
The covariate set is the union of the columns of the instrument,
outcome, and treatment model matrices (the intercept is dropped). The
SMD denominator is the unweighted pooled standard deviation
in both columns, so the two columns are
directly comparable. Weighted arm means are Hájek means using the
inverse-propensity weights implied by the fit (for
estimand = "latt", the Z=0 arm uses the ATT odds weights
, matching the estimator).
A data frame with one row per covariate and columns
variable, smd_unweighted, and smd_weighted; with detail = TRUE,
the four additional columns described above.
plot.drlate() with type = "balance" for the love plot.
Tests whether the estimated instrument propensity score balances the
covariates, using the overidentification test of Imai and Ratkovic (2014).
The propensity-score MLE score equations identify the coefficients; the
covariate-balancing (CBPS) moments are the overidentifying restrictions. A
large statistic is evidence that the propensity-score model does not balance
the covariates — a misspecification diagnostic. This is the
Stata latebalance overid postestimation feature.
balance_test(object)balance_test(object)
object |
A fitted |
An object of class drlate_balance_test: a list with statistic
(Hansen's J), df, p.value, ivmodel, and n, with a print method.
Imai, K. and Ratkovic, M. (2014). Covariate Balancing Propensity Score. Journal of the Royal Statistical Society B 76(1), 243–263.
balance() for the standardized-mean-difference diagnostics.
fit <- drlate(lwage ~ age + educ, nvstat ~ age + educ, rsncode ~ age + educ, data = drlate_sim) balance_test(fit)fit <- drlate(lwage ~ age + educ, nvstat ~ age + educ, rsncode ~ age + educ, data = drlate_sim) balance_test(fit)
Compares the average of each covariate in the full estimation sample with
its average in the complier subpopulation, the latter computed with the
normalized Abadie kappa weights of kappa_weights(). Because the local
average treatment effect is a causal effect for compliers, knowing how
compliers differ from the population aids interpretation. This is the
Stata estat compliers postestimation feature.
complier_means(object, vars = NULL)complier_means(object, vars = NULL)
object |
A fitted |
vars |
Optional character vector selecting a subset of the model covariates. Defaults to all covariates across the three model formulas. |
Covariate values are reported on their original scale.
A data frame with one row per covariate and columns variable,
population_mean, complier_mean, and difference
(complier_mean - population_mean).
fit <- drlate(lwage ~ age + educ, nvstat ~ age + educ, rsncode ~ age + educ, data = drlate_sim) complier_means(fit)fit <- drlate(lwage ~ age + educ, nvstat ~ age + educ, rsncode ~ age + educ, data = drlate_sim) complier_means(fit)
Confidence intervals for drlate fits
## S3 method for class 'drlate' confint(object, parm, level = 0.95, method = c("default", "fieller"), ...)## S3 method for class 'drlate' confint(object, parm, level = 0.95, method = c("default", "fieller"), ...)
object |
A fitted |
parm |
Coefficients to include (names or indices); defaults to all three reported quantities. |
level |
Confidence level. |
method |
|
... |
Currently unused. |
For method = "default", a numeric matrix with one row per requested
coefficient (parm) and two columns holding the lower and upper
confidence limits. The columns are labelled with the corresponding
percentiles (for the default 95% level, "2.5 %" and "97.5 %"). The
limits are Wald intervals from the joint sandwich covariance, or
percentile intervals from the resampling draws when the fit was computed
with vcov = "bootstrap".
For method = "fieller", an object of class "drlate_fieller": a list
describing the weak-instrument-robust confidence set for the LATE/LATT
ratio (its endpoints and shape, the estimand name, and the confidence
level), with its own print method. Because a Fieller set need not be a
bounded interval, it is returned in this form rather than as a matrix of
endpoints.
Tests whether the treatment is unconfounded given the covariates, using
the comparison proposed by Słoczyński, Uysal, and Wooldridge (2022,
Section 5), building on Donald, Hsu, and Lieli (2014). Under
one-sided noncompliance (nobody takes the treatment without the
instrument: ), the LATT identified
through the instrument equals the ATT identified through
unconfoundedness of the treatment — so a significant difference between
the doubly robust LATT estimate (which uses the instrument) and the
doubly robust ATT estimate (which does not) is evidence against
unconfoundedness. Unlike the textbook OLS-vs-IV Hausman test, this
comparison is robust to treatment effect heterogeneity.
dr_hausman( outcome, treatment, instrument, data, omodel = c("linear", "logit", "poisson"), tmodel = c("logit", "linear", "poisson"), ivmodel = c("logit", "ipt"), weights = NULL, cluster = NULL, pstolerance = 1e-05, subset = NULL )dr_hausman( outcome, treatment, instrument, data, omodel = c("linear", "logit", "poisson"), tmodel = c("logit", "linear", "poisson"), ivmodel = c("logit", "ipt"), weights = NULL, cluster = NULL, pstolerance = 1e-05, subset = NULL )
outcome |
A formula |
treatment |
A formula |
instrument |
A formula |
data |
A data frame containing all variables. |
omodel |
Outcome model family: |
tmodel |
Treatment model family: |
ivmodel |
Instrument propensity score model for the LATT half:
|
weights |
Optional sampling weights (a numeric vector, or a column
name in |
cluster |
Optional cluster identifier for clustered standard errors
(a vector, or a column name in |
pstolerance |
Overlap tolerance: estimation stops with an error if
any estimated instrument propensity score is below |
subset |
Optional logical or integer vector selecting rows of |
The DR ATT estimator follows the paper's equation (33): a treatment
propensity score is fitted by logit QMLE on the
treatment-equation covariates; the outcome model is fitted on the
untreated sample weighted by the odds ; and
is the treated-sample mean outcome minus the mean
imputed counterfactual. The standard error of the difference comes from
stacking the moment conditions of both estimators (and the difference)
into one M-estimation system, so the covariance between them is
accounted for analytically — the analytic option suggested in the paper.
Note that the two halves adjust on their respective formulas: the LATT half's propensity score uses the instrument-equation covariates, while the ATT half's uses the treatment-equation covariates (both share the outcome model). Supply the same covariate set to all three formulas unless you intend them to differ.
An object of class "htest" with the z statistic, p-value, and
the DR LATT, DR ATT, and difference estimates.
Słoczyński, T., S. D. Uysal, and J. M. Wooldridge (2022). "Doubly Robust Estimation of Local Average Treatment Effects Using Inverse Probability Weighted Regression Adjustment." doi:10.48550/arXiv.2208.01300
Donald, S. G., Y.-C. Hsu, and R. P. Lieli (2014). "Testing the Unconfoundedness Assumption via Inverse Probability Weighted Estimators of (L)ATT." Journal of Business & Economic Statistics 32(3), 395-415.
d <- drlate_sim d$nvstat[d$rsncode == 0] <- 0L # impose one-sided noncompliance dr_hausman(lwage ~ age + educ, nvstat ~ age + educ, rsncode ~ age + educ, data = d)d <- drlate_sim d$nvstat[d$rsncode == 0] <- 0L # impose one-sided noncompliance dr_hausman(lwage ~ age + educ, nvstat ~ age + educ, rsncode ~ age + educ, data = d)
Estimates the local average treatment effect (LATE) or the local average
treatment effect on the treated (LATT) with a binary instrument, following
Słoczyński, Uysal, and Wooldridge (2022). A faithful R port of the Stata
package drlate (SSC S459708): point estimates come from sequential
weighted regressions, and standard errors are computed jointly for the
instrument propensity score, the outcome regression, the treatment
regression, and the causal estimand by stacking all moment conditions
into a single M-estimation system.
drlate( outcome, treatment, instrument, data, omodel = c("linear", "logit", "probit", "poisson", "flogit", "fprobit"), tmodel = c("logit", "probit", "linear", "poisson"), ivmodel = c("logit", "cbps", "ipt", "probit"), method = c("ipwra", "ipw", "aipw", "ra", "kappa", "kappa0", "kappa10"), estimand = c("late", "latt"), normalized = TRUE, weights = NULL, cluster = NULL, pstolerance = 1e-05, osample = FALSE, subset = NULL, keep_data = TRUE, vcov = c("analytic", "bootstrap"), boot_reps = 999L, boot_seed = NULL, cores = 1L )drlate( outcome, treatment, instrument, data, omodel = c("linear", "logit", "probit", "poisson", "flogit", "fprobit"), tmodel = c("logit", "probit", "linear", "poisson"), ivmodel = c("logit", "cbps", "ipt", "probit"), method = c("ipwra", "ipw", "aipw", "ra", "kappa", "kappa0", "kappa10"), estimand = c("late", "latt"), normalized = TRUE, weights = NULL, cluster = NULL, pstolerance = 1e-05, osample = FALSE, subset = NULL, keep_data = TRUE, vcov = c("analytic", "bootstrap"), boot_reps = 999L, boot_seed = NULL, cores = 1L )
outcome |
A formula |
treatment |
A formula |
instrument |
A formula |
data |
A data frame containing all variables. |
omodel |
Outcome model family: |
tmodel |
Treatment model family: |
ivmodel |
Instrument propensity score model: |
method |
Estimator: |
estimand |
|
normalized |
Logical; use normalized moment conditions (default
|
weights |
Optional sampling weights (a numeric vector, or a column
name in |
cluster |
Optional cluster identifier for clustered standard errors
(a vector, or a column name in |
pstolerance |
Overlap tolerance: estimation stops with an error if
any estimated instrument propensity score is below |
osample |
Logical; if |
subset |
Optional logical or integer vector selecting rows of |
keep_data |
Logical; retain the internal estimation context (model
matrices, fitted propensity scores, weights) on the returned object
(default |
vcov |
|
boot_reps |
Number of bootstrap replications (default 999). |
boot_seed |
Optional seed for reproducible bootstrap draws.
Results are reproducible for a fixed number of |
cores |
Number of CPU cores for the bootstrap (default 1). Values
above 1 use a PSOCK cluster and require the package to be installed
(not merely loaded with |
An object of class "drlate", a list with components including
coefficients (the causal estimate, the numerator effect of Z on Y,
and the denominator effect of Z on D), vcov3 (their variance matrix,
diagonal by construction, as in the Stata package), vcov_full (the
joint variance matrix of all stacked parameters), theta (all stacked
parameter estimates), N, dmeanz1, dmeanz0, and the call.
For method = "kappa10" only the causal estimate is reported
(the estimator is a difference of two ratios, so no single
numerator/denominator pair exists). For "kappa" and "kappa0" the
third coefficient is the mean of the corresponding kappa weight: under
the LATE assumptions it estimates the same complier share as the IPW
first-stage contrast (the population ATE of Z on D), but it is a
different sample statistic and the two can diverge under propensity
score misspecification.
Słoczyński, T., S. D. Uysal, and J. M. Wooldridge (2022). "Doubly Robust Estimation of Local Average Treatment Effects Using Inverse Probability Weighted Regression Adjustment." doi:10.48550/arXiv.2208.01300
Słoczyński, T., S. D. Uysal, and J. M. Wooldridge (2025). "Abadie's Kappa and Weighting Estimators of the Local Average Treatment Effect." Journal of Business & Economic Statistics 43(1), 164–177. doi:10.1080/07350015.2024.2332763
data(drlate_sim) fit <- drlate(lwage ~ age + educ, nvstat ~ age + educ, rsncode ~ age + educ, data = drlate_sim) summary(fit)data(drlate_sim) fit <- drlate(lwage ~ age + educ, nvstat ~ age + educ, rsncode ~ age + educ, data = drlate_sim) summary(fit)
Runs several estimators on the same specification and collects the
causal estimates with their confidence intervals — the sensitivity
comparison applied papers routinely report. Formula restrictions are
handled automatically: method = "ipw" drops the outcome/treatment
covariates and method = "ra" drops the instrument covariates (each
with a message), matching the requirements of those estimators.
drlate_compare( outcome, treatment, instrument, data, methods = c("ipwra", "ipw", "aipw", "ra"), both_norms = FALSE, ... )drlate_compare( outcome, treatment, instrument, data, methods = c("ipwra", "ipw", "aipw", "ra"), both_norms = FALSE, ... )
outcome |
A formula |
treatment |
A formula |
instrument |
A formula |
data |
A data frame containing all variables. |
methods |
Estimators to run (any of the |
both_norms |
Logical; also run the unnormalized variants of
|
... |
Passed on to |
Because IPW carries no outcome/treatment regressions and RA carries no instrument propensity score, the automatic formula adjustment means the rows do not share a single adjustment specification: differences between the IPW or RA row and the doubly robust rows reflect both the estimator and the reduced specification. Read the comparison as a robustness display, not as a test that isolates estimator choice; the doubly robust rows (IPWRA, AIPW) are the like-for-like pair.
An object of class "drlate_compare": a data frame with columns
method, normalized, estimate, se, ci_lo, ci_hi, with a
print method and a dot-whisker plot method.
cmp <- drlate_compare(lwage ~ age + educ, nvstat ~ age + educ, rsncode ~ age + educ, data = drlate_sim) cmpcmp <- drlate_compare(lwage ~ age + educ, nvstat ~ age + educ, rsncode ~ age + educ, data = drlate_sim) cmp
A simulated dataset with a binary instrument, a binary treatment with
two-sided noncompliance, and continuous, positive, and binary outcome
variables, designed to exercise every model family supported by
drlate(). The complier average treatment effect (LATE) used in the
data-generating process is 0.5. The treatment is genuinely endogenous
(compliance type shifts the baseline outcome, so naive OLS is biased
upward) and the instrument is only conditionally valid (its propensity
depends on age and educ, so the raw Wald ratio is biased too).
drlate_simdrlate_sim
A data frame with 2,000 rows and 7 variables:
continuous outcome
positive outcome (for Poisson models), exp(lwage / 2)
binary outcome (for logit models)
binary treatment
binary instrument
continuous covariate
factor covariate with levels hs, college, graduate
Simulated; see data-raw/drlate_sim.R in the package sources.
Returns the per-observation Abadie kappa weight implied by a fitted
drlate() object,
where is the estimated instrument propensity score. The kappa
weights identify the complier subpopulation: for any function of the
data,
(Abadie 2003). They are the weights used by complier_means() and are the
Stata estat compliers, genkappa() object.
kappa_weights(object, normalize = TRUE)kappa_weights(object, normalize = TRUE)
object |
A fitted |
normalize |
Logical. If |
A numeric vector with one entry per estimation-sample observation.
fit <- drlate(lwage ~ age + educ, nvstat ~ age + educ, rsncode ~ age + educ, data = drlate_sim) head(kappa_weights(fit))fit <- drlate(lwage ~ age + educ, nvstat ~ age + educ, rsncode ~ age + educ, data = drlate_sim) head(kappa_weights(fit))
Diagnostic plots for drlate fits
## S3 method for class 'drlate' plot( x, type = c("overlap", "balance", "balance_density", "weights"), bins = 30, geom = c("histogram", "density"), var = NULL, ... )## S3 method for class 'drlate' plot( x, type = c("overlap", "balance", "balance_density", "weights"), bins = 30, geom = c("histogram", "density"), var = NULL, ... )
x |
A fitted |
type |
One of:
|
bins |
Number of histogram bins for |
geom |
For |
var |
For |
... |
Currently unused. |
A ggplot object.