Package 'drape' reference manual

Title:	Doubly Robust Average Partial Effects
Description:	Doubly robust average partial effect estimation. This implementation contains methods for adding additional smoothness to plug-in regression procedures and for estimating score functions using smoothing splines. Details of the method can be found in Harvey Klyne and Rajen D. Shah (2023) <doi:10.48550/arXiv.2308.09207>.
Authors:	Harvey Klyne [aut, cre, cph]
Maintainer:	Harvey Klyne <[email protected]>
License:	MIT + file LICENSE
Version:	0.0.2
Built:	2024-11-29 04:05:06 UTC
Source:	CRAN

Estimate the score function of the d'th covariate using a polynomial basis.

Description

Computes the score function estimate when rho(X) is assumed to lie within the span of the polynomial basis of X.

Usage

basis_poly(X, d, degree = 2, lambda = NULL)
basis_poly(X, d, degree = 2, lambda = NULL)

Arguments

`X`	matrix of covariates.
`d`	integer index of covariate of interest.
`degree`	maximum degree of polynomial terms.
`lambda`	optional scalar penalty, if "NULL" chosen via cross-validation.

Value

list containing the estimated score function "rho", which takes matrix input and yields a vector of score estimates.

Examples

set.seed(0)
X <- matrix(stats::rnorm(200), ncol=4)
bs <- basis_poly(X=X, d=1, degree=2)
bs$rho(X)
set.seed(0)
X <- matrix(stats::rnorm(200), ncol=4)
bs <- basis_poly(X=X, d=1, degree=2)
bs$rho(X)

Generate simulation data and evaluate estimators, with sample splitting.

Description

Generate simulation data and evaluate estimators, with sample splitting.

Usage

compare(n, ex_setting, f_setting, nfold = 5)
compare(n, ex_setting, f_setting, nfold = 5)

Arguments

`n`	integer number of samples. For "401k" ex_setting this is ignored and the whole data set is used.
`ex_setting`	string "normal", "mixture2", "mixture3", "logistic", "t4", "401k".
`f_setting`	string "plm", "additive", "interaction".
`nfold`	integer number of cross-validation folds.

Value

list containing estimates, standard error estimates, and sample theta (for debugging).

Evaluate estimators by training nuisance functions on training set and evaluating them on test set.

Description

Evaluate estimators by training nuisance functions on training set and evaluating them on test set.

Usage

compare_evaluate(train, test, ex_setting, f_setting, regression, sm_bw_out)
compare_evaluate(train, test, ex_setting, f_setting, regression, sm_bw_out)

Arguments

`train`	list containing vector of responses y and matrix of predictors X = (x,z).
`test`	list containing vector of responses y and matrix of predictors X = (x,z).
`ex_setting`	string "normal", "mixture2", "mixture3", "logistic", "t4", "401k".
`f_setting`	string "plm", "additive", "interaction".
`regression`	Optional fitted regression.
`sm_bw_out`	Output of cv_resmooth.

Value

list containing f, df, and score estimates evaluated on the test set.

Generate simulation data and evaluate OLS estimator.

Description

Generate simulation data and evaluate OLS estimator.

Usage

compare_lm(n, ex_setting, f_setting)
compare_lm(n, ex_setting, f_setting)

Arguments

`n`	integer number of samples. For "401k" ex_setting this is ignored and the whole data set is used.
`ex_setting`	string "normal", "mixture2", "mixture3", "logistic", "t4", "401k".
`f_setting`	string "plm", "additive", "interaction".

Value

list containing estimate, standard error estimate, and sample theta (for debugging).

Generate simulation data and evaluate partially linear estimator.

Description

Generate simulation data and evaluate partially linear estimator.

Usage

compare_partially_linear(n, ex_setting, f_setting)
compare_partially_linear(n, ex_setting, f_setting)

Arguments

`n`	integer number of samples. For "401k" ex_setting this is ignored and the whole data set is used.
`ex_setting`	string "normal", "mixture2", "mixture3", "logistic", "t4", "401k".
`f_setting`	string "plm", "additive", "interaction".

Value

list containing estimate, standard error estimate, and sample theta (for debugging).

Generate simulation data and evaluate Rothenhausler estimator.

Description

Generate simulation data and evaluate Rothenhausler estimator.

Usage

compare_rothenhausler(n, ex_setting, f_setting)
compare_rothenhausler(n, ex_setting, f_setting)

Arguments

`n`	integer number of samples. For "401k" ex_setting this is ignored and the whole data set is used.
`ex_setting`	string "normal", "mixture2", "mixture3", "logistic", "t4", "401k".
`f_setting`	string "plm", "additive", "interaction".

Value

list containing estimate, standard error estimate, and sample theta (for debugging).

K-fold cross-validation for resmoothing bandwidth.

Description

Picks the largest resmoothing bandwidth achieving a cross-validation score within some specified tolerance of the original regression.

Usage

cv_resmooth(
  X,
  y,
  d = 1,
  regression,
  tol = 2,
  prefit = FALSE,
  foldid = NULL,
  bw = exp(seq(-5, 2, 0.2))/(2 * sqrt(3)) * stats::sd(X[, d]),
  nfolds = 5L,
  n_points = 101,
  sd_trim = 5
)
cv_resmooth(
  X,
  y,
  d = 1,
  regression,
  tol = 2,
  prefit = FALSE,
  foldid = NULL,
  bw = exp(seq(-5, 2, 0.2))/(2 * sqrt(3)) * stats::sd(X[, d]),
  nfolds = 5L,
  n_points = 101,
  sd_trim = 5
)

Arguments

`X`	matrix of covariates.
`y`	vector of responses.
`d`	integer index of covariate to be smoothed along.
`regression`	If prefit = FALSE this is a function which takes input data of the form (X,y), and returns a prediction function. This prediction function itself accepts matrix input same width as X, and returns a vector of y-predictions, and optionally a vector of derivative predictions. If prefit = TRUE then this is a list of length nfolds with each entry containing a component "fit" consisting of a prediction function taking matrix input and returning a vector.
`tol`	vector of tolerances controlling the degree of permissible cross-validation error increase. Larger values lead to a larger amount of smoothing being selected.
`prefit`	boolean signifying if the regressions are already fit to the training data for each fold.
`foldid`	optional vector with components in 1:nfolds indicating the folds in which each observation fell. Overwrites nfolds.
`bw`	vector of bandwidths for the Gaussian resmoothing kernel.
`nfolds`	integer number of cross-validation folds.
`n_points`	integer number of gridpoints to be used for convolution.
`sd_trim`	float number of standard deviations at which to trim the Gaussian distribution.

Value

list. Vector "bw" of bandwidths used. Vectors "cv" of cross-validation scores and numeric "cv_unsm" for the cross-validation without any smoothing. Vector "bw_opt_inds" for the indices of the selected bandwidths under various tolerances. Vector "bw_opt" for the corresponding bandwidths.

Examples

X <- matrix(stats::rnorm(200), ncol=2)
y <- X[,1] + sin(X[,2]) + 0.5 * stats::rnorm(nrow(X))
reg <- function(X,y){
    df <- data.frame(y,X)
    colnames(df) <- c("y", "X1", "X2")
    lm1 <- stats::lm(y~X1+sin(X2), data=df)
    fit <- function(newX){
        newdf = data.frame(newX)
        colnames(newdf) <- c("X1", "X2")
        return(as.vector(stats::predict(lm1, newdata=newdf)))}
    return(list("fit"=fit))
}
cv_resmooth(X=X, y=y, d=2, regression=reg, tol = c(0.5, 1, 2))
X <- matrix(stats::rnorm(200), ncol=2)
y <- X[,1] + sin(X[,2]) + 0.5 * stats::rnorm(nrow(X))
reg <- function(X,y){
    df <- data.frame(y,X)
    colnames(df) <- c("y", "X1", "X2")
    lm1 <- stats::lm(y~X1+sin(X2), data=df)
    fit <- function(newX){
        newdf = data.frame(newX)
        colnames(newdf) <- c("X1", "X2")
        return(as.vector(stats::predict(lm1, newdata=newdf)))}
    return(list("fit"=fit))
}
cv_resmooth(X=X, y=y, d=2, regression=reg, tol = c(0.5, 1, 2))

K-fold cross-validation for spline_score.

Description

K-fold cross-validation for spline_score.

Usage

cv_spline_score(x, df = 2:15, nfolds = 5L, tol = 0.001, nmax = NULL)
cv_spline_score(x, df = 2:15, nfolds = 5L, tol = 0.001, nmax = NULL)

Arguments

`x`	vector of datapoints
`df`	vector of smoothing parameters for the non-parametric score estimator, corresponding to the effective degrees of freedom for a smoothing spline.
`nfolds`	integer number of cross-validation folds.
`tol`	numeric tolerance, minimum distance between neighbouring points, to avoid singularities.
`nmax`	if specified, overrides tol as maximal number of unique points.

Value

list of 5 elements: df vector, cv vector of corresponding cross-validation scores, se vector of standard error estimates, df_min cross-validation minimiser, df_1se largest smoothing parameter within CV score within one standard error of df_min.

Examples

set.seed(0)
x <- stats::rt(100, df=4)
cv_spline_score(x)

x <- stats::rlogis(500)
cvspl <- cv_spline_score(x)
cvspl$df_min
set.seed(0)
x <- stats::rt(100, df=4)
cv_spline_score(x)

x <- stats::rlogis(500)
cvspl <- cv_spline_score(x)
cvspl$df_min

Estimate the doubly-robust average partial effect estimate of X on Y, in the presence of Z.

Description

Estimate the doubly-robust average partial effect estimate of X on Y, in the presence of Z.

Usage

drape(
  y,
  x,
  z,
  response_regression,
  predictor_regression,
  resmooth_bw = NULL,
  spline_df = NULL,
  nfolds = 5L,
  foldid = NULL,
  verbose = FALSE
)
drape(
  y,
  x,
  z,
  response_regression,
  predictor_regression,
  resmooth_bw = NULL,
  spline_df = NULL,
  nfolds = 5L,
  foldid = NULL,
  verbose = FALSE
)

Arguments

`y`	vector of responses.
`x`	vector of the predictor of interest.
`z`	matrix of additional predictors.
`response_regression`	function which takes input data of the form (X,y), where X=cbind(x,z), and returns a prediction function f:X -> y and optionally a similar derivative estimation function (in this case no resmoothing is done).
`predictor_regression`	function which takes input data of the form (z,x), and returns a prediction function m:z -> x.
`resmooth_bw`	optional numeric to be used as resmoothing bandwidth, otherwise chosen via cross-validation. Only used if response_regression doesn't predict derivatives.
`spline_df`	optional double, a smoothing parameter for the unconditional spline score estimator, corresponding to the effective degrees of freedom for a smoothing spline. If NULL, chosen via cross-validation.
`nfolds`	integer, number of sample-splits. If set to one, then all data is used for both training and evaluation.
`foldid`	optional vector with components in 1:nfolds indicating the folds in which each observation fell. Overwrites nfolds.
`verbose`	boolean controlling level of information outputted.

Value

list containing the average partial effect estimate and the corresponding standard error estimate. If verbose=TRUE, additionally contains variables used in computations.

Examples

set.seed(0)
data <- simulate_data(200, "normal", "plm")
response_regression <- function(X,y){
    df <- data.frame(y,X)
    colnames(df) <- c("y", paste0("X", 1:10))
    lm1 <- stats::lm(y~X1+sin(X2), data=df)
    fit <- function(newX){
        newdf <- data.frame(newX)
        colnames(newdf) <- paste0("X", 1:10)
        return(as.vector(stats::predict(lm1, newdata=newdf)))}
    return(list("fit"=fit))
}
predictor_regression <- function(z,x){
    df <- data.frame(x,z)
    colnames(df) <- c("x", paste0("Z", 1:9))
    lm1 <- stats::lm(x~Z1+Z2, data=df)
    fit <- function(newz){
        newdf <- data.frame(newz)
        colnames(newdf) <- paste0("Z", 1:9)
        return(as.vector(stats::predict(lm1, newdata=newdf)))}
    return(list("fit"=fit))
}
drape(data$y, data$x, data$z, response_regression, predictor_regression, nfolds=2)
set.seed(0)
data <- simulate_data(200, "normal", "plm")
response_regression <- function(X,y){
    df <- data.frame(y,X)
    colnames(df) <- c("y", paste0("X", 1:10))
    lm1 <- stats::lm(y~X1+sin(X2), data=df)
    fit <- function(newX){
        newdf <- data.frame(newX)
        colnames(newdf) <- paste0("X", 1:10)
        return(as.vector(stats::predict(lm1, newdata=newdf)))}
    return(list("fit"=fit))
}
predictor_regression <- function(z,x){
    df <- data.frame(x,z)
    colnames(df) <- c("x", paste0("Z", 1:9))
    lm1 <- stats::lm(x~Z1+Z2, data=df)
    fit <- function(newz){
        newdf <- data.frame(newz)
        colnames(newdf) <- paste0("Z", 1:9)
        return(as.vector(stats::predict(lm1, newdata=newdf)))}
    return(list("fit"=fit))
}
drape(data$y, data$x, data$z, response_regression, predictor_regression, nfolds=2)

Fit a lasso regression using quadratic polynomial basis, with interactions.

Description

Compute regression function and derivative estimates based on polynomial basis lasso with penalty parameter chosen by cross validation (CV).

Usage

fit_lasso_poly(X, y, degree, lambda = NULL)
fit_lasso_poly(X, y, degree, lambda = NULL)

Arguments

`X`	matrix of covariates.
`y`	vector of responses.
`degree`	maximum degree of polynomial terms.
`lambda`	optional scalar tuning parameter, if "NULL" chosen via cross-validation.

Value

List containing: A function "fit" which takes matrix input of the same width as X, and returns a vector of y-predictions. A scalar "lambda" the tuning parameter.

Fit pre-tuned XGBoost regression for use in simulations.

Description

Fit pre-tuned XGBoost regression for use in simulations.

Usage

fit_xgboost(X, y, params, derivative = FALSE)
fit_xgboost(X, y, params, derivative = FALSE)

Arguments

`X`	matrix of covariates.
`y`	vector of responses.
`params`	XGBoost hyperparameters.
`derivative`	logical determining if numerical difference derivative estimate (wrt the first predictor) should also be returned.

Value

list containing a function "fit" which takes matrix input of the same width as X, and returns a vector of predictions. Optionally the list also contains a function "deriv_fit" for numerical difference derivative estimates.

Compute sums of a Monte Carlo vector for use in resmoothing.

Description

Compute sums of a Monte Carlo vector for use in resmoothing.

Usage

MC_sums(a, n, nMC, nbw)
MC_sums(a, n, nMC, nbw)

Arguments

`a`	vector of length (n x nMC x nbw).
`n`	integer.
`nMC`	integer.
`nbw`	integer.

Value

list with nbw elements. The j'th element of which is a vector of length n, the i'th element being the sum of the (((j-1)n + (i-1)) x nMC + 1) to (((j-1)n + i) x nMC) elements of a inclusive.

Population score function for the symmetric mixture two Gaussian random variables.

Description

Population score function for the symmetric mixture two Gaussian random variables.

Usage

mixture_score(x, sd)
mixture_score(x, sd)

Arguments

`x`	vector of observations.
`sd`	standard deviation of each Gaussian.

Value

vector of length n

Generate a matrix of covariates for use in resmoothing, in which the d'th column of X is translated successively by the Kronecker product of bw and MC_variates.

Description

Generate a matrix of covariates for use in resmoothing, in which the d'th column of X is translated successively by the Kronecker product of bw and MC_variates.

Usage

new_X(X, d, MC_variates, bw)
new_X(X, d, MC_variates, bw)

Arguments

`X`	matrix of covariates.
`d`	integer index of covariate to be smoothed along.
`MC_variates`	vector of standard Gaussian rvs.
`bw`	vector of bandwidths for the Gaussian kernel.

Value

matrix with ncol(X) columns and (nrow(X)length(MC_variates) length(bw)) rows.

Generate pseudo responses as in Ng 1994 to enable univariate score estimation by standard smoothing spline regression.

Description

Pseudo responses should be regarded as a computational tool, not as an estimate of the score itself.

Usage

ng_pseudo_response(x, w = rep(1, length(x)))
ng_pseudo_response(x, w = rep(1, length(x)))

Arguments

`x`	vector of covariates.
`w`	vector of weights.

Value

A vector of score estimates.

Examples

x <- seq(-3,3, length.out=50)
ng_pseudo_response(x)
x <- seq(-3,3, length.out=50)
ng_pseudo_response(x)

Fit a doubly-robust partially linear regression using the DoubleML package and pre-tuned XGBoost regressions, for use in simulations.

Description

Fit a doubly-robust partially linear regression using the DoubleML package and pre-tuned XGBoost regressions, for use in simulations.

Usage

partially_linear(X, y, g_params, m_params)
partially_linear(X, y, g_params, m_params)

Arguments

`X`	matrix of covariates.
`y`	vector of responses.
`g_params`	XGBoost hyperparameters for partially linear regression of y on X.
`m_params`	XGBoost hyperparameters for predictor regression of the first column of X on the others.

Value

List containing the linear parameter estimate and the corresponding standard error estimate.

Resmooth the predictions of a fitted model

Description

Smooth the predictions of a fitted model by convolving them with a Gaussian kernel along the d'th covariate.

Usage

resmooth(
  fit,
  X,
  d = 1,
  bw = exp(seq(-1, 1))/(2 * sqrt(3)) * stats::sd(X[, d]),
  n_points = 101,
  sd_trim = 5
)
resmooth(
  fit,
  X,
  d = 1,
  bw = exp(seq(-1, 1))/(2 * sqrt(3)) * stats::sd(X[, d]),
  n_points = 101,
  sd_trim = 5
)

Arguments

`fit`	a prediction function taking matrix input and returning a vector.
`X`	matrix of covariates.
`d`	integer index of covariate to be smoothed along.
`bw`	vector of bandwidths for the Gaussian kernel.
`n_points`	integer number of gridpoints to be used for convolution.
`sd_trim`	float number of standard deviations at which to trim the Gaussian distribution.

Value

List with the following elements. A list "pred" of the same length as "bw". Each element is a vector of predictions which are smooth with respect to the dth column of X, with smoothness corresponding to the respective element of "bw". A similar list "deriv" of corresponding vectors of first derivatives. Vectors "gridpoints" and "prob_weights" of equally spaced gridpoints and corresponding normal density weights. Vector "bw" of bandwidths used.

Examples

# Single bandwidth
X <- matrix(seq(-2,2,by=0.05))
fit <- function(Y){1*(rowMeans(Y)<0)}
sm <- resmooth(fit=fit, X=X, d=1, bw=0.2)
sm$pred[[1]]

# Multiple bandwidths simultaneously
X <- matrix(stats::rnorm(200), ncol=2)
y <- X[,1] + sin(X[,2]) + 0.5 * stats::rnorm(nrow(X))
df <- data.frame(y,X)
lm1 <- stats::lm(y~X1+sin(X2), data=df)
fit <- function(Y){as.vector(stats::predict(lm1, newdata=data.frame(Y)))}
resmooth(fit=fit, X=X, d=2)
# Single bandwidth
X <- matrix(seq(-2,2,by=0.05))
fit <- function(Y){1*(rowMeans(Y)<0)}
sm <- resmooth(fit=fit, X=X, d=1, bw=0.2)
sm$pred[[1]]

# Multiple bandwidths simultaneously
X <- matrix(stats::rnorm(200), ncol=2)
y <- X[,1] + sin(X[,2]) + 0.5 * stats::rnorm(nrow(X))
df <- data.frame(y,X)
lm1 <- stats::lm(y~X1+sin(X2), data=df)
fit <- function(Y){as.vector(stats::predict(lm1, newdata=data.frame(Y)))}
resmooth(fit=fit, X=X, d=2)

Symmetric mixture two Gaussian random variables.

Description

The resulting distribution is mean zero, variance one. X ~ N(-sqrt(1-sd^2),sd^2) wp 0.5, N(sqrt(1-sd^2),sd^2) wp 0.5.

Usage

rmixture(n, sd)
rmixture(n, sd)

Arguments

`n`	number of observations.
`sd`	standard deviation of each Gaussian.

Value

vector of length n

Generate the modified quadratic basis of Rothenhausler and Yu.

Description

Generate the modified quadratic basis of Rothenhausler and Yu.

Usage

rothenhausler_basis(X)
rothenhausler_basis(X)

Arguments

`X`	matrix of covariates.

Value

List containing the modified basis matrices for regression and derivative estimation.

Estimate the average partial effect of using the debiased lasso method of Rothenhausler and Yu, using pre-tuned lasso penalties, for use in simulations.

Description

Estimate the average partial effect of using the debiased lasso method of Rothenhausler and Yu, using pre-tuned lasso penalties, for use in simulations.

Usage

rothenhausler_yu(X, y, f_lambda, m_lambda)
rothenhausler_yu(X, y, f_lambda, m_lambda)

Arguments

`X`	matrix of covariates.
`y`	vector of responses.
`f_lambda`	lasso penalty for regression of y on X.
`m_lambda`	lasso penalty for predictor regression of the first column of X on the others.

Value

List containing the linear parameter estimate and the corresponding standard error estimate.

Generate simulation data.

Description

If ex_setting = "401k" then 401k data set is used for (X,Z). Otherwise:

$Z ~ N_{9}(0,\Sigma),$

where $\Sigma_{jj} = 1$ , $\Sigma_{jk} = 0.5$ for all j not equal to k.

$X = m(Z) + s(Z)*ex,$

where m and s are step functions of z_1 and z_3 respectively.

$Y = f(X,Z) + N(0,1).$

Usage

simulate_data(n, ex_setting, f_setting)
simulate_data(n, ex_setting, f_setting)

Arguments

`n`	integer number of samples. For "401k" ex_setting this is ignored and the whole data set is used.
`ex_setting`	string "normal", "mixture2", "mixture3", "logistic", "t4", "401k".
`f_setting`	string "plm", "additive", "interaction".

Value

list containing y, x, z. Additionally contains the population nuisance parameters evaluated on the data, and the sample version of the average partial effect.

Examples

simulate_data(100, "normal", "plm")
simulate_data(100, "normal", "plm")

Sort and bin x within a specified tolerance, using hist().

Description

Sort and bin x within a specified tolerance, using hist().

Usage

sort_bin(x, tol = 1e-05, nmax = NULL)
sort_bin(x, tol = 1e-05, nmax = NULL)

Arguments

`x`	vector of covariates.
`tol`	numeric tolerance, minimum distance between neighbouring points, to avoid singularities.
`nmax`	if specified, overrides tol as maximal number of unique points.

Value

list with three elements. x_sort is sorted and binned x, w is a vector of weights corresponding to the frequency of each bin, order is a vector specifying the ordering of x into the binned values sort_x.

Univariate score estimation via the smoothing spline method of Cox 1985 and Ng 1994.

Description

Univariate score estimation via the smoothing spline method of Cox 1985 and Ng 1994.

Usage

spline_score(x, df = 5, tol = 0.001, nmax = NULL)
spline_score(x, df = 5, tol = 0.001, nmax = NULL)

Arguments

`x`	vector of datapoints
`df`	vector of smoothing parameters for the non-parametric score estimator, corresponding to the effective degrees of freedom for a smoothing spline.
`tol`	numeric tolerance, minimum distance between neighbouring points, to avoid singularities.
`nmax`	if specified, overrides tol as maximal number of unique points.

Value

score function "rho" and derivative "drho", which take vector input and yield a vector of score estimates corresponding to each df (in a list if there are multiple df values). Also output the vector "df".

Examples

# Single bandwidth
x <- stats::rlogis(100)
spl <- spline_score(x, df=6)
spl$rho(x)
spl$drho(x)

# Multiple bandwidths simultaneously
x <- stats::rt(n=100, df=4)
spl <- spline_score(x, df=c(2,5,10))
spl$rho(x)
# Single bandwidth
x <- stats::rlogis(100)
spl <- spline_score(x, df=6)
spl$rho(x)
spl$drho(x)

# Multiple bandwidths simultaneously
x <- stats::rt(n=100, df=4)
spl <- spline_score(x, df=c(2,5,10))
spl$rho(x)

Generate n copies of $Z ~ N_{p}(0,\Sigma)$ , where $\Sigma_{jj} = 1$ , $\Sigma_{jk} = \text{corr}$ for all j not equal to k.

Description

Generate n copies of $Z ~ N_{p}(0,\Sigma)$ , where $\Sigma_{jj} = 1$ , $\Sigma_{jk} = \text{corr}$ for all j not equal to k.

Usage

z_correlated_normal(n, p, corr)
z_correlated_normal(n, p, corr)

Arguments

`n`	integer number of samples.
`p`	integer number of dimensions.
`corr`	float correlation in (-1,1).

Value

n by p matrix.

Package 'drape'

Help Index

Estimate the score function of the d'th covariate using a polynomial basis.

Description

Usage

Arguments

Value

Examples

Generate simulation data and evaluate estimators, with sample splitting.

Description

Usage

Arguments

Value

Evaluate estimators by training nuisance functions on training set and evaluating them on test set.

Description

Usage

Arguments

Value

Generate simulation data and evaluate OLS estimator.

Description

Usage

Arguments

Value

Generate simulation data and evaluate partially linear estimator.

Description

Usage

Arguments

Value

Generate simulation data and evaluate Rothenhausler estimator.

Description

Usage

Arguments

Value

K-fold cross-validation for resmoothing bandwidth.

Description

Usage

Arguments

Value

Examples

K-fold cross-validation for spline_score.

Description

Usage

Arguments

Value

Examples

Estimate the doubly-robust average partial effect estimate of X on Y, in the presence of Z.

Description

Usage

Arguments

Value

Examples

Fit a lasso regression using quadratic polynomial basis, with interactions.

Description

Usage

Arguments

Value

Fit pre-tuned XGBoost regression for use in simulations.

Description

Usage

Arguments

Value

Compute sums of a Monte Carlo vector for use in resmoothing.

Description

Usage

Arguments

Value

Population score function for the symmetric mixture two Gaussian random variables.

Description

Usage

Arguments

Value

Generate a matrix of covariates for use in resmoothing, in which the d'th column of X is translated successively by the Kronecker product of bw and MC_variates.

Description

Usage

Arguments

Value

Generate pseudo responses as in Ng 1994 to enable univariate score estimation by standard smoothing spline regression.

Description

Usage

Arguments

Generate n copies of $Z ~ N_{p}(0,\Sigma)$ , where $\Sigma_{jj} = 1$ , $\Sigma_{jk} = \text{corr}$ for all j not equal to k.