Title: | Semiparametric Least Squares Inference for Causal Effects |
---|---|
Description: | Several causal effects are measured using least squares regressions and basis function approximations. Backward and forward selection methods based on different criteria are used to select the basis functions. |
Authors: | Pierre Chausse Developer [aut, cre], Mihai Giurcanu Developer [aut] |
Maintainer: | Pierre Chausse Developer <[email protected]> |
License: | GPL (>= 2) |
Version: | 0.3-1 |
Built: | 2024-12-13 06:35:37 UTC |
Source: | CRAN |
When the information about a model is available, it reconstructs it and returns a valid model object.
## S3 method for class 'slseFit' as.model(x, ...) ## S3 method for class 'cslseFit' as.model(x, ...) ## S3 method for class 'cslse' as.model(x, ...)
## S3 method for class 'slseFit' as.model(x, ...) ## S3 method for class 'cslseFit' as.model(x, ...) ## S3 method for class 'cslse' as.model(x, ...)
x |
An object containing the model to extract. |
... |
Other arguments to pass to other methods. Currently not used. |
The method returns an object of class slseModel
or
cslseModel
.
data(simDat3) mod <- cslseModel(Y ~ Z | ~ X1 * X2, data = simDat3) fit <- estSLSE(mod) ## Extract the model from a cslseModel object as.model(fit) ## Extract the model from a cslse object cs <- causalSLSE(mod) as.model(cs)
data(simDat3) mod <- cslseModel(Y ~ Z | ~ X1 * X2, data = simDat3) fit <- estSLSE(mod) ## Extract the model from a cslseModel object as.model(fit) ## Extract the model from a cslse object cs <- causalSLSE(mod) as.model(cs)
This is the main method to estimate the causal effects using the semiparametric
least squares method. It returns an object of class
cslse
and is registered for objects of class
cslseModel
and cslseFit
.
## S3 method for class 'cslseModel' causalSLSE(object, selType=c("SLSE","BLSE","FLSE"), selCrit = c("AIC", "BIC", "PVT"), selVcov = c("HC0", "Classical", "HC1", "HC2", "HC3"), causal = c("ALL","ACT","ACE","ACN"), pvalT = function(p) 1/log(p), vcov.=vcovHC, reSelect=FALSE, ...) ## S3 method for class 'cslseFit' causalSLSE(object, causal = c("ALL","ACT","ACE","ACN"), vcov.=vcovHC, ...) ## S3 method for class 'formula' causalSLSE(object, data, nbasis=function(n) n^0.3, knots, selType=c("SLSE","BLSE","FLSE"), selCrit = c("AIC", "BIC", "PVT"), selVcov = c("HC0", "Classical", "HC1", "HC2", "HC3"), causal = c("ALL","ACT","ACE","ACN"), pvalT = function(p) 1/log(p), vcov.=vcovHC, reSelect=FALSE, ...)
## S3 method for class 'cslseModel' causalSLSE(object, selType=c("SLSE","BLSE","FLSE"), selCrit = c("AIC", "BIC", "PVT"), selVcov = c("HC0", "Classical", "HC1", "HC2", "HC3"), causal = c("ALL","ACT","ACE","ACN"), pvalT = function(p) 1/log(p), vcov.=vcovHC, reSelect=FALSE, ...) ## S3 method for class 'cslseFit' causalSLSE(object, causal = c("ALL","ACT","ACE","ACN"), vcov.=vcovHC, ...) ## S3 method for class 'formula' causalSLSE(object, data, nbasis=function(n) n^0.3, knots, selType=c("SLSE","BLSE","FLSE"), selCrit = c("AIC", "BIC", "PVT"), selVcov = c("HC0", "Classical", "HC1", "HC2", "HC3"), causal = c("ALL","ACT","ACE","ACN"), pvalT = function(p) 1/log(p), vcov.=vcovHC, reSelect=FALSE, ...)
object |
An object of class |
data |
A |
nbasis |
A function to determined the number of basis functions. It has to be a function of one argument, the sample size. |
knots |
A list of knots for the treated and nontreated groups. The list must be named using the group names. Each element of the list is also a list of length equal to the number of confounders. The choice for each confounders is NULL for no knots or numeric for specific values. If missing, the knots are automatically generated. |
selType |
The method for selecting the knots. By default (SLSE), all knots from the model are used. |
selCrit |
The criterion to select the knots. |
causal |
What causal effect should we compute. |
pvalT |
A function to determine the p-value threshold for the significance of the coefficients. It has to be a function of one parameter, which is the average number of knots in the model. This value may differ across treatment group. |
selVcov |
The type of least squares covariance matrix used to compute the p-values needed for the selection. |
vcov. |
An alternative function to compute the covariance matrix
of the least squares estimators. The default is
|
reSelect |
By default, the stored selections are used. If
|
... |
Additional arguments to pass to |
It returns an object of class cslse
, which inherits from
the class cslseFit
. It is a list with the following elements:
treated , nontreated
|
They are objects of class |
ACE , ACT , ACN
|
Estimates of the average causal effect, the
causal effect on the treated and the causal effect on the
nontreated. Each of them is a vector of two elements: the estimate and
its estimated standard error. All three are included only if the
argument |
Also, the object contains the following additional attributes:
treatedVar |
The name of the variable in the dataset that represents the treatment indicator. |
groupInd |
A named vector with the value of the treatment indicator corresponding do each treatment group. |
estSLSE
for the estimation of the model,
slseKnots
for the format of knots, and selSLSE
and update
for the knots selection
and to understand how stored selections are used.
data(simDat3) ## A causal SLSE model with the outcome Y ## the treatment indicator Z and the confounders X1, X2 and X1:X2 mod1 <- cslseModel(Y ~ Z | ~ X1 * X2, data = simDat3) ## The causal effects are estimated using the backward method and AIC criterion ## The HC1 type is used for the least squares covariance matrix fit1 <- causalSLSE(mod1, selType = "BLSE", type = "HC1") ## This is the same for formula objects fit2 <- causalSLSE(Y ~ Z | ~ X1 * X2, data = simDat3, selType = "BLSE", type = "HC1")
data(simDat3) ## A causal SLSE model with the outcome Y ## the treatment indicator Z and the confounders X1, X2 and X1:X2 mod1 <- cslseModel(Y ~ Z | ~ X1 * X2, data = simDat3) ## The causal effects are estimated using the backward method and AIC criterion ## The HC1 type is used for the least squares covariance matrix fit1 <- causalSLSE(mod1, selType = "BLSE", type = "HC1") ## This is the same for formula objects fit2 <- causalSLSE(Y ~ Z | ~ X1 * X2, data = simDat3, selType = "BLSE", type = "HC1")
This function creates an object of class cslseModel
or
slseModel
. The former is a list of slseModel
objects, one
for each treatment group. The model objects contain all specifications
about the model being estimated including the starting knots used to
construct the basis functions.
cslseModel(form, data, nbasis = function(n) n^0.3, knots, groupInd=c(treated=1, nontreated=0)) slseModel(form, data, nbasis = function(n) n^0.3, knots)
cslseModel(form, data, nbasis = function(n) n^0.3, knots, groupInd=c(treated=1, nontreated=0)) slseModel(form, data, nbasis = function(n) n^0.3, knots)
form |
A formula for the regression in the case of a simple spline
regression, or two formulas separated by |
data |
A |
nbasis |
A function to determined the number of basis functions. It has to be a function of one argument, the sample size. |
knots |
An optional list of knots. For |
groupInd |
A named vector with the group names and values. By
default, the treatment indicator defined in |
An object of class slseModel
is a regression model with a
dependent variable and a set of covariates. We assume that the model can
be written as , where
is an
unknown function of the covariates that we approximate using linear
combinations of basis functions. For now, we only consider local linear
splines defined by sets of knots, one for each covariate generated by
the
model.matrix
of form
. The knots are
automatically determined unless specified by the user.
An object of class cslseModel
is a list of slseModel
objects, one for each treatment group. The assignment to a group is
determined by a treatment indicator, which is the right-hand side
variable in the formula to the left of |
specified in
form
. The formula to the right of |
is used by
model.matrix
to generate a set of confounders for each
treatment group. See the example below and the vignette to more details.
The function slseModel
creates a semiparametric least squares
model or a class slseModel
object. It is a list with the
following elements:
slseForm |
The formula for the semiparametric estimation. It links the outcome variable with the matrix of basis functions. |
form |
The |
nameY |
The name of the variable representing the outcome |
data |
The dataset passed to the function |
knots |
An object of class |
nameX |
The names of all covariates included in the model. They are
all variables implied by the formula |
nameS |
The prefix added to the covariate names when the basis
functions are generated. By default, it is equal to |
xlevels |
When relevant, a list of levels of the factors included in the model. |
na |
A vector of integer representing the observations omitted
because of missing values. It is |
Note that other elements may be present if a knots selection has been applied to the model.
The function cslseModel
creates an object of class
cslseModel
. It is a list of slseModel
objects, one for
each treatment group. It also contains the following additional
attributes:
treatedVar |
The name of the variable in the dataset that represents the treatment indicator. |
groupInd |
A named vector with the value of the treatment indicator corresponding do each treatment group. |
selSLSE
for additional elements that can be
included in the slseModel
object, estSLSE
to see
how the model is estimated and slseKnots
for the format
of the knots
data(simDat3) ## Y is the outcome and Z the treatment indicator ## The confounders are X1, X2 and X1:X2: mod1 <- cslseModel(Y ~ Z | ~ X1 * X2, data = simDat3) mod1 ## A simple SLSE for a regression model mod2 <- slseModel(Y ~ X1 * X2, data = simDat3) mod2 ## If the treatment indicator differs from 0-1 data(simDat4) mod3 <- cslseModel(Y ~ treat | ~ X1 * X2, data = simDat4, groupInd = c(treated = "treat", nontreated = "notreat" )) mod3
data(simDat3) ## Y is the outcome and Z the treatment indicator ## The confounders are X1, X2 and X1:X2: mod1 <- cslseModel(Y ~ Z | ~ X1 * X2, data = simDat3) mod1 ## A simple SLSE for a regression model mod2 <- slseModel(Y ~ X1 * X2, data = simDat3) mod2 ## If the treatment indicator differs from 0-1 data(simDat4) mod3 <- cslseModel(Y ~ treat | ~ X1 * X2, data = simDat4, groupInd = c(treated = "treat", nontreated = "notreat" )) mod3
cslseModel
or slseModel
Objects
This is the main function to estimate cslseModel
or
slseModel
objects. It generates the basis functions based on the
knots specified in the model and estimates it by least squares.
## S3 method for class 'cslseModel' estSLSE(model, selKnots, ...) ## S3 method for class 'slseModel' estSLSE(model, selKnots, ...)
## S3 method for class 'cslseModel' estSLSE(model, selKnots, ...) ## S3 method for class 'slseModel' estSLSE(model, selKnots, ...)
model |
A model of class |
selKnots |
An optional list of integers to select the knots from
the list of knots specified by the model. If the model is a
|
... |
Additional arguments to pass to other methods. Currently not used. |
The method for slseModel
objects generates the matrix
of basis functions implied by the set of knots included in the model and
estimate the model by the least squares. Let be the
outcome and
be the matrix of basis functions. Then, the
function estimates the model using the code
lm(Y~U)
.
For cslseModel
, we could estimate the model using
lm(Y~Z+I(Z-1)+I(U0*(1-Z))+I(U1*Z))
, where Z
is a binary
variable equal to 1 for the treated and 0 for the nontreated, and
U0
and U1
are the matrices of basis functions for the
nontreated and treated, but the model is estimated separately for each
group. Therefore, the function estSLSE.cslseModel
calls the
function estSLSE.slseModel
for each slseModel
objects
included in the cslseModel
object.
It returns an object of class slseFit
or cslseFit
depending on which method is called. An object of class slseFit
is a list with the following elements:
LSE |
This is the least squares estimate of the semiparametric
model. It is an object of class |
model |
An object of class |
An object of class cslseFit
is a list of slseFit
objects,
one for each treatment group. It also contains the following additional
attributes:
treatedVar |
The name of the variable in the dataset that represents the treatment indicator. |
groupInd |
A named vector with the value of the treatment indicator corresponding do each treatment group. |
data(simDat3) ## Estimating a causal semiparametric model mod1 <- cslseModel(Y ~ Z | ~ X1 * X2, data = simDat3) fit1 <- estSLSE(mod1) ## Estimating a semiparametric model mod2 <- slseModel(Y ~ X1 * X2, data = simDat3) fit2 <- estSLSE(mod2)
data(simDat3) ## Estimating a causal semiparametric model mod1 <- cslseModel(Y ~ Z | ~ X1 * X2, data = simDat3) fit1 <- estSLSE(mod1) ## Estimating a semiparametric model mod2 <- slseModel(Y ~ X1 * X2, data = simDat3) fit2 <- estSLSE(mod2)
extract
Method for cslse
Objectsextract
method for cslse
objects created by the
causalSLSE
function.
## S4 method for signature 'cslse' extract( model, include.nobs = TRUE, include.nknots = TRUE, include.numcov = TRUE, include.rsquared = TRUE, include.adjrs=TRUE, separated.rsquared = FALSE, which = c("ALL","ACE","ACT","ACN","ACE-ACT","ACE-ACN","ACT-ACN"), ...)
## S4 method for signature 'cslse' extract( model, include.nobs = TRUE, include.nknots = TRUE, include.numcov = TRUE, include.rsquared = TRUE, include.adjrs=TRUE, separated.rsquared = FALSE, which = c("ALL","ACE","ACT","ACN","ACE-ACT","ACE-ACN","ACT-ACN"), ...)
model |
A causal effect object created by
|
include.nobs |
Report the number of obervations? |
include.nknots |
Report the total number of knots for each group? |
include.numcov |
Report the total number of covariates (including interactions is any) per group? |
include.rsquared |
Report the R-squared of the final regression estimtion |
include.adjrs |
Report the adjusted R-squared of the final regression estimtion |
separated.rsquared |
Should we print the R-squared separately for each group? This applies as well to the adjusted R-squared. |
which |
Which causal effect measures should be printed? |
... |
Custom parameters, which are handed over to subroutines. Currently not in use. |
It returns an object of class texreg
.
extract
Method for slseFit
Objectsextract
method for slseFit
objects created by the
estSLSE
function.
## S4 method for signature 'slseFit' extract( model, include.rsquared = TRUE, include.adjrs = TRUE, include.nobs = TRUE, include.fstatistic = FALSE, include.rmse = FALSE, ...)
## S4 method for signature 'slseFit' extract( model, include.rsquared = TRUE, include.adjrs = TRUE, include.nobs = TRUE, include.fstatistic = FALSE, include.rmse = FALSE, ...)
model |
A model estimated by |
include.rsquared |
Report the R-squared of the final regression estimation |
include.adjrs |
Report the adjusted R-squared of the final regression estimtion |
include.nobs |
Report the number of obervations? |
include.fstatistic |
Report the F-statistics? |
include.rmse |
Report the RMSE? |
... |
Custom parameters, which are handed over to subroutines. Currently not in use. |
It returns an object of class texreg
.
It generates a matrix of basis functions using local linear splines. The
number of basis functions and the breaking points are determined by the
knot specifications of the slseModel
or cslseModel
model.
## S3 method for class 'slseModel' llSplines(object, ...) ## S3 method for class 'cslseModel' llSplines(object, ...)
## S3 method for class 'slseModel' llSplines(object, ...) ## S3 method for class 'cslseModel' llSplines(object, ...)
object |
A model of class |
... |
Additional arguments to pass to other methods. Currently not used. |
The function returns a matrix of basis functions used to
estimate the semiparametric model. It is used directly as regressor in
lm
as shown in the example below.
data(simDat3) ## We manually estimate the semiparametric model mod1 <- slseModel(Y ~ X1 * X2, data = simDat3) U <- llSplines(mod1) fit1 <- lm(Y ~ U, data = simDat3) ## We use estSLSE instead (results are identical) fit2 <- estSLSE(mod1)
data(simDat3) ## We manually estimate the semiparametric model mod1 <- slseModel(Y ~ X1 * X2, data = simDat3) U <- llSplines(mod1) fit1 <- lm(Y ~ U, data = simDat3) ## We use estSLSE instead (results are identical) fit2 <- estSLSE(mod1)
This data was collected to evaluate the National Supported Work (NSW) Demonstration project in Lalonde (1986).
data(nsw)
data(nsw)
A data frame containing 9 variables.
Treatment assignment
Age
Years of Education
1 if Black, 0 otherwise
1 if Hispanic 0 otherwise
1 if married 0 otherwise
1 if no college degree 0 otherwise
1975 earnings
1978 earnings
The dataset was obtained from the ATE package (see reference).
"NSW Data Files" from Rajeev Dehejia's website. URL: http://users.nber.org/~rdehejia/data/.nswdata2.html
"National Supported Work Evaluation Study, 1975-1979: Public Use Files." from the Interuniversity Consortium for Political and Social Research. URL: http://www.icpsr.umich.edu/icpsrweb/ICPSR/studies/7865
Lalonde, R. (1986). "Evaluating the Econometric Evaluations of Training Programs," American Economic Review, 76(4), 604-620.
Dehejia R. and Wahba S. (1999). "Causal Effects in Non-Experimental Studies: Re-Evaluating the Evaluation of Training Programs," JASA 94 (448), 1053-1062.
Asad Haris and Gary Chan (2015). ATE: Inference for Average Treatment Effects using Covariate Balancing. R package version 0.2.0. https://CRAN.R-project.org/package=ATE
For objects of class slseFit
, the method plots the predicted
outcome with respect to a given covariate. It is the same for objects of class
cslseFit
, but the predicted outcome is plotted for each treatment
group separately.
## S3 method for class 'cslseFit' plot(x, y, which = y, interval = c("none", "confidence"), level = 0.95, fixedCov = list(), vcov. = vcovHC, add = FALSE, addToLegend = NULL, addPoints = FALSE, FUN = mean, plot=TRUE, graphPar=list(), ...) ## S3 method for class 'slseFit' plot(x, y, which = y, interval = c("none", "confidence"), level = 0.95, fixedCov = NULL, vcov. = vcovHC, add = FALSE, addPoints = FALSE, FUN = mean, plot=TRUE, graphPar=list(), ...)
## S3 method for class 'cslseFit' plot(x, y, which = y, interval = c("none", "confidence"), level = 0.95, fixedCov = list(), vcov. = vcovHC, add = FALSE, addToLegend = NULL, addPoints = FALSE, FUN = mean, plot=TRUE, graphPar=list(), ...) ## S3 method for class 'slseFit' plot(x, y, which = y, interval = c("none", "confidence"), level = 0.95, fixedCov = NULL, vcov. = vcovHC, add = FALSE, addPoints = FALSE, FUN = mean, plot=TRUE, graphPar=list(), ...)
x |
Object of class |
y |
alias for |
which |
Which covariate to plot against the outcome variable. It could be an integer or a character. |
interval |
The type of confidence interval. The default is
|
level |
The confidence interval level if included. |
fixedCov |
List of covariates to fix to
specific values for the nontreated and treated groups. By default,
covariates not selected by |
vcov. |
An alternative function to compute the covariance
matrix of the least squares estimates. The default is the
|
add |
Should the plot be added to an existing one? |
addToLegend |
A character string to add to the legend next to treated and control. |
addPoints |
Should we add the scatterplot of the outcome and covariate on the graph? |
FUN |
The function to determine the fixed value for the
covariates not fixed by |
plot |
If set to |
graphPar |
A list of graphical parameters. See Details. |
... |
Additional argument to pass to the |
The default set of parameters can be obtained by running the command
causalSLSE:::.initPar()
. It returns a list of four elements:
treated
for the parameter of the lines or points of the treated,
nontreated
for the parameters of the nontreated, common
for the common parameters not specific to a group like the main title or
the axis labels, and legend
for the legend parameters. The
elements treated
and nontreated
are lists of two elements:
points
and lines
, which are lists of graphical parameters
for the scatterplot (when addPoints
is TRUE
) and the
lines. The argument graphPar
can be use to modify existing
parameters or to add new ones. It must respect the structure of the
initial list. See the example below.
To fix covariates to the same values for both groups, fixedCov
is
just a named list with values for the covariates associated with the
names. To fix the covariates to different values for the treated and
nontreated, fixedCov
is a named list of at least 1 element with
names being either treated, nontreated or both. If only one group is
specified, the covariates for the other groups are determined by
FUN
.
It returns NULL
invisibly if the argument plot
is
TRUE
. Otherwise, it returns a data.frame
with the
necessary variables to manually create the plot. For slseModel
objects, it is a list with the following elements:
Outcome |
The outcome variable from the model. The name of this element is the actual name of the outcome variable in the model. |
which |
The covariate selected by the argument |
fit |
The predicted outcome from the model fit. |
lower , upper
|
The lower and upper bounds of the confidence
interval. It is only available when the argument |
Note that all returned variables are ordered with respect to the selected covariates. See the example below.
For cslseModel
, the above list of variables is returned
separately for each treatment group.
data(simDat3) ## For cslse objects mod <- cslseModel(Y ~ Z | ~ X1 + X2, data = simDat3) fit <- causalSLSE(mod) plot(fit, "X1") ## Let's change the type of points for the treated and lines for the ## nontreated gpar <- list(treated = list(points = list(pch = 24, col = 5)), nontreated = list(lines = list(lty = 5, col = "darkgreen")), common = list(xlab = "New X", main = "Plot with changed parameters")) plot(fit, "X1", addPoints = TRUE, graphPar = gpar) ## For slseModel objects: mod2 <- slseModel(Y ~ X1 + X2, data = simDat3) fit2 <- estSLSE(mod2) plot(fit2, "X1", interval = "confidence", addPoints = TRUE) ## The same graph produced manually p2 <- plot(fit2, "X1", interval = "confidence", plot = FALSE) plot(p2$X1, p2$Y, pch = 21, main = "Y against X1", xlab = "X1", ylab = "Y") lines(p2$X1, p2$fit, lwd = 2) lines(p2$X1, p2$lower, lty = 3, lwd = 2) lines(p2$X1, p2$upper, lty = 3, lwd = 2)
data(simDat3) ## For cslse objects mod <- cslseModel(Y ~ Z | ~ X1 + X2, data = simDat3) fit <- causalSLSE(mod) plot(fit, "X1") ## Let's change the type of points for the treated and lines for the ## nontreated gpar <- list(treated = list(points = list(pch = 24, col = 5)), nontreated = list(lines = list(lty = 5, col = "darkgreen")), common = list(xlab = "New X", main = "Plot with changed parameters")) plot(fit, "X1", addPoints = TRUE, graphPar = gpar) ## For slseModel objects: mod2 <- slseModel(Y ~ X1 + X2, data = simDat3) fit2 <- estSLSE(mod2) plot(fit2, "X1", interval = "confidence", addPoints = TRUE) ## The same graph produced manually p2 <- plot(fit2, "X1", interval = "confidence", plot = FALSE) plot(p2$X1, p2$Y, pch = 21, main = "Y against X1", xlab = "X1", ylab = "Y") lines(p2$X1, p2$fit, lwd = 2) lines(p2$X1, p2$lower, lty = 3, lwd = 2) lines(p2$X1, p2$upper, lty = 3, lwd = 2)
The method computes the predicted outcome for each group with standard errors and confidence intervals.
## S3 method for class 'cslseFit' predict(object, interval=c("none","confidence"), se.fit=FALSE, newdata=NULL, level=0.95, vcov.=vcovHC, ...) ## S3 method for class 'slseFit' predict(object, interval=c("none","confidence"), se.fit=FALSE, newdata=NULL, level=0.95, vcov.=vcovHC, ...)
## S3 method for class 'cslseFit' predict(object, interval=c("none","confidence"), se.fit=FALSE, newdata=NULL, level=0.95, vcov.=vcovHC, ...) ## S3 method for class 'slseFit' predict(object, interval=c("none","confidence"), se.fit=FALSE, newdata=NULL, level=0.95, vcov.=vcovHC, ...)
object |
Object of class |
interval |
If set to |
se.fit |
Should the function return the standard errors of the predicted values? |
level |
The confidence interval level if interval is set to
|
newdata |
A |
vcov. |
An alternative function to compute the covariance
matrix of the least squares estimates. The default is the
|
... |
Additional argument to pass to the |
For slseFit
objects, it returns the predicted outcome if
se.fit
is FALSE
or a list of the following two elements
otherwise:
fit |
The predicted outcome. |
se.fit |
The standard errors of the predicted outcomes. |
If the argument confidence
is set to "interval"
, the
predicted outcome is a matrix with the predicted outcome, and the
lower and upper bounds of the confidence intervals.
For objects of class 'cslseFit', the same is returned for each
treatment group in a list. The elements of the list are treated
and nontreated
(until the package allows for more than one
treatment).
data(simDat3) mod <- cslseModel(Y ~ Z | ~ X1 + X2, data = simDat3) fit <- causalSLSE(mod) ## Predicting outcome for all observations pr <- predict(fit, interval = "confidence") ## Predicting outcome with new data ndat <- data.frame(X1 = c(-2, 1, 2, 3), X2 = c(-4, -2, 0, 1), Z = c(1, 1, 0, 0)) predict(fit, newdata = ndat)
data(simDat3) mod <- cslseModel(Y ~ Z | ~ X1 + X2, data = simDat3) fit <- causalSLSE(mod) ## Predicting outcome for all observations pr <- predict(fit, interval = "confidence") ## Predicting outcome with new data ndat <- data.frame(X1 = c(-2, 1, 2, 3), X2 = c(-4, -2, 0, 1), Z = c(1, 1, 0, 0)) predict(fit, newdata = ndat)
Print methods for different objects from the package.
## S3 method for class 'cslseModel' print(x, which=c("Model", "selKnots", "Pvalues"), digits = max(3L, getOption("digits") - 3L), ...) ## S3 method for class 'slseModel' print(x, which=c("Model", "selKnots", "Pvalues"), digits = max(3L, getOption("digits") - 3L), ...) ## S3 method for class 'cslse' print(x, digits = max(3L, getOption("digits") - 3L), ...) ## S3 method for class 'cslseFit' print(x, digits = max(3L, getOption("digits") - 3L), ...) ## S3 method for class 'slseFit' print(x, digits = max(3L, getOption("digits") - 3L), ...) ## S3 method for class 'summary.cslse' print(x, digits = max(3L, getOption("digits") - 3L), signif.stars = getOption("show.signif.stars"), beta=FALSE, knots = FALSE,...) ## S3 method for class 'summary.cslseFit' print(x, groups, digits = max(3L, getOption("digits") - 3L), signif.stars = getOption("show.signif.stars"), ...) ## S3 method for class 'summary.slseFit' print(x, digits = max(3L, getOption("digits") - 3L), signif.stars = getOption("show.signif.stars"), ...) ## S3 method for class 'slseKnots' print(x, header=c("None", "All", "Select"), digits = max(3L, getOption("digits") - 3L), ...)
## S3 method for class 'cslseModel' print(x, which=c("Model", "selKnots", "Pvalues"), digits = max(3L, getOption("digits") - 3L), ...) ## S3 method for class 'slseModel' print(x, which=c("Model", "selKnots", "Pvalues"), digits = max(3L, getOption("digits") - 3L), ...) ## S3 method for class 'cslse' print(x, digits = max(3L, getOption("digits") - 3L), ...) ## S3 method for class 'cslseFit' print(x, digits = max(3L, getOption("digits") - 3L), ...) ## S3 method for class 'slseFit' print(x, digits = max(3L, getOption("digits") - 3L), ...) ## S3 method for class 'summary.cslse' print(x, digits = max(3L, getOption("digits") - 3L), signif.stars = getOption("show.signif.stars"), beta=FALSE, knots = FALSE,...) ## S3 method for class 'summary.cslseFit' print(x, groups, digits = max(3L, getOption("digits") - 3L), signif.stars = getOption("show.signif.stars"), ...) ## S3 method for class 'summary.slseFit' print(x, digits = max(3L, getOption("digits") - 3L), signif.stars = getOption("show.signif.stars"), ...) ## S3 method for class 'slseKnots' print(x, header=c("None", "All", "Select"), digits = max(3L, getOption("digits") - 3L), ...)
x |
A model or a fit object. |
digits |
The number of digits to print. |
signif.stars |
Should we print the significant stars? |
beta |
Should we print the coefficient matrix of the basis functions? |
knots |
Should we print the set of knots? |
which |
What info we should print? It prints a summary of the
model when set to |
header |
What description of the object should be printed. |
groups |
The names of the group to be printed. By default, they are all printed. |
... |
Argument for other types of objects. |
All methods return NULL
invisibly.
## For cslseModel objects data(simDat3) mod1 <- cslseModel(Y ~ Z | ~ X1 * X2, data = simDat3) print(mod1) ## For cslse objects causal <- causalSLSE(mod1) print(causal) ## For summary.cslse objects s <- summary(causal) print(s) ## For cslseFit objects fit <- estSLSE(mod1) print(fit) ## For summary.cslseFit objects summary(fit)
## For cslseModel objects data(simDat3) mod1 <- cslseModel(Y ~ Z | ~ X1 * X2, data = simDat3) print(mod1) ## For cslse objects causal <- causalSLSE(mod1) print(causal) ## For summary.cslse objects s <- summary(causal) print(s) ## For cslseFit objects fit <- estSLSE(mod1) print(fit) ## For summary.cslseFit objects summary(fit)
This is the main function to select the knots in cslseModel
or
slseModel
objects. It returns a model with an optimal set of
knots.
## S3 method for class 'cslseModel' selSLSE(model, selType=c("BLSE", "FLSE"), selCrit = c("AIC", "BIC", "PVT"), pvalT = function(p) 1/log(p), vcovType = c("HC0", "Classical", "HC1", "HC2", "HC3"), reSelect=FALSE, ...) ## S3 method for class 'slseModel' selSLSE(model, selType=c("BLSE", "FLSE"), selCrit = c("AIC", "BIC", "PVT"), pvalT = function(p) 1/log(p), vcovType = c("HC0", "Classical", "HC1", "HC2", "HC3"), reSelect=FALSE, ...)
## S3 method for class 'cslseModel' selSLSE(model, selType=c("BLSE", "FLSE"), selCrit = c("AIC", "BIC", "PVT"), pvalT = function(p) 1/log(p), vcovType = c("HC0", "Classical", "HC1", "HC2", "HC3"), reSelect=FALSE, ...) ## S3 method for class 'slseModel' selSLSE(model, selType=c("BLSE", "FLSE"), selCrit = c("AIC", "BIC", "PVT"), pvalT = function(p) 1/log(p), vcovType = c("HC0", "Classical", "HC1", "HC2", "HC3"), reSelect=FALSE, ...)
model |
A model of class |
selType |
The selection method: backward ( |
selCrit |
The criterion to select the piecewise polynomial knots. |
pvalT |
A function to determine the p-value threshold for the significance of the coefficients. It has to be a function of one parameter, which is the average number of knots in the model. |
vcovType |
The type of least squares covariance matrix used to compute the p-values needed for the selection. |
reSelect |
By default, the stored selections are used. If
|
... |
Additional arguments to pass to other methods. Currently not used. |
It selects the knots using one of the two methods, FLSE or BLSE, with
either the AIC, BIC or a p-value threshold (see the vignette for more
details). Any of these selection methods requires several least squares
estimations and it is performed only if the method has not been applied
yet and reSelect
is set to TRUE
. This is possible because
any new knots selection is saved into the returned model. A model may
have more than one selection saved into it. The active knots (the ones
used when we estimate the model) is stored into the element knots
of the model and the saved selections are stored into the element
selections
. See below for what is included in this element.
Note that the selections for the three criteria AIC, BIC and PVT are
computed and saved automatically in the returned model when
selCrit
is set to either "AIC"
or "BIC"
, because it
does not require many more operations to select them all once we do it
for AIC or BIC. However, it is only computed for PVT when selCrit
is set to "PVT"
.
The knots are selected jointly for all treatment groups in
cslseModel
objects. However, the active knots and all saved
selections are stored separately for each treatment group. For example,
the active knots for the treated in the cslseModel
object
mod
are stored in mod$treated$knots
. See the Vignette for
more details.
The method returns an object of class slseModel
or
cslseModel
depending on which object it is applied
to. When it does not already exist, the element selections
is added
to the slseModel
object (or to each slseModel
object in
cslseModel
objects). The element selections
is a list with
one or more of the following elements:
originalKnots |
The original knots as selected initially by
|
FLSE , BLSE
|
This is where selections based on the forward (FLSE) and the backward (BLSE) methods are stored. |
Finally, BLSE
and FLSE
are lists that may contain the following elements:
AIC , BIC
|
A list of integer vectors, one for each covariate in the
|
PVT |
Same as the AIC and BIC, but the selection is based on a p-value threshold. |
JAIC , JBIC
|
This criteria is for |
Threshold |
The p-value threshold used for the PVT criterion. |
pval |
A list of p-values, one for each original knots. See vignette for a definition of the p-values. |
slseModel
and cslseModel
for model
objects description and update
for
ways of selecting stored selections
data(simDat3) mod1 <- cslseModel(Y~Z|~X1*X2, data=simDat3) mod1 <- selSLSE(mod1, selType="FLSE", selCrit="AIC") ## The following does not require additional computation ## because the selection is stored in mod1 mod1 <- selSLSE(mod1, selType="FLSE", selCrit="BIC") ## But the following does mod1 <- selSLSE(mod1, selType="BLSE", selCrit="BIC") ## See one selection: mod1$treated$selections$BLSE$JBIC
data(simDat3) mod1 <- cslseModel(Y~Z|~X1*X2, data=simDat3) mod1 <- selSLSE(mod1, selType="FLSE", selCrit="AIC") ## The following does not require additional computation ## because the selection is stored in mod1 mod1 <- selSLSE(mod1, selType="FLSE", selCrit="BIC") ## But the following does mod1 <- selSLSE(mod1, selType="BLSE", selCrit="BIC") ## See one selection: mod1$treated$selections$BLSE$JBIC
This dataset is used in several documentation files to illustrate the different functionalities of the package.
data("simDat1")
data("simDat1")
A data frame with 300 observations on the following 9 variables.
X
Continuous Covariate
Z
Treatment indicator.
Y
Observed outcome.
Y1
Outcome for the treated (not observed in practice).
Y0
Outcome for the nontreated (not observed in practice).
U01
, U02
The true basis functions for the nontreated.
U11
, U12
The true basis functions for the treated.
This dataset is used in several documentation files to illustrate the different functionalities of the package.
data("simDat2")
data("simDat2")
A data frame with 300 observations on the following 11 variables.
X
Continuous Covariate
Z
Treatment indicator.
Y
Observed outcome.
Y1
Outcome for the treated (not observed in practice).
Y0
Outcome for the nontreated (not observed in practice).
U01
, U02
, U03
The true basis functions for the nontreated.
U11
, U12
, U13
The true basis functions for the treated.
This dataset is used in several documentation files to illustrate the different functionalities of the package.
data("simDat3")
data("simDat3")
A data frame with 300 observations on the following 16 variables.
X1
Continuous Covariate
X2
Continuous Covariate
Z
Treatment indicator.
Y
Observed outcome.
Y1
Outcome for the treated (not observed in practice).
Y0
Outcome for the nontreated (not observed in practice).
U01
, U02
, U03
, U04
,
U05
The true basis functions for the nontreated.
U11
, U12
, U13
, U14
,
U15
The true basis functions for the treated.
This dataset is used in several documentation files to illustrate the different functionalities of the package.
data("simDat4")
data("simDat4")
A data frame with 500 observations on the following 7 variables.
Z
Treatment indicator.
Y
Observed outcome.
X1
Continuous covariates with a large proportion of zeros.
X2
A categorical variable with 3 character levels (not expressed as factor).
X3
A categorical variable with 3 numerical levels (not expressed as factor).
X4
A binary variable.
treat
A character variable.
This dataset is used in several documentation files to illustrate the different functionalities of the package.
data("simDat5")
data("simDat5")
A data frame with 300 observations on the following 16 variables.
X1
Continuous Covariate
X2
Continuous Covariate
Z
Treatment indicator.
Y
Observed outcome.
Y1
Outcome for the treated (not observed in practice).
Y0
Outcome for the nontreated (not observed in practice).
The function creates an object of class slseKnots
for
semiparametric least squares models. It returns an object of
class slseKnots
.
slseKnots(form, data, X, nbasis = function(n) n^0.3, knots)
slseKnots(form, data, X, nbasis = function(n) n^0.3, knots)
form |
A formula that determines the covariates to include in the regression. It is a regular formula and only the right hand side is considered. |
data |
A |
nbasis |
A function to determine the number of basis functions. It has to be a function of one argument, the sample size. |
knots |
An optional list of knots. Each element of the list (one
for each covariate) is either a vector of numeric knots or it is
|
X |
An optional matrix of covariates. When provided, |
The automatic selection is as follows. It is applied to each
variable created by the model.matrix
of form
except for
the intercept:
The number of knots is the ceiling of what the nbasis
function
returns minus 1. Let p-1
be the number of knots. Then, we
compute the p+1
empirical quantiles of the variable for equally
spaced probabilities going from 0 to 1 and drop the first and last
ones. This is done using the function quantile
with
type=1
. We then remove the duplicated values and the ones equal
to either the min
or the max
of the variable. if the number
of remaining knots is equal to 0, the set of knots for this variable
is set to NULL
.
For manual selection, see the vignette.
It returns an object of class slseKnots
. It is a list for which
the length and names are respectively the number of columns and the
column names of the model.matrix
of form
after
the intercept has been removed. Each element is a numeric vector of
knots, unless the number of knots is set to 0, in which case it is
NULL
.
data(simDat3) k <- slseKnots(Y ~ X1 * X2, data = simDat3) k ## We can extract the set for one variable k$X1
data(simDat3) k <- slseKnots(Y ~ X1 * X2, data = simDat3) k ## We can extract the set for one variable k$X1
The method computes summary statistics for estimated semiparametric and causal models.
## S3 method for class 'cslse' summary(object, ...) ## S3 method for class 'cslseFit' summary(object, vcov.=vcovHC, ...) ## S3 method for class 'slseFit' summary(object, vcov.=vcovHC, ...)
## S3 method for class 'cslse' summary(object, ...) ## S3 method for class 'cslseFit' summary(object, vcov.=vcovHC, ...) ## S3 method for class 'slseFit' summary(object, vcov.=vcovHC, ...)
object |
A model estimated by |
vcov. |
A function to compute the strandard error of the
least squares coefficients. The default is
|
... |
Argument for other types of objects |
The function summary.slseFit
returns the summary statistics of
the least squares estimation of slseModel
objects
estimated by estSLSE
. It is an object of class
summary.slseFit
, which is a list with the following elements:
model |
The |
lseSum |
An object of class |
The function summary.cslseFit
returns an object of class
summary.cslseFit
, which is a list of summary.slseFit
objects,
one for each treatment group.
The function summary.cslse
returns the summary statistics of the
causal effect estimates. It is an object of class summary.cslse
,
which is a list with the following elements:
causal |
A matrix with the causal effect estimates in the first column, their standard errors in the second, their t-ratios in the third and p-values in the fourth. |
lse |
A list of |
## A causal semiparametric model with causal effect estimates data(simDat3) mod1 <- cslseModel(Y ~ Z | ~ X1 * X2, data = simDat3) CE <- causalSLSE(mod1, selType = "BLSE") summary(CE) ## A causal semiparametric model fit <- estSLSE(mod1) summary(fit) ## A semiparametric model mod2 <- slseModel(Y ~ X1 * X2, data = simDat3) fit2 <- estSLSE(mod2) summary(fit2)
## A causal semiparametric model with causal effect estimates data(simDat3) mod1 <- cslseModel(Y ~ Z | ~ X1 * X2, data = simDat3) CE <- causalSLSE(mod1, selType = "BLSE") summary(CE) ## A causal semiparametric model fit <- estSLSE(mod1) summary(fit) ## A semiparametric model mod2 <- slseModel(Y ~ X1 * X2, data = simDat3) fit2 <- estSLSE(mod2) summary(fit2)
The method updates an object by modifying its specification. Currently, it is used to change the set of knots by either selecting them manually or by specifying the selection methods.
## S3 method for class 'cslseModel' update(object, selType, selCrit="AIC", selKnots, ...) ## S3 method for class 'slseModel' update(object, selType, selCrit="AIC", selKnots, ...) ## S3 method for class 'slseKnots' update(object, selKnots, ...)
## S3 method for class 'cslseModel' update(object, selType, selCrit="AIC", selKnots, ...) ## S3 method for class 'slseModel' update(object, selType, selCrit="AIC", selKnots, ...) ## S3 method for class 'slseKnots' update(object, selKnots, ...)
object |
An object to be modified. |
selKnots |
An optional list of integers to select the knots from the original list of knots. If missing, the current knots are kept. |
selType |
The selection method: |
selCrit |
The criterion to select the optimal set of knots. |
... |
Argument for other types of objects. Currently not used. |
The method for slseKnots
is explained in the vignette, but it
is mostly used internally. For the model objects, the method can be used
to choose a set of knots already stored in the object. It avoids
having to re-compute them which can be computationally intensive for
large samples. It returns an error message if the type of selection
requested does not exist. In that case, you need to run the
selSLSE
method. If the selType
argument is set to
None
, the method returns the original model.
It returns an object of the same class, but with a different set of knots.
selSLSE
for more details on how stored knots can
be selected from a model, slseModel
for model
description and slseKnots
for the format of knots
data(simDat3) mod1 <- cslseModel(Y ~ Z | ~ X1 * X2, data = simDat3) mod2 <- selSLSE(mod1, "BLSE", "AIC") ## We changed the knots to the BLSE-BIC selection ## already stored in the model object update(mod2, "BLSE", "BIC") ## We recover the original set of knots update(mod2, "None")
data(simDat3) mod1 <- cslseModel(Y ~ Z | ~ X1 * X2, data = simDat3) mod2 <- selSLSE(mod1, "BLSE", "AIC") ## We changed the knots to the BLSE-BIC selection ## already stored in the model object update(mod2, "BLSE", "BIC") ## We recover the original set of knots update(mod2, "None")