Package 'ivitr'

Title: Estimate IV-Optimal Individualized Treatment Rules
Description: A method that estimates an IV-optimal individualized treatment rule. An individualized treatment rule is said to be IV-optimal if it minimizes the maximum risk with respect to the putative IV and the set of IV identification assumptions. Please refer to <arXiv:2002.02579> for more details on the methodology and some theory underpinning the method. Function IV-PILE() uses functions in the package 'locClass'. Package 'locClass' can be accessed and installed from the 'R-Forge' repository via the following link: <https://r-forge.r-project.org/projects/locclass/>. Alternatively, one can install the package by entering the following in R: 'install.packages("locClass", repos="<http://R-Forge.R-project.org>")'.
Authors: Bo Zhang
Maintainer: Bo Zhang <[email protected]>
License: GPL-3
Version: 0.1.0
Built: 2024-10-27 06:23:14 UTC
Source: CRAN

Help Index


Rouse (1995) dataset

Description

Variables of the dataset is as follows:

educ86

Years of education since 1986.

twoyr

Attending a two-year college immediately after high school.

female

Gender: 1 if female and 0 otherwise.

black

Race: 1 if African American and 0 otherwise.

hispanic

Race: 1 if Hispanic and 0 otherwise.

bytest

Test score.

dadsome

Dad's education: some college.

dadcoll

Dad's education: college.

momsome

Mom's education: some college.

momcoll

Mom's education: college.

fincome

Family income.

fincmiss

Missingness indicator for family income.

tuition2

Average state two-year college tuition.

tuition4

Average state four-year college tuition.

dist2yr

Distance to the nearest two-year college.

dist4yr

Distance to the nearest four-year college.

Usage

data(dt_Rouse)

Format

A data frame with 4437 rows and 16 columns.

Source

ss


Estimate the Balke-Pearl bound for each instance in a dataset

Description

estimate_BP_bound estimates the Balke-Pearl bound for each instance in the input dataset with a binary IV, observed covariates, a binary treatment indicator, and a binary outcome.

Usage

estimate_BP_bound(dt, method = "rf", nodesize = 5)

Arguments

dt

A dataframe whose first column is a binary IV 'Z', followed by q columns of observed covariates, followed by a binary treatment indicator 'A', and finally followed by a binary outcome 'Y'. The dataset has q+3 columns in total.

method

A character string indicator the method used to estimate each constituent conditional probability of the Balke-Pearl bound. Users can choose to fit multinomial regression by setting method = 'multinom', and random forest by setting method = 'rf'.

nodesize

Node size to be used in a random forest algorithm if method is set to 'rf'. The default value is set to 5.

Value

The original dataframe with two additional columns: L and U. L indicates the Balke-Pearl lower bound and U is the Balke-Pearl upper bound.

Examples

attach(dt_Rouse)
# Construct an IV out of differential distance to two-year versus
# four-year college. Z = 1 if the subject lives not farther from
# a 4-year college compared to a 2-year college.
Z = (dist4yr <= dist2yr) + 0

# Treatment A = 1 if the subject attends a 4-year college and 0
# otherwise.
A = 1 - twoyr

# Outcome Y = 1 if the subject obtained a bachelor's degree
Y = (educ86 >= 16) + 0

# Prepare the dataset
dt = data.frame(Z, female, black, hispanic, bytest, dadsome,
     dadcoll, momsome, momcoll, fincome, fincmiss, A, Y)

# Calculate the Balke-Pearl bound by estimating each constituent
# conditional probability p(Y = y, A = a | Z, X) with a random
# forest.
dt_with_BP_bound_rf = estimate_BP_bound(dt, method = 'rf', nodesize = 5)

# Calculate the Balke-Pearl bound by estimating each constituent
# conditional probability p(Y = y, A = a | Z, X) with a multinomial
# regression.
dt_with_BP_bound_multinom = estimate_BP_bound(dt, method = 'multinom')

Estimate the partial identification bound as in Siddique (2013, JASA) for each instance in a dataset

Description

estimate_Sid_bound estimates the partial identification bound for each instance in the input dataset with a binary IV, observed covariates, a binary treatment indicator, and a binary outcome according to Siddique (2013, JASA).

Usage

estimate_Sid_bound(dt, method = "rf", nodesize = 5)

Arguments

dt

A dataframe whose first column is a binary IV 'Z', followed by q columns of observed covariates, followed by a binary treatment indicator 'A', and finally followed by a binary outcome 'Y'. The dataset has q+3 columns in total.

method

A character string indicator the method used to estimate each constituent conditional probability of the partial identification bound. Users can choose to fit multinomial regression by setting method = 'multinom', and random forest by setting method = 'rf'.

nodesize

Node size to be used in a random forest algorithm if method is set to 'rf'. The default value is set to 5.

Value

The original dataframe with two additional columns: L and U. L indicates the lower bound and U the upper bound as in Siddique 2013

Examples

attach(dt_Rouse)
# Construct an IV out of differential distance to two-year versus
# four-year college. Z = 1 if the subject lives not farther from
# a 4-year college compared to a 2-year college.
Z = (dist4yr <= dist2yr) + 0

# Treatment A = 1 if the subject attends a 4-year college and 0
# otherwise.
A = 1 - twoyr

# Outcome Y = 1 if the subject obtained a bachelor's degree
Y = (educ86 >= 16) + 0

# Prepare the dataset
dt = data.frame(Z, female, black, hispanic, bytest, dadsome,
     dadcoll, momsome, momcoll, fincome, fincmiss, A, Y)

# Calculate the Siddique bound by estimating each constituent
# conditional probability p(Y = y, A = a | Z, X) with a random
# forest.
dt_with_Sid_bound_rf = estimate_Sid_bound(dt, method = 'rf', nodesize = 5)

# Calculate the Siddique bound by estimating each constituent
# conditional probability p(Y = y, A = a | Z, X) with a multinomial
# regression.
dt_with_Sid_bound_multinom = estimate_Sid_bound(dt, method = 'multinom')

Estimate an IV-optimal individualized treatment rule

Description

IV_PILE estimates an IV-optimal individualized treatment rule given a dataset with estimated partial identification intervals for each instance.

Usage

IV_PILE(dt, kernel = "linear", C = 1, sig = 1/(ncol(dt) - 5))

Arguments

dt

A dataframe whose first column is a binary IV 'Z', followed by q columns of observed covariates, a binary treatment indicator 'A', a binary outcome 'Y', lower endpoint of the partial identification interval 'L', and upper endpoint of the partial identification interval 'U'. The dataset has q+5 columns in total.

kernel

The kernel used in the weighted SVM algorithm. The user may choose between 'linear' (linear kernel) and 'radial' (Gaussian RBF kernel).

C

Cost of violating the constraint. This is the parameter C in the Lagrange formulation.

sig

Sigma in the Gaussian RBF kernel. Default is set to 1/dimension of covariates, i.e., 1/q. This parameter is not relevant for linear kernel.

Value

An object of the type wsvm, inheriting from svm.

Examples

## Not run: 
# It is necessary to install the package locClass in order
# to run the following code.

attach(dt_Rouse)
# Construct an IV out of differential distance to two-year versus
# four-year college. Z = 1 if the subject lives not farther from
# a 4-year college compared to a 2-year college.
Z = (dist4yr <= dist2yr) + 0

# Treatment A = 1 if the subject attends a 4-year college and 0
# otherwise.
A = 1 - twoyr

# Outcome Y = 1 if the subject obtained a bachelor's degree
Y = (educ86 >= 16) + 0

# Prepare the dataset
dt = data.frame(Z, female, black, hispanic, bytest, dadsome,
     dadcoll, momsome, momcoll, fincome, fincmiss, A, Y)

# Estimate the Balke-Pearl bound by estimating each constituent
# conditional probability p(Y = y, A = a | Z, X) with a multinomial
# regression.
dt_with_BP_bound_multinom = estimate_BP_bound(dt, method = 'multinom')

# Estimate the IV-optimal individualized treatment rule using a
# linear kernel, under the putative IV and the Balke-Pearl bound.


iv_itr_BP_linear = IV_PILE(dt_with_BP_bound_multinom, kernel = 'linear')

## End(Not run)