Package 'sensitivityCalibration'

Title: A Calibrated Sensitivity Analysis for Matched Observational Studies
Description: Implements the calibrated sensitivity analysis approach for matched observational studies. Our sensitivity analysis framework views matched sets as drawn from a super-population. The unmeasured confounder is modeled as a random variable. We combine matching and model-based covariate-adjustment methods to estimate the treatment effect. The hypothesized unmeasured confounder enters the picture as a missing covariate. We adopt a state-of-art Expectation Maximization (EM) algorithm to handle this missing covariate problem in generalized linear models (GLMs). As our method also estimates the effect of each observed covariate on the outcome and treatment assignment, we are able to calibrate the unmeasured confounder to observed covariates. Zhang, B., Small, D. S. (2018). <arXiv:1812.00215>.
Authors: Bo Zhang
Maintainer: Bo Zhang <[email protected]>
License: MIT + file LICENSE
Version: 0.0.1
Built: 2025-02-20 07:04:32 UTC
Source: CRAN

Help Index


Make the dynamic calibration plot.

Description

This is another main function in the package. For a given p and the border of the sensitivity parameters (lambda, delta), a calibration plot is made for each (lambda, delta) pair on the border.

Usage

calibrate_anim(border, q, u, p, degree, xmax, ymax, data_matched)

Arguments

border

Border or frontier of the sensitivity parameters for a fixed p.

q

Number of matched covariates plus treatment.

u

Unmeasured confounder; u = c(1,0) if the unmeasured confounder is assumed to be binary.

p

The probability vector corresponding to u; p = c(0.5, 0.5) if the unmeasured confounder is assumed to be Bernoulli(0.5).

degree

Degree of freedom of the spline fit for the boundary.

xmax

Maximum xlim of the plot.

ymax

Maximum ylim of the plot.

data_matched

The matched dataset.

Details

border is the dataframe returned by the function find_border. It has to contain at least (k+1) different lambda/delta pairs in order to fit a smoothing spline with k dfs.

Examples

data(NHANES_blood_lead_small_matched)
attach(NHANES_blood_lead_small_matched)

# Prepare the border
lambda_vec = c(seq(0.1,1.9,0.1), 2.2, 2.5, 3, 3.5, 4)
delta_vec = c(7.31, 5.34, 4.38, 3.76, 3.18, 2.87, 2.55, 2.36, 2.16, 1.99, 1.86,
1.74, 1.63, 1.54, 1.44, 1.40, 1.31, 1.28, 1.22, 1.08, 0.964, 0.877, 0.815, 0.750)
border = data.frame(lambda_vec, delta_vec)

calibrate_anim(border, 9, c(1,0), c(0.5,0.5), 10, 5, 3.5, NHANES_blood_lead_small_matched)

detach(NHANES_blood_lead_small_matched)

Make the calibration plot.

Description

This is the main function in the package. Given a matched dataset and one particular (p, lambda, delta) triple, obtain corresponding coefficients of observed coefficients and plot them with the lengend added. This graph is meant to provide an intuitive interpretation of the magnitude of the sensitivity parameters lambda and delta by contrasting them with the estimated coefficients of the observed covariates.

Usage

calibrate_one(lambda_vec, delta_vec, q, u, p, lambda, delta, label_vec, data_matched)

Arguments

lambda_vec

A vector of lambdas that define the border.

delta_vec

A vector of deltas that define the border.

q

Number of matched covariates plus treatment.

u

Unmeasured confounder; u = c(1,0) if the unmeasured confounder is assumed to be binary.

p

The probability vector corresponding to u; p = c(0.5, 0.5) if the unmeasured confounder is assumed to be Bernoulli(0.5).

lambda

Sensitivity parameter that controls association between U and treatment assignment.

delta

Sensitivity parameter that controls association between U and response.

label_vec

A vector of characters of length q-1 consists of the names of observed/matched covariates.

data_matched

The matched dataset.

Details

border is the dataframe returned by the function find_border. It has to contain at least 7 different lambda/delta pairs in order to fit a smoothing spline with 6 dfs.

lambda and delta is a pair on the border.

label_vec is typically taken to be the columns names of the dataset, i.e., the names of the q - 1 observed covariates.

Examples

data(NHANES_blood_lead_small_matched)
attach(NHANES_blood_lead_small_matched)

# Prepare the lambda_vec and delta_vec
lambda_vec = c(seq(0.1,1.9,0.1), 2.2, 2.5, 3, 3.5, 4)
delta_vec = c(7.31, 5.34, 4.38, 3.76, 3.18, 2.87, 2.55, 2.36, 2.16, 1.99, 1.86,
1.74, 1.63, 1.54, 1.44, 1.40, 1.31, 1.28, 1.22, 1.08, 0.964, 0.877, 0.815, 0.750)

calibrate_one(lambda_vec, delta_vec, 9, c(1,0), c(0.5,0.5), 1, 0.492,
colnames(NHANES_blood_lead_small_matched)[1:8], NHANES_blood_lead_small_matched)

detach(NHANES_blood_lead_small_matched)

Construct the 95% confidence interval of the treatment effect given the set of sensitivity parameters.

Description

This is the main function in the package. Given a dataset and sensitivity parameters (p, lambda, delta), the function returns 95% CI for the estimated treatment effect.

Usage

CI_block_boot(q, u, p, lambda, delta, data_matched, n_boot = 2000)

Arguments

q

Number of matched covariates plus treatment.

u

Unmeasured confounder; u = c(1,0) if the unmeasured confounder is assumed to be binary.

p

The probability vector corresponding to u; p = c(0.5, 0.5) if the unmeasured confounder is assumed to be Bernoulli(0.5).

lambda

Sensitivity parameter that controls association between U and treatment assignment.

delta

Sensitivity parameter that controls association between U and response.

data_matched

The dataset after matching.

n_boot

Number of boostrap samples.

Details

If the number of matched covariates is k, then q = k + 1.

If the hypothesized unmeasured confounder is binary, then u = c(1,0) and p = c(p, 1-p).

data_matched should be in the following format: the first (q-1) columns are matched covariates, the qth column is the treatment status, and the (q+1)th column is the response. See the NHANES_blood_lead_small_matched dataset for an example.

Note the input for this function is a dataset before matching. To run this function, optmatch package needs to be installed and loaded.

Examples

data(NHANES_blood_lead_small_matched)
attach(NHANES_blood_lead_small_matched)

CI_block_boot(9, c(1,0), c(0.5,0.5), 0, 0, NHANES_blood_lead_small_matched, n_boot = 10)

detach(NHANES_blood_lead_small_matched)

Estimate the treatment effect for a matched dataset given the set of sensitivity parameters.

Description

This is the main function in the package. Given a matched dataset and sensitivity parameters (p, lambda, delta), the function runs the EM algorithm by the method of weights and return estimated coefficients of the propensity score model and the outcome regression model.

Usage

EM_Algorithm(q, u, p, lambda, delta, data_matched, all_coef = FALSE,
             aug_data = FALSE, tol = 0.0001)

Arguments

q

Number of matched covariates plus treatment.

u

Unmeasured confounder; u = c(1,0) if the unmeasured confounder is assumed to be binary.

p

The probability vector corresponding to u; p = c(0.5, 0.5) if the unmeasured confounder is assumed to be Bernoulli(0.5).

lambda

Sensitivity parameter that controls association between U and treatment assignment.

delta

Sensitivity parameter that controls association between U and response.

data_matched

A matched dataset. See details below.

all_coef

TRUE then all estimated coefficients are returned, FALSE then only the estimated treatment effect is returned.

aug_data

TRUE then the augmented dataframe at the time of convergence is returned.

tol

Tolerance for the algorithm convergence.

Details

If the number of matched covariates is k, then q = k + 1.

If the hypothesized unmeasured confounder is binary, then u = c(1,0) and p = c(p, 1-p).

data_matched should be in the following format: the first (q-1) columns are matched covariates, the qth column is the treatment status, the (q+1)th column is the column of unmeasured confounders U0, the (q+2)th column is the response, the last column, i.e., (q+3)th column, is the assignment of the matched set. We use the fullmatch function in the package optmatch to perform the fullmatching. See NHANES_blood_lead_small_matched for an example of a matched dataset and the examples section therein for instructions on how to construct such a matched dataset.

Examples

data(NHANES_blood_lead_small_matched)
attach(NHANES_blood_lead_small_matched)

# Run the EM algorithm assuming no unmeasured confounding, i.e., lambda =delta = 0
EM_Algorithm(9, c(1,0), c(0.5,0.5), 0, 0, NHANES_blood_lead_small_matched)

# Run the EM algorithm assuming the magnitude of the unmeasured confounding is lambda =delta = 1
EM_Algorithm(9, c(1,0), c(0.5,0.5), 1, 1, NHANES_blood_lead_small_matched)

detach(NHANES_blood_lead_small_matched)

Find the lambda-delta boundary for a fixed sensitivity parameter p.

Description

Given the dataset, unmeasured confounder, sensitivity parameter p, and a sequence of lambda values, the function uses binary search to find a sequence of delta corresponding to each lambda in the lambda_vec such that the estimated 95% for the treatment effect barely covers 0. The function returns a dataframe consisting of lambda_vec and the corresponding deltas. See below for an example.

Usage

find_border(q, u, p, lambda_vec, start_value_low, start_value_high,
data_matched, n_boot = 2000, tol = 0.01)

Arguments

q

Number of matched covariates plus treatment.

u

Unmeasured confounder; u = c(1,0) if the unmeasured confounder is assumed to be binary.

p

The probability vector corresponding to u; p = c(0.5, 0.5) if the unmeasured confounder is assumed to be Bernoulli(0.5).

lambda_vec

A sequence of lambda values.

start_value_low

Starting value for the binary search (the lower endpoint).

start_value_high

Starting value for the binary search (the higher endpoint).

data_matched

The dataset after matching.

n_boot

Number of boostrap samples used to approximate the CI.

tol

Tolerance for the binary search.

Details

start_value_low and start_value_high are user supplied numbers to start the binary search.

Examples

data(NHANES_blood_lead_small_matched)
attach(NHANES_blood_lead_small_matched)

find_border(9, c(1,0), c(0.5,0.5), c(0.5,1,1.5), 0, 4,
NHANES_blood_lead_small_matched, n_boot = 1000)

detach(NHANES_blood_lead_small_matched)

Estimate the maximum delta for fixed sensitivity parameters p and lambda.

Description

Estimate the maximum delta value for a given p and lambda, so that the estimated 95% confidence interval for the treatment effect is still significant. Note in order to run this function, optmatch package needs to be installed and loaded.

Usage

find_delta(q, u, p, lambda, start_value_low, start_value_high,
data_matched, n_boot = 200, tol = 0.01)

Arguments

q

Number of matched covariates plus treatment.

u

Unmeasured confounder; u = c(1,0) if the unmeasured confounder is assumed to be binary.

p

The probability vector corresponding to u; p = c(0.5, 0.5) if the unmeasured confounder is assumed to be Bernoulli(0.5).

lambda

A lambda value.

start_value_low

Starting value for the binary search (the lower endpoint).

start_value_high

Starting value for the binary search (the higher endpoint).

data_matched

The dataset after matching.

n_boot

Number of boostrap samples used to approximate the CI.

tol

Tolerance for the binary search.

Details

start_value_low and start_value_high are user supplied numbers to start the binary search.

Examples

data(NHANES_blood_lead_small_matched)
attach(NHANES_blood_lead_small_matched)

find_delta(9, c(1,0), c(0.5,0.5), 1, 1, 3,
NHANES_blood_lead_small_matched, n_boot = 1000)

detach(NHANES_blood_lead_small_matched)

Second hand smoking and blood lead levels dataset from NHANES III.

Description

A dataset constructed from NHANES III.

Usage

data(NHANES_blood_lead)

Format

A data frame with 4519 observations on the following 10 variables.

COP

treatment, 1 if cotinine level is between 0.563-14.9 ng/ml and 0 otherwise

DMARETHN

1 if white, 0 if others

DMPPIR

Poverty income ratio

HFE1

1 if the house is built before 1974, 0 if after 1974

HFE2

number of rooms in the house

HFHEDUCR

education level of the reference adult

HSAGEIR

age at the time of interview

HSFSIZER

size of the family

HSSEX

1 if male, 0 if female

PBP

blood lead level

Details

We follow Mannino rt al. (2003) in constructing a dataset that includes children aged 4-16 years old for whom both serum cotinine levels and blood lead levels were measured in the Third National Health and Nutrition Examination Survey (NHANES III), along with the following variables: race/ethnicity, age, sex, poverty income ratio, education level of the reference adult, family size, number of rooms in the house, and year the house was constructed. The biomarker cotinine is a metabolite of nicotine and an indicator of second-hand smoke exposure. Treatment status is 1 if cotinine level is between 0.563-14.9 ng/ml and 0 otherwise. All continuous/ordinal variables are standardized by subtracting the mean and divided by 2 standard deviations so that they are more comparable to binary covariates (Gelman 2008).

Source

NHANES III, the Third US National Health and Nutrition Examination Survey.

References

D. M. Mannino, R. Albalak, S. D. Grosse, and J. Repace. Second-hand smoke exposureand blood lead levels in U.S. children.Epidemiology, 14:719-727, 2003

A. Gelman. Scaling regression inputs by dividing by two standard deviations.Statisticsin Medicine, 27:2865-2873, 2008.

Examples

data(NHANES_blood_lead)

A random subset of NHANES_blood_lead data.

Description

A random subset of NHANES_blood_lead data for the purpose of testing.

Usage

data(NHANES_blood_lead_small)

Format

A random sample from the NHANES_blood_lead dataset. It consists of 500 instances and the same 10 variables as the NHANES_blood_lead data.

COP

treatment, 1 if cotinine level is between 0.563-14.9 ng/ml and 0 otherwise

DMARETHN

1 if white, 0 if others

DMPPIR

Poverty income ratio

HFE1

1 if the house is built before 1974, 0 if after 1974

HFE2

number of rooms in the house

HFHEDUCR

education level of the reference adult

HSAGEIR

age at the time of interview

HSFSIZER

size of the family

HSSEX

1 if male, 0 if female

PBP

blood lead level

Details

We take a 500 random sample from the NHANES_blood_lead dataset. This small dataset is primarily for the purpose of testing the algorithm.

Source

NHANES III, the Third US National Health and Nutrition Examination Survey.

References

D. M. Mannino, R. Albalak, S. D. Grosse, and J. Repace. Second-hand smoke exposureand blood lead levels in U.S. children.Epidemiology, 14:719-727, 2003

A. Gelman. Scaling regression inputs by dividing by two standard deviations.Statisticsin Medicine, 27:2865-2873, 2008.

Examples

data(NHANES_blood_lead_small)

NHANES_blood_lead_small data after matching.

Description

NHANES_blood_lead_small data after a full matching using the optmatch package

Usage

data(NHANES_blood_lead_small_matched)

Format

NHANES_blood_lead_small dataset after a full matching. It consists of 500 instances and the following 12 variables:

COP

treatment, 1 if cotinine level is between 0.563-14.9 ng/ml and 0 otherwise

DMARETHN

1 if white, 0 if others

DMPPIR

Poverty income ratio

HFE1

1 if the house is built before 1974, 0 if after 1974

HFE2

number of rooms in the house

HFHEDUCR

education level of the reference adult

HSAGEIR

age at the time of interview

HSFSIZER

size of the family

HSSEX

1 if male, 0 if female

PBP

blood lead level

U0

placeholder for the hypothesized unmeasured confounder U

matches

matched set assignment

Details

We perform a full matching on the NHANES_blood_lead_small dataset using the optmatch package. The code for constructing this matched dataset from the original dataset is given in the examples section. We add a column U0 as placeholder for the unmeasurefor confounder U.

Source

NHANES III, the Third US National Health and Nutrition Examination Survey.

References

D. M. Mannino, R. Albalak, S. D. Grosse, and J. Repace. Second-hand smoke exposureand blood lead levels in U.S. children.Epidemiology, 14:719-727, 2003

A. Gelman. Scaling regression inputs by dividing by two standard deviations.Statisticsin Medicine, 27:2865-2873, 2008.

Examples

## Not run: 
# To run this example, optmatch must be installed
set.seed(1)
library(optmatch)
data(NHANES_blood_lead_small)
attach(NHANES_blood_lead_small)

# Perform a fullmatch
fm = fullmatch(COP ~. , data = NHANES_blood_lead_small[, 1:9], min.controls = 1/4, max.controls = 4)
NHANES_blood_lead_small_matched = cbind(NHANES_blood_lead_small, matches = fm)

# Add a U0 row
U0 = rep(1, dim(NHANES_blood_lead_small_matched)[1])
NHANES_blood_lead_small_matched = cbind(NHANES_blood_lead_small_matched[,1:9], U0,
NHANES_blood_lead_small_matched[, 10:11])

## End(Not run)