Title: | Prioritize Variables with Joint Variable Importance Plot in Observational Study Design |
---|---|
Description: | In the observational study design stage, matching/weighting methods are conducted. However, when many background variables are present, the decision as to which variables to prioritize for matching/weighting is not trivial. Thus, the joint treatment-outcome variable importance plots are created to guide variable selection. The joint variable importance plots enhance variable comparisons via unadjusted bias curves derived under the omitted variable bias framework. The plots translate variable importance into recommended values for tuning parameters in existing methods. Post-matching and/or weighting plots can also be used to visualize and assess the quality of the observational study design. The method motivation and derivation is presented in "Prioritizing Variables for Observational Study Design using the Joint Variable Importance Plot" by Liao et al. (2024) <doi:10.1080/00031305.2024.2303419>. See the package paper by Liao and Pimentel (2024) <doi:10.21105/joss.06093> for a beginner friendly user introduction. |
Authors: | Lauren D. Liao [aut, cre] , Samuel D. Pimentel [aut] |
Maintainer: | Lauren D. Liao <[email protected]> |
License: | MIT + file LICENSE |
Version: | 1.0.0 |
Built: | 2024-11-25 15:18:38 UTC |
Source: | CRAN |
support function to plot bias curves
add_bias_curves(p, ...)
add_bias_curves(p, ...)
p |
plot made with jointVIP object |
... |
encompasses other variables needed |
a joint variable importance plot of class ggplot
with curves
support function to plot variable text labels
add_variable_labels(p, ...)
add_variable_labels(p, ...)
p |
plot made with jointVIP object |
... |
encompasses other variables needed |
a joint variable importance plot of class ggplot
with curves
plot the bootstrap version of the jointVIP object
bootstrap.plot( x, ..., smd = "cross-sample", use_abs = TRUE, plot_title = "Joint Variable Importance Plot", B = 100 )
bootstrap.plot( x, ..., smd = "cross-sample", use_abs = TRUE, plot_title = "Joint Variable Importance Plot", B = 100 )
x |
a jointVIP object |
... |
custom options: |
smd |
specify the standardized mean difference is |
use_abs |
TRUE (default) for absolute measures |
plot_title |
optional string for plot title |
B |
100 (default) for the number of times the bootstrap step wished to run |
a joint variable importance plot of class ggplot
data <- data.frame(year = rnorm(50, 200, 5), pop = rnorm(50, 1000, 500), gdpPercap = runif(50, 100, 1000), trt = rbinom(50, 1, 0.5), out = rnorm(50, 1, 0.2)) # random 20 percent of control as pilot data pilot_sample_num = sample(which(data$trt == 0), length(which(data$trt == 0)) * 0.2) pilot_df = data[pilot_sample_num, ] analysis_df = data[-pilot_sample_num, ] treatment = "trt" outcome = "out" covariates = names(analysis_df)[!names(analysis_df) %in% c(treatment, outcome)] new_jointVIP = create_jointVIP(treatment = treatment, outcome = outcome, covariates = covariates, pilot_df = pilot_df, analysis_df = analysis_df) # more bootstrap number B would be typically used in real settings # this is just a small example set.seed(1234567891) bootstrap.plot(new_jointVIP, B = 15)
data <- data.frame(year = rnorm(50, 200, 5), pop = rnorm(50, 1000, 500), gdpPercap = runif(50, 100, 1000), trt = rbinom(50, 1, 0.5), out = rnorm(50, 1, 0.2)) # random 20 percent of control as pilot data pilot_sample_num = sample(which(data$trt == 0), length(which(data$trt == 0)) * 0.2) pilot_df = data[pilot_sample_num, ] analysis_df = data[-pilot_sample_num, ] treatment = "trt" outcome = "out" covariates = names(analysis_df)[!names(analysis_df) %in% c(treatment, outcome)] new_jointVIP = create_jointVIP(treatment = treatment, outcome = outcome, covariates = covariates, pilot_df = pilot_df, analysis_df = analysis_df) # more bootstrap number B would be typically used in real settings # this is just a small example set.seed(1234567891) bootstrap.plot(new_jointVIP, B = 15)
A subset of data from the Centers for Disease Control and Prevention 2015 Behavioral Risk Factor Surveillance System (BRFSS) Survey
brfss
brfss
brfss
A data frame with 5,000 rows and 17 columns:
Chronic obstructive pulmonary disease
Smoke
Sex
Weight
Average drinks answers to: during the past 30 days, when you drank, how many drinks did you drink on average?
Race group
Age groups
http://static.lib.virginia.edu/statlab/materials/data/brfss_2015_sample.csv
support function for ceiling function with decimals
ceiling_dec(num, dec_place = 1)
ceiling_dec(num, dec_place = 1)
num |
numeric |
dec_place |
decimal place that is desired ceiling for |
numeric number desired
Check measures Check to see if there is any missing values or variables without any variation or identical rows (only unique rows will be used)
check_measures(measures)
check_measures(measures)
measures |
measures needed for jointVIP |
measures needed for jointVIP
This is creates the jointVIP object & check inputs
create_jointVIP(treatment, outcome, covariates, pilot_df, analysis_df)
create_jointVIP(treatment, outcome, covariates, pilot_df, analysis_df)
treatment |
string denoting the name of the binary treatment variable, containing numeric values: 0 denoting control and 1 denoting treated |
outcome |
string denoting the name of a numeric outcome variable |
covariates |
vector of strings or list denoting column names of interest |
pilot_df |
data.frame of the pilot data; character and factor variables are automatically one-hot encoded |
analysis_df |
data.frame of the analysis data; character and factor variables are automatically one-hot encoded |
a jointVIP object
data <- data.frame(year = rnorm(50, 200, 5), pop = rnorm(50, 1000, 500), gdpPercap = runif(50, 100, 1000), trt = rbinom(50, 1, 0.5), out = rnorm(50, 1, 0.2)) # random 20 percent of control as pilot data pilot_sample_num = sample(which(data$trt == 0), length(which(data$trt == 0)) * 0.2) pilot_df = data[pilot_sample_num, ] analysis_df = data[-pilot_sample_num, ] treatment = "trt" outcome = "out" covariates = names(analysis_df)[!names(analysis_df) %in% c(treatment, outcome)] new_jointVIP = create_jointVIP(treatment = treatment, outcome = outcome, covariates = covariates, pilot_df = pilot_df, analysis_df = analysis_df)
data <- data.frame(year = rnorm(50, 200, 5), pop = rnorm(50, 1000, 500), gdpPercap = runif(50, 100, 1000), trt = rbinom(50, 1, 0.5), out = rnorm(50, 1, 0.2)) # random 20 percent of control as pilot data pilot_sample_num = sample(which(data$trt == 0), length(which(data$trt == 0)) * 0.2) pilot_df = data[pilot_sample_num, ] analysis_df = data[-pilot_sample_num, ] treatment = "trt" outcome = "out" covariates = names(analysis_df)[!names(analysis_df) %in% c(treatment, outcome)] new_jointVIP = create_jointVIP(treatment = treatment, outcome = outcome, covariates = covariates, pilot_df = pilot_df, analysis_df = analysis_df)
This is creates the post_jointVIP object & check inputs
create_post_jointVIP(object, post_analysis_df, wts = NA)
create_post_jointVIP(object, post_analysis_df, wts = NA)
object |
a jointVIP object |
post_analysis_df |
post matched or weighted data.frame |
wts |
user-supplied weights |
a post_jointVIP object (subclass of jointVIP)
data <- data.frame(year = rnorm(50, 200, 5), pop = rnorm(50, 1000, 500), gdpPercap = runif(50, 100, 1000), trt = rbinom(50, 1, 0.5), out = rnorm(50, 1, 0.2)) # random 20 percent of control as pilot data pilot_sample_num = sample(which(data$trt == 0), length(which(data$trt == 0)) * 0.2) pilot_df = data[pilot_sample_num, ] analysis_df = data[-pilot_sample_num, ] treatment = "trt" outcome = "out" covariates = names(analysis_df)[!names(analysis_df) %in% c(treatment, outcome)] new_jointVIP = create_jointVIP(treatment = treatment, outcome = outcome, covariates = covariates, pilot_df = pilot_df, analysis_df = analysis_df) ## at this step typically you may wish to do matching or weighting ## the results after can be stored as a post_data ## the post_data here is not matched or weighted, only for illustrative purposes post_data <- data.frame(year = rnorm(50, 200, 5), pop = rnorm(50, 1000, 500), gdpPercap = runif(50, 100, 1000), trt = rbinom(50, 1, 0.5), out = rnorm(50, 1, 0.2)) post_dat_jointVIP = create_post_jointVIP(new_jointVIP, post_data)
data <- data.frame(year = rnorm(50, 200, 5), pop = rnorm(50, 1000, 500), gdpPercap = runif(50, 100, 1000), trt = rbinom(50, 1, 0.5), out = rnorm(50, 1, 0.2)) # random 20 percent of control as pilot data pilot_sample_num = sample(which(data$trt == 0), length(which(data$trt == 0)) * 0.2) pilot_df = data[pilot_sample_num, ] analysis_df = data[-pilot_sample_num, ] treatment = "trt" outcome = "out" covariates = names(analysis_df)[!names(analysis_df) %in% c(treatment, outcome)] new_jointVIP = create_jointVIP(treatment = treatment, outcome = outcome, covariates = covariates, pilot_df = pilot_df, analysis_df = analysis_df) ## at this step typically you may wish to do matching or weighting ## the results after can be stored as a post_data ## the post_data here is not matched or weighted, only for illustrative purposes post_data <- data.frame(year = rnorm(50, 200, 5), pop = rnorm(50, 1000, 500), gdpPercap = runif(50, 100, 1000), trt = rbinom(50, 1, 0.5), out = rnorm(50, 1, 0.2)) post_dat_jointVIP = create_post_jointVIP(new_jointVIP, post_data)
support function for floor function with decimals
floor_dec(num, dec_place = 1)
floor_dec(num, dec_place = 1)
num |
numeric |
dec_place |
decimal place that is desired floor for |
numeric number desired
Calculate bootstrapped variation additional tool to help calculate the uncertainty of each variable's bias
get_boot_measures(object, smd = "cross-sample", use_abs = TRUE, B = 100)
get_boot_measures(object, smd = "cross-sample", use_abs = TRUE, B = 100)
object |
jointVIP object |
smd |
calculate standardized mean difference either using |
use_abs |
TRUE (default) for absolute measures |
B |
100 (default) for the number of times the bootstrap step wished to run |
bootstrapped measures needed for bootstrap-jointVIP
Prepare data frame to plot standardized omitted variable bias Marginal standardized mean differences and outcome correlation
get_measures(object, smd = "cross-sample")
get_measures(object, smd = "cross-sample")
object |
jointVIP object |
smd |
calculate standardized mean difference either using |
measures needed for jointVIP
Post-measures data frame to plot post-standardized omitted variable bias
get_post_measures(object, smd = "cross-sample")
get_post_measures(object, smd = "cross-sample")
object |
post_jointVIP object |
smd |
calculate standardized mean difference either using |
measures needed for jointVIP
support function for one-hot encoding
one_hot(df)
one_hot(df)
df |
data.frame object for performing one-hot encoding |
data.frame object with factor variables one-hot encoded for each level
plot the jointVIP object
## S3 method for class 'jointVIP' plot( x, ..., smd = "cross-sample", use_abs = TRUE, plot_title = "Joint Variable Importance Plot" )
## S3 method for class 'jointVIP' plot( x, ..., smd = "cross-sample", use_abs = TRUE, plot_title = "Joint Variable Importance Plot" )
x |
a jointVIP object |
... |
custom options: |
smd |
specify the standardized mean difference is |
use_abs |
TRUE (default) for absolute measures |
plot_title |
optional string for plot title |
a joint variable importance plot of class ggplot
data <- data.frame(year = rnorm(50, 200, 5), pop = rnorm(50, 1000, 500), gdpPercap = runif(50, 100, 1000), trt = rbinom(50, 1, 0.5), out = rnorm(50, 1, 0.2)) # random 20 percent of control as pilot data pilot_sample_num = sample(which(data$trt == 0), length(which(data$trt == 0)) * 0.2) pilot_df = data[pilot_sample_num, ] analysis_df = data[-pilot_sample_num, ] treatment = "trt" outcome = "out" covariates = names(analysis_df)[!names(analysis_df) %in% c(treatment, outcome)] new_jointVIP = create_jointVIP(treatment = treatment, outcome = outcome, covariates = covariates, pilot_df = pilot_df, analysis_df = analysis_df) plot(new_jointVIP)
data <- data.frame(year = rnorm(50, 200, 5), pop = rnorm(50, 1000, 500), gdpPercap = runif(50, 100, 1000), trt = rbinom(50, 1, 0.5), out = rnorm(50, 1, 0.2)) # random 20 percent of control as pilot data pilot_sample_num = sample(which(data$trt == 0), length(which(data$trt == 0)) * 0.2) pilot_df = data[pilot_sample_num, ] analysis_df = data[-pilot_sample_num, ] treatment = "trt" outcome = "out" covariates = names(analysis_df)[!names(analysis_df) %in% c(treatment, outcome)] new_jointVIP = create_jointVIP(treatment = treatment, outcome = outcome, covariates = covariates, pilot_df = pilot_df, analysis_df = analysis_df) plot(new_jointVIP)
plot the post_jointVIP object this plot uses the same custom options as the jointVIP object
## S3 method for class 'post_jointVIP' plot( x, ..., smd = "cross-sample", use_abs = TRUE, plot_title = "Joint Variable Importance Plot", add_post_labs = TRUE, post_label_cut_bias = 0.005 )
## S3 method for class 'post_jointVIP' plot( x, ..., smd = "cross-sample", use_abs = TRUE, plot_title = "Joint Variable Importance Plot", add_post_labs = TRUE, post_label_cut_bias = 0.005 )
x |
a post_jointVIP object |
... |
custom options: |
smd |
specify the standardized mean difference is |
use_abs |
TRUE (default) for absolute measures |
plot_title |
optional string for plot title |
add_post_labs |
TRUE (default) show post-measure labels |
post_label_cut_bias |
0.005 (default) show cutoff above this number; suppressed if show_post_labs is FALSE |
a post-analysis joint variable importance plot of class ggplot
data <- data.frame(year = rnorm(50, 200, 5), pop = rnorm(50, 1000, 500), gdpPercap = runif(50, 100, 1000), trt = rbinom(50, 1, 0.5), out = rnorm(50, 1, 0.2)) # random 20 percent of control as pilot data pilot_sample_num = sample(which(data$trt == 0), length(which(data$trt == 0)) * 0.2) pilot_df = data[pilot_sample_num, ] analysis_df = data[-pilot_sample_num, ] treatment = "trt" outcome = "out" covariates = names(analysis_df)[!names(analysis_df) %in% c(treatment, outcome)] new_jointVIP = create_jointVIP(treatment = treatment, outcome = outcome, covariates = covariates, pilot_df = pilot_df, analysis_df = analysis_df) ## at this step typically you may wish to do matching or weighting ## the results after can be stored as a post_data ## the post_data here is not matched or weighted, only for illustrative purposes post_data <- data.frame(year = rnorm(50, 200, 5), pop = rnorm(50, 1000, 500), gdpPercap = runif(50, 100, 1000), trt = rbinom(50, 1, 0.5), out = rnorm(50, 1, 0.2)) post_dat_jointVIP = create_post_jointVIP(new_jointVIP, post_data) plot(post_dat_jointVIP)
data <- data.frame(year = rnorm(50, 200, 5), pop = rnorm(50, 1000, 500), gdpPercap = runif(50, 100, 1000), trt = rbinom(50, 1, 0.5), out = rnorm(50, 1, 0.2)) # random 20 percent of control as pilot data pilot_sample_num = sample(which(data$trt == 0), length(which(data$trt == 0)) * 0.2) pilot_df = data[pilot_sample_num, ] analysis_df = data[-pilot_sample_num, ] treatment = "trt" outcome = "out" covariates = names(analysis_df)[!names(analysis_df) %in% c(treatment, outcome)] new_jointVIP = create_jointVIP(treatment = treatment, outcome = outcome, covariates = covariates, pilot_df = pilot_df, analysis_df = analysis_df) ## at this step typically you may wish to do matching or weighting ## the results after can be stored as a post_data ## the post_data here is not matched or weighted, only for illustrative purposes post_data <- data.frame(year = rnorm(50, 200, 5), pop = rnorm(50, 1000, 500), gdpPercap = runif(50, 100, 1000), trt = rbinom(50, 1, 0.5), out = rnorm(50, 1, 0.2)) post_dat_jointVIP = create_post_jointVIP(new_jointVIP, post_data) plot(post_dat_jointVIP)
Obtains a print for jointVIP object
## S3 method for class 'jointVIP' print(x, ..., smd = "cross-sample", use_abs = TRUE, bias_tol = 0.01)
## S3 method for class 'jointVIP' print(x, ..., smd = "cross-sample", use_abs = TRUE, bias_tol = 0.01)
x |
a jointVIP object |
... |
not used |
smd |
specify the standardized mean difference is |
use_abs |
TRUE (default) for absolute measures |
bias_tol |
numeric 0.01 (default) any bias above the absolute bias_tol will be printed |
measures used to create the plot of jointVIP
data <- data.frame(year = rnorm(50, 200, 5), pop = rnorm(50, 1000, 500), gdpPercap = runif(50, 100, 1000), trt = rbinom(50, 1, 0.5), out = rnorm(50, 1, 0.2)) # random 20 percent of control as pilot data pilot_sample_num = sample(which(data$trt == 0), length(which(data$trt == 0)) * 0.2) pilot_df = data[pilot_sample_num, ] analysis_df = data[-pilot_sample_num, ] treatment = "trt" outcome = "out" covariates = names(analysis_df)[!names(analysis_df) %in% c(treatment, outcome)] new_jointVIP = create_jointVIP(treatment = treatment, outcome = outcome, covariates = covariates, pilot_df = pilot_df, analysis_df = analysis_df) print(new_jointVIP)
data <- data.frame(year = rnorm(50, 200, 5), pop = rnorm(50, 1000, 500), gdpPercap = runif(50, 100, 1000), trt = rbinom(50, 1, 0.5), out = rnorm(50, 1, 0.2)) # random 20 percent of control as pilot data pilot_sample_num = sample(which(data$trt == 0), length(which(data$trt == 0)) * 0.2) pilot_df = data[pilot_sample_num, ] analysis_df = data[-pilot_sample_num, ] treatment = "trt" outcome = "out" covariates = names(analysis_df)[!names(analysis_df) %in% c(treatment, outcome)] new_jointVIP = create_jointVIP(treatment = treatment, outcome = outcome, covariates = covariates, pilot_df = pilot_df, analysis_df = analysis_df) print(new_jointVIP)
Obtains a print for post_jointVIP object
## S3 method for class 'post_jointVIP' print(x, ..., smd = "cross-sample", use_abs = TRUE, bias_tol = 0.01)
## S3 method for class 'post_jointVIP' print(x, ..., smd = "cross-sample", use_abs = TRUE, bias_tol = 0.01)
x |
a post_jointVIP object |
... |
not used |
smd |
specify the standardized mean difference is |
use_abs |
TRUE (default) for absolute measures |
bias_tol |
numeric 0.01 (default) any bias above the absolute bias_tol will be printed |
measures used to create the plot of jointVIP
data <- data.frame(year = rnorm(50, 200, 5), pop = rnorm(50, 1000, 500), gdpPercap = runif(50, 100, 1000), trt = rbinom(50, 1, 0.5), out = rnorm(50, 1, 0.2)) # random 20 percent of control as pilot data pilot_sample_num = sample(which(data$trt == 0), length(which(data$trt == 0)) * 0.2) pilot_df = data[pilot_sample_num, ] analysis_df = data[-pilot_sample_num, ] treatment = "trt" outcome = "out" covariates = names(analysis_df)[!names(analysis_df) %in% c(treatment, outcome)] new_jointVIP = create_jointVIP(treatment = treatment, outcome = outcome, covariates = covariates, pilot_df = pilot_df, analysis_df = analysis_df) ## at this step typically you may wish to do matching or weighting ## the results after can be stored as a post_data ## the post_data here is not matched or weighted, only for illustrative purposes post_data <- data.frame(year = rnorm(50, 200, 5), pop = rnorm(50, 1000, 500), gdpPercap = runif(50, 100, 1000), trt = rbinom(50, 1, 0.5), out = rnorm(50, 1, 0.2)) post_dat_jointVIP = create_post_jointVIP(new_jointVIP, post_data) print(post_dat_jointVIP)
data <- data.frame(year = rnorm(50, 200, 5), pop = rnorm(50, 1000, 500), gdpPercap = runif(50, 100, 1000), trt = rbinom(50, 1, 0.5), out = rnorm(50, 1, 0.2)) # random 20 percent of control as pilot data pilot_sample_num = sample(which(data$trt == 0), length(which(data$trt == 0)) * 0.2) pilot_df = data[pilot_sample_num, ] analysis_df = data[-pilot_sample_num, ] treatment = "trt" outcome = "out" covariates = names(analysis_df)[!names(analysis_df) %in% c(treatment, outcome)] new_jointVIP = create_jointVIP(treatment = treatment, outcome = outcome, covariates = covariates, pilot_df = pilot_df, analysis_df = analysis_df) ## at this step typically you may wish to do matching or weighting ## the results after can be stored as a post_data ## the post_data here is not matched or weighted, only for illustrative purposes post_data <- data.frame(year = rnorm(50, 200, 5), pop = rnorm(50, 1000, 500), gdpPercap = runif(50, 100, 1000), trt = rbinom(50, 1, 0.5), out = rnorm(50, 1, 0.2)) post_dat_jointVIP = create_post_jointVIP(new_jointVIP, post_data) print(post_dat_jointVIP)
Obtains a summary jointVIP object
## S3 method for class 'jointVIP' summary(object, ..., smd = "cross-sample", use_abs = TRUE, bias_tol = 0.01)
## S3 method for class 'jointVIP' summary(object, ..., smd = "cross-sample", use_abs = TRUE, bias_tol = 0.01)
object |
a jointVIP object |
... |
not used |
smd |
specify the standardized mean difference is |
use_abs |
TRUE (default) for absolute measures |
bias_tol |
numeric 0.01 (default) any bias above the absolute bias_tol will be summarized |
no return value
data <- data.frame(year = rnorm(50, 200, 5), pop = rnorm(50, 1000, 500), gdpPercap = runif(50, 100, 1000), trt = rbinom(50, 1, 0.5), out = rnorm(50, 1, 0.2)) # random 20 percent of control as pilot data pilot_sample_num = sample(which(data$trt == 0), length(which(data$trt == 0)) * 0.2) pilot_df = data[pilot_sample_num, ] analysis_df = data[-pilot_sample_num, ] treatment = "trt" outcome = "out" covariates = names(analysis_df)[!names(analysis_df) %in% c(treatment, outcome)] new_jointVIP = create_jointVIP(treatment = treatment, outcome = outcome, covariates = covariates, pilot_df = pilot_df, analysis_df = analysis_df) summary(new_jointVIP)
data <- data.frame(year = rnorm(50, 200, 5), pop = rnorm(50, 1000, 500), gdpPercap = runif(50, 100, 1000), trt = rbinom(50, 1, 0.5), out = rnorm(50, 1, 0.2)) # random 20 percent of control as pilot data pilot_sample_num = sample(which(data$trt == 0), length(which(data$trt == 0)) * 0.2) pilot_df = data[pilot_sample_num, ] analysis_df = data[-pilot_sample_num, ] treatment = "trt" outcome = "out" covariates = names(analysis_df)[!names(analysis_df) %in% c(treatment, outcome)] new_jointVIP = create_jointVIP(treatment = treatment, outcome = outcome, covariates = covariates, pilot_df = pilot_df, analysis_df = analysis_df) summary(new_jointVIP)
Obtains a summary post_jointVIP object
## S3 method for class 'post_jointVIP' summary( object, ..., smd = "cross-sample", use_abs = TRUE, bias_tol = 0.01, post_bias_tol = 0.005 )
## S3 method for class 'post_jointVIP' summary( object, ..., smd = "cross-sample", use_abs = TRUE, bias_tol = 0.01, post_bias_tol = 0.005 )
object |
a post_jointVIP object |
... |
not used |
smd |
specify the standardized mean difference is |
use_abs |
TRUE (default) for absolute measures |
bias_tol |
numeric 0.01 (default) any bias above the absolute bias_tol will be summarized |
post_bias_tol |
numeric 0.005 (default) any bias above the absolute bias_tol will be summarized |
no return value
data <- data.frame(year = rnorm(50, 200, 5), pop = rnorm(50, 1000, 500), gdpPercap = runif(50, 100, 1000), trt = rbinom(50, 1, 0.5), out = rnorm(50, 1, 0.2)) # random 20 percent of control as pilot data pilot_sample_num = sample(which(data$trt == 0), length(which(data$trt == 0)) * 0.2) pilot_df = data[pilot_sample_num, ] analysis_df = data[-pilot_sample_num, ] treatment = "trt" outcome = "out" covariates = names(analysis_df)[!names(analysis_df) %in% c(treatment, outcome)] new_jointVIP = create_jointVIP(treatment = treatment, outcome = outcome, covariates = covariates, pilot_df = pilot_df, analysis_df = analysis_df) ## at this step typically you may wish to do matching or weighting ## the results after can be stored as a post_data ## the post_data here is not matched or weighted, only for illustrative purposes post_data <- data.frame(year = rnorm(50, 200, 5), pop = rnorm(50, 1000, 500), gdpPercap = runif(50, 100, 1000), trt = rbinom(50, 1, 0.5), out = rnorm(50, 1, 0.2)) post_dat_jointVIP = create_post_jointVIP(new_jointVIP, post_data) summary(post_dat_jointVIP)
data <- data.frame(year = rnorm(50, 200, 5), pop = rnorm(50, 1000, 500), gdpPercap = runif(50, 100, 1000), trt = rbinom(50, 1, 0.5), out = rnorm(50, 1, 0.2)) # random 20 percent of control as pilot data pilot_sample_num = sample(which(data$trt == 0), length(which(data$trt == 0)) * 0.2) pilot_df = data[pilot_sample_num, ] analysis_df = data[-pilot_sample_num, ] treatment = "trt" outcome = "out" covariates = names(analysis_df)[!names(analysis_df) %in% c(treatment, outcome)] new_jointVIP = create_jointVIP(treatment = treatment, outcome = outcome, covariates = covariates, pilot_df = pilot_df, analysis_df = analysis_df) ## at this step typically you may wish to do matching or weighting ## the results after can be stored as a post_data ## the post_data here is not matched or weighted, only for illustrative purposes post_data <- data.frame(year = rnorm(50, 200, 5), pop = rnorm(50, 1000, 500), gdpPercap = runif(50, 100, 1000), trt = rbinom(50, 1, 0.5), out = rnorm(50, 1, 0.2)) post_dat_jointVIP = create_post_jointVIP(new_jointVIP, post_data) summary(post_dat_jointVIP)