We provide
gen_syn_data
to generate synthetic data
for CausalGPS package
Input parameters:
sample_size
Number of data samples
seed
The seed of R’s random number
generator
outcome_sd
Standard deviation used to
generate the outcome
gps_spec
A numerical value (1-7) that
indicates the GPS model used to generate synthetic data. See the
following section for more details.
cova_spec
A numerical value (1-2) to
modify the covariates. See the code for more details.
We generate six confounders (C1, C2, ..., C6), which include a combination of continuous and categorical variables, and generate W using six specifications of the generalized propensity score model,
W = 9{−0.8 + (0.1, 0.1, −0.1, 0.2, 0.1, 0.1)C} + 17 + N(0, 5)
W = 15{−0.8 + (0.1, 0.1, −0.1, 0.2, 0.1, 0.1)C} + 22 + T(2)
W = 9{−0.8 + (0.1, 0.1, −0.1, 0.2, 0.1, 0.1)C} + 3/2C32 + 15 + N(0, 5)
$W = \frac{49 \exp(\{-0.8+ (0.1,0.1,-0.1,0.2,0.1,0.1) \boldsymbol{C}\})}{1+ \exp(\{-0.8+ (0.1,0.1,-0.1,0.2,0.1,0.1) \boldsymbol{C}\})} -6 + N(0,5)$
$W = \frac{42}{1+ \exp(\{-0.8+ (0.1,0.1,-0.1,0.2,0.1,0.1) \boldsymbol{C}\})} - 18 + N(0,5)$
W = 7log({−0.8 + (0.1, 0.1, −0.1, 0.2, 0.1, 0.1)C}) + 13 + N(0, 4)
We generate Y from an outcome model which is assumed to be a cubical function of W with additive terms for the confounders and interactions between W and confounders C,
Y|W, C ∼ N{μ(W, C), sd2}
μ(W, C) = −10 − (2, 2, 3, −1, 2, 2)C − W(0.1 − 0.1C1 + 0.1C4 + 0.1C5 + 0.1C32) + 0.132W3