Using DevTreatRules

Here, we illustrate the DevTreatRules package by building and evaluating treatment rules based on the example dataset included with the package.

library(DevTreatRules)
head(obsStudyGeneExpressions)
##   no_relapse intervention prognosis clinic age gene_1 gene_2 gene_3 gene_4
## 1          1            0      Good      1  46  0.531  0.071 0.1294  0.427
## 2          0            1      Poor      3  51  0.744  0.711 1.3532  1.939
## 3          1            0      Poor      3  62  1.146  0.498 1.4707  1.456
## 4          1            0      Good      5  51  1.816  1.758 0.2226  0.688
## 5          1            1      Good      4  52  0.403  0.636 0.0933  0.288
## 6          1            1      Poor      6  51  1.797  0.642 0.2618  0.277
##   gene_5 gene_6 gene_7 gene_8 gene_9 gene_10
## 1  0.421  0.402  0.831 0.0136  1.186  1.8831
## 2  0.230  0.228  0.282 1.3413  1.106  0.6606
## 3  0.291  0.543  0.915 0.4839  1.193  0.6233
## 4  0.620  1.571  1.606 1.9986  0.785  1.7684
## 5  0.300  1.276  0.642 1.9003  0.521  0.0126
## 6  1.053  0.939  1.736 0.2348  1.480  1.3908

Split the Dataset

First, we split obsStudyGeneExpressions into independent development/validation/evaluation partitions by calling the SplitData() function

set.seed(123)
example.split <- SplitData(data=obsStudyGeneExpressions, n.sets=3, split.proportions=c(0.5, 0.25, 0.25))
table(example.split$partition)
## 
## development  validation  evaluation 
##        2500        1250        1250

and then forming independent datasets based on the partition variable created above.

library(dplyr)
development.data <- example.split %>% filter(partition == "development")
validation.data <-  example.split %>% filter(partition == "validation")
evaluation.data <-  example.split %>% filter(partition == "evaluation")

Specify Variable Roles

Suppose we believe the variables prognosis, clinic, and age may have influenced treatment assignment. We would codify this knowledge into DevTreatRules by specifying the argument

names.influencing.treatment=c("prognosis", "clinic", "age")

in functions we will call later in this vignette. Further suppose that we don’t expect prognosis and clinic to be measured on the same scale in independent clinical settings where we would like to apply our estimated rule (so they are not sensible rule inputs). However, we do expect the gene expression measurements in gene_1, …, gene_10 to potentially influence treatment response and also to be reliably measured in future settings (so they are sensible rule inputs). We specify this knowledge in DevTreatRules with the argument

names.influencing.rule=c("age", paste0("gene_", 1:10))

On the Development Dataset, Build the Treatment Rule

Although we could directly estimate a single treatment rule on the development dataset with BuildRule() using, for example,

one.rule <- BuildRule(development.data=development.data,
                      study.design="observational",
                      prediction.approach="split.regression",
                      name.outcome="no_relapse",
                      type.outcome="binary",
                      desirable.outcome=TRUE,
                      name.treatment="intervention",
                      names.influencing.treatment=c("prognosis", "clinic", "age"),
                      names.influencing.rule=c("age", paste0("gene_", 1:10)),
                      propensity.method="logistic.regression",
                      rule.method="glm.regression")

this has limited utility because it required us to specify just one value for the prediction.approach argument (even if we don’t have a priori knowledge about which of split-regression, OWL framework, and direct-interactions approaches will perform best) and to specify just one regression method for the propensity.score and rule.method arguments (even if we are not sure whether standard logistic regression or lasso/ridge logistic regression will yield a better rule).

Instead, it would be more useful to perform model selection to estimate rules for different combinations of split-regression/OWL framework/direct-interactions and standard/lasso/ridge logistic regression (e.g. by looping over calls to BuildRule()). The model-selection process is automated in CompareRulesOnValidation().

On the Validation Dataset, Perform Model Selection

Here we will perform model selection by calling CompareRulesOnValidation() with the arguments

vec.approaches=c("OWL", "split.regression", "OWL.framework", "direct.interactions")
vec.rule.methods=c("glm.regression", "lasso", "ridge")

which are actually the default values of CompareRulesOnValidation(), and with

vec.propensity.methods="logistic.regression"

Now we perform model selection by calling CompareRulesOnValidation()

set.seed(123)
model.selection <- CompareRulesOnValidation(development.data=development.data,
                validation.data=validation.data,
                study.design.development="observational",
                vec.approaches=c("split.regression", "OWL.framework", "direct.interactions"),
                vec.rule.methods=c("glm.regression", "lasso", "ridge"),
                vec.propensity.methods="logistic.regression",
                name.outcome.development="no_relapse",
                type.outcome.development="binary",
                name.treatment.development="intervention",
                names.influencing.treatment.development=c("prognosis", "clinic", "age"),
                names.influencing.rule.development=c("age", paste0("gene_", 1:10)),
                desirable.outcome.development=TRUE)

We can compare these estimates for the nine treatment rules (three approaches, three combinations of rule/propensity methods) to decide which rules to bring forward to the evaluation dataset. For context, the estimates for the naive “treat.all” and “treat.none” strategies are always provided by CompareRulesOnValidation().

First, for rules estimated with the split-regression approach, we obtain the estimates

model.selection$list.summaries[["split.regression"]] 
##                                                    Positives Negatives
## propensity_logistic.regression_rule_glm.regression       665       585
## propensity_logistic.regression_rule_lasso                694       556
## propensity_logistic.regression_rule_ridge                762       488
## treat.all                                               1250         0
## treat.none                                                 0      1250
##                                                    ATE in Positives
## propensity_logistic.regression_rule_glm.regression          0.11919
## propensity_logistic.regression_rule_lasso                   0.11078
## propensity_logistic.regression_rule_ridge                   0.08535
## treat.all                                                  -0.00375
## treat.none                                                       NA
##                                                    ATE in Negatives      ABR
## propensity_logistic.regression_rule_glm.regression         -0.11361  0.11658
## propensity_logistic.regression_rule_lasso                  -0.14051  0.12400
## propensity_logistic.regression_rule_ridge                  -0.12102  0.09927
## treat.all                                                        NA -0.00375
## treat.none                                                 -0.00375  0.00375

Next, for the OWL framework we have

model.selection$list.summaries[["OWL.framework"]] 
##                                                    Positives Negatives
## propensity_logistic.regression_rule_glm.regression       673       577
## propensity_logistic.regression_rule_lasso               1250         0
## propensity_logistic.regression_rule_ridge               1250         0
## treat.all                                               1250         0
## treat.none                                                 0      1250
##                                                    ATE in Positives
## propensity_logistic.regression_rule_glm.regression          0.12056
## propensity_logistic.regression_rule_lasso                  -0.00375
## propensity_logistic.regression_rule_ridge                  -0.00375
## treat.all                                                  -0.00375
## treat.none                                                       NA
##                                                    ATE in Negatives      ABR
## propensity_logistic.regression_rule_glm.regression         -0.12967  0.12477
## propensity_logistic.regression_rule_lasso                        NA -0.00375
## propensity_logistic.regression_rule_ridge                        NA -0.00375
## treat.all                                                        NA -0.00375
## treat.none                                                 -0.00375  0.00375

and, last, for the direct-interactions approach

model.selection$list.summaries[["direct.interactions"]] 
##                                                    Positives Negatives
## propensity_logistic.regression_rule_glm.regression       673       577
## propensity_logistic.regression_rule_lasso                697       553
## propensity_logistic.regression_rule_ridge               1250         0
## treat.all                                               1250         0
## treat.none                                                 0      1250
##                                                    ATE in Positives
## propensity_logistic.regression_rule_glm.regression          0.12457
## propensity_logistic.regression_rule_lasso                   0.08807
## propensity_logistic.regression_rule_ridge                  -0.00375
## treat.all                                                  -0.00375
## treat.none                                                       NA
##                                                    ATE in Negatives      ABR
## propensity_logistic.regression_rule_glm.regression         -0.12672  0.12556
## propensity_logistic.regression_rule_lasso                  -0.10951  0.09755
## propensity_logistic.regression_rule_ridge                        NA -0.00375
## treat.all                                                        NA -0.00375
## treat.none                                                 -0.00375  0.00375

Based on the above estimates in the validation set, we decide to select three rules to bring forward to the evaluation set: (1) split-regression with logistic/logistic as the propensity/rule methods,

model.selection$list.summaries$split.regression["propensity_logistic.regression_rule_glm.regression", ]
##        Positives        Negatives ATE in Positives ATE in Negatives 
##          665.000          585.000            0.119           -0.114 
##              ABR 
##            0.117

along with (2) OWL framework with logistic/logistic as the propensity/rule methods

model.selection$list.summaries$OWL.framework["propensity_logistic.regression_rule_glm.regression", ]
##        Positives        Negatives ATE in Positives ATE in Negatives 
##          673.000          577.000            0.121           -0.130 
##              ABR 
##            0.125

and (3) direct-interactions with logistic/lasso as the propensity/rule methods.

model.selection$list.summaries$direct.interactions["propensity_logistic.regression_rule_lasso", ]
##        Positives        Negatives ATE in Positives ATE in Negatives 
##         697.0000         553.0000           0.0881          -0.1095 
##              ABR 
##           0.0976

We can also see the underlying coefficient estimates for these rules with

rule.split <- model.selection$list.rules$split.regression[["propensity_logistic.regression_rule_glm.regression"]]
coef(rule.split$rule.object.control)
## (Intercept)         age      gene_1      gene_2      gene_3      gene_4 
##     0.81702     0.00354    -0.53828    -0.09668     0.02675    -0.06418 
##      gene_5      gene_6      gene_7      gene_8      gene_9     gene_10 
##    -0.24930    -0.02657    -0.20052    -0.02206    -0.06508     0.35033
coef(rule.split$rule.object.treat)
## (Intercept)         age      gene_1      gene_2      gene_3      gene_4 
##    -0.26512    -0.00177     0.47034    -0.02004    -0.12251    -0.14514 
##      gene_5      gene_6      gene_7      gene_8      gene_9     gene_10 
##     0.14135    -0.04188    -0.03558     0.08977     0.11137     0.12449
rule.OWL <- model.selection$list.rules$OWL.framework[["propensity_logistic.regression_rule_glm.regression"]]
coef(rule.OWL$rule.object)
## (Intercept)         age      gene_1      gene_2      gene_3      gene_4 
##    -0.50119    -0.00340     0.46473     0.05083    -0.04349    -0.03906 
##      gene_5      gene_6      gene_7      gene_8      gene_9     gene_10 
##     0.15631     0.00839     0.09120     0.05875     0.08967    -0.10412
rule.DI <- model.selection$list.rules$direct.interactions[["propensity_logistic.regression_rule_lasso"]]
coef(rule.DI$rule.object)
## 13 x 1 sparse Matrix of class "dgCMatrix"
##                        s1
## (Intercept)        .     
## treatment.neg.pos -0.0753
## age                .     
## gene_1             0.1232
## gene_2             .     
## gene_3             .     
## gene_4             .     
## gene_5             .     
## gene_6             .     
## gene_7             .     
## gene_8             .     
## gene_9             .     
## gene_10            .

On the Evaluation Dataset, Evaluate the Selected Rules

Since the validation dataset informed our model selection (i.e. we selected these particular two rules because they appeared best on the validation set), the estimates from the validation set itself are not trustworthy estimates of performance in independent settings. To obtain a trustworthy estimate of the rules’ performance in independent samples drawn from the same population, we turn to the EvaluateRule() function applied to the independent evaluation dataset.

First, we obtain the estimated performance of the rule using split-regression with

set.seed(123)
split.eval <- EvaluateRule(evaluation.data=evaluation.data,
                           BuildRule.object=rule.split,
                           study.design="observational",
                           name.outcome="no_relapse",
                           type.outcome="binary",
                           desirable.outcome=TRUE,
                           name.treatment="intervention",
                           names.influencing.treatment=c("prognosis", "clinic", "age"),
                           names.influencing.rule=c("age", paste0("gene_", 1:10)),
                           propensity.method="logistic.regression",
                           bootstrap.CI=FALSE)
split.eval$summaries
##                n.positives ATE.positives n.negatives ATE.negatives     ABR
## estimated.rule         685        0.1529         565       -0.0540  0.1082
## treat.all             1250        0.0663           0            NA  0.0663
## treat.none               0            NA        1250        0.0663 -0.0663

And last, the rule from OWL framework yields the following estimates

set.seed(123)
OWL.eval <- EvaluateRule(evaluation.data=evaluation.data,
                              BuildRule.object=rule.OWL,
                              study.design="observational",
                              name.outcome="no_relapse",
                              type.outcome="binary",
                              desirable.outcome=TRUE,
                              name.treatment="intervention",
                              names.influencing.treatment=c("prognosis", "clinic", "age"),
                              names.influencing.rule=c("age", paste0("gene_", 1:10)),
                              propensity.method="logistic.regression",
                              bootstrap.CI=FALSE)
OWL.eval$summaries
##                n.positives ATE.positives n.negatives ATE.negatives     ABR
## estimated.rule         691        0.1511         559       -0.0518  0.1067
## treat.all             1250        0.0663           0            NA  0.0663
## treat.none               0            NA        1250        0.0663 -0.0663

And last, the rule from OWL framework yields the following estimates

set.seed(123)
DI.eval <- EvaluateRule(evaluation.data=evaluation.data,
                              BuildRule.object=rule.DI,
                              study.design="observational",
                              name.outcome="no_relapse",
                              type.outcome="binary",
                              desirable.outcome=TRUE,
                              name.treatment="intervention",
                              names.influencing.treatment=c("prognosis", "clinic", "age"),
                              names.influencing.rule=c("age", paste0("gene_", 1:10)),
                              propensity.method="logistic.regression",
                              bootstrap.CI=FALSE)
DI.eval$summaries
##                n.positives ATE.positives n.negatives ATE.negatives     ABR
## estimated.rule         698        0.1653         552       -0.0629  0.1201
## treat.all             1250        0.0663           0            NA  0.0663
## treat.none               0            NA        1250        0.0663 -0.0663

We could have also obtained bootstrap-based CIs for the ATE/ABR estimates (in both the validation and evaluation datasets) by specifying the following arguments to BuildRulesOnValidation and EvaluateRule()

bootstrap.CI=TRUE
booststrap.CI.replications=1000 # or any other number of replications

but we chose not to compute CIs in this example to minimize run-time.

References

  • Yingqi Zhao, Donglin Zeng, A. John Rush & Michael R. Kosorok (2012). Estimating individualized treatment rules using outcome weighted learning. Journal of the American Statistical Association, 107:499 1106–1118.
  • Lu Tian, Ash A. Alizadeh, Andrew J. Gentles, Robert Tibshirani (2014). A simple method for estimating interactions between a treatment and a large number of covariates. Journal of the American Statistical Association, 109:508: 1517–1532.
  • Shuai Chen, Lu Tian, Tianxi Cai, Menggang Yu (2017). A general statistical framework for subgroup identification and comparative treatment scoring. Biometrics, 73:4: 1199–1209.
  • Jeremy Roth and Noah Simon (2019). Using propensity scores to develop and evaluate treatment rules with observational data. (Manuscript in progress).
  • Jeremy Roth and Noah Simon (2019). Elucidating outcome-weighted-learning and its comparison to split-regression: direct vs. indirect methods in practice. (Manuscript in progress).