Here, we illustrate the
DevTreatRules
package by building and evaluating treatment
rules based on the example dataset included with the package.
library(DevTreatRules)
head(obsStudyGeneExpressions)
## no_relapse intervention prognosis clinic age gene_1 gene_2 gene_3 gene_4
## 1 1 0 Good 1 46 0.531 0.071 0.1294 0.427
## 2 0 1 Poor 3 51 0.744 0.711 1.3532 1.939
## 3 1 0 Poor 3 62 1.146 0.498 1.4707 1.456
## 4 1 0 Good 5 51 1.816 1.758 0.2226 0.688
## 5 1 1 Good 4 52 0.403 0.636 0.0933 0.288
## 6 1 1 Poor 6 51 1.797 0.642 0.2618 0.277
## gene_5 gene_6 gene_7 gene_8 gene_9 gene_10
## 1 0.421 0.402 0.831 0.0136 1.186 1.8831
## 2 0.230 0.228 0.282 1.3413 1.106 0.6606
## 3 0.291 0.543 0.915 0.4839 1.193 0.6233
## 4 0.620 1.571 1.606 1.9986 0.785 1.7684
## 5 0.300 1.276 0.642 1.9003 0.521 0.0126
## 6 1.053 0.939 1.736 0.2348 1.480 1.3908
First, we split obsStudyGeneExpressions
into independent
development/validation/evaluation partitions by calling the
SplitData()
function
set.seed(123)
example.split <- SplitData(data=obsStudyGeneExpressions, n.sets=3, split.proportions=c(0.5, 0.25, 0.25))
table(example.split$partition)
##
## development validation evaluation
## 2500 1250 1250
and then forming independent datasets based on the
partition
variable created above.
Suppose we believe the variables prognosis
,
clinic
, and age
may have influenced treatment
assignment. We would codify this knowledge into
DevTreatRules
by specifying the argument
in functions we will call later in this vignette. Further suppose
that we don’t expect prognosis
and clinic
to
be measured on the same scale in independent clinical settings where we
would like to apply our estimated rule (so they are not sensible rule
inputs). However, we do expect the gene expression measurements in
gene_1
, …, gene_10
to potentially influence
treatment response and also to be reliably measured in future settings
(so they are sensible rule inputs). We specify this knowledge in
DevTreatRules
with the argument
Although we could directly estimate a single treatment rule on the
development dataset with BuildRule()
using, for
example,
one.rule <- BuildRule(development.data=development.data,
study.design="observational",
prediction.approach="split.regression",
name.outcome="no_relapse",
type.outcome="binary",
desirable.outcome=TRUE,
name.treatment="intervention",
names.influencing.treatment=c("prognosis", "clinic", "age"),
names.influencing.rule=c("age", paste0("gene_", 1:10)),
propensity.method="logistic.regression",
rule.method="glm.regression")
this has limited utility because it required us to specify just one
value for the prediction.approach
argument (even if we
don’t have a priori knowledge about which of split-regression,
OWL framework, and direct-interactions approaches will perform best) and
to specify just one regression method for the
propensity.score
and rule.method
arguments
(even if we are not sure whether standard logistic regression or
lasso/ridge logistic regression will yield a better rule).
Instead, it would be more useful to perform model selection to
estimate rules for different combinations of split-regression/OWL
framework/direct-interactions and standard/lasso/ridge logistic
regression (e.g. by looping over calls to BuildRule()
). The
model-selection process is automated in
CompareRulesOnValidation()
.
Here we will perform model selection by calling
CompareRulesOnValidation()
with the arguments
vec.approaches=c("OWL", "split.regression", "OWL.framework", "direct.interactions")
vec.rule.methods=c("glm.regression", "lasso", "ridge")
which are actually the default values of
CompareRulesOnValidation()
, and with
Now we perform model selection by calling
CompareRulesOnValidation()
set.seed(123)
model.selection <- CompareRulesOnValidation(development.data=development.data,
validation.data=validation.data,
study.design.development="observational",
vec.approaches=c("split.regression", "OWL.framework", "direct.interactions"),
vec.rule.methods=c("glm.regression", "lasso", "ridge"),
vec.propensity.methods="logistic.regression",
name.outcome.development="no_relapse",
type.outcome.development="binary",
name.treatment.development="intervention",
names.influencing.treatment.development=c("prognosis", "clinic", "age"),
names.influencing.rule.development=c("age", paste0("gene_", 1:10)),
desirable.outcome.development=TRUE)
We can compare these estimates for the nine treatment rules (three
approaches, three combinations of rule/propensity methods) to decide
which rules to bring forward to the evaluation dataset. For context, the
estimates for the naive “treat.all” and “treat.none” strategies are
always provided by CompareRulesOnValidation()
.
First, for rules estimated with the split-regression approach, we obtain the estimates
model.selection$list.summaries[["split.regression"]]
## Positives Negatives
## propensity_logistic.regression_rule_glm.regression 665 585
## propensity_logistic.regression_rule_lasso 694 556
## propensity_logistic.regression_rule_ridge 762 488
## treat.all 1250 0
## treat.none 0 1250
## ATE in Positives
## propensity_logistic.regression_rule_glm.regression 0.11919
## propensity_logistic.regression_rule_lasso 0.11078
## propensity_logistic.regression_rule_ridge 0.08535
## treat.all -0.00375
## treat.none NA
## ATE in Negatives ABR
## propensity_logistic.regression_rule_glm.regression -0.11361 0.11658
## propensity_logistic.regression_rule_lasso -0.14051 0.12400
## propensity_logistic.regression_rule_ridge -0.12102 0.09927
## treat.all NA -0.00375
## treat.none -0.00375 0.00375
Next, for the OWL framework we have
model.selection$list.summaries[["OWL.framework"]]
## Positives Negatives
## propensity_logistic.regression_rule_glm.regression 673 577
## propensity_logistic.regression_rule_lasso 1250 0
## propensity_logistic.regression_rule_ridge 1250 0
## treat.all 1250 0
## treat.none 0 1250
## ATE in Positives
## propensity_logistic.regression_rule_glm.regression 0.12056
## propensity_logistic.regression_rule_lasso -0.00375
## propensity_logistic.regression_rule_ridge -0.00375
## treat.all -0.00375
## treat.none NA
## ATE in Negatives ABR
## propensity_logistic.regression_rule_glm.regression -0.12967 0.12477
## propensity_logistic.regression_rule_lasso NA -0.00375
## propensity_logistic.regression_rule_ridge NA -0.00375
## treat.all NA -0.00375
## treat.none -0.00375 0.00375
and, last, for the direct-interactions approach
model.selection$list.summaries[["direct.interactions"]]
## Positives Negatives
## propensity_logistic.regression_rule_glm.regression 673 577
## propensity_logistic.regression_rule_lasso 697 553
## propensity_logistic.regression_rule_ridge 1250 0
## treat.all 1250 0
## treat.none 0 1250
## ATE in Positives
## propensity_logistic.regression_rule_glm.regression 0.12457
## propensity_logistic.regression_rule_lasso 0.08807
## propensity_logistic.regression_rule_ridge -0.00375
## treat.all -0.00375
## treat.none NA
## ATE in Negatives ABR
## propensity_logistic.regression_rule_glm.regression -0.12672 0.12556
## propensity_logistic.regression_rule_lasso -0.10951 0.09755
## propensity_logistic.regression_rule_ridge NA -0.00375
## treat.all NA -0.00375
## treat.none -0.00375 0.00375
Based on the above estimates in the validation set, we decide to select three rules to bring forward to the evaluation set: (1) split-regression with logistic/logistic as the propensity/rule methods,
model.selection$list.summaries$split.regression["propensity_logistic.regression_rule_glm.regression", ]
## Positives Negatives ATE in Positives ATE in Negatives
## 665.000 585.000 0.119 -0.114
## ABR
## 0.117
along with (2) OWL framework with logistic/logistic as the propensity/rule methods
model.selection$list.summaries$OWL.framework["propensity_logistic.regression_rule_glm.regression", ]
## Positives Negatives ATE in Positives ATE in Negatives
## 673.000 577.000 0.121 -0.130
## ABR
## 0.125
and (3) direct-interactions with logistic/lasso as the propensity/rule methods.
model.selection$list.summaries$direct.interactions["propensity_logistic.regression_rule_lasso", ]
## Positives Negatives ATE in Positives ATE in Negatives
## 697.0000 553.0000 0.0881 -0.1095
## ABR
## 0.0976
We can also see the underlying coefficient estimates for these rules with
rule.split <- model.selection$list.rules$split.regression[["propensity_logistic.regression_rule_glm.regression"]]
coef(rule.split$rule.object.control)
## (Intercept) age gene_1 gene_2 gene_3 gene_4
## 0.81702 0.00354 -0.53828 -0.09668 0.02675 -0.06418
## gene_5 gene_6 gene_7 gene_8 gene_9 gene_10
## -0.24930 -0.02657 -0.20052 -0.02206 -0.06508 0.35033
coef(rule.split$rule.object.treat)
## (Intercept) age gene_1 gene_2 gene_3 gene_4
## -0.26512 -0.00177 0.47034 -0.02004 -0.12251 -0.14514
## gene_5 gene_6 gene_7 gene_8 gene_9 gene_10
## 0.14135 -0.04188 -0.03558 0.08977 0.11137 0.12449
rule.OWL <- model.selection$list.rules$OWL.framework[["propensity_logistic.regression_rule_glm.regression"]]
coef(rule.OWL$rule.object)
## (Intercept) age gene_1 gene_2 gene_3 gene_4
## -0.50119 -0.00340 0.46473 0.05083 -0.04349 -0.03906
## gene_5 gene_6 gene_7 gene_8 gene_9 gene_10
## 0.15631 0.00839 0.09120 0.05875 0.08967 -0.10412
rule.DI <- model.selection$list.rules$direct.interactions[["propensity_logistic.regression_rule_lasso"]]
coef(rule.DI$rule.object)
## 13 x 1 sparse Matrix of class "dgCMatrix"
## s1
## (Intercept) .
## treatment.neg.pos -0.0753
## age .
## gene_1 0.1232
## gene_2 .
## gene_3 .
## gene_4 .
## gene_5 .
## gene_6 .
## gene_7 .
## gene_8 .
## gene_9 .
## gene_10 .
Since the validation dataset informed our model selection (i.e. we
selected these particular two rules because they appeared best on the
validation set), the estimates from the validation set itself are not
trustworthy estimates of performance in independent settings. To obtain
a trustworthy estimate of the rules’ performance in independent samples
drawn from the same population, we turn to the
EvaluateRule()
function applied to the independent
evaluation dataset.
First, we obtain the estimated performance of the rule using split-regression with
set.seed(123)
split.eval <- EvaluateRule(evaluation.data=evaluation.data,
BuildRule.object=rule.split,
study.design="observational",
name.outcome="no_relapse",
type.outcome="binary",
desirable.outcome=TRUE,
name.treatment="intervention",
names.influencing.treatment=c("prognosis", "clinic", "age"),
names.influencing.rule=c("age", paste0("gene_", 1:10)),
propensity.method="logistic.regression",
bootstrap.CI=FALSE)
split.eval$summaries
## n.positives ATE.positives n.negatives ATE.negatives ABR
## estimated.rule 685 0.1529 565 -0.0540 0.1082
## treat.all 1250 0.0663 0 NA 0.0663
## treat.none 0 NA 1250 0.0663 -0.0663
And last, the rule from OWL framework yields the following estimates
set.seed(123)
OWL.eval <- EvaluateRule(evaluation.data=evaluation.data,
BuildRule.object=rule.OWL,
study.design="observational",
name.outcome="no_relapse",
type.outcome="binary",
desirable.outcome=TRUE,
name.treatment="intervention",
names.influencing.treatment=c("prognosis", "clinic", "age"),
names.influencing.rule=c("age", paste0("gene_", 1:10)),
propensity.method="logistic.regression",
bootstrap.CI=FALSE)
OWL.eval$summaries
## n.positives ATE.positives n.negatives ATE.negatives ABR
## estimated.rule 691 0.1511 559 -0.0518 0.1067
## treat.all 1250 0.0663 0 NA 0.0663
## treat.none 0 NA 1250 0.0663 -0.0663
And last, the rule from OWL framework yields the following estimates
set.seed(123)
DI.eval <- EvaluateRule(evaluation.data=evaluation.data,
BuildRule.object=rule.DI,
study.design="observational",
name.outcome="no_relapse",
type.outcome="binary",
desirable.outcome=TRUE,
name.treatment="intervention",
names.influencing.treatment=c("prognosis", "clinic", "age"),
names.influencing.rule=c("age", paste0("gene_", 1:10)),
propensity.method="logistic.regression",
bootstrap.CI=FALSE)
DI.eval$summaries
## n.positives ATE.positives n.negatives ATE.negatives ABR
## estimated.rule 698 0.1653 552 -0.0629 0.1201
## treat.all 1250 0.0663 0 NA 0.0663
## treat.none 0 NA 1250 0.0663 -0.0663
We could have also obtained bootstrap-based CIs for the ATE/ABR
estimates (in both the validation and evaluation datasets) by specifying
the following arguments to BuildRulesOnValidation
and
EvaluateRule()
but we chose not to compute CIs in this example to minimize run-time.