Title: | Generalized Propensity Score Cumulative Distribution Function |
---|---|
Description: | Implements the generalized propensity score cumulative distribution function proposed by Greene (2017) <https://digitalcommons.library.tmc.edu/dissertations/AAI10681743/>. A single scalar balancing score is calculated for any generalized propensity score vector with three or more treatments. This balancing score is used for propensity score matching and stratification in outcome analyses when analyzing either ordinal or multinomial treatments. |
Authors: | Derek W. Brown [aut, cre], Thomas J. Greene [aut], Stacia M. DeSantis [aut] |
Maintainer: | Derek W. Brown <[email protected]> |
License: | GPL (>= 3) |
Version: | 0.1.1 |
Built: | 2024-12-15 07:41:30 UTC |
Source: | CRAN |
GPSCDF
takes in a generalized propensity score (GPS) object with length
>2 and returns the GPS-CDF balancing score.
GPSCDF(pscores = NULL, data = NULL, trt = NULL, stratify = FALSE, nstrat = 5, optimal = FALSE, greedy = FALSE, ordinal = FALSE, multinomial = FALSE, caliper = NULL)
GPSCDF(pscores = NULL, data = NULL, trt = NULL, stratify = FALSE, nstrat = 5, optimal = FALSE, greedy = FALSE, ordinal = FALSE, multinomial = FALSE, caliper = NULL)
pscores |
The object containing the treatment ordered generalized propensity scores for each subject. |
data |
An optional data frame to attach the calculated balancing score. The data frame will also be used in stratification and matching. |
trt |
An optional object containing the treatment variable. |
stratify |
Option to produce strata based on the power parameter
( |
nstrat |
An optional parameter for the number of strata to be created
when |
optimal |
Option to perform optimal matching of subjects based on the
power parameter ( |
greedy |
Option to perform greedy matching of subjects based on the power
parameter ( |
ordinal |
Specifies ordinal treatment groups for matching. Subjects are
matched based on the ratio of the squared difference of power parameters for
two subjects, |
multinomial |
Specifies multinomial treatment groups for matching.
Subjects are matched based on the absolute difference of power parameters
for two subjects, |
caliper |
An optional parameter for the caliper value used when
performing greedy matching. Used when |
The GPSCDF
method is used to conduct propensity score matching and
stratification for both ordinal and multinomial treatments. The method
directly maps any GPS vector (with length >2) to a single scalar value that
can be used to produce either average treatment effect (ATE) or average
treatment effect among the treated (ATT) estimates. For the K
multinomial treatments setting, the balance achieved from each K!
ordering of the GPS should be assessed to find the optimal ordering of the GPS
vector (see Examples for more details).
ppar |
The power parameter scalar balancing score to be used in outcome analyses through stratification or matching. |
data |
The user defined dataset with power parameter (ppar), strata, and/or optimal matching variables attached. |
nstrat |
The number of strata used for stratification. |
strata |
The strata produced based on the calculated
power parameter ( |
optmatch |
The optimal matches produced
based on the calculated power parameter ( |
optdistance |
The average absolute total distance of power parameters
( |
caliper |
The caliper value used for greedy matching. |
grddata |
The user defined dataset with greedy matching variable attached. |
grdmatch |
The greedy matches
produced based on the calculated power parameter ( |
grdydistance |
The average absolute total distance of power parameters
( |
Derek W. Brown, Thomas J. Greene, Stacia M. DeSantis
Greene, TJ. (2017). Utilizing Propensity Score Methods for Ordinal Treatments and Prehospital Trauma Studies. Texas Medical Center Dissertations (via ProQuest).
### Example: Create data example N<- 100 set.seed(18201) # make sure data is repeatable Sigma <- matrix(.2,4,4) diag(Sigma) <- 1 data<-matrix(0, nrow=N, ncol=6,dimnames=list(c(1:N), c("Y","trt",paste("X",c(1:4),sep="")))) data[,3:6]<-matrix(MASS::mvrnorm(N, mu=rep(0, 4), Sigma, empirical = FALSE) , nrow=N, ncol = 4) dat<-as.data.frame(data) #Create Treatment Variable tlogits<-matrix(0,nrow=N,ncol=2) tprobs<-matrix(0,nrow=N,ncol=3) alphas<-c(0.25, 0.3) strongbetas<-c(0.7, 0.4) modbetas<-c(0.2, 0.3) for(j in 1:2){ tlogits[,j]<- alphas[j] + strongbetas[j]*dat$X1 + strongbetas[j]*dat$X2+ modbetas[j]*dat$X3 + modbetas[j]*dat$X4 } for(j in 1:2){ tprobs[,j]<- exp(tlogits[,j])/(1 + exp(tlogits[,1]) + exp(tlogits[,2])) tprobs[,3]<- 1/(1 + exp(tlogits[,1]) + exp(tlogits[,2])) } set.seed(91187) for(j in 1:N){ data[j,2]<-sample(c(1:3),size=1,prob=tprobs[j,]) } #Create Outcome Variable ylogits<-matrix(0,nrow=N,ncol=1,dimnames=list(c(1:N),c("Logit(P(Y=1))"))) yprobs<-matrix(0,nrow=N,ncol=2,dimnames=list(c(1:N),c("P(Y=0)","P(Y=1)"))) for(j in 1:N){ ylogits[j,1]<- -1.1 + 0.7*data[j,2] + 0.6*dat$X1[j] + 0.6*dat$X2[j] + 0.4*dat$X3[j] + 0.4*dat$X4[j] yprobs[j,2]<- 1/(1+exp(-ylogits[j,1])) yprobs[j,1]<- 1-yprobs[j,2] } set.seed(91187) for(j in 1:N){ data[j,1]<-sample(c(0,1),size=1,prob=yprobs[j,]) } dat<-as.data.frame(data) ### Example: Using GPSCDF #Create the generalized propensity score (GPS) vector using any parametric or #nonparametric model glm<- nnet::multinom(as.factor(trt)~ X1+ X2+ X3+ X4, data=dat) probab<- round(predict(glm, newdata=dat, type="probs"),digits=8) gps<-cbind(probab[,1],probab[,2],1-probab[,1]-probab[,2]) #Create scalar balancing power parameter fit<-GPSCDF(pscores=gps) ## Not run: fit$ppar ## End(Not run) #Attach scalar balancing power parameter to user defined data set fit2<-GPSCDF(pscores=gps, data=dat) ## Not run: fit2$ppar fit2$data ## End(Not run) ### Example: Ordinal Treatment #Stratification fit3<-GPSCDF(pscores=gps, data=dat, stratify=TRUE, nstrat=5) ## Not run: fit3$ppar fit3$data fit3$nstrat fit3$strata library(survival) model1<-survival::clogit(Y~as.factor(trt)+X1+X2+X3+X4+strata(strata), data=fit3$data) summary(model1) ## End(Not run) #Optimal Matching fit4<- GPSCDF(pscores=gps, data=dat, trt=dat$trt, optimal=TRUE, ordinal=TRUE) ## Not run: fit4$ppar fit4$data fit4$optmatch fit4$optdistance library(survival) model2<-survival::clogit(Y~as.factor(trt)+X1+X2+X3+X4+strata(optmatch), data=fit4$data) summary(model2) ## End(Not run) #Greedy Matching fit5<- GPSCDF(pscores=gps, data=dat, trt=dat$trt, greedy=TRUE, ordinal=TRUE) ## Not run: fit5$ppar fit5$data fit5$caliper fit5$grddata fit5$grdmatch fit5$grdydistance library(survival) model3<-survival::clogit(Y~as.factor(trt)+X1+X2+X3+X4+strata(grdmatch), data=fit5$grddata) summary(model3) ## End(Not run) ### Example: Multinomial Treatment #Create all K! orderings of the GPS vector gps1<-cbind(gps[,1],gps[,2],gps[,3]) gps2<-cbind(gps[,1],gps[,3],gps[,2]) gps3<-cbind(gps[,2],gps[,1],gps[,3]) gps4<-cbind(gps[,2],gps[,3],gps[,1]) gps5<-cbind(gps[,3],gps[,1],gps[,2]) gps6<-cbind(gps[,3],gps[,2],gps[,1]) gpsarry<-array(c(gps1, gps2, gps3, gps4, gps5, gps6), dim=c(N,3,6)) #Create scalar balancing power parameters for each ordering of the GPS vector fit6<- matrix(0,nrow=N,ncol=6,dimnames=list(c(1:N),c("ppar1","ppar2","ppar3", "ppar4","ppar5","ppar6"))) ## Not run: for(i in 1:6){ fit6[,i]<-GPSCDF(pscores=gpsarry[,,i])$ppar } fit6 #Perform analyses (similar to ordinal examples) using each K! ordering of the #GPS vector. Select ordering which achieves optimal covariate balance #(i.e. minimal standardized mean difference). ## End(Not run)
### Example: Create data example N<- 100 set.seed(18201) # make sure data is repeatable Sigma <- matrix(.2,4,4) diag(Sigma) <- 1 data<-matrix(0, nrow=N, ncol=6,dimnames=list(c(1:N), c("Y","trt",paste("X",c(1:4),sep="")))) data[,3:6]<-matrix(MASS::mvrnorm(N, mu=rep(0, 4), Sigma, empirical = FALSE) , nrow=N, ncol = 4) dat<-as.data.frame(data) #Create Treatment Variable tlogits<-matrix(0,nrow=N,ncol=2) tprobs<-matrix(0,nrow=N,ncol=3) alphas<-c(0.25, 0.3) strongbetas<-c(0.7, 0.4) modbetas<-c(0.2, 0.3) for(j in 1:2){ tlogits[,j]<- alphas[j] + strongbetas[j]*dat$X1 + strongbetas[j]*dat$X2+ modbetas[j]*dat$X3 + modbetas[j]*dat$X4 } for(j in 1:2){ tprobs[,j]<- exp(tlogits[,j])/(1 + exp(tlogits[,1]) + exp(tlogits[,2])) tprobs[,3]<- 1/(1 + exp(tlogits[,1]) + exp(tlogits[,2])) } set.seed(91187) for(j in 1:N){ data[j,2]<-sample(c(1:3),size=1,prob=tprobs[j,]) } #Create Outcome Variable ylogits<-matrix(0,nrow=N,ncol=1,dimnames=list(c(1:N),c("Logit(P(Y=1))"))) yprobs<-matrix(0,nrow=N,ncol=2,dimnames=list(c(1:N),c("P(Y=0)","P(Y=1)"))) for(j in 1:N){ ylogits[j,1]<- -1.1 + 0.7*data[j,2] + 0.6*dat$X1[j] + 0.6*dat$X2[j] + 0.4*dat$X3[j] + 0.4*dat$X4[j] yprobs[j,2]<- 1/(1+exp(-ylogits[j,1])) yprobs[j,1]<- 1-yprobs[j,2] } set.seed(91187) for(j in 1:N){ data[j,1]<-sample(c(0,1),size=1,prob=yprobs[j,]) } dat<-as.data.frame(data) ### Example: Using GPSCDF #Create the generalized propensity score (GPS) vector using any parametric or #nonparametric model glm<- nnet::multinom(as.factor(trt)~ X1+ X2+ X3+ X4, data=dat) probab<- round(predict(glm, newdata=dat, type="probs"),digits=8) gps<-cbind(probab[,1],probab[,2],1-probab[,1]-probab[,2]) #Create scalar balancing power parameter fit<-GPSCDF(pscores=gps) ## Not run: fit$ppar ## End(Not run) #Attach scalar balancing power parameter to user defined data set fit2<-GPSCDF(pscores=gps, data=dat) ## Not run: fit2$ppar fit2$data ## End(Not run) ### Example: Ordinal Treatment #Stratification fit3<-GPSCDF(pscores=gps, data=dat, stratify=TRUE, nstrat=5) ## Not run: fit3$ppar fit3$data fit3$nstrat fit3$strata library(survival) model1<-survival::clogit(Y~as.factor(trt)+X1+X2+X3+X4+strata(strata), data=fit3$data) summary(model1) ## End(Not run) #Optimal Matching fit4<- GPSCDF(pscores=gps, data=dat, trt=dat$trt, optimal=TRUE, ordinal=TRUE) ## Not run: fit4$ppar fit4$data fit4$optmatch fit4$optdistance library(survival) model2<-survival::clogit(Y~as.factor(trt)+X1+X2+X3+X4+strata(optmatch), data=fit4$data) summary(model2) ## End(Not run) #Greedy Matching fit5<- GPSCDF(pscores=gps, data=dat, trt=dat$trt, greedy=TRUE, ordinal=TRUE) ## Not run: fit5$ppar fit5$data fit5$caliper fit5$grddata fit5$grdmatch fit5$grdydistance library(survival) model3<-survival::clogit(Y~as.factor(trt)+X1+X2+X3+X4+strata(grdmatch), data=fit5$grddata) summary(model3) ## End(Not run) ### Example: Multinomial Treatment #Create all K! orderings of the GPS vector gps1<-cbind(gps[,1],gps[,2],gps[,3]) gps2<-cbind(gps[,1],gps[,3],gps[,2]) gps3<-cbind(gps[,2],gps[,1],gps[,3]) gps4<-cbind(gps[,2],gps[,3],gps[,1]) gps5<-cbind(gps[,3],gps[,1],gps[,2]) gps6<-cbind(gps[,3],gps[,2],gps[,1]) gpsarry<-array(c(gps1, gps2, gps3, gps4, gps5, gps6), dim=c(N,3,6)) #Create scalar balancing power parameters for each ordering of the GPS vector fit6<- matrix(0,nrow=N,ncol=6,dimnames=list(c(1:N),c("ppar1","ppar2","ppar3", "ppar4","ppar5","ppar6"))) ## Not run: for(i in 1:6){ fit6[,i]<-GPSCDF(pscores=gpsarry[,,i])$ppar } fit6 #Perform analyses (similar to ordinal examples) using each K! ordering of the #GPS vector. Select ordering which achieves optimal covariate balance #(i.e. minimal standardized mean difference). ## End(Not run)