Title: | Threshold Regression |
---|---|
Description: | Fit a threshold regression model based on the first-hitting-time of a boundary by the sample path of a Wiener diffusion process. The threshold regression methodology is well suited to applications involving survival and time-to-event data. |
Authors: | Tao Xiao |
Maintainer: | Tao Xiao <[email protected]> |
License: | GPL-2 |
Version: | 1.0.3 |
Built: | 2024-12-22 06:38:27 UTC |
Source: | CRAN |
Survival of 137 acute leukemia patients treated with bone marrow transplants which are a standard treatment for acute leukemia.
bmt
bmt
time: | time to relapse, death or end of study (in days) |
indicator: | censoring indicator variable - 1 = dead or replapse, 0 = otherwise |
recipient_age: | patient age |
group: | risk categories based on their status at the time of transplantation |
fab: | French-American-British (FAB) classification based on standard morphological criteria |
Klein JP, Moeschberger ML. Survival analysis: techniques for censored and truncated data. 2 edition, 2003 Springer-Verlag New York, Inc.
This function can be used to estimate hazard ratios at selected time points for specified scenarios (based on given categories or value settings of covariates).
hr(object,var,timevalue,scenario) ## S3 method for class 'threg' hr(object,var,timevalue,scenario)
hr(object,var,timevalue,scenario) ## S3 method for class 'threg' hr(object,var,timevalue,scenario)
object |
a threg object. |
var |
specifies the categorical variable for the calculation of hazard ratios. Such categorical variable must be a factor variable that has been used in threg() that returns the threg object. |
timevalue |
specifies a value of time at which the hazard ratios are calculated. A vector is allowed. |
scenario |
specifies a scenario where the hazard ratios are calculated. |
#load the data "lkr" data("lkr") #Transform the "treatment2" variable into factor variable "f.treatment2" . lkr$f.treatment2=factor(lkr$treatment2) #fit the threshold regression model on the factor variable "f.treatment2", fit<-threg(Surv(weeks, relapse)~ f.treatment2|f.treatment2,data = lkr) fit #calculate the hazard ratio of the drug B group v.s. the standard group at #week 5 (this hazard ratio is calculated as 2.08) hr.threg(fit,var=f.treatment2,timevalue=5) #calculate the hazard ratio of the drug B group v.s. the standard group at #week 20 (this hazard ratio is calculated as 0.12) hr.threg(fit,var=f.treatment2,timevalue=20) #As a comparison, fit the Cox proportion hazards model on "f.treatment2", #and the Cox model gives a constant hazard ratio, 0.73. summary(coxph(Surv(weeks, relapse) ~ f.treatment2, data = lkr)) #load the data "bmt" data("bmt") #Transform the "group" and "fab" variables into factor variables #"f.group" and "f.fab". bmt$f.group=factor(bmt$group) bmt$f.fab=factor(bmt$fab) #fit a threshold regression model on the "bmt" dataset, by using "recipient_age" and #"f.fab" as the predictors for ln(y0), and "f.group" and "f.fab" as predictors for mu. fit<-threg(Surv(time, indicator)~ recipient_age+f.fab|f.group+f.fab, data = bmt) fit #Calculate the hazard ratio for #"f.group" for the specified scenario that "the patient age is 18 years old and #the FAB classification is 0", at the time ``500 days''. hr.threg(fit,var=f.group,timevalue=500,scenario=recipient_age(18)+f.fab1(0))
#load the data "lkr" data("lkr") #Transform the "treatment2" variable into factor variable "f.treatment2" . lkr$f.treatment2=factor(lkr$treatment2) #fit the threshold regression model on the factor variable "f.treatment2", fit<-threg(Surv(weeks, relapse)~ f.treatment2|f.treatment2,data = lkr) fit #calculate the hazard ratio of the drug B group v.s. the standard group at #week 5 (this hazard ratio is calculated as 2.08) hr.threg(fit,var=f.treatment2,timevalue=5) #calculate the hazard ratio of the drug B group v.s. the standard group at #week 20 (this hazard ratio is calculated as 0.12) hr.threg(fit,var=f.treatment2,timevalue=20) #As a comparison, fit the Cox proportion hazards model on "f.treatment2", #and the Cox model gives a constant hazard ratio, 0.73. summary(coxph(Surv(weeks, relapse) ~ f.treatment2, data = lkr)) #load the data "bmt" data("bmt") #Transform the "group" and "fab" variables into factor variables #"f.group" and "f.fab". bmt$f.group=factor(bmt$group) bmt$f.fab=factor(bmt$fab) #fit a threshold regression model on the "bmt" dataset, by using "recipient_age" and #"f.fab" as the predictors for ln(y0), and "f.group" and "f.fab" as predictors for mu. fit<-threg(Surv(time, indicator)~ recipient_age+f.fab|f.group+f.fab, data = bmt) fit #Calculate the hazard ratio for #"f.group" for the specified scenario that "the patient age is 18 years old and #the FAB classification is 0", at the time ``500 days''. hr.threg(fit,var=f.group,timevalue=500,scenario=recipient_age(18)+f.fab1(0))
Data of a leukemia remission study. There are 42 patients in the dataset, who were monitored for whether they relapsed and for how long (in weeks) they remained in remission.
lkr
lkr
weeks: | the time that a patient was remained in remission (in weeks) |
relapse: | whether the patient was relapsed - 1 = yes, 0 = no |
treatment1: | 1=drug A, 0 = standard drug for treatment1 |
treatment2: | 1=drug B, 0 = standard drug for treatment2 |
wbc3cat: | white blood cell count, recorded in three categories - 1 = normal, 2= moderate, 3 = high |
Garrett JM. sbe14: Odds Ratios and Confidence Intervals for Logistic Regression Models with Effect Modification. Stata Technical Bulletin, 36, 15-22. Reprinted in Stata Technical Bulletin Reprints, vol. 6, pp. 104-114, 1997 College Station, TX: Stata Press
This function can be used to display the graphs of the estimated survival, hazard or density functions at different levels of a factor predictor variable which has been included in the threshold regression by threg() function.
## S3 method for class 'threg' plot(x,var,scenario,graph,nolegend=0,nocolor=0,...)
## S3 method for class 'threg' plot(x,var,scenario,graph,nolegend=0,nocolor=0,...)
x |
a threg object. |
var |
specifies the categorical variable for each level of which the curves are plotted. Such categorical variable must be a factor variable that has been used in threg() that returns the threg object. |
scenario |
specifies a scenario where the predicted plots are based on. |
graph |
specifies the type of curves to be generated. The “hz” option is to plot hazard function curves, the “sv” option is to plot survival function curves, and the “ds” option is to plot density function curves. |
nolegend |
set the “nolegend” argument to 1 if users do not want the “threg” package to generate legends for the picture. Note that even if “nolegend” is set to 1, users can still generate legends by themselves after the picture is generated, by using the “legend” function in R. |
nocolor |
set the “nocolor” argument to 1 if users want to depict all curves in black. |
... |
for future methods |
#load the data "lkr" data("lkr") #Transform the "treatment2" variable into factor variable "f.treatment2" . lkr$f.treatment2=factor(lkr$treatment2) #fit the threshold regression model on the factor variable "f.treatment2", fit<-threg(Surv(weeks, relapse)~ f.treatment2|f.treatment2,data = lkr) fit #generate the predicted survival curves for the drug B group and #the standard group. plot(fit,var=f.treatment2,graph=sv,nolegend=1,nocolor=1) legend(20, 1, c("Standard","Drug B"), lty = 1:2) #load the data "bmt" data("bmt") #Transform the "group" and "fab" variables into factor variables #"f.group" and "f.fab". bmt$f.group=factor(bmt$group) bmt$f.fab=factor(bmt$fab) #fit a threshold regression model on the "bmt" dataset, by using "recipient_age" and #"f.fab" as the predictors for ln(y0), and "f.group" and "f.fab" as predictors for mu. fit<-threg(Surv(time, indicator)~ recipient_age+f.fab|f.group+f.fab, data = bmt) fit #fit the same model as above, but additionally overlay curves of survival functions #corresponding to different levels of "f.group'. plot.threg(fit,var=f.group,scenario=recipient_age(18)+f.fab1(0),graph=sv,nocolor=1) #fit the same model as above, but additionally overlay curves of hazard functions #corresponding to different levels of "f.group'. plot.threg(fit,var=f.group,scenario=recipient_age(18)+f.fab1(0),graph=hz,nocolor=1) #fit the same model as above, but additionally overlay curves of probability density #functions corresponding to different levels of "f.group'. plot.threg(fit,var=f.group,scenario=recipient_age(18)+f.fab1(0),graph=ds,nocolor=1)
#load the data "lkr" data("lkr") #Transform the "treatment2" variable into factor variable "f.treatment2" . lkr$f.treatment2=factor(lkr$treatment2) #fit the threshold regression model on the factor variable "f.treatment2", fit<-threg(Surv(weeks, relapse)~ f.treatment2|f.treatment2,data = lkr) fit #generate the predicted survival curves for the drug B group and #the standard group. plot(fit,var=f.treatment2,graph=sv,nolegend=1,nocolor=1) legend(20, 1, c("Standard","Drug B"), lty = 1:2) #load the data "bmt" data("bmt") #Transform the "group" and "fab" variables into factor variables #"f.group" and "f.fab". bmt$f.group=factor(bmt$group) bmt$f.fab=factor(bmt$fab) #fit a threshold regression model on the "bmt" dataset, by using "recipient_age" and #"f.fab" as the predictors for ln(y0), and "f.group" and "f.fab" as predictors for mu. fit<-threg(Surv(time, indicator)~ recipient_age+f.fab|f.group+f.fab, data = bmt) fit #fit the same model as above, but additionally overlay curves of survival functions #corresponding to different levels of "f.group'. plot.threg(fit,var=f.group,scenario=recipient_age(18)+f.fab1(0),graph=sv,nocolor=1) #fit the same model as above, but additionally overlay curves of hazard functions #corresponding to different levels of "f.group'. plot.threg(fit,var=f.group,scenario=recipient_age(18)+f.fab1(0),graph=hz,nocolor=1) #fit the same model as above, but additionally overlay curves of probability density #functions corresponding to different levels of "f.group'. plot.threg(fit,var=f.group,scenario=recipient_age(18)+f.fab1(0),graph=ds,nocolor=1)
This function can be used to predict the initial health status value , the drift value of the health process
, the probability density function of the survival time
, the survival function
and the hazard function
for a specified scenario and time value. The scenario specified here is similar to those in hr.threg and plot.threg functions. The only difference is that we need to provide the scenario values for all variables in the model, while for hr.threg and plot.threg functions we do not need to provide scenario values of the dummy variables expanded from the factor variable specified in the var argument.
## S3 method for class 'threg' predict(object,timevalue,scenario,...)
## S3 method for class 'threg' predict(object,timevalue,scenario,...)
object |
a threg object. |
timevalue |
specifies the desired time value at which the predicted values are to be calculated. Vector is allowed for this argument. If this argument is omitted, then the predicted values for the study time of all subjects would be calculated. |
scenario |
specifies the values of all predictors. If this argument is omitted, then the predicted values at a specified time value for all subjects would be calculated, and in this case the covariate values for each subject are used as their corresponding scenario values. |
... |
for future methods |
#load the data "bmt" data("bmt") #Transform the "group" and "fab" variables into factor variables #"f.group" and "f.fab". bmt$f.group=factor(bmt$group) bmt$f.fab=factor(bmt$fab) #fit a threshold regression model on the "bmt" dataset, by using "recipient_age" and #"f.fab" as the predictors for ln(y0), and "f.group" and "f.fab" as predictors for mu. fit<-threg(Surv(time, indicator)~ recipient_age+f.fab|f.group+f.fab, data = bmt) #predict lny0, y0, mu, f, S and h for the specified scenario that "the patient age is #18 years old, the FAB classification is 0 and the risk category is 3", at the #time ``2000 days'' predict.threg(fit,timevalue=2000,scenario=recipient_age(18)+f.fab1(0)+f.group2(0)+f.group3(1))
#load the data "bmt" data("bmt") #Transform the "group" and "fab" variables into factor variables #"f.group" and "f.fab". bmt$f.group=factor(bmt$group) bmt$f.fab=factor(bmt$fab) #fit a threshold regression model on the "bmt" dataset, by using "recipient_age" and #"f.fab" as the predictors for ln(y0), and "f.group" and "f.fab" as predictors for mu. fit<-threg(Surv(time, indicator)~ recipient_age+f.fab|f.group+f.fab, data = bmt) #predict lny0, y0, mu, f, S and h for the specified scenario that "the patient age is #18 years old, the FAB classification is 0 and the risk category is 3", at the #time ``2000 days'' predict.threg(fit,timevalue=2000,scenario=recipient_age(18)+f.fab1(0)+f.group2(0)+f.group3(1))
Produces a printed summary of a fitted threg model
## S3 method for class 'threg' print(x, digits=max(options()$digits - 4, 3), ...)
## S3 method for class 'threg' print(x, digits=max(options()$digits - 4, 3), ...)
x |
the result of a call to |
digits |
significant digits to print |
... |
For future methods |
This function can be used to fit a threshold regressio model based on the first-hitting-time of a boundary by the sample path of a Wiener diffusion process. It uses maximum likelihood estimation method for calculating regression coefficient estimates, asymptotic standard errors and p-values.
threg(formula, data)
threg(formula, data)
formula |
a formula object, with the response on the left of a ~ operator, and the independent variables on the right. The response must be a survival object as returned by the Surv function. On the right of the ~ operator, a | operator must be used: on the left of the | operator, users specify independent variables that will be used in the linear regression function for |
data |
input dataset. Such dataset must be a survival dataset including at least the survival time variable and censoring variable. For the censoring variable, 1 should be used to indicate the subjects with failure observed, and 0 should be used to indicate the subjects that are right censored. The dataset can also include other independent variables that will be used in the threshold regression model. |
Threshold regression is a recently developed regression methodology to analyze time to event data. For a review of this regression model, see Lee and Whitmore (2006, 2010). A unique feature of threshold regression is that the event time is considered as the time when an underlying stochastic process first hits a boundary threshold. In the context of survival data, for example, the event can be death. The death time of an individual is considered as the time when his/her latent health status decreases to the zero boundary.
In the threg
package, a Wiener process is used to model the latent health status process. An event is observed when
reaches
for the first time. Three parameters of the Wiener process are involved:
,
and
. Parameter
, called the drift of the Wiener process, is the rate per unit time at which the level of the sample path is changing. The sample path approaches the threshold if
. Parameter
is the initial value of the process and is taken as positive. Parameter
represents the variability per unit time of the process (Lee and Whitmore 2006). The first hitting time (FHT) of a Wiener process with
,
and
is an inverse Gaussian distribution with probability density function (p.d.f):
where . The p.d.f. is proper if
. The cumulative distribution function of the FHT is:
where is the cumulative distribution function of the standard normal distribution. Note that if
, the Wiener process may never hit the boundary at zero and hence there is a probability that the FHT is
, that is,
.
Since the health status process is usually latent (i.e., unobserved), an arbitrary unit can be used to measure such a process. Hence the variance parameter of the process is set to 1 in the
threg
package to fix the measurement unit of the process. Then we can regress the other two process parameters, and
on covariate data. We assume that
and
are linear in regression coefficients.
Suppose that the covariate vector is , where
are covariates and the leading 1 in
allows for a constant term in the regression relationship. Then
can be linked to the covariates with the following regression form:
and can be linked to the covariates with the following regression form:
Vectors and
are regression coefficients for
and
, respectively, with
and
. Note that researchers can set some elements in
or
to zero if they feel the corresponding covariates are not important in predicting
or
. For example, if covariate
in the vector
is considered not important to predict
, we can remove the
term by setting
to zero.
Tao Xiao
Maintainer: Tao Xiao <[email protected]>
Lee, M-L. T., Whitmore, G. A. (2010) Proportional hazards and threshold regression: their theoretical and practical connections., Lifetime Data Analysis 16, 2: 196-214.
Lee, M-L. T., Whitmore, G. A. (2006) Threshold regression for survival analysis: modeling event times by a stochastic process, Statistical Science 21: 501-513.
Klein, J. P., Moeschberger, M. L. (2003) Survival Analysis: Techniques for Censored and Truncated Data. 2 edition. Springer-Verlag New York, Inc.
Garrett, J. M. (1997) Odds ratios and confidence intervals for logistic regression models with effect modification. Stata Technical Bulletin, 36, 1522. Reprinted in Stata Technical Bulletin Reprints, vol. 6, pp. 104-114. College Station, TX: Stata Press.
Xiao, T., Whitmore, G. A., He, Xin, Lee, M-L. T. (2015) The R Package threg to Implement Threshold Regression Models. Journal of Statistical Software, 66(8), 1-16. URL http://www.jstatsoft.org/v66/i08/.
#load the data "lkr" data("lkr") #Transform the "treatment2" variable into factor variable "f.treatment2" . lkr$f.treatment2=factor(lkr$treatment2) #fit the threshold regression model on the factor variable "f.treatment2", fit<-threg(Surv(weeks, relapse)~ f.treatment2|f.treatment2,data = lkr) fit #load the data "bmt" data("bmt") #Transform the "group" and "fab" variables into factor variables #"f.group" and "f.fab". bmt$f.group=factor(bmt$group) bmt$f.fab=factor(bmt$fab) #fit a threshold regression model on the "bmt" dataset, by using "recipient_age" and #"f.fab" as the predictors for ln(y0), and "f.group" and "f.fab" as predictors for mu. fit<-threg(Surv(time, indicator)~ recipient_age+f.fab|f.group+f.fab, data = bmt) fit
#load the data "lkr" data("lkr") #Transform the "treatment2" variable into factor variable "f.treatment2" . lkr$f.treatment2=factor(lkr$treatment2) #fit the threshold regression model on the factor variable "f.treatment2", fit<-threg(Surv(weeks, relapse)~ f.treatment2|f.treatment2,data = lkr) fit #load the data "bmt" data("bmt") #Transform the "group" and "fab" variables into factor variables #"f.group" and "f.fab". bmt$f.group=factor(bmt$group) bmt$f.fab=factor(bmt$fab) #fit a threshold regression model on the "bmt" dataset, by using "recipient_age" and #"f.fab" as the predictors for ln(y0), and "f.group" and "f.fab" as predictors for mu. fit<-threg(Surv(time, indicator)~ recipient_age+f.fab|f.group+f.fab, data = bmt) fit