Package 'SemiCompRisks'

Title: Hierarchical Models for Parametric and Semi-Parametric Analyses of Semi-Competing Risks Data
Description: Hierarchical multistate models are considered to perform the analysis of independent/clustered semi-competing risks data. The package allows to choose the specification for model components from a range of options giving users substantial flexibility, including: accelerated failure time or proportional hazards regression models; parametric or non-parametric specifications for baseline survival functions and cluster-specific random effects distribution; a Markov or semi-Markov specification for terminal event following non-terminal event. While estimation is mainly performed within the Bayesian paradigm, the package also provides the maximum likelihood estimation approach for several parametric models. The package also includes functions for univariate survival analysis as complementary analysis tools.
Authors: Kyu Ha Lee, Catherine Lee, Danilo Alvares, and Sebastien Haneuse
Maintainer: Kyu Ha Lee <[email protected]>
License: GPL (>= 2)
Version: 3.4
Built: 2024-11-20 06:51:27 UTC
Source: CRAN

Help Index


Algorithms for fitting parametric and semi-parametric models to semi-competing risks data / univariate survival data.

Description

The package provides functions to perform the analysis of semi-competing risks or univariate survival data with either hazard regression (HReg) models or accelerated failure time (AFT) models. The framework is flexible in the sense that:
1) it can handle cluster-correlated or independent data;
2) the option to choose between parametric (Weibull) and semi-parametric (mixture of piecewise exponential) specification for baseline hazard function(s) is available;
3) for clustered data, the option to choose between parametric (multivariate Normal for semicompeting risks data, Normal for univariate survival data) and semi-parametric (Dirichlet process mixture) specification for random effects distribution is available;
4) for semi-competing risks data, the option to choose between Makov and semi-Makov model is available.

Details

The package includes following functions:

BayesID_HReg Bayesian analysis of semi-competing risks data using HReg models
BayesID_AFT Bayesian analysis of semi-competing risks data using AFT models
BayesSurv_HReg Bayesian analysis of univariate survival data using HReg models
BayesSurv_AFT Bayesian analysis of univariate survival data using AFT models
FreqID_HReg Frequentist analysis of semi-competing risks data using HReg models
FreqSurv_HReg Frequentist analysis of univariate survival data using HReg models
initiate.startValues_HReg Initiating starting values for Bayesian estimations with HReg models
initiate.startValues_AFT Initiating starting values for Bayesian estimations with AFT models
simID Simulating semi-competing risks data under Weibull/Weibull-MVN model
simSurv Simulating survival data under Weibull/Weibull-Normal model
Package: SemiCompRisks
Type: Package
Version: 3.4
Date: 2021-2-2
License: GPL (>= 2)
LazyLoad: yes

Author(s)

Kyu Ha Lee, Catherine Lee, Danilo Alvares, and Sebastien Haneuse
Maintainer: Kyu Ha Lee <[email protected]>

References

Lee, K. H., Haneuse, S., Schrag, D., and Dominici, F. (2015), Bayesian semiparametric analysis of semicompeting risks data: investigating hospital readmission after a pancreatic cancer diagnosis, Journal of the Royal Statistical Society: Series C, 64, 2, 253-273.

Lee, K. H., Dominici, F., Schrag, D., and Haneuse, S. (2016), Hierarchical models for semicompeting risks data with application to quality of end-of-life care for pancreatic cancer, Journal of the American Statistical Association, 111, 515, 1075-1095.

Lee, K. H., Rondeau, V., and Haneuse, S. (2017), Accelerated failure time models for semicompeting risks data in the presence of complex censoring, Biometrics, 73, 4, 1401-1412.

Alvares, D., Haneuse, S., Lee, C., Lee, K. H. (2019), SemiCompRisks: An R package for the analysis of independent and cluster-correlated semi-competing risks data, The R Journal, 11, 1, 376-400.


The function to implement Bayesian parametric and semi-parametric analyses for semi-competing risks data in the context of accelerated failure time (AFT) models.

Description

Independent semi-competing risks data can be analyzed using AFT models that have a hierarchical structure. The proposed models can accomodate left-truncated and/or interval-censored data. An efficient computational algorithm that gives users the flexibility to adopt either a fully parametric (log-Normal) or a semi-parametric (Dirichlet process mixture) model specification is developed.

Usage

BayesID_AFT(Formula, data, model = "LN", hyperParams, startValues,
mcmcParams, na.action = "na.fail", subset=NULL, path=NULL)

Arguments

Formula

a Formula object, with the outcomes on the left of a \sim, and covariates on the right. It is of the form, left truncation time | interval- (or right-) censored time for non-terminal event | interval-(or right-) censored time for terminal event \sim covariates for h1h_1 | covariates for h2h_2 | covariates for h3h_3: i.e., LL | y1Ly_{1L}+y1Uy_{1U} | y2Ly_{2L}+y2Uy_{2U} ~ x1x_1 | x2x_2 | x3x_3.

data

a data.frame in which to interpret the variables named in Formula.

model

The specification of baseline survival distribution: "LN" or "DPM".

hyperParams

a list containing lists or vectors for hyperparameter values in hierarchical models. Components include, theta (a numeric vector for hyperparameter in the prior of subject-specific frailty variance component), LN (a list containing numeric vectors for log-Normal hyperparameters: LN.ab1, LN.ab2, LN.ab3), DPM (a list containing numeric vectors for DPM hyperparameters: DPM.mu1, DPM.mu2, DPM.mu3, DPM.sigSq1, DPM.sigSq2, DPM.sigSq3, DPM.ab1, DPM.ab2, DPM.ab3, Tau.ab1, Tau.ab2, Tau.ab3). See Details and Examples below.

startValues

a list containing vectors of starting values for model parameters. It can be specified as the object returned by the function initiate.startValues_AFT.

mcmcParams

a list containing variables required for MCMC sampling. Components include, run (a list containing numeric values for setting for the overall run: numReps, total number of scans; thin, extent of thinning; burninPerc, the proportion of burn-in). storage (a list containing numeric values for storing posterior samples for subject- and cluster-specific random effects: nGam_save, the number of γ\gamma to be stored; nY1_save, the number of y1y1 to be stored; nY2_save, the number of y2y2 to be stored; nY1.NA_save, the number of y1.NAy1.NA to be stored). tuning (a list containing numeric values relevant to tuning parameters for specific updates in Metropolis-Hastings (MH) algorithm: betag.prop.var, the variance of proposal density for βg\beta_g; mug.prop.var, the variance of proposal density for μg\mu_{g}; zetag.prop.var, the variance of proposal density for 1/σg21/\sigma_g^2; gamma.prop.var, the variance of proposal density for γ\gamma). See Details and Examples below.

na.action

how NAs are treated. See model.frame.

subset

a specification of the rows to be used: defaults to all rows. See model.frame.

path

the name of directory where the results are saved.

Details

We view the semi-competing risks data as arising from an underlying illness-death model system in which individuals may undergo one or more of three transitions: 1) from some initial condition to non-terminal event, 2) from some initial condition to terminal event, 3) from non-terminal event to terminal event. Let Ti1T_{i1}, Ti2T_{i2} denote time to non-terminal and terminal event from subject i=1,...,ni=1,...,n. We propose to directly model the times of the events via the following AFT model specification:

log(Ti1)=xi1β1+γi+ϵi1,Ti1>0,\log(T_{i1}) = x_{i1}^\top\beta_1 + \gamma_i + \epsilon_{i1}, T_{i1} > 0,

log(Ti2)=xi2β2+γi+ϵi2,Ti2>0,\log(T_{i2}) = x_{i2}^\top\beta_2 + \gamma_i + \epsilon_{i2}, T_{i2} > 0,

log(Ti2Ti1)=xi3β3+γi+ϵi3,Ti2>Ti1,\log(T_{i2} - T_{i1}) = x_{i3}^\top\beta_3 + \gamma_i + \epsilon_{i3}, T_{i2} > T_{i1},

where xigx_{ig} is a vector of transition-specific covariates, βg\beta_g is a corresponding vector of transition-specific regression parameters and ϵig\epsilon_{ig} is a transition-specific random variable whose distribution determines that of the corresponding transition time, g{1,2,3}g \in \{1,2,3\}. γi\gamma_i is a study participant-specific random effect that induces positive dependence between the two event times, thereby performing a role analogous to that performed by frailties in models for the hazard function. Let LiL_{i} denote the time at study entry (i.e. the left-truncation time). Furthermore, suppose that study participant ii was observed at follow-up times {ci1,,cimi}\{c_{i1},\ldots, c_{im_i}\} and let cic_i^* denote the time to the end of study or to administrative right-censoring. Considering interval-censoring for both events, the times to non-terminal and terminal event for the ithi^{th} study participant satisfy cijTi1<cij+1c_{ij}\leq T_{i1}< c_{ij+1} for some jj and cikTi2<cik+1c_{ik}\leq T_{i2}< c_{ik+1} for some kk, respectively. Then the observed outcomes for the ithi^{th} study participant can be succinctly denoted by {cij,cij+1,cik,cik+1,Li}\{c_{ij}, c_{ij+1}, c_{ik}, c_{ik+1}, L_{i}\}.

For the Bayesian semi-parametric analysis, we proceed by adopting independent DPM of normal distributions for each ϵig\epsilon_{ig}. More precisely, ϵig\epsilon_{ig} is taken to be an independent draw from a mixture of MgM_g normal distributions with means and variances (μgr\mu_{gr}, σgr2\sigma_{gr}^2), for r{1,,Mg}r \in \{1,\ldots,M_g\}. Since the class-specific (μgr,σgr2)(\mu_{gr}, \sigma_{gr}^2) are not known, they are taken to be draws from some common distribution, Gg0G_{g0}, often referred to as the centering distribution. Furthermore, since the ‘true’ class membership for any given study participant is not known, we let pgrp_{gr} denote the probability of belonging to the rthr^{th} class for transition gg and pgp_g = (pg1,,pgMg)(p_{g1}, \ldots, p_{gM_g}) the collection of such probabilities. Note, pgp_g is defined at the level of the population (i.e. is not study participant-specific) and its components add up to 1.0. In the absence of prior knowledge regarding the distribution of class memberships for the nn individuals across the MgM_g classes, pgp_g is assumed to follow a conjugate symmetric Dirichlet(τg/Mg,,τg/Mg)(\tau_g/M_g,\ldots,\tau_g/M_g) distribution, where τg\tau_g is referred to as the precision parameter. The finite mixture distribution can then be succinctly represented as:

ϵigriNormal(μri,σri2),\epsilon_{ig} | r_{i} \sim Normal(\mu_{r_{i}}, \sigma_{r_{i}}^2),

(μgr,σgr2)Gg0,  for r=1,,Mg,(\mu_{gr}, \sigma_{gr}^2) \sim G_{g0}, ~~for~ r=1,\ldots,M_g,

ripgDiscrete(ripg1,,pgMg),r_{i}| p_g \sim Discrete(r_{i} | p_{g1},\ldots,p_{gM_g}),

pgDirichlet(τg/Mg,,τg/Mg).p_g \sim Dirichlet(\tau_g/M_g, \ldots, \tau_g/M_g).

Letting MgM_g approach infinity, this specification is referred to as a DPM of normal distributions. In our proposed framework, we specify a Gamma(aτga_{\tau_g}, bτgb_{\tau_g}) hyperprior for τg\tau_g. For regression parameters, we adopt non-informative flat priors on the real line. For γ\gamma={γ1,,γn}\{\gamma_1, \ldots, \gamma_n\}, we assume that each γi\gamma_i is an independent random draw from a Normal(0, θ\theta) distribution. In the absence of prior knowledge on the variance component θ\theta, we adopt a conjugate inverse-Gamma hyperprior, IG(aθa_\theta, bθb_\theta). Finally, We take the Gg0G_{g0} as a normal distribution centered at μg0\mu_{g0} with a variance σg02\sigma_{g0}^2 for μgr\mu_{gr} and an IG(aσga_{\sigma_g}, bσgb_{\sigma_g}) for σgr2\sigma_{gr}^2.

For the Bayesian parametric analysis, we build on the log-Normal formulation and take the ϵig\epsilon_{ig} to follow independent Normal(μg\mu_g, σg2\sigma_g^2) distributions, gg=1,2,3. For location parameters {μ1,μ2,μ3}\{\mu_1, \mu_2, \mu_3\}, we adopt non-informative flat priors on the real line. For {σ12,σ22,σ32}\{\sigma_1^2, \sigma_2^2, \sigma_3^2\}, we adopt independent inverse Gamma distributions, denoted IG(aσga_{\sigma g}, bσgb_{\sigma g}). For βg\beta_g, γ\gamma, and θ\theta, we adopt the same priors as those adopted for the DPM model.

Value

BayesID_AFT returns an object of class Bayes_AFT.

Note

The posterior samples of γ\gamma are saved separately in working directory/path. For a dataset with large nn, nGam_save should be carefully specified considering the system memory and the storage capacity.

Author(s)

Kyu Ha Lee and Sebastien Haneuse
Maintainer: Kyu Ha Lee <[email protected]>

References

Lee, K. H., Rondeau, V., and Haneuse, S. (2017), Accelerated failure time models for semicompeting risks data in the presence of complex censoring, Biometrics, 73, 4, 1401-1412.

Alvares, D., Haneuse, S., Lee, C., Lee, K. H. (2019), SemiCompRisks: An R package for the analysis of independent and cluster-correlated semi-competing risks data, The R Journal, 11, 1, 376-400.

See Also

initiate.startValues_AFT, print.Bayes_AFT, summary.Bayes_AFT, predict.Bayes_AFT

Examples

## Not run: 
# loading a data set
data(scrData)
scrData$y1L <- scrData$y1U <- scrData[,1]
scrData$y1U[which(scrData[,2] == 0)] <- Inf
scrData$y2L <- scrData$y2U <- scrData[,3]
scrData$y2U[which(scrData[,4] == 0)] <- Inf
scrData$LT <- rep(0, dim(scrData)[1])

form <- Formula(LT | y1L + y1U | y2L + y2U  ~ x1 + x2 + x3 | x1 + x2 | x1 + x2)

#####################
## Hyperparameters ##
#####################

## Subject-specific random effects variance component
##
theta.ab <- c(0.5, 0.05)

## log-Normal model
##
LN.ab1 <- c(0.3, 0.3)
LN.ab2 <- c(0.3, 0.3)
LN.ab3 <- c(0.3, 0.3)

## DPM model
##
DPM.mu1 <- log(12)
DPM.mu2 <- log(12)
DPM.mu3 <- log(12)

DPM.sigSq1 <- 100
DPM.sigSq2 <- 100
DPM.sigSq3 <- 100

DPM.ab1 <-  c(2, 1)
DPM.ab2 <-  c(2, 1)
DPM.ab3 <-  c(2, 1)

Tau.ab1 <- c(1.5, 0.0125)
Tau.ab2 <- c(1.5, 0.0125)
Tau.ab3 <- c(1.5, 0.0125)

##
hyperParams <- list(theta=theta.ab,
LN=list(LN.ab1=LN.ab1, LN.ab2=LN.ab2, LN.ab3=LN.ab3),
DPM=list(DPM.mu1=DPM.mu1, DPM.mu2=DPM.mu2, DPM.mu3=DPM.mu3, DPM.sigSq1=DPM.sigSq1,
DPM.sigSq2=DPM.sigSq2, DPM.sigSq3=DPM.sigSq3, DPM.ab1=DPM.ab1, DPM.ab2=DPM.ab2,
DPM.ab3=DPM.ab3, Tau.ab1=Tau.ab1, Tau.ab2=Tau.ab2, Tau.ab3=Tau.ab3))

###################
## MCMC SETTINGS ##
###################

## Setting for the overall run
##
numReps    <- 300
thin       <- 3
burninPerc <- 0.5

## Setting for storage
##
nGam_save <- 10
nY1_save <- 10
nY2_save <- 10
nY1.NA_save <- 10

## Tuning parameters for specific updates
##
##  - those common to all models
betag.prop.var	<- c(0.01,0.01,0.01)
mug.prop.var	<- c(0.1,0.1,0.1)
zetag.prop.var	<- c(0.1,0.1,0.1)
gamma.prop.var	<- 0.01

##
mcmcParams	<- list(run=list(numReps=numReps, thin=thin, burninPerc=burninPerc),
storage=list(nGam_save=nGam_save, nY1_save=nY1_save, nY2_save=nY2_save, nY1.NA_save=nY1.NA_save),
tuning=list(betag.prop.var=betag.prop.var, mug.prop.var=mug.prop.var,
zetag.prop.var=zetag.prop.var, gamma.prop.var=gamma.prop.var))

#################################################################
## Analysis of Independent Semi-competing risks data ############
#################################################################

###############
## logNormal ##
###############

##
myModel <- "LN"
myPath  <- "Output/01-Results-LN/"

startValues      <- initiate.startValues_AFT(form, scrData, model=myModel, nChain=2)

##
fit_LN <- BayesID_AFT(form, scrData, model=myModel, hyperParams,
startValues, mcmcParams, path=myPath)

fit_LN
summ.fit_LN <- summary(fit_LN); names(summ.fit_LN)
summ.fit_LN
pred_LN <- predict(fit_LN, time = seq(0, 35, 1), tseq=seq(from=0, to=30, by=5))
plot(pred_LN, plot.est="Haz")
plot(pred_LN, plot.est="Surv")

#########
## DPM ##
#########

##
myModel <- "DPM"
myPath  <- "Output/02-Results-DPM/"

startValues      <- initiate.startValues_AFT(form, scrData, model=myModel, nChain=2)

##
fit_DPM <- BayesID_AFT(form, scrData, model=myModel, hyperParams,
startValues, mcmcParams, path=myPath)

fit_DPM
summ.fit_DPM <- summary(fit_DPM); names(summ.fit_DPM)
summ.fit_DPM
pred_DPM <- predict(fit_DPM, time = seq(0, 35, 1), tseq=seq(from=0, to=30, by=5))
plot(pred_DPM, plot.est="Haz")
plot(pred_DPM, plot.est="Surv")

## End(Not run)

The function to implement Bayesian parametric and semi-parametric analyses for semi-competing risks data in the context of hazard regression (HReg) models.

Description

Independent/cluster-correlated semi-competing risks data can be analyzed using HReg models that have a hierarchical structure. The priors for baseline hazard functions can be specified by either parametric (Weibull) model or non-parametric mixture of piecewise exponential models (PEM). The option to choose between parametric multivariate normal (MVN) and non-parametric Dirichlet process mixture of multivariate normals (DPM) is available for the prior of cluster-specific random effects distribution. The conditional hazard function for time to the terminal event given time to non-terminal event can be modeled based on either Markov (it does not depend on the timing of the non-terminal event) or semi-Markov assumption (it does depend on the timing).

Usage

BayesID_HReg(Formula, data, id=NULL, model=c("semi-Markov", "Weibull"),
hyperParams, startValues, mcmcParams, na.action = "na.fail", subset=NULL, path=NULL)

Arguments

Formula

a Formula object, with the outcome on the left of a \sim, and covariates on the right. It is of the form, time to non-terminal event + corresponding censoring indicator | time to terminal event + corresponding censoring indicator \sim covariates for h1h_1 | covariates for h2h_2 | covariates for h3h_3: i.e., y1y_1+δ1\delta_1 | y2y_2+δ2\delta_2 ~ x1x_1 | x2x_2 | x3x_3.

data

a data.frame in which to interpret the variables named in Formula.

id

a vector of cluster information for n subjects. The cluster membership must be consecutive positive integers, 1:J1:J.

model

a character vector that specifies the type of components in a model. The first element is for the assumption on h3h_3: "semi-Markov" or "Markov". The second element is for the specification of baseline hazard functions: "Weibull" or "PEM". The third element needs to be set only for clustered semi-competing risks data and is for the specification of cluster-specific random effects distribution: "MVN" or "DPM".

hyperParams

a list containing lists or vectors for hyperparameter values in hierarchical models. Components include, theta (a numeric vector for hyperparameter in the prior of subject-specific frailty variance component), WB (a list containing numeric vectors for Weibull hyperparameters: WB.ab1, WB.ab2, WB.ab3, WB.cd1, WB.cd2, WB.cd3), PEM (a list containing numeric vectors for PEM hyperparameters: PEM.ab1, PEM.ab2, PEM.ab3, PEM.alpha1, PEM.alpha2, PEM.alpha3). Models for clustered data require additional components, MVN (a list containing numeric vectors for MVN hyperparameters: Psi_v, rho_v), DPM (a list containing numeric vectors for DPM hyperparameters: Psi0, rho0, aTau, bTau). See Details and Examples below.

startValues

a list containing vectors of starting values for model parameters. It can be specified as the object returned by the function initiate.startValues_HReg.

mcmcParams

a list containing variables required for MCMC sampling. Components include, run (a list containing numeric values for setting for the overall run: numReps, total number of scans; thin, extent of thinning; burninPerc, the proportion of burn-in). storage (a list containing numeric values for storing posterior samples for subject- and cluster-specific random effects: nGam_save, the number of γ\gamma to be stored; storeV, a vector of three logical values to determine whether all the posterior samples of V1V_1, V2V_2, V3V_3 are to be stored). tuning (a list containing numeric values relevant to tuning parameters for specific updates in Metropolis-Hastings-Green (MHG) algorithm: mhProp_theta_var, the variance of proposal density for θ\theta; mhProp_Vg_var, the variance of proposal density for VgV_g in DPM models; mhProp_alphag_var, the variance of proposal density for αg\alpha_g in Weibull models; Cg, a vector of three proportions that determine the sum of probabilities of choosing the birth and the death moves in PEM models. The sum of the three elements should not exceed 0.6; delPertg, the perturbation parameters in the birth update in PEM models. The values must be between 0 and 0.5; If rj.scheme=1, the birth update will draw the proposal time split from 1:smax1:s_{max}. If rj.scheme=2, the birth update will draw the proposal time split from uniquely ordered failure times in the data. Only required for PEM models; Kg_max, the maximum number of splits allowed at each iteration in MHG algorithm for PEM models; time_lambda1, time_lambda2, time_lambda3 - time points at which the log-hazard functions are calculated for predict.Bayes_HReg, Only required for PEM models). See Details and Examples below.

na.action

how NAs are treated. See model.frame.

subset

a specification of the rows to be used: defaults to all rows. See model.frame.

path

the name of directory where the results are saved.

Details

We view the semi-competing risks data as arising from an underlying illness-death model system in which individuals may undergo one or more of three transitions: 1) from some initial condition to non-terminal event, 2) from some initial condition to terminal event, 3) from non-terminal event to terminal event. Let tji1t_{ji1}, tji2t_{ji2} denote time to non-terminal and terminal event from subject i=1,...,nji=1,...,n_j in cluster j=1,...,Jj=1,...,J. The system of transitions is modeled via the specification of three hazard functions:

h1(tji1γji,xji1,Vj1)=γjih01(tji1)exp(xji1β1+Vj1),tji1>0,h_1(t_{ji1} | \gamma_{ji}, x_{ji1}, V_{j1}) = \gamma_{ji} h_{01}(t_{ji1})\exp(x_{ji1}^{\top}\beta_1 +V_{j1}), t_{ji1}>0,

h2(tji2γji,xji2,Vj2)=γjih02(tji2)exp(xji2β2+Vj2),tji2>0,h_2(t_{ji2} | \gamma_{ji}, x_{ji2}, V_{j2}) = \gamma_{ji} h_{02}(t_{ji2})\exp(x_{ji2}^{\top}\beta_2 +V_{j2}), t_{ji2}>0,

h3(tji2tji1,γji,xji3,Vj3)=γjih03(tji2)exp(xji3β3+Vj3),0<tji1<tji2,h_3(t_{ji2} | t_{ji1}, \gamma_{ji}, x_{ji3}, V_{j3}) = \gamma_{ji} h_{03}(t_{ji2})\exp(x_{ji3}^{\top}\beta_3 +V_{j3}), 0<t_{ji1}<t_{ji2},

where γji\gamma_{ji} is a subject-specific frailty and VjV_j=(Vj1V_{j1}, Vj2V_{j2}, Vj3V_{j3}) is a vector of cluster-specific random effects (each specific to one of the three possible transitions), taken to be independent of xji1x_{ji1}, xji2x_{ji2}, and xji3x_{ji3}. For g{1,2,3}g \in \{1,2,3\}, h0gh_{0g} is an unspecified baseline hazard function and βg\beta_g is a vector of pgp_g log-hazard ratio regression parameters. The h03h_{03} is assumed to be Markov with respect to t1t_1. We refer to the model specified by three conditional hazard functions as the Markov model. An alternative specification is to model the risk of terminal event following non-terminal event as a function of the sojourn time. Specifically, retaining h1h_1 and h2h_2 as above, we consider modeling h3h_3 as follows:

h3(tji2tji1,γji,xji3,Vj3)=γjih03(tji2tji1)exp(xji3β3+Vj3),0<tji1<tji2.h_3(t_{ji2} | t_{ji1}, \gamma_{ji}, x_{ji3}, V_{j3}) = \gamma_{ji} h_{03}(t_{ji2}-t_{ji1})\exp(x_{ji3}^{\top}\beta_3 +V_{j3}), 0<t_{ji1}<t_{ji2}.

We refer to this alternative model as the semi-Markov model.
For parametric MVN prior specification for a vector of cluster-specific random effects, we assume VjV_j arise as i.i.d. draws from a mean 0 MVN distribution with variance-covariance matrix ΣV\Sigma_V. The diagonal elements of the 3×33\times 3 matrix ΣV\Sigma_V characterize variation across clusters in risk for non-terminal, terminal and terminal following non-terminal event, respectively, which is not explained by covariates included in the linear predictors. Specifically, the priors can be written as follows:

VjMVN(0,ΣV),V_j \sim MVN(0, \Sigma_V),

ΣVinverseWishart(Ψv,ρv).\Sigma_V \sim inverse-Wishart(\Psi_v, \rho_v).

For DPM prior specification for VjV_j, we consider non-parametric Dirichlet process mixture of MVN distributions: the VjV_j's are draws from a finite mixture of M multivariate Normal distributions, each with their own mean vector and variance-covariance matrix, (μm\mu_m, Σm\Sigma_m) for m=1,...,Mm=1,...,M. Let mj{1,...,M}m_j\in\{1,...,M\} denote the specific component to which the jjth cluster belongs. Since the class-specific (μm\mu_m, Σm\Sigma_m) are unknown they are taken to be draws from some distribution, G0G_0, often referred to as the centering distribution. Furthermore, since the true class memberships are not known, we denote the probability that the jjth cluster belongs to any given class by the vector p=(p1,...,pM)p=(p_1,..., p_M) whose components add up to 1.0. In the absence of prior knowledge regarding the distribution of class memberships for the JJ clusters across the MM classes, a natural prior for pp is the conjugate symmetric Dirichlet(τ/M,...,τ/M)Dirichlet(\tau/M,...,\tau/M) distribution; the hyperparameter, τ\tau, is often referred to as a the precision parameter. The prior can be represented as follows (MM goes to infinity):

VjmjMVN(μmj,Σmj),V_j | m_j \sim MVN(\mu_{m_j}, \Sigma_{m_j}),

(μm,Σm)G0,  for m=1,...,M,(\mu_m, \Sigma_m) \sim G_{0},~~ for ~m=1,...,M,

mjpDiscrete(mjp1,...,pM),m_j | p \sim Discrete(m_j| p_1,...,p_M),

pDirichlet(τ/M,...,τ/M),p \sim Dirichlet(\tau/M,...,\tau/M),

where G0G_0 is taken to be a multivariate Normal/inverse-Wishart (NIW) distribution for which the probability density function is the following product:

fNIW(μ,ΣΨ0,ρ0)=fMVN(μ0,Σ)×finvWishart(ΣΨ0,ρ0).f_{NIW}(\mu, \Sigma | \Psi_0, \rho_0) = f_{MVN}(\mu | 0, \Sigma) \times f_{inv-Wishart}(\Sigma | \Psi_0, \rho_0).

We consider Gamma(aτ,bτ)Gamma(a_{\tau}, b_{\tau}) as the prior for concentration parameter τ\tau.

For non-parametric PEM prior specification for baseline hazard functions, let sg,maxs_{g,\max} denote the largest observed event time for each transition g{1,2,3}g \in \{1,2,3\}. Then, consider the finite partition of the relevant time axis into Kg+1K_{g} + 1 disjoint intervals: 0<sg,1<sg,2<...<sg,Kg+1=sg,max0<s_{g,1}<s_{g,2}<...<s_{g, K_g+1} = s_{g, \max}. For notational convenience, let Ig,k=(sg,k1,sg,k]I_{g,k}=(s_{g, k-1}, s_{g, k}] denote the kthk^{th} partition. For given a partition, sg=(sg,1,,sg,Kg+1)s_g = (s_{g,1}, \dots, s_{g, K_g + 1}), we assume the log-baseline hazard functions is piecewise constant:

λ0g(t)=logh0g(t)=k=1Kg+1λg,kI(tIg,k),\lambda_{0g}(t)=\log h_{0g}(t) = \sum_{k=1}^{K_g + 1} \lambda_{g,k} I(t\in I_{g,k}),

where I()I(\cdot) is the indicator function and sg,00s_{g,0} \equiv 0. Note, this specification is general in that the partitions of the time axes differ across the three hazard functions. our prior choices are, for g{1,2,3}g\in\{1,2,3\}:

λgKg,μλg,σλg2MVNKg+1(μλg1,σλg2Σλg),\lambda_g | K_g, \mu_{\lambda_g}, \sigma_{\lambda_g}^2 \sim MVN_{K_g+1}(\mu_{\lambda_g}1, \sigma_{\lambda_g}^2\Sigma_{\lambda_g}),

KgPoisson(αg),K_g \sim Poisson(\alpha_g),

π(sgKg)(2Kg+1)!k=1Kg+1(sg,ksg,k1)(sg,Kg+1)(2Kg+1),\pi(s_g | K_g) \propto \frac{(2K_g+1)! \prod_{k=1}^{K_g+1}(s_{g,k}-s_{g,k-1})}{(s_{g,K_g+1})^{(2K_g+1)}},

π(μλg)1,\pi(\mu_{\lambda_g}) \propto 1,

σλg2Gamma(ag,bg).\sigma_{\lambda_g}^{-2} \sim Gamma(a_g, b_g).

Note that KgK_g and sgs_g are treated as random and the priors for KgK_g and sgs_g jointly form a time-homogeneous Poisson process prior for the partition. The number of time splits and their positions are therefore updated within our computational scheme using reversible jump MCMC.

For parametric Weibull prior specification for baseline hazard functions, h0g(t)=αgκgtαg1h_{0g}(t) = \alpha_g \kappa_g t^{\alpha_g-1}. In our Bayesian framework, our prior choices are, for g{1,2,3}g\in\{1,2,3\}:

π(αg)Gamma(ag,bg),\pi(\alpha_g) \sim Gamma(a_g, b_g),

π(κg)Gamma(cg,dg).\pi(\kappa_g) \sim Gamma(c_g, d_g).

Our prior choice for remaining model parameters in all of four models (Weibull-MVN, Weibull-DPM, PEM-MVN, PEM-DPM) is given as follows:

π(βg)1,\pi(\beta_g) \propto 1,

γjiθGamma(θ1,θ1),\gamma_{ji}|\theta \sim Gamma(\theta^{-1}, \theta^{-1}),

θ1Gamma(ψ,ω).\theta^{-1} \sim Gamma(\psi, \omega).

We provide a detailed description of the hierarchical models for cluster-correlated semi-competing risks data. The models for independent semi-competing risks data can be obtained by removing cluster-specific random effects, VjV_j, and its corresponding prior specification from the description given above.

Value

BayesID_HReg returns an object of class Bayes_HReg.

Note

The posterior samples of γ\gamma and VgV_g are saved separately in working directory/path. For a dataset with large nn, nGam_save should be carefully specified considering the system memory and the storage capacity.

Author(s)

Kyu Ha Lee and Sebastien Haneuse
Maintainer: Kyu Ha Lee <[email protected]>

References

Lee, K. H., Haneuse, S., Schrag, D., and Dominici, F. (2015), Bayesian semiparametric analysis of semicompeting risks data: investigating hospital readmission after a pancreatic cancer diagnosis, Journal of the Royal Statistical Society: Series C, 64, 2, 253-273.

Lee, K. H., Dominici, F., Schrag, D., and Haneuse, S. (2016), Hierarchical models for semicompeting risks data with application to quality of end-of-life care for pancreatic cancer, Journal of the American Statistical Association, 111, 515, 1075-1095.

Alvares, D., Haneuse, S., Lee, C., Lee, K. H. (2019), SemiCompRisks: An R package for the analysis of independent and cluster-correlated semi-competing risks data, The R Journal, 11, 1, 376-400.

See Also

initiate.startValues_HReg, print.Bayes_HReg, summary.Bayes_HReg, predict.Bayes_HReg

Examples

## Not run:    
# loading a data set
data(scrData)
id=scrData$cluster

form <- Formula(time1 + event1 | time2 + event2 ~ x1 + x2 + x3 | x1 + x2 | x1 + x2)

#####################
## Hyperparameters ##
#####################

## Subject-specific frailty variance component
##  - prior parameters for 1/theta
##
theta.ab <- c(0.7, 0.7)

## Weibull baseline hazard function: alphas, kappas
##
WB.ab1 <- c(0.5, 0.01) # prior parameters for alpha1
WB.ab2 <- c(0.5, 0.01) # prior parameters for alpha2
WB.ab3 <- c(0.5, 0.01) # prior parameters for alpha3
##
WB.cd1 <- c(0.5, 0.05) # prior parameters for kappa1
WB.cd2 <- c(0.5, 0.05) # prior parameters for kappa2
WB.cd3 <- c(0.5, 0.05) # prior parameters for kappa3

## PEM baseline hazard function
##
PEM.ab1 <- c(0.7, 0.7) # prior parameters for 1/sigma_1^2
PEM.ab2 <- c(0.7, 0.7) # prior parameters for 1/sigma_2^2
PEM.ab3 <- c(0.7, 0.7) # prior parameters for 1/sigma_3^2
##
PEM.alpha1 <- 10 # prior parameters for K1
PEM.alpha2 <- 10 # prior parameters for K2
PEM.alpha3 <- 10 # prior parameters for K3

## MVN cluster-specific random effects
##
Psi_v <- diag(1, 3)
rho_v <- 100

## DPM cluster-specific random effects
##
Psi0  <- diag(1, 3)
rho0  <- 10
aTau  <- 1.5
bTau  <- 0.0125

##
hyperParams <- list(theta=theta.ab,
                WB=list(WB.ab1=WB.ab1, WB.ab2=WB.ab2, WB.ab3=WB.ab3,
                       WB.cd1=WB.cd1, WB.cd2=WB.cd2, WB.cd3=WB.cd3),
                   PEM=list(PEM.ab1=PEM.ab1, PEM.ab2=PEM.ab2, PEM.ab3=PEM.ab3,
                       PEM.alpha1=PEM.alpha1, PEM.alpha2=PEM.alpha2, PEM.alpha3=PEM.alpha3),
                   MVN=list(Psi_v=Psi_v, rho_v=rho_v),
                   DPM=list(Psi0=Psi0, rho0=rho0, aTau=aTau, bTau=bTau))
                    
###################
## MCMC SETTINGS ##
###################

## Setting for the overall run
##
numReps    <- 2000
thin       <- 10
burninPerc <- 0.5

## Settings for storage
##
nGam_save <- 0
storeV    <- rep(TRUE, 3)

## Tuning parameters for specific updates
##
##  - those common to all models
mhProp_theta_var  <- 0.05
mhProp_Vg_var     <- c(0.05, 0.05, 0.05)
##
## - those specific to the Weibull specification of the baseline hazard functions
mhProp_alphag_var <- c(0.01, 0.01, 0.01)
##
## - those specific to the PEM specification of the baseline hazard functions
Cg        <- c(0.2, 0.2, 0.2)
delPertg  <- c(0.5, 0.5, 0.5)
rj.scheme <- 1
Kg_max    <- c(50, 50, 50)
sg_max    <- c(max(scrData$time1[scrData$event1 == 1]),
               max(scrData$time2[scrData$event1 == 0 & scrData$event2 == 1]),
               max(scrData$time2[scrData$event1 == 1 & scrData$event2 == 1]))

time_lambda1 <- seq(1, sg_max[1], 1)
time_lambda2 <- seq(1, sg_max[2], 1)
time_lambda3 <- seq(1, sg_max[3], 1)               

##
mcmc.WB  <- list(run=list(numReps=numReps, thin=thin, burninPerc=burninPerc),
                storage=list(nGam_save=nGam_save, storeV=storeV),
                tuning=list(mhProp_theta_var=mhProp_theta_var,
                mhProp_Vg_var=mhProp_Vg_var, mhProp_alphag_var=mhProp_alphag_var))

##
mcmc.PEM <- list(run=list(numReps=numReps, thin=thin, burninPerc=burninPerc),
                storage=list(nGam_save=nGam_save, storeV=storeV),
                tuning=list(mhProp_theta_var=mhProp_theta_var,
                mhProp_Vg_var=mhProp_Vg_var, Cg=Cg, delPertg=delPertg,
                rj.scheme=rj.scheme, Kg_max=Kg_max,
                time_lambda1=time_lambda1, time_lambda2=time_lambda2,
                time_lambda3=time_lambda3))
    
#####################
## Starting Values ##
#####################

##
Sigma_V <- diag(0.1, 3)
Sigma_V[1,2] <- Sigma_V[2,1] <- -0.05
Sigma_V[1,3] <- Sigma_V[3,1] <- -0.06
Sigma_V[2,3] <- Sigma_V[3,2] <- 0.07

#################################################################
## Analysis of Independent Semi-Competing Risks Data ############
#################################################################

#############
## WEIBULL ##
#############

##
myModel <- c("semi-Markov", "Weibull")
myPath  <- "Output/01-Results-WB/"

startValues      <- initiate.startValues_HReg(form, scrData, model=myModel, nChain=2)

##
fit_WB <- BayesID_HReg(form, scrData, id=NULL, model=myModel,
                hyperParams, startValues, mcmc.WB, path=myPath)
				
fit_WB
summ.fit_WB <- summary(fit_WB); names(summ.fit_WB)
summ.fit_WB
pred_WB <- predict(fit_WB, tseq=seq(from=0, to=30, by=5))
plot(pred_WB, plot.est="Haz")
plot(pred_WB, plot.est="Surv")

#########
## PEM ##
#########

##						
myModel <- c("semi-Markov", "PEM")
myPath  <- "Output/02-Results-PEM/"

startValues      <- initiate.startValues_HReg(form, scrData, model=myModel, nChain=2)

##
fit_PEM <- BayesID_HReg(form, scrData, id=NULL, model=myModel,
                 hyperParams, startValues, mcmc.PEM, path=myPath)
				
fit_PEM
summ.fit_PEM <- summary(fit_PEM); names(summ.fit_PEM)
summ.fit_PEM
pred_PEM <- predict(fit_PEM)
plot(pred_PEM, plot.est="Haz")
plot(pred_PEM, plot.est="Surv")
					
#################################################################
## Analysis of Correlated Semi-Competing Risks Data #############
#################################################################

#################
## WEIBULL-MVN ##
#################

##
myModel <- c("semi-Markov", "Weibull", "MVN")
myPath  <- "Output/03-Results-WB_MVN/"

startValues      <- initiate.startValues_HReg(form, scrData, model=myModel, id, nChain=2)

##
fit_WB_MVN <- BayesID_HReg(form, scrData, id, model=myModel,
                    hyperParams, startValues, mcmc.WB, path=myPath)
                    
fit_WB_MVN
summ.fit_WB_MVN <- summary(fit_WB_MVN); names(summ.fit_WB_MVN)
summ.fit_WB_MVN
pred_WB_MVN <- predict(fit_WB_MVN, tseq=seq(from=0, to=30, by=5))
plot(pred_WB_MVN, plot.est="Haz")
plot(pred_WB_MVN, plot.est="Surv")


#################
## WEIBULL-DPM ##
#################

##
myModel <- c("semi-Markov", "Weibull", "DPM")
myPath  <- "Output/04-Results-WB_DPM/"

startValues      <- initiate.startValues_HReg(form, scrData, model=myModel, id, nChain=2)

##
fit_WB_DPM <- BayesID_HReg(form, scrData, id, model=myModel,
                    hyperParams, startValues, mcmc.WB, path=myPath)

fit_WB_DPM
summ.fit_WB_DPM <- summary(fit_WB_DPM); names(summ.fit_WB_DPM)
summ.fit_WB_DPM
pred_WB_DPM <- predict(fit_WB_MVN, tseq=seq(from=0, to=30, by=5))
plot(pred_WB_DPM, plot.est="Haz")
plot(pred_WB_DPM, plot.est="Surv")

#############
## PEM-MVN ##
#############

##
myModel <- c("semi-Markov", "PEM", "MVN")
myPath  <- "Output/05-Results-PEM_MVN/"

startValues      <- initiate.startValues_HReg(form, scrData, model=myModel, id, nChain=2)

##
fit_PEM_MVN <- BayesID_HReg(form, scrData, id, model=myModel,
                    hyperParams, startValues, mcmc.PEM, path=myPath)
                    
fit_PEM_MVN
summ.fit_PEM_MVN <- summary(fit_PEM_MVN); names(summ.fit_PEM_MVN)
summ.fit_PEM_MVN
pred_PEM_MVN <- predict(fit_PEM_MVN)
plot(pred_PEM_MVN, plot.est="Haz")
plot(pred_PEM_MVN, plot.est="Surv")

#############
## PEM-DPM ##
#############

##
myModel <- c("semi-Markov", "PEM", "DPM")
myPath  <- "Output/06-Results-PEM_DPM/"

startValues      <- initiate.startValues_HReg(form, scrData, model=myModel, id, nChain=2)
                    
##
fit_PEM_DPM <- BayesID_HReg(form, scrData, id, model=myModel,
                    hyperParams, startValues, mcmc.PEM, path=myPath)
                    
fit_PEM_DPM
summ.fit_PEM_DPM <- summary(fit_PEM_DPM); names(summ.fit_PEM_DPM)
summ.fit_PEM_DPM
pred_PEM_DPM <- predict(fit_PEM_DPM)
plot(pred_PEM_DPM, plot.est="Haz")
plot(pred_PEM_DPM, plot.est="Surv")
                    

## End(Not run)

The function to implement Bayesian parametric and semi-parametric analyses for univariate survival data in the context of accelerated failure time (AFT) models.

Description

Independent univariate survival data can be analyzed using AFT models that have a hierarchical structure. The proposed models can accomodate left-truncated and/or interval-censored data. An efficient computational algorithm that gives users the flexibility to adopt either a fully parametric (log-Normal) or a semi-parametric (Dirichlet process mixture) model specification is developed.

Usage

BayesSurv_AFT(Formula, data, model = "LN", hyperParams, startValues,
                mcmcParams, na.action = "na.fail", subset=NULL, path=NULL)

Arguments

Formula

a Formula object, with the outcomes on the left of a \sim, and covariates on the right. It is of the form, left truncation time | interval- (or right-) censored time to event \sim covariates : i.e., LL | yLy_{L}+yUy_{U} ~ xx.

data

a data.frame in which to interpret the variables named in Formula.

model

The specification of baseline survival distribution: "LN" or "DPM".

hyperParams

a list containing lists or vectors for hyperparameter values in hierarchical models. Components include, LN (a list containing numeric vectors for log-Normal hyperparameters: LN.ab), DPM (a list containing numeric vectors for DPM hyperparameters: DPM.mu, DPM.sigSq, DPM.ab, Tau.ab). See Details and Examples below.

startValues

a list containing vectors of starting values for model parameters. It can be specified as the object returned by the function initiate.startValues_AFT.

mcmcParams

a list containing variables required for MCMC sampling. Components include, run (a list containing numeric values for setting for the overall run: numReps, total number of scans; thin, extent of thinning; burninPerc, the proportion of burn-in). tuning (a list containing numeric values relevant to tuning parameters for specific updates in Metropolis-Hastings (MH) algorithm: beta.prop.var, the variance of proposal density for β\beta; mu.prop.var, the variance of proposal density for μ\mu; zeta.prop.var, the variance of proposal density for 1/σ21/\sigma^2).

na.action

how NAs are treated. See model.frame.

subset

a specification of the rows to be used: defaults to all rows. See model.frame.

path

the name of directory where the results are saved.

Details

The function BayesSurv_AFT implements Bayesian semi-parametric (DPM) and parametric (log-Normal) models to univariate time-to-event data in the presence of left-truncation and/or interval-censoring. Consider a univariate AFT model that relates the covariate xix_i to survival time TiT_i for the ithi^{\textrm{th}} subject:

log(Ti)=xiβ+ϵi,\log(T_i) = x_i^{\top}\beta + \epsilon_i,

where ϵi\epsilon_i is a random variable whose distribution determines that of TiT_i and β\beta is a vector of regression parameters. Considering the interval censoring, the time to the event for the ithi^{\textrm{th}} subject satisfies cijTi<cij+1c_{ij}\leq T_i <c_{ij+1}. Let LiL_i denote the left-truncation time. For the Bayesian parametric analysis, we take ϵi\epsilon_i to follow the Normal(μ\mu, σ2\sigma^2) distribution for ϵi\epsilon_i. The following prior distributions are adopted for the model parameters:

π(β,μ)1,\pi(\beta, \mu) \propto 1,

σ2Inverse-Gamma(aσ,bσ).\sigma^2 \sim \textrm{Inverse-Gamma}(a_{\sigma}, b_{\sigma}).

For the Bayesian semi-parametric analysis, we assume that ϵi\epsilon_i is taken as draws from the DPM of normal distributions:

ϵDPM(G0,τ).\epsilon\sim DPM(G_0, \tau).

We refer readers to print.Bayes_AFT for a detailed illustration of DPM specification. We adopt a non-informative flat prior on the real line for the regression parameters β\beta and a Gamma(aτa_{\tau}, bτb_{\tau}) hyperprior for the precision parameter τ\tau.

Value

BayesSurv_AFT returns an object of class Bayes_AFT.

Author(s)

Kyu Ha Lee and Sebastien Haneuse
Maintainer: Kyu Ha Lee <[email protected]>

References

Lee, K. H., Rondeau, V., and Haneuse, S. (2017), Accelerated failure time models for semicompeting risks data in the presence of complex censoring, Biometrics, 73, 4, 1401-1412.

Alvares, D., Haneuse, S., Lee, C., Lee, K. H. (2019), SemiCompRisks: An R package for the analysis of independent and cluster-correlated semi-competing risks data, The R Journal, 11, 1, 376-400.

See Also

initiate.startValues_AFT, print.Bayes_AFT, summary.Bayes_AFT, predict.Bayes_AFT

Examples

## Not run: 
# loading a data set
data(survData)
survData$yL <- survData$yU <- survData[,1]
survData$yU[which(survData[,2] == 0)] <- Inf
survData$LT <- rep(0, dim(survData)[1])

form <- Formula(LT | yL + yU ~ cov1 + cov2)

#####################
## Hyperparameters ##
#####################

## log-Normal model
##
LN.ab <- c(0.3, 0.3)

## DPM model
##
DPM.mu <- log(12)
DPM.sigSq <- 100
DPM.ab <-  c(2, 1)
Tau.ab <- c(1.5, 0.0125)

##
hyperParams <- list(LN=list(LN.ab=LN.ab),
DPM=list(DPM.mu=DPM.mu, DPM.sigSq=DPM.sigSq, DPM.ab=DPM.ab, Tau.ab=Tau.ab))

###################
## MCMC SETTINGS ##
###################

## Setting for the overall run
##
numReps    <- 100
thin       <- 1
burninPerc <- 0.5

## Tuning parameters for specific updates
##
##  - those common to all models
beta.prop.var	<- 0.01
mu.prop.var	<- 0.1
zeta.prop.var	<- 0.1

##
mcmcParams	<- list(run=list(numReps=numReps, thin=thin, burninPerc=burninPerc),
tuning=list(beta.prop.var=beta.prop.var, mu.prop.var=mu.prop.var,
zeta.prop.var=zeta.prop.var))

################################################################
## Analysis of Independent univariate survival data ############
################################################################

###############
## logNormal ##
###############

##
myModel <- "LN"
myPath  <- "Output/01-Results-LN/"

startValues      <- initiate.startValues_AFT(form, survData, model=myModel, nChain=2)

##
fit_LN <- BayesSurv_AFT(form, survData, model=myModel, hyperParams,
startValues, mcmcParams, path=myPath)

fit_LN
summ.fit_LN <- summary(fit_LN); names(summ.fit_LN)
summ.fit_LN
pred_LN <- predict(fit_LN, time = seq(0, 35, 1), tseq=seq(from=0, to=30, by=5))
plot(pred_LN, plot.est="Haz")
plot(pred_LN, plot.est="Surv")

#########
## DPM ##
#########

##
myModel <- "DPM"
myPath  <- "Output/02-Results-DPM/"

startValues      <- initiate.startValues_AFT(form, survData, model=myModel, nChain=2)

##
fit_DPM <- BayesSurv_AFT(form, survData, model=myModel, hyperParams,
startValues, mcmcParams, path=myPath)

fit_DPM
summ.fit_DPM <- summary(fit_DPM); names(summ.fit_DPM)
summ.fit_DPM
pred_DPM <- predict(fit_DPM, time = seq(0, 35, 1), tseq=seq(from=0, to=30, by=5))
plot(pred_DPM, plot.est="Haz")
plot(pred_DPM, plot.est="Surv")

## End(Not run)

The function to implement Bayesian parametric and semi-parametric regression analyses for univariate time-to-event data in the context of hazard regression (HReg) models.

Description

Independent/cluster-correlated univariate right-censored survival data can be analyzed using hierarchical models. The prior for the baseline hazard function can be specified by either parametric (Weibull) model or non-parametric mixture of piecewise exponential models (PEM).

Usage

BayesSurv_HReg(Formula, data, id=NULL, model="Weibull", hyperParams,
        startValues, mcmcParams, na.action = "na.fail", subset=NULL, path=NULL)

Arguments

Formula

a Formula object, with the outcome on the left of a \sim, and covariates on the right. It is of the form, time to event + censoring indicator \sim covariates: i.e., yy+δ\delta ~ xx.

data

a data.frame in which to interpret the variables named in Formula.

id

a vector of cluster information for n subjects. The cluster membership must be consecutive positive integers, 1:J1:J.

model

a character vector that specifies the type of components in a model. The first element is for the specification of baseline hazard functions: "Weibull" or "PEM". The second element needs to be set only for clustered data and is for the specification of cluster-specific random effects distribution: "Normal" or "DPM".

hyperParams

a list containing lists or vectors for hyperparameter values in hierarchical models. Components include, WB (a list containing a numeric vector for Weibull hyperparameters: WB.ab, WB.cd), PEM (a list containing numeric vectors for PEM hyperparameters: PEM.ab, PEM.alpha). Models for clustered data require additional components, Normal (a list containing a numeric vector for hyperparameters in a Normal prior: Normal.ab), DPM (a list containing numeric vectors for DPM hyperparameters: DPM.ab, aTau, bTau). See Details and Examples below.

startValues

a list containing vectors of starting values for model parameters. It can be specified as the object returned by the function initiate.startValues_HReg.

mcmcParams

a list containing variables required for MCMC sampling. Components include, run (a list containing numeric values for setting for the overall run: numReps, total number of scans; thin, extent of thinning; burninPerc, the proportion of burn-in). storage (a list containing numeric values for storing posterior samples for cluster-specific random effects: storeV, a logical value to determine whether all the posterior samples of VV are to be stored.) tuning (a list containing numeric values relevant to tuning parameters for specific updates in Metropolis-Hastings-Green (MHG) algorithm: mhProp_V_var, the variance of proposal density for VV in DPM models; mhProp_alpha_var, the variance of proposal density for α\alpha in Weibull models; C, a numeric value for the proportion that determines the sum of probabilities of choosing the birth and the death moves in PEM models. The value should not exceed 0.8; delPert, the perturbation parameter in the birth update in PEM models. The values must be between 0 and 0.5; If rj.scheme=1, the birth update will draw the proposal time split from 1:smax1:s_{max}. If rj.scheme=2, the birth update will draw the proposal time split from uniquely ordered failure times in the data. Only required for PEM models; K_max, the maximum number of splits allowed at each iteration in MHG algorithm for PEM models; time_lambda - time points at which the log-hazard function is calculated for predict.Bayes_HReg, Only required for PEM models). See Details and Examples below.

na.action

how NAs are treated. See model.frame.

subset

a specification of the rows to be used: defaults to all rows. See model.frame.

path

the name of directory where the results are saved.

Details

The function BayesSurv_HReg implements Bayesian semi-parametric (piecewise exponential mixture) and parametric (Weibull) models to univariate time-to-event data. Let tjit_{ji} denote time to event of interest from subject i=1,...,nji=1,...,n_j in cluster j=1,...,Jj=1,...,J. The covariates xjix_{ji} are incorporated via Cox proportional hazards model:

h(tjixji)=h0(tji)exp(xjiβ+Vj),tji>0,h(t_{ji} | x_{ji}) = h_{0}(t_{ji})\exp(x_{ji}^{\top}\beta + V_{j}), t_{ji}>0,

where h0h_0 is an unspecified baseline hazard function and β\beta is a vector of pp log-hazard ratio regression parameters. VjV_j's are cluster-specific random effects. For parametric Normal prior specification for a vector of cluster-specific random effects, we assume VV arise as i.i.d. draws from a mean 0 Normal distribution with variance σ2\sigma^2. Specifically, the priors can be written as follows:

VjNormal(0,σ2),V_j \sim Normal(0, \sigma^2),

ζ=1/σ2Gamma(aN,bN).\zeta=1/\sigma^2 \sim Gamma(a_{N}, b_{N}).

For DPM prior specification for VjV_j, we consider non-parametric Dirichlet process mixture of Normal distributions: the VjV_j's' are draws from a finite mixture of M Normal distributions, each with their own mean and variance, (μm\mu_m, σm2\sigma_m^2) for m=1,...,Mm=1,...,M. Let mj{1,...,M}m_j\in\{1,...,M\} denote the specific component to which the jjth cluster belongs. Since the class-specific (μm\mu_m, σm2\sigma_m^2) are not known they are taken to be draws from some distribution, G0G_0, often referred to as the centering distribution. Furthermore, since the true class memberships are unknown, we denote the probability that the jjth cluster belongs to any given class by the vector p=(p1,...,pM)p=(p_1,..., p_M) whose components add up to 1.0. In the absence of prior knowledge regarding the distribution of class memberships for the JJ clusters across the MM classes, a natural prior for pp is the conjugate symmetric Dirichlet(τ/M,...,τ/M)Dirichlet(\tau/M,...,\tau/M) distribution; the hyperparameter, τ\tau, is often referred to as a the precision parameter. The prior can be represented as follows (MM goes to infinity):

VjmjNormal(μmj,σmj2),V_j | m_j \sim Normal(\mu_{m_j}, \sigma_{m_j}^2),

(μm,σm2)G0,  for m=1,...,M,(\mu_m, \sigma_m^2) \sim G_{0},~~ for ~m=1,...,M,

mjpDiscrete(mjp1,...,pM),m_j | p \sim Discrete(m_j| p_1,...,p_M),

pDirichlet(τ/M,...,τ/M),p \sim Dirichlet(\tau/M,...,\tau/M),

where G0G_0 is taken to be a multivariate Normal/inverse-Gamma (NIG) distribution for which the probability density function is the following product:

fNIG(μ,σ2μ0,ζ0,a0,b0)=fNormal(μ0,1/ζ02)×fGamma(ζ=1/σ2a0,b0).f_{NIG}(\mu, \sigma^2 | \mu_0, \zeta_0, a_0, b_0) = f_{Normal}(\mu | 0, 1/\zeta_0^2) \times f_{Gamma}(\zeta=1/\sigma^2 | a_0, b_0).

In addition, we use Gamma(aτ,bτ)Gamma(a_{\tau}, b_{\tau}) as the hyperprior for τ\tau.

For non-parametric prior specification (PEM) for baseline hazard function, let smaxs_{\max} denote the largest observed event time. Then, consider the finite partition of the relevant time axis into K+1K + 1 disjoint intervals: 0<s1<s2<...<sK+1=smax0<s_1<s_2<...<s_{K+1} = s_{\max}. For notational convenience, let Ik=(sk1,sk]I_k=(s_{k-1}, s_k] denote the kthk^{th} partition. For given a partition, s=(s1,,sK+1)s = (s_1, \dots, s_{K + 1}), we assume the log-baseline hazard functions is piecewise constant:

λ0(t)=logh0(t)=k=1K+1λkI(tIk),\lambda_{0}(t)=\log h_{0}(t) = \sum_{k=1}^{K + 1} \lambda_{k} I(t\in I_{k}),

where I()I(\cdot) is the indicator function and s00s_0 \equiv 0. In our proposed Bayesian framework, our prior choices are:

π(β)1,\pi(\beta) \propto 1,

λK,μλ,σλ2MVNK+1(μλ1,σλ2Σλ),\lambda | K, \mu_{\lambda}, \sigma_{\lambda}^2 \sim MVN_{K+1}(\mu_{\lambda}1, \sigma_{\lambda}^2\Sigma_{\lambda}),

KPoisson(α),K \sim Poisson(\alpha),

π(sK)(2K+1)!k=1K+1(sksk1)(sK+1)(2K+1),\pi(s | K) \propto \frac{(2K+1)! \prod_{k=1}^{K+1}(s_k-s_{k-1})}{(s_{K+1})^{(2K+1)}},

π(μλ)1,\pi(\mu_{\lambda}) \propto 1,

σλ2Gamma(a,b).\sigma_{\lambda}^{-2} \sim Gamma(a, b).

Note that KK and ss are treated as random and the priors for KK and ss jointly form a time-homogeneous Poisson process prior for the partition. The number of time splits and their positions are therefore updated within our computational scheme using reversible jump MCMC.

For parametric Weibull prior specification for baseline hazard function, h0(t)=ακtα1h_{0}(t) = \alpha \kappa t^{\alpha-1}. In our Bayesian framework, our prior choices are:

π(β)1,\pi(\beta) \propto 1,

π(α)Gamma(a,b),\pi(\alpha) \sim Gamma(a, b),

π(κ)Gamma(c,d).\pi(\kappa) \sim Gamma(c, d).

We provide a detailed description of the hierarchical models for cluster-correlated univariate survival data. The models for independent data can be obtained by removing cluster-specific random effects, VjV_j, and its corresponding prior specification from the description given above.

Value

BayesSurv_HReg returns an object of class Bayes_HReg.

Note

The posterior samples of VgV_g are saved separately in working directory/path.

Author(s)

Kyu Ha Lee and Sebastien Haneuse
Maintainer: Kyu Ha Lee <[email protected]>

References

Lee, K. H., Haneuse, S., Schrag, D., and Dominici, F. (2015), Bayesian semiparametric analysis of semicompeting risks data: investigating hospital readmission after a pancreatic cancer diagnosis, Journal of the Royal Statistical Society: Series C, 64, 2, 253-273.

Lee, K. H., Dominici, F., Schrag, D., and Haneuse, S. (2016), Hierarchical models for semicompeting risks data with application to quality of end-of-life care for pancreatic cancer, Journal of the American Statistical Association, 111, 515, 1075-1095.

Alvares, D., Haneuse, S., Lee, C., Lee, K. H. (2019), SemiCompRisks: An R package for the analysis of independent and cluster-correlated semi-competing risks data, The R Journal, 11, 1, 376-400.

See Also

initiate.startValues_HReg, print.Bayes_HReg, summary.Bayes_HReg, predict.Bayes_HReg

Examples

## Not run: 		
# loading a data set	
data(survData)
id=survData$cluster

form <- Formula(time + event ~ cov1 + cov2)

#####################
## Hyperparameters ##
#####################

## Weibull baseline hazard function: alpha1, kappa1
##
WB.ab <- c(0.5, 0.01) # prior parameters for alpha
##
WB.cd <- c(0.5, 0.05) # prior parameters for kappa

## PEM baseline hazard function: 
##
PEM.ab <- c(0.7, 0.7) # prior parameters for 1/sigma^2
##
PEM.alpha <- 10 # prior parameters for K

## Normal cluster-specific random effects
##
Normal.ab 	<- c(0.5, 0.01) 		# prior for zeta

## DPM cluster-specific random effects
##
DPM.ab <- c(0.5, 0.01)
aTau  <- 1.5
bTau  <- 0.0125

##
hyperParams <- list(WB=list(WB.ab=WB.ab, WB.cd=WB.cd),
                    PEM=list(PEM.ab=PEM.ab, PEM.alpha=PEM.alpha),
                    Normal=list(Normal.ab=Normal.ab),
                    DPM=list(DPM.ab=DPM.ab, aTau=aTau, bTau=bTau))
                    
###################
## MCMC SETTINGS ##
###################

## Setting for the overall run
##
numReps    <- 2000
thin       <- 10
burninPerc <- 0.5

## Settings for storage
##
storeV    <- TRUE

## Tuning parameters for specific updates
##
##  - those common to all models
mhProp_V_var     <- 0.05
##
## - those specific to the Weibull specification of the baseline hazard functions
mhProp_alpha_var <- 0.01
##
## - those specific to the PEM specification of the baseline hazard functions
C        <- 0.2
delPert  <- 0.5
rj.scheme <- 1
K_max    <- 50
s_max    <- max(survData$time[survData$event == 1])
time_lambda <- seq(1, s_max, 1)

##
mcmc.WB  <- list(run=list(numReps=numReps, thin=thin, burninPerc=burninPerc),
                    storage=list(storeV=storeV),
                    tuning=list(mhProp_alpha_var=mhProp_alpha_var, mhProp_V_var=mhProp_V_var))
##
mcmc.PEM <- list(run=list(numReps=numReps, thin=thin, burninPerc=burninPerc),
                    storage=list(storeV=storeV),
                    tuning=list(mhProp_V_var=mhProp_V_var, C=C, delPert=delPert,
                    rj.scheme=rj.scheme, K_max=K_max, time_lambda=time_lambda))

################################################################
## Analysis of Independent Univariate Survival Data ############
################################################################

#############
## WEIBULL ##
#############

##
myModel <- "Weibull"
myPath  <- "Output/01-Results-WB/"

startValues      <- initiate.startValues_HReg(form, survData, model=myModel, nChain=2)

##
fit_WB <- BayesSurv_HReg(form, survData, id=NULL, model=myModel, 
                  hyperParams, startValues, mcmc.WB, path=myPath)
                  
fit_WB
summ.fit_WB <- summary(fit_WB); names(summ.fit_WB)
summ.fit_WB
pred_WB <- predict(fit_WB, tseq=seq(from=0, to=30, by=5))
plot(pred_WB, plot.est="Haz")
plot(pred_WB, plot.est="Surv")

#########
## PEM ##
#########
                
##
myModel <- "PEM"
myPath  <- "Output/02-Results-PEM/"

startValues      <- initiate.startValues_HReg(form, survData, model=myModel, nChain=2)

##
fit_PEM <- BayesSurv_HReg(form, survData, id=NULL, model=myModel,
                   hyperParams, startValues, mcmc.PEM, path=myPath)
                   
fit_PEM
summ.fit_PEM <- summary(fit_PEM); names(summ.fit_PEM)
summ.fit_PEM
pred_PEM <- predict(fit_PEM)
plot(pred_PEM, plot.est="Haz")
plot(pred_PEM, plot.est="Surv")

###############################################################
## Analysis of Correlated Univariate Survival Data ############
###############################################################

####################
## WEIBULL-NORMAL ##
####################

##
myModel <- c("Weibull", "Normal")
myPath  <- "Output/03-Results-WB_Normal/"

startValues      <- initiate.startValues_HReg(form, survData, model=myModel, id, nChain=2)

##
fit_WB_N <- BayesSurv_HReg(form, survData, id, model=myModel,
                        hyperParams, startValues, mcmc.WB, path=myPath)
                        
fit_WB_N
summ.fit_WB_N <- summary(fit_WB_N); names(summ.fit_WB_N)
summ.fit_WB_N
pred_WB_N <- predict(fit_WB_N, tseq=seq(from=0, to=30, by=5))
plot(pred_WB_N, plot.est="Haz")
plot(pred_WB_N, plot.est="Surv")

#################
## WEIBULL-DPM ##
#################

##
myModel <- c("Weibull", "DPM")
myPath  <- "Output/04-Results-WB_DPM/"

startValues      <- initiate.startValues_HReg(form, survData, model=myModel, id, nChain=2)

##
fit_WB_DPM <- BayesSurv_HReg(form, survData, id, model=myModel,
                        hyperParams, startValues, mcmc.WB, path=myPath)

fit_WB_DPM
summ.fit_WB_DPM <- summary(fit_WB_DPM); names(summ.fit_WB_DPM)
summ.fit_WB_DPM
pred_WB_DPM <- predict(fit_WB_DPM, tseq=seq(from=0, to=30, by=5))
plot(pred_WB_DPM, plot.est="Haz")
plot(pred_WB_DPM, plot.est="Surv")

################
## PEM-NORMAL ##
################

##
myModel <- c("PEM", "Normal")
myPath  <- "Output/05-Results-PEM_Normal/"

startValues      <- initiate.startValues_HReg(form, survData, model=myModel, id, nChain=2)

##
fit_PEM_N <- BayesSurv_HReg(form, survData, id, model=myModel,
                            hyperParams, startValues, mcmc.PEM, path=myPath)

fit_PEM_N
summ.fit_PEM_N <- summary(fit_PEM_N); names(summ.fit_PEM_N)
summ.fit_PEM_N
pred_PEM_N <- predict(fit_PEM_N)
plot(pred_PEM_N, plot.est="Haz")
plot(pred_PEM_N, plot.est="Surv")

#############
## PEM-DPM ##
#############

##
myModel <- c("PEM", "DPM")
myPath  <- "Output/06-Results-PEM_DPM/"

startValues      <- initiate.startValues_HReg(form, survData, model=myModel, id, nChain=2)

##
fit_PEM_DPM <- BayesSurv_HReg(form, survData, id, model=myModel,
                        hyperParams, startValues, mcmc.PEM, path=myPath)
                        
fit_PEM_DPM
summ.fit_PEM_DPM <- summary(fit_PEM_DPM); names(summ.fit_PEM_DPM)
summ.fit_PEM_DPM
pred_PEM_DPM <- predict(fit_PEM_DPM)
plot(pred_PEM_DPM, plot.est="Haz")
plot(pred_PEM_DPM, plot.est="Surv")

## End(Not run)

Data on 137 Bone Marrow Transplant Patients

Description

Data on 137 Bone Marrow Transplant Patients

Usage

data(BMT)

Format

a data frame with 137 observations on the following 22 variables.

g

disease group; 1-ALL, 2-AML low-risk, 3-high-risk

T1

time (in days) to death or on study time

T2

disease-Free survival time (time to relapse, death or end of study)

delta1

death indicator; 1-Dead, 0-Alive

delta2

relapse indicator; 1-Relapsed, 0-Disease-Free

delta3

disease-Free survival indicator; 1-Dead or relapsed, 0-Alive disease-free

TA

time (in days) to acute graft-versus-host disease

deltaA

acute graft-versus-host disease indicator; 1-Developed acute graft-versus-host disease, 0-Never developed acute graft-versus-host disease

TC

time (in days) to chronic graft-versus-host disease

deltaC

chronic graft-versus-host disease indicator; 1-Developed chronic graft-versus-host disease, 0-Never developed chronic graft-versus-host disease

TP

time (in days) to return of platelets to normal levels

deltaP

platelet recovery indicator; 1-Platelets returned to normal levels, 0-Platelets never returned to normal levels

Z1

patient age in years

Z2

donor age in years

Z3

patient sex: 1-Male, 2-Female

Z4

donor sex: 1-Male, 2-Female

Z5

patient CMV status: 1-CMV positive, 0-CMV negative

Z6

donor CMV status: 1-CMV positive, 0-CMV negative

Z7

waiting time to transplant in days

Z8

FAB: 1-FAB Grade 4 or 5 and AML, 0-Otherwise

Z9

hospital: 1-The Ohio State University, 2-Alfred, 3-St. Vincent, 4-Hahnemann

Z10

MTX used as a graft-versus-host-prophylactic; 1-Yes, 0-No

Source

1. R package "KMsurv".
2. Klein, J. P. and Moeschberger M. L. (2005). Survival Analysis: Techniques for Censored and Truncated Data.

References

Klein, J. P. and Moeschberger M. L. (2005). Survival Analysis: Techniques for Censored and Truncated Data.

Examples

data(BMT)

Center for International Blood and Bone Marrow Transplant Research (CIBMTR) data

Description

We provide a dataset with five covariates from a study of acute graft-versus-host (GVHD) disease with 9651 patients who underwent first allogeneic hematopoietic cell transplant. We also provide an algorithm to simulate semi-competing risks outcome data.

Usage

data("CIBMTR")

Format

A data frame with 9651 observations on the following 5 variables.

sexP

patient sex: M-Male, F-Female

ageP

patient age: LessThan10, 10to19, 20to29, 30to39, 40to49, 50to59, 60plus

dType

disease type: AML-Acute Myeloid Leukemia, ALL-Acute Lymphoblastic Leukemia, CML-Chronic Myeloid Leukemia, MDS-Myelodysplastic Syndrome

dStatus

disease stage: Early-early, Int-intermediate, Adv-advanced

donorGrp

human leukocyte antigen compatibility: HLA_Id_Sib-identical sibling, 8_8-8/8, 7_8-7/8

Details

See Examples below for an algorithm to simulate semi-competing risks outcome data.

Source

Center for International Blood and Bone Marrow Transplant Research

References

Lee, C., Lee, S.J., Haneuse, S. (2017+). Time-to-event analysis when the event is defined on a finite time interval. under review.

See Also

CIBMTR_Params

Examples

data(CIBMTR_Params)
data(CIBMTR)

## CREATING DUMMY VARIABLES ##

# Sex (M: reference)
CIBMTR$sexP <- as.numeric(CIBMTR$sexP)-1

# Age (LessThan10: reference)
CIBMTR$ageP20to29 <- as.numeric(CIBMTR$ageP=="20to29")
CIBMTR$ageP30to39 <- as.numeric(CIBMTR$ageP=="30to39")
CIBMTR$ageP40to49 <- as.numeric(CIBMTR$ageP=="40to49")
CIBMTR$ageP50to59 <- as.numeric(CIBMTR$ageP=="50to59")
CIBMTR$ageP60plus <- as.numeric(CIBMTR$ageP=="60plus")

# Disease type (AML: reference)
CIBMTR$dTypeALL <- as.numeric(CIBMTR$dType=="ALL")
CIBMTR$dTypeCML <- as.numeric(CIBMTR$dType=="CML")
CIBMTR$dTypeMDS <- as.numeric(CIBMTR$dType=="MDS")

# Disease status (Early: reference)
CIBMTR$dStatusInt <- as.numeric(CIBMTR$dStatus=="Int")
CIBMTR$dStatusAdv <- as.numeric(CIBMTR$dStatus=="Adv")

# HLA compatibility (HLA_Id_Sib: reference)
CIBMTR$donorGrp8_8 <- as.numeric(CIBMTR$donorGrp=="8_8")
CIBMTR$donorGrp7_8 <- as.numeric(CIBMTR$donorGrp=="7_8")

# Covariate matrix
x <- CIBMTR[,c("sexP","ageP20to29","ageP30to39","ageP40to49","ageP50to59","ageP60plus",
"dTypeALL","dTypeCML","dTypeMDS","dStatusInt","dStatusAdv","donorGrp8_8","donorGrp7_8")]

# Set the parameter values
beta1 <- CIBMTR_Params$beta1.true
beta2 <- CIBMTR_Params$beta2.true
beta3 <- CIBMTR_Params$beta3.true
alpha1 <- CIBMTR_Params$alpha1.true
alpha2 <- CIBMTR_Params$alpha2.true
alpha3 <- CIBMTR_Params$alpha3.true
kappa1 <- CIBMTR_Params$kappa1.true
kappa2 <- CIBMTR_Params$kappa2.true
kappa3 <- CIBMTR_Params$kappa3.true
theta <- CIBMTR_Params$theta.true

set.seed(1405)
simCIBMTR <- simID(id=NULL, x, x, x, beta1, beta2, beta3, alpha1, alpha2, alpha3,
          kappa1, kappa2, kappa3, theta, SigmaV.true=NULL, cens=c(365,365))
          
names(simCIBMTR) <- c("time1", "event1", "time2", "event2")
CIBMTR <- cbind(simCIBMTR, CIBMTR) 
head(CIBMTR)

Estimates for model parameters from semi-competing risks analysis of the CIBMTR data using Weibull illness-death model.

Description

Estimates for model parameters from semi-competing risks analysis of the CIBMTR data using Weibull illness-death model.

Usage

data("CIBMTR_Params")

Format

The format is a list of 10 components corresponding to parameters for Weibull illness-death model.

See Also

CIBMTR

Examples

data(CIBMTR_Params)

The function to fit parametric Weibull models for the frequentist anlaysis of semi-competing risks data.

Description

Independent semi-competing risks data can be analyzed using hierarchical models. Markov or semi-Markov assumption can be adopted for the conditional hazard function for time to the terminal event given time to non-terminal event.

Usage

FreqID_HReg(Formula, data, model="semi-Markov", frailty = TRUE, na.action = "na.fail",
subset=NULL)

Arguments

Formula

a Formula object, with the outcome on the left of a \sim, and covariates on the right. It is of the form, time to non-terminal event + corresponding censoring indicator | time to terminal event + corresponding censoring indicator \sim covariates for h1h_1 | covariates for h2h_2 | covariates for h3h_3: i.e., y1y_1+δ1\delta_1 | y2y_2+δ2\delta_2 ~ x1x_1 | x2x_2 | x3x_3.

data

a data.frame in which to interpret the variables named in Formula.

model

a character value that specifies the type of a model based on the assumption on h3h_3: "semi-Markov" or "Markov".

frailty

a logical value to determine whether to include the subject-specific shared frailty term, γ\gamma, into the model.

na.action

how NAs are treated. See model.frame.

subset

a specification of the rows to be used: defaults to all rows. See model.frame.

Details

See BayesID_HReg for a detailed description of the models.

Value

FreqID_HReg returns an object of class Freq_HReg.

Author(s)

Sebastien Haneuse and Kyu Ha Lee
Maintainer: Kyu Ha Lee <[email protected]>

References

Lee, K. H., Haneuse, S., Schrag, D., and Dominici, F. (2015), Bayesian semiparametric analysis of semicompeting risks data: investigating hospital readmission after a pancreatic cancer diagnosis, Journal of the Royal Statistical Society: Series C, 64, 2, 253-273.

Alvares, D., Haneuse, S., Lee, C., Lee, K. H. (2019), SemiCompRisks: An R package for the analysis of independent and cluster-correlated semi-competing risks data, The R Journal, 11, 1, 376-400.

See Also

print.Freq_HReg, summary.Freq_HReg, predict.Freq_HReg, BayesID_HReg.

Examples

## Not run: 
# loading a data set
data(scrData)

form <- Formula(time1 + event1 | time2 + event2 ~ x1 + x2 + x3 | x1 + x2 | x1 + x2)

fit_WB	<- FreqID_HReg(form, data=scrData, model="semi-Markov")

fit_WB
summ.fit_WB <- summary(fit_WB); names(summ.fit_WB)
summ.fit_WB
pred_WB <- predict(fit_WB, tseq=seq(from=0, to=30, by=5))
plot(pred_WB, plot.est="Haz")
plot(pred_WB, plot.est="Surv")

## End(Not run)

The function to fit parametric Weibull models for the frequentist analysis of univariate survival data.

Description

Independent univariate right-censored survival data can be analyzed using hierarchical models.

Usage

FreqSurv_HReg(Formula, data, na.action = "na.fail", subset=NULL)

Arguments

Formula

a Formula object, with the outcome on the left of a \sim, and covariates on the right. It is of the form, time to event + censoring indicator \sim covariates: i.e., yy+δ\delta ~ xx.

data

a data.frame in which to interpret the variables named in Formula.

na.action

how NAs are treated. See model.frame.

subset

a specification of the rows to be used: defaults to all rows. See model.frame.

Details

See BayesSurv_HReg for a detailed description of the models.

Value

FreqSurv_HReg returns an object of class Freq_HReg.

Author(s)

Sebastien Haneuse and Kyu Ha Lee
Maintainer: Kyu Ha Lee <[email protected]>

References

Lee, K. H., Haneuse, S., Schrag, D., and Dominici, F. (2015), Bayesian semiparametric analysis of semicompeting risks data: investigating hospital readmission after a pancreatic cancer diagnosis, Journal of the Royal Statistical Society: Series C, 64, 2, 253-273.

Alvares, D., Haneuse, S., Lee, C., Lee, K. H. (2019), SemiCompRisks: An R package for the analysis of independent and cluster-correlated semi-competing risks data, The R Journal, 11, 1, 376-400.

See Also

print.Freq_HReg, summary.Freq_HReg, predict.Freq_HReg, BayesSurv_HReg.

Examples

## Not run: 	
# loading a data set	
data(survData)

form <- Formula(time + event ~ cov1 + cov2)

fit_WB <- FreqSurv_HReg(form, data=survData)
fit_WB
summ.fit_WB <- summary(fit_WB); names(summ.fit_WB)
summ.fit_WB
pred_WB <- predict(fit_WB, tseq=seq(from=0, to=30, by=5))
plot(pred_WB, plot.est="Haz")
plot(pred_WB, plot.est="Surv")

## End(Not run)

The function that initiates starting values for a single chain.

Description

The function initiates starting values for a single chain for accelrated failture time (AFT) models. Users are allowed to set some non-null values to starting values for a set of parameters. The function will automatically generate starting values for any parameters whose values are not specified.

Usage

initiate.startValues_AFT(Formula, data, model, nChain=1,
                            beta1=NULL, beta2=NULL, beta3=NULL, beta=NULL,
                            gamma=NULL, theta=NULL,
                            y1=NULL, y2=NULL, y=NULL,
                            LN.mu=NULL, LN.sigSq=NULL,
                            DPM.class1=NULL, DPM.class2=NULL, DPM.class3=NULL,
                            DPM.class=NULL, DPM.mu1=NULL, DPM.mu2=NULL,
                            DPM.mu3=NULL, DPM.mu=NULL, DPM.zeta1=NULL,
                            DPM.zeta2=NULL, DPM.zeta3=NULL, DPM.zeta=NULL,
                            DPM.tau=NULL)

Arguments

Formula

For BayesID_AFT, it is a data.frame containing semi-competing risks outcomes from n subjects. See BayesID_AFT. For BayesSurv_AFT, it is a data.frame containing univariate time-to-event outcomes from n subjects. See BayesSurv_AFT. For BayesID_AFT, it is a list containing three formula objects that correspond to the transition gg=1,2,3. For BayesSurv_AFT, it is a formula object that corresponds to log(t)log(t).

data

a data.frame in which to interpret the variables named in the formula(s) in lin.pred.

model

a character vector that specifies the type of components in a model. Check BayesID_AFT and BayesSurv_AFT.

nChain

The number of chains.

beta1

starting values of β1\beta_1 for BayesID_AFT.

beta2

starting values of β2\beta_2 for BayesID_AFT.

beta3

starting values of β3\beta_3 for BayesID_AFT.

beta

starting values of β\beta for BayesSurv_AFT.

gamma

starting values of γ\gamma for BayesID_AFT.

theta

starting values of θ\theta for BayesID_AFT.

y1

starting values of log(t1)log(t_1) for BayesID_AFT.

y2

starting values of log(t2)log(t_2) for BayesID_AFT.

y

starting values of log(t)log(t) for BayesSurv_AFT.

LN.mu

starting values of β0\beta_0 in logNormal models for BayesID_AFT and BayesSurv_AFT.

LN.sigSq

starting values of σ2\sigma^2 in logNormal models for BayesID_AFT and BayesSurv_AFT.

DPM.class1

starting values of the class membership for transition 1 in DPM models for BayesID_AFT.

DPM.class2

starting values of the class membership for transition 2 in DPM models for BayesID_AFT.

DPM.class3

starting values of the class membership for transition 3 in DPM models for BayesID_AFT.

DPM.class

starting values of the class membership in DPM models for BayesSurv_AFT.

DPM.mu1

starting values of μ1\mu_1 in DPM models for BayesID_AFT.

DPM.mu2

starting values of μ2\mu_2 in DPM models for BayesID_AFT.

DPM.mu3

starting values of μ3\mu_3 in DPM models for BayesID_AFT.

DPM.mu

starting values of μ\mu in DPM models for BayesSurv_AFT.

DPM.zeta1

starting values of ζ1\zeta_{1} in DPM models for BayesID_AFT.

DPM.zeta2

starting values of ζ2\zeta_{2} in DPM models for BayesID_AFT.

DPM.zeta3

starting values of ζ3\zeta_{3} in DPM models for BayesID_AFT.

DPM.zeta

starting values of ζ\zeta in DPM models for BayesSurv_AFT.

DPM.tau

starting values of τ\tau in DPM models for BayesID_AFT and BayesSurv_AFT.

Value

initiate.startValues_AFT returns a list containing starting values for a sigle chain that can be used for BayesID_AFT and BayesSurv_AFT.

Author(s)

Sebastien Haneuse and Kyu Ha Lee
Maintainer: Kyu Ha Lee <[email protected]>

References

Lee, K. H., Rondeau, V., and Haneuse, S. (2017), Accelerated failure time models for semicompeting risks data in the presence of complex censoring, Biometrics, 73, 4, 1401-1412.

Alvares, D., Haneuse, S., Lee, C., Lee, K. H. (2019), SemiCompRisks: An R package for the analysis of independent and cluster-correlated semi-competing risks data, The R Journal, 11, 1, 376-400.

See Also

BayesID_AFT, BayesSurv_AFT

Examples

## See Examples in \code{\link{BayesID_AFT}} and \code{\link{BayesSurv_AFT}}.

The function that initiates starting values for a single chain.

Description

The function initiates starting values for a single chain for hazard regression (HReg) models. Users are allowed to set some non-null values to starting values for a set of parameters. The function will automatically generate starting values for any parameters whose values are not specified.

Usage

initiate.startValues_HReg(Formula, data, model, id = NULL, nChain=1,
                   beta1 = NULL, beta2 = NULL, beta3 = NULL, beta = NULL,
                   gamma.ji = NULL, theta = NULL,
                   V.j1 = NULL, V.j2 = NULL, V.j3 = NULL, V.j = NULL,
                   WB.alpha = NULL, WB.kappa = NULL, 
                   PEM.lambda1=NULL, PEM.lambda2=NULL, PEM.lambda3=NULL, PEM.lambda=NULL,
                   PEM.s1=NULL, PEM.s2=NULL, PEM.s3=NULL, PEM.s=NULL,
                   PEM.mu_lam=NULL, PEM.sigSq_lam=NULL,
                   MVN.SigmaV = NULL, Normal.zeta = NULL, 
                   DPM.class = NULL, DPM.tau = NULL)

Arguments

Formula

For BayesID_HReg, it is a data.frame containing semi-competing risks outcomes from n subjects. For BayesSurv_HReg, it is a data.frame containing univariate time-to-event outcomes from n subjects. For BayesID_HReg, it is a list containing three formula objects that correspond to hg()h_g(), gg=1,2,3. For BayesSurv_HReg, it is a formula object that corresponds to h()h().

data

a data.frame in which to interpret the variables named in the formula(s) in lin.pred.

model

a character vector that specifies the type of components in a model. Check BayesID_HReg and BayesSurv_HReg.

id

a vector of cluster information for n subjects. The cluster membership must be set to consecutive positive integers, 1:J1:J.

nChain

The number of chains.

beta1

starting values of β1\beta_1 for BayesID_HReg.

beta2

starting values of β2\beta_2 for BayesID_HReg.

beta3

starting values of β3\beta_3 for BayesID_HReg.

beta

starting values of β\beta for BayesSurv_HReg.

gamma.ji

starting values of γ\gamma for BayesID_HReg.

theta

starting values of θ\theta for BayesID_HReg.

V.j1

starting values of Vj1V_{j1} for BayesID_HReg.

V.j2

starting values of Vj2V_{j2} for BayesID_HReg.

V.j3

starting values of Vj3V_{j3} for BayesID_HReg.

V.j

starting values of VjV_{j} for BayesSurv_HReg.

WB.alpha

starting values of the Weibull parameters, αg\alpha_g for BayesID_HReg. starting values of the Weibull parameter, α\alpha for BayesSurv_HReg.

WB.kappa

starting values of the Weibull parameters, κg\kappa_g for BayesID_HReg. starting values of the Weibull parameter, κ\kappa for BayesSurv_HReg.

PEM.lambda1

starting values of the PEM parameters, λ1\lambda_1 for BayesID_HReg.

PEM.lambda2

starting values of the PEM parameters, λ2\lambda_2 for BayesID_HReg.

PEM.lambda3

starting values of the PEM parameters, λ3\lambda_3 for BayesID_HReg.

PEM.lambda

starting values of λ\lambda for BayesSurv_HReg.

PEM.s1

starting values of the PEM parameters, s1s_1 for BayesID_HReg.

PEM.s2

starting values of the PEM parameters, s2s_2 for BayesID_HReg.

PEM.s3

starting values of the PEM parameters, s3s_3 for BayesID_HReg.

PEM.s

starting values of ss for BayesSurv_HReg.

PEM.mu_lam

starting values of the PEM parameters, μλ,g\mu_{\lambda,g} for BayesID_HReg. starting values of the PEM parameter, μλ\mu_{\lambda} for BayesSurv_HReg.

PEM.sigSq_lam

starting values of the PEM parameters, σλ,g2\sigma_{\lambda,g}^2 for BayesID_HReg. starting values of the PEM parameter, σλ2\sigma_{\lambda}^2 for BayesSurv_HReg.

MVN.SigmaV

starting values of ΣV\Sigma_V in DPM models for BayesID_HReg.

Normal.zeta

starting values of ζ\zeta in DPM models for BayesSurv_HReg.

DPM.class

starting values of the class membership in DPM models for BayesID_HReg and BayesSurv_HReg.

DPM.tau

starting values of τ\tau in DPM models for BayesID_HReg and BayesSurv_HReg.

Value

initiate.startValues_HReg returns a list containing starting values for a sigle chain that can be used for BayesID_HReg and BayesSurv_HReg.

Author(s)

Sebastien Haneuse and Kyu Ha Lee
Maintainer: Kyu Ha Lee <[email protected]>

References

Lee, K. H., Haneuse, S., Schrag, D., and Dominici, F. (2015), Bayesian semiparametric analysis of semicompeting risks data: investigating hospital readmission after a pancreatic cancer diagnosis, Journal of the Royal Statistical Society: Series C, 64, 2, 253-273.

Lee, K. H., Dominici, F., Schrag, D., and Haneuse, S. (2016), Hierarchical models for semicompeting risks data with application to quality of end-of-life care for pancreatic cancer, Journal of the American Statistical Association, 111, 515, 1075-1095.

Alvares, D., Haneuse, S., Lee, C., Lee, K. H. (2019), SemiCompRisks: An R package for the analysis of independent and cluster-correlated semi-competing risks data, The R Journal, 11, 1, 376-400.

See Also

BayesID_HReg, BayesSurv_HReg

Examples

## See Examples in \code{\link{BayesID_HReg}} and \code{\link{BayesSurv_HReg}}.

Methods for objects of classes, Bayes_HReg/Bayes_AFT/Freq_HReg.

Description

The Bayes_HReg class represents results from Bayesian analysis of semi-competing risks or univariate time-to-event data in the context of hazard regression models.
The Bayes_AFT class represents results from Bayesian analysis of semi-competing risks or univariate time-to-event data in the context of AFT models.
The Freq_HReg class represents results from Frequentist analysis of semi-competing risks or univariate time-to-event data in the context of hazard regression models.

Usage

## S3 method for class 'Bayes_HReg'
print(x, digits=3, alpha=0.05, ...)
## S3 method for class 'Bayes_AFT'
print(x, digits=3, alpha=0.05, ...)
## S3 method for class 'summ.Bayes_HReg'
print(x, digits=3, ...)
## S3 method for class 'summ.Bayes_AFT'
print(x, digits=3, ...)
## S3 method for class 'Freq_HReg'
print(x, digits=3, alpha=0.05, ...)
## S3 method for class 'summ.Freq_HReg'
print(x, digits=3, ...)
## S3 method for class 'Bayes_HReg'
summary(object, digits=3, alpha=0.05, ...)
## S3 method for class 'Bayes_AFT'
summary(object, digits=3, alpha=0.05, ...)
## S3 method for class 'Freq_HReg'
summary(object, digits=3, alpha=0.05, ...)
## S3 method for class 'Bayes_HReg'
coef(object, alpha=0.05, ...)
## S3 method for class 'Bayes_AFT'
coef(object, alpha=0.05, ...)
## S3 method for class 'Freq_HReg'
coef(object, alpha=0.05, ...)
## S3 method for class 'Bayes_HReg'
predict(object, xnew=NULL, x1new=NULL, x2new=NULL,
x3new=NULL, tseq=c(0, 5, 10), alpha=0.05, ...)
## S3 method for class 'pred.Bayes_HReg'
plot(x, plot.est="Haz", xlab=NULL, ylab=NULL, ...)
## S3 method for class 'Bayes_AFT'
predict(object, xnew=NULL, x1new=NULL, x2new=NULL,
x3new=NULL, time, tseq=c(0, 5, 10), alpha=0.05, ...)
## S3 method for class 'pred.Bayes_AFT'
plot(x, plot.est="Haz", xlab=NULL, ylab=NULL, ...)
## S3 method for class 'Freq_HReg'
predict(object, xnew=NULL, x1new=NULL, x2new=NULL,
x3new=NULL, tseq=c(0, 5, 10), alpha=0.05, ...)
## S3 method for class 'pred.Freq_HReg'
plot(x, plot.est="Haz", xlab=NULL, ylab=NULL, ...)
## S3 method for class 'Freq_HReg'
vcov(object, ...)

Arguments

x

an object of class Bayes_HReg or Bayes_AFT or Freq_HReg.

digits

a numeric value indicating the number of digits to display.

object

an object of class Bayes_HReg or Bayes_AFT orFreq_HReg.

time

the points at which the baseline survival/hazard functions are evaluated.

tseq

the points at which tick-marks are to be drawn. Required only if the object x is returned by parametric Weibull-HReg/log-Normal-AFT/DPM-AFT models.

plot.est

used only if plot is TRUE. If Surv (the default) then estimated survival functions are produced. If Haz then estimated hazard functions are produced.

xlab

a title for the x axis.

ylab

a title for the y axis.

xnew

a vector of covariate values with which to predict for which to predict for hh.

x1new

a vector of covariate values with which to predict for which to predict for h1h_1.

x2new

a vector of covariate values with which to predict for which to predict for h2h_2.

x3new

a vector of covariate values with which to predict for which to predict for h3h_3.

alpha

confidence/credibility level of the interval.

...

additional arguments.

See Also

BayesID_HReg, BayesID_AFT, BayesSurv_HReg, BayesSurv_AFT, FreqID_HReg, FreqSurv_HReg.


Old functions

Description

Since version 2.5, the functions BayesID(), BayesSurv(), FreqID(), FreqSurv(), initiate.startValues() have been renamed as BayesID_HReg(), BayesSurv_HReg(), FreqID_HReg(), FreqSurv_HReg(), initiate.startValues_HReg(), respectively. If one of the old functions is called, a warning message will be displayed with the corresponding new function name.

Usage

BayesID(...)
BayesSurv(...)
FreqID(...)
FreqSurv(...)
initiate.startValues(...)

Arguments

...

arguments used for the old functions.


Function to predict the joint probability involving two event times in Bayesian illness-death models

Description

PPD is a function to predict the joint probability involving two event times in Bayesian illness-death models.

Usage

PPD(fit, x1, x2, x3, t1, t2)

Arguments

fit

an object of class Bayes_HReg. Currently, the function is available for PEM illness-death models.

x1

a vector of covariates for h1h_1 with which to predict.

x2

a vector of covariates for h2h_2 with which to predict.

x3

a vector of covariates for h3h_3 with which to predict.

t1

time to non-terminal event for which the joint probability is calculated.

t2

time to terminal event for which the joint probability is calculated.

Details

Using the posterior predictive density, given (x1x_1, x2x_2, x3x_3), one can predict any joint probability involving the two event times such as P(T1<t1,T2<t2x1,x2,x3)P(T_1<t_1, T_2<t_2| x_1, x_2, x_3) for 0<t1t20<t_1\le t_2 and P(T1=,T2<t2x1,x2,x3)P(T_1=\infty, T_2<t_2| x_1, x_2, x_3) for t2>0t_2>0.

Value

F_u

Predicted P(T1t1,T2t2x1,x2,x3)P(T_1\le t_1, T_2\le t_2| x_1, x_2, x_3) in the upper wedge of the support of (T1,T2)(T_1, T_2).

F_l

Predicted P(T1=,T2t2x1,x2,x3)P(T_1=\infty, T_2\le t_2| x_1, x_2, x_3) in the lower wedge of the support of (t1,t2)(t1, t2).

Author(s)

Kyu Ha Lee and Sebastien Haneuse
Maintainer: Kyu Ha Lee <[email protected]>

References

Lee, K. H., Haneuse, S., Schrag, D., and Dominici, F. (2015), Bayesian semiparametric analysis of semicompeting risks data: investigating hospital readmission after a pancreatic cancer diagnosis, Journal of the Royal Statistical Society: Series C, 64, 2, 253-273.

See Also

BayesID_HReg

Examples

## Not run:    
# loading a data set
data(scrData)
id=scrData$cluster

form <- Formula(time1 + event1 | time2 + event2 ~ x1 + x2 | x1 + x2 | x1 + x2)

#####################
## Hyperparameters ##
#####################

## Subject-specific frailty variance component
##  - prior parameters for 1/theta
##
theta.ab <- c(0.7, 0.7)

## PEM baseline hazard function
##
PEM.ab1 <- c(0.7, 0.7) # prior parameters for 1/sigma_1^2
PEM.ab2 <- c(0.7, 0.7) # prior parameters for 1/sigma_2^2
PEM.ab3 <- c(0.7, 0.7) # prior parameters for 1/sigma_3^2
##
PEM.alpha1 <- 10 # prior parameters for K1
PEM.alpha2 <- 10 # prior parameters for K2
PEM.alpha3 <- 10 # prior parameters for K3

##
hyperParams <- list(theta=theta.ab,
                   PEM=list(PEM.ab1=PEM.ab1, PEM.ab2=PEM.ab2, PEM.ab3=PEM.ab3,
                    PEM.alpha1=PEM.alpha1, PEM.alpha2=PEM.alpha2,
                    PEM.alpha3=PEM.alpha3))
                    
###################
## MCMC SETTINGS ##
###################

## Setting for the overall run
##
numReps    <- 2000
thin       <- 10
burninPerc <- 0.5

## Settings for storage
##
nGam_save <- 0

## Tuning parameters for specific updates
##
##  - those common to all models
mhProp_theta_var  <- 0.05
##
## - those specific to the Weibull specification of the baseline hazard functions
mhProp_alphag_var <- c(0.01, 0.01, 0.01)
##
## - those specific to the PEM specification of the baseline hazard functions
Cg        <- c(0.2, 0.2, 0.2)
delPertg  <- c(0.5, 0.5, 0.5)
rj.scheme <- 1
Kg_max    <- c(50, 50, 50)
sg_max    <- c(max(scrData$time1[scrData$event1 == 1]),
               max(scrData$time2[scrData$event1 == 0 & scrData$event2 == 1]),
               max(scrData$time2[scrData$event1 == 1 & scrData$event2 == 1]))

time_lambda1 <- seq(1, sg_max[1], 1)
time_lambda2 <- seq(1, sg_max[2], 1)
time_lambda3 <- seq(1, sg_max[3], 1)               

##
mcmc.PEM <- list(run=list(numReps=numReps, thin=thin, burninPerc=burninPerc),
                storage=list(nGam_save=nGam_save),
                tuning=list(mhProp_theta_var=mhProp_theta_var,
                Cg=Cg, delPertg=delPertg,
                rj.scheme=rj.scheme, Kg_max=Kg_max,
                time_lambda1=time_lambda1, time_lambda2=time_lambda2,
                time_lambda3=time_lambda3))
    
##						
myModel <- c("semi-Markov", "PEM")
myPath  <- "Output/02-Results-PEM/"

startValues      <- initiate.startValues_HReg(form, scrData, model=myModel, nChain=2)

##
fit_PEM <- BayesID_HReg(form, scrData, id=NULL, model=myModel,
                 hyperParams, startValues, mcmc.PEM, path=myPath)
				
PPD(fit_PEM, x1=c(1,1), x2=c(1,1), x3=c(1,1), t1=3, t2=6)
		

## End(Not run)

A simulated clustered semi-competing risks data set

Description

Simulated semi-competing risks data

Usage

data(scrData)

Format

a data frame with 2000 observations on the following 14 variables.

time1

the time to non-terminal event

event1

the censoring indicators for the non-terminal event time; 1=event observed, 0=censored/truncated

time2

the time to terminal event

event2

the censoring indicators for the terminal event time; 1=event observed, 0=censored

cluster

cluster numbers

x1

a vector of continuous covarate

x2

a vector of continuous covarate

x3

a vector of continuous covarate

Examples

data(scrData)

The function that simulates independent/cluster-correlated semi-competing risks data under semi-Markov Weibull/Weibull-MVN models.

Description

The function to simulate independent/cluster-correlated semi-competing risks data under semi-Markov Weibull/Weibull-MVN models.

Usage

simID(id=NULL, x1, x2, x3, beta1.true, beta2.true, beta3.true,
			alpha1.true, alpha2.true, alpha3.true, 
			kappa1.true, kappa2.true, kappa3.true, 
			theta.true, SigmaV.true=NULL, cens)

Arguments

id

a vector of cluster information for n subjects. The cluster membership must be set to consecutive positive integers, 1:J1:J. Required only when generating clustered data.

x1

covariate matrix, n observations by p1 variables.

x2

covariate matrix, n observations by p2 variables.

x3

covariate matrix, n observations by p3 variables.

beta1.true

true value for β1\beta_1.

beta2.true

true value for β2\beta_2.

beta3.true

true value for β3\beta_3.

alpha1.true

true value for α1\alpha_1.

alpha2.true

true value for α2\alpha_2.

alpha3.true

true value for α3\alpha_3.

kappa1.true

true value for κ1\kappa_1.

kappa2.true

true value for κ2\kappa_2.

kappa3.true

true value for κ3\kappa_3.

theta.true

true value for θ\theta.

SigmaV.true

true value for ΣV\Sigma_V. Required only when generating clustered data.

cens

a vector with two numeric elements. The right censoring times are generated from Uniform(cens[1]cens[1], cens[2]cens[2]).

Value

simIDcor returns a data.frame containing semi-competing risks outcomes from n subjects. It is of dimension n×4n\times 4: the columns correspond to y1y_1, δ1\delta_1, y2y_2, δ2\delta_2.

y1

a vector of n times to the non-terminal event

y2

a vector of n times to the terminal event

delta1

a vector of n censoring indicators for the non-terminal event time (1=event occurred, 0=censored)

delta2

a vector of n censoring indicators for the terminal event time (1=event occurred, 0=censored)

Author(s)

Kyu Ha Lee and Sebastien Haneuse
Maintainer: Kyu Ha Lee <[email protected]>

Examples

library(MASS)
	set.seed(123456)
	
	J = 110
	nj = 50
	n = J * nj

	id <- rep(1:J, each = nj)

	kappa1.true <- 0.05
	kappa2.true <- 0.01
	kappa3.true <- 0.01
	alpha1.true <- 0.8
	alpha2.true <- 1.1
	alpha3.true <- 0.9
	beta1.true <- c(0.5, 0.8, -0.5)
	beta2.true <- c(0.5, 0.8, -0.5)
	beta3.true <- c(1, 1, -1)
	SigmaV.true <- matrix(0.25,3,3)
	theta.true <- 0.5
	cens <- c(90, 90)

	cov1 <- matrix(rnorm((length(beta1.true)-1)*n, 0, 1), n, length(beta1.true)-1)
	cov2 <- sample(c(0, 1), n, replace = TRUE)
	x1 <- as.data.frame(cbind(cov1, cov2))
	x2 <- as.data.frame(cbind(cov1, cov2))
	x3 <- as.data.frame(cbind(cov1, cov2))
	
	simData <- simID(id, x1, x2, x3, beta1.true, beta2.true, beta3.true, 
				alpha1.true, alpha2.true, alpha3.true, 
				kappa1.true, kappa2.true, kappa3.true, 
				theta.true, SigmaV.true, cens)

The function that simulates independent/cluster-correlated right-censored survival data under Weibull/Weibull-Normal model.

Description

The function to simulate independent/cluster-correlated right-censored survival data under Weibull/Weibull-Normal model.

Usage

simSurv(id=NULL, x, beta.true, alpha.true, kappa.true, sigmaV.true=NULL, cens)

Arguments

id

a vector of cluster information for n subjects. The cluster membership must be set to consecutive positive integers, 1:J1:J. Required only when generating clustered data.

x

covariate matrix, n observations by p variables.

beta.true

true value for β\beta.

alpha.true

true value for α\alpha.

kappa.true

true value for κ\kappa.

sigmaV.true

true value for σV\sigma_V. Required only when generating clustered data.

cens

a vector with two numeric elements. The right censoring times are generated from Uniform(cens[1]cens[1], cens[2]cens[2]).

Value

simSurv returns a data.frame containing univariate time-to-event outcomes from n subjects. It is of dimension n×2n\times 2: the columns correspond to yy, δ\delta.

y

a vector of n times to the event

delta

a vector of n censoring indicators for the event time (1=event occurred, 0=censored)

Author(s)

Kyu Ha Lee and Sebastien Haneuse
Maintainer: Kyu Ha Lee <[email protected]>

Examples

set.seed(123456)
	
	J = 110
	nj = 50
	n = J * nj

	id <- rep(1:J, each = nj)

	x	= matrix(0, n, 2)	
	x[,1]	= rnorm(n, 0, 2)	
	x[,2]	= sample(c(0, 1), n, replace = TRUE)

	beta.true = c(0.5, 0.5)
	
	alpha.true = 1.5		
	kappa.true = 0.02
	sigmaV.true = 0.1

	cens <- c(30, 40)		

	simData <- simSurv(id, x, beta.true, alpha.true, kappa.true, 
				sigmaV.true, cens)

A simulated clustered univariate survival data.

Description

Simulated univariate survival data.

Usage

data(survData)

Format

a data frame with 2000 observations on the following 4 variables.

time

the time to event

event

the censoring indicators for the event time; 1=event observed, 0=censored

cluster

cluster numbers

cov1

the first column of covariate matrix x

cov2

the second column of covariate matrix x

Examples

data(scrData)