Title: | Random Generation of Survival Data |
---|---|
Description: | Random generation of survival data from a wide range of regression models, including accelerated failure time (AFT), proportional hazards (PH), proportional odds (PO), accelerated hazard (AH), Yang and Prentice (YP), and extended hazard (EH) models. The package 'rsurv' also stands out by its ability to generate survival data from an unlimited number of baseline distributions provided that an implementation of the quantile function of the chosen baseline distribution is available in R. Another nice feature of the package 'rsurv' lies in the fact that linear predictors are specified via a formula-based approach, facilitating the inclusion of categorical variables and interaction terms. The functions implemented in the package 'rsurv' can also be employed to simulate survival data with more complex structures, such as survival data with different types of censoring mechanisms, survival data with cure fraction, survival data with random effects (frailties), multivariate survival data, and competing risks survival data. Details about the R package 'rsurv' can be found in Demarqui (2024) <doi:10.48550/arXiv.2406.01750>. |
Authors: | Fabio Demarqui [aut, cre, cph] |
Maintainer: | Fabio Demarqui <[email protected]> |
License: | GPL (>= 3) |
Version: | 0.0.2 |
Built: | 2024-10-26 03:40:03 UTC |
Source: | CRAN |
Random generation of survival data based on different survival regression models available in the literature, including Accelerated Failure Time (AFT) model, Proportional Hazard (PH) model, Proportional Odds (PO) model and the Yang & Prentice (YP) model.
_PACKAGE
Demarqui FN, Mayrink VD (2021). “Yang and Prentice model with piecewise exponential baseline distribution for modeling lifetime data with crossing survival curves.” Brazilian Journal of Probability and Statistics, 35(1), 172 – 186. doi:10.1214/20-BJPS471.
Yang S, Prentice RL (2005). “Semiparametric analysis of short-term and long-term hazard ratios with two-sample survival data.” Biometrika, 92(1), 1-17.
This function is used to specify different link functions for the count component of the mixture cure rate model.
bernoulli(link = "logit")
bernoulli(link = "logit")
link |
desired link function; currently implemented links are: logit, probit, cloglog and cauchy. |
A list containing the codes associated with the count distribution assumed for the latent variable N and the chosen link.
This function is used to specify different link functions for the count component of the promotion time cure rate model
inv_pgf(formula, incidence = "bernoulli", kappa = NULL, zeta = NULL, data, ...)
inv_pgf(formula, incidence = "bernoulli", kappa = NULL, zeta = NULL, data, ...)
formula |
formula specifying the linear predictor for the incidence sub-model. |
incidence |
the desired incidence model. |
kappa |
vector of regression coefficients associated with the incidence sub-model. |
zeta |
extra negative-binomial parameter. |
data |
a data.frame containing the explanatory covariates passed to the formula. |
... |
further arguments passed to other methods. |
A vector with the values of the inverse of the desired probability generating function.
Function to construct linear predictors.
lp(formula, coefs, data, ...)
lp(formula, coefs, data, ...)
formula |
formula specifying the linear predictors. |
coefs |
vector of regression coefficients. |
data |
data frame containing the covariates used to construct the linear predictors. |
... |
further arguments passed to other methods. |
a vector containing the linear predictors.
library(rsurv) library(dplyr) n <- 100 coefs <- c(1, 0.7, 2.3) simdata <- data.frame( age = rnorm(n), sex = sample(c("male", "female"), size = n, replace = TRUE) ) |> mutate( lp = lp(~age+sex, coefs) ) glimpse(simdata)
library(rsurv) library(dplyr) n <- 100 coefs <- c(1, 0.7, 2.3) simdata <- data.frame( age = rnorm(n), sex = sample(c("male", "female"), size = n, replace = TRUE) ) |> mutate( lp = lp(~age+sex, coefs) ) glimpse(simdata)
This function is used to specify different link functions for the count component of the promotion time cure rate model.
negbin(zeta = stop("'theta' must be specified"), link = "log")
negbin(zeta = stop("'theta' must be specified"), link = "log")
zeta |
The known value of the additional parameter. |
link |
desired link function; currently implemented links are: log, identity and sqrt. |
A list containing the codes associated with the count distribution assumed for the latent variable N and the chosen link.
Generic quantile function used internally to simulating from an arbitrary baseline survival distribution.
qsurv(p, baseline, package = NULL, ...)
qsurv(p, baseline, package = NULL, ...)
p |
vector of quantiles associated with the right tail area of the baseline survival distribution. |
baseline |
the name of the baseline distribution. |
package |
the name of the package where the baseline distribution is implemented. It ensures that the right quantile function from the right package is found, regardless of the current R search path. |
... |
further arguments passed to other methods. |
a vector of quantiles.
library(rsurv) set.seed(1234567890) u <- sort(runif(5)) x1 <- qexp(u, rate = 1, lower.tail = FALSE) x2 <- qsurv(u, baseline = "exp", rate = 1) x3 <- qsurv(u, baseline = "exp", rate = 1, package = "stats") x4 <- qsurv(u, baseline = "gengamma.orig", shape=1, scale=1, k=1, package = "flexsurv") cbind(x1, x2, x3, x4)
library(rsurv) set.seed(1234567890) u <- sort(runif(5)) x1 <- qexp(u, rate = 1, lower.tail = FALSE) x2 <- qsurv(u, baseline = "exp", rate = 1) x3 <- qsurv(u, baseline = "exp", rate = 1, package = "stats") x4 <- qsurv(u, baseline = "gengamma.orig", shape=1, scale=1, k=1, package = "flexsurv") cbind(x1, x2, x3, x4)
Function to generate a random sample of survival data from accelerated failure time models.
raftreg(u, formula, baseline, beta, dist = NULL, package = NULL, data, ...)
raftreg(u, formula, baseline, beta, dist = NULL, package = NULL, data, ...)
u |
a numeric vector of quantiles. |
formula |
formula specifying the linear predictors. |
baseline |
the name of the baseline survival distribution. |
beta |
vector of regression coefficients. |
dist |
an alternative way to specify the baseline survival distribution. |
package |
the name of the package where the assumed quantile function is implemented. |
data |
data frame containing the covariates used to generate the survival times. |
... |
further arguments passed to other methods. |
a numeric vector containing the generated random sample.
library(rsurv) library(dplyr) n <- 1000 simdata <- data.frame( age = rnorm(n), sex = sample(c("f", "m"), size = n, replace = TRUE) ) %>% mutate( t = raftreg(runif(n), ~ age+sex, beta = c(1, 2), dist = "weibull", shape = 1.5, scale = 1), c = runif(n, 0, 10) ) %>% rowwise() %>% mutate( time = min(t, c), status = as.numeric(time == t) ) glimpse(simdata)
library(rsurv) library(dplyr) n <- 1000 simdata <- data.frame( age = rnorm(n), sex = sample(c("f", "m"), size = n, replace = TRUE) ) %>% mutate( t = raftreg(runif(n), ~ age+sex, beta = c(1, 2), dist = "weibull", shape = 1.5, scale = 1), c = runif(n, 0, 10) ) %>% rowwise() %>% mutate( time = min(t, c), status = as.numeric(time == t) ) glimpse(simdata)
Function to generate a random sample of survival data from accelerated hazard models.
rahreg(u, formula, baseline, beta, dist = NULL, package = NULL, data, ...)
rahreg(u, formula, baseline, beta, dist = NULL, package = NULL, data, ...)
u |
a numeric vector of quantiles. |
formula |
formula specifying the linear predictors. |
baseline |
the name of the baseline survival distribution. |
beta |
vector of regression coefficients. |
dist |
an alternative way to specify the baseline survival distribution. |
package |
the name of the package where the assumed quantile function is implemented. |
data |
data frame containing the covariates used to generate the survival times. |
... |
further arguments passed to other methods. |
a numeric vector containing the generated random sample.
library(rsurv) library(dplyr) n <- 1000 simdata <- data.frame( age = rnorm(n), sex = sample(c("f", "m"), size = n, replace = TRUE) ) %>% mutate( t = rahreg(runif(n), ~ age+sex, beta = c(1, 2), dist = "weibull", shape = 1.5, scale = 1), c = runif(n, 0, 10) ) %>% rowwise() %>% mutate( time = min(t, c), status = as.numeric(time == t) ) glimpse(simdata)
library(rsurv) library(dplyr) n <- 1000 simdata <- data.frame( age = rnorm(n), sex = sample(c("f", "m"), size = n, replace = TRUE) ) %>% mutate( t = rahreg(runif(n), ~ age+sex, beta = c(1, 2), dist = "weibull", shape = 1.5, scale = 1), c = runif(n, 0, 10) ) %>% rowwise() %>% mutate( time = min(t, c), status = as.numeric(time == t) ) glimpse(simdata)
Function to generate a random sample of survival data from extended hazard models.
rehreg(u, formula, baseline, beta, phi, dist = NULL, package = NULL, data, ...)
rehreg(u, formula, baseline, beta, phi, dist = NULL, package = NULL, data, ...)
u |
a numeric vector of quantiles. |
formula |
formula specifying the linear predictors. |
baseline |
the name of the baseline survival distribution. |
beta |
vector of regression coefficients. |
phi |
vector of regression coefficients. |
dist |
an alternative way to specify the baseline survival distribution. |
package |
the name of the package where the assumed quantile function is implemented. |
data |
data frame containing the covariates used to generate the survival times. |
... |
further arguments passed to other methods. |
a numeric vector containing the generated random sample.
library(rsurv) library(dplyr) n <- 1000 simdata <- data.frame( age = rnorm(n), sex = sample(c("f", "m"), size = n, replace = TRUE) ) %>% mutate( t = rehreg(runif(n), ~ age+sex, beta = c(1, 2), phi = c(-1, 2), dist = "weibull", shape = 1.5, scale = 1), c = runif(n, 0, 10) ) %>% rowwise() %>% mutate( time = min(t, c), status = as.numeric(time == t) ) glimpse(simdata)
library(rsurv) library(dplyr) n <- 1000 simdata <- data.frame( age = rnorm(n), sex = sample(c("f", "m"), size = n, replace = TRUE) ) %>% mutate( t = rehreg(runif(n), ~ age+sex, beta = c(1, 2), phi = c(-1, 2), dist = "weibull", shape = 1.5, scale = 1), c = runif(n, 0, 10) ) %>% rowwise() %>% mutate( time = min(t, c), status = as.numeric(time == t) ) glimpse(simdata)
The frailty function for adding a simple random effects term to the linear predictor of a given survival regression model.
rfrailty( cluster, frailty = c("gamma", "gaussian", "ps"), sigma = 1, alpha = NULL, ... )
rfrailty( cluster, frailty = c("gamma", "gaussian", "ps"), sigma = 1, alpha = NULL, ... )
cluster |
a vector determining the grouping of subjects (always converted to a factor object internally. |
frailty |
the frailty distribution; current implementation includes the gamma (default), lognormal and positive stable (ps) distributions. |
sigma |
standard deviation assumed for the frailty distribution; sigma = 1 by default; this value is ignored for positive stable (ps) distribution. |
alpha |
stability parameter of the positive stable distribution; alpha must lie in (0,1) interval and an NA is return otherwise. |
... |
further arguments passed to other methods. |
a vector with the generated frailties.
Function to generate a random sample of type I and type II interval censored survival data.
rinterval(time, tau, type = c("I", "II"), prob)
rinterval(time, tau, type = c("I", "II"), prob)
time |
a numeric vector of survival times. |
tau |
either a vector of censoring times (for type I interval-censored survival data) or time grid of scheduled visits (for type II interval censored survival data). |
type |
type of interval-censored survival data (I or II). |
prob |
= 0.5 attendance probability of scheduled visit; ignored when type = I. |
a data.frame containing the generated random sample.
Function to generate a random sample of survival data from proportional hazards models.
rphreg(u, formula, baseline, beta, dist = NULL, package = NULL, data, ...)
rphreg(u, formula, baseline, beta, dist = NULL, package = NULL, data, ...)
u |
a numeric vector of quantiles. |
formula |
formula specifying the linear predictors. |
baseline |
the name of the baseline survival distribution. |
beta |
vector of regression coefficients. |
dist |
an alternative way to specify the baseline survival distribution. |
package |
the name of the package where the assumed quantile function is implemented. |
data |
data frame containing the covariates used to generate the survival times. |
... |
further arguments passed to other methods. |
a numeric vector containing the generated random sample.
library(rsurv) library(dplyr) n <- 1000 simdata <- data.frame( age = rnorm(n), sex = sample(c("f", "m"), size = n, replace = TRUE) ) %>% mutate( t = rphreg(runif(n), ~ age+sex, beta = c(1, 2), dist = "weibull", shape = 1.5, scale = 1), c = runif(n, 0, 10) ) %>% rowwise() %>% mutate( time = min(t, c), status = as.numeric(time == t) ) glimpse(simdata)
library(rsurv) library(dplyr) n <- 1000 simdata <- data.frame( age = rnorm(n), sex = sample(c("f", "m"), size = n, replace = TRUE) ) %>% mutate( t = rphreg(runif(n), ~ age+sex, beta = c(1, 2), dist = "weibull", shape = 1.5, scale = 1), c = runif(n, 0, 10) ) %>% rowwise() %>% mutate( time = min(t, c), status = as.numeric(time == t) ) glimpse(simdata)
Function to generate a random sample of survival data from proportional odds models.
rporeg(u, formula, baseline, beta, dist = NULL, package = NULL, data, ...)
rporeg(u, formula, baseline, beta, dist = NULL, package = NULL, data, ...)
u |
a numeric vector of quantiles. |
formula |
formula specifying the linear predictors. |
baseline |
the name of the baseline survival distribution. |
beta |
vector of regression coefficients. |
dist |
an alternative way to specify the baseline survival distribution. |
package |
the name of the package where the assumed quantile function is implemented. |
data |
data frame containing the covariates used to generate the survival times. |
... |
further arguments passed to other methods. |
a numeric vector containing the generated random sample.
library(rsurv) library(dplyr) n <- 1000 simdata <- data.frame( age = rnorm(n), sex = sample(c("f", "m"), size = n, replace = TRUE) ) %>% mutate( t = rporeg(runif(n), ~ age+sex, beta = c(1, 2), dist = "weibull", shape = 1.5, scale = 1), c = runif(n, 0, 10) ) %>% rowwise() %>% mutate( time = min(t, c), status = as.numeric(time == t) ) glimpse(simdata)
library(rsurv) library(dplyr) n <- 1000 simdata <- data.frame( age = rnorm(n), sex = sample(c("f", "m"), size = n, replace = TRUE) ) %>% mutate( t = rporeg(runif(n), ~ age+sex, beta = c(1, 2), dist = "weibull", shape = 1.5, scale = 1), c = runif(n, 0, 10) ) %>% rowwise() %>% mutate( time = min(t, c), status = as.numeric(time == t) ) glimpse(simdata)
Function to generate a random sample of survival data from Yang and Prentice models.
rypreg(u, formula, baseline, beta, phi, dist = NULL, package = NULL, data, ...)
rypreg(u, formula, baseline, beta, phi, dist = NULL, package = NULL, data, ...)
u |
a numeric vector of quantiles. |
formula |
formula specifying the linear predictors. |
baseline |
the name of the baseline survival distribution. |
beta |
vector of short-term regression coefficients. |
phi |
vector of long-term regression coefficients. |
dist |
an alternative way to specify the baseline survival distribution. |
package |
the name of the package where the assumed quantile function is implemented. |
data |
data frame containing the covariates used to generate the survival times. |
... |
further arguments passed to other methods. |
a numeric vector containing the generated random sample.
library(rsurv) library(dplyr) n <- 1000 simdata <- data.frame( age = rnorm(n), sex = sample(c("f", "m"), size = n, replace = TRUE) ) %>% mutate( t = rypreg(runif(n), ~ age+sex, beta = c(1, 2), phi = c(-1, 2), dist = "weibull", shape = 1.5, scale = 1), c = runif(n, 0, 10) ) %>% rowwise() %>% mutate( time = min(t, c), status = as.numeric(time == t) ) glimpse(simdata)
library(rsurv) library(dplyr) n <- 1000 simdata <- data.frame( age = rnorm(n), sex = sample(c("f", "m"), size = n, replace = TRUE) ) %>% mutate( t = rypreg(runif(n), ~ age+sex, beta = c(1, 2), phi = c(-1, 2), dist = "weibull", shape = 1.5, scale = 1), c = runif(n, 0, 10) ) %>% rowwise() %>% mutate( time = min(t, c), status = as.numeric(time == t) ) glimpse(simdata)