| Title: | Parametric Survival Modeling in Bulk |
|---|---|
| Description: | A simple tool for the bulk creation and testing of parametric survival models. Simply provide 'fssg' with a formula and some data, and let it identify the best distributions for you. |
| Authors: | John Rothen [aut, cre, cph] (ORCID: <https://orcid.org/0009-0008-4897-8004>) |
| Maintainer: | John Rothen <[email protected]> |
| License: | MIT + file LICENSE |
| Version: | 1.0.0 |
| Built: | 2026-06-03 18:41:50 UTC |
| Source: | https://github.com/cran/fssg |
Function to check if times can be calculated using the distribution with default inits.
check_inits(times, distribution)check_inits(times, distribution)
times |
|
distribution |
A distribution object from |
This should work with all fssg custom distribution, but does not work on some of the native flexsurv distributions.
Boolean indicator for success. If true, then all values can be calculated, and life is good.
# choose a distribution dist <- get_fssg_dist('gamma_gompertz') # identify all of the actual survival times times <- rpois(1000, 100) # simulated times # check if the distribution can be calculated at each time with default inits check_inits(times, dist)# choose a distribution dist <- get_fssg_dist('gamma_gompertz') # identify all of the actual survival times times <- rpois(1000, 100) # simulated times # check if the distribution can be calculated at each time with default inits check_inits(times, dist)
Provides the probability density function at point x for an erlang distribution of parameters k and lambda. X, k, and l can be vectors.
derlang(x, k, l, log = FALSE)derlang(x, k, l, log = FALSE)
x |
vector of quantiles. |
k |
shape parameter, positive integer. |
l |
short for lambda, rate parameter, must be greater than zero. |
log |
logical: if TRUE, log of probability is returned. |
The probability of the Erlang probability distribution at x with parameters K and lambda.
https://quarto.wessa.net/erlang.html
https://en.wikipedia.org/wiki/Erlang_distribution
derlang(1, 1, 1)derlang(1, 1, 1)
Provides probability density function for Gamma-Gompertz distribution.
dgamgomp(x, b, sigma, beta, log = FALSE)dgamgomp(x, b, sigma, beta, log = FALSE)
x |
vector of quantiles. |
b |
scale paramater, must be greater than 0. |
sigma, beta
|
shape parameters, must be greater than 0. |
log |
logical: if TRUE, log of probability is returned. |
Probabilities for quantiles x.
https://en.wikipedia.org/wiki/Gamma/Gompertz_distribution
dgamgomp(1,1,1,1)dgamgomp(1,1,1,1)
Provides probability distribution function for Hypertabastic distribution.
dhypertab(x, a, b, log = FALSE)dhypertab(x, a, b, log = FALSE)
x |
vector of quantiles. |
a |
alpha parameter. Must be greater than 0. |
b |
beta parameter. Must be greater than 0. |
log |
logical: if TRUE, log of probability is returned. |
Probabilities for quantiles x.
https://en.wikipedia.org/wiki/Hypertabastic_survival_models
dhypertab(1,1,1)dhypertab(1,1,1)
Providers probability distribution function for Inverse Lindley distribution.
dinvlind(x, theta, log = FALSE)dinvlind(x, theta, log = FALSE)
x |
vector of quantiles. |
theta |
paramater, must be greater than 0. |
log |
logical: if TRUE, log of probability is returned. |
Probabilities for quantiles x.
Sharma, V. K., Singh, S. K., Singh, U., & Agiwal, V. (2015). The inverse Lindley distribution: a stress-strength reliability model with application to head and neck cancer data. Journal of Industrial and Production Engineering, 32(3), 162-173. doi:10.1080/21681015.2015.1025901
Asgharzadeh, Akbar & Alizadeh Sangtarashani, Mojtaba. (2023). Inverse Lindley distribution: different methods for estimating their PDF and CDF. Journal of Statistical Computation and Simulation. 94. 1-20. doi:10.1080/21681015.2015.1025901
dinvlind(1,1)dinvlind(1,1)
Provides probability distribution function for Log Cauchy distribution.
dlogcauchy(x, mu, sigma, log = FALSE)dlogcauchy(x, mu, sigma, log = FALSE)
x |
vector of quantiles. |
mu |
location parameter, must be real. |
sigma |
scale parameter, must be greater than 0. |
log |
logical: if TRUE, log of probability is returned. |
Probabilities for quantiles x.
https://en.wikipedia.org/wiki/Log-Cauchy_distribution
dlogcauchy(1,1,1)dlogcauchy(1,1,1)
A simple tool for the bulk creation and testing of parametric survival models.
fssg( formula, data = NA, models = NA, skip = c("default"), opt_method = "BFGS", spline = NA, max_knots = 1, dump_models = TRUE, detailed = FALSE, ibs = FALSE, progress = TRUE, warn = FALSE )fssg( formula, data = NA, models = NA, skip = c("default"), opt_method = "BFGS", spline = NA, max_knots = 1, dump_models = TRUE, detailed = FALSE, ibs = FALSE, progress = TRUE, warn = FALSE )
formula |
Formula. Should be a survival formula, with a Surv object on the left hand side. |
data |
If your formula needs a dataset, provide that here. |
models |
Vector of strings. If you only want to run specific models, specify them here by their list name in |
skip |
Vector. If you want to skip any specific models, you can add their names here. By default, some of the repetitive or incredibly niche models are skipped. |
opt_method |
String. Named of the preferred optimization method. Default for fssg is 'Nelder-Mead', with 'BFGS' being used as a back-up in case of errors.
Can be any valid |
spline |
String or Vector of Strings. Include 'rp' or 'wy' for
Royston-Parmar natural cubic spline, or Wang-Yan alternative natural cubic spline respectively.
The Wang-Yan version requires the package |
max_knots |
Integer. Specifies the maximum number of knots to be considered in spline models. |
dump_models |
Logical. If TRUE, each successful model will be placed into a list and returned. |
detailed |
Logical. If True, calculates a number of additional fit statistics for each model. |
ibs |
Logical. If TRUE, calculate integrated brier score for each model. Please note that this greatly increases run time, and is not recommended for large data. |
progress |
Logical. If TRUE, prints progress updates while the function runs. |
warn |
Logical. If TRUE, also prints any warnings that appear. |
Please see vignette("fssg") for a more in-depth example of the function.
List containing a summary of the models generated. If dump_models is True, also returns a list of generated models.
library(survival) fssg( Surv(time, status)~1, data=aml, models=c('genf','exp','dagum','lomax','rayleigh','gamma_gompertz'), spline = c('rp'), max_knots=2, warn = TRUE )$summarylibrary(survival) fssg( Surv(time, status)~1, data=aml, models=c('genf','exp','dagum','lomax','rayleigh','gamma_gompertz'), spline = c('rp'), max_knots=2, warn = TRUE )$summary
flexsurv allows for the creating of custom distribution objects, which must follow a specific format to be used in flexsurv functions.
fssg uses a modified version of this formatting, which specifies the relevant distribution functions directly.
fssg_dist( name, pars, location, transforms, inv.transforms, inits, d, p, q = NA, h = NA, H = NA, fullname = "" )fssg_dist( name, pars, location, transforms, inv.transforms, inits, d, p, q = NA, h = NA, H = NA, fullname = "" )
name |
Simple short hand name for the relevant distribution. |
pars |
Vector of parameter names which will be provided to the relevant distribution functions. |
location |
Name of the parameter which should be allowed to vary based on covariates.
The name 'location' is an artifact of the original |
transforms |
Vector of the functions which should be used to scale each parameter to the real number line. If a parameter must be positive, then 'log' would scale the parameter to the real line. This is used to pass parameters through to the optimization function. If no transformation is needed, use 'identity'. |
inv.transforms |
Vector of the inverse transformation functions for parameters. E.g. If transforms is 'log', inv.transforms would be 'exp'. |
inits |
Function which will take a vector of times t and provide the initial parameter estimates. Ideally these should be estimated using the times from the relevant dataset, but can also be arbitrary initial values such as 1. |
d |
Density function of the relevant distribution, such as dnorm.
If not supplied, |
p |
Distribution function of the relevant distribution, such as pnorm.
If not supplied, |
q |
Quantile function for the relevant distribution, such as qnorm.
fssg provides a helper function ( |
h |
Hazard function of the relevant distribution. Not required, but may be supplied instead of a d function if that is preferred. Will be estimated if not provided. |
H |
Cumulative hazard function. Will be estimated if not provided. |
fullname |
Alternative name(s) for the distribution. This is used for labeling of outputs generated by fssg. |
fssg distributions should be specified as a list, with the following attributes.
Important note: flexsurv by default only varies one model parameter (what is specified in the distributions as location)
We can make more than one parameter vary using the anc parameter in flexsurvreg.
Example: anc = list(shape1 = ~ var1 + var2, shape2 = ~ var3). This requires the ancillary parameters to be outright specified, which is distribution specific.
A flexsurv-ready distribution object.
flexsurv vignette by Christopher H. Jackson https://CRAN.R-project.org/package=flexsurv
fssg_dist( name = 'betapr', pars= c('shape1','shape2','scale'), location='scale', transforms= c(log,log,log), inv.transforms= c(exp,exp,exp), inits= function(t){c(3, 2, 1)}, # can be improved d = extraDistr::dbetapr, p = extraDistr::pbetapr, fullname='beta_prime' )fssg_dist( name = 'betapr', pars= c('shape1','shape2','scale'), location='scale', transforms= c(log,log,log), inv.transforms= c(exp,exp,exp), inits= function(t){c(3, 2, 1)}, # can be improved d = extraDistr::dbetapr, p = extraDistr::pbetapr, fullname='beta_prime' )
Compiles list of available distributions
fssg_dist_list()fssg_dist_list()
For details on all distributions, please see vignette("Distributions").
a list of all possible distributions
fdl <- fssg_dist_list()fdl <- fssg_dist_list()
Collect fit statistics for a parametric survival model
get_fit_stats(model, ibs = FALSE)get_fit_stats(model, ibs = FALSE)
model |
Model object. Currently formatted to work with |
ibs |
Logical. If True, calculates the integrated Brier Score, which is a helpful fit statistic but is much slower to calculate than all other statistics. |
For a table of fit statistics and their sources, please see the vignette vignette("Fit_Statistics") for more details.
List of fit statistics for the model.
Please note that for concordance and AUC statistics, the ranks are arbitrarily sorted in one direction regardless of PH/AFT specification of the model. To account for this, the statistics returned are the max of (statistic, 1-statistic), which should always provide the correct value regardless of rank sort order.
library(survival) library(flexsurv) flexsurvreg(Surv(time,status) ~ age +sex, data=cancer, dist= 'weibull') -> model get_fit_stats(model = model, ibs = FALSE)library(survival) library(flexsurv) flexsurvreg(Surv(time,status) ~ age +sex, data=cancer, dist= 'weibull') -> model get_fit_stats(model = model, ibs = FALSE)
Function to return a specific distribution object.
get_fssg_dist(dist_name)get_fssg_dist(dist_name)
dist_name |
Name of the distribution. |
Distribution object.
get_fssg_dist('weibull') get_fssg_dist('frechet') get_fssg_dist('gamma_gompertz')get_fssg_dist('weibull') get_fssg_dist('frechet') get_fssg_dist('gamma_gompertz')
fssg Helper FunctionsFunctions that extrapolate the quantile, Survival, hazard, and cumulative hazard functions for a distribution based on the provided density and/or distribution function(s). These functions assume you have a p and d function formatted in the standard format found in native R distribution functions (such as pnorm and dnorm).
survivify(p_function) hazardify(d_function, p_function) cumhazardify(p_function) quantilify(p_function)survivify(p_function) hazardify(d_function, p_function) cumhazardify(p_function) quantilify(p_function)
p_function |
Distribution function. E.g. |
d_function |
Density function. E.g. |
survivify creates a function for 1-p(t).
hazardify creates a function for d(t)/S(t).
cumhazardify creates a function for -log(S(t)).
quantilify approximates a quantile function based on p(t) using numeric root (uniroot).
A new function of the desired type with the same parameters as the input functions.
survivify returns the survival function S(t).
hazardify returns the hazard function h(t).
cumhazardify returns the cumulative hazard function H(t)
quantilify returns an estimated quantile function q(t).
survivify(pnorm) hazardify(dnorm, pnorm) cumhazardify(pnorm) quantilify(pnorm)survivify(pnorm) hazardify(dnorm, pnorm) cumhazardify(pnorm) quantilify(pnorm)
Provides the cumulative density function at point q for an erlang distribution of parameters k and lambda. X, k, and l can be vectors.
perlang(q, k, l, lower.tail = TRUE, log.p = FALSE)perlang(q, k, l, lower.tail = TRUE, log.p = FALSE)
q |
vector of quantiles. |
k |
shape parameter, positive integer. |
l |
short for lambda, rate parameter, must be greater than zero. |
lower.tail |
logical: if TRUE, returns densities from 0 to q, otherwise q to 1. |
log.p |
logical: if TRUE, log of probability is returned. |
The cumulative probability of the Erlang probability distribution at based on quantile q with parameters K and lambda.
https://quarto.wessa.net/erlang.html
https://en.wikipedia.org/wiki/Erlang_distribution
perlang(1, 1, 1)perlang(1, 1, 1)
Provides cumulative density function for Gamma-Gompertz distribution.
pgamgomp(q, b, sigma, beta, lower.tail = TRUE, log.p = FALSE)pgamgomp(q, b, sigma, beta, lower.tail = TRUE, log.p = FALSE)
q |
vector of quantiles. |
b |
scale paramater, must be greater than 0. |
sigma, beta
|
shape parameters, must be greater than 0. |
lower.tail |
logical: if TRUE, returns densities from 0 to q, otherwise q to 1. |
log.p |
logical: if TRUE, log of probability is returned. |
Probabilities for quantiles q.
https://en.wikipedia.org/wiki/Gamma/Gompertz_distribution
pgamgomp(1,1,1,1)pgamgomp(1,1,1,1)
Provides cumulative distribution function for Hypertabastic distribution.
phypertab(q, a, b, lower.tail = TRUE, log.p = FALSE)phypertab(q, a, b, lower.tail = TRUE, log.p = FALSE)
q |
vector of quantiles. |
a |
alpha parameter. Must be greater than 0. |
b |
beta parameter. Must be greater than 0. |
lower.tail |
logical: if TRUE, returns densities from 0 to q, otherwise q to 1. |
log.p |
logical: if TRUE, log of probability is returned. |
Probabilities for quantiles q.
https://en.wikipedia.org/wiki/Hypertabastic_survival_models
phypertab(1,1,1)phypertab(1,1,1)
Providers probability distribution function for Inverse Lindley distribution.
pinvlind(q, theta, lower.tail = TRUE, log.p = FALSE)pinvlind(q, theta, lower.tail = TRUE, log.p = FALSE)
q |
vector of quantiles. |
theta |
paramater, must be greater than 0. |
lower.tail |
logical: if TRUE, returns densities from 0 to q, otherwise q to 1. |
log.p |
logical: if TRUE, log of probability is returned. |
Probabilities for quantiles q.
Sharma, V. K., Singh, S. K., Singh, U., & Agiwal, V. (2015). The inverse Lindley distribution: a stress-strength reliability model with application to head and neck cancer data. Journal of Industrial and Production Engineering, 32(3), 162-173. doi:10.1080/21681015.2015.1025901
Asgharzadeh, Akbar & Alizadeh Sangtarashani, Mojtaba. (2023). Inverse Lindley distribution: different methods for estimating their PDF and CDF. Journal of Statistical Computation and Simulation. 94. 1-20. doi:10.1080/21681015.2015.1025901
pinvlind(1,1)pinvlind(1,1)
Provides cumulative distribution function for Log Cauchy distribution.
plogcauchy(q, mu, sigma, lower.tail = TRUE, log.p = FALSE)plogcauchy(q, mu, sigma, lower.tail = TRUE, log.p = FALSE)
q |
vector of quantiles. |
mu |
location parameter, must be real. |
sigma |
scale parameter, must be greater than 0. |
lower.tail |
logical: if TRUE, returns densities from 0 to q, otherwise q to 1. |
log.p |
logical: if TRUE, log of probability is returned. |
Probabilities for quantiles q.
https://en.wikipedia.org/wiki/Log-Cauchy_distribution
plogcauchy(1,1,1)plogcauchy(1,1,1)
This data is entirely fabricated. The source code for creating this data set can be found in the data-raw folder of the github repository.
pseudopseudo
Time until event.
Indicator for death. If TRUE, the patient died at corresponding time.
Arbitrary covariates.
fssg
head(pseudo)head(pseudo)
Simple add-in which lets you keyboard map the writing of pipe + newline. Intended to make functional programming pipelines a little easier on the hands.
quick_pipe()quick_pipe()