| Title: | Temporal Parametric Hazard Modeling |
|---|---|
| Description: | Provides native R implementations of the multiphase parametric hazard model of Blackstone, Naftel, and Turner (1986) <doi:10.1080/01621459.1986.10478314> with a focus on behavioral parity, transparent numerics, and reproducible validation against reference outputs from the original 'C'/'SAS' HAZARD program, originally developed at the University of Alabama at Birmingham (UAB). The 'SAS'/'C' code and this R package are currently developed and maintained at The Cleveland Clinic Foundation, and the R code was wholly developed at The Cleveland Clinic Foundation. The generalized temporal decomposition family extends to longitudinal mixed-effects settings (Rajeswaran et al. 2018 <doi:10.1177/0962280215623583>). The package is intentionally implemented in pure R first; performance-critical paths may later be accelerated with 'Rcpp' without changing the public interface. |
| Authors: | John Ehrlinger [aut, cre, cph] |
| Maintainer: | John Ehrlinger <[email protected]> |
| License: | MIT + file LICENSE |
| Version: | 1.1.0 |
| Built: | 2026-06-12 20:18:48 UTC |
| Source: | https://github.com/cran/TemporalHazard |
Survival data for 310 patients who underwent repair of atrioventricular septal defects (congenital heart disease) at the Cleveland Clinic between 1977 and 1993. Exhibits two identifiable hazard phases: an early post-operative risk and a constant late phase.
avcavc
A data frame with 310 rows and 11 variables:
Patient identifier
NYHA functional class (1–4)
Surgical grade of AV valve incompetence
Date of operation (months since January 1967)
Age at repair (months)
Malalignment indicator (0/1)
Interventricular communication indicator (0/1)
Associated cardiac anomaly indicator (0/1)
Death indicator (1 = dead, 0 = censored)
Follow-up interval to death or last contact (months)
Interaction term: opmos x age
Blackstone, Naftel, and Turner (1986) doi:10.1080/01621459.1986.10478314. Cleveland Clinic Foundation.
vignette("fitting-hazard-models"),
vignette("prediction-visualization")
Other datasets:
cabgkul,
omc,
tga,
valves
data(avc) avc <- na.omit(avc) # Kaplan-Meier survival km <- survival::survfit(survival::Surv(int_dead, dead) ~ 1, data = avc) plot(km, xlab = "Months after AVC repair", ylab = "Survival", main = "AVC: Kaplan-Meier survival estimate") # Two-phase hazard fit (early CDF + constant -- what AVC supports) fit <- hazard( survival::Surv(int_dead, dead) ~ 1, data = avc, dist = "multiphase", phases = list( early = hzr_phase("cdf", t_half = 0.5, nu = 1, m = 1), constant = hzr_phase("constant") ), fit = TRUE, control = list(n_starts = 5, maxit = 1000) ) summary(fit)data(avc) avc <- na.omit(avc) # Kaplan-Meier survival km <- survival::survfit(survival::Surv(int_dead, dead) ~ 1, data = avc) plot(km, xlab = "Months after AVC repair", ylab = "Survival", main = "AVC: Kaplan-Meier survival estimate") # Two-phase hazard fit (early CDF + constant -- what AVC supports) fit <- hazard( survival::Surv(int_dead, dead) ~ 1, data = avc, dist = "multiphase", phases = list( early = hzr_phase("cdf", t_half = 0.5, nu = 1, m = 1), constant = hzr_phase("constant") ), fit = TRUE, control = list(n_starts = 5, maxit = 1000) ) summary(fit)
Survival data for 5,880 patients who underwent primary isolated CABG at KU Leuven, Belgium, between 1971 and July 1987. The simplest dataset structure (intercept-only, right-censored) with large sample size exercising all three temporal hazard phases.
cabgkulcabgkul
A data frame with 5880 rows and 2 variables:
Follow-up interval to death or last contact (months)
Death indicator (1 = dead, 0 = censored)
KU Leuven cardiac surgery registry. Primary benchmark dataset for C binary parity testing.
vignette("fitting-hazard-models")
Other datasets:
avc,
omc,
tga,
valves
data(cabgkul) # Kaplan-Meier survival km <- survival::survfit(survival::Surv(int_dead, dead) ~ 1, data = cabgkul) plot(km, xlab = "Months after CABG", ylab = "Survival", main = "CABGKUL: Kaplan-Meier survival (n = 5,880)") # Single-phase Weibull fit with parametric overlay fit <- hazard(survival::Surv(int_dead, dead) ~ 1, data = cabgkul, dist = "weibull", theta = c(mu = 0.10, nu = 1.0), fit = TRUE) t_grid <- seq(0.01, max(cabgkul$int_dead) * 0.9, length.out = 200) surv <- predict(fit, newdata = data.frame(time = t_grid), type = "survival") plot(km, xlab = "Months after CABG", ylab = "Survival", main = "CABGKUL: Weibull vs. Kaplan-Meier") lines(t_grid, surv, col = "blue", lwd = 2) legend("bottomleft", c("KM", "Weibull"), col = c("black", "blue"), lty = 1, lwd = c(1, 2))data(cabgkul) # Kaplan-Meier survival km <- survival::survfit(survival::Surv(int_dead, dead) ~ 1, data = cabgkul) plot(km, xlab = "Months after CABG", ylab = "Survival", main = "CABGKUL: Kaplan-Meier survival (n = 5,880)") # Single-phase Weibull fit with parametric overlay fit <- hazard(survival::Surv(int_dead, dead) ~ 1, data = cabgkul, dist = "weibull", theta = c(mu = 0.10, nu = 1.0), fit = TRUE) t_grid <- seq(0.01, max(cabgkul$int_dead) * 0.9, length.out = 200) surv <- predict(fit, newdata = data.frame(time = t_grid), type = "survival") plot(km, xlab = "Months after CABG", ylab = "Survival", main = "CABGKUL: Weibull vs. Kaplan-Meier") lines(t_grid, surv, col = "blue", lwd = 2) legend("bottomleft", c("KM", "Weibull"), col = c("black", "blue"), lty = 1, lwd = c(1, 2))
Extract coefficients from hazard model
## S3 method for class 'hazard' coef(object, ...)## S3 method for class 'hazard' coef(object, ...)
object |
A |
... |
Unused; for S3 compatibility. |
A named numeric vector of fitted parameter estimates, or NULL
if the model has not been fitted (fit = FALSE).
fit <- hazard(time = rexp(30, 0.5), status = rep(1L, 30), theta = c(0.3, 1.0), dist = "weibull", fit = TRUE) coef(fit)fit <- hazard(time = rexp(30, 0.5), status = rep(1L, 30), theta = c(0.3, 1.0), dist = "weibull", fit = TRUE) coef(fit)
Creates a hazard object and optionally fits it via maximum likelihood.
This mirrors the argument-oriented workflow of the legacy HAZARD C/SAS
implementation: supply starting values in theta and the function will
optimize to produce fitted estimates.
hazard( formula = NULL, data = NULL, time = NULL, status = NULL, time_lower = NULL, time_upper = NULL, x = NULL, time_windows = NULL, theta = NULL, dist = "weibull", phases = NULL, fit = FALSE, weights = NULL, control = list(), ... )hazard( formula = NULL, data = NULL, time = NULL, status = NULL, time_lower = NULL, time_upper = NULL, x = NULL, time_windows = NULL, theta = NULL, dist = "weibull", phases = NULL, fit = FALSE, weights = NULL, control = list(), ... )
formula |
Optional formula of the form |
data |
Optional data frame containing variables referenced in formula. |
time |
Numeric follow-up time vector. |
status |
Numeric or logical event indicator vector. |
time_lower |
Optional numeric lower bound vector for censoring intervals.
Used when |
time_upper |
Optional numeric upper bound vector for censoring intervals.
Used when |
x |
Optional design matrix (or data frame coercible to matrix). |
time_windows |
Optional numeric vector of strictly positive cut points for
piecewise time-varying coefficients. When provided, each predictor column in
|
theta |
Optional numeric coefficient vector (starting values for optimization). |
dist |
Character baseline distribution label (default |
phases |
Optional named list of |
fit |
Logical; if TRUE, fit the model via maximum likelihood (default FALSE). |
weights |
Optional numeric vector of observation weights (non-negative).
Each observation's log-likelihood contribution is multiplied by its weight.
Use for severity-weighted repeated events. Default |
control |
Named list of control options (see Details). |
... |
Additional named arguments retained for parity with legacy calling conventions. |
Control parameters:
maxit: Maximum iterations (default 1000)
n_starts: Number of optimization starts for multiphase fits (default 5).
Each start after the first adds random noise to the initial values, drawn
from the ambient RNG stream; call set.seed() before fitting for
reproducible results.
reltol: Relative parameter change tolerance (default 1e-5)
abstol: Absolute gradient norm tolerance (default 1e-6)
method: Optimization method: "bfgs" or "nm" (default "bfgs")
condition: Condition number control (default 14)
nocov, nocor: Suppress covariance/correlation output (legacy; no-op in M2)
Censoring status coding:
1: Exact event at time
0: Right-censored at time
-1: Left-censored with upper bound at time_upper \(or time\)
2: Interval-censored in the interval \(time_lower, time_upper\)
Time-varying coefficients:
If time_windows is supplied, predictors are expanded to piecewise window
interactions so each window has its own coefficient vector.
This is implemented as design-matrix expansion, so the existing likelihood engines remain unchanged.
An object of class hazard, a named list with components:
call (the matched call),
spec (model specification: dist, control,
time_windows, phases),
data (input data: time, status, x,
weights, etc.),
fit (optimisation results: theta, objective,
converged, se, vcov, counts, message;
all NULL when fit = FALSE),
and engine (implementation tag, "native-r-m2").
For dist = "multiphase" the cumulative hazard, instantaneous hazard, and
survival are
where and the temporal shapes ,
are set by each phase's type (see hzr_phase()). The
proportional-hazards single-phase families ("weibull", "exponential")
are the special case , with covariates acting multiplicatively on
one temporal shape. The "loglogistic" (proportional-odds) and
"lognormal" (accelerated-failure-time) families place covariates
differently — on the odds of failure and the log-time location,
respectively — so they are separate parameterizations, not special cases
of this additive form. Parameters are estimated
on an unconstrained internal scale (e.g. , )
and transformed back for reporting; see
vignette("mf-mathematical-foundations").
The dist argument selects the parametric form of the baseline hazard. The
four single-distribution families differ in the shape the
hazard traces over follow-up; choose by what the risk is expected to do over
time. "multiphase" is the general additive model that lets several such
shapes coexist.
"weibull" — monotone rising or falling hazard (default)The
workhorse parametric model: , with hazard . The single shape
makes risk increase over time (), decrease
(), or stay flat (). Use it as the default when
a single monotone trend describes the hazard.
"exponential" — constant hazardThe memoryless special case
: a time-invariant baseline rate, . Use it when the event rate does not change with
follow-up time (the constant background risk also appears as the
"constant" phase in a multiphase model).
"loglogistic" — unimodal (rise-then-fall) hazardA log-logistic
proportional-odds form (covariates act multiplicatively on the odds of
failure, , not as an AFT time shift) whose hazard rises to
a single peak and then declines when the shape exceeds 1 (and is monotone
decreasing otherwise), with heavier tails than the log-normal. Use it
when risk climbs to an early peak and then eases off.
"lognormal" — early-peaking, resolving hazardAn
accelerated-failure-time form in which time is Gaussian; the
hazard rises to an early peak and then decays toward zero. Use it for
risk that is concentrated early and resolves over time.
"multiphase" — additive N-phase hazardSums several phase shapes
into one model, , so the
overall hazard can fall, level off, and rise again within one fit.
Requires phases; see hzr_phase() for the available phase shapes. This
is the form that reproduces the classic C/SAS HAZARD models.
Blackstone EH, Naftel DC, Turner ME Jr. The decomposition of time-varying hazard into phases, each incorporating a separate stream of concomitant information. J Am Stat Assoc. 1986;81(395):615–624. doi:10.1080/01621459.1986.10478314
Rajeswaran J, Blackstone EH, Ehrlinger J, Li L, Ishwaran H, Parides MK. Probability of atrial fibrillation after ablation: Using a parametric nonlinear temporal decomposition mixed effects model. Stat Methods Med Res. 2018;27(1):126–141. doi:10.1177/0962280215623583
predict.hazard() for survival/cumulative-hazard predictions,
summary.hazard() for model summaries,
hzr_phase() for specifying multiphase temporal shapes.
Vignettes with worked examples:
vignette("fitting-hazard-models") — single-phase through multiphase fitting,
vignette("prediction-visualization") — prediction types and decomposed hazard plots,
vignette("inference-diagnostics") — bootstrap CIs and model diagnostics.
# -- Univariable Weibull ---------------------------------------------- set.seed(1) time <- rexp(50, rate = 0.3) status <- sample(0:1, 50, replace = TRUE, prob = c(0.3, 0.7)) fit <- hazard(time = time, status = status, theta = c(0.3, 1.0), dist = "weibull", fit = TRUE) summary(fit) # -- Formula interface with covariates -------------------------------- set.seed(1001) n <- 180 dat <- data.frame( time = rexp(n, rate = 0.35) + 0.05, status = rbinom(n, size = 1, prob = 0.6), age = rnorm(n, mean = 62, sd = 11), nyha = sample(1:4, n, replace = TRUE), shock = rbinom(n, size = 1, prob = 0.18) ) fit2 <- hazard( survival::Surv(time, status) ~ age + nyha + shock, data = dat, theta = c(mu = 0.25, nu = 1.10, beta1 = 0, beta2 = 0, beta3 = 0), dist = "weibull", fit = TRUE, control = list(maxit = 300) ) summary(fit2) # -- Parametric survival with Kaplan-Meier overlay ----------------- if (requireNamespace("ggplot2", quietly = TRUE)) { library(ggplot2) # Parametric curve on a fine grid at median covariate profile t_grid <- seq(0.05, max(dat$time), length.out = 80) curve_df <- data.frame( time = t_grid, age = median(dat$age), nyha = 2, shock = 0 ) curve_df$survival <- predict(fit2, newdata = curve_df, type = "survival") * 100 # Kaplan-Meier empirical overlay km <- survival::survfit(survival::Surv(time, status) ~ 1, data = dat) km_df <- data.frame(time = km$time, survival = km$surv * 100) ggplot() + geom_step(data = km_df, aes(time, survival, colour = "Kaplan-Meier")) + geom_line(data = curve_df, aes(time, survival, colour = "Parametric (Weibull)")) + scale_colour_manual( values = c("Parametric (Weibull)" = "#0072B2", "Kaplan-Meier" = "#D55E00") ) + scale_y_continuous(limits = c(0, 100)) + labs(x = "Months after surgery", y = "Freedom from death (%)", colour = NULL) + theme_minimal() + theme(legend.position = "bottom") } # -- Multiphase model (two phases) --------------------------------- fit_mp <- hazard( survival::Surv(time, status) ~ 1, data = dat, dist = "multiphase", phases = list( early = hzr_phase("cdf", t_half = 0.5, nu = 2, m = 0, fixed = "shapes"), late = hzr_phase("cdf", t_half = 5, nu = 1, m = 0, fixed = "shapes") ), fit = TRUE, control = list(n_starts = 5, maxit = 1000) ) summary(fit_mp) # -- Per-phase decomposed cumulative hazard ------------------------ if (requireNamespace("ggplot2", quietly = TRUE)) { t_grid <- seq(0.01, max(dat$time), length.out = 100) decomp <- predict(fit_mp, newdata = data.frame(time = t_grid), type = "cumulative_hazard", decompose = TRUE) df_long <- data.frame( time = rep(decomp$time, 3), cumhaz = c(decomp$total, decomp$early, decomp$late), component = rep(c("Total", "Early (cdf)", "Late (cdf)"), each = nrow(decomp)) ) df_long$component <- factor(df_long$component, levels = c("Total", "Early (cdf)", "Late (cdf)")) ggplot2::ggplot(df_long, ggplot2::aes(x = time, y = cumhaz, colour = component, linewidth = component)) + ggplot2::geom_line() + ggplot2::scale_colour_manual(values = c( "Total" = "black", "Early (cdf)" = "#0072B2", "Late (cdf)" = "#D55E00" )) + ggplot2::scale_linewidth_manual(values = c( "Total" = 1.2, "Early (cdf)" = 0.6, "Late (cdf)" = 0.6 )) + ggplot2::labs( x = "Time", y = "Cumulative hazard H(t)", colour = NULL, linewidth = NULL, title = "Multiphase decomposition: early + late" ) + ggplot2::theme_minimal() + ggplot2::theme(legend.position = "bottom") }# -- Univariable Weibull ---------------------------------------------- set.seed(1) time <- rexp(50, rate = 0.3) status <- sample(0:1, 50, replace = TRUE, prob = c(0.3, 0.7)) fit <- hazard(time = time, status = status, theta = c(0.3, 1.0), dist = "weibull", fit = TRUE) summary(fit) # -- Formula interface with covariates -------------------------------- set.seed(1001) n <- 180 dat <- data.frame( time = rexp(n, rate = 0.35) + 0.05, status = rbinom(n, size = 1, prob = 0.6), age = rnorm(n, mean = 62, sd = 11), nyha = sample(1:4, n, replace = TRUE), shock = rbinom(n, size = 1, prob = 0.18) ) fit2 <- hazard( survival::Surv(time, status) ~ age + nyha + shock, data = dat, theta = c(mu = 0.25, nu = 1.10, beta1 = 0, beta2 = 0, beta3 = 0), dist = "weibull", fit = TRUE, control = list(maxit = 300) ) summary(fit2) # -- Parametric survival with Kaplan-Meier overlay ----------------- if (requireNamespace("ggplot2", quietly = TRUE)) { library(ggplot2) # Parametric curve on a fine grid at median covariate profile t_grid <- seq(0.05, max(dat$time), length.out = 80) curve_df <- data.frame( time = t_grid, age = median(dat$age), nyha = 2, shock = 0 ) curve_df$survival <- predict(fit2, newdata = curve_df, type = "survival") * 100 # Kaplan-Meier empirical overlay km <- survival::survfit(survival::Surv(time, status) ~ 1, data = dat) km_df <- data.frame(time = km$time, survival = km$surv * 100) ggplot() + geom_step(data = km_df, aes(time, survival, colour = "Kaplan-Meier")) + geom_line(data = curve_df, aes(time, survival, colour = "Parametric (Weibull)")) + scale_colour_manual( values = c("Parametric (Weibull)" = "#0072B2", "Kaplan-Meier" = "#D55E00") ) + scale_y_continuous(limits = c(0, 100)) + labs(x = "Months after surgery", y = "Freedom from death (%)", colour = NULL) + theme_minimal() + theme(legend.position = "bottom") } # -- Multiphase model (two phases) --------------------------------- fit_mp <- hazard( survival::Surv(time, status) ~ 1, data = dat, dist = "multiphase", phases = list( early = hzr_phase("cdf", t_half = 0.5, nu = 2, m = 0, fixed = "shapes"), late = hzr_phase("cdf", t_half = 5, nu = 1, m = 0, fixed = "shapes") ), fit = TRUE, control = list(n_starts = 5, maxit = 1000) ) summary(fit_mp) # -- Per-phase decomposed cumulative hazard ------------------------ if (requireNamespace("ggplot2", quietly = TRUE)) { t_grid <- seq(0.01, max(dat$time), length.out = 100) decomp <- predict(fit_mp, newdata = data.frame(time = t_grid), type = "cumulative_hazard", decompose = TRUE) df_long <- data.frame( time = rep(decomp$time, 3), cumhaz = c(decomp$total, decomp$early, decomp$late), component = rep(c("Total", "Early (cdf)", "Late (cdf)"), each = nrow(decomp)) ) df_long$component <- factor(df_long$component, levels = c("Total", "Early (cdf)", "Late (cdf)")) ggplot2::ggplot(df_long, ggplot2::aes(x = time, y = cumhaz, colour = component, linewidth = component)) + ggplot2::geom_line() + ggplot2::scale_colour_manual(values = c( "Total" = "black", "Early (cdf)" = "#0072B2", "Late (cdf)" = "#D55E00" )) + ggplot2::scale_linewidth_manual(values = c( "Total" = 1.2, "Early (cdf)" = 0.6, "Late (cdf)" = 0.6 )) + ggplot2::labs( x = "Time", y = "Cumulative hazard H(t)", colour = NULL, linewidth = NULL, title = "Multiphase decomposition: early + late" ) + ggplot2::theme_minimal() + ggplot2::theme(legend.position = "bottom") }
Returns a formal mapping table that defines how legacy SAS HAZARD/C-style
inputs map to hazard(...) arguments in this package.
hzr_argument_mapping(include_planned = TRUE)hzr_argument_mapping(include_planned = TRUE)
include_planned |
Logical; if |
A data frame with one row per mapping rule.
hazard() for the target arguments, hzr_phase() for phase
construction, and hzr_decompos()/hzr_decompos_g3() for the
parameter-level early/late translations.
hzr_argument_mapping() hzr_argument_mapping(include_planned = FALSE)hzr_argument_mapping() hzr_argument_mapping(include_planned = FALSE)
Resample data with replacement, refit the hazard model on each
replicate, and accumulate coefficient distributions. Returns a tidy
data frame of per-replicate estimates with summary statistics.
This is the R equivalent of the SAS bootstrap.hazard.sas macro.
hzr_bootstrap( object, n_boot = 200L, fraction = 1, seed = NULL, verbose = FALSE ) ## S3 method for class 'hzr_bootstrap' print(x, digits = 4, ...)hzr_bootstrap( object, n_boot = 200L, fraction = 1, seed = NULL, verbose = FALSE ) ## S3 method for class 'hzr_bootstrap' print(x, digits = 4, ...)
object |
A fitted |
n_boot |
Integer: number of bootstrap replicates (default 200). |
fraction |
Numeric in (0, 1]: fraction of data to sample per replicate (default 1.0 for full bootstrap; < 1 for bagging). |
seed |
Optional integer random seed for reproducibility. When
supplied, |
verbose |
Logical; if |
x |
An |
digits |
Number of decimal places for formatting. |
... |
Additional arguments (ignored). |
A list with class "hzr_bootstrap" containing:
Data frame with columns replicate, parameter,
and estimate – one row per parameter per successful replicate.
Data frame with columns parameter, n, pct,
mean, sd, min, max, ci_lower, ci_upper – one row per
parameter.
Number of successfully converged replicates.
Number of replicates that failed to converge.
hazard() for model fitting, vcov.hazard() for
Hessian-based standard errors.
data(avc) avc <- na.omit(avc) fit <- hazard( survival::Surv(int_dead, dead) ~ age + mal, data = avc, dist = "weibull", theta = c(mu = 0.01, nu = 0.5, 0, 0), fit = TRUE ) bs <- hzr_bootstrap(fit, n_boot = 50, seed = 123) print(bs)data(avc) avc <- na.omit(avc) fit <- hazard( survival::Surv(int_dead, dead) ~ age + mal, data = avc, dist = "weibull", theta = c(mu = 0.01, nu = 0.5, 0, 0), fit = TRUE ) bs <- hzr_bootstrap(fit, n_boot = 50, seed = 123) print(bs)
Group a continuous covariate into quantile bins, compute the event
probability (or hazard rate) per bin, and apply a link transform
(logit, Gompertz, or Cox).
This is the R equivalent of the SAS logit.sas and logitgr.sas
macros.
hzr_calibrate( x, event, groups = 10L, by = NULL, link = c("logit", "gompertz", "cox"), time = NULL )hzr_calibrate( x, event, groups = 10L, by = NULL, link = c("logit", "gompertz", "cox"), time = NULL )
x |
Numeric vector: the continuous covariate to calibrate. |
event |
Numeric vector: event indicator (1 = event, 0 = no event). |
groups |
Integer: number of quantile bins (default 10). |
by |
Optional factor or character vector for stratified calibration
(SAS |
link |
Character: transform to apply to event probabilities.
One of |
time |
Optional numeric vector: follow-up time, required when
|
Use this function before model entry to assess whether the covariate relationship with the outcome is approximately linear on the link scale. If the transformed probabilities are roughly linear against the group means, the covariate can enter the model untransformed. Curvature suggests a transformation (log, quadratic) may improve fit.
A data frame with one row per group (or per group-by-stratum combination) and columns:
Integer group label.
Stratum level (only present when by is provided).
Number of observations in the group.
Number of events.
Mean of x within the group.
Minimum of x within the group.
Maximum of x within the group.
Event probability (events / n), or for Cox link: events / sum(time).
Transformed probability on the chosen link scale.
hzr_deciles() for model-based calibration after fitting.
data(avc) avc <- na.omit(avc) # Logit calibration of age cal <- hzr_calibrate(avc$age, avc$dead, groups = 10) print(cal) if (requireNamespace("ggplot2", quietly = TRUE)) { library(ggplot2) ggplot(cal, aes(mean, link_value)) + geom_point(size = 3) + geom_smooth(method = "lm", formula = y ~ x, se = FALSE, linetype = "dashed") + labs(x = "Age at repair (months)", y = "Logit(P(death))") + theme_minimal() }data(avc) avc <- na.omit(avc) # Logit calibration of age cal <- hzr_calibrate(avc$age, avc$dead, groups = 10) print(cal) if (requireNamespace("ggplot2", quietly = TRUE)) { library(ggplot2) ggplot(cal, aes(mean, link_value)) + geom_point(size = 3) + geom_smooth(method = "lm", formula = y ~ x, se = FALSE, linetype = "dashed") + labs(x = "Age at repair (months)", y = "Logit(P(death))") + theme_minimal() }
Bounds a probability vector to so that
downstream or evaluations stay finite. Used
throughout the likelihood and prediction code to guard survival and CDF
values against exact 0 or 1.
hzr_clamp_prob(p, eps = 1e-12)hzr_clamp_prob(p, eps = 1e-12)
p |
Numeric vector of probabilities. |
eps |
Small positive tolerance in |
Numeric vector bounded to [eps, 1 - eps].
hzr_log1pexp() and hzr_log1mexp() for the companion
stable-logarithm primitives.
hzr_clamp_prob(c(0, 0.5, 1))hzr_clamp_prob(c(0, 0.5, 1))
Compute cumulative incidence functions for multiple competing event
types using the Aalen-Johansen estimator with Greenwood variance.
This is the R equivalent of the SAS markov.sas macro.
hzr_competing_risks(time, event) ## S3 method for class 'hzr_competing_risks' print(x, digits = 4, ...)hzr_competing_risks(time, event) ## S3 method for class 'hzr_competing_risks' print(x, digits = 4, ...)
time |
Numeric vector of follow-up times. |
event |
Integer vector of event type indicators: 0 = censored, 1 = event type 1, 2 = event type 2, etc. |
x |
An |
digits |
Number of decimal places for formatting. |
... |
Additional arguments (ignored). |
Unlike the naive 1 - KM estimator (which overestimates incidence when competing risks exist), this provides the correct marginal cumulative incidence for each event type.
A data frame with one row per unique event time and columns:
Event time.
Number at risk.
Events of each type at this time.
Number censored at this time.
Overall event-free survival (freedom from all events).
Cumulative incidence for each event type.
Standard error of overall survival.
Standard error of each cumulative incidence.
This estimator is unweighted: every observation contributes a
unit count to the at-risk and event tallies. There is no weights
argument, so case or inverse-probability weights are not yet supported
for competing-risks incidence (unlike hazard(), which accepts
weights). Pre-expand weighted rows to individual records if an
approximate weighted estimate is needed.
Aalen O, Johansen S (1978). An empirical transition matrix for non-homogeneous Markov chains based on censored observations. Scand J Statist 5(3):141–150.
Kalbfleisch JD, Prentice RL (1980). The Statistical Analysis of Failure Time Data. Wiley, New York.
hzr_kaplan() for single-event survival estimation.
data(valves) valves_cc <- na.omit(valves) # Combine death and PVE into a competing risks event variable # 0 = censored, 1 = death, 2 = PVE event_cr <- ifelse(valves_cc$dead == 1, 1L, ifelse(valves_cc$pve == 1, 2L, 0L)) time_cr <- pmin(valves_cc$int_dead, valves_cc$int_pve) cr <- hzr_competing_risks(time_cr, event_cr) head(cr)data(valves) valves_cc <- na.omit(valves) # Combine death and PVE into a competing risks event variable # 0 = censored, 1 = death, 2 = PVE event_cr <- ifelse(valves_cc$dead == 1, 1L, ifelse(valves_cc$pve == 1, 2L, 0L)) time_cr <- pmin(valves_cc$int_dead, valves_cc$int_pve) cr <- hzr_competing_risks(time_cr, event_cr) head(cr)
Partition observations into groups (default 10) by predicted risk and compare observed vs. expected event counts in each group. Good calibration means the two track each other across the risk spectrum.
hzr_deciles(object, time, groups = 10L, status = NULL, event_time = NULL)hzr_deciles(object, time, groups = 10L, status = NULL, event_time = NULL)
object |
A fitted |
time |
Numeric scalar: the horizon at which predicted survival is used
to rank subjects into risk groups (e.g. |
groups |
Integer: number of risk groups (default 10 for deciles). |
status |
Optional numeric vector of event indicators (1 = event,
0 = censored).
If |
event_time |
Optional numeric vector of observed event/censoring
times. If |
This reproduces the SAS deciles.hazard.sas macro. All subjects are
ranked by predicted survival at the horizon time and split into equal-sized
risk groups. Within each group the expected event count is the sum of
each subject's predicted cumulative hazard at its own follow-up time, and
the observed count is its number of events; under conservation of events
the group totals sum to the total observed events. The horizon therefore only
stratifies subjects into risk groups – it does not restrict or exclude any
subject, and the expected/observed totals are independent of it.
A data frame with one row per risk group and columns:
Integer group label (1 = lowest risk, ranked by predicted
survival at time).
Number of observations in the group.
Observed event count in the group (all events over follow-up).
Expected event count: the sum of each subject's predicted cumulative hazard at its own follow-up time.
Observed event rate (events / n).
Expected event rate (expected / n).
Chi-square contribution: (events - expected)^2 / expected.
Upper-tail p-value from the chi-square test for this group (1 df).
Mean predicted survival probability at the horizon in the group.
Mean predicted cumulative hazard at follow-up in the group.
An attribute "overall" is attached with the overall chi-square
statistic, degrees of freedom, and p-value.
predict.hazard() for the prediction types used internally.
data(avc) avc <- na.omit(avc) fit <- hazard( survival::Surv(int_dead, dead) ~ age + mal, data = avc, dist = "weibull", theta = c(mu = 0.01, nu = 0.5, beta_age = 0, beta_mal = 0), fit = TRUE ) cal <- hzr_deciles(fit, time = 120) print(cal)data(avc) avc <- na.omit(avc) fit <- hazard( survival::Surv(int_dead, dead) ~ age + mal, data = avc, dist = "weibull", theta = c(mu = 0.01, nu = 0.5, beta_age = 0, beta_mal = 0), fit = TRUE ) cal <- hzr_deciles(fit, time = 120) print(cal)
Computes the cumulative distribution , density , and
hazard for the parametric family defined by
half-life, time exponent, and shape. This single function generates all
temporal phase shapes used in multiphase hazard models.
hzr_decompos(time, t_half, nu, m)hzr_decompos(time, t_half, nu, m)
time |
Numeric vector of times (must be > 0). |
t_half |
Half-life: time at which |
nu |
Time exponent controlling rate dynamics.
SAS early: |
m |
Shape exponent controlling the distributional form.
SAS early: |
A named list with three numeric vectors, each the same length
as time:
Cumulative distribution .
Density . The "early" phase
temporal pattern.
Hazard . The "late"
phase temporal pattern.
The original C code used separate parameterizations for early (DELTA,
RHO/THALF, NU, M) and late (TAU, GAMMA, ALPHA, ETA) phases. Both
collapse onto the three parameters here. See
hzr_argument_mapping() for the full translation table.
Six cases are defined by the signs of nu and m:
| Case | Sign | Behavior |
| 1 | m > 0, nu > 0 | Standard sigmoidal |
| 1L | m = 0, nu > 0 | Exponential-like (Weibull CDF) |
| 2 | m < 0, nu > 0 | Heavy-tailed |
| 2L | m < 0, nu = 0 | Exponential decay |
| 3 | m > 0, nu < 0 | Bounded cumulative |
| 3L | m = 0, nu < 0 | Bounded exponential |
The combination m < 0 and nu < 0 is undefined and raises an error.
nu = 0 is supported only with m < 0 (Case 2L, the exponential-decay
limit); nu = 0 with m >= 0 has no usable limiting form and raises an
error.
The construction fixes a rate so that
exactly. For the base case ():
The CDF and density are then
and the hazard is . The remaining five cases
in the table above arise as limits (, ) or sign
reflections of this base form; the implementation dispatches to the
appropriate branch after inspecting the signs of nu and m. See
vignette("mf-mathematical-foundations") for the full derivation of every
case.
Blackstone EH, Naftel DC, Turner ME Jr. The decomposition of time-varying hazard into phases, each incorporating a separate stream of concomitant information. J Am Stat Assoc. 1986;81(395):615–624. doi:10.1080/01621459.1986.10478314
Rajeswaran J, Blackstone EH, Ehrlinger J, Li L, Ishwaran H, Parides MK. Probability of atrial fibrillation after ablation: Using a parametric nonlinear temporal decomposition mixed effects model. Stat Methods Med Res. 2018;27(1):126–141. doi:10.1177/0962280215623583
hzr_phase_cumhaz() for the phase-level cumulative hazard
contribution, hzr_argument_mapping() for SAS/C parameter mapping,
hzr_phase() for specifying phases in hazard() models.
vignette("mf-mathematical-foundations") for the full derivation.
t_grid <- seq(0.1, 10, by = 0.1) # Case 1: standard sigmoidal (m > 0, nu > 0) d1 <- hzr_decompos(t_grid, t_half = 3, nu = 2, m = 1) plot(t_grid, d1$G, type = "l", main = "CDF (m=1, nu=2)") # Case 1L: Weibull-like (m = 0, nu > 0) d1L <- hzr_decompos(t_grid, t_half = 3, nu = 2, m = 0) # Case 2: heavy-tailed (m < 0, nu > 0) d2 <- hzr_decompos(t_grid, t_half = 3, nu = 2, m = -1)t_grid <- seq(0.1, 10, by = 0.1) # Case 1: standard sigmoidal (m > 0, nu > 0) d1 <- hzr_decompos(t_grid, t_half = 3, nu = 2, m = 1) plot(t_grid, d1$G, type = "l", main = "CDF (m=1, nu=2)") # Case 1L: Weibull-like (m = 0, nu > 0) d1L <- hzr_decompos(t_grid, t_half = 3, nu = 2, m = 0) # Case 2: heavy-tailed (m < 0, nu > 0) d2 <- hzr_decompos(t_grid, t_half = 3, nu = 2, m = -1)
Computes the cumulative intensity and its derivative
for the late-phase parametric family used in the
original Blackstone C/SAS HAZARD code. Unlike hzr_decompos() (which
computes the early-phase G1 – a bounded CDF), this function can produce
unbounded values, making it suitable for modelling increasing late risk.
hzr_decompos_g3(time, tau, gamma, alpha, eta)hzr_decompos_g3(time, tau, gamma, alpha, eta)
time |
Numeric vector of times (must be > 0). |
tau |
Positive scalar scale parameter. |
gamma |
Positive scalar time exponent. |
alpha |
Non-negative scalar shape parameter (0 selects limiting case). |
eta |
Positive scalar outer exponent. |
A named list with two numeric vectors, each the same length
as time:
Cumulative intensity (may exceed 1).
Derivative .
When :
When (limiting exponential case):
| SAS name | R argument | Role |
| TAU | tau |
Scale (time at which ) |
| GAMMA | gamma |
Power exponent on |
| ALPHA | alpha |
Shape (0 = exponential limiting case) |
| ETA | eta |
Outer power exponent |
Blackstone EH, Naftel DC, Turner ME Jr. The decomposition of time-varying hazard into phases, each incorporating a separate stream of concomitant information. J Am Stat Assoc. 1986;81(395):615–624. doi:10.1080/01621459.1986.10478314
hzr_decompos() for the early-phase (G1) decomposition,
hzr_phase_cumhaz() for phase-level cumulative hazard helpers.
t_grid <- seq(0.1, 10, by = 0.1) # Weibull-like: alpha = 1 gives G3(t) = (t/tau)^(gamma*eta) d <- hzr_decompos_g3(t_grid, tau = 1, gamma = 3, alpha = 1, eta = 1) plot(t_grid, d$G3, type = "l", main = "G3: power law (gamma=3)") # General case with alpha > 0 d2 <- hzr_decompos_g3(t_grid, tau = 2, gamma = 2, alpha = 0.5, eta = 1)t_grid <- seq(0.1, 10, by = 0.1) # Weibull-like: alpha = 1 gives G3(t) = (t/tau)^(gamma*eta) d <- hzr_decompos_g3(t_grid, tau = 1, gamma = 3, alpha = 1, eta = 1) plot(t_grid, d$G3, type = "l", main = "G3: power law (gamma=3)") # General case with alpha > 0 d2 <- hzr_decompos_g3(t_grid, tau = 2, gamma = 2, alpha = 0.5, eta = 1)
Compare a fitted hazard model against the nonparametric Kaplan-Meier
estimate by computing observed and expected (parametric) event counts
at each distinct event time. This is the R equivalent of the SAS
hazplot.sas macro and implements the conservation-of-events
diagnostic.
hzr_gof(object, time_grid = NULL)hzr_gof(object, time_grid = NULL)
object |
A fitted |
time_grid |
Optional numeric vector of time points at which to
evaluate the parametric model.
If |
At each observed event time the function computes:
The Kaplan-Meier survival and cumulative hazard.
The parametric survival and cumulative hazard from the fitted model (and per-phase components for multiphase models).
Cumulative observed events vs. cumulative expected events (sum of individual cumulative hazards for those exiting the risk set at each time).
The running residual (expected minus observed).
Perfect model fit implies the expected and observed event counts track each other (residual near zero). This is the conservation-of-events principle.
A data frame with one row per time point and columns:
Evaluation time.
Number at risk (Kaplan-Meier).
Number of events at this time.
Number censored at this time.
Kaplan-Meier survival estimate.
Kaplan-Meier cumulative hazard
().
Parametric survival from the fitted model.
Parametric cumulative hazard.
Cumulative observed events to this time.
Cumulative expected events (sum of individual cumulative hazards for observations exiting the risk set).
Expected minus observed
(cum_expected - cum_observed).
For multiphase models, additional columns are appended for each
phase: par_cumhaz_<phase>.
An attribute "summary" is attached with scalar diagnostics:
total observed events, total expected events, and the final residual.
hzr_deciles() for decile-of-risk calibration,
predict.hazard() for prediction types.
data(avc) avc <- na.omit(avc) fit <- hazard( survival::Surv(int_dead, dead) ~ age + mal, data = avc, dist = "weibull", theta = c(mu = 0.01, nu = 0.5, beta_age = 0, beta_mal = 0), fit = TRUE ) gof <- hzr_gof(fit) print(gof) # Plot observed vs expected events if (requireNamespace("ggplot2", quietly = TRUE)) { library(ggplot2) ggplot(gof, aes(x = time)) + geom_line(aes(y = cum_observed), colour = "#D55E00") + geom_line(aes(y = cum_expected), colour = "#0072B2") + labs(x = "Time", y = "Cumulative events") + theme_minimal() }data(avc) avc <- na.omit(avc) fit <- hazard( survival::Surv(int_dead, dead) ~ age + mal, data = avc, dist = "weibull", theta = c(mu = 0.01, nu = 0.5, beta_age = 0, beta_mal = 0), fit = TRUE ) gof <- hzr_gof(fit) print(gof) # Plot observed vs expected events if (requireNamespace("ggplot2", quietly = TRUE)) { library(ggplot2) ggplot(gof, aes(x = time)) + geom_line(aes(y = cum_observed), colour = "#D55E00") + geom_line(aes(y = cum_expected), colour = "#0072B2") + labs(x = "Time", y = "Cumulative events") + theme_minimal() }
Compute the product-limit (Kaplan-Meier) survival estimate with
logit-transformed confidence limits that respect the
boundary.
This is the R equivalent of the SAS kaplan.sas macro.
hzr_kaplan(time, status, conf_level = 0.95, event_only = TRUE)hzr_kaplan(time, status, conf_level = 0.95, event_only = TRUE)
time |
Numeric vector of follow-up times. |
status |
Numeric event indicator (1 = event, 0 = censored). |
conf_level |
Confidence level for the interval (default 0.95). The SAS default of 0.68268948 corresponds to a 1-SD interval. |
event_only |
Logical; if |
The standard Greenwood confidence interval can exceed in the
tails. The logit-transformed interval avoids this by working on the
log-odds scale:
where , is the
cumulative Greenwood variance product, and is the
normal quantile for the requested confidence level.
A data frame with one row per time point and columns:
Event/censoring time.
Number at risk at start of interval.
Number of events at this time.
Number censored at this time.
Kaplan-Meier survival estimate.
Standard error of survival (Greenwood).
Lower confidence limit (logit-transformed).
Upper confidence limit (logit-transformed).
Cumulative hazard .
Interval hazard rate
.
Probability density estimate
.
Restricted mean survival time (area under curve to this time).
Kaplan EL, Meier P (1958). Nonparametric estimation from incomplete observations. J Am Stat Assoc 53(282):457–481. doi:10.1080/01621459.1958.10501452
Greenwood M (1926). The natural duration of cancer. Reports on Public Health and Medical Subjects 33:1–26.
hzr_gof() for parametric vs. nonparametric comparison.
data(cabgkul) km <- hzr_kaplan(cabgkul$int_dead, cabgkul$dead) head(km) if (requireNamespace("ggplot2", quietly = TRUE)) { library(ggplot2) ggplot(km, aes(time)) + geom_step(aes(y = survival * 100)) + geom_ribbon(aes(ymin = cl_lower * 100, ymax = cl_upper * 100), stat = "identity", alpha = 0.2) + labs(x = "Months", y = "Survival (%)") + theme_minimal() }data(cabgkul) km <- hzr_kaplan(cabgkul$int_dead, cabgkul$dead) head(km) if (requireNamespace("ggplot2", quietly = TRUE)) { library(ggplot2) ggplot(km, aes(time)) + geom_step(aes(y = survival * 100)) + geom_ribbon(aes(ymin = cl_lower * 100, ymax = cl_upper * 100), stat = "identity", alpha = 0.2) + labs(x = "Months", y = "Survival (%)") + theme_minimal() }
Computes
accurately across the full range. Following Mächler (2012), the evaluation
switches at : for small it uses
(avoiding cancellation as ), and
for larger it uses (avoiding loss of
precision as ). Non-positive or non-finite inputs return
NA.
hzr_log1mexp(x)hzr_log1mexp(x)
x |
Numeric vector with positive values. |
Numeric vector with element-wise .
Mächler M (2012). Accurately Computing . R package
Rmpfr vignette.
https://CRAN.R-project.org/package=Rmpfr/vignettes/log1mexp-note.pdf
hzr_log1pexp() for the softplus ,
hzr_clamp_prob() for boundary-safe probabilities.
hzr_log1mexp(c(0.01, 0.5, 5))hzr_log1mexp(c(0.01, 0.5, 5))
Computes the "softplus" function
without overflow. A naive log(1 + exp(x)) overflows to Inf for
even though the true value is . The
identity for keeps
every intermediate finite.
hzr_log1pexp(x)hzr_log1pexp(x)
x |
Numeric vector. |
Numeric vector with element-wise .
Mächler M (2012). Accurately Computing . R package
Rmpfr vignette.
https://CRAN.R-project.org/package=Rmpfr/vignettes/log1mexp-note.pdf
hzr_log1mexp() for the complementary ,
hzr_clamp_prob() for boundary-safe probabilities.
hzr_log1pexp(c(-50, 0, 50))hzr_log1pexp(c(-50, 0, 50))
Compute the Nelson-Aalen cumulative hazard estimate with lognormal
confidence limits. Supports weighted events for severity-adjusted
analyses of repeated/recurrent events.
This is the R equivalent of the SAS nelsonl.sas macro.
hzr_nelson(time, event, weight = NULL, conf_level = 0.95) ## S3 method for class 'hzr_nelson' print(x, digits = 4, ...)hzr_nelson(time, event, weight = NULL, conf_level = 0.95) ## S3 method for class 'hzr_nelson' print(x, digits = 4, ...)
time |
Numeric vector of follow-up times. |
event |
Numeric event indicator (1 = event, 0 = censored). |
weight |
Optional numeric vector of event weights (default 1). Weights are applied only to events (censored observations contribute zero weight). Use for severity-weighted repeated events. |
conf_level |
Confidence level for the interval (default 0.95). |
x |
An |
digits |
Number of decimal places for formatting. |
... |
Additional arguments (ignored). |
Unlike survival::survfit() which uses the Breslow estimator with
Greenwood variance, this function uses the Wayne Nelson estimator
with lognormal confidence limits that are always non-negative.
A data frame with one row per unique event time and columns:
Event time.
Number at risk.
Number of events at this time.
Sum of event weights at this time.
Nelson cumulative hazard estimate.
Standard error.
Lower lognormal confidence limit.
Upper lognormal confidence limit.
Interval hazard rate.
Cumulative (weighted) event count.
Nelson W (1972). Theory and applications of hazard plotting for censored failure data. Technometrics 14(4):945–966. doi:10.1080/00401706.1972.10488991
Aalen O (1978). Nonparametric inference for a family of counting processes. Ann Statist 6(4):701–726. doi:10.1214/aos/1176344247
hzr_kaplan() for survival estimation.
data(cabgkul) nel <- hzr_nelson(cabgkul$int_dead, cabgkul$dead) head(nel)data(cabgkul) nel <- hzr_nelson(cabgkul$int_dead, cabgkul$dead) head(nel)
Creates an hzr_phase object describing one term in a multiphase additive
cumulative hazard model. Pass a list of these to the phases argument of
hazard() when dist = "multiphase".
hzr_phase( type = c("cdf", "hazard", "constant", "g3"), t_half = 1, nu = 1, m = 0, tau = 1, gamma = 1, alpha = 1, eta = 1, formula = NULL, fixed = character(0) ) ## S3 method for class 'hzr_phase' print(x, ...)hzr_phase( type = c("cdf", "hazard", "constant", "g3"), t_half = 1, nu = 1, m = 0, tau = 1, gamma = 1, alpha = 1, eta = 1, formula = NULL, fixed = character(0) ) ## S3 method for class 'hzr_phase' print(x, ...)
type |
Character; the phase's temporal shape — one of |
t_half |
Positive scalar; initial half-life (time at which
|
nu |
Numeric scalar; initial time exponent. Used for |
m |
Numeric scalar; initial shape exponent. Used for |
tau |
Positive scalar; scale parameter for |
gamma |
Positive scalar; time exponent for |
alpha |
Non-negative scalar; shape parameter for |
eta |
Positive scalar; outer exponent for |
formula |
Optional one-sided formula (e.g. |
fixed |
Character vector naming shape parameters to hold fixed during
optimization. Valid names for |
x |
An |
... |
Additional arguments (ignored). |
An S3 object of class "hzr_phase" with elements:
Phase type string.
Initial half-life (cdf/hazard phases).
Initial time exponent (cdf/hazard phases).
Initial shape exponent (cdf/hazard phases).
Scale parameter (g3 phases).
Time exponent (g3 phases).
Shape parameter (g3 phases).
Outer exponent (g3 phases).
Phase-specific formula or NULL.
Character vector of fixed parameter names (may be empty).
Each phase is one term in the additive cumulative hazard
where is the phase-specific log-linear scale and
is the temporal shape selected by type (below). The
t_half/nu/m (or g3 tau/gamma/alpha/eta) arguments set the
starting values for that shape; formula attaches the covariates
that enter .
The type argument chooses the temporal shape for the phase.
Each captures a qualitatively different pattern of risk over time; a typical
clinical model combines an early, a constant, and a late phase so that
the total hazard can fall, level off, and rise again.
"cdf" — early, resolving riskNamed for the cumulative
distribution function: the phase contributes ,
the bounded CDF of the temporal decomposition ( at ,
rising to a ceiling of ). Because it saturates, the hazard it
adds, , peaks early and then decays toward zero — the
signature of a one-time insult that patients either succumb to or survive
past, e.g. peri-operative mortality. Shape set by t_half, nu, m.
SAS/C equivalent: the Early (G1) phase.
"hazard" — accumulating aging risk (G1 family)Named because the
phase contributes a cumulative hazard built from the same G1 family:
, which is unbounded and monotone
increasing. Its hazard rises without leveling off, so it
models risk that grows as subjects age. This is an alternative late-risk
form derived from G1; for the original SAS/C late phase prefer "g3".
Shape set by t_half, nu, m.
"g3" — late, rising risk (original C/SAS late phase)Named for
the G3 (third) decomposition family used by the original HAZARD
program for the late phase. It contributes from
hzr_decompos_g3(), an unbounded intensity with its own four-parameter
shape (tau, gamma, alpha, eta) that is more flexible than the
G1-derived "hazard" form for capturing accelerating late mortality
(e.g. structural valve deterioration years after surgery). Use this when
reproducing classic three-phase HAZARD models. SAS/C equivalent: the
Late (G3) phase.
"constant" — flat background rateA time-invariant hazard:
, so the added hazard is constant (the
exponential model). It represents the steady, ongoing risk present at all
follow-up times, independent of how long ago the time origin was. Takes
no shape parameters — only its scale (and any covariates) is
estimated. SAS/C equivalent: the Constant (G2) phase.
The shape derivative (which forms the
instantaneous hazard contribution ) is for
"cdf", for "hazard", for "g3", and for
"constant".
hazard() for fitting multiphase models,
hzr_decompos() for the underlying parametric family,
hzr_phase_cumhaz() and hzr_phase_hazard() for computing
and from these specifications.
vignette("fitting-hazard-models") for multiphase fitting examples,
vignette("mf-mathematical-foundations") for the mathematical framework.
# Classic 3-phase Blackstone pattern early <- hzr_phase("cdf", t_half = 0.5, nu = 2, m = 0) const <- hzr_phase("constant") late <- hzr_phase("g3", tau = 1, gamma = 3, alpha = 1, eta = 1) # Fix all shapes (C/SAS-style: only estimate mu) early_fixed <- hzr_phase("cdf", t_half = 0.5, nu = 2, m = 0, fixed = "shapes") late_fixed <- hzr_phase("g3", tau = 1, gamma = 3, alpha = 1, eta = 1, fixed = "shapes") # Fix only some parameters early_partial <- hzr_phase("cdf", t_half = 0.5, nu = 2, m = 0, fixed = c("nu", "m")) # Phase with specific covariates early_cov <- hzr_phase("cdf", t_half = 0.5, nu = 2, m = 0, formula = ~ age + shock) # Use in hazard(): # hazard(Surv(time, status) ~ age, data = dat, # dist = "multiphase", # phases = list(early = early, constant = const, late = late))# Classic 3-phase Blackstone pattern early <- hzr_phase("cdf", t_half = 0.5, nu = 2, m = 0) const <- hzr_phase("constant") late <- hzr_phase("g3", tau = 1, gamma = 3, alpha = 1, eta = 1) # Fix all shapes (C/SAS-style: only estimate mu) early_fixed <- hzr_phase("cdf", t_half = 0.5, nu = 2, m = 0, fixed = "shapes") late_fixed <- hzr_phase("g3", tau = 1, gamma = 3, alpha = 1, eta = 1, fixed = "shapes") # Fix only some parameters early_partial <- hzr_phase("cdf", t_half = 0.5, nu = 2, m = 0, fixed = c("nu", "m")) # Phase with specific covariates early_cov <- hzr_phase("cdf", t_half = 0.5, nu = 2, m = 0, formula = ~ age + shock) # Use in hazard(): # hazard(Surv(time, status) ~ age, data = dat, # dist = "multiphase", # phases = list(early = early, constant = const, late = late))
Computes for one phase in the additive model
.
hzr_phase_cumhaz( time, t_half = 1, nu = 1, m = 0, type = c("cdf", "hazard", "constant") )hzr_phase_cumhaz( time, t_half = 1, nu = 1, m = 0, type = c("cdf", "hazard", "constant") )
time |
Numeric vector of times (> 0). |
t_half |
Half-life parameter (> 0). |
nu |
Time exponent. |
m |
Shape parameter. |
type |
Phase type: |
"cdf": . Bounded . Models early
risk that resolves over time.
"hazard": . Monotone increasing.
Models late or aging risk. This is the cumulative hazard derived from
the hazard function , since
.
"constant": . Ignores t_half, nu, m.
Equivalent to exponential (constant hazard rate).
Numeric vector of cumulative hazard contributions ,
same length as time.
hzr_decompos() for the underlying parametric family,
hzr_phase_hazard() for the instantaneous hazard contribution.
t_grid <- seq(0.1, 10, by = 0.1) phi_early <- hzr_phase_cumhaz(t_grid, t_half = 2, nu = 2, m = 0, type = "cdf") phi_late <- hzr_phase_cumhaz(t_grid, t_half = 5, nu = 1, m = 0, type = "hazard") phi_const <- hzr_phase_cumhaz(t_grid, type = "constant")t_grid <- seq(0.1, 10, by = 0.1) phi_early <- hzr_phase_cumhaz(t_grid, t_half = 2, nu = 2, m = 0, type = "cdf") phi_late <- hzr_phase_cumhaz(t_grid, t_half = 5, nu = 1, m = 0, type = "hazard") phi_const <- hzr_phase_cumhaz(t_grid, type = "constant")
Computes for one phase – the derivative of
the cumulative hazard contribution returned by hzr_phase_cumhaz().
hzr_phase_hazard( time, t_half = 1, nu = 1, m = 0, type = c("cdf", "hazard", "constant") )hzr_phase_hazard( time, t_half = 1, nu = 1, m = 0, type = c("cdf", "hazard", "constant") )
time |
Numeric vector of times (> 0). |
t_half |
Half-life parameter (> 0). |
nu |
Time exponent. |
m |
Shape parameter. |
type |
Phase type: |
"cdf": (density).
"hazard": .
"constant": .
Numeric vector of instantaneous hazard contributions ,
same length as time.
hzr_decompos() for the underlying parametric family,
hzr_phase_cumhaz() for the cumulative version.
t_grid <- seq(0.1, 10, by = 0.1) phi_early <- hzr_phase_hazard(t_grid, t_half = 2, nu = 2, m = 0, type = "cdf") phi_late <- hzr_phase_hazard(t_grid, t_half = 5, nu = 1, m = 0, type = "hazard")t_grid <- seq(0.1, 10, by = 0.1) phi_early <- hzr_phase_hazard(t_grid, t_half = 2, nu = 2, m = 0, type = "cdf") phi_late <- hzr_phase_hazard(t_grid, t_half = 5, nu = 1, m = 0, type = "hazard")
Run forward, backward, or two-way stepwise selection on an existing
hazard fit using Wald p-values or AIC deltas as the entry /
retention criterion. Phase-specific entry is supported for
multiphase models: a covariate can enter one phase and not another.
hzr_stepwise( fit, scope = NULL, data, direction = c("both", "forward", "backward"), criterion = c("wald", "aic"), slentry = 0.3, slstay = 0.2, max_steps = 50L, max_move = 4L, force_in = character(), force_out = character(), trace = TRUE, ... ) ## S3 method for class 'hzr_stepwise' print(x, ...) ## S3 method for class 'hzr_stepwise' summary(object, ...) ## S3 method for class 'summary.hzr_stepwise' print(x, ...) ## S3 method for class 'hzr_stepwise' as.data.frame(x, ...)hzr_stepwise( fit, scope = NULL, data, direction = c("both", "forward", "backward"), criterion = c("wald", "aic"), slentry = 0.3, slstay = 0.2, max_steps = 50L, max_move = 4L, force_in = character(), force_out = character(), trace = TRUE, ... ) ## S3 method for class 'hzr_stepwise' print(x, ...) ## S3 method for class 'hzr_stepwise' summary(object, ...) ## S3 method for class 'summary.hzr_stepwise' print(x, ...) ## S3 method for class 'hzr_stepwise' as.data.frame(x, ...)
fit |
A fitted |
scope |
Candidate set. |
data |
Data frame the base fit was built on. Required for refits. |
direction |
Search strategy — one of |
criterion |
Entry / retention rule — one of |
slentry |
Entry p-value threshold for the Wald criterion.
Default |
slstay |
Retention p-value threshold for the Wald criterion.
Default |
max_steps |
Hard cap on total accepted actions. Emits a
|
max_move |
Per-variable oscillation cap. When a variable has
entered + exited more than |
force_in |
Character vector of variables that must remain in the model. Such variables are still scored and reported in the selection trace, but are never dropped. |
force_out |
Character vector of variables that may never be considered as candidates. |
trace |
Logical; print step-by-step progress to the console.
Default |
... |
Unused. |
x |
An |
object |
An |
The steps data frame has columns:
step_numInteger sequence starting at 1.
action"enter", "drop", or "frozen".
variableVariable affected.
phasePhase name (multiphase) or NA_character_.
criterion"wald" or "aic".
scoreWinning score used for the decision.
stat, df
Wald statistic and degrees of freedom.
p_value, delta_aic
Always populated when computable, regardless of the active criterion.
logLik, aic, n_coef
Goodness-of-fit diagnostics of the model after this step.
An object of class c("hzr_stepwise", "hazard") – the
final fit augmented with:
stepsData frame with one row per accepted / frozen action; see Details.
scopeRecord of the candidate scope, plus
force_in, force_out, and the frozen set.
criteriaNamed list of the threshold / direction settings actually applied.
trace_msgCharacter vector of the trace lines,
captured regardless of the trace flag.
elapseddifftime from start to finish.
final_callThe call that produced this result.
print.hzr_stepwise returns x invisibly.
summary.hzr_stepwise returns a summary.hzr_stepwise object
(extends summary.hazard) with $stepwise_steps and $stepwise_trace
appended.
print.summary.hzr_stepwise returns x invisibly.
as.data.frame.hzr_stepwise returns the $steps data frame.
Two arguments shape the search. direction decides which moves are
allowed at each step; criterion decides how a candidate move is scored
and whether it is accepted.
direction = "forward"Start from the base model and only add variables — the best eligible candidate enters each step until none clears the entry rule. Variables never leave once in.
direction = "backward"Start from the full candidate model and only drop variables — the weakest term leaves each step until all survivors clear the retention rule.
direction = "both" (default)Two-way stepwise: after each
entry, already-selected variables are re-tested and may be dropped.
This is the SAS SELECTION = STEPWISE strategy. max_move caps how
often a single variable may oscillate before it is frozen.
criterion = "wald" (default)Accept moves on SAS-style
significance thresholds, using the Wald of the affected
coefficient(s): a candidate enters if its p-value is below slentry, and
a term is dropped if its p-value rises above slstay. Entry candidates
are scored from a refit that adds the candidate (so its new coefficient
can be tested); drop candidates are scored from the current model's
Wald p-values without a per-candidate refit, and a single refit is run
only after a drop is chosen. Note this differs algorithmically from
C/SAS HAZARD, which selects on a score (Q) statistic evaluated without
refitting; the two can take different step paths even when they converge
to a similar final model.
criterion = "aic"Accept any move with
(a strictly better penalised fit), ignoring
slentry / slstay. Entry candidates use the actual
from the candidate refit; drop candidates use a
Wald-to-likelihood-ratio approximation,
, computed from the
current model without a per-candidate refit (the chosen drop is refit
afterwards). Use this for a non-significance-based,
information-criterion search.
hazard() for the base model and hzr_phase() for multiphase
scopes; stepwise_trace() to retrieve the captured selection log.
data(avc) avc <- na.omit(avc) base <- hazard(survival::Surv(int_dead, dead) ~ age, data = avc, dist = "weibull", fit = TRUE) sw <- hzr_stepwise(base, scope = ~ age + nyha, data = avc, direction = "forward", control = list(n_starts = 1)) print(sw)data(avc) avc <- na.omit(avc) base <- hazard(survival::Surv(int_dead, dead) ~ age, data = avc, dist = "weibull", fit = TRUE) sw <- hzr_stepwise(base, scope = ~ age + nyha, data = avc, direction = "forward", control = list(n_starts = 1)) print(sw)
Test if an object is an hzr_phase
is_hzr_phase(x)is_hzr_phase(x)
x |
Object to test. |
Logical scalar.
is_hzr_phase(hzr_phase("cdf")) is_hzr_phase("not a phase")is_hzr_phase(hzr_phase("cdf")) is_hzr_phase("not a phase")
Data for 339 patients who underwent open mitral commissurotomy at the University of Alabama Birmingham. Contains repeated thromboembolic events (up to 3 per patient) with left censoring, exercising the interval censoring likelihood.
omcomc
A data frame with 339 rows and 7 variables:
Patient identifier
Indicator for first thromboembolic event
Indicator for second thromboembolic event
Indicator for third thromboembolic event
Follow-up interval to death or last contact (months)
Death indicator (1 = dead, 0 = censored)
Operation date (Julian)
University of Alabama Birmingham cardiac surgery registry.
Other datasets:
avc,
cabgkul,
tga,
valves
Produces prediction outputs from a hazard object. Supports multiple prediction
types including linear predictor, hazard, survival probability, and cumulative hazard.
## S3 method for class 'hazard' predict( object, newdata = NULL, type = c("hazard", "linear_predictor", "survival", "cumulative_hazard"), decompose = FALSE, se.fit = FALSE, level = 0.95, conf.type = c("log-log", "logit"), ... )## S3 method for class 'hazard' predict( object, newdata = NULL, type = c("hazard", "linear_predictor", "survival", "cumulative_hazard"), decompose = FALSE, se.fit = FALSE, level = 0.95, conf.type = c("log-log", "logit"), ... )
object |
A |
newdata |
Optional matrix or data frame of predictors. For types requiring
time (e.g., "survival", "cumulative_hazard"), newdata should include a |
type |
Prediction type:
|
decompose |
Logical; if |
se.fit |
Logical; if |
level |
Numeric confidence level in |
conf.type |
Transform for |
... |
Unused; included for S3 compatibility. |
For Weibull models with survival or cumulative_hazard predictions:
Cumulative hazard: H(t|x) = (mu*t)^nu * exp(eta)
Survival: S(t|x) = exp(-H(t|x))
Time values must be positive and finite. If newdata contains a time column,
it will be used; otherwise, the time vector from the fitted object is used.
For models fit with time_windows, predictions for type = "linear_predictor"
or "hazard" also require time values (via newdata$time or fitted-time fallback)
so window-specific coefficients can be selected.
When se.fit = FALSE (default), a numeric vector of predictions.
When se.fit = TRUE, a data frame with columns fit, se.fit, lower,
upper (delta-method point estimate, standard error, and confidence
limits at level). For multiphase type = "cumulative_hazard" with
decompose = TRUE, a long data frame (time, component, fit,
se.fit, lower, upper); with decompose = TRUE and se.fit = FALSE,
a wide data frame of per-phase contributions.
hazard() for model fitting,
summary.hazard() for model summaries,
hzr_phase() for multiphase temporal shapes.
vignette("prediction-visualization") for detailed prediction
workflows including decomposed hazard plots and patient-specific curves.
# -- Basic predictions ------------------------------------------------ set.seed(1) fit <- hazard(time = rexp(50, 0.3), status = rep(1L, 50), theta = c(0.3, 1.0), dist = "weibull", fit = TRUE) predict(fit, type = "survival") predict(fit, newdata = data.frame(time = c(1, 2, 5)), type = "cumulative_hazard") # -- Patient-specific survival curves --------------------------------- set.seed(1001) n <- 180 dat <- data.frame( time = rexp(n, rate = 0.35) + 0.05, status = rbinom(n, size = 1, prob = 0.6), age = rnorm(n, mean = 62, sd = 11), nyha = sample(1:4, n, replace = TRUE), shock = rbinom(n, size = 1, prob = 0.18) ) fit2 <- hazard( survival::Surv(time, status) ~ age + nyha + shock, data = dat, theta = c(mu = 0.25, nu = 1.10, beta1 = 0, beta2 = 0, beta3 = 0), dist = "weibull", fit = TRUE ) new_patients <- data.frame( time = c(0.5, 1.5, 3.0), age = c(50, 65, 75), nyha = c(1, 3, 4), shock = c(0, 0, 1) ) # Compute predictions from the clean covariate frame before adding columns surv <- predict(fit2, newdata = new_patients, type = "survival") cumhaz <- predict(fit2, newdata = new_patients, type = "cumulative_hazard") new_patients$survival <- surv new_patients$cumulative_hazard <- cumhaz new_patients # -- Grouped survival curves --------------------------------------- if (requireNamespace("ggplot2", quietly = TRUE)) { library(ggplot2) t_grid <- seq(0.05, max(dat$time), length.out = 80) profiles <- data.frame( label = c("Low risk (age 50, NYHA I)", "High risk (age 75, NYHA IV)"), age = c(50, 75), nyha = c(1, 4), shock = c(0, 1) ) curve_list <- lapply(seq_len(nrow(profiles)), function(i) { nd <- data.frame( time = t_grid, age = profiles$age[i], nyha = profiles$nyha[i], shock = profiles$shock[i] ) nd$survival <- predict(fit2, newdata = nd, type = "survival") * 100 nd$profile <- profiles$label[i] nd }) curve_df <- do.call(rbind, curve_list) ggplot(curve_df, aes(time, survival, colour = profile)) + geom_line() + scale_y_continuous(limits = c(0, 100)) + labs(x = "Months after surgery", y = "Freedom from death (%)", title = "Predicted survival by risk profile", colour = NULL) + theme_minimal() } # -- Multiphase predictions with decomposition -------------------- set.seed(42) n <- 200 dat <- data.frame( time = rexp(n, rate = 0.25) + 0.01, status = rbinom(n, size = 1, prob = 0.65) ) fit_mp <- hazard( survival::Surv(time, status) ~ 1, data = dat, dist = "multiphase", phases = list( early = hzr_phase("cdf", t_half = 0.5, nu = 2, m = 0, fixed = "shapes"), late = hzr_phase("cdf", t_half = 5, nu = 1, m = 0, fixed = "shapes") ), fit = TRUE, control = list(n_starts = 5, maxit = 1000) ) t_grid <- seq(0.01, max(dat$time) * 0.9, length.out = 100) nd <- data.frame(time = t_grid) # Overall survival predict(fit_mp, newdata = nd, type = "survival") # Per-phase decomposed cumulative hazard decomp <- predict(fit_mp, newdata = nd, type = "cumulative_hazard", decompose = TRUE) head(decomp)# -- Basic predictions ------------------------------------------------ set.seed(1) fit <- hazard(time = rexp(50, 0.3), status = rep(1L, 50), theta = c(0.3, 1.0), dist = "weibull", fit = TRUE) predict(fit, type = "survival") predict(fit, newdata = data.frame(time = c(1, 2, 5)), type = "cumulative_hazard") # -- Patient-specific survival curves --------------------------------- set.seed(1001) n <- 180 dat <- data.frame( time = rexp(n, rate = 0.35) + 0.05, status = rbinom(n, size = 1, prob = 0.6), age = rnorm(n, mean = 62, sd = 11), nyha = sample(1:4, n, replace = TRUE), shock = rbinom(n, size = 1, prob = 0.18) ) fit2 <- hazard( survival::Surv(time, status) ~ age + nyha + shock, data = dat, theta = c(mu = 0.25, nu = 1.10, beta1 = 0, beta2 = 0, beta3 = 0), dist = "weibull", fit = TRUE ) new_patients <- data.frame( time = c(0.5, 1.5, 3.0), age = c(50, 65, 75), nyha = c(1, 3, 4), shock = c(0, 0, 1) ) # Compute predictions from the clean covariate frame before adding columns surv <- predict(fit2, newdata = new_patients, type = "survival") cumhaz <- predict(fit2, newdata = new_patients, type = "cumulative_hazard") new_patients$survival <- surv new_patients$cumulative_hazard <- cumhaz new_patients # -- Grouped survival curves --------------------------------------- if (requireNamespace("ggplot2", quietly = TRUE)) { library(ggplot2) t_grid <- seq(0.05, max(dat$time), length.out = 80) profiles <- data.frame( label = c("Low risk (age 50, NYHA I)", "High risk (age 75, NYHA IV)"), age = c(50, 75), nyha = c(1, 4), shock = c(0, 1) ) curve_list <- lapply(seq_len(nrow(profiles)), function(i) { nd <- data.frame( time = t_grid, age = profiles$age[i], nyha = profiles$nyha[i], shock = profiles$shock[i] ) nd$survival <- predict(fit2, newdata = nd, type = "survival") * 100 nd$profile <- profiles$label[i] nd }) curve_df <- do.call(rbind, curve_list) ggplot(curve_df, aes(time, survival, colour = profile)) + geom_line() + scale_y_continuous(limits = c(0, 100)) + labs(x = "Months after surgery", y = "Freedom from death (%)", title = "Predicted survival by risk profile", colour = NULL) + theme_minimal() } # -- Multiphase predictions with decomposition -------------------- set.seed(42) n <- 200 dat <- data.frame( time = rexp(n, rate = 0.25) + 0.01, status = rbinom(n, size = 1, prob = 0.65) ) fit_mp <- hazard( survival::Surv(time, status) ~ 1, data = dat, dist = "multiphase", phases = list( early = hzr_phase("cdf", t_half = 0.5, nu = 2, m = 0, fixed = "shapes"), late = hzr_phase("cdf", t_half = 5, nu = 1, m = 0, fixed = "shapes") ), fit = TRUE, control = list(n_starts = 5, maxit = 1000) ) t_grid <- seq(0.01, max(dat$time) * 0.9, length.out = 100) nd <- data.frame(time = t_grid) # Overall survival predict(fit_mp, newdata = nd, type = "survival") # Per-phase decomposed cumulative hazard decomp <- predict(fit_mp, newdata = nd, type = "cumulative_hazard", decompose = TRUE) head(decomp)
Print method for hzr_calibrate
## S3 method for class 'hzr_calibrate' print(x, digits = 3, ...)## S3 method for class 'hzr_calibrate' print(x, digits = 3, ...)
x |
An |
digits |
Number of decimal places for formatting. |
... |
Additional arguments (ignored). |
The object x of class c("hzr_calibrate", "data.frame"),
invisibly. The data frame has one row per quantile group and columns:
group (group index),
n (group size),
events (event count),
mean, min, max (covariate summary within group),
prob (observed event probability),
link_value (transformed probability on the link scale).
When stratified via the by argument, a by column is also
present. Attributes: "link" (the transform applied,
e.g. "logit") and "groups" (number of quantile bins).
Print method for hzr_deciles
## S3 method for class 'hzr_deciles' print(x, digits = 3, ...)## S3 method for class 'hzr_deciles' print(x, digits = 3, ...)
x |
An |
digits |
Number of decimal places for formatting. |
... |
Additional arguments (ignored). |
The object x of class c("hzr_deciles", "data.frame"),
invisibly. The data frame has one row per risk group and columns:
group (integer group index, 1 = lowest risk),
n (group size),
events (observed event count),
expected (expected event count from model predictions),
observed_rate, expected_rate (events / n),
chi_sq (per-group (O-E)^2/E contribution),
p_value (1-df chi-square upper-tail p),
mean_survival, mean_cumhaz (mean predicted values in group).
An "overall" attribute contains the omnibus chi-square test
(fields: chi_sq, df, p_value, time,
groups, total_events, total_expected,
n_included, n_excluded).
Print method for hzr_gof
## S3 method for class 'hzr_gof' print(x, digits = 3, ...)## S3 method for class 'hzr_gof' print(x, digits = 3, ...)
x |
An |
digits |
Number of decimal places for formatting. |
... |
Additional arguments (ignored). |
The object x of class c("hzr_gof", "data.frame"),
invisibly. The data frame has one row per time point and columns:
time, n_risk, n_event, n_censor,
km_surv (Kaplan-Meier survival), km_cumhaz,
par_surv (parametric survival), par_cumhaz,
cum_observed (cumulative observed events),
cum_expected (cumulative expected events from model),
residual (cum_expected - cum_observed).
Multiphase models additionally include par_cumhaz_<phase> columns
for per-phase cumulative hazard contributions.
A "summary" attribute contains scalar diagnostics:
total_observed, total_expected, final_residual,
dist, n.
Print method for hzr_kaplan
## S3 method for class 'hzr_kaplan' print(x, digits = 4, n = 20, ...)## S3 method for class 'hzr_kaplan' print(x, digits = 4, n = 20, ...)
x |
An |
digits |
Number of decimal places for formatting. |
n |
Maximum rows to print (default 20). |
... |
Additional arguments (ignored). |
The object x of class c("hzr_kaplan", "data.frame"),
invisibly. The data frame has one row per event time (or all times when
event_only = FALSE) and columns:
time (follow-up time),
n_risk (number at risk),
n_event (events in interval),
n_censor (censored observations in interval),
survival (Kaplan-Meier survival estimate),
std_err (Greenwood standard error on log-hazard scale),
cl_lower, cl_upper (logit-transformed confidence limits
on the survival scale),
cumhaz (Nelson-Aalen cumulative hazard),
hazard (interval hazard estimate),
density (estimated event density),
life (life-table life expectancy contribution).
hzr_stepwise fitEvery run of hzr_stepwise() records the header, per-step lines,
and final summary regardless of the trace flag. This accessor
returns the full character vector for display or logging.
stepwise_trace(fit)stepwise_trace(fit)
fit |
An |
Character vector, one element per console line.
hzr_stepwise(), which produces the object this accessor reads.
data(avc) avc <- na.omit(avc) base <- hazard(survival::Surv(int_dead, dead) ~ age, data = avc, dist = "weibull", fit = TRUE) sw <- hzr_stepwise(base, scope = ~ age + nyha, data = avc, direction = "forward", control = list(n_starts = 1)) cat(stepwise_trace(sw), sep = "\n")data(avc) avc <- na.omit(avc) base <- hazard(survival::Surv(int_dead, dead) ~ age, data = avc, dist = "weibull", fit = TRUE) sw <- hzr_stepwise(base, scope = ~ age + nyha, data = avc, direction = "forward", control = list(n_starts = 1)) cat(stepwise_trace(sw), sep = "\n")
Returns a compact summary of a hazard object, including model metadata,
fit diagnostics, and coefficient-level statistics when available.
## S3 method for class 'hazard' summary(object, ...)## S3 method for class 'hazard' summary(object, ...)
object |
A |
... |
Unused; for S3 compatibility. |
An object of class summary.hazard.
hazard() for model fitting, predict.hazard() for predictions.
vignette("fitting-hazard-models") for fitting workflows,
vignette("inference-diagnostics") for bootstrap CIs and diagnostics.
# -- Single-phase Weibull summary ------------------------------------ fit <- hazard(time = rexp(30, 0.5), status = rep(1L, 30), theta = c(0.3, 1.0), dist = "weibull", fit = TRUE) summary(fit) # -- Multiphase model summary ---------------------------------------- set.seed(42) n <- 200 dat <- data.frame( time = rexp(n, rate = 0.25) + 0.01, status = rbinom(n, size = 1, prob = 0.65) ) fit_mp <- hazard( survival::Surv(time, status) ~ 1, data = dat, dist = "multiphase", phases = list( early = hzr_phase("cdf", t_half = 0.5, nu = 2, m = 0, fixed = "shapes"), late = hzr_phase("cdf", t_half = 5, nu = 1, m = 0, fixed = "shapes") ), fit = TRUE, control = list(n_starts = 5, maxit = 1000) ) summary(fit_mp)# -- Single-phase Weibull summary ------------------------------------ fit <- hazard(time = rexp(30, 0.5), status = rep(1L, 30), theta = c(0.3, 1.0), dist = "weibull", fit = TRUE) summary(fit) # -- Multiphase model summary ---------------------------------------- set.seed(42) n <- 200 dat <- data.frame( time = rexp(n, rate = 0.25) + 0.01, status = rbinom(n, size = 1, prob = 0.65) ) fit_mp <- hazard( survival::Surv(time, status) ~ 1, data = dat, dist = "multiphase", phases = list( early = hzr_phase("cdf", t_half = 0.5, nu = 2, m = 0, fixed = "shapes"), late = hzr_phase("cdf", t_half = 5, nu = 1, m = 0, fixed = "shapes") ), fit = TRUE, control = list(n_starts = 5, maxit = 1000) ) summary(fit_mp)
Survival data for 470 patients who underwent the arterial switch operation for transposition of the great arteries at Boston Children's Hospital and Children's Hospital of Philadelphia. Used for sensitivity analysis and internal validation examples.
tgatga
A data frame with 470 rows and 14 variables:
Patient identifier
Simple TGA indicator (0/1)
D-looped transposition indicator (0/1)
Coronary artery pattern indicator
Hybrid approach procedure indicator (0/1)
No total circulatory arrest indicator (0/1)
Total circulatory arrest time (minutes)
Age at operation (days)
Aortic cross-clamp time per year
Death indicator (1 = dead, 0 = censored)
Follow-up interval to death or last contact (months)
Source institution (BCH or CHOP)
Coronary artery configuration (1/2/L)
Year of operation
Boston Children's Hospital and Children's Hospital of Philadelphia.
Other datasets:
avc,
cabgkul,
omc,
valves
Data for 1,533 patients who underwent primary heart valve replacement. The largest multivariable example dataset with multiple endpoints including death, prosthetic valve endocarditis (PVE), bioprosthesis degeneration, and reoperation.
valvesvalves
A data frame with 1533 rows and 19 variables:
Age at operation (years)
NYHA functional class (1–4)
Mitral valve position indicator (0/1)
Double valve replacement indicator (0/1)
Aortic position, incompetence (0/1)
Black race indicator (0/1)
Ischemic pathology indicator (0/1)
Native valve endocarditis indicator (0/1)
Mechanical valve indicator (0/1)
Male sex indicator (0/1)
Follow-up interval to death or last contact (months)
Death indicator (1 = dead, 0 = censored)
Follow-up interval to PVE or last contact (months)
PVE indicator (1 = PVE, 0 = censored)
Bioprosthesis indicator (0/1)
Follow-up interval to degeneration or last contact (months)
Reoperation for degeneration indicator (0/1)
Follow-up interval to reoperation or last contact (months)
Reoperation indicator (0/1)
Cleveland Clinic Foundation heart valve replacement registry.
vignette("fitting-hazard-models"),
vignette("prediction-visualization")
Other datasets:
avc,
cabgkul,
omc,
tga
data(valves) valves_cc <- na.omit(valves) # Kaplan-Meier for two endpoints km_death <- survival::survfit( survival::Surv(int_dead, dead) ~ 1, data = valves_cc) km_pve <- survival::survfit( survival::Surv(int_pve, pve) ~ 1, data = valves_cc) plot(km_death, xlab = "Months after valve replacement", ylab = "Survival", main = "Valves: Death and PVE endpoints") lines(km_pve, col = "red") legend("bottomleft", c("Death", "PVE"), col = c("black", "red"), lty = 1)data(valves) valves_cc <- na.omit(valves) # Kaplan-Meier for two endpoints km_death <- survival::survfit( survival::Surv(int_dead, dead) ~ 1, data = valves_cc) km_pve <- survival::survfit( survival::Surv(int_pve, pve) ~ 1, data = valves_cc) plot(km_death, xlab = "Months after valve replacement", ylab = "Survival", main = "Valves: Death and PVE endpoints") lines(km_pve, col = "red") legend("bottomleft", c("Death", "PVE"), col = c("black", "red"), lty = 1)
Returns the estimated variance-covariance matrix of the fitted coefficients.
## S3 method for class 'hazard' vcov(object, ...)## S3 method for class 'hazard' vcov(object, ...)
object |
A |
... |
Unused; for S3 compatibility. |
A numeric matrix containing the estimated variance-covariance matrix
of the fitted coefficients, with rows and columns named by the coefficient
labels (phase-prefixed for multiphase models, e.g. early.x). Rows
and columns for parameters held fixed (e.g. fixed shape parameters) are
NA because they carry no variance; the finite free-parameter block
is still usable. For Conservation-of-Events fits the conserved phase
log_mu normally carries a variance: it is removed from the
optimizer search but the vcov is the full-information matrix at the optimum
(the CoE solution is the unconstrained MLE). That recomputation requires
numDeriv and an invertible Hessian; if either is unavailable the fit
emits a warning and the conserved log_mu stays NA (the rest
of the matrix is unaffected). Returns a scalar NA only when the
model has not been fitted or no covariance matrix is available.
fit <- hazard(time = rexp(30, 0.5), status = rep(1L, 30), theta = c(0.3, 1.0), dist = "weibull", fit = TRUE) vcov(fit)fit <- hazard(time = rexp(30, 0.5), status = rep(1L, 30), theta = c(0.3, 1.0), dist = "weibull", fit = TRUE) vcov(fit)