| Title: | Fit a Mixture Cure Rate Model with Custom Link Function |
|---|---|
| Description: | Tools to fit Mixture Cure Rate models via the Expectation-Maximization (EM) algorithm, allowing for flexible link functions in the cure component and various survival distributions in the latency part. The package supports user-specified link functions, includes methods for parameter estimation and model diagnostics, and provides residual analysis tailored for cure models. The classical theory methods used are described in Berkson, J. and Gage, R. P. (1952) <doi:10.2307/2281318>, Dempster, A. P., Laird, N. M. and Rubin, D. B. (1977) <https://www.jstor.org/stable/2984875>, Bazán, J., Torres-Avilés, F., Suzuki, A. and Louzada, F. (2017)<doi:10.1002/asmb.2215>. |
| Authors: | Chaeyeon Yoo [aut], Dipak K. Dey [aut], Victor H. Lachos [aut], Jalmar M. F. Carrasco [aut, cre] |
| Maintainer: | Jalmar M. F. Carrasco <[email protected]> |
| License: | GPL-3 |
| Version: | 0.2.0 |
| Built: | 2026-05-20 09:20:36 UTC |
| Source: | https://github.com/cran/EMGCR |
A sample of 2,766 patients diagnosed with liver cancer between 2012 and 2016, whose cancer grades were well identified. Available individual-level covariates include age at diagnosis, sex, pathological grade of the liver cancer, number of relapses, and median household income.
data(liver)data(liver)
A data frame with 2,766 observations and 10 variables:
Unique patient identifier
Age as a factor (e.g., age group)
Age as a numeric variable
Pathological grade of liver cancer (I, II, III, IV)
Median household income as factor (possibly grouped)
Median household income as numeric
Number of relapses after first diagnosis
Sex of patient: 1 = male, 0 = female
Event indicator: 1 = death, 0 = censored
Survival time in months
The grade of the disease is categorized into four levels:
Grade I – Well differentiated
Grade II – Moderately differentiated
Grade III – Poorly differentiated
Grade IV – Undifferentiated/anaplastic
data(liver) head(liver)data(liver) head(liver)
A sample of 1,736 patients who have been diagnosed with liver cancer between 2012 and 2016, whose cancer grades are well identified. Available individual-level covariates include age at diagnosis, sex, race, grade of liver cancer, median household income, time to treatment, tumor size, radiation indicator, and chemotherapy indicator. The grade of diseas is categorized into four levels: well-differentiated (Grade I), moderately differentiated (Grade II), poorly differentiated (Grade III), and undifferentiated/anaplastic (Grade IV).
data(liver2)data(liver2)
A data frame with 1,736 observations and 10 variables:
Unique patient identifier
Survival time in months
censored = 0, dead due to liver cancer = 1
Sex of patient: Male=1, Female=0
Scaled age of diagnosis
Scaled median household income of the subject
While = 1, Other = 0
Pathological grade of liver cancer
chemotherapy = 1, non = 0
radiation = 1, non = 0
data(liver2) head(liver2)data(liver2) head(liver2)
Fits a cure rate model using a flexible link function and a variety of survival distributions. The model accounts for a cured fraction through a logistic-type link and estimates the model via an EM-like algorithm.
MCRfit( formula, data, dist = "weibull", link = "logit", tau = 1, maxit = 1000, tol = 1e-05 )MCRfit( formula, data, dist = "weibull", link = "logit", tau = 1, maxit = 1000, tol = 1e-05 )
formula |
A two-part formula of the form |
data |
A data frame containing the variables in the model. |
dist |
A character string indicating the baseline distribution. Supported values are |
link |
A character string specifying the link function for the cure fraction. Options are |
tau |
A numeric value used when |
maxit |
Maximum number of iterations for the EM-like algorithm. Defaults to 1000. |
tol |
Convergence tolerance. Defaults to 1e-5. |
An object of class "MCR", which is a list containing:
coefficients |
Estimated regression coefficients for the survival part. |
coefficients_cure |
Estimated coefficients for the cure part. |
scale |
Estimated scale parameter of the baseline distribution. |
loglik |
Final log-likelihood value. |
n |
Number of observations used in the model. |
deleted |
Number of incomplete cases removed before fitting. |
ep |
Estimated standard errors. |
iter |
Number of iterations used for convergence. |
dist |
Distribution used. |
link |
Link function used. |
tau |
Tau parameter used (if applicable). |
require(EMGCR) data(liver2) names(liver2) liver2$sex <- factor(liver2$sex) liver2$grade <- factor(liver2$grade) liver2$radio <- factor(liver2$radio) liver2$chemo <- factor(liver2$chemo) str(liver2) model <- MCRfit( survival::Surv(time, status) ~ age + sex + grade + radio + chemo | age + medh + grade + radio + chemo, dist = "loglogistic", link = "plogit", tau = 0.15, data = liver2 ) modelrequire(EMGCR) data(liver2) names(liver2) liver2$sex <- factor(liver2$sex) liver2$grade <- factor(liver2$grade) liver2$radio <- factor(liver2$radio) liver2$chemo <- factor(liver2$chemo) str(liver2) model <- MCRfit( survival::Surv(time, status) ~ age + sex + grade + radio + chemo | age + medh + grade + radio + chemo, dist = "loglogistic", link = "plogit", tau = 0.15, data = liver2 ) model
Plot multiple MCR model fits against Kaplan-Meier curve
## S3 method for class 'MCR' plot(...)## S3 method for class 'MCR' plot(...)
... |
One or more fitted MCR objects from |
A ggplot object with Kaplan-Meier and survival curves for each model.
Produces a Q-Q plot of residuals from a Mixture Cure Rate (MCR) model fitted via MCRfit. Optionally, a simulation envelope can be included for Cox-Snell residuals.
qqMCR( object, type = c("cox-snell", "quantile"), envelope = FALSE, nsim = 100, censor = NULL, ... )qqMCR( object, type = c("cox-snell", "quantile"), envelope = FALSE, nsim = 100, censor = NULL, ... )
object |
An object of class |
type |
Character. Type of residual to use in the QQ-plot. Options are |
envelope |
Logical. Whether to add a simulation envelope to the QQ-plot. Default is |
nsim |
Integer. Number of simulations used to construct the envelope. Default is |
censor |
Logical vector or NULL. Censoring indicator used when simulating data for the envelope. Required only when |
... |
Additional arguments (currently ignored). |
The function generates QQ-plots of either Cox-Snell or quantile residuals. When envelope = TRUE and type = "cox-snell", a simulation envelope is added using Monte Carlo replications.
A QQ-plot is produced as a side effect. Nothing is returned.
data(liver) fit <- MCRfit(survival::Surv(time, status) ~ age + medh + relapse + grade | sex + grade, data = liver, dist = "weibull", link = "logit") qqMCR(fit, type = "quantile", envelope = TRUE, nsim = 50, censor = liver$status)data(liver) fit <- MCRfit(survival::Surv(time, status) ~ age + medh + relapse + grade | sex + grade, data = liver, dist = "weibull", link = "logit") qqMCR(fit, type = "quantile", envelope = TRUE, nsim = 50, censor = liver$status)
This function computes Global Cox-Snell and randomized quantile residuals for objects of class MCR.
## S3 method for class 'MCR' residuals(object, type = c("cox-snell", "quantile"), ...)## S3 method for class 'MCR' residuals(object, type = c("cox-snell", "quantile"), ...)
object |
An object of class |
type |
Type of residual. |
... |
Additional arguments (not used). |
A numeric vector of residuals.
data(liver) names(liver) model <- MCRfit( survival::Surv(time, status) ~ age + medh + relapse + grade | sex + age + medh + grade, data = liver ) summary(residuals(model,type="quantile"))data(liver) names(liver) model <- MCRfit( survival::Surv(time, status) ~ age + medh + relapse + grade | sex + age + medh + grade, data = liver ) summary(residuals(model,type="quantile"))
Simulates survival data from a mixture cure rate model with covariates and user-defined link and latency distributions. Censoring is applied randomly.
rMCM( n, x, w, censor, alpha, beta, eta, dist = "weibull", link = "logit", tau = 1 )rMCM( n, x, w, censor, alpha, beta, eta, dist = "weibull", link = "logit", tau = 1 )
n |
Integer. Number of observations to simulate. |
x |
Matrix or numeric. Covariate matrix for the latency component (must include intercept if needed). |
w |
Matrix or numeric. Covariate matrix for the cure component (no intercept assumed). |
censor |
Numeric. Maximum censoring time (uniformly distributed). |
alpha |
Numeric. Shape parameter for the survival distribution. |
beta |
Numeric vector. Coefficients for the latency part. |
eta |
Numeric vector. Coefficients for the cure part. |
dist |
Character. Distribution for the latency part. Options: |
link |
Character. Link function for cure component. Options: |
tau |
A numeric value used when |
A list with elements:
Observed (possibly censored) survival time.
Event indicator (1 = event, 0 = censored).
Covariate matrix for the latency component.
Covariate matrix for the cure component.
Percentage of cured individuals.
Percentage of censored cases among the uncured.
# Example: Simulating survival data using the inverse Gaussian distribution library(EMGCR) n <- 500 beta <- c(1, -1, -2) eta <- c(0.5, -0.5) alpha <- 1.5 p <- length(beta) q <- length(eta) set.seed(10) X <- matrix(rnorm(n*(p-1),0,1),n,p-1) X <- cbind(1,X) set.seed(20) W <- matrix(runif(n*q,-1,1),n,q) W <- scale(W) max_censoring <- 10 set.seed(1234) sim_data <- rMCM(n=n, x = X, w = W, censor = max_censoring, beta = beta, eta = eta, alpha = alpha, link = "logit", dist = "invgauss", tau = 1) names(sim_data) head(sim_data) attributes(sim_data) attr(sim_data, "pCcensur") attr(sim_data, "pUCcensur")# Example: Simulating survival data using the inverse Gaussian distribution library(EMGCR) n <- 500 beta <- c(1, -1, -2) eta <- c(0.5, -0.5) alpha <- 1.5 p <- length(beta) q <- length(eta) set.seed(10) X <- matrix(rnorm(n*(p-1),0,1),n,p-1) X <- cbind(1,X) set.seed(20) W <- matrix(runif(n*q,-1,1),n,q) W <- scale(W) max_censoring <- 10 set.seed(1234) sim_data <- rMCM(n=n, x = X, w = W, censor = max_censoring, beta = beta, eta = eta, alpha = alpha, link = "logit", dist = "invgauss", tau = 1) names(sim_data) head(sim_data) attributes(sim_data) attr(sim_data, "pCcensur") attr(sim_data, "pUCcensur")