Title: | Regularization in Cox Frailty Models |
---|---|
Description: | Different regularization approaches for Cox Frailty Models by penalization methods are provided. |
Authors: | Andreas Groll |
Maintainer: | Andreas Groll <[email protected]> |
License: | GPL (>= 2) |
Version: | 2.0.0 |
Built: | 2024-11-02 06:32:58 UTC |
Source: | CRAN |
The function generates a B-spline design matrix with equidistant knots for given degree of the splines and number of basis functions.
bs.design(x, xl, xr, spline.degree, nbasis, comp = NULL)
bs.design(x, xl, xr, spline.degree, nbasis, comp = NULL)
x |
the positions where spline to be evaluated. |
xl |
lower intervall boundary where spline functions are relevant. |
xr |
upper intervall boundary where spline functions are relevant. |
spline.degree |
(polynomial) degree of the B-splines. |
nbasis |
number of basis functions used. |
comp |
Specify if only specific columns of the B-spline design matrix should be returned. Default is NULL and the whole B-spline design matrix is returned. |
The B-spline design matrix is returned.
Andreas Groll [email protected]
x <- rnorm(100) B <- bs.design(x=x, xl=min(x), xr=max(x), spline.degree=3, nbasis=5)
x <- rnorm(100) B <- bs.design(x=x, xl=min(x), xr=max(x), spline.degree=3, nbasis=5)
A full likelihood approach for Cox Frailty Models based on the full likelihood is provided.
coxFL(fix=formula, rnd=NULL, vary.coef=NULL, data, control = list())
coxFL(fix=formula, rnd=NULL, vary.coef=NULL, data, control = list())
fix |
a two-sided linear formula object describing the unpenalized
fixed (time-constant) effects part of the model, with the response on the left of a
|
rnd |
a two-sided linear formula object describing the
random-effects part of the model, with the grouping factor on the left of a
|
vary.coef |
a one-sided linear formula object describing the
time-varying effects part of the model, with the time-varying terms, separated by |
data |
the data frame containing the variables named in the three preceding
|
control |
a list of control values for the estimation algorithm to replace the default values returned by the function |
The coxFL
algorithm is designed to investigate
the effect structure in the Cox frailty model, which is a
widely used model that accounts for heterogeneity in time-to-event data.
Since in survival models one has to account for possible variation of
the effect strength over time, some features can incorporated with time-varying effects.
Package: | pencoxfrail |
Type: | Package |
Version: | 1.1.2 |
Date: | 2023-08-25 |
License: | GPL-2 |
LazyLoad: | yes |
for loading a dataset type data(nameofdataset)
Generic functions such as print
, predict
, plot
and summary
have methods to show the results of the fit.
The predict
function uses also estimates of random effects for prediction, if possible (i.e. for known subjects of the grouping factor).
Either the survival stepfunction or the baseline hazard (not cumulative!) can be calculated by specifying one of two possible methods: method=c("hazard","survival")
. By default, for each new subject in new.data
an individual stepfunction is calculated on a pre-specified time grid, also accounting for covariate changes over time. Alternatively, for new.data
a single vector of a specific (time-constant) covariate combination can be specified.
Usage:
predict(coxlasso.obj,new.data,time.grid,method=c("hazard","survival"))
The plot
function plots all time-varying effects, including the baseline hazard.
call |
a list containing an image of the |
baseline |
a vector containing the estimated B-spline coefficients of the baseline hazard.
If the covariates corresponding to the time-varying effects are centered (and standardized, see |
time.vary |
a vector containing the estimated B-spline coefficients of all time-varying effects.
If the covariates corresponding to the time-varying effects are standardized (see |
coefficients |
a vector containing the estimated fixed effects. |
ranef |
a vector containing the estimated random effects. |
Q |
a scalar or matrix containing the estimates of the random effects standard deviation or variance-covariance parameters, respectively. |
Delta |
a matrix containing the estimates of fixed and random effects (columns) for each iteration (rows) of the main algorithm (i.e. before the final re-estimation step is performed, see details). |
Q_long |
a list containing the estimates of the random effects variance-covariance parameters for each iteration of the main algorithm. |
iter |
number of iterations until the main algorithm has converged. |
knots |
vector of knots used in the B-spline representation. |
Phi.big |
large B-spline design matrix corresponding to the baseline hazard and all time-varying effects. For the time-varying effects, the B-spline functions (as a function of time) have already been multiplied with their associated covariates. |
time.grid |
the time grid used in when approximating the (Riemann) integral involved in the model's full likelihood. |
m |
number of metric covariates with time-varying effects. |
m2 |
number of categorical covariates with time-varying effects. |
Andreas Groll [email protected]
Groll, A., T. Hastie and G. Tutz (2016). Regularization in Cox Frailty Models. Ludwig-Maximilians-University. Technical Report 191.
## Not run: data(lung) # remove NAs lung <- lung[!is.na(lung$inst),] # transform inst into factor variable lung$inst <- as.factor(lung$inst) # just for illustration, create factor with only three ph.ecog classes lung$ph.ecog[is.na(lung$ph.ecog)] <- 2 lung$ph.ecog[lung$ph.ecog==3] <- 2 lung$ph.ecog <- as.factor(lung$ph.ecog) fix.form <- as.formula("Surv(time, status) ~ 1 + age + ph.ecog + sex") coxFL.obj <- coxFL(fix=fix.form, data=lung, control=list(print.iter=TRUE, exact = 1)) coef(coxFL.obj) # For comparison: coxph coxph.1 <- coxph(fix.form , data=lung) coef(coxph.1) # now add random institutional effect coxFL.obj2 <- coxFL(fix=fix.form, rnd = list(inst=~1), data=lung, control=list(print.iter=TRUE, exact = 1)) coef(coxFL.obj2) # print frailty Std.Dev. print(coxFL.obj2$Q) # print frailties print(coxFL.obj2$ranef) # For comparison: coxph fix.form.cox <- update(fix.form, ~ . + frailty(inst, distribution="gaussian")) coxph.2 <- coxph(fix.form.cox , data=lung) coef(coxph.2) # print frailty Std.Dev. print(sqrt(coxph.2$history[[1]]$history[nrow(coxph.2$history[[1]]$history), 1])) # print frailties print(coxph.2$frail) # now fit a time-varying effect for age fix.form <- as.formula("Surv(time, status) ~ 1 + ph.ecog + sex") vary.coef <- as.formula("~ age") coxFL.obj3 <- coxFL(fix=fix.form,vary.coef=vary.coef, data=lung, control=list(print.iter=TRUE)) summary(coxFL.obj3) # show fit plot(coxFL.obj3) # predict survival curve of new subject, institution 1 and up to time 300 pred.obj <- predict(coxFL.obj2, newdata=data.frame(inst=1, time=NA, status=NA, age=26, ph.ecog=2,sex=1), time.grid=seq(0,300,by=1)) # plot predicted hazard function plot(pred.obj$time.grid,pred.obj$haz,type="l",xlab="time",ylab="hazard") # plot predicted survival function plot(pred.obj$time.grid,pred.obj$survival,type="l",xlab="time",ylab="survival") ## specify a larger new data set new.data <- data.frame(inst=c(1,1,6), time=c(20,40,200), status=c(NA,NA,NA), age=c(26,26,54), ph.ecog=c(0,0,2),sex=c(1,1,1)) ## as here no frailties have been specified, id.var needs to be given! pred.obj2 <- predict(coxFL.obj3, newdata=new.data,id.var = "inst") # plot predicted hazard functions (for the available time intervals) # for individual 1 and 3 plot(pred.obj2$time.grid[!is.na(pred.obj2$haz[,1])], pred.obj2$haz[,1][!is.na(pred.obj2$haz[,1])], type="l",xlab="time",ylab="hazard",xlim=c(0,200), ylim=c(0,max(pred.obj2$haz,na.rm=T))) lines(pred.obj2$time.grid[!is.na(pred.obj2$haz[,3])], pred.obj2$haz[,3][!is.na(pred.obj2$haz[,3])], col="red",lty=2,) # plot predicted survival functions (for the available time intervals) # for individual 1 and 3 plot(pred.obj2$time.grid[!is.na(pred.obj2$survival[,1])], pred.obj2$survival[,1][!is.na(pred.obj2$survival[,1])], type="l",xlab="time",ylab="hazard",xlim=c(0,200), ylim=c(0,max(pred.obj2$survival,na.rm=T))) lines(pred.obj2$time.grid[!is.na(pred.obj2$survival[,3])], pred.obj2$survival[,3][!is.na(pred.obj2$survival[,3])], col="red",lty=2,) ## End(Not run)
## Not run: data(lung) # remove NAs lung <- lung[!is.na(lung$inst),] # transform inst into factor variable lung$inst <- as.factor(lung$inst) # just for illustration, create factor with only three ph.ecog classes lung$ph.ecog[is.na(lung$ph.ecog)] <- 2 lung$ph.ecog[lung$ph.ecog==3] <- 2 lung$ph.ecog <- as.factor(lung$ph.ecog) fix.form <- as.formula("Surv(time, status) ~ 1 + age + ph.ecog + sex") coxFL.obj <- coxFL(fix=fix.form, data=lung, control=list(print.iter=TRUE, exact = 1)) coef(coxFL.obj) # For comparison: coxph coxph.1 <- coxph(fix.form , data=lung) coef(coxph.1) # now add random institutional effect coxFL.obj2 <- coxFL(fix=fix.form, rnd = list(inst=~1), data=lung, control=list(print.iter=TRUE, exact = 1)) coef(coxFL.obj2) # print frailty Std.Dev. print(coxFL.obj2$Q) # print frailties print(coxFL.obj2$ranef) # For comparison: coxph fix.form.cox <- update(fix.form, ~ . + frailty(inst, distribution="gaussian")) coxph.2 <- coxph(fix.form.cox , data=lung) coef(coxph.2) # print frailty Std.Dev. print(sqrt(coxph.2$history[[1]]$history[nrow(coxph.2$history[[1]]$history), 1])) # print frailties print(coxph.2$frail) # now fit a time-varying effect for age fix.form <- as.formula("Surv(time, status) ~ 1 + ph.ecog + sex") vary.coef <- as.formula("~ age") coxFL.obj3 <- coxFL(fix=fix.form,vary.coef=vary.coef, data=lung, control=list(print.iter=TRUE)) summary(coxFL.obj3) # show fit plot(coxFL.obj3) # predict survival curve of new subject, institution 1 and up to time 300 pred.obj <- predict(coxFL.obj2, newdata=data.frame(inst=1, time=NA, status=NA, age=26, ph.ecog=2,sex=1), time.grid=seq(0,300,by=1)) # plot predicted hazard function plot(pred.obj$time.grid,pred.obj$haz,type="l",xlab="time",ylab="hazard") # plot predicted survival function plot(pred.obj$time.grid,pred.obj$survival,type="l",xlab="time",ylab="survival") ## specify a larger new data set new.data <- data.frame(inst=c(1,1,6), time=c(20,40,200), status=c(NA,NA,NA), age=c(26,26,54), ph.ecog=c(0,0,2),sex=c(1,1,1)) ## as here no frailties have been specified, id.var needs to be given! pred.obj2 <- predict(coxFL.obj3, newdata=new.data,id.var = "inst") # plot predicted hazard functions (for the available time intervals) # for individual 1 and 3 plot(pred.obj2$time.grid[!is.na(pred.obj2$haz[,1])], pred.obj2$haz[,1][!is.na(pred.obj2$haz[,1])], type="l",xlab="time",ylab="hazard",xlim=c(0,200), ylim=c(0,max(pred.obj2$haz,na.rm=T))) lines(pred.obj2$time.grid[!is.na(pred.obj2$haz[,3])], pred.obj2$haz[,3][!is.na(pred.obj2$haz[,3])], col="red",lty=2,) # plot predicted survival functions (for the available time intervals) # for individual 1 and 3 plot(pred.obj2$time.grid[!is.na(pred.obj2$survival[,1])], pred.obj2$survival[,1][!is.na(pred.obj2$survival[,1])], type="l",xlab="time",ylab="hazard",xlim=c(0,200), ylim=c(0,max(pred.obj2$survival,na.rm=T))) lines(pred.obj2$time.grid[!is.na(pred.obj2$survival[,3])], pred.obj2$survival[,3][!is.na(pred.obj2$survival[,3])], col="red",lty=2,) ## End(Not run)
coxFL
fitThe values supplied in the function call replace the defaults and a list with all possible arguments is returned. The returned list is used as the control
argument to the coxFL
function.
coxFLControl(start = NULL, index=NULL, q_start = NULL, conv.eps = 1e-3, standardize = FALSE, center = FALSE, smooth=list(nbasis = 6, penal = 1e+2), print.iter = FALSE, max.iter = 100, exact = NULL, xr = NULL, eps = 1e-3, ...)
coxFLControl(start = NULL, index=NULL, q_start = NULL, conv.eps = 1e-3, standardize = FALSE, center = FALSE, smooth=list(nbasis = 6, penal = 1e+2), print.iter = FALSE, max.iter = 100, exact = NULL, xr = NULL, eps = 1e-3, ...)
start |
a vector of suitable length containing starting values for the spline-coefficients of the baseline hazard and the time-varying effects, followed by the fixed and random effects. The correct ordering is important. Default is a vector full of zeros. |
index |
vector which defines the grouping of the variables. Components sharing the same number build a group and factor variables get a single number (and are automatically treated as a group). Non-penalized coefficients are marked with NA. |
q_start |
a scalar or matrix of suitable dimension, specifying starting values for the random-effects variance-covariance matrix. Default is a scalar 0.1 or diagonal matrix with 0.1 in the diagonal, depending on the dimension of the random effects. |
conv.eps |
controls the speed of convergence. Default is 1e-3. |
smooth |
a list specifying the number of basis functions |
standardize |
logical. If true, the covariates corresponding to the fixed effects will be scaled to a variance equal to one. Default is TRUE. |
center |
logical. If true, the covariates corresponding to the time-varying effects will be centered. Default is FALSE (and centering is only recommended if really necessary; it can also have a strong effect on the baseline hazard, in particular, if a strong penalty is selected). |
print.iter |
logical. Should the number of iterations be printed? Default is FALSE. |
max.iter |
the number of iterations for the final Fisher scoring re-estimation procedure. Default is 200. |
exact |
controls the exactness of the (Riemann) integral approximations. If not set by the user to a specific value, it will be automatically chosen such that the time interval [0,t_max] will be divided in about 1000 equal sized Riemann bars. |
xr |
maximal time point that is regarded. Default is NULL and the maximal event or censoring time point in the data is used. |
eps |
Small epsilon that controls which fixed effects are set to zero: parameters with an absolute value smaller than epsilon are taken to be zero. Default is 1e-3. |
... |
Futher arguments to be passed. |
a list with components for each of the possible arguments.
Andreas Groll [email protected]
# Use different weighting of the two penalty parts # and lighten the convergence criterion coxFLControl(c.app = 1e-5, conv.eps=1e-3)
# Use different weighting of the two penalty parts # and lighten the convergence criterion coxFLControl(c.app = 1e-5, conv.eps=1e-3)
A LASSO approach for Cox Frailty Models based on the Cox full likelihood is provided.
coxlasso(fix=formula, rnd=NULL, vary.coef=NULL, data, xi, adaptive.weights = NULL, control = list())
coxlasso(fix=formula, rnd=NULL, vary.coef=NULL, data, xi, adaptive.weights = NULL, control = list())
fix |
a two-sided linear formula object describing the LASSO-penalized
fixed (time-constant) effects part of the model, with the response on the left of a
|
rnd |
a two-sided linear formula object describing the
random-effects part of the model, with the grouping factor on the left of a
|
vary.coef |
a one-sided linear formula object describing the
time-varying effects part of the model, with the time-varying terms, separated by |
data |
the data frame containing the variables named in the three preceding
|
xi |
the LASSO-penalty parameter that controls the strenght of the penalty term.
The optimal penalty parameter is a tuning parameter of the procedure that has to be determined,
e.g. by K-fold cross validation (see |
adaptive.weights |
for the LASSO-penalized fixed effects a vector of adaptive weights can be passed to the procedure. If no adaptive weights are specified, an unpenalized model (i.e. |
control |
a list of control values for the estimation algorithm to replace the default values returned by the function |
The coxlasso
algorithm is designed to investigate
the effect structure in the Cox frailty model, which is a
widely used model that accounts for heterogeneity in time-to-event data.
Since in survival models one has to account for possible variation of
the effect strength over time, some features can incorporated with time-varying effects.
The penalty is depending on the LASSO tuning parameter , which has to be determined by a suitable technique, e.g. by K-fold cross validation.
Package: | pencoxfrail |
Type: | Package |
Version: | 1.1.2 |
Date: | 2023-08-25 |
License: | GPL-2 |
LazyLoad: | yes |
for loading a dataset type data(nameofdataset)
Generic functions such as print
, predict
, plot
and summary
have methods to show the results of the fit.
The predict
function uses also estimates of random effects for prediction, if possible (i.e. for known subjects of the grouping factor).
Either the survival stepfunction or the baseline hazard (not cumulative!) can be calculated by specifying one of two possible methods: method=c("hazard","survival")
. By default, for each new subject in new.data
an individual stepfunction is calculated on a pre-specified time grid, also accounting for covariate changes over time. Alternatively, for new.data
a single vector of a specific (time-constant) covariate combination can be specified.
Usage:
predict(coxlasso.obj,new.data,time.grid,method=c("hazard","survival"))
The plot
function plots all time-varying effects, including the baseline hazard.
call |
a list containing an image of the |
baseline |
a vector containing the estimated B-spline coefficients of the baseline hazard.
If the covariates corresponding to the time-varying effects are centered (and standardized, see |
time.vary |
a vector containing the estimated B-spline coefficients of all time-varying effects.
If the covariates corresponding to the time-varying effects are standardized (see |
coefficients |
a vector containing the estimated fixed effects. |
ranef |
a vector containing the estimated random effects. |
Q |
a scalar or matrix containing the estimates of the random effects standard deviation or variance-covariance parameters, respectively. |
Delta |
a matrix containing the estimates of fixed and random effects (columns) for each iteration (rows) of the main algorithm (i.e. before the final re-estimation step is performed, see details). |
Q_long |
a list containing the estimates of the random effects variance-covariance parameters for each iteration of the main algorithm. |
iter |
number of iterations until the main algorithm has converged. |
adaptive.weights |
if not given as an argument by the user, a two-column matrix of adaptive weights is calculated by the |
knots |
vector of knots used in the B-spline representation. |
Phi.big |
large B-spline design matrix corresponding to the baseline hazard and all time-varying effects. For the time-varying effects, the B-spline functions (as a function of time) have already been multiplied with their associated covariates. |
time.grid |
the time grid used in when approximating the (Riemann) integral involved in the model's full likelihood. |
m |
number of metric covariates with time-varying effects. |
m2 |
number of categorical covariates with time-varying effects. |
Andreas Groll [email protected]
Maike Hohberg [email protected]
Groll, A., T. Hastie and G. Tutz (2017). Selection of Effects in Cox Frailty Models by Regularization Methods. Biometrics 73(3): 846-856.
coxlassoControl,cv.coxlasso,coxFL,Surv,pbc
## Not run: data(lung) # remove NAs lung <- lung[!is.na(lung$inst),] # transform inst into factor variable lung$inst <- as.factor(lung$inst) # just for illustration, create factor with only three ph.ecog classes lung$ph.ecog[is.na(lung$ph.ecog)] <- 2 lung$ph.ecog[lung$ph.ecog==3] <- 2 lung$ph.ecog <- as.factor(lung$ph.ecog) fix.form <- as.formula("Surv(time, status) ~ 1 + age + ph.ecog + sex") lasso.obj <- coxlasso(fix=fix.form, data=lung, xi=10, control=list(print.iter=TRUE, exact = 1)) coef(lasso.obj) # now add random institutional effect lasso.obj2 <- coxlasso(fix=fix.form, rnd = list(inst=~1), data=lung, xi=10,control=list(print.iter=TRUE, exact = 1)) coef(lasso.obj2) # print frailty Std.Dev. print(lasso.obj2$Q) # print frailties print(lasso.obj2$ranef) # now fit a time-varying effect for age fix.form <- as.formula("Surv(time, status) ~ 1 + ph.ecog + sex") vary.coef <- as.formula("~ age") lasso.obj3 <- coxlasso(fix=fix.form,vary.coef=vary.coef, data=lung, xi=10,control=list(print.iter=TRUE)) summary(lasso.obj3) # show fit plot(lasso.obj3) # predict survival curve of new subject, institution 1 and up to time 300 pred.obj <- predict(lasso.obj2, newdata=data.frame(inst=1, time=NA, status=NA, age=26, ph.ecog=2,sex=1), time.grid=seq(0,300,by=1)) # plot predicted hazard function plot(pred.obj$time.grid,pred.obj$haz,type="l",xlab="time",ylab="hazard") # plot predicted survival function plot(pred.obj$time.grid,pred.obj$survival,type="l",xlab="time",ylab="survival") ## specify a larger new data set new.data <- data.frame(inst=c(1,1,6), time=c(20,40,200), status=c(NA,NA,NA), age=c(26,26,54), ph.ecog=c(0,0,2),sex=c(1,1,1)) ## as here no frailties have been specified, id.var needs to be given! pred.obj2 <- predict(lasso.obj3, newdata=new.data,id.var = "inst") # plot predicted hazard functions (for the available time intervals) plot(pred.obj2$time.grid[!is.na(pred.obj2$haz[,1])], pred.obj2$haz[,1][!is.na(pred.obj2$haz[,1])], type="l",xlab="time",ylab="hazard",xlim=c(0,200), ylim=c(0,max(pred.obj2$haz,na.rm=T))) lines(pred.obj2$time.grid[!is.na(pred.obj2$haz[,3])], pred.obj2$haz[,3][!is.na(pred.obj2$haz[,3])], col="red",lty=2,) # plot predicted survival functions (for the available time intervals) plot(pred.obj2$time.grid[!is.na(pred.obj2$survival[,1])], pred.obj2$survival[,1][!is.na(pred.obj2$survival[,1])], type="l",xlab="time",ylab="hazard",xlim=c(0,200), ylim=c(0,max(pred.obj2$survival,na.rm=T))) lines(pred.obj2$time.grid[!is.na(pred.obj2$survival[,3])], pred.obj2$survival[,3][!is.na(pred.obj2$survival[,3])], col="red",lty=2,) # see also demo("coxlasso-lung") ## End(Not run)
## Not run: data(lung) # remove NAs lung <- lung[!is.na(lung$inst),] # transform inst into factor variable lung$inst <- as.factor(lung$inst) # just for illustration, create factor with only three ph.ecog classes lung$ph.ecog[is.na(lung$ph.ecog)] <- 2 lung$ph.ecog[lung$ph.ecog==3] <- 2 lung$ph.ecog <- as.factor(lung$ph.ecog) fix.form <- as.formula("Surv(time, status) ~ 1 + age + ph.ecog + sex") lasso.obj <- coxlasso(fix=fix.form, data=lung, xi=10, control=list(print.iter=TRUE, exact = 1)) coef(lasso.obj) # now add random institutional effect lasso.obj2 <- coxlasso(fix=fix.form, rnd = list(inst=~1), data=lung, xi=10,control=list(print.iter=TRUE, exact = 1)) coef(lasso.obj2) # print frailty Std.Dev. print(lasso.obj2$Q) # print frailties print(lasso.obj2$ranef) # now fit a time-varying effect for age fix.form <- as.formula("Surv(time, status) ~ 1 + ph.ecog + sex") vary.coef <- as.formula("~ age") lasso.obj3 <- coxlasso(fix=fix.form,vary.coef=vary.coef, data=lung, xi=10,control=list(print.iter=TRUE)) summary(lasso.obj3) # show fit plot(lasso.obj3) # predict survival curve of new subject, institution 1 and up to time 300 pred.obj <- predict(lasso.obj2, newdata=data.frame(inst=1, time=NA, status=NA, age=26, ph.ecog=2,sex=1), time.grid=seq(0,300,by=1)) # plot predicted hazard function plot(pred.obj$time.grid,pred.obj$haz,type="l",xlab="time",ylab="hazard") # plot predicted survival function plot(pred.obj$time.grid,pred.obj$survival,type="l",xlab="time",ylab="survival") ## specify a larger new data set new.data <- data.frame(inst=c(1,1,6), time=c(20,40,200), status=c(NA,NA,NA), age=c(26,26,54), ph.ecog=c(0,0,2),sex=c(1,1,1)) ## as here no frailties have been specified, id.var needs to be given! pred.obj2 <- predict(lasso.obj3, newdata=new.data,id.var = "inst") # plot predicted hazard functions (for the available time intervals) plot(pred.obj2$time.grid[!is.na(pred.obj2$haz[,1])], pred.obj2$haz[,1][!is.na(pred.obj2$haz[,1])], type="l",xlab="time",ylab="hazard",xlim=c(0,200), ylim=c(0,max(pred.obj2$haz,na.rm=T))) lines(pred.obj2$time.grid[!is.na(pred.obj2$haz[,3])], pred.obj2$haz[,3][!is.na(pred.obj2$haz[,3])], col="red",lty=2,) # plot predicted survival functions (for the available time intervals) plot(pred.obj2$time.grid[!is.na(pred.obj2$survival[,1])], pred.obj2$survival[,1][!is.na(pred.obj2$survival[,1])], type="l",xlab="time",ylab="hazard",xlim=c(0,200), ylim=c(0,max(pred.obj2$survival,na.rm=T))) lines(pred.obj2$time.grid[!is.na(pred.obj2$survival[,3])], pred.obj2$survival[,3][!is.na(pred.obj2$survival[,3])], col="red",lty=2,) # see also demo("coxlasso-lung") ## End(Not run)
coxlasso
fitThe values supplied in the function call replace the defaults and a list with all possible arguments is returned. The returned list is used as the control
argument to the coxlasso
function.
coxlassoControl(start = NULL, index=NULL, q_start = NULL, conv.eps = 1e-3, standardize = TRUE, center = FALSE, smooth=list(nbasis = 6, penal = 1e+2), print.iter = FALSE, max.iter = 100, c.app = 1e-6, exact = NULL, xr = NULL, eps = 1e-2, quant.knots = TRUE,...)
coxlassoControl(start = NULL, index=NULL, q_start = NULL, conv.eps = 1e-3, standardize = TRUE, center = FALSE, smooth=list(nbasis = 6, penal = 1e+2), print.iter = FALSE, max.iter = 100, c.app = 1e-6, exact = NULL, xr = NULL, eps = 1e-2, quant.knots = TRUE,...)
start |
a vector of suitable length containing starting values for the spline-coefficients of the baseline hazard and the time-varying effects, followed by the fixed and random effects. The correct ordering is important. Default is a vector full of zeros. |
index |
vector which defines the grouping of the variables. Components sharing the same number build a group and factor variables get a single number (and are automatically treated as a group). Non-penalized coefficients are marked with NA. |
q_start |
a scalar or matrix of suitable dimension, specifying starting values for the random-effects variance-covariance matrix. Default is a scalar 0.1 or diagonal matrix with 0.1 in the diagonal, depending on the dimension of the random effects. |
conv.eps |
controls the speed of convergence. Default is 1e-3. |
center |
logical. If true, the covariates corresponding to the time-varying effects will be centered. Default is FALSE (and centering is only recommended if really necessary; it can also have a strong effect on the baseline hazard, in particular, if a strong penalty is selected). |
standardize |
logical. If true, the covariates corresponding to the fixed effects will be scaled to a variance equal to one. Default is TRUE. |
smooth |
a list specifying the number of basis functions |
print.iter |
logical. Should the number of iterations be printed? Default is FALSE. |
max.iter |
the number of iterations for the final Fisher scoring re-estimation procedure. Default is 200. |
c.app |
The parameter controlling the exactness of the quadratic approximations of the penalties. Default is 1e-6. |
exact |
controls the exactness of the (Riemann) integral approximations. If not set by the user to a specific value, it will be automatically chosen such that the time interval [0,t_max] will be divided in about 1000 equal sized Riemann bars. |
xr |
maximal time point that is regarded. Default is NULL and the maximal event or censoring time point in the data is used. |
eps |
Small epsilon that controls which fixed effects are set to zero: parameters with an absolute value smaller than epsilon are taken to be zero. Default is 1e-2. |
quant.knots |
Shall the knots be defined based on quantiles? Default is TRUE. |
.
... |
Futher arguments to be passed. |
a list with components for each of the possible arguments.
Andreas Groll [email protected]
# Use different weighting of the two penalty parts # and lighten the convergence criterion coxlassoControl(c.app = 1e-5, conv.eps=1e-3)
# Use different weighting of the two penalty parts # and lighten the convergence criterion coxlassoControl(c.app = 1e-5, conv.eps=1e-3)
A ridge regression approach for Cox Frailty Models based on the Cox full likelihood is provided.
coxridge(fix=formula, rnd=NULL, vary.coef=NULL, xi.ridge, data, control = list())
coxridge(fix=formula, rnd=NULL, vary.coef=NULL, xi.ridge, data, control = list())
fix |
a two-sided linear formula object describing the LASSO-penalized
fixed (time-constant) effects part of the model, with the response on the left of a
|
rnd |
a two-sided linear formula object describing the
random-effects part of the model, with the grouping factor on the left of a
|
vary.coef |
a one-sided linear formula object describing the
time-varying effects part of the model, with the time-varying terms, separated by |
xi.ridge |
the ridge-penalty parameter that controls the strenght of the penalty term. The optimal penalty parameter is a tuning parameter of the procedure that has to be determined, e.g. by K-fold cross validation. |
data |
the data frame containing the variables named in the three preceding
|
control |
a list of control values for the estimation algorithm to replace the default values returned by the function |
The coxridge
algorithm is designed to investigate
the effect structure in the Cox frailty model, which is a
widely used model that accounts for heterogeneity in time-to-event data.
Since in survival models one has to account for possible variation of
the effect strength over time, some features can incorporated with time-varying effects.
The penalty is depending on the ridge tuning parameter .ridge, which has to be determined by a suitable technique, e.g. by K-fold cross validation.
Package: | pencoxfrail |
Type: | Package |
Version: | 1.1.2 |
Date: | 2023-08-25 |
License: | GPL-2 |
LazyLoad: | yes |
for loading a dataset type data(nameofdataset)
Generic functions such as print
, predict
, plot
and summary
have methods to show the results of the fit.
The predict
function uses also estimates of random effects for prediction, if possible (i.e. for known subjects of the grouping factor).
Either the survival stepfunction or the baseline hazard (not cumulative!) can be calculated by specifying one of two possible methods: method=c("hazard","survival")
. By default, for each new subject in new.data
an individual stepfunction is calculated on a pre-specified time grid, also accounting for covariate changes over time. Alternatively, for new.data
a single vector of a specific (time-constant) covariate combination can be specified.
Usage:
predict(coxlasso.obj,new.data,time.grid,method=c("hazard","survival"))
The plot
function plots all time-varying effects, including the baseline hazard.
call |
a list containing an image of the |
baseline |
a vector containing the estimated B-spline coefficients of the baseline hazard.
If the covariates corresponding to the time-varying effects are centered (and standardized, see |
time.vary |
a vector containing the estimated B-spline coefficients of all time-varying effects.
If the covariates corresponding to the time-varying effects are standardized (see |
coefficients |
a vector containing the estimated fixed effects. |
ranef |
a vector containing the estimated random effects. |
Q |
a scalar or matrix containing the estimates of the random effects standard deviation or variance-covariance parameters, respectively. |
Delta |
a matrix containing the estimates of fixed and random effects (columns) for each iteration (rows) of the main algorithm (i.e. before the final re-estimation step is performed, see details). |
Q_long |
a list containing the estimates of the random effects variance-covariance parameters for each iteration of the main algorithm. |
iter |
number of iterations until the main algorithm has converged. |
adaptive.weights |
if not given as an argument by the user, a two-column matrix of adaptive weights is calculated by the |
knots |
vector of knots used in the B-spline representation. |
Phi.big |
large B-spline design matrix corresponding to the baseline hazard and all time-varying effects. For the time-varying effects, the B-spline functions (as a function of time) have already been multiplied with their associated covariates. |
time.grid |
the time grid used in when approximating the (Riemann) integral involved in the model's full likelihood. |
m |
number of metric covariates with time-varying effects. |
m2 |
number of categorical covariates with time-varying effects. |
Andreas Groll [email protected]
Maike Hohberg [email protected]
Groll, A., T. Hastie and G. Tutz (2017). Selection of Effects in Cox Frailty Models by Regularization Methods. Biometrics 73(3): 846-856.
coxlassoControl,cv.coxlasso,coxFL,Surv,pbc
## Not run: # remove NAs lung <- lung[!is.na(lung$inst),] # transform inst into factor variable lung$inst <- as.factor(lung$inst) # just for illustration, create factor with only three ph.ecog classes lung$ph.ecog[is.na(lung$ph.ecog)] <- 2 lung$ph.ecog[lung$ph.ecog==3] <- 2 lung$ph.ecog <- as.factor(lung$ph.ecog) fix.form <- as.formula("Surv(time, status) ~ 1 + age + ph.ecog + sex") ridge.obj <- coxridge(fix=fix.form, data=lung, xi.ridge=10, control=list(print.iter=TRUE, exact = 1)) coef(ridge.obj) # now add random institutional effect ridge.obj2 <- coxridge(fix=fix.form, rnd = list(inst=~1), data=lung, xi.ridge=10,control=list(print.iter=TRUE, exact = 1)) coef(ridge.obj2) # print frailty Std.Dev. print(ridge.obj2$Q) # print frailties print(ridge.obj2$ranef) # now fit a time-varying effect for age fix.form <- as.formula("Surv(time, status) ~ 1 + ph.ecog + sex") vary.coef <- as.formula("~ age") ridge.obj3 <- ridgelasso(fix=fix.form,vary.coef=vary.coef, data=lung, xi.ridge=10,control=list(print.iter=TRUE)) summary(ridge.obj3) # show fit plot(ridge.obj3) # predict survival curve of new subject, institution 1 and up to time 300 pred.obj <- predict(ridge.obj2, newdata=data.frame(inst=1, time=NA, status=NA, age=26, ph.ecog=2,sex=1), time.grid=seq(0,300,by=1)) # plot predicted hazard function plot(pred.obj$time.grid,pred.obj$haz,type="l",xlab="time",ylab="hazard") # plot predicted survival function plot(pred.obj$time.grid,pred.obj$survival,type="l",xlab="time",ylab="survival") ## specify a larger new data set new.data <- data.frame(inst=c(1,1,6), time=c(20,40,200), status=c(NA,NA,NA), age=c(26,26,54), ph.ecog=c(0,0,2),sex=c(1,1,1)) ## as here no frailties have been specified, id.var needs to be given! pred.obj2 <- predict(lasso.obj3, newdata=new.data,id.var = "inst") # plot predicted hazard functions (for the available time intervals) plot(pred.obj2$time.grid[!is.na(pred.obj2$haz[,1])], pred.obj2$haz[,1][!is.na(pred.obj2$haz[,1])], type="l",xlab="time",ylab="hazard",xlim=c(0,200), ylim=c(0,max(pred.obj2$haz,na.rm=T))) lines(pred.obj2$time.grid[!is.na(pred.obj2$haz[,3])], pred.obj2$haz[,3][!is.na(pred.obj2$haz[,3])], col="red",lty=2,) # plot predicted survival functions (for the available time intervals) plot(pred.obj2$time.grid[!is.na(pred.obj2$survival[,1])], pred.obj2$survival[,1][!is.na(pred.obj2$survival[,1])], type="l",xlab="time",ylab="hazard",xlim=c(0,200), ylim=c(0,max(pred.obj2$survival,na.rm=T))) lines(pred.obj2$time.grid[!is.na(pred.obj2$survival[,3])], pred.obj2$survival[,3][!is.na(pred.obj2$survival[,3])], col="red",lty=2,) # see also demo("coxlasso-lung") ## End(Not run)
## Not run: # remove NAs lung <- lung[!is.na(lung$inst),] # transform inst into factor variable lung$inst <- as.factor(lung$inst) # just for illustration, create factor with only three ph.ecog classes lung$ph.ecog[is.na(lung$ph.ecog)] <- 2 lung$ph.ecog[lung$ph.ecog==3] <- 2 lung$ph.ecog <- as.factor(lung$ph.ecog) fix.form <- as.formula("Surv(time, status) ~ 1 + age + ph.ecog + sex") ridge.obj <- coxridge(fix=fix.form, data=lung, xi.ridge=10, control=list(print.iter=TRUE, exact = 1)) coef(ridge.obj) # now add random institutional effect ridge.obj2 <- coxridge(fix=fix.form, rnd = list(inst=~1), data=lung, xi.ridge=10,control=list(print.iter=TRUE, exact = 1)) coef(ridge.obj2) # print frailty Std.Dev. print(ridge.obj2$Q) # print frailties print(ridge.obj2$ranef) # now fit a time-varying effect for age fix.form <- as.formula("Surv(time, status) ~ 1 + ph.ecog + sex") vary.coef <- as.formula("~ age") ridge.obj3 <- ridgelasso(fix=fix.form,vary.coef=vary.coef, data=lung, xi.ridge=10,control=list(print.iter=TRUE)) summary(ridge.obj3) # show fit plot(ridge.obj3) # predict survival curve of new subject, institution 1 and up to time 300 pred.obj <- predict(ridge.obj2, newdata=data.frame(inst=1, time=NA, status=NA, age=26, ph.ecog=2,sex=1), time.grid=seq(0,300,by=1)) # plot predicted hazard function plot(pred.obj$time.grid,pred.obj$haz,type="l",xlab="time",ylab="hazard") # plot predicted survival function plot(pred.obj$time.grid,pred.obj$survival,type="l",xlab="time",ylab="survival") ## specify a larger new data set new.data <- data.frame(inst=c(1,1,6), time=c(20,40,200), status=c(NA,NA,NA), age=c(26,26,54), ph.ecog=c(0,0,2),sex=c(1,1,1)) ## as here no frailties have been specified, id.var needs to be given! pred.obj2 <- predict(lasso.obj3, newdata=new.data,id.var = "inst") # plot predicted hazard functions (for the available time intervals) plot(pred.obj2$time.grid[!is.na(pred.obj2$haz[,1])], pred.obj2$haz[,1][!is.na(pred.obj2$haz[,1])], type="l",xlab="time",ylab="hazard",xlim=c(0,200), ylim=c(0,max(pred.obj2$haz,na.rm=T))) lines(pred.obj2$time.grid[!is.na(pred.obj2$haz[,3])], pred.obj2$haz[,3][!is.na(pred.obj2$haz[,3])], col="red",lty=2,) # plot predicted survival functions (for the available time intervals) plot(pred.obj2$time.grid[!is.na(pred.obj2$survival[,1])], pred.obj2$survival[,1][!is.na(pred.obj2$survival[,1])], type="l",xlab="time",ylab="hazard",xlim=c(0,200), ylim=c(0,max(pred.obj2$survival,na.rm=T))) lines(pred.obj2$time.grid[!is.na(pred.obj2$survival[,3])], pred.obj2$survival[,3][!is.na(pred.obj2$survival[,3])], col="red",lty=2,) # see also demo("coxlasso-lung") ## End(Not run)
performs k-fold cross-validation for coxlasso
, produces a plot, and returns a value for the LASSO tuning parameter .
cv.coxlasso(fix, rnd = NULL, vary.coef = NULL, n.folds = 10, xi = NULL, data, adaptive.weights = NULL, print.fold = TRUE, print.xi = FALSE, len.xi = 100, lgrid = TRUE, ran.seed = 1909, xi.factor = 1.01, min.fold = 4, pass.on.start = TRUE, control = list(print.iter = FALSE))
cv.coxlasso(fix, rnd = NULL, vary.coef = NULL, n.folds = 10, xi = NULL, data, adaptive.weights = NULL, print.fold = TRUE, print.xi = FALSE, len.xi = 100, lgrid = TRUE, ran.seed = 1909, xi.factor = 1.01, min.fold = 4, pass.on.start = TRUE, control = list(print.iter = FALSE))
fix |
a two-sided linear formula object describing the fixed (time-constant) effects part of the model, with the response on the left of a
|
rnd |
a two-sided linear formula object describing the
random-effects part of the model, with the grouping factor on the left of a
|
vary.coef |
a one-sided linear formula object describing the
time-varying effects part of the model, with the time-varying terms, separated by |
n.folds |
number of folds. Default is 10. |
xi |
Optional user-supplied xi sequence; default is NULL, and cv.coxlasso chooses its own sequence |
data |
the data frame containing the variables named in the three preceding
|
adaptive.weights |
for the LASSO-penalized fixed effects a vector of adaptive weights can be passed to the procedure. If no adaptive weights are specified, an unpenalized model (i.e. |
print.fold |
Should folds of CV be printed? Default is yes. |
print.xi |
Should current |
len.xi |
Length of |
lgrid |
Logical; shall a logarithmized grid version for the penalty parameter be used? Default is TRUE. |
ran.seed |
Random seed number to be set. Default is 1909, the year of birth of Borussia Dortmund football club. |
xi.factor |
A factor which increases xi.max once again to be sure that xi is large enough on all sets. Default is 1.01 |
min.fold |
Only those xi values are taken into account where at least min.fold folds are not NA. Default is 4. |
pass.on.start |
Shall starting values be passed onthroughout estimation? Default is TRUE |
control |
a list of control values for the estimation algorithm to replace the default values returned by the function |
The function runs coxlasso
over a grid of values for each training data set with one fold omitted.
For each run, the value for the full likelihood is calculated and the average for each on the grid is computed over the folds. The function choses the
that maximizes this likelihood value as the optimal tuning parameter value.
The function returns a list "cv.coxlasso"
which includes:
cv.error |
a vector of mean CV error (i.e., negative likelihood) values for each |
xi.opt |
a scalar value of |
xi.1se |
largest value of |
The plot
function plots the values of against the corresponding CV error (i.e., negative likelihood) values.
Andreas Groll [email protected]
Maike Hohberg [email protected]
To appear soon.
coxlasso, coxlassoControl, coxFL, Surv, pbc
## Not run: data(lung) # remove NAs lung <- lung[!is.na(lung$inst),] # transform inst into factor variable lung$inst <- as.factor(lung$inst) # just for illustration, create factor with only three ph.ecog classes lung$ph.ecog[is.na(lung$ph.ecog)] <- 2 lung$ph.ecog[lung$ph.ecog==3] <- 2 lung$ph.ecog <- as.factor(lung$ph.ecog) fix.form <- as.formula("Surv(time, status) ~ 1 + age + ph.ecog + sex") # find optimal tuning paramater cv.coxlasso.obj <- cv.coxlasso(fix = fix.form, data = lung, n.folds = 5) # estimate coxlasso model with optimal xi lasso.obj <- coxlasso(fix=fix.form, data=lung, xi=cv.coxlasso.obj$xi.opt, control=list(print.iter=TRUE)) coef(lasso.obj) # see also demo("coxlasso-lung") ## End(Not run)
## Not run: data(lung) # remove NAs lung <- lung[!is.na(lung$inst),] # transform inst into factor variable lung$inst <- as.factor(lung$inst) # just for illustration, create factor with only three ph.ecog classes lung$ph.ecog[is.na(lung$ph.ecog)] <- 2 lung$ph.ecog[lung$ph.ecog==3] <- 2 lung$ph.ecog <- as.factor(lung$ph.ecog) fix.form <- as.formula("Surv(time, status) ~ 1 + age + ph.ecog + sex") # find optimal tuning paramater cv.coxlasso.obj <- cv.coxlasso(fix = fix.form, data = lung, n.folds = 5) # estimate coxlasso model with optimal xi lasso.obj <- coxlasso(fix=fix.form, data=lung, xi=cv.coxlasso.obj$xi.opt, control=list(print.iter=TRUE)) coef(lasso.obj) # see also demo("coxlasso-lung") ## End(Not run)
The function approximates the integral which appears in the (full) Cox likelihood
if the covariate
has a time-varying effect
, which is expanded in B-splines, i.e.
.
int.approx(z,time.grid,B,nbasis,alpha)
int.approx(z,time.grid,B,nbasis,alpha)
z |
a vector which contains at the first component a time point up to which it should be integrated and the covariates |
time.grid |
an equally-spaced time grid on which the B-spline design matrix |
B |
a B-spline design matrix, which has been created with the function |
nbasis |
number of basis functions used when the B-spline design matrix |
alpha |
vector of B-spline coefficients. |
The B-spline design matrix is returned.
Andreas Groll [email protected]
## generate time grid and corresponding B-spline design matrix time.grid <- seq(0,200,by=1) B <- bs.design(x=time.grid, xl=min(time.grid), xr=max(time.grid), spline.degree=3, nbasis=5) ## specify spline coefficients and covariate vector (with upper integral bound as first component) alpha <- c(0.1,0.2,0.05,0.1,0.15) z <- c(time=100,age=25) ## calculate intergal from 0 to 100 int.approx(z=z,time.grid=time.grid,B=B,nbasis=5,alpha=alpha)
## generate time grid and corresponding B-spline design matrix time.grid <- seq(0,200,by=1) B <- bs.design(x=time.grid, xl=min(time.grid), xr=max(time.grid), spline.degree=3, nbasis=5) ## specify spline coefficients and covariate vector (with upper integral bound as first component) alpha <- c(0.1,0.2,0.05,0.1,0.15) z <- c(time=100,age=25) ## calculate intergal from 0 to 100 int.approx(z=z,time.grid=time.grid,B=B,nbasis=5,alpha=alpha)
A regularization approach for Cox Frailty Models by penalization methods is provided.
pencoxfrail(fix=formula, rnd=formula, vary.coef=formula, data, xi, adaptive.weights = NULL, control = list())
pencoxfrail(fix=formula, rnd=formula, vary.coef=formula, data, xi, adaptive.weights = NULL, control = list())
fix |
a two-sided linear formula object describing the unpenalized
fixed (time-constant) effects part of the model, with the response on the left of a
|
rnd |
a two-sided linear formula object describing the
random-effects part of the model, with the grouping factor on the left of a
|
vary.coef |
a one-sided linear formula object describing the
time-varying effects part of the model, with the time-varying terms, separated by |
data |
the data frame containing the variables named in the three preceding
|
xi |
the overall penalty parameter that controls the strenght of both penalty terms in |
adaptive.weights |
a two-column matrix of adaptive weights passed to the procedure; the first column contains the weights |
control |
a list of control values for the estimation algorithm to replace the default values returned by the function |
The pencoxfrail
algorithm is designed to investigate
the effect structure in the Cox frailty model, which is a
widely used model that accounts for heterogeneity in survival data.
Since in survival models one has to account for possible variation of
the effect strength over time the selection of the relevant features distinguishes between the folllowing cases:
covariates can have time-varying effects, can have time-constant effects
or be irrelevant. For this purpose, the following specific penality is applied on the vectors of B-spline coefficients , assuming
different, potentially time-varying effects, each expanded in
B-spline basis functions:
This penalty is able to distinguish between these types of effects to obtain a sparse representation that includes the relevant effects in a proper form.
The penalty is depending on two tuning parameters, and
, which have to be determined by a suitable technique, e.g. by
(2-dimensional) K-fold cross validation.
The first term of the penalty controls the smoothness of the time-varying covariate effects, whereby for
values of and
large enough, all differences
, are removed from the model, resulting in constant covariate effects.
As the B-splines of each variable with varying coefficients sum up to one,
a constant effect is obtained if all spline coefficients are set equal.
Hence, the first penalty term does not affect the spline's global level.
The second term penalizes all spline coefficients belonging to a single time-varying effect in the way
of a group LASSO and, hence, controls the selection of covariates.
Package: | pencoxfrail |
Type: | Package |
Version: | 1.1.2 |
Date: | 2023-08-25 |
License: | GPL-2 |
LazyLoad: | yes |
for loading a dataset type data(nameofdataset)
Generic functions such as print
, predict
, plot
and summary
have methods to show the results of the fit.
The predict
function uses also estimates of random effects for prediction, if possible (i.e. for known subjects of the grouping factor).
Either the survival stepfunction or the baseline hazard (not cumulative!) can be calculated by specifying one of two possible methods: method=c("hazard","survival")
. By default, for each new subject in new.data
an individual stepfunction is calculated on a pre-specified time grid, also accounting for covariate changes over time. Alternatively, for new.data
a single vector of a specific (time-constant) covariate combination can be specified.
Usage:
predict(pencoxfrail.obj,new.data,time.grid,method=c("hazard","survival"))
The plot
function plots all time-varying effects, including the baseline hazard.
call |
a list containing an image of the |
baseline |
a vector containing the estimated B-spline coefficients of the baseline hazard.
If the covariates corresponding to the time-varying effects are centered (and standardized, see |
time.vary |
a vector containing the estimated B-spline coefficients of all time-varying effects.
If the covariates corresponding to the time-varying effects are standardized (see |
coefficients |
a vector containing the estimated fixed effects. |
ranef |
a vector containing the estimated random effects. |
Q |
a scalar or matrix containing the estimates of the random effects standard deviation or variance-covariance parameters, respectively. |
Delta |
a matrix containing the estimates of fixed and random effects (columns) for each iteration (rows) of the main algorithm (i.e. before the final re-estimation step is performed, see details). |
Q_long |
a list containing the estimates of the random effects variance-covariance parameters for each iteration of the main algorithm. |
iter |
number of iterations until the main algorithm has converged. |
adaptive.weights |
If |
knots |
vector of knots used in the B-spline representation. |
Phi.big |
large B-spline design matrix corresponding to the baseline hazard and all time-varying effects. For the time-varying effects, the B-spline functions (as a function of time) have already been multiplied with their associated covariates. |
time.grid |
the time grid used in when approximating the (Riemann) integral involved in the model's full likelihood. |
m |
number of metric covariates with time-varying effects. |
m2 |
number of categorical covariates with time-varying effects. |
Andreas Groll [email protected]
Groll, A., T. Hastie and G. Tutz (2017). Selection of Effects in Cox Frailty Models by Regularization Methods. Biometrics 73(3): 846-856.
## Not run: data(lung) # remove NAs lung <- lung[!is.na(lung$inst),] # transform inst into factor variable lung$inst <- as.factor(lung$inst) # Random institutional effect fix.form <- as.formula("Surv(time, status) ~ 1") vary.coef <- as.formula("~ age") pen.obj <- pencoxfrail(fix=fix.form,vary.coef=vary.coef, rnd = list(inst=~1), data=lung, xi=10,control=list(print.iter=TRUE)) # show fit plot(pen.obj) # predict survival curve of new subject, institution 1 and up to time 500 pred.obj <- predict(pen.obj,newdata=data.frame(inst=1,time=NA,status=NA,age=26), time.grid=seq(0,500,by=1)) # plot predicted hazard function plot(pred.obj$time.grid,pred.obj$haz,type="l",xlab="time",ylab="hazard") # plot predicted survival function plot(pred.obj$time.grid,pred.obj$survival,type="l",xlab="time",ylab="survival") # see also demo("pencoxfrail-pbc") ## End(Not run)
## Not run: data(lung) # remove NAs lung <- lung[!is.na(lung$inst),] # transform inst into factor variable lung$inst <- as.factor(lung$inst) # Random institutional effect fix.form <- as.formula("Surv(time, status) ~ 1") vary.coef <- as.formula("~ age") pen.obj <- pencoxfrail(fix=fix.form,vary.coef=vary.coef, rnd = list(inst=~1), data=lung, xi=10,control=list(print.iter=TRUE)) # show fit plot(pen.obj) # predict survival curve of new subject, institution 1 and up to time 500 pred.obj <- predict(pen.obj,newdata=data.frame(inst=1,time=NA,status=NA,age=26), time.grid=seq(0,500,by=1)) # plot predicted hazard function plot(pred.obj$time.grid,pred.obj$haz,type="l",xlab="time",ylab="hazard") # plot predicted survival function plot(pred.obj$time.grid,pred.obj$survival,type="l",xlab="time",ylab="survival") # see also demo("pencoxfrail-pbc") ## End(Not run)
pencoxfrail
fitThe values supplied in the function call replace the defaults and a list with all possible arguments is returned. The returned list is used as the control
argument to the pencoxfrail
function.
pencoxfrailControl(start = NULL, q_start = NULL, conv.eps = 1e-4, standardize = FALSE, center = FALSE, smooth=list(nbasis = 6, penal = 0.1), ridge.pen = 1e-4, print.iter = FALSE, max.iter = 100, c.app = 1e-6, zeta = 0.5, exact = 1e-2, xr = NULL, ...)
pencoxfrailControl(start = NULL, q_start = NULL, conv.eps = 1e-4, standardize = FALSE, center = FALSE, smooth=list(nbasis = 6, penal = 0.1), ridge.pen = 1e-4, print.iter = FALSE, max.iter = 100, c.app = 1e-6, zeta = 0.5, exact = 1e-2, xr = NULL, ...)
start |
a vector of suitable length containing starting values for the spline-coefficients of the baseline hazard and the time-varying effects, followed by the fixed and random effects. The correct ordering is important. Default is a vector full of zeros. |
q_start |
a scalar or matrix of suitable dimension, specifying starting values for the random-effects variance-covariance matrix. Default is a scalar 0.1 or diagonal matrix with 0.1 in the diagonal, depending on the dimension of the random effects. |
conv.eps |
controls the speed of convergence. Default is 1e-4. |
center |
logical. If true, the covariates corresponding to the time-varying effects will be centered. Default is FALSE (and centering is only recommended if really necessary; it can also have a strong effect on the baseline hazard, in particular, if a strong penalty is selected). |
standardize |
logical. If true, the the covariates corresponding to the time-varying effects will be scaled to a variance equal to one (*after* possible centering). Default is FALSE. |
smooth |
a list specifying the number of basis functions |
ridge.pen |
On all time-varying effects (except for the baseline hazard) a slight ridge penalty is applied on the second order differences of the corresponding spline coefficients to stabilize estimation. Default is 1e-4. |
print.iter |
logical. Should the number of iterations be printed? Default is FALSE. |
max.iter |
the number of iterations for the final Fisher scoring re-estimation procedure. Default is 200. |
c.app |
The parameter controlling the exactness of the quadratic approximations of the penalties. Default is 1e-6. |
zeta |
The parameter controlling the weighting between the two penalty parts in the specific double-penalty |
exact |
controls the exactness of the (Riemann) integral approximations. Default is 1e-2. |
xr |
maximal time point that is regarded. Default is NULL and the maximal event or censoring time point in the data is used. |
... |
Futher arguments to be passed. |
a list with components for each of the possible arguments.
Andreas Groll [email protected]
# Use different weighting of the two penalty parts # and lighten the convergence criterion pencoxfrailControl(zeta=0.3, conv.eps=1e-3)
# Use different weighting of the two penalty parts # and lighten the convergence criterion pencoxfrailControl(zeta=0.3, conv.eps=1e-3)