Title: | Event History Analysis |
---|---|
Description: | Parametric proportional hazards fitting with left truncation and right censoring for common families of distributions, piecewise constant hazards, and discrete models. Parametric accelerated failure time models for left truncated and right censored data. Proportional hazards models for tabular and register data. Sampling of risk sets in Cox regression, selections in the Lexis diagram, bootstrapping. Broström (2022) <doi:10.1201/9780429503764>. |
Authors: | Göran Broström [aut, cre], Jianming Jin [ctb] |
Maintainer: | Göran Broström <[email protected]> |
License: | GPL (>= 2) |
Version: | 2.11.5 |
Built: | 2024-11-20 07:01:40 UTC |
Source: | CRAN |
Parametric proportional hazards fitting with left truncation and right censoring for common families of distributions, piecewise constant hazards, and discrete models. Parametric accelerated failure time models for left truncated and right censored data. Proportional hazards models for tabular and register data. Sampling of risk sets in Cox regression, selections in the Lexis diagram, bootstrapping. Broström (2022) doi:10.1201/9780429503764.
Eha enhances the recommended survival package in several ways,
see the description. The main applications in mind are demography and
epidemiology. For standard Cox regression analysis the function
coxph
in survival is still recommended. The function
coxreg
in eha in fact calls coxph for the standard kind
of analyses.
Maintainer: Göran Broström [email protected]
Other contributors:
Jianming Jin [contributor]
Broström, G. (2012). Event History Analysis with R, Chapman and Hall/CRC Press, Boca Raton, FL.
Useful links:
The accelerated failure time model with parametric baseline hazard(s). Allows for stratification with different scale and shape in each stratum, and left truncated and right censored data.
aftreg( formula = formula(data), data = parent.frame(), na.action = getOption("na.action"), dist = "weibull", init, shape = 0, id, param = c("lifeAcc", "lifeExp"), control = list(eps = 1e-08, maxiter = 20, trace = FALSE), singular.ok = TRUE, model = FALSE, x = FALSE, y = TRUE )
aftreg( formula = formula(data), data = parent.frame(), na.action = getOption("na.action"), dist = "weibull", init, shape = 0, id, param = c("lifeAcc", "lifeExp"), control = list(eps = 1e-08, maxiter = 20, trace = FALSE), singular.ok = TRUE, model = FALSE, x = FALSE, y = TRUE )
formula |
a formula object, with the response on the left of a ~ operator, and the terms on the right. The response must be a survival object as returned by the Surv function. |
data |
a data.frame in which to interpret the variables named in the formula. |
na.action |
a missing-data filter function, applied to the model.frame,
after any subset argument has been used. Default is
|
dist |
Which distribution? Default is "weibull", with the alternatives
"gompertz", "ev", "loglogistic" and "lognormal". A special case like the
|
init |
vector of initial values of the iteration. Default initial value is zero for all variables. |
shape |
If positive, the shape parameter is fixed at that value. If
zero or negative, the shape parameter is estimated. Stratification is now
regarded as a meaningful option even if |
id |
If there are more than one spell per individual, it is essential to keep spells together by the id argument. This allows for time-varying covariates. |
param |
Which parametrization should be used? The |
control |
a list with components |
singular.ok |
Not used. |
model |
Not used. |
x |
Return the design matrix in the model object? |
y |
Return the response in the model object? |
The parameterization is different from the one used by
survreg
, when param = "lifeAcc"
. The result
is then true acceleration of time. Then the model is
where is some standardized
survivor function. The baseline parameters
and
are log shape
and log scale, respectively. This is for the
default
parametrization.
With the lifeExp
parametrization, some signs are changed:
is changed to
. For the Gompertz distribution, the
base parametrization is canonical
, a necessity for consistency with
the shape/scale paradigm (this is new in 2.3).
A list of class "aftreg"
with components
coefficients |
Fitted parameter estimates. |
var |
Covariance matrix of the estimates. |
loglik |
Vector of length two; first component is the value at the initial parameter values, the second componet is the maximized value. |
score |
The score test statistic (at the initial value). |
linear.predictors |
The estimated linear predictors. |
means |
Means of the columns of the design matrix. |
w.means |
Weighted (against exposure time) means of covariates; weighted relative frequencies of levels of factors. |
n |
Number of spells in indata (possibly after removal of cases with NA's). |
n.events |
Number of events in data. |
terms |
Used by extractor functions. |
assign |
Used by extractor functions. |
wald.test |
The Wald test statistic (at the initial value). |
y |
The Surv vector. |
isF |
Logical vector indicating the covariates that are factors. |
covars |
The covariates. |
ttr |
Total Time at Risk. |
levels |
List of levels of factors. |
formula |
The calling formula. |
call |
The call. |
method |
The method. |
convergence |
Did the optimization converge? |
fail |
Did the optimization fail? (Is |
pfixed |
TRUE if shape was fixed in the estimation. |
param |
The parametrization. |
Göran Broström
data(mort) aftreg(Surv(enter, exit, event) ~ ses, param = "lifeExp", data = mort)
data(mort) aftreg(Surv(enter, exit, event) ~ ses, param = "lifeExp", data = mort)
This function is called by aftreg
, but it can also be directly
called by a user.
aftreg.fit(X, Y, dist, param, strata, offset, init, shape, id, control, pfixed)
aftreg.fit(X, Y, dist, param, strata, offset, init, shape, id, control, pfixed)
X |
The design (covariate) matrix. |
Y |
A survival object, the response. |
dist |
Which baseline distribution? |
param |
Which parametrization? |
strata |
A stratum variable. |
offset |
Offset. |
init |
Initial regression parameter values. |
shape |
If positive, a fixed value of the shape parameter in the distribution. Otherwise, the shape is estimated. |
id |
See corresponding argument to |
control |
Controls convergence and output. |
pfixed |
A logical indicating fixed shape parameter(s). |
See aftreg
for more detail.
coefficients |
Estimated regression coefficients plus estimated scale and shape coefficients, sorted by strata, if present. |
df |
Degrees of freedom; No. of regression parameters. |
var |
Variance-covariance matrix |
loglik |
Vector of length 2. The first component is the maximized loglihood with only scale and shape in the model, the second the final maximum. |
conver |
TRUE if convergence |
fail |
TRUE if failure |
iter |
Number of Newton-Raphson iterates. |
n.strata |
The number of strata in the data. |
Göran Broström
For a given age interval, each spell is cut to fit into the given age interval.
age.window(dat, window, surv = c("enter", "exit", "event"))
age.window(dat, window, surv = c("enter", "exit", "event"))
dat |
Input data frame. Must contain survival data. |
window |
Vector of length two; the age interval. |
surv |
Vector of length three giving the names of the central variables in 'dat'. |
The window
must be in the order (begin, end)
A data frame of the same form as the input data frame, but 'cut' as
desired. Intervals exceeding window[2]
will be given event = 0
.
If the selection gives an empty result, NULL
is returned, with no warning.
Göran Broström
dat <- data.frame(enter = 0, exit = 5.731, event = 1, x = 2) window <- c(2, 5.3) dat.trim <- age.window(dat, window)
dat <- data.frame(enter = 0, exit = 5.731, event = 1, x = 2) window <- c(2, 5.3) dat.trim <- age.window(dat, window)
For a given time interval, each spell is cut so that it fully lies in the given time interval
cal.window(dat, window, surv = c("enter", "exit", "event", "birthdate"))
cal.window(dat, window, surv = c("enter", "exit", "event", "birthdate"))
dat |
Input data frame. Must contain survival data and a birth date. |
window |
Vector of length two; the time interval |
surv |
Vector of length four giving the names of the central variables in 'dat'. |
The window
must be in the order (begin, end)
A data frame of the same form as the input data frame, but 'cut' as
desired. Intervals exceeding window[2]
will be given event = 0
Göran Broström
dat <- data.frame(enter = 0, exit = 5.731, event = 1, birthdate = 1962.505, x = 2) window <- c(1963, 1965) dat.trim <- cal.window(dat, window)
dat <- data.frame(enter = 0, exit = 5.731, event = 1, birthdate = 1962.505, x = 2) window <- c(1963, 1965) dat.trim <- cal.window(dat, window)
Comparison of the cumulative hazards functions for a semi-parametric and a parametric model.
check.dist(sp, pp, main = NULL, col = 1:2, lty = 1:2, printLegend = TRUE)
check.dist(sp, pp, main = NULL, col = 1:2, lty = 1:2, printLegend = TRUE)
sp |
An object of type "coxreg", typically output from
|
pp |
An object of type "phreg", typically output from
|
main |
Header for the plot. Default is distribution and "cumulative hazard function" |
col |
Line colors. should be |
lty |
line types. |
printLegend |
Should a legend be printed? Default is |
For the moment only a graphical comparison. The arguments sp
and
pp
may be swapped.
No return value.
Göran Broström
data(mort) oldpar <- par(mfrow = c(2, 2)) fit.cr <- coxreg(Surv(enter, exit, event) ~ ses, data = mort) fit.w <- phreg(Surv(enter, exit, event) ~ ses, data = mort) fit.g <- phreg(Surv(enter, exit, event) ~ ses, data = mort, dist = "gompertz") fit.ev <- phreg(Surv(enter, exit, event) ~ ses, data = mort, dist = "ev") check.dist(fit.cr, fit.w) check.dist(fit.cr, fit.g) check.dist(fit.cr, fit.ev) par(oldpar)
data(mort) oldpar <- par(mfrow = c(2, 2)) fit.cr <- coxreg(Surv(enter, exit, event) ~ ses, data = mort) fit.w <- phreg(Surv(enter, exit, event) ~ ses, data = mort) fit.g <- phreg(Surv(enter, exit, event) ~ ses, data = mort, dist = "gompertz") fit.ev <- phreg(Surv(enter, exit, event) ~ ses, data = mort, dist = "ev") check.dist(fit.cr, fit.w) check.dist(fit.cr, fit.g) check.dist(fit.cr, fit.ev) par(oldpar)
Check that exit occurs after enter, that spells from an individual do not overlap, and that each individual experiences at most one event.
check.surv(enter, exit, event, id = NULL, eps = 1e-08)
check.surv(enter, exit, event, id = NULL, eps = 1e-08)
enter |
Left truncation time. |
exit |
Time of exit. |
event |
Indicator of event. Zero means 'no event'. |
id |
Identification of individuals. |
eps |
The smallest allowed spell length or overlap. |
Interval lengths must be strictly positive.
A vector of id's for the insane individuals. Of zero length if no errors.
Göran Broström
xx <- data.frame(enter = c(0, 1), exit = c(1.5, 3), event = c(0, 1), id = c(1,1)) check.surv(xx$enter, xx$exit, xx$event, xx$id)
xx <- data.frame(enter = c(0, 1), exit = c(1.5, 3), event = c(0, 1), id = c(1,1)) check.surv(xx$enter, xx$exit, xx$event, xx$id)
Children born in Skellefteå, Sweden, 1850-1884, are followed fifteen years or until death or out-migration.
data(child)
data(child)
A data frame with 26855 children born 1850-1884.
id
An identification number.
m.id
Mother's id.
sex
Sex.
socBranch
Working branch of family (father).
birthdate
Birthdate.
enter
Start age of follow-up, always zero.
exit
Age of departure, either by death or emigration.
event
Type of departure, death = 1, right censoring = 0.
illeg
Born out of marriage ("illegitimate")?
m.age
Mother's age.
The Skellefteå region is a large region in the northern part of Sweden.
Data originate from the Centre for Demographic and Ageing Research, Umeå University, Umeå, Sweden, https://www.umu.se/en/centre-for-demographic-and-ageing-research/.
fit <- coxreg(Surv(enter, exit, event) ~ sex + socBranch, data = child, coxph = TRUE) summary(fit)
fit <- coxreg(Surv(enter, exit, event) ~ sex + socBranch, data = child, coxph = TRUE) summary(fit)
Comparison of the estimated baseline cumulative hazards functions for two survival models.
compHaz( fit1, fit2, main = NULL, lty = 1:2, col = c("red", "blue"), printLegend = TRUE )
compHaz( fit1, fit2, main = NULL, lty = 1:2, col = c("red", "blue"), printLegend = TRUE )
fit1 |
An object of type "coxreg", "phreg", or other output from from survival fitters. |
fit2 |
An object of type "coxreg", "phreg", or other output from survival fitters. |
main |
Header for the plot. Default is |
lty |
line types. |
col |
Line colors. should be |
printLegend |
Should a legend be printed? Default is |
No return value.
Göran Broström
fit.cr <- coxreg(Surv(enter, exit, event) ~ sex, data = oldmort) fit.w <- phreg(Surv(enter, exit, event) ~ sex, data = oldmort) compHaz(fit.cr, fit.w)
fit.cr <- coxreg(Surv(enter, exit, event) ~ sex, data = oldmort) fit.w <- phreg(Surv(enter, exit, event) ~ sex, data = oldmort) compHaz(fit.cr, fit.w)
Calculates minus the log likelihood function and its first and second order
derivatives for data from a Cox regression model. It is used by coxreg
if the argument coxph = FALSE
coxfunk(beta, X, offset, rs, what = 2)
coxfunk(beta, X, offset, rs, what = 2)
beta |
Regression parameters |
X |
The design (covariate) matrix. |
offset |
Offset. |
rs |
Risk set created by |
what |
what = 0 means only loglihood, 1 means score vector as well, 2 loglihood, score and hessian. |
Note that the function returns log likelihood, score vector and minus hessian, i.e. the observed information. The model is
A list with components
loglik |
The log likelihood. |
dloglik |
The score vector. Nonzero if |
d2loglik |
The hessian. Nonzero if |
Göran Broström
Performs Cox regression with some special attractions, especially sampling of risksets and the weird bootstrap.
coxreg(formula = formula(data), data = parent.frame(), weights, subset, t.offset, na.action = getOption("na.action"), init = NULL, method = c("efron", "breslow", "mppl", "ml"), control = list(eps = 1e-08, maxiter = 25, trace = FALSE), singular.ok = TRUE, model = FALSE, center = NULL, x = FALSE, y = TRUE, hazards = NULL, boot = FALSE, efrac = 0, geometric = FALSE, rs = NULL, frailty = NULL, max.survs = NULL, coxph = TRUE)
coxreg(formula = formula(data), data = parent.frame(), weights, subset, t.offset, na.action = getOption("na.action"), init = NULL, method = c("efron", "breslow", "mppl", "ml"), control = list(eps = 1e-08, maxiter = 25, trace = FALSE), singular.ok = TRUE, model = FALSE, center = NULL, x = FALSE, y = TRUE, hazards = NULL, boot = FALSE, efrac = 0, geometric = FALSE, rs = NULL, frailty = NULL, max.survs = NULL, coxph = TRUE)
formula |
a formula object, with the response on the left of a ~ operator, and the terms on the right. The response must be a survival object as returned by the Surv function. |
data |
a data.frame in which to interpret the variables named in the formula. |
weights |
Case weights; time-fixed or time-varying. |
subset |
An optional vector specifying a subset of observations to be used in the fitting process. |
t.offset |
Case offsets; time-varying. |
na.action |
a missing-data filter function, applied to the model.frame,
after any subset argument has been used. Default is
|
init |
vector of initial values of the iteration. Default initial value is zero for all variables. |
method |
Method of treating ties, "efron" (default), "breslow", "mppl" (maximum partial partial likelihood), or "ml" (maximum likelihood). |
control |
a list with components |
singular.ok |
Not used |
model |
Not used |
center |
deprecated. See Details. |
x |
Return the design matrix in the model object? |
y |
return the response in the model object? |
hazards |
deprecated. Was: Calculate baseline hazards? Default is TRUE. Calculating hazards is better done separately, after fitting. In most cases. |
boot |
Number of boot replicates. Defaults to FALSE, no boot samples. |
efrac |
Upper limit of fraction failures in 'mppl'. |
geometric |
If TRUE, forces an 'ml' model with constant riskset probability. Default is FALSE. |
rs |
Risk set? |
frailty |
Grouping variable for frailty analysis. Not in use (yet). |
max.survs |
Sampling of risk sets? If given, it should be (the upper limit of) the number of survivors in each risk set. |
coxph |
Logical, defaults to |
The default method, efron
, and the alternative, breslow
, are
both the same as in coxph
in package
survival
. The methods mppl
and ml
are maximum
likelihood, discrete-model, based.
A list of class c("coxreg", "coxph")
with components
coefficients |
Fitted parameter estimates. |
var |
Covariance matrix of the estimates. |
loglik |
Vector of length two; first component is the value at the initial parameter values, the second component is the maximized value. |
score |
The score test statistic (at the initial value). |
linear.predictors |
The estimated linear predictors. |
residuals |
The martingale residuals. |
hazards |
The estimated baseline hazards, calculated at the value zero of the covariates (rather, columns of the design matrix). Is a list, with one component per stratum. Each component is a matrix with two columns, the first contains risk times, the second the corresponding hazard atom. |
means |
Means of the columns of
the design matrix corresponding to covariates, if |
w.means |
Weighted (against exposure time) means of covariates; weighted relative frequencies of levels of factors. |
n |
Number of spells in indata (possibly after removal of cases with NA's). |
n.events |
Number of events in data. |
terms |
Used by extractor functions. |
assign |
Used by extractor functions. |
y |
The Surv vector. |
isF |
Logical vector indicating the covariates that are factors. |
covars |
The covariates. |
ttr |
Total Time at Risk. |
levels |
List of levels of factors. |
formula |
The calling formula. |
bootstrap |
The (matrix of) bootstrap replicates, if requested on input. It is up to the user to do whatever desirable with this sample. |
call |
The call. |
method |
The method. |
n.strata |
Number of strata. |
convergence |
Did the optimization converge? |
fail |
Did the optimization fail? (Is |
The use of rs
is dangerous, see note. It can
however speed up computing time considerably for huge data sets.
This function starts by creating risksets, if no riskset is supplied
via rs
, with the aid of risksets
. Supplying output from
risksets
via rs
fails if there are any NA's in the data! Note
also that it depends on stratification, so rs
contains information
about stratification. Giving another strata variable in the formula is an
error. The same is ok, for instance to supply stratum interactions.
Göran Broström
Broström, G. and Lindkvist, M. (2008). Partial partial likelihood. Communications in Statistics: Simulation and Computation 37:4, 679-686.
dat <- data.frame(time= c(4, 3,1,1,2,2,3), status=c(1,1,1,0,1,1,0), x= c(0, 2,1,1,1,0,0), sex= c(0, 0,0,0,1,1,1)) coxreg( Surv(time, status) ~ x + strata(sex), data = dat) #stratified model # Same as: rs <- risksets(Surv(dat$time, dat$status), strata = dat$sex) coxreg( Surv(time, status) ~ x, data = dat, rs = rs) #stratified model
dat <- data.frame(time= c(4, 3,1,1,2,2,3), status=c(1,1,1,0,1,1,0), x= c(0, 2,1,1,1,0,0), sex= c(0, 0,0,0,1,1,1)) coxreg( Surv(time, status) ~ x + strata(sex), data = dat) #stratified model # Same as: rs <- risksets(Surv(dat$time, dat$status), strata = dat$sex) coxreg( Surv(time, status) ~ x, data = dat, rs = rs) #stratified model
Called by coxreg
, but a user can call it directly.
coxreg.fit( X, Y, rs, weights, t.offset = NULL, strats, offset, init, max.survs, method = "efron", boot = FALSE, efrac = 0, calc.martres = TRUE, control, verbose = TRUE, calc.hazards = NULL, center = NULL )
coxreg.fit( X, Y, rs, weights, t.offset = NULL, strats, offset, init, max.survs, method = "efron", boot = FALSE, efrac = 0, calc.martres = TRUE, control, verbose = TRUE, calc.hazards = NULL, center = NULL )
X |
The design matrix. |
Y |
The survival object. |
rs |
The risk set composition. If absent, calculated. |
weights |
Case weights; time-fixed or time-varying. |
t.offset |
Case offset; time-varying. |
strats |
The stratum variable. Can be absent. |
offset |
Offset. Can be absent. |
init |
Start values. If absent, equal to zero. |
max.survs |
Sampling of risk sets? If so, gives the maximum number of survivors in each risk set. |
method |
Either "efron" (default) or "breslow". |
boot |
Number of bootstrap replicates. Defaults to FALSE, no bootstrapping. |
efrac |
Upper limit of fraction failures in 'mppl'. |
calc.martres |
Should martingale residuals be calculated? |
control |
See |
verbose |
Should Warnings about convergence be printed? |
calc.hazards |
Deprecated. See |
center |
Deprecated. See |
rs
is dangerous to use when NA's are present.
A list with components
coefficients |
Estimated regression parameters. |
var |
Covariance matrix of estimated coefficients. |
loglik |
First component is value at |
score |
Score test statistic, at initial value. |
linear.predictors |
Linear predictors. |
residuals |
Martingale residuals. |
hazard |
Estimated baseline hazard. At value zero of design variables. |
means |
Means of the columns of the design matrix. |
bootstrap |
The bootstrap replicates, if requested on input. |
conver |
|
f.conver |
TRUE if variables converged. |
fail |
|
iter |
Number of performed iterations. |
It is the user's responsibility to check that indata is sane.
Göran Broström
X <- as.matrix(data.frame( x= c(0, 2,1,4,1,0,3), sex= c(1, 0,0,0,1,1,1))) time <- c(1,2,3,4,5,6,7) status <- c(1,1,1,0,1,1,0) stratum <- rep(1, length(time)) coxreg.fit(X, Surv(time, status), strats = stratum, max.survs = 6, control = list(eps=1.e-4, maxiter = 10, trace = FALSE))
X <- as.matrix(data.frame( x= c(0, 2,1,4,1,0,3), sex= c(1, 0,0,0,1,1,1))) time <- c(1,2,3,4,5,6,7) status <- c(1,1,1,0,1,1,0) stratum <- rep(1, length(time)) coxreg.fit(X, Surv(time, status), strats = stratum, max.survs = 6, control = list(eps=1.e-4, maxiter = 10, trace = FALSE))
Performs Cox regression with some special attractions, especially sampling of risksets and the weird bootstrap.
coxreg2(formula = formula(data), data = parent.frame(), weights, subset, t.offset, na.action = getOption("na.action"), init = NULL, method = c("efron", "breslow", "mppl", "ml"), control = list(eps = 1e-08, maxiter = 25, trace = FALSE), singular.ok = TRUE, model = FALSE, center = NULL, x = FALSE, y = TRUE, hazards = NULL, boot = FALSE, efrac = 0, geometric = FALSE, rs = NULL, frailty = NULL, max.survs = NULL, coxph = TRUE)
coxreg2(formula = formula(data), data = parent.frame(), weights, subset, t.offset, na.action = getOption("na.action"), init = NULL, method = c("efron", "breslow", "mppl", "ml"), control = list(eps = 1e-08, maxiter = 25, trace = FALSE), singular.ok = TRUE, model = FALSE, center = NULL, x = FALSE, y = TRUE, hazards = NULL, boot = FALSE, efrac = 0, geometric = FALSE, rs = NULL, frailty = NULL, max.survs = NULL, coxph = TRUE)
formula |
a formula object, with the response on the left of a ~ operator, and the terms on the right. The response must be a survival object as returned by the Surv function. |
data |
a data.frame in which to interpret the variables named in the formula. |
weights |
Case weights; time-fixed or time-varying. |
subset |
An optional vector specifying a subset of observations to be used in the fitting process. |
t.offset |
Case offsets; time-varying. |
na.action |
a missing-data filter function, applied to the model.frame,
after any subset argument has been used. Default is
|
init |
vector of initial values of the iteration. Default initial value is zero for all variables. |
method |
Method of treating ties, "efron" (default), "breslow", "mppl" (maximum partial partial likelihood), or "ml" (maximum likelihood). |
control |
a list with components |
singular.ok |
Not used |
model |
Not used |
center |
deprecated. See Details. |
x |
Return the design matrix in the model object? |
y |
return the response in the model object? |
hazards |
deprecated. Was: Calculate baseline hazards? Default is TRUE. Calculating hazards is better done separately, after fitting. In most cases. |
boot |
Number of boot replicates. Defaults to FALSE, no boot samples. |
efrac |
Upper limit of fraction failures in 'mppl'. |
geometric |
If TRUE, forces an 'ml' model with constant riskset probability. Default is FALSE. |
rs |
Risk set? |
frailty |
Grouping variable for frailty analysis. Not in use (yet). |
max.survs |
Sampling of risk sets? If given, it should be (the upper limit of) the number of survivors in each risk set. |
coxph |
Logical, defaults to |
The default method, efron
, and the alternative, breslow
, are
both the same as in coxph
in package
survival
. The methods mppl
and ml
are maximum
likelihood, discrete-model, based.
A list of class c("coxreg", "coxph")
with components
coefficients |
Fitted parameter estimates. |
var |
Covariance matrix of the estimates. |
loglik |
Vector of length two; first component is the value at the initial parameter values, the second component is the maximized value. |
score |
The score test statistic (at the initial value). |
linear.predictors |
The estimated linear predictors. |
residuals |
The martingale residuals. |
hazards |
The estimated baseline hazards, calculated at the value zero of the covariates (rather, columns of the design matrix). Is a list, with one component per stratum. Each component is a matrix with two columns, the first contains risk times, the second the corresponding hazard atom. |
means |
Means of the columns of
the design matrix corresponding to covariates, if |
w.means |
Weighted (against exposure time) means of covariates; weighted relative frequencies of levels of factors. |
n |
Number of spells in indata (possibly after removal of cases with NA's). |
n.events |
Number of events in data. |
terms |
Used by extractor functions. |
assign |
Used by extractor functions. |
y |
The Surv vector. |
isF |
Logical vector indicating the covariates that are factors. |
covars |
The covariates. |
ttr |
Total Time at Risk. |
levels |
List of levels of factors. |
formula |
The calling formula. |
bootstrap |
The (matrix of) bootstrap replicates, if requested on input. It is up to the user to do whatever desirable with this sample. |
call |
The call. |
method |
The method. |
n.strata |
Number of strata. |
convergence |
Did the optimization converge? |
fail |
Did the optimization fail? (Is |
The use of rs
is dangerous, see note. It can
however speed up computing time considerably for huge data sets.
This function starts by creating risksets, if no riskset is supplied
via rs
, with the aid of risksets
. Supplying output from
risksets
via rs
fails if there are any NA's in the data! Note
also that it depends on stratification, so rs
contains information
about stratification. Giving another strata variable in the formula is an
error. The same is ok, for instance to supply stratum interactions.
Göran Broström
Broström, G. and Lindkvist, M. (2008). Partial partial likelihood. Communications in Statistics: Simulation and Computation 37:4, 679-686.
dat <- data.frame(time= c(4, 3,1,1,2,2,3), status=c(1,1,1,0,1,1,0), x= c(0, 2,1,1,1,0,0), sex= c(0, 0,0,0,1,1,1)) coxreg( Surv(time, status) ~ x + strata(sex), data = dat) #stratified model # Same as: rs <- risksets(Surv(dat$time, dat$status), strata = dat$sex) coxreg( Surv(time, status) ~ x, data = dat, rs = rs) #stratified model
dat <- data.frame(time= c(4, 3,1,1,2,2,3), status=c(1,1,1,0,1,1,0), x= c(0, 2,1,1,1,0,0), sex= c(0, 0,0,0,1,1,1)) coxreg( Surv(time, status) ~ x + strata(sex), data = dat) #stratified model # Same as: rs <- risksets(Surv(dat$time, dat$status), strata = dat$sex) coxreg( Surv(time, status) ~ x, data = dat, rs = rs) #stratified model
Given a data frame with a defined response variable, this function creates a unique representation of the covariates in the data frame, vector (matrix) of responses, and a pointer vector, connecting the responses with the corresponding covariates.
cro(dat, response = 1)
cro(dat, response = 1)
dat |
A data frame |
response |
The column(s) where the response resides. |
The rows in the data frame are converted to text strings with paste
and compared with match
.
A list with components
y |
The response. |
covar |
A data frame with unique rows of covariates. |
keys |
Pointers from |
This function is based on suggestions by Anne York and Brian Ripley.
Göran Broström
dat <- data.frame(y = c(1.1, 2.3, 0.7), x1 = c(1, 0, 1), x2 = c(0, 1, 0)) cro(dat)
dat <- data.frame(y = c(1.1, 2.3, 0.7), x1 = c(1, 0, 1), x2 = c(0, 1, 0)) cro(dat)
Density, distribution function, quantile function, hazard function,
cumulative hazard function, and random generation for the EV distribution
with parameters shape
and scale
.
dEV(x, shape = 1, scale = 1, log = FALSE) pEV(q, shape = 1, scale = 1, lower.tail = TRUE, log.p = FALSE) qEV(p, shape = 1, scale = 1, lower.tail = TRUE, log.p = FALSE) hEV(x, shape = 1, scale = 1, log = FALSE) HEV(x, shape = 1, scale = 1, log.p = FALSE) rEV(n, shape = 1, scale = 1)
dEV(x, shape = 1, scale = 1, log = FALSE) pEV(q, shape = 1, scale = 1, lower.tail = TRUE, log.p = FALSE) qEV(p, shape = 1, scale = 1, lower.tail = TRUE, log.p = FALSE) hEV(x, shape = 1, scale = 1, log = FALSE) HEV(x, shape = 1, scale = 1, log.p = FALSE) rEV(n, shape = 1, scale = 1)
shape , scale
|
shape and scale parameters, both defaulting to 1. |
lower.tail |
logical; if TRUE (default), probabilities are |
x , q
|
vector of quantiles. |
p |
vector of probabilities. |
n |
number of observations. If |
log , log.p
|
logical; if TRUE, probabilities p are given as log(p). |
The EV distribution with scale
parameter and
shape
parameter has hazard function given by
for .
dEV
gives the density, pEV
gives the distribution
function, qEV
gives the quantile function, hEV
gives the
hazard function, HEV
gives the cumulative hazard function, and
rEV
generates random deviates.
Invalid arguments will result in return value NaN
, with a warning.
Birth intervals for married women with at least one birth, 19th northern Sweden
data(fert)
data(fert)
A data frame with 12169 observations the lengths (in years) of birth
intervals for 1859 married women with at least one birth. The first
interval (parity = 0
) is the interval from marriage to first birth.
id
Personal identification number for mother.
parity
Time order of birth interval for the present
mother. The interval with parity = 0
is the first, from
marriage to first birth.
age
The age of mother at start of interval.
year
The calendar year at start of interval.
next.ivl
The length of the coming time interval.
event
An indicator for whether the next.ivl
ends in a new birth (event = 1
) or is right censored
(event = 0
). Censoring occurs when the woman ends her
fertility period within her first marriage (marriage dissolution
or reaching the age of 48).
prev.ivl
The length of the previous time interval. May be used as explanatory variable in a Cox regression of birth intervals.
ses
Socio-economic status, a factor with levels
lower
, upper
, farmer
, and unknown
.
parish
The Skelleftea region consists of three parishes, Jorn, Norsjo, and Skelleftea.
The data set contain clusters of dependent observations defined by mother's id.
Data is coming from The Demographic Data Base, Umea University, Umea, Sweden.
https://www.umu.se/enheten-for-demografi-och-aldrandeforskning/
data(fert) fit <- coxreg(Surv(next.ivl, event) ~ ses + prev.ivl, data = fert, subset = (parity == 1)) summary(fit)
data(fert) fit <- coxreg(Surv(next.ivl, event) ~ ses + prev.ivl, data = fert, subset = (parity == 1)) summary(fit)
Utilizing GLMM models: Experimental, not exported (yet).
frail.fit(X, Y, rs, strats, offset, init, max.survs, frailty, control)
frail.fit(X, Y, rs, strats, offset, init, max.survs, frailty, control)
X |
design matrix |
Y |
survival object |
rs |
output from |
strats |
strata |
offset |
offset |
init |
start values |
max.survs |
for sampling of riskset survivors |
frailty |
grouping variable |
control |
control of optimization |
This function is called from coxreg
. A user may call it directly.
geome.fit(X, Y, rs, strats, offset, init, max.survs, method = "ml", control)
geome.fit(X, Y, rs, strats, offset, init, max.survs, method = "ml", control)
X |
The design matrix |
Y |
Survival object |
rs |
risk set produced by |
strats |
Stratum indicator |
offset |
Offset |
init |
Initial values |
max.survs |
Maximal survivors |
method |
"ml", always, i.e., this argument is ignored. |
control |
See |
See the code.
Nothing special
coxreg
is a defunct function
Göran Broström
See coxreg
.
Density, distribution function, quantile function, hazard function,
cumulative hazard function, and random generation for the Gompertz
distribution with parameters shape
and scale
.
dgompertz(x, shape = 1, scale = 1, rate, log = FALSE, param = c("default", "canonical", "rate")) pgompertz(q, shape = 1, scale = 1, rate, lower.tail = TRUE, log.p = FALSE, param = c("default", "canonical", "rate")) qgompertz(p, shape = 1, scale = 1, rate, lower.tail = TRUE, log.p = FALSE, param = c("default", "canonical", "rate")) hgompertz(x, shape = 1, scale = 1, rate, log = FALSE, param = c("default", "canonical", "rate")) Hgompertz(x, shape = 1, scale = 1, rate, log.p = FALSE, param = c("default", "canonical", "rate")) rgompertz(n, shape = 1, scale = 1, rate, param = c("default", "canonical", "rate"))
dgompertz(x, shape = 1, scale = 1, rate, log = FALSE, param = c("default", "canonical", "rate")) pgompertz(q, shape = 1, scale = 1, rate, lower.tail = TRUE, log.p = FALSE, param = c("default", "canonical", "rate")) qgompertz(p, shape = 1, scale = 1, rate, lower.tail = TRUE, log.p = FALSE, param = c("default", "canonical", "rate")) hgompertz(x, shape = 1, scale = 1, rate, log = FALSE, param = c("default", "canonical", "rate")) Hgompertz(x, shape = 1, scale = 1, rate, log.p = FALSE, param = c("default", "canonical", "rate")) rgompertz(n, shape = 1, scale = 1, rate, param = c("default", "canonical", "rate"))
shape , scale
|
shape and scale parameters, both defaulting to 1. |
rate |
the rate parameter for that parametrization, replaces scale. |
lower.tail |
logical; if TRUE (default), probabilities are |
param |
default or canonical or rate. |
x , q
|
vector of quantiles. |
p |
vector of probabilities. |
n |
number of observations. If |
log , log.p
|
logical; if TRUE, probabilities p are given as log(p). |
The Gompertz distribution with scale
parameter and
shape
parameter has hazard function given by
for .
If
param = "canonical"
, then then a –> a/b, so that b is a
true scale parameter (for any fixed a), and b is an 'AFT parameter'.
If param = "rate"
, then b –> 1/b.
dgompertz
gives the density, pgompertz
gives the distribution
function, qgompertz
gives the quantile function, hgompertz
gives the
hazard function, Hgompertz
gives the cumulative hazard function, and
rgompertz
generates random deviates.
Invalid arguments will result in return value NaN
, with a warning.
Get baseline hazards atoms from fits from
hazards(x, cum = TRUE, ...)
hazards(x, cum = TRUE, ...)
x |
A |
cum |
Logical: Should the cumulative hazards be returned? |
... |
Additional arguments for various methods. |
A list where each component is a two-column matrix representing hazard atoms from one stratum. The first column is event time, and the second column is the corresponding hazard atom.
HISCO to HISCLASS transformation
HiscoHisclass(hisco, status = NULL, relation = NULL, urban = NULL, debug = FALSE)
HiscoHisclass(hisco, status = NULL, relation = NULL, urban = NULL, debug = FALSE)
hisco |
Hisco codes to be transformed to hisclass. |
status |
Optional standard description of status. |
relation |
Relation between owner of occupation and self. |
urban |
Logical, "Is residence in an urban area?" |
debug |
Logical, prints intermediate values if TRUE. |
A vector of hisclass codes, same length as input hisco.
Göran Broström with translation and modification of a Stata do.
Van Leeuwen, M. and Maas, I. (2011). HISCLASS. A Historical International Social Class Scheme. Leuwen University Press.
strata
function imported from survival
This function is imported from the survival
package. See
strata
.
Surv
function imported from survival
This function is imported from the survival
package. See
Surv
.
Matched data on infant mortality, from seven parishes in Sweden, 1821–1894.
data(infants)
data(infants)
A data frame with 80 rows and five variables.
stratum
Triplet No. Each triplet consist of one infant whose mother died (a case), and two controls, i.e, infants whose mother did not die. Matched on covariates below.
enter
Age (in days) of case when its mother died.
exit
Age (in days) at death or right censoring (at age 365 days).
event
Follow-up ends with death (1) or right censoring (0).
mother
dead
for cases, alive
for controls.
age
Mother's age at infant's birth.
sex
The infant's sex.
parish
Birth parish, either Nedertornea or not Nedertornea.
civst
Civil status of mother, married
or
unmarried
.
ses
Socio-economic status of mother, either farmer or not farmer.
year
Year of birth of the infant.
From 5641 first-born in seven Swedish parishes 1820-1895, from Fleninge in the very south to Nedertorneå in the very north, those whose mother died during their first year of life were selected, in all 35 infants. To each of them, two controls were selected by matching on the given covariates.
Data originate from The Demographic Data Base, Umeå University, Umeå, Sweden, https://www.umu.se/enheten-for-demografi-och-aldrandeforskning/.
Broström, G. (1987). The influence of mother's death on infant mortality: A case study in matched data survival analysis. Scandinavian Journal of Statistics 14, 113-123.
data(infants) fit <- coxreg(Surv(enter, exit, event) ~ strata(stratum) + mother, data = infants) fit fit.w <- phreg(Surv(enter, exit, event) ~ mother + parish + ses, data = infants) summary(fit.w) ## Weibull proportional hazards model.
data(infants) fit <- coxreg(Surv(enter, exit, event) ~ strata(stratum) + mother, data = infants) fit fit.w <- phreg(Surv(enter, exit, event) ~ mother + parish + ses, data = infants) summary(fit.w) ## Weibull proportional hazards model.
Unnecessary cut spells are glued together, overlapping spells are "polished", etc.
join.spells(dat, strict = FALSE, eps = 1e-08)
join.spells(dat, strict = FALSE, eps = 1e-08)
dat |
A data frame with names enter, exit, event, id. |
strict |
If TRUE, nothing is changed if errors in spells (non-positive length, overlapping intervals, etc.) are detected. Otherwise (the default), bad spells are removed, with "earlier life" having higher priority. |
eps |
Tolerance for equality of two event times. Should be kept small. |
In case of overlapping intervals (i.e., a data error), the appropriate id's
are returned if strict
is TRUE
.
A data frame with the same variables as the input, but individual spells are joined, if possible (identical covariate values, and adjacent time intervals).
Göran Broström
Therneau, T.M. and Grambsch, P.M. (2000). Modeling Survival Data: Extending the Cox model. Springer.
Density, distribution function, quantile function, hazard function,
cumulative hazard function, and random generation for the Loglogistic distribution
with parameters shape
and scale
.
dllogis(x, shape = 1, scale = 1, log = FALSE) pllogis(q, shape = 1, scale = 1, lower.tail = TRUE, log.p = FALSE) qllogis(p, shape = 1, scale = 1, lower.tail = TRUE, log.p = FALSE) hllogis(x, shape = 1, scale = 1, prop = 1, log = FALSE) Hllogis(x, shape = 1, scale = 1, prop = 1, log.p = FALSE) rllogis(n, shape = 1, scale = 1)
dllogis(x, shape = 1, scale = 1, log = FALSE) pllogis(q, shape = 1, scale = 1, lower.tail = TRUE, log.p = FALSE) qllogis(p, shape = 1, scale = 1, lower.tail = TRUE, log.p = FALSE) hllogis(x, shape = 1, scale = 1, prop = 1, log = FALSE) Hllogis(x, shape = 1, scale = 1, prop = 1, log.p = FALSE) rllogis(n, shape = 1, scale = 1)
shape , scale
|
shape and scale parameters, both defaulting to 1. |
lower.tail |
logical; if TRUE (default), probabilities are |
x , q
|
vector of quantiles. |
p |
vector of probabilities. |
n |
number of observations. If |
log , log.p
|
logical; if TRUE, probabilities p are given as log(p). |
prop |
proportionality constant in the extended Loglogistic distribution. |
The Loglogistic distribution with scale
parameter and
shape
parameter has hazard function given by
for .
dllogis
gives the density, pllogis
gives the distribution
function, qllogis
gives the quantile function, hllogis
gives the
hazard function, Hllogis
gives the cumulative hazard function, and
rllogis
generates random deviates.
Invalid arguments will result in return value NaN
, with a warning.
Density, distribution function, quantile function, hazard function,
cumulative hazard function, and random generation for the Lognormal distribution
with parameters shape
and scale
.
hlnorm(x, meanlog = 0, sdlog = 1, shape = 1 / sdlog, scale = exp(meanlog), prop = 1, log = FALSE) Hlnorm(x, meanlog = 0, sdlog = 1, shape = 1 / sdlog, scale = exp(meanlog), prop = 1, log.p = FALSE)
hlnorm(x, meanlog = 0, sdlog = 1, shape = 1 / sdlog, scale = exp(meanlog), prop = 1, log = FALSE) Hlnorm(x, meanlog = 0, sdlog = 1, shape = 1 / sdlog, scale = exp(meanlog), prop = 1, log.p = FALSE)
x |
vector of quantiles. |
meanlog |
mean in the Normal distribution. |
sdlog , shape
|
sdlog is standard deviation in the Normal distrimution, shape = 1/sdlog. |
scale |
is exp(meanlog). |
prop |
proportionality constant in the extended Lognormal distribution. |
log , log.p
|
logical; if TRUE, probabilities p are given as log(p). |
The Lognormal distribution with scale
parameter and
shape
parameter has hazard function given by
for .
dlnorm
gives the density, plnorm
gives the distribution
function, qlnorm
gives the quantile function, hlnorm
gives the
hazard function, Hlnorm
gives the cumulative hazard function, and
rlnorm
generates random deviates.
Invalid arguments will result in return value NaN
, with a warning.
Performs the log-rank test on survival data, possibly stratified.
logrank(Y, group, data = parent.frame())
logrank(Y, group, data = parent.frame())
Y |
a survival object as returned by the |
group |
defines the groups to be compared. Coerced to a factor. |
data |
a data.frame in which to interpret the variables. |
A list of class logrank
with components
test.statistic |
The logrank (score) test statistic. |
df |
The degrees of freedom of the test statistic. |
p.value |
The p value of the test. |
hazards |
A list of two-column matrices, describing event times and corresponding hazard atoms in each stratum (class 'hazdata'). |
call |
The call |
The test is performed by fitting a Cox regression model and reporting
its score test
. With tied data, this might be slightly different from
the true logrank test, but the difference is unimportant in practice.
Göran Broström
fit <- logrank(Y = Surv(enter, exit, event), group = civ, data = oldmort[oldmort$region == "town", ]) fit
fit <- logrank(Y = Surv(enter, exit, event), group = civ, data = oldmort[oldmort$region == "town", ]) fit
The data consists of yearly rye prices from 1801 to 1894. Logged and detrended, so the time series is supposed to measure short term fluctuations in rye prices.
data(scania)
data(scania)
A data frame with 94 observations in two columns on the following 2 variables.
year
The year the price is recorded.
foodprices
Detrended log rye prices.
The Scanian area in southern Sweden was during the 19th century a mainly rural area.
The Scanian Economic Demographic Database.
Jörberg, L. (1972). A History of Prices in Sweden 1732-1914, CWK Gleerup, Lund.
data(logrye) summary(logrye)
data(logrye) summary(logrye)
This (generic) function prints the LaTeX code of the results of a fit from
coxreg
, phreg
, tpchreg
,
or aftreg
, similar
to what xtable
does for fits from other functions.
ltx( x, caption = NULL, label = NULL, dr = NULL, digits = max(options()$digits - 4, 3), ... )
ltx( x, caption = NULL, label = NULL, dr = NULL, digits = max(options()$digits - 4, 3), ... )
x |
The output from a call to |
caption |
A suitable caption for the table. |
label |
A label used in the LaTeX code. |
dr |
Output from a |
digits |
Number of digits to be printed. |
... |
Not used. |
The result is a printout which is (much) nicer than the standard printed
output from glm
and friends,
LaTeX code version of the results from a run with
coxreg
, phreg
, phreg
,
or aftreg
.
For printing confidence limits, use ltx2
.
Göran Broström.
ltx2
, coxreg
, phreg
,
phreg
, and aftreg
.
data(oldmort) fit <- coxreg(Surv(enter, exit, event) ~ civ + sex, data = oldmort) dr <- drop1(fit, test = "Chisq") ltx(fit, dr = dr, caption = "A test example.", label = "tab:test1")
data(oldmort) fit <- coxreg(Surv(enter, exit, event) ~ civ + sex, data = oldmort) dr <- drop1(fit, test = "Chisq") ltx(fit, dr = dr, caption = "A test example.", label = "tab:test1")
This (generic) function prints the LaTeX code of the results of a fit from
coxreg
, phreg
, tpchreg
,
or aftreg
.
ltx2( x, caption = NULL, label = NULL, dr = NULL, digits = max(options()$digits - 4, 4), conf = 0.95, keep = NULL, ... )
ltx2( x, caption = NULL, label = NULL, dr = NULL, digits = max(options()$digits - 4, 4), conf = 0.95, keep = NULL, ... )
x |
The output from a call to |
caption |
A suitable caption for the table. |
label |
A label used in the LaTeX code. |
dr |
Output from a |
digits |
Number of digits to be printed. |
conf |
Confidence intervals level. |
keep |
Number of covariates to present. |
... |
Not used. |
LaTeX code version of the results from a run with
coxreg
, phreg
, phreg
,
aftreg
.
Resulting tables contain estimated hazard ratios and confidence limits
instead of regression coefficients and standard errors as in ltx
.
Göran Broström.
xtable
, coxreg
, phreg
,
phreg
, aftreg
, and ltx
.
data(oldmort) fit <- coxreg(Surv(enter, exit, event) ~ sex, data = oldmort) ltx2(fit, caption = "A test example.", label = "tab:test1")
data(oldmort) fit <- coxreg(Surv(enter, exit, event) ~ sex, data = oldmort) ltx2(fit, caption = "A test example.", label = "tab:test1")
Given an ordinary data frame suitable for survival analysis, and a data frame with "communal" time series, this function includes the communal covariates as fixed, by the "cutting spells" method.
make.communal( dat, com.dat, communal = TRUE, start, period = 1, lag = 0, surv = c("enter", "exit", "event", "birthdate"), tol = 1e-04, fortran = TRUE )
make.communal( dat, com.dat, communal = TRUE, start, period = 1, lag = 0, surv = c("enter", "exit", "event", "birthdate"), tol = 1e-04, fortran = TRUE )
dat |
A data frame containing interval specified survival data and covariates, of which one must give a "birth date", the connection between duration and calendar time |
com.dat |
Data frame with communal covariates. They must have the same
start year and periodicity, given by |
communal |
Boolean; if TRUE, then it is a true communal (default),
otherwise a fixed. The first component is the first year (start date in
decimal form), and the second component is the period length. The third is
|
start |
Start date in decimal form. |
period |
Period length. Defaults to one. |
lag |
The lag of the effect. Defaults to zero. |
surv |
Character vector of length 4 giving the names of interval start,
interval end, event indicator, birth date, in that order. These names must
correspond to names in |
tol |
Largest length of an interval considered to be of zero length. The cutting sometimes produces zero length intervals, which we want to discard. |
fortran |
If |
The main purpose of this function is to prepare a data file for use with
coxreg
, aftreg
, and
coxph
.
The return value is a data frame with the same variables as in the
combination of dat
and com.dat
. Therefore it is an error to
have common name(s) in the two data frames.
Not very vigorously tested.
Göran Broström
coxreg
, aftreg
,
coxph
, cal.window
dat <- data.frame(enter = 0, exit = 5.731, event = 1, birthdate = 1962.505, x = 2) ## Birth date: July 2, 1962 (approximately). com.dat <- data.frame(price = c(12, 3, -5, 6, -8, -9, 1, 7)) dat.com <- make.communal(dat, com.dat, start = 1962.000)
dat <- data.frame(enter = 0, exit = 5.731, event = 1, birthdate = 1962.505, x = 2) ## Birth date: July 2, 1962 (approximately). com.dat <- data.frame(price = c(12, 3, -5, 6, -8, -9, 1, 7)) dat.com <- make.communal(dat, com.dat, start = 1962.000)
Density, distribution function, quantile function, hazard function,
cumulative hazard function, and random generation for the Gompertz-Makeham
distribution with parameters shape
and scale
.
dmakeham(x, shape = c(1, 1), scale = 1, log = FALSE) pmakeham(q, shape = c(1, 1), scale = 1, lower.tail = TRUE, log.p = FALSE) qmakeham(p, shape = c(1, 1), scale = 1, lower.tail = TRUE, log.p = FALSE) hmakeham(x, shape = c(1, 1), scale = 1, log = FALSE) Hmakeham(x, shape = c(1, 1), scale = 1, log.p = FALSE) rmakeham(n, shape = c(1, 1), scale = 1)
dmakeham(x, shape = c(1, 1), scale = 1, log = FALSE) pmakeham(q, shape = c(1, 1), scale = 1, lower.tail = TRUE, log.p = FALSE) qmakeham(p, shape = c(1, 1), scale = 1, lower.tail = TRUE, log.p = FALSE) hmakeham(x, shape = c(1, 1), scale = 1, log = FALSE) Hmakeham(x, shape = c(1, 1), scale = 1, log.p = FALSE) rmakeham(n, shape = c(1, 1), scale = 1)
shape |
A vector, default value c(1, 1). |
scale |
defaulting to 1. |
lower.tail |
logical; if TRUE (default), probabilities are |
x , q
|
vector of quantiles. |
p |
vector of probabilities. |
n |
number of observations. If |
log , log.p
|
logical; if TRUE, probabilities p are given as log(p). |
The Gompertz-Makeham distribution with shape
parameter and
scale
parameter has hazard function given by
for .
dmakeham
gives the density, pmakeham
gives the distribution
function, qmakeham
gives the quantile function, hmakeham
gives the
hazard function, Hmakeham
gives the cumulative hazard function, and
rmakeham
generates random deviates.
Invalid arguments will result in return value NaN
, with a warning.
Males born in the years 1800-1820 and surving at least 40 years in the parish Skellefteå in northern Sweden are followed from their fortieth birthday until death or the sixtieth birthday, whichever comes first.
data(male.mortality)
data(male.mortality)
A data frame with 2058 observations on the following 6 variables.
id
Personal identification number.
enter
Start of duration. Measured in years since the fortieth birthday.
exit
End of duration. Measured in years since the fortieth birthday.
event
a logical vector indicating death at end of interval.
birthdate
The birthdate in decimal form.
ses
Socio-economic status, a factor with levels
lower
, upper
The interesting explanatory
covariate is ses
(socioeconomic status), which is a
time-varying covariate. This explains why several individuals are
representated by more than one record each. Left trucation and right
censoring are introduced this way.
This data set is also known, and accessible, as mort
.
Data is coming from The Demographic Data Base, Umea University, Umeå, Sweden.
https://www.umu.se/enheten-for-demografi-och-aldrandeforskning/
data(male.mortality) fit <- coxreg(Surv(enter, exit, event) ~ ses, data = male.mortality) summary(fit)
data(male.mortality) fit <- coxreg(Surv(enter, exit, event) ~ ses, data = male.mortality) summary(fit)
Maximum Likelihood estimation of proportional hazards models. Is deprecated,
use coxreg
instead.
mlreg( formula = formula(data), data = parent.frame(), na.action = getOption("na.action"), init = NULL, method = c("ML", "MPPL"), control = list(eps = 1e-08, maxiter = 10, n.points = 12, trace = FALSE), singular.ok = TRUE, model = FALSE, center = TRUE, x = FALSE, y = TRUE, boot = FALSE, geometric = FALSE, rs = NULL, frailty = NULL, max.survs = NULL )
mlreg( formula = formula(data), data = parent.frame(), na.action = getOption("na.action"), init = NULL, method = c("ML", "MPPL"), control = list(eps = 1e-08, maxiter = 10, n.points = 12, trace = FALSE), singular.ok = TRUE, model = FALSE, center = TRUE, x = FALSE, y = TRUE, boot = FALSE, geometric = FALSE, rs = NULL, frailty = NULL, max.survs = NULL )
formula |
a formula object, with the response on the left of a ~ operator, and the terms on the right. The response must be a survival object as returned by the Surv function. |
data |
a data.frame in which to interpret the variables named in the formula. |
na.action |
a missing-data filter function, applied to the model.frame,
after any subset argument has been used. Default is
|
init |
vector of initial values of the iteration. Default initial value is zero for all variables. |
method |
Method of treating ties, "ML", the default, means pure maximum likelihood, i.e, data are treated as discrete. The choice "MPPL" implies that risk sets with no tied events are treated as in ordinary Cox regression. This is a cameleont that adapts to data, part discrete and part continuous. |
control |
a list with components |
singular.ok |
Not used. |
model |
Not used. |
center |
Should covariates be centered? Default is TRUE |
x |
Return the design matrix in the model object? |
y |
return the response in the model object? |
boot |
No. of bootstrap replicates. Defaults to FALSE, i.e., no bootstrapping. |
geometric |
If |
rs |
Risk set? If present, speeds up calculations considerably. |
frailty |
A grouping variable for frailty analysis. Full name is needed. |
max.survs |
Sampling of risk sets? |
Method ML
performs a true discrete analysis, i.e., one parameter per
observed event time. Method MPPL
is a compromize between the discrete
and continuous time approaches; one parameter per observed event time with
multiple events. With no ties in data, an ordinary Cox regression (as with
coxreg
) is performed.
A list of class c("mlreg", "coxreg", "coxph")
with components
coefficients |
Fitted parameter estimates. |
var |
Covariance matrix of the estimates. |
loglik |
Vector of length two; first component is the value at the initial parameter values, the second componet is the maximized value. |
score |
The score test statistic (at the initial value). |
linear.predictors |
The estimated linear predictors. |
residuals |
The martingale residuals. |
hazard |
The estimated baseline hazard. |
means |
Means of the columns of the design matrix. |
w.means |
Weighted (against exposure time) means of covariates; weighted relative frequencies of levels of factors. |
n |
Number of spells in indata (possibly after removal of cases with NA's). |
events |
Number of events in data. |
terms |
Used by extractor functions. |
assign |
Used by extractor functions. |
wald.test |
The Walt test statistic (at the initial value). |
y |
The Surv vector. |
isF |
Logical vector indicating the covariates that are factors. |
covars |
The covariates. |
ttr |
Total Time at Risk. |
levels |
List of levels of factors. |
formula |
The calling formula. |
call |
The call. |
bootstrap |
The bootstrap sample, if requested on input. |
sigma |
Present if a frailty model is fitted. Equals the estimated frailty standard deviation. |
sigma.sd |
The standard error of the estimated frailty standard deviation. |
method |
The method. |
convergence |
Did the optimization converge? |
fail |
Did the optimization fail? (Is |
The use of rs
is dangerous, see note above. It can
however speed up computing time.
This function starts by creating risksets, if no riskset is supplied
via rs
, with the aid of risksets
. This latter mechanism
fails if there are any NA's in the data! Note also that it depends on
stratification, so rs
contains information about stratification.
Giving another strata variable in the formula is an error. The same is ok,
for instance to supply stratum interactions.
Note futher that mlreg
is deprecated. coxreg
should be
used instead.
Göran Broström
Broström, G. (2002). Cox regression; Ties without tears. Communications in Statistics: Theory and Methods 31, 285–297.
dat <- data.frame(time= c(4, 3,1,1,2,2,3), status=c(1,1,1,0,1,1,0), x= c(0, 2,1,1,1,0,0), sex= c(0, 0,0,0,1,1,1)) mlreg( Surv(time, status) ~ x + strata(sex), data = dat) #stratified model # Same as: rs <- risksets(Surv(dat$time, dat$status), strata = dat$sex) mlreg( Surv(time, status) ~ x, data = dat, rs = rs) #stratified model
dat <- data.frame(time= c(4, 3,1,1,2,2,3), status=c(1,1,1,0,1,1,0), x= c(0, 2,1,1,1,0,0), sex= c(0, 0,0,0,1,1,1)) mlreg( Surv(time, status) ~ x + strata(sex), data = dat) #stratified model # Same as: rs <- risksets(Surv(dat$time, dat$status), strata = dat$sex) mlreg( Surv(time, status) ~ x, data = dat, rs = rs) #stratified model
Males born in the years 1800-1820 and surving at least 40 years in the parish Skellefteå in northern Sweden are followed from their fortieth birthday until death or the sixtieth birthday, whichever comes first.
data(mort)
data(mort)
A data frame with 2058 observations on the following 6 variables.
id
Personal identification number.
enter
Start of duration. Measured in years since the fortieth birthday.
exit
End of duration. Measured in years since the fortieth birthday.
event
a logical vector indicating death at end of interval.
birthdate
The birthdate in decimal form.
ses
Socio-economic status, a factor with levels
lower
, upper
The interesting explanatory
covariate is ses
(socioeconomic status), which is a
time-varying covariate. This explains why several individuals are
representated by more than one record each. Left trucation and right
censoring are introduced this way.
This data set is also known, and accessible, as male.mortality
Data is coming from The Demographic Data Base, Umea University, Umeå, Sweden.
https://www.umu.se/enheten-for-demografi-och-aldrandeforskning/
data(mort) fit <- coxreg(Surv(enter, exit, event) ~ ses, data = mort) summary(fit)
data(mort) fit <- coxreg(Surv(enter, exit, event) ~ ses, data = mort) summary(fit)
Create an oe ("occurrence/exposure") object, used as a response
variable in a model formula specifically in tpchreg
.
oe(count, exposure)
oe(count, exposure)
count |
Number of events, a non-negative integer-valued vector. |
exposure |
exposure time corresponding to count. A positive numeric vector. |
The data consists of old age life histories from 1 January 1860 to 31 december 1880, 21 years. Only (parts of) life histories above age 60 is considered.
data(oldmort)
data(oldmort)
A data frame with 6508 observations from 4603 persons on the following 13 variables.
id
Identification number.
enter
Start age for the interval.
exit
Stop age for the interval.
event
Indicator of death; equals TRUE
if the person died
at the end of the interval, FALSE
otherwise.
birthdate
Birthdate as a real number (i.e., "1765-06-27" is 1765.490).
m.id
Mother's identification number.
f.id
Father's identification number.
sex
Gender, a factor with levels male
female
civ
Civil status, a factor with levels unmarried
married
widow
ses.50
Socio-economic status at age 50, a factor with levels middle
unknown
upper
farmer
lower
birthplace
a factor with levels parish
region
remote
imr.birth
Infant mortality rate at birth in the region of birth
region
Subregion of Sundsvall, a factor with levels town
industry
rural
The Sundsvall area in mid-Sweden was during the 19th century a fast growing forest industry. At the end of the century, it was one of the largest sawmill area in Europe. The town Sundsvall is fast growing part of the region and center for the commerse.
The Demographic Data Base, Umeå University, Sweden.
Edvinsson, S. (2000). The Demographic Data Base at Umeå University: A resource for historical studies. In Hall, McKaa, and Thorvaldsen (eds), "Handbook of International Historical Microdata for Population Research", Minnesota Population Center, Minneapolis.
data(oldmort) summary(oldmort) ## maybe str(oldmort) ; plot(oldmort) ...
data(oldmort) summary(oldmort) ## maybe str(oldmort) ; plot(oldmort) ...
Density, distribution function, quantile function, hazard function, cumulative hazard function, mean, and random generation for the Piecewice Constant Hazards (pch) distribution.
ppch(q, cuts, levels, lower.tail = TRUE, log.p = FALSE) dpch(x, cuts, levels, log = FALSE) hpch(x, cuts, levels, log = FALSE) Hpch(x, cuts, levels, log.p = FALSE) qpch(p, cuts, levels, lower.tail = TRUE, log.p = FALSE) mpch(cuts, levels) rpch(n, cuts, levels)
ppch(q, cuts, levels, lower.tail = TRUE, log.p = FALSE) dpch(x, cuts, levels, log = FALSE) hpch(x, cuts, levels, log = FALSE) Hpch(x, cuts, levels, log.p = FALSE) qpch(p, cuts, levels, lower.tail = TRUE, log.p = FALSE) mpch(cuts, levels) rpch(n, cuts, levels)
cuts |
Vector of cut points defining the intervals where the hazard function is constant. |
levels |
Vector of levels (values of the hazard function). |
lower.tail |
logical; if TRUE (default), probabilities are
|
x , q
|
vector of quantiles. |
p |
vector of probabilities. |
log , log.p
|
logical; if TRUE, probabilities p are given as log(p). |
n |
number of observations. If |
The pch distribution has a hazard function that is piecewise constant on intervals defined by cutpoints
If n = 0
, this reduces to an exponential distribution.
dpch
gives the density,
ppch
gives the distribution function,
qpch
gives the quantile function,
hpch
gives the hazard function,
Hpch
gives the cumulative hazard function,
mpch
gives the mean, and
rpch
generates random deviates.
the parameter levels
must have length at least 1, and the
number of cut points must be one less than the number of levels.
Proportional hazards model with piecewise constant baseline hazard(s). Allows for stratification and left truncated and right censored data.
pchreg( formula = formula(data), data = parent.frame(), na.action = getOption("na.action"), cuts = NULL, init, control = list(eps = 1e-08, maxiter = 20, trace = FALSE), singular.ok = TRUE, model = FALSE, x = FALSE, y = TRUE )
pchreg( formula = formula(data), data = parent.frame(), na.action = getOption("na.action"), cuts = NULL, init, control = list(eps = 1e-08, maxiter = 20, trace = FALSE), singular.ok = TRUE, model = FALSE, x = FALSE, y = TRUE )
formula |
a formula object, with the response on the left of a ~ operator, and the terms on the right. The response must be a survival object as returned by the Surv function. |
data |
a data.frame in which to interpret the variables named in the formula. |
na.action |
a missing-data filter function, applied to the model.frame,
after any subset argument has been used. Default is
|
cuts |
Specifies the points in time where the hazard function jumps. If omitted, an exponential model is fitted. |
init |
vector of initial values of the iteration. Default initial value is zero for all variables. |
control |
a list with components |
singular.ok |
Not used. |
model |
Not used. |
x |
Return the design matrix in the model object? |
y |
Return the response in the model object? |
A list of class "pchreg"
with components
coefficients |
Fitted parameter estimates. |
cuts |
Cut points ( |
hazards |
The estimated constant levels. |
var |
Covariance matrix of the estimates. |
loglik |
Vector of length two; first component is the value at the initial parameter values, the second component is the maximized value. |
score |
The score test statistic (at the initial value). |
linear.predictors |
The estimated linear predictors. |
means |
Means of the columns of the design matrix, except those columns corresponding to a factor level. Otherwise all zero. |
w.means |
Weighted (against exposure time) means of covariates; weighted relative frequencies of levels of factors. |
n |
Number of spells in indata (possibly after removal of cases with NA's). |
n.events |
Number of events in data. |
terms |
Used by extractor functions. |
assign |
Used by extractor functions. |
wald.test |
The Wald test statistic (at the initial value). |
y |
The Surv vector. |
isF |
Logical vector indicating the covariates that are factors. |
covars |
The covariates. |
ttr |
Total Time at Risk. |
levels |
List of levels of factors. |
formula |
The calling formula. |
call |
The call. |
method |
The method. |
convergence |
Did the optimization converge? |
fail |
Did the optimization fail? (Is |
Göran Broström
## Not run: dat <- age.window(oldmort, c(60, 80)) fit <- pchreg(Surv(enter, exit, event) ~ ses.50 + sex, data = dat, cuts = seq(60, 80, by = 4)) summary(fit) fit.cr <- coxreg(Surv(enter, exit, event) ~ ses.50 + sex, data = dat) check.dist(fit.cr, fit, main = "Cumulative hazards") ## End(Not run)
## Not run: dat <- age.window(oldmort, c(60, 80)) fit <- pchreg(Surv(enter, exit, event) ~ ses.50 + sex, data = dat, cuts = seq(60, 80, by = 4)) summary(fit) fit.cr <- coxreg(Surv(enter, exit, event) ~ ses.50 + sex, data = dat) check.dist(fit.cr, fit, main = "Cumulative hazards") ## End(Not run)
Calculates occurrence / exposure rates for time periods given by
period
and for ages given by age
.
perstat(surv, period, age = c(0, 200))
perstat(surv, period, age = c(0, 200))
surv |
An (extended) |
period |
A vector of dates (in decimal form) |
age |
A vector of length 2; lowest and highest age |
A list with components
events |
No. of events in eavh time period. |
exposure |
Exposure times in each period. |
intensity |
|
Göran Broström
Calculates minus the log likelihood function and its first and second order derivatives for data from a Weibull regression model.
phfunc( beta = NULL, lambda, p, X = NULL, Y, offset = rep(0, length(Y)), ord = 2, pfixed = FALSE, dist = "weibull" )
phfunc( beta = NULL, lambda, p, X = NULL, Y, offset = rep(0, length(Y)), ord = 2, pfixed = FALSE, dist = "weibull" )
beta |
Regression parameters |
lambda |
The scale paramater |
p |
The shape parameter |
X |
The design (covariate) matrix. |
Y |
The response, a survival object. |
offset |
Offset. |
ord |
ord = 0 means only loglihood, 1 means score vector as well, 2 loglihood, score and hessian. |
pfixed |
Logical, if TRUE the shape parameter is regarded as a known constant in the calculations, meaning that it is not cosidered in the partial derivatives. |
dist |
Which distribtion? The default is "weibull", with the alternatives "loglogistic" and "lognormal". |
Note that the function returns log likelihood, score vector and minus hessian, i.e. the observed information. The model is
A list with components
f |
The log likelihood. Present if
|
fp |
The score vector. Present if |
fpp |
The negative of the hessian. Present if |
Göran Broström
Proportional hazards model with parametric baseline hazard(s). Allows for stratification with different scale and shape in each stratum, and left truncated and right censored data.
phreg( formula = formula(data), data = parent.frame(), na.action = getOption("na.action"), dist = "weibull", cuts = NULL, init, shape = 0, param = c("canonical", "rate"), control = list(eps = 1e-08, maxiter = 20, trace = FALSE), singular.ok = TRUE, model = FALSE, x = FALSE, y = TRUE )
phreg( formula = formula(data), data = parent.frame(), na.action = getOption("na.action"), dist = "weibull", cuts = NULL, init, shape = 0, param = c("canonical", "rate"), control = list(eps = 1e-08, maxiter = 20, trace = FALSE), singular.ok = TRUE, model = FALSE, x = FALSE, y = TRUE )
formula |
a formula object, with the response on the left of a ~ operator, and the terms on the right. The response must be a survival object as returned by the Surv function. |
data |
a data.frame in which to interpret the variables named in the formula. |
na.action |
a missing-data filter function, applied to the model.frame,
after any subset argument has been used. Default is
|
dist |
Which distribution? Default is "weibull", with the alternatives
"ev" (Extreme value), "gompertz", "pch" (piecewise constant hazards
function), "loglogistic" and "lognormal". A special case like the
|
cuts |
Only used with |
init |
vector of initial values of the iteration. Default initial value is zero for all variables. |
shape |
If positive, the shape parameter is fixed at that value (in each stratum). If zero or negative, the shape parameter is estimated. If more than one stratum is present in data, each stratum gets its own estimate. Only relevant for the Weibull and Extreme Value distributions. |
param |
Applies only to the Gompertz distribution: "canonical" is
defined in the description of the |
control |
a list with components |
singular.ok |
Not used. |
model |
Not used. |
x |
Return the design matrix in the model object? |
y |
Return the response in the model object? |
The parameterization is the same as in coxreg
and
coxph
, but different from the one used by
survreg
(which is not a proportional hazards
modelling function). The model is
where S0 is some standardized survivor function.
A list of class c("phreg", "coxreg")
with components
coefficients |
Fitted parameter estimates. |
cuts |
Cut points for
the "pch" distribution. |
hazards |
The estimated
constant levels in the case of the "pch" distribution. |
var |
Covariance matrix of the estimates. |
loglik |
Vector of length two; first component is the value at the initial parameter values, the second componet is the maximized value. |
score |
The score test statistic (at the initial value). |
linear.predictors |
The estimated linear predictors. |
means |
Means of the columns of the design matrix, except those columns corresponding to a factor level. Otherwise all zero. |
w.means |
Weighted (against exposure time) means of covariates; weighted relative frequencies of levels of factors. |
n |
Number of spells in indata (possibly after removal of cases with NA's). |
n.events |
Number of events in data. |
terms |
Used by extractor functions. |
assign |
Used by extractor functions. |
wald.test |
The Wald test statistic (at the initial value). |
y |
The Surv vector. |
isF |
Logical vector indicating the covariates that are factors. |
covars |
The covariates. |
ttr |
Total Time at Risk. |
levels |
List of levels of factors. |
formula |
The calling formula. |
call |
The call. |
method |
The method. |
convergence |
Did the optimization converge? |
fail |
Did the optimization fail? (Is |
pfixed |
TRUE if shape was fixed in the estimation. |
The lognormal and loglogistic distributions are included on an experimental basis for the moment. Use with care, results may be unreliable!
The gompertz distribution has an exponentially increasing hazard function
under the canonical parametrization. This may cause instability in the
convergence of the fitting algorithm in the case of near-exponential data.
It may be resolved by using param = "rate"
.
The lognormal and loglogistic baseline distributions are extended to a three-parameter family by adding a "proportionality" parameter (multiplying the baseline hazard function). The log of the estimated parameter turns up as '(Intercept)' in the printed output. The reason for this extension is that the standard lognormal and loglogistic distributions are not closed under proportional hazards.
Göran Broström
coxreg
, check.dist
,
link{aftreg}
.
data(mort) fit <- phreg(Surv(enter, exit, event) ~ ses, data = mort) fit plot(fit) fit.cr <- coxreg(Surv(enter, exit, event) ~ ses, data = mort) check.dist(fit.cr, fit)
data(mort) fit <- phreg(Surv(enter, exit, event) ~ ses, data = mort) fit plot(fit) fit.cr <- coxreg(Surv(enter, exit, event) ~ ses, data = mort) check.dist(fit.cr, fit)
This function is called by phreg
, but it can also be directly
called by a user.
phreg.fit(X, Y, dist, strata, offset, init, shape, control)
phreg.fit(X, Y, dist, strata, offset, init, shape, control)
X |
The design (covariate) matrix. |
Y |
A survival object, the response. |
dist |
Which baseline distribution? |
strata |
A stratum variable. |
offset |
Offset. |
init |
Initial regression parameter values. |
shape |
If positive, a fixed value of the shape parameter in the distribution. Otherwise, the shape is estimated. |
control |
Controls convergence and output. |
See phreg
for more detail.
coefficients |
Estimated regression coefficients plus estimated scale and shape coefficients, sorted by strata, if present. |
var |
Variance-covariance matrix |
loglik |
Vector of length 2. The first component is the maximized loglihood with only scale and shape in the model, the second the final maximum. |
score |
Score test statistic at initial values |
linear.predictors |
Linear predictors for each interval. |
means |
Means of the covariates |
conver |
TRUE if convergence |
fail |
TRUE if failure |
iter |
Number of Newton-Raphson iterates. |
n.strata |
The number of strata in the data. |
Göran Broström
Calculate piecewise hazards, no. of events, and exposure times in each interval indicated by cutpoints.
piecewise(enter, exit, event, cutpoints)
piecewise(enter, exit, event, cutpoints)
enter |
Left interval endpoint |
exit |
Right interval endpoint |
event |
Indicator of event |
cutpoints |
Vector of cutpoints |
Exact calculation.
A list with components
events |
Vector of number of events |
exposure |
Vector of total exposure time |
intensity |
Vector of hazards, |
Göran Broström
Just a simple plot of the hazard (cumulative hazard, density, survival) functions for each stratum.
## S3 method for class 'aftreg' plot( x, fn = c("haz", "cum", "den", "sur"), main = NULL, xlim = NULL, ylim = NULL, xlab = "Duration", ylab = "", col, lty, printLegend = TRUE, new.data = x$means, ... )
## S3 method for class 'aftreg' plot( x, fn = c("haz", "cum", "den", "sur"), main = NULL, xlim = NULL, ylim = NULL, xlab = "Duration", ylab = "", col, lty, printLegend = TRUE, new.data = x$means, ... )
x |
A |
fn |
Which functions shoud be plotted! Default is all. They will scroll
by, so you have to take care of explicitly what you want to be produced.
See, eg, |
main |
Header for the plot |
xlim |
x limits |
ylim |
y limits |
xlab |
x label |
ylab |
y label |
col |
Colors? |
lty |
Line types? |
printLegend |
Should legend be printed? Default is yes. |
new.data |
At which covariate values? |
... |
Extra parameters passed to 'plot' |
The plot is drawn at the mean values of the covariates, by default.
No return value.
Göran Broström
y <- rllogis(40, shape = 1, scale = 1) x <- rep(c(1,1,2,2), 10) fit <- aftreg(Surv(y, rep(1, 40)) ~ x, dist = "loglogistic") plot(fit)
y <- rllogis(40, shape = 1, scale = 1) x <- rep(c(1,1,2,2), 10) fit <- aftreg(Surv(y, rep(1, 40)) ~ x, dist = "loglogistic") plot(fit)
coxreg
objectsA plot of a baseline function of a coxreg
fit is produced, one curve
for each stratum. A wrapper for plot.survfit
.
## S3 method for class 'coxreg' plot( x, fn = c("cum", "surv", "log", "loglog"), conf.int = FALSE, fig = TRUE, xlim = NULL, ylim = NULL, main = NULL, xlab = "Duration", ylab = "", col = 1, lty = 1, printLegend = TRUE, ... )
## S3 method for class 'coxreg' plot( x, fn = c("cum", "surv", "log", "loglog"), conf.int = FALSE, fig = TRUE, xlim = NULL, ylim = NULL, main = NULL, xlab = "Duration", ylab = "", col = 1, lty = 1, printLegend = TRUE, ... )
x |
A |
fn |
What should be plotted? Default is "cumhaz", and the other choices are "surv", "log", and "loglog". |
conf.int |
logical or a value like 0.95 (default for one curve). |
fig |
logical. If |
xlim |
Start and end of the x axis. |
ylim |
Start and end of the y axis. |
main |
A headline for the plot |
xlab |
Label on the x axis. |
ylab |
Label on the y axis. |
col |
Color of the curves. Defaults to 'black'. |
lty |
Line type(s). |
printLegend |
Either a logical or a text string; if |
... |
Other parameters to pass to the plot. |
An object of class hazdata
containing the coordinates of the
curve(s).
Baseline hazards estimates.
## S3 method for class 'hazdata' plot( x, strata = NULL, fn = c("cum", "surv", "log", "loglog"), fig = TRUE, xlim = NULL, ylim = NULL, main = NULL, xlab = "", ylab = "", col = "black", lty = 1, printLegend = TRUE, ... )
## S3 method for class 'hazdata' plot( x, strata = NULL, fn = c("cum", "surv", "log", "loglog"), fig = TRUE, xlim = NULL, ylim = NULL, main = NULL, xlab = "", ylab = "", col = "black", lty = 1, printLegend = TRUE, ... )
x |
A |
strata |
Stratum names if there are strata present. |
fn |
Which type of plot? Allowed values are "cum" (or "cumhaz"), "surv" (or "sur"), "log", or "loglog". The last two plots the cumulative hazards on a log (y) scale or a log-log (xy) scale, respectively. |
fig |
Should a plot actually be produced? Default is TRUE. |
xlim |
Horizontal plot limits. If NULL, calculated by the function. |
ylim |
Vertical plot limits. If NULL, set to |
main |
A heading for the plot. |
xlab |
Label on the x axis. |
ylab |
Label on the y-axis. |
col |
Color of the lines. May be a vector of length equal to No. of strata. |
lty |
Line type(s). May be a vector of length equal to No. of strata. |
printLegend |
Logical or character; should a legend be produced?
Defaults to TRUE. If character, it should be one of |
... |
Anything that |
It is also possible to have as first argument an object of type "coxreg", given that it contains a component of type "hazdata".
A list where the elements are two-column matrices, one for each stratum in the model. The first column contains risktimes, and the second the y coordinates for the requested curve(s).
x
is a list where each element is a two-column matrix. The first
column contains failure times, and the second column contains the
corresponding 'hazard atoms'.
Göran Broström
time0 <- numeric(50) group <- c(rep(0, 25), rep(1, 25)) x <- runif(50, -0.5, 0.5) time1 <- rexp( 50, exp(group) ) event <- rep(1, 50) fit <- coxreg(Surv(time0, time1, event) ~ x + strata(group), method = "ml") plot(fit$hazards, col = 1:2, fn = "surv", xlab = "Duration") ## Same result as: ## plot(fit, col = 1:2, fn = "sur", xlab = "Duration")
time0 <- numeric(50) group <- c(rep(0, 25), rep(1, 25)) x <- runif(50, -0.5, 0.5) time1 <- rexp( 50, exp(group) ) event <- rep(1, 50) fit <- coxreg(Surv(time0, time1, event) ~ x + strata(group), method = "ml") plot(fit$hazards, col = 1:2, fn = "surv", xlab = "Duration") ## Same result as: ## plot(fit, col = 1:2, fn = "sur", xlab = "Duration")
Baseline hazards estimates.
## S3 method for class 'logrank' plot( x, fn = c("cum", "surv", "log", "loglog"), xlim = NULL, ylim = NULL, main = NULL, xlab = "", ylab = "", col = "black", lty = 1, printLegend = TRUE, ... )
## S3 method for class 'logrank' plot( x, fn = c("cum", "surv", "log", "loglog"), xlim = NULL, ylim = NULL, main = NULL, xlab = "", ylab = "", col = "black", lty = 1, printLegend = TRUE, ... )
x |
A |
fn |
Which type of plot? Allowed values are "cum" (or "cumhaz"), "surv" (or "sur"), "log", or "loglog". The last two plots the cumulative hazards on a log (y) scale or a log-log (xy) scale, respectively. |
xlim |
Horizontal plot limits. If NULL, calculated by the function. |
ylim |
Vertical plot limits. If NULL, set to |
main |
A heading for the plot. |
xlab |
Label on the x axis. |
ylab |
Label on the y-axis. |
col |
Color of the lines. May be a vector of length equal to No. of strata. |
lty |
Line type(s). May be a vector of length equal to No. of strata. |
printLegend |
Logical or character; should a legend be produced?
Defaults to TRUE. If character, it should be one of |
... |
Anything that |
It is also possible to have as first argument an object of type "coxreg", given that it contains a component of type "hazdata".
A list where the elements are two-column matrices, one for each stratum in the model. The first column contains risktimes, and the second the y coordinates for the requested curve(s).
x
is a list where each element is a two-column matrix. The first
column contains failure times, and the second column contains the
corresponding 'hazard atoms'.
Göran Broström
fit <- logrank(Surv(enter, exit, event), group = civ, data = oldmort[oldmort$region == "town", ]) plot(fit)
fit <- logrank(Surv(enter, exit, event), group = civ, data = oldmort[oldmort$region == "town", ]) plot(fit)
Plot(s) of the hazard, density, cumulative hazards, and/or the survivor function(s) for each stratum.
## S3 method for class 'phreg' plot( x, fn = c("haz", "cum", "den", "sur"), main = NULL, xlim = NULL, ylim = NULL, xlab = "Duration", ylab = "", col, lty, printLegend = TRUE, score = 1, fig = TRUE, ... )
## S3 method for class 'phreg' plot( x, fn = c("haz", "cum", "den", "sur"), main = NULL, xlim = NULL, ylim = NULL, xlab = "Duration", ylab = "", col, lty, printLegend = TRUE, score = 1, fig = TRUE, ... )
x |
A |
fn |
Which function should be plotted? Default is the hazard function(s). |
main |
Header for the plot |
xlim |
x limits |
ylim |
y limits |
xlab |
x label |
ylab |
y label |
col |
Color(s) for the curves. Defaults to black. |
lty |
Line type for the curve(s). Defaults to 1:(No. of strata). |
printLegend |
Logical, or character ("topleft", "bottomleft",
"topright" or "bottomright"); if |
score |
Multiplication factor for the hazard function. |
fig |
logical, should the graph be drawn? If FALSE, data is returned. |
... |
Extra parameters passed to 'plot' and 'lines'. |
No return value if fig = TRUE, otherwise the cumulative
hazards function (coordinates), given fn = "cum"
.
Reference hazard is given by the fit; zero for all covariates, and the reference category for factors.
Göran Broström
y <- rllogis(40, shape = 1, scale = 1) x <- rep(c(1,1,2,2), 10) fit <- phreg(Surv(y, rep(1, 40)) ~ x, dist = "loglogistic") plot(fit)
y <- rllogis(40, shape = 1, scale = 1) x <- rep(c(1,1,2,2), 10) fit <- phreg(Surv(y, rep(1, 40)) ~ x, dist = "loglogistic") plot(fit)
Plot(s) of the hazard, cumulative hazards, and/or the survivor function(s) for each stratum.
## S3 method for class 'tpchreg' plot( x, fn = c("haz", "cum", "sur"), log = "", main = NULL, xlim = NULL, ylim = NULL, xlab = "Duration", ylab = "", col, lty, printLegend = TRUE, ... )
## S3 method for class 'tpchreg' plot( x, fn = c("haz", "cum", "sur"), log = "", main = NULL, xlim = NULL, ylim = NULL, xlab = "Duration", ylab = "", col, lty, printLegend = TRUE, ... )
x |
A |
fn |
Which functions should be plotted? Default is the hazard function. |
log |
character, "" (default), "y", or "xy". |
main |
Header for the plot |
xlim |
x limits |
ylim |
y limits |
xlab |
x label |
ylab |
y label |
col |
Color(s) for the curves. Defaults to black. |
lty |
Line type for the curve(s). Defaults to 1:(No. of strata). |
printLegend |
Logical, or character ("topleft", "bottomleft",
"topright" or "bottomright"); if |
... |
Extra parameters passed to 'plot' and 'lines'. |
No return value.
Göran Broström
Plot(s) of the hazard, density, cumulative hazards, and/or the survivor function(s) for each stratum.
## S3 method for class 'weibreg' plot( x, fn = c("haz", "cum", "den", "sur"), main = NULL, xlim = NULL, ylim = NULL, xlab = NULL, ylab = NULL, new.data = x$means, ... )
## S3 method for class 'weibreg' plot( x, fn = c("haz", "cum", "den", "sur"), main = NULL, xlim = NULL, ylim = NULL, xlab = NULL, ylab = NULL, new.data = x$means, ... )
x |
A |
fn |
Which functions shoud be plotted! Default is all. They will scroll
by, so you have to take care explicitely what you want to be produced. See,
eg, |
main |
Header for the plot |
xlim |
x limits |
ylim |
y limits |
xlab |
x label |
ylab |
y label |
new.data |
At which covariate values? |
... |
Extra parameters passed to 'plot' |
The plot is drawn at the mean values of the covariates.
No return value
Göran Broström
y <- rweibull(4, shape = 1, scale = 1) x <- c(1,1,2,2) fit <- weibreg(Surv(y, c(1,1,1,1)) ~ x) plot(fit)
y <- rweibull(4, shape = 1, scale = 1) x <- c(1,1,2,2) fit <- weibreg(Surv(y, c(1,1,1,1)) ~ x) plot(fit)
Comparison of the cumulative hazards functions for a semi-parametric and parametric models.
plotHaz( sp, pp, interval, main = NULL, xlab = "Time", ylab = "Cum. hazards", leglab, col = c("blue", "red"), lty = 1:2, ylim, log = "", printLegend = TRUE )
plotHaz( sp, pp, interval, main = NULL, xlab = "Time", ylab = "Cum. hazards", leglab, col = c("blue", "red"), lty = 1:2, ylim, log = "", printLegend = TRUE )
sp |
An object of type "coxreg" or "phreg", typically output from
|
pp |
An object of type "coxreg" or "phreg", typically output from
|
interval |
Time interval for the plot, if missing, calculated from |
main |
Header for the plot. Default is distribution and "cumulative hazard function" |
xlab |
Label on x axis (default "Time") |
ylab |
Label on y axis (default "Cum. Hazards") |
leglab |
Labels in legend. |
col |
Line colors. should be |
lty |
line types. |
ylim |
Y limits for the plot. |
log |
Argument sent to |
printLegend |
Should a legend be printed? Default is |
For the moment only a graphical comparison. The arguments sp
and
pp
may be swapped.
No return value.
Göran Broström
check.dist
, coxreg
and phreg
.
data(mort) op <- par(mfrow = c(1, 2)) fit.cr <- coxreg(Surv(enter, exit, event) ~ ses, data = mort) fit.w <- phreg(Surv(enter, exit, event) ~ ses, data = mort) fit.g <- phreg(Surv(enter, exit, event) ~ ses, data = mort, dist = "gompertz") plotHaz(fit.cr, fit.w, interval = c(0, 20), main = "Weibull") plotHaz(fit.cr, fit.g, main = "Gompertz") par(op)
data(mort) op <- par(mfrow = c(1, 2)) fit.cr <- coxreg(Surv(enter, exit, event) ~ ses, data = mort) fit.w <- phreg(Surv(enter, exit, event) ~ ses, data = mort) fit.g <- phreg(Surv(enter, exit, event) ~ ses, data = mort, dist = "gompertz") plotHaz(fit.cr, fit.w, interval = c(0, 20), main = "Weibull") plotHaz(fit.cr, fit.g, main = "Gompertz") par(op)
The hazard, the cumulative hazard, the density, and the survivor baseline functions are plotted.
## S3 method for class 'aftreg' print(x, digits = max(options()$digits - 4, 3), ...)
## S3 method for class 'aftreg' print(x, digits = max(options()$digits - 4, 3), ...)
x |
A |
digits |
Precision in printing |
... |
Not used. |
No value is returned.
Doesn't work for threeway or higher order interactions. Use
print.coxph
in that case.
Göran Broström
More "pretty-printing" than print.coxph
, which is a fall-back for
'difficult' objects.
## S3 method for class 'coxreg' print(x, digits = max(options()$digits - 4, 3), ...)
## S3 method for class 'coxreg' print(x, digits = max(options()$digits - 4, 3), ...)
x |
A |
digits |
Output format. |
... |
Other arguments. |
Doesn't work with three-way and higher interactions, in which case
print.coxph
is used.
No value is returned.
Göran Broström
The result of logrank
is printed
## S3 method for class 'logrank' print(x, digits = max(options()$digits - 4, 6), ...)
## S3 method for class 'logrank' print(x, digits = max(options()$digits - 4, 6), ...)
x |
A |
digits |
Precision in printing |
... |
Not used. |
The input is returned invisibly.
Göran Broström
The hazard, the cumulative hazard, the density, and the survivor baseline functions are plotted.
## S3 method for class 'phreg' print(x, digits = max(options()$digits - 4, 3), ...)
## S3 method for class 'phreg' print(x, digits = max(options()$digits - 4, 3), ...)
x |
A |
digits |
Precision in printing |
... |
Not used. |
No value is returned.
Doesn't work for threeway or higher order interactions. Use
print.coxph
in that case.
Göran Broström
Given the output from risksets
, summary statistics are given for it.
## S3 method for class 'risksets' print(x, ...)
## S3 method for class 'risksets' print(x, ...)
x |
An object of class 'risksets'. |
... |
Not used for the moment. |
No value is returned; the function prints summary statistics of risk sets.
There is no summary.risksets
yet. On the TODO list.
Göran Broström
risksets
rs <- with(mort, risksets(Surv(enter, exit, event))) print(rs)
rs <- with(mort, risksets(Surv(enter, exit, event))) print(rs)
Prints summary.aftreg objects
## S3 method for class 'summary.aftreg' print(x, digits = max(getOption("digits") - 3, 3), short = FALSE, ...)
## S3 method for class 'summary.aftreg' print(x, digits = max(getOption("digits") - 3, 3), short = FALSE, ...)
x |
A |
digits |
Output format. |
short |
Logical: If TRUE, short output, only regression. |
... |
Other arguments. |
No value is returned.
Göran Broström
Prints summary.coxreg objects
## S3 method for class 'summary.coxreg' print(x, digits = 3, short = FALSE, ...)
## S3 method for class 'summary.coxreg' print(x, digits = 3, short = FALSE, ...)
x |
A |
digits |
Output format. |
short |
Logical, short or long (default) output? |
... |
Other arguments. |
No value is returned.
Göran Broström
Prints summary.phreg objects
## S3 method for class 'summary.phreg' print(x, digits = max(getOption("digits") - 3, 3), ...)
## S3 method for class 'summary.phreg' print(x, digits = max(getOption("digits") - 3, 3), ...)
x |
A |
digits |
Output format. |
... |
Other arguments. |
No value is returned.
Göran Broström
Prints summary.tpchreg objects
## S3 method for class 'summary.tpchreg' print(x, digits = max(getOption("digits") - 3, 3), ...)
## S3 method for class 'summary.tpchreg' print(x, digits = max(getOption("digits") - 3, 3), ...)
x |
A |
digits |
Output format. |
... |
Other arguments. |
No value is returned.
Göran Broström
More "pretty-printing"
## S3 method for class 'tpchreg' print(x, digits = max(options()$digits - 4, 3), ...)
## S3 method for class 'tpchreg' print(x, digits = max(options()$digits - 4, 3), ...)
x |
A |
digits |
Output format. |
... |
Other arguments. |
Doesn't work with three-way or higher interactions.
No value is returned.
Göran Broström
The hazard, the cumulative hazard, the density, and the survivor baseline functions are plotted.
## S3 method for class 'weibreg' print(x, digits = max(options()$digits - 4, 3), ...)
## S3 method for class 'weibreg' print(x, digits = max(options()$digits - 4, 3), ...)
x |
A |
digits |
Precision in printing |
... |
Not used. |
No value is returned.
Doesn't work for threeway or higher order interactions. Use
print.coxph
in that case.
Göran Broström
Retrieves regression tables
regtable(x, digits = 3, short = TRUE, ...)
regtable(x, digits = 3, short = TRUE, ...)
x |
A |
digits |
Output format. |
short |
If TRUE, return only coefficients table. |
... |
Other arguments. |
A character data frame, ready to print in various formats.
Should not be used if interactions present.
Göran Broström
Focus is on the risk set composition just prior to a failure.
risksets( x, strata = NULL, max.survs = NULL, members = TRUE, collate_sets = FALSE )
risksets( x, strata = NULL, max.survs = NULL, members = TRUE, collate_sets = FALSE )
x |
A |
strata |
Stratum indicator. |
max.survs |
Maximum number of survivors in each risk set. If smaller than the 'natural number', survivors are sampled from the present ones. No sampling if missing. |
members |
If TRUE, all members of all risk sets are listed in the resulting list, see below. |
collate_sets |
logical. If TRUE, group information by
risk sets in a list. Only if |
If the input argument max.survs is left alone, all survivors are accounted for in all risk sets.
A list with components (if collate_sets = FALSE
)
antrs |
No. of risk sets in each
stratum. The number of strata is given by |
risktimes |
Ordered distinct failure time points. |
eventset |
If 'members' is TRUE, a vector of pointers to events in each risk set, else NULL. |
riskset |
If 'members' is TRUE, a vector of pointers to the members of the risk sets, in order. The 'n.events' first are the events. If 'members' is FALSE, 'riskset' is NULL. |
size |
The sizes of the risk sets. |
n.events |
The number of events in each risk set. |
sample_fraction |
If 'members' is TRUE, the sampling fraction of survivors in each risk set. |
Can be used to "sample the risk sets".
Göran Broström
enter <- c(0, 1, 0, 0) exit <- c(1, 2, 3, 4) event <- c(1, 1, 1, 0) risksets(Surv(enter, exit, event))
enter <- c(0, 1, 0, 0) exit <- c(1, 2, 3, 4) event <- c(1, 1, 1, 0) risksets(Surv(enter, exit, event))
The data consists of old age life histories from 1 January 1813 to 31 december 1894. Only (parts of) life histories above age 50 is considered.
data(scania)
data(scania)
A data frame with 1931 observations from 1931 persons on the following 9 variables.
id
Identification number (enumeration).
enter
Start age for the interval.
exit
Stop age for the interval.
event
Indicator of death;
equals TRUE
if the person died
at the end of the interval, FALSE
otherwise.
birthdate
Birthdate as a real number (i.e., "1765-06-27" is 1765.490).
sex
Gender, a factor with levels male
female
.
parish
One of five parishes in Scania, coded 1, 2, 3, 4, 5. Factor.
ses
Socio-economic status at age 50, a factor with
levels upper
and lower
.
immigrant
a factor with levels no
region
and yes
.
The Scanian area in southern Sweden was during the 19th century a mainly rural area.
The Scanian Economic Demographic Database, Lund University, Sweden.
data(scania) summary(scania)
data(scania) summary(scania)
Prints aftreg objects
## S3 method for class 'aftreg' summary(object, ...)
## S3 method for class 'aftreg' summary(object, ...)
object |
A |
... |
Additional ... |
Göran Broström
## The function is currently defined as function (object, ...) print(object)
## The function is currently defined as function (object, ...) print(object)
A summary of coxreg objects.
## S3 method for class 'coxreg' summary(object, ...)
## S3 method for class 'coxreg' summary(object, ...)
object |
A |
... |
Additional ... |
Göran Broström
fit <- coxreg(Surv(enter, exit, event) ~ sex + civ, data = oldmort) summary(fit)
fit <- coxreg(Surv(enter, exit, event) ~ sex + civ, data = oldmort) summary(fit)
A summary of phreg objects.
## S3 method for class 'phreg' summary(object, ...)
## S3 method for class 'phreg' summary(object, ...)
object |
A |
... |
Additional ... |
Göran Broström
fit <- phreg(Surv(enter, exit, event) ~ sex + civ, data = oldmort[oldmort$region == "town", ]) summary(fit)
fit <- phreg(Surv(enter, exit, event) ~ sex + civ, data = oldmort[oldmort$region == "town", ]) summary(fit)
Summary for tpchreg objects
## S3 method for class 'tpchreg' summary(object, ci = FALSE, level = 0.95, ...)
## S3 method for class 'tpchreg' summary(object, ci = FALSE, level = 0.95, ...)
object |
A |
ci |
Logical. If TRUE, confidence limits are given instead of se's. |
level |
Confidence level, used if ci. |
... |
Additional ... |
Göran Broström
## The function is currently defined as ## function (object, ...)
## The function is currently defined as ## function (object, ...)
This is the same as print.weibreg
## S3 method for class 'weibreg' summary(object, ...)
## S3 method for class 'weibreg' summary(object, ...)
object |
A |
... |
Additional ... |
Göran Broström
## The function is currently defined as function (object, ...) print(object)
## The function is currently defined as function (object, ...) print(object)
Given a survival object, (a matrix with two or three columns) and a set of specified cut times, split each record into multiple subrecords at each cut time. The new survival object will be in ‘counting process’ format, with an enter time, exit time, and event status for each record.
SurvSplit(Y, cuts)
SurvSplit(Y, cuts)
Y |
A survival object, a matrix with two or three columns. |
cuts |
The cut points, must be strictly positive and distinct. |
A list with components
Y |
The new survival object with three columns, i.e., in 'counting process' form. |
ivl |
Interval No., starting from leftmost, (0, cuts[1]) or similar. |
idx |
Row number for original Y row. |
This function is used in phreg
for the piecewise
constant hazards model. It uses age.window
for each interval.
Göran Broström
##---- Should be DIRECTLY executable !! ---- ##-- ==> Define data, use random, ##-- or do help(data=index) for the standard data sets. ## The function is currently defined as function(Y, cuts){ if (NCOL(Y) == 2) Y <- cbind(rep(0, NROW(Y)), Y) indat <- cbind(Y, 1:NROW(Y), rep(-1, NROW(Y))) colnames(indat) <- c("enter", "exit", "event", "idx", "ivl") n <- length(cuts) cuts <- sort(cuts) if ((cuts[1] <= 0) || (cuts[n] == Inf)) stop("'cuts' must be positive and finite.") cuts <- c(0, cuts, Inf) n <- n + 1 out <- list() indat <- as.data.frame(indat) for (i in 1:n){ out[[i]] <- age.window(indat, cuts[i:(i+1)]) out[[i]]$ivl <- i out[[i]] <- t(out[[i]]) } Y <- matrix(unlist(out), ncol = 5, byrow = TRUE) colnames(Y) <- colnames(indat) list(Y = Y[, 1:3], ivl = Y[, 5], idx = Y[, 4] ) }
##---- Should be DIRECTLY executable !! ---- ##-- ==> Define data, use random, ##-- or do help(data=index) for the standard data sets. ## The function is currently defined as function(Y, cuts){ if (NCOL(Y) == 2) Y <- cbind(rep(0, NROW(Y)), Y) indat <- cbind(Y, 1:NROW(Y), rep(-1, NROW(Y))) colnames(indat) <- c("enter", "exit", "event", "idx", "ivl") n <- length(cuts) cuts <- sort(cuts) if ((cuts[1] <= 0) || (cuts[n] == Inf)) stop("'cuts' must be positive and finite.") cuts <- c(0, cuts, Inf) n <- n + 1 out <- list() indat <- as.data.frame(indat) for (i in 1:n){ out[[i]] <- age.window(indat, cuts[i:(i+1)]) out[[i]]$ivl <- i out[[i]] <- t(out[[i]]) } Y <- matrix(unlist(out), ncol = 5, byrow = TRUE) colnames(Y) <- colnames(indat) list(Y = Y[, 1:3], ivl = Y[, 5], idx = Y[, 4] ) }
A data frame containing data on the number of deaths by sex, age and year, Sweden 1969-2020.
data(swedeaths)
data(swedeaths)
A data frame with 5 variables and 10504 observations.
age
Numerical with integer values 0-100, representing achieved age in years during the actual calendar year. The highest value, 100, represents ages 100 and above.
sex
A factor with two levels, "women" and "men".
year
Calendar year.
deaths
Number of deaths by age, sex, and year.
id
Created by the reshape
procedure, see Details.
Data are downloaded from Statistics Sweden in the form of a csv file and and in that process converted to a data frame. Variable names are translated from Swedish, and some of them are coverted to factors. Each numeric column contains the number of deaths by sex and age. The original data set is in wide form and then converted to long format.
Statistics Sweden, https://www.scb.se.
summary(swedeaths) ## maybe str(swedeaths) ...
summary(swedeaths) ## maybe str(swedeaths) ...
A data frame containing data on the population size by sex, age and year, Sweden 1969-2020.
data(swepop)
data(swepop)
A data frame with 5 variables and 10504 observations.
age
Numerical with integer values 0-100, representing achieved age in years during the actual calendar year.The highest value, 100, represents ages 100 and above.
sex
A factor with two levels, "women" and "men".
year
Calendar year.
pop
Average population by age, sex, and year.
Created by the reshape
procedure, see Details.
Data are downloaded from Statistics Sweden in the form of a csv file and
converted to a data frame. Variable names are translated from Swedish,
and some of them are coverted to factors. The variable pop
contains
the average population by sex and age, calculated by taking the mean
value of the population size at December 31 the previous year and
December 31 the current year. The original data contain the sizes at the
end of each year. The original data set is in wide format and converted to
long format by reshape
.
Statistics Sweden, https://www.scb.se.
summary(swepop) ## maybe str(swepop) ...
summary(swepop) ## maybe str(swepop) ...
From input data of the 'interval' type, with an event indicator, summary statistics for each risk set (at an event time point) are calculated.
table.events(enter = rep(0, length(exit)), exit, event, strict = TRUE)
table.events(enter = rep(0, length(exit)), exit, event, strict = TRUE)
enter |
Left truncation time point. |
exit |
End time point, an event or a right censoring. |
event |
Event indicator. |
strict |
If TRUE, then tabulating is not done after a time point where all individuals in a riskset failed. |
A list with components
times |
Ordered distinct event time points. |
events |
Number of events at each event time point. |
riskset.sizes |
Number at risk at each event time point. |
Göran Broström
exit = c(1,2,3,4,5) event = c(1,1,0,1,1) table.events(exit = exit, event = event)
exit = c(1,2,3,4,5) event = c(1,1,0,1,1) table.events(exit = exit, event = event)
The result of the transformation can be used to do survival analysis via
logistic regression. If the cloglog
link is used, this corresponds to
a discrete time analogue to Cox's proportional hazards model.
toBinary( dat, surv = c("enter", "exit", "event"), strats, max.survs = NROW(dat) )
toBinary( dat, surv = c("enter", "exit", "event"), strats, max.survs = NROW(dat) )
dat |
A data frame with three variables representing the survival
response. The default is that they are named |
surv |
A character vector with the names of the three variables representing survival. |
strats |
An eventual stratification variable. |
max.survs |
Maximal number of survivors per risk set. If set to a (small) number, survivors are sampled from the risk sets. |
toBinary calls risksets
in the eha
package.
Returns a data frame expanded risk set by risk set. The three
"survival variables" are replaced by a variable named event
(which
overwrites an eventual variable by that name in the input). Two more
variables are created, riskset
and orig.row
.
event |
Indicates an event in the corresponding risk set. |
riskset |
Factor (with levels 1, 2, ...) indicating risk set. |
risktime |
The 'risktime' (age) in the corresponding riskset. |
orig.row |
The row number for this item in the original data frame. |
The survival variables must be three. If you only have exit and event, create a third containing all zeros.
Göran Broström
enter <- rep(0, 4) exit <- 1:4 event <- rep(1, 4) z <- rep(c(-1, 1), 2) dat <- data.frame(enter, exit, event, z) binDat <- toBinary(dat) dat binDat coxreg(Surv(enter, exit, event) ~ z, method = "ml", data = dat) ## Same as: summary(glm(event ~ z + riskset, data = binDat, family = binomial(link = cloglog)))
enter <- rep(0, 4) exit <- 1:4 event <- rep(1, 4) z <- rep(c(-1, 1), 2) dat <- data.frame(enter, exit, event, z) binDat <- toBinary(dat) dat binDat coxreg(Surv(enter, exit, event) ~ z, method = "ml", data = dat) ## Same as: summary(glm(event ~ z + riskset, data = binDat, family = binomial(link = cloglog)))
This function uses as.Date
and a simple linear transformation.
toDate(times)
toDate(times)
times |
a vector of durations |
A vector of dates as character strings of the type "1897-05-21".
Göran Broström
##---- Should be DIRECTLY executable !! ---- ##-- ==> Define data, use random, ##-- or do help(data=index) for the standard data sets. toDate(1897.357)
##---- Should be DIRECTLY executable !! ---- ##-- ==> Define data, use random, ##-- or do help(data=index) for the standard data sets. toDate(1897.357)
Given a vector of dates, the output is a vector of durations in years since "0000-01-01".
toTime(dates)
toTime(dates)
dates |
A vector of dates in character form or of class |
A vector of durations, as decribed above.
Göran Broström
##---- Should be DIRECTLY executable !! ---- ##-- ==> Define data, use random, ##-- or do help(data=index) for the standard data sets. toTime(c("1897-05-16", "1901-11-21"))
##---- Should be DIRECTLY executable !! ---- ##-- ==> Define data, use random, ##-- or do help(data=index) for the standard data sets. toTime(c("1897-05-16", "1901-11-21"))
Transform a "survival data frame" to tabular form aggregating number of events and exposure time by time intervals and covariates.
toTpch(formula, data, cuts, enter = "enter", exit = "exit", event = "event", episode = "age")
toTpch(formula, data, cuts, enter = "enter", exit = "exit", event = "event", episode = "age")
formula |
A model formula. |
data |
A data frame with survival data. |
cuts |
An ordered, non-negative vector of time points at which a hazard function changes value. Note that data are left truncated at cuts[1] (the smallest value) and right censored at c[n], where n is the length of cuts and cuts[n] == max(cuts). |
enter |
Character string with the name of the variable representing left truncation values. |
exit |
Character string with the name of the event/censoring time variable. |
event |
Character string with the name of the event indicator variable. |
episode |
Character string with the name of the output variable of the grouped time (a factor variable). |
If cuts
is missing, nothing is done. Internally, this function first calls
survival::survSplit
and then stats::aggregate
.
A data frame with exposure time and number of events aggregated by time intervals and covariates. If all covariates are factors,this usually results in a huge reduction of the size of thedata frame, but otherwise the size of the output may be larger than the size of the input data frame
Episodes, or parts of episodes, outside min(cuts), max(cuts)
are cut off.
With continuous covariates, consider rounding them so that the number of distinct oberved values is not too large.
Göran Broström
Proportional hazards regression with piecewise constant hazards and tabular data.
tpchreg(formula, data, time, weights, last, subset, na.action, contrasts = NULL, start.coef = NULL, control = list(epsilon = 1.e-8, maxit = 200, trace = FALSE))
tpchreg(formula, data, time, weights, last, subset, na.action, contrasts = NULL, start.coef = NULL, control = list(epsilon = 1.e-8, maxit = 200, trace = FALSE))
formula |
a formula with 'oe(count, exposure) ~ x1 + ...' |
data |
a data frame with occurrence/exposure data plus covariates. |
time |
the time variable, a factor character vector indicating time intervals, or numeric, indicating the start of intervals. |
weights |
Case weights. |
last |
If |
subset |
subset of data, not implemented yet. |
na.action |
Not implemented yet. |
contrasts |
Not implemented yet. |
start.coef |
For the moment equal to zero, not used. |
control |
list of control parameters for the optimization. |
The interpretation of cuts is different from that in hpch
.
This is intentional.
oe
.
sw <- swepop sw$deaths <- swedeaths$deaths fit <- tpchreg(oe(deaths, pop) ~ strata(sex) + I(year - 2000), time = age, last = 101, data = sw[sw$year >= 2000, ]) summary(fit)
sw <- swepop sw$deaths <- swedeaths$deaths fit <- tpchreg(oe(deaths, pop) ~ strata(sex) + I(year - 2000), time = age, last = 101, data = sw[sw$year >= 2000, ]) summary(fit)
Proportional hazards model with baseline hazard(s) from the Weibull family of distributions. Allows for stratification with different scale and shape in each stratum, and left truncated and right censored data.
weibreg( formula = formula(data), data = parent.frame(), na.action = getOption("na.action"), init, shape = 0, control = list(eps = 1e-04, maxiter = 10, trace = FALSE), singular.ok = TRUE, model = FALSE, x = FALSE, y = TRUE, center = TRUE )
weibreg( formula = formula(data), data = parent.frame(), na.action = getOption("na.action"), init, shape = 0, control = list(eps = 1e-04, maxiter = 10, trace = FALSE), singular.ok = TRUE, model = FALSE, x = FALSE, y = TRUE, center = TRUE )
formula |
a formula object, with the response on the left of a ~ operator, and the terms on the right. The response must be a survival object as returned by the Surv function. |
data |
a data.frame in which to interpret the variables named in the formula. |
na.action |
a missing-data filter function, applied to the model.frame,
after any subset argument has been used. Default is
|
init |
vector of initial values of the iteration. Default initial value is zero for all variables. |
shape |
If positive, the shape parameter is fixed at that value (in each stratum). If zero or negative, the shape parameter is estimated. If more than one stratum is present in data, each stratum gets its own estimate. |
control |
a list with components |
singular.ok |
Not used. |
model |
Not used. |
x |
Return the design matrix in the model object? |
y |
Return the response in the model object? |
center |
Deprecated, and not used. Will be removed in the future. |
The parameterization is the same as in coxreg
and
coxph
, but different from the one used by
survreg
. The model is
This is in correspondence with Weibull
. To
compare regression coefficients with those from survreg
you need to
divide by estimated shape () and change sign. The p-values
and test statistics are however the same, with one exception; the score test
is done at maximized scale and shape in
weibreg
.
This model is a Weibull distribution with shape parameter and scale
parameter
A list of class c("weibreg", "coxreg")
with components
coefficients |
Fitted parameter estimates. |
var |
Covariance matrix of the estimates. |
loglik |
Vector of length two; first component is the value at the initial parameter values, the second componet is the maximized value. |
score |
The score test statistic (at the initial value). |
linear.predictors |
The estimated linear predictors. |
means |
Means of the columns of the design matrix. |
w.means |
Weighted (against exposure time) means of covariates; weighted relative frequencies of levels of factors. |
n |
Number of spells in indata (possibly after removal of cases with NA's). |
events |
Number of events in data. |
terms |
Used by extractor functions. |
assign |
Used by extractor functions. |
wald.test |
The Wald test statistic (at the initial value). |
y |
The Surv vector. |
isF |
Logical vector indicating the covariates that are factors. |
covars |
The covariates. |
ttr |
Total Time at Risk. |
levels |
List of levels of factors. |
formula |
The calling formula. |
call |
The call. |
method |
The method. |
convergence |
Did the optimization converge? |
fail |
Did the optimization fail? (Is |
pfixed |
TRUE if shape was fixed in the estimation. |
The print method print.weibreg
doesn't work
if threeway or higher order interactions are present.
Note further that covariates are internally centered, if center =
TRUE
, by this function, and this is not corrected for in the output. This
affects the estimate of , but nothing else. If
you don't like this, set
center = FALSE
.
This function is not maintained, and may behave in unpredictable ways.
Use phreg
with dist = "weibull"
(the default) instead!
Will soon be declared deprecated.
Göran Broström
dat <- data.frame(time = c(4, 3, 1, 1, 2, 2, 3), status = c(1, 1, 1, 0, 1, 1, 0), x = c(0, 2, 1, 1, 1, 0, 0), sex = c(0, 0, 0, 0, 1, 1, 1)) weibreg( Surv(time, status) ~ x + strata(sex), data = dat) #stratified model
dat <- data.frame(time = c(4, 3, 1, 1, 2, 2, 3), status = c(1, 1, 1, 0, 1, 1, 0), x = c(0, 2, 1, 1, 1, 0, 0), sex = c(0, 0, 0, 0, 1, 1, 1)) weibreg( Surv(time, status) ~ x + strata(sex), data = dat) #stratified model
This function is called by weibreg
, but it can also be
directly called by a user.
weibreg.fit(X, Y, strata, offset, init, shape, control, center = TRUE)
weibreg.fit(X, Y, strata, offset, init, shape, control, center = TRUE)
X |
The design (covariate) matrix. |
Y |
A survival object, the response. |
strata |
A stratum variable. |
offset |
Offset. |
init |
Initial regression parameter values. |
shape |
If positive, a fixed value of the shape parameter in the Weibull distribution. Otherwise, the shape is estimated. |
control |
Controls convergence and output. |
center |
Should covariates be centered? |
See weibreg
for more detail.
coefficients |
Estimated regression coefficients plus estimated scale and shape coefficients, sorted by strata, if present. |
var |
|
loglik |
Vector of length 2. The first component is the maximized loglihood with only scale and shape in the model, the second the final maximum. |
score |
Score test statistic at initial values |
linear.predictors |
Linear predictors for each interval. |
means |
Means of the covariates |
conver |
TRUE if convergence |
fail |
TRUE if failure |
iter |
Number of Newton-Raphson iterates. |
n.strata |
The number of strata in the data. |
Göran Broström
hweibull
calculates the hazard function of a Weibull distribution,
and Hweibull
calculates the corresponding cumulative hazard function.
hweibull(x, shape, scale = 1, log = FALSE)
hweibull(x, shape, scale = 1, log = FALSE)
x |
Vector of quantiles. |
shape |
The shape parameter. |
scale |
The scale parameter, defaults to 1. |
log |
logical; if TRUE, the log of the hazard function is given. |
See dweibull.
The (cumulative) hazard function, evaluated at x.
Göran Broström
hweibull(3, 2, 1) dweibull(3, 2, 1) / pweibull(3, 2, 1, lower.tail = FALSE) Hweibull(3, 2, 1) -pweibull(3, 2, 1, log.p = TRUE, lower.tail = FALSE)
hweibull(3, 2, 1) dweibull(3, 2, 1) / pweibull(3, 2, 1, lower.tail = FALSE) Hweibull(3, 2, 1) -pweibull(3, 2, 1, log.p = TRUE, lower.tail = FALSE)
Calculates minus the log likelihood function and its first and second order
derivatives for data from a Weibull regression model. Is called by
weibreg
.
wfunk( beta = NULL, lambda, p, X = NULL, Y, offset = rep(0, length(Y)), ord = 2, pfixed = FALSE )
wfunk( beta = NULL, lambda, p, X = NULL, Y, offset = rep(0, length(Y)), ord = 2, pfixed = FALSE )
beta |
Regression parameters |
lambda |
The scale paramater |
p |
The shape parameter |
X |
The design (covariate) matrix. |
Y |
The response, a survival object. |
offset |
Offset. |
ord |
ord = 0 means only loglihood, 1 means score vector as well, 2 loglihood, score and hessian. |
pfixed |
Logical, if TRUE the shape parameter is regarded as a known constant in the calculations, meaning that it is not cosidered in the partial derivatives. |
Note that the function returns log likelihood, score vector and minus hessian, i.e. the observed information. The model is
This is in correspondence with dweibull
.
A list with components
f |
The log likelihood. Present if
|
fp |
The score vector. Present if |
fpp |
The negative of the hessian. Present if |
Göran Broström