Title: | Interval Censored Multi-State Models |
---|---|
Description: | Allows for the non-parametric estimation of transition intensities in interval-censored multi-state models using the approach of Gomon and Putter (2024) <doi:10.48550/arXiv.2409.07176> or Gu et al. (2023) <doi:10.1093/biomet/asad073>. |
Authors: | Daniel Gomon [aut, cre] , Hein Putter [aut] |
Maintainer: | Daniel Gomon <[email protected]> |
License: | GPL (>= 2) |
Version: | 0.1.0 |
Built: | 2024-10-29 06:54:52 UTC |
Source: | CRAN |
Non-parametric estimation of transition intensities in interval-censored multi-state models
Allows for the estimation of transition intensities in interval-censored multi-state models using the approach of Gomon and Putter (2024) (Multinomial likelihood) or Gu et al. (2023) (Poisson likelihood).
Maintainer: Daniel Gomon <[email protected]> [aut, cre]
Hein Putter [aut]
Y. Gu, D. Zeng, G. Heiss, and D. Y. Lin, Maximum likelihood estimation for semiparametric regression models with interval-censored multistate data, Biometrika, Nov. 2023, doi:10.1093/biomet/asad073
D. Gomon and H. Putter, Non-parametric estimation of transition intensities in interval censored Markov multi-state models without loops, arXiv, 2024, doi:10.48550/arXiv.2409.07176
Function which takes as input a vector a with event times and a vector b with event times and checks whether each entry in a is larger or equal than the entries in b.
ageqb(a, b)
ageqb(a, b)
a |
Vector of event times |
b |
Vector of event times |
Matrix of size (length(a) * length(b)) with binary values indicating whether the event times in a are larger than the event times in b
Function which takes as input a vector a with event times and a vector b with event times and checks whether each entry in a is larger than the entries in b.
agreaterb(a, b)
agreaterb(a, b)
a |
Vector of event times |
b |
Vector of event times |
Matrix of size (length(a) * length(b)) with binary values indicating whether the event times in a are larger than the event times in b
Function which takes as input a vector a with event times and a 2 column matrix B representing half-open intervals (l, R] and checks whether each event time is contained in each half-open interval.
ainB(a, B)
ainB(a, B)
a |
Vector of event times |
B |
Two column matrix containing intervals |
Matrix of size (length(a) * nrow(B)) with binary values indicating whether the event times in a are contained in the intervals of B
Function which takes as input a 2 column matrix of half-open intervals and a vector of event times and returns the event times that intersect with the half-open intervals.
Aintersectb(A, b, A.left.open = FALSE)
Aintersectb(A, b, A.left.open = FALSE)
A |
Two column matrix containing intervals |
b |
Vector of event times |
A.left.open |
Are the intervals in A open on the left side? Default = FALSE. |
Numeric vector of event times from b that intersect with A.
Function which takes as input a matrix with 2 columns and a vector indicating left points of intervals [b, Infinity) and checks whether each interval in the matrix is contained within the corresponding interval derived from b.
Alargerb(A, b)
Alargerb(A, b)
A |
Two column matrix containing intervals to be checked for being contained in b |
b |
Vector indicating left point in corresponding [b, Infinity) interval |
Matrix of size (nrow(A) * length(b)) with binary values indicating whether the intervals in A are contained in the ones induced by b
Function which takes as input two matrices with 2 columns each and checks whether each interval in the first matrix is contained within each interval in the second matrix.
AsubsetB(A, B, B.left.open = FALSE, B.right.open = FALSE)
AsubsetB(A, B, B.left.open = FALSE, B.right.open = FALSE)
A |
Two column matrix containing intervals to be checked for being contained in B |
B |
Two column matrix containing intervals possibly overlapping the intervals in A |
B.left.open |
Are the intervals in B left-open? |
B.right.open |
Are the intervals in B right-open? |
Matrix of size (nrow(A) * nrow(B)) with binary values indicating whether the intervals in A are contained in B
Given observed transition intervals, determine the "worst" (least informative) possible direct transition intervals that could have occurred to form this sample.
direct_from_observed_intervals(observed_intervals, tmat, gd)
direct_from_observed_intervals(observed_intervals, tmat, gd)
observed_intervals |
Output from |
tmat |
A transition matrix as created by |
gd |
A
The true transition time between states is then interval censored between the times. |
A data.frame
with the following named columns
entry_time
:Time of entry into "from" state;
time_from
:Last time subject(id) was seen in state "from";
time_to
:First time subject(id) was seen in state "to";
from
:State from which a transition was observed;
to
:State to which the transition was observed;
id
:Subject identifier;
For right-censored observations, entry_time denotes the first time
seen in the censored state, time_from the last time seen in the censored state,
time_to is Inf
, from the censored state and to is NA
.
Given a realisation of a multi-state model, estimate the support of the different transitions possible in that MSM. The estimation is performed by viewing each possible state in a competing risks setting and applying the result of Hudgens (2001) to determine the support and left-truncation intervals and Hudgens (2005) to check whether a solution is possible.
estimate_support_msm(gd, tmat)
estimate_support_msm(gd, tmat)
gd |
A
The true transition time between states is then interval censored between the times. |
tmat |
A transition matrix as created by |
TODO
Michael G. Hudgens, On Nonparametric Maximum Likelihood Estimation with Interval Censoring and Left Truncation, Journal of the Royal Statistical Society Series B: Statistical Methodology, Volume 67, Issue 4, September 2005, Pages 573-587, doi:10.1111/j.1467-9868.2005.00516.x
M. G. Hudgens, G. A. Satten, and I. M. Longini, Nonparametric Maximum Likelihood Estimation for Competing Risks Survival Data Subject to Interval Censoring and Truncation, Biometrics, vol. 57, no. 1, Pages 74-80, March 2001, doi:10.1111/j.0006-341x.2001.00074.x
Given a markov chain multi state model with exactly observed transition times, sample from this chain at the observation times, giving interval censored observations (panel data).
evalstep(time, stepf, newtime, subst = -Inf, to.data.frame = FALSE)
evalstep(time, stepf, newtime, subst = -Inf, to.data.frame = FALSE)
time |
Times at which a transition occurs |
stepf |
States at which the chain is in at |
newtime |
Observation times of the chain, to create observed states |
subst |
State to return if observation time is before first transition time. Default = -Inf. |
to.data.frame |
Should the result be returned as a |
A numeric vector
or data.frame
(if to.data.frame = TRUE
)
containing either the observed states or the named columns newtime
and
res
, representing the observation times and observed states.
Hein Putter
obs_states <- evalstep(time = seq(0, 20, 2), stepf = sample(1:9, 11, replace = TRUE), newtime = c(-1, 1, 7, 9, 19)) obs_states
obs_states <- evalstep(time = seq(0, 20, 2), stepf = sample(1:9, 11, replace = TRUE), newtime = c(-1, 1, 7, 9, 19)) obs_states
Given a sample from a multi-state model, summarize the transitions that have been observed.
get_trans_intervals(gd, tmat)
get_trans_intervals(gd, tmat)
gd |
A
The true transition time between states is then interval censored between the times. |
tmat |
A transition matrix as created by |
A data.frame
with the following named columns
entry_time
:Time of entry into "from" state;
time_from
:Last time subject(id) was seen in state "from";
time_to
:First time subject(id) was seen in state "to";
from
:State from which a transition was observed;
to
:State to which the transition was observed;
id
:Subject identifier;
For right-censored observations, entry_time denotes the first time
seen in the censored state, time_from the last time seen in the censored state,
time_to is Inf
, from the censored state and to is NA
.
Given intervals, construct a graph containing vertices representing these intervals and edges between the vertices if the intervals intersect. See Hudgens (2005).
graphfromIntervals(intervals)
graphfromIntervals(intervals)
intervals |
A data.frame with 3 columns containing half-open intervals (left open, right closed) and an indicator whether the interval results from a censored transition or truncation: #'
Note that the truncation intervals need to be in the form (N, Inf] with N a numeric value. |
Returns an 'igraph'
object containing the graph with vertices
representing the intervals and edges between the vertices if the intervals
intersect. The vertices will be named accordingly, starting with a 'T' when
representing a truncation interval and 'C' when representing a censoring
interval.
Michael G. Hudgens, On Nonparametric Maximum Likelihood Estimation with Interval Censoring and Left Truncation, Journal of the Royal Statistical Society Series B: Statistical Methodology, Volume 67, Issue 4, September 2005, Pages 573-587, doi:10.1111/j.1467-9868.2005.00516.x
msfit
object, linearly interpolate the cumulative hazard
taking into account the support sets for msfit
objects.After using this function, use probtrans to get interpolated transition probabilities.
interpol_msfit(msfit, times)
interpol_msfit(msfit, times)
msfit |
A |
times |
Times at which to interpolate the |
An msfit
object containing the interpolated hazards
library(mstate) tmat <- trans.illdeath() times <- seq(0, 5, 0.1) ms_fit <- list(Haz = data.frame(time = rep(times, 3), Haz = c(replicate(3, cumsum(runif(length(times), 0, 0.02)))), trans = rep(1:3, each = length(times))), trans = tmat) class(ms_fit) <- "msfit" ms_fit_interpolated <- interpol_msfit(ms_fit, seq(0, 5, 0.01))
library(mstate) tmat <- trans.illdeath() times <- seq(0, 5, 0.1) ms_fit <- list(Haz = data.frame(time = rep(times, 3), Haz = c(replicate(3, cumsum(runif(length(times), 0, 0.02)))), trans = rep(1:3, each = length(times))), trans = tmat) class(ms_fit) <- "msfit" ms_fit_interpolated <- interpol_msfit(ms_fit, seq(0, 5, 0.01))
Determine NPMLE for Multi State illness death Markov model using Frydman (1995)
msm_frydman(data, tol = 1e-08)
msm_frydman(data, tol = 1e-08)
data |
A
|
tol |
Tolerance of the EM algorithm. Algorithm will stop when the absolute difference between current mass estimates and new estimates is smaller than the tolerance |
For an illness death model (1 = healthy, 2 = ill, 3 = dead) estimate the NPMLE in the following form:
F12
:Cumulative distribution function of 1->2 transition;
F13
:Cumulative distribution function of 1->3 transition;
Lambda23
:Cumulative intensity of 2->3 transition;
A list with the following entries:
data_idx
: A list containing the data used for the fit (matdata
),
the indices for which group a subject belongs to (GroupX_idx
), some computational
parameters (see Frydman(1995)) and the unique failure times of the 2->3 and 1->3
transitions respectively in t_n_star
and e_k_star
;
supportMSM
: A list containing all transition intervals in A
and
the theoretical support intervals in Q_mat
;
z_lambda
: Computational quantities, see Frydman(1995);
cdf
: A list of functions that allow to recover the cdf for
the 1->3 (F13
) and 1->2 (F12
) transition and the cumulative hazard
for the 2->3 (Lambda23
) transition.;
Frydman, H. (1995). Nonparametric Estimation of a Markov 'Illness-Death' Process from Interval- Censored Observations, with Application to Diabetes Survival Data. Biometrika, 82(4), 773-789. doi:10.2307/2337344
data <- data.frame(delta = c(0, 0, 1, 1), Delta = c(0, 1, 0, 1), L = c(NA, NA, 1, 1.5), R = c(NA, 3, 2, 3), time = c(4, 5, 6, 7)) mod_frydman <- msm_frydman(data) visualise_data(data, mod_frydman)
data <- data.frame(delta = c(0, 0, 1, 1), Delta = c(0, 1, 0, 1), L = c(NA, NA, 1, 1.5), R = c(NA, 3, 2, 3), time = c(4, 5, 6, 7)) mod_frydman <- msm_frydman(data) visualise_data(data, mod_frydman)
For a general Markov chain multi-state model with interval censored
transitions calculate the NPMLE of the transition intensities. The estimates
are returned as an msfit
object.
npmsm( gd, tmat, method = c("multinomial", "poisson"), inits = c("equalprob", "homogeneous", "unif", "beta"), beta_params, support_manual, exact, maxit = 100, tol = 1e-04, conv_crit = c("haz", "prob", "lik"), verbose = FALSE, manual = FALSE, newmet = FALSE, include_inf = FALSE, checkMLE = TRUE, checkMLE_tol = 1e-04, prob_tol = tol/10, remove_redundant = TRUE, remove_bins = FALSE, estimateSupport = FALSE, init_int = c(0, 0), ... )
npmsm( gd, tmat, method = c("multinomial", "poisson"), inits = c("equalprob", "homogeneous", "unif", "beta"), beta_params, support_manual, exact, maxit = 100, tol = 1e-04, conv_crit = c("haz", "prob", "lik"), verbose = FALSE, manual = FALSE, newmet = FALSE, include_inf = FALSE, checkMLE = TRUE, checkMLE_tol = 1e-04, prob_tol = tol/10, remove_redundant = TRUE, remove_bins = FALSE, estimateSupport = FALSE, init_int = c(0, 0), ... )
gd |
A
The true transition time between states is then interval censored between the times. |
tmat |
A transition matrix as created by |
method |
Which method should be used for the EM algorithm. Choices are
|
inits |
Which distribution should be used to generate the initial estimates
of the intensities in the EM algorithm. One of c("equalprob", "unif", "beta"),
with "equalprob" assigning 1/K to each intensity, with K the number of distinct
observation times ( |
beta_params |
A vector of length 2 specifying the beta distribution parameters
for initial distribution generation. First entry will be used as |
support_manual |
Used for specifying a manual support region for the transitions.
A list of length the number of transitions in |
exact |
Numeric vector indicating to which states transitions are observed at exact times.
Must coincide with the column number in |
maxit |
Maximum number of iterations. Default = 100. |
tol |
Tolerance of the convergence procedure. A change in the value of
|
conv_crit |
Convergence criterion. Stops procedure when the difference
in the chosen quantity between two consecutive iterations is smaller
than the tolerance level
Default is "haz". The options "haz" and "lik" can be compared across different
|
verbose |
Should iteration messages be printed? Default is FALSE |
manual |
Manually specify starting transition intensities? If |
newmet |
Should contributions after last observation time also be used in the likelihood? Default is FALSE. |
include_inf |
Should an additional bin from the largest observed time to infinity be included in the algorithm? Default is FALSE. |
checkMLE |
Should a check be performed whether the estimate has converged
towards a true Maximum Likelihood Estimate? This is done by comparing
the reduced gradient to the value of |
checkMLE_tol |
Tolerance for checking whether the estimate has converged to MLE.
Whenever an estimated transition intensity is smaller than |
prob_tol |
If an estimated probability is smaller than |
remove_redundant |
Should redundant observations be removed before running the algorithm? An observation is redundant when the same state has been observed more than 3 times consecutively, or if it is a repeat observation of an absorbing state. Default is TRUE. |
remove_bins |
Should a bin be removed during the algorithm if all
estimated intensities are zero for a single bin? Can improve
computation speed for large data sets. Note that zero means the estimated intensities
are smaller than |
estimateSupport |
Should the support of the transitions be estimated using
the result of Hudgens (2005)? Currently produces incorrect support sets -
DO NOT USE. Default = |
init_int |
A vector of length 2, with the first entry indicating what percentage of mass should be distributed over (second entry) what percentage of all first bins. Default is c(0, 0), in which case the argument is ignored. This argument has no practical uses and only exists for demonstration purposes in the related article. |
... |
Further arguments to |
Denote the unique observation times in the data as
Let
denote the possible states in the model and
the state of the process at time t.
Then this function can be used to estimate the transition intensities
.
Having obtained these estimated, it is possible to recover the transition probabilities
for
using
the
transprob
functions.
A list with the following entries:
A
: A list of class msfit
containing
the cumulative intensities for each transition and the transition matrix used;
Ainit
: Initial intensities, in an object of class msfit
;
gd
: Data used for the estimation procedure;
ll
: Log-likelihood value of the procedure at the last iteration;
delta
: Change in log-likelihood value at the last iteration;
it
: Number of iterations of the procedure;
taus
: Unique time points of the data, the cumulative intensity only makes jumps at these time points.;
tmat
: The transition matrix used, see transMat
;
tmat2
: A summary of the transitions in the model, see to.trans2
;
ll_history
: The log-likelihood value at every iteration of the procedure;
KKT_violated
: How often were KKT conditions violated during maximisation of the likelihood? In other words, how often did we hit the optimization boundary during the procedure?;
min_time
: The smallest time of an observation in the used data. Note that the smallest time in the data is used as zero reference;
reduced_gradient
: The reduced gradient at the last iteration. Rows indicate the transitions and columns the unique observation times;
isMLE
: Has the procedure converged to the NPMLE? Checked
using checkMLE_tol
;
langrangemultiplier
: The lagrange multipliers at the last iteration;
aghmat
: A matrix representation of the transition intensities in A
.
Rows represent transitions and columns unique observation times;
Ygk
: The summed at-risk indicator for all subjects in the data at the last iteration. Rows represent transitions and columns unique observation times;
Dmk
: The summed probability of making a transition for all subjects at the last iteration. Rows represent transitions and columns unique observation times;
method
: Method used for the optimization procedure;
maxit
: Maximum number of allowed iterations;
tol
: Tolerance of the convergence procedure;
conv_crit
: Convergence criterion of the procedure;
checkMLE
: Was the reduced gradient checked at the last iteration to determine convergence?;
checkMLE_tol
: The tolerance of the checkMLE procedure;
prob_tol
: Tolerance for probabilities to be set to zero;
remove_redundant
: Were redundant observations removed before performing the procedure?;
D. Gomon and H. Putter, Non-parametric estimation of transition intensities in interval censored Markov multi-state models without loops, arXiv, 2024, doi:10.48550/arXiv.2409.07176
Y. Gu, D. Zeng, G. Heiss, and D. Y. Lin, Maximum likelihood estimation for semiparametric regression models with interval-censored multistate data, Biometrika, Nov. 2023, doi:10.1093/biomet/asad073
Michael G. Hudgens, On Nonparametric Maximum Likelihood Estimation with Interval Censoring and Left Truncation, Journal of the Royal Statistical Society Series B: Statistical Methodology, Volume 67, Issue 4, September 2005, Pages 573-587, doi:10.1111/j.1467-9868.2005.00516.x
transprob
for calculating transition probabilities,
plot.npmsm
for plotting the cumulative intensities,
print.npmsm
for printing some output summaries,
visualise_msm
for visualising data,
msfit
for details on the output object.
#Create transition matrix using mstate functionality: illness-death if(require(mstate)){ tmat <- mstate::trans.illdeath() } #Write a function for evaluation times: observe at 0 and uniform inter-observation times. eval_times <- function(n_obs, stop_time){ cumsum( c( 0, runif( n_obs-1, 0, 2*(stop_time-4)/(n_obs-1) ) ) ) } #Use built_in function to simulate illness-death data #from Weibull distributions for each transition sim_dat <- sim_id_weib(n = 50, n_obs = 6, stop_time = 15, eval_times = eval_times, start_state = "stable", shape = c(0.5, 0.5, 2), scale = c(5, 10, 10/gamma(1.5))) tmat <- mstate::trans.illdeath() #Fit the model using method = "multinomial" mod_fit <- npmsm(gd = sim_dat, tmat = tmat, tol = 1e-2) #Plot the cumulative intensities for each transition plot(mod_fit)
#Create transition matrix using mstate functionality: illness-death if(require(mstate)){ tmat <- mstate::trans.illdeath() } #Write a function for evaluation times: observe at 0 and uniform inter-observation times. eval_times <- function(n_obs, stop_time){ cumsum( c( 0, runif( n_obs-1, 0, 2*(stop_time-4)/(n_obs-1) ) ) ) } #Use built_in function to simulate illness-death data #from Weibull distributions for each transition sim_dat <- sim_id_weib(n = 50, n_obs = 6, stop_time = 15, eval_times = eval_times, start_state = "stable", shape = c(0.5, 0.5, 2), scale = c(5, 10, 10/gamma(1.5))) tmat <- mstate::trans.illdeath() #Fit the model using method = "multinomial" mod_fit <- npmsm(gd = sim_dat, tmat = tmat, tol = 1e-2) #Plot the cumulative intensities for each transition plot(mod_fit)
npmsm
modelFor a fitted npmsm
model plot the transition probabilities
from a certain state for all possible (direct and indirect) transitions.
plot_probtrans( npmsmlist, from = NULL, to = NULL, transitions = NULL, landmark, interpolate = TRUE, facet = TRUE, times_interpol = NULL, c.legend = TRUE, c.names = NULL )
plot_probtrans( npmsmlist, from = NULL, to = NULL, transitions = NULL, landmark, interpolate = TRUE, facet = TRUE, times_interpol = NULL, c.legend = TRUE, c.names = NULL )
npmsmlist |
An "npmsm" object or a list containing multiple "npmsm" objects |
from |
A numeric value indicating the state from which we consider the transition probabilities. Default is NULL, meaning we consider transition probabilities from all states from which a direct transition is possible. |
to |
A numeric vector indicating to which states we consider the transition
probabilities. Only states that can be reached from the |
transitions |
A numeric vector indicating which transitions to consider (plot).
Can only be used if |
landmark |
A landmark time indicating from which time on survival should be determined. If missing, the smallest between the time in the first "npmsm" object or 0 will be used. |
interpolate |
Should the cumulative hazard be linearly interpolated before determining transition probabilities? Default is TRUE. |
facet |
Should the resulting plot be faceted (one panel per transition)? Default is TRUE. |
times_interpol |
At which times should the cumulative hazard be interpolated? Only necessary to specify if interpolate = TRUE. |
c.legend |
Should legend be displayed for colour (different entries in
|
c.names |
A character vector indicating the names to display in the legend.
These names should represent the entries in |
A plot will be produced in the plotting window. When assigning
the output to an object, the underlying data frame used for plotting
and a 'ggplot'
object will be returned in a list.
require(mstate) require(ggplot2) #Generate from an illness-death model with exponential transitions with #rates 1/2, 1/10 and 1 for 10 subjects over a time grid. gd <- sim_weibmsm(tmat = trans.illdeath(), shape = c(1,1,1), scale = c(2, 10, 1), n_subj = 10, obs_pars = c(2, 0.5, 20), startprobs = c(0.9, 0.1, 0)) #Fit 2 models: 1 with at most 4 iterations and 1 with at most 20 mod1 <- npmsm(gd, trans.illdeath(), maxit = 4) mod2 <- npmsm(gd, trans.illdeath(), maxit = 20) #Plot the transition probabilities from state 1, without interpolating #the cumulative hazard for the npmsm runs with max 4 and 20 iterations. plot_probtrans(list(mod1, mod2), from = 1, interpolate = FALSE, c.names = c("4 iterations", "20 iterations"))
require(mstate) require(ggplot2) #Generate from an illness-death model with exponential transitions with #rates 1/2, 1/10 and 1 for 10 subjects over a time grid. gd <- sim_weibmsm(tmat = trans.illdeath(), shape = c(1,1,1), scale = c(2, 10, 1), n_subj = 10, obs_pars = c(2, 0.5, 20), startprobs = c(0.9, 0.1, 0)) #Fit 2 models: 1 with at most 4 iterations and 1 with at most 20 mod1 <- npmsm(gd, trans.illdeath(), maxit = 4) mod2 <- npmsm(gd, trans.illdeath(), maxit = 20) #Plot the transition probabilities from state 1, without interpolating #the cumulative hazard for the npmsm runs with max 4 and 20 iterations. plot_probtrans(list(mod1, mod2), from = 1, interpolate = FALSE, c.names = c("4 iterations", "20 iterations"))
npmsm
modelFor a fitted npmsm
model plot the transition specific
survival probabilities. These are given by the product integral of the hazard
increments estimated for a single transition. This is equivalent to a Kaplan-Meier
estimator ignoring the existence of all other transitions.
plot_surv(npmsmlist, landmark, support = FALSE, sup_cutoff = 1e-08)
plot_surv(npmsmlist, landmark, support = FALSE, sup_cutoff = 1e-08)
npmsmlist |
An "npmsm" object or a list containing multiple "npmsm" objects |
landmark |
A landmark time indicating from which time on survival should be determined. If missing, the smallest time in the first "npmsm" object will be used. |
support |
Should the support regions be displayed as rectangles? |
sup_cutoff |
Cutoff to be used for determining the support intervals. |
A plot will be produced in the plotting window. When assigning
the output to an object, the underlying data frame used for plotting
and a 'ggplot'
object will be returned in a list.
require(mstate) require(ggplot2) #Generate from an illness-death model with exponential transitions with #rates 1/2, 1/10 and 1 for 10 subjects over a time grid. gd <- sim_weibmsm(tmat = trans.illdeath(), shape = c(1,1,1), scale = c(2, 10, 1), n_subj = 10, obs_pars = c(2, 0.5, 20), startprobs = c(0.9, 0.1, 0)) mod1 <- npmsm(gd, trans.illdeath(), maxit = 4) mod2 <- npmsm(gd, trans.illdeath(), maxit = 20) #Plot the transition specific Kaplan-Meier estimators and their numerically #determined support sets. plot_surv(list(mod1, mod2), support = TRUE)
require(mstate) require(ggplot2) #Generate from an illness-death model with exponential transitions with #rates 1/2, 1/10 and 1 for 10 subjects over a time grid. gd <- sim_weibmsm(tmat = trans.illdeath(), shape = c(1,1,1), scale = c(2, 10, 1), n_subj = 10, obs_pars = c(2, 0.5, 20), startprobs = c(0.9, 0.1, 0)) mod1 <- npmsm(gd, trans.illdeath(), maxit = 4) mod2 <- npmsm(gd, trans.illdeath(), maxit = 20) #Plot the transition specific Kaplan-Meier estimators and their numerically #determined support sets. plot_surv(list(mod1, mod2), support = TRUE)
Plot the cumulative intensities of a 'npmsm'
objects.
A wrapper for plot.msfit
from the
mstate
package.
## S3 method for class 'npmsm' plot(x, ...)
## S3 method for class 'npmsm' plot(x, ...)
x |
An object of class |
... |
Additional arguments to |
A plot will be produced in the plotting window
Print some details of a npmsm
fit
## S3 method for class 'npmsm' print(x, ...)
## S3 method for class 'npmsm' print(x, ...)
x |
An object of class |
... |
Additional arguments to print |
A summary of the fitted model will be displayed in the console
Determine transition probabilities for a multi-state model with Weibull hazards for the transitions.
probtrans_weib(transMat, times, shapes, scales, type = c("prodint", "ODE"))
probtrans_weib(transMat, times, shapes, scales, type = c("prodint", "ODE"))
transMat |
A transition matrix as generated by |
times |
The times at which the transition probabilities should be
determined. Will always determine the probabilities forward in time starting
from |
shapes |
The Weibull shapes corresponding to the numbered transitions
in |
scales |
The Weibull scales corresponding to the numbered transitions
in |
type |
Should the transition probabilities be determined using
product integration |
An object containing the "true" transition probabilities for the specified Weibull hazards.
#Illness-death model tmat <- mstate::trans.illdeath() IDM <- probtrans_weib(tmat, seq(0, 15, 0.01), shapes = c(0.5, 0.5, 2), scales = c(5, 10, 10/gamma(1.5)), type = "prodint") IDM2 <- probtrans_weib(tmat, seq(0, 15, 0.01), shapes = c(0.5, 0.5, 2), scales = c(5, 10, 10/gamma(1.5)), type = "ODE") plot(IDM) plot(IDM2) #Extended illness-death model tmat <- mstate::transMat(list(c(2, 3), c(4), c(), c())) IDM <- probtrans_weib(tmat, seq(0, 15, 0.01), shapes = c(0.5, 0.5, 2), scales = c(5, 10, 10/gamma(1.5)), type = "prodint") IDM2 <- probtrans_weib(tmat, seq(0, 15, 0.01), shapes = c(0.5, 0.5, 2), scales = c(5, 10, 10/gamma(1.5)), type = "ODE") plot(IDM) plot(IDM2)
#Illness-death model tmat <- mstate::trans.illdeath() IDM <- probtrans_weib(tmat, seq(0, 15, 0.01), shapes = c(0.5, 0.5, 2), scales = c(5, 10, 10/gamma(1.5)), type = "prodint") IDM2 <- probtrans_weib(tmat, seq(0, 15, 0.01), shapes = c(0.5, 0.5, 2), scales = c(5, 10, 10/gamma(1.5)), type = "ODE") plot(IDM) plot(IDM2) #Extended illness-death model tmat <- mstate::transMat(list(c(2, 3), c(4), c(), c())) IDM <- probtrans_weib(tmat, seq(0, 15, 0.01), shapes = c(0.5, 0.5, 2), scales = c(5, 10, 10/gamma(1.5)), type = "prodint") IDM2 <- probtrans_weib(tmat, seq(0, 15, 0.01), shapes = c(0.5, 0.5, 2), scales = c(5, 10, 10/gamma(1.5)), type = "ODE") plot(IDM) plot(IDM2)
This function calculates G as defined in
Frydman (1995), with
the failure times. Note that length(t_n) must be
equal to length(lambda)
prod_lambda_G_base(lambda, t_n, Q, A)
prod_lambda_G_base(lambda, t_n, Q, A)
lambda |
Intensities of the 2->3 transition |
t_n |
Unique failure times, same length as lambda |
Q |
Matrix (2 column) containing support intervals as rows |
A |
Matrix (2 column) containing censoring intervals as rows |
Remove redundant observed states from a supplied data frame. Observations are redundant either when we observe an absorbing state multiple times (as we cannot leave an absorbing state), or when a transient state is observed multiple times between transitions (as we cannot have loops, therefore no extra information is provided when we observe a transient state multiple times).
remove_redundant_observations(gd, tmat)
remove_redundant_observations(gd, tmat)
gd |
A
The true transition time between states is then interval censored between the times. |
tmat |
A transition matrix as created by |
A data.frame
containing the information contained in the
input data.frame
gd
,
but without redundant observations. Depending on whether tmat
was
specified the function may remove more observations.
#We simulate some data #Function to generate evaluation times: at 0 and uniform inter-observation eval_times <- function(n_obs, stop_time){ cumsum( c( 0, runif( n_obs-1, 0, 2*(stop_time-4)/(n_obs-1) ) ) ) } #Simulate illness-death model data with Weibull transitions sim_dat <- sim_id_weib(n = 20, n_obs = 6, stop_time = 15, eval_times = eval_times, start_state = "stable", shape = c(0.5, 0.5, 2), scale = c(5, 10, 10/gamma(1.5))) visualise_msm(sim_dat) require(mstate) sim_dat_clean <- remove_redundant_observations(sim_dat, trans.illdeath()) visualise_msm(sim_dat_clean)
#We simulate some data #Function to generate evaluation times: at 0 and uniform inter-observation eval_times <- function(n_obs, stop_time){ cumsum( c( 0, runif( n_obs-1, 0, 2*(stop_time-4)/(n_obs-1) ) ) ) } #Simulate illness-death model data with Weibull transitions sim_dat <- sim_id_weib(n = 20, n_obs = 6, stop_time = 15, eval_times = eval_times, start_state = "stable", shape = c(0.5, 0.5, 2), scale = c(5, 10, 10/gamma(1.5))) visualise_msm(sim_dat) require(mstate) sim_dat_clean <- remove_redundant_observations(sim_dat, trans.illdeath()) visualise_msm(sim_dat_clean)
An illness-death model has 3 transitions:
1
:State 1 (Healthy) to State 2 (Illness);
2
:State 1 (Healthy) to State 3 (Death);
3
:State 2 (Illness) to State 3 (Death);
Using this function, it is possible to simulate data from an illness-death model with Weibull transition intensities. Requires the use of an external (self-written) function to generate observation times.
sim_id_weib( n, n_obs, stop_time, eval_times, start_state = c("stable", "equalprob"), shape, scale, ... )
sim_id_weib( n, n_obs, stop_time, eval_times, start_state = c("stable", "equalprob"), shape, scale, ... )
n |
Number of subjects to generate paths for. |
n_obs |
Number of observations in time period for each subject. |
stop_time |
Largest time at which the model is considered. |
eval_times |
A function which returns the evaluation times for a subject.
Must have as arguments at least |
start_state |
In which states can subjects start? Either everyone starts in state 1 ("stable") or equal probability to start in state 1 or 2 ("equalprob"). |
shape |
Vector of shape parameters for the 3 transitions. See |
scale |
Vector of scale parameters for the 3 transitions. See |
... |
Further parameters to |
Taking shape = 1
we get an exponential distribution with rate
1/scale
Panel data in the form of a data.frame
with 3 named columns
id, time and state. These represent the subject identifier, the observation
time and the state at the
observation time.
#Function to generate evaluation times: at 0 and uniform inter-observation eval_times <- function(n_obs, stop_time){ cumsum( c( 0, runif( n_obs-1, 0, 2*(stop_time-4)/(n_obs-1) ) ) ) } #Simulate illness-death model data with Weibull transitions sim_dat <- sim_id_weib(n = 20, n_obs = 6, stop_time = 15, eval_times = eval_times, start_state = "stable", shape = c(0.5, 0.5, 2), scale = c(5, 10, 10/gamma(1.5))) visualise_msm(sim_dat)
#Function to generate evaluation times: at 0 and uniform inter-observation eval_times <- function(n_obs, stop_time){ cumsum( c( 0, runif( n_obs-1, 0, 2*(stop_time-4)/(n_obs-1) ) ) ) } #Simulate illness-death model data with Weibull transitions sim_dat <- sim_id_weib(n = 20, n_obs = 6, stop_time = 15, eval_times = eval_times, start_state = "stable", shape = c(0.5, 0.5, 2), scale = c(5, 10, 10/gamma(1.5))) visualise_msm(sim_dat)
Simulate multiple trajectories from a multi-state model quantified by a transition matrix, with interval-censored transitions and Weibull distributed transition intensities. Allows for Weibull censoring in each of the states.
sim_weibmsm( data, tmat, startprobs, exact, shape, scale, censshape, censscale, n_subj, obs_pars, true_trajec = FALSE )
sim_weibmsm( data, tmat, startprobs, exact, shape, scale, censshape, censscale, n_subj, obs_pars, true_trajec = FALSE )
data |
A |
tmat |
A transition matrix as created by |
startprobs |
A numeric vector of length H indicating the probability of each subject to start in any of the possible states. Must sum to 1. By default, all subjects will start in state 1. |
exact |
A numeric vector indicating which states are exactly observed.
The transition time to exact states will be observed at exact times, regardless
of the times in |
shape |
A numeric vector of length M indicating the shape of the Weibull
transition intensity for the corresponding transition in |
scale |
A numeric vector of length M indicating the scale of the Weibull
transition intensity for the corresponding transition in |
censshape |
A numeric vector of length H indicating the Weibull
censoring shape in each of the states. If no censoring is required in some states,
set corresponding entries to |
censscale |
A numeric vector of length H indicating the Weibull censoring
scale in each of the states. If no censoring is required in some states,
set corresponding entries to |
n_subj |
(Optional) Instead of specifying |
obs_pars |
(Optional) A numeric vector of length 3 specifying what the
time is between planned assessments, what the uniform deviation from this
time is at the visits and the maximum visit time.
Specifying |
true_trajec |
Should the true (right-censored) trajectory be returned for
the subjects as well? Default = |
Taking (cens)shape
to be 1 for all transitions, we obtain exponential
(censoring)/transitions with rate 1/(cens)scale
.
If right-censoring parameters are specified, a right-censoring time is generated in
each of the visited states. If the subject is right-censored, we assume the subject
is no longer observed at later obstimes
. Due to the interval-censored
nature of the generation process, it may therefore appear as if the subject
was right-censored in an earlier state.
Suppose a subject arrives in state g at time s. If we wish to generate
a survival time from that state according to a Weibull intensity in a clock forward
model, we can use the inverse transform of the conditional Weibull intensity.
More specifically, letting denote the shape and
denote the scale,
the conditional survival function for
is given by
The corresponding cumulative intensity is then given by:
And the inverse cumulative intensity is then:
A conditional survival time is then generated by:
with a sample from the standard uniform distribution.
If we additionally have covariates (or frailties), the
above should be replaced by
with
and
the coefficients and covariates respectively.
A matrix
with 3 columns time
, state
and id
, indicating
the observation time, the corresponding state and subject identifier. If
true_trajec = TRUE
, a list
with the matrix described above and a matrix
representing the underlying right-censored trajectory.
require(mstate) require(ggplot2) #Generate from an illness-death model with exponential transitions with #rates 1/2, 1/10 and 1 for 10 subjects over a time grid. gd <- sim_weibmsm(tmat = trans.illdeath(), shape = c(1,1,1), scale = c(2, 10, 1), n_subj = 10, obs_pars = c(2, 0.5, 20), startprobs = c(0.9, 0.1, 0), true_trajec = TRUE) #Observed trajectories visualise_msm(gd$observed) #True trajectories visualise_msm(gd$true) #Can supply data-frame with specified observation times obs_df <- data.frame(time = c(0, 1, 3, 5, 0.5, 6, 9), id = c(1, 1, 1, 1, 2, 2, 2)) gd <- sim_weibmsm(data = obs_df, tmat = trans.illdeath(), shape = c(1, 1, 1), scale = c(2, 10, 1)) visualise_msm(gd)
require(mstate) require(ggplot2) #Generate from an illness-death model with exponential transitions with #rates 1/2, 1/10 and 1 for 10 subjects over a time grid. gd <- sim_weibmsm(tmat = trans.illdeath(), shape = c(1,1,1), scale = c(2, 10, 1), n_subj = 10, obs_pars = c(2, 0.5, 20), startprobs = c(0.9, 0.1, 0), true_trajec = TRUE) #Observed trajectories visualise_msm(gd$observed) #True trajectories visualise_msm(gd$true) #Can supply data-frame with specified observation times obs_df <- data.frame(time = c(0, 1, 3, 5, 0.5, 6, 9), id = c(1, 1, 1, 1, 2, 2, 2)) gd <- sim_weibmsm(data = obs_df, tmat = trans.illdeath(), shape = c(1, 1, 1), scale = c(2, 10, 1)) visualise_msm(gd)
Given only direct transition intervals, determine the support for each transition separately using Hudgens(2001) result. Each state is considered from a competing-risks viewpoint. Hudgens(2005) result is applied to see if the NPMLE for any of the transitions does not exist.
support_from_direct_intervals(direct_intervals, tmat)
support_from_direct_intervals(direct_intervals, tmat)
direct_intervals |
Output from |
tmat |
A transition matrix as created by |
A list containing the estimated support sets for each possible transition
in tmat
.
For each transition in tmat
, determine all consecutive
bins with non-zero (higher than cutoff
) transition intensities.
These then determine the numerical support of the transition.
support_npmsm(npmsm, cutoff = 1e-08)
support_npmsm(npmsm, cutoff = 1e-08)
npmsm |
|
cutoff |
Above which value is a mass in a bin considered to be non-zero? Default = 1e-8. Note that this is independent of bin size, so can be tricky!! |
A list containing a list for each transition. Each transition specific list contains the support intervals for that transition in a matrix with 3 named columns L, R and dA, indicating the left/right-endpoints of the support intervals and the change in the estimated intensities over this support interval.
require(mstate) require(ggplot2) #Generate from an illness-death model with exponential transitions with #rates 1/2, 1/10 and 1 for 10 subjects over a time grid. gd <- sim_weibmsm(tmat = trans.illdeath(), shape = c(1,1,1), scale = c(2, 10, 1), n_subj = 10, obs_pars = c(2, 0.5, 20), startprobs = c(0.9, 0.1, 0)) #Fit 2 models: 1 with at most 4 iterations and 1 with at most 20 mod1 <- npmsm(gd, trans.illdeath(), maxit = 4) #Determine support numerically: mod1_supp <- support_npmsm(mod1) mod1_supp[[1]]
require(mstate) require(ggplot2) #Generate from an illness-death model with exponential transitions with #rates 1/2, 1/10 and 1 for 10 subjects over a time grid. gd <- sim_weibmsm(tmat = trans.illdeath(), shape = c(1,1,1), scale = c(2, 10, 1), n_subj = 10, obs_pars = c(2, 0.5, 20), startprobs = c(0.9, 0.1, 0)) #Fit 2 models: 1 with at most 4 iterations and 1 with at most 20 mod1 <- npmsm(gd, trans.illdeath(), maxit = 4) #Determine support numerically: mod1_supp <- support_npmsm(mod1) mod1_supp[[1]]
Given censoring/truncation intervals, find the maxcliques and determine the support of the interval censored problem.
supportHudgens(intervals, reduction = TRUE, existence = FALSE)
supportHudgens(intervals, reduction = TRUE, existence = FALSE)
intervals |
A data.frame with 3 columns containing half-open intervals (left open, right closed) and an indicator whether the interval results from a censored transition or truncation:
Note that the truncation intervals need to be in the form (N, Inf] with N a numeric value. |
reduction |
Should the support be reduced using Lemma 3 from Hudgens (2005)? This requires checking an extra condition. Default is TRUE. |
existence |
Should the existence of the NPMLE be checked using Theorem 1/Lemma 4 from
Hudgens (2005)? Requires |
graph
: An igraph
object representing the censoring/truncation intervals
support
: Support estimated from the censoring intervals
dir_graph
: A directed igraph
object used to determine whether the NPMLE
exists in the presence of left-truncation.
exist_mle
: Logical output indicating whether the NPMLE exists.
Michael G. Hudgens, On Nonparametric Maximum Likelihood Estimation with Interval Censoring and Left Truncation, Journal of the Royal Statistical Society Series B: Statistical Methodology, Volume 67, Issue 4, September 2005, Pages 573-587, doi:10.1111/j.1467-9868.2005.00516.x
probtrans
functionFor 'msm'
objects: determine transition probabilities
(as in probtrans
) from an
msm
object. Currently only direction = "forward" is supported.
For 'npmsm'
objects: Determine transition probabilities for an 'npmsm'
object
using the probtrans
function.
For 'msfit'
objects: Wrapper for probtrans
## S3 method for class 'msm' transprob(object, predt, times, ...) ## S3 method for class 'npmsm' transprob(object, ...) ## S3 method for class 'msfit' transprob(object, ...) transprob(object, ...)
## S3 method for class 'msm' transprob(object, predt, times, ...) ## S3 method for class 'npmsm' transprob(object, ...) ## S3 method for class 'msfit' transprob(object, ...) transprob(object, ...)
object |
Object of compatible class |
predt |
A positive number indicating the prediction time. This is
the time at which the prediction is made. If missing, smallest time of
|
times |
A vector of times at which the transition probabilities should be determined. |
... |
Further arguments to |
Can be used for objects of class 'npmsm', 'msm' and 'msfit'
A probtrans
object containing the estimated transition probabilities.
Visualise data for illness-death model, only applicable to Frydman(1995) setting.
visualise_data(data, msmFrydman)
visualise_data(data, msmFrydman)
data |
A
|
msmFrydman |
A fitted model from |
Returns a visualisation of illness-death data, with the transition
from healthy to illness interval-censored and the other two transitions
observed exactly or right-censored. If msmFrydman
is specified, the
support intervals from the fit are additionally plotted at the top of the
data visualisation.
Frydman, H. (1995). Nonparametric Estimation of a Markov 'Illness-Death' Process from Interval- Censored Observations, with Application to Diabetes Survival Data. Biometrika, 82(4), 773-789. doi:10.2307/2337344
See msm_frydman
for fitting a model.
data <- data.frame(delta = c(0, 0, 1, 1), Delta = c(0, 1, 0, 1), L = c(NA, NA, 1, 1.5), R = c(NA, 3, 2, 3), time = c(4, 5, 6, 7)) mod_frydman <- msm_frydman(data) visualise_data(data, mod_frydman)
data <- data.frame(delta = c(0, 0, 1, 1), Delta = c(0, 1, 0, 1), L = c(NA, NA, 1, 1.5), R = c(NA, 3, 2, 3), time = c(4, 5, 6, 7)) mod_frydman <- msm_frydman(data) visualise_data(data, mod_frydman)
Produce a plot with the y-axis representing subjects in the data and the x-axis representing the time at which states have been observed.
visualise_msm(gd, npmsm, tmat, neat = TRUE, cutoff)
visualise_msm(gd, npmsm, tmat, neat = TRUE, cutoff)
gd |
A
|
npmsm |
Output from |
tmat |
A transition matrix as created by |
neat |
Boolean indicating whether redundant observations should be removed in the plot. Default is TRUE |
cutoff |
cutoff value for numerically determining the support using
|
A plot will be produced in the plotting window.
#Write a function for evaluation times: observe at 0 and uniform inter-observation times. eval_times <- function(n_obs, stop_time){ cumsum( c( runif(1, 0, 0.5), runif( n_obs-1, 0, 2*(stop_time-4)/(n_obs-1) ) ) ) } #Use built_in function to simulate illness-death data #from Weibull distributions for each transition sim_dat <- sim_id_weib(n = 50, n_obs = 6, stop_time = 15, eval_times = eval_times, start_state = "stable", shape = c(0.5, 0.5, 2), scale = c(5, 10, 10/gamma(1.5))) #Visualise the data visualise_msm(sim_dat)
#Write a function for evaluation times: observe at 0 and uniform inter-observation times. eval_times <- function(n_obs, stop_time){ cumsum( c( runif(1, 0, 0.5), runif( n_obs-1, 0, 2*(stop_time-4)/(n_obs-1) ) ) ) } #Use built_in function to simulate illness-death data #from Weibull distributions for each transition sim_dat <- sim_id_weib(n = 50, n_obs = 6, stop_time = 15, eval_times = eval_times, start_state = "stable", shape = c(0.5, 0.5, 2), scale = c(5, 10, 10/gamma(1.5))) #Visualise the data visualise_msm(sim_dat)