Title: | Log-Concave Distribution Estimation with Interval-Censored Data |
---|---|
Description: | We consider the non-parametric maximum likelihood estimation of the underlying distribution function, assuming log-concavity, based on mixed-case interval-censored data. The algorithm implemented is base on Chi Wing Chu, Hok Kan Ling and Chaoyu Yuan (2024, <doi:10.48550/arXiv.2411.19878>). |
Authors: | Chi Wing Chu [aut], Hok Kan Ling [aut], Chaoyu Yuan [aut, cre] |
Maintainer: | Chaoyu Yuan <[email protected]> |
License: | GPL-3 |
Version: | 1.0.1 |
Built: | 2024-12-06 01:36:35 UTC |
Source: | CRAN |
This function constructs case II interval-censored data using the provided event times and censoring (survey) times. Each individual's event time is either left-censored, right-censored, or interval-censored based on two survey times: the left and right bounds of the interval.
case_II_X(event_times, survey_times)
case_II_X(event_times, survey_times)
event_times |
A numeric vector of event times for each individual. |
survey_times |
A numeric matrix with two columns, where each row contains the left and right censoring (survey) times for each individual. |
A matrix with two columns, where each row represents an individual's interval-censored data.
The first column is the left endpoint, and the second column is the right endpoint.
If the event time is before the left survey time, the interval is (0, left survey time]
.
If the event time is after the right survey time, the interval is (right survey time, Inf)
.
If the event time falls between the left and right survey times, the interval is (left survey time, right survey time]
.
This function constructs case I interval-censored data (current status data) using the provided event times and censoring (survey) times. Each individual's event time is either left-censored or right-censored at their survey time, depending on whether the event has occurred by the survey time.
current_status_X(event_times, survey_times)
current_status_X(event_times, survey_times)
event_times |
A numeric vector of event times for each individual. |
survey_times |
A numeric vector of censoring (survey) times for each individual. |
A matrix with two columns, where each row represents an individual's interval-censored data.
The first column is the left endpoint, and the second column is the right endpoint.
If the event time is before the survey time, the interval is (0, survey_time]
.
If the event time is after the survey time, the interval is (survey_time, Inf)
.
This function processes interval-censored data and prepares various components needed for model fitting, including unique time points, censoring intervals, and weights.
data_prep(X)
data_prep(X)
X |
A matrix or data frame of interval-censored data where each row contains the lower and upper bounds of the interval for each observation. |
A list containing:
Unique time points.
The number of unique time points (excluding infinity if present).
Indices of observations where the event is in the intersection of L group and the complement of R group. The L group consists of samples with left intervals time <= min(all right intervals time). The R group consists of samples with infinity right interval time.
Indices of observations where the event is in the intersection of the complement of L group and R group.
Indices of observations where the event is in the intersection of the complement of L group and the complement of R group.
Indices corresponding to the right bounds of the intervals in tau
.
Indices corresponding to the left bounds of the intervals in tau
.
Unique time points excluding infinity.
Weights for each unique interval.
Processed matrix of interval-censored data with unique rows.
This function computes the directional derivatives for the active set algorithm used in the estimation
of the distribution function under log-concavity with interval-censored data.
The calculation takes advantage of the specific structure of the basis matrix, making it efficient to compute in O(n)
time complexity.
find_dir_deriv(diff_tau, first_order)
find_dir_deriv(diff_tau, first_order)
diff_tau |
A numeric vector containing the differences between consecutive time points (tau). |
first_order |
A numeric vector representing the first-order derivatives at each time point. |
A numeric vector of length length(diff_tau) + 1
representing the directional derivatives for the active set algorithm.
This function computes the qsi matrix for specified indices and time points.
find_qsi(is, tau_no_Inf)
find_qsi(is, tau_no_Inf)
is |
Indices of nodes. |
tau_no_Inf |
Unique time points excluding infinity. |
qsi matrix.
Computes the value of the function for a given object of class
iclogcondist
.
This is a generic function to compute for object class
iclogcondist
.
For usage details, please refer to function get_F_at_x.iclogcondist
get_F_at_x(object, ...)
get_F_at_x(object, ...)
object |
An object for which the method is defined. |
... |
Additional arguments passed to the method. |
A numeric vector of values, either or
.
# Example usage: data(lgnm) # Evaluate for LCMLE object fit_LCMLE <- ic_LCMLE(lgnm) get_F_at_x(fit_LCMLE) # Evaluate for UMLE object fit_UMLE <- ic_UMLE(lgnm) x = seq(0.001, 6, length.out = 1000) get_F_at_x(fit_UMLE, x = x)
# Example usage: data(lgnm) # Evaluate for LCMLE object fit_LCMLE <- ic_LCMLE(lgnm) get_F_at_x(fit_LCMLE) # Evaluate for UMLE object fit_UMLE <- ic_UMLE(lgnm) x = seq(0.001, 6, length.out = 1000) get_F_at_x(fit_UMLE, x = x)
Computes the value of the function for a given object of class
iclogcondist
.
## S3 method for class 'iclogcondist' get_F_at_x(object, x = NA, log = FALSE, ...)
## S3 method for class 'iclogcondist' get_F_at_x(object, x = NA, log = FALSE, ...)
object |
An object of class |
x |
A numeric vector of values at which |
log |
Logical; if |
... |
Additional arguments (not currently used). |
A numeric vector of values, either or
.
# Example usage: data(lgnm) # Evaluate for LCMLE object fit_LCMLE <- ic_LCMLE(lgnm) get_F_at_x(fit_LCMLE) # Evaluate for UMLE object fit_UMLE <- ic_UMLE(lgnm) x = seq(0.001, 6, length.out = 1000) get_F_at_x(fit_UMLE, x = x)
# Example usage: data(lgnm) # Evaluate for LCMLE object fit_LCMLE <- ic_LCMLE(lgnm) get_F_at_x(fit_LCMLE) # Evaluate for UMLE object fit_UMLE <- ic_UMLE(lgnm) x = seq(0.001, 6, length.out = 1000) get_F_at_x(fit_UMLE, x = x)
This function computes the Least Concave Majorant (LCM) of the log of the unconstrained MLE for interval-censored data.
ic_LCM_UMLE(X)
ic_LCM_UMLE(X)
X |
A matrix with two columns, where each row represents an interval (L, R] for interval-censored data.
|
A list containing:
est |
A list with |
knot_info |
A list with |
neg_log_likelihood |
The negative log-likelihood of the LCM fit. |
weight |
Vector of weights corresponding to each interval in the data. |
X |
The original interval-censored data matrix input. |
data(lgnm) result <- ic_LCM_UMLE(X = lgnm)
data(lgnm) result <- ic_LCM_UMLE(X = lgnm)
This function computes the log-concave MLE of the cumulative distribution function for interval-censored data under log-concavity on the underlying distribution function based on an active set algorithm. The active set algorithm adjusts the knots set based on certain directional derivatives.
ic_LCMLE( X, initial = "LCM", print = FALSE, max_iter = 500, tol_conv = 1e-07, tol_conv_like = 1e-10, tol_K = 1e-05 )
ic_LCMLE( X, initial = "LCM", print = FALSE, max_iter = 500, tol_conv = 1e-07, tol_conv_like = 1e-10, tol_K = 1e-05 )
X |
A matrix with two columns, where each row represents an interval (L, R] for interval-censored data.
|
initial |
A character string specifying the method of obtaining an initial value ("LCM" or "MLE") for the estimation process. Default is |
print |
Logical. If |
max_iter |
An integer specifying the maximum number of iterations for the algorithm. Default is 500. |
tol_conv |
A numeric tolerance level for convergence based on the directional derivatives. Default is |
tol_conv_like |
A numeric tolerance level for convergence based on log-likelihood difference. Default is |
tol_K |
A numeric tolerance for checking if v^T phi is in K (constraint set) in active constraint set |
A list with the following components:
est |
A list containing |
knot_info |
A list with |
neg_log_likelihood |
Vector of negative log-likelihood values for each iteration of the algorithm. |
dir_derivs |
Vector of directional derivatives for each iteration. |
iter_no |
Integer representing the total number of iterations. |
weight |
Vector of weights corresponding to each interval in the data. |
X |
The original interval-censored data matrix input. |
# Example usage: data(lgnm) result <- ic_LCMLE(X = lgnm, initial = "LCM", print = TRUE, max_iter = 500) print(result$est)
# Example usage: data(lgnm) result <- ic_LCMLE(X = lgnm, initial = "LCM", print = TRUE, max_iter = 500) print(result$est)
This function computes the unconstrained maximum likelihood estimate (UMLE) for interval-censored data.
It utilizes the non-parametric MLE from the ic_np
function in the icenReg
package as a starting point
and prepares key components such as cumulative probabilities, log-transformed values, and knot information.
ic_UMLE(X)
ic_UMLE(X)
X |
A matrix with two columns, where each row represents an interval (L, R] for interval-censored data.
|
The ic_np
function from the icenReg
package is used to compute the non-parametric MLE for
interval-censored data. This provides initial estimates of probabilities (p_hat
) and jump points
(knot
) in the cumulative distribution function. These are then processed to compute the
cumulative probabilities (F_hat
) and log-transformed values (phi_hat
) at unique time points.
A list containing:
est |
A list with |
knot_info |
A list with |
neg_log_likelihood |
The negative log-likelihood of the MLE fit. |
weight |
Vector of weights corresponding to each interval in the data. |
X |
The original interval-censored data matrix input. |
Anderson-Bergman, C. (2016) An efficient implementation of the EMICM algorithm for the interval censored NPMLE Journal of Computational and Graphical Statistics.
data(lgnm) result <- ic_UMLE(X = lgnm)
data(lgnm) result <- ic_UMLE(X = lgnm)
This function visualizes a user-specified distribution true_dist
(if available) and the estimated
cumulative distribution functions (CDF) and
for a given range.
The function overlays the estimated functions from a list of fitted models
on the same plot, allowing comparison with the user-specified distribution (if provided).
In a simulation study, the user-specified distribution can correspond to the true underlying distribution.
iclogcondist_visualization(X, range = NA, fit_list = list(), true_dist = NA)
iclogcondist_visualization(X, range = NA, fit_list = list(), true_dist = NA)
X |
A dataset or input data used to prepare the plot range if |
range |
A numeric vector of length 2 specifying the range of |
fit_list |
A named list of fitted models, where each element is expected to contain
an |
true_dist |
Optional. A data frame or list containing the user-specified distribution values,
with components |
A list containing two ggplot objects: logF_plot
for and
F_plot
for .
# Example usage data(lgnm) fit_LCMLE <- ic_LCMLE(lgnm) fit_UMLE <- ic_UMLE(lgnm) iclogcondist_visualization( X = lgnm, range = c(0, 10), fit_list = list( "UMLE" = fit_UMLE, "LCMLE" = fit_LCMLE ) )
# Example usage data(lgnm) fit_LCMLE <- ic_LCMLE(lgnm) fit_UMLE <- ic_UMLE(lgnm) iclogcondist_visualization( X = lgnm, range = c(0, 10), fit_list = list( "UMLE" = fit_UMLE, "LCMLE" = fit_LCMLE ) )
This function implements the ICM algorithm for solving the sub-problem in the active set algorithm.
This is a support of the active set algorithm, computing the optimal values phi_tilde
with reduced number of knots in the sub-problem.
It uses backtracking to ensure convergence (Jongbloed, 1998).
icm_subset_cpp( phi_tilde_initial, is, tau_no_Inf, L_Rc, Lc_R, Lc_Rc, ri, li, weight, tol = 1e-10, max_iter = 500 )
icm_subset_cpp( phi_tilde_initial, is, tau_no_Inf, L_Rc, Lc_R, Lc_Rc, ri, li, weight, tol = 1e-10, max_iter = 500 )
phi_tilde_initial |
A numeric vector representing the initial values of the reduced variables |
is |
A numeric vector indicating the nodes with unequal left-hand slope and right-hand slope. |
tau_no_Inf |
A numeric vector containing the unique time points, excluding infinity. |
L_Rc |
Indices of observations where the event is in the intersection of L group and the complement of R group. The L group consists of samples with left intervals time <= min(all right intervals time). The R group consists of samples with infinity right interval time. |
Lc_R |
Indices of observations where the event is in the intersection of the complement of L group and R group. |
Lc_Rc |
Indices of observations where the event is in the intersection of the complement of L group and the complement of R group. |
ri |
A numeric vector of indices corresponding to the right bounds of the intervals in |
li |
A numeric vector of indices corresponding to the left bounds of the intervals in |
weight |
A numeric vector representing the weights for each observation. |
tol |
A numeric value specifying the tolerance for convergence. Default is |
max_iter |
An integer specifying the maximum number of iterations. Default is |
A list containing:
The estimated values of the reduced variable phi_tilde
at the end of the ICM iterations.
Jongbloed, G.: The iterative convex minorant algorithm for nonparametric estimation. J. Comput. Gr. Stat. 7(3), 310–321 (1998)
This function obtains initial values for the maximum likelihood estimation under log-concavity with interval-censored data based on the unconstrained maximum likelihood estimate (MLE) or its least concave majorant (LCM). Alternatively, the user can provide a numeric vector of initial values.
initial_values(X, initial = "LCM")
initial_values(X, initial = "LCM")
X |
A matrix or data frame of interval-censored data, where each row contains the lower and upper bounds of the interval for each observation. |
initial |
A character string specifying the method for generating initial values.
The default is |
A list containing:
The initial values of the phi
parameter based on the specified method.
Initial values based on the unconstrained MLE.
Initial values based on the least concave majorant.
This dataset, lgnm
, provides an example of case II interval censoring data for illustrating the functions in this package.
The event time is simulated from a log-normal distribution with parameters mean = 0 and standard deviation = 1.
The left censoring time is drawn from a uniform distribution between 0 and 2, and the right censoring time is drawn
from a uniform distribution between the left censoring time and 20. Both the left and right censoring times are rounded to four decimal places.
A data frame with 100 observations on the following 2 variables:
The left censoring time.
The right censoring time.
Synthetic data generated for illustration purposes.
data(lgnm) head(lgnm)
data(lgnm) head(lgnm)
This function computes the negative log-likelihood of an interval-censored model based on the specified parameterization.
neg_log_like(x, weight, li, ri, L_Rc, Lc_R, Lc_Rc, type = "", tau_no_Inf)
neg_log_like(x, weight, li, ri, L_Rc, Lc_R, Lc_Rc, type = "", tau_no_Inf)
x |
A numeric vector of parameter estimates (can be in terms of |
weight |
A numeric vector of weights for the observations. |
li |
A numeric vector of indices corresponding to the left bounds of the intervals in |
ri |
A numeric vector of indices corresponding to the right bounds of the intervals in |
L_Rc |
Indices of observations where the event is in the intersection of L group and the complement of R group. The L group consists of samples with left intervals time <= min(all right intervals time). The R group consists of samples with infinity right interval time. |
Lc_R |
Indices of observations where the event is in the intersection of the complement of L group and R group. |
Lc_Rc |
Indices of observations where the event is in the intersection of the complement of L group and the complement of R group. |
type |
A character string indicating the parameterization of |
tau_no_Inf |
A numeric vector of unique time points excluding infinity. |
The negative log-likelihood value.
This function generates a plot for objects of class iclogcondist
, which are typically generated by
ic_UMLE
, ic_LCM_UMLE
, or ic_LCMLE
. The plot can display either the cumulative
distribution function F(t)
or the log cumulative distribution function logF(t)
, depending on the
setting of the log
parameter.
## S3 method for class 'iclogcondist' plot(x, log = FALSE, ...)
## S3 method for class 'iclogcondist' plot(x, log = FALSE, ...)
x |
An object of class |
log |
Logical; if |
... |
Additional arguments passed to the plotting function. |
An invisible ggplot
object representing the plot. The plot is also displayed in the current graphics device.
# Example usage with ic_UMLE, ic_LCM_UMLE, and ic_LCMLE data(lgnm) X <- lgnm fit_UMLE <- ic_UMLE(X) fit_LCM_UMLE <- ic_LCM_UMLE(X) fit_LCMLE <- ic_LCMLE(X) plot(fit_UMLE, log = TRUE) # Plot logF(t) for UMLE plot(fit_LCM_UMLE, log = FALSE) # Plot F(t) for LCM_UMLE plot(fit_LCMLE, log = FALSE) # Plot F(t) for LCMLE
# Example usage with ic_UMLE, ic_LCM_UMLE, and ic_LCMLE data(lgnm) X <- lgnm fit_UMLE <- ic_UMLE(X) fit_LCM_UMLE <- ic_LCM_UMLE(X) fit_LCMLE <- ic_LCMLE(X) plot(fit_UMLE, log = TRUE) # Plot logF(t) for UMLE plot(fit_LCM_UMLE, log = FALSE) # Plot F(t) for LCM_UMLE plot(fit_LCMLE, log = FALSE) # Plot F(t) for LCMLE
This function computes the cumulative distribution function (CDF) of a truncated log-logistic distribution
at a given point x
.
ptllogis(x, shape = 1, scale = 1, upper_bound = Inf)
ptllogis(x, shape = 1, scale = 1, upper_bound = Inf)
x |
A numeric vector at which to evaluate the CDF. |
shape |
A positive numeric value representing the shape parameter of the log-logistic distribution. Default is |
scale |
A positive numeric value representing the scale parameter of the log-logistic distribution. Default is |
upper_bound |
A positive numeric value indicating the upper truncation point. Default is |
A numeric vector of the CDF values of the truncated log-logistic distribution at x
.
# Evaluate the CDF at x = 2 for a truncated log-logistic distribution ptllogis(2, shape = 2, scale = 1, upper_bound = 5)
# Evaluate the CDF at x = 2 for a truncated log-logistic distribution ptllogis(2, shape = 2, scale = 1, upper_bound = 5)
This function computes the cumulative distribution function (CDF) of a truncated log-normal distribution
at a given point x
.
ptlnorm(x, meanlog = 0, sdlog = 1, upper_bound = Inf)
ptlnorm(x, meanlog = 0, sdlog = 1, upper_bound = Inf)
x |
A numeric vector at which to evaluate the CDF. |
meanlog |
A numeric value representing the mean of the log-normal distribution on the log scale. Default is |
sdlog |
A positive numeric value representing the standard deviation of the log-normal distribution on the log scale. Default is |
upper_bound |
A positive numeric value indicating the upper truncation point. Default is |
A numeric vector of the CDF values of the truncated log-normal distribution at x
.
# Evaluate the CDF at x = 2 for a truncated log-normal distribution ptlnorm(2, meanlog = 0, sdlog = 1, upper_bound = 5)
# Evaluate the CDF at x = 2 for a truncated log-normal distribution ptlnorm(2, meanlog = 0, sdlog = 1, upper_bound = 5)
This function computes the cumulative distribution function (CDF) of a truncated Weibull distribution
at a given point x
.
ptweibull(x, shape = 1, scale = 1, upper_bound = Inf)
ptweibull(x, shape = 1, scale = 1, upper_bound = Inf)
x |
A numeric vector at which to evaluate the CDF. |
shape |
A positive numeric value representing the shape parameter of the Weibull distribution. Default is |
scale |
A positive numeric value representing the scale parameter of the Weibull distribution. Default is |
upper_bound |
A positive numeric value indicating the upper truncation point. Default is |
A numeric vector of the CDF values of the truncated Weibull distribution at x
.
# Evaluate the CDF at x = 2 for a truncated Weibull distribution ptweibull(2, shape = 2, scale = 1, upper_bound = 5)
# Evaluate the CDF at x = 2 for a truncated Weibull distribution ptweibull(2, shape = 2, scale = 1, upper_bound = 5)
This function computes the quantiles of a truncated log-logistic distribution for a given probability vector.
qtllogis(q, shape = 1, scale = 1, upper_bound = Inf)
qtllogis(q, shape = 1, scale = 1, upper_bound = Inf)
q |
A numeric vector of probabilities for which to calculate the quantiles. |
shape |
A positive numeric value representing the shape parameter of the log-logistic distribution. Default is |
scale |
A positive numeric value representing the scale parameter of the log-logistic distribution. Default is |
upper_bound |
A positive numeric value indicating the upper truncation point. Default is |
A numeric vector of quantiles corresponding to the given probabilities in q
.
# Calculate the 0.5 quantile of a truncated log-logistic distribution qtllogis(0.5, shape = 2, scale = 1, upper_bound = 5)
# Calculate the 0.5 quantile of a truncated log-logistic distribution qtllogis(0.5, shape = 2, scale = 1, upper_bound = 5)
This function computes the quantiles of a truncated log-normal distribution for a given probability vector.
qtlnorm(q, meanlog = 0, sdlog = 1, upper_bound = Inf)
qtlnorm(q, meanlog = 0, sdlog = 1, upper_bound = Inf)
q |
A numeric vector of probabilities for which to calculate the quantiles. |
meanlog |
A numeric value representing the mean of the log-normal distribution on the log scale. Default is |
sdlog |
A positive numeric value representing the standard deviation of the log-normal distribution on the log scale. Default is |
upper_bound |
A positive numeric value indicating the upper truncation point. Default is |
A numeric vector of quantiles corresponding to the given probabilities in q
.
# Calculate the 0.5 quantile of a truncated log-normal distribution qtlnorm(0.5, meanlog = 0, sdlog = 1, upper_bound = 5)
# Calculate the 0.5 quantile of a truncated log-normal distribution qtlnorm(0.5, meanlog = 0, sdlog = 1, upper_bound = 5)
This function computes the quantiles of a truncated Weibull distribution for a given probability vector.
qtweibull(q, shape = 1, scale = 1, upper_bound = Inf)
qtweibull(q, shape = 1, scale = 1, upper_bound = Inf)
q |
A numeric vector of probabilities for which to calculate the quantiles. |
shape |
A positive numeric value representing the shape parameter of the Weibull distribution. Default is |
scale |
A positive numeric value representing the scale parameter of the Weibull distribution. Default is |
upper_bound |
A positive numeric value indicating the upper truncation point. Default is |
A numeric vector of quantiles corresponding to the given probabilities in q
.
# Calculate the 0.5 quantile of a truncated Weibull distribution qtweibull(0.5, shape = 2, scale = 1, upper_bound = 5)
# Calculate the 0.5 quantile of a truncated Weibull distribution qtweibull(0.5, shape = 2, scale = 1, upper_bound = 5)
This function generates random samples from a truncated log-logistic distribution using an acceptance-rejection method.
rtllogis(n, shape = 1, scale = 1, upper_bound = Inf)
rtllogis(n, shape = 1, scale = 1, upper_bound = Inf)
n |
An integer specifying the number of random samples to generate. |
shape |
A positive numeric value representing the shape parameter of the log-logistic distribution. Default is |
scale |
A positive numeric value representing the scale parameter of the log-logistic distribution. Default is |
upper_bound |
A positive numeric value indicating the upper truncation point. Default is |
A numeric vector of n
random samples from the truncated log-logistic distribution.
# Generate 10 random samples from a truncated log-logistic distribution rtllogis(10, shape = 2, scale = 1, upper_bound = 5)
# Generate 10 random samples from a truncated log-logistic distribution rtllogis(10, shape = 2, scale = 1, upper_bound = 5)
This function generates random samples from a truncated log-normal distribution using an acceptance-rejection method.
rtlnorm(n, meanlog = 0, sdlog = 1, upper_bound = Inf)
rtlnorm(n, meanlog = 0, sdlog = 1, upper_bound = Inf)
n |
An integer specifying the number of random samples to generate. |
meanlog |
A numeric value representing the mean of the log-normal distribution on the log scale. Default is |
sdlog |
A positive numeric value representing the standard deviation of the log-normal distribution on the log scale. Default is |
upper_bound |
A positive numeric value indicating the upper truncation point. Default is |
A numeric vector of n
random samples from the truncated log-normal distribution.
# Generate 10 random samples from a truncated log-normal distribution rtlnorm(10, meanlog = 0, sdlog = 1, upper_bound = 5)
# Generate 10 random samples from a truncated log-normal distribution rtlnorm(10, meanlog = 0, sdlog = 1, upper_bound = 5)
This function generates random samples from a truncated Weibull distribution
using inverse transform sampling. When shape = 1
, it reduces to a truncated exponential distribution.
rtweibull(n, shape = 1, scale = 1, upper_bound = Inf)
rtweibull(n, shape = 1, scale = 1, upper_bound = Inf)
n |
An integer specifying the number of random samples to generate. |
shape |
A positive numeric value representing the shape parameter of the Weibull distribution. Default is |
scale |
A positive numeric value representing the scale parameter of the Weibull distribution. Default is |
upper_bound |
A positive numeric value indicating the upper truncation point. Default is |
A numeric vector of n
random samples from the truncated Weibull distribution.
# Generate 10 random samples from a truncated Weibull distribution rtweibull(10, shape = 2, scale = 1, upper_bound = 5)
# Generate 10 random samples from a truncated Weibull distribution rtweibull(10, shape = 2, scale = 1, upper_bound = 5)
This function generates interval-censored data, where the event times are generated from one of the following distributions: Weibull, log-normal and log-logistic. It supports both case 1 and case 2 interval censoring.
simulate_ic_data( n, dist, para1, para2, upper_bound = Inf, C1_upper = 1, case = 2, rounding = FALSE, round_digit = 4 )
simulate_ic_data( n, dist, para1, para2, upper_bound = Inf, C1_upper = 1, case = 2, rounding = FALSE, round_digit = 4 )
n |
An integer specifying the number of observations to generate. |
dist |
A character string indicating the distribution to use for event times.
Options are |
para1 |
A numeric value representing the first parameter of the distribution:
|
para2 |
A numeric value representing the second parameter of the distribution:
|
upper_bound |
A numeric value specifying the upper bound for event times,
corresponding to a truncated distribution. Default is |
C1_upper |
A numeric value specifying the upper limit for the first censoring time |
case |
An integer specifying the censoring case to simulate:
|
rounding |
A logical value. If |
round_digit |
An integer specifying the number of digits for rounding when |
**Censoring Times**:
In case = 1
(current status), one censoring time is generated, where it follows U(0, C1_upper)
.
In case = 2
(case 2 interval censoring), two censoring times are generated:
C1
: sampled from U(0, C1_upper)
.
C2
: sampled from U(C1, min(upper_bound, 20))
.
**Distributions**:
**Weibull**: Parameterized by shape (para1
) and scale (para2
).
**Log-logistic**: Parameterized by shape (para1
) and scale (para2
).
**Log-normal**: Parameterized by mean (para1
) and standard deviation (para2
).
A matrix of interval-censored data where each row represents an interval (L, R] containing the unobserved event time.
# Simulate data with a truncated Weibull distribution and case II interval censoring simulate_ic_data(n = 100, dist = "weibull", para1 = 2, para2 = 1, upper_bound = 5, case = 2)
# Simulate data with a truncated Weibull distribution and case II interval censoring simulate_ic_data(n = 100, dist = "weibull", para1 = 2, para2 = 1, upper_bound = 5, case = 2)
This function finds the unique rows of a given matrix and calculates the frequency (weight) of each unique row. It returns both the unique rows and the weights (the number of occurrences of each row).
unique_X_weight(X)
unique_X_weight(X)
X |
A matrix. The matrix whose unique rows are to be found. |
A list containing two components:
A matrix of the unique rows from the input matrix.
An integer vector containing the frequency (weight) of each unique row.