Package 'transmdl'

Title: Semiparametric Transformation Models
Description: To make the semiparametric transformation models easier to apply in real studies, we introduce this R package, in which the MLE in transformation models via an EM algorithm proposed by Zeng D, Lin DY(2007) <doi:10.1111/j.1369-7412.2007.00606.x> and adaptive lasso method in transformation models proposed by Liu XX, Zeng D(2013) <doi:10.1093/biomet/ast029> are implemented. C++ functions are used to compute complex loops. The coefficient vector and cumulative baseline hazard function can be estimated, along with the corresponding standard errors and P values.
Authors: Fengyu Wen [aut, cre], Baosheng Liang [aut]
Maintainer: Fengyu Wen <[email protected]>
License: GPL (>= 2)
Version: 0.1.0
Built: 2024-10-31 22:28:25 UTC
Source: CRAN

Help Index


Estimate parameters and hazard function via EM algorithm.

Description

Estimate the vector of parameters for baseline covariates β\beta and baseline cumulative hazard function Λ()\Lambda(\cdot) using the expectation-maximization algorithm. Λ(t)\Lambda(t) is estimated as a step function with jumps only at the observed failure times. Typically, it would only be used in a call to trans.m or Simu.

Usage

EM_est(Y, X, delta, alpha, Q = 60, EM_itmax = 250)

Arguments

Y

observed event times

X

design matrix

delta

censoring indicator. If YiY_i is censored, delta=0. If not, delta=1.

alpha

parameter in transformation function

Q

number of nodes and weights in Gaussian quadrature. Defaults to 60.

EM_itmax

maximum iteration of EM algorithm. Defaults to 250.

Value

a list containing

beta_new estimator of β\beta
Lamb_Y estimator of Λ(Y)\Lambda(Y)
lamb_Y estimator of λ(Y)\lambda(Y)
lamb_Ydot estimator of λ(Y)\lambda(Y')
Y_eq_Yhat a matrix used in trans.m and Simu
Y_geq_Yhat a matrix used in trans.m and Simu

References

Abramowitz, M., and Stegun, I.A. (1972). Handbook of Mathematical Functions (9th ed.). Dover Publications, New York.

Evans, M. and Swartz, T. (2000). Approximating Integrals via Monte Carlo and Deterministic Methods. Oxford University Press.

Liu, Q. and Pierce, D.A. (1994). A note on Gauss-Hermite quadrature. Biometrika 81: 624-629.

Examples

gen_data = generate_data(200, 1, 0.5, c(-0.5, 1))
  delta = gen_data$delta
  Y = gen_data$Y
  X = gen_data$X
  EM_est(Y, X, delta, alpha = 1)$beta_new - c(-0.5, 1)

Generate data for simulation.

Description

Generate observed event times, covariates and other data used for simulations in the paper.

Usage

generate_data(n, alpha, rho, beta_true, now_repeat = 1)

Arguments

n

number of subjects

alpha

parameter in transformation function

rho

parameter in baseline cumulative hazard function Λ(t)=ρlog(1+t)\Lambda(t) = \rho log(1+t) assumed in simulation

beta_true

parameter β\beta

now_repeat

number of duplication of simulation

Details

The survival function for tt of the iith observation takes the form

Si(tXi)=exp{H{Λ(t)exp(βTXi)}}.S_{i}(t| X_{i}) = \exp\left\{-H \{\Lambda(t) \exp ( \beta^T X_{i} ) \}\right\}.

The failure time TiT_i can be generated by

Ti={exp{Uα1αρexp{βTXi}}1α>0,exp{log(U)ρexp{βTXi}}1,α=0.}T_i = \left\{\begin{array}{l l} \exp\{ \frac{U^{-\alpha}-1}{\alpha\rho\exp\{\beta^TX_i \}} \}-1& \alpha > 0, \\ \exp\{ \frac{-log(U)}{\rho\exp\{\beta^TX_i \}} \}-1, & \alpha = 0. \end{array}\right\}

Value

a list containing

X design matrix
beta_X XβTX\cdot\beta^T
Y observed event time

References

Abramowitz, M., and Stegun, I.A. (1972). Handbook of Mathematical Functions (9th ed.). Dover Publications, New York. +- Evans, M. and Swartz, T. (2000). Approximating Integrals via Monte Carlo and Deterministic Methods. Oxford University Press.

Liu, Q. and Pierce, D.A. (1994). A note on Gauss-Hermite quadrature. Biometrika 81: 624-629.

Examples

generate_data(200,0.5,1,c(0.5,-1))

Adaptive LASSO for Semiparametric Transformation Models.

Description

Select the important variables in semiparametric transformation models for right censored data using adaptive lasso.

Usage

trans_alasso(Z, Y, delta_i, r, lamb_vec, solu = TRUE)

Arguments

Z

the baseline covariates

Y

observed event times

delta_i

censoring indicator. If YY is censored, delta_i=0. If not, delta_i=1.

r

parameter in transformation function

lamb_vec

the grad of the tuning parameter λ\lambda

solu

determines whether the solution path will be plotted. The default is TRUE.

Details

The initial value of the coefficient β\beta used as the adapting weights is EM estimator, which is computed by the function EM_est. The tuning parameter λ\lambda is data-dependent and we select it using generalized crossvalidation. There may be some errors for small λ\lambda, in which case the λ\lambda and the number of adaptive lasso iteration are recorded in the skip_para.

Value

a list containing

beta_res the estimated β\beta with the selected tuning parameter λ\lambda
GCV_res the value of GCV with the selected tuning parameter λ\lambda
lamb_res the selected tuning parameter λ\lambda
beta_all estimated β\beta with all tuning parameters
CSV_all value of GCV with all tuning parameters
skip_para a list containing the λ\lambda and the number of adaptive lasso iteration when adaptive lasso doesn't work.

References

Xiaoxi, L. , & Donglin, Z. . (2013). Variable selection in semiparametric transformation models for right-censored data. Biometrika(4), 859-876.

Examples

if(!requireNamespace("MASS", quietly = TRUE))
 {stop("package MASS needed for this example. Please install it.")}

 gen_lasdat = function(n,r,rho,beta_true,a,b,seed=66,std = FALSE)
{

  set.seed(seed)
  beta_len = length(beta_true)
  beta_len = beta_len
  sigm = matrix(0, nrow = beta_len, ncol = beta_len)
  for(i in 1:(beta_len-1))
  {
    diag(sigm[1:(beta_len+1-i),i:beta_len]) = rho^(i-1)
  }
  sigm[1,beta_len] = rho^(beta_len-1)
  sigm[lower.tri(sigm)] = t(sigm)[lower.tri(sigm)]

  Z = MASS::mvrnorm(n, mu = rep(0, beta_len), Sigma = sigm)
  beta_Z.true = c(Z %*% beta_true)
  U = runif(n)
  if(r>0)
  {
    t = ((U^(-r)-1)/(a*r*exp(beta_Z.true)))^(1/b)
  }else if(r == 0)
  {
    t = (-log(U)/(a*exp(beta_Z.true)))^(1/b)
    #t = (exp(-log(U)/(0.5 * exp(beta_Z.true))) - 1)
  }
  C = runif(n,0,8)
  Y = pmin(C,t)
  delta_i = ifelse( C >= t, 1, 0)
  if(std)
  {
    Z = apply(Z,2,normalize)
  }
  return(list(Z = Z, Y = Y, delta_i = delta_i,censor = mean(1-delta_i)))
}

now_rep=1
dat = gen_lasdat(100,1,0.5,c(0.3,0.5,0.7,0,0,0,0,0,0,0),2,5,seed= 6+60*now_rep,std = FALSE)
Z = dat$Z
Y = dat$Y
delta_i = dat$delta_i

tra_ala = trans_alasso(Z,Y,delta_i,lamb_vec = c(5,7),r=1)
tra_ala$GCV_res
tra_ala$beta_res
tra_ala$lamb_res

Regression Analysis of Right-censored Data using Semiparametric Transformation Models.

Description

This function is used to conduct the regression analysis of right-censored data using semiparametric transformation models. It calculates the estimators, standard errors and p values. A plot of estimated baseline cumulative hazard function and confidence intervals can be produced.

Usage

trans_m(
  X,
  delta,
  Y,
  plot.Lamb = TRUE,
  alpha = seq(0.1, 1.1, by = 0.1),
  trsmodel = TRUE,
  EM_itmax = 250,
  show_res = TRUE
)

Arguments

X

design matrix

delta

censoring indicator. If YiY_i is censored, delta=0. If not, delta=1.

Y

observed event times

plot.Lamb

If TRUE, plot the estimated baseline cumulative hazard function and confidence intervals. The default is TRUE.

alpha

parameter in transformation function. Generally, α\alpha can not be observed in medical applications. In that situation, alpha indicates the scale of choosing α\alpha. The default is (0.1,0.2,...,1.1)(0.1,0.2,...,1.1). If α\alpha is known, alpha indicates the true value of α\alpha.

trsmodel

logical value indicating whether to implement transformation models. The default is TRUE.

EM_itmax

maximum iteration of EM algorithm. Defaults to 250.

show_res

show results after trans_m.

Details

If α\alpha is unknown, we firse set α=\alpha=alpha. Then, for each α\alpha, we estimate the parameters and record the value of observed log-likelihood function. The α\alpha that maximizes the observed log-likelihood function and the corresponding β^\hat\beta and Λ^()\hat\Lambda(\cdot) are chosen as the best estimators. Nonparametric maximum likelihood estimators are developed for the regression parameters and cumulative intensity functions of these models based on censored data.

Value

a list containing

beta.est estimators of β\beta
SE.beta standard errors of the estimated β\beta
SE.Ydot standard errors of the estimated Λ(Y)\Lambda(Y')
Ydot vector of sorted event times with duplicate values removed
Lamb.est estimated baseline cumulative hazard
lamb.est estimated jump sizes of baseline cumulative hazard function
choose.alpha the chosen α\alpha
Lamb.upper upper confidence limits for the estimated baseline cumulative hazard function
Lamb.lower lower confidence limits for the estimated baseline cumulative hazard function
p.beta P values of estimated β\beta
p.Lamb P values of estimated baseline cumulative hazard
p.beta

References

Cheng, S.C., Wei, L.J., and Ying, Z. (1995). Analysis of transformation models with censored data. Biometrika 82, 835-845.

Zeng, D. and Lin, D.Y. (2007). Maximum likelihood estimation in semiparametric regression models with censored data. J. R. Statist. Soc. B 69, 507-564.

Abramowitz, M., and Stegun, I.A. (1972). Handbook of Mathematical Functions (9th ed.). Dover Publications, New York.

Evans, M. and Swartz, T. (2000). Approximating Integrals via Monte Carlo and Deterministic Methods. Oxford University Press.

Liu, Q. and Pierce, D.A. (1994). A note on Gauss-Hermite quadrature. Biometrika 81, 624-629.

Louis, T. (1982). Finding the Observed Information Matrix when Using the EM Algorithm. Journal of the Royal Statistical Society. Series B (Methodological), 44(2), 226-233.

See Also

EM_est

Examples

gen_data = generate_data(200, 1, 0.5, c(-0.5,1))
 delta = gen_data$delta
 Y = gen_data$Y
 X = gen_data$X
 res.trans = trans_m(X, delta,  Y, plot.Lamb = TRUE, show_res = FALSE)