Title: | Accelerated Failure Time for High Dimensional Data with MCMC |
---|---|
Description: | Functions for Posterior estimates of Accelerated Failure Time(AFT) model with MCMC and Maximum likelihood estimates of AFT model without MCMC for univariate and multivariate analysis in high dimensional gene expression data are available in this 'afthd' package. AFT model with Bayesian framework for multivariate in high dimensional data has been proposed by Prabhash et al.(2016) <doi:10.21307/stattrans-2016-046>. |
Authors: | Atanu Bhattacharjee [aut, cre, ctb], Gajendra Kumar Vishwakarma [aut, ctb], Pragya Kumari [aut, ctb] |
Maintainer: | Atanu Bhattacharjee <[email protected]> |
License: | GPL-3 |
Version: | 1.1.0 |
Built: | 2024-11-13 06:36:49 UTC |
Source: | CRAN |
Provides better estimates (which has minimum deviance(DIC) ) for survival data among weibull, log normal and log logistic distribution of parametric AFT model using MCMC for multivariable (maximum 5 at a time) in high dimensional data.
aftbybmv(m, n, STime, Event, nc, ni, data)
aftbybmv(m, n, STime, Event, nc, ni, data)
m |
Starting column number of covariates of study from high dimensional entered data. |
n |
Ending column number of covariates of study from high dimensional entered data. |
STime |
name of survival time in data. |
Event |
name of event in data. 0 is for censored and 1 for occurrence of event. |
nc |
number of MCMC chain. |
ni |
number of MCMC iteration to update the outcome. |
data |
High dimensional gene expression data that contains event status, survival time and and set of covariates. |
This function deals covariates (in data) with missing values. Missing value in any column (covariate) is replaced by mean of that particular covariate.
AFT model is log-linear regression model for survival time ,
,..,
.
i.e.,
i.e.,
Where is known cdf which is defined on real line.
Here, when baseline distribution is extreme value then T follows weibull distribution.
To make interpretation of regression coefficients simpler, using extreme value distribution with median 0.
So using weibull distribution that leads to AFT model when
When baseline distribution is normal then T follows log normal distribution.
When baseline distribution is logistic then T follows log logistic distribution.
Data frame is containing posterior estimates mean, sd, credible intervals, n.eff and Rhat for beta's, sigma, tau and deviance of the model for the selected covariates. beta's of regression coefficient of the model. beta[1] is for intercept and others are for covariates (which is/are chosen order as columns in data). 'sigma' is the scale parameter of the distribution. DIC is the estimate of expected predictive error (so lower deviance denotes better estimation).
Atanu Bhattacharjee, Gajendra Kumar Vishwakarma and Pragya Kumari
Prabhash et al(2016) <doi:10.21307/stattrans-2016-046>
wbysmv, lgnbymv, lgstbymv
## data(hdata) aftbybmv(10,12,STime="os",Event="death",2,100,hdata) ##
## data(hdata) aftbybmv(10,12,STime="os",Event="death",2,100,hdata) ##
High dimensional head and neck cancer gene expression data
data(hdata)
data(hdata)
A dataframe with 565 rows and 104 variables
ID of subjects
Initial censoring time
Survival event
death due to other causes
Duration of overall survival
Duration of progression free survival
Progression event
High dimensional covariates
data(hdata)
data(hdata)
Provides posterior estimates of AFT model with log normal distribution using Bayesian for multivariate (maximum 5 at a time) in high dimensional gene expression data. It also deals covariates with missing values.
lgnbymv(m, n, STime, Event, nc, ni, data)
lgnbymv(m, n, STime, Event, nc, ni, data)
m |
Starting column number of covariates of study from high dimensional entered data. |
n |
Ending column number of covariates of study from high dimensional entered data. |
STime |
name of survival time in data |
Event |
name of event in data. 0 is for censored and 1 for occurrence of event. |
nc |
number of MCMC chain. |
ni |
number of MCMC iteration to update the outcome. |
data |
High dimensional gene expression data that contains event status, survival time and and set of covariates. |
This function deals covariates (in data) with missing values. Missing value in any column (covariate) is replaced by mean of that particular covariate.
AFT model is log-linear regression model for survival time ,
,..,
.
i.e.,
Where is known cdf which is defined on real line.
When baseline distribution is normal then T follows log normal distribution.
Data frame is containing mean, sd, n.eff, Rhat and credible intervals for beta's, sigma, tau and deviance of the model for the chosen covariates. beta[1] is for intercept and others are for covariates (which is/are chosen as columns in data). sigma is the scale parameter of the distribution.
Atanu Bhattacharjee, Gajendra Kumar Vishwakarma and Pragya Kumari
Prabhash et al(2016) <doi:10.21307/stattrans-2016-046>
lgnbyuni, wbysmv, lgstbymv
## data(hdata) lgnbymv(10,12,STime="os",Event="death",2,100,hdata) ##
## data(hdata) lgnbymv(10,12,STime="os",Event="death",2,100,hdata) ##
Provides posterior estimates of AFT model with log normal distribution using Bayesian for univariate in high dimensional gene expression data. It also deals covariates with missing values.
lgnbyuni(m, n, STime, Event, nc, ni, data)
lgnbyuni(m, n, STime, Event, nc, ni, data)
m |
Starting column number of covariates of study from high dimensional entered data. |
n |
Ending column number of covariates of study from high dimensional entered data. |
STime |
name of survival time in data |
Event |
name of event in data. 0 is for censored and 1 for occurrence of event. |
nc |
number of MCMC chain. |
ni |
number of MCMC iteration to update the outcome. |
data |
High dimensional gene expression data that contains event status, survival time and and set of covariates. |
This function deals covariates (in data) with missing values. Missing value in any column (covariate) is replaced by mean of that particular covariate.
AFT model is log-linear regression model for survival time ,
,..,
.
i.e.,
Where is known cdf which is defined on real line.
When baseline distribution is normal then T follows log normal distribution.
Data frame is containing posterior estimates (Coef, SD, Credible Interval, Rhat, n.eff) of regression coefficient of selected covariates and deviance. Result shows together for all covariates chosen from column m to n.
Atanu Bhattacharjee, Gajendra Kumar Vishwakarma and Pragya Kumari
Prabhash et al(2016) <doi:10.21307/stattrans-2016-046>
lgnbymv, wbysuni, lgstbyuni
## data(hdata) lgnbyuni(10,12,STime="os",Event="death",2,10,hdata) ##
## data(hdata) lgnbyuni(10,12,STime="os",Event="death",2,10,hdata) ##
Provides estimate of AFT model with log logistic distribution using MCMC for multivariable (maximum 5 covariates of column at a time) in high dimensional gene expression data. It also deals covariates with missing values.
lgstbymv(m, n, STime, Event, nc, ni, data)
lgstbymv(m, n, STime, Event, nc, ni, data)
m |
Starting column number of covariates of study from high dimensional entered data. |
n |
Ending column number of covariates of study from high dimensional entered data. |
STime |
name of survival time in data. |
Event |
name of event in data. 0 is for censored and 1 for occurrence of event. |
nc |
number of MCMC chain. |
ni |
number of MCMC iteration to update the outcome. |
data |
High dimensional gene expression data that contains event status, survival time and and set of covariates. |
This function deals covariates (in data) with missing values. Missing value in any column (covariate) is replaced by mean of that particular covariate.
AFT model is log-linear regression model for survival time ,
,..,
.
i.e.,
Where is known cdf which is defined on real line.
When baseline distribution is logistic then T follows log logistic distribution.
Data frame is containing mean, sd, n.eff, Rhat and credible intervals (2.5%, 25%, 50%, 75% and 97.5%) for beta's, sigma, tau and deviance of the model for the selected covariates. beta[1] is for intercept and others are for covariates (which is/are chosen as columns in data). sigma is the scale parameter of the distribution.
Atanu Bhattacharjee, Gajendra Kumar Vishwakarma and Pragya Kumari
Prabhash et al(2016) <doi: 10.21307/stattrans-2016-046>
wbysmv, lgnbymv, lgstbyuni
## data(hdata) lgstbymv(10,12,STime="os",Event="death",5,100,hdata) ##
## data(hdata) lgstbymv(10,12,STime="os",Event="death",5,100,hdata) ##
Provides estimate of AFT model with log logistic distribution using MCMC for univariate in high dimensional gene expression data. It also deals covariates with missing values.
lgstbyuni(m, n, STime, Event, nc, ni, data)
lgstbyuni(m, n, STime, Event, nc, ni, data)
m |
Starting column number of covariates of study from high dimensional entered data. |
n |
Ending column number of covariates of study from high dimensional entered data. |
STime |
name of survival time in data |
Event |
name of event in data |
nc |
number of chain used in model. |
ni |
number of iteration used in model. |
data |
High dimensional gene expression data that contains event status, survival time and and set of covariates. |
This function deals covariates (in data) with missing values. Missing value in any column (covariate) is replaced by mean of that particular covariate.
AFT model is log-linear regression model for survival time ,
,..,
.
i.e.,
Where is known cdf which is defined on real line.
When baseline distribution is logistic then T follows log logistic distribution.
Data frame is containing posterior estimates (Coef, SD, Credible Interval, Rhat, n.eff) of regression coefficient of selected covariates and deviance. Result shows together for all covariates chosen from column m to n.
Atanu Bhattacharjee, Gajendra Kumar Vishwakarma and Pragya Kumari
Prabhash et al(2016) <doi:10.21307/stattrans-2016-046>
wbysmv, lgnbymv, lgstbymvs
## data(hdata) lgstbyuni(12,14,STime="os",Event="death",3,100,hdata) ##
## data(hdata) lgstbyuni(12,14,STime="os",Event="death",3,100,hdata) ##
Provides list of covariates and their estimates of parametric AFT model with smooth time functions, whose p value is less than chosen value (by default p=1 that is all chosen covariates come in result). Using AFT model for univariate in high dimensional data without MCMC.
pvaft(m, n, STime, Event, p = 1, data)
pvaft(m, n, STime, Event, p = 1, data)
m |
Starting column number of covariates of study in high dimensional entered data. |
n |
Ending column number of covariates of study in high dimensional entered data. |
STime |
name of survival time in data. |
Event |
name of event in data. 0 is for censored and 1 for occurrence of event. |
p |
p-value, to make restriction for selection of covariates, default value is 1. |
data |
High dimensional gene expression data that contains event status, survival time and and set of covariates. |
Survival time T for covariate x, is modelled as AFT model using
and baseline survival function is modelled as
Where and
are linear predictor.
Matrix that contains survival information of selected covariates(selected from chosen columns whose p value is <= p) on AFT model. Result shows together for all covariates chosen from column m to n.
Atanu Bhattacharjee, Gajendra Kumar Vishwakarma and Pragya Kumari
wbysuni,wbysmv, rglaft
## data(hdata) pvaft(9,30,STime="os",Event="death",0.1,hdata) ##
## data(hdata) pvaft(9,30,STime="os",Event="death",0.1,hdata) ##
Provides Estimates of selected variable in parametric AFT model with smooth time functions for univariate in high dimensional gene expression data without MCMC.Incorporated variable selection has been done with regularization technique. It also deals covariates with missing values.
@details Survival time T for covariate x, is modelled as AFT model using
and baseline survival function is modelled as
Where and
are linear predictor.
rglaft(m, n, STime, Event, alpha, data)
rglaft(m, n, STime, Event, alpha, data)
m |
Starting column number of covariates of study from high dimensional entered data. |
n |
Ending column number of covariates of study from high dimensional entered data. |
STime |
name of survival time in data. |
Event |
name of event in data. 0 is for censored and 1 for occurrence of event. |
alpha |
It is chosen value between 0 and 1 to know the regularization method. alpha=1 for Lasso, alpha=0 for Ridge and alpha between 0 and 1 for elastic net regularization. |
data |
High dimensional gene expression data that contains event status, survival time and and set of covariates. |
Matrix that contains survival information of selected covariates(selected from chosen columns using regularization) on AFT model. Uppermost covariates are more significant than lowerone, as covariates are ordered as their increasing order of p value.
Atanu Bhattacharjee, Gajendra Kumar Vishwakarma and Pragya Kumari
pvaft, rglwbysu, rglwbysm
## data(hdata) set.seed(1000) rglaft(9,50,STime="os",Event="death",1,hdata) ##
## data(hdata) set.seed(1000) rglaft(9,50,STime="os",Event="death",1,hdata) ##
Provides posterior Estimates of selected variable in AFT model for multivariable(maximum 5 at a time) in high dimensional gene expression data with MCMC.Incorporated variable selection has been done with regularization technique. It also deals covariates with missing values.
rglwbysm(m, n, STime, Event, nc, ni, alpha, data)
rglwbysm(m, n, STime, Event, nc, ni, alpha, data)
m |
Starting column number of covariates of study from high dimensional entered data. |
n |
Ending column number of covariates of study from high dimensional entered data. |
STime |
name of survival time in data |
Event |
name of event status in data. 0 is for censored and 1 for occurrence of event. |
nc |
number of markov chain. |
ni |
number of iteration for MCMC. |
alpha |
It is chosen value between 0 and 1 to know the regularization method. alpha=1 for Lasso, alpha=0 for Ridge and alpha between 0 and 1 for elastic net regularization. |
data |
High dimensional gene expression data that contains event status, survival time and and set of covariates. |
Here weibull distribution has been used for AFT model with MCMC. This function deals covariates (in data) with missing values. Missing value in any column (covariate) is replaced by mean of that particular covariate.
Data frame is containing posterior estimates mean, sd, credible intervals, n.eff and Rhat for beta's, sigma, alpha, tau and deviance (DIC information) of the model for the selected covariates using regularization technique. beta's of regression coefficient of the model. beta[1] is for intercept and others are for covariates (which is/are chosen order as columns in data). alpha is shape parameter of the distribution. 'sigma' is the scale parameter of the distribution.
Atanu Bhattacharjee, Gajendra Kumar Vishwakarma and Pragya Kumari
Prabhash et al(2016)<doi:10.21307/stattrans-2016-046>
wbysuni,wbysmv, rglwbysu, aftbybmv
## data(hdata) set.seed(1000) rglwbysm(9,45,STime="os",Event="death",2,10,1,hdata) ##
## data(hdata) set.seed(1000) rglwbysm(9,45,STime="os",Event="death",2,10,1,hdata) ##
Provides posterior Estimates of selected variable(by regularization technique) in AFT model for univariate in high dimensional gene expression data with MCMC. Incorporated variable selection has been done with regularization technique. It also deals covariates with missing values.
rglwbysu(m, n, STime, Event, nc, ni, alpha, data)
rglwbysu(m, n, STime, Event, nc, ni, alpha, data)
m |
Starting column number of covariates of study from high dimensional entered data. |
n |
Ending column number of covariates of study from high dimensional entered data. |
STime |
name of survival time in data |
Event |
name of event in data. 0 is for censored and 1 for occurrence of event. |
nc |
number of chain used in model. |
ni |
number of iteration used in model. |
alpha |
It is chosen value between 0 and 1 to know the regularization method. alpha=1 for Lasso, alpha=0 for Ridge and alpha between 0 and 1 for elastic net regularization. |
data |
High dimensional gene expression data that contains event status, survival time and and set of covariates. |
Here weibull distribution has been used for AFT model with MCMC. This function deals covariates (in data) with missing values. Missing value in any column (covariate) is replaced by mean of that particular covariate.
posterior estimates (Coef, SD, Credible Interval) of regression coefficient for all selected covariate (using regularization) in model and deviance.
Atanu Bhattacharjee, Gajendra Kumar Vishwakarma and Pragya Kumari
Prabhash et al(2016) <doi:10.21307/stattrans-2016-046>
wbysuni,wbysmv, rglwbysm
## data(hdata) set.seed(1000) rglwbysu(9,45,STime="os",Event="death",2,10,1,hdata) ##
## data(hdata) set.seed(1000) rglwbysu(9,45,STime="os",Event="death",2,10,1,hdata) ##
Provides estimate of AFT model including Survival time for augmented data with weibull distribution using MCMC for multivariable (maximum 5 covariates of column at a time) in high dimensional gene expression data. It also deals covariates with missing values.
wbyAgmv(m, n, p, q, t, STime, Event, nc, ni, data)
wbyAgmv(m, n, p, q, t, STime, Event, nc, ni, data)
m |
Starting column number of covariates of study in data. |
n |
Ending column number of covariates of study in data. |
p |
starting row number for augumented data in entered data. |
q |
last row number for augumented data in entered data |
t |
time (same unit as STime in data) after which, estimated STime to be printed (for individuals p to q). |
STime |
name of survival time in data |
Event |
name of event in data. 0 is for censored and 1 for occurrence of event. |
nc |
number of markov chain. |
ni |
number of iteration for MCMC. |
data |
High dimensional gene expression data that contains event status, survival time and and set of covariates. |
Here weibull distribution has been used for AFT model with MCMC. This function deals covariates (in data) with missing values. Missing value in any column (covariate) is replaced by mean of that particular covariate.
Posterior estimates of beta's, sigma , tau and deviance are their estimates mean, sd, credible intervals,number of efficient sample (n.eff) and Rhat. beta's denotes posterior estimates of regression coefficient of the model. beta[1] is for intercept and others are for covariates (which is/are chosen as columns in data).'sigma' is the scale parameter of the distribution. 'STime' in output section provides estimated value of STime="os" in data only for individual row number p to q. 'Overall_S' in output, provides an overall estimate of STime="os" in data for all individuals nrow(data). @import R2jags
Atanu Bhattacharjee, Gajendra Kumar Vishwakarma and Pragya Kumari
Prabhash et al(2016) <doi:10.21307/stattrans-2016-046>
wbysmv
## data(hdata) wbyAgmv(9,13,p=560,q=565,t=200,STime="os",Event="death",2,10,hdata) # ##
## data(hdata) wbyAgmv(9,13,p=560,q=565,t=200,STime="os",Event="death",2,10,hdata) # ##
Provides multivariate(maximum 5 covariates of column at a time) posterior estimates of AFT model using MCMC for competing risk high dimensional gene expression data. It also deal with missing values.
wbyscrkm(m, n, STime, Event, nc, ni, data)
wbyscrkm(m, n, STime, Event, nc, ni, data)
m |
Starting column number of covariates of study from high dimensional entered data. |
n |
Ending column number of covariates of study from high dimensional entered data. |
STime |
name of survival time in data. |
Event |
name of event status in data. 0 is for censored and 1 for occurrence of event of interest and 2 for occurrence of event due to other causes. |
nc |
number of markov chain. |
ni |
number of iteration for MCMC. |
data |
High dimensional gene expression data that contains event status with competing risk, survival time and and set of covariates. |
Here AFT model has been used with weibull distribution. This function deals covariates (in data) with missing values. Missing value in any column(covariate) is replaced by mean of that particular covariate.
Data frame is containing posterior estimates mean, sd, credible intervals, n.eff and Rhat for beta's, sigma, alpha, tau and deviance (DIC information) of the model for the chosen covariates as columns between m and n. beta's of regression coefficient of the model. beta[1] is for intercept and others are for covariates (which is/are chosen order as columns in data). alpha is shape parameter of the distribution. 'sigma' is the scale parameter of the distribution.
Atanu Bhattacharjee, Gajendra Kumar Vishwakarma and Pragya Kumari
Prabhash et al(2016) <doi:10.21307/stattrans-2016-046>
wbysmv, wbyscrku
## data(hdata) wbyscrkm(9,11,STime="os",Event="death2",2,10,hdata) ##
## data(hdata) wbyscrkm(9,11,STime="os",Event="death2",2,10,hdata) ##
Provides univariate estimate of AFT model using MCMC for competing risk high dimensional gene expression data. It also dea with missing values.
wbyscrku(m, n, STime, Event, nc, ni, data)
wbyscrku(m, n, STime, Event, nc, ni, data)
m |
Starting column number of covariates of study from high dimensional entered data. |
n |
Ending column number of covariates of study from high dimensional entered data. |
STime |
name of survival time in data. |
Event |
name of event status in data. 0 is for censored and 1 for occurrence of event of interest and 2 for occurrence of event due to other causes. |
nc |
number of markov chain. |
ni |
number of iteration for MCMC. |
data |
High dimensional gene expression data that contains event status with competing risk, survival time and and set of covariates. |
Here AFT model has been used with weibull distribution. This function deals covariates (in data) with missing values. Missing value in any column(covariate) is replaced by mean of that particular covariate.
posterior estimates (Coef, SD, Credible Interval) of regression coefficient of covariate(which is/are chosen as columns in data) in model and deviance are there. Outcome shows together for all covariates chosen from column m to n.
Atanu Bhattacharjee, Gajendra Kumar Vishwakarma and Pragya Kumari
Prabhash et al(2016) <doi:10.21307/stattrans-2016-046>
wbysuni, wbyscrkm
## data(hdata) #1<=p<=q<=nrow(data) wbyscrku(9,13,STime="os",Event="death2",2,100,hdata) ##
## data(hdata) #1<=p<=q<=nrow(data) wbyscrku(9,13,STime="os",Event="death2",2,100,hdata) ##
Provides estimate of AFT model with weibull distribution using MCMC for multivariable (maximum 5 covariates of column at a time) in high dimensional gene expression data. It also deals covariates with missing values.
wbysmv(m, n, STime, Event, nc, ni, data)
wbysmv(m, n, STime, Event, nc, ni, data)
m |
Starting column number of covariates of study from high dimensional entered data. |
n |
Ending column number of covariates of study from high dimensional entered data. |
STime |
name of survival time in data. |
Event |
name of event status in data. 0 is for censored and 1 for occurrence of event. |
nc |
number of markov chain. |
ni |
number of iteration for MCMC. |
data |
High dimensional gene expression data that contains event status, survival time and and set of covariates. |
This function deals covariates (in data) with missing values. Missing value in any column (covariate) is replaced by mean of that particular covariate.
AFT model is log-linear regression model for survival time ,
,..,
.
i.e.,
Where is known cdf which is defined on real line.
Here, when baseline distribution is extreme value then T follows weibull distribution.
To make interpretation of regression coefficients simpler, using extreme value distribution with median 0.
So using weibull distribution that leads to AFT model when
Data frame is containing mean, sd, n.eff, Rhat and credible intervals for beta's, sigma, alpha, tau and deviance of the model for the chosen covariates. beta[1] is for intercept and others are for covariates (which is/are chosen as columns in data). sigma is the scale parameter of the distribution. alpha is shape parameter of the distribution.
Atanu Bhattacharjee, Gajendra Kumar Vishwakarma and Pragya Kumari
Prabhash et al(2016) <doi:10.21307/stattrans-2016-046>
pvaft, wbysuni, rglwbysm, wbyscrkm, wbyAgmv
## data(hdata) wbysmv(9,13,STime="os",Event="death",2,10,hdata) ##
## data(hdata) wbysmv(9,13,STime="os",Event="death",2,10,hdata) ##
Provides posterior estimates of AFT model with weibull distribution using MCMC for univariate in high dimensional gene expression data. It also deals covariates with missing values.
wbysuni(m, n, STime, Event, nc, ni, data)
wbysuni(m, n, STime, Event, nc, ni, data)
m |
Starting column number of covariates of study from high dimensional entered data. |
n |
Ending column number of covariates of study from high dimensional entered data. |
STime |
name of survival time in data |
Event |
name of event status in data. 0 is for censored and 1 for occurrence of event. |
nc |
number of markov chain. |
ni |
number of iteration for MCMC. |
data |
High dimensional gene expression data that contains event status, survival time and and set of covariates. |
This function deals covariates (in data) with missing values. Missing value in any column (covariate) is replaced by mean of that particular covariate.
AFT model is log-linear regression model for survival time ,
,..,
.
i.e.,
Where is known cdf which is defined on real line.
Here, when baseline distribution is extreme value then T follows weibull distribution.
To make interpretation of regression coefficients simpler, using extreme value distribution with median 0.
So using weibull distribution that leads to AFT model when
Data frame is containing posterior estimates (Coef, SD, Credible Interval, Rhat, n.eff) of regression coefficient for covariates and deviance. Result shows together for all covariates chosen from column m to n.
Atanu Bhattacharjee, Gajendra Kumar Vishwakarma and Pragya Kumari
Prabhash et al(2016) <doi:10.21307/stattrans-2016-046>
pvaft, wbysmv, rglwbysu, wbyscrku
## data(hdata) wbysuni(9,13,STime="os",Event="death",1,10,hdata) ##
## data(hdata) wbysuni(9,13,STime="os",Event="death",1,10,hdata) ##