| Title: | Optimal Categorisation of Continuous Variables in Prediction Models |
|---|---|
| Description: | Allows the user to categorise a continuous predictor variable in a logistic or a Cox proportional hazards regression setting, by maximising the discriminative ability of the model. I Barrio, I Arostegui, MX Rodriguez-Alvarez, JM Quintana (2015) <doi:10.1177/0962280215601873>. I Barrio, MX Rodriguez-Alvarez, L Meira-Machado, C Esteban, I Arostegui (2017) <https://www.idescat.cat/sort/sort411/41.1.3.barrio-etal.pdf>. |
| Authors: | Irantzu Barrio [aut, cre], Maria Xose Rodriguez-Alvarez [aut], Inmaculada Arostegui [ctb], Diana Marcela Perez [ctb] |
| Maintainer: | Irantzu Barrio <[email protected]> |
| License: | GPL (>= 3) |
| Version: | 2.0 |
| Built: | 2026-05-08 11:15:06 UTC |
| Source: | https://github.com/cran/CatPredi |
Returns an object with the optimal cut points to categorise a continuous predictor variable in a logistic regression model
catpredi( formula, cat.var, cat.points = 1, data, method = c("addfor", "genetic", "backaddfor"), range = NULL, correct.AUC = FALSE, control = controlcatpredi(), ... )catpredi( formula, cat.var, cat.points = 1, data, method = c("addfor", "genetic", "backaddfor"), range = NULL, correct.AUC = FALSE, control = controlcatpredi(), ... )
formula |
An object of class |
cat.var |
Name of the continuous variable to categorise. |
cat.points |
Number of cut points to look for. |
data |
Data frame containing all needed variables. |
method |
The algorithm selected to search for the optimal cut points.
|
range |
The range of the continuous variable in which to look for the cut
points. By default |
correct.AUC |
A logical value. If |
control |
Output of the |
... |
Further arguments for passing on to the function |
Returns an object of class "catpredi" with the following components:
The matched call.
The algorithm selected in the call.
The model formula used in the call.
Name of the continuous variable to categorise.
The data frame used in the call.
Logical value indicating whether bias-corrected AUC was used.
A list containing estimated cut points, AUC and bias-corrected AUC for each method.
The control parameters used in the call.
Irantzu Barrio, Maria Xose Rodriguez-Alvarez, Inmaculada Arostegui, Javier Roca-Pardinas and Xabier Amutxastegi.
I Barrio, J Roca-Pardinas and I Arostegui (2021). Selecting the number of categories of the lymph node ratio in cancer research: A bootstrap-based hypothesis test. Statistical Methods in Medical Research, 30(3), 926-940.
I Barrio, I Arostegui, M.X Rodriguez-Alvarez and J.M Quintana (2017). A new approach to categorising continuous variables in prediction models: proposal and validation. Statistical Methods in Medical Research, 26(6), 2586-2602.
S.N Wood (2006). Generalized Additive Models: An Introduction with R. Chapman and Hall/CRC.
controlcatpredi,
comp.cutpoints,
plot.catpredi,
summary.catpredi
library(CatPredi) ## Not run: set.seed(127) #Simulate data n = 100 #Predictor variable xh <- rnorm(n, mean = 0, sd = 1) xd <- rnorm(n, mean = 1.5, sd = 1) x <- c(xh, xd) #Response y <- c(rep(0,n), rep(1,n)) #Covariate zh <- rnorm(n, mean=1.5, sd=1) zd <- rnorm(n, mean=1, sd=1) z <- c(zh, zd) # Data frame df <- data.frame(y = y, x = x, z = z) # Select optimal cut points using the AddFor algorithm res.addfor <- catpredi(formula = y ~ z, cat.var = "x", cat.points = 2, data = df, method = "addfor", range=NULL, correct.AUC=FALSE, control=controlcatpredi(grid=20)) # Select optimal cut points using the BackAddFor algorithm res.backaddfor <- catpredi(formula = y ~ z, cat.var = "x", cat.points = 3, data = df, method = "backaddfor", range=NULL, correct.AUC=FALSE) ## End(Not run) ## Not run: set.seed(127) #Simulate data n = 200 #Predictor variable xh <- rnorm(n, mean = 0, sd = 1) xd <- rnorm(n, mean = 1.5, sd = 1) x <- c(xh, xd) #Response y <- c(rep(0,n), rep(1,n)) #Covariate zh <- rnorm(n, mean=1.5, sd=1) zd <- rnorm(n, mean=1, sd=1) z <- c(zh, zd) # Data frame df <- data.frame(y = y, x = x, z = z) # Select optimal cut points using the AddFor algorithm res.addfor <- catpredi(formula = y ~ z, cat.var = "x", cat.points = 3, data = df, method = "addfor", range=NULL, correct.AUC=FALSE) # Select optimal cut points using the BackAddFor algorithm res.backaddfor <- catpredi(formula = y ~ z, cat.var = "x", cat.points = 3, data = df, method = "backaddfor", range=NULL, correct.AUC=FALSE) ## End(Not run)library(CatPredi) ## Not run: set.seed(127) #Simulate data n = 100 #Predictor variable xh <- rnorm(n, mean = 0, sd = 1) xd <- rnorm(n, mean = 1.5, sd = 1) x <- c(xh, xd) #Response y <- c(rep(0,n), rep(1,n)) #Covariate zh <- rnorm(n, mean=1.5, sd=1) zd <- rnorm(n, mean=1, sd=1) z <- c(zh, zd) # Data frame df <- data.frame(y = y, x = x, z = z) # Select optimal cut points using the AddFor algorithm res.addfor <- catpredi(formula = y ~ z, cat.var = "x", cat.points = 2, data = df, method = "addfor", range=NULL, correct.AUC=FALSE, control=controlcatpredi(grid=20)) # Select optimal cut points using the BackAddFor algorithm res.backaddfor <- catpredi(formula = y ~ z, cat.var = "x", cat.points = 3, data = df, method = "backaddfor", range=NULL, correct.AUC=FALSE) ## End(Not run) ## Not run: set.seed(127) #Simulate data n = 200 #Predictor variable xh <- rnorm(n, mean = 0, sd = 1) xd <- rnorm(n, mean = 1.5, sd = 1) x <- c(xh, xd) #Response y <- c(rep(0,n), rep(1,n)) #Covariate zh <- rnorm(n, mean=1.5, sd=1) zd <- rnorm(n, mean=1, sd=1) z <- c(zh, zd) # Data frame df <- data.frame(y = y, x = x, z = z) # Select optimal cut points using the AddFor algorithm res.addfor <- catpredi(formula = y ~ z, cat.var = "x", cat.points = 3, data = df, method = "addfor", range=NULL, correct.AUC=FALSE) # Select optimal cut points using the BackAddFor algorithm res.backaddfor <- catpredi(formula = y ~ z, cat.var = "x", cat.points = 3, data = df, method = "backaddfor", range=NULL, correct.AUC=FALSE) ## End(Not run)
Returns an object with the optimal cut points to categorise a continuous predictor variable in a Cox proportional hazards regression model
catpredi.survival( formula, cat.var, cat.points = 1, data, method = c("addfor", "genetic", "backaddfor"), conc.index = c("cindex", "cpe"), range = NULL, correct.index = FALSE, control = controlcatpredi.survival(), ... )catpredi.survival( formula, cat.var, cat.points = 1, data, method = c("addfor", "genetic", "backaddfor"), conc.index = c("cindex", "cpe"), range = NULL, correct.index = FALSE, control = controlcatpredi.survival(), ... )
formula |
An object of class |
cat.var |
Name of the continuous variable to categorise. |
cat.points |
Number of cut points to look for. |
data |
Data frame containing all needed variables. |
method |
The algorithm selected to search for the optimal cut points.
|
conc.index |
The concordance probability estimator selected for maximisation purposes. "cindex" if the c-index concordance probability is choosen and "cpe" otherwise. The c-index and CPE are estimated using the rms and CPE packages, respectively. |
range |
The range of the continuous variable in which to look for the cut
points. By default |
correct.index |
A logical value. If TRUE the bias corrected concordance probability is estimated. |
control |
Output of the |
... |
Further arguments for passing on to the function |
Returns an object of class "catpredi.survival" with the following components:
The matched call.
The algorithm selected in the call.
an object of class formula giving the model to
be fitted in addition to the continuous covariate is aimed to categorise.
name of the continuous variable to categorise.
the data frame with the variables used in the call.
The logical value used in the call.
a list with the estimated cut points, concordance probability and bias corrected concordance probability.
the control parameters used in the call.
When the c-index concordance probability is choosen, a list with the following components is obtained for each of the methods used in the call:
Estimated optimal cut points.
Estimated c-index.
Estimated bias corrected c-index.
When the CPE concordance probability is choosen, a list with the following components is obtained for each of the methods used in the call:
Estimated optimal cut points.
Estimated CPE.
Estimated bias corrected CPE.
Irantzu Barrio and Maria Xose Rodriguez-Alvarez
I Barrio, M.X Rodriguez-Alvarez, L Meira-Machado, C Esteban and I Arostegui (2017). Comparison of two discrimination indexes in the categorisation of continuous predictors in time-to-event studies. SORT, 41:73-92
M Gonen and G Heller (2005). Concordance probability and discriminatory power in proportional hazards regression. Biometrika, 92:965-970.
F Harrell (2001). Regression modeling strategies: with applications to linear models, logistic and ordinal regression, and survival analysis. Springer.
controlcatpredi.survival,
comp.cutpoints.survival,
plot.catpredi.survival,
catpredi
library(CatPredi) library(survival) set.seed(123) #Simulate data n = 500 tauc = 1 X <- rnorm(n=n, mean=0, sd=2) SurvT <- exp(2*X + rweibull(n = n, shape=1, scale = 1)) + rnorm(n, mean=0, sd=0.25) # Censoring time CensTime <- runif(n=n, min=0, max=tauc) # Status SurvS <- as.numeric(SurvT <= CensTime) # Data frame dat <- data.frame(X = X, SurvT = pmin(SurvT, CensTime), SurvS = SurvS) # Select optimal cut points using the AddFor algorithm res <- catpredi.survival (formula= Surv(SurvT,SurvS)~1, cat.var="X", cat.points = 2, data = dat, method = "addfor", conc.index = "cindex", range = NULL, correct.index = FALSE)library(CatPredi) library(survival) set.seed(123) #Simulate data n = 500 tauc = 1 X <- rnorm(n=n, mean=0, sd=2) SurvT <- exp(2*X + rweibull(n = n, shape=1, scale = 1)) + rnorm(n, mean=0, sd=0.25) # Censoring time CensTime <- runif(n=n, min=0, max=tauc) # Status SurvS <- as.numeric(SurvT <= CensTime) # Data frame dat <- data.frame(X = X, SurvT = pmin(SurvT, CensTime), SurvS = SurvS) # Select optimal cut points using the AddFor algorithm res <- catpredi.survival (formula= Surv(SurvT,SurvS)~1, cat.var="X", cat.points = 2, data = dat, method = "addfor", conc.index = "cindex", range = NULL, correct.index = FALSE)
Compares two objects of class "catpredi".
comp.cutpoints(obj1, obj2, V = 100)comp.cutpoints(obj1, obj2, V = 100)
obj1 |
An object inheriting from class |
obj2 |
An object inheriting from class |
V |
Number of bootstrap resamples. By default V=100 |
This function returns an object of class "comp.cutpoints" with the following components:
the difference of the bias corrected AUCs for the two categorical variables.
bootstrap based confidence interval for the bias corrected AUC difference.
Irantzu Barrio, Maria Xose Rodriguez-Alvarez and Inmaculada Arostegui.
I Barrio, I Arostegui, M.X Rodriguez-Alvarez and J.M Quintana (2017). A new approach to categorising continuous variables in prediction models: proposal and validation. Statistical Methods in Medical Research, 26(6), 2586-2602.
library(CatPredi) set.seed(127) #Simulate data n = 100 #Predictor variable xh <- rnorm(n, mean = 0, sd = 1) xd <- rnorm(n, mean = 1.5, sd = 1) x <- c(xh, xd) #Response y <- c(rep(0,n), rep(1,n)) # Data frame df <- data.frame(y = y, x = x) # Select 2 optimal cut points using the AddFor algorithm. Correct the AUC res.backaddfor.k2 <- catpredi(formula = y ~ 1, cat.var = "x", cat.points = 2, data = df, method = "backaddfor", range=NULL, correct.AUC=TRUE, control=controlcatpredi(grid=100)) # Select 3 optimal cut points using the AddFor algorithm. Correct the AUC res.backaddfor.k3 <- catpredi(formula = y ~ 1, cat.var = "x", cat.points = 3, data = df, method = "backaddfor", range=NULL, correct.AUC=TRUE, control=controlcatpredi(grid=100)) # Select optimal number of cut points comp <- comp.cutpoints(res.backaddfor.k2, res.backaddfor.k3, V = 100)library(CatPredi) set.seed(127) #Simulate data n = 100 #Predictor variable xh <- rnorm(n, mean = 0, sd = 1) xd <- rnorm(n, mean = 1.5, sd = 1) x <- c(xh, xd) #Response y <- c(rep(0,n), rep(1,n)) # Data frame df <- data.frame(y = y, x = x) # Select 2 optimal cut points using the AddFor algorithm. Correct the AUC res.backaddfor.k2 <- catpredi(formula = y ~ 1, cat.var = "x", cat.points = 2, data = df, method = "backaddfor", range=NULL, correct.AUC=TRUE, control=controlcatpredi(grid=100)) # Select 3 optimal cut points using the AddFor algorithm. Correct the AUC res.backaddfor.k3 <- catpredi(formula = y ~ 1, cat.var = "x", cat.points = 3, data = df, method = "backaddfor", range=NULL, correct.AUC=TRUE, control=controlcatpredi(grid=100)) # Select optimal number of cut points comp <- comp.cutpoints(res.backaddfor.k2, res.backaddfor.k3, V = 100)
Compares two objects of class "catpredi.survival"
comp.cutpoints.survival(obj1, obj2, V = 100)comp.cutpoints.survival(obj1, obj2, V = 100)
obj1 |
An object inheriting from class |
obj2 |
An object inheriting from class |
V |
Number of bootstrap resamples. By default V=100 |
This function returns an object of class "comp.cutpoints.survival" with the following components:
the difference of the bias corrected concordance probability for the two categorical variables.
bootstrap based confidence interval for the bias corrected concordance probability difference.
Irantzu Barrio and Maria Xose Rodriguez-Alvarez.
I Barrio, M.X Rodriguez-Alvarez, L Meira-Machado, C Esteban and I Arostegui (2017). Comparison of two discrimination indexes in the categorisation of continuous predictors in time-to-event studies. SORT, 41:73-92
library(CatPredi) library(survival) set.seed(123) #Simulate data n = 300 tauc = 1 X <- rnorm(n=n, mean=0, sd=2) SurvT <- exp(2*X + rweibull(n = n, shape=1, scale = 1)) + rnorm(n, mean=0, sd=0.25) # Censoring time CensTime <- runif(n=n, min=0, max=tauc) # Status SurvS <- as.numeric(SurvT <= CensTime) # Data frame dat <- data.frame(X = X, SurvT = pmin(SurvT, CensTime), SurvS = SurvS) # Select 2 optimal cut points using the AddFor algorithm. Correct the c-index res.k2 <- catpredi.survival (formula= Surv(SurvT,SurvS)~1, cat.var="X", cat.points = 2, data = dat, method = "addfor", conc.index = "cindex", range = NULL, correct.index = TRUE) # Select 3 optimal cut points using the AddFor algorithm. Correct the c-index res.k3 <- catpredi.survival (formula= Surv(SurvT,SurvS)~1, cat.var="X", cat.points = 3, data = dat, method = "addfor", conc.index = "cindex", range = NULL, correct.index = TRUE) # Select optimal number of cut points comp <- comp.cutpoints.survival(res.k2, res.k3, V = 100)library(CatPredi) library(survival) set.seed(123) #Simulate data n = 300 tauc = 1 X <- rnorm(n=n, mean=0, sd=2) SurvT <- exp(2*X + rweibull(n = n, shape=1, scale = 1)) + rnorm(n, mean=0, sd=0.25) # Censoring time CensTime <- runif(n=n, min=0, max=tauc) # Status SurvS <- as.numeric(SurvT <= CensTime) # Data frame dat <- data.frame(X = X, SurvT = pmin(SurvT, CensTime), SurvS = SurvS) # Select 2 optimal cut points using the AddFor algorithm. Correct the c-index res.k2 <- catpredi.survival (formula= Surv(SurvT,SurvS)~1, cat.var="X", cat.points = 2, data = dat, method = "addfor", conc.index = "cindex", range = NULL, correct.index = TRUE) # Select 3 optimal cut points using the AddFor algorithm. Correct the c-index res.k3 <- catpredi.survival (formula= Surv(SurvT,SurvS)~1, cat.var="X", cat.points = 3, data = dat, method = "addfor", conc.index = "cindex", range = NULL, correct.index = TRUE) # Select optimal number of cut points comp <- comp.cutpoints.survival(res.k2, res.k3, V = 100)
Compares two objects of class "catpredi" to evaluate the significance of the improvement in model performance (in terms of the AUC) by adding k+1 cut-off points to the predictor variable.
compare.AUC.ht( obj1, obj2, level = 0.95, nb = 100, parallel = TRUE, plot = TRUE )compare.AUC.ht( obj1, obj2, level = 0.95, nb = 100, parallel = TRUE, plot = TRUE )
obj1 |
An object inheriting from class |
obj2 |
An object inheriting from class |
level |
The confidence level required for the hypothesis test. By default level = 0.95. |
nb |
Number of bootstrap resamples. By default nb = 100 |
parallel |
A logical value. if TRUE the bootstrap is processed in parallel. |
plot |
A logical value. if TRUE the density plot for the bootstrap statistic is provided. |
This function returns an object of class "compare.AUC.ht" with the following components:
test statistic, with the difference of the AUCs for the two objects.
a vector with the nb bootstrap statistics.
empirical level-percentile of the bootstrap statistics vector.
Irantzu Barrio, Inmaculada Arostegui, Javier Roca-Pardinas and Xabier Amutxastegi.
I Barrio, J Roca-Pardinas and I Arostegui (2021). Selecting the number of categories of the lymph node ratio in cancer research: A bootstrap-based hypothesis test. Statistical Methods in Medical Research, 30(3), 926-940.
library(CatPredi) ## Not run: set.seed(127) #Simulate data n = 100 #Predictor variable xh <- rnorm(n, mean = 0, sd = 1) xd <- rnorm(n, mean = 1.5, sd = 1) x <- c(xh, xd) #Response y <- c(rep(0,n), rep(1,n)) # Data frame df <- data.frame(y = y, x = x) # Select 2 optimal cut points using the AddFor algorithm. Correct the AUC res.addfor.k2 <- catpredi(formula = y ~ 1, cat.var = "x", cat.points = 2, data = df, method = "addfor", range=NULL, correct.AUC=TRUE, control=controlcatpredi(grid=20)) # Select 3 optimal cut points using the AddFor algorithm. Correct the AUC res.addfor.k3 <- catpredi(formula = y ~ 1, cat.var = "x", cat.points = 3, data = df, method = "addfor", range=NULL, correct.AUC=TRUE, control=controlcatpredi(grid=20)) comp <- comp.cutpoints(res.addfor.k2, res.addfor.k3, V = 10) # Select 1 optimal cut points using the BackAddFor algorithm. res.backaddfor.k1 <- catpredi(formula = y ~ 1, cat.var = "x", cat.points = 1, data = df, method = "backaddfor", range=NULL, correct.AUC=FALSE) # Select 2 optimal cut points using the BackAddFor algorithm. res.backaddfor.k2 <- catpredi(formula = y ~ 1, cat.var = "x", cat.points = 2, data = df, method = "backaddfor", range=NULL, correct.AUC=FALSE) # Test if k=1 cut-off points is enough to categorise x comp.k1.k2 <- compare.AUC.ht(res.backaddfor.k1, res.backaddfor.k2) ## End(Not run)library(CatPredi) ## Not run: set.seed(127) #Simulate data n = 100 #Predictor variable xh <- rnorm(n, mean = 0, sd = 1) xd <- rnorm(n, mean = 1.5, sd = 1) x <- c(xh, xd) #Response y <- c(rep(0,n), rep(1,n)) # Data frame df <- data.frame(y = y, x = x) # Select 2 optimal cut points using the AddFor algorithm. Correct the AUC res.addfor.k2 <- catpredi(formula = y ~ 1, cat.var = "x", cat.points = 2, data = df, method = "addfor", range=NULL, correct.AUC=TRUE, control=controlcatpredi(grid=20)) # Select 3 optimal cut points using the AddFor algorithm. Correct the AUC res.addfor.k3 <- catpredi(formula = y ~ 1, cat.var = "x", cat.points = 3, data = df, method = "addfor", range=NULL, correct.AUC=TRUE, control=controlcatpredi(grid=20)) comp <- comp.cutpoints(res.addfor.k2, res.addfor.k3, V = 10) # Select 1 optimal cut points using the BackAddFor algorithm. res.backaddfor.k1 <- catpredi(formula = y ~ 1, cat.var = "x", cat.points = 1, data = df, method = "backaddfor", range=NULL, correct.AUC=FALSE) # Select 2 optimal cut points using the BackAddFor algorithm. res.backaddfor.k2 <- catpredi(formula = y ~ 1, cat.var = "x", cat.points = 2, data = df, method = "backaddfor", range=NULL, correct.AUC=FALSE) # Test if k=1 cut-off points is enough to categorise x comp.k1.k2 <- compare.AUC.ht(res.backaddfor.k1, res.backaddfor.k2) ## End(Not run)
Function used to set several parameters to control the selection of the optimal cut points in a logistic regression model
controlcatpredi( min.p.cat = 1, grid = 100, B = 50, eps = 0.001, b.method = c("ncoutcome", "coutcome"), print.gen = 0 )controlcatpredi( min.p.cat = 1, grid = 100, B = 50, eps = 0.001, b.method = c("ncoutcome", "coutcome"), print.gen = 0 )
min.p.cat |
Set the minimun number of individuals in each category |
grid |
Grid size for the AddFor and BackAddFor algorithms |
B |
Number of bootstrap replicates for the AUC bias correction procedure |
eps |
An argument for the BackAddFor algorithm, indicating whether the improvement between iterations is considered significant |
b.method |
Allows to specify whether the bootstrap resampling should be done considering or not the outcome variable. The option "ncoutcome" indicates that the data is resampled without taking into account the response variable, while "coutcome" indicates that the data is resampled in regard to the response variable |
print.gen |
Corresponds to the argument print.level of the |
A list with components for each of the possible arguments.
Irantzu Barrio, Maria Xose Rodriguez-Alvarez, Inmaculada Arostegui, Javier Roca-Pardinas and Xabier Amutxastegi.
Mebane Jr, W. R., & Sekhon, J. S. (2011). Genetic optimization using derivatives: the rgenoud package for R. Journal of Statistical Software 4211, 1-26.
Function used to set several parameters to control the selection of the optimal cut points in a Cox proportional hazards regression model.
controlcatpredi.survival( min.p.cat = 5, grid = 100, B = 50, b.method = c("ncoutcome", "coutcome"), print.gen = 0 )controlcatpredi.survival( min.p.cat = 5, grid = 100, B = 50, b.method = c("ncoutcome", "coutcome"), print.gen = 0 )
min.p.cat |
Set the minimun number of individuals in each category. |
grid |
Grid size for the AddFor algorithm. |
B |
Number of bootstrap replicates for the AUC bias correction procedure. |
b.method |
Allows to specify whether the bootstrap resampling should be done considering or not the outcome variable. The option "ncoutcome" indicates that the data is resampled without taking into account the response variable, while "coutcome" indicates that the data is resampled in regard to the response variable. |
print.gen |
Corresponds to the argument print.level of the |
A list with components for each of the possible arguments.
Irantzu Barrio and Maria Xose Rodriguez-Alvarez.
Mebane Jr, W. R., & Sekhon, J. S. (2011). Genetic optimization using derivatives: the rgenoud package for R. Journal of Statistical Software 4211, 1-26.
Plots the relationship between the predictor variable is aimed to categorise and the response variable based on a GAM model. Additionally, the optimal cut points obtained with the catpredi() function are drawn on the graph.
## S3 method for class 'catpredi' plot(x, ...)## S3 method for class 'catpredi' plot(x, ...)
x |
An object of type catpredi. |
... |
Additional arguments to be passed on to other functions. Not yet implemented. |
This function returns the plot of the relationship between the predictor variable and the outcome.
Irantzu Barrio, Maria Xose Rodriguez-Alvarez and Inmaculada Arostegui.
I Barrio, I Arostegui, M.X Rodriguez-Alvarez and J.M Quintana (2017). A new approach to categorising continuous variables in prediction models: proposal and validation. Statistical Methods in Medical Research, 26(6), 2586-2602.
I Barrio, J Roca-Pardinas and I Arostegui (2021). Selecting the number of categories of the lymph node ratio in cancer research: A bootstrap-based hypothesis test. Statistical Methods in Medical Research, 30(3), 926-940.
library(CatPredi) ## Not run: set.seed(127) #Simulate data n = 100 #Predictor variable xh <- rnorm(n, mean = 0, sd = 1) xd <- rnorm(n, mean = 1.5, sd = 1) x <- c(xh, xd) #Response y <- c(rep(0,n), rep(1,n)) # Data frame df <- data.frame(y = y, x = x) # Select optimal cut points using the AddFor algorithm res.backaddfor <- catpredi(formula = y ~ 1, cat.var = "x", cat.points = 3, data = df, method = "backaddfor", range = NULL, correct.AUC = FALSE) # Plot plot(res.backaddfor) ## End(Not run)library(CatPredi) ## Not run: set.seed(127) #Simulate data n = 100 #Predictor variable xh <- rnorm(n, mean = 0, sd = 1) xd <- rnorm(n, mean = 1.5, sd = 1) x <- c(xh, xd) #Response y <- c(rep(0,n), rep(1,n)) # Data frame df <- data.frame(y = y, x = x) # Select optimal cut points using the AddFor algorithm res.backaddfor <- catpredi(formula = y ~ 1, cat.var = "x", cat.points = 3, data = df, method = "backaddfor", range = NULL, correct.AUC = FALSE) # Plot plot(res.backaddfor) ## End(Not run)
Plots the functional form of the predictor variable we want to categorise. Additionally, the optimal cut points obtained with the catpredi.survival() function are drawn on the graph.
## S3 method for class 'catpredi.survival' plot(x, ...)## S3 method for class 'catpredi.survival' plot(x, ...)
x |
An object of type catpredi.survival. |
... |
Additional arguments to be passed on to other functions. Not yet implemented. |
This function returns the plot of the relationship between the predictor variable and the outcome.
Irantzu Barrio and Maria Xose Rodriguez-Alvarez.
I Barrio, M.X Rodriguez-Alvarez, L Meira-Machado, C Esteban and I Arostegui (2017). Comparison of two discrimination indexes in the categorisation of continuous predictors in time-to-event studies. SORT, 41:73-92
library(CatPredi) library(survival) set.seed(123) #Simulate data n = 500 tauc = 1 X <- rnorm(n=n, mean=0, sd=2) SurvT <- exp(2*X + rweibull(n = n, shape=1, scale = 1)) + rnorm(n, mean=0, sd=0.25) # Censoring time CensTime <- runif(n=n, min=0, max=tauc) # Status SurvS <- as.numeric(SurvT <= CensTime) # Data frame dat <- data.frame(X = X, SurvT = pmin(SurvT, CensTime), SurvS = SurvS) # Select optimal cut points using the AddFor algorithm res <- catpredi.survival (formula= Surv(SurvT,SurvS)~1, cat.var="X", cat.points = 2, data = dat, method = "addfor", conc.index = "cindex", range = NULL, correct.index = FALSE) # Plot plot(res)library(CatPredi) library(survival) set.seed(123) #Simulate data n = 500 tauc = 1 X <- rnorm(n=n, mean=0, sd=2) SurvT <- exp(2*X + rweibull(n = n, shape=1, scale = 1)) + rnorm(n, mean=0, sd=0.25) # Censoring time CensTime <- runif(n=n, min=0, max=tauc) # Status SurvS <- as.numeric(SurvT <= CensTime) # Data frame dat <- data.frame(X = X, SurvT = pmin(SurvT, CensTime), SurvS = SurvS) # Select optimal cut points using the AddFor algorithm res <- catpredi.survival (formula= Surv(SurvT,SurvS)~1, cat.var="X", cat.points = 2, data = dat, method = "addfor", conc.index = "cindex", range = NULL, correct.index = FALSE) # Plot plot(res)
Produces a summary of a catpredi object. The following are printed: the call to the catpredi() function; the estimated optimal cut points obtained with the method selected and the estimated AUC and bias corrected AUC (if the argument correct.AUC is TRUE) for the categorised variable.
## S3 method for class 'catpredi' summary(object, digits = 4, ...)## S3 method for class 'catpredi' summary(object, digits = 4, ...)
object |
An object of class catpredi as produced by catpredi() |
digits |
. |
... |
Further arguments passed to or from other methods. |
Returns an object of class "summary.catpredi" with the same components as the
catpredi function (see catpredi). plus:
fitted model according to the model specified in the call,
based on the function gam of the package mgcv.
Irantzu Barrio, Maria Xose Rodriguez-Alvarez and Inmaculada Arostegui.
I Barrio, I Arostegui, M.X Rodriguez-Alvarez and J.M Quintana (2017). A new approach to categorising continuous variables in prediction models: proposal and validation. Statistical Methods in Medical Research, 26(6), 2586-2602.
I Barrio, J Roca-Pardinas and I Arostegui (2021). Selecting the number of categories of the lymph node ratio in cancer research: A bootstrap-based hypothesis test. Statistical Methods in Medical Research, 30(3), 926-940.
library(CatPredi) set.seed(127) #Simulate data n = 200 #Predictor variable xh <- rnorm(n, mean = 0, sd = 1) xd <- rnorm(n, mean = 1.5, sd = 1) x <- c(xh, xd) #Response y <- c(rep(0,n), rep(1,n)) #Covariate zh <- rnorm(n, mean=1.5, sd=1) zd <- rnorm(n, mean=1, sd=1) z <- c(zh, zd) # Data frame df <- data.frame(y = y, x = x, z = z) # Select optimal cut points using the AddFor algorithm res.backaddfor <- catpredi(formula = y ~ z, cat.var = "x", cat.points = 2, data = df, method = "backaddfor", range=NULL, correct.AUC=FALSE) # Summary summary(res.backaddfor)library(CatPredi) set.seed(127) #Simulate data n = 200 #Predictor variable xh <- rnorm(n, mean = 0, sd = 1) xd <- rnorm(n, mean = 1.5, sd = 1) x <- c(xh, xd) #Response y <- c(rep(0,n), rep(1,n)) #Covariate zh <- rnorm(n, mean=1.5, sd=1) zd <- rnorm(n, mean=1, sd=1) z <- c(zh, zd) # Data frame df <- data.frame(y = y, x = x, z = z) # Select optimal cut points using the AddFor algorithm res.backaddfor <- catpredi(formula = y ~ z, cat.var = "x", cat.points = 2, data = df, method = "backaddfor", range=NULL, correct.AUC=FALSE) # Summary summary(res.backaddfor)
Produces a summary of a "catpredi.survival" object. The following are printed: the call to the catpredi.survival() function; the estimated optimal cut points obtained with the method and concordance probability estimator selected and the estimated and bias corrected concordance probability for the categorised variable (whenever the argument correct.index is set to TRUE).
## S3 method for class 'catpredi.survival' summary(object, digits = 4, ...)## S3 method for class 'catpredi.survival' summary(object, digits = 4, ...)
object |
An object of class "catpredi.survival" as produced by catpredi.survival() |
digits |
. |
... |
Further arguments passed to or from other methods. |
Returns an object of class "summary.catpredi.survival" with the same components
as the catpredi.survival function (see catpredi.survival).
Irantzu Barrio and Maria Xose Rodriguez-Alvarez.
I Barrio, M.X Rodriguez-Alvarez, L Meira-Machado, C Esteban and I Arostegui (2017). Comparison of two discrimination indexes in the categorisation of continuous predictors in time-to-event studies. SORT, 41:73-92
library(CatPredi) library(survival) set.seed(123) #Simulate data n = 500 tauc = 1 X <- rnorm(n=n, mean=0, sd=2) SurvT <- exp(2*X + rweibull(n = n, shape=1, scale = 1)) + rnorm(n, mean=0, sd=0.25) # Censoring time CensTime <- runif(n=n, min=0, max=tauc) # Status SurvS <- as.numeric(SurvT <= CensTime) # Data frame dat <- data.frame(X = X, SurvT = pmin(SurvT, CensTime), SurvS = SurvS) # Select optimal cut points using the AddFor algorithm res <- catpredi.survival (formula= Surv(SurvT,SurvS)~1, cat.var="X", cat.points = 2, data = dat, method = "addfor", conc.index = "cindex", range = NULL, correct.index = FALSE) # Summary summary(res)library(CatPredi) library(survival) set.seed(123) #Simulate data n = 500 tauc = 1 X <- rnorm(n=n, mean=0, sd=2) SurvT <- exp(2*X + rweibull(n = n, shape=1, scale = 1)) + rnorm(n, mean=0, sd=0.25) # Censoring time CensTime <- runif(n=n, min=0, max=tauc) # Status SurvS <- as.numeric(SurvT <= CensTime) # Data frame dat <- data.frame(X = X, SurvT = pmin(SurvT, CensTime), SurvS = SurvS) # Select optimal cut points using the AddFor algorithm res <- catpredi.survival (formula= Surv(SurvT,SurvS)~1, cat.var="X", cat.points = 2, data = dat, method = "addfor", conc.index = "cindex", range = NULL, correct.index = FALSE) # Summary summary(res)