Title: | Classification Trees for Ordinal Responses |
---|---|
Description: | Recursive partitioning methods to build classification trees for ordinal responses within the CART framework. Trees are grown using the Generalized Gini impurity function, where the misclassification costs are given by the absolute or squared differences in scores assigned to the categories of the response. Pruning is based on the total misclassification rate or on the total misclassification cost. |
Authors: | Giuliano Galimberti, Gabriele Soffritti, Matteo Di Maso |
Maintainer: | Giuliano Galimberti <[email protected]> |
License: | GPL (>= 2) |
Version: | 1.0-2 |
Built: | 2024-12-22 06:43:28 UTC |
Source: | CRAN |
This package contains functions that allow the user to build classification trees for ordinal responses within the CART framework.
The trees are grown using the Generalized Gini impurity function, where the misclassification costs are given by
the absolute or squared differences in scores assigned to the categories of the response.
Pruning is based on the total misclassification rate or on the total misclassification cost.
Package: | rpartScore |
Type: | Package |
Version: | 1.0-2 |
Date: | 2022-05-25 |
License: | GPL (>=2) |
LazyLoad: | yes |
This package contains functions that allow the user to build classification trees for ordinal responses within the CART framework.
It is assumed that a set of numerical scores has been assigned to the ordered categories of the response.
Two splitting functions are implemented, both based on the generalized Gini impurity function. They use the absolute
and the squared differences in scores, respectively, as misclassification costs.
In order to select the optimal tree size, pruning can be performed, using two different measures of prediction performance:
the total misclassification rate or the total misclassification cost.
This package requires the rpart
package. The main function in this package is rpartScore
.
The use of this function is almost the same as the rpart
function. The main difference is the presence of two
arguments (split
and prune
) instead of the method
argument.
The argument split
controls the splitting function used to grow the classification tree, by setting the
misclassification costs equal to the absolute ("abs"
- default option) or to the squared
("quad"
) differences in scores.
The argument prune
allows the user to select the prediction performance measure used to prune the classification tree,
and can take two values: "mr"
(total misclassification rate) or "mc"
(total misclassification cost - default option).
Giuliano Galimberti, Gabriele Soffritti, Matteo Di Maso
Maintainer: Giuliano Galimberti <[email protected]>
Breiman L., Friedman J.H., Olshen R.A., Stone C.J. 1984 Classification and Regression Trees. Wadsworth International.
Galimberti G., Soffritti G., Di Maso M. 2012 Classification Trees for Ordinal Responses in R: The rpartScore
Package.
Journal of Statistical Software, 47(10), 1-25. doi:10.18637/jss.v047.i10.
Piccarreta R. 2008 Classication Trees for Ordinal Variables. Computational Statistics, 23, 407-427. doi:10.1007/s00180-007-0077-5.
data("birthwt",package="MASS") birthwt$Category.s <- ifelse(birthwt$bwt <= 2500, 3, ifelse(birthwt$bwt <= 3000, 2, ifelse(birthwt$bwt <= 3500, 1, 0))) T.abs.mc <- rpartScore(Category.s ~ age + lwt + race + smoke + ptl + ht + ui + ftv, data = birthwt) plotcp(T.abs.mc) T.abs.mc.pruned<-prune(T.abs.mc,cp=0.02) plot(T.abs.mc.pruned) text(T.abs.mc.pruned) T.quad.mr <- rpartScore(Category.s ~ age + lwt + race + smoke + ptl + ht + ui + ftv, split = "quad", prune = "mr", data = birthwt)
data("birthwt",package="MASS") birthwt$Category.s <- ifelse(birthwt$bwt <= 2500, 3, ifelse(birthwt$bwt <= 3000, 2, ifelse(birthwt$bwt <= 3500, 1, 0))) T.abs.mc <- rpartScore(Category.s ~ age + lwt + race + smoke + ptl + ht + ui + ftv, data = birthwt) plotcp(T.abs.mc) T.abs.mc.pruned<-prune(T.abs.mc,cp=0.02) plot(T.abs.mc.pruned) text(T.abs.mc.pruned) T.quad.mr <- rpartScore(Category.s ~ age + lwt + race + smoke + ptl + ht + ui + ftv, split = "quad", prune = "mr", data = birthwt)
This function is not invoked directly by the user but is used for its effects in the pruning procedure. See Galimberti et al. (2012) for further details.
Giuliano Galimberti, Gabriele Soffritti, Matteo Di Maso
Galimberti G., Soffritti G., Di Maso M. 2012 Classification Trees for Ordinal Responses in R: The rpartScore
Package.
Journal of Statistical Software, 47(10), 1-25. doi:10.18637/jss.v047.i10.
This function is not invoked directly by the user but is used for its effects in the pruning procedure. See Galimberti et al. (2012) for further details.
Giuliano Galimberti, Gabriele Soffritti, Matteo Di Maso
Galimberti G., Soffritti G., Di Maso M. 2012 Classification Trees for Ordinal Responses in R: The rpartScore
Package.
Journal of Statistical Software, 47(10), 1-25.
doi:10.18637/jss.v047.i10.
This function is not invoked directly by the user but is used for summarizing and visualizing a classification tree. See Galimberti et al. (2012) for further details.
Giuliano Galimberti, Gabriele Soffritti, Matteo Di Maso
Galimberti G., Soffritti G., Di Maso M. 2012 Classification Trees for Ordinal Responses in R: The rpartScore
Package.
Journal of Statistical Software, 47(10), 1-25.
doi:10.18637/jss.v047.i10.
This function allows the user to build classification trees for ordinal responses within the CART framework. The trees are grown using the Generalized Gini impurity function, where the misclassification costs are given by the absolute or squared differences in scores assigned to the categories of the response. Pruning is based on the total misclassification rate or on the total misclassification cost.
rpartScore(formula, data, weights, subset, na.action = na.rpart, split = "abs", prune = "mc", model = FALSE, x = FALSE, y = TRUE, control, ...)
rpartScore(formula, data, weights, subset, na.action = na.rpart, split = "abs", prune = "mc", model = FALSE, x = FALSE, y = TRUE, control, ...)
formula |
a formula, as in the |
data |
an optional data frame in which to interpret the variables named in the formula. |
weights |
optional case weights. |
subset |
optional expression saying that only a subset of the rows of the data should be used in the fit. |
na.action |
The default action deletes all observations for which |
split |
One of |
prune |
One of |
model |
if logical: keep a copy of the model frame in the result? If the input value for model is a model frame (likely from an earlier call to the |
x |
keep a copy of the |
y |
keep a copy of the dependent variable in the result. If missing and |
control |
options that control details of the |
... |
arguments to |
The use of this function is almost the same as the rpart
function.
It is assumed that a set of (not necessarily linear) numerical scores has been assigned to the ordered categories of the response.
The main difference with respect to the rpart
function is the presence of two
arguments (split
and prune
) instead of the method
argument.
The argument split
controls the splitting function used to grow the classification tree, by setting the
misclassification costs in the generalized Gini impurity function equal to the absolute ("abs"
- is the default option) or to the squared
("quad"
) differences in scores.
The argument prune
allows the user to select the prediction performance measure used to prune the classification tree,
and can take two values: "mr"
(total misclassification rate) or "mc"
(total misclassification cost - is the default option).
An object of class rpart
, a superset of class tree
.
Giuliano Galimberti, Gabriele Soffritti, Matteo Di Maso
Breiman L., Friedman J.H., Olshen R.A., Stone C.J. 1984 Classification and Regression Trees. Wadsworth International.
Galimberti G., Soffritti G., Di Maso M. 2012 Classification Trees for Ordinal Responses in R: The rpartScore
Package.
Journal of Statistical Software, 47(10), 1-25. doi:10.18637/jss.v047.i10.
Piccarreta R. 2008 Classication Trees for Ordinal Variables. Computational Statistics, 23, 407-427. doi:10.1007/s00180-007-0077-5.
rpart
,rpart.control
,
rpart.object
,summary.rpart
,
print.rpart
data("birthwt",package="MASS") birthwt$Category.s <- ifelse(birthwt$bwt <= 2500, 3, ifelse(birthwt$bwt <= 3000, 2, ifelse(birthwt$bwt <= 3500, 1, 0))) T.abs.mc <- rpartScore(Category.s ~ age + lwt + race + smoke + ptl + ht + ui + ftv, data = birthwt) plotcp(T.abs.mc) T.abs.mc.pruned<-prune(T.abs.mc,cp=0.02) plot(T.abs.mc.pruned) text(T.abs.mc.pruned) T.quad.mr <- rpartScore(Category.s ~ age + lwt + race + smoke + ptl + ht + ui + ftv, split = "quad", prune = "mr", data = birthwt)
data("birthwt",package="MASS") birthwt$Category.s <- ifelse(birthwt$bwt <= 2500, 3, ifelse(birthwt$bwt <= 3000, 2, ifelse(birthwt$bwt <= 3500, 1, 0))) T.abs.mc <- rpartScore(Category.s ~ age + lwt + race + smoke + ptl + ht + ui + ftv, data = birthwt) plotcp(T.abs.mc) T.abs.mc.pruned<-prune(T.abs.mc,cp=0.02) plot(T.abs.mc.pruned) text(T.abs.mc.pruned) T.quad.mr <- rpartScore(Category.s ~ age + lwt + race + smoke + ptl + ht + ui + ftv, split = "quad", prune = "mr", data = birthwt)
This function is not invoked directly by the user but is used for its effects in the tree growing procedure. See Galimberti et al. (2012) for further details.
Giuliano Galimberti, Gabriele Soffritti, Matteo Di Maso
Galimberti G., Soffritti G., Di Maso M. 2012 Classification Trees for Ordinal Responses in R: The rpartScore
Package.
Journal of Statistical Software, 47(10), 1-25.
doi:10.18637/jss.v047.i10.
This function is not invoked directly by the user but is used for its effects in the tree growing procedure. See Galimberti et al. (2012) for further details.
Giuliano Galimberti, Gabriele Soffritti, Matteo Di Maso
Galimberti G., Soffritti G., Di Maso M. 2012 Classification Trees for Ordinal Responses in R: The rpartScore
Package.
Journal of Statistical Software, 47(10), 1-25.
doi:10.18637/jss.v047.i10.