Title: | Gradient Boosting |
---|---|
Description: | Functional gradient descent algorithm for a variety of convex and non-convex loss functions, for both classical and robust regression and classification problems. See Wang (2011) <doi:10.2202/1557-4679.1304>, Wang (2012) <doi:10.3414/ME11-02-0020>, Wang (2018) <doi:10.1080/10618600.2018.1424635>, Wang (2018) <doi:10.1214/18-EJS1404>. |
Authors: | Zhu Wang [aut, cre] , Torsten Hothorn [ctb] |
Maintainer: | Zhu Wang <[email protected]> |
License: | GPL (>= 2) |
Version: | 0.3-24 |
Built: | 2024-11-02 06:29:08 UTC |
Source: | CRAN |
Gradient boosting for optimizing loss functions with componentwise linear, smoothing splines, tree models as base learners.
bst(x, y, cost = 0.5, family = c("gaussian", "hinge", "hinge2", "binom", "expo", "poisson", "tgaussianDC", "thingeDC", "tbinomDC", "binomdDC", "texpoDC", "tpoissonDC", "huber", "thuberDC", "clossR", "clossRMM", "closs", "gloss", "qloss", "clossMM", "glossMM", "qlossMM", "lar"), ctrl = bst_control(), control.tree = list(maxdepth = 1), learner = c("ls", "sm", "tree")) ## S3 method for class 'bst' print(x, ...) ## S3 method for class 'bst' predict(object, newdata=NULL, newy=NULL, mstop=NULL, type=c("response", "all.res", "class", "loss", "error"), ...) ## S3 method for class 'bst' plot(x, type = c("step", "norm"),...) ## S3 method for class 'bst' coef(object, which=object$ctrl$mstop, ...) ## S3 method for class 'bst' fpartial(object, mstop=NULL, newdata=NULL)
bst(x, y, cost = 0.5, family = c("gaussian", "hinge", "hinge2", "binom", "expo", "poisson", "tgaussianDC", "thingeDC", "tbinomDC", "binomdDC", "texpoDC", "tpoissonDC", "huber", "thuberDC", "clossR", "clossRMM", "closs", "gloss", "qloss", "clossMM", "glossMM", "qlossMM", "lar"), ctrl = bst_control(), control.tree = list(maxdepth = 1), learner = c("ls", "sm", "tree")) ## S3 method for class 'bst' print(x, ...) ## S3 method for class 'bst' predict(object, newdata=NULL, newy=NULL, mstop=NULL, type=c("response", "all.res", "class", "loss", "error"), ...) ## S3 method for class 'bst' plot(x, type = c("step", "norm"),...) ## S3 method for class 'bst' coef(object, which=object$ctrl$mstop, ...) ## S3 method for class 'bst' fpartial(object, mstop=NULL, newdata=NULL)
x |
a data frame containing the variables in the model. |
y |
vector of responses. |
cost |
price to pay for false positive, 0 < |
family |
A variety of loss functions.
|
ctrl |
an object of class |
type |
|
control.tree |
control parameters of rpart. |
learner |
a character specifying the component-wise base learner to be used:
|
object |
class of |
newdata |
new data for prediction with the same number of columns as |
newy |
new response. |
mstop |
boosting iteration for prediction. |
which |
at which boosting |
... |
additional arguments. |
Boosting algorithms for classification and regression problems. In a classification problem, suppose is a classifier for a response
. A cost-sensitive or weighted loss function is
For family="hinge"
,
For family="hinge2"
,
l(y,f,cost)= 1, if y = +1 and f > 0 ; = 1-cost, if y = +1 and f < 0; = cost, if y = -1 and f > 0; = 1, if y = -1 and f < 0.
For twin boosting if twinboost=TRUE
, there are two types of adaptive boosting if learner="ls"
: for twintype=1
, weights are based on coefficients in the first round of boosting; for twintype=2
, weights are based on predictions in the first round of boosting. See Buehlmann and Hothorn (2010).
An object of class bst
with print
, coef
,
plot
and predict
methods are available for linear models.
For nonlinear models, methods print
and predict
are available.
x , y , cost , family , learner , control.tree , maxdepth
|
These are input variables and parameters |
ctrl |
the input |
yhat |
predicted function estimates |
ens |
a list of length |
ml.fit |
the last element of |
ensemble |
a vector of length |
xselect |
selected variables in |
coef |
estimated coefficients in each iteration. Used internally only |
Zhu Wang
Zhu Wang (2011), HingeBoost: ROC-Based Boost for Classification and Variable Selection. The International Journal of Biostatistics, 7(1), Article 13.
Peter Buehlmann and Torsten Hothorn (2010), Twin Boosting: improved feature selection and prediction, Statistics and Computing, 20, 119-138.
cv.bst
for cross-validated stopping iteration. Furthermore see
bst_control
x <- matrix(rnorm(100*5),ncol=5) c <- 2*x[,1] p <- exp(c)/(exp(c)+exp(-c)) y <- rbinom(100,1,p) y[y != 1] <- -1 x <- as.data.frame(x) dat.m <- bst(x, y, ctrl = bst_control(mstop=50), family = "hinge", learner = "ls") predict(dat.m) dat.m1 <- bst(x, y, ctrl = bst_control(twinboost=TRUE, coefir=coef(dat.m), xselect.init = dat.m$xselect, mstop=50)) dat.m2 <- rbst(x, y, ctrl = bst_control(mstop=50, s=0, trace=TRUE), rfamily = "thinge", learner = "ls") predict(dat.m2)
x <- matrix(rnorm(100*5),ncol=5) c <- 2*x[,1] p <- exp(c)/(exp(c)+exp(-c)) y <- rbinom(100,1,p) y[y != 1] <- -1 x <- as.data.frame(x) dat.m <- bst(x, y, ctrl = bst_control(mstop=50), family = "hinge", learner = "ls") predict(dat.m) dat.m1 <- bst(x, y, ctrl = bst_control(twinboost=TRUE, coefir=coef(dat.m), xselect.init = dat.m$xselect, mstop=50)) dat.m2 <- rbst(x, y, ctrl = bst_control(mstop=50, s=0, trace=TRUE), rfamily = "thinge", learner = "ls") predict(dat.m2)
Specification of the number of boosting iterations, step size and other parameters for boosting algorithms.
bst_control(mstop = 50, nu = 0.1, twinboost = FALSE, twintype=1, threshold=c("standard", "adaptive"), f.init = NULL, coefir = NULL, xselect.init = NULL, center = FALSE, trace = FALSE, numsample = 50, df = 4, s = NULL, sh = NULL, q = NULL, qh = NULL, fk = NULL, start=FALSE, iter = 10, intercept = FALSE, trun=FALSE)
bst_control(mstop = 50, nu = 0.1, twinboost = FALSE, twintype=1, threshold=c("standard", "adaptive"), f.init = NULL, coefir = NULL, xselect.init = NULL, center = FALSE, trace = FALSE, numsample = 50, df = 4, s = NULL, sh = NULL, q = NULL, qh = NULL, fk = NULL, start=FALSE, iter = 10, intercept = FALSE, trun=FALSE)
mstop |
an integer giving the number of boosting iterations. |
nu |
a small number (between 0 and 1) defining the step size or shrinkage parameter. |
twinboost |
a logical value: |
twintype |
for |
threshold |
if |
f.init |
the estimate from the first round of twin boosting. Only useful when |
coefir |
the estimated coefficients from the first round of twin boosting. Only useful when |
xselect.init |
the variable selected from the first round of twin boosting. Only useful when |
center |
a logical value: |
trace |
a logical value for printout of more details of information during the fitting process. |
numsample |
number of random sample variable selected in the first round of twin boosting. This is potentially useful in the future implementation. |
df |
degree of freedom used in smoothing splines. |
s , q
|
nonconvex loss tuning parameter |
sh , qh
|
threshold value or frequency |
fk |
predicted values at an iteration in the MM algorithm |
start |
a logical value, if |
iter |
number of iteration in the MM algorithm |
intercept |
logical value, if TRUE, estimation of intercept with linear predictor model |
trun |
logical value, if TRUE, predicted value in each boosting iteration is truncated at -1, 1, for |
Objects to specify parameters of the boosting algorithms implemented in bst
, via the ctrl
argument.
The s
value is for robust nonconvex loss where smaller s
value is more robust to outliers with family="closs", "tbinom", "thinge", "tbinomd"
, and larger s
value more robust with family="clossR", "gloss", "qloss"
.
For family="closs"
, if s=2
, the loss is similar to the square loss; if s=1
, the loss function is an approximation of the hinge loss; for smaller values, the loss function approaches the 0-1 loss function if s<1
, the loss function is a nonconvex function of the margin.
The default value of s
is -1 if family="thinge"
, -log(3) if family="tbinom"
, and 4 if family="binomd"
. If trun=TRUE
, boosting classifiers can produce real values in [-1, 1] indicating their confidence in [-1, 1]-valued classification. cf. R. E. Schapire and Y. Singer. Improved boosting algorithms using confidence-rated predictions. In Proceedings of the Eleventh Annual Conference on Computational Learning Theory, pages 80-91, 1998.
An object of class bst_control
, a list. Note fk
may be updated for robust boosting.
Function to determine the first q predictors in the boosting path, or perform (10-fold) cross-validation and determine the optimal set of parameters
bst.sel(x, y, q, type=c("firstq", "cv"), ...)
bst.sel(x, y, q, type=c("firstq", "cv"), ...)
x |
Design matrix (without intercept). |
y |
Continuous response vector for linear regression |
q |
Maximum number of predictors that should be selected if |
type |
if |
... |
Function to determine the first q predictors in the boosting path, or perform (10-fold) cross-validation and determine the optimal set of parameters. This may be used for p-value calculation. See below.
Vector of selected predictors.
Zhu Wang
## Not run: x <- matrix(rnorm(100*100), nrow = 100, ncol = 100) y <- x[,1] * 2 + x[,2] * 2.5 + rnorm(100) sel <- bst.sel(x, y, q=10) library("hdi") fit.multi <- hdi(x, y, method = "multi.split", model.selector =bst.sel, args.model.selector=list(type="firstq", q=10)) fit.multi fit.multi$pval[1:10] ## the first 10 p-values fit.multi <- hdi(x, y, method = "multi.split", model.selector =bst.sel, args.model.selector=list(type="cv")) fit.multi fit.multi$pval[1:10] ## the first 10 p-values ## End(Not run)
## Not run: x <- matrix(rnorm(100*100), nrow = 100, ncol = 100) y <- x[,1] * 2 + x[,2] * 2.5 + rnorm(100) sel <- bst.sel(x, y, q=10) library("hdi") fit.multi <- hdi(x, y, method = "multi.split", model.selector =bst.sel, args.model.selector=list(type="firstq", q=10)) fit.multi fit.multi$pval[1:10] ## the first 10 p-values fit.multi <- hdi(x, y, method = "multi.split", model.selector =bst.sel, args.model.selector=list(type="cv")) fit.multi fit.multi$pval[1:10] ## the first 10 p-values ## End(Not run)
Cross-validated estimation of the empirical risk/error for boosting parameter selection.
cv.bst(x,y,K=10,cost=0.5,family=c("gaussian", "hinge", "hinge2", "binom", "expo", "poisson", "tgaussianDC", "thingeDC", "tbinomDC", "binomdDC", "texpoDC", "tpoissonDC", "clossR", "closs", "gloss", "qloss", "lar"), learner = c("ls", "sm", "tree"), ctrl = bst_control(), type = c("loss", "error"), plot.it = TRUE, main = NULL, se = TRUE, n.cores=2, ...)
cv.bst(x,y,K=10,cost=0.5,family=c("gaussian", "hinge", "hinge2", "binom", "expo", "poisson", "tgaussianDC", "thingeDC", "tbinomDC", "binomdDC", "texpoDC", "tpoissonDC", "clossR", "closs", "gloss", "qloss", "lar"), learner = c("ls", "sm", "tree"), ctrl = bst_control(), type = c("loss", "error"), plot.it = TRUE, main = NULL, se = TRUE, n.cores=2, ...)
x |
a data frame containing the variables in the model. |
y |
vector of responses. |
K |
K-fold cross-validation |
cost |
price to pay for false positive, 0 < |
family |
|
learner |
a character specifying the component-wise base learner to be used:
|
ctrl |
an object of class |
type |
cross-validation criteria. For |
plot.it |
a logical value, to plot the estimated loss or error with cross validation if |
main |
title of plot |
se |
a logical value, to plot with standard errors. |
n.cores |
The number of CPU cores to use. The cross-validation loop will attempt to send different CV folds off to different cores. |
... |
additional arguments. |
object with
residmat |
empirical risks in each cross-validation at boosting iterations |
mstop |
boosting iteration steps at which CV curve should be computed. |
cv |
The CV curve at each value of mstop |
cv.error |
The standard error of the CV curve |
family |
loss function types |
...
## Not run: x <- matrix(rnorm(100*5),ncol=5) c <- 2*x[,1] p <- exp(c)/(exp(c)+exp(-c)) y <- rbinom(100,1,p) y[y != 1] <- -1 x <- as.data.frame(x) cv.bst(x, y, ctrl = bst_control(mstop=50), family = "hinge", learner = "ls", type="loss") cv.bst(x, y, ctrl = bst_control(mstop=50), family = "hinge", learner = "ls", type="error") dat.m <- bst(x, y, ctrl = bst_control(mstop=50), family = "hinge", learner = "ls") dat.m1 <- cv.bst(x, y, ctrl = bst_control(twinboost=TRUE, coefir=coef(dat.m), xselect.init = dat.m$xselect, mstop=50), family = "hinge", learner="ls") ## End(Not run)
## Not run: x <- matrix(rnorm(100*5),ncol=5) c <- 2*x[,1] p <- exp(c)/(exp(c)+exp(-c)) y <- rbinom(100,1,p) y[y != 1] <- -1 x <- as.data.frame(x) cv.bst(x, y, ctrl = bst_control(mstop=50), family = "hinge", learner = "ls", type="loss") cv.bst(x, y, ctrl = bst_control(mstop=50), family = "hinge", learner = "ls", type="error") dat.m <- bst(x, y, ctrl = bst_control(mstop=50), family = "hinge", learner = "ls") dat.m1 <- cv.bst(x, y, ctrl = bst_control(twinboost=TRUE, coefir=coef(dat.m), xselect.init = dat.m$xselect, mstop=50), family = "hinge", learner="ls") ## End(Not run)
Cross-validated estimation of the empirical misclassification error for boosting parameter selection.
cv.mada(x, y, balance=FALSE, K=10, nu=0.1, mstop=200, interaction.depth=1, trace=FALSE, plot.it = TRUE, se = TRUE, ...)
cv.mada(x, y, balance=FALSE, K=10, nu=0.1, mstop=200, interaction.depth=1, trace=FALSE, plot.it = TRUE, se = TRUE, ...)
x |
a data matrix containing the variables in the model. |
y |
vector of multi class responses. |
balance |
logical value. If TRUE, The K parts were roughly balanced, ensuring that the classes were distributed proportionally among each of the K parts. |
K |
K-fold cross-validation |
nu |
a small number (between 0 and 1) defining the step size or shrinkage parameter. |
mstop |
number of boosting iteration. |
interaction.depth |
used in gbm to specify the depth of trees. |
trace |
if TRUE, iteration results printed out. |
plot.it |
a logical value, to plot the cross-validation error if |
se |
a logical value, to plot with 1 standard deviation curves. |
... |
additional arguments. |
object with
residmat |
empirical risks in each cross-validation at boosting iterations |
fraction |
abscissa values at which CV curve should be computed. |
cv |
The CV curve at each value of fraction |
cv.error |
The standard error of the CV curve |
...
Cross-validated estimation of the empirical multi-class loss for boosting parameter selection.
cv.mbst(x, y, balance=FALSE, K = 10, cost = NULL, family = c("hinge","hinge2","thingeDC", "closs", "clossMM"), learner = c("tree", "ls", "sm"), ctrl = bst_control(), type = c("loss","error"), plot.it = TRUE, se = TRUE, n.cores=2, ...)
cv.mbst(x, y, balance=FALSE, K = 10, cost = NULL, family = c("hinge","hinge2","thingeDC", "closs", "clossMM"), learner = c("tree", "ls", "sm"), ctrl = bst_control(), type = c("loss","error"), plot.it = TRUE, se = TRUE, n.cores=2, ...)
x |
a data frame containing the variables in the model. |
y |
vector of responses. |
balance |
logical value. If TRUE, The K parts were roughly balanced, ensuring that the classes were distributed proportionally among each of the K parts. |
K |
K-fold cross-validation |
cost |
price to pay for false positive, 0 < |
family |
|
learner |
a character specifying the component-wise base learner to be used:
|
ctrl |
an object of class |
type |
for |
plot.it |
a logical value, to plot the estimated risks if |
se |
a logical value, to plot with standard errors. |
n.cores |
The number of CPU cores to use. The cross-validation loop will attempt to send different CV folds off to different cores. |
... |
additional arguments. |
object with
residmat |
empirical risks in each cross-validation at boosting iterations |
fraction |
abscissa values at which CV curve should be computed. |
cv |
The CV curve at each value of fraction |
cv.error |
The standard error of the CV curve |
...
Cross-validated estimation of the empirical multi-class hinge loss for boosting parameter selection.
cv.mhingebst(x, y, balance=FALSE, K = 10, cost = NULL, family = "hinge", learner = c("tree", "ls", "sm"), ctrl = bst_control(), type = c("loss","error"), plot.it = TRUE, main = NULL, se = TRUE, n.cores=2, ...)
cv.mhingebst(x, y, balance=FALSE, K = 10, cost = NULL, family = "hinge", learner = c("tree", "ls", "sm"), ctrl = bst_control(), type = c("loss","error"), plot.it = TRUE, main = NULL, se = TRUE, n.cores=2, ...)
x |
a data frame containing the variables in the model. |
y |
vector of responses. |
balance |
logical value. If TRUE, The K parts were roughly balanced, ensuring that the classes were distributed proportionally among each of the K parts. |
K |
K-fold cross-validation |
cost |
price to pay for false positive, 0 < |
family |
|
Implementing the negative gradient corresponding to the loss function to be minimized.
learner |
a character specifying the component-wise base learner to be used:
|
ctrl |
an object of class |
type |
for |
plot.it |
a logical value, to plot the estimated loss or error with cross validation if |
main |
title of plot |
se |
a logical value, to plot with standard errors. |
n.cores |
The number of CPU cores to use. The cross-validation loop will attempt to send different CV folds off to different cores. |
... |
additional arguments. |
object with
residmat |
empirical risks in each cross-validation at boosting iterations |
fraction |
abscissa values at which CV curve should be computed. |
cv |
The CV curve at each value of fraction |
cv.error |
The standard error of the CV curve |
...
Cross-validated estimation of the empirical misclassification error for boosting parameter selection.
cv.mhingeova(x, y, balance=FALSE, K=10, cost = NULL, nu=0.1, learner=c("tree", "ls", "sm"), maxdepth=1, m1=200, twinboost = FALSE, m2=200, trace=FALSE, plot.it = TRUE, se = TRUE, ...)
cv.mhingeova(x, y, balance=FALSE, K=10, cost = NULL, nu=0.1, learner=c("tree", "ls", "sm"), maxdepth=1, m1=200, twinboost = FALSE, m2=200, trace=FALSE, plot.it = TRUE, se = TRUE, ...)
x |
a data frame containing the variables in the model. |
y |
vector of multi class responses. |
balance |
logical value. If TRUE, The K parts were roughly balanced, ensuring that the classes were distributed proportionally among each of the K parts. |
K |
K-fold cross-validation |
cost |
price to pay for false positive, 0 < |
nu |
a small number (between 0 and 1) defining the step size or shrinkage parameter. |
learner |
a character specifying the component-wise base learner to be used:
|
maxdepth |
tree depth used in |
m1 |
number of boosting iteration |
twinboost |
logical: twin boosting? |
m2 |
number of twin boosting iteration |
trace |
if TRUE, iteration results printed out |
plot.it |
a logical value, to plot the estimated risks if |
se |
a logical value, to plot with standard errors. |
... |
additional arguments. |
object with
residmat |
empirical risks in each cross-validation at boosting iterations |
fraction |
abscissa values at which CV curve should be computed. |
cv |
The CV curve at each value of fraction |
cv.error |
The standard error of the CV curve |
...
The functions for balanced cross validation were from R package pmar.
Cross-validated estimation of the empirical risk/error, can be used for tuning parameter selection.
cv.rbst(x, y, K = 10, cost = 0.5, rfamily = c("tgaussian", "thuber", "thinge", "tbinom", "binomd", "texpo", "tpoisson", "clossR", "closs", "gloss", "qloss"), learner = c("ls", "sm", "tree"), ctrl = bst_control(), type = c("loss", "error"), plot.it = TRUE, main = NULL, se = TRUE, n.cores=2,...)
cv.rbst(x, y, K = 10, cost = 0.5, rfamily = c("tgaussian", "thuber", "thinge", "tbinom", "binomd", "texpo", "tpoisson", "clossR", "closs", "gloss", "qloss"), learner = c("ls", "sm", "tree"), ctrl = bst_control(), type = c("loss", "error"), plot.it = TRUE, main = NULL, se = TRUE, n.cores=2,...)
x |
a data frame containing the variables in the model. |
y |
vector of responses. |
K |
K-fold cross-validation |
cost |
price to pay for false positive, 0 < |
rfamily |
nonconvex loss function types. |
learner |
a character specifying the component-wise base learner to be used:
|
ctrl |
an object of class |
type |
cross-validation criteria. For |
plot.it |
a logical value, to plot the estimated loss or error with cross validation if |
main |
title of plot |
se |
a logical value, to plot with standard errors. |
n.cores |
The number of CPU cores to use. The cross-validation loop will attempt to send different CV folds off to different cores. |
... |
additional arguments. |
object with
residmat |
empirical risks in each cross-validation at boosting iterations |
mstop |
boosting iteration steps at which CV curve should be computed. |
cv |
The CV curve at each value of mstop |
cv.error |
The standard error of the CV curve |
rfamily |
nonconvex loss function types. |
...
Zhu Wang
## Not run: x <- matrix(rnorm(100*5),ncol=5) c <- 2*x[,1] p <- exp(c)/(exp(c)+exp(-c)) y <- rbinom(100,1,p) y[y != 1] <- -1 x <- as.data.frame(x) cv.rbst(x, y, ctrl = bst_control(mstop=50), rfamily = "thinge", learner = "ls", type="lose") cv.rbst(x, y, ctrl = bst_control(mstop=50), rfamily = "thinge", learner = "ls", type="error") dat.m <- rbst(x, y, ctrl = bst_control(mstop=50), rfamily = "thinge", learner = "ls") dat.m1 <- cv.rbst(x, y, ctrl = bst_control(twinboost=TRUE, coefir=coef(dat.m), xselect.init = dat.m$xselect, mstop=50), family = "thinge", learner="ls") ## End(Not run)
## Not run: x <- matrix(rnorm(100*5),ncol=5) c <- 2*x[,1] p <- exp(c)/(exp(c)+exp(-c)) y <- rbinom(100,1,p) y[y != 1] <- -1 x <- as.data.frame(x) cv.rbst(x, y, ctrl = bst_control(mstop=50), rfamily = "thinge", learner = "ls", type="lose") cv.rbst(x, y, ctrl = bst_control(mstop=50), rfamily = "thinge", learner = "ls", type="error") dat.m <- rbst(x, y, ctrl = bst_control(mstop=50), rfamily = "thinge", learner = "ls") dat.m1 <- cv.rbst(x, y, ctrl = bst_control(twinboost=TRUE, coefir=coef(dat.m), xselect.init = dat.m$xselect, mstop=50), family = "thinge", learner="ls") ## End(Not run)
Cross-validated estimation of the empirical multi-class loss, can be used for tuning parameter selection.
cv.rmbst(x, y, balance=FALSE, K = 10, cost = NULL, rfamily = c("thinge", "closs"), learner = c("tree", "ls", "sm"), ctrl = bst_control(), type = c("loss","error"), plot.it = TRUE, main = NULL, se = TRUE, n.cores=2, ...)
cv.rmbst(x, y, balance=FALSE, K = 10, cost = NULL, rfamily = c("thinge", "closs"), learner = c("tree", "ls", "sm"), ctrl = bst_control(), type = c("loss","error"), plot.it = TRUE, main = NULL, se = TRUE, n.cores=2, ...)
x |
a data frame containing the variables in the model. |
y |
vector of responses. |
balance |
logical value. If TRUE, The K parts were roughly balanced, ensuring that the classes were distributed proportionally among each of the K parts. |
K |
K-fold cross-validation |
cost |
price to pay for false positive, 0 < |
rfamily |
|
Implementing the negative gradient corresponding to the loss function to be minimized.
learner |
a character specifying the component-wise base learner to be used:
|
ctrl |
an object of class |
type |
loss value or misclassification error. |
plot.it |
a logical value, to plot the estimated loss or error with cross validation if |
main |
title of plot |
se |
a logical value, to plot with standard errors. |
n.cores |
The number of CPU cores to use. The cross-validation loop will attempt to send different CV folds off to different cores. |
... |
additional arguments. |
object with
residmat |
empirical risks in each cross-validation at boosting iterations |
fraction |
abscissa values at which CV curve should be computed. |
cv |
The CV curve at each value of fraction |
cv.error |
The standard error of the CV curve |
...
Zhu Wang
Randomly generate data for a three-class model.
ex1data(n.data, p=50)
ex1data(n.data, p=50)
n.data |
number of data samples. |
p |
number of predictors. |
The data is generated based on Example 1 described in Wang (2012).
A list with n.data by p predictor matrix x
, three-class response y
and conditional probabilities.
Zhu Wang
Zhu Wang (2012), Multi-class HingeBoost: Method and Application to the Classification of Cancer Types Using Gene Expression Data. Methods of Information in Medicine, 51(2), 162–7.
## Not run: dat <- ex1data(100, p=5) mhingebst(x=dat$x, y=dat$y) ## End(Not run)
## Not run: dat <- ex1data(100, p=5) mhingebst(x=dat$x, y=dat$y) ## End(Not run)
One-vs-all multi-class AdaBoost
mada(xtr, ytr, xte=NULL, yte=NULL, mstop=50, nu=0.1, interaction.depth=1)
mada(xtr, ytr, xte=NULL, yte=NULL, mstop=50, nu=0.1, interaction.depth=1)
xtr |
training data matrix containing the predictor variables in the model. |
ytr |
training vector of responses. |
xte |
test data matrix containing the predictor variables in the model. |
yte |
test vector of responses. |
mstop |
number of boosting iteration. |
nu |
a small number (between 0 and 1) defining the step size or shrinkage parameter. |
interaction.depth |
used in gbm to specify the depth of trees. |
For a C-class problem (C > 2), each class is separately compared against all other classes with AdaBoost, and C functions are estimated to represent confidence for each class. The classification rule is to assign the class with the largest estimate.
A list contains variable selected xselect
and training and testing error err.tr, err.te
.
Zhu Wang
cv.mada
for cross-validated stopping iteration.
data(iris) mada(xtr=iris[,-5], ytr=iris[,5])
data(iris) mada(xtr=iris[,-5], ytr=iris[,5])
Gradient boosting for optimizing multi-class loss functions with componentwise linear, smoothing splines, tree models as base learners.
mbst(x, y, cost = NULL, family = c("hinge", "hinge2", "thingeDC", "closs", "clossMM"), ctrl = bst_control(), control.tree=list(fixed.depth=TRUE, n.term.node=6, maxdepth = 1), learner = c("ls", "sm", "tree")) ## S3 method for class 'mbst' print(x, ...) ## S3 method for class 'mbst' predict(object, newdata=NULL, newy=NULL, mstop=NULL, type=c("response", "class", "loss", "error"), ...) ## S3 method for class 'mbst' fpartial(object, mstop=NULL, newdata=NULL)
mbst(x, y, cost = NULL, family = c("hinge", "hinge2", "thingeDC", "closs", "clossMM"), ctrl = bst_control(), control.tree=list(fixed.depth=TRUE, n.term.node=6, maxdepth = 1), learner = c("ls", "sm", "tree")) ## S3 method for class 'mbst' print(x, ...) ## S3 method for class 'mbst' predict(object, newdata=NULL, newy=NULL, mstop=NULL, type=c("response", "class", "loss", "error"), ...) ## S3 method for class 'mbst' fpartial(object, mstop=NULL, newdata=NULL)
x |
a data frame containing the variables in the model. |
y |
vector of responses. |
cost |
price to pay for false positive, 0 < |
family |
|
ctrl |
an object of class |
control.tree |
control parameters of rpart. |
learner |
a character specifying the component-wise base learner to be used:
|
type |
in |
object |
class of |
newdata |
new data for prediction with the same number of columns as |
newy |
new response. |
mstop |
boosting iteration for prediction. |
... |
additional arguments. |
A linear or nonlinear classifier is fitted using a boosting algorithm for multi-class responses. This function is different from mhingebst
on how to deal with zero-to-sum constraint and loss functions. If family="hinge"
, the loss function is the same as in mhingebst
but the boosting algorithm is different. If family="hinge2"
, the loss function is different from family="hinge"
: the response is not recoded as in Wang (2012). In this case, the loss function is
family="thingeDC"
for robust loss function used in the DCB algorithm.
An object of class mbst
with print
, coef
,
plot
and predict
methods are available for linear models.
For nonlinear models, methods print
and predict
are available.
x , y , cost , family , learner , control.tree , maxdepth
|
These are input variables and parameters |
ctrl |
the input |
yhat |
predicted function estimates |
ens |
a list of length |
ml.fit |
the last element of |
ensemble |
a vector of length |
xselect |
selected variables in |
coef |
estimated coefficients in each iteration. Used internally only |
Zhu Wang
Zhu Wang (2011), HingeBoost: ROC-Based Boost for Classification and Variable Selection. The International Journal of Biostatistics, 7(1), Article 13.
Zhu Wang (2012), Multi-class HingeBoost: Method and Application to the Classification of Cancer Types Using Gene Expression Data. Methods of Information in Medicine, 51(2), 162–7.
cv.mbst
for cross-validated stopping iteration. Furthermore see
bst_control
x <- matrix(rnorm(100*5),ncol=5) c <- quantile(x[,1], prob=c(0.33, 0.67)) y <- rep(1, 100) y[x[,1] > c[1] & x[,1] < c[2] ] <- 2 y[x[,1] > c[2]] <- 3 x <- as.data.frame(x) dat.m <- mbst(x, y, ctrl = bst_control(mstop=50), family = "hinge", learner = "ls") predict(dat.m) dat.m1 <- mbst(x, y, ctrl = bst_control(twinboost=TRUE, f.init=predict(dat.m), xselect.init = dat.m$xselect, mstop=50)) dat.m2 <- rmbst(x, y, ctrl = bst_control(mstop=50, s=1, trace=TRUE), rfamily = "thinge", learner = "ls") predict(dat.m2)
x <- matrix(rnorm(100*5),ncol=5) c <- quantile(x[,1], prob=c(0.33, 0.67)) y <- rep(1, 100) y[x[,1] > c[1] & x[,1] < c[2] ] <- 2 y[x[,1] > c[2]] <- 3 x <- as.data.frame(x) dat.m <- mbst(x, y, ctrl = bst_control(mstop=50), family = "hinge", learner = "ls") predict(dat.m) dat.m1 <- mbst(x, y, ctrl = bst_control(twinboost=TRUE, f.init=predict(dat.m), xselect.init = dat.m$xselect, mstop=50)) dat.m2 <- rmbst(x, y, ctrl = bst_control(mstop=50, s=1, trace=TRUE), rfamily = "thinge", learner = "ls") predict(dat.m2)
Gradient boosting for optimizing multi-class hinge loss functions with componentwise linear least squares, smoothing splines and trees as base learners.
mhingebst(x, y, cost = NULL, family = c("hinge"), ctrl = bst_control(), control.tree = list(fixed.depth=TRUE, n.term.node=6, maxdepth = 1), learner = c("ls", "sm", "tree")) ## S3 method for class 'mhingebst' print(x, ...) ## S3 method for class 'mhingebst' predict(object, newdata=NULL, newy=NULL, mstop=NULL, type=c("response", "class", "loss", "error"), ...) ## S3 method for class 'mhingebst' fpartial(object, mstop=NULL, newdata=NULL)
mhingebst(x, y, cost = NULL, family = c("hinge"), ctrl = bst_control(), control.tree = list(fixed.depth=TRUE, n.term.node=6, maxdepth = 1), learner = c("ls", "sm", "tree")) ## S3 method for class 'mhingebst' print(x, ...) ## S3 method for class 'mhingebst' predict(object, newdata=NULL, newy=NULL, mstop=NULL, type=c("response", "class", "loss", "error"), ...) ## S3 method for class 'mhingebst' fpartial(object, mstop=NULL, newdata=NULL)
x |
a data frame containing the variables in the model. |
y |
vector of responses. |
cost |
equal costs for now and unequal costs will be implemented in the future. |
family |
|
ctrl |
an object of class |
control.tree |
control parameters of rpart. |
learner |
a character specifying the component-wise base learner to be used:
|
type |
in |
object |
class of |
newdata |
new data for prediction with the same number of columns as |
newy |
new response. |
mstop |
boosting iteration for prediction. |
... |
additional arguments. |
A linear or nonlinear classifier is fitted using a boosting algorithm based on component-wise base learners for multi-class responses.
An object of class mhingebst
with print
and predict
methods being available for fitted models.
Zhu Wang
Zhu Wang (2011), HingeBoost: ROC-Based Boost for Classification and Variable Selection. The International Journal of Biostatistics, 7(1), Article 13.
Zhu Wang (2012), Multi-class HingeBoost: Method and Application to the Classification of Cancer Types Using Gene Expression Data. Methods of Information in Medicine, 51(2), 162–7.
cv.mhingebst
for cross-validated stopping iteration. Furthermore see
bst_control
## Not run: dat <- ex1data(100, p=5) res <- mhingebst(x=dat$x, y=dat$y) ## End(Not run)
## Not run: dat <- ex1data(100, p=5) res <- mhingebst(x=dat$x, y=dat$y) ## End(Not run)
Multi-class algorithm with one-vs-all binary HingeBoost which optimizes the hinge loss functions with componentwise linear, smoothing splines, tree models as base learners.
mhingeova(xtr, ytr, xte=NULL, yte=NULL, cost = NULL, nu=0.1, learner=c("tree", "ls", "sm"), maxdepth=1, m1=200, twinboost = FALSE, m2=200) ## S3 method for class 'mhingeova' print(x, ...)
mhingeova(xtr, ytr, xte=NULL, yte=NULL, cost = NULL, nu=0.1, learner=c("tree", "ls", "sm"), maxdepth=1, m1=200, twinboost = FALSE, m2=200) ## S3 method for class 'mhingeova' print(x, ...)
xtr |
training data containing the predictor variables. |
ytr |
vector of training data responses. |
xte |
test data containing the predictor variables. |
yte |
vector of test data responses. |
cost |
default is NULL for equal cost; otherwise a numeric vector indicating price to pay for false positive, 0 < |
nu |
a small number (between 0 and 1) defining the step size or shrinkage parameter. |
learner |
a character specifying the component-wise base learner to be used:
|
maxdepth |
tree depth used in |
m1 |
number of boosting iteration |
twinboost |
logical: twin boosting? |
m2 |
number of twin boosting iteration |
x |
class of |
... |
additional arguments. |
For a C-class problem (C > 2), each class is separately compared against all other classes with HingeBoost, and C functions are estimated to represent confidence for each class. The classification rule is to assign the class with the largest estimate. A linear or nonlinear multi-class HingeBoost classifier is fitted using a boosting algorithm based on one-against component-wise base learners for +1/-1 responses, with possible cost-sensitive hinge loss function.
An object of class mhingeova
with print
method being available.
Zhu Wang
Zhu Wang (2011), HingeBoost: ROC-Based Boost for Classification and Variable Selection. The International Journal of Biostatistics, 7(1), Article 13.
Zhu Wang (2012), Multi-class HingeBoost: Method and Application to the Classification of Cancer Types Using Gene Expression Data. Methods of Information in Medicine, 51(2), 162–7.
bst
for HingeBoost binary classification. Furthermore see cv.bst
for stopping iteration selection by cross-validation, and bst_control
for control parameters.
## Not run: dat1 <- read.table("http://archive.ics.uci.edu/ml/machine-learning-databases/ thyroid-disease/ann-train.data") dat2 <- read.table("http://archive.ics.uci.edu/ml/machine-learning-databases/ thyroid-disease/ann-test.data") res <- mhingeova(xtr=dat1[,-22], ytr=dat1[,22], xte=dat2[,-22], yte=dat2[,22], cost=c(2/3, 0.5, 0.5), nu=0.5, learner="ls", m1=100, K=5, cv1=FALSE, twinboost=TRUE, m2= 200, cv2=FALSE) res <- mhingeova(xtr=dat1[,-22], ytr=dat1[,22], xte=dat2[,-22], yte=dat2[,22], cost=c(2/3, 0.5, 0.5), nu=0.5, learner="ls", m1=100, K=5, cv1=FALSE, twinboost=TRUE, m2= 200, cv2=TRUE) ## End(Not run)
## Not run: dat1 <- read.table("http://archive.ics.uci.edu/ml/machine-learning-databases/ thyroid-disease/ann-train.data") dat2 <- read.table("http://archive.ics.uci.edu/ml/machine-learning-databases/ thyroid-disease/ann-test.data") res <- mhingeova(xtr=dat1[,-22], ytr=dat1[,22], xte=dat2[,-22], yte=dat2[,22], cost=c(2/3, 0.5, 0.5), nu=0.5, learner="ls", m1=100, K=5, cv1=FALSE, twinboost=TRUE, m2= 200, cv2=FALSE) res <- mhingeova(xtr=dat1[,-22], ytr=dat1[,22], xte=dat2[,-22], yte=dat2[,22], cost=c(2/3, 0.5, 0.5), nu=0.5, learner="ls", m1=100, K=5, cv1=FALSE, twinboost=TRUE, m2= 200, cv2=TRUE) ## End(Not run)
Find Number of Variables In Multi-class Boosting Iterations
nsel(object, mstop)
nsel(object, mstop)
object |
|
mstop |
boosting iteration number |
a vector of length mstop
indicating number of variables selected in each boosting iteration
Zhu Wang
MM (majorization/minimization) algorithm based gradient boosting for optimizing nonconvex robust loss functions with componentwise linear, smoothing splines, tree models as base learners.
rbst(x, y, cost = 0.5, rfamily = c("tgaussian", "thuber","thinge", "tbinom", "binomd", "texpo", "tpoisson", "clossR", "closs", "gloss", "qloss"), ctrl=bst_control(), control.tree=list(maxdepth = 1), learner=c("ls","sm","tree"),del=1e-10)
rbst(x, y, cost = 0.5, rfamily = c("tgaussian", "thuber","thinge", "tbinom", "binomd", "texpo", "tpoisson", "clossR", "closs", "gloss", "qloss"), ctrl=bst_control(), control.tree=list(maxdepth = 1), learner=c("ls","sm","tree"),del=1e-10)
x |
a data frame containing the variables in the model. |
y |
vector of responses. |
cost |
price to pay for false positive, 0 < |
rfamily |
robust loss function, see details. |
ctrl |
an object of class |
control.tree |
control parameters of rpart. |
learner |
a character specifying the component-wise base learner to be used:
|
del |
convergency criteria |
An MM algorithm operates by creating a convex surrogate function that majorizes the nonconvex objective function. When the surrogate function is minimized with gradient boosting algorithm, the desired objective function is decreased. The MM algorithm contains difference of convex (DC) algorithm for rfamily=c("tgaussian", "thuber","thinge", "tbinom", "binomd", "texpo", "tpoisson")
and quadratic majorization boosting algorithm (QMBA) for rfamily=c("clossR", "closs", "gloss", "qloss")
.
rfamily
= "tgaussian" for truncated square error loss, "thuber" for truncated Huber loss, "thinge" for truncated hinge loss, "tbinom" for truncated logistic loss, "binomd" for logistic difference loss, "texpo" for truncated exponential loss, "tpoisson" for truncated Poisson loss, "clossR" for C-loss in regression, "closs" for C-loss in classification, "gloss" for G-loss, "qloss" for Q-loss.
s
must be a numeric value to be specified in bst_control
. For rfamily="thinge", "tbinom", "texpo"
s < 0
. For rfamily="binomd", "tpoisson", "closs", "qloss", "clossR"
, s > 0
and for rfamily="gloss"
, s > 1
. Some suggested s
values: "thinge"= -1, "tbinom"= -log(3), "binomd"= log(4), "texpo"= log(0.5), "closs"=1, "gloss"=1.5, "qloss"=2, "clossR"=1.
An object of class bst
with print
, coef
,
plot
and predict
methods are available for linear models.
For nonlinear models, methods print
and predict
are available.
x , y , cost , rfamily , learner , control.tree , maxdepth
|
These are input variables and parameters |
ctrl |
the input |
yhat |
predicted function estimates |
ens |
a list of length |
ml.fit |
the last element of |
ensemble |
a vector of length |
xselect |
selected variables in |
coef |
estimated coefficients in |
Zhu Wang
Zhu Wang (2018), Quadratic Majorization for Nonconvex Loss with Applications to the Boosting Algorithm, Journal of Computational and Graphical Statistics, 27(3), 491-502, doi:10.1080/10618600.2018.1424635
Zhu Wang (2018), Robust boosting with truncated loss functions, Electronic Journal of Statistics, 12(1), 599-650, doi:10.1214/18-EJS1404
cv.rbst
for cross-validated stopping iteration. Furthermore see
bst_control
x <- matrix(rnorm(100*5),ncol=5) c <- 2*x[,1] p <- exp(c)/(exp(c)+exp(-c)) y <- rbinom(100,1,p) y[y != 1] <- -1 y[1:10] <- -y[1:10] x <- as.data.frame(x) dat.m <- bst(x, y, ctrl = bst_control(mstop=50), family = "hinge", learner = "ls") predict(dat.m) dat.m1 <- bst(x, y, ctrl = bst_control(twinboost=TRUE, coefir=coef(dat.m), xselect.init = dat.m$xselect, mstop=50)) dat.m2 <- rbst(x, y, ctrl = bst_control(mstop=50, s=0, trace=TRUE), rfamily = "thinge", learner = "ls") predict(dat.m2)
x <- matrix(rnorm(100*5),ncol=5) c <- 2*x[,1] p <- exp(c)/(exp(c)+exp(-c)) y <- rbinom(100,1,p) y[y != 1] <- -1 y[1:10] <- -y[1:10] x <- as.data.frame(x) dat.m <- bst(x, y, ctrl = bst_control(mstop=50), family = "hinge", learner = "ls") predict(dat.m) dat.m1 <- bst(x, y, ctrl = bst_control(twinboost=TRUE, coefir=coef(dat.m), xselect.init = dat.m$xselect, mstop=50)) dat.m2 <- rbst(x, y, ctrl = bst_control(mstop=50, s=0, trace=TRUE), rfamily = "thinge", learner = "ls") predict(dat.m2)
Gradient boosting path for optimizing robust loss functions with componentwise linear, smoothing splines, tree models as base learners. See details below before use.
rbstpath(x, y, rmstop=seq(40, 400, by=20), ctrl=bst_control(), del=1e-16, ...)
rbstpath(x, y, rmstop=seq(40, 400, by=20), ctrl=bst_control(), del=1e-16, ...)
x |
a data frame containing the variables in the model. |
y |
vector of responses. |
rmstop |
vector of boosting iterations |
ctrl |
an object of class |
del |
convergency criteria |
... |
arguments passed to |
This function invokes rbst
with mstop
being each element of vector rmstop
. It can provide different paths. Thus rmstop
serves as another hyper-parameter. However, the most important hyper-parameter is the loss truncation point or the point determines the level of nonconvexity. This is an experimental function and may not be needed in practice.
A length rmstop
vector of lists with each element being an object of class rbst
.
Zhu Wang
x <- matrix(rnorm(100*5),ncol=5) c <- 2*x[,1] p <- exp(c)/(exp(c)+exp(-c)) y <- rbinom(100,1,p) y[y != 1] <- -1 y[1:10] <- -y[1:10] x <- as.data.frame(x) dat.m <- bst(x, y, ctrl = bst_control(mstop=50), family = "hinge", learner = "ls") predict(dat.m) dat.m1 <- bst(x, y, ctrl = bst_control(twinboost=TRUE, coefir=coef(dat.m), xselect.init = dat.m$xselect, mstop=50)) dat.m2 <- rbst(x, y, ctrl = bst_control(mstop=50, s=0, trace=TRUE), rfamily = "thinge", learner = "ls") predict(dat.m2) rmstop <- seq(10, 40, by=10) dat.m3 <- rbstpath(x, y, rmstop, ctrl=bst_control(s=0), rfamily = "thinge", learner = "ls")
x <- matrix(rnorm(100*5),ncol=5) c <- 2*x[,1] p <- exp(c)/(exp(c)+exp(-c)) y <- rbinom(100,1,p) y[y != 1] <- -1 y[1:10] <- -y[1:10] x <- as.data.frame(x) dat.m <- bst(x, y, ctrl = bst_control(mstop=50), family = "hinge", learner = "ls") predict(dat.m) dat.m1 <- bst(x, y, ctrl = bst_control(twinboost=TRUE, coefir=coef(dat.m), xselect.init = dat.m$xselect, mstop=50)) dat.m2 <- rbst(x, y, ctrl = bst_control(mstop=50, s=0, trace=TRUE), rfamily = "thinge", learner = "ls") predict(dat.m2) rmstop <- seq(10, 40, by=10) dat.m3 <- rbstpath(x, y, rmstop, ctrl=bst_control(s=0), rfamily = "thinge", learner = "ls")
MM (majorization/minimization) based gradient boosting for optimizing nonconvex robust loss functions with componentwise linear, smoothing splines, tree models as base learners.
rmbst(x, y, cost = 0.5, rfamily = c("thinge", "closs"), ctrl=bst_control(), control.tree=list(maxdepth = 1),learner=c("ls","sm","tree"),del=1e-10)
rmbst(x, y, cost = 0.5, rfamily = c("thinge", "closs"), ctrl=bst_control(), control.tree=list(maxdepth = 1),learner=c("ls","sm","tree"),del=1e-10)
x |
a data frame containing the variables in the model. |
y |
vector of responses. |
cost |
price to pay for false positive, 0 < |
rfamily |
|
ctrl |
an object of class |
control.tree |
control parameters of rpart. |
learner |
a character specifying the component-wise base learner to be used:
|
del |
convergency criteria |
An MM algorithm operates by creating a convex surrogate function that majorizes the nonconvex objective function. When the surrogate function is minimized with gradient boosting algorithm, the desired objective function is decreased. The MM algorithm contains difference of convex (DC) for rfamily="thinge"
, and quadratic majorization boosting algorithm (QMBA) for rfamily="closs"
.
An object of class bst
with print
, coef
,
plot
and predict
methods are available for linear models.
For nonlinear models, methods print
and predict
are available.
x , y , cost , rfamily , learner , control.tree , maxdepth
|
These are input variables and parameters |
ctrl |
the input |
yhat |
predicted function estimates |
ens |
a list of length |
ml.fit |
the last element of |
ensemble |
a vector of length |
xselect |
selected variables in |
coef |
estimated coefficients in |
Zhu Wang
Zhu Wang (2018), Quadratic Majorization for Nonconvex Loss with Applications to the Boosting Algorithm, Journal of Computational and Graphical Statistics, 27(3), 491-502, doi:10.1080/10618600.2018.1424635
Zhu Wang (2018), Robust boosting with truncated loss functions, Electronic Journal of Statistics, 12(1), 599-650, doi:10.1214/18-EJS1404
cv.rmbst
for cross-validated stopping iteration. Furthermore see
bst_control
x <- matrix(rnorm(100*5),ncol=5) c <- quantile(x[,1], prob=c(0.33, 0.67)) y <- rep(1, 100) y[x[,1] > c[1] & x[,1] < c[2] ] <- 2 y[x[,1] > c[2]] <- 3 x <- as.data.frame(x) x <- as.data.frame(x) dat.m <- mbst(x, y, ctrl = bst_control(mstop=50), family = "hinge", learner = "ls") predict(dat.m) dat.m1 <- mbst(x, y, ctrl = bst_control(twinboost=TRUE, f.init=predict(dat.m), xselect.init = dat.m$xselect, mstop=50)) dat.m2 <- rmbst(x, y, ctrl = bst_control(mstop=50, s=1, trace=TRUE), rfamily = "thinge", learner = "ls") predict(dat.m2)
x <- matrix(rnorm(100*5),ncol=5) c <- quantile(x[,1], prob=c(0.33, 0.67)) y <- rep(1, 100) y[x[,1] > c[1] & x[,1] < c[2] ] <- 2 y[x[,1] > c[2]] <- 3 x <- as.data.frame(x) x <- as.data.frame(x) dat.m <- mbst(x, y, ctrl = bst_control(mstop=50), family = "hinge", learner = "ls") predict(dat.m) dat.m1 <- mbst(x, y, ctrl = bst_control(twinboost=TRUE, f.init=predict(dat.m), xselect.init = dat.m$xselect, mstop=50)) dat.m2 <- rmbst(x, y, ctrl = bst_control(mstop=50, s=1, trace=TRUE), rfamily = "thinge", learner = "ls") predict(dat.m2)