Title: | Probabilistic Regression Trees |
---|---|
Description: | Probabilistic Regression Trees (PRTree). Functions for fitting and predicting PRTree models with some adaptations to handle missing values. The main calculations are performed in 'FORTRAN', resulting in highly efficient algorithms. This package's implementation is based on the PRTree methodology described in Alkhoury, S.; Devijver, E.; Clausel, M.; Tami, M.; Gaussier, E.; Oppenheim, G. (2020) - "Smooth And Consistent Probabilistic Regression Trees" <https://proceedings.neurips.cc/paper_files/paper/2020/file/8289889263db4a40463e3f358bb7c7a1-Paper.pdf>. |
Authors: | Alisson Silva Neimaier [aut, cre] , Taiane Schaedler Prass [aut, ths] |
Maintainer: | Alisson Silva Neimaier <[email protected]> |
License: | GPL (>= 3) |
Version: | 0.1.2 |
Built: | 2024-11-28 06:51:27 UTC |
Source: | CRAN |
Probabilistic Regression Trees (PRTrees)
pr_tree(y, X, sigma_grid = NULL, max_terminal_nodes = 15L, cp = 0.01, max_depth = 5L, n_min = 5L, perc_x = 0.1, p_min = 0.05)
pr_tree(y, X, sigma_grid = NULL, max_terminal_nodes = 15L, cp = 0.01, max_depth = 5L, n_min = 5L, perc_x = 0.1, p_min = 0.05)
y |
a numeric vector corresponding to the dependent variable |
X |
A numeric vector, matrix or dataframe corresponding to the independent variables, with the same number of observations as |
sigma_grid |
optionally, a numeric vector with candidate values for the parameter |
max_terminal_nodes |
a non-negative integer. The maximum number of regions in the output tree. The default is 15. |
cp |
a positive numeric value. The complexity parameter. Any split that does not decrease the MSE by a factor of |
max_depth |
a non-negative integer. The maximum depth of the decision tree. The depth is defined as the length of the longest path from the root to a leaf. The default is 5. |
n_min |
a non-negative integer, The minimum number of observations in a final node. The default is 5. |
perc_x |
a positive numeric value. Given a column of |
p_min |
a positive numeric value. A threshold probability that controls the splitting process. A splitting attempt is made in a given region only when the proportion of rows with probability higher than |
yhat |
the estimated values for |
P |
the matrix of probabilities calculated with the observations in |
gamma |
the values of the |
MSE |
the mean squared error calculated for the returned tree |
sigma |
the |
nodes_matrix_info |
information related to each node of the returned tree |
regions |
information related to each region of the returned tree |
set.seed(1234) X = matrix(runif(200, 0, 10), ncol = 1) eps = matrix(rnorm(200, 0, 0.05), ncol = 1) y = matrix(cos(X) + eps, ncol = 1) reg = PRTree::pr_tree(y, X, max_terminal_nodes = 9) plot(X[order(X)], reg$yhat[order(X)], xlab = 'x', ylab = 'cos(x)', col = 'blue', type = 'l') points(X[order(X)], y[order(X)], xlab = 'x', ylab = 'cos(x)', col = 'red')
set.seed(1234) X = matrix(runif(200, 0, 10), ncol = 1) eps = matrix(rnorm(200, 0, 0.05), ncol = 1) y = matrix(cos(X) + eps, ncol = 1) reg = PRTree::pr_tree(y, X, max_terminal_nodes = 9) plot(X[order(X)], reg$yhat[order(X)], xlab = 'x', ylab = 'cos(x)', col = 'blue', type = 'l') points(X[order(X)], y[order(X)], xlab = 'x', ylab = 'cos(x)', col = 'red')
Predicted values based on a prtree object.
## S3 method for class 'prtree' predict(object, newdata, ...)
## S3 method for class 'prtree' predict(object, newdata, ...)
object |
Object of class inheriting from |
newdata |
A matrix with new values for the covariates. |
... |
further arguments passed to or from other methods. |
A list with the following arguments
yhat |
The predicted values. |
newdata |
The matrix with the covariates new values. |