Title: | Plot a Model's Residuals, Response, and Partial Dependence Plots |
---|---|
Description: | Plot model surfaces for a wide variety of models using partial dependence plots and other techniques. Also plot model residuals and other information on the model. |
Authors: | Stephen Milborrow [aut, cre] |
Maintainer: | Stephen Milborrow <[email protected]> |
License: | GPL-3 |
Version: | 3.6.4 |
Built: | 2024-10-31 06:56:58 UTC |
Source: | CRAN |
Plot a gbm
model showing the training and other
error curves.
plot_gbm(object=stop("no 'object' argument"), smooth = c(0, 0, 0, 1), col = c(1, 2, 3, 4), ylim = "auto", legend.x = NULL, legend.y = NULL, legend.cex = .8, grid.col = NA, n.trees = NA, col.n.trees ="darkgray", ...)
plot_gbm(object=stop("no 'object' argument"), smooth = c(0, 0, 0, 1), col = c(1, 2, 3, 4), ylim = "auto", legend.x = NULL, legend.y = NULL, legend.cex = .8, grid.col = NA, n.trees = NA, col.n.trees ="darkgray", ...)
object |
The |
smooth |
Four-element vector specifying if smoothing should be applied
to the train, test, CV, and OOB curves respectively.
When smoothing is specified, a smoothed curve is plotted and the
minimum is calculated from the smoothed curve. |
col |
Four-element vector specifying the colors for the train, test, CV, and OOB
curves respectively. |
ylim |
The default |
legend.x |
The x position of the legend.
The default positions the legend automatically. |
legend.y |
The y position of the legend. |
legend.cex |
The legend |
grid.col |
Default |
n.trees |
For use by |
col.n.trees |
For use by |
... |
Dot arguments are passed internally to
|
This function returns a four-element vector specifying the number of trees at the train, test, CV, and OOB minima respectively.
The minima are calculated after smoothing as specified by this
function's smooth
argument.
By default, only the OOB curve is smoothed.
The smoothing algorithm for the OOB curve differs slightly
from gbm.perf
, so can give a slightly
different number of trees.
The OOB curve
The OOB curve is artificially rescaled to force it into the plot. See Chapter 7 in the plotres vignette.
Interaction with plotres
When invoking this function via plotres
, prefix any
argument of plotres
with w1.
to tell plotres
to
pass the argument to this function.
For example give w1.ylim=c(0,10)
to plotres
(plain
ylim=c(0,10)
in this context gets passed to the residual
plots).
Acknowledgments
This function is derived from code in the gbm
package authored by Greg Ridgeway and others.
Chapter 7 in plotres vignette discusses this function.
if (require(gbm)) { n <- 100 # toy model for quick demo x1 <- 3 * runif(n) x2 <- 3 * runif(n) x3 <- sample(1:4, n, replace=TRUE) y <- x1 + x2 + x3 + rnorm(n, 0, .3) data <- data.frame(y=y, x1=x1, x2=x2, x3=x3) mod <- gbm(y~., data=data, distribution="gaussian", n.trees=300, shrinkage=.1, interaction.depth=3, train.fraction=.8, verbose=FALSE) plot_gbm(mod) # plotres(mod) # plot residuals # plotmo(mod) # plot regression surfaces }
if (require(gbm)) { n <- 100 # toy model for quick demo x1 <- 3 * runif(n) x2 <- 3 * runif(n) x3 <- sample(1:4, n, replace=TRUE) y <- x1 + x2 + x3 + rnorm(n, 0, .3) data <- data.frame(y=y, x1=x1, x2=x2, x3=x3) mod <- gbm(y~., data=data, distribution="gaussian", n.trees=300, shrinkage=.1, interaction.depth=3, train.fraction=.8, verbose=FALSE) plot_gbm(mod) # plotres(mod) # plot residuals # plotmo(mod) # plot regression surfaces }
Plot the coefficient paths of a glmnet
model.
An enhanced version of plot.glmnet
.
plot_glmnet(x = stop("no 'x' argument"), xvar = c("rlambda", "lambda", "norm", "dev"), label = 10, nresponse = NA, grid.col = NA, s = NA, ...)
plot_glmnet(x = stop("no 'x' argument"), xvar = c("rlambda", "lambda", "norm", "dev"), label = 10, nresponse = NA, grid.col = NA, s = NA, ...)
x |
The |
xvar |
What gets plotted along the x axis. One of: |
label |
Default |
nresponse |
Which response to plot for multiple response models. |
grid.col |
Default |
s |
For use by |
... |
Dot arguments are passed internally to
Use |
Limitations
For multiple response models use the nresponse
argument to
specify which response should be plotted.
(Currently each response must be plotted one by one.)
The type.coef
argument of plot.glmnet
is
currently not supported.
Currently xvar="norm"
is not supported for multiple
response models (you will get an error message).
Interaction with plotres
When invoking this function via plotres
, prefix any
argument of plotres
with w1.
to tell plotres
to
pass the argument to this function.
For example give w1.col=1:4
to plotres
(plain
col=1:4
in this context gets passed to the residual plots).
Acknowledgments
This function is based on plot.glmnet
in the
glmnet
package authored by Jerome Friedman,
Trevor Hastie, and Rob Tibshirani.
This function incorporates the function spread.labs
from the orphaned
package TeachingDemos
written by Greg Snow.
Chapter 6 in plotres vignette discusses this function.
if (require(glmnet)) { x <- matrix(rnorm(100 * 10), 100, 10) # n=100 p=10 y <- x[,1] + x[,2] + 2 * rnorm(100) # y depends only on x[,1] and x[,2] mod <- glmnet(x, y) plot_glmnet(mod) # plotres(mod) # plot the residuals }
if (require(glmnet)) { x <- matrix(rnorm(100 * 10), 100, 10) # n=100 p=10 y <- x[,1] + x[,2] + 2 * rnorm(100) # y depends only on x[,1] and x[,2] mod <- glmnet(x, y) plot_glmnet(mod) # plotres(mod) # plot the residuals }
Plot model surfaces for a wide variety of models.
This function plots the model's response when varying one or two predictors while holding the other predictors constant (a poor man's partial-dependence plot).
It can also generate partial-dependence plots (by specifying
pmethod="partdep"
).
Please see the plotmo vignette (also available here).
plotmo(object=stop("no 'object' argument"), type=NULL, nresponse=NA, pmethod="plotmo", pt.col=0, jitter=.5, smooth.col=0, level=0, func=NULL, inverse.func=NULL, nrug=0, grid.col=0, type2="persp", degree1=TRUE, all1=FALSE, degree2=TRUE, all2=FALSE, do.par=TRUE, clip=TRUE, ylim=NULL, caption=NULL, trace=0, grid.func=NULL, grid.levels=NULL, extend=0, ngrid1=50, ngrid2=20, ndiscrete=5, npoints=3000, center=FALSE, xflip=FALSE, yflip=FALSE, swapxy=FALSE, int.only.ok=TRUE, ...)
plotmo(object=stop("no 'object' argument"), type=NULL, nresponse=NA, pmethod="plotmo", pt.col=0, jitter=.5, smooth.col=0, level=0, func=NULL, inverse.func=NULL, nrug=0, grid.col=0, type2="persp", degree1=TRUE, all1=FALSE, degree2=TRUE, all2=FALSE, do.par=TRUE, clip=TRUE, ylim=NULL, caption=NULL, trace=0, grid.func=NULL, grid.levels=NULL, extend=0, ngrid1=50, ngrid2=20, ndiscrete=5, npoints=3000, center=FALSE, xflip=FALSE, yflip=FALSE, swapxy=FALSE, int.only.ok=TRUE, ...)
object |
The model object. |
||||||||||||||||||||||||||||||||
type |
Type parameter passed to |
||||||||||||||||||||||||||||||||
nresponse |
Which column to use when |
||||||||||||||||||||||||||||||||
pmethod |
Plotting method. One of:
|
||||||||||||||||||||||||||||||||
pt.col |
The color of response points (or response sites in degree2 plots).
This refers to the response |
||||||||||||||||||||||||||||||||
jitter |
Applies only if |
||||||||||||||||||||||||||||||||
smooth.col |
Color of smooth line through the response points.
(The points themselves will not be plotted unless mod <- lm(Volume~Height, data=trees) plotmo(mod, pt.color=1, smooth.col=2) You can adjust the amount of smoothing with |
||||||||||||||||||||||||||||||||
level |
Draw estimated confidence or prediction interval bands at the given mod <- lm(log(Volume)~log(Girth), data=trees) plotmo(mod, level=.95) You can modify the color of the bands with |
||||||||||||||||||||||||||||||||
func |
Superimpose mod <- lm(Volume~Girth, data=trees) estimated.volume <- function(x) .17 * x$Girth^2 plotmo(mod, pt.col=2, func=estimated.volume) The |
||||||||||||||||||||||||||||||||
inverse.func |
A function applied to the response before plotting. Useful to transform a transformed response back to the original scale. Example: mod <- lm(log(Volume)~., data=trees) plotmo(mod, inverse.func=exp) # exp() is inverse of log() |
||||||||||||||||||||||||||||||||
nrug |
Number of ticks in the |
||||||||||||||||||||||||||||||||
grid.col |
Default is |
||||||||||||||||||||||||||||||||
type2 |
Degree2 plot type.
One of plotmo(mod, persp.ticktype="detailed", persp.nticks=3) plotmo(mod, type2="image") plotmo(mod, type2="image", image.col=heat.colors(12)) plotmo(mod, type2="contour", contour.col=2, contour.labcex=.4) |
||||||||||||||||||||||||||||||||
degree1 |
An index vector specifying which subset of degree1 (main effect) plots to include
(after selecting the relevant predictors as described in
“Which variables are plotted?” in the |
||||||||||||||||||||||||||||||||
all1 |
Default is |
||||||||||||||||||||||||||||||||
degree2 |
An index vector specifying which subset of degree2 (interaction) plots to include.
|
||||||||||||||||||||||||||||||||
all2 |
Default is |
||||||||||||||||||||||||||||||||
do.par |
One of
|
||||||||||||||||||||||||||||||||
clip |
The default is |
||||||||||||||||||||||||||||||||
ylim |
Three possibilities:
|
||||||||||||||||||||||||||||||||
caption |
Overall caption. By default create the caption automatically.
Use |
||||||||||||||||||||||||||||||||
trace |
Default is |
||||||||||||||||||||||||||||||||
grid.func |
Function applied to columns of the plotmo(mod, grid.func=mean) grid.func <- function(x, ...) quantile(x)[2] # 25% quantile plotmo(mod, grid.func=grid.func) This argument is not related to the |
||||||||||||||||||||||||||||||||
grid.levels |
Default is plotmo(mod, grid.levels=list(sex="m", age=21)) |
||||||||||||||||||||||||||||||||
extend |
Amount to extend the horizontal axis in each plot.
The default is |
||||||||||||||||||||||||||||||||
ngrid1 |
Number of equally spaced x values in each degree1 plot.
Default is |
||||||||||||||||||||||||||||||||
ngrid2 |
Grid size for degree2 plots ( |
||||||||||||||||||||||||||||||||
npoints |
Number of response points to be plotted
(a sample of |
||||||||||||||||||||||||||||||||
ndiscrete |
Default |
||||||||||||||||||||||||||||||||
int.only.ok |
Plot the model even if it is an intercept-only model (no predictors are
used in the model).
Do this by plotting a single degree1 plot for the first predictor.
|
||||||||||||||||||||||||||||||||
center |
Center the plotted response.
Default is |
||||||||||||||||||||||||||||||||
xflip |
Default |
||||||||||||||||||||||||||||||||
yflip |
Default |
||||||||||||||||||||||||||||||||
swapxy |
Default |
||||||||||||||||||||||||||||||||
... |
Dot arguments are passed to the predict and plot functions.
Dot argument names, whether prefixed or not, should be specified in full
and not abbreviated.
plotmo(mod, s=1) # error: arg matches multiple formal args plotmo(mod, predict.s=1) # ok now: s=1 will be passed to predict() The prefixes recognized by
The For backwards compatibility, some dot arguments are supported but not
explicitly documented. For example, the old argument |
In general this function won't work on models that don't save the call and data with the model in a standard way. For further discussion please see “Accessing the model data” in the plotmo vignette. Package authors may want to look at Guidelines for S3 Regression Models (also available here).
By default, plotmo
tries to use sensible model-dependent
defaults when calling predict
.
Use trace=1
to see the arguments passed to predict
.
You can change the defaults by using plotmo
's type
argument,
and by using dot arguments prefixed with
predict.
(see the description of “...
” above).
Please see the plotmo vignette (also available here).
if (require(rpart)) { data(kyphosis) rpart.model <- rpart(Kyphosis~., data=kyphosis) # pass type="prob" to plotmo's internal calls to predict.rpart, and # select the column named "present" from the matrix returned by predict.rpart plotmo(rpart.model, type="prob", nresponse="present") } if (require(earth)) { data(ozone1) earth.model <- earth(O3 ~ ., data=ozone1, degree=2) plotmo(earth.model) # plotmo(earth.model, pmethod="partdep") # partial dependence plots }
if (require(rpart)) { data(kyphosis) rpart.model <- rpart(Kyphosis~., data=kyphosis) # pass type="prob" to plotmo's internal calls to predict.rpart, and # select the column named "present" from the matrix returned by predict.rpart plotmo(rpart.model, type="prob", nresponse="present") } if (require(earth)) { data(ozone1) earth.model <- earth(O3 ~ ., data=ozone1, degree=2) plotmo(earth.model) # plotmo(earth.model, pmethod="partdep") # partial dependence plots }
Miscellaneous functions exported for internal use by earth
and other packages.
You can ignore these.
# for earth plotmo_fitted(object, trace, nresponse, type, ...) plotmo_cum(rinfo, info, nfigs=1, add=FALSE, cum.col1, grid.col, jitter=0, cum.grid="percentages", ...) plotmo_nresponse(y, object, nresponse, trace, fname, type="response") plotmo_rinfo(object, type=NULL, residtype=type, nresponse=1, standardize=FALSE, delever=FALSE, trace=0, leverage.msg="returned as NA", expected.levs=NULL, labels.id=NULL, ...) plotmo_predict(object, newdata, nresponse, type, expected.levs, trace, inverse.func=NULL, ...) plotmo_prolog(object, object.name, trace, ...) plotmo_resplevs(object, plotmo_fitted, yfull, trace) plotmo_rsq(object, newdata, trace=0, nresponse=NA, type=NULL, ...) plotmo_standardizescale(object) plotmo_type(object, trace, fname="plotmo", type, ...) plotmo_y(object, nresponse=NULL, trace=0, expected.len=NULL, resp.levs=NULL, convert.glm.response=!is.null(nresponse)) ## Default S3 method: plotmo.pairs(object, x, nresponse, trace, all2, ...) ## Default S3 method: plotmo.singles(object, x, nresponse, trace, all1, ...) ## Default S3 method: plotmo.y(object, trace, naked, expected.len, ...) # plotmo methods plotmo.convert.na.nresponse(object, nresponse, yhat, type="response", ...) plotmo.pairs(object, x, nresponse, trace, all2, ...) plotmo.pint(object, newdata, type, level, trace, ...) plotmo.predict(object, newdata, type, ..., TRACE) plotmo.prolog(object, object.name, trace, ...) plotmo.residtype(object, ..., TRACE) plotmo.singles(object, x, nresponse, trace, all1, ...) plotmo.type(object, ..., TRACE) plotmo.x(object, trace, ...) plotmo.y(object, trace, naked, expected.len, nresponse=1, ...)
# for earth plotmo_fitted(object, trace, nresponse, type, ...) plotmo_cum(rinfo, info, nfigs=1, add=FALSE, cum.col1, grid.col, jitter=0, cum.grid="percentages", ...) plotmo_nresponse(y, object, nresponse, trace, fname, type="response") plotmo_rinfo(object, type=NULL, residtype=type, nresponse=1, standardize=FALSE, delever=FALSE, trace=0, leverage.msg="returned as NA", expected.levs=NULL, labels.id=NULL, ...) plotmo_predict(object, newdata, nresponse, type, expected.levs, trace, inverse.func=NULL, ...) plotmo_prolog(object, object.name, trace, ...) plotmo_resplevs(object, plotmo_fitted, yfull, trace) plotmo_rsq(object, newdata, trace=0, nresponse=NA, type=NULL, ...) plotmo_standardizescale(object) plotmo_type(object, trace, fname="plotmo", type, ...) plotmo_y(object, nresponse=NULL, trace=0, expected.len=NULL, resp.levs=NULL, convert.glm.response=!is.null(nresponse)) ## Default S3 method: plotmo.pairs(object, x, nresponse, trace, all2, ...) ## Default S3 method: plotmo.singles(object, x, nresponse, trace, all1, ...) ## Default S3 method: plotmo.y(object, trace, naked, expected.len, ...) # plotmo methods plotmo.convert.na.nresponse(object, nresponse, yhat, type="response", ...) plotmo.pairs(object, x, nresponse, trace, all2, ...) plotmo.pint(object, newdata, type, level, trace, ...) plotmo.predict(object, newdata, type, ..., TRACE) plotmo.prolog(object, object.name, trace, ...) plotmo.residtype(object, ..., TRACE) plotmo.singles(object, x, nresponse, trace, all1, ...) plotmo.type(object, ..., TRACE) plotmo.x(object, trace, ...) plotmo.y(object, trace, naked, expected.len, nresponse=1, ...)
... |
- |
add |
- |
all1 |
- |
all2 |
- |
convert.glm.response |
- |
cum.col1 |
- |
cum.grid |
- |
delever |
- |
expected.len |
- |
expected.levs |
- |
fname |
- |
grid.col |
- |
info |
- |
inverse.func |
- |
jitter |
- |
labels.id |
- |
level |
- |
leverage.msg |
- |
naked |
- |
newdata |
- |
nfigs |
- |
nresponse |
- |
object.name |
- |
object |
- |
plotmo_fitted |
- |
residtype |
- |
resp.levs |
- |
rinfo |
- |
standardize |
- |
TRACE |
- |
trace |
- |
type |
- |
x |
- |
yfull |
- |
yhat |
- |
y |
- |
Plot the residuals of a regression model.
Please see the plotres vignette (also available here).
plotres(object = stop("no 'object' argument"), which = 1:4, info = FALSE, versus = 1, standardize = FALSE, delever = FALSE, level = 0, id.n = 3, labels.id = NULL, smooth.col = 2, grid.col = 0, jitter = 0, do.par = NULL, caption = NULL, trace = 0, npoints = 3000, center = TRUE, type = NULL, nresponse = NA, object.name = quote.deparse(substitute(object)), ...)
plotres(object = stop("no 'object' argument"), which = 1:4, info = FALSE, versus = 1, standardize = FALSE, delever = FALSE, level = 0, id.n = 3, labels.id = NULL, smooth.col = 2, grid.col = 0, jitter = 0, do.par = NULL, caption = NULL, trace = 0, npoints = 3000, center = TRUE, type = NULL, nresponse = NA, object.name = quote.deparse(substitute(object)), ...)
object |
The model object. |
||||||||||||||||||||||||||||
which |
Which plots do draw. Default is
|
||||||||||||||||||||||||||||
info |
Default is i) Display the distribution of the residuals along the bottom of the plot. ii) Display the training R-Squared. iii) Display the Spearman Rank Correlation of the absolute residuals
with the fitted values.
Actually, correlation is measured against the absolute values
of whatever is on the horizontal
axis — by default this is the fitted response, but may be something
else if the iv) In the Cumulative Distribution plot ( v) Only for vi) Add various annotations to the other plots.
|
||||||||||||||||||||||||||||
versus |
What do we plot the residuals against? One of:
Else a character vector specifying which predictors to plot against.
|
||||||||||||||||||||||||||||
standardize |
Default is |
||||||||||||||||||||||||||||
delever |
Default is |
||||||||||||||||||||||||||||
level |
Draw estimated confidence or prediction interval bands at the given
mod <- lm(log(Volume)~log(Girth), data=trees) plotres(mod, level=.90) You can modify the color of the bands with |
||||||||||||||||||||||||||||
id.n |
The largest |
||||||||||||||||||||||||||||
labels.id |
Residual labels.
Only used if |
||||||||||||||||||||||||||||
smooth.col |
Color of the smooth line through the residual points.
Default is |
||||||||||||||||||||||||||||
grid.col |
Default is |
||||||||||||||||||||||||||||
jitter |
Default is |
||||||||||||||||||||||||||||
do.par |
One of
|
||||||||||||||||||||||||||||
caption |
Overall caption. By default create the caption automatically.
Use |
||||||||||||||||||||||||||||
trace |
Default is |
||||||||||||||||||||||||||||
npoints |
Number of points to be plotted.
A sample of |
||||||||||||||||||||||||||||
center |
Default is TRUE, meaning center the horizontal axis in the residuals plot, so asymmetry in the residual distribution is more obvious. |
||||||||||||||||||||||||||||
type |
Type parameter passed first to |
||||||||||||||||||||||||||||
nresponse |
Which column to use when |
||||||||||||||||||||||||||||
object.name |
The name of the |
||||||||||||||||||||||||||||
... |
Dot arguments are passed to the plot functions. Dot argument names, whether prefixed or not, should be specified in full and not abbreviated. “Prefixed” arguments are passed directly to the associated function.
For example the prefixed argument
The For backwards compatibility, some dot arguments are supported but not explicitly documented. |
If the which=1
plot was plotted, the return value of that
plot (model dependent).
Else if the which=3
plot was plotted, return list(x,y)
where x
and y
are the coordinates of the points in that plot
(but without jittering even if the jitter
argument was used).
Else return NULL
.
This function is designed primarily for displaying standard
response - fitted
residuals for models
with a single continuous response,
although it will work for a few other models.
In general this function won't work on models that don't save the call
and data with the model in a standard way.
It uses the same underlying mechanism to access the model data as
plotmo
.
For further discussion please see “Accessing the model
data” in the plotmo vignette
(also available here).
Package authors may want to look at
Guidelines for S3 Regression Models
(also available here).
Please see the plotres vignette (also available here).
# we use lm in this example, but plotres is more useful for models # that don't have a function like plot.lm for plotting residuals lm.model <- lm(Volume~., data=trees) plotres(lm.model)
# we use lm in this example, but plotres is more useful for models # that don't have a function like plot.lm for plotting residuals lm.model <- lm(Volume~., data=trees) plotres(lm.model)