Title: | Estimation of Model-Based Predictions, Contrasts and Means |
---|---|
Description: | Implements a general interface for model-based estimations for a wide variety of models, used in the computation of marginal means, contrast analysis and predictions. For a list of supported models, see 'insight::supported_models()'. |
Authors: | Dominique Makowski [aut, cre] , Daniel Lüdecke [aut] , Mattan S. Ben-Shachar [aut] , Indrajeet Patil [aut] |
Maintainer: | Dominique Makowski <[email protected]> |
License: | GPL-3 |
Version: | 0.8.9 |
Built: | 2024-10-27 12:34:35 UTC |
Source: | CRAN |
This function summarises the smooth term trend in terms of linear segments. Using the approximate derivative, it separates a non-linear vector into quasi-linear segments (in which the trend is either positive or negative). Each of this segment its characterized by its beginning, end, size (in proportion, relative to the total size) trend (the linear regression coefficient) and linearity (the R2 of the linear regression).
describe_nonlinear(data, ...) ## S3 method for class 'data.frame' describe_nonlinear(data, x = NULL, y = NULL, ...) estimate_smooth(data, ...)
describe_nonlinear(data, ...) ## S3 method for class 'data.frame' describe_nonlinear(data, x = NULL, y = NULL, ...) estimate_smooth(data, ...)
data |
The data containing the link, as for instance obtained by
|
... |
Other arguments to be passed to or from. |
x , y
|
The name of the responses variable ( |
A data frame of linear description of non-linear terms.
# Create data data <- data.frame(x = rnorm(200)) data$y <- data$x^2 + rnorm(200, 0, 0.5) model <<- lm(y ~ poly(x, 2), data = data) link_data <- estimate_relation(model, length = 100) describe_nonlinear(link_data, x = "x")
# Create data data <- data.frame(x = rnorm(200)) data$y <- data$x^2 + rnorm(200, 0, 0.5) model <<- lm(y ~ poly(x, 2), data = data) link_data <- estimate_relation(model, length = 100) describe_nonlinear(link_data, x = "x")
Run a contrast analysis by estimating the differences between each level of a
factor. See also other related functions such as estimate_means()
and estimate_slopes()
.
estimate_contrasts( model, contrast = NULL, by = NULL, fixed = NULL, transform = "none", ci = 0.95, p_adjust = "holm", method = "pairwise", adjust = NULL, at = NULL, ... )
estimate_contrasts( model, contrast = NULL, by = NULL, fixed = NULL, transform = "none", ci = 0.95, p_adjust = "holm", method = "pairwise", adjust = NULL, at = NULL, ... )
model |
A statistical model. |
contrast |
A character vector indicating the name of the variable(s) for which to compute the contrasts. |
by |
The predictor variable(s) at which to evaluate the desired effect / mean / contrasts. Other predictors of the model that are not included here will be collapsed and "averaged" over (the effect will be estimated across them). |
fixed |
A character vector indicating the names of the predictors to be "fixed" (i.e., maintained), so that the estimation is made at these values. |
transform |
Is passed to the |
ci |
Confidence Interval (CI) level. Default to |
p_adjust |
The p-values adjustment method for frequentist multiple
comparisons. Can be one of "holm" (default), "tukey", "hochberg", "hommel",
"bonferroni", "BH", "BY", "fdr" or "none". See the p-value adjustment
section in the |
method |
Contrast method. See same argument in emmeans::contrast. |
adjust |
Deprecated in favour of |
at |
Deprecated, use |
... |
Other arguments passed for instance to |
See the Details section below, and don't forget to also check out the Vignettes and README examples for various examples, tutorials and use cases.
The estimate_slopes()
, estimate_means()
and estimate_contrasts()
functions are forming a group, as they are all based on marginal
estimations (estimations based on a model). All three are also built on the
emmeans package, so reading its documentation (for instance for
emmeans::emmeans()
and emmeans::emtrends()
) is recommended to understand
the idea behind these types of procedures.
Model-based predictions is the basis for all that follows. Indeed,
the first thing to understand is how models can be used to make predictions
(see estimate_link()
). This corresponds to the predicted response (or
"outcome variable") given specific predictor values of the predictors (i.e.,
given a specific data configuration). This is why the concept of reference grid()
is so important for direct predictions.
Marginal "means", obtained via estimate_means()
, are an extension
of such predictions, allowing to "average" (collapse) some of the predictors,
to obtain the average response value at a specific predictors configuration.
This is typically used when some of the predictors of interest are factors.
Indeed, the parameters of the model will usually give you the intercept value
and then the "effect" of each factor level (how different it is from the
intercept). Marginal means can be used to directly give you the mean value of
the response variable at all the levels of a factor. Moreover, it can also be
used to control, or average over predictors, which is useful in the case of
multiple predictors with or without interactions.
Marginal contrasts, obtained via estimate_contrasts()
, are
themselves at extension of marginal means, in that they allow to investigate
the difference (i.e., the contrast) between the marginal means. This is,
again, often used to get all pairwise differences between all levels of a
factor. It works also for continuous predictors, for instance one could also
be interested in whether the difference at two extremes of a continuous
predictor is significant.
Finally, marginal effects, obtained via estimate_slopes()
, are
different in that their focus is not values on the response variable, but the
model's parameters. The idea is to assess the effect of a predictor at a
specific configuration of the other predictors. This is relevant in the case
of interactions or non-linear relationships, when the effect of a predictor
variable changes depending on the other predictors. Moreover, these effects
can also be "averaged" over other predictors, to get for instance the
"general trend" of a predictor over different factor levels.
Example: Let's imagine the following model lm(y ~ condition * x)
where
condition
is a factor with 3 levels A, B and C and x
a continuous
variable (like age for example). One idea is to see how this model performs,
and compare the actual response y to the one predicted by the model (using
estimate_response()
). Another idea is evaluate the average mean at each of
the condition's levels (using estimate_means()
), which can be useful to
visualize them. Another possibility is to evaluate the difference between
these levels (using estimate_contrasts()
). Finally, one could also estimate
the effect of x averaged over all conditions, or instead within each
condition (using [estimate_slopes]
).
A data frame of estimated contrasts.
## Not run: # Basic usage model <- lm(Sepal.Width ~ Species, data = iris) estimate_contrasts(model) # Dealing with interactions model <- lm(Sepal.Width ~ Species * Petal.Width, data = iris) # By default: selects first factor estimate_contrasts(model) # Can also run contrasts between points of numeric estimate_contrasts(model, contrast = "Petal.Width", length = 4) # Or both estimate_contrasts(model, contrast = c("Species", "Petal.Width"), length = 2) # Or with custom specifications estimate_contrasts(model, contrast = c("Species", "Petal.Width=c(1, 2)")) # Can fixate the numeric at a specific value estimate_contrasts(model, fixed = "Petal.Width") # Or modulate it estimate_contrasts(model, by = "Petal.Width", length = 4) # Standardized differences estimated <- estimate_contrasts(lm(Sepal.Width ~ Species, data = iris)) standardize(estimated) # Other models (mixed, Bayesian, ...) data <- iris data$Petal.Length_factor <- ifelse(data$Petal.Length < 4.2, "A", "B") model <- lme4::lmer(Sepal.Width ~ Species + (1 | Petal.Length_factor), data = data) estimate_contrasts(model) library(rstanarm) data <- mtcars data$cyl <- as.factor(data$cyl) data$am <- as.factor(data$am) model <- stan_glm(mpg ~ cyl * am, data = data, refresh = 0) estimate_contrasts(model) estimate_contrasts(model, fixed = "am") model <- stan_glm(mpg ~ cyl * wt, data = data, refresh = 0) estimate_contrasts(model) estimate_contrasts(model, fixed = "wt") estimate_contrasts(model, by = "wt", length = 4) model <- stan_glm(Sepal.Width ~ Species + Petal.Width + Petal.Length, data = iris, refresh = 0) estimate_contrasts(model, by = "Petal.Length", test = "bf") ## End(Not run)
## Not run: # Basic usage model <- lm(Sepal.Width ~ Species, data = iris) estimate_contrasts(model) # Dealing with interactions model <- lm(Sepal.Width ~ Species * Petal.Width, data = iris) # By default: selects first factor estimate_contrasts(model) # Can also run contrasts between points of numeric estimate_contrasts(model, contrast = "Petal.Width", length = 4) # Or both estimate_contrasts(model, contrast = c("Species", "Petal.Width"), length = 2) # Or with custom specifications estimate_contrasts(model, contrast = c("Species", "Petal.Width=c(1, 2)")) # Can fixate the numeric at a specific value estimate_contrasts(model, fixed = "Petal.Width") # Or modulate it estimate_contrasts(model, by = "Petal.Width", length = 4) # Standardized differences estimated <- estimate_contrasts(lm(Sepal.Width ~ Species, data = iris)) standardize(estimated) # Other models (mixed, Bayesian, ...) data <- iris data$Petal.Length_factor <- ifelse(data$Petal.Length < 4.2, "A", "B") model <- lme4::lmer(Sepal.Width ~ Species + (1 | Petal.Length_factor), data = data) estimate_contrasts(model) library(rstanarm) data <- mtcars data$cyl <- as.factor(data$cyl) data$am <- as.factor(data$am) model <- stan_glm(mpg ~ cyl * am, data = data, refresh = 0) estimate_contrasts(model) estimate_contrasts(model, fixed = "am") model <- stan_glm(mpg ~ cyl * wt, data = data, refresh = 0) estimate_contrasts(model) estimate_contrasts(model, fixed = "wt") estimate_contrasts(model, by = "wt", length = 4) model <- stan_glm(Sepal.Width ~ Species + Petal.Width + Petal.Length, data = iris, refresh = 0) estimate_contrasts(model, by = "Petal.Length", test = "bf") ## End(Not run)
After fitting a model, it is useful generate model-based estimates of the response variables for different combinations of predictor values. Such estimates can be used to make inferences about relationships between variables and to make predictions about individual cases.
Model-based response estimates and uncertainty can be generated for both the conditional average response values (the regression line or expectation) and for predictions about individual cases. See below for details.
estimate_expectation( model, data = NULL, ci = 0.95, keep_iterations = FALSE, ... ) estimate_response(...) estimate_link(model, data = "grid", ci = 0.95, keep_iterations = FALSE, ...) estimate_prediction( model, data = NULL, ci = 0.95, keep_iterations = FALSE, ... ) estimate_relation( model, data = "grid", ci = 0.95, keep_iterations = FALSE, ... )
estimate_expectation( model, data = NULL, ci = 0.95, keep_iterations = FALSE, ... ) estimate_response(...) estimate_link(model, data = "grid", ci = 0.95, keep_iterations = FALSE, ...) estimate_prediction( model, data = NULL, ci = 0.95, keep_iterations = FALSE, ... ) estimate_relation( model, data = "grid", ci = 0.95, keep_iterations = FALSE, ... )
model |
A statistical model. |
data |
A data frame with model's predictors to estimate the response. If
|
ci |
Confidence Interval (CI) level. Default to |
keep_iterations |
If |
... |
You can add all the additional control arguments from
|
A data frame of predicted values and uncertainty intervals, with
class "estimate_predicted"
. Methods for visualisation_recipe()
and plot()
are available.
The most important way that various types of response estimates differ is in terms of what quantity is being estimated and the meaning of the uncertainty intervals. The major choices are expected values for uncertainty in the regression line and predicted values for uncertainty in the individual case predictions.
Expected values refer to the fitted regression line - the estimated average response value (i.e., the "expectation") for individuals with specific predictor values. For example, in a linear model y = 2 + 3x + 4z + e, the estimated average y for individuals with x = 1 and z = 2 is 11.
For expected values, uncertainty intervals refer to uncertainty in the estimated conditional average (where might the true regression line actually fall)? Uncertainty intervals for expected values are also called "confidence intervals".
Expected values and their uncertainty intervals are useful for describing the relationship between variables and for describing how precisely a model has been estimated.
For generalized linear models, expected values are reported on one of two scales:
The link scale refers to scale of the fitted regression line, after transformation by the link function. For example, for a logistic regression (logit binomial) model, the link scale gives expected log-odds. For a log-link Poisson model, the link scale gives the expected log-count.
The response scale refers to the original scale of the response variable (i.e., without any link function transformation). Expected values on the link scale are back-transformed to the original response variable metric (e.g., expected probabilities for binomial models, expected counts for Poisson models).
In contrast to expected values, predicted values refer to predictions for individual cases. Predicted values are also called "posterior predictions" or "posterior predictive draws".
For predicted values, uncertainty intervals refer to uncertainty in the individual response values for each case (where might any single case actually fall)? Uncertainty intervals for predicted values are also called "prediction intervals" or "posterior predictive intervals".
Predicted values and their uncertainty intervals are useful for forecasting the range of values that might be observed in new data, for making decisions about individual cases, and for checking if model predictions are reasonable ("posterior predictive checks").
Predicted values and intervals are always on the scale of the original response variable (not the link scale).
modelbased provides 4 functions for generating model-based response estimates and their uncertainty:
estimate_expectation()
:
Generates expected values (conditional average) on the response scale.
The uncertainty interval is a confidence interval.
By default, values are computed using the data used to fit the model.
estimate_link()
:
Generates expected values (conditional average) on the link scale.
The uncertainty interval is a confidence interval.
By default, values are computed using a reference grid spanning the
observed range of predictor values (see visualisation_matrix()
).
estimate_prediction()
:
Generates predicted values (for individual cases) on the response scale.
The uncertainty interval is a prediction interval.
By default, values are computed using the data used to fit the model.
estimate_relation()
:
Like estimate_expectation()
.
Useful for visualizing a model.
Generates expected values (conditional average) on the response scale.
The uncertainty interval is a confidence interval.
By default, values are computed using a reference grid spanning the
observed range of predictor values (see visualisation_matrix()
).
estimate_response()
is a deprecated alias for estimate_expectation()
.
If the data = NULL
, values are estimated using the data used to fit the
model. If data = "grid"
, values are computed using a reference grid
spanning the observed range of predictor values with
visualisation_matrix()
. This can be useful for model visualization. The
number of predictor values used for each variable can be controlled with the
length
argument. data
can also be a data frame containing columns with
names matching the model frame (see insight::get_data()
). This can be used
to generate model predictions for specific combinations of predictor values.
These functions are built on top of insight::get_predicted()
and correspond
to different specifications of its parameters. It may be useful to read its
documentation,
in particular the description of the predict
argument for additional
details on the difference between expected vs. predicted values and link vs.
response scales.
Additional control parameters can be used to control results from
insight::get_datagrid()
(when data = "grid"
) and from
insight::get_predicted()
(the function used internally to compute
predictions).
For plotting, check the examples in visualisation_recipe()
. Also check out
the Vignettes and README examples for
various examples, tutorials and usecases.
library(modelbased) # Linear Models model <- lm(mpg ~ wt, data = mtcars) # Get predicted and prediction interval (see insight::get_predicted) estimate_response(model) # Get expected values with confidence interval pred <- estimate_relation(model) pred # Visualisation (see visualisation_recipe()) if (require("see")) { plot(pred) } # Standardize predictions pred <- estimate_relation(lm(mpg ~ wt + am, data = mtcars)) z <- standardize(pred, include_response = FALSE) z unstandardize(z, include_response = FALSE) # Logistic Models model <- glm(vs ~ wt, data = mtcars, family = "binomial") estimate_response(model) estimate_relation(model) # Mixed models if (require("lme4")) { model <- lmer(mpg ~ wt + (1 | gear), data = mtcars) estimate_response(model) estimate_relation(model) } # Bayesian models if (require("rstanarm")) { model <- suppressWarnings(rstanarm::stan_glm( mpg ~ wt, data = mtcars, refresh = 0, iter = 200 )) estimate_response(model) estimate_relation(model) }
library(modelbased) # Linear Models model <- lm(mpg ~ wt, data = mtcars) # Get predicted and prediction interval (see insight::get_predicted) estimate_response(model) # Get expected values with confidence interval pred <- estimate_relation(model) pred # Visualisation (see visualisation_recipe()) if (require("see")) { plot(pred) } # Standardize predictions pred <- estimate_relation(lm(mpg ~ wt + am, data = mtcars)) z <- standardize(pred, include_response = FALSE) z unstandardize(z, include_response = FALSE) # Logistic Models model <- glm(vs ~ wt, data = mtcars, family = "binomial") estimate_response(model) estimate_relation(model) # Mixed models if (require("lme4")) { model <- lmer(mpg ~ wt + (1 | gear), data = mtcars) estimate_response(model) estimate_relation(model) } # Bayesian models if (require("rstanarm")) { model <- suppressWarnings(rstanarm::stan_glm( mpg ~ wt, data = mtcars, refresh = 0, iter = 200 )) estimate_response(model) estimate_relation(model) }
Extract random parameters of each individual group in the context of mixed models. Can be reshaped to be of the same dimensions as the original data, which can be useful to add the random effects to the original data.
estimate_grouplevel(model, type = "random", ...) reshape_grouplevel(x, indices = "all", group = "all", ...)
estimate_grouplevel(model, type = "random", ...) reshape_grouplevel(x, indices = "all", group = "all", ...)
model |
A mixed model with random effects. |
type |
If |
... |
Other arguments passed to or from other methods. |
x |
The output of |
indices |
A list containing the indices to extract (e.g., "Coefficient"). |
group |
A list containing the random factors to select. |
# lme4 model data(mtcars) model <- lme4::lmer(mpg ~ hp + (1 | carb), data = mtcars) random <- estimate_grouplevel(model) random # Visualize random effects plot(random) # Show group-specific effects estimate_grouplevel(model, deviation = FALSE) # Reshape to wide data so that it matches the original dataframe... reshaped <- reshape_grouplevel(random, indices = c("Coefficient", "SE")) # ... and can be easily combined alldata <- cbind(mtcars, reshaped) # Use summary() to remove duplicated rows summary(reshaped) # Compute BLUPs estimate_grouplevel(model, type = "total")
# lme4 model data(mtcars) model <- lme4::lmer(mpg ~ hp + (1 | carb), data = mtcars) random <- estimate_grouplevel(model) random # Visualize random effects plot(random) # Show group-specific effects estimate_grouplevel(model, deviation = FALSE) # Reshape to wide data so that it matches the original dataframe... reshaped <- reshape_grouplevel(random, indices = c("Coefficient", "SE")) # ... and can be easily combined alldata <- cbind(mtcars, reshaped) # Use summary() to remove duplicated rows summary(reshaped) # Compute BLUPs estimate_grouplevel(model, type = "total")
Estimate average value of response variable at each factor levels. For
plotting, check the examples in visualisation_recipe()
. See also
other related functions such as estimate_contrasts()
and
estimate_slopes()
.
estimate_means( model, by = "auto", fixed = NULL, transform = "response", ci = 0.95, backend = "emmeans", at = NULL, ... )
estimate_means( model, by = "auto", fixed = NULL, transform = "response", ci = 0.95, backend = "emmeans", at = NULL, ... )
model |
A statistical model. |
by |
The predictor variable(s) at which to evaluate the desired effect / mean / contrasts. Other predictors of the model that are not included here will be collapsed and "averaged" over (the effect will be estimated across them). |
fixed |
A character vector indicating the names of the predictors to be "fixed" (i.e., maintained), so that the estimation is made at these values. |
transform |
Is passed to the |
ci |
Confidence Interval (CI) level. Default to |
backend |
Whether to use 'emmeans' or 'marginaleffects' as a backend. The latter is experimental and some features might not work. |
at |
Deprecated, use |
... |
Other arguments passed for instance to |
See the Details section below, and don't forget to also check out the Vignettes and README examples for various examples, tutorials and use cases.
The estimate_slopes()
, estimate_means()
and estimate_contrasts()
functions are forming a group, as they are all based on marginal
estimations (estimations based on a model). All three are also built on the
emmeans package, so reading its documentation (for instance for
emmeans::emmeans()
and emmeans::emtrends()
) is recommended to understand
the idea behind these types of procedures.
Model-based predictions is the basis for all that follows. Indeed,
the first thing to understand is how models can be used to make predictions
(see estimate_link()
). This corresponds to the predicted response (or
"outcome variable") given specific predictor values of the predictors (i.e.,
given a specific data configuration). This is why the concept of reference grid()
is so important for direct predictions.
Marginal "means", obtained via estimate_means()
, are an extension
of such predictions, allowing to "average" (collapse) some of the predictors,
to obtain the average response value at a specific predictors configuration.
This is typically used when some of the predictors of interest are factors.
Indeed, the parameters of the model will usually give you the intercept value
and then the "effect" of each factor level (how different it is from the
intercept). Marginal means can be used to directly give you the mean value of
the response variable at all the levels of a factor. Moreover, it can also be
used to control, or average over predictors, which is useful in the case of
multiple predictors with or without interactions.
Marginal contrasts, obtained via estimate_contrasts()
, are
themselves at extension of marginal means, in that they allow to investigate
the difference (i.e., the contrast) between the marginal means. This is,
again, often used to get all pairwise differences between all levels of a
factor. It works also for continuous predictors, for instance one could also
be interested in whether the difference at two extremes of a continuous
predictor is significant.
Finally, marginal effects, obtained via estimate_slopes()
, are
different in that their focus is not values on the response variable, but the
model's parameters. The idea is to assess the effect of a predictor at a
specific configuration of the other predictors. This is relevant in the case
of interactions or non-linear relationships, when the effect of a predictor
variable changes depending on the other predictors. Moreover, these effects
can also be "averaged" over other predictors, to get for instance the
"general trend" of a predictor over different factor levels.
Example: Let's imagine the following model lm(y ~ condition * x)
where
condition
is a factor with 3 levels A, B and C and x
a continuous
variable (like age for example). One idea is to see how this model performs,
and compare the actual response y to the one predicted by the model (using
estimate_response()
). Another idea is evaluate the average mean at each of
the condition's levels (using estimate_means()
), which can be useful to
visualize them. Another possibility is to evaluate the difference between
these levels (using estimate_contrasts()
). Finally, one could also estimate
the effect of x averaged over all conditions, or instead within each
condition (using [estimate_slopes]
).
A data frame of estimated marginal means.
library(modelbased) # Frequentist models # ------------------- model <- lm(Petal.Length ~ Sepal.Width * Species, data = iris) estimate_means(model) estimate_means(model, fixed = "Sepal.Width") estimate_means(model, by = c("Species", "Sepal.Width"), length = 2) estimate_means(model, by = "Species=c('versicolor', 'setosa')") estimate_means(model, by = "Sepal.Width=c(2, 4)") estimate_means(model, by = c("Species", "Sepal.Width=0")) estimate_means(model, by = "Sepal.Width", length = 5) estimate_means(model, by = "Sepal.Width=c(2, 4)") # Methods that can be applied to it: means <- estimate_means(model, fixed = "Sepal.Width") plot(means) # which runs visualisation_recipe() standardize(means) data <- iris data$Petal.Length_factor <- ifelse(data$Petal.Length < 4.2, "A", "B") model <- lmer(Petal.Length ~ Sepal.Width + Species + (1 | Petal.Length_factor), data = data) estimate_means(model) estimate_means(model, by = "Sepal.Width", length = 3)
library(modelbased) # Frequentist models # ------------------- model <- lm(Petal.Length ~ Sepal.Width * Species, data = iris) estimate_means(model) estimate_means(model, fixed = "Sepal.Width") estimate_means(model, by = c("Species", "Sepal.Width"), length = 2) estimate_means(model, by = "Species=c('versicolor', 'setosa')") estimate_means(model, by = "Sepal.Width=c(2, 4)") estimate_means(model, by = c("Species", "Sepal.Width=0")) estimate_means(model, by = "Sepal.Width", length = 5) estimate_means(model, by = "Sepal.Width=c(2, 4)") # Methods that can be applied to it: means <- estimate_means(model, fixed = "Sepal.Width") plot(means) # which runs visualisation_recipe() standardize(means) data <- iris data$Petal.Length_factor <- ifelse(data$Petal.Length < 4.2, "A", "B") model <- lmer(Petal.Length ~ Sepal.Width + Species + (1 | Petal.Length_factor), data = data) estimate_means(model) estimate_means(model, by = "Sepal.Width", length = 3)
Estimate the slopes (i.e., the coefficient) of a predictor over or within
different factor levels, or alongside a numeric variable . In other words, to
assess the effect of a predictor at specific configurations data. Other
related functions based on marginal estimations includes
estimate_contrasts()
and estimate_means()
.
estimate_slopes(model, trend = NULL, by = NULL, ci = 0.95, at = NULL, ...)
estimate_slopes(model, trend = NULL, by = NULL, ci = 0.95, at = NULL, ...)
model |
A statistical model. |
trend |
A character indicating the name of the variable for which to compute the slopes. |
by |
The predictor variable(s) at which to evaluate the desired effect / mean / contrasts. Other predictors of the model that are not included here will be collapsed and "averaged" over (the effect will be estimated across them). |
ci |
Confidence Interval (CI) level. Default to |
at |
Deprecated, use |
... |
Other arguments passed for instance to |
See the Details section below, and don't forget to also check out the Vignettes and README examples for various examples, tutorials and use cases.
The estimate_slopes()
, estimate_means()
and estimate_contrasts()
functions are forming a group, as they are all based on marginal
estimations (estimations based on a model). All three are also built on the
emmeans package, so reading its documentation (for instance for
emmeans::emmeans()
and emmeans::emtrends()
) is recommended to understand
the idea behind these types of procedures.
Model-based predictions is the basis for all that follows. Indeed,
the first thing to understand is how models can be used to make predictions
(see estimate_link()
). This corresponds to the predicted response (or
"outcome variable") given specific predictor values of the predictors (i.e.,
given a specific data configuration). This is why the concept of reference grid()
is so important for direct predictions.
Marginal "means", obtained via estimate_means()
, are an extension
of such predictions, allowing to "average" (collapse) some of the predictors,
to obtain the average response value at a specific predictors configuration.
This is typically used when some of the predictors of interest are factors.
Indeed, the parameters of the model will usually give you the intercept value
and then the "effect" of each factor level (how different it is from the
intercept). Marginal means can be used to directly give you the mean value of
the response variable at all the levels of a factor. Moreover, it can also be
used to control, or average over predictors, which is useful in the case of
multiple predictors with or without interactions.
Marginal contrasts, obtained via estimate_contrasts()
, are
themselves at extension of marginal means, in that they allow to investigate
the difference (i.e., the contrast) between the marginal means. This is,
again, often used to get all pairwise differences between all levels of a
factor. It works also for continuous predictors, for instance one could also
be interested in whether the difference at two extremes of a continuous
predictor is significant.
Finally, marginal effects, obtained via estimate_slopes()
, are
different in that their focus is not values on the response variable, but the
model's parameters. The idea is to assess the effect of a predictor at a
specific configuration of the other predictors. This is relevant in the case
of interactions or non-linear relationships, when the effect of a predictor
variable changes depending on the other predictors. Moreover, these effects
can also be "averaged" over other predictors, to get for instance the
"general trend" of a predictor over different factor levels.
Example: Let's imagine the following model lm(y ~ condition * x)
where
condition
is a factor with 3 levels A, B and C and x
a continuous
variable (like age for example). One idea is to see how this model performs,
and compare the actual response y to the one predicted by the model (using
estimate_response()
). Another idea is evaluate the average mean at each of
the condition's levels (using estimate_means()
), which can be useful to
visualize them. Another possibility is to evaluate the difference between
these levels (using estimate_contrasts()
). Finally, one could also estimate
the effect of x averaged over all conditions, or instead within each
condition (using [estimate_slopes]
).
A data.frame of class estimate_slopes
.
# Get an idea of the data ggplot(iris, aes(x = Petal.Length, y = Sepal.Width)) + geom_point(aes(color = Species)) + geom_smooth(color = "black", se = FALSE) + geom_smooth(aes(color = Species), linetype = "dotted", se = FALSE) + geom_smooth(aes(color = Species), method = "lm", se = FALSE) # Model it model <- lm(Sepal.Width ~ Species * Petal.Length, data = iris) # Compute the marginal effect of Petal.Length at each level of Species slopes <- estimate_slopes(model, trend = "Petal.Length", by = "Species") slopes # Plot it plot(slopes) standardize(slopes) model <- mgcv::gam(Sepal.Width ~ s(Petal.Length), data = iris) slopes <- estimate_slopes(model, by = "Petal.Length", length = 50) summary(slopes) plot(slopes) model <- mgcv::gam(Sepal.Width ~ s(Petal.Length, by = Species), data = iris) slopes <- estimate_slopes(model, trend = "Petal.Length", by = c("Petal.Length", "Species"), length = 20 ) summary(slopes) plot(slopes)
# Get an idea of the data ggplot(iris, aes(x = Petal.Length, y = Sepal.Width)) + geom_point(aes(color = Species)) + geom_smooth(color = "black", se = FALSE) + geom_smooth(aes(color = Species), linetype = "dotted", se = FALSE) + geom_smooth(aes(color = Species), method = "lm", se = FALSE) # Model it model <- lm(Sepal.Width ~ Species * Petal.Length, data = iris) # Compute the marginal effect of Petal.Length at each level of Species slopes <- estimate_slopes(model, trend = "Petal.Length", by = "Species") slopes # Plot it plot(slopes) standardize(slopes) model <- mgcv::gam(Sepal.Width ~ s(Petal.Length), data = iris) slopes <- estimate_slopes(model, by = "Petal.Length", length = 50) summary(slopes) plot(slopes) model <- mgcv::gam(Sepal.Width ~ s(Petal.Length, by = Species), data = iris) slopes <- estimate_slopes(model, trend = "Petal.Length", by = c("Petal.Length", "Species"), length = 20 ) summary(slopes) plot(slopes)
Find points of inversion of a curve.
find_inversions(x)
find_inversions(x)
x |
A numeric vector. |
Vector of inversion points.
x <- sin(seq(0, 4 * pi, length.out = 100)) plot(x, type = "b") find_inversions(x)
x <- sin(seq(0, 4 * pi, length.out = 100)) plot(x, type = "b") find_inversions(x)
The get_emmeans()
function is a wrapper to facilitate the usage of
emmeans::emmeans()
and emmeans::emtrends()
, providing a somewhat simpler
and intuitive API to find the specifications and variables of interest. It is
meanly made to for the developers to facilitate the organization and
debugging, and end-users should rather use the estimate_*()
series of
functions.
get_emcontrasts( model, contrast = NULL, by = NULL, fixed = NULL, transform = "none", method = "pairwise", at = NULL, ... ) model_emcontrasts( model, contrast = NULL, by = NULL, fixed = NULL, transform = "none", method = "pairwise", at = NULL, ... ) get_emmeans( model, by = "auto", fixed = NULL, transform = "response", levels = NULL, modulate = NULL, at = NULL, ... ) model_emmeans( model, by = "auto", fixed = NULL, transform = "response", levels = NULL, modulate = NULL, at = NULL, ... ) get_emtrends( model, trend = NULL, by = NULL, fixed = NULL, levels = NULL, modulate = NULL, at = NULL, ... ) model_emtrends( model, trend = NULL, by = NULL, fixed = NULL, levels = NULL, modulate = NULL, at = NULL, ... )
get_emcontrasts( model, contrast = NULL, by = NULL, fixed = NULL, transform = "none", method = "pairwise", at = NULL, ... ) model_emcontrasts( model, contrast = NULL, by = NULL, fixed = NULL, transform = "none", method = "pairwise", at = NULL, ... ) get_emmeans( model, by = "auto", fixed = NULL, transform = "response", levels = NULL, modulate = NULL, at = NULL, ... ) model_emmeans( model, by = "auto", fixed = NULL, transform = "response", levels = NULL, modulate = NULL, at = NULL, ... ) get_emtrends( model, trend = NULL, by = NULL, fixed = NULL, levels = NULL, modulate = NULL, at = NULL, ... ) model_emtrends( model, trend = NULL, by = NULL, fixed = NULL, levels = NULL, modulate = NULL, at = NULL, ... )
model |
A statistical model. |
contrast |
A character vector indicating the name of the variable(s) for which to compute the contrasts. |
by |
The predictor variable(s) at which to evaluate the desired effect / mean / contrasts. Other predictors of the model that are not included here will be collapsed and "averaged" over (the effect will be estimated across them). |
fixed |
A character vector indicating the names of the predictors to be "fixed" (i.e., maintained), so that the estimation is made at these values. |
transform |
Is passed to the |
method |
Contrast method. See same argument in emmeans::contrast. |
at |
Deprecated, use |
... |
Other arguments passed for instance to |
levels , modulate
|
Deprecated, use |
trend |
A character indicating the name of the variable for which to compute the slopes. |
if (require("emmeans", quietly = TRUE)) { # Basic usage model <- lm(Sepal.Width ~ Species, data = iris) get_emcontrasts(model) # Dealing with interactions model <- lm(Sepal.Width ~ Species * Petal.Width, data = iris) # By default: selects first factor get_emcontrasts(model) # Can also run contrasts between points of numeric get_emcontrasts(model, contrast = "Petal.Width", length = 3) # Or both get_emcontrasts(model, contrast = c("Species", "Petal.Width"), length = 2) # Or with custom specifications estimate_contrasts(model, contrast = c("Species", "Petal.Width=c(1, 2)")) # Can fixate the numeric at a specific value get_emcontrasts(model, fixed = "Petal.Width") # Or modulate it get_emcontrasts(model, by = "Petal.Width", length = 4) } model <- lm(Sepal.Length ~ Species + Petal.Width, data = iris) if (require("emmeans", quietly = TRUE)) { # By default, 'by' is set to "Species" get_emmeans(model) # Overall mean (close to 'mean(iris$Sepal.Length)') get_emmeans(model, by = NULL) # One can estimate marginal means at several values of a 'modulate' variable get_emmeans(model, by = "Petal.Width", length = 3) # Interactions model <- lm(Sepal.Width ~ Species * Petal.Length, data = iris) get_emmeans(model) get_emmeans(model, by = c("Species", "Petal.Length"), length = 2) get_emmeans(model, by = c("Species", "Petal.Length = c(1, 3, 5)"), length = 2) } if (require("emmeans")) { model <- lm(Sepal.Width ~ Species * Petal.Length, data = iris) get_emtrends(model) get_emtrends(model, by = "Species") get_emtrends(model, by = "Petal.Length") get_emtrends(model, by = c("Species", "Petal.Length")) model <- lm(Petal.Length ~ poly(Sepal.Width, 4), data = iris) get_emtrends(model) get_emtrends(model, by = "Sepal.Width") }
if (require("emmeans", quietly = TRUE)) { # Basic usage model <- lm(Sepal.Width ~ Species, data = iris) get_emcontrasts(model) # Dealing with interactions model <- lm(Sepal.Width ~ Species * Petal.Width, data = iris) # By default: selects first factor get_emcontrasts(model) # Can also run contrasts between points of numeric get_emcontrasts(model, contrast = "Petal.Width", length = 3) # Or both get_emcontrasts(model, contrast = c("Species", "Petal.Width"), length = 2) # Or with custom specifications estimate_contrasts(model, contrast = c("Species", "Petal.Width=c(1, 2)")) # Can fixate the numeric at a specific value get_emcontrasts(model, fixed = "Petal.Width") # Or modulate it get_emcontrasts(model, by = "Petal.Width", length = 4) } model <- lm(Sepal.Length ~ Species + Petal.Width, data = iris) if (require("emmeans", quietly = TRUE)) { # By default, 'by' is set to "Species" get_emmeans(model) # Overall mean (close to 'mean(iris$Sepal.Length)') get_emmeans(model, by = NULL) # One can estimate marginal means at several values of a 'modulate' variable get_emmeans(model, by = "Petal.Width", length = 3) # Interactions model <- lm(Sepal.Width ~ Species * Petal.Length, data = iris) get_emmeans(model) get_emmeans(model, by = c("Species", "Petal.Length"), length = 2) get_emmeans(model, by = c("Species", "Petal.Length = c(1, 3, 5)"), length = 2) } if (require("emmeans")) { model <- lm(Sepal.Width ~ Species * Petal.Length, data = iris) get_emtrends(model) get_emtrends(model, by = "Species") get_emtrends(model, by = "Petal.Length") get_emtrends(model, by = c("Species", "Petal.Length")) model <- lm(Petal.Length ~ poly(Sepal.Width, 4), data = iris) get_emtrends(model) get_emtrends(model, by = "Sepal.Width") }
Modelbased-like API to create marginaleffects objects. This is Work-in-progress.
get_marginaleffects( model, trend = NULL, by = NULL, fixed = NULL, at = NULL, ... )
get_marginaleffects( model, trend = NULL, by = NULL, fixed = NULL, at = NULL, ... )
model |
A statistical model. |
trend |
A character indicating the name of the variable for which to compute the slopes. |
by |
The predictor variable(s) at which to evaluate the desired effect / mean / contrasts. Other predictors of the model that are not included here will be collapsed and "averaged" over (the effect will be estimated across them). |
fixed |
A character vector indicating the names of the predictors to be "fixed" (i.e., maintained), so that the estimation is made at these values. |
at |
Deprecated, use |
... |
Other arguments passed for instance to |
if (require("marginaleffects")) { model <- lm(Sepal.Width ~ Species * Petal.Length, data = iris) get_marginaleffects(model, trend = "Petal.Length", by = "Species") get_marginaleffects(model, trend = "Petal.Length", by = "Petal.Length") get_marginaleffects(model, trend = "Petal.Length", by = c("Species", "Petal.Length")) }
if (require("marginaleffects")) { model <- lm(Sepal.Width ~ Species * Petal.Length, data = iris) get_marginaleffects(model, trend = "Petal.Length", by = "Species") get_marginaleffects(model, trend = "Petal.Length", by = "Petal.Length") get_marginaleffects(model, trend = "Petal.Length", by = c("Species", "Petal.Length")) }
Smoothing a vector or a time series. For data.frames, the function will smooth all numeric variables stratified by factor levels (i.e., will smooth within each factor level combination).
smoothing(x, method = "loess", strength = 0.25, ...)
smoothing(x, method = "loess", strength = 0.25, ...)
x |
A numeric vector. |
method |
Can be "loess" (default) or "smooth". A loess smoothing can be slow. |
strength |
This argument only applies when |
... |
Arguments passed to or from other methods. |
A smoothed vector or data frame.
x <- sin(seq(0, 4 * pi, length.out = 100)) + rnorm(100, 0, 0.2) plot(x, type = "l") lines(smoothing(x, method = "smooth"), type = "l", col = "blue") lines(smoothing(x, method = "loess"), type = "l", col = "red") x <- sin(seq(0, 4 * pi, length.out = 10000)) + rnorm(10000, 0, 0.2) plot(x, type = "l") lines(smoothing(x, method = "smooth"), type = "l", col = "blue") lines(smoothing(x, method = "loess"), type = "l", col = "red")
x <- sin(seq(0, 4 * pi, length.out = 100)) + rnorm(100, 0, 0.2) plot(x, type = "l") lines(smoothing(x, method = "smooth"), type = "l", col = "blue") lines(smoothing(x, method = "loess"), type = "l", col = "red") x <- sin(seq(0, 4 * pi, length.out = 10000)) + rnorm(10000, 0, 0.2) plot(x, type = "l") lines(smoothing(x, method = "smooth"), type = "l", col = "blue") lines(smoothing(x, method = "loess"), type = "l", col = "red")
This function is an alias (another name) for the insight::get_datagrid()
function. Same arguments apply.
visualisation_matrix(x, ...) ## S3 method for class 'data.frame' visualisation_matrix( x, by = "all", target = NULL, at = NULL, factors = "reference", numerics = "mean", preserve_range = FALSE, reference = x, ... ) ## S3 method for class 'numeric' visualisation_matrix(x, ...) ## S3 method for class 'factor' visualisation_matrix(x, ...)
visualisation_matrix(x, ...) ## S3 method for class 'data.frame' visualisation_matrix( x, by = "all", target = NULL, at = NULL, factors = "reference", numerics = "mean", preserve_range = FALSE, reference = x, ... ) ## S3 method for class 'numeric' visualisation_matrix(x, ...) ## S3 method for class 'factor' visualisation_matrix(x, ...)
x |
An object from which to construct the reference grid. |
... |
Arguments passed to or from other methods (for instance, |
by |
Indicates the focal predictors (variables) for the reference grid
and at which values focal predictors should be represented. If not specified
otherwise, representative values for numeric variables or predictors are
evenly distributed from the minimum to the maximum, with a total number of
There is a special handling of assignments with brackets, i.e. values
defined inside
For factor variables, the value(s) inside the brackets should indicate
one or more factor levels, like The remaining variables not specified in |
target , at
|
Deprecated name. Please use |
factors |
Type of summary for factors. Can be |
numerics |
Type of summary for numeric values. Can be |
preserve_range |
In the case of combinations between numeric variables
and factors, setting |
reference |
The reference vector from which to compute the mean and SD.
Used when standardizing or unstandardizing the grid using |
Reference grid data frame.
library(modelbased) # Add one row to change the "mode" of Species data <- rbind(iris, iris[149, ], make.row.names = FALSE) # Single variable is of interest; all others are "fixed" visualisation_matrix(data, by = "Sepal.Length") visualisation_matrix(data, by = "Sepal.Length", length = 3) visualisation_matrix(data, by = "Sepal.Length", range = "ci", ci = 0.90) visualisation_matrix(data, by = "Sepal.Length", factors = "mode") # Multiple variables are of interest, creating a combination visualisation_matrix(data, by = c("Sepal.Length", "Species"), length = 3) visualisation_matrix(data, by = c(1, 3), length = 3) visualisation_matrix(data, by = c("Sepal.Length", "Species"), preserve_range = TRUE) visualisation_matrix(data, by = c("Sepal.Length", "Species"), numerics = 0) visualisation_matrix(data, by = c("Sepal.Length = 3", "Species")) visualisation_matrix(data, by = c("Sepal.Length = c(3, 1)", "Species = 'setosa'")) # with list-style at-argument visualisation_matrix(data, by = list(Sepal.Length = c(1, 3), Species = "setosa")) # Standardize vizdata <- visualisation_matrix(data, by = "Sepal.Length") standardize(vizdata)
library(modelbased) # Add one row to change the "mode" of Species data <- rbind(iris, iris[149, ], make.row.names = FALSE) # Single variable is of interest; all others are "fixed" visualisation_matrix(data, by = "Sepal.Length") visualisation_matrix(data, by = "Sepal.Length", length = 3) visualisation_matrix(data, by = "Sepal.Length", range = "ci", ci = 0.90) visualisation_matrix(data, by = "Sepal.Length", factors = "mode") # Multiple variables are of interest, creating a combination visualisation_matrix(data, by = c("Sepal.Length", "Species"), length = 3) visualisation_matrix(data, by = c(1, 3), length = 3) visualisation_matrix(data, by = c("Sepal.Length", "Species"), preserve_range = TRUE) visualisation_matrix(data, by = c("Sepal.Length", "Species"), numerics = 0) visualisation_matrix(data, by = c("Sepal.Length = 3", "Species")) visualisation_matrix(data, by = c("Sepal.Length = c(3, 1)", "Species = 'setosa'")) # with list-style at-argument visualisation_matrix(data, by = list(Sepal.Length = c(1, 3), Species = "setosa")) # Standardize vizdata <- visualisation_matrix(data, by = "Sepal.Length") standardize(vizdata)
Visualisation Recipe for 'modelbased' Objects
## S3 method for class 'estimate_grouplevel' visualisation_recipe( x, hline = NULL, pointrange = NULL, facet_wrap = NULL, labs = NULL, ... ) ## S3 method for class 'estimate_means' visualisation_recipe( x, show_data = "jitter", point = NULL, jitter = point, boxplot = NULL, violin = NULL, line = NULL, pointrange = NULL, labs = NULL, ... ) ## S3 method for class 'estimate_predicted' visualisation_recipe( x, show_data = "points", point = NULL, density_2d = NULL, line = NULL, ribbon = NULL, labs = NULL, ... ) ## S3 method for class 'estimate_slopes' visualisation_recipe( x, hline = NULL, line = NULL, pointrange = NULL, ribbon = NULL, labs = NULL, facet_wrap = NULL, ... )
## S3 method for class 'estimate_grouplevel' visualisation_recipe( x, hline = NULL, pointrange = NULL, facet_wrap = NULL, labs = NULL, ... ) ## S3 method for class 'estimate_means' visualisation_recipe( x, show_data = "jitter", point = NULL, jitter = point, boxplot = NULL, violin = NULL, line = NULL, pointrange = NULL, labs = NULL, ... ) ## S3 method for class 'estimate_predicted' visualisation_recipe( x, show_data = "points", point = NULL, density_2d = NULL, line = NULL, ribbon = NULL, labs = NULL, ... ) ## S3 method for class 'estimate_slopes' visualisation_recipe( x, hline = NULL, line = NULL, pointrange = NULL, ribbon = NULL, labs = NULL, facet_wrap = NULL, ... )
x |
A modelbased object. |
... |
Other arguments passed to other functions. |
show_data |
Display the "raw" data as a background to the model-based
estimation. Can be set to |
point , jitter , boxplot , violin , pointrange , density_2d , line , hline , ribbon , labs , facet_wrap
|
Additional aesthetics and parameters for the geoms (see customization example). |
# ============================================== # estimate_grouplevel # ============================================== data <- lme4::sleepstudy data <- rbind(data, data) data$Newfactor <- rep(c("A", "B", "C", "D")) # 1 random intercept model <- lme4::lmer(Reaction ~ Days + (1 | Subject), data = data) x <- estimate_grouplevel(model) layers <- visualisation_recipe(x) layers plot(layers) # 2 random intercepts model <- lme4::lmer(Reaction ~ Days + (1 | Subject) + (1 | Newfactor), data = data) x <- estimate_grouplevel(model) plot(visualisation_recipe(x)) model <- lme4::lmer(Reaction ~ Days + (1 + Days | Subject) + (1 | Newfactor), data = data) x <- estimate_grouplevel(model) plot(visualisation_recipe(x)) # Simple Model --------------- x <- estimate_means(lm(Sepal.Width ~ Species, data = iris)) layers <- visualisation_recipe(x) layers plot(layers) ## Not run: # Customize aesthetics layers <- visualisation_recipe(x, jitter = list(width = 0.03, color = "red"), line = list(linetype = "dashed") ) plot(layers) # Customize raw data plot(visualisation_recipe(x, show_data = c("violin", "boxplot", "jitter"))) # Two levels --------------- data <- mtcars data$cyl <- as.factor(data$cyl) data$new_factor <- as.factor(rep(c("A", "B"), length.out = nrow(mtcars))) # Modulations -------------- x <- estimate_means(model, by = c("new_factor", "wt")) plot(visualisation_recipe(x)) # x <- estimate_means(model, by =c("new_factor", "cyl", "wt")) # plot(visualisation_recipe(x)) # TODO: broken #' # GLMs --------------------- data <- data.frame(vs = mtcars$vs, cyl = as.factor(mtcars$cyl)) x <- estimate_means(glm(vs ~ cyl, data = data, family = "binomial")) plot(visualisation_recipe(x)) ## End(Not run) # ============================================== # estimate_relation, estimate_response, ... # ============================================== # Simple Model --------------- x <- estimate_relation(lm(mpg ~ wt, data = mtcars)) layers <- visualisation_recipe(x) layers plot(layers) ## Not run: # Customize aesthetics ---------- layers <- visualisation_recipe(x, point = list(color = "red", alpha = 0.6, size = 3), line = list(color = "blue", size = 3), ribbon = list(fill = "green", alpha = 0.7), labs = list(subtitle = "Oh yeah!") ) layers plot(layers) # Customize raw data ------------- plot(visualisation_recipe(x, show_data = "none")) plot(visualisation_recipe(x, show_data = c("density_2d", "points"))) plot(visualisation_recipe(x, show_data = "density_2d_filled")) plot(visualisation_recipe(x, show_data = "density_2d_polygon")) plot(visualisation_recipe(x, show_data = "density_2d_raster")) + scale_x_continuous(expand = c(0, 0)) + scale_y_continuous(expand = c(0, 0)) # Single predictors examples ----------- plot(estimate_relation(lm(Sepal.Length ~ Sepal.Width, data = iris))) plot(estimate_relation(lm(Sepal.Length ~ Species, data = iris))) # 2-ways interaction ------------ # Numeric * numeric x <- estimate_relation(lm(mpg ~ wt * qsec, data = mtcars)) layers <- visualisation_recipe(x) plot(layers) # Numeric * factor x <- estimate_relation(lm(Sepal.Width ~ Sepal.Length * Species, data = iris)) layers <- visualisation_recipe(x) plot(layers) # Factor * numeric x <- estimate_relation(lm(Sepal.Width ~ Species * Sepal.Length, data = iris)) layers <- visualisation_recipe(x) plot(layers) # 3-ways interaction ------------ data <- mtcars data$vs <- as.factor(data$vs) data$cyl <- as.factor(data$cyl) data$new_factor <- as.factor(rep(c("A", "B"), length.out = nrow(mtcars))) # Numeric * numeric * factor x <- estimate_relation(lm(mpg ~ wt * am * vs, data = data)) layers <- visualisation_recipe(x) plot(layers) # Numeric * factor * factor x <- estimate_relation(lm(mpg ~ wt * cyl * new_factor, data = data)) layers <- visualisation_recipe(x) plot(layers) # Factor * numeric * numeric x <- estimate_relation(lm(mpg ~ cyl * qsec * hp, data = data)) layers <- visualisation_recipe(x) plot(layers) + scale_size_continuous(range = c(0.2, 1)) # GLMs --------------------- x <- estimate_relation(glm(vs ~ mpg, data = mtcars, family = "binomial")) plot(visualisation_recipe(x)) plot(visualisation_recipe(x, show_data = "jitter", point = list(height = 0.03))) # Multiple CIs --------------------- plot(estimate_relation(lm(mpg ~ disp, data = mtcars), ci = c(.50, .80, .95) )) plot(estimate_relation(lm(Sepal.Length ~ Species, data = iris), ci = c(0.5, 0.7, 0.95) )) # Bayesian models --------------------- if (require("ggplot2") && require("rstanarm")) { model <- rstanarm::stan_glm(mpg ~ wt, data = mtcars, refresh = 0) # Plot individual draws instead of regular ribbon x <- estimate_relation(model, keep_iterations = 100) layers <- visualisation_recipe(x, ribbon = list(color = "red")) plot(layers) model <- rstanarm::stan_glm(Sepal.Width ~ Species * Sepal.Length, data = iris, refresh = 0) plot(estimate_relation(model, keep_iterations = 100)) } ## End(Not run) # ============================================== # estimate_slopes # ============================================== if (require("ggplot2")) { model <- lm(Sepal.Width ~ Species * Petal.Length, data = iris) x <- estimate_slopes(model, trend = "Petal.Length", by = "Species") layers <- visualisation_recipe(x) layers plot(layers) model <- lm(Petal.Length ~ poly(Sepal.Width, 4), data = iris) x <- estimate_slopes(model, by = "Sepal.Width", length = 20) plot(visualisation_recipe(x)) model <- lm(Petal.Length ~ Species * poly(Sepal.Width, 3), data = iris) x <- estimate_slopes(model, by = c("Sepal.Width", "Species")) plot(visualisation_recipe(x)) } # TODO: fails with latest emmeans (1.8.0) if (require("mgcv")) { data <- iris data$Petal.Length <- data$Petal.Length^2 model <- mgcv::gam(Sepal.Width ~ t2(Petal.Width, Petal.Length), data = data) x <- estimate_slopes(model, by = c("Petal.Width", "Petal.Length"), length = 20) plot(visualisation_recipe(x)) model <- mgcv::gam(Sepal.Width ~ t2(Petal.Width, Petal.Length, by = Species), data = data) x <- estimate_slopes(model, by = c("Petal.Width", "Petal.Length", "Species"), length = 10) plot(visualisation_recipe(x)) }
# ============================================== # estimate_grouplevel # ============================================== data <- lme4::sleepstudy data <- rbind(data, data) data$Newfactor <- rep(c("A", "B", "C", "D")) # 1 random intercept model <- lme4::lmer(Reaction ~ Days + (1 | Subject), data = data) x <- estimate_grouplevel(model) layers <- visualisation_recipe(x) layers plot(layers) # 2 random intercepts model <- lme4::lmer(Reaction ~ Days + (1 | Subject) + (1 | Newfactor), data = data) x <- estimate_grouplevel(model) plot(visualisation_recipe(x)) model <- lme4::lmer(Reaction ~ Days + (1 + Days | Subject) + (1 | Newfactor), data = data) x <- estimate_grouplevel(model) plot(visualisation_recipe(x)) # Simple Model --------------- x <- estimate_means(lm(Sepal.Width ~ Species, data = iris)) layers <- visualisation_recipe(x) layers plot(layers) ## Not run: # Customize aesthetics layers <- visualisation_recipe(x, jitter = list(width = 0.03, color = "red"), line = list(linetype = "dashed") ) plot(layers) # Customize raw data plot(visualisation_recipe(x, show_data = c("violin", "boxplot", "jitter"))) # Two levels --------------- data <- mtcars data$cyl <- as.factor(data$cyl) data$new_factor <- as.factor(rep(c("A", "B"), length.out = nrow(mtcars))) # Modulations -------------- x <- estimate_means(model, by = c("new_factor", "wt")) plot(visualisation_recipe(x)) # x <- estimate_means(model, by =c("new_factor", "cyl", "wt")) # plot(visualisation_recipe(x)) # TODO: broken #' # GLMs --------------------- data <- data.frame(vs = mtcars$vs, cyl = as.factor(mtcars$cyl)) x <- estimate_means(glm(vs ~ cyl, data = data, family = "binomial")) plot(visualisation_recipe(x)) ## End(Not run) # ============================================== # estimate_relation, estimate_response, ... # ============================================== # Simple Model --------------- x <- estimate_relation(lm(mpg ~ wt, data = mtcars)) layers <- visualisation_recipe(x) layers plot(layers) ## Not run: # Customize aesthetics ---------- layers <- visualisation_recipe(x, point = list(color = "red", alpha = 0.6, size = 3), line = list(color = "blue", size = 3), ribbon = list(fill = "green", alpha = 0.7), labs = list(subtitle = "Oh yeah!") ) layers plot(layers) # Customize raw data ------------- plot(visualisation_recipe(x, show_data = "none")) plot(visualisation_recipe(x, show_data = c("density_2d", "points"))) plot(visualisation_recipe(x, show_data = "density_2d_filled")) plot(visualisation_recipe(x, show_data = "density_2d_polygon")) plot(visualisation_recipe(x, show_data = "density_2d_raster")) + scale_x_continuous(expand = c(0, 0)) + scale_y_continuous(expand = c(0, 0)) # Single predictors examples ----------- plot(estimate_relation(lm(Sepal.Length ~ Sepal.Width, data = iris))) plot(estimate_relation(lm(Sepal.Length ~ Species, data = iris))) # 2-ways interaction ------------ # Numeric * numeric x <- estimate_relation(lm(mpg ~ wt * qsec, data = mtcars)) layers <- visualisation_recipe(x) plot(layers) # Numeric * factor x <- estimate_relation(lm(Sepal.Width ~ Sepal.Length * Species, data = iris)) layers <- visualisation_recipe(x) plot(layers) # Factor * numeric x <- estimate_relation(lm(Sepal.Width ~ Species * Sepal.Length, data = iris)) layers <- visualisation_recipe(x) plot(layers) # 3-ways interaction ------------ data <- mtcars data$vs <- as.factor(data$vs) data$cyl <- as.factor(data$cyl) data$new_factor <- as.factor(rep(c("A", "B"), length.out = nrow(mtcars))) # Numeric * numeric * factor x <- estimate_relation(lm(mpg ~ wt * am * vs, data = data)) layers <- visualisation_recipe(x) plot(layers) # Numeric * factor * factor x <- estimate_relation(lm(mpg ~ wt * cyl * new_factor, data = data)) layers <- visualisation_recipe(x) plot(layers) # Factor * numeric * numeric x <- estimate_relation(lm(mpg ~ cyl * qsec * hp, data = data)) layers <- visualisation_recipe(x) plot(layers) + scale_size_continuous(range = c(0.2, 1)) # GLMs --------------------- x <- estimate_relation(glm(vs ~ mpg, data = mtcars, family = "binomial")) plot(visualisation_recipe(x)) plot(visualisation_recipe(x, show_data = "jitter", point = list(height = 0.03))) # Multiple CIs --------------------- plot(estimate_relation(lm(mpg ~ disp, data = mtcars), ci = c(.50, .80, .95) )) plot(estimate_relation(lm(Sepal.Length ~ Species, data = iris), ci = c(0.5, 0.7, 0.95) )) # Bayesian models --------------------- if (require("ggplot2") && require("rstanarm")) { model <- rstanarm::stan_glm(mpg ~ wt, data = mtcars, refresh = 0) # Plot individual draws instead of regular ribbon x <- estimate_relation(model, keep_iterations = 100) layers <- visualisation_recipe(x, ribbon = list(color = "red")) plot(layers) model <- rstanarm::stan_glm(Sepal.Width ~ Species * Sepal.Length, data = iris, refresh = 0) plot(estimate_relation(model, keep_iterations = 100)) } ## End(Not run) # ============================================== # estimate_slopes # ============================================== if (require("ggplot2")) { model <- lm(Sepal.Width ~ Species * Petal.Length, data = iris) x <- estimate_slopes(model, trend = "Petal.Length", by = "Species") layers <- visualisation_recipe(x) layers plot(layers) model <- lm(Petal.Length ~ poly(Sepal.Width, 4), data = iris) x <- estimate_slopes(model, by = "Sepal.Width", length = 20) plot(visualisation_recipe(x)) model <- lm(Petal.Length ~ Species * poly(Sepal.Width, 3), data = iris) x <- estimate_slopes(model, by = c("Sepal.Width", "Species")) plot(visualisation_recipe(x)) } # TODO: fails with latest emmeans (1.8.0) if (require("mgcv")) { data <- iris data$Petal.Length <- data$Petal.Length^2 model <- mgcv::gam(Sepal.Width ~ t2(Petal.Width, Petal.Length), data = data) x <- estimate_slopes(model, by = c("Petal.Width", "Petal.Length"), length = 20) plot(visualisation_recipe(x)) model <- mgcv::gam(Sepal.Width ~ t2(Petal.Width, Petal.Length, by = Species), data = data) x <- estimate_slopes(model, by = c("Petal.Width", "Petal.Length", "Species"), length = 10) plot(visualisation_recipe(x)) }
Find zero crossings of a vector, i.e., indices when the numeric variable crosses 0.
zero_crossings(x)
zero_crossings(x)
x |
A numeric vector. |
Vector of zero crossings.
Based on the uniroot.all
function from the rootSolve package.
x <- sin(seq(0, 4 * pi, length.out = 100)) plot(x) zero_crossings(x)
x <- sin(seq(0, 4 * pi, length.out = 100)) plot(x) zero_crossings(x)