--- title: "A Quick Start Guide on Using semptools" author: "Shu Fai Cheung & Mark Hok Chio Lai" date: "`r Sys.Date()`" output: rmarkdown::html_vignette vignette: > %\VignetteIndexEntry{A Quick Start Guide on Using semptools} %\VignetteEngine{knitr::rmarkdown} %\VignetteEncoding{UTF-8} %\VignetteDepends{magrittr} --- ```{r, include = FALSE} knitr::opts_chunk$set( collapse = TRUE, comment = "#>", fig.width = 6, fig.height = 6, fig.align = "center" ) ``` ```{r setup, echo = FALSE} library(semptools) ``` # Introduction The package [semptools](https://sfcheung.github.io/semptools/) ([CRAN page](https://cran.r-project.org/package=semptools)) contains functions that *post-process* an output from `semPlot::semPaths()`, to help users to customize the appearance of the graphs generated by `semPlot::semPaths()`. The following sections were written to be self-contained, with some elements repeated, such that each of them can be read individually. # Mark all parameter estimates by asterisks based on p-Value: `mark_sig` Let us consider a simple path analysis model: ```{r mark_sig01} library(lavaan) mod_pa <- 'x1 ~~ x2 x3 ~ x1 + x2 x4 ~ x1 + x3 ' fit_pa <- lavaan::sem(mod_pa, pa_example) parameterEstimates(fit_pa) ``` This is the plot from `semPaths`. ```{r} library(semPlot) m <- matrix(c("x1", NA, NA, NA, NA, "x3", NA, "x4", "x2", NA, NA, NA), byrow = TRUE, 3, 4) p_pa <- semPaths(fit_pa, whatLabels = "est", sizeMan = 10, edge.label.cex = 1.15, style = "ram", nCharNodes = 0, nCharEdges = 0, layout = m) ``` We know from the `lavaan::lavaan()` output that some paths are significant and some are not. In some disciplines, asterisks are conventionally added indicate this. However, `semPlot::semPaths()` does not do this. We can use `mark_sig()` to add asterisks based on the p-values of the free parameters. ```{r} library(semptools) p_pa2 <- mark_sig(p_pa, fit_pa) plot(p_pa2) ``` The first argument, `semPaths_plot`, is the output from `semPaths::semPlot()`. The second argument, `object`, is the `lavaan::lavaan()` output used to generate the plot. This output is needed to extract the *p*-values. The default labels follow the common convention: "\*" for *p* less than .05, "\*\*" for *p* less than .01, and "\*\*\*" for p less than .001. This can be changed by the argument `alpha` (this must be named as the it is not the second argument). E.g.: ```{r} p_pa3 <- mark_sig(p_pa, fit_pa, alpha = c("(n.s.)" = 1.00, "*" = .01)) plot(p_pa3) ``` # Add standard error estimates to parameter estimates: `mark_se` Let us consider a simple path analysis model: ```{r} library(lavaan) mod_pa <- 'x1 ~~ x2 x3 ~ x1 + x2 x4 ~ x1 + x3 ' fit_pa <- lavaan::sem(mod_pa, pa_example) parameterEstimates(fit_pa) ``` This is the plot from `semPlot::semPaths()`. ```{r} library(semPlot) m <- matrix(c("x1", NA, NA, NA, NA, "x3", NA, "x4", "x2", NA, NA, NA), byrow = TRUE, 3, 4) p_pa <- semPaths(fit_pa, whatLabels = "est", sizeMan = 10, edge.label.cex = 1.15, style = "ram", nCharNodes = 0, nCharEdges = 0, layout = m) ``` We can use `mark_se()` to add the standard errors for the parameter estimates: ```{r} library(semptools) p_pa2 <- mark_se(p_pa, fit_pa) plot(p_pa2) ``` The first argument, `semPaths_plot`, is the output from `semPaths::semPlot()`. The second argument, `object`, is the `lavaan::lavaan()` output used to generate the plot. This output is needed to extra the standard errors. By default, the standard errors are enclosed by parentheses and appended to the parameter estimates, separated by one space. The argument `sep` can be used to use another separator. For example, if `"\n"` is used, the standard errors will be displayed below the corresponding parameter estimates. ```{r} p_pa2 <- mark_se(p_pa, fit_pa, sep = "\n") plot(p_pa2) ``` # Rotate the residuals of selected variables: `rotate_resid` Let us consider a simple path analysis model: ```{r} library(lavaan) mod_pa <- 'x1 ~~ x2 x3 ~ x1 + x2 x4 ~ x1 + x3 ' fit_pa <- lavaan::sem(mod_pa, pa_example) ``` This is the plot from `semPlot::semPaths()`. ```{r} library(semPlot) m <- matrix(c("x1", NA, NA, NA, NA, "x3", NA, "x4", "x2", NA, NA, NA), byrow = TRUE, 3, 4) p_pa <- semPaths(fit_pa, whatLabels = "est", sizeMan = 10, edge.label.cex = 1.15, style = "ram", nCharNodes = 0, nCharEdges = 0, layout = m) ``` Suppose we want to rotate the residuals of some variables to improve readability. - For `x3`, we want to place the residual to top-right corner. - For `x4`, we want to place the residual to the top-left corner. - For `x2`, we want to place the residual to the left. We first need to decide the angle of placement, in degrees. Top is 0 degree. Clockwise position is positive, and anticlockwise position is negative. Therefore, top-right is 45, top-left is -45, and left is -90. We then use `rotate_resid()` to post-process the `semPlot::semPaths()` output. The first argument, `semPaths_plot`, is the `semPlot::semPaths()` output. The second argument, `rotate_resid_list`, is the vector to specify how the residuals should be rotated. The name is the node for which the residual will be rotated, and the value is the degree of rotation. For example, to achieve the results described above, the vector is `c(x3 = 45, x4 = -45, x2 = -90)`: ```{r} library(semptools) my_rotate_resid_list <- c(x3 = 45, x4 = -45, x2 = -90) p_pa3 <- rotate_resid(p_pa, my_rotate_resid_list) plot(p_pa3) ``` (Note: This function accepts named vectors since version 0.2.8. Lists of named list are still supported but not suggested. Please see `?rotate_resid` on how to use lists of named list.) # Set the curve attributes of selected arrows: `set_curve` Let us consider a simple path analysis model: ```{r} library(lavaan) mod_pa <- 'x1 ~~ x2 x3 ~ x1 + x2 x4 ~ x1 + x3 ' fit_pa <- lavaan::sem(mod_pa, pa_example) ``` This is the plot from `semPaths`. ```{r} library(semPlot) m <- matrix(c("x1", NA, NA, NA, NA, "x3", NA, "x4", "x2", NA, NA, NA), byrow = TRUE, 3, 4) p_pa <- semPaths(fit_pa, whatLabels = "est", sizeMan = 10, edge.label.cex = 1.15, style = "ram", nCharNodes = 0, nCharEdges = 0, layout = m) ``` Suppose we want to change the curvature of these two arrows (`edges`): - Have the `x1 ~~ x2` covariance curved "away" from the center. - Have the `x4 ~ x1` path curved upward. We then use `set_curve()` to post-process the `semPlot::semPaths()` output. The first argument, `semPaths_plot`, is the `semPlot::semPaths()` output. The second argument, ` curve_list`, is the list to specify the new curvature of the selected arrows. The "name" of each element is of the same form as `lhs-op-rhs` as in `lavaan::lavaan()` model syntax. In `lavaan`, `y ~ x` denotes an arrow from `x` to `y`. Therefore, if we want to change the curvature of the path *from* `x` *to* `y` to -3, then the element is `"y ~ x" = -3`. Note that whether `~` or `~~` is used does not matter. To achieve the changes described above, we can use `c("x2 ~~ x1" = -3, "x4 ~ x1" = 2)`, as shown below: ```{r} my_curve_list <- c("x2 ~~ x1" = -3, "x4 ~ x1" = 2) p_pa3 <- set_curve(p_pa, my_curve_list) plot(p_pa3) ``` Note that the meaning of the value depends on which variable is in the `from` field and which variable is in the `to` field. Therefore, `"x2 ~~ x1" = -3` and `"x1 ~~ x2" = -3` are two different changes. If we treat the `from` variable as the back and the `to` variable as the front, then a *positive* number bends the line to *left*, and a *negative* number bends the line to the *right*. It is not easy to decide what the value should be used to set the curve. Trial and error is needed for complicated models. The `curve` attributes of the corresponding arrows of the `qgraph` object will be updated. (Note: This function accepts named vectors since version 0.2.8. Lists of named list are still supported but not suggested. Please see `?set_curve` on how to use lists of named list.) # Set the positions of parameters of selected arrows: `set_edge_label_position` Let us consider a simple path analysis model: ```{r} library(lavaan) mod_pa <- 'x1 ~~ x2 x3 ~ x1 + x2 x4 ~ x1 + x3 ' fit_pa <- lavaan::sem(mod_pa, pa_example) ``` This is the plot from `semPlot::semPaths()`. ```{r} library(semPlot) m <- matrix(c("x1", NA, NA, NA, NA, "x3", NA, "x4", "x2", NA, NA, NA), byrow = TRUE, 3, 4) p_pa <- semPaths(fit_pa, whatLabels = "est", sizeMan = 10, edge.label.cex = 1.15, style = "ram", nCharNodes = 0, nCharEdges = 0, layout = m) ``` Suppose we want to move the parameter estimates this way: - For the `x4 ~ x1` path, move the parameter estimates closer to `x4`. - For the `x3 ~ x1` path, move the parameter estimates closer to `x1`. - For the `x3 ~ x2` path, move the parameter estimates closer to `x2`. We can use `set_edge_label_position()` to post-process the `semPlot::semPaths` output. The first argument, `semPaths_plot`, is the `semPlot::semPaths()` output. The second argument, `position_list`, is the list to specify the new position of the selected arrows. We can use a named vector to specify the changes. The "name" of each element is of the same form as `lhs-op-rhs` as in `lavaan::lavaan()` model syntax. In `lavaan`, `y ~ x` denotes an arrow from `x` to `y`. Therefore, if we want to change the curvature of the path *from* `x` *to* `y` to -3, then the element is `"y ~ x" = -3`. Note that whether `~` or `~~` is used does not matter. Therefore, the changes described above can be specified by `c("x2 ~~ x1" = -3, "x4 ~ x1" = 2)`, as shown below: ```{r} library(semptools) my_position_list <- c("x3 ~ x1" = .25, "x3 ~ x2" = .25, "x4 ~ x1" = .75) p_pa3 <- set_edge_label_position(p_pa, my_position_list) plot(p_pa3) ``` (Note: This function accept named vectors since version 0.2.8. Lists of named list are still supported but not suggested. Please see `?set_edge_label_position` on how to use lists of named list.) # Change one or more node labels: `change_node_label` `semPlot::semPaths()` supports changing the labels of nodes when generating a plot through the argument `nodeLabels`. However, if we want to use functions such as `mark_sig()` or `mark_se()`, which require information from the original results from the original `lavaan` output, then we cannot use `nodeLabels` because these functions do not (yet) know how to map a user-defined label to the variables in the `lavaan` output. One solution is to use `semptools` functions to process the `qgraph` generated by `semPlot::semPaths()`, and change the node labels in *last step* to create the final plot. This can be done by `change_node_label()`. Let us consider a simple path analysis model in which we use `marg_sig()` to add asterisks to denote significant parameters: ```{r} library(lavaan) library(semPlot) library(semptools) mod_pa <- 'x1 ~~ x2 x3 ~ x1 + x2 x4 ~ x1 + x3 ' fit_pa <- lavaan::sem(mod_pa, pa_example) m <- matrix(c("x1", NA, NA, NA, NA, "x3", NA, "x4", "x2", NA, NA, NA), byrow = TRUE, 3, 4) p_pa <- semPaths(fit_pa, whatLabels = "est", sizeMan = 10, edge.label.cex = 1.15, style = "ram", nCharNodes = 0, nCharEdges = 0, layout = m) p_pa2 <- mark_sig(p_pa, fit_pa, alpha = c("(n.s.)" = 1.00, "*" = .01)) plot(p_pa2) ``` Suppose we want change `x1`, `x2`, `x3`, and `x4` to `Attitude`, `SbjNorm`, `Intention`, and `Behavior`, we process the graph, `p_pa2` above, by `change_node_label()` as below: ```{r} p_pa3 <- change_node_label(p_pa2, c(x1 = "Attitude", x2 = "SbjNorm", x3 = "Intention", x4 = "Behavior"), label.cex = 1.1) plot(p_pa3) ``` The second argument can be a named vector or a named list. The name of each element is the original label (e.g., `x1` in this example), and the value is the new label (e.g., `"Attitude"` for `x1`). Only the labels of named nodes will be changed. Note that usually we also set the `label.cex` argument, which is identical to the same argument in `semPlot::semPaths()` because the new labels might not fit the nodes. # Using pipe-operator All the functions support the `%>%` operator from `magrittr` or the native pipe operator `|>` available since R 4.1.x. Therefore, we can chain the post-processing. ```{r} library(lavaan) mod_pa <- 'x1 ~~ x2 x3 ~ x1 + x2 x4 ~ x1 + x3 ' fit_pa <- lavaan::sem(mod_pa, pa_example) ``` This is the initial plot: ```{r} library(semPlot) m <- matrix(c("x1", NA, NA, NA, NA, "x3", NA, "x4", "x2", NA, NA, NA), byrow = TRUE, 3, 4) p_pa <- semPaths(fit_pa, whatLabels = "est", sizeMan = 10, edge.label.cex = 1.15, style = "ram", nCharNodes = 0, nCharEdges = 0, layout = m) ``` We will do this: - Change the curvature of `x1 ~~ x2` - Rotate the residuals of `x1`, `x2`, `x3`, and `x4`, - Add asterisks to denote significant test results - Add standard errors - Move the parameter estimate of the `x4 ~ x1` path closer to `x4`. ```{r eval = FALSE} my_position_list <- c("x4 ~ x1" = .75) my_curve_list <- c("x2 ~ x1" = -2) my_rotate_resid_list <- c(x1 = 0, x2 = 180, x3 = 140, x4 = 140) my_position_list <- c("x4 ~ x1" = .65) # If R version 4.1.0 or above p_pa3 <- p_pa |> set_curve(my_curve_list) |> rotate_resid(my_rotate_resid_list) |> mark_sig(fit_pa) |> mark_se(fit_pa, sep = "\n") |> set_edge_label_position(my_position_list) plot(p_pa3) ``` ```{r echo = FALSE} my_position_list <- c("x4 ~ x1" = .75) my_curve_list <- c("x2 ~ x1" = -2) my_rotate_resid_list <- c(x1 = 0, x2 = 180, x3 = 140, x4 = 140) my_position_list <- c("x4 ~ x1" = .65) # if ((compareVersion(as.character(getRversion()), "4.1.0")) >= 0) { # p_pa3 <- p_pa |> set_curve(my_curve_list) |> # rotate_resid(my_rotate_resid_list) |> # mark_sig(fit_pa) |> # mark_se(fit_pa, sep = "\n") |> # set_edge_label_position(my_position_list) # } else { require(magrittr) p_pa3 <- p_pa %>% set_curve(my_curve_list) %>% rotate_resid(my_rotate_resid_list) %>% mark_sig(fit_pa) %>% mark_se(fit_pa, sep = "\n") %>% set_edge_label_position(my_position_list) # } plot(p_pa3) ``` For most of the functions, the necessary argument beside the `semPlot::semPaths` output, if any, is the second element. Therefore, they can be included as unnamed arguments. For the third and other optional arguments, such as `sep` for `mark_se()`, it is better to name them. # Limitations - Currently, if a function needs the SEM output, only `lavaan` output is supported.