--- title: "5. Visualization: every plot, and how to read it" output: rmarkdown::html_vignette vignette: > %\VignetteIndexEntry{5. Visualization: every plot, and how to read it} %\VignetteEngine{knitr::rmarkdown} %\VignetteEncoding{UTF-8} --- ```{r, include = FALSE} knitr::opts_chunk$set(collapse = TRUE, comment = "#>", fig.width = 8, fig.height = 5, out.width = "100%") ``` `transitiontrees` draws every static plot in pure `ggplot2` -- no extra plotting dependency -- plus one optional interactive renderer (`visNetwork`). Every plot returns a standard object you can theme, save, or further modify. This vignette tours them all and reads each one. A shared convention across the static tree styles: **node size = context count**, **node fill = the most-recent state of the pathway**, and **edge thickness = the volume of sequences flowing down that branch**. ## Setup Every plot below is drawn from the same fitted, pruned tree on the bundled `trajectories` data (138 learners, three engagement states). We fit it once here and reuse it throughout. ```{r setup} library(transitiontrees) data(trajectories) set.seed(1) tree <- context_tree(trajectories, max_depth = 3L, min_count = 5L) pruned <- prune_tree(tree, criterion = "G2", alpha = 0.05) pruned ``` ## 1. The fitted tree, four ways ### Horizontal phylogram (default) Root on the left, depth rightward; every leaf is labelled with its full arrow-form pathway and the predicted next state. This is the style for a paper when you need to cite specific pathways inline. ```{r horizontal, fig.width = 14, fig.height = 8} plot(pruned, style = "horizontal") ``` `point_size_range` and `edge_size_range` exaggerate or compress the size dynamic range -- useful for slides where the count contrast must read from the back of the room. The encodings are unchanged; only the scales differ. ```{r horizontal-sized, fig.width = 14, fig.height = 8} plot(pruned, style = "horizontal", point_size_range = c(3, 12), edge_size_range = c(0.4, 3.5)) ``` ### Radial dendrogram The same tree wrapped into a circle: the eye goes to the thick central branches (the corpus highways) versus the thin outer twigs (contexts pruning kept on evidence, not volume). ```{r dendrogram, fig.height = 6} plot(pruned, style = "dendrogram") ``` ### Icicle / sunburst A space-filling partition: arc angular width is proportional to count, so a dominant state visually swallows the ring -- an honest depiction of class imbalance. ```{r icicle, fig.height = 6} plot(pruned, style = "icicle") ``` A fourth style, `style = "interactive"`, renders the same tree as a draggable, zoomable `visNetwork` widget (collapse the dominant spine and the rare informative branches become legible). It produces an HTML widget rather than a static figure, so it is best run in an interactive session rather than shown inline here. ## 2. Pathway-centric plots These complement the tree by ranking *pathways* rather than drawing topology. ### Next-state heatmap Each row is a context, each column a next state, each cell `P(next | context)`, modal cell bold; a `>` prefix marks a context whose modal next state flips versus its shorter parent. Sorting the **same** data two ways is the single best "common vs informative" figure: ```{r heatmap-count, fig.height = 5.5} plot_pathways(pruned, top = 12, sort_by = "count") # the highways ``` ```{r heatmap-div, fig.height = 5.5} plot_pathways(pruned, top = 12, sort_by = "divergence") # the informative ones ``` Sorted by count the bright cells stack on the most frequent next state; sorted by divergence they move off it. That lateral shift is the thesis in one comparison. ### Divergence lollipop Per-context KL from the shorter parent, ranked, with orange points marking modal-flip contexts -- the histories that genuinely *change* the prediction. `min_count` removes small-sample mirages. ```{r divergence, fig.height = 5} plot_divergence(pruned, top = 12, min_count = 5) ``` ### Per-context distributions The full next-state distribution for each context as small multiples -- peaked panels are near-settled continuations, flat panels are the decision points where history does not resolve the next state. ```{r distributions, fig.height = 5.5} plot_distributions(pruned, top = 6) ``` ## 3. Diagnostic plots ### How much memory does one pathway need? `plot_pruning()` walks a pathway's suffix chain -- the full context, then the same context with its oldest move dropped, down to the root -- and marks which contexts the pruning test keeps (solid) versus drops (faded). It answers, for that one pathway, how far back history actually has to reach. ```{r pruning, fig.width = 9, fig.height = 4.5} plot_pruning(tree, "Active -> Active -> Average") ``` ### Predictive quality `plot_predictive()` scores sequences against the fitted tree three ways. For this tour we score the bundled `trajectories` themselves; in a real evaluation pass genuinely held-out sequences (the *Advanced analysis* vignette shows the cross-validated route). `type = "logloss"` -- per-position surprise in bits against position; below the uniform ceiling is structure the model exploited: ```{r predictive-logloss, fig.height = 4.5} plot_predictive(pruned, trajectories, type = "logloss") ``` `type = "ecdf"` -- the distribution of the probability assigned to the state that actually occurred; steep steps reveal calibration plateaus (e.g. a mass of three-way-open branch points): ```{r predictive-ecdf, fig.height = 4.5} plot_predictive(pruned, trajectories, type = "ecdf") ``` A third type, `type = "position"`, traces each individual sequence's confidence move-by-move (one grey line per sequence). It is a per-sequence view that only reads cleanly for a handful of sequences, so it is omitted here; reach for it when you want to inspect a few specific trajectories rather than the corpus as a whole. ## 4. Forward trajectory trees The context tree reads backward; `plot_trajectories()` draws the same sequences forward in time. Colour by **frequency** (how many sequences walk each path) or by **predictability** (`P(state | history)` from the model). Read together they separate traffic from predictability -- a wide-but-pale edge is a high-traffic decision point. Forward trajectories show their structure best on a richer alphabet, so this section uses the bundled `ai_long` log (eight AI-prompting move types) rather than the three-state engagement data above. ```{r traj-fit} data(ai_long) tree_ai <- context_tree(ai_long, actor = "project", session = "session_id", action = "code", max_depth = 3L, min_count = 10L) pruned_ai <- prune_tree(tree_ai) ``` ```{r traj-freq, fig.width = 11, fig.height = 7} plot_trajectories(tree_ai, measure = "frequency", min_count = 20L) ``` ```{r traj-pred, fig.width = 11, fig.height = 7} plot_trajectories(pruned_ai, measure = "predictability", min_count = 20L) ``` ## 5. Inferential plots ### Bootstrap forest plot Each pathway's 95% bootstrap interval on G-squared against the chi-square critical value (dashed line); colour encodes the trust quadrant. A bar entirely to the right is reproducibly informative. ```{r boot-plot, fig.height = 5.5} boot <- bootstrap_pathways(pruned, iter = 100L, seed = 1L) plot(boot) ``` ### Per-pathway resample distributions ```{r boot-resamples, fig.height = 4.5} plot_pathway_resamples(boot, stat = "divergence", top = 6) ``` ### Cohort comparison: permutation null We name an external group column (`Achiever`) on the bundled `group_regulation_long` log; `context_tree(group = )` fits one tree per cohort, and `compare_trees()` consumes the group directly. ```{r compare-plot, fig.height = 4.5} data(group_regulation_long) grp_reg <- context_tree(group_regulation_long, actor = "Actor", time = "Time", action = "Action", group = "Achiever", max_depth = 2L, min_count = 10L) cmp <- compare_trees(prune_tree(grp_reg), iter = 199L, seed = 1L) plot(cmp) ``` The observed distance (orange line) sits in the right tail of the label-shuffled null (grey) -- the visual form of the permutation p-value. ### Tuning surface ```{r tune-plot, fig.height = 5} tg <- tune_tree(trajectories, max_depth = 1L:4L, folds = 5L, seed = 1L) plot(tg) ``` A flat-then-rising perplexity curve is the picture of a short-memory process; the orange star marks the cross-validated winner. ### Group difference map `plot_difference()` draws the per-context residual map for the same `group =`-fitted tree -- where two cohorts resolve the same history toward different next states. `depth = 1L` keeps the map to the single-state contexts so the rows stay legible (a deep tree has too many contexts to label). ```{r difference, fig.height = 5} plot_difference(grp_reg, depth = 1L) ``` ## Recap | Goal | Function | |---|---| | The tree | `plot(style = c("horizontal", "dendrogram", "icicle", "interactive"))` (interactive = `visNetwork` widget) | | Rank pathways | `plot_pathways()`, `plot_divergence()`, `plot_distributions()` | | Memory of one pathway | `plot_pruning()` | | Held-out quality | `plot_predictive(type = c("logloss", "ecdf", "position"))` | | Forward trajectories | `plot_trajectories(measure = c("frequency", "predictability"))` | | Reliability | `plot()`, `plot_pathway_resamples()` | | Comparison | `plot()`, `plot_difference()` | | Tuning | `plot()` |