--- title: "Introspecting Tables" author: "Gabriel Becker and Adrian Waddell" date: "`r Sys.Date()`" output: rmarkdown::html_vignette vignette: > %\VignetteIndexEntry{Introspecting Tables} %\VignetteEncoding{UTF-8} %\VignetteEngine{knitr::rmarkdown} editor_options: chunk_output_type: console --- ```{r, include = FALSE} suggested_dependent_pkgs <- c("dplyr") knitr::opts_chunk$set( collapse = TRUE, comment = "#>", eval = all(vapply( suggested_dependent_pkgs, requireNamespace, logical(1), quietly = TRUE )) ) ``` ```{r, echo=FALSE} knitr::opts_chunk$set(comment = "#") ``` The packages used in this vignette are `rtables` and `dplyr`: ```{r, message=FALSE} library(rtables) library(dplyr) ``` ## Introduction First, let's set up a simple table. ```{r} lyt <- basic_table() %>% split_cols_by("ARMCD", show_colcounts = TRUE, colcount_format = "N=xx") %>% split_cols_by("STRATA2", show_colcounts = TRUE) %>% split_rows_by("STRATA1") %>% add_overall_col("All") %>% summarize_row_groups() %>% analyze("AGE", afun = max, format = "xx.x") tbl <- build_table(lyt, ex_adsl) tbl ``` ## Getting Started We can get basic table dimensions, the number of rows, and the number of columns with the following code: ```{r} dim(tbl) nrow(tbl) ncol(tbl) ``` ## Detailed Table Structure The `table_structure()` function prints a summary of a table's row structure at one of two levels of detail. By default, it summarizes the structure at the subtable level. ```{r} table_structure(tbl) ``` When the `detail` argument is set to `"row"`, however, it provides a more detailed row-level summary which acts as a useful alternative to how we might normally use the `str()` function to interrogate compound nested lists. ```{r} table_structure(tbl, detail = "row") # or "subtable" ``` Similarly, for columns we can see how the tree is structured with the following call: ```{r} coltree_structure(tbl) ``` Further information about the column structure can be found in the vignette on [`col_counts`](https://insightsengineering.github.io/rtables/latest-tag/articles/col_counts.html). The `make_row_df()` and `make_col_df()` functions each create a `data.frame` with a variety of information about the table's structure. Most useful for introspection purposes are the `label`, `name`, `abs_rownumber`, `path` and `node_class` columns (the remainder of the information in the returned `data.frame` is used for pagination) ```{r} make_row_df(tbl)[, c("label", "name", "abs_rownumber", "path", "node_class")] ``` There is also a wrapper function, `row_paths()` available for `make_row_df` to display only the row path structure: ```{r} row_paths(tbl) ``` By default `make_row_df()` summarizes only visible rows, but setting `visible_only` to `FALSE` gives us a structural summary of the table with the full hierarchy of subtables, including those that are not represented directly by any visible rows: ```{r} make_row_df(tbl, visible_only = FALSE)[, c("label", "name", "abs_rownumber", "path", "node_class")] ``` `make_col_df()` similarly accepts `visible_only`, though here the meaning is slightly different, indicating whether only *leaf* columns should be summarized (defaults to `TRUE`) or whether higher level groups of columns - analogous to subtables in row space - should be summarized as well. ```{r} make_col_df(tbl)[, c("label", "name", "abs_pos", "path", "leaf_indices")] ``` ```{r} make_col_df(tbl, visible_only = FALSE)[, c("label", "name", "abs_pos", "path", "leaf_indices")] ``` Similarly, there is wrapper function `col_paths()` available, which displays only the column structure: ```{r} col_paths(tbl) ``` The `row_paths_summary()` and `col_paths_summary()` functions wrap the respective `make_*_df` functions, printing the `name`, `node_class`, and `path` information (in the row case), or the `label` and `path` information (in the column case), indented to illustrate table structure: ```{r} row_paths_summary(tbl) ``` ```{r} col_paths_summary(tbl) ``` ## Insights on Value Format Structure We can gain insight into the value formatting structure of a table using `table_shell()`, which returns a table with the same output as `print()` but with the cell values replaced by their underlying format strings (e.g. instead of `40.0`, `xx.x` is displayed, and so on). This is useful for understanding the structure of the table, and for debugging purposes. Another useful tool is the `value_formats()` function which instead of a table returns a matrix of the format strings for each cell value in the table. See below the printout for the above examples: ```{r} table_shell(tbl) ``` ```{r} value_formats(tbl) ``` ## Applications Knowing the structure of an `rtable` object is helpful for retrieving specific values from the table. For examples, see the [Path Based Cell Value Accessing](https://insightsengineering.github.io/rtables/latest-tag/articles/subsetting_tables.html#path-based-cell-value-accessing) section of the Subsetting and Manipulating Table Contents vignette. Understanding table structure is also important for post-processing processes such as sorting and pruning. More details on this are covered in the [Pruning and Sorting Tables vignette](https://insightsengineering.github.io/rtables/latest-tag/articles/sorting_pruning.html) vignette. ## Summary In this vignette you have learned a number of utility functions that are available for examining the underlying structure of `rtable` objects.