| Title: | Declarative EQUATOR-Style Flow Diagrams for Clinical Studies |
|---|---|
| Description: | Build EQUATOR-style flowcharts for clinical studies by sequentially defining inclusion and exclusion criteria, study arms, and endpoints. The pipe-friendly API supports CONSORT (randomized trials), STROBE (observational cohorts), STARD (diagnostic accuracy), PRISMA (systematic reviews), and MOOSE (observational meta-analysis) diagram layouts, as well as multi-source convergence, split-and-recombine, factorial, and hybrid topologies. Diagrams are rendered via 'grid' graphics in both data-driven (automatic counting) and manual-count modes, with optional 'DiagrammeR'/'Graphviz' output. |
| Authors: | Paul Hsin-ti McClelland [aut, cre, cph] (ORCID: <https://orcid.org/0000-0002-3119-6531>) |
| Maintainer: | Paul Hsin-ti McClelland <[email protected]> |
| License: | GPL (>= 3) |
| Version: | 0.6.0 |
| Built: | 2026-06-24 13:29:39 UTC |
| Source: | https://github.com/cran/selecta |
Models a step where participants undergo (or fail to undergo) a test or procedure. This is the primary building block for STARD-style diagnostic accuracy diagrams. The side box shows who did not receive the procedure (with optional reasons), and the main flow continues with those who were assessed.
assess( .flow, label, criterion, not_received = NULL, reasons = NULL, show_zero = FALSE )assess( .flow, label, criterion, not_received = NULL, reasons = NULL, show_zero = FALSE )
.flow |
A |
label |
Character string naming the test or procedure
(e.g., |
criterion |
An unquoted logical expression that evaluates to
|
not_received |
Integer (manual mode). Number of participants who did not receive this test. |
reasons |
Named integer vector of reasons for non-receipt
(e.g., |
show_zero |
Logical. If |
assess() models a test or procedure that only part of the cohort
undergoes, the recurring motif of STARD diagnostic-accuracy diagrams. It
is implemented as an exclude() step with inverted label
semantics: the side box reads “Did not receive label” and
the continuing box reads “Received label”, so the main flow
carries those who were assessed. In data mode, criterion is
an unquoted logical expression that is TRUE for participants who
did not receive the test; in manual mode, not_received
gives that count and reasons an optional named breakdown. Chained
assess() steps commonly precede a stratify() split on
the index-test result, with each terminal box reporting its
target-condition breakdown.
The updated selecta object with an assessment step
appended.
exclude for general exclusion steps,
endpoint for the terminal diagnosis boxes (STARD)
Other flow construction functions:
combine(),
endpoint(),
enroll(),
exclude(),
phase(),
sources(),
stratify()
# STARD diagnostic accuracy flow enroll(n = 360, label = "Eligible patients") |> assess("Index test", not_received = 22, reasons = c("Refused" = 12, "Contraindicated" = 10)) |> assess("Reference standard", not_received = 18) |> stratify(labels = c("Index test positive", "Index test negative"), n = c(150, 170), label = "Index test result") |> endpoint("Final diagnosis", breakdown = list(c("Target +" = 130, "Target -" = 20), c("Target +" = 15, "Target -" = 155)))# STARD diagnostic accuracy flow enroll(n = 360, label = "Eligible patients") |> assess("Index test", not_received = 22, reasons = c("Refused" = 12, "Contraindicated" = 10)) |> assess("Reference standard", not_received = 18) |> stratify(labels = c("Index test positive", "Index test negative"), n = c(150, 170), label = "Index test result") |> endpoint("Final diagnosis", breakdown = list(c("Target +" = 130, "Target -" = 20), c("Target +" = 15, "Target -" = 155)))
Returns the dataset remaining after all exclusion criteria have been
applied. When arms are defined via stratify(), the result
is either a single combined data.table or a named list of
per-arm data.table objects. Data mode only.
cohort(.flow, split = FALSE, arm = NULL)cohort(.flow, split = FALSE, arm = NULL)
.flow |
A |
split |
Logical. If |
arm |
Character. Name of a specific arm to extract. If supplied,
returns only that arm's |
cohort() replays the exclusion criteria of a data-mode flow
against the original dataset and returns the rows that survive to the
end, so the analyst can pass the exact analyzed population to downstream
modeling. It requires a flow created by supplying data to
enroll(); manual-mode flows carry only counts and therefore
raise an error. For an unsplit flow the result is a single
data.table; after stratify() or allocate(),
split = TRUE returns one table per arm and arm extracts a
single named arm. To inspect the cohort at every intermediate step rather
than only the end, use cohorts().
A data.table containing the participants remaining after
all exclusion criteria. When split = TRUE, a named list of
data.tables (one per arm). When arm is specified, a
single-arm data.table.
cohorts for stage-by-stage snapshots,
enroll for initializing a data-mode flow
Other cohort extraction functions:
cohorts()
flow <- enroll(selectaex2, id = "patient_id") |> exclude("Ineligible", criterion = eligible == FALSE) |> endpoint("Final") final <- cohort(flow) nrow(final)flow <- enroll(selectaex2, id = "patient_id") |> exclude("Ineligible", criterion = eligible == FALSE) |> endpoint("Final") final <- cohort(flow) nrow(final)
Returns a named list of datasets at each step of the enrollment flow, enabling cross-cohort comparisons. Results are reported as a named list, organized by step label. Data mode only.
cohorts(.flow)cohorts(.flow)
.flow |
A |
cohorts() replays a data mode flow and captures the dataset
at every step, returning a named list keyed by step label (with
"_start" for the initial cohort). Each snapshot exposes both the
included and the excluded rows together with their counts,
which is useful for validating a diagram against the data, auditing why
particular participants were dropped, or extracting an intermediate
population. After a stratify() or allocate()
split, the included and excluded elements of a per-arm
step are themselves named lists with one entry per arm; after a factorial
(two-level) split the entries are the cells, keyed
"<parent>: <child>". A manual-mode flow has no underlying data and
therefore raises an error. To obtain only the final analyzed population,
use cohort().
A named list of cohort snapshots, keyed by step label. Each snapshot is itself a list with:
includedA data.table of participants still in
the flow after this step.
excludedA data.table of participants removed at
this step (for exclusion steps; NULL otherwise).
n_includedInteger count of included participants.
n_excludedInteger count of excluded participants (or
NA).
cohort for extracting only the final cohort
Other cohort extraction functions:
cohort()
flow <- enroll(selectaex2, id = "patient_id") |> exclude("Ineligible", criterion = eligible == FALSE) |> endpoint("Final") stages <- cohorts(flow) names(stages) stages[["Ineligible"]]$n_excludedflow <- enroll(selectaex2, id = "patient_id") |> exclude("Ineligible", criterion = eligible == FALSE) |> endpoint("Final") stages <- cohorts(flow) names(stages) stages[["Ineligible"]]$n_excluded
Converges all active parallel streams into a single flow. Used to handle
either source convergence or split-and-recombine topologies. After
stratify(), recombines strata that were characterized independently
back into a unified downstream flow.
combine(.flow, label, sublabel = NULL, n = NULL, reasons = NULL)combine(.flow, label, sublabel = NULL, n = NULL, reasons = NULL)
.flow |
A |
label |
Character string for the merged node. |
sublabel |
Optional character string rendered below |
n |
Integer. Explicit post-merge count (manual mode). If omitted, computed as the sum of all active stream counts. |
reasons |
Optional named integer vector of sub-items displayed below the count (e.g., outcome categories). |
combine() converges the active parallel streams into one node and
is the counterpart to both entry splits. After sources(), it
pools the identification streams of a systematic review; after
stratify() (or allocate()), it recombines strata
that were handled independently, producing a split-and-recombine diagram.
By default, the merged count is the sum of the incoming streams after
any per-arm exclusions applied since the split—an explicit n
overrides this in manual mode. In such situations, an additional option
is provided (getOption("selecta.check_arithmetic"), default
TRUE), which will check arithmetic and raise an advisory warning
if there is a discrepancy between counts.
The optional sublabel parameter prints on a second line inside the
merged box, which is convenient for naming the recombined cohort.
The updated selecta object with a combine step
appended. All subsequent steps operate on the single merged stream.
sources for multi-source entry,
stratify for split-and-recombine flows
Other flow construction functions:
assess(),
endpoint(),
enroll(),
exclude(),
phase(),
sources(),
stratify()
# PRISMA: merge identification sources sources(PubMed = 1234, Embase = 567) |> combine("Records after deduplication") |> exclude("Records removed", n = 352, show_count = FALSE, reasons = c("Duplicates" = 340, "Automation" = 12)) # Split-and-recombine: stratify, then combine enroll(n = 158) |> stratify(labels = c("Not screened", "Screened"), n = c(82, 76), label = "Screening status") |> exclude("Condition not confirmed", n = c(44, 66)) |> combine("Confirmed cohort", sublabel = "Participants with confirmed diagnosis") |> exclude("Incomplete records", n = 7) |> endpoint("Final cohort")# PRISMA: merge identification sources sources(PubMed = 1234, Embase = 567) |> combine("Records after deduplication") |> exclude("Records removed", n = 352, show_count = FALSE, reasons = c("Duplicates" = 340, "Automation" = 12)) # Split-and-recombine: stratify, then combine enroll(n = 158) |> stratify(labels = c("Not screened", "Screened"), n = c(82, 76), label = "Screening status") |> exclude("Condition not confirmed", n = c(44, 66)) |> combine("Confirmed cohort", sublabel = "Participants with confirmed diagnosis") |> exclude("Incomplete records", n = 7) |> endpoint("Final cohort")
Adds the terminal node(s) to the enrollment flow. If arms have been
defined via stratify(), one endpoint box appears per arm.
endpoint( .flow, label = "Final Analysis", breakdown = NULL, groups = NULL, n = NULL, variable = NULL )endpoint( .flow, label = "Final Analysis", breakdown = NULL, groups = NULL, n = NULL, variable = NULL )
.flow |
A |
label |
Character string for the final box. With |
breakdown |
Optional named numeric vector (or, for a per-arm endpoint,
a list of them) itemizing the box total into parts printed within
the box, beneath the total. This is the STARD final-diagnosis form, where
each terminal box reports its target-condition composition, e.g.
|
groups |
Optional character vector of group labels (manual mode). When
supplied, the endpoint splits into one separate terminal box per
group, fanning from a shared distributor. Use this for study-design
diagrams that end by displaying the groups to be analyzed (“Group
A”, “Group B”, ...). A split endpoint requires a single incoming
stream; it cannot follow an unrecombined |
n |
Optional numeric vector of per-group counts (manual mode), parallel
to |
variable |
Optional character naming a grouping column (data mode).
Splits the terminal endpoint by that column, one box per level, with
counts tabulated automatically. The data-mode counterpart of
|
endpoint() closes the flow with its terminal node(s) and is usually
the last step in a pipeline. When the flow has been split with
stratify() or allocate() and not recombined, one
endpoint box is drawn per arm, and label and breakdown may be
supplied per arm.
Two distinct presentations of detail are available, which are mutually
exclusive. breakdown itemizes a single box's total as text lines
inside that box (the STARD final-diagnosis form, reporting each box's
target-condition composition). Conversely, groups divides the
endpoint into separate side-by-side boxes, one per group, fanning from a
shared distributor; this design favors study diagrams that end by
displaying the groups to be analyzed. The completed object is then passed
to flowchart(), flowsave(), or recdims().
The updated selecta object with an endpoint step appended.
assess for the diagnostic test-receipt steps that
precede a STARD endpoint, flowchart for rendering
Other flow construction functions:
assess(),
combine(),
enroll(),
exclude(),
phase(),
sources(),
stratify()
enroll(n = 300) |> exclude("Excluded", n = 40) |> endpoint("Included in analysis") # STARD-style per-arm endpoint with a within-box breakdown enroll(n = 500) |> stratify(labels = c("Positive", "Negative"), n = c(200, 300), label = "Index test result") |> endpoint("Final diagnosis", breakdown = list(c("Target +" = 160, "Target -" = 40), c("Target +" = 25, "Target -" = 275))) # Split endpoint into separate terminal group boxes (manual) enroll(n = 300, label = "Eligible cohort") |> endpoint("Allocated to study group", groups = c("Group A", "Group B", "Group C"), n = c(100, 100, 100)) # Split endpoint by a grouping column (data mode) df <- data.frame(id = 1:300, grp = sample(c("A", "B", "C"), 300, TRUE)) enroll(df, id = "id", label = "Eligible cohort") |> endpoint("Allocated to study group", variable = "grp")enroll(n = 300) |> exclude("Excluded", n = 40) |> endpoint("Included in analysis") # STARD-style per-arm endpoint with a within-box breakdown enroll(n = 500) |> stratify(labels = c("Positive", "Negative"), n = c(200, 300), label = "Index test result") |> endpoint("Final diagnosis", breakdown = list(c("Target +" = 160, "Target -" = 40), c("Target +" = 25, "Target -" = 275))) # Split endpoint into separate terminal group boxes (manual) enroll(n = 300, label = "Eligible cohort") |> endpoint("Allocated to study group", groups = c("Group A", "Group B", "Group C"), n = c(100, 100, 100)) # Split endpoint by a grouping column (data mode) df <- data.frame(id = 1:300, grp = sample(c("A", "B", "C"), 300, TRUE)) enroll(df, id = "id", label = "Eligible cohort") |> endpoint("Allocated to study group", variable = "grp")
Entry point for building an EQUATOR-style enrollment diagram from a single
starting population. Accepts either a data.frame (data mode,
where counts are computed automatically from exclusion expressions) or a
starting count n (manual mode, where counts are supplied explicitly
at each step).
enroll(data = NULL, id = NULL, n = NULL, label = "Study Population")enroll(data = NULL, id = NULL, n = NULL, label = "Study Population")
data |
A |
id |
Character string naming the participant ID column in |
n |
Integer. Starting population count for manual mode. Must be a
non-negative scalar. Ignored when |
label |
Character string for the top-level box in the diagram.
Default is |
enroll() begins every single-source pipeline and fixes the
operating mode for all subsequent steps. Supplying data (with
id) selects data mode, in which later exclude() and
stratify() steps filter and partition the dataset and counts are
derived from the data. Alternatively, supplying n instead selects
manual mode, in which counts are taken from the numbers given at
each step. The two modes are mutually exclusive, and the resulting object
is intended to be extended with the pipe operator. For diagrams with
several entry sources that converge (PRISMA, MOOSE), use sources()
instead of enroll().
An object of class "selecta" containing the data (if
supplied), mode, starting count, label, and an empty step list.
Subsequent pipeline functions (exclude(), stratify(),
endpoint(), etc.) append steps to this object.
sources for multi-source entry,
exclude for adding exclusion criteria,
flowchart for rendering
Other flow construction functions:
assess(),
combine(),
endpoint(),
exclude(),
phase(),
sources(),
stratify()
# Manual mode enroll(n = 500, label = "Assessed for eligibility") # Data mode enroll(selectaex2, id = "patient_id", label = "Study Population") # Minimal CONSORT pipeline enroll(n = 500) |> exclude("Ineligible", n = 65) |> allocate(labels = c("Treatment", "Control"), n = c(218, 217)) |> endpoint("Analyzed")# Manual mode enroll(n = 500, label = "Assessed for eligibility") # Data mode enroll(selectaex2, id = "patient_id", label = "Study Population") # Minimal CONSORT pipeline enroll(n = 500) |> exclude("Ineligible", n = 65) |> allocate(labels = c("Treatment", "Control"), n = c(218, 217)) |> endpoint("Analyzed")
Appends an exclusion step to the enrollment flow. Participants matching the criteria are removed and shown in a side box. Optionally, itemized sub-reasons can be displayed below the total.
exclude( .flow, label, criterion, n = NULL, reasons = NULL, show_zero = FALSE, show_count = FALSE, included_label = NULL, collapse_singletons = FALSE )exclude( .flow, label, criterion, n = NULL, reasons = NULL, show_zero = FALSE, show_count = FALSE, included_label = NULL, collapse_singletons = FALSE )
.flow |
A |
label |
Character. Human-readable description for the side box
(e.g., |
criterion |
An unquoted logical expression evaluated against the
data. Should evaluate to |
n |
Integer. Number of participants removed at this step. After
a |
reasons |
Exclusion sub-reasons. Accepts these forms:
|
show_zero |
Logical. If |
show_count |
Logical. If |
included_label |
Character string (or vector). Optional text for the
box showing the count remaining after exclusion. When provided, a
count box is always rendered regardless of |
collapse_singletons |
Logical. When |
exclude() records participants removed at a step and is the most
common pipeline verb. In data mode, criterion is an unquoted logical
expression evaluated against the dataset (rows for which it is
TRUE are removed) and reasons may name one column (a flat
breakdown) or two columns (a reason and a sub-reason, cross-tabulated into a
two-level breakdown); in manual mode, n gives
the number removed and reasons may be a named numeric vector.
After a stratify() or allocate() split the
exclusion applies per arm, in which case n, reasons, and
included_label accept per-arm vectors or lists. By default the
running count box is suppressed between consecutive exclusions for a
compact diagram; supplying included_label (or
show_count = TRUE) forces a count box to be drawn.
When getOption("selecta.check_arithmetic") is TRUE, the
manual counts of the whole flow are audited together before export: an
over-exclusion, a split or combine whose parts do not match the running
total, and sub-reasons that do not sum to their exclusion total each
raise an advisory warning without altering the figures. The audit runs
whenever the flow is computed; this includes calls to flowchart(),
flowsave(), and summary(), so a single call to any of
these functions reports every discrepancy at once.
Eligibility that is more naturally framed as inclusion fits this same
model: express it as the exclusion of those who fail the criteria, and
use included_label to label the retained count (e.g.,
included_label = "Eligible cohort").
After a stratify() step, both label and
included_label accept character vectors (one element per arm)
for per-arm labeling—useful in observational designs where
attrition mechanisms differ across strata.
The updated selecta object with an exclusion step appended.
assess for assessment/procedure steps (STARD),
enroll for initializing a flow
Other flow construction functions:
assess(),
combine(),
endpoint(),
enroll(),
phase(),
sources(),
stratify()
enroll(n = 500) |> exclude("Ineligible", n = 65) # With sub-reasons (manual) enroll(n = 500) |> exclude("Excluded", n = 65, reasons = c("Did not meet criteria" = 22, "Ineligible comorbidities" = 18, "Declined to participate" = 15, "Lost to follow-up" = 10)) # Show intermediate count box (opt-in) enroll(n = 500) |> exclude("Ineligible", n = 65, show_count = TRUE) |> exclude("Declined", n = 20) |> endpoint("Final") # Or use included_label (always shows count box) enroll(n = 500) |> exclude("Ineligible", n = 65, included_label = "Eligible") |> endpoint("Final") # Per-arm labels (observational) enroll(n = 1000) |> stratify(labels = c("Exposed", "Unexposed"), n = c(500, 500), label = "Classified by exposure") |> exclude(c("Treatment discontinued", "Initiated treatment"), n = c(45, 52)) # Per-arm reasons (list of named vectors) enroll(n = 900) |> allocate(labels = c("Drug A", "Placebo"), n = c(450, 450)) |> exclude("Discontinued", n = c(30, 25), reasons = list( c("Adverse event" = 18, "Withdrew consent" = 12), c("Adverse event" = 10, "Lost to follow-up" = 15) )) |> endpoint("Analyzed") # Compound expression (data mode) data(selectaex2) enroll(selectaex2, id = "patient_id") |> exclude("Ineligible or duplicate", criterion = eligible == FALSE | is_duplicate == TRUE)enroll(n = 500) |> exclude("Ineligible", n = 65) # With sub-reasons (manual) enroll(n = 500) |> exclude("Excluded", n = 65, reasons = c("Did not meet criteria" = 22, "Ineligible comorbidities" = 18, "Declined to participate" = 15, "Lost to follow-up" = 10)) # Show intermediate count box (opt-in) enroll(n = 500) |> exclude("Ineligible", n = 65, show_count = TRUE) |> exclude("Declined", n = 20) |> endpoint("Final") # Or use included_label (always shows count box) enroll(n = 500) |> exclude("Ineligible", n = 65, included_label = "Eligible") |> endpoint("Final") # Per-arm labels (observational) enroll(n = 1000) |> stratify(labels = c("Exposed", "Unexposed"), n = c(500, 500), label = "Classified by exposure") |> exclude(c("Treatment discontinued", "Initiated treatment"), n = c(45, 52)) # Per-arm reasons (list of named vectors) enroll(n = 900) |> allocate(labels = c("Drug A", "Placebo"), n = c(450, 450)) |> exclude("Discontinued", n = c(30, 25), reasons = list( c("Adverse event" = 18, "Withdrew consent" = 12), c("Adverse event" = 10, "Lost to follow-up" = 15) )) |> endpoint("Analyzed") # Compound expression (data mode) data(selectaex2) enroll(selectaex2, id = "patient_id") |> exclude("Ineligible or duplicate", criterion = eligible == FALSE | is_duplicate == TRUE)
Computes counts from the pipeline, lays out nodes, and draws an
EQUATOR-style enrollment diagram. This is the primary rendering
function for interactive use; for saving to file with auto-sized
dimensions, see flowsave().
flowchart(.flow, engine = c("grid", "dot"), count_first = FALSE, ...) ## S3 method for class 'selecta' plot(x, engine = c("grid", "dot"), ...)flowchart(.flow, engine = c("grid", "dot"), count_first = FALSE, ...) ## S3 method for class 'selecta' plot(x, engine = c("grid", "dot"), ...)
.flow |
A |
engine |
Character. Rendering engine: |
count_first |
Logical. If |
... |
Additional styling and formatting arguments forwarded to the selected engine; arguments an engine does not recognize are ignored. For
For
|
x |
A |
flowchart() is the primary rendering entry point and accepts a
completed pipeline object. The grid engine draws the diagram to
the active graphics device using the grid system and is intended
for publication-quality figures with phase strips, precise dimensions,
and locale-aware counts; the dot engine instead returns a
Graphviz DOT-language string for prototyping or rendering through external
Graphviz tooling, and draws nothing itself. Styling, font, and
number-format options are forwarded to the chosen engine through
...; options unsupported by an engine (for example the phase
strips, which the dot engine does not draw) are ignored. flowchart()
is normally the last call in a pipeline; for direct file output use
flowsave(), and to size a canvas use recdims.
For engine = "grid": invisibly returns the computed graph
structure (a list of nodes, edges, and phases
data.tables). For engine = "dot": returns a DOT-language string.
flowsave for saving to file,
recdims for dimension recommendations,
plot.selecta for S3 plot method
Other flowchart output functions:
flowsave(),
print.selecta(),
recdims(),
summary.selecta()
# Build a flow once, then render it. Most of the package's pipeline # functions are modular and intended to be composed like this rather # than run in isolation; see the vignettes for fuller treatments. flow <- enroll(n = 1200) |> phase("Enrollment") |> exclude("Excluded", n = 150, reasons = c("Did not meet criteria" = 55, "Declined to participate" = 48, "Other reasons" = 47)) |> phase("Allocation") |> allocate(labels = c("Treatment", "Control"), n = c(520, 530)) |> phase("Analysis") |> endpoint("Final Analysis") # The "dot" engine returns a Graphviz DOT string and draws nothing, # so it runs anywhere without opening a graphics device. dot <- flowchart(flow, engine = "dot") substr(dot, 1, 50) # The "grid" engine draws to the active graphics device. These calls are # guarded with interactive() so they render in an interactive session but # are skipped during non-interactive documentation builds, where the # diagram cannot be sized to the page and would render incorrectly. if (interactive()) { flowchart(flow) # draws to the active device plot(flow) # plot() is a thin wrapper around flowchart() # Locale-aware counts: a European thousands separator. enroll(n = 12500) |> exclude("Excluded", n = 1450) |> endpoint("Analyzed") |> flowchart(number_format = "eu") }# Build a flow once, then render it. Most of the package's pipeline # functions are modular and intended to be composed like this rather # than run in isolation; see the vignettes for fuller treatments. flow <- enroll(n = 1200) |> phase("Enrollment") |> exclude("Excluded", n = 150, reasons = c("Did not meet criteria" = 55, "Declined to participate" = 48, "Other reasons" = 47)) |> phase("Allocation") |> allocate(labels = c("Treatment", "Control"), n = c(520, 530)) |> phase("Analysis") |> endpoint("Final Analysis") # The "dot" engine returns a Graphviz DOT string and draws nothing, # so it runs anywhere without opening a graphics device. dot <- flowchart(flow, engine = "dot") substr(dot, 1, 50) # The "grid" engine draws to the active graphics device. These calls are # guarded with interactive() so they render in an interactive session but # are skipped during non-interactive documentation builds, where the # diagram cannot be sized to the page and would render incorrectly. if (interactive()) { flowchart(flow) # draws to the active device plot(flow) # plot() is a thin wrapper around flowchart() # Locale-aware counts: a European thousands separator. enroll(n = 12500) |> exclude("Excluded", n = 1450) |> endpoint("Analyzed") |> flowchart(number_format = "eu") }
Renders the enrollment diagram and saves it to a file. Supported
formats are PDF, PNG, SVG, and TIFF (inferred from the file
extension). The grid engine renders via R graphics devices; the
dot engine pipes Graphviz output through the system dot
binary. Dimensions are computed automatically from diagram content via
recdims() unless overridden.
flowsave( x, file, engine = c("grid", "dot"), width = NULL, height = NULL, dpi = 300, sans_serif = TRUE, ... )flowsave( x, file, engine = c("grid", "dot"), width = NULL, height = NULL, dpi = 300, sans_serif = TRUE, ... )
x |
A |
file |
Character string. Output file path. The format is inferred
from the file extension. Supported extensions: |
engine |
Character string. One of |
width |
Numeric or |
height |
Numeric or |
dpi |
Integer. Resolution in dots per inch for raster formats
(PNG, TIFF). Default 300. Honored by both engines. Mirrors the
|
sans_serif |
Logical. |
... |
Additional styling and formatting arguments forwarded to the
selected engine; see
|
flowsave() renders a flow directly to a file, inferring the format
from the extension and choosing dimensions automatically unless
width and height are given. With engine = "grid" it
draws through R's graphics devices, producing either vector formats
(.pdf, .svg) or raster formats (.png, .tiff).
For raster formats, flowsave() prefers the ragg device when
installed, with fallback to the base png()/tiff() devices
otherwise. Using these devices is generally advised for raster output
over other devices such as cairo since some cairo configurations drop
the plotmath italics in the count labels. The dpi argument mirrors
ggplot2::ggsave() for raster resolution.
With engine = "dot", flowsave() renders a graphic based on
a Graphviz DOT string: a .dot extension writes the source text
directly and needs no external software, whereas image output shells out
to the system dot binary and therefore requires Graphviz on the
PATH.
When sizing automatically, flowsave() calls recdims()
once and reuses the computed layout, so a separate recdims() call
is unnecessary. With the grid engine, leaving either dimension at
its default also reports the content-derived recommendation through a
message(); supply both width and height to size
manually and silence it. The dot engine instead lets Graphviz size
the output from the layout, so no recommendation is reported.
Invisibly returns the output file path.
flowchart for interactive rendering,
recdims for dimension recommendations
Other flowchart output functions:
flowchart(),
print.selecta(),
recdims(),
summary.selecta()
flow <- enroll(n = 500) |> exclude("Ineligible", n = 50) |> endpoint("Analysis") # Grid engine (default). Files are written under tempdir() here so # the example respects CRAN's no-write policy; in practice any # desired path may be supplied. flowsave(flow, file.path(tempdir(), "consort.pdf")) flowsave(flow, file.path(tempdir(), "consort.png"), width = 8, height = 10) # DOT engine writing a .dot source file requires no external software. flowsave(flow, file.path(tempdir(), "consort.dot"), engine = "dot") # Rasterized DOT output (.svg, .png, .pdf) requires the Graphviz 'dot' # binary on the system PATH. if (nzchar(Sys.which("dot"))) { flowsave(flow, file.path(tempdir(), "consort.svg"), engine = "dot") # DOT engine with Times typography for serif environments. flowsave(flow, file.path(tempdir(), "consort_times.svg"), engine = "dot", font_family = "Times-Roman", sans_serif = FALSE) }flow <- enroll(n = 500) |> exclude("Ineligible", n = 50) |> endpoint("Analysis") # Grid engine (default). Files are written under tempdir() here so # the example respects CRAN's no-write policy; in practice any # desired path may be supplied. flowsave(flow, file.path(tempdir(), "consort.pdf")) flowsave(flow, file.path(tempdir(), "consort.png"), width = 8, height = 10) # DOT engine writing a .dot source file requires no external software. flowsave(flow, file.path(tempdir(), "consort.dot"), engine = "dot") # Rasterized DOT output (.svg, .png, .pdf) requires the Graphviz 'dot' # binary on the system PATH. if (nzchar(Sys.which("dot"))) { flowsave(flow, file.path(tempdir(), "consort.svg"), engine = "dot") # DOT engine with Times typography for serif environments. flowsave(flow, file.path(tempdir(), "consort_times.svg"), engine = "dot", font_family = "Times-Roman", sans_serif = FALSE) }
Adds a vertical phase label to the left margin of the diagram
(e.g., "Enrollment", "Allocation",
"Follow-up", "Analysis"). Phase labels span all
subsequent steps until the next phase() call or the end of
the flow.
phase(.flow, label)phase(.flow, label)
.flow |
A |
label |
Character string. The phase label, rendered as rotated text on the left margin. |
phase() inserts a stage boundary rather than a flow node. Each
call opens a phase whose label is drawn in the left margin, spanning
every subsequent step until the next phase() or the end of the
flow. The purpose of these phase markers is to reflect the stages of
analysis in the diagram; as such, they are purely presentational, and
they do not alter counts or topology. In the grid engine,
phase labels are rendered vertically and are wrapped to fit their band
by default; conversely, the dot engine renders phase labels
horizontally due to engine limitations.
The updated selecta object with a phase marker
appended.
flowchart for rendering with phase labels
Other flow construction functions:
assess(),
combine(),
endpoint(),
enroll(),
exclude(),
sources(),
stratify()
# Phase labels divide a flow into labeled stages. The printed summary # marks each phase with a "--- Label ---" banner. enroll(n = 1200, label = "Records identified") |> phase("Enrollment") |> exclude("Duplicates", n = 84) |> phase("Allocation") |> stratify(labels = c("Drug A", "Placebo"), n = c(520, 533)) |> phase("Follow-up") |> exclude("Lost to follow-up", n = c(23, 31)) |> phase("Analysis") |> endpoint("Final Analysis")# Phase labels divide a flow into labeled stages. The printed summary # marks each phase with a "--- Label ---" banner. enroll(n = 1200, label = "Records identified") |> phase("Enrollment") |> exclude("Duplicates", n = 84) |> phase("Allocation") |> stratify(labels = c("Drug A", "Placebo"), n = c(520, 533)) |> phase("Follow-up") |> exclude("Lost to follow-up", n = c(23, 31)) |> phase("Analysis") |> endpoint("Final Analysis")
Displays a concise text summary of the pipeline steps and their
parameters. Intended for interactive inspection of a selecta
object before rendering.
## S3 method for class 'selecta' print(x, ...)## S3 method for class 'selecta' print(x, ...)
x |
A |
... |
Ignored. |
The print method gives a compact, text-only view of a
selecta object for interactive inspection before rendering. It
lists the operating mode, the starting count, and each pipeline step with
its key parameters (exclusion reasons, arm labels, endpoint sub-items),
and marks phase boundaries with a “— Label —” banner. It does
not draw the diagram or open a graphics device; for that use
flowchart() or flowsave().
Invisibly returns x.
summary.selecta for a tabular per-node summary,
flowchart for rendering
Other flowchart output functions:
flowchart(),
flowsave(),
recdims(),
summary.selecta()
flow <- enroll(n = 500) |> exclude("Ineligible", n = 65, reasons = c("No consent" = 30, "Under 18" = 35)) |> allocate(labels = c("Drug A", "Placebo"), n = c(218, 217)) |> endpoint("Analyzed") flowflow <- enroll(n = 500) |> exclude("Ineligible", n = 65, reasons = c("No consent" = 30, "Under 18" = 35)) |> allocate(labels = c("Drug A", "Placebo"), n = c(218, 217)) |> endpoint("Analyzed") flow
Computes recommended width and height in inches based on diagram content. A throwaway graphics device is opened to obtain accurate text measurements, then closed immediately.
recdims( x, vpad = getOption("selecta.vpad", 0.25), pad = 0.08, line_height = 0.2, count_first = FALSE, cex = 0.85, cex_side = NULL, cex_phase = 0.9, phase_width = 0.22, margin = 0.25, phase_multiline = TRUE, phase_max_lines = 3L, font_family = "Helvetica", number_format = NULL, ..., .measure_dev = NULL, .return_graph = FALSE )recdims( x, vpad = getOption("selecta.vpad", 0.25), pad = 0.08, line_height = 0.2, count_first = FALSE, cex = 0.85, cex_side = NULL, cex_phase = 0.9, phase_width = 0.22, margin = 0.25, phase_multiline = TRUE, phase_max_lines = 3L, font_family = "Helvetica", number_format = NULL, ..., .measure_dev = NULL, .return_graph = FALSE )
x |
A |
vpad |
Numeric. Vertical spacing between elements in inches.
Default 0.25; override globally with
|
pad |
Numeric. Internal padding within boxes in inches. Default 0.08. |
line_height |
Numeric. Vertical line spacing in inches. Default 0.20. |
count_first |
Logical. If |
cex |
Numeric. Font size multiplier for main text. Default 0.85. |
cex_side |
Numeric. Font size multiplier for side box text.
Defaults to the value of |
cex_phase |
Numeric. Font size multiplier for phase labels. Default 0.9. |
phase_width |
Numeric. Width of phase label boxes in inches. Default 0.22. |
margin |
Numeric. Fixed margin on all four sides in inches. Default 0.25. |
phase_multiline |
Logical. If |
phase_max_lines |
Integer. Maximum wrapped lines per phase label when wrapping is active. Default 3. |
font_family |
Character. Font family for text measurement.
Default |
number_format |
Character string or two-element character vector.
Locale-aware count formatter passed through to |
... |
Additional arguments. Styling-only parameters that do not
affect text measurement (such as |
.measure_dev |
Optional zero-argument function that opens a graphics
device for text measurement, matching the device that will render the
diagram. When |
.return_graph |
Logical. If |
recdims() computes the canvas size a flow needs at a given
typography and layout, so the figure is neither clipped nor surrounded by
excess whitespace. It lays the diagram out and measures it on a throwaway
graphics device, returning width and height in inches without drawing
anything visible. Because text metrics are font- and device-dependent,
any sizing parameter passed here (cex, font_family,
phase_multiline, number_format, and so on) should match the
values used at render time; styling-only parameters are ignored so the
same call can be shared across recdims(), flowchart(),
and flowsave(). The advanced .measure_dev argument
supplies a custom device opener when measurement must match a non-default
device. flowsave() calls recdims() internally when
width or height is left unspecified, so explicit use is
only needed when the dimensions themselves are wanted.
A named numeric vector with elements width and
height (in inches), rounded up to the nearest tenth.
flowsave for saving to file,
flowchart for interactive rendering
Other flowchart output functions:
flowchart(),
flowsave(),
print.selecta(),
summary.selecta()
flow <- enroll(n = 500) |> exclude("Ineligible", n = 65) |> allocate(labels = c("Drug A", "Placebo"), n = c(220, 215)) |> endpoint("Analyzed") recdims(flow)flow <- enroll(n = 500) |> exclude("Ineligible", n = 65) |> allocate(labels = c("Drug A", "Placebo"), n = c(220, 215)) |> endpoint("Analyzed") recdims(flow)
A synthetic dataset of 3,000 patients in an observational study with no treatment arms. Includes eligibility flags, exclusion reasons, and follow-up loss indicators suitable for demonstrating STROBE-style enrollment diagrams in data mode.
selectaex0selectaex0
A data.table with 3,000 rows and the following columns:
Unique patient identifier.
Logical. Whether the record is a duplicate.
Logical. Whether the patient meets eligibility criteria.
Character. Reason for exclusion, if applicable.
Logical. Whether the patient was lost to follow-up.
Character. Reason for follow-up loss, if applicable.
data(selectaex0) str(selectaex0)data(selectaex0) str(selectaex0)
A synthetic dataset of 2,400 patients in a two-arm randomized controlled trial. Includes screening, eligibility, treatment assignment, and discontinuation variables suitable for demonstrating CONSORT-style enrollment diagrams in data mode.
selectaex2selectaex2
A data.table with 2,400 rows and the following columns:
Unique patient identifier.
Logical. Whether the record is a duplicate.
Logical. Whether the patient meets eligibility criteria.
Character. Reason for exclusion, if applicable.
Character. Treatment arm assignment (e.g., "Drug A", "Placebo").
Logical. Whether the patient discontinued the study.
Character. Reason for discontinuation, if applicable.
data(selectaex2) str(selectaex2) table(selectaex2$treatment)data(selectaex2) str(selectaex2) table(selectaex2$treatment)
A synthetic dataset of 2,400 patients in a three-arm randomized
controlled trial. Structure matches selectaex2 with an
additional treatment arm.
selectaex3selectaex3
A data.table with 2,400 rows. See selectaex2
for column descriptions.
data(selectaex3) str(selectaex3) table(selectaex3$treatment)data(selectaex3) str(selectaex3) table(selectaex3$treatment)
A synthetic dataset of 3,600 patients in a six-arm dose-finding
trial. Structure matches selectaex2 with six treatment
arms.
selectaex6selectaex6
A data.table with 3,600 rows. See selectaex2
for column descriptions.
data(selectaex6) str(selectaex6) table(selectaex6$treatment)data(selectaex6) str(selectaex6) table(selectaex6$treatment)
Entry point for flows that begin with multiple parallel identification streams, such as systematic review diagrams. Each named argument defines a source group (column). Individual databases or registers within each group are listed as sub-items inside a single box, mirroring the format of exclusion reasons.
sources(..., headers = NULL)sources(..., headers = NULL)
... |
Named integer vectors specifying sources. Each argument
name identifies a group and its named elements are individual sources
(e.g., |
headers |
Named character vector mapping group names to column
header labels. For example,
|
sources() initializes a multi-source flow of the kind used in the
identification stage of systematic-review diagrams (PRISMA, MOOSE), where
records arrive from several origins and are pooled. Counts are supplied
as named numeric values; passing named vectors instead of scalars groups
the sources into labeled columns, and at most three groups are
supported, matching the standard PRISMA layout. A sources() flow
is operated in manual mode and is normally followed by combine()
to merge the streams into a single downstream node. For a conventional
single-entry study, use enroll() instead.
An object of class "selecta" with a sources step
pre-loaded. The total starting count is the sum of all source counts
across all groups.
enroll for single-source entry,
combine to merge parallel streams into a single flow
Other flow construction functions:
assess(),
combine(),
endpoint(),
enroll(),
exclude(),
phase(),
stratify()
# Simple multi-source (one column, no header) sources(PubMed = 1234, Embase = 567, CENTRAL = 89) # Grouped sources (PRISMA two-column layout) sources( databases = c("PubMed" = 1234, "Embase" = 567, "CENTRAL" = 89), other = c("Citation search" = 55, "Websites" = 34) ) # Three columns with custom headers sources( previous = c("Previous review" = 12, "Previous reports" = 15), databases = c("PubMed" = 1234, "Embase" = 567, "CENTRAL" = 89), other = c("Citation search" = 55, "Websites" = 34), headers = c(previous = "Previous studies", databases = "Databases and registers", other = "Other methods") ) |> combine("Records after deduplication") |> exclude("Records removed", n = 352, show_count = FALSE, reasons = c("Duplicates" = 340, "Marked ineligible" = 12))# Simple multi-source (one column, no header) sources(PubMed = 1234, Embase = 567, CENTRAL = 89) # Grouped sources (PRISMA two-column layout) sources( databases = c("PubMed" = 1234, "Embase" = 567, "CENTRAL" = 89), other = c("Citation search" = 55, "Websites" = 34) ) # Three columns with custom headers sources( previous = c("Previous review" = 12, "Previous reports" = 15), databases = c("PubMed" = 1234, "Embase" = 567, "CENTRAL" = 89), other = c("Citation search" = 55, "Websites" = 34), headers = c(previous = "Previous studies", databases = "Databases and registers", other = "Other methods") ) |> combine("Records after deduplication") |> exclude("Records removed", n = 352, show_count = FALSE, reasons = c("Duplicates" = 340, "Marked ineligible" = 12))
Divides the enrollment flow into parallel arms. This is the primary
function for splitting a population by any characteristic: treatment
assignment, exposure status, diagnostic test result, etc. Subsequent
exclude() calls apply within each arm independently. While
stratify() is the primary function, allocate() is
provided as a convenience alias with default label "Randomized",
suitable for interventional trials (CONSORT).
stratify(.flow, variable = NULL, labels = NULL, n = NULL, label = "Stratified") allocate(.flow, variable = NULL, labels = NULL, n = NULL, label = "Randomized")stratify(.flow, variable = NULL, labels = NULL, n = NULL, label = "Stratified") allocate(.flow, variable = NULL, labels = NULL, n = NULL, label = "Randomized")
.flow |
A |
variable |
Character string naming the column that defines the arms. Data mode only. |
labels |
A character vector of arm labels. In data mode, this
can be a named vector to relabel factor levels (e.g.,
|
n |
Integer vector. Number of participants in each arm, in the same
order as |
label |
Character string for the split box. Defaults to
|
stratify() splits the flow into parallel arms, after which each
exclude() (and the eventual endpoint()) applies
within every arm. In data mode, variable names a column whose
levels define the arms, optionally relabeled through a named
labels vector; in manual mode, labels and n give the
arm names and per-arm counts directly.
allocate() is an identical alias differing only in its default
label ("Randomized"), provided so that interventional
trials (CONSORT) read naturally; both record the same step type.
Parallel arms may later be merged with combine() to form a
split-and-recombine diagram, and a flow may be split again after
combining. A second stratify() or allocate() before
combining produces a factorial (two-level) split, supported in both
data and manual modes.
The updated selecta object with a stratification step
appended. All subsequent pipeline steps operate independently within
each arm.
exclude for per-arm exclusions after splitting,
endpoint for per-arm endpoints
Other flow construction functions:
assess(),
combine(),
endpoint(),
enroll(),
exclude(),
phase(),
sources()
# Observational study (STROBE) enroll(n = 3860) |> stratify(labels = c("Exposed", "Unexposed"), n = c(1900, 1960), label = "Classified by exposure") # Randomized trial (CONSORT) enroll(n = 400) |> allocate(labels = c("Drug A", "Placebo"), n = c(200, 200))# Observational study (STROBE) enroll(n = 3860) |> stratify(labels = c("Exposed", "Unexposed"), n = c(1900, 1960), label = "Classified by exposure") # Randomized trial (CONSORT) enroll(n = 400) |> allocate(labels = c("Drug A", "Placebo"), n = c(200, 200))
Computes all counts from the pipeline and returns a data.table
summarizing each node in the diagram.
## S3 method for class 'selecta' summary(object, ...)## S3 method for class 'selecta' summary(object, ...)
object |
A |
... |
Ignored. |
The summary method runs the same count computation that underlies
rendering and returns the result as a clean data.table, one row per
node, rather than drawing anything. This is convenient for programmatic
checks (confirming arm totals, extracting the final analyzed count) and
for embedding flow figures in tables or reports. The returned object is a
plain data.table and may be filtered or joined like any other. For
a human-readable console view use print.selecta(); to render
the diagram use flowchart().
A data.table with columns phase, role,
arm, text, and n. Each row corresponds to one
node in the computed diagram.
print.selecta for a console summary,
flowchart for rendering
Other flowchart output functions:
flowchart(),
flowsave(),
print.selecta(),
recdims()
flow <- enroll(n = 500) |> exclude("Ineligible", n = 65) |> allocate(labels = c("Drug A", "Placebo"), n = c(218, 217)) |> endpoint("Analyzed") summary(flow)flow <- enroll(n = 500) |> exclude("Ineligible", n = 65) |> allocate(labels = c("Drug A", "Placebo"), n = c(218, 217)) |> endpoint("Analyzed") summary(flow)