| Title: | Tidyverse-Compatible Fragility Index Calculations |
|---|---|
| Description: | Provides optimized, Tidyverse-compatible functions for calculating the Fragility Index and Reverse Fragility Index for 2x2 contingency tables from clinical trials. Uses customized hypergeometric and algebraic calculations along with binary search algorithms to achieve substantial speedups over standard implementations, with seamless integration into 'dplyr' pipelines. |
| Authors: | Tom Drake [aut, cre] |
| Maintainer: | Tom Drake <[email protected]> |
| License: | GPL-3 |
| Version: | 0.1.0 |
| Built: | 2026-06-25 19:57:01 UTC |
| Source: | https://github.com/cran/FragiliTidy |
Implements the Continuous Fragility Index (CFI) of Caldwell, Youssefzadeh and Limpisvasti (J Clin Epidemiol 2021;136:20-25) for two-arm trials with a continuous outcome compared via Welch's t-test.
The CFI is the minimum number of substitution iterations required to drive a significant Welch t-test result to non-significance, where on each iteration the data point in the higher-mean group that lies closest to but still above that group's mean is moved into the lower-mean group.
When raw data are unavailable, datasets matching the supplied summary
statistics are generated by rejection sampling and the procedure is
repeated n_sim times; the mean CFI across simulations is returned.
Adds a Continuous Fragility Index column to a data frame of trial summary statistics. Supports tidy evaluation.
continuous_fragility_index( data, mean1, sd1, n1, mean2, sd2, n2, conf.level = 0.95, n_sim = 5L, tol_mean = 0.01, tol_sd = 0.01, col_name = "continuous_fragility_index" )continuous_fragility_index( data, mean1, sd1, n1, mean2, sd2, n2, conf.level = 0.95, n_sim = 5L, tol_mean = 0.01, tol_sd = 0.01, col_name = "continuous_fragility_index" )
data |
A data frame or tibble. |
mean1, sd1, n1
|
Unquoted column names for arm 1 mean, SD, and sample size. |
mean2, sd2, n2
|
Unquoted column names for arm 2 mean, SD, and sample size. |
conf.level, n_sim, tol_mean, tol_sd
|
|
col_name |
Name of the output column (default
|
The input data frame with an additional CFI column.
Direct (non-simulated) CFI when raw per-patient outcomes are available.
continuous_fragility_index_raw(x, y, conf.level = 0.95)continuous_fragility_index_raw(x, y, conf.level = 0.95)
x, y
|
Numeric vectors of outcome values for arms 1 and 2. |
conf.level |
Confidence level for the Welch t-test (default |
A single integer: the number of substitution iterations required to
lose significance, or 0 if the baseline test was non-significant.
set.seed(1) x <- rnorm(50, 70, 10) y <- rnorm(50, 50, 10) continuous_fragility_index_raw(x, y)set.seed(1) x <- rnorm(50, 70, 10) y <- rnorm(50, 50, 10) continuous_fragility_index_raw(x, y)
Calculates the Continuous Fragility Index (Caldwell et al. 2021) from two-arm summary statistics by simulating compatible datasets and applying the iterative substitution algorithm.
continuous_fragility_index_summary( mean1, sd1, n1, mean2, sd2, n2, conf.level = 0.95, n_sim = 5L, tol_mean = 0.01, tol_sd = 0.01, seed = NULL )continuous_fragility_index_summary( mean1, sd1, n1, mean2, sd2, n2, conf.level = 0.95, n_sim = 5L, tol_mean = 0.01, tol_sd = 0.01, seed = NULL )
mean1, sd1, n1
|
Mean, standard deviation, and sample size of arm 1. |
mean2, sd2, n2
|
Mean, standard deviation, and sample size of arm 2. |
conf.level |
Confidence level for the Welch t-test (default |
n_sim |
Number of simulated datasets to average over (default |
tol_mean, tol_sd
|
Relative tolerances for rejection sampling. |
seed |
Optional integer seed for reproducibility. |
A single numeric value: the mean CFI across n_sim simulations.
Returns 0 if the baseline Welch test is already non-significant and
NA_real_ if any input is missing.
Caldwell JE, Youssefzadeh K, Limpisvasti O. A method for calculating the fragility index of continuous outcomes. J Clin Epidemiol 2021;136:20-25.
continuous_fragility_index_summary( mean1 = 70, sd1 = 10, n1 = 100, mean2 = 50, sd2 = 10, n2 = 100, seed = 1 )continuous_fragility_index_summary( mean1 = 70, sd1 = 10, n1 = 100, mean2 = 50, sd2 = 10, n2 = 100, seed = 1 )
Vectorised wrapper around continuous_fragility_index_summary() for use
inside dplyr::mutate().
continuous_fragility_index_vec( mean1, sd1, n1, mean2, sd2, n2, conf.level = 0.95, n_sim = 5L, tol_mean = 0.01, tol_sd = 0.01 )continuous_fragility_index_vec( mean1, sd1, n1, mean2, sd2, n2, conf.level = 0.95, n_sim = 5L, tol_mean = 0.01, tol_sd = 0.01 )
mean1, sd1, n1, mean2, sd2, n2
|
Numeric vectors of summary statistics. |
conf.level, n_sim, tol_mean, tol_sd
|
A numeric vector of CFI values.
Computes the fragility index for columns in a data frame.
Supports tidy evaluation and integrates with %>% or |>.
fragility_index( data, intervention_event, control_event, intervention_n, control_n, conf.level = 0.95, verbose = FALSE, col_name = "fragility_index" )fragility_index( data, intervention_event, control_event, intervention_n, control_n, conf.level = 0.95, verbose = FALSE, col_name = "fragility_index" )
data |
A data frame or tibble. |
intervention_event |
Column name (unquoted) for the intervention events. |
control_event |
Column name (unquoted) for the control events. |
intervention_n |
Column name (unquoted) for the intervention group totals. |
control_n |
Column name (unquoted) for the control group totals. |
conf.level |
Confidence level (default 0.95). Can be a number or a column name. |
verbose |
Logical; if TRUE, returns a nested list-column with p-values for each iteration. |
col_name |
Name of the output column. Default is |
The original data frame with an added column for the fragility index.
Calculates the fragility index for vector inputs. This is useful for running
inside dplyr::mutate().
fragility_index_vec( intervention_event, control_event, intervention_n, control_n, conf.level = 0.95, verbose = FALSE )fragility_index_vec( intervention_event, control_event, intervention_n, control_n, conf.level = 0.95, verbose = FALSE )
intervention_event |
Vector of events in the intervention group. |
control_event |
Vector of events in the control group. |
intervention_n |
Vector of total patients in the intervention group. |
control_n |
Vector of total patients in the control group. |
conf.level |
Significance level / confidence level (default 0.95). |
verbose |
Logical indicating if full progression of p-values should be returned. |
A numeric vector of fragility indices (if verbose = FALSE), or a list
of tibbles containing step-by-step p-values (if verbose = TRUE).
Adds a reverse Continuous Fragility Index column to a data frame of trial summary statistics.
reverse_continuous_fragility_index( data, mean1, sd1, n1, mean2, sd2, n2, conf.level = 0.95, n_sim = 5L, tol_mean = 0.01, tol_sd = 0.01, max_iter = 10000L, col_name = "reverse_continuous_fragility_index" )reverse_continuous_fragility_index( data, mean1, sd1, n1, mean2, sd2, n2, conf.level = 0.95, n_sim = 5L, tol_mean = 0.01, tol_sd = 0.01, max_iter = 10000L, col_name = "reverse_continuous_fragility_index" )
data |
A data frame or tibble. |
mean1, sd1, n1
|
Unquoted column names for arm 1 summary stats. |
mean2, sd2, n2
|
Unquoted column names for arm 2 summary stats. |
conf.level, n_sim, tol_mean, tol_sd, max_iter
|
|
col_name |
Output column name (default
|
The input data frame with an additional reverse CFI column.
Estimates how many additional participants per arm would have been required to render a non-significant Welch t-test significant, given two-arm summary statistics. This is a continuous-outcome analogue of the reverse fragility index for dichotomous outcomes: a measure of how far a non-significant trial was from significance, expressed in participants per arm.
reverse_continuous_fragility_index_summary( mean1, sd1, n1, mean2, sd2, n2, conf.level = 0.95, n_sim = 5L, tol_mean = 0.01, tol_sd = 0.01, max_iter = 10000L, seed = NULL )reverse_continuous_fragility_index_summary( mean1, sd1, n1, mean2, sd2, n2, conf.level = 0.95, n_sim = 5L, tol_mean = 0.01, tol_sd = 0.01, max_iter = 10000L, seed = NULL )
mean1, sd1, n1
|
Mean, standard deviation, and sample size of arm 1. |
mean2, sd2, n2
|
Mean, standard deviation, and sample size of arm 2. |
conf.level |
Confidence level for the Welch t-test (default |
n_sim |
Number of simulated datasets to average over (default |
tol_mean, tol_sd
|
Relative tolerances for rejection sampling. |
max_iter |
Maximum additional participants per arm before giving up
and returning |
seed |
Optional integer seed for reproducibility. |
If the original test is already significant the function returns 0.
Otherwise additional participants are sampled from each arm's assumed
normal distribution (parameterised by the supplied mean and SD) and added
one per arm per iteration until significance is reached. The procedure is
repeated n_sim times and the mean is returned.
A single numeric value: mean additional participants per arm
required to reach significance across n_sim simulations. Returns 0
if the original test was already significant.
reverse_continuous_fragility_index_summary( mean1 = 55, sd1 = 10, n1 = 30, mean2 = 50, sd2 = 10, n2 = 30, seed = 1 )reverse_continuous_fragility_index_summary( mean1 = 55, sd1 = 10, n1 = 30, mean2 = 50, sd2 = 10, n2 = 30, seed = 1 )
Vectorised wrapper around reverse_continuous_fragility_index_summary()
for use inside dplyr::mutate().
reverse_continuous_fragility_index_vec( mean1, sd1, n1, mean2, sd2, n2, conf.level = 0.95, n_sim = 5L, tol_mean = 0.01, tol_sd = 0.01, max_iter = 10000L )reverse_continuous_fragility_index_vec( mean1, sd1, n1, mean2, sd2, n2, conf.level = 0.95, n_sim = 5L, tol_mean = 0.01, tol_sd = 0.01, max_iter = 10000L )
mean1, sd1, n1, mean2, sd2, n2
|
Numeric vectors of summary statistics. |
conf.level, n_sim, tol_mean, tol_sd, max_iter
|
A numeric vector of reverse CFI values.
Computes the reverse fragility index for columns in a data frame.
Supports tidy evaluation and integrates with %>% or |>.
revfragility_index( data, intervention_event, control_event, intervention_n, control_n, conf.level = 0.95, verbose = FALSE, col_name = "revfragility_index", compatibility_mode = FALSE )revfragility_index( data, intervention_event, control_event, intervention_n, control_n, conf.level = 0.95, verbose = FALSE, col_name = "revfragility_index", compatibility_mode = FALSE )
data |
A data frame or tibble. |
intervention_event |
Column name (unquoted) for the intervention events. |
control_event |
Column name (unquoted) for the control events. |
intervention_n |
Column name (unquoted) for the intervention group totals. |
control_n |
Column name (unquoted) for the control group totals. |
conf.level |
Confidence level (default 0.95). Can be a number or a column name. |
verbose |
Logical; if TRUE, returns a nested list-column with p-values for each iteration. |
col_name |
Name of the output column. Default is |
compatibility_mode |
If TRUE, reproduces the original package's bug in verbose mode. |
The original data frame with an added column for the reverse fragility index.
Calculates the reverse fragility index for vector inputs. This is useful for running
inside dplyr::mutate().
revfragility_index_vec( intervention_event, control_event, intervention_n, control_n, conf.level = 0.95, verbose = FALSE, compatibility_mode = FALSE )revfragility_index_vec( intervention_event, control_event, intervention_n, control_n, conf.level = 0.95, verbose = FALSE, compatibility_mode = FALSE )
intervention_event |
Vector of events in the intervention group. |
control_event |
Vector of events in the control group. |
intervention_n |
Vector of total patients in the intervention group. |
control_n |
Vector of total patients in the control group. |
conf.level |
Significance level / confidence level (default 0.95). |
verbose |
Logical indicating if full progression of p-values should be returned. |
compatibility_mode |
If TRUE, reproduces the original package's bug in verbose mode. |
A numeric vector of reverse fragility indices (if verbose = FALSE), or a list
of tibbles containing step-by-step p-values (if verbose = TRUE).
This file provides optimized, tidyverse-compatible functions for calculating the Fragility Index and the Reverse Fragility Index. It uses customized 2x2 hypergeometric and algebraic calculations to achieve a 25x speedup compared to standard stats package functions, and binary search algorithms to yield an additional 10x-1000x speedup for large trials.