| Title: | Calibration Weighting to Multiple Subgroup Pass-Rate Targets |
|---|---|
| Description: | Calibration weighting for binary-outcome pass rates against multiple overlapping subgroup targets. Adjusts initial positive weights so that the overall pass rate and subgroup pass rates approach (soft mode) or exactly match (exact mode) given targets, while preserving the initial weight structure and population margins. Provides a one-step interface, pre-solve data checks, target-table construction, effective sample size and design-effect diagnostics, and example data. The solver works on a bounded convex quadratic program over demographic-cell-by-outcome aggregates for efficiency on large samples. Methods follow the calibration approach of Deville and Saerndal (1992) <doi:10.1080/01621459.1992.10475217>. |
| Authors: | Kunxiang Ma [aut, cre] |
| Maintainer: | Kunxiang Ma <[email protected]> |
| License: | MIT + file LICENSE |
| Version: | 0.3.0 |
| Built: | 2026-06-25 19:59:02 UTC |
| Source: | https://github.com/cran/ratecalib |
One-step convenience: read the sample data and the target table from an Excel workbook, infer the grouping variables from the targets, and solve.
calibrate_from_excel( path, outcome, weight, data_sheet = 1, targets_sheet = "targets", group_vars = NULL, ... )calibrate_from_excel( path, outcome, weight, data_sheet = 1, targets_sheet = "targets", group_vars = NULL, ... )
path |
Path to an |
outcome |
Name of the binary outcome column. |
weight |
Name of the initial weight column. |
data_sheet |
Sheet holding the sample data (default 1). |
targets_sheet |
Sheet holding the target table (default |
group_vars |
Optional grouping-variable names; inferred from the target
table when |
... |
Further arguments passed to |
An object of class pass_rate_calibration.
if (requireNamespace("openxlsx", quietly = TRUE)) { path <- tempfile(fileext = ".xlsx") d <- example_rate_data(300) tg <- make_rate_targets(groups = list(sex = c(M = 0.72, F = 0.68))) openxlsx::write.xlsx(list(data = d, targets = tg), path) fit <- calibrate_from_excel(path, "qualified", "initial_weight", data_sheet = "data", targets_sheet = "targets") fit$target_check }if (requireNamespace("openxlsx", quietly = TRUE)) { path <- tempfile(fileext = ".xlsx") d <- example_rate_data(300) tg <- make_rate_targets(groups = list(sex = c(M = 0.72, F = 0.68))) openxlsx::write.xlsx(list(data = d, targets = tg), path) fit <- calibrate_from_excel(path, "qualified", "initial_weight", data_sheet = "data", targets_sheet = "targets") fit$target_check }
Adjust initial positive weights so that an overall binary outcome rate and subgroup outcome rates approach or exactly match specified targets. The optimization is performed on demographic-cell-by-outcome aggregates.
calibrate_pass_rates( data, outcome, weight, group_vars, targets, lower = 0.25, upper = 4, mode = c("soft", "exact"), distance = c("chi2", "raking", "logit"), lambda = 10000, new_weight = "weight_calibrated", verbose = FALSE )calibrate_pass_rates( data, outcome, weight, group_vars, targets, lower = 0.25, upper = 4, mode = c("soft", "exact"), distance = c("chi2", "raking", "logit"), lambda = 10000, new_weight = "weight_calibrated", verbose = FALSE )
data |
A data frame containing one row per sampled unit. |
outcome |
Name of a binary 0/1 outcome column. |
weight |
Name of the initial positive weight column. |
group_vars |
Character vector naming grouping variables. |
targets |
A data frame with columns |
lower, upper
|
Scalar lower and upper bounds on the multiplier applied to each initial cell weight. |
mode |
Either |
distance |
Distance function of the calibration family. |
lambda |
Positive soft-constraint penalty. Larger values emphasize target matching more strongly. |
new_weight |
Name of the calibrated weight column added to |
verbose |
Logical; passed to OSQP. |
An object of class pass_rate_calibration.
d <- example_rate_data(300) targets <- make_rate_targets(groups = list(sex = c(M = 0.72, F = 0.68))) fit <- calibrate_pass_rates(d, "qualified", "initial_weight", group_vars = "sex", targets = targets) fit$target_checkd <- example_rate_data(300) targets <- make_rate_targets(groups = list(sex = c(M = 0.72, F = 0.68))) fit <- calibrate_pass_rates(d, "qualified", "initial_weight", group_vars = "sex", targets = targets) fit$target_check
A convenience interface for everyday use. The user only supplies the data,
the outcome variable, the initial weights, an overall target and per-group
targets; the function builds the target table, identifies grouping
variables, runs the data checks and calls calibrate_pass_rates().
calibrate_rates( data, outcome, weight, overall = NULL, groups = list(), priority = 5, group_priority = 1, lower = 0.25, upper = 4, mode = c("soft", "exact"), distance = c("chi2", "raking", "logit"), lambda = 10000, new_weight = "weight_calibrated", check = TRUE, verbose = FALSE )calibrate_rates( data, outcome, weight, overall = NULL, groups = list(), priority = 5, group_priority = 1, lower = 0.25, upper = 4, mode = c("soft", "exact"), distance = c("chi2", "raking", "logit"), lambda = 10000, new_weight = "weight_calibrated", check = TRUE, verbose = FALSE )
data |
A data frame with one row per sampled unit. |
outcome |
Name of the binary outcome column (1 = pass, 0 = fail). |
weight |
Name of the initial weight column. |
overall |
Optional scalar overall target rate; may be |
groups |
Named list. Each element name is a grouping variable and each element is a named numeric vector of target rates. |
priority |
Priority of the overall target. Defaults to 5. |
group_priority |
Priority of the group targets; a single number or a vector named by grouping variable. |
lower, upper
|
Lower and upper bounds on the weight-adjustment multiplier. |
mode |
|
distance |
Calibration distance, passed to |
lambda |
Soft-constraint penalty strength. |
new_weight |
Name of the new calibrated weight column. |
check |
Whether to run the data checks before solving. |
verbose |
Whether to print OSQP solver information. |
An object of class pass_rate_calibration.
d <- example_rate_data(300) fit <- calibrate_rates(d, "qualified", "initial_weight", groups = list(sex = c(M = 0.72, F = 0.68))) summary(fit)d <- example_rate_data(300) fit <- calibrate_rates(d, "qualified", "initial_weight", groups = list(sex = c(M = 0.72, F = 0.68))) summary(fit)
Re-runs the calibration of a fitted pass_rate_calibration object for each
set of replicate weights (e.g. bootstrap, jackknife or BRR weights produced
elsewhere), holding the targets and solver settings fixed. The resulting
calibrated replicate weights feed replicate_variance() for design-consistent
variance estimation, mirroring the replicate-weights approach of the survey
package.
calibrate_replicate_weights( fit, repweights, scale = 1, rscales = NULL, progress = FALSE ) ## S3 method for class 'replicate_calibration' print(x, ...)calibrate_replicate_weights( fit, repweights, scale = 1, rscales = NULL, progress = FALSE ) ## S3 method for class 'replicate_calibration' print(x, ...)
fit |
A fitted object of class |
repweights |
A numeric matrix or data frame of replicate weights with
one row per observation (matching |
scale, rscales
|
Replication variance constants, as in
|
progress |
Logical; show a text progress bar over the replicates. |
x |
A |
... |
Ignored. |
An object of class replicate_calibration with the full-sample
calibrated weights, the matrix of calibrated replicate_weights, and the
scale/rscales constants.
d <- example_rate_data(300) fit <- calibrate_rates(d, "qualified", "initial_weight", groups = list(sex = c(M = 0.72, F = 0.68))) repw <- d$initial_weight * matrix(stats::runif(nrow(d) * 5, 0.8, 1.2), ncol = 5) rc <- calibrate_replicate_weights(fit, repw) replicate_variance(rc, d$qualified, statistic = "mean")d <- example_rate_data(300) fit <- calibrate_rates(d, "qualified", "initial_weight", groups = list(sex = c(M = 0.72, F = 0.68))) repw <- d$initial_weight * matrix(stats::runif(nrow(d) * 5, 0.8, 1.2), ncol = 5) rc <- calibrate_replicate_weights(fit, repw) replicate_variance(rc, d$qualified, statistic = "mean")
Extract calibration diagnostics
calibration_diagnostics(x, sort_targets = TRUE)calibration_diagnostics(x, sort_targets = TRUE)
x |
A |
sort_targets |
Whether to sort targets by absolute error, descending. |
A list with target, margin and weight diagnostics.
d <- example_rate_data(300) fit <- calibrate_rates(d, "qualified", "initial_weight", groups = list(sex = c(M = 0.72, F = 0.68))) calibration_diagnostics(fit)d <- example_rate_data(300) fit <- calibrate_rates(d, "qualified", "initial_weight", groups = list(sex = c(M = 0.72, F = 0.68))) calibration_diagnostics(fit)
Runs two deterministic, closed-form feasibility checks before calibration. Both rely only on the initial weight marginals that the solver preserves, so they are cheap and exact within their stated scope.
calibration_feasibility( data, outcome, weight, group_vars, targets, lower = 0.25, upper = 4, tol = 1e-08 ) ## S3 method for class 'ratecalib_feasibility' print(x, ...)calibration_feasibility( data, outcome, weight, group_vars, targets, lower = 0.25, upper = 4, tol = 1e-08 ) ## S3 method for class 'ratecalib_feasibility' print(x, ...)
data |
A data frame with one row per sampled unit. |
outcome |
Name of the binary outcome column (1 = pass, 0 = fail). |
weight |
Name of the initial weight column. |
group_vars |
Character vector of grouping-variable names. |
targets |
Target table with columns |
lower, upper
|
Lower and upper bounds on the weight-adjustment multiplier. |
tol |
Numeric tolerance for the consistency comparison. |
x |
A |
... |
Further arguments (ignored by the print method). |
Overall-vs-group consistency. Marginal totals are held fixed
during calibration, so if every level of a grouping variable carries an
exact target, the overall rate is uniquely pinned to
. Two such "complete" variables, or one
plus an explicit overall target, can disagree; that disagreement is a
guaranteed conflict under mode = "exact".
Single-target marginal interval. With the group total fixed
and per-unit multipliers bounded by [lower, upper], a group's reachable
weighted rate lies in a closed interval (two-block water-filling). A
target outside that interval can never be met. This is a necessary
condition only: passing every single-target check does not guarantee the
targets are jointly feasible, because overlapping units couple the groups.
A list of class ratecalib_feasibility with elements consistency
(a list with pins, consistent and detail), marginal (a data frame
of per-target achievable intervals), necessary_ok (logical) and note.
d <- example_rate_data(300) targets <- make_rate_targets(overall = 0.62, groups = list(sex = c(M = 0.66, F = 0.60))) calibration_feasibility(d, "qualified", "initial_weight", "sex", targets)d <- example_rate_data(300) targets <- make_rate_targets(overall = 0.62, groups = list(sex = c(M = 0.66, F = 0.60))) calibration_feasibility(d, "qualified", "initial_weight", "sex", targets)
Checks variables, weights, the binary outcome, group coverage and target supportability, and reports the current weighted pass rates.
check_calibration_data( data, outcome, weight, group_vars, targets = NULL, consistency_tol = 0.01 ) ## S3 method for class 'ratecalib_check' print(x, ...)check_calibration_data( data, outcome, weight, group_vars, targets = NULL, consistency_tol = 0.01 ) ## S3 method for class 'ratecalib_check' print(x, ...)
data |
A data frame. |
outcome |
Name of the binary outcome column. |
weight |
Name of the initial weight column. |
group_vars |
Character vector of grouping-variable names. |
targets |
Optional target table. |
consistency_tol |
Tolerance (on the rate scale) for the overall-vs-group
consistency warning. Only inconsistencies larger than this are reported, so
that sub-tolerance rounding (round-number targets that do not divide the
weighted marginals exactly) does not trigger noise. For a precise,
exact-mode feasibility analysis call |
x |
A |
... |
Further arguments (ignored by the print method). |
A list of class ratecalib_check with ok, errors, warnings,
overview, group_summary and target_support.
d <- example_rate_data(300) check_calibration_data(d, "qualified", "initial_weight", group_vars = "sex")d <- example_rate_data(300) check_calibration_data(d, "qualified", "initial_weight", group_vars = "sex")
Creates a simulated data set with sex, residence, a 5-level education variable, a 5-level age variable, a pass indicator and initial weights.
example_rate_data(n = 5000L, seed = 2026L)example_rate_data(n = 5000L, seed = 2026L)
n |
Sample size. |
seed |
Random seed. |
A data frame.
d <- example_rate_data(200) head(d)d <- example_rate_data(200) head(d)
Writes a multi-sheet workbook: data (with the calibrated weight column),
target_check, margin_check, diagnostics and settings.
export_calibration_xlsx(fit, path, overwrite = TRUE)export_calibration_xlsx(fit, path, overwrite = TRUE)
fit |
An object of class |
path |
Output |
overwrite |
Whether to overwrite an existing file. |
path, invisibly.
if (requireNamespace("openxlsx", quietly = TRUE)) { d <- example_rate_data(300) fit <- calibrate_rates(d, "qualified", "initial_weight", groups = list(sex = c(M = 0.72, F = 0.68))) export_calibration_xlsx(fit, tempfile(fileext = ".xlsx")) }if (requireNamespace("openxlsx", quietly = TRUE)) { d <- example_rate_data(300) fit <- calibrate_rates(d, "qualified", "initial_weight", groups = list(sex = c(M = 0.72, F = 0.68))) export_calibration_xlsx(fit, tempfile(fileext = ".xlsx")) }
Build a pass-rate target table
make_rate_targets( overall = NULL, groups = list(), interactions = list(), overall_priority = 5, group_priority = 1, interaction_priority = 1, means = NULL, totals = NULL, proportions = NULL )make_rate_targets( overall = NULL, groups = list(), interactions = list(), overall_priority = 5, group_priority = 1, interaction_priority = 1, means = NULL, totals = NULL, proportions = NULL )
overall |
Optional scalar overall target rate. |
groups |
Named list. Each element is a named numeric vector containing target rates for one grouping variable. |
interactions |
Named list of cross-classification (interaction) targets.
Each element name is a colon-joined set of grouping variables
(e.g. |
overall_priority |
Positive priority for the overall target. |
group_priority |
Either one positive scalar or a named positive vector indexed by group-variable name. |
interaction_priority |
Either one positive scalar or a named positive vector indexed by interaction key. |
means, totals
|
Optional data frames of mean/total targets for a numeric
variable, each with columns |
proportions |
Optional data frame of proportion targets for a value of a
categorical variable, with columns |
A data frame suitable for calibrate_pass_rates().
make_rate_targets(overall = 0.70, groups = list(sex = c(M = 0.72, F = 0.68)))make_rate_targets(overall = 0.70, groups = list(sex = c(M = 0.72, F = 0.68)))
S3 methods that print, summarize and plot calibration results.
## S3 method for class 'pass_rate_calibration' weights(object, ...) ## S3 method for class 'pass_rate_calibration' as.data.frame(x, row.names = NULL, optional = FALSE, ...) ## S3 method for class 'pass_rate_calibration' print(x, digits = 4, ...) ## S3 method for class 'pass_rate_calibration' summary(object, top = 10L, ...) ## S3 method for class 'summary_pass_rate_calibration' print(x, digits = 4, ...) ## S3 method for class 'pass_rate_calibration' plot(x, type = c("target_error", "multipliers"), top = 20L, ...)## S3 method for class 'pass_rate_calibration' weights(object, ...) ## S3 method for class 'pass_rate_calibration' as.data.frame(x, row.names = NULL, optional = FALSE, ...) ## S3 method for class 'pass_rate_calibration' print(x, digits = 4, ...) ## S3 method for class 'pass_rate_calibration' summary(object, top = 10L, ...) ## S3 method for class 'summary_pass_rate_calibration' print(x, digits = 4, ...) ## S3 method for class 'pass_rate_calibration' plot(x, type = c("target_error", "multipliers"), top = 20L, ...)
... |
Further arguments passed to the underlying print or graphics functions. |
x, object
|
A |
row.names, optional
|
Standard |
digits |
Number of decimal places to display. |
top |
Number of largest-error targets to display. |
type |
Plot type: |
print and plot return their input invisibly; summary returns a
summary_pass_rate_calibration object; weights returns the calibrated
weight vector; as.data.frame returns the data with the calibrated weight
column.
d <- example_rate_data(300) fit <- calibrate_rates(d, "qualified", "initial_weight", groups = list(sex = c(M = 0.72, F = 0.68))) print(fit) summary(fit) head(weights(fit)) head(as.data.frame(fit)) plot(fit, type = "target_error")d <- example_rate_data(300) fit <- calibrate_rates(d, "qualified", "initial_weight", groups = list(sex = c(M = 0.72, F = 0.68))) print(fit) summary(fit) head(weights(fit)) head(as.data.frame(fit)) plot(fit, type = "target_error")
Read calibration sample data from an Excel workbook
read_calibration_data(path, sheet = 1)read_calibration_data(path, sheet = 1)
path |
Path to an |
sheet |
Sheet name or index (default 1). |
A data frame with one row per sampled unit.
if (requireNamespace("openxlsx", quietly = TRUE)) { path <- tempfile(fileext = ".xlsx") openxlsx::write.xlsx(example_rate_data(50), path) head(read_calibration_data(path)) }if (requireNamespace("openxlsx", quietly = TRUE)) { path <- tempfile(fileext = ".xlsx") openxlsx::write.xlsx(example_rate_data(50), path) head(read_calibration_data(path)) }
Reads a worksheet of targets and maps its headers (English or Chinese, in
any letter case) onto the canonical columns variable, level,
target_rate and the optional priority.
read_targets_xlsx(path, sheet = 1)read_targets_xlsx(path, sheet = 1)
path |
Path to an |
sheet |
Sheet name or index (default 1). |
A data frame suitable for calibrate_pass_rates().
if (requireNamespace("openxlsx", quietly = TRUE)) { path <- tempfile(fileext = ".xlsx") openxlsx::write.xlsx( make_rate_targets(groups = list(sex = c(M = 0.72, F = 0.68))), path) read_targets_xlsx(path) }if (requireNamespace("openxlsx", quietly = TRUE)) { path <- tempfile(fileext = ".xlsx") openxlsx::write.xlsx( make_rate_targets(groups = list(sex = c(M = 0.72, F = 0.68))), path) read_targets_xlsx(path) }
Computes the point estimate, variance and standard error of a weighted total or mean of a study variable, using calibrated replicate weights.
replicate_variance(object, x, statistic = c("total", "mean"))replicate_variance(object, x, statistic = c("total", "mean"))
object |
A |
x |
Numeric study variable, one value per observation. |
statistic |
Either |
A list with estimate, variance and se.
d <- example_rate_data(300) fit <- calibrate_rates(d, "qualified", "initial_weight", groups = list(sex = c(M = 0.72, F = 0.68))) repw <- d$initial_weight * matrix(stats::runif(nrow(d) * 5, 0.8, 1.2), ncol = 5) rc <- calibrate_replicate_weights(fit, repw) replicate_variance(rc, d$qualified, statistic = "mean")d <- example_rate_data(300) fit <- calibrate_rates(d, "qualified", "initial_weight", groups = list(sex = c(M = 0.72, F = 0.68))) repw <- d$initial_weight * matrix(stats::runif(nrow(d) * 5, 0.8, 1.2), ncol = 5) rc <- calibrate_replicate_weights(fit, repw) replicate_variance(rc, d$qualified, statistic = "mean")