Package: riskutility 0.1.0

Matthias Templ

riskutility: Disclosure Risk and Data Utility Metrics for Synthetic and Anonymized Data

Provides comprehensive methods to measure disclosure risk and data utility for anonymized and synthetic data. Implements attribution-based risk metrics including Correct Attribution Probability (CAP), Targeted CAP (TCAP), Within Equivalence Class Attribution Probability (WEAP), and RAPID (Risk of Attribute Prediction-Induced Disclosure). Also provides distance-based privacy metrics such as Distance to Closest Record (DCR), Nearest Neighbor Distance Ratio (NNDR), and Identical Match Share (IMS). Utility assessment includes propensity score analysis, distribution comparisons, and various statistical tests. Methods are based on Taub et al. (2018) <doi:10.1007/978-3-319-99771-1_9> and related literature. Designed for integration with 'simPop' S4 classes.

Authors:Matthias Templ [aut, cre], Oscar Thees [ctb]

riskutility_0.1.0.tar.gz
riskutility_0.1.0.tar.gz(r-4.7-any)riskutility_0.1.0.tar.gz(r-4.6-any)
riskutility_0.1.0.tgz(r-4.6-emscripten)
manual.pdf |manual.html
card.svg |card.png
riskutility/json (API)
NEWS

# Install 'riskutility' in R:
install.packages('riskutility', repos = c('https://cran.r-universe.dev', 'https://cloud.r-project.org'))

Bug tracker:https://github.com/matthias-da/riskutility/issues

On CRAN:

Conda:

2.70 score 6 scripts 99 exports 111 dependencies

Last updated from:437449adf4. Checks:4 OK. Indexed: yes.

TargetResultTimeFilesSyslog
linux-devel-x86_64OK381
source / vignettesOK371
linux-release-x86_64OK388
wasm-releaseOK203

Exports:aitattacker_riskchisq_utilityci_overlapci_proximitycompare_boxplotscompare_chisq_gofcompare_correlation_matricescompare_distributions_contcompare_embeddingcompare_feature_importancecompare_histogramscompare_ks_testcompare_means_frequenciescompare_missing_valuescompare_model_performancecompare_multivariate_distributioncompare_multivariate_summary_statisticscompare_outlierscompare_pcacompare_wassersteinConditionalEntropycontingency_fidelitycopula_fidelityCrossEntropyCumulativeEntropydcapdcrdelta_presencedensitydiff_1d_numdensitydiff_kl_numdensitydiff_pcadisclosure_reportdiscodomiasdriskenergy_distanceepsilon_identifiabilityfrom_sdcMicrofrom_simPopfrom_synthpopgowerhellingerhigh_risk_recordshitting_rateimsindividual_riskinformation_surprisalinspect_recordJSDivJSDiv_bayeskanonymityKLDivKLDiv_bayesldiversitylinkabilitymaemapemax_info_leakageMaxEntropymerge_per_recordmia_classifierMinEntropymmdmqsmsemutualInformationnnaanndrNormalizedEntropypMSEpopulation_uniquenesspositive_information_disclosureprivacy_scorepropscorerapidrapid_synthesizer_cvrapid_testrapid_threshold_selectrecordLinkageregression_fidelityRenyiEntropyrepurf_privacyrisk_by_grouprmserumapsingling_outspeckssubgroup_utilitysudasynth_pairsystemAnonymityLeveltail_fidelitytcaptclosenesstop_at_risktstrweap

Dependencies:abindbackportsbbotkbootbroomcarcarDatacheckmateclassclicodetoolscolorspacecowplotcpp11data.tableDEoptimRDerivdigestdoBydplyre1071evaluatefarverforecastFormulafracdifffuturefuture.applygenericsggplot2globalsgluegtableisobandjsonlitelabelinglaekenlatticelgrlifecyclelistenvlme4lmtestmagrittrMASSMatrixMatrixModelsmatrixStatsmgcvmicrobenchmarkminqamiraimlbenchmlr3mlr3learnersmlr3measuresmlr3miscmlr3pipelinesmlr3tuningmodelrmoocorenanonextnlmenloptrnnetnumDerivpalmerpenguinsparadoxparallellypbkrtestpillarpkgconfigplyrproxyPRROCpurrrquantregR6randomForestrangerrbibutilsRColorBrewerRcppRcppArmadilloRcppEigenRdpackreformulasreshape2rlangrobustbaseS7scalesspSparseMstringistringrsurvivaltibbletidyrtidyselecttimeDateurcautf8uuidvcdvctrsVIMviridisLitewithrxgboostzoo

riskutility: Comprehensive Disclosure Risk and Data Utility Assessment for Anonymized and Synthetic Data in R

Rendered fromriskutility.Rmdusingknitr::rmarkdownon Jun 22 2026.

Last update: 2026-06-22
Started: 2026-06-22

Readme and manuals

Help Manual

Help pageTopics
riskutility: Disclosure Risk and Data Utility for Anonymized and Synthetic Datariskutility-package riskutility
Attacker Risk Models (Prosecutor/Journalist/Marketer)attacker_risk attacker_risk.default attacker_risk.synth_pair
Chi-Square Utility Measureschisq_utility chisq_utility.default chisq_utility.synth_pair
Comparison of confidence intervalsci_overlap
Confidence Interval Proximityci_proximity ci_proximity.default ci_proximity.synth_pair
Compare Boxplots of a Numeric Variable in Two Datasetscompare_boxplots compare_boxplots.default compare_boxplots.synth_pair
Compare Frequencies using Chi-square Goodness-of-Fit Test with Optional Grouping, Weights, and Simulated p-valuescompare_chisq_gof compare_chisq_gof.default compare_chisq_gof.synth_pair
Compare Correlation Matrices between Two Datasetscompare_correlation_matrices compare_correlation_matrices.default compare_correlation_matrices.synth_pair
Compare continuous distributions conditionallycompare_distributions_cont compare_distributions_cont.default compare_distributions_cont.synth_pair
Compare Dimensionality Reduction Methods (t-SNE, UMAP, MDS-Sammon) for Two Datasetscompare_embedding compare_embedding.default compare_embedding.synth_pair
Compare Feature Importance between Real and Synthetic Datacompare_feature_importance compare_feature_importance.default compare_feature_importance.synth_pair
Compare Histograms of a Numeric Variable in Two Datasetscompare_histograms compare_histograms.default compare_histograms.synth_pair
Compare Distributions using the Kolmogorov-Smirnov Testcompare_ks_test compare_ks_test.default compare_ks_test.synth_pair
Compare Means and Frequencies between Two Datasetscompare_means_frequencies compare_means_frequencies.default compare_means_frequencies.synth_pair
Compare Missing Value Patterns Between Original and Synthetic Datacompare_missing_values compare_missing_values.default compare_missing_values.synth_pair
Compare Predictive Model Performance between Two Datasetscompare_model_performance compare_model_performance.default compare_model_performance.synth_pair
Compare Multivariate Distributions using Mahalanobis Distance or Jensen-Shannon Divergencecompare_multivariate_distribution compare_multivariate_distribution.default compare_multivariate_distribution.synth_pair
Compare Multivariate Summary Statistics between Two Datasetscompare_multivariate_summary_statistics compare_multivariate_summary_statistics.default compare_multivariate_summary_statistics.synth_pair
Compare Outlier Detection Between Original and Synthetic Datacompare_outliers compare_outliers.default compare_outliers.synth_pair
Compare Principal Component Analysis (PCA) between Two Datasets with Separate Loadingscompare_pca compare_pca.default compare_pca.synth_pair
Compare Distributions using the Wasserstein Distancecompare_wasserstein compare_wasserstein.default compare_wasserstein.synth_pair
Confidence Intervals for RAPID Risk Estimatesconfint.rapid
Contingency Table Fidelity for Categorical Dependence Comparisoncontingency_fidelity contingency_fidelity.default contingency_fidelity.synth_pair
Copula Fidelity for Dependence Structure Comparisoncopula_fidelity copula_fidelity.default copula_fidelity.synth_pair
Correct Attribution Probability (CAP/DCAP)dcap dcap.default dcap.synth_pair
Distance to Closest Record (DCR)dcr dcr.default dcr.synth_pair
delta-Presence Risk Assessmentdelta_presence delta_presence.default delta_presence.synth_pair
Density ratio and entropydensitydiff_1d_num
Kullback-Leibler divergencedensitydiff_kl_num
Density ratio and entropy of first PC'sdensitydiff_pca
Comprehensive Disclosure Risk Reportdisclosure_report disclosure_report.default disclosure_report.synth_pair
Disclosive in Synthetic Correct Original (DiSCO)disco disco.default disco.synth_pair
Density-based Membership Inference Attack (DOMIAS)domias domias.default domias.synth_pair
Disclosure Risk for Continuous Variables (dRisk / dRiskRMD)drisk drisk.default drisk.synth_pair
Energy Distance for Multivariate Numeric Dataenergy_distance energy_distance.default energy_distance.synth_pair
Entropy measuresCrossEntropy Entropy JSDiv JSDiv_bayes KLDiv KLDiv_bayes
Additional Entropy-Based Privacy MeasuresConditionalEntropy CumulativeEntropy EntropyMeasures MaxEntropy MinEntropy NormalizedEntropy RenyiEntropy
Epsilon Identifiabilityepsilon_identifiability epsilon_identifiability.default epsilon_identifiability.synth_pair
Evaluation statisticsait evaluation_stats mae mape mse rmse
Extract comparison data from sdcMicro objectfrom_sdcMicro
Extract comparison data from simPop objectfrom_simPop
Extract comparison data from synthpop objectfrom_synthpop
Gower distance between two data framesgower gower.default gower.synth_pair
Hellinger Distance for Categorical Distributionshellinger hellinger.default hellinger.synth_pair
Get high-risk recordshigh_risk_records
Hitting Ratehitting_rate hitting_rate.default hitting_rate.synth_pair
Identical Match Share (IMS)ims ims.default ims.synth_pair
Individual Re-identification Riskindividual_risk individual_risk.default individual_risk.synth_pair
Information Surprisalinformation_surprisal
Inspect a Single Record's Linkage Detailinspect_record inspect_record.recordLinkageRisk print.inspect_record
k-Anonymity Assessmentkanonymity kanonymity.default kanonymity.synth_pair
l-Diversity Assessmentldiversity ldiversity.default ldiversity.synth_pair
Linkability Risklinkability linkability.default linkability.synth_pair
Maximum Information Leakagemax_info_leakage
Merge Per-Record Risks Back to Datamerge_per_record merge_per_record.recordLinkageRisk
Membership Inference Attack metric via classificationmia_classifier mia_classifier.default mia_classifier.synth_pair
Maximum Mean Discrepancy (MMD) for Multivariate Datammd mmd.default mmd.synth_pair
Model Quality Scoremqs mqs.default mqs.synth_pair
Mutual Information Privacy MetricsmutualInformation
Nearest-Neighbor Adversarial Accuracy (NNAA)nnaa nnaa.default nnaa.synth_pair
Nearest Neighbor Distance Ratio (NNDR)nndr nndr.default nndr.synth_pair
Plot method for attacker_risk objectsplot.attacker_risk
Plot method for chisq_utility objectsplot.chisq_utility
Plot method for ci_proximity objectsplot.ci_proximity
Plot Method for Objects of Class "compare_distributions_cont"plot.compare_distributions_cont
Plot method for compare_feature_importance objectsplot.compare_feature_importance
Plot method for contingency_fidelity objectsplot.contingency_fidelity
Plot method for copula_fidelity objectsplot.copula_fidelity
Plot method for dcap objectsplot.dcap
Plot method for dcr objectsplot.dcr
Plot method for delta_presence objectsplot.delta_presence
Plot method for denpca objectsplot.denpca
Plot method for denratio objectsplot.denratio
Plot method for disclosure_report objectsplot.disclosure_report
Plot method for disco objectsplot.disco
Plot method for domias objectsplot.domias
Plot method for drisk objectsplot.drisk
Plot method for energy_distance objectsplot.energy_distance
Plot method for epsilon_identifiability objectsplot.epsilon_identifiability
Plot method for gower objectsplot.gower
Plot method for hellinger objectsplot.hellinger
Plot method for hitting_rate objectsplot.hitting_rate
Plot method for ims objectsplot.ims
Plot method for individual_risk objectsplot.individual_risk
Plot method for kanonymity objectsplot.kanonymity
Plot method for ldiversityRisk objectsplot.ldiversityRisk
Plot method for linkability objectsplot.linkability
Plot method for mia objectsplot.mia
Plot method for missingCompare objectsplot.missingCompare
Plot method for mmd objectsplot.mmd
Plot method for mqs objectsplot.mqs
Plot method for nnaa objectsplot.nnaa
Plot method for nndr objectsplot.nndr
Plot method for pMSE objectsplot.pMSE
Plot method for population_uniqueness objectsplot.population_uniqueness
Plot method for propscore objectsplot.propscore
Plot method for rapid objectsplot.rapid
Plot method for recordLinkageRiskplot.recordLinkageRisk
Plot method for regression_fidelity objectsplot.regression_fidelity
Plot method for rumap objectsplot.rumap
Plot method for singling_out objectsplot.singling_out
Plot method for specks objectsplot.specks
Plot method for subgroup_utility objectsplot.subgroup_utility
Plot method for suda objectsplot.suda
Plot method for tail_fidelity objectsplot.tail_fidelity
Plot method for tcap objectsplot.tcap
Plot method for tcloseness objectsplot.tcloseness
Plot method for weap objectsplot.weap
Propensity Score Mean Squared Error (pMSE) Utility MeasurepMSE pMSE.default pMSE.synth_pair
Population Uniqueness Riskpopulation_uniqueness population_uniqueness.default population_uniqueness.synth_pair
Positive Information Disclosure (PID)positive_information_disclosure
Print method for attacker_risk objectsprint.attacker_risk
Print method for chisq_utility objectsprint.chisq_utility
Print method for ci_proximity objectsprint.ci_proximity
Print method for compare_distributions_cont objectsprint.compare_distributions_cont
Print method for compare_feature_importance objectsprint.compare_feature_importance
Print method for contingency_fidelity objectsprint.contingency_fidelity
Print method for copula_fidelity objectsprint.copula_fidelity
Print method for dcap objectsprint.dcap
Print method for dcr objectsprint.dcr
Print method for delta_presence objectsprint.delta_presence
Print method for denpca objectsprint.denpca
Print method for denratio objectsprint.denratio
Print method for disclosure_report objectsprint.disclosure_report
Print method for disco objectsprint.disco
Print method for domias objectsprint.domias
Print method for drisk objectsprint.drisk
Print method for energy_distance objectsprint.energy_distance
Print method for epsilon_identifiability objectsprint.epsilon_identifiability
Print method for gower objectsprint.gower
Print method for hellinger objectsprint.hellinger
Print method for hitting_rate objectsprint.hitting_rate
Print method for ims objectsprint.ims
Print method for individual_risk objectsprint.individual_risk
Print method for kanonymity objectsprint.kanonymity
Print method for ldiversityRisk objectsprint.ldiversityRisk
Print method for linkability objectsprint.linkability
Print method for mia objectsprint.mia
Print method for missingCompare objectsprint.missingCompare
Print method for mmd objectsprint.mmd
Print method for mqs objectsprint.mqs
Print method for nnaa objectsprint.nnaa
Print method for nndr objectsprint.nndr
Print method for pMSE objectsprint.pMSE
Print method for population_uniqueness objectsprint.population_uniqueness
Print method for propscore objectsprint.propscore
Print method for rapid objectsprint.rapid
Print method for recordLinkageRiskprint.recordLinkageRisk
Print method for regression_fidelity objectsprint.regression_fidelity
Print method for rumap objectsprint.rumap
Print method for singling_out objectsprint.singling_out
Print method for specks objectsprint.specks
Print method for subgroup_utility objectsprint.subgroup_utility
Print method for suda objectsprint.suda
Print method for summary.attacker_risk objectsprint.summary.attacker_risk
Print method for summary.chisq_utility objectsprint.summary.chisq_utility
Print method for summary.ci_proximity objectsprint.summary.ci_proximity
Print method for summary.compare_distributions_cont objectsprint.summary.compare_distributions_cont
Print method for summary.compare_feature_importance objectsprint.summary.compare_feature_importance
Print method for summary.contingency_fidelity objectsprint.summary.contingency_fidelity
Print method for summary.copula_fidelity objectsprint.summary.copula_fidelity
Print method for summary.dcap objectsprint.summary.dcap
Print method for summary.dcr objectsprint.summary.dcr
Print method for summary.delta_presence objectsprint.summary.delta_presence
Print method for summary.denpca objectsprint.summary.denpca
Print method for summary.denratio objectsprint.summary.denratio
Print method for summary.disclosure_report objectsprint.summary.disclosure_report
Print method for summary.disco objectsprint.summary.disco
Print method for summary.domias objectsprint.summary.domias
Print method for summary.drisk objectsprint.summary.drisk
Print method for summary.energy_distance objectsprint.summary.energy_distance
Print method for summary.epsilon_identifiability objectsprint.summary.epsilon_identifiability
Print method for summary.gower objectsprint.summary.gower
Print method for summary.hellinger objectsprint.summary.hellinger
Print method for summary.hitting_rate objectsprint.summary.hitting_rate
Print method for summary.ims objectsprint.summary.ims
Print method for summary.individual_risk objectsprint.summary.individual_risk
Print method for summary.kanonymity objectsprint.summary.kanonymity
Print method for summary.ldiversityRisk objectsprint.summary.ldiversityRisk
Print method for summary.linkability objectsprint.summary.linkability
Print method for summary.mia objectsprint.summary.mia
Print method for summary.missingCompare objectsprint.summary.missingCompare
Print method for summary.mmd objectsprint.summary.mmd
Print method for summary.mqs objectsprint.summary.mqs
Print method for summary.nnaa objectsprint.summary.nnaa
Print method for summary.nndr objectsprint.summary.nndr
Print method for summary.pMSE objectsprint.summary.pMSE
Print method for summary.population_uniqueness objectsprint.summary.population_uniqueness
Print method for summary.propscore objectsprint.summary.propscore
Print method for summary.rapid objectsprint.summary.rapid
Print method for summary.rapid_cv objectsprint.summary.rapid_cv
Print method for summary.rapid_test objectsprint.summary.rapid_test
Print method for summary.rapid_threshold objectsprint.summary.rapid_threshold
Print method for summary.recordLinkageRiskprint.summary.recordLinkageRisk
Print method for summary.regression_fidelity objectsprint.summary.regression_fidelity
Print method for summary.rumap objectsprint.summary.rumap
Print method for summary.singling_out objectsprint.summary.singling_out
Print method for summary.specks objectsprint.summary.specks
Print method for summary.subgroup_utility objectsprint.summary.subgroup_utility
Print method for summary.suda objectsprint.summary.suda
Print method for summary.synth_pair objectsprint.summary.synth_pair
Print method for summary.tail_fidelity objectsprint.summary.tail_fidelity
Print method for summary.tcap objectsprint.summary.tcap
Print method for summary.tcloseness objectsprint.summary.tcloseness
Print method for summary.weap objectsprint.summary.weap
Print method for synth_pair objectsprint.synth_pair
Print method for tail_fidelity objectsprint.tail_fidelity
Print method for tcap objectsprint.tcap
Print method for tcloseness objectsprint.tcloseness
Print method for weap objectsprint.weap
Privacy Scoreprivacy_score
Propensity score utility measurepropscore propscore.default propscore.synth_pair
RAPID: Risk of Attribute Prediction-Induced Disclosurerapid rapid.default rapid.synth_pair
Cross-Validation of Synthesizer Disclosure Riskprint.rapid_cv rapid_synthesizer_cv
Permutation Test for RAPID Disclosure Riskprint.rapid_test rapid_test
Data-Driven Threshold Selection for RAPIDplot.rapid_threshold print.rapid_threshold rapid_threshold_select
Record Linkage Risk After PerturbationrecordLinkage recordLinkage.default recordLinkage.synth_pair
Regression Fidelity — Coefficient Comparisonregression_fidelity regression_fidelity.default regression_fidelity.synth_pair
Replicated Uniques (RepU)repu
RF Proximity Privacy Assessmentplot.rf_privacy print.rf_privacy print.summary.rf_privacy rf_privacy rf_privacy.default rf_privacy.synth_pair summary.rf_privacy
Aggregate Risk by Grouprisk_by_group risk_by_group.recordLinkageRisk
Multivariate Risk-Utility Map (RU-Map)rumap rumap.default rumap.synth_pair
Singling Out Risksingling_out singling_out.default singling_out.synth_pair
SPECKS - Propensity Score Comparison via Kolmogorov-Smirnov Testspecks specks.default specks.synth_pair
Stratified Utility Assessment Across Subgroupssubgroup_utility subgroup_utility.default subgroup_utility.synth_pair
SUDA - Special Uniques Detection Algorithmsuda suda.default suda.synth_pair
Summary method for attacker_risk objectssummary.attacker_risk
Summary method for chisq_utility objectssummary.chisq_utility
Summary method for ci_proximity objectssummary.ci_proximity
Summary method for compare_distributions_cont objectssummary.compare_distributions_cont
Summary method for compare_feature_importance objectssummary.compare_feature_importance
Summary method for contingency_fidelity objectssummary.contingency_fidelity
Summary method for copula_fidelity objectssummary.copula_fidelity
Summary method for dcap objectssummary.dcap
Summary method for dcr objectssummary.dcr
Summary method for delta_presence objectssummary.delta_presence
Summary method for denpca objectssummary.denpca
Summary method for denratio objectssummary.denratio
Summary method for disclosure_report objectssummary.disclosure_report
Summary method for disco objectssummary.disco
Summary method for domias objectssummary.domias
Summary method for drisk objectssummary.drisk
Summary method for energy_distance objectssummary.energy_distance
Summary method for epsilon_identifiability objectssummary.epsilon_identifiability
Summary method for gower objectssummary.gower
Summary method for hellinger objectssummary.hellinger
Summary method for hitting_rate objectssummary.hitting_rate
Summary method for ims objectssummary.ims
Summary method for individual_risk objectssummary.individual_risk
Summary method for kanonymity objectssummary.kanonymity
Summary method for ldiversityRisk objectssummary.ldiversityRisk
Summary method for linkability objectssummary.linkability
Summary method for mia objectssummary.mia
Summary method for missingCompare objectssummary.missingCompare
Summary method for mmd objectssummary.mmd
Summary method for mqs objectssummary.mqs
Summary method for nnaa objectssummary.nnaa
Summary method for nndr objectssummary.nndr
Summary method for pMSE objectssummary.pMSE
Summary method for population_uniqueness objectssummary.population_uniqueness
Summary method for propscore objectssummary.propscore
Summary method for rapid objectssummary.rapid
Summary method for rapid_cv objectssummary.rapid_cv
Summary method for rapid_test objectssummary.rapid_test
Summary method for rapid_threshold objectssummary.rapid_threshold
Summary method for recordLinkageRisksummary.recordLinkageRisk
Summary method for regression_fidelity objectssummary.regression_fidelity
Summary method for rumap objectssummary.rumap
Summary method for singling_out objectssummary.singling_out
Summary method for specks objectssummary.specks
Summary method for subgroup_utility objectssummary.subgroup_utility
Summary method for suda objectssummary.suda
Summary method for synth_pair objectssummary.synth_pair
Summary method for tail_fidelity objectssummary.tail_fidelity
Summary method for tcap objectssummary.tcap
Summary method for tcloseness objectssummary.tcloseness
Summary method for weap objectssummary.weap
Create a Synthetic Data Comparison Pairsynth_pair
System Anonymity LevelsystemAnonymityLevel
Tail Fidelity — Tail Preservation Utility Measuretail_fidelity tail_fidelity.default tail_fidelity.synth_pair
Targeted Correct Attribution Probability (TCAP)tcap tcap.default tcap.synth_pair
t-Closeness Assessmenttcloseness tcloseness.default tcloseness.synth_pair
Return Highest-Risk Recordstop_at_risk top_at_risk.recordLinkageRisk
Train on Synthetic, Test on Real (TSTR) Utility Measureplot.tstr print.summary.tstr print.tstr summary.tstr tstr tstr.default tstr.synth_pair
Within Equivalence Class Attribution Probability (WEAP)weap weap.default weap.synth_pair