Package: StatMatch 1.4.3

Marcello DOrazio

StatMatch: Statistical Matching or Data Fusion

Integration of two data sources referred to the same target population which share a number of variables. Some functions can also be used to impute missing values in data sets through hot deck imputation methods. Methods to perform statistical matching when dealing with data from complex sample surveys are available too.

Authors:Marcello D'Orazio [aut, cre]

StatMatch_1.4.3.tar.gz
StatMatch_1.4.3.tar.gz(r-4.5-noble)StatMatch_1.4.3.tar.gz(r-4.4-noble)
StatMatch_1.4.3.tgz(r-4.4-emscripten)StatMatch_1.4.3.tgz(r-4.3-emscripten)
StatMatch.pdf |StatMatch.html
StatMatch/json (API)
NEWS

# Install 'StatMatch' in R:
install.packages('StatMatch', repos = c('https://cran.r-universe.dev', 'https://cloud.r-project.org'))

Bug tracker:https://github.com/marcellodo/statmatch/issues

Datasets:
  • samp.A - Artificial data set resembling EU-SILC survey
  • samp.B - Artificial data set resembling EU-SILC survey
  • samp.C - Artificial data set resembling EU-SILC survey

On CRAN:

Conda:

4.03 score 10 packages 1.8k downloads 11 mentions 25 exports 41 dependencies

Last updated 2 months agofrom:3cfab59911. Checks:3 OK. Indexed: yes.

TargetResultLatest binary
Doc / VignettesOKMar 10 2025
R-4.5-linuxOKMar 10 2025
R-4.4-linuxOKMar 10 2025

Exports:comb.samplescomp.contcomp.propcreate.fusedcreate.imputedfact2dummyFbounds.predFbwidths.by.xFrechet.bounds.catgower.distharmonize.xmahalanobis.distmaximum.distmixed.mtcNND.hotdeckpBayesplotBoundsplotContplotTabpw.assocRANDwNND.hotdeckrankNND.hotdeckrho.boundsrho.bounds.predselMtc.by.unc

Dependencies:clicolorspaceDBIdplyrfansifarvergenericsggplot2gluegtableisobandlabelinglatticelifecyclelpSolvemagrittrMASSMatrixmgcvminqamitoolsmunsellnlmenumDerivpillarpkgconfigproxyR6RColorBrewerRcppRcppArmadillorlangscalessurveysurvivaltibbletidyselectutf8vctrsviridisLitewithr

Readme and manuals

Help Manual

Help pageTopics
Statistical Matching or Data FusionStatMatch-package StatMatch
Statistical Matching of data from complex sample surveyscomb.samples
Compares two distributions of the same continuous variablecomp.cont
Compares two distributions of the same categorical variablecomp.prop
Creates a matched (synthetic) datasetcreate.fused
Fills-in missing values in the recipient dataset with values observed on the donors unitscreate.imputed
Transforms a categorical variable in a set of dummy variablesfact2dummy
Estimates Frechet bounds for cells in the contingency table crossing two categorical variables observed in distinct samples referred to the same target population.Fbounds.pred
Computes the Frechet bounds of cells in a contingency table by considering all the possible subsets of the common variables.Fbwidths.by.x
Frechet bounds of cells in a contingency tableFrechet.bounds.cat
Computes the Gower's Distancegower.dist
Harmonizes the marginal (joint) distribution of a set of variables observed independently in two sample surveys referred to the same target populationharmonize.x
Computes the Mahalanobis Distancemahalanobis.dist
Computes the Maximum Distancemaximum.dist
Statistical Matching via Mixed Methodsmixed.mtc
Distance Hot Deck method.NND.hotdeck
Pseudo-Bayes estimates of cell probabilitiespBayes
Graphical representation of the uncertainty bounds estimated through the 'Frechet.bounds.cat' functionplotBounds
graphical comparison of the estimated distributions for the same continuous variable.plotCont
Graphical comparison of the estimated distributions for the same categorical variable.plotTab
Pairwise measures between categorical variablespw.assoc
Random Distance hot deck.RANDwNND.hotdeck
Rank distance hot deck method.rankNND.hotdeck
Estimates plausible values of the Pearson's correlation coefficient between two variables observed in distinct samples referred to the same target population.rho.bounds
Estimates plausible values of the Pearson's correlation coefficient between two variables observed in distinct samples referred to the same target population.rho.bounds.pred
Artificial data set resembling EU-SILC surveysamp.A
Artificial data set resembling EU-SILC surveysamp.B
Artificial data set resembling EU-SILC surveysamp.C
Identifies the best combination if matching variables in reducing uncertainty in estimation the contingency table Y vs. Z.selMtc.by.unc