Package: pickmax 0.1.0

Sbonelo Chamane

pickmax: Split and Coalesce Duplicated Records

Deduplicates datasets by retaining the most complete and informative records. Identifies duplicated entries based on a specified key column, calculates completeness scores for each row, and compares values within groups. When differences between duplicates exceed a user-defined threshold, records are split into unique IDs; otherwise, they are coalesced into a single, most complete entry. Returns a list containing the original duplicates, the split entries, and the final coalesced dataset. Useful for cleaning survey or administrative data where duplicated IDs may reflect minor data entry inconsistencies.

Authors:Sbonelo Chamane [aut, cre], Musawenkosi Mabaso [aut], Ronel Sewpaul [aut], Sean Jooste [aut], Kutloano Skhosana [aut], Khangelani Zuma [aut]

pickmax_0.1.0.tar.gz
pickmax_0.1.0.tar.gz(r-4.7-any)pickmax_0.1.0.tar.gz(r-4.6-any)
pickmax_0.1.0.tgz(r-4.6-emscripten)
manual.pdf |manual.html
card.svg |card.png
pickmax/json (API)

# Install 'pickmax' in R:
install.packages('pickmax', repos = c('https://cran.r-universe.dev', 'https://cloud.r-project.org'))

On CRAN:

Conda:

This package does not link to any Github/Gitlab/R-forge repository. No issue tracker or development information is available.

1.00 score 170 downloads 1 exports 15 dependencies

Last updated from:2df14f241e. Checks:4 OK. Indexed: yes.

TargetResultTimeFilesSyslog
linux-devel-x86_64OK107
source / vignettesOK134
linux-release-x86_64OK120
wasm-releaseOK97

Exports:pickmax

Dependencies:clidplyrgenericsgluelifecyclemagrittrpillarpkgconfigR6rlangtibbletidyselectutf8vctrswithr