# -------------------------------------------- # CITATION file created with {cffr} R package # See also: https://docs.ropensci.org/cffr/ # -------------------------------------------- cff-version: 1.2.0 message: 'To cite package "pickmax" in publications use:' type: software license: GPL-3.0-only title: 'pickmax: Split and Coalesce Duplicated Records' version: 0.1.0 doi: 10.32614/CRAN.package.pickmax abstract: Deduplicates datasets by retaining the most complete and informative records. Identifies duplicated entries based on a specified key column, calculates completeness scores for each row, and compares values within groups. When differences between duplicates exceed a user-defined threshold, records are split into unique IDs; otherwise, they are coalesced into a single, most complete entry. Returns a list containing the original duplicates, the split entries, and the final coalesced dataset. Useful for cleaning survey or administrative data where duplicated IDs may reflect minor data entry inconsistencies. authors: - family-names: Chamane given-names: Sbonelo email: SChamane@hsrc.ac.za orcid: https://orcid.org/0000-0001-5350-5203 - family-names: Mabaso given-names: Musawenkosi - family-names: Sewpaul given-names: Ronel - family-names: Jooste given-names: Sean - family-names: Skhosana given-names: Kutloano - family-names: Zuma given-names: Khangelani repository: https://cran.r-universe.dev commit: 2df14f241ea011b3776e96265415f6b763ccb95c date-released: '2025-07-15' contact: - family-names: Chamane given-names: Sbonelo email: SChamane@hsrc.ac.za orcid: https://orcid.org/0000-0001-5350-5203