Package: zoomerjoin 0.1.5
zoomerjoin:Superlatively Fast Fuzzy Joins
Empowers users to fuzzily-merge data frames with millions or tens of millions of rows in minutes with low memory usage. The package uses the locality sensitive hashing algorithms developed by Datar, Immorlica, Indyk and Mirrokni (2004) <doi:10.1145/997817.997857>, and Broder (1998) <doi:10.1109/SEQUEN.1997.666900> to avoid having to compare every pair of records in each dataset, resulting in fuzzy-merges that finish in linear time.
Authors:
zoomerjoin_0.1.5.tar.gz
zoomerjoin_0.1.5.tar.gz(r-4.5-noble)zoomerjoin_0.1.5.tar.gz(r-4.4-noble)
zoomerjoin.pdf |zoomerjoin.html✨
zoomerjoin/json (API)
NEWS
# Installzoomerjoin in R: |
install.packages('zoomerjoin',repos = c('https://cran.r-universe.dev', 'https://cloud.r-project.org')) |
Bug tracker:https://github.com/beniaminogreen/zoomerjoin/issues
- dime_data - Donors from DIME Database
Last updated 3 days agofrom:ef0bae5e05
Exports:em_linkeuclidean_anti_joineuclidean_full_joineuclidean_inner_joineuclidean_left_joineuclidean_probabilityeuclidean_right_joinhamming_anti_joinhamming_distancehamming_full_joinhamming_inner_joinhamming_left_joinhamming_probabilityhamming_right_joinjaccard_anti_joinjaccard_curvejaccard_full_joinjaccard_hyper_grid_searchjaccard_inner_joinjaccard_left_joinjaccard_probabilityjaccard_right_joinjaccard_similarityjaccard_string_group
Dependencies:clicollapsecpp11dplyrfansigenericsgluelifecyclemagrittrpillarpkgconfigpurrrR6Rcpprlangstringistringrtibbletidyrtidyselectutf8vctrswithr
A Zoomerjoin Guided Tour
Rendered fromguided_tour.Rmd
usingknitr::rmarkdown
on Jul 03 2024.Last update: 2024-07-03
Started: 2024-02-01
Benchmarks
Rendered frombenchmarks.Rmd
usingknitr::rmarkdown
on Jul 03 2024.Last update: 2024-07-03
Started: 2024-02-01
Matching Vectors Based on Euclidean Distance
Rendered frommatching_vectors.Rmd
usingknitr::rmarkdown
on Jul 03 2024.Last update: 2024-07-03
Started: 2024-02-01