Package: stringdist 0.9.14
stringdist: Approximate String Matching, Fuzzy Text Search, and String Distance Functions
Implements an approximate string matching version of R's native 'match' function. Also offers fuzzy text search based on various string distance measures. Can calculate various string distances based on edits (Damerau-Levenshtein, Hamming, Levenshtein, optimal sting alignment), qgrams (q- gram, cosine, jaccard distance) or heuristic metrics (Jaro, Jaro-Winkler). An implementation of soundex is provided as well. Distances can be computed between character vectors while taking proper care of encoding or between integer vectors representing generic sequences. This package is built for speed and runs in parallel by using 'openMP'. An API for C or C++ is exposed as well. Reference: MPJ van der Loo (2014) <doi:10.32614/RJ-2014-011>.
Authors:
stringdist_0.9.14.tar.gz
stringdist_0.9.14.tar.gz(r-4.5-noble)stringdist_0.9.14.tar.gz(r-4.4-noble)
stringdist_0.9.14.tgz(r-4.4-emscripten)stringdist_0.9.14.tgz(r-4.3-emscripten)
stringdist.pdf |stringdist.html✨
stringdist/json (API)
NEWS
# Install 'stringdist' in R: |
install.packages('stringdist', repos = c('https://cran.r-universe.dev', 'https://cloud.r-project.org')) |
Bug tracker:https://github.com/markvanderloo/stringdist/issues
Last updated 21 days agofrom:37a6f2948b. Checks:OK: 2. Indexed: no.
Target | Result | Date |
---|---|---|
Doc / Vignettes | OK | Dec 10 2024 |
R-4.5-linux-x86_64 | OK | Dec 10 2024 |
Exports:afindainamatchextractgrabgrablphoneticprintable_asciiqgramsseq_ainseq_amatchseq_distseq_distmatrixseq_qgramsseq_simstringdiststringdistmatrixstringsimstringsimmatrix
Dependencies:
Readme and manuals
Help Manual
Help page | Topics |
---|---|
A package for string distance calculation and approximate string matching. | stringdist-package |
Stringdist-based fuzzy text search | afind extract grab grabl |
Approximate string matching | ain amatch |
Phonetic algorithms | phonetic |
Detect the presence of non-printable or non-ascii characters | printable_ascii |
Get a table of qgram counts from one or more character vectors. | qgrams |
Approximate matching for integer sequences. | seq_ain seq_amatch |
Compute distance metrics between integer sequences | seq_dist seq_distmatrix |
Get a table of qgram counts for integer sequences | seq_qgrams |
Compute similarity scores between sequences of integers | seq_sim |
Compute distance metrics between strings | stringdist stringdistmatrix |
Calling stringdist from 'C' or 'C++' | stringdist_api |
String metrics in 'stringdist' | stringdist-encoding |
String metrics in 'stringdist' | stringdist-metrics |
Multithreading and parallelization in 'stringdist' | stringdist-parallelization |
Compute similarity scores between strings | stringsim stringsimmatrix |