fuzzystring: Fast Fuzzy String Joins for Data Frames (original) (raw)

Perform fuzzy joins on data frames using approximate string matching. Implements all standard join types (inner, left, right, full, semi, anti) with support for multiple string distance metrics from the 'stringdist' package including Levenshtein, Damerau-Levenshtein, Jaro-Winkler, and Soundex. Features a high-performance 'data.table' backend with 'C++' row binding for efficient processing of large datasets. Ideal for matching misspellings, inconsistent labels, messy user input, or reconciling datasets with slight variations in identifiers. Optionally returns distance metrics alongside matched records.

Version: 0.0.1
Depends: R (≥ 4.1)
Imports: data.table, Rcpp, stringdist
LinkingTo: Rcpp
Suggests: dplyr, ggplot2, knitr, qdapDictionaries, readr, rmarkdown, rvest, stringr, testthat (≥ 3.0.0), tidyr
Published: 2026-02-08
DOI: 10.32614/CRAN.package.fuzzystring
Author: Paul E. Santos AndradeORCID iD [aut, cre], David Robinson [ctb] (aut of fuzzyjoin)
Maintainer: Paul E. Santos Andrade
BugReports: https://github.com/PaulESantos/fuzzystring/issues
License: MIT + file
URL: https://github.com/PaulESantos/fuzzystring,https://paulesantos.github.io/fuzzystring/
NeedsCompilation: yes
Materials: README
CRAN checks: fuzzystring results

Documentation:

Downloads:

Linking:

Please use the canonical formhttps://CRAN.R-project.org/package=fuzzystringto link to this page.