clj-fuzzy (original) (raw)
clj-fuzzy
clj-fuzzy is a native Clojure library providing a collection of famous algorithms dealing with fuzzy strings and phonetics.
It can be used in Clojure, ClojureScript, client-side JavaScript and Node.js.
Available algorithms
clj-fuzzy embarks three kinds of algorithms:
- Distance metrics for strings and other sequences.
- Stemmers trying to extract a radical from given words.
- Phonetic algorithms whose goal is to return an approximative phonetic representation of the given string.
Metrics
- Sorensen / Dice coefficient
- Levenshtein distance
- Hamming distance
- Jaccard / Tanimoto distance
- Jaro-Winkler distance
- MRA comparison
- Tversky index
Stemmers
- Lancaster stemmer (en)
- Lovins stemmer (en)
- Porter stemmer (en)
- Schinke stemmer (latin)
Phonetics
- Metaphone
- Double Metaphone
- Soundex
- NYSIIS (original & refined)
- Caverphone (original & revisited)
- Cologne Phonetic
- MRA codex
Installation & Usage
Contribution
Please feel free to contribute by forking this repo. Just be sure to add relevant unit tests and pass them all before submitting any code.
License
MIT