fuzzystrmatch
fuzzystrmatch : determine similarities and distance between strings
Overview
| ID | Extension | Package | Version | Category | License | Language |
|---|---|---|---|---|---|---|
| 2380 | fuzzystrmatch
|
fuzzystrmatch
|
1.2 |
FTS
|
PostgreSQL
|
C
|
| Attribute | Has Binary | Has Library | Need Load | Has DDL | Relocatable | Trusted |
|---|---|---|---|---|---|---|
--s-dt-
|
No
|
Yes
|
No
|
Yes
|
no
|
yes
|
| Relationships | |
|---|---|
| Need By | postgis_tiger_geocoder
|
| See Also | pg_similarity
smlar
pg_trgm
unaccent
pg_bigm
citext
btree_gist
btree_gin
|
Packages
| PG18 | PG17 | PG16 | PG15 | PG14 |
|---|---|---|---|---|
1.2
|
1.2
|
1.2
|
1.2
|
1.2
|
This is a built-in contrib extension ship with the PostgreSQL kernel
Install
Create this extension with:
CREATE EXTENSION fuzzystrmatch;Usage
fuzzystrmatch: Determine similarities and distance between strings
The fuzzystrmatch module provides functions to determine similarities and distance between strings.
CREATE EXTENSION fuzzystrmatch;Soundex
Converts a string to its Soundex code (useful for matching similar-sounding names):
SELECT soundex('Anne'), soundex('Ann'), difference('Anne', 'Ann');
-- A500, A500, 4
SELECT soundex('Anne'), soundex('Andrew'), difference('Anne', 'Andrew');
-- A500, A536, 2
SELECT soundex('Anne'), soundex('Margaret'), difference('Anne', 'Margaret');
-- A500, M626, 0The difference function returns 0–4, where 4 means most similar.
Daitch-Mokotoff Soundex
Returns a set of Daitch-Mokotoff soundex codes (better for non-English names):
SELECT daitch_mokotoff('George');
-- {595000}
SELECT daitch_mokotoff('John');
-- {160000,460000}
-- Find names sounding like 'Schwartzenegger'
SELECT * FROM s WHERE daitch_mokotoff(nm) && daitch_mokotoff('Schwartzenegger');Supports indexing with GIN:
CREATE INDEX ON s USING gin (daitch_mokotoff(nm) gin__int_ops);Levenshtein Distance
Computes edit distance between two strings (insertions, deletions, substitutions):
SELECT levenshtein('GUMBO', 'GAMBOL');
-- 2
SELECT levenshtein('GUMBO', 'GAMBOL', 2, 1, 1);
-- 3 (custom costs: insert=2, delete=1, substitute=1)
-- Bounded version (faster, stops early)
SELECT levenshtein_less_equal('extensive', 'exhaustive', 2);
-- 3 (actual distance exceeds threshold, returns actual)
SELECT levenshtein_less_equal('extensive', 'exhaustive', 4);
-- 4Metaphone
Returns a metaphone code for a string:
SELECT metaphone('GUMBO', 4);
-- KMDouble Metaphone
Returns primary and alternate codes (handles more name variations):
SELECT dmetaphone('gumbo');
-- KMP
SELECT dmetaphone_alt('gumbo');
-- KMP