djm's scribble

py-editdist

written by djm, on Jul 6, 2006 12:31:00 AM.

I just wrote and released py-editdist, a fast CPython module to calculate the Levenshtein edit distance between two strings. Wikipedia has a good description of the actual algorithm, and quite a few sample implementations. I looked at using the Python one from there, but it was too slow - taking around ten milliseconds to calculate the distance between two ~50 character strings, which is a problem because I need to do it about 10 million times per day :) py-editdist does the same calculation in ~70 microseconds, which is in the realm of usable at least…

Leave a Reply