“A Bit-Vector Algorithm for Computing Levenshtein and Damerau Edit Distances”, Heikki Hyyrö2002 ()⁠:

The edit distance between strings A and B is defined as the minimum number of edit operations needed in converting A into B or vice versa. The Levenshtein edit distance allows 3 types of operations: an insertion, a deletion or a substitution of a character. The Damerau edit distance allows the previous 3 plus a transposition between two adjacent characters.

To our best knowledge the best current practical algorithms for computing these edit distances run in time 𝒪(dm) and 𝒪(⌈m/w⌉(n + σ)), where d is the edit distance between the two strings, m and n are their lengths (mn), w is the computer word size and σ is the size of the alphabet.

In this paper we present an algorithm that runs in time 𝒪(⌈d/wm + ⌈n/wσ), or 𝒪(⌈d/wn + ⌈m/wσ). The structure of the algorithm is such, that in practice it is mostly suitable for testing whether the edit distance between two strings is within some pre-determined error threshold.

We also present some initial test results with thresholded edit distance computation. In them our algorithm works faster than the original algorithm of Myers1999.

[Keywords: Levenshtein edit distance, Damerau edit distance, bit-parallelism, approximate string matching, dynamic programming]