public class LevenshteinAutomata
extends java.lang.Object
Implements the algorithm described in: Schulz and Mihov: Fast String Correction with Levenshtein Automata
Modifier and Type | Class and Description |
---|---|
(package private) static class |
LevenshteinAutomata.ParametricDescription
A ParametricDescription describes the structure of a Levenshtein DFA for some degree n.
|
Modifier and Type | Field and Description |
---|---|
(package private) int[] |
alphabet |
(package private) int |
alphaMax |
(package private) LevenshteinAutomata.ParametricDescription[] |
descriptions |
static int |
MAXIMUM_SUPPORTED_DISTANCE
Maximum edit distance this class can generate an automaton for.
|
(package private) int |
numRanges |
(package private) int[] |
rangeLower |
(package private) int[] |
rangeUpper |
(package private) int[] |
word |
Constructor and Description |
---|
LevenshteinAutomata(int[] word,
int alphaMax,
boolean withTranspositions)
Expert: specify a custom maximum possible symbol
(alphaMax); default is Character.MAX_CODE_POINT.
|
LevenshteinAutomata(java.lang.String input,
boolean withTranspositions)
Create a new LevenshteinAutomata for some input String.
|
Modifier and Type | Method and Description |
---|---|
private static int[] |
codePoints(java.lang.String input) |
(package private) int |
getVector(int x,
int pos,
int end)
Get the characteristic vector
X(x, V)
where V is substring(pos, end) |
Automaton |
toAutomaton(int n)
Compute a DFA that accepts all strings within an edit distance of
n . |
Automaton |
toAutomaton(int n,
java.lang.String prefix)
Compute a DFA that accepts all strings within an edit distance of
n ,
matching the specified exact prefix. |
public static final int MAXIMUM_SUPPORTED_DISTANCE
final int[] word
final int[] alphabet
final int alphaMax
final int[] rangeLower
final int[] rangeUpper
int numRanges
LevenshteinAutomata.ParametricDescription[] descriptions
public LevenshteinAutomata(java.lang.String input, boolean withTranspositions)
public LevenshteinAutomata(int[] word, int alphaMax, boolean withTranspositions)
private static int[] codePoints(java.lang.String input)
public Automaton toAutomaton(int n)
n
.
All automata have the following properties:
public Automaton toAutomaton(int n, java.lang.String prefix)
n
,
matching the specified exact prefix.
All automata have the following properties:
int getVector(int x, int pos, int end)
X(x, V)
where V is substring(pos, end)