public class TokenStreamOffsetStrategy extends AnalysisOffsetStrategy
OffsetsEnum
wrapping the TokenStream
filtered to terms
in the query, including wildcards. It can't handle position-sensitive queries (phrases). Passage accuracy suffers
because the freq() is unknown -- it's always Integer.MAX_VALUE
instead.Modifier and Type | Class and Description |
---|---|
private static class |
TokenStreamOffsetStrategy.TokenStreamOffsetsEnum |
Modifier and Type | Field and Description |
---|---|
private static BytesRef[] |
ZERO_LEN_BYTES_REF_ARRAY |
analyzer
components
Constructor and Description |
---|
TokenStreamOffsetStrategy(UHComponents components,
Analyzer indexAnalyzer) |
Modifier and Type | Method and Description |
---|---|
private static CharacterRunAutomaton[] |
convertTermsToAutomata(BytesRef[] terms,
CharacterRunAutomaton[] automata) |
OffsetsEnum |
getOffsetsEnum(LeafReader reader,
int docId,
java.lang.String content)
The primary method -- return offsets for highlightable words in the specified document.
|
getOffsetSource, tokenStream
createOffsetsEnumFromReader, createOffsetsEnumsForAutomata, createOffsetsEnumsForTerms, createOffsetsEnumsWeightMatcher, getField
private static final BytesRef[] ZERO_LEN_BYTES_REF_ARRAY
public TokenStreamOffsetStrategy(UHComponents components, Analyzer indexAnalyzer)
private static CharacterRunAutomaton[] convertTermsToAutomata(BytesRef[] terms, CharacterRunAutomaton[] automata)
public OffsetsEnum getOffsetsEnum(LeafReader reader, int docId, java.lang.String content) throws java.io.IOException
FieldOffsetStrategy
getOffsetsEnum
in class FieldOffsetStrategy
java.io.IOException