public class WeightedSpanTermExtractor
extends java.lang.Object
WeightedSpanTerm
s from a Query
based on whether
Term
s from the Query
are contained in a supplied TokenStream
.
In order to support additional, by default unsupported queries, subclasses can override
extract(Query, float, Map)
for extracting wrapped or delegate queries and
extractUnknownQuery(Query, Map)
to process custom leaf queries:
WeightedSpanTermExtractor extractor = new WeightedSpanTermExtractor() {
protected void extract(Query query, float boost, Map<String, WeightedSpanTerm>terms) throws IOException {
if (query instanceof QueryWrapper) {
extract(((QueryWrapper)query).getQuery(), boost, terms);
} else {
super.extract(query, boost, terms);
}
}
protected void extractUnknownQuery(Query query, Map<String, WeightedSpanTerm> terms) throws IOException {
if (query instanceOf CustomTermQuery) {
Term term = ((CustomTermQuery) query).getTerm();
terms.put(term.field(), new WeightedSpanTerm(1, term.text()));
}
}
};
}
Modifier and Type | Class and Description |
---|---|
(package private) static class |
WeightedSpanTermExtractor.DelegatingLeafReader |
protected static class |
WeightedSpanTermExtractor.PositionCheckingMap<K>
This class makes sure that if both position sensitive and insensitive
versions of the same term are added, the position insensitive one wins.
|
Modifier and Type | Field and Description |
---|---|
private boolean |
cachedTokenStream |
private java.lang.String |
defaultField |
private boolean |
expandMultiTermQuery |
private java.lang.String |
fieldName |
private LeafReader |
internalReader |
private int |
maxDocCharsToAnalyze |
private TokenStream |
tokenStream |
private boolean |
usePayloads |
private boolean |
wrapToCaching |
Constructor and Description |
---|
WeightedSpanTermExtractor() |
WeightedSpanTermExtractor(java.lang.String defaultField) |
Modifier and Type | Method and Description |
---|---|
protected void |
collectSpanQueryFields(SpanQuery spanQuery,
java.util.Set<java.lang.String> fieldNames) |
protected void |
extract(Query query,
float boost,
java.util.Map<java.lang.String,WeightedSpanTerm> terms)
|
protected void |
extractUnknownQuery(Query query,
java.util.Map<java.lang.String,WeightedSpanTerm> terms) |
protected void |
extractWeightedSpanTerms(java.util.Map<java.lang.String,WeightedSpanTerm> terms,
SpanQuery spanQuery,
float boost)
|
protected void |
extractWeightedTerms(java.util.Map<java.lang.String,WeightedSpanTerm> terms,
Query query,
float boost)
|
protected boolean |
fieldNameComparator(java.lang.String fieldNameToCheck)
Necessary to implement matches for queries against
defaultField |
boolean |
getExpandMultiTermQuery() |
protected LeafReaderContext |
getLeafContext() |
TokenStream |
getTokenStream()
Returns the tokenStream which may have been wrapped in a CachingTokenFilter.
|
java.util.Map<java.lang.String,WeightedSpanTerm> |
getWeightedSpanTerms(Query query,
float boost,
TokenStream tokenStream)
Creates a Map of
WeightedSpanTerms from the given Query and TokenStream . |
java.util.Map<java.lang.String,WeightedSpanTerm> |
getWeightedSpanTerms(Query query,
float boost,
TokenStream tokenStream,
java.lang.String fieldName)
Creates a Map of
WeightedSpanTerms from the given Query and TokenStream . |
java.util.Map<java.lang.String,WeightedSpanTerm> |
getWeightedSpanTermsWithScores(Query query,
float boost,
TokenStream tokenStream,
java.lang.String fieldName,
IndexReader reader)
Creates a Map of
WeightedSpanTerms from the given Query and TokenStream . |
boolean |
isCachedTokenStream() |
protected boolean |
isQueryUnsupported(java.lang.Class<? extends Query> clazz) |
boolean |
isUsePayloads() |
protected boolean |
mustRewriteQuery(SpanQuery spanQuery) |
void |
setExpandMultiTermQuery(boolean expandMultiTermQuery) |
protected void |
setMaxDocCharsToAnalyze(int maxDocCharsToAnalyze)
A threshold of number of characters to analyze.
|
void |
setUsePayloads(boolean usePayloads) |
void |
setWrapIfNotCachingTokenFilter(boolean wrap)
By default,
TokenStream s that are not of the type
CachingTokenFilter are wrapped in a CachingTokenFilter to
ensure an efficient reset - if you are already using a different caching
TokenStream impl and you don't want it to be wrapped, set this to
false. |
private java.lang.String fieldName
private TokenStream tokenStream
private java.lang.String defaultField
private boolean expandMultiTermQuery
private boolean cachedTokenStream
private boolean wrapToCaching
private int maxDocCharsToAnalyze
private boolean usePayloads
private LeafReader internalReader
public WeightedSpanTermExtractor()
public WeightedSpanTermExtractor(java.lang.String defaultField)
protected void extract(Query query, float boost, java.util.Map<java.lang.String,WeightedSpanTerm> terms) throws java.io.IOException
query
- Query to extract Terms fromterms
- Map to place created WeightedSpanTerms injava.io.IOException
- If there is a low-level I/O errorprotected boolean isQueryUnsupported(java.lang.Class<? extends Query> clazz)
protected void extractUnknownQuery(Query query, java.util.Map<java.lang.String,WeightedSpanTerm> terms) throws java.io.IOException
java.io.IOException
protected void extractWeightedSpanTerms(java.util.Map<java.lang.String,WeightedSpanTerm> terms, SpanQuery spanQuery, float boost) throws java.io.IOException
terms
- Map to place created WeightedSpanTerms inspanQuery
- SpanQuery to extract Terms fromjava.io.IOException
- If there is a low-level I/O errorprotected void extractWeightedTerms(java.util.Map<java.lang.String,WeightedSpanTerm> terms, Query query, float boost) throws java.io.IOException
terms
- Map to place created WeightedSpanTerms inquery
- Query to extract Terms fromjava.io.IOException
- If there is a low-level I/O errorprotected boolean fieldNameComparator(java.lang.String fieldNameToCheck)
defaultField
protected LeafReaderContext getLeafContext() throws java.io.IOException
java.io.IOException
public java.util.Map<java.lang.String,WeightedSpanTerm> getWeightedSpanTerms(Query query, float boost, TokenStream tokenStream) throws java.io.IOException
WeightedSpanTerms
from the given Query
and TokenStream
.
query
- that caused hittokenStream
- of text to be highlightedjava.io.IOException
- If there is a low-level I/O errorpublic java.util.Map<java.lang.String,WeightedSpanTerm> getWeightedSpanTerms(Query query, float boost, TokenStream tokenStream, java.lang.String fieldName) throws java.io.IOException
WeightedSpanTerms
from the given Query
and TokenStream
.
query
- that caused hittokenStream
- of text to be highlightedfieldName
- restricts Term's used based on field namejava.io.IOException
- If there is a low-level I/O errorpublic java.util.Map<java.lang.String,WeightedSpanTerm> getWeightedSpanTermsWithScores(Query query, float boost, TokenStream tokenStream, java.lang.String fieldName, IndexReader reader) throws java.io.IOException
WeightedSpanTerms
from the given Query
and TokenStream
. Uses a supplied
IndexReader
to properly weight terms (for gradient highlighting).
query
- that caused hittokenStream
- of text to be highlightedfieldName
- restricts Term's used based on field namereader
- to use for scoringjava.io.IOException
- If there is a low-level I/O errorprotected void collectSpanQueryFields(SpanQuery spanQuery, java.util.Set<java.lang.String> fieldNames)
protected boolean mustRewriteQuery(SpanQuery spanQuery)
public boolean getExpandMultiTermQuery()
public void setExpandMultiTermQuery(boolean expandMultiTermQuery)
public boolean isUsePayloads()
public void setUsePayloads(boolean usePayloads)
public boolean isCachedTokenStream()
public TokenStream getTokenStream()
public void setWrapIfNotCachingTokenFilter(boolean wrap)
TokenStream
s that are not of the type
CachingTokenFilter
are wrapped in a CachingTokenFilter
to
ensure an efficient reset - if you are already using a different caching
TokenStream
impl and you don't want it to be wrapped, set this to
false. This setting is ignored when a term vector based TokenStream is supplied,
since it can be reset efficiently.protected final void setMaxDocCharsToAnalyze(int maxDocCharsToAnalyze)