public class ProtectedTermFilterFactory extends ConditionalTokenFilterFactory implements ResourceLoaderAware
ProtectedTermFilter
CustomAnalyzer example:
Analyzer ana = CustomAnalyzer.builder() .withTokenizer("standard") .when("protectedterm", "ignoreCase", "true", "protected", "protectedTerms.txt") .addTokenFilter("truncate", "prefixLength", "4") .addTokenFilter("lowercase") .endwhen() .build();
Solr example, in which conditional filters are specified via the wrappedFilters
parameter - a comma-separated list of case-insensitive TokenFilter SPI names - and conditional
filter args are specified via filterName.argName
parameters:
<fieldType name="reverse_lower_with_exceptions" class="solr.TextField" positionIncrementGap="100"> <analyzer> <tokenizer class="solr.WhitespaceTokenizerFactory"/> <filter class="solr.ProtectedTermFilterFactory" ignoreCase="true" protected="protectedTerms.txt" wrappedFilters="truncate,lowercase" truncate.prefixLength="4" /> </analyzer> </fieldType>
When using the wrappedFilters
parameter, each filter name must be unique, so if you
need to specify the same filter more than once, you must add case-insensitive unique '-id' suffixes
(note that the '-id' suffix is stripped prior to SPI lookup), e.g.:
<fieldType name="double_synonym_with_exceptions" class="solr.TextField" positionIncrementGap="100"> <analyzer> <tokenizer class="solr.WhitespaceTokenizerFactory"/> <filter class="solr.ProtectedTermFilterFactory" ignoreCase="true" protected="protectedTerms.txt" wrappedFilters="synonymgraph-A,synonymgraph-B" synonymgraph-A.synonyms="synonyms-1.txt" synonymgraph-B.synonyms="synonyms-2.txt"/> </analyzer> </fieldType>
See related CustomAnalyzer.Builder.whenTerm(Predicate)
Modifier and Type | Field and Description |
---|---|
static char |
FILTER_ARG_SEPARATOR |
static char |
FILTER_NAME_ID_SEPARATOR |
private boolean |
ignoreCase |
static java.lang.String |
PROTECTED_TERMS |
private CharArraySet |
protectedTerms |
private java.lang.String |
termFiles |
private java.lang.String |
wrappedFilters |
LUCENE_MATCH_VERSION_PARAM, luceneMatchVersion
Constructor and Description |
---|
ProtectedTermFilterFactory(java.util.Map<java.lang.String,java.lang.String> args) |
Modifier and Type | Method and Description |
---|---|
protected ConditionalTokenFilter |
create(TokenStream input,
java.util.function.Function<TokenStream,TokenStream> inner)
Modify the incoming
TokenStream with a ConditionalTokenFilter |
void |
doInform(ResourceLoader loader)
Initialises this component with the corresponding
ResourceLoader |
CharArraySet |
getProtectedTerms() |
private void |
handleWrappedFilterArgs(java.util.Map<java.lang.String,java.lang.String> args) |
boolean |
isIgnoreCase() |
private void |
populateInnerFilters(java.util.LinkedHashMap<java.lang.String,java.util.Map<java.lang.String,java.lang.String>> wrappedFilterArgs) |
create, inform, setInnerFilters
availableTokenFilters, forName, lookupClass, normalize, reloadTokenFilters
get, get, get, get, get, getBoolean, getChar, getClassArg, getFloat, getInt, getLines, getLuceneMatchVersion, getOriginalArgs, getPattern, getSet, getSnowballWordSet, getWordSet, isExplicitLuceneMatchVersion, require, require, require, requireBoolean, requireChar, requireFloat, requireInt, setExplicitLuceneMatchVersion, splitAt, splitFileNames
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
inform
public static final java.lang.String PROTECTED_TERMS
public static final char FILTER_ARG_SEPARATOR
public static final char FILTER_NAME_ID_SEPARATOR
private final java.lang.String termFiles
private final boolean ignoreCase
private final java.lang.String wrappedFilters
private CharArraySet protectedTerms
public ProtectedTermFilterFactory(java.util.Map<java.lang.String,java.lang.String> args)
private void handleWrappedFilterArgs(java.util.Map<java.lang.String,java.lang.String> args)
private void populateInnerFilters(java.util.LinkedHashMap<java.lang.String,java.util.Map<java.lang.String,java.lang.String>> wrappedFilterArgs)
public boolean isIgnoreCase()
public CharArraySet getProtectedTerms()
protected ConditionalTokenFilter create(TokenStream input, java.util.function.Function<TokenStream,TokenStream> inner)
ConditionalTokenFilterFactory
TokenStream
with a ConditionalTokenFilter
create
in class ConditionalTokenFilterFactory
public void doInform(ResourceLoader loader) throws java.io.IOException
ConditionalTokenFilterFactory
ResourceLoader
doInform
in class ConditionalTokenFilterFactory
java.io.IOException