public class StopFilter extends FilteringTokenFilter
AttributeSource.State
Modifier and Type | Field and Description |
---|---|
private CharArraySet |
stopWords |
private CharTermAttribute |
termAtt |
input
DEFAULT_TOKEN_ATTRIBUTE_FACTORY
Constructor and Description |
---|
StopFilter(TokenStream in,
CharArraySet stopWords)
Constructs a filter which removes words from the input TokenStream that are
named in the Set.
|
Modifier and Type | Method and Description |
---|---|
protected boolean |
accept()
Returns the next input Token whose term() is not a stop word.
|
static CharArraySet |
makeStopSet(java.util.List<?> stopWords)
Builds a Set from an array of stop words,
appropriate for passing into the StopFilter constructor.
|
static CharArraySet |
makeStopSet(java.util.List<?> stopWords,
boolean ignoreCase)
Creates a stopword set from the given stopword list.
|
static CharArraySet |
makeStopSet(java.lang.String... stopWords)
Builds a Set from an array of stop words,
appropriate for passing into the StopFilter constructor.
|
static CharArraySet |
makeStopSet(java.lang.String[] stopWords,
boolean ignoreCase)
Creates a stopword set from the given stopword array.
|
end, incrementToken, reset
close
addAttribute, addAttributeImpl, captureState, clearAttributes, cloneAttributes, copyTo, endAttributes, equals, getAttribute, getAttributeClassesIterator, getAttributeFactory, getAttributeImplsIterator, hasAttribute, hasAttributes, hashCode, reflectAsString, reflectWith, removeAllAttributes, restoreState, toString
private final CharArraySet stopWords
private final CharTermAttribute termAtt
public StopFilter(TokenStream in, CharArraySet stopWords)
in
- Input streamstopWords
- A CharArraySet
representing the stopwords.makeStopSet(java.lang.String...)
public static CharArraySet makeStopSet(java.lang.String... stopWords)
stopWords
- An array of stopwordspassing false to ignoreCase
public static CharArraySet makeStopSet(java.util.List<?> stopWords)
stopWords
- A List of Strings or char[] or any other toString()-able list representing the stopwordsCharArraySet
) containing the wordspassing false to ignoreCase
public static CharArraySet makeStopSet(java.lang.String[] stopWords, boolean ignoreCase)
stopWords
- An array of stopwordsignoreCase
- If true, all words are lower cased first.public static CharArraySet makeStopSet(java.util.List<?> stopWords, boolean ignoreCase)
stopWords
- A List of Strings or char[] or any other toString()-able list representing the stopwordsignoreCase
- if true, all words are lower cased firstCharArraySet
) containing the wordsprotected boolean accept()
accept
in class FilteringTokenFilter