public final class SnowballFilter extends TokenFilter
org.tartarus.snowball.ext
.
NOTE: SnowballFilter expects lowercased text.
TurkishLowerCaseFilter
.
LowerCaseFilter
.
Note: This filter is aware of the KeywordAttribute
. To prevent
certain terms from being passed to the stemmer
KeywordAttribute.isKeyword()
should be set to true
in a previous TokenStream
.
Note: For including the original term as well as the stemmed version, see
KeywordRepeatFilterFactory
AttributeSource.State
Modifier and Type | Field and Description |
---|---|
private KeywordAttribute |
keywordAttr |
private SnowballProgram |
stemmer |
private CharTermAttribute |
termAtt |
input
DEFAULT_TOKEN_ATTRIBUTE_FACTORY
Constructor and Description |
---|
SnowballFilter(TokenStream input,
SnowballProgram stemmer) |
SnowballFilter(TokenStream in,
java.lang.String name)
Construct the named stemming filter.
|
Modifier and Type | Method and Description |
---|---|
boolean |
incrementToken()
Returns the next input Token, after being stemmed
|
close, end, reset
addAttribute, addAttributeImpl, captureState, clearAttributes, cloneAttributes, copyTo, endAttributes, equals, getAttribute, getAttributeClassesIterator, getAttributeFactory, getAttributeImplsIterator, hasAttribute, hasAttributes, hashCode, reflectAsString, reflectWith, removeAllAttributes, restoreState, toString
private final SnowballProgram stemmer
private final CharTermAttribute termAtt
private final KeywordAttribute keywordAttr
public SnowballFilter(TokenStream input, SnowballProgram stemmer)
public SnowballFilter(TokenStream in, java.lang.String name)
org.tartarus.snowball.ext
.
The name of a stemmer is the part of the class name before "Stemmer",
e.g., the stemmer in EnglishStemmer
is named "English".in
- the input tokens to stemname
- the name of a stemmerpublic final boolean incrementToken() throws java.io.IOException
incrementToken
in class TokenStream
java.io.IOException