public final class CodepointCountFilter extends FilteringTokenFilter
Note: Length is calculated as the number of Unicode codepoints.
AttributeSource.State
Modifier and Type | Field and Description |
---|---|
private int |
max |
private int |
min |
private CharTermAttribute |
termAtt |
input
DEFAULT_TOKEN_ATTRIBUTE_FACTORY
Constructor and Description |
---|
CodepointCountFilter(TokenStream in,
int min,
int max)
Create a new
CodepointCountFilter . |
Modifier and Type | Method and Description |
---|---|
boolean |
accept()
Override this method and return if the current input token should be returned by
FilteringTokenFilter.incrementToken() . |
end, incrementToken, reset
close
addAttribute, addAttributeImpl, captureState, clearAttributes, cloneAttributes, copyTo, endAttributes, equals, getAttribute, getAttributeClassesIterator, getAttributeFactory, getAttributeImplsIterator, hasAttribute, hasAttributes, hashCode, reflectAsString, reflectWith, removeAllAttributes, restoreState, toString
private final int min
private final int max
private final CharTermAttribute termAtt
public CodepointCountFilter(TokenStream in, int min, int max)
CodepointCountFilter
. This will filter out tokens whose
CharTermAttribute
is either too short (Character.codePointCount(char[], int, int)
< min) or too long (Character.codePointCount(char[], int, int)
> max).in
- the TokenStream
to consumemin
- the minimum lengthmax
- the maximum lengthpublic boolean accept()
FilteringTokenFilter
FilteringTokenFilter.incrementToken()
.accept
in class FilteringTokenFilter