public class ArabicStemmer extends Object
Stemming is done in-place for efficiency, operating on a termbuffer.
Stemming is defined as:
Modifier and Type | Field and Description |
---|---|
static char |
ALEF |
static char |
BEH |
static char |
FEH |
static char |
HEH |
static char |
KAF |
static char |
LAM |
static char |
NOON |
static char[][] |
prefixes |
static char[][] |
suffixes |
static char |
TEH |
static char |
TEH_MARBUTA |
static char |
WAW |
static char |
YEH |
Constructor and Description |
---|
ArabicStemmer() |
Modifier and Type | Method and Description |
---|---|
protected int |
delete(char[] s,
int pos,
int len)
Delete a character in-place
|
protected int |
deleteN(char[] s,
int pos,
int len,
int nChars)
Delete n characters in-place
|
int |
stem(char[] s,
int len)
Stem an input buffer of Arabic text.
|
int |
stemPrefix(char[] s,
int len)
Stem a prefix off an Arabic word.
|
int |
stemSuffix(char[] s,
int len)
Stem suffix(es) off an Arabic word.
|
public static final char ALEF
public static final char BEH
public static final char TEH_MARBUTA
public static final char TEH
public static final char FEH
public static final char KAF
public static final char LAM
public static final char NOON
public static final char HEH
public static final char WAW
public static final char YEH
public static final char[][] prefixes
public static final char[][] suffixes
public int stem(char[] s, int len)
s
- input bufferlen
- length of input bufferpublic int stemPrefix(char[] s, int len)
s
- input bufferlen
- length of input bufferpublic int stemSuffix(char[] s, int len)
s
- input bufferlen
- length of input bufferprotected int deleteN(char[] s, int pos, int len, int nChars)
s
- Input Bufferpos
- Position of character to deletelen
- Length of input buffernChars
- number of characters to deleteprotected int delete(char[] s, int pos, int len)
s
- Input Bufferpos
- Position of character to deletelen
- length of input bufferCopyright © 2000-2012 Apache Software Foundation. All Rights Reserved.