public class IndonesianStemmer
extends java.lang.Object
Stems Indonesian words with the algorithm presented in: A Study of Stemming Effects on Information Retrieval in Bahasa Indonesia, Fadillah Z Tala. http://www.illc.uva.nl/Publications/ResearchReports/MoL-2003-02.text.pdf
Modifier and Type | Field and Description |
---|---|
private int |
flags |
private int |
numSyllables |
private static int |
REMOVED_BER |
private static int |
REMOVED_DI |
private static int |
REMOVED_KE |
private static int |
REMOVED_MENG |
private static int |
REMOVED_PE |
private static int |
REMOVED_PENG |
private static int |
REMOVED_TER |
Constructor and Description |
---|
IndonesianStemmer() |
Modifier and Type | Method and Description |
---|---|
private boolean |
isVowel(char ch) |
private int |
removeFirstOrderPrefix(char[] text,
int length) |
private int |
removeParticle(char[] text,
int length) |
private int |
removePossessivePronoun(char[] text,
int length) |
private int |
removeSecondOrderPrefix(char[] text,
int length) |
private int |
removeSuffix(char[] text,
int length) |
int |
stem(char[] text,
int length,
boolean stemDerivational)
Stem a term (returning its new length).
|
private int |
stemDerivational(char[] text,
int length) |
private int numSyllables
private int flags
private static final int REMOVED_KE
private static final int REMOVED_PENG
private static final int REMOVED_DI
private static final int REMOVED_MENG
private static final int REMOVED_TER
private static final int REMOVED_BER
private static final int REMOVED_PE
public int stem(char[] text, int length, boolean stemDerivational)
Use stemDerivational
to control whether full stemming
or only light inflectional stemming is done.
private int stemDerivational(char[] text, int length)
private boolean isVowel(char ch)
private int removeParticle(char[] text, int length)
private int removePossessivePronoun(char[] text, int length)
private int removeFirstOrderPrefix(char[] text, int length)
private int removeSecondOrderPrefix(char[] text, int length)
private int removeSuffix(char[] text, int length)