public class DefaultSDContextGenerator extends Object implements SDContextGenerator
Modifier and Type | Field and Description |
---|---|
protected StringBuffer |
buf
String buffer for generating features.
|
protected List<String> |
collectFeats
List for holding features as they are generated.
|
Constructor and Description |
---|
DefaultSDContextGenerator(char[] eosCharacters)
Creates a new
SDContextGenerator instance with
no induced abbreviations. |
DefaultSDContextGenerator(Set<String> inducedAbbreviations,
char[] eosCharacters)
Creates a new
SDContextGenerator instance which uses
the set of induced abbreviations. |
Modifier and Type | Method and Description |
---|---|
protected void |
collectFeatures(String prefix,
String suffix,
String previous,
String next)
Deprecated.
|
protected void |
collectFeatures(String prefix,
String suffix,
String previous,
String next,
Character eosChar)
Determines some of the features for the sentence detector and adds them to list features.
|
String[] |
getContext(CharSequence sb,
int position)
Returns an array of contextual features for the potential sentence boundary at the
specified position within the specified string buffer.
|
protected StringBuffer buf
public DefaultSDContextGenerator(char[] eosCharacters)
SDContextGenerator
instance with
no induced abbreviations.eosCharacters
- public DefaultSDContextGenerator(Set<String> inducedAbbreviations, char[] eosCharacters)
SDContextGenerator
instance which uses
the set of induced abbreviations.inducedAbbreviations
- a Set
of Strings
representing induced abbreviations in the training data.
Example: "Mr."eosCharacters
- public String[] getContext(CharSequence sb, int position)
SDContextGenerator
getContext
in interface SDContextGenerator
sb
- The String
for which sentences are being determined.position
- An index into the specified string buffer when a sentence boundary may occur.protected void collectFeatures(String prefix, String suffix, String previous, String next)
collectFeatures(String, String, String, String, Character)
instead.prefix
- String preceding the eos character in the eos token.suffix
- String following the eos character in the eos token.previous
- Space delimited token preceding token containing eos character.next
- Space delimited token following token containing eos character.protected void collectFeatures(String prefix, String suffix, String previous, String next, Character eosChar)
prefix
- String preceding the eos character in the eos token.suffix
- String following the eos character in the eos token.previous
- Space delimited token preceding token containing eos character.next
- Space delimited token following token containing eos character.eosChar
- the EOS character been analyzedCopyright © 2017 The Apache Software Foundation. All rights reserved.