public class Chset extends Parser<java.lang.Object> implements java.lang.Cloneable
Chset
(character set) parser matches the current character
in the parse buffer against an arbitrary character set. The character set is
represented as a sorted array of ranges for which a match should be
successful. Matching takes O(log nranges) time. There are predefined
character sets for matching any character (ANYCHAR
), no
characters (NOTHING
) and some standard 7-bit ASCII ranges
(ALNUM
, ALPHA
, DIGIT
,
XDIGIT
, LOWER
, UPPER
,
WHITESPACE
), and ASCII
.
Note that the character set parser only matches a single character of the
parse buffer. The Sequence
or Repeat parsers need
to be used to match more than one character.
The following matches vowels and digits:
Parser p = new Chset("uoiea0-9");
p.parse("a") -> matches "a"
p.parse("3") -> matches "3"
p.parse("b") -> no matchParser
Modifier and Type | Field and Description |
---|---|
static Chset |
ALNUM |
static Chset |
ALPHA |
static Chset |
ANYCHAR |
static Chset |
ASCII |
static Chset |
DIGIT |
static Chset |
LOWER |
protected static char |
MAX_CHAR |
protected static char |
MIN_CHAR |
static Chset |
NOTHING |
static Chset |
UPPER |
static Chset |
WHITESPACE |
static Chset |
XDIGIT |
Constructor and Description |
---|
Chset()
Class constructor for an empty character set.
|
Chset(char ch)
Class constructor for a character literal.
|
Chset(char min,
char max)
Class constructor for a single character range.
|
Chset(java.lang.String spec)
Class constructor that initializes a
Chset from a string
specification. |
Modifier and Type | Method and Description |
---|---|
protected void |
clear(char min,
char max) |
java.lang.Object |
clone()
Returns a clone character set of
this . |
static Chset |
difference(Chset left,
Chset right)
Creates a new character set which matches a character if that character
matches the
left character set but does not match the
right character set. |
static Chset |
intersection(Chset left,
Chset right)
Creates a new character set which matches a character if that character
matches both the
left and right character sets. |
static Chset |
not(Chset subject)
Creates a new character set which matches a character if that character
does not match the
subject character set. |
int |
parse(char[] buf,
int start,
int end,
java.lang.Object data)
Matches
buf[start] against the character set. |
protected void |
set(char min,
char max) |
protected int |
size()
Returns the size of the range array.
|
boolean |
test(char ch)
Tests to see if a single character matches the character set.
|
protected boolean |
testRanges(char ch)
Tests to see if a single character matches the character set, but only
looks at the ranges representation.
|
java.lang.String |
toString() |
static Chset |
union(Chset left,
Chset right)
Creates a new character set which matches a character if that character
matches either the
left or right character sets. |
static Chset |
xor(Chset left,
Chset right)
Creates a new character set which matches a character if that character
matches the
left character set or the right
character set, but not both. |
protected static final char MIN_CHAR
protected static final char MAX_CHAR
public static final Chset ANYCHAR
public static final Chset NOTHING
public static final Chset ALNUM
public static final Chset ALPHA
public static final Chset DIGIT
public static final Chset XDIGIT
public static final Chset LOWER
public static final Chset UPPER
public static final Chset WHITESPACE
public static final Chset ASCII
public Chset()
public Chset(char ch)
ch
- The character literal for this character set to match against.public Chset(char min, char max)
min
and max
match.min
- The beginning of the character range.max
- The end of the character range.public Chset(java.lang.String spec)
Chset
from a string
specification.spec
- The string specification to intialize the Chset
from.public java.lang.Object clone()
this
.clone
in class java.lang.Object
public int parse(char[] buf, int start, int end, java.lang.Object data)
buf[start]
against the character set.parse
in class Parser<java.lang.Object>
buf
- The character array to match against.start
- The start offset of data within the character array to match
against.end
- The end offset of data within the character array to match
against.data
- User defined object that is passed to
Callback.handle
when an Action
fires.Parser.parse(char[], int, int, T)
public boolean test(char ch)
ch
- The character to test.protected boolean testRanges(char ch)
ch
- The character to test.protected void set(char min, char max)
set(char, char)
protected void clear(char min, char max)
clear(char, char)
protected int size()
public static Chset not(Chset subject)
subject
character set. This operation is
implemented by taking the difference of the ANYCHAR
character
set and the subject
character set.
~subject --> anychar - subjectsubject
- The source character set.public static Chset union(Chset left, Chset right)
left
or right
character sets.
left | rightleft
- The left source character set.right
- The right source character set.public static Chset difference(Chset left, Chset right)
left
character set but does not match the
right
character set.
left - rightleft
- The left source character set.right
- The right source character set.public static Chset intersection(Chset left, Chset right)
left
and right
character sets.
left & right --> left - ~rightleft
- The left source character set.right
- The right source character set.public static Chset xor(Chset left, Chset right)
left
character set or the right
character set, but not both.
left ^ right --> (left - right) | (right - left)left
- The left source character set.right
- The right source character set.public java.lang.String toString()
toString
in class java.lang.Object