A regular expression compiler class. This class compiles a pattern string into a
regular expression program interpretable by the RE evaluator class. The 'recompile'
command line tool uses this compiler to pre-compile regular expressions for use
with RE. For a description of the syntax accepted by RECompiler and what you can
do with regular expressions, see the documentation for the RE matcher class.
ESC_BACKREF
(package private) static final int ESC_BACKREF
ESC_CLASS
(package private) static final int ESC_CLASS
ESC_COMPLEX
(package private) static final int ESC_COMPLEX
ESC_MASK
(package private) static final int ESC_MASK
NODE_NORMAL
(package private) static final int NODE_NORMAL
NODE_NULLABLE
(package private) static final int NODE_NULLABLE
NODE_TOPLEVEL
(package private) static final int NODE_TOPLEVEL
bracketEnd
(package private) int[] bracketEnd
bracketMin
(package private) int[] bracketMin
bracketOpt
(package private) int[] bracketOpt
bracketStart
(package private) int[] bracketStart
bracketUnbounded
(package private) static final int bracketUnbounded
brackets
(package private) int brackets
hashPOSIX
(package private) static Hashtable hashPOSIX
idx
(package private) int idx
instruction
(package private) char[] instruction
len
(package private) int len
lenInstruction
(package private) int lenInstruction
maxBrackets
(package private) int maxBrackets
parens
(package private) int parens
pattern
(package private) String pattern
allocBrackets
(package private) void allocBrackets()
Allocate storage for brackets only as needed
atom
(package private) int atom()
throws RESyntaxException
Absorb an atomic character string. This method is a little tricky because
it can un-include the last character of string if a closure operator follows.
This is correct because *+? have higher precedence than concatentation (thus
ABC* means AB(C*) and NOT (ABC)*).
bracket
(package private) void bracket()
throws RESyntaxException
Match bracket {m,n} expression put results in bracket member variables
branch
(package private) int branch(int[] flags)
throws RESyntaxException
Compile one branch of an or operator (implements concatenation)
flags
- Flags passed by reference
characterClass
(package private) int characterClass()
throws RESyntaxException
Compile a character class
closure
(package private) int closure(int[] flags)
throws RESyntaxException
Compile a possibly closured terminal
flags
- Flags passed by reference
compile
public REProgram compile(String pattern)
throws RESyntaxException
Compiles a regular expression pattern into a program runnable by the pattern
matcher class 'RE'.
pattern
- Regular expression pattern to compile (see RECompiler class
for details).
- A compiled regular expression program.
emit
(package private) void emit(char c)
Emit a single character into the program stream.
ensure
(package private) void ensure(int n)
Ensures that n more characters can fit in the program buffer.
If n more can't fit, then the size is doubled until it can.
n
- Number of additional characters to ensure will fit.
escape
(package private) int escape()
throws RESyntaxException
Match an escape sequence. Handles quoted chars and octal escapes as well
as normal escape characters. Always advances the input stream by the
right amount. This code "understands" the subtle difference between an
octal escape and a backref. You can access the type of ESC_CLASS or
ESC_COMPLEX or ESC_BACKREF by looking at pattern[idx - 1].
- ESC_* code or character if simple escape
expr
(package private) int expr(int[] flags)
throws RESyntaxException
Compile an expression with possible parens around it. Paren matching
is done at this level so we can tie the branch tails together.
flags
- Flag value passed by reference
- Node index of expression in instruction array
internalError
(package private) void internalError()
throws Error
Throws a new internal error exception
node
(package private) int node(char opcode,
int opdata)
Adds a new node
opcode
- Opcode for nodeopdata
- Opdata for node (only the low 16 bits are currently used)
- Index of new node in program
nodeInsert
(package private) void nodeInsert(char opcode,
int opdata,
int insertAt)
Inserts a node with a given opcode and opdata at insertAt. The node relative next
pointer is initialized to 0.
opcode
- Opcode for new nodeopdata
- Opdata for new node (only the low 16 bits are currently used)insertAt
- Index at which to insert the new node in the program
reallocBrackets
(package private) void reallocBrackets()
Enlarge storage for brackets only as needed.
setNextOfEnd
(package private) void setNextOfEnd(int node,
int pointTo)
Appends a node to the end of a node chain
node
- Start of node chain to traversepointTo
- Node to have the tail of the chain point to
syntaxError
(package private) void syntaxError(String s)
throws RESyntaxException
Throws a new syntax error exception
terminal
(package private) int terminal(int[] flags)
throws RESyntaxException
Match a terminal node.
- Index of terminal node (closeable)