org.apache.oro.text.regex
public final class Perl5Matcher extends Object implements PatternMatcher
Perl5Compiler and Perl5Matcher are designed with the intent that you use a separate instance of each per thread to avoid the overhead of both synchronization and concurrent access (e.g., a match that takes a long time in one thread will block the progress of another thread with a shorter match). If you want to use a single instance of each in a concurrent program, you must appropriately protect access to the instances with critical sections. If you want to share Perl5Pattern instances between concurrently executing instances of Perl5Matcher, you must compile the patterns with READ_ONLY_MASK.
Since: 1.0
Version: 2.0.8
See Also: PatternMatcher Perl5Compiler
Method Summary | |
---|---|
boolean | contains(String input, Pattern pattern)
Determines if a string contains a pattern. |
boolean | contains(char[] input, Pattern pattern)
Determines if a string (represented as a char[]) contains a pattern.
|
boolean | contains(PatternMatcherInput input, Pattern pattern)
Determines if the contents of a PatternMatcherInput, starting from the
current offset of the input contains a pattern.
|
MatchResult | getMatch()
Fetches the last match found by a call to a matches() or contains()
method. |
boolean | isMultiline() |
boolean | matches(char[] input, Pattern pattern)
Determines if a string (represented as a char[]) exactly
matches a given pattern. |
boolean | matches(String input, Pattern pattern)
Determines if a string exactly matches a given pattern. |
boolean | matches(PatternMatcherInput input, Pattern pattern)
Determines if the contents of a PatternMatcherInput instance
exactly matches a given pattern. |
boolean | matchesPrefix(char[] input, Pattern pattern, int offset)
Determines if a prefix of a string (represented as a char[])
matches a given pattern, starting from a given offset into the string.
|
boolean | matchesPrefix(char[] input, Pattern pattern)
Determines if a prefix of a string (represented as a char[])
matches a given pattern.
|
boolean | matchesPrefix(String input, Pattern pattern)
Determines if a prefix of a string matches a given pattern.
|
boolean | matchesPrefix(PatternMatcherInput input, Pattern pattern)
Determines if a prefix of a PatternMatcherInput instance
matches a given pattern. |
void | setMultiline(boolean multiline)
Set whether or not subsequent calls to matches()
or contains() should treat the input as
consisting of multiple lines. |
The pattern must be a Perl5Pattern instance, otherwise a ClassCastException will be thrown. You are not required to, and indeed should NOT try to (for performance reasons), catch a ClassCastException because it will never be thrown as long as you use a Perl5Pattern as the pattern parameter.
Parameters: input The String to test for a match. pattern The Perl5Pattern to be matched.
Returns: True if the input contains a pattern match, false otherwise.
Throws: ClassCastException If a Pattern instance other than a Perl5Pattern is passed as the pattern parameter.
The pattern must be a Perl5Pattern instance, otherwise a ClassCastException will be thrown. You are not required to, and indeed should NOT try to (for performance reasons), catch a ClassCastException because it will never be thrown as long as you use a Perl5Pattern as the pattern parameter.
Parameters: input The char[] to test for a match. pattern The Perl5Pattern to be matched.
Returns: True if the input contains a pattern match, false otherwise.
Throws: ClassCastException If a Pattern instance other than a Perl5Pattern is passed as the pattern parameter.
As a side effect, if a match is found, the PatternMatcherInput match offset information is updated. See the PatternMatcherInput method for more details.
The pattern must be a Perl5Pattern instance, otherwise a ClassCastException will be thrown. You are not required to, and indeed should NOT try to (for performance reasons), catch a ClassCastException because it will never be thrown as long as you use a Perl5Pattern as the pattern parameter.
This method is usually used in a loop as follows:
PatternMatcher matcher; PatternCompiler compiler; Pattern pattern; PatternMatcherInput input; MatchResult result; compiler = new Perl5Compiler(); matcher = new Perl5Matcher(); try { pattern = compiler.compile(somePatternString); } catch(MalformedPatternException e) { System.err.println("Bad pattern."); System.err.println(e.getMessage()); return; } input = new PatternMatcherInput(someStringInput); while(matcher.contains(input, pattern)) { result = matcher.getMatch(); // Perform whatever processing on the result you want. }
Parameters: input The PatternMatcherInput to test for a match. pattern The Pattern to be matched.
Returns: True if the input contains a pattern match, false otherwise.
Throws: ClassCastException If a Pattern instance other than a Perl5Pattern is passed as the pattern parameter.
Returns: A MatchResult instance containing the pattern match found by the last call to any one of the matches() or contains() methods. If no match was found by the last call, returns null.
Returns: True if the matcher is treating input as consisting of multiple lines with respect to the ^ and $ metacharacters, false otherwise.
Note: matches() is not the same as sticking a ^ in front of your expression and a $ at the end of your expression in Perl5 and using the =~ operator, even though in many cases it will be equivalent. matches() literally looks for an exact match according to the rules of Perl5 expression matching. Therefore, if you have a pattern foo|foot and are matching the input foot it will not produce an exact match. But foot|foo will produce an exact match for either foot or foo. Remember, Perl5 regular expressions do not match the longest possible match. From the perlre manpage:
Alternatives are tried from left to right, so the first alternative found for which the entire expression matches, is the one that is chosen. This means that alternatives are not necessarily greedy. For example: when matching foo|foot against "barefoot", only the "foo" part will match, as that is the first alternative tried, and it successfully matches the target string.
Parameters: input The char[] to test for an exact match. pattern The Perl5Pattern to be matched.
Returns: True if input matches pattern, false otherwise.
Throws: ClassCastException If a Pattern instance other than a Perl5Pattern is passed as the pattern parameter.
Note: matches() is not the same as sticking a ^ in front of your expression and a $ at the end of your expression in Perl5 and using the =~ operator, even though in many cases it will be equivalent. matches() literally looks for an exact match according to the rules of Perl5 expression matching. Therefore, if you have a pattern foo|foot and are matching the input foot it will not produce an exact match. But foot|foo will produce an exact match for either foot or foo. Remember, Perl5 regular expressions do not match the longest possible match. From the perlre manpage:
Alternatives are tried from left to right, so the first alternative found for which the entire expression matches, is the one that is chosen. This means that alternatives are not necessarily greedy. For example: when matching foo|foot against "barefoot", only the "foo" part will match, as that is the first alternative tried, and it successfully matches the target string.
Parameters: input The String to test for an exact match. pattern The Perl5Pattern to be matched.
Returns: True if input matches pattern, false otherwise.
Throws: ClassCastException If a Pattern instance other than a Perl5Pattern is passed as the pattern parameter.
The pattern must be a Perl5Pattern instance, otherwise a ClassCastException will be thrown. You are not required to, and indeed should NOT try to (for performance reasons), catch a ClassCastException because it will never be thrown as long as you use a Perl5Pattern as the pattern parameter.
Note: matches() is not the same as sticking a ^ in front of your expression and a $ at the end of your expression in Perl5 and using the =~ operator, even though in many cases it will be equivalent. matches() literally looks for an exact match according to the rules of Perl5 expression matching. Therefore, if you have a pattern foo|foot and are matching the input foot it will not produce an exact match. But foot|foo will produce an exact match for either foot or foo. Remember, Perl5 regular expressions do not match the longest possible match. From the perlre manpage:
Alternatives are tried from left to right, so the first alternative found for which the entire expression matches, is the one that is chosen. This means that alternatives are not necessarily greedy. For example: when matching foo|foot against "barefoot", only the "foo" part will match, as that is the first alternative tried, and it successfully matches the target string.
Parameters: input The PatternMatcherInput to test for a match. pattern The Perl5Pattern to be matched.
Returns: True if input matches pattern, false otherwise.
Throws: ClassCastException If a Pattern instance other than a Perl5Pattern is passed as the pattern parameter.
This method is useful for certain common token identification tasks that are made more difficult without this functionality.
Parameters: input The char[] to test for a prefix match. pattern The Pattern to be matched. offset The offset at which to start searching for the prefix.
Returns: True if input matches pattern, false otherwise.
This method is useful for certain common token identification tasks that are made more difficult without this functionality.
Parameters: input The char[] to test for a prefix match. pattern The Pattern to be matched.
Returns: True if input matches pattern, false otherwise.
This method is useful for certain common token identification tasks that are made more difficult without this functionality.
Parameters: input The String to test for a prefix match. pattern The Pattern to be matched.
Returns: True if input matches pattern, false otherwise.
matches(PatternMatcherInput, Pattern)
method,
matchesPrefix() will start its search from the current offset
rather than the begin offset of the PatternMatcherInput.
This method is useful for certain common token identification tasks that are made more difficult without this functionality.
Parameters: input The PatternMatcherInput to test for a prefix match. pattern The Pattern to be matched.
Returns: True if input matches pattern, false otherwise.
matches()
or contains()
should treat the input as
consisting of multiple lines. The default behavior is for
input to be treated as consisting of multiple lines. This method
should only be called if the Perl5Pattern used for a match was
compiled without either of the Perl5Compiler.MULTILINE_MASK or
Perl5Compiler.SINGLELINE_MASK flags, and you want to alter the
behavior of how the ^, $, and . metacharacters are
interpreted on the fly. The compilation options used when compiling
a pattern ALWAYS override the behavior specified by setMultiline(). See
Perl5Compiler for more details.
Parameters: multiline If set to true treats the input as consisting of multiple lines with respect to the ^ and $ metacharacters. If set to false treats the input as consisting of a single line with respect to the ^ and $ metacharacters.