public class XMLTokenizer extends Object
You can use it to parse XML yourself or use the XMLParser to let it parse XML into a Document.
Modifier and Type | Class and Description |
---|---|
static class |
XMLTokenizer.Type
Types of tokens the tokenizer can return
|
Modifier and Type | Field and Description |
---|---|
protected boolean |
inStartElement
true if we're currently inside of a start tag
|
protected int |
pos
The current position in the source
|
protected XMLSource |
source |
Constructor and Description |
---|
XMLTokenizer(XMLSource source) |
Modifier and Type | Method and Description |
---|---|
protected Token |
createToken()
All tokens are created here.
|
protected void |
expect(char expected)
Check that the next character is
expected and skip it |
CharValidator |
getCharValidator() |
EntityResolver |
getEntityResolver() |
int |
getOffset()
Get the current parsing position (for error handling, for example).
|
XMLSource |
getSource() |
boolean |
isTreatEntitiesAsText() |
protected String |
lookAheadForErrorMessage(String conditionalPrefix,
int pos,
int len) |
Token |
next()
Fetch the next token from the source.
|
protected char |
nextChar(String errorMessage) |
protected void |
nextChars(String expected,
int startPos,
String errorMessage) |
protected void |
parseAttribute(Token token)
Read the attribute of an element.
|
protected void |
parseBeginElement(Token token)
Read the name of an element.
|
protected void |
parseBeginSomething(Token token)
Read one of "<tag", "<?pi", "<!--", "<![CDATA[" or a end tag.
|
protected void |
parseCData(Token token)
Parse a CDATA element.
|
protected void |
parseComment(Token token)
Read a comment.
|
protected void |
parseDocType(Token token)
Parse a doctype declaration
|
protected void |
parseEndElement(Token token)
Read an end tag.
|
protected void |
parseEntity(Token token) |
protected void |
parseExcalamation(Token token)
Parse "<!--" or "<![CDATA["
|
protected void |
parseName(String objectName)
Read an XML name
|
protected void |
parseProcessingInstruction(Token token)
Read a processing instruction.
|
protected void |
parseText(Token token)
Read a piece of text.
|
XMLTokenizer |
setCharValidator(CharValidator charValidator) |
XMLTokenizer |
setEntityResolver(EntityResolver resolver) |
void |
setOffset(int offset)
Set the current parsing position.
|
XMLTokenizer |
setTreatEntitiesAsText(boolean treatEntitiesAsText) |
protected void |
skipChar(char c)
Advance one or two positions, depending on whether the current character if
the high part of a surrogate pair.
|
protected void |
skipWhiteSpace()
Advance the current position past any whitespace in the input
|
protected void |
verifyEntity(int start,
int end)
Verify an entity.
|
protected final XMLSource source
protected int pos
protected boolean inStartElement
public XMLTokenizer(XMLSource source)
public XMLTokenizer setTreatEntitiesAsText(boolean treatEntitiesAsText)
public boolean isTreatEntitiesAsText()
public CharValidator getCharValidator()
public XMLTokenizer setCharValidator(CharValidator charValidator)
public EntityResolver getEntityResolver()
public XMLTokenizer setEntityResolver(EntityResolver resolver)
public Token next()
null
if
there are no more tokens in the input.null
at EOFprotected Token createToken()
Use this method to create custom tokens with additional information.
public XMLSource getSource()
public int getOffset()
This value is not very accurate because the tokenizer might be anywhere in the stream.
public void setOffset(int offset)
protected void parseBeginSomething(Token token)
protected void parseBeginElement(Token token)
The resulting token will contain the '<' plus any whitespace between it and the name plus the name itself but no whitespace after the name.
protected void parseEndElement(Token token)
The resulting token will contain the '</' and '>' plus the name plus any whitespace between those three.
protected void parseExcalamation(Token token)
protected void parseDocType(Token token)
The resulting token will contain "
protected void parseCData(Token token)
The resulting token will contain the "<![CDATA[" plus the terminating "]]>".
protected void parseComment(Token token)
The resulting token will contain the "<!--" plus the terminating "-->".
protected void parseProcessingInstruction(Token token)
The resulting token will contain the "<?" plus the terminating "?>".
protected void parseAttribute(Token token)
The resulting token will contain the name, "=" plus the quotes and the value.
protected void parseName(String objectName)
protected void parseText(Token token)
The resulting token will contain the text as is with all the entity and numeric character references.
protected void skipChar(char c)
protected void verifyEntity(int start, int end)
protected void parseEntity(Token token)
protected char nextChar(String errorMessage)
protected void expect(char expected)
expected
and skip itprotected String lookAheadForErrorMessage(String conditionalPrefix, int pos, int len)
protected void skipWhiteSpace()
Copyright © 2008–2014. All rights reserved.