public class CompositeTag extends TagNode
Tag
provides. Also handles the conversion of it's children for
the toHtml
method.Modifier and Type | Field and Description |
---|---|
protected static CompositeTagScanner |
mDefaultCompositeScanner
The default scanner for non-composite tags.
|
protected Tag |
mEndTag
The tag that causes this tag to finish.
|
breakTags, mAttributes, mDefaultScanner
Constructor and Description |
---|
CompositeTag()
Create a composite tag.
|
Modifier and Type | Method and Description |
---|---|
void |
accept(NodeVisitor visitor)
Tag visiting code.
|
Node |
childAt(int index)
Get child at given index
|
SimpleNodeIterator |
children()
Get an iterator over the children of this node.
|
void |
collectInto(NodeList list,
NodeFilter filter)
Collect this node and its child nodes (if-applicable) into the list parameter,
provided the node satisfies the filtering criteria.
|
Text[] |
digupStringNode(java.lang.String searchText)
Finds a text node, however embedded it might be, and returns
it.
|
SimpleNodeIterator |
elements()
Return the child tags as an iterator.
|
int |
findPositionOf(Node searchNode)
Returns the node number of a child node given the node object.
|
int |
findPositionOf(java.lang.String text)
Returns the node number of the first node containing the given text.
|
int |
findPositionOf(java.lang.String text,
java.util.Locale locale)
Returns the node number of the first node containing the given text.
|
Node |
getChild(int index)
Get the child of this node at the given position.
|
int |
getChildCount()
Return the number of child nodes in this tag.
|
Node[] |
getChildrenAsNodeArray()
Get the children as an array of
Node objects. |
java.lang.String |
getChildrenHTML()
Return the HTML code for the children of this tag.
|
Tag |
getEndTag()
Get the end tag for this tag.
|
java.lang.String |
getStringText()
Return the text between the start tag and the end tag.
|
java.lang.String |
getText()
Return the text contained in this tag.
|
protected void |
putChildrenInto(java.lang.StringBuffer sb)
Add the textual contents of the children of this node to the buffer.
|
protected void |
putEndTagInto(java.lang.StringBuffer sb)
Add the textual contents of the end tag of this node to the buffer.
|
void |
removeChild(int i)
Remove the child at the position given.
|
Tag |
searchByName(java.lang.String name)
Searches all children who for a name attribute.
|
NodeList |
searchFor(java.lang.Class classType,
boolean recursive)
Collect all objects that are of a certain type
Note that this will not check for parent types, and will not
recurse through child tags
|
NodeList |
searchFor(java.lang.String searchString)
Searches for all nodes whose text representation contains the search string.
|
NodeList |
searchFor(java.lang.String searchString,
boolean caseSensitive)
Searches for all nodes whose text representation contains the search string.
|
NodeList |
searchFor(java.lang.String searchString,
boolean caseSensitive,
java.util.Locale locale)
Searches for all nodes whose text representation contains the search string.
|
void |
setEndTag(Tag tag)
Set the end tag for this tag.
|
java.lang.String |
toHtml()
Return this tag as HTML code.
|
java.lang.String |
toPlainTextString()
Return the textual contents of this tag and it's children.
|
java.lang.String |
toString()
Return a string representation of the contents of this tag, it's children and it's end tag suitable for debugging.
|
void |
toString(int level,
java.lang.StringBuffer buffer)
Return a string representation of the contents of this tag, it's children and it's end tag suitable for debugging.
|
breaksFlow, getAttribute, getAttributeEx, getAttributes, getAttributesEx, getEnders, getEndingLineNumber, getEndTagEnders, getIds, getParsed, getRawTagName, getStartingLineNumber, getTagBegin, getTagEnd, getTagName, getThisScanner, isEmptyXmlTag, isEndTag, removeAttribute, setAttribute, setAttribute, setAttribute, setAttributeEx, setAttributes, setAttributesEx, setEmptyXmlTag, setTagBegin, setTagEnd, setTagName, setText, setThisScanner
clone, doSemanticAction, getChildren, getEndPosition, getPage, getParent, getStartPosition, setChildren, setEndPosition, setPage, setParent, setStartPosition
equals, finalize, getClass, hashCode, notify, notifyAll, wait, wait, wait
clone, doSemanticAction, getChildren, getEndPosition, getPage, getParent, getStartPosition, setChildren, setEndPosition, setPage, setParent, setStartPosition
protected Tag mEndTag
protected static final CompositeTagScanner mDefaultCompositeScanner
public SimpleNodeIterator children()
public Node getChild(int index)
index
- The in the node list of the child.public Node[] getChildrenAsNodeArray()
Node
objects.public void removeChild(int i)
i
- The index of the child to remove.public SimpleNodeIterator elements()
public java.lang.String toPlainTextString()
toPlainTextString
in interface Node
toPlainTextString
in class TagNode
protected void putChildrenInto(java.lang.StringBuffer sb)
sb
- The buffer to append to.protected void putEndTagInto(java.lang.StringBuffer sb)
sb
- The buffer to append to.public java.lang.String toHtml()
toHtml
in interface Node
toHtml
in class TagNode
Node.toHtml()
public Tag searchByName(java.lang.String name)
name
- Attribute to match in tagpublic NodeList searchFor(java.lang.String searchString)
NodeList nodeList = formTag.searchFor("Hello World");
searchString
- Search criterion.searchString
in them.public NodeList searchFor(java.lang.String searchString, boolean caseSensitive)
NodeList nodeList = formTag.searchFor("Hello World");
searchString
- Search criterion.caseSensitive
- If true
this search should be case
sensitive. Otherwise, the search string and the node text are converted
to uppercase using an English locale.searchString
in them.public NodeList searchFor(java.lang.String searchString, boolean caseSensitive, java.util.Locale locale)
NodeList nodeList = formTag.searchFor("Hello World");
searchString
- Search criterion.caseSensitive
- If true
this search should be case
sensitive. Otherwise, the search string and the node text are converted
to uppercase using the locale provided.locale
- The locale for uppercase conversion.searchString
in them.public NodeList searchFor(java.lang.Class classType, boolean recursive)
classType
- The class to search for.recursive
- If true, recursively search through the children.public int findPositionOf(java.lang.String text)
text
- The text to search for.(String, Locale)
public int findPositionOf(java.lang.String text, java.util.Locale locale)
locale
- The locale to use in converting to uppercase.text
- The text to search for.public int findPositionOf(Node searchNode)
searchNode
- The child node to find.public Node childAt(int index)
index
- The index into the child node list.public void collectInto(NodeList list, NodeFilter filter)
This mechanism allows powerful filtering code to be written very easily,
without bothering about collection of embedded tags separately.
e.g. when we try to get all the links on a page, it is not possible to
get it at the top-level, as many tags (like form tags), can contain
links embedded in them. We could get the links out by checking if the
current node is a CompositeTag
, and going through its children.
So this method provides a convenient way to do this.
Using collectInto(), programs get a lot shorter. Now, the code to extract all links from a page would look like:
NodeList list = new NodeList(); NodeFilter filter = new TagNameFilter ("A"); for (NodeIterator e = parser.elements(); e.hasMoreNodes();) e.nextNode().collectInto(list, filter);Thus,
list
will hold all the link nodes, irrespective of how
deep the links are embedded.
Another way to accomplish the same objective is:
NodeList list = new NodeList(); NodeFilter filter = new TagClassFilter (LinkTag.class); for (NodeIterator e = parser.elements(); e.hasMoreNodes();) e.nextNode().collectInto(list, filter);This is slightly less specific because the LinkTag class may be registered for more than one node name, e.g. <LINK> tags too.
collectInto
in interface Node
collectInto
in class AbstractNode
list
- The list to add nodes to.filter
- The filter to apply.org.htmlparser.filters
public java.lang.String getChildrenHTML()
public void accept(NodeVisitor visitor)
accept()
on the start tag and then
walks the child list invoking accept()
on each
of the children, finishing up with an accept()
call on the end tag. If shouldRecurseSelf()
returns true it then asks the visitor to visit itself.public int getChildCount()
public Tag getEndTag()
public void setEndTag(Tag tag)
public Text[] digupStringNode(java.lang.String searchText)
searchText
- The text to search for.public java.lang.String toString()
public java.lang.String getText()
public java.lang.String getStringText()
public void toString(int level, java.lang.StringBuffer buffer)
level
- The indentation level to use.buffer
- The buffer to append to.HTML Parser is an open source library released under LGPL.