See: Description
| Interface | Description |
|---|---|
| InputSourceable |
An InputSourceable can return an arbitrary number of new
InputSources for a given document. |
| TagAction |
Defines an action that is to be performed whenever a particular tag occurs
during HTML parsing.
|
| Class | Description |
|---|---|
| BoilerpipeHTMLContentHandler |
A simple SAX
ContentHandler, used by BoilerpipeSAXInput. |
| BoilerpipeHTMLParser |
A simple SAX Parser, used by
BoilerpipeSAXInput. |
| BoilerpipeSAXInput |
Parses an
InputSource using SAX and returns a TextDocument. |
| CommonTagActions |
Defines an action that is to be performed whenever a particular tag occurs during HTML parsing.
|
| CommonTagActions.BlockTagLabelAction |
CommonTagActions for block-level elements, which triggers some LabelAction on the generated
TextBlock. |
| CommonTagActions.Chained | |
| CommonTagActions.InlineTagLabelAction | |
| DefaultTagActionMap |
Default
TagActions. |
| HTMLDocument |
An
InputSourceable for HTMLFetcher. |
| HTMLFetcher |
A very simple HTTP/HTML fetcher, really just for demo purposes.
|
| HTMLHighlighter |
Highlights text blocks in an HTML document that have been marked as "content"
in the corresponding
TextDocument. |
| MarkupTagAction |
Assigns labels for element CSS classes and ids to the corresponding
TextBlock. |
| TagActionMap |
Base class for definition a set of
TagActions that are to be used for the
HTML parsing process. |
Classes related to parsing and producing HTML from/to Boilerpipe TextDocuments.