public interface BoilerpipeExtractor extends BoilerpipeFilter
| Modifier and Type | Method and Description |
|---|---|
java.lang.String |
getText(org.xml.sax.InputSource is)
Extracts text from the HTML code available from the given
InputSource. |
java.lang.String |
getText(java.io.Reader r)
Extracts text from the HTML code available from the given
Reader. |
java.lang.String |
getText(java.lang.String html)
Extracts text from the HTML code given as a String.
|
java.lang.String |
getText(TextDocument doc)
Extracts text from the given
TextDocument object. |
processjava.lang.String getText(java.lang.String html)
throws BoilerpipeProcessingException
html - The HTML code as a String.BoilerpipeProcessingExceptionjava.lang.String getText(org.xml.sax.InputSource is)
throws BoilerpipeProcessingException
InputSource.is - The InputSource containing the HTMLBoilerpipeProcessingExceptionjava.lang.String getText(java.io.Reader r)
throws BoilerpipeProcessingException
Reader.r - The Reader containing the HTMLBoilerpipeProcessingExceptionjava.lang.String getText(TextDocument doc) throws BoilerpipeProcessingException
TextDocument object.doc - The TextDocument.BoilerpipeProcessingException