See: Description
| Class | Description |
|---|---|
| ArticleExtractor |
A full-text extractor which is tuned towards news articles.
|
| ArticleSentencesExtractor |
A full-text extractor which is tuned towards extracting sentences from news articles.
|
| CanolaExtractor | |
| CommonExtractors |
Provides quick access to common
BoilerpipeExtractors. |
| DefaultExtractor |
A quite generic full-text extractor.
|
| ExtractorBase |
The base class of Extractors.
|
| KeepEverythingExtractor |
Marks everything as content.
|
| KeepEverythingWithMinKWordsExtractor |
A full-text extractor which extracts the largest text component of a page.
|
| LargestContentExtractor |
A full-text extractor which extracts the largest text component of a page.
|
| NumWordsRulesExtractor |
A quite generic full-text extractor solely based upon the number of words per
block (the current, the previous and the next block).
|
This package contains some standard extractors (i.e., completely piped BoilerpipeFilters)