Class | Description |
---|---|
ExtractReuters |
Split the Reuters SGML documents into Simple Text files containing: Title, Date, Dateline, Body
|
ExtractWikipedia |
Extract the downloaded Wikipedia dump into separate files for indexing.
|
Copyright © 2000–2015 The Apache Software Foundation. All rights reserved.