Package | Description |
---|---|
org.apache.lucene.analysis.cn.smart |
Analyzer for Simplified Chinese, which indexes words.
|
org.apache.lucene.analysis.core |
Basic, general-purpose analysis components.
|
org.apache.lucene.analysis.custom |
A general-purpose Analyzer that can be created with a builder-style API.
|
org.apache.lucene.analysis.icu.segmentation |
Tokenizer that breaks text into words with the Unicode Text Segmentation algorithm.
|
org.apache.lucene.analysis.ja |
Analyzer for Japanese.
|
org.apache.lucene.analysis.ngram |
Character n-gram tokenizers and filters.
|
org.apache.lucene.analysis.path |
Analysis components for path-like strings such as filenames.
|
org.apache.lucene.analysis.pattern |
Set of components for pattern-based (regex) analysis.
|
org.apache.lucene.analysis.standard |
Fast, general-purpose grammar-based tokenizers.
|
org.apache.lucene.analysis.th |
Analyzer for Thai.
|
org.apache.lucene.analysis.uima |
Classes that integrate UIMA with Lucene's analysis API.
|
org.apache.lucene.analysis.util |
Utility functions for text analysis.
|
org.apache.lucene.analysis.wikipedia |
Tokenizer that is aware of Wikipedia syntax.
|
org.apache.lucene.benchmark.byTask.utils |
Utilities used for the benchmark, and for the reports.
|
Modifier and Type | Class and Description |
---|---|
class |
HMMChineseTokenizerFactory
Factory for
HMMChineseTokenizer |
class |
SmartChineseSentenceTokenizerFactory
Deprecated.
Use
HMMChineseTokenizerFactory instead |
Modifier and Type | Class and Description |
---|---|
class |
KeywordTokenizerFactory
Factory for
KeywordTokenizer . |
class |
LetterTokenizerFactory
Factory for
LetterTokenizer . |
class |
LowerCaseTokenizerFactory
Factory for
LowerCaseTokenizer . |
class |
WhitespaceTokenizerFactory
Factory for
WhitespaceTokenizer . |
Modifier and Type | Method and Description |
---|---|
TokenizerFactory |
CustomAnalyzer.getTokenizerFactory()
Returns the tokenizer that is used in this analyzer.
|
Modifier and Type | Class and Description |
---|---|
class |
ICUTokenizerFactory
Factory for
ICUTokenizer . |
Modifier and Type | Class and Description |
---|---|
class |
JapaneseTokenizerFactory
Factory for
JapaneseTokenizer . |
Modifier and Type | Class and Description |
---|---|
class |
EdgeNGramTokenizerFactory
Creates new instances of
EdgeNGramTokenizer . |
class |
NGramTokenizerFactory
Factory for
NGramTokenizer . |
Modifier and Type | Class and Description |
---|---|
class |
PathHierarchyTokenizerFactory
Factory for
PathHierarchyTokenizer . |
Modifier and Type | Class and Description |
---|---|
class |
PatternTokenizerFactory
Factory for
PatternTokenizer . |
Modifier and Type | Class and Description |
---|---|
class |
ClassicTokenizerFactory
Factory for
ClassicTokenizer . |
class |
StandardTokenizerFactory
Factory for
StandardTokenizer . |
class |
UAX29URLEmailTokenizerFactory
Factory for
UAX29URLEmailTokenizer . |
Modifier and Type | Class and Description |
---|---|
class |
ThaiTokenizerFactory
Factory for
ThaiTokenizer . |
Modifier and Type | Class and Description |
---|---|
class |
UIMAAnnotationsTokenizerFactory
|
class |
UIMATypeAwareAnnotationsTokenizerFactory
|
Modifier and Type | Method and Description |
---|---|
static TokenizerFactory |
TokenizerFactory.forName(String name,
Map<String,String> args)
looks up a tokenizer by name from context classpath
|
Modifier and Type | Method and Description |
---|---|
static Class<? extends TokenizerFactory> |
TokenizerFactory.lookupClass(String name)
looks up a tokenizer class by name from context classpath
|
Modifier and Type | Class and Description |
---|---|
class |
WikipediaTokenizerFactory
Factory for
WikipediaTokenizer . |
Constructor and Description |
---|
AnalyzerFactory(List<CharFilterFactory> charFilterFactories,
TokenizerFactory tokenizerFactory,
List<TokenFilterFactory> tokenFilterFactories) |
Copyright © 2000–2015 The Apache Software Foundation. All rights reserved.