Package org.apache.lucene.analysis.ja
Class JapaneseAnalyzer
java.lang.Object
org.apache.lucene.analysis.Analyzer
org.apache.lucene.analysis.StopwordAnalyzerBase
org.apache.lucene.analysis.ja.JapaneseAnalyzer
- All Implemented Interfaces:
Closeable
,AutoCloseable
Analyzer for Japanese that uses morphological analysis.
- Since:
- 3.6.0
- See Also:
-
Nested Class Summary
Nested ClassesModifier and TypeClassDescriptionprivate static class
Atomically loads DEFAULT_STOP_SET, DEFAULT_STOP_TAGS in a lazy fashion once the outer class accesses the static final set the first time.Nested classes/interfaces inherited from class org.apache.lucene.analysis.Analyzer
Analyzer.ReuseStrategy, Analyzer.TokenStreamComponents
-
Field Summary
FieldsModifier and TypeFieldDescriptionprivate final JapaneseTokenizer.Mode
private final UserDictionary
Fields inherited from class org.apache.lucene.analysis.StopwordAnalyzerBase
stopwords
Fields inherited from class org.apache.lucene.analysis.Analyzer
GLOBAL_REUSE_STRATEGY, PER_FIELD_REUSE_STRATEGY
-
Constructor Summary
ConstructorsConstructorDescriptionJapaneseAnalyzer
(UserDictionary userDict, JapaneseTokenizer.Mode mode, CharArraySet stopwords, Set<String> stoptags) -
Method Summary
Modifier and TypeMethodDescriptionprotected Analyzer.TokenStreamComponents
createComponents
(String fieldName) Creates a newAnalyzer.TokenStreamComponents
instance for this analyzer.static CharArraySet
protected Reader
initReader
(String fieldName, Reader reader) Override this if you want to add a CharFilter chain.protected Reader
initReaderForNormalization
(String fieldName, Reader reader) Wrap the givenReader
withCharFilter
s that make sense for normalization.protected TokenStream
normalize
(String fieldName, TokenStream in) Wrap the givenTokenStream
in order to apply normalization filters.Methods inherited from class org.apache.lucene.analysis.StopwordAnalyzerBase
getStopwordSet, loadStopwordSet, loadStopwordSet, loadStopwordSet
Methods inherited from class org.apache.lucene.analysis.Analyzer
attributeFactory, close, getOffsetGap, getPositionIncrementGap, getReuseStrategy, normalize, tokenStream, tokenStream
-
Field Details
-
mode
-
stoptags
-
userDict
-
-
Constructor Details
-
JapaneseAnalyzer
public JapaneseAnalyzer() -
JapaneseAnalyzer
public JapaneseAnalyzer(UserDictionary userDict, JapaneseTokenizer.Mode mode, CharArraySet stopwords, Set<String> stoptags)
-
-
Method Details
-
getDefaultStopSet
-
getDefaultStopTags
-
createComponents
Description copied from class:Analyzer
Creates a newAnalyzer.TokenStreamComponents
instance for this analyzer.- Specified by:
createComponents
in classAnalyzer
- Parameters:
fieldName
- the name of the fields content passed to theAnalyzer.TokenStreamComponents
sink as a reader- Returns:
- the
Analyzer.TokenStreamComponents
for this analyzer.
-
normalize
Description copied from class:Analyzer
Wrap the givenTokenStream
in order to apply normalization filters. The default implementation returns theTokenStream
as-is. This is used byAnalyzer.normalize(String, String)
. -
initReader
Description copied from class:Analyzer
Override this if you want to add a CharFilter chain.The default implementation returns
reader
unchanged.- Overrides:
initReader
in classAnalyzer
- Parameters:
fieldName
- IndexableField name being indexedreader
- original Reader- Returns:
- reader, optionally decorated with CharFilter(s)
-
initReaderForNormalization
Description copied from class:Analyzer
Wrap the givenReader
withCharFilter
s that make sense for normalization. This is typically a subset of theCharFilter
s that are applied inAnalyzer.initReader(String, Reader)
. This is used byAnalyzer.normalize(String, String)
.- Overrides:
initReaderForNormalization
in classAnalyzer
-