java.lang.Object
org.apache.lucene.analysis.Analyzer
org.apache.lucene.analysis.StopwordAnalyzerBase
- All Implemented Interfaces:
Closeable
,AutoCloseable
- Direct Known Subclasses:
ArabicAnalyzer
,ArmenianAnalyzer
,BasqueAnalyzer
,BengaliAnalyzer
,BrazilianAnalyzer
,BulgarianAnalyzer
,CatalanAnalyzer
,CJKAnalyzer
,ClassicAnalyzer
,CzechAnalyzer
,DanishAnalyzer
,EnglishAnalyzer
,EstonianAnalyzer
,FinnishAnalyzer
,FrenchAnalyzer
,GalicianAnalyzer
,GermanAnalyzer
,GreekAnalyzer
,HindiAnalyzer
,HungarianAnalyzer
,IndonesianAnalyzer
,IrishAnalyzer
,ItalianAnalyzer
,JapaneseAnalyzer
,LatvianAnalyzer
,LithuanianAnalyzer
,NepaliAnalyzer
,NorwegianAnalyzer
,PersianAnalyzer
,PolishAnalyzer
,PortugueseAnalyzer
,RomanianAnalyzer
,RussianAnalyzer
,SerbianAnalyzer
,SoraniAnalyzer
,SpanishAnalyzer
,StandardAnalyzer
,StopAnalyzer
,SwedishAnalyzer
,TamilAnalyzer
,TeluguAnalyzer
,ThaiAnalyzer
,TurkishAnalyzer
,UAX29URLEmailAnalyzer
Base class for Analyzers that need to make use of stopword sets.
- Since:
- 3.1
-
Nested Class Summary
Nested classes/interfaces inherited from class org.apache.lucene.analysis.Analyzer
Analyzer.ReuseStrategy, Analyzer.TokenStreamComponents
-
Field Summary
FieldsFields inherited from class org.apache.lucene.analysis.Analyzer
GLOBAL_REUSE_STRATEGY, PER_FIELD_REUSE_STRATEGY, storedValue
-
Constructor Summary
ConstructorsModifierConstructorDescriptionprotected
Creates a new Analyzer with an empty stopword setprotected
StopwordAnalyzerBase
(CharArraySet stopwords) Creates a new instance initialized with the given stopword set -
Method Summary
Modifier and TypeMethodDescriptionReturns the analyzer's stopword set or an empty set if the analyzer has no stopwordsprotected static CharArraySet
loadStopwordSet
(Reader stopwords) Creates a CharArraySet from a file.protected static CharArraySet
loadStopwordSet
(Path stopwords) Creates a CharArraySet from a path.Methods inherited from class org.apache.lucene.analysis.Analyzer
attributeFactory, close, createComponents, getOffsetGap, getPositionIncrementGap, getReuseStrategy, initReader, initReaderForNormalization, normalize, normalize, tokenStream, tokenStream
-
Field Details
-
stopwords
An immutable stopword set
-
-
Constructor Details
-
StopwordAnalyzerBase
Creates a new instance initialized with the given stopword set- Parameters:
stopwords
- the analyzer's stopword set
-
StopwordAnalyzerBase
protected StopwordAnalyzerBase()Creates a new Analyzer with an empty stopword set
-
-
Method Details
-
getStopwordSet
Returns the analyzer's stopword set or an empty set if the analyzer has no stopwords- Returns:
- the analyzer's stopword set or an empty set if the analyzer has no stopwords
-
loadStopwordSet
Creates a CharArraySet from a path.- Parameters:
stopwords
- the stopwords file to load- Returns:
- a CharArraySet containing the distinct stopwords from the given file
- Throws:
IOException
- if loading the stopwords throws anIOException
-
loadStopwordSet
Creates a CharArraySet from a file.- Parameters:
stopwords
- the stopwords reader to load- Returns:
- a CharArraySet containing the distinct stopwords from the given reader
- Throws:
IOException
- if loading the stopwords throws anIOException
-