Package org.apache.lucene.monitor
Class SuffixingNGramTokenFilter
java.lang.Object
org.apache.lucene.util.AttributeSource
org.apache.lucene.analysis.TokenStream
org.apache.lucene.analysis.TokenFilter
org.apache.lucene.monitor.SuffixingNGramTokenFilter
- All Implemented Interfaces:
Closeable
,AutoCloseable
,Unwrappable<TokenStream>
-
Nested Class Summary
Nested classes/interfaces inherited from class org.apache.lucene.util.AttributeSource
AttributeSource.State
-
Field Summary
FieldsModifier and TypeFieldDescriptionprivate final String
private int
private int
private int
private int
private int
private char[]
private int
private final KeywordAttribute
private final int
private final OffsetAttribute
private final PositionIncrementAttribute
private final PositionLengthAttribute
private final CharArraySet
private final CharArraySet
private final String
private final CharTermAttribute
private int
private int
Fields inherited from class org.apache.lucene.analysis.TokenFilter
input
Fields inherited from class org.apache.lucene.analysis.TokenStream
DEFAULT_TOKEN_ATTRIBUTE_FACTORY
-
Constructor Summary
ConstructorsConstructorDescriptionSuffixingNGramTokenFilter
(TokenStream input, String suffix, String wildcardToken, int maxTokenLength) Creates SuffixingNGramTokenFilter. -
Method Summary
Modifier and TypeMethodDescriptionfinal boolean
Returns the next token in the stream, or null at EOS.void
reset()
This method is called by a consumer before it begins consumption usingTokenStream.incrementToken()
.Methods inherited from class org.apache.lucene.analysis.TokenFilter
close, end, unwrap
Methods inherited from class org.apache.lucene.util.AttributeSource
addAttribute, addAttributeImpl, captureState, clearAttributes, cloneAttributes, copyTo, endAttributes, equals, getAttribute, getAttributeClassesIterator, getAttributeFactory, getAttributeImplsIterator, hasAttribute, hasAttributes, hashCode, reflectAsString, reflectWith, removeAllAttributes, restoreState, toString
-
Field Details
-
suffix
-
maxTokenLength
private final int maxTokenLength -
anyToken
-
curTermBuffer
private char[] curTermBuffer -
curTermLength
private int curTermLength -
curCodePointCount
private int curCodePointCount -
curGramSize
private int curGramSize -
curPos
private int curPos -
curPosInc
private int curPosInc -
curPosLen
private int curPosLen -
tokStart
private int tokStart -
tokEnd
private int tokEnd -
termAtt
-
posIncAtt
-
posLenAtt
-
offsetAtt
-
keywordAtt
-
seenSuffixes
-
seenInfixes
-
-
Constructor Details
-
SuffixingNGramTokenFilter
public SuffixingNGramTokenFilter(TokenStream input, String suffix, String wildcardToken, int maxTokenLength) Creates SuffixingNGramTokenFilter.- Parameters:
input
-TokenStream
holding the input to be tokenizedsuffix
- a string to suffix to all ngramswildcardToken
- a token to emit if the input token is longer than maxTokenLengthmaxTokenLength
- tokens longer than this will not be ngrammed
-
-
Method Details
-
incrementToken
Returns the next token in the stream, or null at EOS.- Specified by:
incrementToken
in classTokenStream
- Returns:
- false for end of stream; true otherwise
- Throws:
IOException
-
reset
Description copied from class:TokenFilter
This method is called by a consumer before it begins consumption usingTokenStream.incrementToken()
.Resets this stream to a clean state. Stateful implementations must implement this method so that they can be reused, just as if they had been created fresh.
If you override this method, always call
super.reset()
, otherwise some internal state will not be correctly reset (e.g.,Tokenizer
will throwIllegalStateException
on further usage).NOTE: The default implementation chains the call to the input TokenStream, so be sure to call
super.reset()
when overriding this method.- Overrides:
reset
in classTokenFilter
- Throws:
IOException
-