Class SuffixingNGramTokenFilter

All Implemented Interfaces:
Closeable, AutoCloseable, Unwrappable<TokenStream>

final class SuffixingNGramTokenFilter extends TokenFilter
  • Field Details

    • suffix

      private final String suffix
    • maxTokenLength

      private final int maxTokenLength
    • anyToken

      private final String anyToken
    • curTermBuffer

      private char[] curTermBuffer
    • curTermLength

      private int curTermLength
    • curCodePointCount

      private int curCodePointCount
    • curGramSize

      private int curGramSize
    • curPos

      private int curPos
    • curPosInc

      private int curPosInc
    • curPosLen

      private int curPosLen
    • tokStart

      private int tokStart
    • tokEnd

      private int tokEnd
    • termAtt

      private final CharTermAttribute termAtt
    • posIncAtt

      private final PositionIncrementAttribute posIncAtt
    • posLenAtt

      private final PositionLengthAttribute posLenAtt
    • offsetAtt

      private final OffsetAttribute offsetAtt
    • keywordAtt

      private final KeywordAttribute keywordAtt
    • seenSuffixes

      private final CharArraySet seenSuffixes
    • seenInfixes

      private final CharArraySet seenInfixes
  • Constructor Details

    • SuffixingNGramTokenFilter

      public SuffixingNGramTokenFilter(TokenStream input, String suffix, String wildcardToken, int maxTokenLength)
      Creates SuffixingNGramTokenFilter.
      Parameters:
      input - TokenStream holding the input to be tokenized
      suffix - a string to suffix to all ngrams
      wildcardToken - a token to emit if the input token is longer than maxTokenLength
      maxTokenLength - tokens longer than this will not be ngrammed
  • Method Details

    • incrementToken

      public final boolean incrementToken() throws IOException
      Returns the next token in the stream, or null at EOS.
      Specified by:
      incrementToken in class TokenStream
      Returns:
      false for end of stream; true otherwise
      Throws:
      IOException
    • reset

      public void reset() throws IOException
      Description copied from class: TokenFilter
      This method is called by a consumer before it begins consumption using TokenStream.incrementToken().

      Resets this stream to a clean state. Stateful implementations must implement this method so that they can be reused, just as if they had been created fresh.

      If you override this method, always call super.reset(), otherwise some internal state will not be correctly reset (e.g., Tokenizer will throw IllegalStateException on further usage).

      NOTE: The default implementation chains the call to the input TokenStream, so be sure to call super.reset() when overriding this method.

      Overrides:
      reset in class TokenFilter
      Throws:
      IOException