Class HHMMSegmenter

java.lang.Object
org.apache.lucene.analysis.cn.smart.hhmm.HHMMSegmenter

public class HHMMSegmenter extends Object
Finds the optimal segmentation of a sentence into Chinese words
  • Field Details

  • Constructor Details

    • HHMMSegmenter

      public HHMMSegmenter()
  • Method Details

    • createSegGraph

      private SegGraph createSegGraph(String sentence)
      Create the SegGraph for a sentence.
      Parameters:
      sentence - input sentence, without start and end markers
      Returns:
      SegGraph corresponding to the input sentence.
    • getCharTypes

      private static int[] getCharTypes(String sentence)
      Get the character types for every character in a sentence.
      Parameters:
      sentence - input sentence
      Returns:
      array of character types corresponding to character positions in the sentence
      See Also:
    • process

      public List<SegToken> process(String sentence)
      Return a list of SegToken representing the best segmentation of a sentence
      Parameters:
      sentence - input sentence
      Returns:
      best segmentation as a List