lingscope.algorithms
Class AbnerTokenizer

java.lang.Object
  extended by lingscope.algorithms.AbnerTokenizer

public class AbnerTokenizer
extends java.lang.Object


Constructor Summary
AbnerTokenizer()
           
 
Method Summary
static java.lang.String splitTermsByPunctuation(java.lang.String input)
          Takes an input and splits the sentence by punctuations and spaces, then stitches it back together with a space and returns
static java.lang.String tokenize(java.lang.String s)
          Take raw text apply ABNER's built-in tokenization on it.
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Constructor Detail

AbnerTokenizer

public AbnerTokenizer()
Method Detail

tokenize

public static java.lang.String tokenize(java.lang.String s)

Take raw text apply ABNER's built-in tokenization on it.


splitTermsByPunctuation

public static java.lang.String splitTermsByPunctuation(java.lang.String input)
Takes an input and splits the sentence by punctuations and spaces, then stitches it back together with a space and returns

Parameters:
input - the input string to process
Returns:
processed input string, where all words and punctuations are seperated by space