class JapaneseNames::Util::Ngram

Provides methods for parsing Japanese name strings.

Public Class Methods

index_partition(str, i) click to toggle source

Partitions a string based on an index

# File lib/japanese_names/util/ngram.rb, line 17
def index_partition(str, i)
  [str[0...i], str[i..-1]]
end
mask_left(str, mask) click to toggle source

Masks a String from the left side and returns the remaining (right) portion of the String.

Example: mask_left(“abcde”, “ab”) #=> “cde”

# File lib/japanese_names/util/ngram.rb, line 36
def mask_left(str, mask)
  str.gsub(/\A#{mask}/, '')
end
mask_right(str, mask) click to toggle source

Masks a String from the right side and returns the remaining (left) portion of the String.

Example: mask_right(“abcde”, “de”) #=> “abc”

# File lib/japanese_names/util/ngram.rb, line 43
def mask_right(str, mask)
  str.gsub(/#{mask}\z/, '')
end
ngram_partition(str) click to toggle source

Generates middle-out partition n-grams for a string

# File lib/japanese_names/util/ngram.rb, line 9
def ngram_partition(str)
  size = str.size
  spiral_partition_indexes(size).map do |i|
    index_partition(str, i)
  end
end
spiral_partition_indexes(size) click to toggle source

Lists middle-out partition points for a given string length

# File lib/japanese_names/util/ngram.rb, line 22
def spiral_partition_indexes(size)
  ary = []
  last = size / 2
  ary << last
  (size - 2).times do |i|
    last += (i + 1) * (-1)**i
    ary << last
  end
  ary
end