class JapaneseNames::Util::Ngram
Provides methods for parsing Japanese name strings.
Public Class Methods
index_partition(str, i)
click to toggle source
Partitions a string based on an index
# File lib/japanese_names/util/ngram.rb, line 17 def index_partition(str, i) [str[0...i], str[i..-1]] end
mask_left(str, mask)
click to toggle source
Masks a String from the left side and returns the remaining (right) portion of the String.
Example: mask_left
(“abcde”, “ab”) #=> “cde”
# File lib/japanese_names/util/ngram.rb, line 36 def mask_left(str, mask) str.gsub(/\A#{mask}/, '') end
mask_right(str, mask)
click to toggle source
Masks a String from the right side and returns the remaining (left) portion of the String.
Example: mask_right
(“abcde”, “de”) #=> “abc”
# File lib/japanese_names/util/ngram.rb, line 43 def mask_right(str, mask) str.gsub(/#{mask}\z/, '') end
ngram_partition(str)
click to toggle source
Generates middle-out partition n-grams for a string
# File lib/japanese_names/util/ngram.rb, line 9 def ngram_partition(str) size = str.size spiral_partition_indexes(size).map do |i| index_partition(str, i) end end
spiral_partition_indexes(size)
click to toggle source
Lists middle-out partition points for a given string length
# File lib/japanese_names/util/ngram.rb, line 22 def spiral_partition_indexes(size) ary = [] last = size / 2 ary << last (size - 2).times do |i| last += (i + 1) * (-1)**i ary << last end ary end