class Sawarineko::Converter

Convert plain text to Sawarineko text.

Constants

CONVERTIBLE_EUC_KR

Array of convertible character code on EUC-KR encoding.

Public Class Methods

new(encoding = Encoding::UTF_8) click to toggle source

Initialize a Converter. Get the encoding of source. Initialize Regexps for conversion.

encoding - The Encoding of source (default: Encoding::UTF_8).

# File lib/sawarineko/converter.rb, line 15
def initialize(encoding = Encoding::UTF_8)
  @encoding = Encoding.find(encoding)
  @hiragana_regex = Regexp.new('な'.encode(@encoding)).freeze
  @katakana_regex = Regexp.new('ナ'.encode(@encoding)).freeze
  @hangul_regex = case @encoding
                  when Encoding::UTF_8, Encoding::UTF_16BE,
                       Encoding::UTF_16LE, Encoding::EUC_KR
                    Regexp.new('[나-낳]'.encode(@encoding)).freeze
                  when Encoding::CP949
                    Regexp.new('[나-낳낛-낤낥-낲]'.encode(@encoding)).freeze
                  end
end

Public Instance Methods

convert(source) click to toggle source

Convert the source.

source - The String source to convert.

Returns the String converted to Sawarineko.

# File lib/sawarineko/converter.rb, line 33
def convert(source)
  new_source = source.gsub(@hiragana_regex, 'にゃ'.encode(@encoding).freeze)
                     .gsub(@katakana_regex, 'ニャ'.encode(@encoding).freeze)
  if @hangul_regex
    new_source.gsub(@hangul_regex) { |ch| convert_hangul(ch) }
  else
    new_source
  end
end

Private Instance Methods

convert_hangul(ch) click to toggle source

Convert a hangul character to Sawarineko.

ch - A hangul character to convert.

Returns the converted character.

# File lib/sawarineko/converter.rb, line 50
def convert_hangul(ch)
  if @encoding == Encoding::EUC_KR && !CONVERTIBLE_EUC_KR.include?(ch.ord)
    return ch
  end
  (ch.encode(Encoding::UTF_8).ord + 56).chr(Encoding::UTF_8)
                                       .encode(@encoding)
end