class Sawarineko::Converter
Convert plain text to Sawarineko
text.
Constants
- CONVERTIBLE_EUC_KR
Array of convertible character code on EUC-KR encoding.
Public Class Methods
new(encoding = Encoding::UTF_8)
click to toggle source
Initialize a Converter
. Get the encoding of source. Initialize Regexps for conversion.
encoding - The Encoding of source (default: Encoding::UTF_8).
# File lib/sawarineko/converter.rb, line 15 def initialize(encoding = Encoding::UTF_8) @encoding = Encoding.find(encoding) @hiragana_regex = Regexp.new('な'.encode(@encoding)).freeze @katakana_regex = Regexp.new('ナ'.encode(@encoding)).freeze @hangul_regex = case @encoding when Encoding::UTF_8, Encoding::UTF_16BE, Encoding::UTF_16LE, Encoding::EUC_KR Regexp.new('[나-낳]'.encode(@encoding)).freeze when Encoding::CP949 Regexp.new('[나-낳낛-낤낥-낲]'.encode(@encoding)).freeze end end
Public Instance Methods
convert(source)
click to toggle source
Convert the source.
source - The String source to convert.
Returns the String converted to Sawarineko
.
# File lib/sawarineko/converter.rb, line 33 def convert(source) new_source = source.gsub(@hiragana_regex, 'にゃ'.encode(@encoding).freeze) .gsub(@katakana_regex, 'ニャ'.encode(@encoding).freeze) if @hangul_regex new_source.gsub(@hangul_regex) { |ch| convert_hangul(ch) } else new_source end end
Private Instance Methods
convert_hangul(ch)
click to toggle source
Convert a hangul character to Sawarineko
.
ch - A hangul character to convert.
Returns the converted character.
# File lib/sawarineko/converter.rb, line 50 def convert_hangul(ch) if @encoding == Encoding::EUC_KR && !CONVERTIBLE_EUC_KR.include?(ch.ord) return ch end (ch.encode(Encoding::UTF_8).ord + 56).chr(Encoding::UTF_8) .encode(@encoding) end