class UnicodeUtils::Codepoint
A Codepoint
instance represents a single Unicode code point.
UnicodeUtils::Codepoint.new(0x20ac) => #<U+20AC "€" EURO SIGN utf8:e2,82,ac>
Constants
- RANGE
The Unicode codespace. Any integer in this range is a Unicode code point.
Public Class Methods
new(int)
click to toggle source
Create a Codepoint
instance that wraps the given Integer. int
must be in Codepoint::RANGE.
# File lib/unicode_utils/codepoint.rb, line 17 def initialize(int) unless RANGE.include?(int) raise ArgumentError, "#{int} not in codespace" end @int = int end
Public Instance Methods
hexbytes()
click to toggle source
Get the bytes used to encode this code point in UTF-8, hex-formatted.
Codepoint.new(0xe4).hexbytes => "c3,a4"
# File lib/unicode_utils/codepoint.rb, line 54 def hexbytes to_s.bytes.map { |b| sprintf("%02x", b) }.join(",") end
inspect()
click to toggle source
<U+… char name utf8-hexbytes>
# File lib/unicode_utils/codepoint.rb, line 59 def inspect "#<#{uplus} #{to_s.inspect} #{name || "nil"} utf8:#{hexbytes}>" end
name()
click to toggle source
Get the normative Unicode name of this code point.
See also: UnicodeUtils.char_name
# File lib/unicode_utils/codepoint.rb, line 39 def name UnicodeUtils.char_name(@int) end
ord()
click to toggle source
Convert to Integer.
# File lib/unicode_utils/codepoint.rb, line 25 def ord @int end
to_s()
click to toggle source
Convert this code point to an UTF-8 encoded string. Returns a new string on each call and thus it is allowed to mutate the return value.
# File lib/unicode_utils/codepoint.rb, line 46 def to_s @int.chr(Encoding::UTF_8) end
uplus()
click to toggle source
Format in U+ notation.
Codepoint.new(0xc5).uplus => "U+00C5"
# File lib/unicode_utils/codepoint.rb, line 32 def uplus sprintf('U+%04X', @int) end