module Addressable::IDNA

Constants

ACE_MAX_LENGTH
ACE_PREFIX
COMPOSITION_TABLE
PUNYCODE_BASE
PUNYCODE_DAMP
PUNYCODE_DELIMITER
PUNYCODE_INITIAL_BIAS
PUNYCODE_INITIAL_N
PUNYCODE_MAXINT
PUNYCODE_PRINT_ASCII
PUNYCODE_SKEW
PUNYCODE_TMAX
PUNYCODE_TMIN
UNICODE_DATA

This is a sparse Unicode table. Codepoints without entries are assumed to have the value: [0, 0, nil, nil, nil, nil, nil]

UNICODE_DATA_CANONICAL
UNICODE_DATA_COMBINING_CLASS
UNICODE_DATA_COMPATIBILITY
UNICODE_DATA_EXCLUSION
UNICODE_DATA_LOWERCASE
UNICODE_DATA_TITLECASE
UNICODE_DATA_UPPERCASE
UNICODE_MAX_LENGTH
UNICODE_TABLE

This module is loosely based on idn_actionmailer by Mick Staugaard, the unicode library by Yoshida Masato, and the punycode implementation by Kazuhiro Nishiyama. Most of the code was copied verbatim, but some reformatting was done, and some translation from C was done.

Without their code to work from as a base, we'd all still be relying on the presence of libidn. Which nobody ever seems to have installed.

Original sources: github.com/staugaard/idn_actionmailer www.yoshidam.net/Ruby.html#unicode rubyforge.org/frs/?group_id=2550

UTF8_REGEX
UTF8_REGEX_MULTIBYTE

Public Class Methods

to_ascii(value) click to toggle source
# File lib/addressable/idna/native.rb, line 42
def self.to_ascii(value)
  value.to_s.split('.', -1).map do |segment|
    if segment.size > 0 && segment.size < 64
      IDN::Idna.toASCII(segment, IDN::Idna::ALLOW_UNASSIGNED)
    elsif segment.size >= 64
      segment
    else
      ''
    end
  end.join('.')
end
to_unicode(value) click to toggle source
# File lib/addressable/idna/native.rb, line 54
def self.to_unicode(value)
  value.to_s.split('.', -1).map do |segment|
    if segment.size > 0 && segment.size < 64
      IDN::Idna.toUnicode(segment, IDN::Idna::ALLOW_UNASSIGNED)
    elsif segment.size >= 64
      segment
    else
      ''
    end
  end.join('.')
end
unicode_normalize_kc(value) click to toggle source

@deprecated Use {String#unicode_normalize(:nfkc)} instead

# File lib/addressable/idna/native.rb, line 34
def unicode_normalize_kc(value)
  value.to_s.unicode_normalize(:nfkc)
end

Private Class Methods

lookup_unicode_lowercase(codepoint) click to toggle source
# File lib/addressable/idna/pure.rb, line 140
def self.lookup_unicode_lowercase(codepoint)
  codepoint_data = UNICODE_DATA[codepoint]
  (codepoint_data ?
    (codepoint_data[UNICODE_DATA_LOWERCASE] || codepoint) :
    codepoint)
end
punycode_adapt(delta, numpoints, firsttime) click to toggle source

Bias adaptation method

# File lib/addressable/idna/pure.rb, line 488
def self.punycode_adapt(delta, numpoints, firsttime)
  delta = firsttime ? delta / PUNYCODE_DAMP : delta >> 1
  # delta >> 1 is a faster way of doing delta / 2
  delta += delta / numpoints
  difference = PUNYCODE_BASE - PUNYCODE_TMIN

  k = 0
  while delta > (difference * PUNYCODE_TMAX) / 2
    delta /= difference
    k += PUNYCODE_BASE
  end

  k + (difference + 1) * delta / (delta + PUNYCODE_SKEW)
end
punycode_basic?(codepoint) click to toggle source
# File lib/addressable/idna/pure.rb, line 456
def self.punycode_basic?(codepoint)
  codepoint < 0x80
end
punycode_decode(value) click to toggle source
# File lib/addressable/idna/native.rb, line 28
def self.punycode_decode(value)
  IDN::Punycode.decode(value.to_s)
end
punycode_decode_digit(codepoint) click to toggle source

Returns the numeric value of a basic codepoint (for use in representing integers) in the range 0 to base - 1, or PUNYCODE_BASE if codepoint does not represent a value.

# File lib/addressable/idna/pure.rb, line 474
def self.punycode_decode_digit(codepoint)
  if codepoint - 48 < 10
    codepoint - 22
  elsif codepoint - 65 < 26
    codepoint - 65
  elsif codepoint - 97 < 26
    codepoint - 97
  else
    PUNYCODE_BASE
  end
end
punycode_delimiter?(codepoint) click to toggle source
# File lib/addressable/idna/pure.rb, line 461
def self.punycode_delimiter?(codepoint)
  codepoint == PUNYCODE_DELIMITER
end
punycode_encode(value) click to toggle source
# File lib/addressable/idna/native.rb, line 24
def self.punycode_encode(value)
  IDN::Punycode.encode(value.to_s)
end
punycode_encode_digit(d) click to toggle source
# File lib/addressable/idna/pure.rb, line 466
def self.punycode_encode_digit(d)
  d + 22 + 75 * ((d < 26) ? 1 : 0)
end
unicode_downcase(input) click to toggle source

Unicode aware downcase method.

@api private @param [String] input

The input string.

@return [String] The downcased result.

# File lib/addressable/idna/pure.rb, line 132
def self.unicode_downcase(input)
  input = input.to_s unless input.is_a?(String)
  unpacked = input.unpack("U*")
  unpacked.map! { |codepoint| lookup_unicode_lowercase(codepoint) }
  return unpacked.pack("U*")
end