module HumanReadable

Human readable random tokens without ambiguous characters, and optional Emoji support

Copyright 2020 Mack Earnhardt

Constants

Configuration

HumanReadable configuration

SKIN_TONE_REGEXP
VERSION

Gem version

Public Class Methods

charset() click to toggle source

Characters available for token generation

DEFAULT: Digits 0-9 and uppercase letters A-Z except for ILOU

@note Manipulate via {#configure} @return [Array] of available characters

# File lib/human_readable.rb, line 104
def charset
  @charset ||=
    begin
      array = (
        ('0'..'9').to_a +
        ('A'..'Z').to_a +
        extend_chars -
        exclude_chars -
        validation_hash.keys +
        validation_hash.values
      )
      array.uniq!
      array.sort!
    end
end
configure() { |configuration| ... } click to toggle source

Yields block for configuration

HumanReadable.configure do |c|
  c.substitution_hash = { %w[I L] => 1, O: 0, U: :V } # Default
  c.output_size = 10                                  # Default

  # Substitution hash
  c.substitution_hash[:B] = 8
  c.substitution_hash[:U] = nil
  # or equivalently
  c.substitution_hash = { %w[I L] => 1, O: 0, U: nil, B: 8}

  # Extend charset
  c.extend_chars = %w[~ ! @ $]

  # Exclude charset
  c.exclude_chars = %w[X Y Z]

  # Supports Emoji!!
  c.extend_chars << %w[โ›ฐ๏ธ ๐Ÿงป โœ‚๏ธ ๐ŸฆŽ ๐Ÿ––]
  c.substitution_hash['๐Ÿ–ค'] = 'โค๏ธ'

  # And understands skin tones
  c.remove_skin_tones = false                         # Default
  c.substitution_hash[%w[๐Ÿ‘๐Ÿป ๐Ÿ‘๐Ÿผ ๐Ÿ‘๐Ÿฝ ๐Ÿ‘๐Ÿพ ๐Ÿ‘๐Ÿฟ]] = '๐Ÿ‘'
  # -or-
  c.remove_skin_tones = true
  c.extend_chars << '๐Ÿ‘'
end

Specified keys won't be used during generation, and values will be substituted during validation, increasing the likelihood that a misread character can be restored. Extend or replace the substitutions to alter the character set. For convenience, digits and symbols are allowed in the hash and are translated to characters during usage.

@note Changing substitution_hash keys alters the check character, invalidating previous tokens. @return [nil]

# File lib/human_readable.rb, line 52
def configure
  yield(configuration)
  nil
end
generate(output_size: configuration.output_size) click to toggle source

Generates a random token of the requested size

@note Minimum size is 2 since the last character is a check character @param output_size [Integer] desired number of printable characters @return [String] random token with check character

# File lib/human_readable.rb, line 62
def generate(output_size: configuration.output_size)
  raise(MinSizeTwo) if output_size < 2

  "#{token = generate_random(output_size - 1)}#{check_character(token)}"
end
reset() click to toggle source

Reset configuration and memoizations

@return [Array] list of variables reset

# File lib/human_readable.rb, line 123
def reset
  instance_variables.each { |sym| remove_instance_variable(sym) }
end
valid_token?(input) click to toggle source

Clean and validate a candidate token

  • Upcases

  • Applies substitutions

  • Remove characters not in available character set

  • Validates the check character

@param input [String] the candidate token @return [String, nil] possibly modified token if valid, else nil

# File lib/human_readable.rb, line 77
def valid_token?(input)
  return unless input.is_a?(String)

  codepoints =
    input.upcase.each_grapheme_cluster.map do |c|
      c.gsub!(SKIN_TONE_REGEXP, '') if configuration.remove_skin_tones
      charset_hash[validation_hash[c] || c]
    end
  codepoints.compact!

  return if codepoints.size < 2

  array =
    codepoints.reverse.each_with_index.map do |codepoint, i|
      codepoint *= 2 if i.odd?
      codepoint / charset_size + codepoint % charset_size
    end

  codepoints.map { |codepoint| charset[codepoint] }.join if (array.sum % charset_size).zero?
end

Private Class Methods

byte_multiplier() click to toggle source
# File lib/human_readable.rb, line 206
def byte_multiplier
  @byte_multiplier ||=
    begin
      bit_multiplier = char_bits / 8.0
      # Then extra 1.1 helps performance due to randomness of misses
      miss_percentage = 2**char_bits * 1.0 / charset_size * 1.1
      bit_multiplier * miss_percentage
    end
end
char_bits() click to toggle source
# File lib/human_readable.rb, line 198
def char_bits
  @char_bits ||= (charset_size - 1).to_s(2).size
end
char_cleanup(array) click to toggle source
# File lib/human_readable.rb, line 220
def char_cleanup(array)
  array.compact!
  array.flatten!
  array.map!(&:to_s)
  array.map! { |element| element.gsub(SKIN_TONE_REGEXP, '') } if configuration.remove_skin_tones
  array.map!(&:upcase)
end
charset_hash() click to toggle source
# File lib/human_readable.rb, line 251
def charset_hash
  @charset_hash ||= Hash[charset.each_with_index.map { |char, i| [char, i] }]
end
charset_size() click to toggle source
# File lib/human_readable.rb, line 216
def charset_size
  @charset_size ||= charset.size
end
check_character(input) click to toggle source

Compute check character using Luhn mod N algorithm

CAUTION: Changing charset alters the output

# File lib/human_readable.rb, line 185
def check_character(input)
  array =
    input.each_grapheme_cluster.to_a.reverse.each_with_index.map do |c, i|
      d = charset_hash[c]
      d *= 2 if i.even?
      d / charset_size + d % charset_size
    end

  mod = (charset_size - array.sum % charset_size) % charset_size

  charset[mod]
end
configuration() click to toggle source
# File lib/human_readable.rb, line 140
def configuration
  @configuration ||= Configuration.new(
    { %w[I L] => 1, O: 0, U: :V },
    [],
    [],
    10,
    false
  )
end
exclude_chars() click to toggle source
# File lib/human_readable.rb, line 232
def exclude_chars
  @exclude_chars ||= char_cleanup(
    configuration.exclude_chars + configuration.substitution_hash.each.map { |k, v| k if v.nil? }
  )
end
extend_chars() click to toggle source
# File lib/human_readable.rb, line 228
def extend_chars
  @extend_chars ||= char_cleanup(configuration.extend_chars)
end
generate_random(random_size) click to toggle source

Generates a random string of the requested length from the charset

We could use one of the below routines in #generate, but the first increases the chances of token collisions and the second is too slow.

Array.new(random_size) { charset.sample }
# or
Array.new(random_size) { charset.sample(random: SecureRandom) }

Instead we attempt to optimize the number of bytes generated with each call to SecureRandom.

# File lib/human_readable.rb, line 161
def generate_random(random_size)
  codepoints = []

  while codepoints.size < random_size
    bytes_needed = ((random_size - codepoints.size) * byte_multiplier).ceil

    codepoints +=
      begin
        array =
          SecureRandom
          .random_bytes(bytes_needed)
          .unpack1('B*')
          .scan(scan_regexp)
          .map! { |bin_string| bin_string.to_i(2) }
        array.select { |codepoint| codepoint < charset_size }
      end
  end

  codepoints[0, random_size].map { |codepoint| charset[codepoint] }.join
end
scan_regexp() click to toggle source
# File lib/human_readable.rb, line 202
def scan_regexp
  @scan_regexp ||= /.{#{char_bits}}/
end
validation_hash() click to toggle source

Flattened version of substitution_hash

# File lib/human_readable.rb, line 239
def validation_hash
  @validation_hash ||=
    begin
      array =
        configuration.substitution_hash.map do |k, v|
          (k.is_a?(Array) ? k.map { |k1| [k1, v] } : [k, v]) unless v.nil?
        end
      array = char_cleanup(array)
      Hash[*array]
    end
end