class TwitterCldr::Utils::RegexpAst::CharacterSet

Attributes

members[R]
negated[R]
negated?[R]

Public Class Methods

from_parser_node(node, expressions) click to toggle source
# File lib/twitter_cldr/utils/regexp_ast.rb, line 46
def self.from_parser_node(node, expressions)
  new(
    expressions, Quantifier.from_parser_node(node),
    fix_members(node.members), node.negative?
  )
end
new(expressions, quantifier, members, negated) click to toggle source
# File lib/twitter_cldr/utils/regexp_ast.rb, line 41
def initialize(expressions, quantifier, members, negated)
  @members = members; @negated = negated
  super(expressions, quantifier)
end

Private Class Methods

fix_members(members) click to toggle source

CLDR occasionally uses d and other escapes in character classes to signify 0-9 and friends. This is legal regex syntax, but the regexp_parser gem doesn't handle it correctly, so we have to repair things here.

# File lib/twitter_cldr/utils/regexp_ast.rb, line 59
def self.fix_members(members)
  members.join.scan(/(\\[wd]|\w-\w|\w|-)/).to_a.flatten.inject([]) do |ret, member|
    case member
      when '\d' then ret << '0-9'
      when '\w' then ret += ['A-Z', 'a-z', '0-9', '_']
      else ret << member
    end

    ret
  end
end