class Oga::XPath::Lexer

Lexer for turning XPath expressions into a set of tokens. Tokens are returned as arrays with every array having two values:

  1. The token type as a symbol

  2. The token value or nil if there is no value

Basic usage of this lexer is as following:

lexer  = Oga::XPath::Lexer.new('//foo/bar')
tokens = lexer.lex

Alternatively you can stream tokens instead of returning them as a whole:

lexer = Oga::XPath::Lexer.new('//foo/bar')

lexer.advance do |type, value|

end

Unlike the XML lexer the XPath lexer does not support IO instances, it can only lex strings.

## Thread Safety

This class keeps track of an internal state. As a result it's not safe to share a single instance between multiple threads. However, you're free to use separate instances per thread as there is no global (= class level) shared state.

@api private

Constants

AXIS_EMIT_EXTRA_SLASH

Axes that require an extra T_SLASH token to be emitted.

@return [Array]

AXIS_EMIT_NODE

Axes that require a separate `node()` call to be emitted.

@return [Array]

AXIS_MAPPING

Maps certain XPath axes written in their short form to their long form equivalents.

@return [Hash]

Attributes

_xpath_lexer_eof_trans[RW]
_xpath_lexer_from_state_actions[RW]
_xpath_lexer_index_offsets[RW]
_xpath_lexer_indicies[RW]
_xpath_lexer_key_spans[RW]
_xpath_lexer_to_state_actions[RW]
_xpath_lexer_trans_actions[RW]
_xpath_lexer_trans_keys[RW]
_xpath_lexer_trans_targs[RW]
xpath_lexer_en_main[RW]
xpath_lexer_error[RW]
xpath_lexer_first_final[RW]
xpath_lexer_start[RW]

Public Class Methods

new(data) click to toggle source

@param [String] data The data to lex.

# File lib/oga/xpath/lexer.rb, line 1991
def initialize(data)
  @data = data
end

Public Instance Methods

advance(&block) click to toggle source

Advances through the input and generates the corresponding tokens. Each token is yielded to the supplied block.

Each token is an Array in the following format:

[TYPE, VALUE]

The type is a symbol, the value is either nil or a String.

This method stores the supplied block in `@block` and resets it after the lexer loop has finished.

@see [#add_token]

# File lib/oga/xpath/lexer.rb, line 2022
      def advance(&block)
        @block = block

        data  = @data # saves ivar lookups while lexing.
        ts    = nil
        te    = nil
        stack = []
        top   = 0
        cs    = self.class.xpath_lexer_start
        act   = 0
        eof   = @data.bytesize
        p     = 0
        pe    = eof

        _xpath_lexer_eof_trans          = self.class.send(:_xpath_lexer_eof_trans)
        _xpath_lexer_from_state_actions = self.class.send(:_xpath_lexer_from_state_actions)
        _xpath_lexer_index_offsets      = self.class.send(:_xpath_lexer_index_offsets)
        _xpath_lexer_indicies           = self.class.send(:_xpath_lexer_indicies)
        _xpath_lexer_key_spans          = self.class.send(:_xpath_lexer_key_spans)
        _xpath_lexer_to_state_actions   = self.class.send(:_xpath_lexer_to_state_actions)
        _xpath_lexer_trans_actions      = self.class.send(:_xpath_lexer_trans_actions)
        _xpath_lexer_trans_keys         = self.class.send(:_xpath_lexer_trans_keys)
        _xpath_lexer_trans_targs        = self.class.send(:_xpath_lexer_trans_targs)

        
# line 2048 "lib/oga/xpath/lexer.rb"
begin
        testEof = false
        _slen, _trans, _keys, _inds, _acts, _nacts = nil
        _goto_level = 0
        _resume = 10
        _eof_trans = 15
        _again = 20
        _test_eof = 30
        _out = 40
        while true
        if _goto_level <= 0
        if p == pe
                _goto_level = _test_eof
                next
        end
        if cs == 0
                _goto_level = _out
                next
        end
        end
        if _goto_level <= _resume
        case _xpath_lexer_from_state_actions[cs] 
        when 11 then
# line 1 "NONE"
                begin
ts = p
                end
# line 2076 "lib/oga/xpath/lexer.rb"
        end
        _keys = cs << 1
        _inds = _xpath_lexer_index_offsets[cs]
        _slen = _xpath_lexer_key_spans[cs]
        _wide = ( (data.getbyte(p) || 0))
        _trans = if (   _slen > 0 && 
                        _xpath_lexer_trans_keys[_keys] <= _wide && 
                        _wide <= _xpath_lexer_trans_keys[_keys + 1] 
                    ) then
                        _xpath_lexer_indicies[ _inds + _wide - _xpath_lexer_trans_keys[_keys] ] 
                 else 
                        _xpath_lexer_indicies[ _inds + _slen ]
                 end
        end
        if _goto_level <= _eof_trans
        cs = _xpath_lexer_trans_targs[_trans]
        if _xpath_lexer_trans_actions[_trans] != 0
        case _xpath_lexer_trans_actions[_trans]
        when 19 then
# line 166 "lib/oga/xpath/lexer.rl"
                begin
 add_token(:T_SLASH)            end
        when 35 then
# line 279 "lib/oga/xpath/lexer.rl"
                begin
 add_token(:T_ADD)              end
        when 13 then
# line 1 "NONE"
                begin
te = p+1
                end
        when 12 then
# line 350 "lib/oga/xpath/lexer.rl"
                begin
te = p+1
                end
        when 9 then
# line 332 "lib/oga/xpath/lexer.rl"
                begin
te = p+1
 begin 
          emit(:T_TYPE_TEST, ts, te - 2)
         end
                end
        when 3 then
# line 344 "lib/oga/xpath/lexer.rl"
                begin
te = p+1
 begin 
          emit(:T_VAR, ts + 1, te)
         end
                end
        when 2 then
# line 223 "lib/oga/xpath/lexer.rl"
                begin
te = p+1
 begin 
          emit(:T_STRING, ts + 1, te - 1)
         end
                end
        when 8 then
# line 244 "lib/oga/xpath/lexer.rl"
                begin
te = p+1
 begin 
          emit(:T_AXIS, ts, te - 2)
         end
                end
        when 21 then
# line 255 "lib/oga/xpath/lexer.rl"
                begin
te = p+1
 begin 
          value = AXIS_MAPPING[slice_input(ts, te)]

          add_token(:T_AXIS, value)

          # Short axes that use node() as their default, implicit test. This is
          # added on lexer level to make it easier to handle these cases on
          # parser/evaluator level.
          if AXIS_EMIT_NODE.include?(value)
            add_token(:T_TYPE_TEST, 'node')

            if AXIS_EMIT_EXTRA_SLASH.include?(value) and te != eof
              add_token(:T_SLASH)
            end
          end
         end
                end
        when 16 then
# line 185 "lib/oga/xpath/lexer.rl"
                begin
te = p+1
 begin 
          emit(:T_IDENT, ts, te)
         end
                end
        when 25 then
# line 350 "lib/oga/xpath/lexer.rl"
                begin
te = p
p = p - 1;              end
        when 33 then
# line 344 "lib/oga/xpath/lexer.rl"
                begin
te = p
p = p - 1; begin 
          emit(:T_VAR, ts + 1, te)
         end
                end
        when 37 then
# line 199 "lib/oga/xpath/lexer.rl"
                begin
te = p
p = p - 1; begin 
          value = slice_input(ts, te).to_i

          add_token(:T_INT, value)
         end
                end
        when 38 then
# line 205 "lib/oga/xpath/lexer.rl"
                begin
te = p
p = p - 1; begin 
          value = slice_input(ts, te).to_f

          add_token(:T_FLOAT, value)
         end
                end
        when 39 then
# line 255 "lib/oga/xpath/lexer.rl"
                begin
te = p
p = p - 1; begin 
          value = AXIS_MAPPING[slice_input(ts, te)]

          add_token(:T_AXIS, value)

          # Short axes that use node() as their default, implicit test. This is
          # added on lexer level to make it easier to handle these cases on
          # parser/evaluator level.
          if AXIS_EMIT_NODE.include?(value)
            add_token(:T_TYPE_TEST, 'node')

            if AXIS_EMIT_EXTRA_SLASH.include?(value) and te != eof
              add_token(:T_SLASH)
            end
          end
         end
                end
        when 24 then
# line 185 "lib/oga/xpath/lexer.rl"
                begin
te = p
p = p - 1; begin 
          emit(:T_IDENT, ts, te)
         end
                end
        when 1 then
# line 350 "lib/oga/xpath/lexer.rl"
                begin
 begin p = ((te))-1; end
                end
        when 7 then
# line 185 "lib/oga/xpath/lexer.rl"
                begin
 begin p = ((te))-1; end
 begin 
          emit(:T_IDENT, ts, te)
         end
                end
        when 4 then
# line 1 "NONE"
                begin
        case act
        when 0 then
        begin  begin
                cs = 0
                _goto_level = _again
                next
        end
end
        when 6 then
        begin begin p = ((te))-1; end

          value = slice_input(ts, te).to_i

          add_token(:T_INT, value)
        end
        when 7 then
        begin begin p = ((te))-1; end

          value = slice_input(ts, te).to_f

          add_token(:T_FLOAT, value)
        end
        else
        begin begin p = ((te))-1; end
end
end 
                        end
        when 14 then
# line 167 "lib/oga/xpath/lexer.rl"
                begin
 add_token(:T_LPAREN)           end
# line 350 "lib/oga/xpath/lexer.rl"
                begin
te = p+1
                end
        when 15 then
# line 168 "lib/oga/xpath/lexer.rl"
                begin
 add_token(:T_RPAREN)           end
# line 350 "lib/oga/xpath/lexer.rl"
                begin
te = p+1
                end
        when 18 then
# line 169 "lib/oga/xpath/lexer.rl"
                begin
 add_token(:T_COMMA)            end
# line 350 "lib/oga/xpath/lexer.rl"
                begin
te = p+1
                end
        when 20 then
# line 170 "lib/oga/xpath/lexer.rl"
                begin
 add_token(:T_COLON)            end
# line 350 "lib/oga/xpath/lexer.rl"
                begin
te = p+1
                end
        when 22 then
# line 171 "lib/oga/xpath/lexer.rl"
                begin
 add_token(:T_LBRACK)           end
# line 350 "lib/oga/xpath/lexer.rl"
                begin
te = p+1
                end
        when 23 then
# line 172 "lib/oga/xpath/lexer.rl"
                begin
 add_token(:T_RBRACK)           end
# line 350 "lib/oga/xpath/lexer.rl"
                begin
te = p+1
                end
        when 45 then
# line 278 "lib/oga/xpath/lexer.rl"
                begin
 add_token(:T_PIPE)             end
# line 349 "lib/oga/xpath/lexer.rl"
                begin
te = p
p = p - 1;              end
        when 34 then
# line 279 "lib/oga/xpath/lexer.rl"
                begin
 add_token(:T_ADD)              end
# line 349 "lib/oga/xpath/lexer.rl"
                begin
te = p
p = p - 1;              end
        when 42 then
# line 280 "lib/oga/xpath/lexer.rl"
                begin
 add_token(:T_EQ)               end
# line 349 "lib/oga/xpath/lexer.rl"
                begin
te = p
p = p - 1;              end
        when 32 then
# line 281 "lib/oga/xpath/lexer.rl"
                begin
 add_token(:T_NEQ)              end
# line 349 "lib/oga/xpath/lexer.rl"
                begin
te = p
p = p - 1;              end
        when 40 then
# line 282 "lib/oga/xpath/lexer.rl"
                begin
 add_token(:T_LT)               end
# line 349 "lib/oga/xpath/lexer.rl"
                begin
te = p
p = p - 1;              end
        when 43 then
# line 283 "lib/oga/xpath/lexer.rl"
                begin
 add_token(:T_GT)               end
# line 349 "lib/oga/xpath/lexer.rl"
                begin
te = p
p = p - 1;              end
        when 41 then
# line 284 "lib/oga/xpath/lexer.rl"
                begin
 add_token(:T_LTE)              end
# line 349 "lib/oga/xpath/lexer.rl"
                begin
te = p
p = p - 1;              end
        when 44 then
# line 285 "lib/oga/xpath/lexer.rl"
                begin
 add_token(:T_GTE)              end
# line 349 "lib/oga/xpath/lexer.rl"
                begin
te = p
p = p - 1;              end
        when 28 then
# line 295 "lib/oga/xpath/lexer.rl"
                begin
 add_token(:T_AND)              end
# line 349 "lib/oga/xpath/lexer.rl"
                begin
te = p
p = p - 1;              end
        when 31 then
# line 296 "lib/oga/xpath/lexer.rl"
                begin
 add_token(:T_OR)               end
# line 349 "lib/oga/xpath/lexer.rl"
                begin
te = p
p = p - 1;              end
        when 29 then
# line 297 "lib/oga/xpath/lexer.rl"
                begin
 add_token(:T_DIV)              end
# line 349 "lib/oga/xpath/lexer.rl"
                begin
te = p
p = p - 1;              end
        when 30 then
# line 298 "lib/oga/xpath/lexer.rl"
                begin
 add_token(:T_MOD)              end
# line 349 "lib/oga/xpath/lexer.rl"
                begin
te = p
p = p - 1;              end
        when 26 then
# line 299 "lib/oga/xpath/lexer.rl"
                begin
 add_token(:T_MUL)              end
# line 349 "lib/oga/xpath/lexer.rl"
                begin
te = p
p = p - 1;              end
        when 27 then
# line 300 "lib/oga/xpath/lexer.rl"
                begin
 add_token(:T_SUB)              end
# line 349 "lib/oga/xpath/lexer.rl"
                begin
te = p
p = p - 1;              end
        when 17 then
# line 1 "NONE"
                begin
te = p+1
                end
# line 349 "lib/oga/xpath/lexer.rl"
                begin
act = 1;                end
        when 5 then
# line 1 "NONE"
                begin
te = p+1
                end
# line 199 "lib/oga/xpath/lexer.rl"
                begin
act = 6;                end
        when 6 then
# line 1 "NONE"
                begin
te = p+1
                end
# line 205 "lib/oga/xpath/lexer.rl"
                begin
act = 7;                end
        when 36 then
# line 1 "NONE"
                begin
te = p+1
                end
# line 279 "lib/oga/xpath/lexer.rl"
                begin
 add_token(:T_ADD)              end
# line 199 "lib/oga/xpath/lexer.rl"
                begin
act = 6;                end
# line 2474 "lib/oga/xpath/lexer.rb"
        end
        end
        end
        if _goto_level <= _again
        case _xpath_lexer_to_state_actions[cs] 
        when 10 then
# line 1 "NONE"
                begin
ts = nil;               end
# line 1 "NONE"
                begin
act = 0
                end
# line 2488 "lib/oga/xpath/lexer.rb"
        end

        if cs == 0
                _goto_level = _out
                next
        end
        p += 1
        if p != pe
                _goto_level = _resume
                next
        end
        end
        if _goto_level <= _test_eof
        if p == eof
        if _xpath_lexer_eof_trans[cs] > 0
                _trans = _xpath_lexer_eof_trans[cs] - 1;
                _goto_level = _eof_trans
                next;
        end
        end

        end
        if _goto_level <= _out
                break
        end
end
        end

# line 118 "lib/oga/xpath/lexer.rl"

        # % fix highlight
      ensure
        @block = nil
      end
lex() click to toggle source

Gathers all the tokens for the input and returns them as an Array.

@see [#advance] @return [Array]

# File lib/oga/xpath/lexer.rb, line 1999
def lex
  tokens = []

  advance do |type, value|
    tokens << [type, value]
  end

  return tokens
end

Private Instance Methods

add_token(type, value = nil) click to toggle source

Yields a new token to the supplied block.

@param [Symbol] type The token type. @param [String] value The token value.

@yieldparam [Symbol] type @yieldparam [String|NilClass] value

# File lib/oga/xpath/lexer.rb, line 2556
def add_token(type, value = nil)
  @block.call(type, value)
end
emit(type, start, stop) click to toggle source

Emits a token of which the value is based on the supplied start/stop position.

@param [Symbol] type The token type. @param [Fixnum] start @param [Fixnum] stop

@see [#text] @see [#add_token]

# File lib/oga/xpath/lexer.rb, line 2534
def emit(type, start, stop)
  value = slice_input(start, stop)

  add_token(type, value)
end
slice_input(start, stop) click to toggle source

Returns the text between the specified start and stop position.

@param [Fixnum] start @param [Fixnum] stop @return [String]

# File lib/oga/xpath/lexer.rb, line 2545
def slice_input(start, stop)
  return @data.byteslice(start, stop - start)
end