class ANTLR3::CommonTokenStream
CommonTokenStream
serves as the primary token stream implementation for feeding sequential token input into parsers.
Using some TokenSource
(such as a lexer), the stream collects a token sequence, setting the token’s index
attribute to indicate the token’s position within the stream. The streams may be tuned to some channel value; off-channel tokens will be filtered out by the peek
, look
, and consume
methods.
Sample Usage¶ ↑
source_input = ANTLR3::StringStream.new("35 * 4 - 1") lexer = Calculator::Lexer.new(source_input) tokens = ANTLR3::CommonTokenStream.new(lexer) # assume this grammar defines whitespace as tokens on channel HIDDEN # and numbers and operations as tokens on channel DEFAULT tokens.look # => 0 INT['35'] @ line 1 col 0 (0..1) tokens.look(2) # => 2 MULT["*"] @ line 1 col 2 (3..3) tokens.tokens(0, 2) # => [0 INT["35"] @line 1 col 0 (0..1), # 1 WS[" "] @line 1 col 2 (1..1), # 2 MULT["*"] @ line 1 col 3 (3..3)] # notice the #tokens method does not filter off-channel tokens lexer.reset hidden_tokens = ANTLR3::CommonTokenStream.new(lexer, :channel => ANTLR3::HIDDEN) hidden_tokens.look # => 1 WS[' '] @ line 1 col 2 (1..1)
Public Class Methods
constructs a new token stream using the token_source
provided. token_source
is usually a lexer, but can be any object that implements next_token
and includes ANTLR3::TokenSource
.
If a block is provided, each token harvested will be yielded and if the block returns a nil
or false
value, the token will not be added to the stream – it will be discarded.
Options¶ ↑
- :channel
-
The channel value the stream should be tuned to initially
- :source_name
-
The source name (file name) attribute of the stream
Example¶ ↑
# create a new token stream that is tuned to channel :comment, and # discard all WHITE_SPACE tokens ANTLR3::CommonTokenStream.new(lexer, :channel => :comment) do |token| token.name != 'WHITE_SPACE' end
# File lib/antlr3/streams.rb, line 780 def initialize( token_source, options = {} ) case token_source when CommonTokenStream # this is useful in cases where you want to convert a CommonTokenStream # to a RewriteTokenStream or other variation of the standard token stream stream = token_source @token_source = stream.token_source @channel = options.fetch( :channel ) { stream.channel or DEFAULT_CHANNEL } @source_name = options.fetch( :source_name ) { stream.source_name } tokens = stream.tokens.map { | t | t.dup } else @token_source = token_source @channel = options.fetch( :channel, DEFAULT_CHANNEL ) @source_name = options.fetch( :source_name ) { @token_source.source_name rescue nil } tokens = @token_source.to_a end @last_marker = nil @tokens = block_given? ? tokens.select { | t | yield( t, self ) } : tokens @tokens.each_with_index { |t, i| t.index = i } @position = if first_token = @tokens.find { |t| t.channel == @channel } @tokens.index( first_token ) else @tokens.length end end
Public Instance Methods
# File lib/antlr3/streams.rb, line 938 def << k self >> -k end
identical to Array#[], as applied to the stream’s token buffer
# File lib/antlr3/streams.rb, line 1064 def []( i, *args ) @tokens[ i, *args ] end
# File lib/antlr3/streams.rb, line 1057 def at( i ) @tokens.at i end
advance the stream one step to the next on-channel token
# File lib/antlr3/streams.rb, line 901 def consume token = @tokens[ @position ] || EOF_TOKEN if @position < @tokens.length @position = future?( 2 ) || @tokens.length end return( token ) end
yields each token in the stream (including off-channel tokens) If no block is provided, the method returns an Enumerator object. each
accepts the same arguments as tokens
# File lib/antlr3/streams.rb, line 996 def each( *args ) block_given? or return enum_for( :each, *args ) tokens( *args ).each { |token| yield( token ) } end
yields each token in the stream with the given channel value If no channel value is given, the stream’s tuned channel value will be used. If no block is given, an enumerator will be returned.
# File lib/antlr3/streams.rb, line 1007 def each_on_channel( channel = @channel ) block_given? or return enum_for( :each_on_channel, channel ) for token in @tokens token.channel == channel and yield( token ) end end
fetches the text content of all tokens between start
and stop
and joins the chunks into a single string
# File lib/antlr3/streams.rb, line 1081 def extract_text( start = 0, stop = @tokens.length - 1 ) start = start.to_i.at_least( 0 ) stop = stop.to_i.at_most( @tokens.length ) @tokens[ start..stop ].map! { |t| t.text }.join( '' ) end
returns the index of the on-channel token at look-ahead position k
or nil if no other on-channel tokens exist
# File lib/antlr3/streams.rb, line 946 def future?( k = 1 ) @position == -1 and fill_buffer case when k == 0 then nil when k < 0 then past?( -k ) when k == 1 then @position else # since the stream only yields on-channel # tokens, the stream can't just go to the # next position, but rather must skip # over off-channel tokens ( k - 1 ).times.inject( @position ) do |cursor, | begin tk = @tokens.at( cursor += 1 ) or return( cursor ) # ^- if tk is nil (i.e. i is outside array limits) end until tk.channel == @channel cursor end end end
saves the current stream position, yields to the block, and then ensures the stream’s position is restored before returning the value of the block
# File lib/antlr3/streams.rb, line 887 def hold( pos = @position ) block_given? or return enum_for( :hold, pos ) begin yield ensure seek( pos ) end end
Standard Conversion Methods ###############################
# File lib/antlr3/streams.rb, line 1069 def inspect string = "#<%p: @token_source=%p @ %p/%p" % [ self.class, @token_source.class, @position, @tokens.length ] tk = look( -1 ) and string << " #{ tk.inspect } <--" tk = look( 1 ) and string << " --> #{ tk.inspect }" string << '>' end
operates simillarly to peek
, but returns the full token object at look-ahead position k
# File lib/antlr3/streams.rb, line 932 def look( k = 1 ) index = future?( k ) or return nil @tokens.fetch( index, EOF_TOKEN ) end
bookmark the current position of the input stream
# File lib/antlr3/streams.rb, line 869 def mark @last_marker = @position end
returns the index of the on-channel token at look-behind position k
or nil if no other on-channel tokens exist before the current token
# File lib/antlr3/streams.rb, line 972 def past?( k = 1 ) @position == -1 and fill_buffer case when k == 0 then nil when @position - k < 0 then nil else k.times.inject( @position ) do |cursor, | begin cursor <= 0 and return( nil ) tk = @tokens.at( cursor -= 1 ) or return( nil ) end until tk.channel == @channel cursor end end end
return the type of the on-channel token at look-ahead distance k
. k = 1
represents the current token. k
greater than 1 represents upcoming on-channel tokens. A negative value of k
returns previous on-channel tokens consumed, where k = -1
is the last on-channel token consumed. k = 0
has undefined behavior and returns nil
# File lib/antlr3/streams.rb, line 925 def peek( k = 1 ) tk = look( k ) and return( tk.type ) end
resets the token stream and rebuilds it with a potentially new token source. If no token_source
value is provided, the stream will attempt to reset the current token_source
by calling reset
on the object. The stream will then clear the token buffer and attempt to harvest new tokens. Identical in behavior to CommonTokenStream.new
, if a block is provided, tokens will be yielded and discarded if the block returns a false
or nil
value.
# File lib/antlr3/streams.rb, line 814 def rebuild( token_source = nil ) if token_source.nil? @token_source.reset rescue nil else @token_source = token_source end @tokens = block_given? ? @token_source.select { |token| yield( token ) } : @token_source.to_a @tokens.each_with_index { |t, i| t.index = i } @last_marker = nil @position = if first_token = @tokens.find { |t| t.channel == @channel } @tokens.index( first_token ) else @tokens.length end return self end
# File lib/antlr3/streams.rb, line 873 def release( marker = nil ) # do nothing end
rewind the stream to its initial state
# File lib/antlr3/streams.rb, line 858 def reset @position = 0 @position += 1 while token = @tokens[ @position ] and token.channel != @channel @last_marker = nil return self end
# File lib/antlr3/streams.rb, line 878 def rewind( marker = @last_marker, release = true ) seek( marker ) end
jump to the stream position specified by index
note: seek does not check whether or not the
token at the specified position is on-channel,
# File lib/antlr3/streams.rb, line 914 def seek( index ) @position = index.to_i.bound( 0, @tokens.length ) return self end
# File lib/antlr3/streams.rb, line 847 def size @tokens.length end
# File lib/antlr3/streams.rb, line 838 def token_class @token_source.token_class rescue NoMethodError @position == -1 and fill_buffer @tokens.empty? ? CommonToken : @tokens.first.class end
returns a copy of the token buffer. If start
and stop
are provided, tokens returns a slice of the token buffer from start..stop
. The parameters are converted to integers with their to_i
methods, and thus tokens can be provided to specify start and stop. If a block is provided, tokens are yielded and filtered out of the return array if the block returns a false
or nil
value.
# File lib/antlr3/streams.rb, line 1044 def tokens( start = nil, stop = nil ) stop.nil? || stop >= @tokens.length and stop = @tokens.length - 1 start.nil? || stop < 0 and start = 0 tokens = @tokens[ start..stop ] if block_given? tokens.delete_if { |t| not yield( t ) } end return( tokens ) end
tune the stream to a new channel value
# File lib/antlr3/streams.rb, line 834 def tune_to( channel ) @channel = channel end
iterates through the token stream, yielding each on channel token along the way. After iteration has completed, the stream’s position will be restored to where it was before walk
was called. While each
or each_on_channel
does not change the positions stream during iteration, walk
advances through the stream. This makes it possible to look ahead and behind the current token during iteration. If no block is given, an enumerator will be returned.
# File lib/antlr3/streams.rb, line 1022 def walk block_given? or return enum_for( :walk ) initial_position = @position begin while token = look and token.type != EOF consume yield( token ) end return self ensure @position = initial_position end end