class ANTLR3::StringStream
A StringStream’s purpose is to wrap the basic, naked text input of a recognition system. Like all other stream types, it provides serial navigation of the input; a recognizer can arbitrarily step forward and backward through the stream’s symbols as it requires. StringStream
and its subclasses are they main way to feed text input into an ANTLR Lexer
for token processing.
The stream’s symbols of interest, of course, are character values. Thus, the peek
method returns the integer character value at look-ahead position k
and the look
method returns the character value as a String
. They also track various pieces of information such as the line and column numbers at the current position.
Note About Text Encoding¶ ↑
This version of the runtime library primarily targets ruby version 1.8, which does not have strong built-in support for multi-byte character encodings. Thus, characters are assumed to be represented by a single byte – an integer between 0 and 255. Ruby 1.9 does provide built-in encoding support for multi-byte characters, but currently this library does not provide any streams to handle non-ASCII encoding. However, encoding-savvy recognition code is a future development goal for this project.
Constants
- NEWLINE
Attributes
current integer character index of the stream
the current character position within the current line, indexed upward from 0
the entire string that is wrapped by the stream
current integer character index of the stream
the current line number of the input, indexed upward from 1
the name associated with the stream – usually a file name defaults to "(string)"
current integer character index of the stream
the name associated with the stream – usually a file name defaults to "(string)"
Public Class Methods
creates a new StringStream
object where data
is the string data to stream. accepts the following options in a symbol-to-value hash:
- :file or :name
-
the (file) name to associate with the stream; default:
'(string)'
- :line
-
the initial line number; default:
1
- :column
-
the initial column number; default:
0
# File lib/antlr3/streams.rb, line 397 def initialize( data, options = {} ) # for 1.9 @string = data.to_s.encode( Encoding::UTF_8 ).freeze @data = @string.codepoints.to_a.freeze @position = options.fetch :position, 0 @line = options.fetch :line, 1 @column = options.fetch :column, 0 @markers = [] @name ||= options[ :file ] || options[ :name ] # || '(string)' mark end
Public Instance Methods
operator style look-behind
# File lib/antlr3/streams.rb, line 521 def <<( k ) self << -k end
identical to String#[]
# File lib/antlr3/streams.rb, line 659 def []( start, *args ) @string[ start, *args ] end
Returns true if the stream appears to be at the beginning of a new line. This is an extra utility method for use inside lexer actions if needed.
# File lib/antlr3/streams.rb, line 534 def beginning_of_line? @position.zero? or @data[ @position - 1 ] == NEWLINE end
Returns true if the stream appears to be at the beginning of a stream (position = 0). This is an extra utility method for use inside lexer actions if needed.
# File lib/antlr3/streams.rb, line 558 def beginning_of_string? @position == 0 end
advance the stream by one character; returns the character consumed
# File lib/antlr3/streams.rb, line 478 def consume c = @data[ @position ] || EOF if @position < @data.length @column += 1 if c == NEWLINE @line += 1 @column = 0 end @position += 1 end return( c ) end
Returns true if the stream appears to be at the end of a new line. This is an extra utility method for use inside lexer actions if needed.
# File lib/antlr3/streams.rb, line 542 def end_of_line? @data[ @position ] == NEWLINE #if @position < @data.length end
Returns true if the stream has been exhausted. This is an extra utility method for use inside lexer actions if needed.
# File lib/antlr3/streams.rb, line 550 def end_of_string? @position >= @data.length end
customized object inspection that shows:
-
the stream class
-
the stream’s location in
index / line:column
format -
before_chars
characters before the cursor (6 characters by default) -
after_chars
characters after the cursor (10 characters by default)
# File lib/antlr3/streams.rb, line 638 def inspect( before_chars = 6, after_chars = 10 ) before = through( -before_chars ).inspect @position - before_chars > 0 and before.insert( 0, '... ' ) after = through( after_chars ).inspect @position + after_chars + 1 < @data.length and after << ' ...' location = "#@position / line #@line:#@column" "#<#{ self.class }: #{ before } | #{ after } @ #{ location }>" end
the last marker value created by a call to mark
# File lib/antlr3/streams.rb, line 597 def last_marker @markers.length - 1 end
identical to peek
, except it returns the character value as a String
# File lib/antlr3/streams.rb, line 411 def look( k = 1 ) # for 1.9 k == 0 and return nil k += 1 if k < 0 index = @position + k - 1 index < 0 and return nil @string[ index ] end
record the current stream location parameters in the stream’s marker table and return an integer-valued bookmark that may be used to restore the stream’s position with the rewind
method. This method is used to implement backtracking.
# File lib/antlr3/streams.rb, line 570 def mark state = [ @position, @line, @column ].freeze @markers << state return @markers.length - 1 end
the total number of markers currently in existence
# File lib/antlr3/streams.rb, line 590 def mark_depth @markers.length end
return the character at look-ahead distance k
as an integer. k = 1
represents the current character. k
greater than 1 represents upcoming characters. A negative value of k
returns previous characters consumed, where k = -1
is the last character consumed. k = 0
has undefined behavior and returns nil
# File lib/antlr3/streams.rb, line 497 def peek( k = 1 ) k == 0 and return nil k += 1 if k < 0 index = @position + k - 1 index < 0 and return nil @data[ index ] or EOF end
let go of the bookmark data for the marker and all marker values created after the marker.
# File lib/antlr3/streams.rb, line 605 def release( marker = @markers.length - 1 ) marker.between?( 1, @markers.length - 1 ) or return @markers.pop( @markers.length - marker ) return self end
rewinds the stream back to the start and clears out any existing marker entries
# File lib/antlr3/streams.rb, line 467 def reset initial_location = @markers.first @position, @line, @column = initial_location @markers.clear @markers << initial_location return self end
restore the stream to an earlier location recorded by mark
. If no marker value is provided, the last marker generated by mark
will be used.
# File lib/antlr3/streams.rb, line 580 def rewind( marker = @markers.length - 1, release = true ) ( marker >= 0 and location = @markers[ marker ] ) or return( self ) @position, @line, @column = location release( marker ) if release return self end
jump to the absolute position value given by index
. note: if index
is before the current position, the line
and column
attributes of the stream will probably be incorrect
# File lib/antlr3/streams.rb, line 616 def seek( index ) index = index.bound( 0, @data.length ) # ensures index is within the stream's range if index > @position skipped = through( index - @position ) if lc = skipped.count( "\n" ) and lc.zero? @column += skipped.length else @line += lc @column = skipped.length - skipped.rindex( "\n" ) - 1 end end @position = index return nil end
# File lib/antlr3/streams.rb, line 458 def size @data.length end
return the string slice between position start
and stop
# File lib/antlr3/streams.rb, line 652 def substring( start, stop ) @string[ start, stop - start + 1 ] end
return a substring around the stream cursor at a distance k
if k >= 0
, return the next k characters if k < 0
, return the previous |k|
characters
# File lib/antlr3/streams.rb, line 510 def through( k ) if k >= 0 then @string[ @position, k ] else start = ( @position + k ).at_least( 0 ) # start cannot be negative or index will wrap around @string[ start ... @position ] end end