module ANTLR3::Token
At a minimum, tokens are data structures that bind together a chunk of text and a corresponding type symbol, which categorizes/characterizes the content of the text. Tokens also usually carry information about their location in the input, such as absolute character index, line number, and position within the line (or column).
Furthermore, ANTLR tokens are assigned a “channel” number, an extra degree of categorization that groups things on a larger scale. Parsers will usually ignore tokens that have channel value 99 (the HIDDEN_CHANNEL), so you can keep things like comment and white space huddled together with neighboring tokens, effectively ignoring them without discarding them.
ANTLR tokens also keep a reference to the source stream from which they originated. Token
streams will also provide an index value for the token, which indicates the position of the token relative to other tokens in the stream, starting at zero. For example, the 22nd token pulled from a lexer by CommonTokenStream
will have index value 21.
Token
as an Interface¶ ↑
This library provides a token implementation (see CommonToken
). Additionally, you may write your own token class as long as you provide methods that give access to the attributes expected by a token. Even though most of the ANTLR library tries to use duck-typing techniques instead of pure object-oriented type checking, it’s a good idea to include this ANTLR3::Token
into your customized token class.
Attributes
the integer value of the channel to which the token is assigned
the text’s starting position in the line within the source (indexed starting at 0)
the index of the token with respect to other the other tokens produced during lexing
a reference to the input stream from which the token was extracted
a reference to the input stream from which the token was extracted
a reference to the input stream from which the token was extracted
the text’s starting line number within the source (indexed starting at 1)
the absolute character index in the input at which the text starts
the absolute character index in the input at which the text ends
the token’s associated chunk of text
the index of the token with respect to other the other tokens produced during lexing
the index of the token with respect to other the other tokens produced during lexing
the integer value associated with the token’s type
Public Instance Methods
Tokens are comparable by their stream index values
# File lib/antlr3/token.rb, line 130 def <=> tk2 index <=> tk2.index end
The match operator has been implemented to match against several different attributes of a token for convenience in quick scripts
@example Match against an integer token type constant
token =~ VARIABLE_NAME => true/false
@example Match against a token type name as a Symbol
token =~ :FLOAT => true/false
@example Match the token text against a Regular Expression
token =~ /^@[a-z_]\w*$/i
@example Compare the token’s text to a string
token =~ "class"
# File lib/antlr3/token.rb, line 117 def =~ obj case obj when Integer then type == obj when Symbol then name == obj.to_s when Regexp then obj =~ text when String then text == obj else super end end
# File lib/antlr3/token.rb, line 146 def concrete? input && start && stop ? true : false end
Sets the token’s channel value to HIDDEN_CHANNEL
# File lib/antlr3/token.rb, line 173 def hide! self.channel = HIDDEN_CHANNEL end
# File lib/antlr3/token.rb, line 150 def imaginary? input && start && stop ? false : true end
# File lib/antlr3/token.rb, line 134 def initialize_copy( orig ) self.index = -1 self.type = orig.type self.channel = orig.channel self.text = orig.text.clone if orig.text self.start = orig.start self.stop = orig.stop self.line = orig.line self.column = orig.column self.input = orig.input end
# File lib/antlr3/token.rb, line 177 def inspect text_inspect = text ? "[#{ text.inspect }] " : ' ' text_position = line > 0 ? "@ line #{ line } col #{ column } " : '' stream_position = start ? "(#{ range.inspect })" : '' front = index >= 0 ? "#{ index } " : '' rep = front << name << text_inspect << text_position << stream_position rep.strip! channel == DEFAULT_CHANNEL or rep << " (#{ channel.to_s })" return( rep ) end
# File lib/antlr3/token.rb, line 154 def name token_name( type ) end
# File lib/antlr3/token.rb, line 190 def pretty_print( printer ) printer.text( inspect ) end
# File lib/antlr3/token.rb, line 194 def range start..stop rescue nil end
# File lib/antlr3/token.rb, line 158 def source_name i = input and i.source_name end
# File lib/antlr3/token.rb, line 166 def source_text concrete? ? input.substring( start, stop ) : text end
# File lib/antlr3/token.rb, line 198 def to_i index.to_i end
# File lib/antlr3/token.rb, line 202 def to_s text.to_s end
Private Instance Methods
# File lib/antlr3/token.rb, line 208 def token_name( type ) BUILT_IN_TOKEN_NAMES[ type ] end