class Honeycomb::Redis::Fields
This structure contains the fields we'll add to each Redis
span.
The logic is in this class to avoid monkey-patching extraneous APIs into the Redis::Client
via {Client}.
@private
Constants
- BACKSLASH
A lookup table for backslash-escaped characters.
This is used by {#prettify} to replicate the hard-coded `case` statements in redis-cli. As of this writing,
Redis
recognizes a handful of standard C escape sequences, like “n” for newlines.Because {#prettify} will output double quoted strings if any escaping is needed, this table must additionally consider the double-quote to be a backslash-escaped character. For example, instead of generating
'"hello"'
we'll generate
"\"hello\""
even though redis-cli would technically recognize the single-quoted version.
@see github.com/antirez/redis/blob/0f026af185e918a9773148f6ceaa1b084662be88/src/sds.c#L888-L896
The redis-cli algorithm for outputting standard escape sequences
- NEEDS_BACKSLASH
A regular expression for characters that need to be backslash-escaped.
Any match of this regexp will be substituted according to the {BACKSLASH} table. This includes standard C escape sequences (newlines, tabs, etc) as well as a couple special considerations:
-
Because {#prettify} will output double quoted strings if any escaping is needed, we must match double quotes (“) so they'll be replaced by escaped quotes (").
-
Backslashes themselves get backslash-escaped, so \ becomes \. However, strings with invalid UTF-8 encoding will blow up when we try to use String#gsub!, so {#prettify} must first use String#encode! to scrub out invalid characters. It does this by replacing invalid bytes with hex-encoded escape sequences using {#hex}. This will insert sequences like xhh, which contains a backslash that we *don't* want to escape.
Unfortunately, this regexp can't really distinguish between backslashes in the original input vs backslashes resulting from the UTF-8 fallback. We make an effort by using a negative lookahead. That way, only backslashes that *aren't* followed by x + hex digit + hex digit will be escaped.
-
- NEEDS_HEX
A regular expression matching characters that need to be hex-encoded.
This replicates the C isprint() function that redis-cli uses to decide whether to escape a character in hexadecimal notation, “xhh”. Any non-printable character must be represented as a hex escape sequence.
Normally, we could match this using a negated POSIX bracket expression:
/[^[:print:]]/
You can read that as “not printable”.
However, in Ruby, these character classes also encompass non-ASCII characters. In contrast, since most platforms have 8-bit `char` types, the C isprint() function generally does not recognize any Unicode code points. This effectively limits the redis-cli interpretation of the printable character range to just printable ASCII characters.
Thus, we match using a combination of the previous regexp with a non-POSIX character class that Ruby defines:
/[^[:print:]&&[:ascii:]]/
You can read this like
NOT (printable AND ascii)
which by DeMorgan's Law is equivalent to
(NOT printable) OR (NOT ascii)
That is, if the character is not printable (even in Unicode), we'll escape it; if the character is printable but non-ASCII, we'll also escape it.
What's more, Ruby's Regexp#=~ method will blow up if the string does not have a valid encoding (e.g., in UTF-8). We handle this case separately, though, using String#encode! with a :fallback option to hex-encode invalid UTF-8 byte sequences with {#hex}.
@see ruby-doc.org/core-2.6.5/Regexp.html @see github.com/antirez/redis/blob/0f026af185e918a9773148f6ceaa1b084662be88/src/sds.c#L878-L880 @see github.com/antirez/redis/blob/0f026af185e918a9773148f6ceaa1b084662be88/src/sds.c#L898-L901 @see www.justinweiss.com/articles/3-steps-to-fix-encoding-problems-in-ruby/
- NEEDS_QUOTES
If the final escaped string needs quotes, it will match this regexp.
The overall string returned by {#prettify} should only be quoted if at least one of the following holds:
-
The string contains an escape sequence, broadly demarcated by a backslash. This includes standard escape sequences like “n” and “t” as well as hex-encoded bytes using the “x” escape sequence. Since {#prettify} uses double quotes on its output string, we must also force quotes if the string itself contains a literal double quote. This double quote behavior is handled tacitly by the {NEEDS_BACKSLASH} + {BACKSLASH} replacement.
-
The string contains a single quote. Since redis-cli recognizes single-quoted strings, we want to wrap the {#prettify} output in double quotes so that the literal single quote character isn't mistaken as the delimiter of a new string.
-
The string contains any whitespace characters. If the {#prettify} output weren't wrapped in quotes, whitespace would act as a separator between arguments to the
Redis
command. To group things together, we need to quote the string.
-
Public Class Methods
# File lib/honeycomb/integrations/redis.rb, line 125 def initialize(client) @client = client end
Public Instance Methods
# File lib/honeycomb/integrations/redis.rb, line 135 def command=(commands) commands = Array(commands) values["redis.command"] = commands.map { |cmd| format(cmd) }.join("\n") end
# File lib/honeycomb/integrations/redis.rb, line 129 def options=(options) options.each do |option, value| values["redis.#{option}"] ||= value unless ignore?(option) end end
# File lib/honeycomb/integrations/redis.rb, line 140 def to_hash values end
Private Instance Methods
# File lib/honeycomb/integrations/redis.rb, line 175 def format(cmd) name, *args = cmd.flatten(1) name = resolve(name) sanitize(args) if name.casecmp("auth").zero? [name.upcase, *args.map { |arg| prettify(arg) }].join(" ") end
Hex-encodes a (presumably non-printable or non-ASCII) character.
Aside from standard backslash escape sequences, redis-cli also recognizes “xhh” notation, where `hh` is a hexadecimal number.
Of note is that redis-cli only recognizes exactly two-digit hexadecimal numbers. This is in accordance with IEEE Std 1003.1-2001, Chapter 7, Locale:
> A character can be represented as a hexadecimal constant. A > hexadecimal constant shall be specified as the escape character > followed by an 'x' followed by two hexadecimal digits. Each constant > shall represent a byte value. Multi-byte values can be represented by > concatenated constants specified in byte order with the last constant > specifying the least significant byte of the character.
Unlike the C `char` type, Ruby's conception of a character can span multiple bytes (and possibly bytes that aren't valid in Ruby's string encoding). So we take care to escape the input properly into the redis-cli compatible version by iterating through each byte and formatting it as a (zero-padded) 2-digit hexadecimal number prefixed by `x`.
@see pubs.opengroup.org/onlinepubs/009695399/basedefs/xbd_chap07.html @see github.com/antirez/redis/blob/0f026af185e918a9773148f6ceaa1b084662be88/src/sds.c#L878-L880 @see github.com/antirez/redis/blob/0f026af185e918a9773148f6ceaa1b084662be88/src/sds.c#L898-L901
# File lib/honeycomb/integrations/redis.rb, line 357 def hex(char) char.bytes.map { |b| Kernel.format("\\x%02x", b) }.join end
Do we ignore this Redis::Client
option?
-
:url - unsafe because it might contain a password
-
:password - unsafe
-
:logger - just some Ruby object, not useful
-
:_parsed - implementation detail
# File lib/honeycomb/integrations/redis.rb, line 161 def ignore?(option) # Redis options may be symbol or string keys. # # This normalizes `option` using `to_sym` as benchmarking on Ruby MRI # v2.6.6 and v2.7.3 has shown that was faster compared to `to_s`. # However, `nil` does not support `to_sym`. This uses a guard clause to # handle the `nil` case because this is still faster than safe # navigation. Also this lib still supports Ruby 2.2.0; which does not # include safe navigation. return true unless option %i[url password logger _parsed].include?(option.to_sym) end
This aims to replicate the algorithms used by redis-cli.
@see github.com/antirez/redis/blob/0f026af185e918a9773148f6ceaa1b084662be88/src/sds.c#L940-L1067
The redis-cli parsing algorithm
@see github.com/antirez/redis/blob/0f026af185e918a9773148f6ceaa1b084662be88/src/sds.c#L878-L907
The redis-cli printing algorithm
# File lib/honeycomb/integrations/redis.rb, line 197 def prettify(arg) pretty = arg.to_s.dup pretty.encode!("UTF-8", "binary", fallback: ->(c) { hex(c) }) pretty.gsub!(NEEDS_BACKSLASH, BACKSLASH) pretty.gsub!(NEEDS_HEX) { |c| hex(c) } pretty =~ NEEDS_QUOTES ? "\"#{pretty}\"" : pretty end
# File lib/honeycomb/integrations/redis.rb, line 182 def resolve(name) @client.command_map.fetch(name, name).to_s end
# File lib/honeycomb/integrations/redis.rb, line 186 def sanitize(args) args.map! { "[sanitized]" } end
# File lib/honeycomb/integrations/redis.rb, line 146 def values @values ||= { "meta.package" => "redis", "meta.package_version" => ::Redis::VERSION, "redis.id" => @client.id, "redis.location" => @client.location, } end