module Rex::Text
This class formats text in various fashions and also provides a mechanism for wrapping text at a given column.
Constants
- AllChars
- Alpha
- AlphaNumeric
- Base32
- Base64
- Base64Url
- DefaultPatternSets
- DefaultWrap
- HighAscii
- Iconv_ASCII
- Iconv_EBCDIC
The Iconv translation table. The Iconv gem is deprecated in favor of String#encode, yet there is no encoding for EBCDIC. See #4525
- Iconv_IBM1047
The Iconv translation table for IBM's mainframe / System Z (z/os, s390, mvs, etc) - This is a different implementation of EBCDIC than the
Iconv_EBCDIC
below. It is technically referred to as Code Page IBM1047. This will be net new (until Ruby supports 1047 code page) for all Mainframe / SystemZ based modules that need to convert ASCII to EBCDICThe bytes are indexed by ASCII conversion number e.g. Iconv_IBM1047 == xc1 for letter “A”
Note the characters CANNOT be assumed to be in any logical order. Nor are the tables reversible. Lookups must be for each byte gist.github.com/bigendiansmalls/b08483ecedff52cc8fa3
- Iconv_ISO8859_1
This is the reverse of the above, converts EBCDIC -> ASCII The bytes are indexed by IBM1047(EBCDIC) conversion number e.g. Iconv_ISO8859_1 = x41 for letter “A”
Note the characters CANNOT be assumed to be in any logical (e.g. sequential) order. Nor are the tables reversible. Lookups must be done byte by byte
- LowAscii
- LowerAlpha
- Names_Female
- Names_Male
- Numerals
- Punctuation
- States
- Surnames
Most 100 common surnames, male/female names in the U.S. (names.mongabay.com/)
- TLDs
- UpperAlpha
Public Class Methods
Turn non-printable chars into hex representations, leaving others alone
If whitespace
is true, converts whitespace (0x20, 0x09, etc) to hex as well.
@see hexify @see to_hex
Converts all the chars
# File lib/rex/text.rb, line 1061 def self.ascii_safe_hex(str, whitespace=false) if whitespace str.gsub(/([\x00-\x20\x80-\xFF])/n){ |x| "\\x%.2x" % x.unpack("C*")[0] } else str.gsub(/([\x00-\x08\x0b\x0c\x0e-\x1f\x80-\xFF])/n){ |x| "\\x%.2x" % x.unpack("C*")[0]} end end
Base32
decoder
# File lib/rex/text.rb, line 1239 def self.b32decode(bytes_in) bytes = bytes_in.take_while {|c| c != 61} # strip padding n = (bytes.length * 5.0 / 8.0).floor p = bytes.length < 8 ? 5 - (n * 8) % 5 : 0 c = bytes.inject(0) {|m,o| (m << 5) + Base32.index(o.chr)} >> p (0..n-1).to_a.reverse.collect {|i| ((c >> i * 8) & 0xff).chr} end
Base32
encoder
# File lib/rex/text.rb, line 1214 def self.b32encode(bytes_in) n = (bytes_in.length * 8.0 / 5.0).ceil p = n < 8 ? 5 - (bytes_in.length * 8) % 5 : 0 c = bytes_in.inject(0) {|m,o| (m << 8) + o} << p [(0..n-1).to_a.reverse.collect {|i| Base32[(c >> i * 5) & 0x1f].chr}, ("=" * (8-n))] end
Return the index of the first badchar in data
, otherwise return nil if there wasn't any badchar occurences.
@param data [String] The string to check for bad characters @param badchars [String] A list of characters considered to be bad @return [Fixnum] Index of the first bad character if any exist in data
@return [nil] If data
contains no bad characters
# File lib/rex/text.rb, line 1690 def self.badchar_index(data, badchars = '') badchars.unpack("C*").each { |badchar| pos = data.index(badchar.chr) return pos if pos } return nil end
Calculate the block API hash for the given module/function
@param mod [String] The name of the module containing the target function. @param fun [String] The name of the function.
@return [String] The hash of the mod/fun pair in string format
# File lib/rex/text.rb, line 1837 def self.block_api_hash(mod, fun) unicode_mod = (mod.upcase + "\x00").unpack('C*').pack('v*') mod_hash = self.ror13_hash(unicode_mod) fun_hash = self.ror13_hash(fun + "\x00") "0x#{(mod_hash + fun_hash & 0xFFFFFFFF).to_s(16)}" end
Returns all chars that are not in the supplied set
@param keepers [String] @return [String] All characters not contained in keepers
# File lib/rex/text.rb, line 1717 def self.charset_exclude(keepers) excluded_bytes = [*(0..255)] - keepers.unpack("C*") excluded_bytes.pack("C*") end
Compresses a string, eliminating all superfluous whitespace before and after lines and eliminating all lines.
@param str [String] The string in which to crunch whitespace @return [String] Just like str
, but with repeated whitespace characters
trimmed down to a single space
# File lib/rex/text.rb, line 1579 def self.compress(str) str.gsub(/\n/m, ' ').gsub(/\s+/, ' ').gsub(/^\s+/, '').gsub(/\s+$/, '') end
Converts a string to one similar to what would be used by cowsay(1), a UNIX utility for displaying text as if it was coming from an ASCII-cow's mouth:
__________________ < the cow says moo > ------------------ \ ^__^ \ (oo)\_______ (__)\ )\/\ ||----w | || ||
@param text [String] The string to cowsay @param width [Fixnum] Width of the cow's cloud. Default's to cowsay(1)'s default, 39.
# File lib/rex/text.rb, line 1143 def self.cowsay(text, width=39) # cowsay(1) chunks a message up into 39-byte chunks and wraps it in '| ' and ' |' # Rex::Text.wordwrap(text, 0, 39, ' |', '| ') almost does this, but won't # split a word that has > 39 characters in it which results in oddly formed # text in the cowsay banner, so just do it by hand. This big mess wraps # the provided text in an ASCII-cloud and then makes it look like the cloud # is a thought/word coming from the ASCII-cow. Each line in the # ASCII-cloud is no more than the specified number-characters long, and the # cloud corners are made to look rounded text_lines = text.scan(Regexp.new(".{1,#{width-4}}")) max_length = text_lines.map(&:size).sort.last cloud_parts = [] cloud_parts << " #{'_' * (max_length + 2)}" if text_lines.size == 1 cloud_parts << "< #{text} >" else cloud_parts << "/ #{text_lines.first.ljust(max_length, ' ')} \\" if text_lines.size > 2 text_lines[1, text_lines.length - 2].each do |line| cloud_parts << "| #{line.ljust(max_length, ' ')} |" end end cloud_parts << "\\ #{text_lines.last.ljust(max_length, ' ')} /" end cloud_parts << " #{'-' * (max_length + 2)}" cloud_parts << <<EOS \\ ,__, \\ (oo)____ (__) )\\ ||--|| * EOS cloud_parts.join("\n") end
# File lib/rex/text.rb, line 1247 def self.decode_base32(str) bytes = str.bytes result = '' size= 8 while bytes.any? do bytes.each_slice(size) do |a| bytes_out = b32decode(a).flatten.join result << bytes_out bytes = bytes.drop(size) end end return result end
Base64
decoder
# File lib/rex/text.rb, line 1271 def self.decode_base64(str) str.to_s.unpack("m")[0] end
Base64
decoder (URL-safe RFC6920, ignores invalid characters)
# File lib/rex/text.rb, line 1287 def self.decode_base64url(str) decode_base64( str.gsub(/[^a-zA-Z0-9_\-]/, ''). tr('-_', '+/')) end
Convert hex-encoded characters to literals.
@example
Rex::Text.dehex("AA\\x42CC") # => "AABCC"
@see hex_to_raw
@param str [String]
# File lib/rex/text.rb, line 1329 def self.dehex(str) return str unless str.respond_to? :match return str unless str.respond_to? :gsub regex = /\x5cx[0-9a-f]{2}/nmi if str.match(regex) str.gsub(regex) { |x| x[2,2].to_i(16).chr } else str end end
Convert and replace hex-encoded characters to literals.
@param (see dehex)
# File lib/rex/text.rb, line 1344 def self.dehex!(str) return str unless str.respond_to? :match return str unless str.respond_to? :gsub regex = /\x5cx[0-9a-f]{2}/nmi str.gsub!(regex) { |x| x[2,2].to_i(16).chr } end
# File lib/rex/text.rb, line 1222 def self.encode_base32(str) bytes = str.bytes result = '' size= 5 while bytes.any? do bytes.each_slice(size) do |a| bytes_out = b32encode(a).flatten.join result << bytes_out bytes = bytes.drop(size) end end return result end
Base64
encoder
# File lib/rex/text.rb, line 1264 def self.encode_base64(str, delim='') [str.to_s].pack("m").gsub(/\s+/, delim) end
Base64
encoder (URL-safe RFC6920)
# File lib/rex/text.rb, line 1278 def self.encode_base64url(str, delim='') encode_base64(str, delim). tr('+/', '-_'). gsub('=', '') end
A native implementation of the EBCDIC to ASCII conversion table, since EBCDIC isn't available to String#encode as of Ruby 2.1
@param str [String] an EBCDIC encoded string @return [String] An encodable ASCII string @note This method will raise in the event of invalid characters
# File lib/rex/text.rb, line 491 def self.from_ebcdic(str) new_str = [] str.each_byte do |x| if Iconv_EBCDIC.index(x.chr) new_str << Iconv_ASCII[Iconv_EBCDIC.index(x.chr)] else raise Rex::Text::IllegalSequence, ("\\x%x" % x) end end new_str.join end
The next two are the same as the above, except strictly for z/os conversions
strictly for ISO8859-1 -> IBM1047
A native implementation of the ISO8859-1(ASCII) -> IBM1047(EBCDIC) conversion table, since EBCDIC isn't available to String#encode as of Ruby 2.1
# File lib/rex/text.rb, line 527 def self.from_ibm1047(str) return str if str.nil? new_str = [] str.each_byte do |x| new_str << Iconv_ISO8859_1[x.ord] end new_str.join end
Compresses a string using gzip
@param str (see zlib_deflate
) @param level [Fixnum] Compression level, 1 (fast) to 9 (best) @return (see zlib_deflate
)
# File lib/rex/text.rb, line 1654 def self.gzip(str, level = 9) raise RuntimeError, "Gzip support is not present." if (!zlib_present?) raise RuntimeError, "Invalid gzip compression level" if (level < 1 or level > 9) s = "" s.force_encoding('ASCII-8BIT') if s.respond_to?(:encoding) gz = Zlib::GzipWriter.new(StringIO.new(s, 'wb'), level) gz << str gz.close return s end
backwards compat for just a bit…
# File lib/rex/text.rb, line 1610 def self.gzip_present? self.zlib_present? end
Converts a hex string to a raw string
@example
Rex::Text.hex_to_raw("\\x41\\x7f\\x42") # => "A\x7fB"
# File lib/rex/text.rb, line 1048 def self.hex_to_raw(str) [ str.downcase.gsub(/'/,'').gsub(/\\?x([a-f0-9][a-f0-9])/, '\1') ].pack("H*") end
Converts a string to a hex version with wrapping support
# File lib/rex/text.rb, line 1080 def self.hexify(str, col = DefaultWrap, line_start = '', line_end = '', buf_start = '', buf_end = '') output = buf_start cur = 0 count = 0 new_line = true # Go through each byte in the string str.each_byte { |byte| count += 1 append = '' # If this is a new line, prepend with the # line start text if (new_line == true) append << line_start new_line = false end # Append the hexified version of the byte append << sprintf("\\x%.2x", byte) cur += append.length # If we're about to hit the column or have gone past it, # time to finish up this line if ((cur + line_end.length >= col) or (cur + buf_end.length >= col)) new_line = true cur = 0 # If this is the last byte, use the buf_end instead of # line_end if (count == str.length) append << buf_end + "\n" else append << line_end + "\n" end end output << append } # If we were in the middle of a line, finish the buffer at this point if (new_line == false) output << buf_end + "\n" end return output end
Decode a string that's html encoded
# File lib/rex/text.rb, line 934 def self.html_decode(str) decoded_str = CGI.unescapeHTML(str) return decoded_str end
Encode a string in a manner useful for HTTP URIs and URI Parameters.
@param str [String] The string to be encoded @param mode [“hex”,“int”,“int-wide”] @return [String] @raise [TypeError] if mode
is not one of the three available modes
# File lib/rex/text.rb, line 918 def self.html_encode(str, mode = 'hex') case mode when 'hex' return str.unpack('C*').collect{ |i| "&#x" + ("%.2x" % i) + ";"}.join when 'int' return str.unpack('C*').collect{ |i| "&#" + i.to_s + ";"}.join when 'int-wide' return str.unpack('C*').collect{ |i| "&#" + ("0" * (7 - i.to_s.length)) + i.to_s + ";" }.join else raise TypeError, 'invalid mode' end end
Hexidecimal MD5 digest of the supplied string
# File lib/rex/text.rb, line 1303 def self.md5(str) Digest::MD5.hexdigest(str) end
Raw MD5 digest of the supplied string
# File lib/rex/text.rb, line 1296 def self.md5_raw(str) Digest::MD5.digest(str) end
Pack a value as 64 bit litle endian; does not exist for Array.pack
# File lib/rex/text.rb, line 1899 def self.pack_int64le(val) [val & 0x00000000ffffffff, val >> 32].pack("V2") end
Step through an arbitrary number of sets of bytes to build up a findable pattern. This is mostly useful for experimentially determining offset lengths into memory structures. Note that the supplied sets should never contain duplicate bytes, or else it can become impossible to measure the offset accurately.
# File lib/rex/text.rb, line 1535 def self.patt2(len, sets = nil) buf = "" counter = [] sets ||= [ UpperAlpha, LowerAlpha, Numerals ] len ||= len.to_i return "" if len.zero? sets = sets.map {|a| a.split(//)} sets.size.times { counter << 0} 0.upto(len-1) do |i| setnum = i % sets.size #puts counter.inspect end return buf end
Creates a pattern that can be used for offset calculation purposes. This routine is capable of generating patterns using a supplied set and a supplied number of identifiable characters (slots). The supplied sets should not contain any duplicate characters or the logic will fail.
@param length [Fixnum] @param sets [Array<(String,String,String)>] The character sets to choose
from. Should have 3 elements, each of which must be a string containing no characters contained in the other sets.
@return [String] A pattern of length
bytes, in which any 4-byte chunk is
unique
@see pattern_offset
# File lib/rex/text.rb, line 1504 def self.pattern_create(length, sets = nil) buf = '' offsets = [] # Make sure there's something in sets even if we were given an explicit nil sets ||= [ UpperAlpha, LowerAlpha, Numerals ] # Return stupid uses return "" if length.to_i < 1 return sets[0][0].chr * length if sets.size == 1 and sets[0].size == 1 sets.length.times { offsets << 0 } until buf.length >= length begin buf << converge_sets(sets, 0, offsets, length) end end # Maximum permutations reached, but we need more data if (buf.length < length) buf = buf * (length / buf.length.to_f).ceil end buf[0,length] end
Calculate the offset to a pattern
@param pattern [String] The pattern to search. Usually the return value
from {.pattern_create}
@param value [String,Fixnum,Bignum] @return [Fixnum] Index of the given value
within pattern
, if it exists @return [nil] if pattern
does not contain value
@see pattern_create
# File lib/rex/text.rb, line 1562 def self.pattern_offset(pattern, value, start=0) if (value.kind_of?(String)) pattern.index(value, start) elsif (value.kind_of?(Fixnum) or value.kind_of?(Bignum)) pattern.index([ value ].pack('V'), start) else raise ::ArgumentError, "Invalid class for value: #{value.class}" end end
Permute the case of a word
# File lib/rex/text.rb, line 1755 def self.permute_case(word, idx=0) res = [] if( (UpperAlpha+LowerAlpha).index(word[idx,1])) word_ucase = word.dup word_ucase[idx, 1] = word[idx, 1].upcase word_lcase = word.dup word_lcase[idx, 1] = word[idx, 1].downcase if (idx == word.length) return [word] else res << permute_case(word_ucase, idx+1) res << permute_case(word_lcase, idx+1) end else res << permute_case(word, idx+1) end res.flatten end
Generate a valid random 4 byte UTF-8 character valid codepoints for 4byte UTF-8 chars: U+010000 - U+10FFFF
@example
Rex::Text.rand_4byte_utf8 # => "\u{108CF3}"
@return [String]
# File lib/rex/text.rb, line 1487 def self.rand_4byte_utf8 [rand(0x10000..0x10ffff)].pack('U*') end
Base text generator method
# File lib/rex/text.rb, line 1364 def self.rand_base(len, bad, *foo) cset = (foo.join.unpack("C*") - bad.to_s.unpack("C*")).uniq return "" if cset.length == 0 outp = [] len.times { outp << cset[rand(cset.length)] } outp.pack("C*") end
Generates a random character.
# File lib/rex/text.rb, line 1359 def self.rand_char(bad, chars = AllChars) rand_text(1, bad, chars) end
Generate a random GUID
@example
Rex::Text.rand_guid # => "{ca776ced-4ab8-2ed6-6510-aa71e5e2508e}"
@return [String]
# File lib/rex/text.rb, line 1451 def self.rand_guid "{#{[8,4,4,4,12].map {|a| rand_text_hex(a) }.join("-")}}" end
Generate a random hostname
@return [String] A random string conforming to the rules of FQDNs
# File lib/rex/text.rb, line 1782 def self.rand_hostname host = [] (rand(5) + 1).times { host.push(Rex::Text.rand_text_alphanumeric(rand(10) + 1)) } host.push(TLDs.sample) host.join('.').downcase end
Generate a random mail address
# File lib/rex/text.rb, line 1821 def self.rand_mail_address mail_address = '' mail_address << Rex::Text.rand_name mail_address << '.' mail_address << Rex::Text.rand_surname mail_address << '@' mail_address << Rex::Text.rand_hostname end
Generate a name
# File lib/rex/text.rb, line 1802 def self.rand_name if rand(10) % 2 == 0 Names_Male.sample else Names_Female.sample end end
Generate a female name
# File lib/rex/text.rb, line 1816 def self.rand_name_female Names_Female.sample end
Generate a male name
# File lib/rex/text.rb, line 1811 def self.rand_name_male Names_Male.sample end
Generate a state
# File lib/rex/text.rb, line 1792 def self.rand_state() States.sample end
Generate a surname
# File lib/rex/text.rb, line 1797 def self.rand_surname Surnames.sample end
Generate random bytes of data
# File lib/rex/text.rb, line 1373 def self.rand_text(len, bad='', chars = AllChars) foo = chars.split('') rand_base(len, bad, *foo) end
Generate random bytes of alpha data
# File lib/rex/text.rb, line 1379 def self.rand_text_alpha(len, bad='') foo = [] foo += ('A' .. 'Z').to_a foo += ('a' .. 'z').to_a rand_base(len, bad, *foo ) end
Generate random bytes of lowercase alpha data
# File lib/rex/text.rb, line 1387 def self.rand_text_alpha_lower(len, bad='') rand_base(len, bad, *('a' .. 'z').to_a) end
Generate random bytes of uppercase alpha data
# File lib/rex/text.rb, line 1392 def self.rand_text_alpha_upper(len, bad='') rand_base(len, bad, *('A' .. 'Z').to_a) end
Generate random bytes of alphanumeric data
# File lib/rex/text.rb, line 1397 def self.rand_text_alphanumeric(len, bad='') foo = [] foo += ('A' .. 'Z').to_a foo += ('a' .. 'z').to_a foo += ('0' .. '9').to_a rand_base(len, bad, *foo ) end
Generate random bytes of base64 data
# File lib/rex/text.rb, line 1434 def self.rand_text_base64(len, bad='') foo = Base64.unpack('C*').map{ |c| c.chr } rand_base(len, bad, *foo ) end
Generate random bytes of base64url data
# File lib/rex/text.rb, line 1440 def self.rand_text_base64url(len, bad='') foo = Base64Url.unpack('C*').map{ |c| c.chr } rand_base(len, bad, *foo ) end
Generate random bytes of english-like data
# File lib/rex/text.rb, line 1420 def self.rand_text_english(len, bad='') foo = [] foo += (0x21 .. 0x7e).map{ |c| c.chr } rand_base(len, bad, *foo ) end
Generate random bytes of alphanumeric hex.
# File lib/rex/text.rb, line 1406 def self.rand_text_hex(len, bad='') foo = [] foo += ('0' .. '9').to_a foo += ('a' .. 'f').to_a rand_base(len, bad, *foo) end
Generate random bytes of high ascii data
# File lib/rex/text.rb, line 1427 def self.rand_text_highascii(len, bad='') foo = [] foo += (0x80 .. 0xff).map{ |c| c.chr } rand_base(len, bad, *foo ) end
Generate random bytes of numeric data
# File lib/rex/text.rb, line 1414 def self.rand_text_numeric(len, bad='') foo = ('0' .. '9').to_a rand_base(len, bad, *foo ) end
Randomize the whitespace in a string
# File lib/rex/text.rb, line 1586 def self.randomize_space(str) set = ["\x09", "\x20", "\x0d", "\x0a"] str.gsub(/\s+/) { |s| len = rand(50)+2 buf = '' while (buf.length < len) buf << set.sample end buf } end
Removes noise from 2 Strings and return a refined String version.
# File lib/rex/text.rb, line 550 def self.refine( str1, str2 ) return str1 if str1 == str2 # get the words of the first str in an array s_words = to_words( str1 ) # get the words of the second str in an array o_words = to_words( str2 ) # get what hasn't changed (the rdiff, so to speak) as a string (s_words - (s_words - o_words)).join end
Removes bad characters from a string.
Modifies data
in place
@param data [#delete] @param badchars [String] A list of characters considered to be bad
# File lib/rex/text.rb, line 1705 def self.remove_badchars(data, badchars = '') return data if badchars.length == 0 badchars_pat = badchars.unpack("C*").map{|c| "\\x%.2x" % c}.join data.gsub!(/[#{badchars_pat}]/n, '') data end
Rotate a 32-bit value to the left by cnt
bits
@param val (see ror) @param cnt (see ror) @return (see ror)
# File lib/rex/text.rb, line 1873 def self.rol(val, cnt) bits = [val].pack("N").unpack("B32")[0].split(//) 1.upto(cnt) do |c| bits.push( bits.shift ) end [bits.join].pack("B32").unpack("N")[0] end
Rotate a 32-bit value to the right by cnt
bits
@param val [Fixnum] The value to rotate @param cnt [Fixnum] Number of bits to rotate by
# File lib/rex/text.rb, line 1859 def self.ror(val, cnt) bits = [val].pack("N").unpack("B32")[0].split(//) 1.upto(cnt) do |c| bits.unshift( bits.pop ) end [bits.join].pack("B32").unpack("N")[0] end
Calculate the ROR13 hash of a given string
@return [Fixnum]
# File lib/rex/text.rb, line 1848 def self.ror13_hash(name) hash = 0 name.unpack("C*").each {|c| hash = ror(hash, 13); hash += c } hash end
Hexidecimal SHA1 digest of the supplied string
# File lib/rex/text.rb, line 1317 def self.sha1(str) Digest::SHA1.hexdigest(str) end
Raw SHA1 digest of the supplied string
# File lib/rex/text.rb, line 1310 def self.sha1_raw(str) Digest::SHA1.digest(str) end
Performs a Fisher-Yates shuffle on an array
Modifies arr
in place
@param arr [Array] The array to be shuffled @return [Array]
# File lib/rex/text.rb, line 1739 def self.shuffle_a(arr) len = arr.length max = len - 1 cyc = [* (0..max) ] for d in cyc e = rand(d+1) next if e == d f = arr[d]; g = arr[e]; arr[d] = g; arr[e] = f; end return arr end
Shuffles a byte stream
@param str [String] @return [String] The shuffled result @see shuffle_a
# File lib/rex/text.rb, line 1728 def self.shuffle_s(str) shuffle_a(str.unpack("C*")).pack("C*") end
Split a string by n character into an array
# File lib/rex/text.rb, line 1884 def self.split_to_a(str, n) if n > 0 s = str.dup until s.empty? (ret ||= []).push s.slice!(0, n) end else ret = str end ret end
Converts a unicode string to standard ASCII text.
# File lib/rex/text.rb, line 836 def self.to_ascii(str='', type = 'utf-16le', mode = '', size = '') return '' if not str case type when 'utf-16le' return str.unpack('v*').pack('C*') when 'utf-16be' return str.unpack('n*').pack('C*') when 'utf-32le' return str.unpack('V*').pack('C*') when 'utf-32be' return str.unpack('N*').pack('C*') when 'utf-7' raise TypeError, 'invalid utf type, not yet implemented' when 'utf-8' raise TypeError, 'invalid utf type, not yet implemented' when 'uhwtfms' # suggested name from HD :P raise TypeError, 'invalid utf type, not yet implemented' when 'uhwtfms-half' # suggested name from HD :P raise TypeError, 'invalid utf type, not yet implemented' else raise TypeError, 'invalid utf type' end end
Converts a raw string into a Bash buffer
# File lib/rex/text.rb, line 357 def self.to_bash(str, wrap = DefaultWrap, name = "buf") return hexify(str, wrap, '$\'', '\'\\', "export #{name}=\\\n", '\'') end
Creates a Bash-style comment
# File lib/rex/text.rb, line 443 def self.to_bash_comment(str, wrap = DefaultWrap) return wordwrap(str, 0, wrap, '', '# ') end
Converts a raw string into a C buffer
# File lib/rex/text.rb, line 311 def self.to_c(str, wrap = DefaultWrap, name = "buf") return hexify(str, wrap, '"', '"', "unsigned char #{name}[] = \n", '";') end
Creates a c-style comment
# File lib/rex/text.rb, line 329 def self.to_c_comment(str, wrap = DefaultWrap) return "/*\n" + wordwrap(str, 0, wrap, '', ' * ') + " */\n" end
# File lib/rex/text.rb, line 315 def self.to_csharp(str, wrap = DefaultWrap, name = "buf") ret = "byte[] #{name} = new byte[#{str.length}] {" i = -1; while (i += 1) < str.length ret << "\n" if i%(wrap/4) == 0 ret << "0x" << str[i].unpack("H*")[0] << "," end ret = ret[0..ret.length-2] #cut off last comma ret << " };\n" end
Creates a comma separated list of dwords
# File lib/rex/text.rb, line 276 def self.to_dword(str, wrap = DefaultWrap) code = str alignnr = str.length % 4 if (alignnr > 0) code << "\x00" * (4 - alignnr) end codevalues = Array.new code.split("").each_slice(4) do |chars4| chars4 = chars4.join("") dwordvalue = chars4.unpack('*V') codevalues.push(dwordvalue[0]) end buff = "" 0.upto(codevalues.length-1) do |byte| if(byte % 8 == 0) and (buff.length > 0) buff << "\r\n" end buff << sprintf('0x%.8x, ', codevalues[byte]) end # strip , at the end buff = buff.chomp(', ') buff << "\r\n" return buff end
A native implementation of the ASCII to EBCDIC conversion table, since EBCDIC isn't available to String#encode as of Ruby 2.1
@param str [String] An encodable ASCII string @return [String] an EBCDIC encoded string @note This method will raise in the event of invalid characters
# File lib/rex/text.rb, line 473 def self.to_ebcdic(str) new_str = [] str.each_byte do |x| if Iconv_ASCII.index(x.chr) new_str << Iconv_EBCDIC[Iconv_ASCII.index(x.chr)] else raise Rex::Text::IllegalSequence, ("\\x%x" % x) end end new_str.join end
Convert 16-byte string to a GUID string
@example
str = "ABCDEFGHIJKLMNOP" Rex::Text.to_guid(str) #=> "{44434241-4645-4847-494a-4b4c4d4e4f50}"
@param bytes [String] 16 bytes which represent a GUID in the proper
order.
@return [String]
# File lib/rex/text.rb, line 1466 def self.to_guid(bytes) return nil unless bytes s = bytes.unpack('H*')[0] parts = [ s[6, 2] + s[4, 2] + s[2, 2] + s[0, 2], s[10, 2] + s[8, 2], s[14, 2] + s[12, 2], s[16, 4], s[20, 12] ] "{#{parts.join('-')}}" end
Returns the escaped hex version of the supplied string
@example
Rex::Text.to_hex("asdf") # => "\\x61\\x73\\x64\\x66"
@param str (see to_octal
) @param prefix (see to_octal
) @param count [Fixnum] Number of bytes to put in each escape chunk @return [String] The escaped hex version of str
# File lib/rex/text.rb, line 613 def self.to_hex(str, prefix = "\\x", count = 1) raise ::RuntimeError, "unable to chunk into #{count} byte chunks" if ((str.length % count) > 0) # XXX: Regexp.new is used here since using /.{#{count}}/o would compile # the regex the first time it is used and never check again. Since we # want to know how many to capture on every instance, we do it this # way. return str.unpack('H*')[0].gsub(Regexp.new(".{#{count * 2}}", nil, 'n')) { |s| prefix + s } end
Returns the string with nonprintable hex characters sanitized to ascii. Similiar to {.to_hex}, but regular ASCII is not translated if count
is 1.
@example
Rex::Text.to_hex_ascii("\x7fABC\0") # => "\\x7fABC\\x00"
@param str (see to_hex
) @param prefix (see to_hex
) @param count (see to_hex
) @param suffix [String,nil] A string to append to the converted bytes @return [String] The original string with non-printables converted to
their escaped hex representation
# File lib/rex/text.rb, line 636 def self.to_hex_ascii(str, prefix = "\\x", count = 1, suffix=nil) raise ::RuntimeError, "unable to chunk into #{count} byte chunks" if ((str.length % count) > 0) return str.unpack('H*')[0].gsub(Regexp.new(".{#{count * 2}}", nil, 'n')) { |s| (0x20..0x7e) === s.to_i(16) ? s.to_i(16).chr : prefix + s + suffix.to_s } end
Converts a string to a nicely formatted hex dump
@param str [String] The string to convert @param width [Fixnum] Number of bytes to convert before adding a newline @param base [Fixnum] The base address of the dump
# File lib/rex/text.rb, line 1002 def self.to_hex_dump(str, width=16, base=nil) buf = '' idx = 0 cnt = 0 snl = false lst = 0 lft_col_len = (base.to_i+str.length).to_s(16).length lft_col_len = 8 if lft_col_len < 8 while (idx < str.length) chunk = str[idx, width] addr = base ? "%0#{lft_col_len}x " %(base.to_i + idx) : '' line = chunk.unpack("H*")[0].scan(/../).join(" ") buf << addr + line if (lst == 0) lst = line.length buf << " " * 4 else buf << " " * ((lst - line.length) + 4).abs end buf << "|" chunk.unpack("C*").each do |c| if (c > 0x1f and c < 0x7f) buf << c.chr else buf << "." end end buf << "|\n" idx += width end buf << "\n" end
The next two are the same as the above, except strictly for z/os conversions
strictly for IBM1047 -> ISO8859-1
A native implementation of the IBM1047(EBCDIC) -> ISO8859-1(ASCII) conversion table, since EBCDIC isn't available to String#encode as of Ruby 2.1 all 256 bytes are defined
# File lib/rex/text.rb, line 511 def self.to_ibm1047(str) return str if str.nil? new_str = [] str.each_byte do |x| new_str << Iconv_IBM1047[x.ord] end new_str.join end
Converts a raw string into a java byte array
# File lib/rex/text.rb, line 364 def self.to_java(str, name = "shell") buff = "byte #{name}[] = new byte[]\n{\n" cnt = 0 max = 0 str.unpack('C*').each do |c| buff << ", " if max > 0 buff << "\t" if max == 0 buff << sprintf('(byte) 0x%.2x', c) max +=1 cnt +=1 if (max > 7) buff << ",\n" if cnt != str.length max = 0 end end buff << "\n};\n" return buff end
Creates a javascript-style comment
# File lib/rex/text.rb, line 336 def self.to_js_comment(str, wrap = DefaultWrap) return wordwrap(str, 0, wrap, '', '// ') end
Takes a string, and returns an array of all mixed case versions.
@example
>> Rex::Text.to_mixed_case_array "abc1" => ["abc1", "abC1", "aBc1", "aBC1", "Abc1", "AbC1", "ABc1", "ABC1"]
@param str [String] The string to randomize @return [Array<String>] @see permute_case
# File lib/rex/text.rb, line 981 def self.to_mixed_case_array(str) letters = [] str.scan(/./).each { |l| letters << [l.downcase, l.upcase] } coords = [] (1 << str.size).times { |i| coords << ("%0#{str.size}b" % i) } mixed = [] coords.each do |coord| c = coord.scan(/./).map {|x| x.to_i} this_str = "" c.each_with_index { |d,i| this_str << letters[i][d] } mixed << this_str end return mixed.uniq end
Creates a comma separated list of numbers
# File lib/rex/text.rb, line 258 def self.to_num(str, wrap = DefaultWrap) code = str.unpack('C*') buff = "" 0.upto(code.length-1) do |byte| if(byte % 15 == 0) and (buff.length > 0) buff << "\r\n" end buff << sprintf('0x%.2x, ', code[byte]) end # strip , at the end buff = buff.chomp(', ') buff << "\r\n" return buff end
Returns the escaped octal version of the supplied string
@example
Rex::Text.to_octal("asdf") # => "\\141\\163\\144\\146"
@param str [String] The string to be converted @param prefix [String] @return [String] The escaped octal version of str
# File lib/rex/text.rb, line 594 def self.to_octal(str, prefix = "\\") octal = "" str.each_byte { |b| octal << "#{prefix}#{b.to_s 8}" } return octal end
Converts a raw string into a perl buffer
# File lib/rex/text.rb, line 343 def self.to_perl(str, wrap = DefaultWrap, name = "buf") return hexify(str, wrap, '"', '" .', "my $#{name} = \n", '";') end
Creates a perl-style comment
# File lib/rex/text.rb, line 436 def self.to_perl_comment(str, wrap = DefaultWrap) return wordwrap(str, 0, wrap, '', '# ') end
Converts a raw string to a powershell byte array
# File lib/rex/text.rb, line 387 def self.to_powershell(str, name = "buf") return Rex::Powershell::Script.to_byte_array(str, name) end
Converts a raw string into a python buffer
# File lib/rex/text.rb, line 350 def self.to_python(str, wrap = DefaultWrap, name = "buf") return hexify(str, wrap, "#{name} += \"", '"', "#{name} = \"\"\n", '"') end
Converts a string to random case
@example
Rex::Text.to_rand_case("asdf") # => "asDf"
@param str [String] The string to randomize @return [String] @see permute_case
@see to_mixed_case_array
# File lib/rex/text.rb, line 963 def self.to_rand_case(str) buf = str.dup 0.upto(str.length) do |i| buf[i,1] = rand(2) == 0 ? str[i,1].upcase : str[i,1].downcase end return buf end
Returns the raw string
# File lib/rex/text.rb, line 450 def self.to_raw(str) return str end
Converts a raw string into a ruby buffer
# File lib/rex/text.rb, line 251 def self.to_ruby(str, wrap = DefaultWrap, name = "buf") return hexify(str, wrap, '"', '" +', "#{name} = \n", '"') end
Creates a ruby-style comment
# File lib/rex/text.rb, line 304 def self.to_ruby_comment(str, wrap = DefaultWrap) return wordwrap(str, 0, wrap, '', '# ') end
Returns a unicode escaped string for Javascript
# File lib/rex/text.rb, line 566 def self.to_unescape(data, endian=ENDIAN_LITTLE, prefix='%%u') data << "\x41" if (data.length % 2 != 0) dptr = 0 buff = '' while (dptr < data.length) c1 = data[dptr,1].unpack("C*")[0] dptr += 1 c2 = data[dptr,1].unpack("C*")[0] dptr += 1 if (endian == ENDIAN_LITTLE) buff << sprintf("#{prefix}%.2x%.2x", c2, c1) else buff << sprintf("#{prefix}%.2x%.2x", c1, c2) end end return buff end
Converts standard ASCII text to a unicode string.
Supported unicode types include: utf-16le, utf16-be, utf32-le, utf32-be, utf-7, and utf-8
Providing 'mode' provides hints to the actual encoder as to how it should encode the string.
Only UTF-7 and UTF-8 use “mode”.
utf-7 by default does not encode alphanumeric and a few other characters. By specifying the mode of “all”, then all of the characters are encoded, not just the non-alphanumeric set. to_unicode
(str, 'utf-7', 'all')
utf-8 specifies that alphanumeric characters are used directly, eg “a” is just “a”. However, there exist 6 different overlong encodings of “a” that are technically not valid, but parse just fine in most utf-8 parsers. (0xC1A1, 0xE081A1, 0xF08081A1, 0xF8808081A1, 0xFC80808081A1, 0xFE8080808081A1). How many bytes to use for the overlong enocding is specified providing 'size'. to_unicode
(str, 'utf-8', 'overlong', 2)
Many utf-8 parsers also allow invalid overlong encodings, where bits that are unused when encoding a single byte are modified. Many parsers will ignore these bits, rendering simple string matching to be ineffective for dealing with UTF-8 strings. There are many more invalid overlong encodings possible for “a”. For example, three encodings are available for an invalid 2 byte encoding of “a”. (0xC1E1 0xC161 0xC121).
By specifying “invalid”, a random invalid encoding is chosen for the given byte size. to_unicode
(str, 'utf-8', 'invalid', 2)
utf-7 defaults to 'normal' utf-7 encoding utf-8 defaults to 2 byte 'normal' encoding
# File lib/rex/text.rb, line 680 def self.to_unicode(str='', type = 'utf-16le', mode = '', size = '') return '' if not str case type when 'utf-16le' return str.unpack('C*').pack('v*') when 'utf-16be' return str.unpack('C*').pack('n*') when 'utf-32le' return str.unpack('C*').pack('V*') when 'utf-32be' return str.unpack('C*').pack('N*') when 'utf-7' case mode when 'all' return str.gsub(/./){ |a| out = '' if 'a' != '+' out = encode_base64(to_unicode(a, 'utf-16be')).gsub(/[=\r\n]/, '') end '+' + out + '-' } else return str.gsub(/[^\n\r\t\ A-Za-z0-9\'\(\),-.\/\:\?]/){ |a| out = '' if a != '+' out = encode_base64(to_unicode(a, 'utf-16be')).gsub(/[=\r\n]/, '') end '+' + out + '-' } end when 'utf-8' if size == '' size = 2 end if size >= 2 and size <= 7 string = '' str.each_byte { |a| if (a < 21 || a > 0x7f) || mode != '' # ugh. turn a single byte into the binary representation of it, in array form bin = [a].pack('C').unpack('B8')[0].split(//) # even more ugh. bin.collect!{|a_| a_.to_i} out = Array.new(8 * size, 0) 0.upto(size - 1) { |i| out[i] = 1 out[i * 8] = 1 } i = 0 byte = 0 bin.reverse.each { |bit| if i < 6 mod = (((size * 8) - 1) - byte * 8) - i out[mod] = bit else byte = byte + 1 i = 0 redo end i = i + 1 } if mode != '' case mode when 'overlong' # do nothing, since we already handle this as above... when 'invalid' done = 0 while done == 0 # the ghetto... bits = [7, 8, 15, 16, 23, 24, 31, 32, 41] bits.each { |bit| bit = (size * 8) - bit if bit > 1 set = rand(2) if out[bit] != set out[bit] = set done = 1 end end } end else raise TypeError, 'Invalid mode. Only "overlong" and "invalid" are acceptable modes for utf-8' end end string << [out.join('')].pack('B*') else string << [a].pack('C') end } return string else raise TypeError, 'invalid utf-8 size' end when 'uhwtfms' # suggested name from HD :P load_codepage() string = '' # overloading mode as codepage if mode == '' mode = 1252 # ANSI - Latan 1, default for US installs of MS products else mode = mode.to_i end if @@codepage_map_cache[mode].nil? raise TypeError, "Invalid codepage #{mode}" end str.each_byte {|byte| char = [byte].pack('C*') possible = @@codepage_map_cache[mode]['data'][char] if possible.nil? raise TypeError, "codepage #{mode} does not provide an encoding for 0x#{char.unpack('H*')[0]}" end string << possible[ rand(possible.length) ] } return string when 'uhwtfms-half' # suggested name from HD :P load_codepage() string = '' # overloading mode as codepage if mode == '' mode = 1252 # ANSI - Latan 1, default for US installs of MS products else mode = mode.to_i end if mode != 1252 raise TypeError, "Invalid codepage #{mode}, only 1252 supported for uhwtfms_half" end str.each_byte {|byte| if ((byte >= 33 && byte <= 63) || (byte >= 96 && byte <= 126)) string << "\xFF" + [byte ^ 32].pack('C') elsif (byte >= 64 && byte <= 95) string << "\xFF" + [byte ^ 96].pack('C') else char = [byte].pack('C') possible = @@codepage_map_cache[mode]['data'][char] if possible.nil? raise TypeError, "codepage #{mode} does not provide an encoding for 0x#{char.unpack('H*')[0]}" end string << possible[ rand(possible.length) ] end } return string else raise TypeError, 'invalid utf type' end end
Converts US-ASCII to UTF-8, skipping over any characters which don't convert cleanly. This is a convenience method that wraps String#encode with non-raising default paramaters.
@param str [String] An encodable ASCII string @return [String] a UTF-8 equivalent @note This method will discard invalid characters
# File lib/rex/text.rb, line 461 def self.to_utf8(str) str.encode('utf-8', { :invalid => :replace, :undef => :replace, :replace => '' }) end
Converts a raw string into a vba buffer
# File lib/rex/text.rb, line 415 def self.to_vbapplication(str, name = "buf") return "#{name} = Array()" if str.nil? or str.empty? code = str.unpack('C*') buff = "#{name} = Array(" maxbytes = 20 1.upto(code.length) do |idx| buff << code[idx].to_s buff << "," if idx < code.length - 1 buff << " _\r\n" if (idx > 1 and (idx % maxbytes) == 0) end buff << ")\r\n" return buff end
Converts a raw string to a vbscript byte array
# File lib/rex/text.rb, line 394 def self.to_vbscript(str, name = "buf") return "#{name}" if str.nil? or str.empty? code = str.unpack('C*') buff = "#{name}=Chr(#{code[0]})" 1.upto(code.length-1) do |byte| if(byte % 100 == 0) buff << "\r\n#{name}=#{name}" end # exe is an Array of bytes, not a String, thanks to the unpack # above, so the following line is not subject to the different # treatments of String#[] between ruby 1.8 and 1.9 buff << "&Chr(#{code[byte]})" end return buff end
Returns the words in str
as an Array.
strict - include only words, no boundary characters (like spaces, etc.)
# File lib/rex/text.rb, line 541 def self.to_words( str, strict = false ) splits = str.split( /\b/ ) splits.reject! { |w| !(w =~ /\w/) } if strict splits end
Uncompresses a string using gzip
@param str (see zlib_inflate
) @return (see zlib_inflate
)
# File lib/rex/text.rb, line 1671 def self.ungzip(str) raise RuntimeError, "Gzip support is not present." if (!zlib_present?) s = "" s.force_encoding('ASCII-8BIT') if s.respond_to?(:encoding) gz = Zlib::GzipReader.new(StringIO.new(str, 'rb')) s << gz.read gz.close return s end
# File lib/rex/text.rb, line 1916 def self.unicode_filter_decode(str) str.to_s.gsub( /\$U\$([\x20-\x2c\x2e-\x7E]*)\-0x([A-Fa-f0-9]+)/n ){|m| [$2].pack("H*") } end
A custom unicode filter for dealing with multi-byte strings on a 8-bit console Punycode would have been more “standard”, but it requires valid Unicode chars
# File lib/rex/text.rb, line 1908 def self.unicode_filter_encode(str) if (str.to_s.unpack("C*") & ( LowAscii + HighAscii + "\x7f" ).unpack("C*")).length > 0 str = "$U$" + str.unpack("C*").select{|c| c < 0x7f and c > 0x1f and c != 0x2d}.pack("C*") + "-0x" + str.unpack("H*")[0] else str end end
Decode a URI encoded string
# File lib/rex/text.rb, line 949 def self.uri_decode(str) str.gsub(/(%[a-z0-9]{2})/i){ |c| [c[1,2]].pack("H*") } end
Encode a string in a manor useful for HTTP URIs and URI Parameters.
# File lib/rex/text.rb, line 863 def self.uri_encode(str, mode = 'hex-normal') return "" if str == nil return str if mode == 'none' # fast track no encoding all = /./ noslashes = /[^\/\\]+/ # http://tools.ietf.org/html/rfc3986#section-2.3 normal = /[^a-zA-Z0-9\/\\\.\-_~]+/ case mode when 'hex-all' return str.gsub(all) { |s| Rex::Text.to_hex(s, '%') } when 'hex-normal' return str.gsub(normal) { |s| Rex::Text.to_hex(s, '%') } when 'hex-noslashes' return str.gsub(noslashes) { |s| Rex::Text.to_hex(s, '%') } when 'hex-random' res = '' str.each_byte do |c| b = c.chr res << ((rand(2) == 0) ? b.gsub(all) { |s| Rex::Text.to_hex(s, '%') } : b.gsub(normal){ |s| Rex::Text.to_hex(s, '%') } ) end return res when 'u-all' return str.gsub(all) { |s| Rex::Text.to_hex(Rex::Text.to_unicode(s, 'uhwtfms'), '%u', 2) } when 'u-normal' return str.gsub(normal) { |s| Rex::Text.to_hex(Rex::Text.to_unicode(s, 'uhwtfms'), '%u', 2) } when 'u-noslashes' return str.gsub(noslashes) { |s| Rex::Text.to_hex(Rex::Text.to_unicode(s, 'uhwtfms'), '%u', 2) } when 'u-random' res = '' str.each_byte do |c| b = c.chr res << ((rand(2) == 0) ? b.gsub(all) { |s| Rex::Text.to_hex(Rex::Text.to_unicode(s, 'uhwtfms'), '%u', 2) } : b.gsub(normal){ |s| Rex::Text.to_hex(Rex::Text.to_unicode(s, 'uhwtfms'), '%u', 2) } ) end return res when 'u-half' return str.gsub(all) { |s| Rex::Text.to_hex(Rex::Text.to_unicode(s, 'uhwtfms-half'), '%u', 2) } else raise TypeError, "invalid mode #{mode.inspect}" end end
Wraps text at a given column using a supplied indention
# File lib/rex/text.rb, line 1072 def self.wordwrap(str, indent = 0, col = DefaultWrap, append = '', prepend = '') return str.gsub(/.{1,#{col - indent}}(?:\s|\Z)/){ ( (" " * indent) + prepend + $& + append + 5.chr).gsub(/\n\005/,"\n").gsub(/\005/,"\n")} end
Encode an ASCII string so it's safe for XML. It's a wrapper for to_hex_ascii.
# File lib/rex/text.rb, line 942 def self.xml_char_encode(str) self.to_hex_ascii(str, "&#x", 1, ";") end
Compresses a string using zlib
@param str [String] The string to be compressed @param level [Fixnum] One of the Zlib compression level constants @return [String] The compressed version of str
# File lib/rex/text.rb, line 1620 def self.zlib_deflate(str, level = Zlib::BEST_COMPRESSION) if self.zlib_present? z = Zlib::Deflate.new(level) dst = z.deflate(str, Zlib::FINISH) z.close return dst else raise RuntimeError, "Gzip support is not present." end end
Uncompresses a string using zlib
@param str [String] Compressed string to inflate @return [String] The uncompressed version of str
# File lib/rex/text.rb, line 1636 def self.zlib_inflate(str) if(self.zlib_present?) zstream = Zlib::Inflate.new buf = zstream.inflate(str) zstream.finish zstream.close return buf else raise RuntimeError, "Gzip support is not present." end end
Returns true if zlib can be used.
# File lib/rex/text.rb, line 1600 def self.zlib_present? begin temp = Zlib return true rescue return false end end
Protected Class Methods
@param str [String] Big-endian data to checksum @return [Fixnum] 16-bit checksum
# File lib/rex/text.rb, line 1992 def self.checksum16_be(str) (str.unpack("n*").inject(:+) || 0) % 0x10000 end
@param str [String] Little-endian data to checksum @return [Fixnum] 16-bit checksum
# File lib/rex/text.rb, line 1986 def self.checksum16_le(str) (str.unpack("v*").inject(:+) || 0) % 0x10000 end
@param str [String] Big-endian data to checksum @return [Fixnum] 32-bit checksum
# File lib/rex/text.rb, line 2004 def self.checksum32_be(str) (str.unpack("N*").inject(:+) || 0) % 0x100000000 end
@param str [String] Little-endian data to checksum @return [Fixnum] 32-bit checksum
# File lib/rex/text.rb, line 1998 def self.checksum32_le(str) (str.unpack("V*").inject(:+) || 0) % 0x100000000 end
@param str [String] Data to checksum @return [Fixnum] 8-bit checksum
# File lib/rex/text.rb, line 1980 def self.checksum8(str) (str.unpack("C*").inject(:+) || 0) % 0x100 end
# File lib/rex/text.rb, line 1946 def self.load_codepage() return if (!@@codepage_map_cache.nil?) file = File.join(File.dirname(__FILE__),'codepage.map') page = '' name = '' map = {} File.open(file).each { |line| next if line =~ /^#/ next if line =~ /^\s*$/ data = line.split if data[1] =~ /^\(/ page = data.shift.to_i name = data.join(' ').sub(/^\(/,'').sub(/\)$/,'') map[page] = {} map[page]['name'] = name map[page]['data'] = {} else data.each { |entry| wide, char = entry.split(':') char = [char].pack('H*') wide = [wide].pack('H*') if map[page]['data'][char].nil? map[page]['data'][char] = [wide] else map[page]['data'][char].push(wide) end } end } @@codepage_map_cache = map end