module Compress

This module is a mixin for classes that want to use a very basic alphabetic shorthand to reduce text size. The module performs in-place operations, and expects to find a @compressed instance variable to work from.

Example:

include Compress
@original = 'JavaScript'
@compressed = @original.dup
abbr
# => "JS"

Constants

URL_HOLDER
URL_LENGTH
URL_PATTERN

Public Instance Methods

char_count() click to toggle source

Calculate the current character count, taking the “virtual size” of Twitter-shortened URLs into account.

# File lib/tweet_compressor/compress.rb, line 37
def char_count
  real_url_chars = @urls.join.size
  virt_url_chars = @urls.count * URL_LENGTH
  @compressed.size - real_url_chars + virt_url_chars
end

Private Instance Methods

abbr() click to toggle source

Special abbreviations to increase clarity.

TODO: A YAML dictionary would be preferrable to case statements if the list grows to any significant length.

# File lib/tweet_compressor/compress.rb, line 49
def abbr
  @compressed = @compressed.split.map do |word|
    case word.downcase
    when 'and'        then '&'
    when 'javascript' then 'JS'
    when 'string'     then 'str'
    when 'one'        then '1'
    when 'two'        then '2'
    when 'three'      then '3'
    when 'four'       then '4'
    when 'five'       then '5'
    when 'six'        then '6'
    when 'seven'      then '7'
    when 'eight'      then '8'
    when 'nine'       then '9'
    when 'ten'        then '10'
    when 'eleven'     then '11'
    when 'twelve'     then '12'
    when 'thirteen'   then '13'
    when 'fourteen'   then '14'
    when 'fifteen'    then '15'
    when 'sixteen'    then '15'
    when 'seventeen'  then '17'
    when 'eighteen'   then '18'
    when 'nineteen'   then '19'
    when 'twenty'     then '20'
    else word
    end
  end.join ' '
  @compressed.gsub! /is (?:an?|the)/, '='
  @compressed.gsub! /(in|with)? regards? (to)?/i, 're'
  @compressed.gsub! /about|regarding|related( to)?|(in response to)/, 're'
end
apostrophes() click to toggle source

Remove apostrophes from contractions to save more space.

# File lib/tweet_compressor/compress.rb, line 84
def apostrophes
  @compressed.gsub! /n't/i, 'nt'
end
contractions() click to toggle source

Identify common contractions, taking a few pains to preserve capitalization of the initial letter.

# File lib/tweet_compressor/compress.rb, line 90
def contractions
  @compressed.gsub! /I would/i, %q{I'd}
  @compressed.gsub! /i will(?!= ?not)/i, %q{I'll}
  @compressed.gsub! /(i)t is/i, %q{\1t's}
  @compressed.gsub! /(i)s not/i, %q{\1sn't}
  @compressed.gsub! /(w)ill not/i, %q{\1on't}
  @compressed.gsub! /(c)an ?not/i, %q{\1an't}
  @compressed.gsub! /(d)o(es)? not/i, %q{\1o\2n't}
  @compressed.gsub! /(s)hould not/i, %q{\1houldn't}
  @compressed.gsub! /(m)ust not/i, %q{\1usn't}
end
correct_grammar() click to toggle source

Fix common grammar mistakes that also save space.

# File lib/tweet_compressor/compress.rb, line 103
def correct_grammar
  @compressed.gsub! /s's/i, ?'
end
dedupe_consonants() click to toggle source

Remove duplicate lowercase consonants. Assume duplicate capital letters like 'LLC' are intentional.

# File lib/tweet_compressor/compress.rb, line 109
def dedupe_consonants
  consonants = [*'a'..'z'].flatten.reject { |c| c =~ /[aeiou]/ }
  regex = /(#{consonants})\1+/
  @compressed = @compressed.split.map do |word|
    next word unless word =~ regex
    word.gsub! regex, $1.to_s
  end.join ' '
end
dedupe_punct() click to toggle source

Remove duplicate punctuation characters. Make an exception for ellipses and dashes.

# File lib/tweet_compressor/compress.rb, line 120
def dedupe_punct
  regex = /([[:punct:]])\1+/
  @compressed = @compressed.split.map do |word|
    word.gsub! /\.{4,}/, '...'
    word.gsub! /-{3,}/, '--'
    next word if word.include? '...' or word.match /-{2,3}/
    next word unless word =~ regex
    word.gsub! regex, '\1'
  end.join ' '
end
ing() click to toggle source

Replace 'ing' with 'g'. Excludes short words like “ring” and “sing,” and checks an exception list for special cases.

# File lib/tweet_compressor/compress.rb, line 133
def ing
  exceptions = %w[fling]
  @compressed = @compressed.split.map do |word|
    next word unless word.end_with? 'ing'
    next word if word.start_with? '#'
    next word if word.size <= 4
    next word if exceptions.include? word
    word.sub(/ing$/, 'g')
  end.join ' '
end
remove_vowels() click to toggle source

Remove lowercase vowels in longer words, unless it is the starting letter.

# File lib/tweet_compressor/compress.rb, line 145
def remove_vowels
  @compressed = @compressed.split.map do |word|
    next word if word.start_with? '#'
    word.size >= 4 ? word.gsub(/(?<!\A)[aeiou]/, '') : word
  end.join ' '
end
sentences() click to toggle source

Remove spaces between punctuation marks and the following words.

# File lib/tweet_compressor/compress.rb, line 153
def sentences
  @compressed.gsub! /([[:punct:]])\s*(\S)/, '\1\2'
end
texting() click to toggle source

Abbreviations common in texting, but with a higher cognitive load.

# File lib/tweet_compressor/compress.rb, line 158
def texting
  @compressed.gsub! /is (?:an?|the)/, '='
  @compressed.gsub! /:.\)|\(.:/, ':)'
  @compressed.gsub! /(in|with)? regards? (to)?/i, 're'
  @compressed.gsub! /about|regarding|related( to)?|(in response to)/i, 're'
  @compressed.gsub! /(RT @[^:\b]+):?/, '\1'
  @compressed.gsub! /\bare\b/, 'r'
  @compressed.gsub! /\bfor\b/, '4'
  @compressed.gsub! /\bto/, '2'
  @compressed.gsub! /why/, 'y'
  @compressed.gsub! /you/, 'u'
end
url_preserve() click to toggle source

Temporarily remove URLs from the pattern space so that they don't get horked during other text transormations.

# File lib/tweet_compressor/compress.rb, line 178
def url_preserve
  @urls = @compressed.scan(/#{URL_PATTERN}/).flatten.compact
  @urls.each { |url| @compressed.gsub! /#{url}/, URL_HOLDER }
end
url_restore() click to toggle source

Return stored URLs to the pattern space.

# File lib/tweet_compressor/compress.rb, line 184
def url_restore
  @urls.each { |url| @compressed.sub! URL_HOLDER, url }
end
whitespace() click to toggle source

Regularize whitespace.

# File lib/tweet_compressor/compress.rb, line 172
def whitespace
  @compressed = @compressed.split.join ' '
end