class SiteDiff::UriWrapper

SiteDiff URI Wrapper.

Constants

DEFAULT_CURL_OPTS

TODO: Move these CURL OPTS to Config.DEFAULT_CONFIG.

Public Class Methods

canonicalize(path) click to toggle source

Canonicalize a path.

@param [String] path

A base relative path. Example: /foo/bar
# File lib/sitediff/uriwrapper.rb, line 193
def self.canonicalize(path)
  # Ignore trailing slashes for all paths except "/" (front page).
  path = path.chomp('/') unless path == '/'
  # If the path is empty, assume that it's the front page.
  path.empty? ? '/' : path
end
new(uri, curl_opts = DEFAULT_CURL_OPTS, debug = true) click to toggle source

Creates a UriWrapper.

# File lib/sitediff/uriwrapper.rb, line 51
def initialize(uri, curl_opts = DEFAULT_CURL_OPTS, debug = true)
  @uri = uri.respond_to?(:scheme) ? uri : Addressable::URI.parse(uri)
  # remove trailing '/'s from local URIs
  @uri.path.gsub!(%r{/*$}, '') if local?
  @curl_opts = curl_opts
  @debug = debug
end

Public Instance Methods

+(other) click to toggle source
What does this one do?

FIXME: this is not used anymore

# File lib/sitediff/uriwrapper.rb, line 88
def +(other)
  # 'path' for SiteDiff includes (parts of) path, query, and fragment.
  sep = ''
  sep = '/' if local? || @uri.path.empty?
  self.class.new(@uri.to_s + sep + other)
end
charset_encoding(http_headers) click to toggle source

Returns the encoding of an HTTP response from headers , nil if not specified.

# File lib/sitediff/uriwrapper.rb, line 105
def charset_encoding(http_headers)
  if (content_type = http_headers['Content-Type'])
    if (md = /;\s*charset=([-\w]*)/.match(content_type))
      md[1]
    end
  end
end
local?() click to toggle source

Is this a local filesystem path?

# File lib/sitediff/uriwrapper.rb, line 82
def local?
  @uri.scheme.nil?
end
password() click to toggle source

Returns the “password” part of the URI.

# File lib/sitediff/uriwrapper.rb, line 67
def password
  @uri.password
end
queue(hydra, &handler) click to toggle source

Queue reading this URL, with a completion handler to run after.

The handler should be callable as handler.

This method may choose not to queue the request at all, but simply execute right away.

# File lib/sitediff/uriwrapper.rb, line 180
def queue(hydra, &handler)
  if local?
    read_file(&handler)
  else
    hydra.queue(typhoeus_request(&handler))
  end
end
read_file() { |read_result| ... } click to toggle source

Reads a file and yields to the completion handler, see .queue()

# File lib/sitediff/uriwrapper.rb, line 97
def read_file
  File.open(@uri.to_s, 'r:UTF-8') { |f| yield ReadResult.new(f.read) }
rescue Errno::ENOENT, Errno::ENOTDIR, Errno::EACCES, Errno::EISDIR => e
  yield ReadResult.error(e.message)
end
to_s() click to toggle source

Converts the URI to a string.

# File lib/sitediff/uriwrapper.rb, line 73
def to_s
  uri = @uri.dup
  uri.user = nil
  uri.password = nil
  uri.to_s
end
typhoeus_request() { |read_result| ... } click to toggle source

Returns a Typhoeus::Request to fetch @uri

Completion callbacks of the request wrap the given handler which is assumed to accept a single ReadResult argument.

# File lib/sitediff/uriwrapper.rb, line 117
def typhoeus_request
  params = @curl_opts.dup
  # Allow basic auth
  params[:userpwd] = @uri.user + ':' + @uri.password if @uri.user

  req = Typhoeus::Request.new(to_s, params)

  req.on_success do |resp|
    body = resp.body
    # Typhoeus does not respect HTTP headers when setting the encoding
    # resp.body; coerce if possible.
    if (encoding = charset_encoding(resp.headers))
      body.force_encoding(encoding)
    end
    # Should be wrapped with rescue I guess? Maybe this entire function?
    # Should at least be an option in the Cli to disable this.
    # "stop on first error"
    begin
      yield ReadResult.new(body, encoding)
    rescue ArgumentError => e
      raise if @debug

      yield ReadResult.error(
        "Parsing error for #{@uri}: #{e.message}"
      )
    rescue StandardError => e
      raise if @debug

      yield ReadResult.error(
        "Unknown parsing error for #{@uri}: #{e.message}"
      )
    end
  end

  req.on_failure do |resp|
    if resp&.status_message
      msg = resp.status_message
      yield ReadResult.error(
        "HTTP error when loading #{@uri}: #{msg}",
        resp.response_code
      )
    elsif (msg = resp.options[:return_code])
      yield ReadResult.error(
        "Connection error when loading #{@uri}: #{msg}",
        resp.response_code
      )
    else
      yield ReadResult.error(
        "Unknown error when loading #{@uri}: #{msg}",
        resp.response_code
      )
    end
  end

  req
end
user() click to toggle source

Returns the “user” part of the URI.

# File lib/sitediff/uriwrapper.rb, line 61
def user
  @uri.user
end