class Mechanize::HTTP::Agent

Extracted from scottwb.com/blog/2013/11/09/defeating-the-infamous-mechanize-too-many-connection-resets-bug/

Constants

MAX_RESET_RETRIES

Public Instance Methods

fetch( uri, method = :get, headers = {}, params = [], referer = current_page, redirects = 0 )
Also aliased as: fetch_without_retry
Alias for: fetch_with_retry
fetch_with_retry( uri, method = :get, headers = {}, params = [], referer = current_page, redirects = 0 ) click to toggle source

We need to replace the core Mechanize HTTP method:

Mechanize::HTTP::Agent#fetch

with a wrapper that handles the infamous “too many connection resets” Mechanize bug that is described here:

https://github.com/sparklemotion/mechanize/issues/123

The wrapper shuts down the persistent HTTP connection when it fails with this error, and simply tries again. In practice, this only ever needs to be retried once, but I am going to let it retry a few times (MAX_RESET_RETRIES), just in case.

# File lib/monkey_patch/mechanize.rb, line 20
def fetch_with_retry(
  uri,
  method    = :get,
  headers   = {},
  params    = [],
  referer   = current_page,
  redirects = 0
)
  action      = "#{method.to_s.upcase} #{uri.to_s}"
  retry_count = 0

  begin
    fetch_without_retry(uri, method, headers, params, referer, redirects)
  rescue Net::HTTP::Persistent::Error => e
    # Pass on any other type of error.
    raise unless e.message =~ /too many connection resets/

    # Pass on the error if we've tried too many times.
    if retry_count >= MAX_RESET_RETRIES
      puts "**** WARN: Mechanize retried connection reset #{MAX_RESET_RETRIES} times and never succeeded: #{action}" if Bankscrap.debug
      raise
    end

    # Otherwise, shutdown the persistent HTTP connection and try again.
    puts "**** WARN: Mechanize retrying connection reset error: #{action}" if Bankscrap.debug
    retry_count += 1
    self.http.shutdown
    retry
  end
end
Also aliased as: fetch
fetch_without_retry( uri, method = :get, headers = {}, params = [], referer = current_page, redirects = 0 )

Alias so fetch actually uses our new fetch_with_retry to wrap the old one aliased as fetch_without_retry.

Alias for: fetch