module Web

Find the contact

Constants

HTTP_REGEX

Captures http:// and https://

Attributes

agent[RW]
page[RW]
url[RW]

Public Instance Methods

blind_test(url) click to toggle source

TODO: Sometimes DNS will do a redirect and not give a 404.

Need to prevent redirects.

Blindly tests to see if a url goes through. If there is a 404 error, this will return nil.

# File lib/gimme_poc/web.rb, line 93
def blind_test(url)
  LogMessages.blind_testing(url)
  get(url)
end
format_url(str) click to toggle source

Mechanize needs absolute urls to work. If http:// or https:// isn't present, append http://.

# File lib/gimme_poc/web.rb, line 40
def format_url(str)
  LazyDomain.autohttp(str)
end
get(str) click to toggle source

Go to a page using Mechanize. Sleep for a split second to not overload any servers.

Returns nil if bad url is given.

# File lib/gimme_poc/web.rb, line 13
def get(str)
  prepare_get_request(str)
  @page = @agent.get(@url)
rescue Exception => e
  LogMessages.warn_err(e)
end
mech_setup() click to toggle source
# File lib/gimme_poc/web.rb, line 27
def mech_setup
  @agent = Mechanize.new do |a|
    a.user_agent_alias = 'Mac Safari'
    a.open_timeout = 7
    a.read_timeout = 7
    a.idle_timeout = 7
    a.redirect_ok = true
  end
end
orig_domain(str) click to toggle source

Outputs domain of a url. Useful if subdomains are given to GimmePOC and they don't work.

For example: Given maps.google.com, returns 'google.com'.

# File lib/gimme_poc/web.rb, line 55
def orig_domain(str)
  LazyDomain.parse(str).domain
rescue PublicSuffix::DomainInvalid => err
  LogMessages.invalid_domain(err)
end
prepare_get_request(str) click to toggle source
# File lib/gimme_poc/web.rb, line 20
def prepare_get_request(str)
  mech_setup
  @url = format_url(str)
  LogMessages.sending_get_request(url)
  sleep(0.1)     
end
subdomain?(str) click to toggle source

Boolean, returns true if url is not identical to original domain.

In the event that the url has a path, this splits everything on forward slash and selects far left item.

# File lib/gimme_poc/web.rb, line 84
def subdomain?(str)
  (unformat_url(str).split('/')[0] != orig_domain(str))
end
unformat_url(str) click to toggle source

Used for subdomain check. Not a permanent change to url variable.

# File lib/gimme_poc/web.rb, line 45
def unformat_url(str)
  str.gsub(HTTP_REGEX, '')
end