class Wayfarer::Job

A {Job} is a class that has a {Routing::Router} with many {Routing::Rule}s which are matched against a URI. Rules map URIs onto job instance methods. Under the hood, jobs are instantiated within separate threads by a {Processor}. Every instance gets its own thread. If a URI is matched, its {Page} is retrieved, and made available to instance methods via {#page}.

Jobs implement ActiveJob's Job API and are therefore compatible with a wide range of job queues. To run a job immediately, call ::perform_now. enqueue a job, call ::perform_later.

@see github.com/rails/rails/tree/master/activejob rails/activejob @see edgeguides.rubyonrails.org/active_job_basics.html ActiveJob Basics

Attributes

config[W]

@!attribute [w] config

router[W]

@!attribute [w] router

adapter[RW]

@!attribute [rw] adapter

page[W]

@!attribute [rw] page

params[RW]

@!attribute [rw] params

staged_uris[R]

@!attribute [r] staged_uris @return [Array<String>, Array<URI>] URIs to stage for the next cycle. @see stage

Public Class Methods

config() { |config| ... } click to toggle source

A configuration based off the global {Wayfarer.config}. @yield [Configuration] @return [Configuration]

# File lib/wayfarer/job.rb, line 83
def config
  @config ||= Wayfarer.config.clone
  yield(@config) if block_given?
  @config
end
new(*argv) click to toggle source
Calls superclass method
# File lib/wayfarer/job.rb, line 119
def initialize(*argv)
  @halts = false
  @staged_uris = []
  super(*argv)
end
prepare() click to toggle source

Returns a class copy.

# File lib/wayfarer/job.rb, line 60
def prepare
  duplicate = dup
  duplicate.router = router.dup
  duplicate.locals = locals.deep_dup
  duplicate.config = config.dup

  duplicate.locals.each do |(key, val)|
    duplicate.locals[key] = Locals.thread_safe_counterpart(val)
  end

  duplicate.locals.each do |(key, _)|
    duplicate.send(:define_method, key) do duplicate.locals[key] end
    duplicate.send(:define_singleton_method, key) do
      duplicate.locals[key]
    end
  end

  duplicate
end
route(&proc)
Alias for: router
router(&proc) click to toggle source

A router. If a block is passed in, it is evaluated within the {Router}'s instance. @return [Routing::Router]

# File lib/wayfarer/job.rb, line 92
def router(&proc)
  @router ||= Routing::Router.new
  @router.instance_eval(&proc) if block_given?
  @router
end
Also aliased as: route, routes
routes(&proc)
Alias for: router

Public Instance Methods

halts?() click to toggle source

Whether this job will stop processing.

# File lib/wayfarer/job.rb, line 126
def halts?
  @halts
end
perform(*uris) click to toggle source

Performs this job. @note ActiveJob API @override

# File lib/wayfarer/job.rb, line 133
def perform(*uris)
  Crawl.new(self.class, *uris).execute
end

Protected Instance Methods

halt() click to toggle source

Sets a halting flag that signals the processor to stop its work.

# File lib/wayfarer/job.rb, line 142
def halt
  @halts = true
end
page() click to toggle source

The {Page} representing the URI currently processed by an action. When using the Selenium adapter, {Page#body} gets refreshed on every call. Otherwise, subsequent DOM updates (i.e. JavaScript-induced) would be invisible. @return Page

# File lib/wayfarer/job.rb, line 178
def page
  return @page unless self.class.config.http_adapter == :selenium

  Page.new(
    uri: @page.uri,
    status_code: @page.uri,
    body: driver.page_source,
    headers: @page.headers
  )
end
stage(*uris) click to toggle source

Adds URIs to process in the next cycle. If a relative path is given, an absolute URI is constructed from the current {#page}'s URI. @param [String, URI, Array<String>, Array<URI>]

# File lib/wayfarer/job.rb, line 150
def stage(*uris)
  expanded = uris.flatten.map do |u|
    if (uri = URI(u)).absolute?
      uri
    else
      # URI#join would discard the path of page.uri.path
      current = page.uri.dup
      current.path = File.join(page.uri.path, uri.path)
      current
    end
  end

  # This method has somewhat become the guard keeper for invalid URIs that
  # would lead to exceptions otherwise down the line
  supported = expanded.select do |uri|
    HTTPAdapters::NetHTTPAdapter::RECOGNIZED_URI_TYPES.any? do |type|
      uri.is_a?(type)
    end
  end

  @staged_uris.push(*supported)
end