class EagleClaw::Scraper
Attributes
A ‘Hash` which holds data collected during a run.
@see initialize @see reset
An ‘Array` which collects
Public Class Methods
Define a post-processor to run in a certain context.
@param [Symbol] context either ‘:each` or `:all`. @param [optional, Symbol] meth name of method to call. @return [nil]
@overload after(:each, :method_name)
Run the given method after each component of the run.
@overload after(:all, :method_name)
Run the given method after the run itself.
@overload after(:each, &block)
Run the given block (using `instance_eval`) after each component of the run.
@overload after(:all, &block)
Run the given block (using `instance_eval`) after the entire run.
@see before
# File lib/eagleclaw.rb, line 65 def after(context, meth = nil, &block) register([:after, context], meth, &block) end
Define a pre-processor to run in a certain context.
@param [Symbol] context either ‘:each` or `:all`. @param [optional, Symbol] meth name of method to call. @return [nil]
@overload before(:each, :method_name)
Run the given method before each component of the run.
@overload before(:all, :method_name)
Run the given method before the run itself.
@overload before(:each, &block)
Run the given block (using `instance_eval`) before each component of the run.
@overload before(:all, &block)
Run the given block (using `instance_eval`) before the run itself.
@example Fetch a page before the run
before(:all) do agent.get("http://google.com/") end
@example Reset the page before each component of the run
before(:each) do agent.get("http://google.com/") end
# File lib/eagleclaw.rb, line 42 def before(context, meth = nil, &block) register([:before, context], meth, &block) end
Create a new {Scraper} instance.
By default, just sets {#data @data} and {#problems @problems} to empty ‘Array`s.
# File lib/eagleclaw.rb, line 94 def initialize @data = [] @problems = [] end
# File lib/eagleclaw.rb, line 69 def prop(prop_name, meth = nil, &block) (@properties ||= []) << prop_name.to_sym register([:property, prop_name.to_sym], meth, &block) end
Public Instance Methods
Reset this scraper instance’s state.
The default version of this method just clears {#data @data} and {#problems @problems}.
@return [nil] @abstract Subclass and extend to reset the scraper state.
# File lib/eagleclaw.rb, line 108 def reset data.clear problems.clear end
Run the scraper.
Operating procedure:
-
Run {Scraper.before before(:all)} blocks.
-
For each property (defined with {Scraper.prop prop(:prop_name)}):
-
Run {Scraper.before before(:each)} blocks.
-
Run the property itself.
-
Runs {Scraper.after after(:each)} blocks.
-
-
Runs {Scraper.after after(:all)} blocks.
-
Return {#data data}.
# File lib/eagleclaw.rb, line 128 def run self.class.run_callbacks([:before, :all], self) self.class.properties.each do |property| self.class.run_callbacks([:before, :each], self) self.class.run_callbacks([:property, property], self) self.class.run_callbacks([:after, :each], self) end self.class.run_callbacks([:after, :all], self) data end