module ScrapCbf::Findable

This module has methods for helping on the task of find specific component views (e.g table).

Public Instance Methods

find_table_by_header( elems, compare, regex = '[[:alpha:]]+', accuracy = 1.0 ) click to toggle source

Find in the Document the first table with single header and level that matches with such accurancy the argument compare.

This method uses Array#find to return object or nil.

@return [Nokogiri::XML::Element, nil]

# File lib/scrap_cbf/helpers/lib/findable.rb, line 13
def find_table_by_header(
  elems,
  compare,
  regex = '[[:alpha:]]+',
  accuracy = 1.0
)
  elems.find do |table|
    # check only single level header
    thead = table.css('thead').first
    return false if !thead || thead.css('tr').length > 1

    header = thead.text.scan(Regexp.new(regex))

    return false if header.empty?

    matches = (compare & header).length
    (header.length / matches) >= accuracy
  end
end