class Krikri::Harvesters::MarcXMLHarvester

A harvester implementation for reading MARC XML documents from a source.

Subclasses of MarcXMLHarvester are expected to implement the following methods:

- #each_collection (yields one or more IO objects containing a MARC XML
- #collection)

Public Instance Methods

content_type() click to toggle source

@return [String] the content type for the records generated by this

harvester
# File lib/krikri/harvesters/marc_xml_harvester.rb, line 39
def content_type
  'text/xml'
end
each_collection() click to toggle source

@abstract @yield [Enumerable<IO>] gives a collection of IO objects representing

XML to be parsed into the record.
# File lib/krikri/harvesters/marc_xml_harvester.rb, line 18
def each_collection
  raise NotImplementedError
end
record_ids() click to toggle source

@return [Enumerator::Lazy] an enumerator of the 001 control fields from

the records targeted by this harvester.
# File lib/krikri/harvesters/marc_xml_harvester.rb, line 32
def record_ids
  enumerate_records.lazy.map { |rec| rec.identifier }
end
records() click to toggle source

@return [Enumerator::Lazy] an enumerator of the records targeted by this

harvester.
# File lib/krikri/harvesters/marc_xml_harvester.rb, line 25
def records
  enumerate_records.lazy.map { |rec| build_record(rec) }
end

Private Instance Methods

build_record(marcxml_doc) click to toggle source

Builds an instance of `@record_class` with the given doc's MARC XML as content.

@param marcxml_doc [MarcXMLDoc] the MarcXML document to serialize as

`#content`

@return [#to_s] an instance of @record_class with a minted id and

content the given content
# File lib/krikri/harvesters/marc_xml_harvester.rb, line 68
def build_record(marcxml_doc)
  @record_class.build(mint_id(marcxml_doc.identifier),
                      marcxml_doc.source,
                      content_type)
end
enumerate_records() click to toggle source

@return [Enumerator] an enumerator over the records

# File lib/krikri/harvesters/marc_xml_harvester.rb, line 47
def enumerate_records
  Enumerator.new do |yielder|
    each_collection do |marcxml_io|
      Nokogiri::XML::Reader(marcxml_io).each do |node|
        if node.name == 'record' &&
           node.node_type == Nokogiri::XML::Reader::TYPE_ELEMENT
          yielder << MarcXMLDoc.new(node)
        end
      end
    end
  end
end