class SolrEad::Indexer

The main entry point for your ead going into solr.

SolrEad uses RSolr to connect to your solr server and then gives you a couple of simple methods for creating, updating and deleting your ead documents.

You'll need to have your solr configuration defined in config/solr.yml. If you're working within the Rails environment, it will obey your environment settings. However, if you are using the gem by itself outside of rails, you can use the RAILS_ENV environment variable, otherwise, it will default to the development url.

Default indexing

This will index your ead into one solr document for the main portion of ead and then multiple documents for the component documents. The fields for the main document are defined in SolrEad::Document and fields for the component are defined in SolrEad::Component.

> file = File.new("path/to/your/ead.xml")
> indexer = SolrEad::Indexer.new
> indexer.create(file)
> indexer.delete("EAD-ID")

Simple indexing

By using the :simple option, SolrEad will create only one solr document from one ead. The default implementation of SolrEad is to create multiple documents, so fields defined in SolrEad::Document reflect this. For example, no component fields are defined in SolrEad::Document, so none would be indexed. If you elect to use the :simple option, you'll want to override SolrEad::Document with your own and define any additional component fields you want to appear in your index.

> file = File.new("path/to/your/ead.xml")
> indexer = SolrEad::Indexer.new(:simple => true)
> indexer.create(file)
> indexer.delete("EAD-ID")

Attributes

options[RW]
solr[RW]

Public Class Methods

new(opts={}) click to toggle source

Creates a new instance of SolrEad::Indexer and connects to your solr server

# File lib/solr_ead/indexer.rb, line 42
def initialize opts={}
  self.solr = solr_connection
  self.options = opts
end

Public Instance Methods

create(file) click to toggle source

Indexes ead xml and commits the results to your solr server.

# File lib/solr_ead/indexer.rb, line 48
def create file
  solr.add om_document(File.new(file)).to_solr
  add_components(file) unless options[:simple]
  solr.commit
end
delete(id) click to toggle source

Deletes the ead document and any component documents from your solr index and commits the results.

# File lib/solr_ead/indexer.rb, line 67
def delete id
  solr.delete_by_query( Solrizer.solr_name("ead", :stored_sortable)+':"' + id + '"')
  solr.commit
end
update(file) click to toggle source

Updates your ead from a given file by first deleting the existing ead document and any component documents, then creating a new index from the supplied file. This method will also commit the results to your solr server when complete.

# File lib/solr_ead/indexer.rb, line 57
def update file
  solr_doc = om_document(File.new(file)).to_solr
  delete solr_doc["id"]
  solr.add solr_doc
  add_components(file) unless options[:simple]
  solr.commit
end

Private Instance Methods

add_components(file, counter = 1) click to toggle source

Creates solr documents for each individual component node in the ead. Field names and values are determined according to the OM terminology outlined in SolrEad::Component as well as additional fields taken from the rest of the ead document as described in SolrEad::Behaviors#additional_component_fields.

Fields from both the terminology and additional_component_fields are all assembled into one solr document via the SolrEad::Component#to_solr method. Any customizations to the contents or appearance of the fields can take place within that method.

Furthermore, one final field is added to the solr document after the to_solr method. A sorting field sort_i is added to the document using the index values from the array of <c> nodes. This allows us to preserve the order of <c> nodes as they appear in the original ead document.

# File lib/solr_ead/indexer.rb, line 105
def add_components file, counter = 1
  components(file).each do |node|
    solr_doc = om_component_from_node(node).to_solr(additional_component_fields(node))
    solr_doc.merge!({Solrizer.solr_name("sort", :sortable, :type => :integer) => counter.to_s})
    solr.add solr_doc
    counter = counter + 1
  end
end
om_component_from_node(node) click to toggle source

Returns an OM document from a given Nokogiri node

Determines if you have specified a custom definition for your ead component. If you've defined a class CustomComponent, and have passed it as an option to your indexer, then SolrEad will use that class instead of SolrEad::Component.

# File lib/solr_ead/indexer.rb, line 88
def om_component_from_node node
  options[:component] ? options[:component].from_xml(prep(node)) : SolrEad::Component.from_xml(prep(node))
end
om_document(file) click to toggle source

Returns an OM document from a given file.

Determines if you have specified a custom definition for your ead document. If you've defined a class CustomDocument, and have passed it as an option to your indexer, then SolrEad will use that class instead of SolrEad::Document.

# File lib/solr_ead/indexer.rb, line 79
def om_document file
  options[:document] ? options[:document].from_xml(File.new(file)) : SolrEad::Document.from_xml(File.new(file))
end
solr_connection() click to toggle source

Returns a connection to solr using Rsolr

# File lib/solr_ead/indexer.rb, line 115
def solr_connection
  if ENV['SOLR_URL']
    RSolr.connect :url => ENV['SOLR_URL']
  else
    RSolr.connect :url => solr_url
  end
end
solr_url() click to toggle source

Determines the url to our solr service by consulting yaml files

# File lib/solr_ead/indexer.rb, line 124
def solr_url
  if defined?(Rails.root)
    ::YAML.load(ERB.new(File.read(File.join(Rails.root,"config","solr.yml"))).result)[Rails.env]['url']
  elsif ENV['RAILS_ENV']
    ::YAML.load(ERB.new(File.read("config/solr.yml")).result)[ENV['RAILS_ENV']]['url']
  else
    ::YAML.load(ERB.new(File.read("config/solr.yml")).result)['development']['url']
  end
end