class Krikri::Enrichments::DcmiTypeMap

Maps string values to DCMI Type vocabulary terms.

Mapping is performed by comparing a hash of string keys to the pre-enriched value, using White Similarity comparisons to determine the closest match. If a suitable match is found, a {DLPA::MAP::Controlled::DCMIType} object is built from the appropriate hash value and set as the new value.

This enrichment ignores strings without a DCMI Type match.

@example

type_mapper = DcmiTypeMap.new
type_mapper.enrich_value('image') # finds RDF::DCMITYPE.Image
type_mapper.enrich_value('a book') # finds RDF::DCMITYPE.Text
type_mapper.enrich_value('a really cool book') # => 'a really cool book'

@example

type_mapper = DcmiTypeMap.new('poloriod' => RDF::DCMIType.Image)
type_mapper.enrich_value('poloroid.') # finds RDF::DCMITYPE.Image

Constants

DEFAULT_MAP

Public Class Methods

new(map = nil) click to toggle source

@param map [Hash<String, RDF::Vocabulary::Term>]

# File lib/krikri/enrichments/dcmi_type_map.rb, line 82
def initialize(map = nil)
  @map = map || DEFAULT_MAP
end

Public Instance Methods

enrich_value(value) click to toggle source

@param value [Object] the value to enrich

@return [DPLA::MAP::Controlled::DCMIType, nil] the matching DCMI Type

term. `nil` if no matches are found.
# File lib/krikri/enrichments/dcmi_type_map.rb, line 91
def enrich_value(value)
  return value unless value.is_a? String

  match = @map.fetch(value.downcase) { most_similar(value) }
  return value if match.nil?
  dcmi = DPLA::MAP::Controlled::DCMIType.new(match)
  dcmi.prefLabel = match.label
  dcmi
end

Private Instance Methods

most_similar(value, threshold = 0.5) click to toggle source

Performs White Similarity comparison against the keys, and gives the value of the closest match.

@param value [String] a string value to compare to the hash map keys. @param threshold [Float] the value at which a string is considered to

be a match

@return [RDF::Vocabulary::Term, nil] the closest DCMI type match, or `nil`

if none is sufficiently close

@see Text::WhiteSimilarity @see www.catalysoft.com/articles/strikeamatch.html article defining

the White Similarity algorithm

@todo consider text similarity algorithms/strategies and move text

matching to a utility and behind a Facade interface.
# File lib/krikri/enrichments/dcmi_type_map.rb, line 121
def most_similar(value, threshold = 0.5)
  @white ||= Text::WhiteSimilarity.new
  result = @map.max_by { |str, _| @white.similarity(value, str) }

  return result[1] if @white.similarity(value, result.first) > threshold
  nil
end