class Krikri::Enrichments::DcmiTypeMap
Maps string values to DCMI Type vocabulary terms.
Mapping
is performed by comparing a hash of string keys to the pre-enriched value, using White Similarity comparisons to determine the closest match. If a suitable match is found, a {DLPA::MAP::Controlled::DCMIType} object is built from the appropriate hash value and set as the new value.
This enrichment ignores strings without a DCMI Type match.
@example
type_mapper = DcmiTypeMap.new type_mapper.enrich_value('image') # finds RDF::DCMITYPE.Image type_mapper.enrich_value('a book') # finds RDF::DCMITYPE.Text type_mapper.enrich_value('a really cool book') # => 'a really cool book'
@example
type_mapper = DcmiTypeMap.new('poloriod' => RDF::DCMIType.Image) type_mapper.enrich_value('poloroid.') # finds RDF::DCMITYPE.Image
Constants
- DEFAULT_MAP
Public Class Methods
@param map [Hash<String, RDF::Vocabulary::Term>]
# File lib/krikri/enrichments/dcmi_type_map.rb, line 82 def initialize(map = nil) @map = map || DEFAULT_MAP end
Public Instance Methods
@param value [Object] the value to enrich
@return [DPLA::MAP::Controlled::DCMIType, nil] the matching DCMI Type
term. `nil` if no matches are found.
# File lib/krikri/enrichments/dcmi_type_map.rb, line 91 def enrich_value(value) return value unless value.is_a? String match = @map.fetch(value.downcase) { most_similar(value) } return value if match.nil? dcmi = DPLA::MAP::Controlled::DCMIType.new(match) dcmi.prefLabel = match.label dcmi end
Private Instance Methods
Performs White Similarity comparison against the keys, and gives the value of the closest match.
@param value [String] a string value to compare to the hash map keys. @param threshold [Float] the value at which a string is considered to
be a match
@return [RDF::Vocabulary::Term, nil] the closest DCMI type match, or `nil`
if none is sufficiently close
@see Text::WhiteSimilarity @see www.catalysoft.com/articles/strikeamatch.html article defining
the White Similarity algorithm
@todo consider text similarity algorithms/strategies and move text
matching to a utility and behind a Facade interface.
# File lib/krikri/enrichments/dcmi_type_map.rb, line 121 def most_similar(value, threshold = 0.5) @white ||= Text::WhiteSimilarity.new result = @map.max_by { |str, _| @white.similarity(value, str) } return result[1] if @white.similarity(value, result.first) > threshold nil end