class NameFinder
Find names from a know list in a text, taking account of names that may overlap. For example, Waterloo and Waterloo East are separate stations; NameFinder
, knowing both, will not give a false match for Waterloo in a text that mentions Waterloo East.
Constants
- VERSION
Attributes
root[R]
Public Class Methods
new(tree={})
click to toggle source
Initialize a new NameFinder
. tree, if supplied, should be the data generated by the export method.
# File lib/name_finder.rb, line 16 def initialize(tree={}) @tree = tree @root = NodeProxy.new(tree, delimiter) end
Public Instance Methods
add(term)
click to toggle source
Add a term to NameFinder’s dictionary
# File lib/name_finder.rb, line 25 def add(term) root.add Buffer.new(normalize(term) + delimiter), term end
export()
click to toggle source
Export the tree of the current dictionary for later re-importing.
# File lib/name_finder.rb, line 50 def export @tree end
find_all_in(haystack)
click to toggle source
Find all names from the dictionary in haystack.
# File lib/name_finder.rb, line 40 def find_all_in(haystack) Set.new.tap { |all| find(haystack) do |found| all << found end }.to_a end
find_in(haystack)
click to toggle source
Find the first name from the dictionary in haystack
# File lib/name_finder.rb, line 31 def find_in(haystack) find(haystack) do |found| return found end nil end
Private Instance Methods
delimiter()
click to toggle source
# File lib/name_finder.rb, line 72 def delimiter " " end
find(haystack) { |found| ... }
click to toggle source
# File lib/name_finder.rb, line 55 def find(haystack) remaining = Buffer.new(normalize(haystack) + delimiter) while !remaining.at_end? found = root.find(remaining) if found yield found remaining = remaining.advance_by(found.length) else remaining = remaining.advance_past(delimiter) end end end
normalize(term)
click to toggle source
# File lib/name_finder.rb, line 68 def normalize(term) term.downcase.gsub(/[^a-z]+/, delimiter) end