class WordNet::Lemma
Represents a single word in the WordNet
lexicon, which can be used to look up a set of synsets.
Constants
- POS_SHORTHAND
- SPACE
Attributes
A unique integer id that references this lemma. Used internally within WordNet's database.
An array of valid pointer symbols for this lemma. The list of all valid pointer symbols is defined in pointers.rb.
The part of speech (noun, verb, adjective) of this lemma. One of 'n', 'v', 'a' (adjective), or 'r' (adverb)
The offset, in bytes, at which the synsets contained in this lemma are stored in WordNet's internal database.
The word this lemma represents
Public Class Methods
Find a lemma for a given word and pos. Valid parts of speech are: 'adj', 'adv', 'noun', 'verb'. Additionally, you can use the shorthand forms of each of these ('a', 'r', 'n', 'v')/
# File lib/rwordnet/lemma.rb, line 65 def find(word, pos) # Map shorthand POS to full forms pos = POS_SHORTHAND[pos] || pos cache = @@cache[pos] ||= build_cache(pos) if found = cache[word] Lemma.new(*found) end end
Find all lemmas for this word across all known parts of speech
# File lib/rwordnet/lemma.rb, line 56 def find_all(word) [:noun, :verb, :adj, :adv].flat_map do |pos| find(word, pos) || [] end end
Create a lemma from a line in an lexicon file. You should not be creating Lemmas by hand; instead, use the WordNet::Lemma.find
and WordNet::Lemma.find_all
methods to find the Lemma
for a word.
# File lib/rwordnet/lemma.rb, line 28 def initialize(lexicon_line, id) @id = id line = lexicon_line.split(" ") @word = line.shift @pos = line.shift synset_count = line.shift.to_i @pointer_symbols = line.slice!(0, line.shift.to_i) line.shift # Throw away redundant sense_cnt @tagsense_count = line.shift.to_i @synset_offsets = line.slice!(0, synset_count).map(&:to_i) end
Private Class Methods
# File lib/rwordnet/lemma.rb, line 77 def build_cache(pos) cache = {} DB.open(File.join("dict", "index.#{pos}")).each_line.each_with_index do |line, index| word = line.slice(0, line.index(SPACE)) cache[word] = [line, index+1] end cache end
Public Instance Methods
Return a list of synsets for this Lemma
. Each synset represents a different sense, or meaning, of the word.
# File lib/rwordnet/lemma.rb, line 42 def synsets @synset_offsets.map { |offset| Synset.new(@pos, offset) } end
Returns a compact string representation of this lemma, e.g. “fall, v” for the verb form of the word “fall”.
# File lib/rwordnet/lemma.rb, line 48 def to_s [@word, @pos].join(",") end