class WordNet::Lemma

Represents a single word in the WordNet lexicon, which can be used to look up a set of synsets.

Constants

POS_SHORTHAND
SPACE

Attributes

id[RW]

A unique integer id that references this lemma. Used internally within WordNet's database.

pointer_symbols[RW]

An array of valid pointer symbols for this lemma. The list of all valid pointer symbols is defined in pointers.rb.

pos[RW]

The part of speech (noun, verb, adjective) of this lemma. One of 'n', 'v', 'a' (adjective), or 'r' (adverb)

synset_offsets[RW]

The offset, in bytes, at which the synsets contained in this lemma are stored in WordNet's internal database.

tagsense_count[RW]

The number of times the sense is tagged in various semantic concordance texts. A tagsense_count of 0 indicates that the sense has not been semantically tagged.

word[RW]

The word this lemma represents

Public Class Methods

find(word, pos) click to toggle source

Find a lemma for a given word and pos. Valid parts of speech are: 'adj', 'adv', 'noun', 'verb'. Additionally, you can use the shorthand forms of each of these ('a', 'r', 'n', 'v')/

# File lib/rwordnet/lemma.rb, line 65
def find(word, pos)
  # Map shorthand POS to full forms
  pos = POS_SHORTHAND[pos] || pos

  cache = @@cache[pos] ||= build_cache(pos)
  if found = cache[word]
    Lemma.new(*found)
  end
end
find_all(word) click to toggle source

Find all lemmas for this word across all known parts of speech

# File lib/rwordnet/lemma.rb, line 56
def find_all(word)
  [:noun, :verb, :adj, :adv].flat_map do |pos|
    find(word, pos) || []
  end
end
new(lexicon_line, id) click to toggle source

Create a lemma from a line in an lexicon file. You should not be creating Lemmas by hand; instead, use the WordNet::Lemma.find and WordNet::Lemma.find_all methods to find the Lemma for a word.

# File lib/rwordnet/lemma.rb, line 28
def initialize(lexicon_line, id)
  @id = id
  line = lexicon_line.split(" ")

  @word = line.shift
  @pos = line.shift
  synset_count = line.shift.to_i
  @pointer_symbols = line.slice!(0, line.shift.to_i)
  line.shift # Throw away redundant sense_cnt
  @tagsense_count = line.shift.to_i
  @synset_offsets = line.slice!(0, synset_count).map(&:to_i)
end

Private Class Methods

build_cache(pos) click to toggle source
# File lib/rwordnet/lemma.rb, line 77
def build_cache(pos)
  cache = {}
  DB.open(File.join("dict", "index.#{pos}")).each_line.each_with_index do |line, index|
    word = line.slice(0, line.index(SPACE))
    cache[word] = [line, index+1]
  end
  cache
end

Public Instance Methods

synsets() click to toggle source

Return a list of synsets for this Lemma. Each synset represents a different sense, or meaning, of the word.

# File lib/rwordnet/lemma.rb, line 42
def synsets
  @synset_offsets.map { |offset| Synset.new(@pos, offset) }
end
to_s() click to toggle source

Returns a compact string representation of this lemma, e.g. “fall, v” for the verb form of the word “fall”.

# File lib/rwordnet/lemma.rb, line 48
def to_s
  [@word, @pos].join(",")
end