class WordNet::Synset

WordNet synonym-set object class

Instances of this class encapsulate the data for a synonym set ('synset') in a WordNet lexical database. A synonym set is a set of words that are interchangeable in some context.

We can either fetch the synset from a connected Lexicon:

lexicon = WordNet::Lexicon.new( 'postgres://localhost/wordnet31' )
ss = lexicon[ :first, 'time' ]
# => #<WordNet::Synset:0x7ffbf2643bb0 {115265518} 'commencement, first,
#       get-go, offset, outset, start, starting time, beginning, kickoff,
#       showtime' (noun): [noun.time] the time at which something is
#       supposed to begin>

or if you've already created a Lexicon, use its connection indirectly to look up a Synset by its ID:

ss = WordNet::Synset[ 115265518 ]
# => #<WordNet::Synset:0x7ffbf257e928 {115265518} 'commencement, first,
#       get-go, offset, outset, start, starting time, beginning, kickoff,
#       showtime' (noun): [noun.time] the time at which something is
#       supposed to begin>

You can fetch a list of the lemmas (base forms) of the words included in the synset:

ss.words.map( &:lemma )
# => ["commencement", "first", "get-go", "offset", "outset", "start",
#     "starting time", "beginning", "kickoff", "showtime"]

But the primary reason for a synset is its lexical and semantic links to other words and synsets. For instance, its hypernym is the equivalent of its superclass: it's the class of things of which the receiving synset is a member.

ss.hypernyms
# => [#<WordNet::Synset:0x7ffbf25c76c8 {115180528} 'point, point in
#        time' (noun): [noun.time] an instant of time>]

The synset's hyponyms, on the other hand, are kind of like its subclasses:

ss.hyponyms
# => [#<WordNet::Synset:0x7ffbf25d83b0 {115142167} 'birth' (noun):
#       [noun.time] the time when something begins (especially life)>,
#     #<WordNet::Synset:0x7ffbf25d8298 {115268993} 'threshold' (noun):
#       [noun.time] the starting point for a new state or experience>,
#     #<WordNet::Synset:0x7ffbf25d8180 {115143012} 'incipiency,
#       incipience' (noun): [noun.time] beginning to exist or to be
#       apparent>,
#     #<WordNet::Synset:0x7ffbf25d8068 {115266164} 'starting point,
#       terminus a quo' (noun): [noun.time] earliest limiting point>]

Traversal

Synset also provides a few 'traversal' methods which provide recursive searching of a Synset's semantic links:

# Recursively search for more-general terms for the synset, and print out
# each one with indentation according to how distantly it's related.
lexicon[ :fencing, 'sword' ].
    traverse(:hypernyms).with_depth.
    each {|ss, depth| puts "%s%s [%d]" % ['  ' * (depth-1), ss.words.first, ss.synsetid] }
# (outputs:)
play [100041468]
  action [100037396]
    act [100030358]
      event [100029378]
        psychological feature [100023100]
          abstract entity [100002137]
            entity [100001740]
combat [101170962]
  battle [100958896]
    group action [101080366]
      event [100029378]
        psychological feature [100023100]
          abstract entity [100002137]
            entity [100001740]
      act [100030358]
        event [100029378]
          psychological feature [100023100]
            abstract entity [100002137]
              entity [100001740]

See the Traversal Methods section for more details.

Low-Level API

This library is implemented using Sequel::Model, an ORM layer on top of the excellent Sequel database toolkit. This means that in addition to the high-level methods above, you can also make use of a database-oriented API if you need to do something not provided by a high-level method.

In order to make use of this API, you'll need to be familiar with Sequel, especially Datasets and Model Associations. Most of Ruby-WordNet's functionality is implemented in terms of one or both of these.

Datasets

The main dataset is available from WordNet::Synset.dataset:

WordNet::Synset.dataset
# => #<Sequel::SQLite::Dataset: "SELECT * FROM `synsets`">

In addition to this, Synset also defines a few other canned datasets. To facilitate searching by part of speech on the Synset class:

or by the semantic links for a particular Synset:

Constants

SEMANTIC_TYPEKEYS

Semantic link type keys; maps what the API calls them to what they are in the DB.

Attributes

Public Class Methods

db=( newdb ) click to toggle source

Overridden to reset any lookup tables that may have been loaded from the previous database.

Calls superclass method
# File lib/wordnet/synset.rb, line 297
def self::db=( newdb )
        self.reset_lookup_tables
        super
end
lexdomain_table() click to toggle source

Return the table of lexical domains, keyed by id.

# File lib/wordnet/synset.rb, line 315
def self::lexdomain_table
        @lexdomain_table ||= self.db[:lexdomains].to_hash( :lexdomainid )
end
lexdomains() click to toggle source

Lexical domains, keyed by name as a String (e.g., “verb.cognition”)

# File lib/wordnet/synset.rb, line 321
def self::lexdomains
        @lexdomains ||= self.lexdomain_table.inject({}) do |hash,(id,domain)|
                hash[ domain[:lexdomainname] ] = domain
                hash
        end
end
linktype_table() click to toggle source

Return the table of link types, keyed by linkid

# File lib/wordnet/synset.rb, line 330
def self::linktype_table
        @linktype_table ||= self.db[:linktypes].inject({}) do |hash,row|
                hash[ row[:linkid] ] = {
                        id: row[:linkid],
                        typename: row[:link],
                        type: row[:link].gsub( /\s+/, '_' ).to_sym,
                        recurses: row[:recurses] && row[:recurses] != 0,
                }
                hash
        end
end
linktypes() click to toggle source

Return the table of link types, keyed by name.

# File lib/wordnet/synset.rb, line 344
def self::linktypes
        @linktypes ||= self.linktype_table.inject({}) do |hash,(id,link)|
                hash[ link[:type] ] = link
                hash
        end
end
postype_table() click to toggle source

Return the table of part-of-speech types, keyed by letter identifier.

# File lib/wordnet/synset.rb, line 353
def self::postype_table
        @postype_table ||= self.db[:postypes].inject({}) do |hash, row|
                hash[ row[:pos].untaint.to_sym ] = row[:posname]
                hash
        end
end
postypes() click to toggle source

Return the table of part-of-speech names to letter identifiers (both Symbols).

# File lib/wordnet/synset.rb, line 362
def self::postypes
        @postypes ||= self.postype_table.invert
end
reset_lookup_tables() click to toggle source

Unload all of the cached lookup tables that have been loaded.

# File lib/wordnet/synset.rb, line 304
def self::reset_lookup_tables
        @lexdomain_table = nil
        @lexdomains      = nil
        @linktype_table  = nil
        @linktypes       = nil
        @postype_table   = nil
        @postypes        = nil
end

Public Instance Methods

inspect() click to toggle source

Return a human-readable representation of the objects, suitable for debugging.

# File lib/wordnet/synset.rb, line 689
def inspect
        return "#<%p:%0#x {%d} '%s' (%s): [%s] %s>" % [
                self.class,
                self.object_id * 2,
                self.synsetid,
                self.wordlist.join(', '),
                self.part_of_speech,
                self.lexical_domain,
                self.definition,
        ]
end
lexical_domain() click to toggle source

Return the name of the lexical domain the synset belongs to; this also corresponds to the lexicographer's file the synset was originally loaded from.

# File lib/wordnet/synset.rb, line 447
def lexical_domain
        return self.class.lexdomain_table[ self.lexdomainid ][ :lexdomainname ]
end
part_of_speech() click to toggle source

Return the name of the Synset's part of speech (pos).

# File lib/wordnet/synset.rb, line 417
def part_of_speech
        return self.class.postype_table[ self.pos.to_sym ]
end
samples() click to toggle source

Return any sample sentences.

# File lib/wordnet/synset.rb, line 453
def samples
        return self.db[:samples].
                filter( synsetid: self.synsetid ).
                order( :sampleid ).
                map( :sample )
end
senses() click to toggle source

The WordNet::Senses associated with the receiver

# File lib/wordnet/synset.rb, line 188
one_to_many :senses,
        key: :synsetid,
        primary_key: :synsetid
sumo_terms() click to toggle source

Terms from the Suggested Upper Merged Ontology

# File lib/wordnet/synset.rb, line 213
many_to_many :sumo_terms,
        join_table: :sumomaps,
        left_key: :synsetid,
        right_key: :sumoid
to_s() click to toggle source

Stringify the synset.

# File lib/wordnet/synset.rb, line 423
def to_s

        # Make a sorted list of the semantic link types from this synset
        semlink_list = self.semlinks_dataset.
                group_and_count( :linkid ).
                to_hash( :linkid, :count ).
                collect do |linkid, count|
                        '%s: %d' % [ self.class.linktype_table[linkid][:typename], count ]
                end.
                sort.
                join( ', ' )

        return "%s (%s): [%s] %s (%s)" % [
                self.words.map( &:to_s ).join(', '),
                self.part_of_speech,
                self.lexical_domain,
                self.definition,
                semlink_list
        ]
end
wordlist() click to toggle source

Return the Synset's Words as an Array of Strings.

# File lib/wordnet/synset.rb, line 683
def wordlist
        return self.words.map( &:to_s )
end
words() click to toggle source

The WordNet::Words associated with the receiver

# File lib/wordnet/synset.rb, line 180
many_to_many :words,
        join_table: :senses,
        left_key: :synsetid,
        right_key: :wordid

Dataset Methods

↑ top

Public Instance Methods

adjective_satellites() click to toggle source

:singleton-method: adjective_satellites Limit results to adjective satellites.

# File lib/wordnet/synset.rb, line 286
def adjective_satellites
        return self.where( pos: 's' )
end
adjectives() click to toggle source

:singleton-method: adjectives Limit results to adjectives.

# File lib/wordnet/synset.rb, line 272
def adjectives
        return self.where( pos: 'a' )
end
adverbs() click to toggle source

:singleton-method: adverbs Limit results to adverbs.

# File lib/wordnet/synset.rb, line 279
def adverbs
        return self.where( pos: 'r' )
end
nouns() click to toggle source

:singleton-method: nouns Limit results to nouns.

# File lib/wordnet/synset.rb, line 258
def nouns
        return self.where( pos: 'n' )
end
verbs() click to toggle source

:singleton-method: verbs Limit results to verbs.

# File lib/wordnet/synset.rb, line 265
def verbs
        return self.where( pos: 'v' )
end

Traversal Methods

↑ top

Public Instance Methods

traverse( type, &block ) click to toggle source

With a block, yield a WordNet::Synset related to the receiver via a link of the specified type, recursing depth first into each of its links if the link type is recursive. To exit from the traversal at any depth, throw :stop_traversal.

If no block is given, return an Enumerator that will do the same thing instead.

# Print all the parts of a boot
puts lexicon[:boot].traverse( :member_meronyms ).to_a

You can also traverse with an addiitional argument that indicates the depth of recursion by calling with_depth on the Enumerator:

$lex[:fencing].traverse( :hypernyms ).with_depth.each {|ss,d| puts "%02d: %s" % [d,ss] }
# (outputs:)

01: play, swordplay (noun): [noun.act] the act using a sword (or other weapon) vigorously
  and skillfully (hypernym: 1, hyponym: 1)
02: action (noun): [noun.act] something done (usually as opposed to something said)
  (hypernym: 1, hyponym: 33)
03: act, deed, human action, human activity (noun): [noun.tops] something that people do
  or cause to happen (hypernym: 1, hyponym: 40)
...
# File lib/wordnet/synset.rb, line 625
def traverse( type, &block )
        enum = Enumerator.new do |yielder|
                traversals = [ self.semanticlink_enum(type) ]
                syn        = nil
                typekey    = SEMANTIC_TYPEKEYS[ type ]
                recurses   = self.class.linktypes[ typekey ][:recurses]

                self.log.debug "Traversing %s semlinks%s" % [ type, recurses ? " (recursive)" : ''  ]

                catch( :stop_traversal ) do
                        until traversals.empty?
                                begin
                                        self.log.debug "  %d traversal/s left" % [ traversals.length ]
                                        syn = traversals.last.next

                                        if enum.with_depth?
                                                yielder.yield( syn, traversals.length )
                                        else
                                                yielder.yield( syn )
                                        end

                                        traversals << syn.semanticlink_enum( type ) if recurses
                                rescue StopIteration
                                        traversals.pop
                                end
                        end
                end
        end

        def enum.with_depth?
                @with_depth = false if !defined?( @with_depth )
                return @with_depth
        end

        def enum.with_depth
                @with_depth = true
                self
        end

        return enum.each( &block ) if block
        return enum
end
|( othersyn ) click to toggle source

Union: Return the least general synset that the receiver and othersyn have in common as a hypernym, or nil if it doesn't share any.

# File lib/wordnet/synset.rb, line 583
def |( othersyn )

        # Find all of this syn's hypernyms
        hypersyns = self.traverse( :hypernyms ).to_a
        commonsyn = nil

        # Now traverse the other synset's hypernyms looking for one of our
        # own hypernyms.
        othersyn.traverse( :hypernyms ) do |syn|
                if hypersyns.include?( syn )
                        commonsyn = syn
                        throw :stop_traversal
                end
        end

        return commonsyn
end