class WordNet::Synset
WordNet
synonym-set object class
Instances of this class encapsulate the data for a synonym set ('synset') in a WordNet
lexical database. A synonym set is a set of words that are interchangeable in some context.
We can either fetch the synset from a connected Lexicon:
lexicon = WordNet::Lexicon.new( 'postgres://localhost/wordnet31' ) ss = lexicon[ :first, 'time' ] # => #<WordNet::Synset:0x7ffbf2643bb0 {115265518} 'commencement, first, # get-go, offset, outset, start, starting time, beginning, kickoff, # showtime' (noun): [noun.time] the time at which something is # supposed to begin>
or if you've already created a Lexicon, use its connection indirectly to look up a Synset
by its ID:
ss = WordNet::Synset[ 115265518 ] # => #<WordNet::Synset:0x7ffbf257e928 {115265518} 'commencement, first, # get-go, offset, outset, start, starting time, beginning, kickoff, # showtime' (noun): [noun.time] the time at which something is # supposed to begin>
You can fetch a list of the lemmas (base forms) of the words included in the synset:
ss.words.map( &:lemma ) # => ["commencement", "first", "get-go", "offset", "outset", "start", # "starting time", "beginning", "kickoff", "showtime"]
But the primary reason for a synset is its lexical and semantic links to other words and synsets. For instance, its hypernym is the equivalent of its superclass: it's the class of things of which the receiving synset is a member.
ss.hypernyms # => [#<WordNet::Synset:0x7ffbf25c76c8 {115180528} 'point, point in # time' (noun): [noun.time] an instant of time>]
The synset's hyponyms, on the other hand, are kind of like its subclasses:
ss.hyponyms # => [#<WordNet::Synset:0x7ffbf25d83b0 {115142167} 'birth' (noun): # [noun.time] the time when something begins (especially life)>, # #<WordNet::Synset:0x7ffbf25d8298 {115268993} 'threshold' (noun): # [noun.time] the starting point for a new state or experience>, # #<WordNet::Synset:0x7ffbf25d8180 {115143012} 'incipiency, # incipience' (noun): [noun.time] beginning to exist or to be # apparent>, # #<WordNet::Synset:0x7ffbf25d8068 {115266164} 'starting point, # terminus a quo' (noun): [noun.time] earliest limiting point>]
Traversal¶ ↑
Synset
also provides a few 'traversal' methods which provide recursive searching of a Synset's semantic links:
# Recursively search for more-general terms for the synset, and print out # each one with indentation according to how distantly it's related. lexicon[ :fencing, 'sword' ]. traverse(:hypernyms).with_depth. each {|ss, depth| puts "%s%s [%d]" % [' ' * (depth-1), ss.words.first, ss.synsetid] } # (outputs:) play [100041468] action [100037396] act [100030358] event [100029378] psychological feature [100023100] abstract entity [100002137] entity [100001740] combat [101170962] battle [100958896] group action [101080366] event [100029378] psychological feature [100023100] abstract entity [100002137] entity [100001740] act [100030358] event [100029378] psychological feature [100023100] abstract entity [100002137] entity [100001740]
See the Traversal Methods section for more details.
Low-Level API¶ ↑
This library is implemented using Sequel::Model, an ORM layer on top of the excellent Sequel database toolkit. This means that in addition to the high-level methods above, you can also make use of a database-oriented API if you need to do something not provided by a high-level method.
In order to make use of this API, you'll need to be familiar with Sequel, especially Datasets and Model Associations. Most of Ruby-WordNet's functionality is implemented in terms of one or both of these.
Datasets¶ ↑
The main dataset is available from WordNet::Synset.dataset:
WordNet::Synset.dataset # => #<Sequel::SQLite::Dataset: "SELECT * FROM `synsets`">
In addition to this, Synset
also defines a few other canned datasets. To facilitate searching by part of speech on the Synset
class:
or by the semantic links for a particular Synset:
-
WordNet::Synset#also_see_dataset
-
WordNet::Synset#attributes_dataset
-
WordNet::Synset#causes_dataset
-
WordNet::Synset#domain_categories_dataset
-
WordNet::Synset#domain_member_categories_dataset
-
WordNet::Synset#domain_member_regions_dataset
-
WordNet::Synset#domain_member_usages_dataset
-
WordNet::Synset#domain_regions_dataset
-
WordNet::Synset#domain_usages_dataset
-
WordNet::Synset#entailments_dataset
-
WordNet::Synset#hypernyms_dataset
-
WordNet::Synset#hyponyms_dataset
-
WordNet::Synset#instance_hypernyms_dataset
-
WordNet::Synset#instance_hyponyms_dataset
-
WordNet::Synset#member_holonyms_dataset
-
WordNet::Synset#member_meronyms_dataset
-
WordNet::Synset#part_holonyms_dataset
-
WordNet::Synset#part_meronyms_dataset
-
WordNet::Synset#semlinks_dataset
-
WordNet::Synset#semlinks_to_dataset
-
WordNet::Synset#senses_dataset
-
WordNet::Synset#similar_words_dataset
-
WordNet::Synset#substance_holonyms_dataset
-
WordNet::Synset#substance_meronyms_dataset
-
WordNet::Synset#sumo_terms_dataset
-
WordNet::Synset#verb_groups_dataset
-
WordNet::Synset#words_dataset
Constants
- SEMANTIC_TYPEKEYS
Semantic link type keys; maps what the API calls them to what they are in the DB.
Attributes
Public Class Methods
Overridden to reset any lookup tables that may have been loaded from the previous database.
# File lib/wordnet/synset.rb, line 297 def self::db=( newdb ) self.reset_lookup_tables super end
Return the table of lexical domains, keyed by id.
# File lib/wordnet/synset.rb, line 315 def self::lexdomain_table @lexdomain_table ||= self.db[:lexdomains].to_hash( :lexdomainid ) end
Lexical domains, keyed by name as a String (e.g., “verb.cognition”)
# File lib/wordnet/synset.rb, line 321 def self::lexdomains @lexdomains ||= self.lexdomain_table.inject({}) do |hash,(id,domain)| hash[ domain[:lexdomainname] ] = domain hash end end
Return the table of link types, keyed by linkid
# File lib/wordnet/synset.rb, line 330 def self::linktype_table @linktype_table ||= self.db[:linktypes].inject({}) do |hash,row| hash[ row[:linkid] ] = { id: row[:linkid], typename: row[:link], type: row[:link].gsub( /\s+/, '_' ).to_sym, recurses: row[:recurses] && row[:recurses] != 0, } hash end end
Return the table of link types, keyed by name.
# File lib/wordnet/synset.rb, line 344 def self::linktypes @linktypes ||= self.linktype_table.inject({}) do |hash,(id,link)| hash[ link[:type] ] = link hash end end
Return the table of part-of-speech types, keyed by letter identifier.
# File lib/wordnet/synset.rb, line 353 def self::postype_table @postype_table ||= self.db[:postypes].inject({}) do |hash, row| hash[ row[:pos].untaint.to_sym ] = row[:posname] hash end end
Return the table of part-of-speech names to letter identifiers (both Symbols).
# File lib/wordnet/synset.rb, line 362 def self::postypes @postypes ||= self.postype_table.invert end
Unload all of the cached lookup tables that have been loaded.
# File lib/wordnet/synset.rb, line 304 def self::reset_lookup_tables @lexdomain_table = nil @lexdomains = nil @linktype_table = nil @linktypes = nil @postype_table = nil @postypes = nil end
Generate methods that will return Synsets related by the given semantic pointer type
.
# File lib/wordnet/synset.rb, line 376 def self::semantic_link( type ) self.log.debug "Generating a %p method" % [ type ] ds_method_body = Proc.new do self.semanticlink_dataset( type ) end define_method( "#{type}_dataset", &ds_method_body ) ss_method_body = Proc.new do self.semanticlink_dataset( type ).all end define_method( type, &ss_method_body ) self.semantic_link_methods << type.to_sym end
Public Instance Methods
Return a human-readable representation of the objects, suitable for debugging.
# File lib/wordnet/synset.rb, line 689 def inspect return "#<%p:%0#x {%d} '%s' (%s): [%s] %s>" % [ self.class, self.object_id * 2, self.synsetid, self.wordlist.join(', '), self.part_of_speech, self.lexical_domain, self.definition, ] end
Return the name of the lexical domain the synset belongs to; this also corresponds to the lexicographer's file the synset was originally loaded from.
# File lib/wordnet/synset.rb, line 447 def lexical_domain return self.class.lexdomain_table[ self.lexdomainid ][ :lexdomainname ] end
Return the name of the Synset's part of speech (pos).
# File lib/wordnet/synset.rb, line 417 def part_of_speech return self.class.postype_table[ self.pos.to_sym ] end
Return any sample sentences.
# File lib/wordnet/synset.rb, line 453 def samples return self.db[:samples]. filter( synsetid: self.synsetid ). order( :sampleid ). map( :sample ) end
Return a Sequel::Dataset for synsets related to the receiver via the semantic link of the specified type
.
# File lib/wordnet/synset.rb, line 399 def semanticlink_dataset( type ) typekey = SEMANTIC_TYPEKEYS[ type ] linkinfo = self.class.linktypes[ typekey ] or raise ArgumentError, "no such link type %p" % [ typekey ] ssids = self.semlinks_dataset.filter( linkid: linkinfo[:id] ).select( :synset2id ) return self.class.filter( synsetid: ssids ) end
Return an Enumerator that will iterate over the Synsets related to the receiver via the semantic links of the specified linktype
.
# File lib/wordnet/synset.rb, line 411 def semanticlink_enum( linktype ) return self.semanticlink_dataset( linktype ).to_enum end
The WordNet::SemanticLinks indicating a relationship with other WordNet::Synsets
# File lib/wordnet/synset.rb, line 196 one_to_many :semlinks, class: 'WordNet::SemanticLink', key: :synset1id, primary_key: :synsetid, eager: :target
The WordNet::SemanticLinks pointing to this Synset
# File lib/wordnet/synset.rb, line 205 many_to_one :semlinks_to, class: 'WordNet::SemanticLink', key: :synsetid, primary_key: :synset2id
The WordNet::Senses associated with the receiver
# File lib/wordnet/synset.rb, line 188 one_to_many :senses, key: :synsetid, primary_key: :synsetid
Terms from the Suggested Upper Merged Ontology
# File lib/wordnet/synset.rb, line 213 many_to_many :sumo_terms, join_table: :sumomaps, left_key: :synsetid, right_key: :sumoid
Stringify the synset.
# File lib/wordnet/synset.rb, line 423 def to_s # Make a sorted list of the semantic link types from this synset semlink_list = self.semlinks_dataset. group_and_count( :linkid ). to_hash( :linkid, :count ). collect do |linkid, count| '%s: %d' % [ self.class.linktype_table[linkid][:typename], count ] end. sort. join( ', ' ) return "%s (%s): [%s] %s (%s)" % [ self.words.map( &:to_s ).join(', '), self.part_of_speech, self.lexical_domain, self.definition, semlink_list ] end
Return the Synset's Words as an Array of Strings.
# File lib/wordnet/synset.rb, line 683 def wordlist return self.words.map( &:to_s ) end
The WordNet::Words associated with the receiver
# File lib/wordnet/synset.rb, line 180 many_to_many :words, join_table: :senses, left_key: :synsetid, right_key: :wordid
Dataset Methods
↑ topPublic Instance Methods
:singleton-method: adjective_satellites
Limit results to adjective satellites.
# File lib/wordnet/synset.rb, line 286 def adjective_satellites return self.where( pos: 's' ) end
:singleton-method: adjectives Limit results to adjectives.
# File lib/wordnet/synset.rb, line 272 def adjectives return self.where( pos: 'a' ) end
:singleton-method: adverbs Limit results to adverbs.
# File lib/wordnet/synset.rb, line 279 def adverbs return self.where( pos: 'r' ) end
:singleton-method: nouns Limit results to nouns.
# File lib/wordnet/synset.rb, line 258 def nouns return self.where( pos: 'n' ) end
:singleton-method: verbs Limit results to verbs.
# File lib/wordnet/synset.rb, line 265 def verbs return self.where( pos: 'v' ) end
Semantic Links
↑ topPublic Instance Methods
“See Also” synsets
# File lib/wordnet/synset.rb, line 467 semantic_link :also_see
Attribute synsets
# File lib/wordnet/synset.rb, line 471 semantic_link :attributes
Cause synsets
# File lib/wordnet/synset.rb, line 475 semantic_link :causes
Domain category synsets
# File lib/wordnet/synset.rb, line 479 semantic_link :domain_categories
Domain member category synsets
# File lib/wordnet/synset.rb, line 483 semantic_link :domain_member_categories
Domain member region synsets
# File lib/wordnet/synset.rb, line 487 semantic_link :domain_member_regions
Domain member usage synsets
# File lib/wordnet/synset.rb, line 491 semantic_link :domain_member_usages
Domain region synsets
# File lib/wordnet/synset.rb, line 495 semantic_link :domain_regions
Domain usage synsets
# File lib/wordnet/synset.rb, line 499 semantic_link :domain_usages
Verb entailment synsets
# File lib/wordnet/synset.rb, line 503 semantic_link :entailments
Hypernym sunsets
# File lib/wordnet/synset.rb, line 507 semantic_link :hypernyms
Hyponym synsets
# File lib/wordnet/synset.rb, line 511 semantic_link :hyponyms
Instance hypernym synsets
# File lib/wordnet/synset.rb, line 515 semantic_link :instance_hypernyms
Instance hyponym synsets
# File lib/wordnet/synset.rb, line 519 semantic_link :instance_hyponyms
Member holonym synsets
# File lib/wordnet/synset.rb, line 523 semantic_link :member_holonyms
Member meronym synsets
# File lib/wordnet/synset.rb, line 527 semantic_link :member_meronyms
Part holonym synsets
# File lib/wordnet/synset.rb, line 531 semantic_link :part_holonyms
Part meronym synsets
# File lib/wordnet/synset.rb, line 535 semantic_link :part_meronyms
Similar word synsets
# File lib/wordnet/synset.rb, line 539 semantic_link :similar_words
Substance holonym synsets
# File lib/wordnet/synset.rb, line 543 semantic_link :substance_holonyms
Substance meronym synsets
# File lib/wordnet/synset.rb, line 547 semantic_link :substance_meronyms
Verb group synsets
# File lib/wordnet/synset.rb, line 551 semantic_link :verb_groups
Traversal Methods
↑ topPublic Instance Methods
Search for the specified synset
in the semantic links of the given type
of the receiver, returning the depth it was found at if it's found, or nil if it wasn't found.
# File lib/wordnet/synset.rb, line 672 def search( type, synset ) found, depth = self.traverse( type ).with_depth.find {|ss,depth| synset == ss } return depth end
With a block, yield a WordNet::Synset
related to the receiver via a link of the specified type
, recursing depth first into each of its links if the link type is recursive. To exit from the traversal at any depth, throw :stop_traversal.
If no block is given, return an Enumerator that will do the same thing instead.
# Print all the parts of a boot puts lexicon[:boot].traverse( :member_meronyms ).to_a
You can also traverse with an addiitional argument that indicates the depth of recursion by calling with_depth on the Enumerator:
$lex[:fencing].traverse( :hypernyms ).with_depth.each {|ss,d| puts "%02d: %s" % [d,ss] } # (outputs:) 01: play, swordplay (noun): [noun.act] the act using a sword (or other weapon) vigorously and skillfully (hypernym: 1, hyponym: 1) 02: action (noun): [noun.act] something done (usually as opposed to something said) (hypernym: 1, hyponym: 33) 03: act, deed, human action, human activity (noun): [noun.tops] something that people do or cause to happen (hypernym: 1, hyponym: 40) ...
# File lib/wordnet/synset.rb, line 625 def traverse( type, &block ) enum = Enumerator.new do |yielder| traversals = [ self.semanticlink_enum(type) ] syn = nil typekey = SEMANTIC_TYPEKEYS[ type ] recurses = self.class.linktypes[ typekey ][:recurses] self.log.debug "Traversing %s semlinks%s" % [ type, recurses ? " (recursive)" : '' ] catch( :stop_traversal ) do until traversals.empty? begin self.log.debug " %d traversal/s left" % [ traversals.length ] syn = traversals.last.next if enum.with_depth? yielder.yield( syn, traversals.length ) else yielder.yield( syn ) end traversals << syn.semanticlink_enum( type ) if recurses rescue StopIteration traversals.pop end end end end def enum.with_depth? @with_depth = false if !defined?( @with_depth ) return @with_depth end def enum.with_depth @with_depth = true self end return enum.each( &block ) if block return enum end
Union: Return the least general synset that the receiver and othersyn
have in common as a hypernym, or nil if it doesn't share any.
# File lib/wordnet/synset.rb, line 583 def |( othersyn ) # Find all of this syn's hypernyms hypersyns = self.traverse( :hypernyms ).to_a commonsyn = nil # Now traverse the other synset's hypernyms looking for one of our # own hypernyms. othersyn.traverse( :hypernyms ) do |syn| if hypersyns.include?( syn ) commonsyn = syn throw :stop_traversal end end return commonsyn end