class Glove::Workers::CooccurrenceWorker

Constructs the co-occurrence matrix for {Glove::Model}

Attributes

token_index[R]

@!attribute [r] token_index

@return [Hash{String=>Integer}] Clone of @caller.token_index

@!attribute [r] word_biases

@return [Array<(Glove::TokenPair)>] Clone of @caller.token_pairs
token_pairs[R]

@!attribute [r] token_index

@return [Hash{String=>Integer}] Clone of @caller.token_index

@!attribute [r] word_biases

@return [Array<(Glove::TokenPair)>] Clone of @caller.token_pairs

Public Class Methods

new(caller) click to toggle source

Creates instance of the class

@param [Glove::Model] caller Caller class

# File lib/glove/workers/cooccurrence_worker.rb, line 18
def initialize(caller)
  @caller = caller
  @token_index = @caller.token_index.dup
  @token_pairs = @caller.token_pairs.dup
end

Public Instance Methods

build_cooc_matrix_col(slice) click to toggle source

Creates a vector column for the cooc_matrix based on given token. Calculates sum for how many times the word exists in the constext of the entire vocabulary

@param [Array<(String, Integer)>] slice Token with index @return [Array] GSL::Vector#to_a representation of the column

# File lib/glove/workers/cooccurrence_worker.rb, line 41
def build_cooc_matrix_col(slice)
  token = slice[0]
  vector = GSL::Vector.alloc(token_index.size)

  token_pairs.each do |pair|
    key = token_index[pair.token]
    sum = pair.neighbors.select{ |word| word == token }.size
    vector[key] += sum
  end

  vector.to_a
end
run() click to toggle source

Perform the building of the matrix

@return [GSL::Matrix] The co-occurrence matrix

# File lib/glove/workers/cooccurrence_worker.rb, line 27
def run
  vectors = Parallel.map(token_index, in_processes: threads) do |slice|
    build_cooc_matrix_col(slice)
  end

  GSL::Matrix.alloc(*vectors)
end